<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Lecture Notes | Haobin Tan</title><link>https://haobin-tan.netlify.app/tags/lecture-notes/</link><atom:link href="https://haobin-tan.netlify.app/tags/lecture-notes/index.xml" rel="self" type="application/rss+xml"/><description>Lecture Notes</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Wed, 14 Sep 2022 00:00:00 +0000</lastBuildDate><image><url>https://haobin-tan.netlify.app/media/icon_hu7d15bc7db65c8eaf7a4f66f5447d0b42_15095_512x512_fill_lanczos_center_3.png</url><title>Lecture Notes</title><link>https://haobin-tan.netlify.app/tags/lecture-notes/</link></image><item><title>Math</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/</link><pubDate>Sat, 04 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/</guid><description>&lt;p>Tutorials&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;a href="https://welt-der-bwl.de/Statistik">Statistik&lt;/a>: Zusammenfassung von Statistik&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://studyflix.de/statistik">Statistik Tutorials von Studyflix&lt;/a> 👍&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Youtube channel &amp;ldquo;&lt;a href="https://www.youtube.com/c/MathebyDanielJung">Math by Daniel Jung&lt;/a>&amp;rdquo; (klar erklärt mit Beispiele) 👍&lt;/p>
&lt;/li>
&lt;/ul></description></item><item><title>Ereignis und Wahrscheinlichkeit</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/ereignisse_und_wahrscheinlichtkeit/</link><pubDate>Sat, 04 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/ereignisse_und_wahrscheinlichtkeit/</guid><description>&lt;h1 id="ereignisse">Ereignisse&lt;/h1>
&lt;p>Ein &lt;strong>endlicher Ergebnisraum&lt;/strong> eines &lt;strong>Zufallsexperimentes&lt;/strong> ist eine nichtleere Menge&lt;/p>
&lt;p>
$$
\Omega=\left\{\omega_{1}, \omega_{2}, \ldots, \omega_{N}\right\}.
$$
&lt;em>I.e.,&lt;/em> $\Omega$ enthält alle mögliche Ergebnisse.&lt;/p>
&lt;p>Die Elemente $\omega_{n} \in \Omega$
heißen &lt;mark>&lt;strong>Ergebnisse&lt;/strong>&lt;/mark>, die möglichen Ausgänge eines Zufallsexperiments.&lt;/p>
&lt;p>Jede Teilmenge $A \subset \Omega$ heißt &lt;mark>&lt;strong>Ereignis&lt;/strong>&lt;/mark>.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Jede einelementige Teilmenge $\left\{\omega_{n}\right\} \subset \Omega$
heißt &lt;mark>&lt;strong>Elementarereignis&lt;/strong>&lt;/mark> (ZUsammenfassung von einem oder mehreren Ergebnissen).&lt;/p>
&lt;p>$\rightarrow$ Der Ergebnisraum $\Omega$ (das &lt;strong>sichere Ereignis&lt;/strong>) und die leere Menge $\emptyset$ (das &lt;strong>unmögliche Ereignis&lt;/strong>) sind stets Ereignisse.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Für zwei Ereignisse $A$ und $B$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Gilt $A \subset B$, so ist $A$ ein &lt;mark>&lt;strong>Teilereignis&lt;/strong>&lt;/mark> von $B$.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Der &lt;strong>Durchschnitt&lt;/strong> $(A \cap B)$, die Vereinigung $(A \cup B)$, und die &lt;strong>Differenz&lt;/strong> $(A-B)$ sind auch Ereignisse.&lt;/p>
&lt;ul>
&lt;li>Durchschnitt und Vereinigung sind &lt;em>kommutativ&lt;/em>, &lt;em>assoziativ&lt;/em> und &lt;em>distributiv&lt;/em>.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Das &lt;strong>entgegengesetzte Ereignis&lt;/strong> $\bar{A}$ von $A$ ist auch ein Ereignis und wird als &lt;mark>&lt;strong>Negation&lt;/strong>&lt;/mark> oder &lt;mark>&lt;strong>Komplement&lt;/strong>&lt;/mark> bezeichnet.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Gilt $A \cap B=\varnothing$, so heißen $A$ und $B$ &lt;mark>&lt;strong>disjunkt&lt;/strong>&lt;/mark> ode &lt;mark>&lt;strong>unvereinbar&lt;/strong>&lt;/mark> .&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>de MORGANschen Formeln&lt;/strong>&lt;/p>
$$
\begin{array}{l}
\overline{A \cup B}=\bar{A} \cap \bar{B} \\
\overline{A \cap B}=\bar{A} \cup \bar{B}
\end{array}
$$
&lt;/li>
&lt;/ul>
&lt;details class="spoiler " id="spoiler-4">
&lt;summary class="cursor-pointer">Beispiel&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;p>Würfel werfen.&lt;/p>
&lt;ul>
&lt;li>Ergebnisraum $\Omega = \\{1, 2, 3, 4, 5, 6\\}$
(Also $\|\Omega\| = 6$)&lt;/li>
&lt;li>Beispiel Ereignise
&lt;ul>
&lt;li>&amp;ldquo;Der Würfel zeight eine ungerade Zahl.&amp;rdquo;&lt;/li>
&lt;li>&amp;ldquo;Der Würfel zeigt eine 3.&amp;rdquo;&lt;/li>
&lt;li>&amp;ldquo;Der Würfel zeigt eine 3.&amp;rdquo; (das unmögliche Ereignis)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Ereignis $A$ = &amp;ldquo;Der Würfel zeight eine ungerade Zahl.&amp;rdquo; = $\\{1, 3, 5\\}$. Ereignis $B$ = &amp;ldquo;Der Würfel zeight eine gerade Zahl&amp;rdquo; = $\\{2, 4, 6\\}$. $A \cap B = \emptyset$ $\Rightarrow$ $A$ und $B$ sind disjunkt oder unvereinbar.&lt;/li>
&lt;/ul>
&lt;p>Reference:
&lt;/p>
&lt;/div>
&lt;/details>
&lt;h2 id="wahrscheinlichkeit-von-kolmogoroff">Wahrscheinlichkeit (von Kolmogoroff)&lt;/h2>
&lt;p>Ein nichtleeres System $\mathfrak{B}$ von Teilmengen eines Ergebnisraums $\Omega$ heißt &lt;mark>&lt;strong>$\sigma$-Algebra&lt;/strong>&lt;/mark> (über $\Omega$), wenn gilt&lt;/p>
$$
\begin{array}{c}
A \in \mathfrak{B} \quad \Rightarrow \quad \bar{A} \in \mathfrak{B}, \\
A_{n} \in \mathfrak{B} ; n=1,2, \ldots \quad \Rightarrow \quad \bigcup_{n=1}^{\infty} A_{n} \in \mathfrak{B}.
\end{array}
$$
&lt;p>Ein höchstens abzählbares System&lt;/p>
$$\left\{A_{n} \in \mathfrak{B}: A_{k} \cap A_{n}=\varnothing, k \neq n\right\}$$
&lt;p>heißt &lt;mark>&lt;strong>vollständige Ereignisdisjunktion&lt;/strong>&lt;/mark>, wenn gilt $\bigcup_{n=1}^{\infty} A_{n}=\Omega$
.&lt;/p>
&lt;h3 id="kolmogoroffsche-axiome">Kolmogoroffsche Axiome&lt;/h3>
&lt;p>Gegeben seien ein Ergebnisraum $\Omega$ und eine geeignete $\sigma$-Algebra $\mathfrak{B}$ über $\Omega$. Die Elemente von $\mathfrak{B}$ sind also die Ereignisse eines Zufallsexperiments.&lt;/p>
&lt;p>Eine Funktion $P$, die jedem Ereignis $A \in \mathfrak{B}$ eine relle Zahl zuordnet, erfülle&lt;/p>
$$
\begin{aligned}
\mathrm{P}(\Omega) &amp;=1 \quad &amp;(\text{Normiertheit})\\
\mathrm{P}(A) &amp; \geq 0 \quad \forall A \in \mathfrak{B} \quad &amp;(\text{Nicht-negativität}) \\
\mathrm{P}\left(\bigcup_{n=1}^{\infty} A_{n}\right) &amp;=\sum_{n=1}^{\infty} \mathrm{P}\left(A_{n}\right) \quad A_i \cap A_j = \emptyset, \forall i,j \quad &amp;(\text{Additivität})
\end{aligned}
$$
&lt;p>dann heißt $P(A)$ die &lt;mark>&lt;strong>Wahrscheinlichkeit&lt;/strong>&lt;/mark> des Ereignisses $A$.&lt;/p>
&lt;details class="spoiler " id="spoiler-9">
&lt;summary class="cursor-pointer">Beispiel&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;p>Würfelwurf&lt;/p>
&lt;p>Ergebnisraum $\Omega = \\{1, 2, 3, 4, 5, 6\\}$&lt;/p>
&lt;p>Ereignis $E = \text{Zahlen von 1 bis 6}$, also $E_i$
ist die Zahl $i$ (z.B $E_1$
ist die Zahl 1).&lt;/p>
&lt;p>Dann haben wir:&lt;/p>
$$
\begin{aligned}
P(E_1) &amp;= \frac{1}{6} \\
P(E_2) &amp;= \frac{1}{6} \\
P(\Omega) &amp;= \frac{6}{6} = 1 \\
P(E_1 \cup E_2) &amp;= \frac{1}{6} + \frac{1}{6} = \frac{2}{6} \quad (E_1 \cap E_2 = \emptyset)
\end{aligned}
$$
&lt;p>Reference:&lt;/p>
&lt;pre>&lt;code>&amp;lt;div style=&amp;quot;position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;&amp;quot;&amp;gt;
&amp;lt;iframe allow=&amp;quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&amp;quot; allowfullscreen=&amp;quot;allowfullscreen&amp;quot; loading=&amp;quot;eager&amp;quot; referrerpolicy=&amp;quot;strict-origin-when-cross-origin&amp;quot; src=&amp;quot;https://www.youtube.com/embed/GtpN4SRESaA?autoplay=0&amp;amp;controls=1&amp;amp;end=0&amp;amp;loop=0&amp;amp;mute=0&amp;amp;start=0&amp;quot; style=&amp;quot;position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;&amp;quot; title=&amp;quot;YouTube video&amp;quot;
&amp;gt;&amp;lt;/iframe&amp;gt;
&amp;lt;/div&amp;gt;
&lt;/code>&lt;/pre>
&lt;/div>
&lt;/details>
&lt;p>Hieraus folgt&lt;/p>
$$
\begin{aligned}
\mathrm{P}(\varnothing) &amp;=0, \\
\mathrm{P}(\bar{A}) &amp;=1-\mathrm{P}(A), \\
0 \leq \mathrm{P}(A) &amp; \leq 1, \\
\mathrm{P}(A \cup B) &amp;=\mathrm{P}(A)+\mathrm{P}(B)-\mathrm{P}(A \cap B), \\
\mathrm{P}\left(\bigcup_{n=1}^{\infty} A_{n}\right) &amp;=1 \quad \text { für jede vollständige Ereignisdisjunktion } A_{n} .
\end{aligned}
$$
&lt;h2 id="bedingte-wahrscheinlichkeiten">Bedingte Wahrscheinlichkeiten&lt;/h2>
&lt;p>Sei $B \subset \Omega$ als &lt;strong>vorausgesetztes Ereignis&lt;/strong>, $A, B \in \mathfrak{B}$ und $\mathrm{P}(B)>0$. Dann heißt&lt;/p>
$$
\mathrm{P}(A \mid B)=\frac{\mathrm{P}(A \cap B)}{\mathrm{P}(B)}
$$
&lt;p>&lt;mark>&lt;strong>bedingte Wahrscheinlichkeit&lt;/strong>&lt;/mark> von $A$ unter der Bedingung $B$.&lt;/p>
&lt;h3 id="multiplikationsregel-fur-wahrscheinlichkeiten">Multiplikationsregel für Wahrscheinlichkeiten&lt;/h3>
$$
\mathrm{P}(A \cap B)=\mathrm{P}(A \mid B) \mathrm{P}(B)
$$
&lt;p>Im allgemein ist $\mathrm{P}(A \mid B) \neq \mathrm{P}(B \mid A)$. Es gilt die Beziehung&lt;/p>
$$
\mathrm{P}(A \mid B) \mathrm{P}(B)=\mathrm{P}(A \cap B) = \mathrm{P}(B \mid A) \mathrm{P}(A)
$$
&lt;p>Verallgemeinierung: Die wiederholte Anwendung der Multiplikationsregel auf den Durchschnitt $N$ zufälliger Ereignisse liefert&lt;/p>
$$
\begin{aligned}
&amp;\mathrm{P}\left(\bigcap_{n=1}^{N} A_{n}\right) \\
=&amp;\mathrm{P}\left(\bigcap_{n=2}^{N} A_{n} \mid A_{1}\right) \mathrm{P}\left(A_{1}\right) \\
=&amp;\mathrm{P}\left(\bigcap_{n=3}^{N} A_{n} \mid A_{2} \cap A_{1}\right) \mathrm{P}\left(A_{2} \mid A_{1}\right) \mathrm{P}\left(A_{1}\right) \\
=&amp;\mathrm{P}\left(\bigcap_{n=4}^{N} A_{n} \mid A_{3} \cap A_{2} \cap A_{1}\right) \mathrm{P}\left(A_{3} \mid A_{2} \cap A_{1}\right) \mathrm{P}\left(A_{2} \mid A_{1}\right) \mathrm{P}\left(A_{1}\right) \\
=&amp;\mathrm{P}\left(A_{N} \mid \bigcap_{n=1}^{N-1} A_{n}\right) \cdots \mathrm{P}\left(A_{4} \mid A_{3} \cap A_{2} \cap A_{1}\right) \mathrm{P}\left(A_{3} \mid A_{2} \cap A_{1}\right) \mathrm{P}\left(A_{2} \mid A_{1}\right) \mathrm{P}\left(A_{1}\right)
\end{aligned}
$$
&lt;details class="spoiler " id="spoiler-15">
&lt;summary class="cursor-pointer">Beispiel&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;p>Vereinfachung mit 3 Ereignisse&lt;/p>
$$
\begin{array}{ll}
&amp;P(A) \cdot P(B \mid A) \cdot P(C \mid A \cap B) \\\\
=&amp;P(A) \cdot \frac{P(A \cap B)}{P(A)} \cdot \frac{P(C \mid A \cap B)}{P(A \cap B)} \\\\
=&amp;P(A \cap B \cap C)
\end{array}
$$
&lt;p>Ref:
&lt;/p>
&lt;/div>
&lt;/details>
&lt;h3 id="formel-von-der-totalen-wahrscheinlichkeit">Formel von der totalen Wahrscheinlichkeit&lt;/h3>
&lt;p>Die Ereignisse $A_{n}(1 \leq n \leq N)$
seien eine vollständige &lt;em>Ereignisdisjunktion&lt;/em> (also $A_i \cap A_j = \emptyset, \forall i, j$
) und es gelte $\mathrm{P}\left(A_{n}\right)>0, \forall n$
. Dann folgt für $\forall B \in \mathfrak{B}$ die &lt;strong>Formel von der totalen Wahrscheinlichkeit&lt;/strong>&lt;/p>
$$
\mathrm{P}(B)=\sum_{n=1}^{N} \mathrm{P}\left(B \mid A_{n}\right) \mathrm{P}\left(A_{n}\right)
$$
&lt;details class="spoiler " id="spoiler-20">
&lt;summary class="cursor-pointer">Beispiel&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;p>$A \cap \bar{A} = \emptyset$&lt;/p>
$$
\begin{array}{l}
P(B)&amp;=P(B \cap A)+P(B \cap \bar{A}) \\\\
&amp;=P(A)P(B \mid A)+P(\bar{A})P(B \mid \bar{A})
\end{array}
$$
&lt;/div>
&lt;/details>
&lt;details class="spoiler " id="spoiler-21">
&lt;summary class="cursor-pointer">Beispiel&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;/div>
&lt;/details>
&lt;p>Und wenn $P(B) > 0$ ist, folgt die &lt;strong>Formel von Bayes&lt;/strong>:&lt;/p>
$$
\mathrm{P}\left(A_{n} \mid B\right)=\frac{\mathrm{P}\left(B \mid A_{n}\right) \mathrm{P}\left(A_{n}\right)}{\sum_{k=1}^{N} \mathrm{P}\left(B \mid A_{k}\right) \mathrm{P}\left(A_{k}\right)}
$$
&lt;p>Im allgemeinen ist $\mathrm{P}(A) \neq \mathrm{P}(A \mid B)$. Gilt aber für $A, B \in \mathfrak{B}$&lt;/p>
$$
\mathrm{P}(A \mid B)=\mathrm{P}(A),
$$
&lt;p>so heißt $A$ &lt;mark>&lt;strong>unabhängig&lt;/strong>&lt;/mark> von $B$.&lt;/p>
&lt;p>Für unabhängige Ereignisse folgt hieraus&lt;/p>
$$
\begin{array}{c}
\mathrm{P}(A \cap B)=\mathrm{P}(A \mid B) \mathrm{P}(B)=\mathrm{P}(A) \mathrm{P}(B) \\
\mathrm{P}(B \mid A)=\frac{\mathrm{P}(A \cap B)}{\mathrm{P}(A)}=\mathrm{P}(B)
\end{array}
$$</description></item><item><title>Glossary</title><link>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/01-glossary/</link><pubDate>Mon, 01 Mar 2021 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/01-glossary/</guid><description>&lt;h2 id="router">Router&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Acronym&lt;/th>
&lt;th>Full Name&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>ISP&lt;/strong>&lt;/td>
&lt;td>&lt;strong>I&lt;/strong>nternet &lt;strong>S&lt;/strong>ervice &lt;strong>P&lt;/strong>rovider&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>IXP&lt;/strong>&lt;/td>
&lt;td>&lt;strong>I&lt;/strong>nternet &lt;strong>E&lt;/strong>xchange &lt;strong>P&lt;/strong>oint&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>FIB&lt;/strong>&lt;/td>
&lt;td>&lt;strong>F&lt;/strong>orwarding &lt;strong>I&lt;/strong>nformation &lt;strong>B&lt;/strong>ase&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>CAM&lt;/strong>&lt;/td>
&lt;td>&lt;strong>C&lt;/strong>ontent-&lt;strong>A&lt;/strong>ddressable &lt;strong>M&lt;/strong>emory&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="internet-routing">Internet Routing&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Acronym&lt;/th>
&lt;th>Full Name&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>AS&lt;/strong>&lt;/td>
&lt;td>&lt;strong>A&lt;/strong>utonomous &lt;strong>S&lt;/strong>ystems&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>IGP&lt;/strong>&lt;/td>
&lt;td>&lt;strong>I&lt;/strong>nterior &lt;strong>G&lt;/strong>ateway &lt;strong>P&lt;/strong>rotocol&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>EGP&lt;/strong>&lt;/td>
&lt;td>&lt;strong>E&lt;/strong>xterior &lt;strong>G&lt;/strong>ateway &lt;strong>P&lt;/strong>rotocol&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>ASN&lt;/strong>&lt;/td>
&lt;td>&lt;strong>A&lt;/strong>utonomous &lt;strong>S&lt;/strong>ystems &lt;strong>N&lt;/strong>umber&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>CDN&lt;/strong>&lt;/td>
&lt;td>&lt;strong>C&lt;/strong>ontent &lt;strong>D&lt;/strong>elivery Network&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>OSBF&lt;/strong>&lt;/td>
&lt;td>&lt;strong>O&lt;/strong>pen &lt;strong>S&lt;/strong>hortest &lt;strong>P&lt;/strong>ath &lt;strong>F&lt;/strong>irst&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>LSA&lt;/strong>&lt;/td>
&lt;td>&lt;strong>L&lt;/strong>ink &lt;strong>S&lt;/strong>tate &lt;strong>A&lt;/strong>dvertisement&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>ABR&lt;/strong>&lt;/td>
&lt;td>&lt;strong>A&lt;/strong>rea &lt;strong>B&lt;/strong>order &lt;strong>R&lt;/strong>outer&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>BGP&lt;/strong>&lt;/td>
&lt;td>&lt;strong>B&lt;/strong>order &lt;strong>G&lt;/strong>ateway &lt;strong>P&lt;/strong>rotocol&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>RIB&lt;/strong>&lt;/td>
&lt;td>&lt;strong>R&lt;/strong>outing &lt;strong>I&lt;/strong>nformation &lt;strong>B&lt;/strong>ase&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="label-switching">Label Switching&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Acronym&lt;/th>
&lt;th>Full Name&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>MPLS&lt;/strong>&lt;/td>
&lt;td>&lt;strong>M&lt;/strong>ulti&lt;strong>p&lt;/strong>rotocol &lt;strong>L&lt;/strong>abel &lt;strong>S&lt;/strong>witching&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>LSR&lt;/strong>&lt;/td>
&lt;td>&lt;strong>L&lt;/strong>abel-&lt;strong>s&lt;/strong>witching &lt;strong>r&lt;/strong>outer&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>LER&lt;/strong>&lt;/td>
&lt;td>&lt;strong>L&lt;/strong>abel &lt;strong>e&lt;/strong>dge &lt;strong>r&lt;/strong>outer&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>FEC&lt;/strong>&lt;/td>
&lt;td>&lt;strong>F&lt;/strong>orwarding &lt;strong>e&lt;/strong>quivalency &lt;strong>cl&lt;/strong>ass&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>RSVP&lt;/strong>&lt;/td>
&lt;td>Resource ReserVation Protocol&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>**VPN **&lt;/td>
&lt;td>&lt;strong>V&lt;/strong>irtual &lt;strong>P&lt;/strong>rivate &lt;strong>N&lt;/strong>etworks&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="software-defined-network-sdn">Software Defined Network (SDN)&lt;/h2>
&lt;h2 id="network-function-virtualization-nfv">Network Function Virtualization (NFV)&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Acronym&lt;/th>
&lt;th>Full Name&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>NAT&lt;/strong>&lt;/td>
&lt;td>&lt;strong>N&lt;/strong>etwork &lt;strong>A&lt;/strong>ddress &lt;strong>T&lt;/strong>ranslation&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>NFVI&lt;/strong>&lt;/td>
&lt;td>Network Function Virtualization Infrastructure&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>VNF&lt;/strong>&lt;/td>
&lt;td>&lt;strong>V&lt;/strong>irtualized &lt;strong>N&lt;/strong>etwork &lt;strong>F&lt;/strong>unctions&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>MANO&lt;/strong>&lt;/td>
&lt;td>Management and Orchestration&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>SFC&lt;/strong>&lt;/td>
&lt;td>&lt;strong>S&lt;/strong>ervice &lt;strong>F&lt;/strong>unction &lt;strong>C&lt;/strong>haining&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="congestion-control">Congestion Control&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Acronym&lt;/th>
&lt;th>Full Name&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>RTT&lt;/strong>&lt;/td>
&lt;td>&lt;strong>R&lt;/strong>ound &lt;strong>T&lt;/strong>rip &lt;strong>T&lt;/strong>ime&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>EWMA&lt;/strong>&lt;/td>
&lt;td>&lt;strong>E&lt;/strong>xponential &lt;strong>W&lt;/strong>eighted &lt;strong>M&lt;/strong>oving &lt;strong>A&lt;/strong>verage&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>RTO&lt;/strong>&lt;/td>
&lt;td>&lt;strong>R&lt;/strong>etransmission &lt;strong>T&lt;/strong>ime&lt;strong>O&lt;/strong>ut&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>AIMD&lt;/strong>&lt;/td>
&lt;td>&lt;strong>A&lt;/strong>dditively &lt;strong>I&lt;/strong>ncrease &lt;strong>M&lt;/strong>ultiplicatively &lt;strong>D&lt;/strong>ecrease&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>AQM&lt;/strong>&lt;/td>
&lt;td>&lt;strong>A&lt;/strong>ctive &lt;strong>Q&lt;/strong>ueue &lt;strong>M&lt;/strong>anagement&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>ECN&lt;/strong>&lt;/td>
&lt;td>Explicit Congestion Notification&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>RED&lt;/strong>&lt;/td>
&lt;td>&lt;strong>R&lt;/strong>andom &lt;strong>E&lt;/strong>arly &lt;strong>D&lt;/strong>etection&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="ethernet">Ethernet&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Acronym&lt;/th>
&lt;th>Full Name&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>CSMA&lt;/strong>&lt;/td>
&lt;td>&lt;strong>C&lt;/strong>arrier &lt;strong>S&lt;/strong>ense &lt;strong>M&lt;/strong>ultiple &lt;strong>A&lt;/strong>ccess&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>CD&lt;/strong>&lt;/td>
&lt;td>&lt;strong>C&lt;/strong>ollision &lt;strong>D&lt;/strong>etection&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>CA&lt;/strong>&lt;/td>
&lt;td>&lt;strong>C&lt;/strong>ollision &lt;strong>A&lt;/strong>voidance&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>IFS&lt;/strong>&lt;/td>
&lt;td>&lt;strong>I&lt;/strong>nter &lt;strong>F&lt;/strong>rame &lt;strong>S&lt;/strong>pace&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>BPDU&lt;/strong>&lt;/td>
&lt;td>&lt;strong>B&lt;/strong>ridge &lt;strong>P&lt;/strong>rotocol &lt;strong>D&lt;/strong>ata &lt;strong>U&lt;/strong>nits&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>STP&lt;/strong>&lt;/td>
&lt;td>&lt;strong>S&lt;/strong>panning &lt;strong>T&lt;/strong>ree &lt;strong>P&lt;/strong>rotocol&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>RSTP&lt;/strong>&lt;/td>
&lt;td>&lt;strong>R&lt;/strong>apid &lt;strong>S&lt;/strong>panning &lt;strong>T&lt;/strong>ree &lt;strong>P&lt;/strong>rotocol (RSTP)&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="data-center">Data Center&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Acronym&lt;/th>
&lt;th>Full Name&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>PFC&lt;/strong>&lt;/td>
&lt;td>&lt;strong>P&lt;/strong>riority-based &lt;strong>F&lt;/strong>low &lt;strong>C&lt;/strong>ontrol&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>PCP&lt;/strong>&lt;/td>
&lt;td>&lt;strong>P&lt;/strong>riority &lt;strong>C&lt;/strong>ode &lt;strong>P&lt;/strong>oint&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>ETS&lt;/strong>&lt;/td>
&lt;td>&lt;strong>E&lt;/strong>nhanced &lt;strong>T&lt;/strong>ransmission &lt;strong>S&lt;/strong>election&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>PG&lt;/strong>&lt;/td>
&lt;td>&lt;strong>P&lt;/strong>riority &lt;strong>G&lt;/strong>roups&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>QCN&lt;/strong>&lt;/td>
&lt;td>&lt;strong>Q&lt;/strong>uantized &lt;strong>C&lt;/strong>ongestion &lt;strong>N&lt;/strong>otification&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>SPB&lt;/strong>&lt;/td>
&lt;td>&lt;strong>S&lt;/strong>hortest &lt;strong>P&lt;/strong>ath &lt;strong>B&lt;/strong>ridging&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>TRILL&lt;/strong>&lt;/td>
&lt;td>&lt;strong>Tr&lt;/strong>ansparent &lt;strong>I&lt;/strong>nterconnection of &lt;strong>L&lt;/strong>ots of &lt;strong>L&lt;/strong>inks&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>IS-IS&lt;/strong>&lt;/td>
&lt;td>&lt;strong>I&lt;/strong>ntermediate-&lt;strong>S&lt;/strong>ystem-to-&lt;strong>I&lt;/strong>ntermediate-&lt;strong>S&lt;/strong>ystem&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>DCTCP&lt;/strong>&lt;/td>
&lt;td>Data Center TCP&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>ECN&lt;/strong>&lt;/td>
&lt;td>&lt;strong>E&lt;/strong>xplicit &lt;strong>C&lt;/strong>ongestion &lt;strong>N&lt;/strong>otification&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="tcp-evolution">TCP Evolution&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Acronym&lt;/th>
&lt;th>Full Name&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>TLV&lt;/strong>&lt;/td>
&lt;td>&lt;strong>T&lt;/strong>ype-&lt;strong>L&lt;/strong>ength-&lt;strong>V&lt;/strong>alue&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>TFO&lt;/strong>&lt;/td>
&lt;td>&lt;strong>T&lt;/strong>CP &lt;strong>F&lt;/strong>ast &lt;strong>O&lt;/strong>pen&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="accees-networks">Accees Networks&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Acronym&lt;/th>
&lt;th>Full Name&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>ISDN&lt;/strong>&lt;/td>
&lt;td>&lt;strong>I&lt;/strong>ntegrated &lt;strong>S&lt;/strong>ervices &lt;strong>D&lt;/strong>igital &lt;strong>N&lt;/strong>etwork&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>NT&lt;/strong>&lt;/td>
&lt;td>&lt;strong>N&lt;/strong>etwork &lt;strong>T&lt;/strong>ermination&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>DSL&lt;/strong>&lt;/td>
&lt;td>&lt;strong>D&lt;/strong>igital &lt;strong>S&lt;/strong>ubscriber &lt;strong>L&lt;/strong>ine&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>ADSL&lt;/strong>&lt;/td>
&lt;td>&lt;strong>A&lt;/strong>symmetric &lt;strong>DSL&lt;/strong>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>SDSL&lt;/strong>&lt;/td>
&lt;td>&lt;strong>S&lt;/strong>ymmetric &lt;strong>DSL&lt;/strong>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>BRAS&lt;/strong>&lt;/td>
&lt;td>&lt;strong>B&lt;/strong>roadband &lt;strong>R&lt;/strong>emote &lt;strong>A&lt;/strong>ccess &lt;strong>S&lt;/strong>erver&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table></description></item><item><title>(Diracsche) Delta-Distribution / Delta-Funktion</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/dirac_funktion/</link><pubDate>Sat, 04 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/dirac_funktion/</guid><description>&lt;h2 id="definition">Definition&lt;/h2>
&lt;p>Die &lt;mark>&lt;strong>Delta-Distribution&lt;/strong>&lt;/mark> (aka. &lt;strong>Dirac-Funktion&lt;/strong>, &lt;strong>Dirac-Maß&lt;/strong>, &lt;strong>Impulsfunktion&lt;/strong>) ist eine spezielle irreguläre &lt;a href="https://de.wikipedia.org/wiki/Distribution_(Mathematik)">Distribution&lt;/a> mit &lt;a href="https://de.wikipedia.org/wiki/Kompakter_Raum">kompaktem&lt;/a> &lt;a href="https://de.wikipedia.org/wiki/Tr%C3%A4ger_(Mathematik)">Träger&lt;/a>.&lt;/p>
$$
\begin{array}{c}
\delta(x)=0, \quad x \neq 0 \\\\
\displaystyle \int_{a}^{b} \delta(x) \mathrm{d} x=1, \quad a&lt;0&lt;b
\end{array}
$$
&lt;p>Illustration: Delta-Funktion im Ursprung wird als Pfeil bei $x=0$ dargestellt und repräsentiert eine Punktladung (Source: &lt;a href="https://de.universaldenker.org/lektionen/235">Dirac&amp;rsquo;sche Delta-Funktion und ihre Eigenschaften&lt;/a>).&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/dirac-delta-graph.svg" alt="Darstellung einer Delta-Funktion im Ursprung als Pfeil" style="width: 50%;" />
&lt;h2 id="delta-funktion-im-koordinatenursprung">Delta-Funktion im Koordinatenursprung&lt;/h2>
&lt;p>Betrachte ein Integral der Delta-Funktion zusammen mit einer &lt;strong>Testfunktion&lt;/strong> $f(x)$&lt;/p>
$$
\int_{a}^{b} f(x) \delta(x) \mathrm{d} x
$$
&lt;p>Denn $\delta(x)$ ist überall $0$, außer an der Stelle $x=0$.&lt;/p>
&lt;p>$\Rightarrow$ $f(x)\delta(x)$ ist überall $0$, außer an der Stelle $x=0$.&lt;/p>
&lt;p>$\Rightarrow$ Im Integral bleibt nur der Funktionswert $f(0)$ erhalten, der nicht von $x$ abhängt.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/dirac-delta-function-picks-function-value-at-origin-with-boundaries.svg" alt="Delta-Funktion pickt den Funktionswert am Ursprung in einem Intervall" style="width:50%;" />
&lt;p>Daher gilt:&lt;/p>
$$
\int_{a}^{b} f(x) \delta(x) \mathrm{d} x= \int_{a}^{b} f(0)\delta(x) \mathrm{d} x=f(0) \underbrace{\int_{a}^{b} \delta(x)\mathrm{d} x}_{=1} = f(0)
$$
&lt;h2 id="eigenschaften">Eigenschaften&lt;/h2>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Bei Berechnen/Verweden/Überprüfen der Eigenschaften von Dirac-Funktion ist es wichtig, die &lt;a href="https://de.wikipedia.org/wiki/Integration_durch_Substitution">Substitutionsregel&lt;/a> zu verwenden.&lt;/span>
&lt;/div>
&lt;h3 id="verschobene-delta-funktion">Verschobene Delta-Funktion&lt;/h3>
&lt;p>Verschiebe die Ladung an eine andere Stelle auf der $x$-Achse (z.B an die Stelle $x=x_0$). Das Argument der Delta-Funktion wird zu $\delta(x-x_0)$.&lt;/p>
&lt;p>Die verschobene Delta-Funktion mit einer anderen Funktion $f(x)$ im Integral multipliziert:&lt;/p>
$$
\int_{a}^{b} f(x) \delta\left(x-x_{0}\right) \mathrm{d} x=f\left(x_{0}\right)
$$
&lt;details class="spoiler " id="spoiler-4">
&lt;summary class="cursor-pointer">Beweis&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/verschobene_Dirac_Fkt.gif" alt="verschobene_Dirac_Fkt">
&lt;/div>
&lt;/details>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/shifted-dirac-delta-function.svg" alt="Verschobene Delta-Funktion pickt einen Funktionswert heraus" style="width:50%;" />
&lt;p>Nach rechts verschobene Delta-Funktion pickt den Wert $f(x_0)$ der Funktion an der Stelle $x=x_0$.&lt;/p>
&lt;details class="spoiler " id="spoiler-5">
&lt;summary class="cursor-pointer">Beispiel&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-06-02%2012.10.45.png" alt="截屏2022-06-02 12.10.45">
&lt;/div>
&lt;/details>
&lt;details class="spoiler " id="spoiler-6">
&lt;summary class="cursor-pointer">Beispiel&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
Eine Delta-Funktion außerhlad der Integrationsgrenzen
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-06-02%2012.11.43.png" alt="截屏2022-06-02 12.11.43">
&lt;/div>
&lt;/details>
&lt;h3 id="symmetrie">Symmetrie&lt;/h3>
&lt;p>Delta-Funktion ist symmetrisch (gerade)&lt;/p>
$$
\delta(x) = \delta(-x)
$$
&lt;details class="spoiler " id="spoiler-8">
&lt;summary class="cursor-pointer">Beweis&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-06-02%2012.47.40.png" alt="截屏2022-06-02 12.47.40">
&lt;/div>
&lt;/details>
&lt;h3 id="skalierung">Skalierung&lt;/h3>
&lt;p>Skaliertes Argument der Delta-Funktion&lt;/p>
$$
\int_{a}^{b} f(x) \delta(|k| x) \mathrm{d} x=\frac{1}{|k|} f(0)
$$
&lt;details class="spoiler " id="spoiler-10">
&lt;summary class="cursor-pointer">Beweis&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-06-02%2016.27.14.png" alt="截屏2022-06-02 16.27.14">
&lt;/div>
&lt;/details>
&lt;h3 id="hintereinanderausführung">Hintereinanderausführung&lt;/h3>
$$
\int_{-\infty}^{\infty} f(x) \delta(g(x)) \mathrm{d} x=\sum_{i=1}^{n} \frac{f\left(x_{i}\right)}{\left|g^{\prime}\left(x_{i}\right)\right|}
$$
&lt;p>wobei $g(x_i) = 0$ und $g^\prime(x_i) \neq 0$.&lt;/p>
&lt;details class="spoiler " id="spoiler-12">
&lt;summary class="cursor-pointer">Beweis&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;p>Substituiere&lt;/p>
$$
u := g(x)
$$
&lt;p>Dann gilt:&lt;/p>
$$
\begin{aligned}
x &amp;= g^{-1}(u) \\\\
\frac{du}{dx} &amp;= g^\prime(x) = g^\prime(g^{-1}(u))
\end{aligned}
$$
&lt;p>Da $\delta(x) \neq 0$ nur bei $x = 0$, können wir den Bereich des Integrals in kleine Intervalle um jede Nullstelle $x_i$ von $g(x)$ aufteilen, wobei $g(x)$ monoton und somit invertierbar ist.&lt;/p>
$$
\begin{aligned}
\int f(x) \delta(g(x)) d x &amp;=\sum_{i} \int_{x_{i}-\varepsilon_{i}}^{x_{i}+\varepsilon_{i}} f(x) \delta(g(x)) d x \\\\
&amp;=\sum_{i} \int_{g\left(x_{i}-\varepsilon_{i}\right)}^{g\left(x_{i}+\varepsilon_{i}\right)} f\left(g^{-1}(u)\right) \delta(u) \frac{1}{g^{\prime}\left(g^{-1}(u)\right)} d u \\\\
&amp;=\sum_{i} \int_{g\left(x_{i}-\varepsilon_{i}\right)}^{g\left(x_{i}+\varepsilon_{i}\right)} \frac{f\left(g^{-1}(u)\right)}{g^{\prime}\left(g^{-1}(u)\right)} \delta(u) d u \\\\
&amp;=\sum_{i} \int_{g\left(x_{i}-\varepsilon_{i}\right)}^{g\left(x_{i}+\varepsilon_{i}\right)} \frac{f\left(x_{i}\right)}{g^{\prime}\left(x_{i}\right)} \delta(u) d u \quad(\ast)
\end{aligned}
$$
&lt;p>$g^\prime (x_i) > 0$
:&lt;/p>
$$
\begin{aligned}
(\ast) &amp;=\sum\_{i} \frac{f\left(x\_{i}\right)}{g^{\prime}\left(x\_{i}\right)} \underbrace{\int\_{g\left(x\_{i}-\varepsilon\_{i}\right)}^{g\left(x\_{i}+\varepsilon\_{i}\right)} \delta(u) d u}\_{=1} \\\\
&amp;=\sum\_{i} \frac{f\left(x\_{i}\right)}{g^{\prime}\left(x\_{i}\right)} \\\\
&amp;=\sum\_{i} \frac{f\left(x\_{i}\right)}{|g^{\prime}\left(x\_{i}\right)|}
\end{aligned}
$$
&lt;p>$g^\prime (x_i) &lt; 0$
:&lt;/p>
&lt;p>Dann ist&lt;/p>
$$
g(x_i + \varepsilon_i) &lt; g(x_i - \varepsilon_i)
$$
&lt;p>Daher&lt;/p>
$$
\begin{aligned}
(\ast) &amp;=\sum_{i} \int\_{g\left(x\_{i}+\varepsilon\_{i}\right)}^{g\left(x\_{i}-\varepsilon\_{i}\right)} \frac{f\left(x\_{i}\right)}{g^{\prime}\left(x\_{i}\right)} \delta(u) d u \\\\
&amp;=\sum\_{i} \int\_{g\left(x\_{i}-\varepsilon\_{i}\right)}^{g\left(x\_{i}+\varepsilon_{i}\right)}-\frac{f\left(x_{i}\right)}{g^{\prime}\left(x\_{i}\right)} \delta(u) d u \\\\
&amp;=\sum\_{i} \int_{g\left(x\_{i}-\varepsilon\_{i}\right)}^{g\left(x\_{i}+\varepsilon\_{i}\right)} \frac{f\left(x\_{i}\right)}{\left|g^{\prime}\left(x_{i}\right)\right|} \delta(u) d u \\\\
&amp;=\sum\_{i} \frac{f\left(x\_{i}\right)}{\left|g^{\prime}\left(x\_{i}\right)\right|} \underbrace{\int\_{g\left(x\_{i}-\varepsilon\_{i}\right)}^{g\left(x\_{i}+\varepsilon\_{i}\right)} \delta(u) d u}\_{=1} \\\\
&amp;=\sum_{i} \frac{f\left(x\_{i}\right)}{\left|g^{\prime}\left(x\_{i}\right)\right|}
\end{aligned}
$$
&lt;p>Also&lt;/p>
$$
\int_{-\infty}^{\infty} f(x) \delta(g(x)) \mathrm{d} x=\sum_{i=1}^{n} \frac{f\left(x_{i}\right)}{\left|g^{\prime}\left(x_{i}\right)\right|} \qquad (\square)
$$
&lt;/div>
&lt;/details>
&lt;p>Ref: &lt;a href="https://math.stackexchange.com/questions/276583/dirac-delta-function-of-a-function">Dirac Delta Function of a Function&lt;/a>&lt;/p>
&lt;h2 id="reference">Reference&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://de.universaldenker.org/lektionen/235">Dirac&amp;rsquo;sche Delta-Funktion und ihre Eigenschaften&lt;/a> 👍👍👍&lt;/li>
&lt;/ul></description></item><item><title>Router</title><link>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/02-router/</link><pubDate>Mon, 01 Mar 2021 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/02-router/</guid><description>&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Router%20%281%29.png"
alt="Schematic view and generic architecture of router">&lt;figcaption>
&lt;p>Schematic view and generic architecture of router&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;h2 id="basic-functionalities">Basic Functionalities&lt;/h2>
&lt;h3 id="intermediate-systems">Intermediate Systems&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Forward data from input port(s) to output port(s)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Forwarding is a task of the &lt;strong>data path&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-01%2011.31.22.png" alt="截屏2021-03-01 11.31.22">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>May operate on different layers&lt;/p>
&lt;ul>
&lt;li>Hubs operate on layer 1&lt;/li>
&lt;li>Bridges operate on layer 2&lt;/li>
&lt;li>&lt;strong>Routers operate on layer 3&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="routing">Routing&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Determines the path that the packets follow&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Routing is part of the control path
$\rightarrow$ Requires &lt;strong>routing algorithms&lt;/strong> and &lt;strong>routing protocols&lt;/strong>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-01%2011.34.13.png" alt="截屏2021-03-01 11.34.13" style="zoom:80%;" />
&lt;h3 id="forwarding-within-a-router">Forwarding within a Router&lt;/h3>
&lt;p>&lt;strong>Main task&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Lookup in forwarding table&lt;/li>
&lt;li>Forward data from input port to output port(s)&lt;/li>
&lt;/ul>
&lt;p>🎯 &lt;strong>Goals&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Forwarding in line &lt;strong>speed&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Short&lt;/strong> queues&lt;/li>
&lt;li>&lt;strong>Small&lt;/strong> tables&lt;/li>
&lt;/ul>
&lt;p>Schematic View of an IP-Router:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-01%2011.47.21.png" alt="截屏2021-03-01 11.47.21" style="zoom:80%;" />
&lt;h3 id="forwarding-functionality">Forwarding Functionality&lt;/h3>
&lt;p>&lt;strong>Basic functions&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Check the headers of an IP packet
&lt;ul>
&lt;li>Version number&lt;/li>
&lt;li>Valid header length&lt;/li>
&lt;li>Checksum&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Check time to live
&lt;ul>
&lt;li>Decrement of TTL field&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Recalculate checksum&lt;/li>
&lt;li>Lookup
&lt;ul>
&lt;li>Determine output port for a packet&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Fragmentation&lt;/li>
&lt;li>Handle IP options&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Possibly: differentiated treatment of packets&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Classification&lt;/li>
&lt;li>Prioritization&lt;/li>
&lt;/ul>
&lt;h2 id="challenge-line-speed">Challenge: Line Speed&lt;/h2>
&lt;ul>
&lt;li>Bandwidth demand increases&lt;/li>
&lt;li>Link capacity has to increase as well to keep up&lt;/li>
&lt;/ul>
&lt;h3 id="types-of-routers">Types of Routers&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Core router&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Used by &lt;strong>service providers&lt;/strong>&lt;/li>
&lt;li>Need to handle large amounts of aggregated traffic&lt;/li>
&lt;li>High speed and reliability essential
&lt;ul>
&lt;li>
&lt;p>Fast lookup and forwarding needed&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Redundancy to increase reliability (dual power supply &amp;hellip;)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Cost secondary issue&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Enterprise router&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Connect end systems in &lt;strong>companies, universities&lt;/strong> &amp;hellip;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Provide connectivity to large number of end systems&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Support of VLANs, firewalls &amp;hellip;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Low cost per port, large number of ports, ease of maintenance&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Edge router (access router)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>At edge of service provider&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Provide connectivity to customer from &lt;strong>home, small businesses&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Support for PPTP, IPsec, VPNs &amp;hellip;&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="forwarding-table-lookup">Forwarding Table Lookup&lt;/h2>
&lt;p>Example of a forwarding table&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-01%2012.18.53.png" alt="截屏2021-03-01 12.18.53" style="zoom:80%;" />
&lt;p>&lt;strong>Prefix&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Identifies a block of addresses&lt;/li>
&lt;li>Continuous blocks of addresses per output port are beneficial
&lt;ul>
&lt;li>Does not require a separate entry for each IP address $\rightarrow$ &lt;strong>Scalability&lt;/strong> 👏&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Longest Prefix Matching&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Consider a typical problem: What to do if there are multiple prefixes in the forwarding table that match on a given destination address?&lt;/li>
&lt;li>🔧 Solution: Select &lt;em>most specific&lt;/em> prefix
&lt;ul>
&lt;li>&lt;strong>most specific prefix = the longest prefix&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Example&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-01%2012.22.49.png" alt="截屏2021-03-01 12.22.49">&lt;/p>
&lt;p>&lt;strong>Efficient Prefix Search&lt;/strong>&lt;/p>
&lt;p>Different approaches for fast prefix search (in software)&lt;/p>
&lt;ul>
&lt;li>&lt;a href="$binary-trie$">Binary trie&lt;/a>&lt;/li>
&lt;li>&lt;a href="#path-compression">Path-compressed trie&lt;/a>&lt;/li>
&lt;li>&lt;a href="#multibit-trie">Multibit-Tries&lt;/a>&lt;/li>
&lt;li>&lt;a href="#hash-tables">Hash tables&lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="efficient-data-structures">Efficient data structures&lt;/h3>
&lt;p>Requirements&lt;/p>
&lt;ul>
&lt;li>Fast lookup&lt;/li>
&lt;li>Low memory&lt;/li>
&lt;li>Fast updates&lt;/li>
&lt;/ul>
&lt;h4 id="naive-approach-simple-array">Naïve approach: Simple Array&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Variables&lt;/p>
&lt;ul>
&lt;li>$N$ = number of prefixes&lt;/li>
&lt;li>$W$ = length of a prefix (e.g., $W=32$ for full IPv4 addresses)&lt;/li>
&lt;li>$k$ = length of a stride (only for multibit tries)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>How it works?&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Store prefixes in a simple array (unordered)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Linear search&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Remember best match while walking through array&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Evaluation&lt;/p>
&lt;ul>
&lt;li>Worst case lookup speed: $O(N)$ $\rightarrow$ pretty bad 🤪&lt;/li>
&lt;li>Memory requirement: $O(N \cdot W)$ $\rightarrow$ pretty bad 🤪&lt;/li>
&lt;li>Updates: $O(1)$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="binary-trie">Binary Trie&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Tries&lt;/strong> $\rightarrow$ tree-based data structures to store and search prefix information&lt;/p>
&lt;ul>
&lt;li>From „re&lt;strong>trie&lt;/strong>val“ (find something)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>💡 &lt;strong>Idea: Bits in the prefix tell the algorithms what branch to take&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-01%2012.58.32.png" alt="截屏2021-03-01 12.58.32" style="zoom:80%;" />
&lt;/li>
&lt;li>
&lt;p>Evaluation&lt;/p>
&lt;ul>
&lt;li>Worst case lookup speed: $O(W)$
&lt;ul>
&lt;li>Maximum of one node per bit in the prefix&lt;/li>
&lt;li>But much better than naïve approach ($W \ll N$)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Memory requirement: $O(N \cdot W)$
&lt;ul>
&lt;li>Assumption: prefixes stored as linked list starting from root node&lt;/li>
&lt;li>Every prefix (out of $N$) can have up to $W$ nodes $\rightarrow$ Maximum of $N \cdot W$ entries&lt;/li>
&lt;li>No improvement (compared with naïve approach) 🤪&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Updates: $O(W)$
&lt;ul>
&lt;li>A maximum of $W$ nodes has to be inserted or deleted (similar to lookup procedure)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Performance&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Can find prefix in $W$ steps $\rightarrow$ address space = $2^W$&lt;/p>
&lt;ul>
&lt;li>$W = $ number of bits in address ($W = 32$ for IPv4, $W = 128$ for IPv6)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Assumption: separate memory access required for each step&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Memory access time $t\_{\text{access}} = 10 ns = 10 ^{-8}s$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Maximum lookups $L$ per second:
&lt;/p>
$$
t\_{\text {lookup }}=32 * t\_{\text {access }}=320 n s \rightarrow L=\frac{1}{t\_{\text {lookup }}}=3,125,000 \text { lookups} / s
$$
&lt;p>
For 100 byte packets, this results in only $2.5$ Gbit/s&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;details>
&lt;summary>Construct binary trie&lt;/summary>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Binary_Trie-no_path_compression.png" alt="Binary_Trie-no_path_compression">&lt;/p>
&lt;/details>
&lt;/li>
&lt;li>
&lt;p>Optimization&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#path-compression">Path compression&lt;/a>&lt;/li>
&lt;li>&lt;a href="#multibit-trie">Multibit-Tries&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="path-compression">Path Compression&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Long sequences of one-child nodes waste memory&lt;/p>
&lt;ul>
&lt;li>
&lt;p>E.g. highlighted (red) search paths in following trie is not required for branching decision&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-01%2022.37.07.png" alt="截屏2021-03-01 22.37.07">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>💡 Idea: Eliminate those sequences from trie&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Lookup operation&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Additional information required&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Store bit index that has to be examined next&lt;/p>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-01%2022.54.13.png" alt="截屏2021-03-01 22.54.13" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-01%2022.54.43.png" alt="截屏2021-03-01 22.54.43" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Evaluation&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Worst case lookup speed: $O(W)$&lt;/p>
&lt;blockquote>
&lt;p>If there are no one-child nodes on a path, number of nodes to search is equal to length of prefix&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;li>
&lt;p>Memory requirement: $O(N)$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Maximum of $N$ leaf nodes, $N-1$ for the internal nodes&lt;/p>
&lt;p>$\rightarrow$ Maximum of $2N-1$ entries&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Improvement against binary trie &amp;#x1f44f;&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Updates: $O(W)$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;details>
&lt;summary>Construct binary trie with path compression&lt;/summary>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Binary_trie_with_compresssion.png" alt="Binary_trie_with_compresssion" style="zoom:80%;" />
&lt;/details>
&lt;/li>
&lt;/ul>
&lt;h4 id="multibit-trie">Multibit Trie&lt;/h4>
&lt;details>
&lt;summary>Example: Homework 03&lt;/summary>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-05%2012.03.23.png" alt="截屏2021-03-05 12.03.23" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-05%2012.03.52.png" alt="截屏2021-03-05 12.03.52" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-05%2012.04.11.png" alt="截屏2021-03-05 12.04.11" style="zoom:67%;" />
&lt;/details>
&lt;h3 id="hash-tables">Hash Tables&lt;/h3>
&lt;ul>
&lt;li>🎯 Obejctives
&lt;ul>
&lt;li>
&lt;p>Improve lookup speed&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Hash tables can perform lookup in $O(1)$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>However: longest prefix match only with hash table doesn‘t work 🤪&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Instead: &lt;strong>use an &lt;em>additional&lt;/em> hash table&lt;/strong>
&lt;ul>
&lt;li>Stores results of trie lookups
&lt;ul>
&lt;li>E.g., destination IP address 109.21.33.9 $\rightarrow$ output port 2&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Significant improvement for large forwarding tables 👏&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>For each received IP packet
&lt;ul>
&lt;li>Does an entry for destination IP address exist in hash table?
&lt;ul>
&lt;li>Yes $\rightarrow$ no trie lookup&lt;/li>
&lt;li>No $\rightarrow$ trie lookup&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Works well if addresses show &lt;strong>„locality“&lt;/strong> characteristics
&lt;ul>
&lt;li>I.e., most IP packets are covered by a small set of prefixes&lt;/li>
&lt;li>Not applicable in the Internet backbone&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="comparsion-between-binary-trie-path-compression-and-multibit-trie">Comparsion between Binary Trie, Path Compression, and Multibit Trie&lt;/h3>
&lt;ul>
&lt;li>$N$ = number of prefixes&lt;/li>
&lt;li>$W$ = length of a prefix (e.g., $W=32$ for full IPv4 addresses)
&lt;ul>
&lt;li>$N \gg W$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>$k$ = length of a stride (only for multibit tries)&lt;/li>
&lt;/ul>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>&lt;/th>
&lt;th>Lookup Speed&lt;/th>
&lt;th>Memory Requirement&lt;/th>
&lt;th>Update&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Binary trie&lt;/td>
&lt;td>$O(W)$&lt;/td>
&lt;td>$O(NW)$&lt;/td>
&lt;td>$O(W)$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Path compression&lt;/td>
&lt;td>$O(W)$&lt;/td>
&lt;td>$O(N)$&lt;/td>
&lt;td>$O(W)$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Multibit trie&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="longest-prefix-matching-in-hardware">Longest Prefix Matching in Hardware&lt;/h3>
&lt;h4 id="ram-based-access">RAM-based Access&lt;/h4>
&lt;ul>
&lt;li>💡Basic idea
&lt;ul>
&lt;li>Read information with a single memory access&lt;/li>
&lt;li>Use destination IP address as RAM address&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>🔴 Problem
&lt;ul>
&lt;li>Independent of number of prefixes in use
&lt;ul>
&lt;li>IPv4 addresses with length of 32 bit $\rightarrow$ requires 4 GByte&lt;/li>
&lt;li>IPv6 addresses with length of 128 bit $\rightarrow$ requires ~$3.4 × 10^{29}$ GByte&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Waste of memory
&lt;ul>
&lt;li>Required memory size grows &lt;em>exponentially&lt;/em> with size of address!&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="content-addressable-memory-cam">Content-Addressable Memory (CAM)&lt;/h4>
&lt;p>&lt;strong>CAM&lt;/strong>: takes data and returns address (opposite to RAM)&lt;/p>
&lt;ul>
&lt;li>CAM can search all stored entries &lt;strong>in a single clock cycle&lt;/strong> (very fast!)&lt;/li>
&lt;li>Application for networking: use addresses as search input to perform very fast address lookups (IP $\rightarrow$ output port)&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-01%2023.53.50.png" alt="截屏2021-03-01 23.53.50">&lt;/p>
&lt;p>&lt;strong>Structure of CAM&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-01%2023.54.40.png" alt="截屏2021-03-01 23.54.40" style="zoom: 67%;" />
&lt;details>
&lt;summary>&lt;b>How does CAM work?&lt;/b>&lt;/summary>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-01%2023.56.34.png" alt="截屏2021-03-01 23.56.34">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-01%2023.56.44.png" alt="截屏2021-03-01 23.56.44">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-01%2023.56.50.png" alt="截屏2021-03-01 23.56.50">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-01%2023.57.03.png" alt=" ">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-01%2023.57.18.png" alt="截屏2021-03-01 23.57.18">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-01%2023.59.04.png" alt="截屏2021-03-01 23.59.04">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-01%2023.58.00.png" alt="截屏2021-03-01 23.58.00">&lt;/p>
&lt;/details>
&lt;p>&lt;strong>Example&lt;/strong>&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/cam-architecture.png">&lt;figcaption>
&lt;h4>Source: [Content-Addressable Memory Introduction](https://www.pagiamtzis.com/cam/camintro/)&lt;/h4>
&lt;/figcaption>
&lt;/figure>
&lt;h4 id="ternary-cam-tcam">Ternary CAM (TCAM)&lt;/h4>
&lt;ul>
&lt;li>An extension that supports a &lt;strong>„Don‘t Care“ State x&lt;/strong> (matching both a 0 and a 1 in that position)
&lt;ul>
&lt;li>Allows longest prefix matching&lt;/li>
&lt;li>Prefixes are stored in the CAM &lt;strong>sorted by prefix length&lt;/strong> (from long to short)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-02%2021.57.54.png" alt="截屏2021-03-02 21.57.54" style="zoom:80%;" />
&lt;ul>
&lt;li>👍 Advantage: Very fast lookups (1 clock cycle)&lt;/li>
&lt;li>🔴 Problems: Severe scalability limitations
&lt;ul>
&lt;li>High energy demand
&lt;ul>
&lt;li>All search words are looked up in parallel&lt;/li>
&lt;li>Every core cell is required for every lookup&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>High cost / low density
&lt;ul>
&lt;li>TCAM requires 2-3 times the transistors compared to SRAM&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Longest matching prefix requires strict ordering of prefixes in the TCAM
&lt;ul>
&lt;li>New entries can require the TCAM to be „re-ordered“
$\rightarrow$ This can take a significant amount of time!&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;details>
&lt;summary>&lt;b>Example: Homework 04&lt;/b>&lt;/summary>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-05%2019.42.57.png" alt="截屏2021-03-05 19.42.57" style="zoom:67%;" />
&lt;p>&lt;strong>💡 Idea:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Sort prefixes from according to their length (longest to shortest)&lt;/strong>&lt;/li>
&lt;li>&lt;strong>CAM part: (prefix, index) pair&lt;/strong>&lt;/li>
&lt;li>&lt;strong>RAM part: (index, egress port) pair&lt;/strong>&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-05%2019.43.53.png" alt="截屏2021-03-05 19.43.53" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-05%2019.44.17.png" alt="截屏2021-03-05 19.44.17" style="zoom:67%;" />
&lt;/details>
&lt;h2 id="router-architecture">Router Architecture&lt;/h2>
&lt;p>Basic components&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Network interfaces&lt;/strong>
&lt;ul>
&lt;li>Realize access to one of the attached networks&lt;/li>
&lt;li>Functionalities of layers 1 and 2&lt;/li>
&lt;li>Basic functions of IP
&lt;ul>
&lt;li>Including forwarding table lookup&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Routing processor&lt;/strong>
&lt;ul>
&lt;li>Routing protocol&lt;/li>
&lt;li>Management functionality&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Switch fabric&lt;/strong>
&lt;ul>
&lt;li>„Backplane“&lt;/li>
&lt;li>Realizes internal forwarding of packets from the input to the output port&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="generic-router-architecture">Generic Router Architecture&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-02%2022.03.55.png" alt="截屏2021-03-02 22.03.55" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>Conflicting design goals&lt;/p>
&lt;ul>
&lt;li>&lt;strong>High efficiency&lt;/strong>
&lt;ul>
&lt;li>Line speed&lt;/li>
&lt;li>Low delay&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Vs. &lt;strong>low cost&lt;/strong>
&lt;ul>
&lt;li>Type and amount of required storage&lt;/li>
&lt;li>Type of switch fabric&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Blocking&lt;/p>
&lt;ul>
&lt;li>
&lt;p>E.g., packets arriving at the same time at different input ports that need the same output port&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Measures that can help prevent blocking&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Overprovisioning&lt;/strong>&lt;/p>
&lt;p>Internal circuits in switch fabric operate at a &lt;em>higher&lt;/em> speed than the individual input ports&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Buffering&lt;/strong>&lt;/p>
&lt;p>Queue packets at appropriate locations until resources are available At&lt;/p>
&lt;ul>
&lt;li>
&lt;p>network interfaces&lt;/p>
&lt;/li>
&lt;li>
&lt;p>In switch fabric&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Backpressure&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Signal the overload back towards the input ports&lt;/li>
&lt;li>Input ports can then reduce load&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Parallel switch fabrics&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Allows parallel transport of multiple packets to output ports&lt;/li>
&lt;li>Requires higher access speed at output ports&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="buffers">Buffers&lt;/h3>
&lt;p>Problem: Simultaneous arrival of multiple packets for an output port&lt;/p>
&lt;ul>
&lt;li>Sequential processing required, since packets can not be sent in parallel&lt;/li>
&lt;li>Packets have to be buffered&lt;/li>
&lt;/ul>
&lt;p>Example&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-02%2022.15.11.png" alt="截屏2021-03-02 22.15.11">&lt;/p>
&lt;ul>
&lt;li>Packets arrive at input ports E1 and E2 at the same time, both must be forwarded to output A1&lt;/li>
&lt;li>One out of the two packets requires buffering&lt;/li>
&lt;/ul>
&lt;p>Where to place the memory elements for buffering?&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;a href="#input-buffer">Input buffer&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#output-buffer">Output buffer&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#distributed-buffer">Distributed buffer&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#central-buffer">Central buffer&lt;/a>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="evaluation-of-alternatives">Evaluation of Alternatives&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Parameters of switch fabric&lt;/p>
&lt;ul>
&lt;li>$N$: Number of input and output ports&lt;/li>
&lt;li>$M$: Total storage capacity&lt;/li>
&lt;li>$S$: Speedup factor of the switch fabric
&lt;ul>
&lt;li>According to the speed of the input and output ports&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>$Z$: Cycle time of memory accesses
&lt;ul>
&lt;li>According to the transmission time of a packet at input and output ports&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Delay und jitter (=variance of the delay)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Important&lt;/p>
&lt;ul>
&lt;li>Additional mechanisms are required, e.g. flow control&lt;/li>
&lt;li>Organization of memories, e.g. FIFO or RAM&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>In the following: simplifying assumptions&lt;/p>
&lt;ul>
&lt;li>
&lt;p>All ports operate at same data rate&lt;/p>
&lt;/li>
&lt;li>
&lt;p>All packets have same length&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="input-buffer">Input buffer&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>💡 Idea: conflict resolution &lt;strong>at input of switch fabric&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-02%2022.30.38.png" alt="截屏2021-03-02 22.30.38">&lt;/p>
&lt;ul>
&lt;li>FIFO buffer per input port&lt;/li>
&lt;li>Scheduling of inputs, e.g.
&lt;ul>
&lt;li>Round robin, priority controlled, depending on buffer levels, &amp;hellip;&lt;/li>
&lt;li>Jitter varies&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Switch fabric internally non-blocking, i.e., no internal conflicts 👏&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Requirements&lt;/p>
&lt;ul>
&lt;li>Internal exchange with speed of connections ($S=1$)&lt;/li>
&lt;li>Cycle time $Z = \frac{1}{2}$ (One packet in, one packet out)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Characteristics&lt;/p>
&lt;ul>
&lt;li>
&lt;p>🔴 Problem: &lt;strong>Head-of-Line blocking&lt;/strong>&lt;/p>
&lt;p>Waiting packet at head of the buffer blocks packet behind it that could be serviced&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Input_Buffer.png">&lt;figcaption>
&lt;h4>Suppose that in the buffer of $I1$, the 1st packet are going to be sent to $O1$ and the 2nd packet are going to be sent to $O2$. But currently the 1st packet is blocked. This caused that the 2nd packet can not be processed, although $O2$ is not occupied. In other words, the 1st packet **blocks** the 2nd packet.&lt;/h4>
&lt;/figcaption>
&lt;/figure>
&lt;/li>
&lt;li>
&lt;p>Maximum throughput is 75% for $𝑁 = 2$ and 58,58% for $𝑁 \to \infty$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>​&lt;/p>
&lt;h4 id="output-buffer">Output buffer&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>💡 Idea: conflict resolution &lt;strong>at output of switch fabric&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-02%2023.22.21.png" alt="截屏2021-03-02 23.22.21">&lt;/p>
&lt;ul>
&lt;li>FIFO buffer per output port&lt;/li>
&lt;li>Switch fabric internally non-blocking, i.e., no internal conflicts&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Requirements&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Internal switching of packets at $N$ times the speed of the input ports:
&lt;/p>
$$
S = N
$$
&lt;blockquote>
&lt;p>Switch fabric internally non-blocking&lt;/p>
&lt;p>$\rightarrow$ $N$ inputs must be processed at the same time (simultaneously)&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;li>
&lt;p>Switching of $N$ packets during one cycle possible $\Rightarrow$
&lt;/p>
$$
Z = \frac{1}{N + 1}
$$
&lt;blockquote>
&lt;p>In worst case, a buffer must take $N$ packets in and send one packet out.&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;li>
&lt;p>Output buffer must be able to accept packets at $N$ times the speed&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Input buffer necessary to accept a packet&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Characteristics&lt;/p>
&lt;ul>
&lt;li>Maximum throughput near 100%, usually at approx. 80-85%&lt;/li>
&lt;li>Good behavior with respect to delay and jitter&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="distributed-buffer">Distributed buffer&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>💡 Idea: conflict resolution &lt;strong>inside switch fabric&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-02%2023.35.03.png" alt="截屏2021-03-02 23.35.03">&lt;/p>
&lt;ul>
&lt;li>Switch fabric as matrix&lt;/li>
&lt;li>FIFO buffer per crosspoint&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Requirements&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Matrix structure&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Internal exchange with speed of connections: $𝑆 = 1 $&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Cycle time: $Z = \frac{1}{2}$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Characteristics&lt;/p>
&lt;ul>
&lt;li>
&lt;p>No Head-of-Line blocking &amp;#x1f44f;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Higher memory requirement $M$ than input or output buffering 🤪&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="central-buffer">Central buffer&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>💡 Idea: conflict resolution with &lt;strong>shared buffer&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>All input and output ports are connected to a shared buffer (organization: RAM&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-02%2023.40.36.png" alt="截屏2021-03-02 23.40.36">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Requirements&lt;/p>
&lt;ul>
&lt;li>Cycle time $Z = \frac{1}{2N}$&lt;/li>
&lt;li>Address and control memory
for address information of packets and control of parallel memory accesses&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Characteristics&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Significantly lower memory requirements&lt;/p>
&lt;/li>
&lt;li>
&lt;p>But: requirements with respect to memory access time are higher 🤪&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="buffer-placement-summary">Buffer placement summary&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-02%2023.42.42.png" alt="截屏2021-03-02 23.42.42" style="zoom:80%;" />
&lt;h3 id="switch-fabric">Switch fabric&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Four typical &lt;strong>basic structures&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Shared memory&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#bus-or-ring-structure">Bus / ring structure&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#crossbar">Crossbar&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#multi-level-switching-networks">Multi-level switching networks&lt;/a>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Evaluation&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>The internal blocking behavior (Blocking / non-blocking)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>The presence of buffers (Buffered / unbuffered)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Topology and number of levels of the switching network and number of possible routes&lt;/p>
&lt;/li>
&lt;li>
&lt;p>The control principle for packet routing (Self-controlling / table-controlled)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>The internal connection concept (Connection oriented / connectionless)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="bus-or-ring-structure">Bus or ring structure&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>💡 Idea&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Conflict-free access through time-division multiplexing&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Transmission capacity bus / ring&lt;/p>
&lt;ul>
&lt;li>At least the sum of the transmission capacities of all input ports&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-03%2012.54.06.png" alt="截屏2021-03-03 12.54.06">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Characteristics&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Easy support for multicast and broadcast&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Spatial extension of a bus system is limited. Usually low number of connections (up to approx. 16)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="crossbar">Crossbar&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>💡 Idea: Each input connected to each output via &lt;strong>crossbar&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-03%2012.55.14.png" alt="截屏2021-03-03 12.55.14">&lt;/p>
&lt;ul>
&lt;li>$N$ inputs, $N$ outputs $\Rightarrow$ $N^2$ crosspoints&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Characteristics&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Partial parallel switching of packets possible&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Multiple packets for the same output $\rightarrow$ Blocking $\to$ Buffering required&lt;/p>
&lt;/li>
&lt;li>
&lt;p>High wiring costs with a large number of inputs and outputs&lt;/p>
&lt;ul>
&lt;li>Mostly limited to 2x2 or 16x16 matrices&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Especially efficient with packets of the same size&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="multi-level-switching-networks">Multi-level Switching Networks&lt;/h4>
&lt;p>From the switching states of an elementary switching matrix&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-03%2013.04.35.png" alt="截屏2021-03-03 13.04.35">&lt;/p>
&lt;p>multilevel connection networks can be set up. E.g.,&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-03%2013.05.14.png" alt="截屏2021-03-03 13.05.14">&lt;/p>
&lt;p>Characteristics&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Less wiring effort than crossbar&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Each input can be connected to each output&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Not all connections possible at the same time&lt;/p>
&lt;ul>
&lt;li>internal blocking possible&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="self-test">Self-test&lt;/h2>
&lt;details>
&lt;summary>What are important responsibilities of the network layer?&lt;/summary>
&lt;/details>
&lt;details>
&lt;summary>Which basic operations are usually performed by an IP router in order to forward a packet to its destination?&lt;/summary>
&lt;/details>
&lt;details>
&lt;summary>Why are high link-speeds such a big problem for modern forwarding hardware?&lt;/summary>
&lt;/details>
&lt;details>
&lt;summary>How does longest prefix matching work in general?&lt;/summary>
&lt;/details>
&lt;details>
&lt;summary>What are efficient (software) data structures for handling longest prefix matching and how do they work?&lt;/summary>
&lt;/details>
&lt;details>
&lt;summary>In what way can hash tables support a trie-based address lookup?&lt;/summary>
&lt;/details>
&lt;details>
&lt;summary>What is a TCAM?&lt;/summary>
&lt;/details>
&lt;details>
&lt;summary>What are the main benefits and problems of the TCAM technology?&lt;/summary>
&lt;/details>
&lt;details>
&lt;summary>How does the introduced generic router architecture look like?&lt;/summary>
&lt;/details>
&lt;details>
&lt;summary>Where can buffer elements be placed inside a switch? What are the associated benefits and drawbacks?&lt;/summary>
&lt;/details></description></item><item><title>Zufallsvariable</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/zufallsvariable/</link><pubDate>Sat, 04 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/zufallsvariable/</guid><description>&lt;h2 id="zufallsvariablen">Zufallsvariablen&lt;/h2>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-yellow-100 dark:bg-yellow-900">
&lt;span class="pr-3 pt-1 text-red-400">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="M12 9v3.75m-9.303 3.376c-.866 1.5.217 3.374 1.948 3.374h14.71c1.73 0 2.813-1.874 1.948-3.374L13.949 3.378c-.866-1.5-3.032-1.5-3.898 0zM12 15.75h.007v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>Zufallsvariablen werden auf den SI-Übungsblättern durch kleine, fettgedruckte Buchstaben gekennzeichnet, z.B. $X$.&lt;/p>
&lt;p>Diese Notation wird nicht auf den handschriftlichen Mitschrieben umgesetzt, sodass Zufallsvariablen und „normale“ Variablen meistens aus dem Kontext heraus unterschieden werden müssen. 🤪&lt;/p>
&lt;/span>
&lt;/div>
&lt;p>Eine &lt;mark>&lt;strong>Zufallsvariable&lt;/strong>&lt;/mark> ist eine Art Funktion, die jedem Ergebnis $\omega$ deines Zufallsexperiments genau eine Zahl $x$ zuordnet.&lt;/p>
&lt;ul>
&lt;li>ordnet also den Ergebnissen eines Zufallsexperiments reelle Zahlen zu&lt;/li>
&lt;li>beschreibt sozusagen das Ergebnis eines Zufallsexperiments, das noch nicht durchgeführt wurde&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>Man sagt Variable, weil deine Zahl, die du am Ende erhältst, eben variabel ist.&lt;/p>
&lt;/blockquote>
&lt;p>‼️&lt;strong>Wichtig: zwischen $X$ und $x$ zu unterscheiden.&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>$X$: die tatsächliche Zufallsvariable, welche keinen festen Wert hat. Sie bildet das derzeit unbekannte Ergebnis eines Zufallsexperiments ab&lt;/li>
&lt;li>$x$: das Ergebnis nach dem Experiment und steht ist somit eine konkrete Zahl.&lt;/li>
&lt;/ul>
&lt;p>Bsp: 2 Würfeln werfen&lt;/p>
&lt;ul>
&lt;li>Zufallsvariable $X$ = Augensumme&lt;/li>
&lt;li>$P(X = 6)$: &amp;ldquo;Die Wahrscheinlichkeit, dass die Summe von zwei Würfeln sechs ergibt&amp;rdquo; (Hier $x=6$)&lt;/li>
&lt;/ul>
&lt;h3 id="diskrete-zufallsvariable">Diskrete Zufallsvariable&lt;/h3>
&lt;p>Eine Zufallsvariable wird als &lt;mark>&lt;strong>diskret&lt;/strong>&lt;/mark> bezeichnet, wenn sie nur &lt;strong>endlich viele&lt;/strong> oder &lt;strong>abzählbar&lt;/strong> unendlich viele Werte annimmt.&lt;/p>
&lt;ul>
&lt;li>Sklaenarten: Nominal- oder Ordinalskala&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">„Abzählbar unendlich“ bedeutet, dass die Menge der Ausprägungen durchnummeriert werden kann.&lt;/span>
&lt;/div>
&lt;p>Bsp: Das Ergebnis beim Würfelwurf ist $x \in \Omega = \\{1, 2, 3, 4, 5, 6\\}$, also $|\Omega| = 6$.&lt;/p>
&lt;h4 id="wahrscheinlichkeitsfunktion">Wahrscheinlichkeitsfunktion&lt;/h4>
&lt;p>Bei diskreten Zufallsvariablen ermittelt man die &lt;mark>&lt;strong>Wahrscheinlichkeitsfunktion&lt;/strong> (Engl. Probability mass function (PMF))&lt;/mark>, die Wahrscheinlichkeit für ein ganz konkretes Ergebnis angibt.&lt;/p>
$$
f(x): \Omega \rightarrow[0,1], x \in \mathbb{N}_{0}
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-05-31%2022.20.17.png" alt="截屏2022-05-31 22.20.17" style="zoom: 50%;" />
&lt;p>Die Funktionswert&lt;/p>
$$
f(x) = P(X=x)
$$
&lt;p>entspricht der Wahrscheinlichkeit, dass $X$ den Wert $x$ annimmt. Daher gilt&lt;/p>
$$
\sum_{x \in \Omega} f(x)=1
$$
&lt;blockquote>
&lt;p>Man schreibt für die „Dichte“ einer diskreten Zufallsvariablen, deren Einzelwahrscheinlichkeiten $p_n = P(X = x_n)$ gegeben sind, auch
&lt;/p>
$$
> f_{X}(x)=\sum_{n=1}^{\infty} \mathrm{P}\left(X=x_{n}\right) \delta\left(x-x_{n}\right)=\sum_{n=1}^{\infty} p_{n} \delta\left(x-x_{n}\right)
> $$
&lt;ul>
&lt;li>$\delta(\cdot)$: &lt;a href="https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/dirac_funktion/">Delta-Distribution&lt;/a>&lt;/li>
&lt;/ul>
&lt;/blockquote>
&lt;h4 id="verteilungsfunktion">Verteilungsfunktion&lt;/h4>
&lt;p>Die &lt;mark>&lt;strong>Verteilungsfunktion (aka. Kumulative Wahrscheinlichkeitsdichte, Engl,. Cumulative Distribution Function (CDF))&lt;/strong>&lt;/mark> gibt an, mit welcher Wahrscheinlichkeit das Ergebnis des Zufallsexperiments &lt;em>kleiner oder gleich&lt;/em> eines bestimmten Wertes ist.&lt;/p>
&lt;ul>
&lt;li>Dafür werden alle Ergebnisse bis zu diesem Wert aggregiert, also „aufaddiert“. Deshalb spricht man auch oft von einer &lt;strong>kumulativen Verteilungsfunktion&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;p>Um die diskrete Verteilungsfunktion zu erhalten, werden schrittweise alle Wahrscheinlichkeitswerte kumuliert. Das heißt, man bildet das Integral unter der Wahrscheinlichkeitsfunktion.&lt;/p>
$$
F(x): \boldsymbol{\Omega} \rightarrow[\mathbf{0}, \mathbf{1}], X \in \mathbb{N}_{\mathbf{0}}
$$
$$
F(x)= P(X \leq x) = \sum_{x_{i} \leq x} f\left(x_{i}\right)
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-05-31%2022.20.17.png" alt="截屏2022-05-31 22.20.17" style="zoom: 40%; float: left" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-05-31%2022.43.01.png" alt="截屏2022-05-31 22.43.01" style="zoom:40%; float:right" />
&lt;p>Eigenschaften&lt;/p>
&lt;ul>
&lt;li>$\lim _{x \rightarrow-\infty} F_{X}(x)=0 ; \lim _{x \rightarrow \infty} F_{X}(x)=1$
&lt;/li>
&lt;li>$F(X)$ ist monoton steigend und rechtseitig stetig&lt;/li>
&lt;/ul>
&lt;details class="spoiler " id="spoiler-9">
&lt;summary class="cursor-pointer">Beispiel&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;p>Würfelwurf:&lt;/p>
&lt;p>Wahrscheinlichkeitsfunktion:&lt;/p>
$$
f(X=k) = \frac{1}{6} \quad k \in \\{1, 2, 3, 4, 5, 6\\}
$$
&lt;p>Verteilungsfunktion:&lt;/p>
$$
F(3) = P(X \leq 3) = \sum_{i\leq 3}f(X=i) = \frac{1}{3} + \frac{1}{3} + \frac{1}{3}
$$
&lt;/div>
&lt;/details>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">In der SI Vorlesung sowie Übung wird die Verteilungsfunktion der Zufallsvariable $X$ als $F_{X}(x)$ schreiben.&lt;/span>
&lt;/div>
&lt;p>Differenz zwischen kumulativer Wahrscheinlichkeiten:&lt;/p>
$$
F(b) - F(a) = P(a &lt; x \leq b) = P(x\leq b) - P(x \leq a)
$$
&lt;h3 id="stetige-zufallsvariable">Stetige Zufallsvariable&lt;/h3>
&lt;p>Eine &lt;mark>&lt;strong>stetige&lt;/strong>&lt;/mark> Zufallsvariable&lt;/p>
&lt;ul>
&lt;li>ist &lt;strong>überabzählbar&lt;/strong>, also nimmt &lt;em>unendlich viele, nicht abzählbare&lt;/em> Werte an.&lt;/li>
&lt;li>meistens bei Messvorgängen der Fall (z.B. Zeit, Längen oder Temperatur)&lt;/li>
&lt;li>Skalenarten: Intervall- oder Rationalskala&lt;/li>
&lt;/ul>
&lt;p>Für stetige Zufallsvariable können wir die Wahrscheinlichkeit nur für &lt;strong>Intervalle&lt;/strong> und NICHT für genaue Werte bestimmen.&lt;/p>
&lt;ul>
&lt;li>Es gibt doch unendlich viele Werte, also ist es unmöglich, ein exaktes Ergebnis festzulegen.&lt;/li>
&lt;li>z.B.
&lt;ul>
&lt;li>&amp;ldquo;Mit welcher Wahrscheinlichkeit ist eine zufällig gewählte Studentin zwischen 165cm und 170cm groß?&amp;rdquo;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Man benutzt im stetigen Fall die &lt;strong>Verteilungsfunktion&lt;/strong> zur Berechnung von Wahrscheinlichkeiten.&lt;/li>
&lt;/ul>
&lt;h4 id="dichtefunktion">Dichtefunktion&lt;/h4>
&lt;p>Die &lt;mark>&lt;strong>Dichtefunktion (Engl. Probability Density Function (PDF))&lt;/strong>&lt;/mark> oder &lt;strong>Dichte&lt;/strong> beschreibt, &amp;ldquo;Wie dicht liegen die betrachteten Werte um einen beliebigen Punkt?&amp;rdquo;&lt;/p>
$$
f(x): \mathbf{\Omega} \rightarrow \mathbb{R}^{+}
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-05-31%2022.24.46.png" alt="截屏2022-05-31 22.24.46" style="zoom:50%;" />
&lt;ul>
&lt;li>Eigenschaften von $f$:&lt;/li>
&lt;/ul>
$$
\begin{array}{l}
f \text{ ist integrierbar}\\
f(x) \geq 0 \quad \forall x \in \mathbb{R} \\
\displaystyle \int_{-\infty}^{+\infty} f(x) \mathrm{d} x=1
\end{array}
$$
&lt;ul>
&lt;li>
&lt;p>Unterschied zu Wahrscheinlichkeitsfunktion&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Die Dichtefunktion liefert nicht die Wahrscheinlichkeit, sondern NUR die &amp;ldquo;Wahrscheinlichkeitsdichte&amp;rdquo;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Bei der stetigen Zufallsvariable, überabzählbar und unendlich viele Ausprägung hat, ist die Wahrscheinlichkeit für jede konkrete Ausprägung gleich 0
&lt;/p>
$$
P(X=x) = 0 \quad \forall x \in \mathbb{R}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Die Wahrscheinlichkeit, dass $X$ einen Wert $x \in [a, b]$ annimmt , entspricht der Fläsche $S$&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-05-31%2022.37.24.png" alt="截屏2022-05-31 22.37.24" style="zoom:50%;" />
$$
P(a \leq x \leq b)=\int_{a}^{b} f(x) \mathrm{d} x=S
$$
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">In der SI Vorlesung sowie Übung wird die Dichtefunktion der Zufallsvariable $X$ als $f_{X}(x)$ schreiben.&lt;/span>
&lt;/div>
&lt;h4 id="verteilungsfunktion-1">Verteilungsfunktion&lt;/h4>
$$
F(x): \Omega \rightarrow[0,1], x \in \mathbb{R}
$$
$$
F(x)=\int f(x) \mathrm{d} x, \quad f(x)=\frac{F(x)}{\mathrm{d} x}
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-05-31%2022.24.46.png" alt="截屏2022-05-31 22.24.46" style="zoom:40%; float:left" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-05-31%2023.01.08.png" alt="截屏2022-05-31 23.01.08" style="zoom:40%; float:right" />
&lt;p>Die Verteilungsfunktion ist eigentlich die Fläche unter der Dichtfunktion:&lt;/p>
$$
F(x)=P(X \leq x=c)=\int_{-\infty}^{c} f(x) \mathrm{d} x
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-05-31%2023.05.33.png" alt="截屏2022-05-31 23.05.33" style="zoom:50%;" />
&lt;p>Die Differenz zwischen zwei Verteilungsfunktion ist also:&lt;/p>
$$
F(b)-F(a)=P(a \leq x \leq b)=\int_{a}^{b} f(x) \mathrm{d} x
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-05-31%2023.07.26.png" alt="截屏2022-05-31 23.07.26" style="zoom:50%;" />
&lt;h4 id="dichtefunktion-vs-verteilungsfunktion">Dichtefunktion vs. Verteilungsfunktion&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Dichtfunktion beschreibt, wie sind die Wahrscheinlichkeiten konkret verteilt?&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Verteilungsfunktion&lt;/p>
&lt;ul>
&lt;li>Summieren der Wahrscheinlichkeiten $\rightarrow$ Bestimmung der Wahrscheinlichkeit für Intervall&lt;/li>
&lt;li>liefert die Wahrscheinlichkeit dafür, dass ien Ereignis $\leq$ eines bestimmten Werted eintritt&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="diskrete-vs-stetige-zufallsvariable">Diskrete Vs. Stetige Zufallsvariable&lt;/h3>
&lt;style type="text/css">
.tg {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top}
.tg .tg-7btt{border-color:inherit;font-weight:bold;text-align:center;vertical-align:top}
&lt;/style>
&lt;table class="tg">
&lt;thead>
&lt;tr>
&lt;th class="tg-c3ow">Zufalls-&lt;br>variable&lt;/th>
&lt;th class="tg-7btt">&lt;span style="font-style:normal">Diskret&lt;/span>&lt;/th>
&lt;th class="tg-7btt">&lt;span style="font-style:normal">Stetig&lt;/span>&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td class="tg-c3ow">&lt;span style="font-style:normal">Beispiel&lt;/span>&lt;/td>
&lt;td class="tg-7btt">&lt;span style="font-weight:400;font-style:normal">Würfelwurf&lt;/span>&lt;/td>
&lt;td class="tg-c3ow">Zeit&lt;br>Temperatur&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tg-c3ow">Wahrscheinlichkeit &lt;br>für&lt;/td>
&lt;td class="tg-c3ow">bestimmter/konkreter Punkt&lt;br>$P(X=x) \in [0, 1]$&lt;/td>
&lt;td class="tg-c3ow">NUR für Intervall&lt;br>($P(X=x) = 0$)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tg-c3ow">Wahrscheinlichkeitsfunktion/&lt;br>Dichtefunktion&lt;/td>
&lt;td class="tg-c3ow">&lt;span style="font-style:normal">Wahrscheinlichkeitsfunktion&lt;/span>&lt;br>$f(x): \Omega \rightarrow[0,1], x \in \mathbb{N}_{0}$&lt;br>$f(x) = P(X=x)$&lt;br>$\sum_{x \in \Omega} f(x)=1$&lt;/td>
&lt;td class="tg-c3ow">Dichtefunktion&lt;br>$f(x): \mathbf{\Omega} \rightarrow \mathbb{R}^{+}$&lt;br>$f$ ist integrierbar&lt;br>$f(x) \geq 0 \quad \forall x \in \mathbb{R}$&lt;br>$\displaystyle \int_{-\infty}^{+\infty} f(x) \mathrm{d} x=1$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tg-c3ow">&lt;span style="font-style:normal">Verteilungsfunktion&lt;/span>&lt;/td>
&lt;td class="tg-c3ow">$F(x): \boldsymbol{\Omega} \rightarrow[\mathbf{0}, \mathbf{1}], X \in \mathbb{N}_{\mathbf{0}}$&lt;br>$F(x)= P(X \leq x) = \sum_{x_{i} \leq x} f\left(x_{i}\right)$&lt;/td>
&lt;td class="tg-c3ow">$F(x): \Omega \rightarrow[0,1], x \in \mathbb{R}$&lt;br>$F(x)=\int f(x) \mathrm{d} x, \quad f(x)=\frac{F(x)}{\mathrm{d} x}$&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
Note: Man schreibt für die *„Dichte“* einer diskreten Zufallsvariablen, deren Einzelwahrscheinlichkeiten $p_n = P(\boldsymbol{x} = x_n)$
gegeben sind, auch
$$
f_{\boldsymbol{x}}(x)=\sum_{n=1}^{\infty} \mathrm{P}\left(\boldsymbol{x}=x_{n}\right) \delta\left(x-x_{n}\right)=\sum_{n=1}^{\infty} p_{n} \delta\left(x-x_{n}\right),
$$
&lt;p>wobei $\delta(\cdot)$ die Delta-Distribution ist. Damit gilt sowohl für kontinuierliche als auch für diskrete Zufallsvariablen der Zusammenhang&lt;/p>
$$
\frac{d}{d_x} F_{\boldsymbol{x}}(x) = f_{\boldsymbol{x}}(x).
$$
&lt;h2 id="kenntwerte-von-zufallsvariablen">Kenntwerte von Zufallsvariablen&lt;/h2>
&lt;h3 id="erwartungswert">Erwartungswert&lt;/h3>
&lt;p>&lt;mark>&lt;strong>Erwartungswert&lt;/strong>&lt;/mark> (auch &lt;strong>Mittelwert&lt;/strong>) : der Durchschnitt, wenn ein Versuch unendlich oft durchgeführt wird&lt;/p>
$$
E_{f_X}\{X\} = \hat{X} = \mu_{X} = \int_{-\infty}^{\infty} x f_{X}(x) d x
$$
&lt;ul>
&lt;li>Notation: $\mu$, $E(X)$, $E\[X\]$, $E\\{X\\}$&lt;/li>
&lt;/ul>
&lt;h4 id="rechenregeln">Rechenregeln&lt;/h4>
$\mathrm{E}_{f_{X}}\{aX + b\}=a \mathrm{E}_{f_{X}}\{X\}+b$
&lt;details class="spoiler " id="spoiler-25">
&lt;summary class="cursor-pointer">Beweis&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
$$
\begin{array}{ll}
&amp;\mathrm{E}\_{f\_{X}}\\{a X+b\\} \\\\
=&amp;\int\_{-\infty}^{\infty}(a x+b) f\_{X}(x) \mathrm{d} x \\\\
=&amp;a \int\_{-\infty}^{\infty} x f\_{X}(x) \mathrm{d} x+b \int\_{-\infty}^{\infty} f\_{X}(x) \mathrm{d} x \\\\
=&amp;a \cdot \mathrm{E}\_{f_{X}}\\{X\\}+b \cdot 1
\end{array}
$$
&lt;/div>
&lt;/details>
&lt;p>Mehr Regeln:&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-07-04%2010.52.26.png"
alt="Basic expectation rules. (Source: kalmanfilter.net)">&lt;figcaption>
&lt;p>Basic expectation rules. (Source: &lt;a href="https://www.kalmanfilter.net/background2.html">kalmanfilter.net&lt;/a>)&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;h3 id="k-te-moment">$k$-te Moment&lt;/h3>
&lt;p>Der Erwartungswert&lt;/p>
$$
\mathrm{E}_{f_X}\left\{X^{k}\right\}=\int_{-\infty}^{\infty} x^{k} f_{X}(x) \mathrm{d} x
$$
&lt;p>ist das &lt;mark>&lt;strong>$k$-te Moment&lt;/strong>&lt;/mark> der Zufallsvariable $X$.&lt;/p>
&lt;p>Der Erwartungswert&lt;/p>
$$
\mathrm{E}_{f_X}\left\{\left(X-\mathrm{E}\{X\}\right)^{k}\right\}=\int_{-\infty}^{\infty}\left(x-\mu_{X}\right)^{k} f_{X}(x) \mathrm{d} x
$$
&lt;p>ist das &lt;mark>&lt;strong>$k$-te zentrale Moment&lt;/strong>&lt;/mark> der Zufallsvariable $X$.&lt;/p>
&lt;h3 id="varianz">Varianz&lt;/h3>
&lt;p>&lt;strong>Varianz&lt;/strong> := die erwartete &lt;em>quadratische&lt;/em> Abweichung vom Erwartungswert&lt;/p>
$$
E_{f_X}\{(X - \mu_X)^2\} = \operatorname{Var}(X) = \sigma_X^2
$$
&lt;ul>
&lt;li>
&lt;p>das zweite zentrale Moment&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Je größer die Varianz, desto weiter streuen die Werte um $E(X)$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Notationen: $\sigma^2$, $\operatorname{Var}(X)$, $\operatorname{Var}\[X\]$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="rechenregeln-1">Rechenregeln&lt;/h4>
$\operatorname{Var}_{f_X}\{aX+b\} = a^2 \operatorname{Var}_{f_X}\{X\}$
&lt;details class="spoiler " id="spoiler-31">
&lt;summary class="cursor-pointer">Beweis&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
$$
\begin{array}{l}
&amp;\operatorname{Var}\_{f\_{X}}\\{a X+b\\} \\\\
=&amp;\mathrm{E}\_{f\_{X}}\left\\{\left(a X+b-\mathrm{E}\_{f\_{X}}\\{a X+b\\}\right)^{2}\right\\} \\\\
=&amp;\mathrm{E}\_{f\_{X}}\left\\{\left(a X+b-\left(a \mu\_{X}+b\right)\right)^{2}\right\\}\\\\
=&amp;\mathrm{E}\_{f\_{X}}\left\\{\left(a\left(X-\mu\_{X}\right)\right)^{2}\right\\} \\\\
=&amp;\int\_{-\infty}^{\infty}\left(a\left(X-\mu\_{X}\right)\right)^{2} f\_{X}(x) \mathrm{d} x \\\\
=&amp;a^{2} \int\_{-\infty}^{\infty}\left(X-\mu\_{X}\right)^{2} f\_{X}(x) \mathrm{d} x \\\\
=&amp;a^{2} \mathrm{E}\_{f\_{X}}\left\\{\left(X-\mu\_{X}\right)^{2}\right\\} \\\\
=&amp;a^{2} \operatorname{Var}\_{f\_{X}}\\{X\\}
\end{array}
$$
&lt;/div>
&lt;/details>
&lt;/br>
$\operatorname{Var}_{f_{X}}\{X\}=\mathrm{E}_{f_{X}}\left\{X^{2}\right\}-\left(\mathrm{E}_{f_{X}}\{X\}\right)^{2}$
&lt;details class="spoiler " id="spoiler-33">
&lt;summary class="cursor-pointer">Beweis&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
$$
\begin{aligned}
\operatorname{Var}\_{f\_{X}}\\\{X\\}=&amp; \int\_{-\infty}^{\infty}\left(x-\mathrm{E}\_{f\_{X}}\\{X\\}\right)^{2} f\_{X}(x) \mathrm{d} x \\\\
=&amp; \int\_{-\infty}^{\infty}\left(x-\mu\_{X}\right)^{2} f\_{X}(x) \mathrm{d} x \\\\
=&amp; \int\_{-\infty}^{\infty}\left(x^{2}-2 x \mu\_{X}+\mu\_{X}^{2}\right) f\_{X}(x) \mathrm{d} x \\\\
=&amp; \int\_{-\infty}^{\infty} x^{2} f\_{X}(x) \mathrm{d} x-2 \mu\_{X} \int\_{-\infty}^{\infty} x f\_{X}(x) \mathrm{d} x+\mu\_{X}^{2} \int\_{-\infty}^{\infty} f\_{X}(x) \mathrm{d} x \\\\
=&amp; \mathrm{E}\_{f\_{X}}\left\\{X^{2}\right\\}-2 \mu\_{X} \mathrm{E}\_{f\_{X}}\\{X\\}+\mu\_{X}^{2} \cdot 1 \\\\
=&amp; \mathrm{E}\_{f\_{X}}\left\\{X^{2}\right\\}-2 \mu\_{X} \mu\_{X}+\mu\_{X}^{2} \cdot 1 \\\\
=&amp; \mathrm{E}\_{f\_{X}}\left\\{X^{2}\right\\}-\mu\_{X}^{2}
\end{aligned}
$$
&lt;/div>
&lt;/details>
&lt;p>Mehr Regeln:&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-07-04%2010.55.30.png"
alt="Basic variance and covariance rules. (Source: kalmanfilter.net)">&lt;figcaption>
&lt;p>Basic variance and covariance rules. (Source: &lt;a href="https://www.kalmanfilter.net/background2.html">kalmanfilter.net&lt;/a>)&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;details class="spoiler " id="spoiler-35">
&lt;summary class="cursor-pointer">Beweis für Regel 10&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-07-04%2010.57.26.png" alt="截屏2022-07-04 10.57.26">
&lt;/div>
&lt;/details>
&lt;details class="spoiler " id="spoiler-36">
&lt;summary class="cursor-pointer">Beweis für Regel 11&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-07-04%2010.57.54.png" alt="截屏2022-07-04 10.57.54">
&lt;/div>
&lt;/details>
&lt;details class="spoiler " id="spoiler-37">
&lt;summary class="cursor-pointer">Beweis für Regel 13&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-07-04%2010.59.10.png" alt="截屏2022-07-04 10.59.10">
&lt;/div>
&lt;/details>
&lt;details class="spoiler " id="spoiler-38">
&lt;summary class="cursor-pointer">Beweis für Regel 14&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-07-04%2010.59.30.png" alt="截屏2022-07-04 10.59.30">
&lt;/div>
&lt;/details>
&lt;h3 id="standardabweichung">Standardabweichung&lt;/h3>
&lt;p>&lt;strong>Standardabweichung&lt;/strong>: Streumaß, das die selbe Einheit wie $X$ hat&lt;/p>
$$
\sigma=\sqrt{\operatorname{Var}(X)}
$$
&lt;p>Groß $\sigma$ $\rightarrow$ große Streuung&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Section2Module7HighLowStandardDeviation.jpg" alt="Standard Deviation" style="zoom:75%;" />
&lt;style type="text/css">
.tg {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top}
.tg .tg-7btt{border-color:inherit;font-weight:bold;text-align:center;vertical-align:top}
&lt;/style>
&lt;table class="tg">
&lt;thead>
&lt;tr>
&lt;th class="tg-c3ow">Zufalls-&lt;br>variable&lt;/th>
&lt;th class="tg-7btt">&lt;span style="font-style:normal">Diskret&lt;/span>&lt;/th>
&lt;th class="tg-7btt">&lt;span style="font-style:normal">Stetig&lt;/span>&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td class="tg-7btt">Erwartungswert&lt;br>($\mu$, $E(x)$)&lt;/td>
&lt;td class="tg-c3ow">$\sum_{i \in \Omega} x_{i} \cdot p_{i}$&lt;/td>
&lt;td class="tg-c3ow">$\int_{-\infty}^{+\infty} x \cdot f(x) \mathrm{d} x$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tg-7btt">Varianz&lt;br>($\sigma^2$, $Var(x)$)&lt;/td>
&lt;td class="tg-c3ow">$\sum_{i \in \Omega}\left(x_{i}-\mu\right)^{2} \cdot p_{i}$&lt;/td>
&lt;td class="tg-c3ow">$\int_{-\infty}^{+\infty}(x-\mu)^{2} \cdot f(x) \mathrm{d} x$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tg-7btt">Standardabweichung&lt;br>($\sigma$)&lt;/td>
&lt;td class="tg-c3ow">$\sqrt{Var(x)}$&lt;/td>
&lt;td class="tg-c3ow">&lt;span style="font-weight:400;font-style:normal">$\sqrt{Var(x)}$&lt;/span>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="normalverteilte-zufallsvariable">Normalverteilte Zufallsvariable&lt;/h3>
&lt;p>Ein &lt;mark>&lt;strong>normalverteilte Zufallsvariable&lt;/strong>&lt;/mark> $X$ hat die Dichte&lt;/p>
$$
f_{X}(x)=\mathcal{N}\left(x-\mu, \sigma^{2}\right)=\frac{1}{\sqrt{2 \pi} \sigma} e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^{2}}
$$
&lt;p>Ihr $k$-tes zentrales Moment ist allgemein&lt;/p>
$$
\mathrm{E}_{f_{X}}\left\{(X-\mu)^{k}\right\}=\left\{\begin{array}{ll}
1 \cdot 3 \cdot 5 \cdots(k-1) \sigma^{k} &amp; \text { falls } k \text { gerade } \\
0 &amp; \text { falls } k \text { ungerade }
\end{array}\right.
$$
&lt;p>Die Normalverteilung ist also vollständig durch $\mu$ und $\sigma$ charakterisiert.&lt;/p>
&lt;h3 id="standardisierte-zufallsvariable">Standardisierte Zufallsvariable&lt;/h3>
&lt;p>Eine Zufallsvariable $X$ mit dem Erwartungswert $\mu_X = E_{f_X}\{X\}$ und der Varianz $\sigma_X^2$ wird durch&lt;/p>
$$
Y = \frac{X - \mu_X}{\sigma_X}
$$
&lt;p>in eine &lt;mark>&lt;strong>standardisierte Zufallsvariable&lt;/strong>&lt;/mark> $Y$, die den Erwartungswert 0 und die Varianz 1 besitzt, transformiert.&lt;/p>
&lt;h3 id="modalwert-quantil-median">Modalwert, Quantil, Median&lt;/h3>
&lt;p>Ein Wert, für den die Dichtefunktion $f_X(x)$ ein lokales Maximum annimmt, heißt &lt;mark>&lt;strong>Modalwert&lt;/strong>&lt;/mark> der stetigen Zufallsvariablen $X$.&lt;/p>
&lt;p>Ein Wert $x_p$, der den Ungleichungen&lt;/p>
$$
P(X &lt; x_p) \leq p, \quad P(X > x_p) \leq 1 - p \quad (0 &lt; p &lt; 1)
$$
&lt;p>genügt, heißt &lt;mark>&lt;strong>$p$-tes Quantil&lt;/strong>&lt;/mark>.&lt;/p>
&lt;ul>
&lt;li>Für eine stetige Zufallsvariable X ist ein $p$-tes Quantil $x_p$ gegeben durch $F_X(x_p) = p$&lt;/li>
&lt;li>Ein Quantil der Ordnung $p=\frac{1}{2}$ heißt &lt;mark>&lt;strong>Median&lt;/strong>&lt;/mark> der Zufallsvariable $X$&lt;/li>
&lt;li>Für normalverteilte Zufallsvariablen fallen Erwartungswert, Modalwert und Median zusammen.&lt;/li>
&lt;/ul>
&lt;h2 id="reference">Reference&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Wahrscheinlichkeits-, Dichte- und Verteilungsfunktion diskreter und stetiger Zufallsvariablen&lt;/p>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/_lq7zfecSpw?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div>
&lt;/li>
&lt;li>
&lt;p>Erwartungswert&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Kenngrößen (Momente) von Zufallsvariablen I: Erwartungswert, Varianz, Standardabweichung&lt;/p>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/KKr-aLFrSVA?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item><item><title>Internet Routing</title><link>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/03-internet_routing/</link><pubDate>Thu, 04 Mar 2021 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/03-internet_routing/</guid><description>&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Internet_Routing.png"
alt="Summary">&lt;figcaption>
&lt;p>Summary&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;h2 id="baiscs">Baiscs&lt;/h2>
&lt;p>Internet: network of networks&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-04%2013.02.45.png" alt="截屏2021-03-04 13.02.45" style="zoom:67%;" />
&lt;h3 id="high-level-view-on-an-ip-router">High-level View on an IP Router&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-04%2013.03.49.png" alt="截屏2021-03-04 13.03.49" style="zoom: 67%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Control Plane&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Routing protocols&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Exchange of routing messages for calculation of routes&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Data Plane&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Lookup&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Forwarding of packets at layer 3&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Routing table&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Generated by routing protocol&lt;/li>
&lt;li>Entries: Mapping of destination IP prefixes to next hop (IP address)&lt;/li>
&lt;li>Optimized for the particular routing algorithm&lt;/li>
&lt;li>Performance is not critical
&lt;ul>
&lt;li>Implemented in software&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Forwarding table&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Used for packet forwarding&lt;/li>
&lt;li>Entries: Mapping of IP prefixes to outgoing ports (interface ID and MAC address)&lt;/li>
&lt;li>Optimized for longest prefix matching&lt;/li>
&lt;li>Performance is critical (lookup in line speed)!
&lt;ul>
&lt;li>Partially uses dedicated hardware&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Routing metric&lt;/strong> (also named &lt;strong>cost&lt;/strong>, &lt;strong>weight&lt;/strong>)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Metric used by a router to make routing decision&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Can be applied to an individual link or to the overall path&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;em>Examples&lt;/em>&lt;/p>
&lt;ul>
&lt;li>&lt;em>Utilization, latency, data rate&lt;/em>&lt;/li>
&lt;li>&lt;em>Number of hops&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Routing policy&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Policy-based routing decisions&lt;/li>
&lt;li>Policies are defined by network operator / owner&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="distributed-adaptive-routing">Distributed Adaptive Routing&lt;/h3>
&lt;ul>
&lt;li>Currently most commonly used in the Internet&lt;/li>
&lt;li>An instance of the routing protocol &lt;strong>in each router&lt;/strong>
&lt;ul>
&lt;li>Exchange of routing information via routing messages&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Adaptation of the paths to the current situation in the network&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-04%2014.29.56.png" alt="截屏2021-03-04 14.29.56">&lt;/p>
&lt;h3 id="path-computation">Path computation&lt;/h3>
&lt;p>Network is modeled as &lt;strong>graph&lt;/strong>
&lt;/p>
$$
G = (N, E)
$$
&lt;ul>
&lt;li>$N$: nodes (routers)&lt;/li>
&lt;li>$E$: edges
&lt;ul>
&lt;li>Links between routers are edges&lt;/li>
&lt;li>Edges are associated with metric&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Example&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-04%2014.31.28.png" alt="截屏2021-03-04 14.31.28">&lt;/p>
&lt;h2 id="autonomous-systems">Autonomous Systems&lt;/h2>
&lt;h3 id="structuring-into-autonomous-systems">Structuring into autonomous systems&lt;/h3>
&lt;p>Internet routing can be divided into &lt;strong>Autonomous Systems (AS)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Routing &lt;strong>inside&lt;/strong> an autonomous system using &lt;strong>Interior Gateway Protocol (IGP)&lt;/strong>&lt;/li>
&lt;li>Routing between autonomous systems using &lt;strong>Exterior Gateway Protocol (EGP)&lt;/strong>&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-04%2014.42.27.png" alt="截屏2021-03-04 14.42.27" style="zoom: 67%;" />
&lt;h4 id="autonomous-systems-1">Autonomous Systems&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Identification: Unique number called &lt;strong>Autonomous Systems Number (ASN)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>earlier 16 bit; now 32 bit&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Properties&lt;/p>
&lt;ul>
&lt;li>Appears as a single entity to the outside&lt;/li>
&lt;li>Uniform routing policy&lt;/li>
&lt;li>Typically uniform interior routing protocol
&lt;ul>
&lt;li>Different ASes can use different interior routing protocols&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>👍 Advantages&lt;/p>
&lt;ul>
&lt;li>Separated administrative domains&lt;/li>
&lt;li>Scalability by using two logical levels
&lt;ul>
&lt;li>Routing protocol inside an AS (not global)&lt;/li>
&lt;li>Routing protocol between ASes&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Important Properties&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Scalability of routing protocols
&lt;ul>
&lt;li>Overhead increases with size of the network 📈
&lt;ul>
&lt;li>Space for storing routing information&lt;/li>
&lt;li>Number of routing messages to exchange&lt;/li>
&lt;li>Computation overhead&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Operator autonomy
&lt;ul>
&lt;li>Choice of interior routing protocol&lt;/li>
&lt;li>Hiding of internal network structure&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Allocation&lt;/p>
&lt;ul>
&lt;li>IANA (Internet Assigned Numbers Authority) delegates allocation to &lt;strong>Regional Internet Registries (RIR)&lt;/strong>, e.g.,
&lt;ul>
&lt;li>
&lt;p>ARIN (North America)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>RIPE NCC (Europe, Middle East and Central Asia) APNIC (Asia-Pacific)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>LACNIC (Latin America, Caribbean)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>AfriNIC (Africa)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="subdivision-into-ass">Subdivision into ASs&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-04%2014.53.42.png" alt="截屏2021-03-04 14.53.42" style="zoom: 67%;" />
&lt;h4 id="classification-of-ases">Classification of ASes&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Classification based on &lt;strong>role&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Stub AS&lt;/strong>
&lt;ul>
&lt;li>Small organizations and enterprises (Mostly operate only regionally)&lt;/li>
&lt;li>Connected to exactly one provider&lt;/li>
&lt;li>No transit traffic&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Multihomed AS&lt;/strong>
&lt;ul>
&lt;li>
&lt;p>Large enterprises&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Connected to several providers (reliability)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>No transit traffic&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Transit AS&lt;/strong>
&lt;ul>
&lt;li>Provider (Often global scope)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Classification based on &lt;strong>“economic position/influence”&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Tier 1, tier 2, tier 3 &amp;hellip;&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-04%2014.57.26.png" alt="截屏2021-03-04 14.57.26">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="different-roles">Different Roles&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>End customer&lt;/strong>
&lt;ul>
&lt;li>Uses Internet application&lt;/li>
&lt;li>Examples
&lt;ul>
&lt;li>Universities&lt;/li>
&lt;li>Enterprises&lt;/li>
&lt;li>Customers of Internet Service Providers (ISP)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="#content-delivery-provider">&lt;strong>Content delivery provider&lt;/strong>&lt;/a>
&lt;ul>
&lt;li>Requested by end customers / Internet application
&lt;ul>
&lt;li>Provide content&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Examples: Google, Akamai, Yahoo, YouTube, Facebook&amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="reachability-across-autonomous-systems">Reachability across autonomous systems&lt;/h3>
&lt;h4 id="reachability">Reachability&lt;/h4>
&lt;p>Main problem&lt;/p>
&lt;ul>
&lt;li>How to ensure mutual reachability?&lt;/li>
&lt;li>Cooperation among autonomous systems?&lt;/li>
&lt;/ul>
&lt;p>Basic concepts&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#connectivity-and-transit">&lt;strong>Transit&lt;/strong>&lt;/a>: Purchased connectivity 💸&lt;/li>
&lt;li>&lt;a href="#peering">&lt;strong>Peering&lt;/strong>&lt;/a>: Direct connection, typically between ASes of the same tier&lt;/li>
&lt;/ul>
&lt;h4 id="connectivity-and-transit">Connectivity and Transit&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Establish connectivity&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Establish paths to all other ASes in the Internet&lt;/p>
&lt;/li>
&lt;li>
&lt;p>AS operator purchases connectivity from one or more ASes&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Transit&lt;/strong>&lt;/p>
&lt;blockquote>
&lt;p>&lt;strong>Internet transit&lt;/strong> is the service of allowing network traffic to cross or &amp;ldquo;transit&amp;rdquo; a computer network, usually used to connect a smaller &lt;a href="https://en.wikipedia.org/wiki/Internet_service_provider">Internet service provider&lt;/a> (ISP) to the larger &lt;a href="https://en.wikipedia.org/wiki/Internet">Internet&lt;/a>. (&lt;a href="https://en.wikipedia.org/wiki/Internet_transit">wiki&lt;/a>)&lt;/p>
&lt;/blockquote>
&lt;ul>
&lt;li>
&lt;p>Purchased connectivity 💸&lt;/p>
&lt;ul>
&lt;li>Upstream: provider (seller) of transit&lt;/li>
&lt;li>Downstream: customer (buyer)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Traffic exchange&lt;/p>
&lt;ul>
&lt;li>
&lt;p>In both directions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Only downstream AS must pay&lt;/strong>; usually volume rate&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Transit AS: Provider AS that offers transit&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Options for connecting a stub AS&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Stub AS&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-04%2015.23.16.png" alt="截屏2021-03-04 15.23.16">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Dualhomed stub AS&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-04%2015.23.52.png" alt="截屏2021-03-04 15.23.52">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Multihomed stub AS&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-04%2015.24.16.png" alt="截屏2021-03-04 15.24.16">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="peering">Peering&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Private peering&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Direct connection between two ASes, usually of same tier&lt;/p>
&lt;/li>
&lt;li>
&lt;p>No cost for traffic exchange; costs for network infrastructure apply&lt;/p>
&lt;/li>
&lt;li>
&lt;p>However&lt;/p>
&lt;ul>
&lt;li>Mostly &lt;strong>only data traffic between privately peered ASes&lt;/strong>&lt;/li>
&lt;li>NO transit traffic of other ASes&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Video explanation&lt;/p>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/T2jb1tzXzMw?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div>
&lt;/li>
&lt;li>
&lt;p>Example: peering and transit combination&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-04%2015.34.37.png" alt="截屏2021-03-04 15.34.37" style="zoom:80%;" />
&lt;/li>
&lt;li>
&lt;p>👍 Advantages&lt;/p>
&lt;ul>
&lt;li>Benefits both ASes: save transit costs, that otherwise would apply&lt;/li>
&lt;li>Shorter data paths: fewer AS hops between source and destination&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🔴 Problems&lt;/p>
&lt;ul>
&lt;li>Direct connection of ASes complicated (Different geographical locations)&lt;/li>
&lt;li>Full mesh of $n$ ASes ($\frac{(n-1)n}{2}$ separate connections!)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Public peering&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Through &lt;strong>Internet exchange points (IXPs)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Central public authority for interconnection&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-04%2015.59.13.png" alt="截屏2021-03-04 15.59.13" style="zoom:67%;" />
- Neutral traffic forwarding on layer 2
- No differentiation regardless of customer, content, or type of service
- Examples
- DECIX (the world’s biggest IXP)
&lt;/li>
&lt;li>
&lt;p>Members / customers: Monthly fixed charges per network port&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Necessary for operation and maintenance of IXP‘s switching platform&lt;/p>
&lt;ul>
&lt;li>Different peering policies
&lt;ul>
&lt;li>&lt;strong>Open&lt;/strong>: AS is open for peering with all other ASes&lt;/li>
&lt;li>&lt;strong>Selective&lt;/strong>: Peering only under given terms and conditions&lt;/li>
&lt;li>&lt;strong>Restrictive&lt;/strong>: AS does not engage in new peering relationships&lt;/li>
&lt;li>&lt;strong>No Peering&lt;/strong>: AS does not do any peering&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="autonomous-systems-and-transitpeering">Autonomous Systems and Transit/Peering&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Tier 1&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Large global ASes with access to (all) other ASes
&lt;ul>
&lt;li>Do not buy any transit. Sell transit&lt;/li>
&lt;li>Peering with other tier 1 ASes&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;em>Examples: Deutsche Telekom, AT&amp;amp;T&amp;hellip;&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Tier 2&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Big national and inter-regional ASes
&lt;ul>
&lt;li>Connection to providers of Internet applications&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Downstream of tier 1 ASes
&lt;ul>
&lt;li>Sell transit to other ASes&lt;/li>
&lt;li>Usually employ peering&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;em>Examples: Vodafone, Comcast, Tele2&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Tier 3&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Small mostly regional ASes
&lt;ul>
&lt;li>Connections with small providers of Internet applications&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Downstream of tier 2 providers
&lt;ul>
&lt;li>Usually do not sell transit to other ASes&lt;/li>
&lt;li>Sell transit mostly to end customers/users&lt;/li>
&lt;li>Usually employ peering&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;em>Examples: KabelBW, NETHINKS, Alice&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="content-delivery-provider">Content Delivery Provider&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>🎯 Goal: FAST delivery of content (i.e. low latencies)&lt;/strong>&lt;/p>
&lt;p>$\rightarrow$ Locations close to tier 1 peering points are preferred&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Two basic alternatives&lt;/p>
&lt;ul>
&lt;li>Web servers are hosted &lt;strong>directly in tier 1 ASes&lt;/strong> (Does not require an own AS number)&lt;/li>
&lt;li>Web servers are connected over own routers
&lt;ul>
&lt;li>
&lt;p>&lt;a href="#content-delivery-network">&lt;strong>Content delivery network (CDN)&lt;/strong>&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Own AS number required&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Peering with essential providers at important peering points&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;em>Examples: Google, Yahoo, Akamai&lt;/em>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="content-delivery-network">Content Delivery Network&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>World wide network with own AS number&lt;/p>
&lt;ul>
&lt;li>Thousands of &lt;strong>Points of Presence (PoP)&lt;/strong> spread across the world&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Point of Presence&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Consists of access routers und core routers&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Access router at the edge of a CDN&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Core router inside a CDN&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Customers are connecting through access routers&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Objectives&lt;/p>
&lt;ul>
&lt;li>Load balancing at access routers&lt;/li>
&lt;li>Be close to customers $\rightarrow$ low latencies&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="routing-in-and-between-autonomous-systems">Routing in and between Autonomous Systems&lt;/h3>
&lt;p>Classification&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Interior gateway protocols (IGPs)&lt;/strong> &lt;em>INSIDE&lt;/em> one AS
&lt;ul>
&lt;li>
&lt;p>A.k.a &lt;strong>intra-domain routing protocols&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Are encapsulated inside an AS, i.e., not visible to the outside&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Different IGPs in different ASes possible&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Metric-based&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Exterior gateway protocols (EGPs)&lt;/strong> &lt;em>BETWEEN&lt;/em> ASes
&lt;ul>
&lt;li>
&lt;p>Also named &lt;strong>inter-domain routing protocols&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Single protocol between &lt;em>all&lt;/em> ASes&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Policy-based&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-04%2016.48.23.png" alt="截屏2021-03-04 16.48.23">&lt;/p>
&lt;h2 id="rip-routing-information-protocol">RIP: Routing Information Protocol&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Interior&lt;/strong> gateway protocol&lt;/li>
&lt;li>Very simple protocol that requires very little configuration&lt;/li>
&lt;/ul>
&lt;h3 id="rip-in-the-protocol-stack">RIP in the Protocol Stack&lt;/h3>
&lt;ul>
&lt;li>Application process routed implements RIP and manages forwarding table&lt;/li>
&lt;li>RIP routing messages are sent over UDP $\rightarrow$ NOT reliable&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-04%2023.16.57.png" alt="截屏2021-03-04 23.16.57" style="zoom:80%;" />
&lt;h3 id="routing-metric">Routing Metric&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Distance between source and destination = number of hops on the path (hop count)&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Hop count&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Refer to the number of intermediate devices through which data must pass between source and destination.&lt;/p>
&lt;ul>
&lt;li>Each time that a packet of data moves from one router (or device) to another, that is considered one HOP.&lt;/li>
&lt;/ul>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Hop-count-trans-20210304235011333.png"
alt="An illustration of hops in a wired network. The hop count between the computers in this case is 2.">&lt;figcaption>
&lt;p>An illustration of hops in a wired network. The hop count between the computers in this case is 2.&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;/li>
&lt;li>
&lt;p>Limited range of values: 1 - 15&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Value of 16 corresponds to &amp;ldquo;infinity&amp;rdquo;&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="rip-routing-messages">RIP Routing Messages&lt;/h3>
&lt;ul>
&lt;li>RIP protocol entities exchange routing messages
&lt;ul>
&lt;li>UDP is used as transport protocol&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Types of routing messages
&lt;ul>
&lt;li>&lt;strong>Request&lt;/strong> message
&lt;ul>
&lt;li>Requires complete routing table or part of it&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Response&lt;/strong> message for different reasons
&lt;ul>
&lt;li>Response to specific query&lt;/li>
&lt;li>Regular update&lt;/li>
&lt;li>Triggered update&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="routing-updates">Routing Updates&lt;/h3>
&lt;h4 id="outgoing">Outgoing&lt;/h4>
&lt;ul>
&lt;li>Regular routing update
&lt;ul>
&lt;li>Periodically, every 30 seconds&lt;/li>
&lt;li>Sends entire routing table to all its neighbors&lt;/li>
&lt;li>Entries in the routing table are periodically refreshed&lt;/li>
&lt;li>No refresh for at least 180 seconds? $\rightarrow$ Hop-Count is set to 16 („infinite“), corresponding route is invalidated&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Metric for route changes (&lt;em>triggered&lt;/em> update)
&lt;ul>
&lt;li>Only changes since the last update are communicated, not the complete routing table&lt;/li>
&lt;li>Rate limitation in order to reduce load on the network&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="incoming">Incoming&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Entry for a destination address does not exist in routing table and received metric is not „infinite“ $\rightarrow$ &lt;strong>Insert&lt;/strong> new entry in routing table&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Current entry for a destination address in routing table has larger metric &lt;strong>or&lt;/strong> routing update was sent by the “next router” for this destination $\rightarrow$ &lt;strong>Modify&lt;/strong> entry&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Otherwise $\rightarrow$ &lt;strong>Ignore&lt;/strong> routing update&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="example">Example&lt;/h4>
&lt;p>Scenario&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-04%2023.42.49.png" alt="截屏2021-03-04 23.42.49" style="zoom:80%;" />
&lt;ul>
&lt;li>Connecting lines represent either direct links or LANs between routers&lt;/li>
&lt;li>Ovals represent routers&lt;/li>
&lt;/ul>
&lt;p>We have the routing table of router D&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-04%2023.44.02.png" alt="截屏2021-03-04 23.44.02" style="zoom:80%;" />
&lt;p>30 seconds later D receives new routing update from Router A&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-04%2023.44.38.png" alt="截屏2021-03-04 23.44.38" style="zoom:80%;" />
&lt;ul>
&lt;li>A tells D: &amp;ldquo;Hey, now I can reach Z through 4 hops&amp;rdquo;.&lt;/li>
&lt;li>I.e., now D can reach Z through $4+1=5$ hops&lt;/li>
&lt;/ul>
&lt;p>As 5 &amp;lt; 7 (the old number of hops to reach Z), D updates its routing table:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-05%2000.03.37.png" alt="截屏2021-03-05 00.03.37" style="zoom:80%;" />
&lt;h2 id="ospf-open-shortest-path-first">OSPF: Open Shortest Path First&lt;/h2>
&lt;h3 id="ospf-basics">OSPF Basics&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Interior&lt;/strong> gateway protocol&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Link state&lt;/strong> protocol&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Each router in the network needs to learn complete &lt;em>topology&lt;/em> of the network (Otherwise, calculated paths are inconsistent)&lt;/p>
&lt;ul>
&lt;li>Topology = Nodes and links with their costs (weights)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Each router separately computes shortest paths based on network topology&lt;/p>
&lt;ul>
&lt;li>Dijkstra shortest path algorithm&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="ospf-in-the-protocol-stack">OSPF in the Protocol Stack&lt;/h4>
&lt;p>OSPF is located on top of IP $\rightarrow$ OSPF uses an &lt;em>unreliable&lt;/em> communication service&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-07%2011.51.52.png" alt="截屏2021-03-07 11.51.52" style="zoom:80%;" />
&lt;h4 id="know-the-neighbors">Know the Neighbors&lt;/h4>
&lt;p>Each router&lt;/p>
&lt;ul>
&lt;li>learns its neighbors and&lt;/li>
&lt;li>monitors the state of the links to them&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-07%2011.56.39.png" alt="截屏2021-03-07 11.56.39" style="zoom:80%;" />
&lt;h4 id="link-states-of-a-router">Link States of a Router&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Router ID of neighbors&lt;/strong>: dynamically discovered by hello protocol&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Availability: dynamically discovered by hello protocol or physical layer&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Everything else is configured&lt;/p>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-07%2012.03.17.png" alt="截屏2021-03-07 12.03.17" style="zoom:80%;" />
&lt;h4 id="pre-configuration-of-ospf-router">Pre-Configuration of OSPF Router&lt;/h4>
&lt;p>Each router is &lt;strong>pre-configured&lt;/strong> with the following parameters&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Router ID&lt;/strong>: unique ID of a router in the network&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Per-interface parameters&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Interface IP address (and mask)&lt;/li>
&lt;li>Interface output cost – metric
&lt;ul>
&lt;li>Typically, &lt;em>inversely proportional&lt;/em> to link data rate&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="routing-metric-1">Routing Metric&lt;/h4>
&lt;p>Each link is associated with &lt;strong>link costs&lt;/strong>&lt;/p>
&lt;p>Example: prefer links with higher data rate
&lt;/p>
$$
\text { Cost }=\frac{\text { Reference Data Rate }}{\text { Interface Data Rate }}
$$
&lt;ul>
&lt;li>$\text{Reference Data Rate}$ can be configured
&lt;ul>
&lt;li>E.g., to 1 $𝐺𝑏𝑖𝑡/𝑠$ or 10 $𝐺𝑏𝑖𝑡/𝑠$&lt;/li>
&lt;li>Should be consistent across all routers in network&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="link-state-advertisement-lsa">Link State Advertisement (LSA)&lt;/h4>
&lt;p>Each router constructs router &lt;strong>link state advertisements (LSAs)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Router LSAs consist of information about its neighbors and links&lt;/li>
&lt;li>Example&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-07%2012.19.30.png" alt="截屏2021-03-07 12.19.30" style="zoom:80%;" />
&lt;p>Router floods its LSA on &lt;em>all&lt;/em> its interfaces $\rightarrow$ All routers in the network must receive an identical copy of this LSA&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-07%2012.47.21.png" alt="截屏2021-03-07 12.47.21">&lt;/p>
&lt;h4 id="link-state-database">Link State Database&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Each router maintains a link state database&lt;/p>
&lt;ul>
&lt;li>Stores most recent LSAs from all other routers in the network&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-07%2012.48.45.png" alt="截屏2021-03-07 12.48.45">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Link state database is used to&lt;/p>
&lt;ul>
&lt;li>Construct topology graph of the network&lt;/li>
&lt;li>Calculate routing table&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Routers have identical knowledge of network topology iff their link state databases are &lt;strong>synchronized&lt;/strong>, i.e., they have identical content at all routers.&lt;/span>
&lt;/div>
&lt;ul>
&lt;li>
&lt;p>Initial Synchronization of link state database&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>(Re-)start&lt;/strong> of a router&lt;/p>
&lt;p>New router has an empty link state database&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Initial database synchronization&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Router asks neighboring router to share its database Performed&lt;/p>
&lt;/li>
&lt;li>
&lt;p>immediately after a “handshake” of the hello protocol&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Routers exchange LSA headers with each other&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If an LSA is missing it is requested from the neighbor router&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>$\rightarrow$ the routers are now considered as adjacent to each other&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="link-state-advertisement">Link State Advertisement&lt;/h3>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-07%2013.02.37.png" alt="截屏2021-03-07 13.02.37" style="zoom:80%;" />
&lt;p>Each LSA is associated with a lifetime (&lt;strong>LS &lt;code>Age&lt;/code>&lt;/strong>)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Set to “0” by advertising router&lt;/p>
&lt;ul>
&lt;li>When flooded, incremented by transmission delay (estimated value)&lt;/li>
&lt;li>As LSA is stored in database, Age is &lt;strong>incremented over time&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>When LSA’s age reaches &lt;strong>&lt;code>MaxAge&lt;/code>&lt;/strong>, LSA is considered &lt;strong>out-of-date&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;code>MaxAge &lt;/code>is set to 1 hour&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Consequence: routers must refresh their LSAs every &lt;code>LSRefreshTime&lt;/code>&lt;/p>
&lt;ul>
&lt;li>&lt;code>LSRefreshTime &lt;/code>is set to 30 minutes&lt;/li>
&lt;li>Minimum value between generation of any particular LSA: 5 seconds&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="hello-protocol">Hello Protocol&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>🎯 Goals&lt;/p>
&lt;ul>
&lt;li>Ensure bi-directional communication between neighboring OSPF routers&lt;/li>
&lt;li>Establish and maintain logical adjacencies&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-07%2013.06.50.png" alt="截屏2021-03-07 13.06.50" style="zoom:80%;" />
&lt;/li>
&lt;li>
&lt;p>Determines identity and liveliness of neighboring routers&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="hello-message">Hello Message&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Contains own &lt;code>router ID&lt;/code>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Contains &lt;code>router ID&lt;/code> of neighboring router, if known&lt;/p>
&lt;ul>
&lt;li>If not yet known $\rightarrow$ &lt;code>router ID&lt;/code> is set to &lt;code>0.0.0.0&lt;/code>&lt;/li>
&lt;li>If own &lt;code>router ID&lt;/code> is contained in neighbor&amp;rsquo;s hello message $\rightarrow$ Communication is considered to be &lt;strong>bi-directional&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Destination IP address of hello message: &lt;code>224.0.0.5&lt;/code> (multicast address, “&lt;code>AllSPFRouters&lt;/code>”)&lt;/p>
&lt;p>$\rightarrow$ hello message is received and processed only by OSPF routers&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="simplified-workflow">Simplified Workflow&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>A router &lt;strong>periodically&lt;/strong> sends a hello message on all its links&lt;/p>
&lt;ul>
&lt;li>&lt;em>“Hello, I am R1, I am still here”&lt;/em>&lt;/li>
&lt;li>If known, hello message contains &lt;code>router ID&lt;/code> of neighboring router
&lt;ul>
&lt;li>&lt;em>“my neighbor on this link is R2”&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-09%2011.40.56.png" alt="截屏2021-03-09 11.40.56" style="zoom:80%;" />
&lt;/li>
&lt;li>
&lt;p>If no hello message is received for a pre-defined period of time $\rightarrow$ the link is considered to be down 🤪&lt;/p>
&lt;ul>
&lt;li>Standard value for periodic hello messages: every 10-30 seconds&lt;/li>
&lt;li>Fast hello extension: &amp;lt; 1 second&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="ospf-message">OSPF Message&lt;/h3>
&lt;h4 id="header-of-ospf-messages">&lt;strong>Header&lt;/strong> of OSPF messages&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-07 13.27.37.png" alt="截屏2021-03-07 13.27.37" style="zoom:67%;" />
&lt;ul>
&lt;li>&lt;strong>Version&lt;/strong>: OSPF Version, currently 2 for IPv4 and 3 for IPv6&lt;/li>
&lt;li>&lt;strong>Type&lt;/strong>
&lt;ul>
&lt;li>Hello&lt;/li>
&lt;li>database description&lt;/li>
&lt;li>link state request&lt;/li>
&lt;li>link state update&lt;/li>
&lt;li>link state acknowledgement&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Router ID&lt;/strong>: ID of originating router&lt;/li>
&lt;li>&lt;strong>Area ID&lt;/strong>: OSPF area&lt;/li>
&lt;li>&lt;strong>Checksum&lt;/strong>: Internet checksum over entire OSPF message&lt;/li>
&lt;li>&lt;strong>AUType and Authentication&lt;/strong>: Optional authentication of originating router&lt;/li>
&lt;/ul>
&lt;h4 id="link-state-update-message">Link State Update Message&lt;/h4>
&lt;p>Structure of a Link State Advertisement&lt;/p>
&lt;ul>
&lt;li>Consists of a &lt;strong>header&lt;/strong> and a &lt;strong>body&lt;/strong>&lt;/li>
&lt;li>&lt;strong>LSA header&lt;/strong>: contains information used to uniquely identify the LSA
&lt;ul>
&lt;li>Advertising router&lt;/li>
&lt;li>Sequence number of LSA at advertising router&lt;/li>
&lt;li>&amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>LSA body&lt;/strong>: contains information of all operational links of the router
&lt;ul>
&lt;li>Associated cost&lt;/li>
&lt;li>Type of link&lt;/li>
&lt;li>Reachability information&lt;/li>
&lt;li>&amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-07%2013.32.39.png" alt="截屏2021-03-07 13.32.39" style="zoom:67%;" />
&lt;p>LSA Header&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>LS Age&lt;/strong>&lt;/p>
&lt;p>Time in seconds since LSA was originated&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Options&lt;/strong>
Optional capabilities supported by OSPF domain&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>LS Type&lt;/strong>
Router LSA, network LSA &amp;hellip;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Link State ID&lt;/strong>
Uniquely identifies an LSA&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Advertising Router&lt;/strong>
OSPF router ID of originating router&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>LS Sequence Number&lt;/strong>
Incremented each time a new LSA is generated&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Checksum&lt;/strong>
Over entire message exept age field&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Length&lt;/strong>
#bytes for entire LSA including header&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="coping-with-dynamic-changes">Coping with Dynamic Changes&lt;/h3>
&lt;h4 id="issuing-lsas">Issuing LSAs&lt;/h4>
&lt;ul>
&lt;li>If nothing changes (link, router), nothing needs to be reported with respect to routing $\rightarrow$ &lt;strong>keep quiet&lt;/strong>
&lt;ul>
&lt;li>LSAs are refreshed every 30 minutes&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Besides periodic refreshes, communication is only needed in case of changes
&lt;ul>
&lt;li>Interface changed to up or down&lt;/li>
&lt;li>Neighboring router on link is unreachable&lt;/li>
&lt;li>Configuration changes&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Minimum time between two consecutive LSAs of a router is set to 5 seconds (Due to stability reasons)&lt;/li>
&lt;/ul>
&lt;h4 id="synchronized-link-state-databases">Synchronized Link State Databases&lt;/h4>
&lt;p>🎯 Goal: link state databases of all routers need to have &lt;strong>identical&lt;/strong> content (need to be &lt;strong>synchronized&lt;/strong>)&lt;/p>
&lt;p>Following actions are needed&lt;/p>
&lt;ul>
&lt;li>Ensure that each LSA is received by every router in the network (&lt;strong>&lt;a href="#reliable-flooding">reliable flooding&lt;/a>&lt;/strong>)&lt;/li>
&lt;li>Ensure that all routers consistently either store or discard each LSA $\rightarrow$ fully deterministic comparison rules&lt;/li>
&lt;li>Ensure that expired LSAs are deleted from link state databases of every router&lt;/li>
&lt;/ul>
&lt;h4 id="reliable-flooding">Reliable Flooding&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Reception of each LSA is &lt;strong>acknowledged&lt;/strong> by neighboring router&lt;/p>
&lt;ul>
&lt;li>Hop-by-hop acknowledgements&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-07%2014.32.37.png" alt="截屏2021-03-07 14.32.37">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Router &lt;strong>R&lt;/strong> stores received LSA&lt;/p>
&lt;ul>
&lt;li>
&lt;p>If R &lt;strong>does not have&lt;/strong> an LSA from the advertising router&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If the received LSA is &lt;strong>newer&lt;/strong> than the one in the database&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="k">def&lt;/span> &lt;span class="nf">is_new_LSA&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">received_LSA&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">cur_stored_LSA&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">MAX_AGE&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="mi">60&lt;/span>&lt;span class="p">):&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">received_LSA&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">sequence_Nr&lt;/span> &lt;span class="o">&amp;gt;&lt;/span> &lt;span class="n">cur_stored_LSA&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">sequence_Nr&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="kc">True&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">elif&lt;/span> &lt;span class="n">received_LSA&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">sequence_Nr&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="n">cur_stored_LSA&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">sequence_Nr&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">if&lt;/span> &lt;span class="n">received_LSA&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">checksum&lt;/span> &lt;span class="o">&amp;gt;&lt;/span> &lt;span class="n">cur_stored_LSA&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">checksum&lt;/span> &lt;span class="ow">or&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">cur_stored_LSA&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">age&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="n">MAX_AGE&lt;/span> &lt;span class="ow">or&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">received_LSA&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">age&lt;/span> &lt;span class="o">&amp;lt;&lt;/span> &lt;span class="n">cur_stored_LSA&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">age&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="kc">True&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">else&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span> &lt;span class="kc">False&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>If R stores the LSA, it forwards it to its neighbors&lt;/p>
&lt;ul>
&lt;li>Uses multicast address &lt;code>224.0.0.5&lt;/code> with hop limit of 1&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="lsa-flooding-example">LSA Flooding Example&lt;/h4>
&lt;p>Router R receives LSA from advertising router R1&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-07%2014.47.58.png" alt="截屏2021-03-07 14.47.58" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>if &lt;code>received_LSA.age == MAX_AGE&lt;/code> &lt;strong>and&lt;/strong> no LSA from R1 is known&lt;/p>
&lt;p>$\rightarrow$ Send ACK and discard&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If there is no LSA from R1 in database &lt;strong>or&lt;/strong> received LSA is newer $\rightarrow$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Store/replace LSA&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Send ACK&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Update Age and flood LSA to neighbors&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>If already stored copy is newer&lt;/p>
&lt;p>$\rightarrow$ Send stored copy back to advertising router R1&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If LSA and stored copy are the same&lt;/p>
&lt;p>$\rightarrow$ Discard LSA&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Re-compute routes if content of link state database changed&lt;/p>
&lt;h3 id="ospf-areas">OSPF Areas&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Basic situation: Autonomous systems can grow rather large&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Scalability problem&lt;/p>
&lt;ul>
&lt;li>LSA flooding and&lt;/li>
&lt;li>Route computation overhead&lt;/li>
&lt;/ul>
&lt;p>$\rightarrow$ do NOT scale 🤪&lt;/p>
&lt;/li>
&lt;li>
&lt;p>🔧 Solution: Divide an AS into &lt;strong>areas&lt;/strong> (i.e., introduce additional level of hierarchy)&lt;/p>
&lt;ul>
&lt;li>Apply routing only within an area
&lt;ul>
&lt;li>LSA flooding and route computation limited to an area&lt;/li>
&lt;li>Only routers within the same area have identical link state databases.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Areas exchange summary information with each other
&lt;ul>
&lt;li>Addresses reachable from these areas&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Typical size of an area: &amp;lt;100 routers&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="ospf-areas-structure">OSPF Areas structure&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-08%2016.47.53.png" alt="截屏2021-03-08 16.47.53" style="zoom:80%;" />
&lt;ul>
&lt;li>Two levels of hierarchy
&lt;ul>
&lt;li>&lt;strong>Area 0&lt;/strong> – &lt;strong>backbone&lt;/strong> of the autonomous system Backbone must be always connected&lt;/li>
&lt;li>All other areas are directly connected to backbone&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Area border routers (ABRs)&lt;/strong> interconnect areas
&lt;ul>
&lt;li>
&lt;p>They belong to both: their area and the backbone&lt;/p>
&lt;/li>
&lt;li>
&lt;p>They run an instance of OSPF for each area they are connected to&lt;/p>
&lt;/li>
&lt;li>
&lt;p>They generate summary LSAs&lt;/p>
&lt;ul>
&lt;li>Contain ABR’s routing table for corresponding area
&lt;ul>
&lt;li>List of destinations reachable within the area&lt;/li>
&lt;li>Associated with path cost from the ABR to destination&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>ABR ́s routing table is constructed after intra-area path computation&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Handle summary LSAs: Same way as “regular” LSAs&lt;/p>
&lt;ul>
&lt;li>ABR forward summary LSAs of an area into backbone&lt;/li>
&lt;li>ABR forward summary LSAs from backbone into the area&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="inter-area-forwarding">Inter-Area Forwarding&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Data between areas are forwarded through backbone (area 0)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>End-to-end path consists of &lt;strong>path segments&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Segment between source and ABR of originating area&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Segment between two ABRs in area 0, and&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Segment between ABR of target area and destination&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Routers within an area select ABRs so that resulting end-to-end path is a &lt;strong>shortest&lt;/strong> path&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Based on path costs of ABRs&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-08%2017.02.09.png" alt="截屏2021-03-08 17.02.09" style="zoom:80%;" />
&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-08%2017.02.36.png" alt="截屏2021-03-08 17.02.36">&lt;/p>
&lt;h3 id="rip-vs-ospf">RIP vs. OSPF&lt;/h3>
&lt;p>&lt;strong>RIP&lt;/strong>: based on distance vector&lt;/p>
&lt;ul>
&lt;li>
&lt;p>🔴 Problems&lt;/p>
&lt;ul>
&lt;li>Limited in metric selection and size
&lt;ul>
&lt;li>
&lt;p>Only one metric (hop count)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Maximum path length of 15 hops&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Periodic updates every 30 seconds, even without changes&lt;/li>
&lt;li>Slow convergence, count-to-infinity $\rightarrow$ Not suitable for large networks 🤪&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>👍 Advantage: easier and requires less resources than OSPF&lt;/p>
&lt;ul>
&lt;li>Still sometimes used in small networks&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>OSPF&lt;/strong>: based on link-state&lt;/p>
&lt;ul>
&lt;li>Addresses shortcomings of RIP
&lt;ul>
&lt;li>Faster convergence, no count-to-infinity, lower signaling overhead &amp;hellip; 👏&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Large networks can be divided into areas&lt;/li>
&lt;li>Standard in large ASes (together with IS-IS)&lt;/li>
&lt;/ul>
&lt;h2 id="bgp-border-gateway-protocol">BGP: Border Gateway Protocol&lt;/h2>
&lt;blockquote>
&lt;p>Good explanation:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://www.imperva.com/blog/bgp-routing-explained/">BGP for Humans: Making Sense of Border Gateway Protocol&lt;/a>&lt;/li>
&lt;/ul>
&lt;/blockquote>
&lt;h3 id="exterior-gateway-protocols">Exterior Gateway Protocols&lt;/h3>
&lt;p>In aforementioned section, we have devided a large networks into different autonomous systems (ASes). In order to make autonomous systems to be able to communicate with each other, there should be at least one special intermediate system that serves as an &lt;strong>interface to other ASes&lt;/strong>.&lt;/p>
&lt;p>👍 Advantages:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Scalability&lt;/strong>
&lt;ul>
&lt;li>
&lt;p>Size of routing tables depends on size of AS&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Changes in routing tables are only propagated within an AS&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Autonomy&lt;/strong>
&lt;ul>
&lt;li>Internet = network of networks&lt;/li>
&lt;li>Routing can be controlled in the own network
&lt;ul>
&lt;li>
&lt;p>Uniform interior routing protocol within the AS&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Interior routing protocols of different ASes do not have to be identical&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="border-gateway-protocol">Border Gateway Protocol&lt;/h3>
&lt;ul>
&lt;li>The most important &lt;strong>exterior&lt;/strong> gateway protocol&lt;/li>
&lt;li>&lt;strong>Path vector&lt;/strong> protocol
&lt;ul>
&lt;li>Extension of distance vector approach&lt;/li>
&lt;li>BGP distributed &lt;strong>paths&lt;/strong>, not metrics like costs etc.
&lt;ul>
&lt;li>With paths it is easy to guarantee that no loops exist&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Based on policies of network operator&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="bgp-in-a-nutshell">BGP in a Nutshell&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>What exactly is being distributed?&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Paths&lt;/strong> (also called &lt;strong>routes&lt;/strong>) that consist of
&lt;ul>
&lt;li>Target: &lt;strong>prefixes&lt;/strong> (also called: network, network prefixes, IP address ranges)&lt;/li>
&lt;li>Attributes: path, next hop
&lt;ul>
&lt;li>Each traversed AS adds its own AS number to the path&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Traffic &amp;ldquo;follows&amp;rdquo; UPDATE messages in opposite direction&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-08%2022.01.15.png" alt="截屏2021-03-08 22.01.15">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;details>
&lt;summary>&lt;b>Example: HW07&lt;/b>&lt;/summary>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-09%2022.53.41.png" alt="截屏2021-03-09 22.53.41" style="zoom:67%;" />
&lt;p>AS 100 announces prefix &lt;code>1.6.17.0/24&lt;/code>. Describe how the routing information is distributed in the network.&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-09%2022.54.30.png" alt="截屏2021-03-09 22.54.30">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-09%2022.54.41.png" alt="截屏2021-03-09 22.54.41">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-09%2022.54.58.png" alt="截屏2021-03-09 22.54.58">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-09%2022.55.12.png" alt="截屏2021-03-09 22.55.12">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-09%2022.55.45.png" alt="截屏2021-03-09 22.55.45">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-09%2022.55.57.png" alt="截屏2021-03-09 22.55.57">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-09%2022.56.15.png" alt="截屏2021-03-09 22.56.15">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-09%2022.58.43.png" alt="截屏2021-03-09 22.58.43">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-09%2022.59.08.png" alt="截屏2021-03-09 22.59.08">&lt;/p>
&lt;p>The other two UPDATE messages (sent from R1 to R31 and R21) are handled in a similar way.&lt;/p>
&lt;/details>
&lt;h3 id="bgp-structure">BGP Structure&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-08%2022.02.02.png" alt="截屏2021-03-08 22.02.02" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;strong>External BGP (EBGP)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Spoken between BGP routers of &lt;strong>neighboring&lt;/strong> ASes&lt;/li>
&lt;li>Announcement and forwarding of path information&lt;/li>
&lt;li>Internal details of AS are NOT exchanged&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Internal BGP (IBGP)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Between BGP routers &lt;strong>within&lt;/strong> an AS&lt;/li>
&lt;li>Synchronization of BGP routers of an AS&lt;/li>
&lt;li>Establishment of transit routes&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="categorization-of-routing-protocols">Categorization of Routing Protocols&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-08%2022.05.46.png" alt="截屏2021-03-08 22.05.46" style="zoom: 67%;" />
&lt;h3 id="interplay-of-the-routing-approaches">Interplay of the Routing Approaches&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-08%2022.07.37.png" alt="截屏2021-03-08 22.07.37" style="zoom:80%;" />
&lt;h3 id="routing-with-bgp-and-igps">Routing with BGP and IGPs&lt;/h3>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-08%2022.09.40.png" alt="截屏2021-03-08 22.09.40">&lt;/p>
&lt;p>Assume Alice wants to sent a packet to an external target ( not part of the local IGP domain, e.g., &lt;code>2.3.4.5&lt;/code>).&lt;/p>
&lt;p>How does IGP router know what to do with this packet?&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Is not strictly prescribed by BGP&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Network operators can configure this freely&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Different approaches possible&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="approach-1-igp-distributes-default-routes">Approach 1: IGP distributes &amp;ldquo;default&amp;rdquo; routes&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Unknown address/prefix packets are routed to &lt;strong>default BGP&lt;/strong> router via shortest path&lt;/p>
&lt;ul>
&lt;li>Good option for stub ASes&lt;/li>
&lt;li>Not practicable for transit ASes&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-08%2022.14.45.png" alt="截屏2021-03-08 22.14.45">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="approach-2-publication-of-external-routes-via-igp">Approach 2: Publication of external routes via IGP&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Allows more fine-grained control such as &lt;em>„all Google traffic goes this way“&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Cannot be done with all external routes (scalability!)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Usually combined with default route&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-09%2014.59.38.png" alt="截屏2021-03-09 14.59.38" style="zoom:80%;" />
&lt;/li>
&lt;/ul>
&lt;h4 id="approach-3-igp-router-also-speaks-bgp">Approach 3: IGP router also speaks BGP&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Forwarding table is build from two routing tables (BGP + IGP)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Often the case with big backbone providers&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-08%2022.25.22.png" alt="截屏2021-03-08 22.25.22">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="bgp-sessions">BGP-Sessions&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Point-to-point&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-08%2022.33.53.png" alt="截屏2021-03-08 22.33.53">&lt;/p>
&lt;ul>
&lt;li>Usually only between &lt;strong>directly connected routers&lt;/strong>
&lt;ul>
&lt;li>Neighbors are called &amp;ldquo;&lt;strong>peers&lt;/strong>&amp;rdquo;&lt;/li>
&lt;li>BGP uses TCP connections between these routers&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>How to establish TCP connection without working IP routing?&lt;/p>
&lt;ul>
&lt;li>IBGP: IGP of AS can be used&lt;/li>
&lt;li>EBGP
&lt;ul>
&lt;li>Usually direct physical connection $\rightarrow$ no routing required&lt;/li>
&lt;li>Manual configuration at both ends of connection&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="ibgp-connections">IBGP Connections&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Simplest case: all BGP routers are fully meshed and connected directly to each other&lt;/p>
&lt;ul>
&lt;li>BGP sessions must be kept active all the time&lt;/li>
&lt;li>Bad scalability 👎&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Alternative: Concentrate IBGP traffic in a single router&lt;/p>
&lt;ul>
&lt;li>Called &lt;strong>route reflector&lt;/strong>&lt;/li>
&lt;li>Only route reflector has to maintain sessions with everyone else&lt;/li>
&lt;li>More than one reflector used in practice for reliability reasons&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Alternative: Form hierarchies of sub ASes&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Called &lt;strong>AS confederations&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Can also be used to implement more complex policies&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Confederation appears to outside as single AS&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="bgp-messages">BGP Messages&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>OPEN&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Establishment of BGP connection to peer BGP router&lt;/p>
&lt;ul>
&lt;li>Important: TCP connection must already exist!&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Authentication&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>UPDATE&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Announcement of new or withdrawal of outdated path&lt;/li>
&lt;li>Attention: Only sent if new, better paths available&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>KEEPALIVE&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Keeps connection alive in absence of UDPATE messages&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Acknowledgment for an OPEN request&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Recommended KeepAliveTimer: 30 s&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>NOTIFICATION&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Error message and tear down of BGP connection&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="routing-information-base">Routing Information Base&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>BGP provides mechanisms for distributing path information&lt;/p>
&lt;ul>
&lt;li>Does NOT dictate how routes should be chosen&lt;/li>
&lt;li>No predefined routing metric&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>BGP uses &lt;strong>policies&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>BGP instance of a router collects received and dispatched routing information in various internal tables&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Routing Information Base (RIB)&lt;/strong>&lt;/li>
&lt;li>Mainly for logical structuring&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Structure&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-08%2022.48.36.png" alt="截屏2021-03-08 22.48.36" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Adj-RIB-In (Adjacency RIB Incoming)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Exists per peer&lt;/li>
&lt;li>Stores information received from this peer&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Loc-RIB (Local RIB, Routing Information Base)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>„Actual routing table“
&lt;ul>
&lt;li>Only p&lt;strong>referred (= best=shortest)&lt;/strong> routes to destination networks are included here&lt;/li>
&lt;li>Forwarding Information Base (FIB) is build based on Loc-RIB&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Adj-RIB-Out (Adjacency RIB Outgoing)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Exists per peer&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Contains routes published to this peer&lt;/p>
&lt;/li>
&lt;/ul>
&lt;details>
&lt;summary>&lt;b>Routing Table example: HW08&lt;/b>&lt;/summary>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-09%2022.53.41.png" alt="截屏2021-03-09 22.53.41" style="zoom:67%;" />
&lt;p>AS 100 announces prefix &lt;code>1.6.17.0/24&lt;/code>. Fill out the simplified routing table of R5&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2011.02.53.png" alt="截屏2021-03-10 11.02.53">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2011.02.56.png" alt="截屏2021-03-10 11.02.56">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2011.03.24.png" alt="截屏2021-03-10 11.03.24">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2011.03.43.png" alt="截屏2021-03-10 11.03.43">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2011.04.07.png" alt="截屏2021-03-10 11.04.07">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2011.04.27.png" alt="截屏2021-03-10 11.04.27">&lt;/p>
&lt;/details>
&lt;h3 id="-challenges">🔴 Challenges&lt;/h3>
&lt;p>BGP &amp;ldquo;struggles&amp;rdquo; with many challenges and problems, e.g.,&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Maintaining scalability&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Security problems&lt;/p>
&lt;/li>
&lt;/ul></description></item><item><title>Zweidimensionale Zufallsvariable</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/2_dim_zufallsvaraible/</link><pubDate>Sun, 05 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/2_dim_zufallsvaraible/</guid><description>&lt;h2 id="verteilungsfunktion-und-dichte">Verteilungsfunktion und Dichte&lt;/h2>
&lt;p>Eine vektorwertige Funktion&lt;/p>
$$
\underline{X}=\underline{X}(\omega): \Omega \rightarrow \mathbb{R}^{2}
$$
&lt;p>die jedem Ergebnis $\omega \in \Omega$ einen Vektor $\underline{x}=\left[\begin{array}{l}x_{1} \\ x_{2}\end{array}\right]$ zuordnet, heißt &lt;mark>&lt;strong>mehrdimensionale Zufallsvariable&lt;/strong>&lt;/mark>, wenn das Urbild eines jeden Intervalls $I_{\underline{a}}=\left(-\infty, a_{1}\right] \times\left(-\infty, a_{2}\right] \subset \mathbb{R}^{2}$ ein Ereignis ist&lt;/p>
$$
X^{-1}\left(I_{a}\right) \in \mathfrak{B}, \quad \forall \underline{a} \in \mathbb{R}^{2}.
$$
&lt;h3 id="verteilungsfunktion">Verteilungsfunktion&lt;/h3>
&lt;p>Die Funktion&lt;/p>
$$
\begin{aligned}
F_{\underline{X}}(\underline{x}) &amp;=F_{X_{1}, X_{2}}\left(x_{1}, x_{2}\right) \\
&amp;=\mathrm{P}\left(X_{1} \leq x_{1}, X_{2} \leq x_{2}\right)
\end{aligned}
$$
&lt;p>der zweidimensionalen Zufallsvariablen $\underline{X}$ heißt &lt;mark>&lt;strong>Verteilungsfunktion&lt;/strong>&lt;/mark> von $\underline{X}$.&lt;/p>
&lt;h3 id="dichte">Dichte&lt;/h3>
&lt;p>Die &lt;mark>&lt;strong>Dichte&lt;/strong>&lt;/mark> der zweidimensionalen Zufallsvariablen $\underline{X}$: partielle Ableitungen der Verteilungsfunktion $F_{\underline{X}}(\underline{x})$&lt;/p>
$$
f_{\underline{X}}(\underline{x})=f_{X_{1}, X_{2}}\left(x_{1}, x_{2}\right)=\frac{\partial^{2}}{\partial x_{1} \partial x_{2}} F_{X_{1}, X_{2}}\left(x_{1}, x_{2}\right)
$$
&lt;p>Sind beide Komponenten diskret verteilt, schreibt man für deren „Dichte“&lt;/p>
$$
f_{\underline{X}}(\underline{x})=\sum_{n=1}^{\infty} \sum_{k=1}^{\infty} \mathrm{P}\left(X_{1}=x_{1, n}, X_{2}=x_{2, k}\right) \cdot \delta\left(x_{1}-x_{1, n}, x_{2}-x_{2, k}\right)
$$
&lt;p>mit der zweidimensionalen $\delta$- Distribution $\delta(x_1, x_2)$ und den Einzelwahrscheinlichkeiten $\mathrm{P}\left(X_{1}=x_{1, n}, X_{2}=x_{2, k}\right)$.&lt;/p>
&lt;h2 id="randdichten-und-bedingte-dichten">Randdichten und bedingte Dichten&lt;/h2>
&lt;p>$\underline{X}$ sei eine zweidimensionale Zufallsvariable mit der Dichte $f(\underline{X})=f_{\underline{X}}\left(x_{1}, x_{2}\right)$. Dann heißen&lt;/p>
$$
\begin{array}{l}
f_{X_{1}}\left(x_{1}\right)=\int_{-\infty}^{\infty} f_{\underline{X}}\left(x_{1}, x_{2}\right) \mathrm{d} x_{2} \\
f_{X_{2}}\left(x_{2}\right)=\int_{-\infty}^{\infty} f_{\underline{X}}\left(x_{1}, x_{2}\right) \mathrm{d} x_{1}
\end{array}
$$
&lt;p>&lt;mark>&lt;strong>Randdichten&lt;/strong>&lt;/mark> von $X$.&lt;/p>
&lt;p>$X$ sei eine zweidimensionale Zufallsvariable mit der Dichte $f_X(x_1, x_2)$ und es gelte $f_{X_1}(x_1) > 0$ und $f_{X_2}(x_2) > 0$ . Dann heißt&lt;/p>
$$
f_{X_{1}}\left(x_{1} \mid X_{2}=x_{2}\right)=\frac{f_{\underline{X}}\left(x_{1}, x_{2}\right)}{f_{X_{2}}\left(x_{2}\right)}
$$
&lt;p>die &lt;mark>&lt;strong>bedingte Dichte&lt;/strong>&lt;/mark> von $X_1$ unter der Bedingung $X_2 = x_2$.&lt;/p>
$$
f_{X_{2}}\left(x_{2} \mid X_{1}=x_{1}\right)=\frac{f_{\underline{X}}\left(x_{1}, x_{2}\right)}{f_{X_{1}}\left(x_{1}\right)}
$$
&lt;p>ist die bedingte Dichte von $X_2$ unter der Bedingung $X_1 = x_1$.&lt;/p>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>&lt;strong>Formel von der totalen Wahrscheinlichkeit für Dichten&lt;/strong>&lt;/p>
$$
f\_{X\_{1}}\left(x\_{1}\right)=\int\_{-\infty}^{\infty} f\_{X\_{1}}\left(x\_{1} \mid X\_{2}=x_{2}\right) f\_{X\_{2}}\left(x\_{2}\right) \mathrm{d} x\_{2}
$$&lt;/span>
&lt;/div>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>&lt;strong>Satz von Bayes für Dichten&lt;/strong>&lt;/p>
$$
f\_{X\_{2}}\left(x\_{2} \mid X\_{1}=x\_{1}\right)=\frac{f\_{X\_{1}}\left(x\_{1} \mid X\_{2}=x\_{2}\right) f\_{X\_{2}}\left(x\_{2}\right)}{\int\_{-\infty}^{\infty} f\_{X\_{1}}\left(x\_{1} \mid X\_{2}=x\_{2}\right) f\_{X\_{2}}\left(x\_{2}\right) \mathrm{d} x\_{2}}
$$&lt;/span>
&lt;/div>
&lt;p>Der &lt;mark>&lt;strong>bedingte Erwartungswert&lt;/strong>&lt;/mark> einer Zufallsvariablen $X_1$ unter der Bedingung $X_2 = x_2$ ist&lt;/p>
$$
\mathrm{E}_{f_{\underline{\underline{x}}}}\left\{X_{1} \mid X_{2}=x_{2}\right\}=\int_{-\infty}^{\infty} x_{1} f_{X_{1}}\left(x_{1} \mid X_{2}=x_{2}\right) \mathrm{d} x_{1}
$$
&lt;h2 id="unabhangigkeit-von-zufallsvariablen">Unabhängigkeit von Zufallsvariablen&lt;/h2>
&lt;p>Zwei Zufallsvariablen $X, Y$ heißen &lt;mark>&lt;strong>unabhängig&lt;/strong>&lt;/mark> , wenn gilt&lt;/p>
$$
f_{X, Y}(x, y)=f_{X}(x) \cdot f_{Y}(y)
$$
&lt;p>Damit gilt auch&lt;/p>
$$
f_{X}(x \mid Y=y)=f_{X}(x)
$$
&lt;p>&lt;strong>Erwartungswert&lt;/strong> für zweidimensionale Zufallsvariablen:&lt;/p>
$$
\mathrm{E}_{f_{X, Y}}\{g(X, Y)\}=\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} g(x, y) f_{X, Y}(x, y) \mathrm{d} x \mathrm{~d} y
$$
&lt;p>Die &lt;strong>Kovarianz&lt;/strong> $\sigma_{X, Y}=\operatorname{Cov}_{\boldsymbol{f}_{X, Y}}\{X, Y\}$
von zwei Zufallsvariablen $X$ und $Y$ ist&lt;/p>
$$
\sigma_{X, Y}=\operatorname{Cov}_{f_{X, Y}}\{X, Y\}=\mathrm{E}\{(X-\mathrm{E}\{X\}) \cdot(Y-\mathrm{E}\{Y\})\}=\mathrm{E}\left\{\left(X-\mu_{x}\right) \cdot\left(Y-\mu_{y}\right)\right\}
$$
&lt;p>Der &lt;strong>Korrelationskoeffizient&lt;/strong> von $X$ und $Y$:&lt;/p>
$$
\rho_{X, Y}=\frac{\operatorname{Cov}_{f_{X, Y}}\{X, Y\}}{\sqrt{\operatorname{Var}_{f_{X}}\{X\} \operatorname{Var}_{f_{Y}}\{Y\}}}=\frac{\sigma_{X, Y}}{\sigma_{X} \cdot \sigma_{Y}} \in [-1, 1]
$$
&lt;ul>
&lt;li>stellt ein &lt;em>Ähnlichkeitsmaß&lt;/em> der Zufallsvariablen $X$ und $Y$ dar
&lt;ul>
&lt;li>$\left|\rho_{X, Y}\right|=1$: $X$ und $Y$ sind maximal ähnlich&lt;/li>
&lt;li>$\left|\rho_{X, Y}\right|=0$: $X$ und $Y$ sind komplett unähnlich (&lt;em>i.e.&lt;/em>, $X$ und $Y$ sind &lt;strong>unkorreliert&lt;/strong>)
&lt;ul>
&lt;li>Unabhängige Zufallsvariablen sind unkorreliert. (Die Umkehrung dieser Aussage gilt im allgemeinen NICHT!)&lt;/li>
&lt;li>Haben $X$ und $Y$ eine Normalevwrteilung und hat $[X, Y]^\top$ eine zweidimensionale Normalverteilung, folgt aus Unkorreliertheit $\rho_{X, Y} = 0$ auch die Unabhängigkeit von $X$ und $Y$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Ist $\underline{X}=\left\{X_{1}, X_{2}, \ldots, X_{N}\right\}^{\top}$
ein $N$-dimensional Zufallsvektor, seine &lt;strong>Kovarianzmatrix&lt;/strong> ist&lt;/p>
$$
\begin{array}{l}
\operatorname{Cov}_{f_{\underline{x}}}\{\underline{X}\}=\mathrm{E}_{f_{\underline{\underline{x}}}}\left\{(\underline{X}-\underline{\mu})(\underline{X}-\underline{\mu})^{\top}\right\}\\
\newline
=\left[\begin{array}{cccc}
\operatorname{Var}_{X_{1}}\left\{X_{1}\right\} &amp; \operatorname{Cov}_{X_{1}, X_{2}}\left\{X_{1}, X_{2}\right\} &amp; \cdots &amp; \operatorname{Cov}_{X_{1}, X_{N}}\left\{X_{1}, X_{N}\right\} \\
\operatorname{Cov}_{X_{2}, X_{1}}\left\{X_{2}, X_{1}\right\} &amp; \operatorname{Var}_{X_{2}}\left\{X_{2}\right\} &amp; \cdots &amp; \mathrm{Cov}_{X_{2}, X_{N}}\left\{X_{2}, X_{N}\right\} \\
\vdots &amp; \vdots &amp; \ddots &amp; \vdots \\
\operatorname{Cov}_{X_{N}, X_{1}}\left\{X_{N}, X_{1}\right\} &amp; \operatorname{Cov}_{X_{N}, X_{2}}\left\{X_{N}, X_{2}\right\} &amp; \cdots &amp; \operatorname{Var}_{X_{N}}\left\{X_{N}\right\}
\end{array}\right]\\
\newline
=\left[\begin{array}{cccc}
\sigma_{X_{1}}^{2} &amp; \rho_{X_{1}, X_{2}} \sigma_{X_{1}} \sigma_{X_{2}} &amp; \cdots &amp; \rho_{X_{1}, X_{N}} \sigma_{X_{1}} \sigma_{X_{N}} \\
\rho_{X_{2}, X_{1}} \sigma_{X_{2}} \sigma_{X_{1}} &amp; \sigma_{X_{2}}^{2} &amp; \cdots &amp; \rho_{X_{2}, X_{N}} \sigma_{X_{2}} \sigma_{X_{N}} \\
\vdots &amp; \vdots &amp; \ddots &amp; \vdots \\
\rho_{X_{N}, X_{1}} \sigma_{X_{N}} \sigma_{X_{1}} &amp; \rho_{X_{N}, X_{2}} \sigma_{X_{N}} \sigma_{X_{2}} &amp; \cdots &amp; \sigma_{X_{N}}^{2}
\end{array}\right]
\end{array}
$$
&lt;details class="spoiler " id="spoiler-19">
&lt;summary class="cursor-pointer">Detail&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-06-05%2018.47.05.png" alt="截屏2022-06-05 18.47.05">
&lt;/div>
&lt;/details>
&lt;p>Eine Kovarianzmatrix ist stets &lt;strong>symmetrisch&lt;/strong> und &lt;strong>positiv &lt;a href="https://de.wikipedia.org/wiki/Definitheit">definit&lt;/a>&lt;/strong> (oder positiv semidefinit).&lt;/p></description></item><item><title>Label Switching</title><link>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/04-label_switching/</link><pubDate>Wed, 10 Mar 2021 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/04-label_switching/</guid><description>&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Label_Switching%20%281%29.png"
alt="Summary of Label Switching">&lt;figcaption>
&lt;p>Summary of Label Switching&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;h2 id="motivation">Motivation&lt;/h2>
&lt;p>Issues related to IP based routing&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Lookup is rather complex&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Longest matching prefix $\rightarrow$ high performance forwarding needed&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Shortest path routing selects shortest path to destination&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Multiple paths to destination can not be utilized concurrently $\rightarrow$ traffic engineering desirable&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Strictly packet based&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Each IP datagram is handled individually – no support for data streams (flows) 🤪&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="flows">Flows&lt;/h2>
&lt;h3 id="what-is-a-flow">What is a flow?&lt;/h3>
&lt;p>A &lt;mark>flow&lt;/mark> is a sequence of packets traversing a network that share a set of header field values.&lt;/p>
&lt;p>Different levels of granularity possible, e.g.,&lt;/p>
&lt;ul>
&lt;li>All packets belonging to a particular TCP connection&lt;/li>
&lt;li>HTTPS traffic&lt;/li>
&lt;li>VoIP traffic
&lt;ul>
&lt;li>Of a particular sender&lt;/li>
&lt;li>Within a network&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-10%2013.28.11.png" alt="截屏2021-03-10 13.28.11" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-10%2013.28.35.png" alt="截屏2021-03-10 13.28.35" style="zoom:67%;" />
&lt;h3 id="flow-based-forwarding">Flow Based Forwarding&lt;/h3>
&lt;ul>
&lt;li>Fundamental concept, independent of certain layers
&lt;ul>
&lt;li>Can span multiple layers&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Incorporates classic routing/forwarding concepts&lt;/li>
&lt;li>Goes beyond classic concepts&lt;/li>
&lt;/ul>
&lt;h3 id="aggregation">Aggregation&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Micro-flows&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Consider a single “connection” e.g., a TCP connection&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Fine grained control&lt;/p>
&lt;/li>
&lt;li>
&lt;p>High number of flows possible&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Macro-flows&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Higher level of aggregation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Aggregation of several “connections”&lt;/p>
&lt;ul>
&lt;li>e.g., IP destination address in specific subnet&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Lower number of flows&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="label-switching">Label Switching&lt;/h2>
&lt;h3 id="classification-of-communication-networks">Classification of Communication Networks&lt;/h3>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2013.32.22.png" alt="截屏2021-03-10 13.32.22">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2013.32.41.png" alt="截屏2021-03-10 13.32.41">&lt;/p>
&lt;h3 id="label-switching-1">Label Switching&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Combination of&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Packet switching&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Packets are forwarded individually (data path is NOT fixed)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Packets include metadata needed for forwarding decision&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Circuit switching&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Paths established for flows through the network (data path is fixed)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Simple forwarding decision&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Differentiation of flows possible&lt;/p>
&lt;ul>
&lt;li>Load balancing&lt;/li>
&lt;li>Quality of service (QoS)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Implementation&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Switching&lt;/strong> at layer 2, Instead of routing at layer 3&lt;/li>
&lt;li>&lt;strong>Labels&lt;/strong>: Identification which is only locally valid&lt;/li>
&lt;li>&lt;strong>Virtual circuits&lt;/strong>: Sequence of labels&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="label">Label&lt;/h3>
&lt;ul>
&lt;li>Short unstructured identification of fixed length
&lt;ul>
&lt;li>
&lt;p>Does NOT carry any layer-3-information&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Unique: only locally at the corresponding switch&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Label swapping: Mapping from input label to output label&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Virtual circuit: Identified through sequence of labels at the path&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2016.55.43.png" alt="截屏2021-03-10 16.55.43">&lt;/p>
&lt;h3 id="transport-of-label">Transport of Label&lt;/h3>
&lt;p>Label must be transported within the packet&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Additional „header“ in the packet, between headers of layer 2 and layer 3 $\rightarrow$ &lt;strong>layer 2.5&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2017.10.57.png" alt="截屏2021-03-10 17.10.57">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Alternative: In specialized fields within existing packet headers&lt;/p>
&lt;ul>
&lt;li>IPv6: flow label (20 bit field in IPv6 header, to identify micro flows more easily)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="label-switching-domain">Label Switching Domain&lt;/h3>
&lt;p>Basic architecture&lt;/p>
&lt;ul>
&lt;li>Border of the domain (&lt;strong>edge devices&lt;/strong>)
&lt;ul>
&lt;li>
&lt;p>Add / remove label&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Map flow to forwarding class&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Access control&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&amp;hellip;&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Within the domain (&lt;strong>switching device&lt;/strong>)
&lt;ul>
&lt;li>Forward packets based on label information&lt;/li>
&lt;li>Label swapping&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2017.12.48.png" alt="截屏2021-03-10 17.12.48">&lt;/p>
&lt;h3 id="label-forwarding-information-base">Label Forwarding Information Base&lt;/h3>
&lt;p>Forwarding table in case of label switching: Efficient access through label (NO longest prefix matching needed).&lt;/p>
&lt;p>Example:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-10%2017.14.25.png" alt="截屏2021-03-10 17.14.25" style="zoom:80%;" />
&lt;h2 id="multiprotocol-label-switching-mpls">Multiprotocol Label Switching (MPLS)&lt;/h2>
&lt;h3 id="general-aspects">General Aspects&lt;/h3>
&lt;p>&lt;strong>MPLS&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Based on label switching&lt;/li>
&lt;li>Originally: data plane optimization&lt;/li>
&lt;li>Standardized within the IETF&lt;/li>
&lt;li>Increasingly applied in larger autonomous systems&lt;/li>
&lt;li>Main Features
&lt;ul>
&lt;li>Fast forwarding (due to reduced amount of packet processing)&lt;/li>
&lt;li>QoS support
&lt;ul>
&lt;li>Guarantees on latency and capacity, e.g., for voice traffic&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Traffic engineering
&lt;ul>
&lt;li>Supports load balancing in order to optimize network utilization &amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Virtual private networks
&lt;ul>
&lt;li>Isolate traffic from other packets on the Internet&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Multiple networks support
&lt;ul>
&lt;li>Usable on different network technologies, e.g., IP, ATM &amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>👍 Advantages&lt;/p>
&lt;ul>
&lt;li>Clear separation of forwarding (label switching) and control (manipulation of label binding)&lt;/li>
&lt;li>Not limited to IP&lt;/li>
&lt;li>Support of metrics&lt;/li>
&lt;li>Versatile concept&lt;/li>
&lt;li>Scales&lt;/li>
&lt;/ul>
&lt;h3 id="architecture-components-and-basic-operation">Architecture, Components and Basic Operation&lt;/h3>
&lt;h4 id="architecture">Architecture&lt;/h4>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2017.20.17.png" alt="截屏2021-03-10 17.20.17">&lt;/p>
&lt;h4 id="components">Components&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Label-switching router (LSR)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>MPLS-capable IP router&lt;/p>
&lt;ul>
&lt;li>Can forward packets based on both, IP prefixes and MPLS labels&lt;/li>
&lt;li>Typically: IP for control plane and MPLS for data plane&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Architecture:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-10%2017.35.34.png" alt="截屏2021-03-10 17.35.34" style="zoom:80%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Label edge router (LER)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Router at the edge of an MPLS domain
&lt;ul>
&lt;li>Each LSR with a non-MPLS capable neighbor is an LER&lt;/li>
&lt;li>Also called: label ingress router resp. label egress router&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Classifies packets that enter the MPLS domain
&lt;ul>
&lt;li>&lt;a href="#forwarding-equivalence-class">Forwarding equivalency class (FEC)&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>MPLS-Node&lt;/strong>: General term for MPLS-capable intermediate systems, like LSRs&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="forwarding-equivalence-classs">Forwarding Equivalence Classs&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Class of packets that should be treated &lt;strong>equally&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Same path through the network&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Same QoS properties&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Basis for label assignment&lt;/p>
&lt;/li>
&lt;li>
&lt;p>MPLS-specific term, roughly comparable to „flow“&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Same address prefix and same type-of-service field&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Same IP addresses and same port numbers&lt;/p>
&lt;/li>
&lt;li>
&lt;p>VoIP traffic with destination address in subnet X&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Granularity&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Coarse-grained&lt;/strong>: Important for quick forwarding and scalability&lt;/li>
&lt;li>&lt;strong>Fine-grained&lt;/strong>: Important for differentiated treatment of packets or flows&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Example 1: Very fine granular FEC (“micro flow”)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>A single TCP connection, identified by 5-tuple&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2017.41.47.png" alt="截屏2021-03-10 17.41.47">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Example 2: data streams differentiation&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2017.42.35.png" alt="截屏2021-03-10 17.42.35">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Traffic engineering&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Usage of different paths&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Goals&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Load balancing&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Utilization of all available resources&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Prioritization of individual data streams&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>(realized through separate virtual connections)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Support of quality of service&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Different quality of service for different data streams&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="label-switched-path">Label Switched Path&lt;/h4>
&lt;p>&lt;em>Virtual&lt;/em> connection: Sequence of labels on a path through MPLS domain.&lt;/p>
&lt;p>Example:&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2017.45.30.png" alt="截屏2021-03-10 17.45.30">&lt;/p>
&lt;h4 id="mpls-label">MPLS-Label&lt;/h4>
&lt;p>Encapsulation: &lt;strong>Between headers of layer 2 (Data Link layer) and layer 3 (Network layer)&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2017.47.06.png" alt="截屏2021-03-10 17.47.06">&lt;/p>
&lt;ul>
&lt;li>Label: the label itself&lt;/li>
&lt;li>Exp: Bits for experimental usage&lt;/li>
&lt;li>S: Stack-bit&lt;/li>
&lt;li>TTL: Time-to-live&lt;/li>
&lt;/ul>
&lt;h3 id="label-distribution">Label Distribution&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Label Binding&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Associate specific label to FEC&lt;/li>
&lt;li>Stored in &lt;strong>label forwarding information base&lt;/strong>
&lt;ul>
&lt;li>Used as &lt;em>incoming&lt;/em> label&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Label distribution&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Label binding is distributed to neighboring routers&lt;/li>
&lt;li>Stored in &lt;strong>label forwarding information base&lt;/strong>
&lt;ul>
&lt;li>Used as &lt;em>outgoing&lt;/em> label&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="types-of-label-distribution">Types of Label Distribution&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>“Roles” of a label-switching router&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-10%2017.56.41.png" alt="截屏2021-03-10 17.56.41">&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Downstream LSR&lt;/strong>: &lt;em>In&lt;/em> direction of data flow&lt;/li>
&lt;li>&lt;strong>Upstream LSR&lt;/strong>: &lt;em>Against&lt;/em> direction of data flow&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Unsolicited downstream&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Router generates label bindings as soon as it is ready to forward MPLS packets of the respective FEC
&lt;ul>
&lt;li>Upstream neighbors (according to IP routing): update forwarding tables
&lt;ul>
&lt;li>Label used as outgoing label&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Non-upstream neighbors can store label for later use
&lt;ul>
&lt;li>Quicker reactions on route changes&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Downstream on demand&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Downstream router generates label binding on demand&lt;/li>
&lt;li>Upstream router has to request label binding for FEC&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="label-distribution-protocol">Label Distribution Protocol&lt;/h4>
&lt;h5 id="rsvp-resource-reservation-protocol">&lt;strong>RSVP (Resource ReserVation Protocol)&lt;/strong>&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal: bandwidth reservation for end-to-end data streams&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Soft state principle&lt;/p>
&lt;ul>
&lt;li>Establish a session and periodically signal that session is still alive&lt;/li>
&lt;li>In case of failure state is automatically removed after some time&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Signaling&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-11%2013.07.14.png" alt="截屏2021-03-11 13.07.14">&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Path message&lt;/strong>
&lt;ul>
&lt;li>
&lt;p>From sender to receiver&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Find path to receiver&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Each hop is recorded in the message&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Resv message&lt;/strong>
&lt;ul>
&lt;li>
&lt;p>From receiver to sender&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Bandwidth reservation on return path&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h5 id="rsvp-te-traffic-engineering">&lt;strong>RSVP-TE (Traffic Engineering)&lt;/strong>&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>Extension to RSVP to support label distribution&lt;/p>
&lt;ul>
&lt;li>Many additional fields and functionality, e.g., fast reroute&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Signaling&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-11%2013.08.51.png" alt="截屏2021-03-11 13.08.51">&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Path message&lt;/strong>
&lt;ul>
&lt;li>From upstream LER to downstream LER&lt;/li>
&lt;li>Label request&lt;/li>
&lt;li>Source route (“explicit route”) [optional]&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Resv message&lt;/strong>
&lt;ul>
&lt;li>
&lt;p>In response to path message&lt;/p>
&lt;/li>
&lt;li>
&lt;p>From downstream LER to upstream LER&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Label binding (hop-per-hop)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="virtual-private-networks">Virtual Private Networks&lt;/h3>
&lt;ul>
&lt;li>MPLS is useful for virtual private networks (VPNs)&lt;/li>
&lt;li>Use case: VPN traffic engineering
&lt;ul>
&lt;li>Customer with sites at different locations (e.g., different cities) wants to lease seamless “network” service&lt;/li>
&lt;li>Requirements
&lt;ul>
&lt;li>
&lt;p>Connect physically remote locations&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Carry IP-based intranet traffic&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Each customer has obtained an IP address block&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Guaranteed bandwidth / SLAs&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Options
&lt;ul>
&lt;li>“Dark fibre” provider&lt;/li>
&lt;li>VPN backbone provider&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="example-private-networks-over-dark-fibre">Example: Private Networks over “Dark Fibre”&lt;/h4>
&lt;p>Suppose that three companies have sites at remote locations&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Company A: Karlsruhe, Paris, Zürich&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Company B: Karlsruhe, Paris&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Company C: Karlsruhe, Paris&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Each company runs a private network&lt;/p>
&lt;ul>
&lt;li>Different subnet for each site from customers IP address space&lt;/li>
&lt;li>Router connects site to other site(s)&lt;/li>
&lt;li>Data is transported over leased fiber optic cables (“dark fibre”)
&lt;ul>
&lt;li>Capacity 155 Mbit/s, utilization marked in graph&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-11%2015.25.04-20210311155949753.png" alt="截屏2021-03-11 15.25.04">&lt;/p>
&lt;p>A provider uses MPLS to offer virtual private networks&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Has „points of presence (PoP)“ in all three cities&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Offers bandwidth at arbitrary rates&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Is cheaper than leasing fiber optic cables&lt;/p>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2015.39.48.png" alt="截屏2021-03-11 15.39.48" style="zoom:67%;" />
&lt;p>Question: Can the provider serve the need of all three companies?&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2015.40.12.png" alt="截屏2021-03-11 15.40.12" style="zoom:67%;" />
&lt;p>The answer is: YES! By utilizing &lt;strong>non-shortest paths&lt;/strong>!&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2015.40.49.png" alt="截屏2021-03-11 15.40.49" style="zoom:67%;" />
&lt;p>We can achieve that using &lt;strong>VPNs implemented by Label Switching&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Outer label: identifies path to LER&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Inner label: identifies VPN instance / customer&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>For company A:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2015.42.49.png" alt="截屏2021-03-11 15.42.49" style="zoom:67%;" />
&lt;ul>
&lt;li>Inner label $5$: Indicates that this packet belongs to company A\&lt;/li>
&lt;li>Outer labels $2, 7, 1$: Label switching/Swapping&lt;/li>
&lt;/ul>
&lt;p>For company B:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2015.45.11.png" alt="截屏2021-03-11 15.45.11" style="zoom:67%;" />
&lt;p>For company C:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2015.45.33.png" alt="截屏2021-03-11 15.45.33" style="zoom:67%;" />
&lt;h4 id="label-distribution-1">Label Distribution&lt;/h4>
&lt;p>Recall VPN example from above&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2015.40.49-20210311160013646.png" alt="截屏2021-03-11 15.40.49" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>LSP for customer B (Karlsruhe $\rightarrow$ Paris) should take a “detour” over Zürich) to match bandwidth requirements&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Setup of LSPs over explicitly given route with &lt;a href="#rsvp-te-traffic-engineering">RSVP-TE&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Example: LSP “Karlsruhe to Paris over Zürich”
&lt;ul>
&lt;li>RSVP-TE signaling initiated at upstream LER (LER-KA)&lt;/li>
&lt;li>Note: LSPs are unidirectional!&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>How are the labels distributed?&lt;/p>
&lt;ul>
&lt;li>
&lt;p>LER-KA1 (upstream) sends Path Message to LER-P (downstream).&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2015.58.04.png" alt="截屏2021-03-11 15.58.04" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2015.58.16.png" alt="截屏2021-03-11 15.58.16" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2015.58.28.png" alt="截屏2021-03-11 15.58.28" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2015.58.48.png" alt="截屏2021-03-11 15.58.48" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>LER-P receives the Path Message and send Resv Message back.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2016.06.48.png" alt="截屏2021-03-11 16.06.48" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2016.07.23.png" alt="截屏2021-03-11 16.07.23" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2016.07.34.png" alt="截屏2021-03-11 16.07.34" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-11%2016.07.58.png" alt="截屏2021-03-11 16.07.58" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Notice that we have label $2$ in the 5th step, and also in the 8th step. This is valid because labels are &lt;strong>locally&lt;/strong> distributed.&lt;/span>
&lt;/div>
&lt;h2 id="resource">Resource&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>MPLS - Multiprotocol Label Switching (2.5 layer protocol)&lt;/p>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/BuIWNecUAE8?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div>
&lt;/li>
&lt;/ul></description></item><item><title>Differenzierensregeln für Matrizen</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/matrix_differenzieren/</link><pubDate>Fri, 17 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/matrix_differenzieren/</guid><description>&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>Für eine Matrix $\mathbf{C}$ gilt&lt;/p>
$$
\frac{\partial}{\partial \mathbf{C}}\left(\underline{a}^{\top} \cdot \mathbf{C} \cdot \underline{b}\right)=\underline{a} \cdot \underline{b}^{\top}
$$&lt;/span>
&lt;/div>
&lt;p>Beispiel&lt;/p>
$$
Q=\underbrace{\left[\begin{array}{ll}
a_{1} &amp; a_{2}
\end{array}\right]}_{\boldsymbol{a}^\top}\left[\begin{array}{ll}
c_{11} &amp; c_{12} \\
c_{21} &amp; c_{22}
\end{array}\right]\underbrace{\left[\begin{array}{l}
b_{1} \\
b_{2}
\end{array}\right]}_{\boldsymbol{b}}=a_{1} b_{1} \cdot c_{11}+a_{2} b_{1} c_{21}+a_{1} b_{2} c_{12}+a_{2} b_{2} c_{22} = \boldsymbol{a} \cdot \boldsymbol{b}^\top
$$
$$
\frac{\partial Q}{\partial \mathbf{C}}=\left[\begin{array}{ll}
\frac{\partial Q}{\partial C_{12}} &amp; \frac{\partial Q}{\partial C_{12}} \\
\frac{\partial Q}{\partial C_{21}} &amp; \frac{\partial Q}{\partial C_{22}}
\end{array}\right]=\left[\begin{array}{ll}
a_{1} b_{1} &amp; a_{1} b_{2} \\
a_{2} b_{1} &amp; a_{2} b_{2}
\end{array}\right]=\left[\begin{array}{l}
a_{1} \\
a_{2}
\end{array}\right]\left[\begin{array}{ll}
b_{1} &amp; b_{2}
\end{array}\right]
$$
&lt;p>Für eine symmetrische Matrix $\mathbf{C}$:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Mit $\underline{a}=\underline{e}$ und $\underline{b} = D \cdot \underline{e}$:&lt;/p>
$$
\frac{\partial}{\partial \mathbf{C}} (\underline{e}^\top \mathbf{C} D \underline{e}) = \underline{e} \cdot \underline{e}^\top \cdot D^\top
$$
&lt;/li>
&lt;li>
&lt;p>Mit $\underline{a}=D \cdot \underline{e}$ und $\underline{b} = \underline{e}$:&lt;/p>
$$
\frac{\partial}{\partial \mathbf{C}} (\underline{e}^\top D^\top \mathbf{C} \underline{e}) = D\cdot \underline{e}\cdot \underline{e}^\top
$$
&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">$$
\frac{\partial}{\partial \mathbf{K}}\left(\boldsymbol{a}^{\top} \cdot \mathbf{K} \cdot \mathbf{C} \cdot \mathbf{K}^{\top} \boldsymbol{b} \right)=\boldsymbol{a} \boldsymbol{b}^{\top} \mathbf{K} \mathbf{C}^{\top}+\boldsymbol{b} \boldsymbol{a}^{\top} \mathbf{K} \mathbf{C}
$$
&lt;/span>
&lt;/div>
&lt;p>Seien $\boldsymbol{a} = \boldsymbol{e}, \boldsymbol{b} = \boldsymbol{e}$, $\mathbf{C}$ symmetrisch, dann gilt&lt;/p>
$$
\frac{\partial}{\partial \mathbf{K}}\left(\boldsymbol{e}^{\top} \cdot \mathbf{K} \cdot \mathbf{C} \cdot \mathbf{K}^{\top} \boldsymbol{e} \right)=\boldsymbol{e} \boldsymbol{e}^{\top} \mathbf{K} \mathbf{C}^{\top}+\boldsymbol{e} \boldsymbol{e}^{\top} \mathbf{K} \mathbf{C} = 2\boldsymbol{e} \boldsymbol{e}^{\top} \mathbf{K} \mathbf{C}
$$</description></item><item><title>Software Defined Networks (SDNs)</title><link>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/05-software_defined_network/</link><pubDate>Thu, 11 Mar 2021 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/05-software_defined_network/</guid><description>&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/SDN-SDN_summary.png"
alt="SDN summary">&lt;figcaption>
&lt;p>SDN summary&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;h2 id="basics-and-architecture">Basics and Architecture&lt;/h2>
&lt;h3 id="high-level-view-on-traditional-ip-networks">High Level View on Traditional IP Networks&lt;/h3>
&lt;p>Abstract view on an IP router&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2010.22.31.png" alt="截屏2021-03-12 10.22.31">&lt;/p>
&lt;ul>
&lt;li>Control plane
&lt;ul>
&lt;li>Exchange of routing messages for calculation of routes &amp;hellip;&lt;/li>
&lt;li>Additional tasks, such as load balancing, access control, &amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Data plane: Forwarding of packets at layer 3&lt;/li>
&lt;/ul>
&lt;p>Every router has control and data plane functions&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2010.40.20.png" alt="截屏2021-03-12 10.40.20" style="zoom:80%;" />
&lt;ul>
&lt;li>Control plane: software running on the router&lt;/li>
&lt;li>Data plane
&lt;ul>
&lt;li>usually application-specific integrated circuits&lt;/li>
&lt;li>Can also be realized in software (virtual switches)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Control is &lt;strong>decentralized&lt;/strong>.&lt;/p>
&lt;p>🔴 Limitations: Limited flexibility for network operators&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Manufacturer-specific management interfaces&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Difficult (and often impossible) to introduce new functions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Complex, highly qualified operators required&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Expensive (at least for core routers)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="current-trend-software-defined-networks-sdn">Current Trend: &lt;strong>Software-Defined Networks (SDN)&lt;/strong>&lt;/h3>
&lt;p>👍 Advantages&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Increase flexibility&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Decrease dependencies on hardware and manufactures&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Commercial off-the-shelf switches (cheaper)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="characteristics">Characteristics&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Separation&lt;/strong> of control plane and data plane&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2010.50.06.png" alt="截屏2021-03-12 10.50.06">&lt;/p>
&lt;ul>
&lt;li>Control functionality resides on a &lt;strong>logically centralized SDN controller&lt;/strong>
&lt;ul>
&lt;li>Controller is executed on commodity hardware $\rightarrow$ Reduces need for specialized routing hardware&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Data plane consists of &lt;strong>simple packet processors (SDN switches)&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Control plane has &lt;em>&lt;strong>global&lt;/strong>&lt;/em> network view&lt;/p>
&lt;ul>
&lt;li>Knows &lt;em>all&lt;/em> switches and their configurations&lt;/li>
&lt;li>Knows network &lt;em>topology&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Network is &lt;strong>software-programmable&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Functionality is provided by network applications (network apps)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Different apps can realize different functionality&lt;/p>
&lt;/li>
&lt;li>
&lt;p>SDN controller can execute multiple apps in parallel&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Processing is based on &lt;strong>flows&lt;/strong>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="basic-operation">Basic Operation&lt;/h3>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2011.08.24.png" alt="截屏2021-03-12 11.08.24">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Control functionality is placed on the SDN controller&lt;/p>
&lt;ul>
&lt;li>E.g., routing including routing table&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Forwarding table is placed on SDN switch&lt;/p>
&lt;ul>
&lt;li>Called &lt;strong>flow table&lt;/strong> in the context of SDN&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>SDN controller programs entries in flow table according to its control functionality&lt;/p>
&lt;ul>
&lt;li>Requires a protocol between controller and switch&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>For every incoming packet in the SDN switch&lt;/p>
&lt;ul>
&lt;li>Suited entry in flow table needs to be determined&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="flows-and-flow-table">Flows and Flow Table&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Flows&lt;/strong>: sequence of packets traversing a network that &lt;em>share&lt;/em> a set of header field values&lt;/p>
&lt;ul>
&lt;li>Here: Identified through &lt;strong>match fields&lt;/strong>, e.g., IP address, port number&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Flow table contains, among others, &lt;strong>match fields and actions&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Matches select appropriate flow table entry&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Actions are applied to all packets that satisfy a match&lt;/p>
&lt;/li>
&lt;li>
&lt;p>E.g.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2011.13.29.png" alt="截屏2021-03-12 11.13.29" style="zoom:80%;" />
&lt;ul>
&lt;li>Flow rule
&lt;ul>
&lt;li>
&lt;p>Decision of controller&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Described in form of match fields, actions, switches&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="flow-rule-and-flow-table-entries">Flow Rule and Flow Table Entries&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2011.15.02.png" alt="截屏2021-03-12 11.15.02" style="zoom:80%;" />
&lt;ol>
&lt;li>
&lt;p>Controller (more precise: app executed by controller) makes a &lt;strong>high level decision&lt;/strong>, for example&lt;/p>
&lt;p>a) Traffic for destination X has to be dropped&lt;/p>
&lt;p>b) Connection between end system A and B has to go through switch S4&lt;/p>
&lt;p>c) &amp;hellip;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>High level decision is represented in a certain format, i.e., as a set of &lt;strong>flow rules&lt;/strong> in the form of match fields, actions and switches&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Flow rules are transmitted (“installed”) to switches with the help of a communication protocol. They are stored as &lt;strong>flow table entries&lt;/strong> in flow tables&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h4 id="flow-programming">Flow Programming&lt;/h4>
&lt;p>SDN provides two different modes&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>&lt;em>Proactive&lt;/em> flow programming&lt;/strong>&lt;/p>
&lt;p>Flow rules are programmed &lt;em>&lt;strong>before&lt;/strong>&lt;/em> first packet of flow arrives&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2011.19.45.png" alt="截屏2021-03-12 11.19.45" style="zoom:80%;" />
&lt;/li>
&lt;li>
&lt;p>&lt;strong>&lt;em>Reactive&lt;/em> flow programming&lt;/strong>&lt;/p>
&lt;p>Flow rules are programmed &lt;em>&lt;strong>in reaction to&lt;/strong>&lt;/em> receipt of first packet of a flow&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2011.20.21.png" alt="截屏2021-03-12 11.20.21" style="zoom:80%;" />
&lt;/li>
&lt;/ul>
&lt;h5 id="three-important-interactions">Three Important Interactions&lt;/h5>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2011.26.11.png" alt="截屏2021-03-12 11.26.11" style="zoom:80%;" />
&lt;h5 id="example-proactive-flow-programming">Example: Proactive Flow Programming&lt;/h5>
&lt;p>Scenario&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2011.28.13.png" alt="截屏2021-03-12 11.28.13" style="zoom:80%;" />
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2011.35.40.png" alt="截屏2021-03-12 11.35.40">&lt;/p>
&lt;details>
&lt;Summary>Details&lt;/Summary>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2011.42.08.png" alt="截屏2021-03-12 11.42.08" style="zoom: 67%;" />
&lt;/details>
&lt;h5 id="example-reactive-flow-programming">Example: Reactive Flow Programming&lt;/h5>
&lt;p>Same scenario as above&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2011.43.41.png" alt="截屏2021-03-12 11.43.41" style="zoom:80%;" />
&lt;details>
&lt;Summary>Details&lt;/Summary>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2011.44.19.png" alt="截屏2021-03-12 11.44.19" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2011.44.31.png" alt="截屏2021-03-12 11.44.31" style="zoom:67%;" />
&lt;/details>
&lt;h5 id="proactive-vs-reactive-flow-programming">Proactive vs. Reactive Flow Programming&lt;/h5>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Flow Programming&lt;/th>
&lt;th>Characteristics&lt;/th>
&lt;th>Delay?&lt;/th>
&lt;th>Loss of controller connectivity&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>Proactive&lt;/strong>&lt;/td>
&lt;td>coarse grained, pre-defined&lt;/td>
&lt;td>No&lt;/td>
&lt;td>Does not disrupt traffic&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Reactive&lt;/strong>&lt;/td>
&lt;td>fine grained, on demand&lt;/td>
&lt;td>Yes&lt;/td>
&lt;td>New flows cannot be installed&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;ul>
&lt;li>
&lt;p>Proactive&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Flow table entries have to be programmed before actual traffic arrives&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Usually coarse grained &lt;strong>pre-defined&lt;/strong> decisions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Not always applicable 🤪&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>No additional delays for new connections&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Loss of controller connectivity does not disrupt traffic&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Reactive&lt;/p>
&lt;ul>
&lt;li>Allows fine grained &lt;strong>on-demand&lt;/strong> control
&lt;ul>
&lt;li>Increased visibility of flows that are active in the network&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Setup time for each flow $\rightarrow$ High overhead for short lived flows&lt;/li>
&lt;li>New flows cannot be installed if controller connectivity is lost&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="sdn-architecture">SDN Architecture&lt;/h3>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2011.51.50.png" alt="截屏2021-03-12 11.51.50">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Application Plane&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Network apps perform network control and management tasks&lt;/li>
&lt;li>Interacts via &lt;strong>northbound API&lt;/strong> with control plane&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Control Plane&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Control tasks are „outsourced“ from data plane to logically centralized control plane
&lt;ul>
&lt;li>E.g., standard tasks such as topology detection, ARP &amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>More complex tasks can be delegated to application plane
&lt;ul>
&lt;li>E.g., routing decisions, load balancing &amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Data Plane&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Responsible for packet forwarding / processing&lt;/li>
&lt;li>SDN switches are relatively simple devices
&lt;ul>
&lt;li>Efficient implementations in hardware (ASIC) or in software (virtual switches)&lt;/li>
&lt;li>Supports basic operations such as match, forward, drop&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Interacts via &lt;strong>southbound API&lt;/strong> with control plane&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Interfaces&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Northbound API&lt;/strong>: between controller and network apps&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Exposes control plane functions to apps&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Abstract from details, apps can operate on consistent network view&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Southbound API&lt;/strong>: between controller and switches&lt;/p>
&lt;ul>
&lt;li>Exposes data plane functions to controller&lt;/li>
&lt;li>Abstracts from hardware details&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Westbound API&lt;/strong>: between controllers&lt;/p>
&lt;ul>
&lt;li>Synchronization of network state information&lt;/li>
&lt;li>E.g., coordinated flow setup, exchange of reachability information&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Eastbound API&lt;/strong>: interface to legacy infrastructures&lt;/p>
&lt;ul>
&lt;li>Usually proprietary&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="sdn-workflow-in-practice">SDN Workflow in Practice&lt;/h2>
&lt;h3 id="workflow-and-primitives">Workflow and Primitives&lt;/h3>
&lt;p>High level view:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2012.05.50.png" alt="截屏2021-03-12 12.05.50" style="zoom:80%;" />
&lt;p>In practice:&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2012.08.56.png" alt="截屏2021-03-12 12.08.56">&lt;/p>
&lt;ol>
&lt;li>
&lt;p>We need a piece of software (app) that realizes the new behavior&lt;/p>
&lt;p>&lt;code>control_my_network.java&lt;/code>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>We need primitives to assist with creating the app&lt;/p>
&lt;p>&lt;code>import OFMatch, OFAction, ...&lt;/code>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>We need a runtime environment that can execute our app&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">$ ./myController --runApp control_my_network.java
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/li>
&lt;li>
&lt;p>We need hardware support for SDN in the switches&lt;/p>
&lt;p>&lt;strong>Flow table(s)&lt;/strong>&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h4 id="primitives-for-sdn-programming">Primitives for SDN Programming&lt;/h4>
&lt;p>🎯 Goal: From intended behavior to lower level flow rules&lt;/p>
&lt;p>$\rightarrow$ This requires &lt;strong>SDN programming primitives&lt;/strong>&lt;/p>
&lt;p>Three important areas to cover&lt;/p>
&lt;h5 id="1-create-and-install-flow-rules">(1) Create and install flow rules&lt;/h5>
&lt;p>Sufficient for proactive use cases.&lt;/p>
&lt;p>Example: Traffic with IP destination address &lt;code>1.2.3.4&lt;/code> has to be forwarded to network B by switch S1&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2012.25.11.png" alt="截屏2021-03-12 12.25.11" style="zoom:80%;" />
&lt;p>Needed: App that implements the corresponding logic&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Represent the decision as flow rules&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Program appropriate flow table entries into the switch&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Suppose that we have &lt;code>static_forwarding.java&lt;/code>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Creates a new flow rule&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sends the flow rule to S1&lt;/p>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2012.34.13.png" alt="截屏2021-03-12 12.34.13" style="zoom:80%;" />
&lt;details>
&lt;summary>Details&lt;/summary>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2012.34.44.png" alt="截屏2021-03-12 12.34.44" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2012.35.36.png" alt="截屏2021-03-12 12.35.36" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2012.36.01.png" alt="截屏2021-03-12 12.36.01" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2012.36.17.png" alt="截屏2021-03-12 12.36.17" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2012.36.52.png" alt="截屏2021-03-12 12.36.52" style="zoom:67%;" />
&lt;/details>
&lt;br>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>Here we use a simple &lt;strong>pseudo programming language&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Language used in practice depends on controller&lt;/li>
&lt;li>Different controllers support different languages: Java, Python, C, C++, &amp;hellip;&lt;/li>
&lt;/ul>&lt;/span>
&lt;/div>
&lt;p>&lt;strong>Overview: Matches&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2012.43.27.png" alt="截屏2021-03-12 12.43.27" style="zoom:67%;" />
&lt;p>&lt;strong>Overview: Actions&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2012.43.56.png" alt="截屏2021-03-12 12.43.56" style="zoom:67%;" />
&lt;p>&lt;strong>Priorities&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Priorities come into play if there are &lt;strong>overlapping&lt;/strong> flow rules&lt;/p>
&lt;ul>
&lt;li>No overlap = all potential packets can only be matched by at most one rule&lt;/li>
&lt;li>Overlap = at least one packet could be matched by more than one rule&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2012.46.39.png" alt="截屏2021-03-12 12.46.39" style="zoom:67%;" />
&lt;p>Assume that all rules are created with same &lt;strong>default priority (=1)&lt;/strong>&lt;/p>
&lt;p>If two rules can overlap, priority has to be changed explicitly&lt;/p>
&lt;ul>
&lt;li>Higher values = higher priority&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2012.50.47.png" alt="截屏2021-03-12 12.50.47">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Multiple Flow Tables&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>SDN switches can support more than one flow table&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2012.51.59.png" alt="截屏2021-03-12 12.51.59">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Using multiple tables has several benefits&lt;/p>
&lt;ul>
&lt;li>Can be used to isolate flow rules from different apps&lt;/li>
&lt;li>Logical separation between different tasks (one table for monitoring, one table for security, &amp;hellip;)&lt;/li>
&lt;li>In some situation: less overall flow table entries&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Similar to single table case&lt;/p>
&lt;ul>
&lt;li>&lt;code>r.TABLE(x)&lt;/code>: specify the table for this rule&lt;/li>
&lt;li>&lt;code>r.ACTION('GOTO', y)&lt;/code>: specify processing continues in another table&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Avoid cycles: Can NOT go to lower flow table number&lt;/p>
&lt;ul>
&lt;li>&lt;code>GOTO&lt;/code> from table x to table y $\Rightarrow$ y &amp;gt; x&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2013.04.07.png" alt="截屏2021-03-12 13.04.07" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;h5 id="2-react-to-data-plane-events">(2) React to data plane events&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>&lt;code>onPacketIn(packet, switch, inport)&lt;/code>&lt;/p>
&lt;ul>
&lt;li>Called if the controller receives a packet that was forwarded via &lt;code>r.ACTION('CONTROLLER')&lt;/code>&lt;/li>
&lt;li>Parameters
&lt;ul>
&lt;li>&lt;code>packet&lt;/code>: contains packet that was forwarded and grants access to its header fields
&lt;ul>
&lt;li>&lt;code>packet.IP_SRC &lt;/code>&lt;/li>
&lt;li>&lt;code>packet.IP_DST &lt;/code>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;code>packet.MAC_SRC&lt;/code>
&lt;ul>
&lt;li>&lt;code>packet.MAC_DST&lt;/code>&lt;/li>
&lt;li>&lt;code>packet.TTL&lt;/code>&lt;/li>
&lt;li>&amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;code>switch&lt;/code>: the switch the packet was received at (e.g., S1)&lt;/li>
&lt;li>&lt;code>inport&lt;/code>: the interface the packet was received at (e.g., port 1)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2015.36.45.png" alt="截屏2021-03-12 15.36.45" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>Sketch&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Create a low priority flow rule that sends „all unknown packets“ to the controller&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-java" data-lang="java">&lt;span class="line">&lt;span class="cl">&lt;span class="n">r&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="na">MATCH&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="sc">&amp;#39;*&amp;#39;&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// match on everything &lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="n">r&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="na">ACTION&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="err">&amp;#39;&lt;/span>&lt;span class="n">CONTROLLER&lt;/span>&lt;span class="err">&amp;#39;&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// send packet to controller &lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="n">r&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="na">PRIORITY&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">0&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="c1">// use lowest priority for this flow rule&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/li>
&lt;li>
&lt;p>Use &lt;code>onPacketIn()&lt;/code> to create and install flow rules on demand&lt;/p>
&lt;/li>
&lt;/ol>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;details>
&lt;summary>Details&lt;/summary>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2015.41.49.png" alt="截屏2021-03-12 15.41.49" style="zoom:80%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2015.42.12.png" alt="截屏2021-03-12 15.42.12" style="zoom:80%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2015.42.27.png" alt="截屏2021-03-12 15.42.27" style="zoom:80%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2015.42.34.png" alt="截屏2021-03-12 15.42.34" style="zoom:80%;" />
&lt;/details>
&lt;h5 id="3-inject-individual-packets">(3) Inject individual packets&lt;/h5>
&lt;p>Handle individual packets from within the app&lt;/p>
&lt;ul>
&lt;li>Forward a packet that was sent to the controller&lt;/li>
&lt;li>Perform topology detection&lt;/li>
&lt;li>Active monitoring („probe packets“)&lt;/li>
&lt;li>Answer ARP requests&lt;/li>
&lt;/ul>
&lt;p>&lt;code>send_packet(packet, switch, rule)&lt;/code>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Injects a single packet into a switch&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Parameters&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;code>packet&lt;/code>: contains the packet that should be injected&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;code>switch&lt;/code>: the switch where the packet is injected&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;code>rule&lt;/code>: a flow rule that is applied to this packet instead of default flow table&lt;/p>
&lt;p>processing (optional)&lt;/p>
&lt;ul>
&lt;li>Only &lt;code>rule.ACTION()&lt;/code> is allowed here&lt;/li>
&lt;li>No matches, no priorities&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Different from installing flow rules&lt;/p>
&lt;ul>
&lt;li>Used for a single packet only&lt;/li>
&lt;li>The flow table is not changed&lt;/li>
&lt;li>Even if the &lt;code>rule &lt;/code>parameter is present, this does NOT create a new flow table entry&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Inject and process injected packet with a custom rule&lt;/p>
&lt;ul>
&lt;li>Directly attaches the actions to the injected packet&lt;/li>
&lt;li>Rule is only used for a single packet&lt;/li>
&lt;li>Flow table remains unchanged&lt;/li>
&lt;li>Advantages
&lt;ul>
&lt;li>Efficient&lt;/li>
&lt;li>Consistent&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Example&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-java" data-lang="java">&lt;span class="line">&lt;span class="cl">&lt;span class="n">newPacket&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">createNewPacket&lt;/span>&lt;span class="p">()&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="n">customRule&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">Rule&lt;/span>&lt;span class="p">()&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="n">customRule&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="na">ACTION&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="err">&amp;#39;&lt;/span>&lt;span class="n">OUTPUT&lt;/span>&lt;span class="err">&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">1&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="n">send_packet&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">newPacket&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">switch&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="n">customRule&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h4 id="summary-on-primitives">Summary on Primitives&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Entry point primitves&lt;/strong>: Callbacks to implement custom logic&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;code>onConnect(switch)&lt;/code>&lt;/p>
&lt;p>Called if a new control connection to switch is established&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;code>onPacketIn(packet, switch, port)&lt;/code>&lt;/p>
&lt;p>Called if a packet was forwarded to the controller&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Flow rule creation primitives&lt;/strong>: Used to define flow rules&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;code>Rule.MATCH()&lt;/code>&lt;/p>
&lt;p>Select packets based on certain header fields&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;code>Rule.ACTION()&lt;/code>&lt;/p>
&lt;p>Specify what happens to a packet in the switch&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;code>Rule.PRIORITY()&lt;/code>&lt;/p>
&lt;p>Specify the priority of the created flow rule&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;code>Rule.TABLE()&lt;/code>&lt;/p>
&lt;p>Specify the flow table the rule should be applied to&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Switch interaction primitives&lt;/strong>: Used to handle flow rule installation and packet injection&lt;/p>
&lt;ul>
&lt;li>&lt;code>send_rule(rule, switch)&lt;/code>
Installs a flow rule and creates the associated flow table entry in the switch&lt;/li>
&lt;li>&lt;code>send_packet(packet, switch)&lt;/code>
Injects a single packet into a switch, process with existing flow table entries&lt;/li>
&lt;li>&lt;code>send_packet(packet, switch, rule)&lt;/code>
Injects a single packet into a switch, process with custom rule&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="learning-switch-example">Learning Switch Example&lt;/h3>
&lt;p>Goal: learn port-address association of end systems&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Switch receives packet and does not know destination address&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Floods&lt;/strong> packets on all active ports&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Learns&lt;/strong> &amp;ldquo;location&amp;rdquo; of the end system with this destination&lt;/p>
&lt;p>address&lt;/p>
&lt;pre>&lt;code>- Remembers that end system is accessible via this port
- Entry in table `&amp;lt;MAC address, port, lifetime&amp;gt;`
&lt;/code>&lt;/pre>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Switch receives packet and knows destination address&lt;/p>
&lt;ul>
&lt;li>Forwards packet via corresponding port&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>We can do the same with SDN: Learning switch app&lt;/p>
&lt;ul>
&lt;li>Observe packets by controller&lt;/li>
&lt;li>Derive locations of end systems&lt;/li>
&lt;li>Program forwarding rules to allow connectivity between end systems based on MAC addresses and port numbers&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2023.00.57.png" alt="截屏2021-03-12 23.00.57" style="zoom:67%;" />
&lt;h4 id="naive-approach">Naïve Approach&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Send all packets to controller&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Controller looks at &lt;code>INPORT&lt;/code> and &lt;strong>source MAC address&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Controller creates rules based on these two pieces of information&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Packets with unknown destination addresses are flooded to all ports&lt;/p>
&lt;/li>
&lt;/ul>
&lt;details>
&lt;summary>Implementation&lt;/summary>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2023.16.13.png" alt="截屏2021-03-12 23.16.13" style="zoom:80%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2023.17.25.png" alt="截屏2021-03-12 23.17.25" style="zoom:80%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2023.18.13.png" alt="截屏2021-03-12 23.18.13" style="zoom:80%;" />
&lt;/details>
&lt;p>🔴 Problem&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2023.27.47.png" alt="截屏2021-03-12 23.27.47">&lt;/p>
&lt;h4 id="version-2">Version 2&lt;/h4>
&lt;ul>
&lt;li>Delay rule installation until the destination address was learned (not the source address)&lt;/li>
&lt;li>Avoids installing rules „too early“&lt;/li>
&lt;/ul>
&lt;details>
&lt;summary>Implementation&lt;/summary>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2023.38.41.png" alt="截屏2021-03-12 23.38.41">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2023.38.58.png" alt="截屏2021-03-12 23.38.58">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2023.39.09.png" alt="截屏2021-03-12 23.39.09">&lt;/p>
&lt;/details>
&lt;p>Consider the example above:&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2023.41.11.png" alt="截屏2021-03-12 23.41.11">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2023.41.16.png" alt="截屏2021-03-12 23.41.16">&lt;/p>
&lt;p>🔴 Problem&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-12%2023.43.04.png" alt="截屏2021-03-12 23.43.04" style="zoom:80%;" />
&lt;h4 id="version-3">Version 3&lt;/h4>
&lt;ul>
&lt;li>Only matching on destination address is not specific enough&lt;/li>
&lt;li>&lt;strong>Use more specific matches&lt;/strong>&lt;/li>
&lt;li>Makes sure that all end systems can be learned by controller&lt;/li>
&lt;/ul>
&lt;p>Implementation&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2023.50.10.png" alt="截屏2021-03-12 23.50.10">&lt;/p>
&lt;p>Consider the example in Version 2:&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2023.56.11.png" alt="截屏2021-03-12 23.56.11">&lt;/p>
&lt;p>🔴 Problem: flow table resources&lt;/p>
&lt;ul>
&lt;li>Needs N*N flow entries for N end systems&lt;/li>
&lt;li>May exceed table capacity! 🤪&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">The amount of flow table entries required is an important factor for usability and scalability.&lt;/span>
&lt;/div>
&lt;h4 id="version-4">Version 4&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Separate flow tables for learning and forwarding&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-12%2023.59.02.png" alt="截屏2021-03-12 23.59.02">&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Flow table FT1&lt;/strong> matches on source address and forwards to controller, if address was not yet learned&lt;/li>
&lt;li>&lt;strong>Flow table FT2&lt;/strong> matches on destination address and forwards packet to destination (if learned) or floods packet (if not learned)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Only 2*N rules for N end systems&lt;/p>
&lt;/li>
&lt;li>
&lt;p>🔴 Problem: Hardware often does not support multiple flow tables due to cost, energy or space constraints&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="openflow">OpenFlow&lt;/h2>
&lt;h3 id="rough-overview">Rough Overview&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>A standard for an SDN &lt;strong>southbound&lt;/strong> interface&lt;/p>
&lt;ul>
&lt;li>Defines the &lt;strong>interaction&lt;/strong> between controller and switches&lt;/li>
&lt;li>Defines a &lt;strong>logical architecture&lt;/strong> for SDN switches (flow table, &amp;hellip;)&lt;/li>
&lt;li>Defined by the Open Networking Foundation (ONF)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Supports&lt;/p>
&lt;ul>
&lt;li>All basic structures and primitives discussed in previous section
&lt;ul>
&lt;li>
&lt;p>Matches&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Actions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Priorities&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Multiple flow tables&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Protocol mechanisms for&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Creating flow rules&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Reacting to data plane events&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Injecting individual packets&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>More sophisticated features
&lt;ul>
&lt;li>Group table&lt;/li>
&lt;li>Rate limiting&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="structure">Structure&lt;/h3>
&lt;p>Provides a uniform view on SDN-capable switches&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2000.07.45.png" alt="截屏2021-03-13 00.07.45">&lt;/p>
&lt;h4 id="ports">&lt;strong>Ports&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Represent logical forwarding targets&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Can be selected by the output action&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Physical ports = hardware interfaces&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Reserved ports (special meaning)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;code>ALL&lt;/code>&lt;/p>
&lt;ul>
&lt;li>Represents all ports eligible to forward a specific packet (= flooding);&lt;/li>
&lt;li>Ingress port is automatically excluded from forwarding&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;code>IN_PORT&lt;/code>&lt;/p>
&lt;p>Always references ingress port of a packet (= send packet back the way it came)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;code>CONTROLLER&lt;/code>&lt;/p>
&lt;p>Forwarding a packet on this port sends it to the controller&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;code>NORMAL&lt;/code>&lt;/p>
&lt;p>Yields control of the forwarding process to the vendor-specific switch implementation&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Logical ports&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Provide abstract forwarding targets (vendor-specific)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Link aggregation&lt;/strong>: Multiple interfaces are combined to a single logical port&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Transparent tunneling&lt;/strong>: Traffic is forwarded via intermediate switches&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="flow-table">&lt;strong>Flow table&lt;/strong>&lt;/h4>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2010.12.09.png" alt="截屏2021-03-13 10.12.09">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Counters&lt;/strong>&lt;/p>
&lt;p>The number of processed packets (counter)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Timeouts&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Maximum lifetime of a flow&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Enables automatic removal of flows&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Cookie&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Marker value set by an SDN controller&lt;/li>
&lt;li>Not used during packet processing&lt;/li>
&lt;li>Simplifies flow management&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Flags&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Indicate how a flow is managed&lt;/p>
&lt;/li>
&lt;li>
&lt;p>E.g., notify controller when a flow is automatically removed&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Pipeline Processing&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Multiple flow tables can be chained in a flow table &lt;strong>pipeline&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Flow tables are numbered in the order they can be traversed by packets&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Processing starts at flow table 0&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Only “forward” traversal is possible $\rightarrow$ no recursion&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Actions are accumulated in an action set during pipeline processing&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Divided into &lt;strong>ingress&lt;/strong> and &lt;strong>egress processing&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2010.15.12.png" alt="截屏2021-03-13 10.15.12">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;details>
&lt;summary>Building an action set&lt;/summary>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2010.26.56.png" alt="截屏2021-03-13 10.26.56">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2010.27.15.png" alt="截屏2021-03-13 10.27.15">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2010.28.06.png" alt="截屏2021-03-13 10.28.06">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2010.28.52.png" alt="截屏2021-03-13 10.28.52">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2010.28.29.png" alt="截屏2021-03-13 10.28.29">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2010.29.47.png" alt="截屏2021-03-13 10.29.47">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2010.30.06.png" alt="截屏2021-03-13 10.30.06">&lt;/p>
&lt;/details>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h5 id="ingress-processing">Ingress Processing&lt;/h5>
&lt;ul>
&lt;li>Starts at flow table 0&lt;/li>
&lt;li>Initial action set is empty&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2010.16.46.png" alt="截屏2021-03-13 10.16.46">&lt;/p>
&lt;h5 id="egress-processing">Egress Processing&lt;/h5>
&lt;p>Optionally follows ingress or group table processing&lt;/p>
&lt;ul>
&lt;li>Egress flow tables must have higher table numbers than ingress tables $\rightarrow$ No return to ingress processing&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2011.54.11.png" alt="截屏2021-03-13 11.54.11" style="zoom:67%;" />
&lt;h5 id="group-tables">Group Tables&lt;/h5>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2010.37.07.png" alt="截屏2021-03-13 10.37.07" style="zoom:80%;" />
&lt;p>Grout entry:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2010.38.12.png" alt="截屏2021-03-13 10.38.12" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>Group tables represent additional forwarding methods (E.g., link selection, fast failover, &amp;hellip;)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Group entries can be invoked from other tables via group actions&lt;/p>
&lt;ul>
&lt;li>
&lt;p>They are referenced by their unique group identifier&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Flow table entries can perform group actions during ingress processing&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Effect of group processing depends on the &lt;strong>group type&lt;/strong> and its &lt;strong>action buckets&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Action buckets&lt;/p>
&lt;ul>
&lt;li>Each group references zero or more action buckets
&lt;ul>
&lt;li>Not every action bucket of a group has to be executed&lt;/li>
&lt;li>A group with no action buckets drops a packet&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>An action bucket contains a set of actions to execute (just like an action set)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Group types&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2010.45.46.png" alt="截屏2021-03-13 10.45.46">&lt;/p>
&lt;ul>
&lt;li>&lt;strong>All&lt;/strong>: executes all buckets in a group (E.g., for broadcast)&lt;/li>
&lt;li>&lt;strong>Indirect&lt;/strong>: executes the single bucket in a group
&lt;ul>
&lt;li>
&lt;p>Indirect groups must reference exactly one action bucket&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Useful to avoid changing multiple flow table entries with common actions&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Select&lt;/strong>: selects one of many buckets of a group (E.g., select by round-robin or hashing of packet data)&lt;/li>
&lt;li>&lt;strong>Fast failover&lt;/strong>: executes &lt;strong>first live bucket&lt;/strong> in a group
&lt;ul>
&lt;li>Each bucket is associated with a port that determines its liveliness&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;details>
&lt;summary>Indirect Group Tables&lt;/summary>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2011.13.01-20210313111743768.png" alt="截屏2021-03-13 11.13.01">&lt;/p>
&lt;p>🎯 Goal: Reroute flows to avoid forwarding via switch S2&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Output ports specified in flow tables are subject to change&lt;/p>
&lt;/li>
&lt;li>
&lt;p>SDN controller must send multiple modify-state messages to SDN switches&lt;/p>
&lt;/li>
&lt;li>
&lt;p>One message for each flow that needs to be updated 🤪&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2011.14.33.png" alt="截屏2021-03-13 11.14.33">&lt;/p>
&lt;p>Optimization&lt;/p>
&lt;ul>
&lt;li>Use an &lt;strong>indirect group&lt;/strong> to avoid sending multiple modify-state messages&lt;/li>
&lt;li>Redirect flows with identical forwarding behavior to that group&lt;/li>
&lt;li>Modify the groups actions when forwarding behavior changes&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2011.15.40.png" alt="截屏2021-03-13 11.15.40">&lt;/p>
&lt;p>Advantage: Instead of modifying a great number of entries in flow table, we just need to modify one entry in group table!&lt;/p>
&lt;/details>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="additional-material-on-openflow">Additional material on OpenFlow&lt;/h3>
&lt;h4 id="flow-table-in-openflow">Flow Table in OpenFlow&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Flow tables contain match/action-associations&lt;/p>
&lt;ul>
&lt;li>Matches select the appropriate flow table entries&lt;/li>
&lt;li>Actions are applied to all packets that satisfy a match&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Table-miss flows&lt;/strong> capture all unmatched packets&lt;/p>
&lt;ul>
&lt;li>Enables reactive flow programming&lt;/li>
&lt;li>Corresponding flow table entry has &lt;em>lowest&lt;/em> priority&lt;/li>
&lt;li>Synonym: &lt;strong>default flow&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2011.30.00.png" alt="截屏2021-03-13 11.30.00" style="zoom:80%;" />
&lt;/li>
&lt;/ul>
&lt;h4 id="matches-in-openflow">Matches in OpenFlow&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Matches have &lt;strong>priorities&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Only the entry with the &lt;strong>highest priority&lt;/strong> is selected&lt;/li>
&lt;li>Disambiguation of similar match fields&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2011.32.10.png" alt="截屏2021-03-13 11.32.10">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Wildcard matching can be performed using &lt;strong>bitmasks&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Empty match fields match all flows&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="actions-in-openflow">Actions in OpenFlow&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Basic functionality is simple: „determine what happens to a packet“&lt;/p>
&lt;/li>
&lt;li>
&lt;p>In reality, OpenFlow makes a distinction between actions, action sets and more general instructions (linked to how the OpenFlow pipeline works)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Action&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>A concrete command to manipulate packets like „output on port“ or „push MPLS“&lt;/li>
&lt;li>OpenFlow supports
&lt;ul>
&lt;li>&lt;strong>Output&lt;/strong>: forwards a packet&lt;/li>
&lt;li>&lt;strong>Set-field&lt;/strong>: modifies a header field of a packet&lt;/li>
&lt;li>&lt;strong>Push-tag&lt;/strong>: pushes a new tag onto a packet&lt;/li>
&lt;li>&lt;strong>Pop-tag&lt;/strong>: removes a tag from a packet&lt;/li>
&lt;li>Drop a packet: Implicitly defined when no output action is specified&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Action set&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Every packet has its own ActionSet while processed&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Changes to the packet can be stored in the set / deleted from the set&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Actual changes are applied when processing ends&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Set is carried between flow tables (in one switch)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>An action set contains &lt;strong>at most one action of a specific type&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Previous instances are overwritten&lt;/li>
&lt;li>An action set may contain multiple set-field actions&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Execution proceeds in a well-defined order&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Modifications to the action set&lt;/p>
&lt;ul>
&lt;li>&lt;strong>write-actions&lt;/strong>: writing new actions to a set&lt;/li>
&lt;li>&lt;strong>clear-actions&lt;/strong>: Removing all actions from the set&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>(Check out the example in &lt;a href="#flow-table">Flow Table&lt;/a>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Instructions&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Control how packets are processed in the switch&lt;/li>
&lt;li>Each flow table entry is associated with a set of instructions
&lt;ul>
&lt;li>
&lt;p>Change the packet immediately (&lt;code>apply&lt;/code>-action)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Change the action set&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Continue processing in another table (&lt;code>goto&lt;/code>-table command)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="openflow-channel">OpenFlow Channel&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Connects each switch to a controller&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Provides the southbound API functionality of an OpenFlow switch&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Management&lt;/strong> and &lt;strong>configuration&lt;/strong> of switches by controllers&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Signaling of events&lt;/strong> from switches to controllers&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Monitoring&lt;/strong> of liveliness, error states, statistics, &amp;hellip;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Experimentation&lt;/strong>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Multiple channels to different controllers can be established&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Three message types&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Controller-to-Switch messages&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Inject controller-generated packets (&lt;strong>packet-out&lt;/strong> message)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Modify port properties or switch table entries (&lt;strong>modify-state&lt;/strong> message)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Collect runtime information (&lt;strong>read-state&lt;/strong> message)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Asynchronous messages&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Packet-in&lt;/strong> message transfers control of packet to the controller&lt;/li>
&lt;li>State changes signaled by switches&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Symmetric messages&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Handle connection setup and ensure correct operation&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Hello&lt;/strong>: exchanged on connection startup (e.g., indicate supported versions)&lt;/li>
&lt;li>&lt;strong>Echo&lt;/strong>: verify lifelines of controller-switch connections&lt;/li>
&lt;li>&lt;strong>Error&lt;/strong>: indicate error states of the controller or switch&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Experimenter&lt;/strong> messages can offer additional functionality&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="meter-tables">Meter Tables&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2012.00.07.png" alt="截屏2021-03-13 12.00.07" style="zoom:80%;" />
&lt;ul>
&lt;li>Meter table entry&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2012.01.10.png" alt="截屏2021-03-13 12.01.10">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Meters measure and control the &lt;strong>rate&lt;/strong> of packets and bytes&lt;/p>
&lt;ul>
&lt;li>
&lt;p>They are managed in the meter table&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Each meter has a unique meter identifier&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Meters are invoked from flow table entries through the meter action&lt;/p>
&lt;/li>
&lt;li>
&lt;p>When invoked, each meter keeps track of the measured rate of packets&lt;/p>
&lt;/li>
&lt;li>
&lt;p>One of several &lt;strong>meter bands&lt;/strong> is triggered when the measured rate exceeds that bands target rate&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Meter bands&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2012.02.29.png" alt="截屏2021-03-13 12.02.29" style="zoom:80%;" />
&lt;ul>
&lt;li>Packet processing by a meter band depends on its &lt;strong>band type&lt;/strong>
&lt;ul>
&lt;li>&lt;strong>DSCP remark&lt;/strong>: implements differentiated services&lt;/li>
&lt;li>&lt;strong>Drop&lt;/strong>: implements simple rate-limiting&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Rate&lt;/strong> and &lt;strong>burst&lt;/strong> determine when a band is executed&lt;/li>
&lt;li>Band types may have additional type-specific &lt;strong>arguments&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="the-power-of-abstraction">The Power of Abstraction&lt;/h2>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2012.08.26.png" alt="截屏2021-03-13 12.08.26" style="zoom:80%;" />
&lt;h3 id="different-abstractions-for-different-apps">Different Abstractions for Different Apps&lt;/h3>
&lt;p>Controller can provide different abstractions to network apps&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Apps should not deal with low level / unnecessary details&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Apps only have an abstract view of the network&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Global view of controller can be different from abstract view of an app&lt;/p>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2012.10.01.png" alt="截屏2021-03-13 12.10.01" style="zoom:80%;" />
&lt;h3 id="examples">Examples&lt;/h3>
&lt;h4 id="big-switch-abstraction">&amp;ldquo;Big Switch Abstraction&amp;rdquo;&lt;/h4>
&lt;p>Consider a security application that manages access control lists&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Controls the access of end systems E1, &amp;hellip; En to services S1, &amp;hellip; Sm&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Details such as the exact position of an end system / service are not required for the application $\rightarrow$ Can be hidden in the abstraction&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2012.11.22.png" alt="截屏2021-03-13 12.11.22">&lt;/p>
&lt;h4 id="network-slicing">Network Slicing&lt;/h4>
&lt;p>Consider a network that has to be &lt;strong>virtualized between multiple customers&lt;/strong>, e.g., Alice and Bob&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Alice is only allowed to utilize S1, S2, and S3&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Bob is only allowed to utilize S2, S3, S5, and S6&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Both customers get an individual (full-meshed) view of the network&lt;/p>
&lt;ul>
&lt;li>This is often called a network &lt;strong>slice&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2012.12.34.png" alt="截屏2021-03-13 12.12.34" style="zoom:80%;" />
&lt;h2 id="-sdn-challenges">🔴 SDN Challenges&lt;/h2>
&lt;h3 id="controller-connectivity">Controller connectivity&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>SDN requires connectivity between controller and switches&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Two different connectivity modes&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Out-of-band&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Dedicated&lt;/strong> (physical) control channel for messages between controller and switch&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2012.41.52.png" alt="截屏2021-03-13 12.41.52">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Cost intensive&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>In-band&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Control messages use &lt;strong>same&lt;/strong> channel as “normal” traffic (data)&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2012.51.35.png" alt="截屏2021-03-13 12.51.35">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Multiple applications can configure switch&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="scalability">Scalability&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Logically centralized approach requires powerful controllers $\rightarrow$ Size / load of bigger networks can easily overload control plane 🤪&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Important parameters with scalability implications&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Number of remotely controlled switches&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Number of end systems / flows in the network&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Number of messages processed by controller&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Communication delay between switches and controller&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Possible solution: &lt;strong>Distributed controllers&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2012.53.35.png" alt="截屏2021-03-13 12.53.35">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="consistency">Consistency&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Network view must remain &lt;strong>consistent&lt;/strong> for applications&lt;/p>
&lt;ul>
&lt;li>Synchronize network state information&lt;/li>
&lt;li>Done via the &lt;strong>westbound&lt;/strong> interface&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2016.37.25.png" alt="截屏2021-03-13 16.37.25" style="zoom:80%;" />
&lt;/li>
&lt;li>
&lt;p>Controller directly applies internal operations (inside partition) and notifies remote controllers of relevant changes of the network&lt;/p>
&lt;ul>
&lt;li>E.g C1 applies internal operations in Partition 1 and then notifies C2 of the change.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Apps can perform data plane operations on remote switches&lt;/p>
&lt;ul>
&lt;li>Apps operate on a consistent network view&lt;/li>
&lt;li>Operations are delegated to responsible SDN controller&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2016.45.17.png" alt="截屏2021-03-13 16.45.17" style="zoom:80%;" />
&lt;/li>
&lt;li>
&lt;p>Note: Control plane with multiple controllers is a &lt;strong>distributed&lt;/strong> system&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Desirable properties&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Consistency&lt;/strong>&lt;/p>
&lt;p>System responds identically to a request no matter which node receives the request (or does not respond at all)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Availability&lt;/strong>&lt;/p>
&lt;p>System always responds to a request (although response may not be consistent or correct)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Partition tolerance&lt;/strong>&lt;/p>
&lt;p>System continues to function even when specific messages are lost or parts of the network fail&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>CAP theorem&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>It is impossible to provide (atomic) consistency, availability and partition tolerance in a distributed system all at once&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Only &lt;strong>two&lt;/strong> of these can be satisfied at the same time&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="data-plane-limitations">Data plane limitations&lt;/h3>
&lt;ul>
&lt;li>Flow Table Capacity&lt;/li>
&lt;li>Flow Setup Latency&lt;/li>
&lt;/ul>
&lt;h2 id="sdn-use-cases">SDN Use Cases&lt;/h2>
&lt;ul>
&lt;li>Google B4&lt;/li>
&lt;li>Defense4All&lt;/li>
&lt;li>VMWare NSX&lt;/li>
&lt;/ul>
&lt;h2 id="tools">Tools&lt;/h2>
&lt;h3 id="controller-platforms">Controller Platforms&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2016.51.31.png" alt="截屏2021-03-13 16.51.31" style="zoom:67%;" />
&lt;h3 id="virtual-switches">Virtual Switches&lt;/h3>
&lt;ul>
&lt;li>Core component in modern data centers&lt;/li>
&lt;li>Used as &lt;strong>“virtual” Top-of-Rack&lt;/strong> switches&lt;/li>
&lt;/ul>
&lt;h2 id="flow-programming-example">Flow Programming Example&lt;/h2>
&lt;p>This example is taken from HW09.&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2022.25.12.png" alt="截屏2021-03-13 22.25.12">&lt;/p>
&lt;h4 id="describe-the-functionality-that-is-implemented-by-app_1java">Describe the functionality that is implemented by &lt;code>app_1.java&lt;/code>&lt;/h4>
&lt;p>The application has proactive and reactive parts&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Proactive: &lt;code>onConnect()&lt;/code>&lt;/p>
&lt;ul>
&lt;li>&lt;code>r1&lt;/code>: Forward all packets whose IP destination address belongs to &lt;code>28.0.0.0/8&lt;/code> to port 1 (i.e. network N1)&lt;/li>
&lt;li>&lt;code>r2&lt;/code>: Default rule, drops everything&lt;/li>
&lt;li>&lt;code>r3&lt;/code>: Send packets from N1 to controller&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Reactive &lt;code>onPacketIn()&lt;/code>&lt;/p>
&lt;ul>
&lt;li>If a packet is sent to controller by &lt;code>r3&lt;/code>, check whether the MAC address is valid. If valid, then forward to port 4 (i.e, network N2). Otherwise drop.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="what-port-is-connected-to-the-internet-in-the-given-example">What port is connected to the Internet in the given example?&lt;/h4>
&lt;p>A reasonable assumption here is that N1 is the internal network (because the application can check source validity with MAC addresses) and N2 is the Internet (i.e., the answer is port 4)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-13%2022.32.30.png" alt="截屏2021-03-13 22.32.30" style="zoom:80%;" />
&lt;h4 id="why-r2priority0-is-required">Why &lt;code>r2.PRIORITY(0)&lt;/code> is required?&lt;/h4>
&lt;p>&lt;code>r2.PRIORITY(0)&lt;/code> is required, because &lt;code>r2&lt;/code> is the default rule in this case&lt;/p>
&lt;ul>
&lt;li>Default rules usually have &lt;code>*&lt;/code> match&lt;/li>
&lt;li>0 is the lowest priority (lower than the default priority = 1)&lt;/li>
&lt;/ul>
&lt;h4 id="why-r1priority2-is-required">why &lt;code>r1.PRIORITY(2)&lt;/code> is required?&lt;/h4>
&lt;p>&lt;code>r1.PRIORITY(2)&lt;/code> is required to enforce that the there are no rule overlaps&lt;/p>
&lt;ul>
&lt;li>With default priority on &lt;code>r1&lt;/code> , &lt;code>r1&lt;/code> and &lt;code>r3&lt;/code> would overlap if a packet from N1 is sent with destination address in 28.0.0.0/8&lt;/li>
&lt;/ul>
&lt;h4 id="draw-a-sequence-diagram-illustrating-the-processing-of-the-six-consecutive-packets-p1---p6-shown-below-the-diagram-should-contain-the-two-networks-n1-n2-the-switch-s-and-the-controller-c-mark-the-arrows-with-send_rule-packet_in-and-send_packet">Draw a sequence diagram illustrating the processing of the six consecutive packets P1 - P6 shown below. The diagram should contain the two networks (N1, N2), the switch (S) and the controller (C). Mark the arrows with send_rule, packet_in and send_packet&lt;/h4>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-13%2022.37.00.png" alt="截屏2021-03-13 22.37.00">&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/SDN-HW09.png"
alt="Solution">&lt;figcaption>
&lt;p>Solution&lt;/p>
&lt;/figcaption>
&lt;/figure></description></item><item><title>Network Function Virtualization (NFV)</title><link>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/06-network_function_virtualization/</link><pubDate>Sun, 14 Mar 2021 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/06-network_function_virtualization/</guid><description>&lt;h2 id="network-functions">Network Functions&lt;/h2>
&lt;h3 id="middleboxes-and-network-functions">Middleboxes and Network Functions&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Middlebox&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Device on the data path between a source and destination end system&lt;/li>
&lt;li>Performs functions other than normal, standard functions of an IP route&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2011.06.02.png" alt="截屏2021-03-14 11.06.02" style="zoom:80%;" />
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Network function&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Functionality of a middlebox&lt;/li>
&lt;li>Executed on the data path&lt;/li>
&lt;li>&lt;em>E.g. &lt;a href="#network-address-translation-nat">Network address translation (NAT)&lt;/a>, &lt;a href="#firewall">firewall&lt;/a>, proxy, load balancing, intrusion detection, &amp;hellip;&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="network-address-translation-nat">Network Address Translation (NAT)&lt;/h4>
&lt;p>Connects a realm with &lt;strong>private addresses&lt;/strong> to an external realm with &lt;strong>globally unique addresses&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Problem: private addresses cannot be used for routing in the Internet&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Solution: Exchange globally unique and private addresses when packets traverse network boundaries&lt;/p>
&lt;p>$\rightarrow$ Clients in the private address range can share globally unique addresses&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2011.13.14.png" alt="截屏2021-03-14 11.13.14" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;h4 id="firewall">Firewall&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Monitors and controls incoming and outgoing traffic&lt;/p>
&lt;ul>
&lt;li>Establishes barrier between trusted and untrusted networks&lt;/li>
&lt;li>Forwards or drops packets based on pre-defined rule set&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2011.14.28.png" alt="截屏2021-03-14 11.14.28" style="zoom:80%;" />
&lt;/li>
&lt;li>
&lt;p>Variants. e.g.&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Shallow vs. deep packet inspection&lt;/strong>
&lt;ul>
&lt;li>&lt;strong>Shallow&lt;/strong>: decisions are based on header fields only (e.g., IP and TCP protocol information)&lt;/li>
&lt;li>Deep: inspects content of higher layer protocols (e.g., detection of malware traffic in application layer protocols)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Stateful vs. stateless processing&lt;/strong>
&lt;ul>
&lt;li>&lt;strong>Stateless&lt;/strong>: every packet is inspected independently of other packets&lt;/li>
&lt;li>&lt;strong>Stateful&lt;/strong>: keeps state between packets (e.g., for every TCP connection to detect invalid sequence numbers)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="traditional-middlebox-deployment">Traditional Middlebox Deployment&lt;/h3>
&lt;p>Example: Caching&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Single content provider&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2011.18.32.png" alt="截屏2021-03-14 11.18.32" style="zoom: 67%;" />
&lt;/li>
&lt;li>
&lt;p>Multiple content providers&lt;/p>
&lt;p>Place multiple middleboxes at different locations in the network&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2011.19.00.png" alt="截屏2021-03-14 11.19.00" style="zoom: 67%;" />
&lt;/li>
&lt;li>
&lt;p>🔴 Problems&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Middleboxes are often build as proprietary hardware&lt;/p>
&lt;ul>
&lt;li>Fast, but very inflexible&lt;/li>
&lt;li>Usually closed sourceblackbox for infrastructure operator&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Static wiring&lt;/p>
&lt;ul>
&lt;li>Hard to setup / tear down&lt;/li>
&lt;li>Hard to move&lt;/li>
&lt;li>Hard to upgrade $\rightarrow$ introduce new or bigger boxes&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Network operators have to manage many different vendor-specific boxes&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="network-function-virtualization-nfv">Network Function Virtualization (NFV)&lt;/h3>
&lt;p>💡Mimic ideas of cloud computing&lt;/p>
&lt;ul>
&lt;li>Implement network functions in software&lt;/li>
&lt;li>Use virtualization technology to decouple network functions from hardware&lt;/li>
&lt;li>Consolidate functionality on high volume servers, switches and storage&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2011.21.37.png" alt="截屏2021-03-14 11.21.37" style="zoom: 67%;" />
&lt;p>Network services combine multiple network functions&lt;/p>
&lt;ul>
&lt;li>End-to-end behavior of a network service is the combination of the
individual network functions&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2011.23.10.png" alt="截屏2021-03-14 11.23.10" style="zoom:67%;" />
&lt;p>👍 Benefits&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Resource sharing&lt;/strong>: Single platform for different applications and users&lt;/li>
&lt;li>&lt;strong>Agility and flexibility&lt;/strong>: Services can scale to address changing demands&lt;/li>
&lt;li>&lt;strong>Rapid deployment and innovation cycles&lt;/strong>: Providers can easily trial and evolve services&lt;/li>
&lt;li>&lt;strong>Reduced costs&lt;/strong>&lt;/li>
&lt;/ul>
&lt;p>Consider the caching example above: Networks provide infrastructure for executing software-based network functions (&lt;strong>NFV Infrastructure, NFVI&lt;/strong>)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2011.27.44.png" alt="截屏2021-03-14 11.27.44" style="zoom:67%;" />
&lt;h4 id="main-building-blocks-of-nfv">Main Building Blocks of NFV&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2011.31.12.png" alt="截屏2021-03-14 11.31.12" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Virtualized Network Functions (VNFs)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>The actual network functions provided in software&lt;/li>
&lt;li>Independent of its deployment (e.g., hardware)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>NFV Management and Orchestration (MANO)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Lifecycle management of VNFs and network services&lt;/li>
&lt;li>Requests resources for VNFs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>NFV Infrastructure (NFVI)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Provides hardware, software and network resources for VNFs&lt;/li>
&lt;li>Decouples VNFs from underlying hardware&lt;/li>
&lt;li>Can contain multiple Points of Presence (PoP)
&lt;ul>
&lt;li>Small data centers, located at different points in the infrastructure&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>SDN is used to transparently reroute flows to PoPs
&lt;ul>
&lt;li>Could also be done with MPLS or other technologies&lt;/li>
&lt;li>SDN and NFV complement each other very well&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2012.05.20.png" alt="截屏2021-03-14 12.05.20" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>Simple deployment example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2012.06.11.png" alt="截屏2021-03-14 12.06.11" style="zoom: 67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="virtualization">Virtualization&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Provides a &lt;strong>software abstraction layer&lt;/strong> between&lt;/p>
&lt;ul>
&lt;li>Hardware and&lt;/li>
&lt;li>Operating system and applications running in a virtual machine&lt;/li>
&lt;/ul>
&lt;p>$\rightarrow$ Offers a standardized platform for applications&lt;/p>
&lt;/li>
&lt;li>
&lt;p>The abstraction layer is referred to as &lt;strong>hypervisor&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&amp;ldquo;Resource broker&amp;rdquo; between hardware and virtual machines&lt;/li>
&lt;li>Translates I/O from virtual machines to physical server devices&lt;/li>
&lt;li>Allows multiple operating systems to coexist on a single physical host&lt;/li>
&lt;li>Allows live migration of virtual machines to other hosts&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="type-1-hypervisor">Type 1 Hypervisor&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2012.09.01.png" alt="截屏2021-03-14 12.09.01" style="zoom: 67%;" />
&lt;ul>
&lt;li>Runs &lt;strong>directly on hardware&lt;/strong>
&lt;ul>
&lt;li>High performance&lt;/li>
&lt;li>Strong isolation between virtual machines&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Synchronizes the access of virtual machines to the hardware&lt;/li>
&lt;/ul>
&lt;h3 id="type-2-hypervisor">Type 2 Hypervisor&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2012.10.35.png" alt="截屏2021-03-14 12.10.35" style="zoom: 67%;" />
&lt;ul>
&lt;li>Runs &lt;strong>on top of a host operating system&lt;/strong>
&lt;ul>
&lt;li>Hypervisor is executed as an application in user space&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Virtual machines provide &lt;strong>virtual hardware&lt;/strong> to guest operating systems
&lt;ul>
&lt;li>Interaction with virtual hardware is directed to physical devices through a
virtual machine driver or the host operating system&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="container-based-virtualization">Container-Based Virtualization&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2012.14.00.png" alt="截屏2021-03-14 12.14.00" style="zoom: 67%;" />
&lt;ul>
&lt;li>
&lt;p>Single kernel provides multiple &lt;strong>instances (containers)&lt;/strong> of same host operating system&lt;/p>
&lt;ul>
&lt;li>
&lt;p>No hypervisor involved&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Isolation of containers is enforced by host operating system kernel&lt;/p>
&lt;ul>
&lt;li>Each container has its own view of the operating system&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Applications in containers are executed by the host operating system&lt;/p>
&lt;p>$\rightarrow$ Applications depend on host operating system&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Kernel synchronizes access of containers to the hardware&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="service-function-chaining-sfc">Service Function Chaining (SFC)&lt;/h2>
&lt;ul>
&lt;li>Ordered set of network functions
&lt;ul>
&lt;li>Specifies ordering constraints that must be applied to flows&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Enables the creation of composite network services
&lt;ul>
&lt;li>Transparent to end systems&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Examples
&lt;ul>
&lt;li>Firewall $\rightarrow$ authentication server&lt;/li>
&lt;li>Load balancer $\rightarrow$ cache&lt;/li>
&lt;li>&amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="example-advanced-caching-scenario">Example: Advanced Caching Scenario&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Place additional firewall, authentication and cache on the data path&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sketch&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Required VNFs are instantiated at appropriate PoPs&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2012.39.24.png" alt="截屏2021-03-14 12.39.24" style="zoom: 67%;" />
&lt;/li>
&lt;li>
&lt;p>Service function chain is established (flow table entries in the data plane)&lt;/p>
&lt;p>$\rightarrow$ Flow table entries enforce correct order of VNF traversal&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/NFV_example.gif" alt="NFV_example" style="zoom:67%;" />
&lt;/li>
&lt;/ol>
&lt;/li>
&lt;/ul>
&lt;h3 id="mpls-based-service-function-chaining">MPLS-based Service Function Chaining&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2012.28.15.png" alt="截屏2021-03-14 12.28.15" style="zoom: 67%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Service classifiers&lt;/strong> select appropriate service function chains (step 1)&lt;/p>
&lt;ul>
&lt;li>Select traffic to be processed in the chain&lt;/li>
&lt;li>Attach a stack of MPLS labels to packets to determine their path through the chain&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Service function forwarders&lt;/strong> deliver packets to network functions&lt;/p>
&lt;ul>
&lt;li>The service function indicated by the topmost MPLS label is applied&lt;/li>
&lt;li>The topmost label is removed from the stack afterwards&lt;/li>
&lt;/ul>
&lt;p>(step 2 - 4)&lt;/p>
&lt;ul>
&lt;li>Normal traffic flow resumes when the MPLS stack is empty (step 5)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="-challenges">🔴 Challenges&lt;/h2>
&lt;ul>
&lt;li>Security&lt;/li>
&lt;li>VNF performance&lt;/li>
&lt;li>VNF placement&lt;/li>
&lt;li>Reliability&lt;/li>
&lt;li>Testing and debugging&lt;/li>
&lt;li>Carrier grade requirements Existence with legacy networks&lt;/li>
&lt;li>&amp;hellip;&lt;/li>
&lt;/ul></description></item><item><title>HMM und Wonham Filter</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/hmm_und_wonham_filter/</link><pubDate>Wed, 29 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/hmm_und_wonham_filter/</guid><description>&lt;p>Das &lt;strong>Hidden Markov Model (HMM)&lt;/strong> ist ein &lt;a href="https://de.wikipedia.org/wiki/Stochastik">stochastisches&lt;/a> &lt;a href="https://de.wikipedia.org/wiki/Mathematisches_Modell">Modell&lt;/a>, in dem ein System durch eine &lt;a href="https://de.wikipedia.org/wiki/Markowkette">Markowkette&lt;/a> mit unbeobachteten Zuständen modelliert wird.&lt;/p>
&lt;blockquote>
&lt;p>Die Modellierung als Markowkette bedeutet, dass das System auf zufällige Weise von einem Zustand in einen anderen übergeht, wobei die &lt;a href="https://de.wikipedia.org/wiki/%C3%9Cbergangswahrscheinlichkeit">Übergangswahrscheinlichkeiten&lt;/a> nur jeweils vom aktuellen Zustand abhängen, aber nicht von den davor eingenommenen Zuständen.&lt;/p>
&lt;/blockquote>
&lt;p>Ein HMM besteht aus&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Systemmodell / Übergangswahrscheinlichkeiten / Transitionsmatrix $\mathbf{A}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Messmodell / Emissionswahrscheinlichkeiten / Messmatrix $\mathbf{B}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Zustandsraum; Zustandswahrscheinlichkeiten $\xi_{k}^{\boldsymbol{x}}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Messungen; Emissionswahrscheinlichkeiten $\xi_{k}^{\boldsymbol{y}}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Initialer Zustand $x_0$ oder initiale Zustandswahrscheinlichkeit $\xi_{0}^{\boldsymbol{x}}$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Beispiel (Übungsblatt 4.2)&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-06-29%2015.45.15.png" alt="截屏2022-06-29 15.45.15">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Zustandsraum&lt;/p>
$$
\begin{aligned}
S &amp;=\{\text { Sonniger Tag }\} \\
R &amp;=\{\text { Regnerischer Tag }\} \\
N &amp;=\{\text { Nebliger Tag }\}
\end{aligned}
$$
&lt;/li>
&lt;li>
&lt;p>Zustandsvektor&lt;/p>
$$
\xi_{k}^{\boldsymbol{x}}=\left[\begin{array}{l}
\mathrm{P}\left(\boldsymbol{x}_{k}=S\right) \\
\mathrm{P}\left(\boldsymbol{x}_{k}=R\right) \\
\mathrm{P}\left(\boldsymbol{x}_{k}=N\right)
\end{array}\right]
$$
&lt;/li>
&lt;li>
&lt;p>Transiitonsmatrix&lt;/p>
$$
\mathbf{A}=\left[\begin{array}{lll}
0.7 &amp; 0.2 &amp; 0.1 \\
0.2 &amp; 0.6 &amp; 0.2 \\
0.4 &amp; 0.3 &amp; 0.3
\end{array}\right]
$$
&lt;/li>
&lt;li>
&lt;p>Messwerte&lt;/p>
$$
\begin{array}{l}
d=\{\text { dreckige Schuhe }\} \\
s=\{\text { saubere Schuhe }\}
\end{array}
$$
&lt;/li>
&lt;li>
&lt;p>Messvektor&lt;/p>
$$
\underline{\xi}_{k}^{\boldsymbol{y}}=\left[\begin{array}{l}
\mathrm{P}\left(\boldsymbol{z}_{k}=d\right) \\
\mathrm{P}\left(\boldsymbol{z}_{k}=s\right)
\end{array}\right]
$$
&lt;/li>
&lt;li>
&lt;p>Messmatrix&lt;/p>
$$
\mathbf{B}=\left[\begin{array}{ll}
0.1 &amp; 0.9 \\
0.8 &amp; 0.2 \\
0.4 &amp; 0.6
\end{array}\right]
$$
&lt;/li>
&lt;li>
&lt;p>Initiale Zustandswahrscheinlichkeit $\xi_{0}^{\boldsymbol{x}}$ und initialer Zustand $x_0$&lt;/p>
$$
\underline{\xi}_{0}^{\boldsymbol{x}}=\left[\begin{array}{l}
1 \\
0 \\
0
\end{array}\right] ; \quad x_{0}=S
$$
&lt;/li>
&lt;/ul>
&lt;p>Modell als Zustandsdiagramm mit Übergangswahrscheinlichkeiten&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertdiskrete_systeme-HMM.drawio.png" alt="wertdiskrete_systeme-HMM.drawio" style="zoom:80%;" />
&lt;h2 id="wonham-filter">Wonham-Filter&lt;/h2>
&lt;p>Das Wonham Filter ist ein rekursives Filter für Zustandschätzung für wertdiskrete Systeme.&lt;/p>
&lt;p>Das Wonham Filter besteht aus zwei Phasen&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Prädiktion&lt;/p>
$$
\underline{\xi}_{k \mid 1: k-1}^{x}=\mathbf{A}_{k}^{\top} \underline{\xi}_{k-1\mid1: k-1}^{x}
$$
&lt;ul>
&lt;li>$\mathbf{A}_k$
: Transitionsmatrix&lt;/li>
&lt;li>$\underline{\xi}_{k-1\mid1: k-1}^{x}$
: letzte Zustandsschätzung&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Filterung&lt;/p>
&lt;p>Für Messung $y_k = m$:&lt;/p>
$$
\underline{\xi}_{k \mid 1: k}^{x} =\frac{\mathbf{B}(:, m) \odot \xi_{k \mid 1: k-1}^{x}}{\mathbb{1}_{N}^{T} \operatorname{diag}(\mathbf{B}(:, m)) \cdot \xi_{k \mid 1: k-1}^{x}} =\frac{\mathbf{B}(:, m) \odot \xi_{k \mid 1: k-1}^{x}}{\mathbf{B}(:, m)^\top \cdot \xi_{k \mid 1: k-1}^{x}}
$$
&lt;/li>
&lt;/ol>
&lt;p>(Mehr über Wonham filter siehe &lt;a href="https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertdiskrete_systeme/zustandsschaetzung/#zustandsschätzung">hier&lt;/a>)&lt;/p>
&lt;p>&lt;strong>Beispiel (weiter)&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-06-29%2016.05.37.png" alt="截屏2022-06-29 16.05.37">&lt;/p>
&lt;p>Zeitpunkt $k=1$:&lt;/p>
$$
\begin{array}{l}
\underline{\xi}_{1}^{p}=\mathbf{A}^{\top} \underline{\xi}_{0}^{\boldsymbol{x}}=\left[\begin{array}{l}
0.7 \\
0.2 \\
0.1
\end{array}\right] \\\\
\underline{\xi}_{1}^{e}=\frac{\mathbf{B}(:, 1) \odot \underline{\xi}_{1}^{p}}{\mathbf{B}(:, 1)^{\top} \underline{\xi}_{1}^{p}}=\frac{\left[\begin{array}{l}
0.1 \\
0.8 \\
0.4
\end{array}\right] \odot\left[\begin{array}{l}
0.7 \\
0.2 \\
0.1
\end{array}\right]}{\left[\begin{array}{lll}
0.1 &amp; 0.8 &amp; 0.4
\end{array}\right]\left[\begin{array}{l}
0.7 \\
0.2 \\
0.1
\end{array}\right]}=\frac{\left[\begin{array}{l}
0.07 \\
0.16 \\
0.04
\end{array}\right]}{0.27}=\left[\begin{array}{l}
0.25926 \\
0.59259 \\
0.14815
\end{array}\right]
\end{array}
$$
&lt;p>$P(\boldsymbol{x}_1 = R) = 0.59259$ ist die größst in $\underline{\xi}_{1}^{e}$. $\Rightarrow$ Die Schätzung deutet auf einen regnerischen Tag.&lt;/p>
&lt;p>Zeitpunkt $k=2$:&lt;/p>
$$
\begin{aligned}
\underline{\xi}_{2}^{p} &amp;=\mathbf{A}^{\top} \underline{\xi}_{1}^{e}=\left[\begin{array}{l}
0.35926 \\
0.45185 \\
0.18889
\end{array}\right] \\
\underline{\xi}_{2}^{e} &amp;=\frac{\mathbf{B}(:, 1) \odot \xi_{2}^{p}}{\mathbf{B}(:, 1)^{\top} \xi_{2}^{p}}=\left[\begin{array}{l}
0.07596 \\
0.76429 \\
0.15975
\end{array}\right]
\end{aligned}
$$
&lt;p>$\Rightarrow$ Die Schätzung deutet auf einen regnerischen Tag.&lt;/p>
&lt;p>Zeitpunkt $k=3$:&lt;/p>
$$
\underline{\xi}_{3}^{p}=\mathbf{A}^{\top} \underline{\xi}_{2}^{e}=\left[\begin{array}{l}
0.26993 \\
0.52169 \\
0.20838
\end{array}\right]
$$
$$
\xi_{3}^{e}=\frac{\mathbf{B}(:, 2) \odot \xi_{3}^{p}}{\mathbf{B}(:, 2)^{\top} \xi_{3}^{p}}=\left[\begin{array}{l}
0.51437 \\
0.22091 \\
0.26472
\end{array}\right]
$$
&lt;p>$\Rightarrow$ Die Schätzung deutet auf einen sonnigen Tag.&lt;/p>
&lt;p>Zeitpunkt $k=4$:&lt;/p>
$$
\begin{array}{l}
\underline{\xi}_{4}^{p}=\mathbf{A}^{\top} \underline{\xi}_{3}^{e}=\left[\begin{array}{ll}
0.510 &amp; 13 \\
0.314 &amp; 84 \\
0.175 &amp; 04
\end{array}\right]\\
\xi_{4}^{e}=\frac{\mathbf{B}(:, 2) \odot \xi_{4}^{p}}{\mathbf{B}(:, 2)^{\top} \xi_{4}^{p}}=\left[\begin{array}{l}
0.73212 \\
0.10041 \\
0.16747
\end{array}\right]
\end{array}
$$
&lt;p>$\Rightarrow$ Die Schätzung deutet auf einen sonnigen Tag.&lt;/p>
&lt;p>&lt;strong>Beispiel (weiter)&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-06-29%2018.20.53.png" alt="截屏2022-06-29 18.20.53">&lt;/p>
&lt;p>Lösung:&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-06-29%2018.21.12.png" alt="截屏2022-06-29 18.21.12">&lt;/p></description></item><item><title>Internet Congestion Control</title><link>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/07-internet_congestion_control/</link><pubDate>Sun, 14 Mar 2021 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/07-internet_congestion_control/</guid><description>&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/TCP_congestion_control%20%283%29.png"
alt="TCP congestion control summary">&lt;figcaption>
&lt;p>TCP congestion control summary&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;p>Focus on&lt;/p>
&lt;ul>
&lt;li>congestion control in the context of the Internet and its transport protocol TCP&lt;/li>
&lt;li>implicit window-based congestion control unless explicitly stated differently&lt;/li>
&lt;/ul>
&lt;h2 id="basics">Basics&lt;/h2>
&lt;h3 id="shared-network-resources">Shared (Network) Resources&lt;/h3>
&lt;ul>
&lt;li>General problem: Multiple users use same resource
&lt;ul>
&lt;li>E.g., multiple video streams use same network link&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>🎯 High level objective with respect to networks
&lt;ul>
&lt;li>Provide good utilization of network resources&lt;/li>
&lt;li>Provide acceptable performance for users&lt;/li>
&lt;li>Provide fairness among users / data streams&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Mechanisms that deal with shared resources
&lt;ul>
&lt;li>Scheduling&lt;/li>
&lt;li>Medium access control&lt;/li>
&lt;li>Congestion control&lt;/li>
&lt;li>&amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="congestion-control-problem">Congestion Control Problem&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2017.09.31.png" alt="截屏2021-03-14 17.09.31" style="zoom:67%;" />
&lt;ul>
&lt;li>&lt;strong>Adjusts load&lt;/strong> introduced to shared resource in order to avoid overload situations&lt;/li>
&lt;li>Utilizes feedback information (implicit or explicit)&lt;/li>
&lt;/ul>
&lt;p>“Critical” Situations&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Example 1&lt;/p>
&lt;p>Router concurrently receives two packets from different input interfaces which are directed towards the same output interface. $\rightarrow$ Only one of these packets can be sent at a time.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2017.11.40.png" alt="截屏2021-03-14 17.11.40" style="zoom:80%;" />
&lt;p>What to do with the other packet?&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Buffer or&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Drop&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example 2&lt;/p>
&lt;p>Router has interfaces with different data rates&lt;/p>
&lt;ul>
&lt;li>Input interface has high data rate&lt;/li>
&lt;li>Output interface has low data rate&lt;/li>
&lt;/ul>
&lt;p>Two successive packets of a same or different senders arrive at input interface.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2017.12.58.png" alt="截屏2021-03-14 17.12.58" style="zoom:80%;" />
&lt;p>What to do with the second packet? The output interface is still busy sending the first packet while the second arrives.&lt;/p>
&lt;ul>
&lt;li>Buffer or&lt;/li>
&lt;li>Drop&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="buffer">Buffer&lt;/h4>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">The terms &lt;strong>buffer&lt;/strong> and &lt;strong>queue&lt;/strong> are used interchangeably.&lt;/span>
&lt;/div>
&lt;ul>
&lt;li>
&lt;p>Routers need buffers (queues) to cope with temporary traffic bursts&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Packets that can NOT be transmitted immediately are placed in the buffer&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If buffer is filled up, packets need to be dropped 🤪&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Buffers add latency&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Typically implemented as FIFO queues&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Router can only start sending a queued packet after all packets in front of it have been sent&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-14%2017.20.06.png"
alt="Five green packets introduce queueing delay for blue packet">&lt;figcaption>
&lt;p>Five green packets introduce queueing delay for blue packet&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>End-to-end latency&lt;/strong> of a packet includes&lt;/p>
&lt;ul>
&lt;li>Propagation delay&lt;/li>
&lt;li>Transmission delay&lt;/li>
&lt;li>Queueing delay&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>General Problem&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2017.22.37.png" alt="截屏2021-03-14 17.22.37" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>Sender wants to send data through the network to the receiver&lt;/p>
&lt;/li>
&lt;li>
&lt;p>On every network path, the link with the &lt;strong>lowest available data rate&lt;/strong> limits the maximum data rate that can be achieved end-to-end&lt;/p>
&lt;ul>
&lt;li>This link is called &lt;strong>bottleneck link&lt;/strong>&lt;/li>
&lt;li>The maximum data rate of a link is called &lt;strong>link capacity&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🔴 Problem: sender can send more data than bottleneck link can handle&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Sender can &lt;strong>overload&lt;/strong> bottleneck link! 🤪&lt;/p>
&lt;p>$\rightarrow$ Sender has to adjust its sending rate&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>How to find the “optimal” sending rate?&lt;/p>
&lt;p>&lt;strong>Congestion Control vs. Flow Control&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2017.25.55.png" alt="截屏2021-03-14 17.25.55" style="zoom:80%;" />
&lt;ul>
&lt;li>&lt;strong>Flow control&lt;/strong>
&lt;ul>
&lt;li>Bottleneck is located at &lt;strong>receiver&lt;/strong> side&lt;/li>
&lt;li>Receiver can not cope with desired data rate of sender&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Congestion control&lt;/strong>
&lt;ul>
&lt;li>Bottleneck is located in the &lt;strong>network&lt;/strong>&lt;/li>
&lt;li>Bottleneck link does not provide sufficient available data rate
&lt;ul>
&lt;li>Leads to congested router / intermediate system&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="congestion-collapse">Congestion Collapse&lt;/h3>
&lt;h4 id="throughput-vs-goodput">Throughput vs. Goodput&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Throughput&lt;/strong>: Amount of network layer data delivered in a time interval&lt;/p>
&lt;ul>
&lt;li>E.g., 1 Gbit/s&lt;/li>
&lt;li>Counts everything &lt;strong>including&lt;/strong> retransmissions&lt;/li>
&lt;/ul>
&lt;p>$\rightarrow$ the aggregated amount of data that flows through the router/link&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Goodput&lt;/strong>: „Application-level“ throughput&lt;/p>
&lt;ul>
&lt;li>Amount of application data delivered in a time interval&lt;/li>
&lt;li>Retransmissions at the transport layer do NOT count&lt;/li>
&lt;li>Packets dropped in transmission do NOT count&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Observation&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-14%2017.42.58.png" alt="截屏2021-03-14 17.42.58" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>Load is small (below network capacity) $\rightarrow$ network keeps up with load&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Load reaches network capacity (&lt;strong>knee&lt;/strong>)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Goodput stops increasing, buffers build up, end-to-end latency increases&lt;/p>
&lt;p>$\rightarrow$ &lt;strong>Network is congested!&lt;/strong>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Load increases beyond &lt;strong>cliff&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Packets start to be dropped, goodput drastically decreases
$\rightarrow$ &lt;strong>Congestion collapse&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;ul>
&lt;li>&lt;strong>Load&lt;/strong> refers to aggregated network layer traffic that is introduced by all active data streams. This includes TCP retransmissions.&lt;/li>
&lt;li>&lt;strong>Network capacity&lt;/strong> refers to maximum load that network can handle.&lt;/li>
&lt;/ul>
&lt;/span>
&lt;/div>
&lt;h4 id="how-could-congestion-collapse-happen">How Could Congestion Collapse Happen?&lt;/h4>
&lt;p>Congestion due to&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Single&lt;/strong> TCP connection
&lt;ul>
&lt;li>Exceeds available capacity at bottleneck link&lt;/li>
&lt;li>Prerequisite: flow control window is large enough&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Multiple&lt;/strong> TCP connections
&lt;ul>
&lt;li>Aggregated load exceeds available capacity&lt;/li>
&lt;li>Single TCP connection has no knowledge about other TCP connections&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="knee-and-cliff">Knee and Cliff&lt;/h4>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-14%2017.46.50.png" alt="截屏2021-03-14 17.46.50">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Keep traffic load around knee&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Good utilization of network capacity&lt;/li>
&lt;li>Low latencies&lt;/li>
&lt;li>Stable goodput&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Prevent traffic from going over the cliff&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>High latencies&lt;/li>
&lt;li>High packet losses&lt;/li>
&lt;li>Highly decreased goodput&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="challenge-of-congestion-control">Challenge of Congestion Control&lt;/h4>
&lt;ul>
&lt;li>Challenge: Find “optimal” sending rate&lt;/li>
&lt;li>Usually, sender has NO global view of the network&lt;/li>
&lt;li>NO trivial answer
&lt;ul>
&lt;li>Lots of algorithms for congestion control&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="types-of-congestion-control">Types of Congestion Control&lt;/h2>
&lt;h4 id="window-based-congestion-control">Window-based Congestion Control&lt;/h4>
&lt;p>Congestion Control Window (&lt;strong>𝐶𝑊𝑛𝑑&lt;/strong>)&lt;/p>
&lt;ul>
&lt;li>Determines maximum number of unacknowledged packets allowed per
TCP connection&lt;/li>
&lt;li>Assumes that packets are acknowledged by receiver&lt;/li>
&lt;li>Basic window mechanism is similar to sliding window as applied for flow control purposes&lt;/li>
&lt;li>Adjusts sending rate of source to bottleneck capacity $\rightarrow$ self-clocking&lt;/li>
&lt;/ul>
&lt;h4 id="rate-based-congestion-control">Rate-based Congestion Control&lt;/h4>
&lt;p>Controls sending rate, no congestion control window&lt;/p>
&lt;ul>
&lt;li>Implemented by timers that determine inter packet intervals
&lt;ul>
&lt;li>High precision required&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>🔴 Problem: NO comparable cut-off mechanism, such as missing acknowledgements
&lt;ul>
&lt;li>Sender keeps sending even in case of congestion&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Needed in case no acknowledgements are used
&lt;ul>
&lt;li>E.g., UDP&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="implicit-vs-explicit-congestion-signals">Implicit vs. Explicit Congestion Signals&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Inplicit&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Without dedicated support of the network&lt;/li>
&lt;li>Implicit congestion signals
&lt;ul>
&lt;li>Timeout of retransmission timer&lt;/li>
&lt;li>Receipt of duplicate acknowledgements&lt;/li>
&lt;li>Round-Trip Time (RTT) variation&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Explicit&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Nodes inside the network indicate congestion&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>On the internet&lt;/p>
&lt;ul>
&lt;li>Usually NO support for explicit congestion signals&lt;/li>
&lt;li>Congestion control must work with implicit congestion signals only&lt;/li>
&lt;/ul>
&lt;h4 id="end-to-end-vs-hop-by-hop">End-to-end vs. Hop-by-hop&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>End-to-end&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Congestion control operates on an &lt;strong>end system basis&lt;/strong>&lt;/li>
&lt;li>Nodes inside the network are NOT involved&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Hop-by-hop&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Congestion control operates on a &lt;strong>per hop basis&lt;/strong>&lt;/li>
&lt;li>Nodes inside the network are actively involved&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="improved-versions-of-tcp">Improved Versions of TCP&lt;/h3>
&lt;p>🎯 Goal&lt;/p>
&lt;ul>
&lt;li>Estimate available network capacity in order to avoid overload situations
&lt;ul>
&lt;li>Provide feedback (&lt;strong>congestion signal&lt;/strong>)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Limit the traffic introduced into the network accordingly
&lt;ul>
&lt;li>Apply &lt;strong>congestion control&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="tcp-tahoe">TCP Tahoe&lt;/h2>
&lt;h3 id="tcp-recap">TCP Recap&lt;/h3>
&lt;ul>
&lt;li>Connection &lt;strong>establishment&lt;/strong>
&lt;ul>
&lt;li>3 way handshake $\rightarrow$ Full duplex connection&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Connection &lt;strong>termination&lt;/strong>
&lt;ul>
&lt;li>Separately for each direction of transmission&lt;/li>
&lt;li>4 way handshake&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Data transfer&lt;/strong>
&lt;ul>
&lt;li>&lt;strong>Byte&lt;/strong>-oriented sequence numbers&lt;/li>
&lt;li>Go-back-N
&lt;ul>
&lt;li>Positive cumulative acknowledgements&lt;/li>
&lt;li>Timeout&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Flow control (sliding window)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="tcp-tahoe-in-a-nutshell">TCP Tahoe in a Nutshell&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Mechanisms used for congestion control&lt;/p>
&lt;ul>
&lt;li>Slow start&lt;/li>
&lt;li>Timeout&lt;/li>
&lt;li>Congestion avoidance&lt;/li>
&lt;li>Fast retransmit&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Congestion signal&lt;/p>
&lt;ul>
&lt;li>Retransmission timeout or&lt;/li>
&lt;li>Receipt of duplicate acknowledgements (𝑑𝑢𝑝𝑎𝑐𝑘)&lt;/li>
&lt;/ul>
&lt;p>$\rightarrow$ In case of congestion signal: slow start&lt;/p>
&lt;/li>
&lt;li>
&lt;p>The following must always be valid
&lt;/p>
$$
\text { LastByteSent }-\text { LastByteAcked } \leq \text { min\{CWnd, RcvWindow\} }
$$
&lt;ul>
&lt;li>$\text{CWnd}$: Congestion Control Window&lt;/li>
&lt;li>$\text{RcvWindow}$: Flow Control Window&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Variables&lt;/p>
&lt;ul>
&lt;li>$\text{CWnd}$: Convestion window&lt;/li>
&lt;li>$\text{SSThres}$: Slow Start Threshold
&lt;ul>
&lt;li>Value of $\text{CWnd}$ at which TCP instance switches from slow start to congestion avoidance&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Baisc approach: &lt;strong>AIMD (additive increase, multiplicative decrease)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Additive increase&lt;/strong> of $\text{CWnd}$ after receipt of an acknowledgement&lt;/li>
&lt;li>&lt;strong>Multiplicative decrease&lt;/strong> of $\text{CWnd}$ if packet loss is assumed (congestion signal)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Initial values&lt;/p>
&lt;ul>
&lt;li>$\text{CWnd}=1 \text{ MSS}$
&lt;ul>
&lt;li>$\text{MSS}$: Maximum Segment Size&lt;/li>
&lt;li>Since RFC 2581: Initial Window $\text{IW} \leq 2 \cdot \text{MSS}$ and $\text{CWnd}=\text{IW}$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>$\text{SSThres}$ initially set to “infinite”&lt;/li>
&lt;li>Number of duplicate ACKs (congestion signal): 3&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="algorithm">&lt;strong>Algorithm&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>$\text{CWnd} &lt; \text{SSThres}$ and ACKs are being received: &lt;strong>slow start&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Exponential increase of congestion window
&lt;ul>
&lt;li>Upon receipt of an ACK: $\text{CWnd } \text{+= } 1$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>$\text{CWnd} \geq \text{SSThres}$ and ACKs are being received: &lt;strong>congestion avoidance&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Linear increase of congestion window
&lt;ul>
&lt;li>Upon receipt of an ACK : $\text{CWnd } \text{+= } 1/\text{CWnd}$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Congestion signal: timeout or 3 duplicate acknowledgements: &lt;strong>slow start&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Congestion is assumed&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Set
&lt;/p>
$$
\text { SSThresh }=\max (\text { FlightSize } / 2, 2 * M S S)
$$
&lt;ul>
&lt;li>$\text { FlightSize }$: amount of data that has been sent but not yet acknowledged
&lt;ul>
&lt;li>This amount is currently in transit&lt;/li>
&lt;li>Might also be limited due to flow control&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Set $\text{CWnd}=1 \text{ MSS}$ or $\text{CWnd}=\text{IW}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>On 3 duplicate ACKs: retransmission of potentially lost TCP segment&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="example">Example&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-15%2017.24.12.png" alt="截屏2021-03-15 17.24.12" style="zoom:67%;" />
&lt;h4 id="evolution-of-congestion-window">Evolution of Congestion Window&lt;/h4>
&lt;p>Assumptions&lt;/p>
&lt;ul>
&lt;li>No transmission errors, no packet losses&lt;/li>
&lt;li>All TCP segments and acknowledgements are transmitted/received within single RTT&lt;/li>
&lt;li>Flight-size equals CWnd&lt;/li>
&lt;li>Congestion signal occurs during RTT&lt;/li>
&lt;/ul>
&lt;p>Initialize $\text{CWnd} = 1 \text{ MSS}$&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-15%2017.26.23.png" alt="截屏2021-03-15 17.26.23" style="zoom: 50%;" />
&lt;p>The $\text{CWnd}$ grows in &amp;ldquo;slow start&amp;rdquo; mode. When $\text{CWnd} = 16$, a timeout error occurs.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-15%2017.28.10.png" alt="截屏2021-03-15 17.28.10" style="zoom: 50%;" />
&lt;p>This is a congestion signal. So we go back to &amp;ldquo;slow start&amp;rdquo;&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Set $\text { SSThresh }=\max (\text { FlightSize } / 2, 2 * M S S)$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>In this case, $\text{FlightSize} = 16$.&lt;/p>
&lt;p>So$\text { SSThresh }=\max (16 / 2, 2) \text{ MSS} = 8 \text{ MSS}$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Set $\text{CWnd}=1 \text{ MSS}$ or $\text{CWnd}=\text{IW}$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-15%2017.31.41.png" alt="截屏2021-03-15 17.31.41" style="zoom: 50%;" />
&lt;p>Now $\text{CWnd} \geq \text{SSThres}$ $\rightarrow$ Switch to &amp;ldquo;congestion avoidance&amp;rdquo;!&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-15%2017.37.19.png" alt="截屏2021-03-15 17.37.19" style="zoom: 50%;" />
&lt;p>When $\text{CWnd} = 12$, a timeout error occurs.&lt;/p>
&lt;p>We just perform the same handling as above.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-15%2017.37.34.png" alt="截屏2021-03-15 17.37.34" style="zoom: 50%;" />
&lt;h4 id="fast-retransmit">Fast Retransmit&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Assume the following scenario&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-15%2017.50.38.png" alt="截屏2021-03-15 17.50.38">&lt;/p>
&lt;p>(Note: Not every segment that is received out of order indicates congestion.
E.g., only one segment is dropped, otherwise data transfer is ok)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>What would happen?
Wait until retransmission timer expires, then retransmission&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Waiting time is longer than a round trip time (RTT) $\rightarrow$ It will take a long time!🤪&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Our goal is &lt;strong>faster reaction&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Retransmission after receipt of a pre-defined number of duplicate ACK&lt;/p>
&lt;p>$\rightarrow$ Much faster than waiting for expiration of retransmission timer&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example: suppose pre-defined number of duplicate ACK is 3&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-15%2017.53.01.png" alt="截屏2021-03-15 17.53.01" style="zoom:80%;" />
&lt;/li>
&lt;/ul>
&lt;h2 id="tcp-reno">TCP Reno&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Differentiation between&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Major congestion signal&lt;/strong>: Timeout of retransmission timer&lt;/li>
&lt;li>&lt;strong>Minor congestion signal&lt;/strong>: Receipt of duplicate ACKs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>In case of a major congestion signal&lt;/p>
&lt;ul>
&lt;li>Reset to slow start as in TCP Tahoe&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>In case of minor congestion signal&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>No reset to slow start&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Receipt of duplicate ACK implies successful delivery of new segments, i.e., packets have left the network&lt;/li>
&lt;li>New packets can also be injected in the network&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>In addition to the mechanisms of TCP Tahoe: &lt;strong>fast recovery&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Controls sending of new segments until receipt of a non-duplicate ACK&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="fast-recovery">Fast Recovery&lt;/h3>
&lt;ul>
&lt;li>Starting condition: Receipt of a specified number of duplicate ACKs
&lt;ul>
&lt;li>Usually set to 3 duplicate ACKs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>💡 Idea: New segments should continue to be sent, even if packet loss is not yet recovered
&lt;ul>
&lt;li>Self clocking continuous&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Reaction
&lt;ul>
&lt;li>Reduce network load by halving the congestion window Retransmit first missing segment (fast retransmit)&lt;/li>
&lt;li>Consider continuous activity, i.e., further received segments while no new data is acknowledged
&lt;ul>
&lt;li>Increase congestion window by number of duplicate ACKs (usually 3)&lt;/li>
&lt;li>Further increase after receipt of each additional duplicate ACK&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Receipt of new ACK (new data is acknowledged)
&lt;ul>
&lt;li>Set congestion window to its value at the beginning of fast recovery&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="in-congestion-avoidance">In Congestion Avoidance&lt;/h3>
&lt;ul>
&lt;li>If timeout: &lt;strong>slow start&lt;/strong>
&lt;ul>
&lt;li>Set $\text { SSThresh }=\max (\text { FlightSize } / 2, 2 * M S S)$&lt;/li>
&lt;li>$\text{CWnd}=1$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>If 3 duplicate ACKs: &lt;strong>fast recovery&lt;/strong>
&lt;ul>
&lt;li>Retransmission of oldest unacknowledged segment (&lt;strong>fast retransmit&lt;/strong>)&lt;/li>
&lt;li>Set $\text { SSThresh }=\max (\text { FlightSize } / 2, 2 * M S S)$&lt;/li>
&lt;li>Set $\text{CWnd} = \text{SSThresh} + 3\text{MSS}$&lt;/li>
&lt;li>Receipt of additional duplicate ACK
&lt;ul>
&lt;li>$\text{CWnd } \text{+= } 1$&lt;/li>
&lt;li>Send new, i.e., not yet sent segments (if available)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Receipt of a “new” ACK: &lt;strong>congestion avoidance&lt;/strong>
&lt;ul>
&lt;li>$\text{CWnd} = \text{SSThresh}$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="evolution-of-congestion-window-with-tcp-reno">Evolution of Congestion Window with TCP Reno&lt;/h3>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-15%2021.11.05.png" alt="截屏2021-03-15 21.11.05">&lt;/p>
&lt;h2 id="analysis-of-improvements">Analysis of Improvements&lt;/h2>
&lt;ul>
&lt;li>After observing congestion collapses, the following mechanisms (among others) were introduced to the original TCP (RFC 793)
&lt;ul>
&lt;li>
&lt;p>Slow-Start&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Round-trip time variance estimation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Exponential retransmission timer backoff&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Dynamic window sizing on congestion&lt;/p>
&lt;/li>
&lt;li>
&lt;p>More aggressive receiver acknowledgement policy&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>🎯 Goal: Enforce packet conservation in order to achieve network stability&lt;/strong>&lt;/li>
&lt;/ul>
&lt;h3 id="self-clocking">Self Clocking&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Recap: TCP uses window-based flow control&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Basic assumption&lt;/p>
&lt;ul>
&lt;li>Complete flow control window in transit
&lt;ul>
&lt;li>In TCP: receive window $𝑅𝑐𝑣𝑊𝑖𝑛𝑑𝑜𝑤$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Bottleneck link with low data rate on the path to the receiver&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Basic scenario&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-16%2010.24.22.png" alt="截屏2021-03-16 10.24.22" style="zoom:67%;" />
![截屏2021-03-16 10.33.11](https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-16%2010.33.11.png)
&lt;/li>
&lt;/ul>
&lt;h4 id="conservation-of-packets">Conservation of Packets&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal: get TCP connection in equilibrium&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Full window of data in transit&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>“Conservative”&lt;/strong>: NO new segment is injected into the network before an old segment leaves the network&lt;/p>
&lt;p>$\rightarrow$ A system with this property should be &lt;strong>robust&lt;/strong> in the face of congestion&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Three ways for packet conservation to fail&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#slow-start">Connection does not get to equilibrium&lt;/a>&lt;/li>
&lt;li>&lt;a href="#retransmission-timer">Sender injects new packet before an old packet has exited&lt;/a>&lt;/li>
&lt;li>&lt;a href="#congestion-avoidance">Resource limits along the path hinder equilibrium&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="slow-start">Slow Start&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal: bring TCP connection into equilibrium&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Connection has just started or&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Restart after assumption of (major) congestion&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🔴 Problem: get the „clock“ started (At the beginning of
a connection there is no „clock“ available.)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>💡 Basic idea (per TCP connection)&lt;/p>
&lt;ul>
&lt;li>Do not send complete receive window (flow control) immediately&lt;/li>
&lt;li>Gradually increase number of segments that can be sent without receiving an ACK
&lt;ul>
&lt;li>Increase the amount of data that can be in transit (“in-flight”)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Approach&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Apply &lt;strong>congestion window&lt;/strong>, in addition to receive window&lt;/p>
&lt;ul>
&lt;li>Minimum of congestion and receive window can be sent
&lt;ul>
&lt;li>Congestion Window: $𝐶𝑊𝑛𝑑$ $[𝑀𝑆𝑆]$&lt;/li>
&lt;li>Receive Window: $Rcv𝑊𝑖𝑛𝑑𝑜𝑤$ $[𝐵𝑦𝑡𝑒]$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>New connection or congestion assumed&lt;/p>
&lt;p>$\rightarrow$ Reset of congestion window: $𝐶𝑊𝑛𝑑 = 1$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Incoming ACK for sent (not retransmitted) segment&lt;/p>
&lt;ul>
&lt;li>Increase congestion window by one: $𝐶𝑊𝑛𝑑 = 𝐶𝑊𝑛𝑑 + 1$&lt;/li>
&lt;/ul>
&lt;p>$\rightarrow$ Leads to exponential growth of 𝐶𝑊𝑛𝑑&lt;/p>
&lt;ul>
&lt;li>Sending rate is at most twice as high as the bottleneck capacity!&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="retransmission-timer">Retransmission Timer&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Assumption: Complete receive window in transit&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Alternative 1: &lt;strong>ACK received&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>A segment was delivered and, thus, exited the network $\rightarrow$ conservation of packets is fulfilled&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Alternative 2: &lt;strong>retransmission timer expired&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Segment is dropped in the network: conservation of packets is fulfilled&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Segment is delayed but not dropped: conservation of packets NOT fulfilled&lt;/p>
&lt;p>$\rightarrow$ Too short retransmission timeout causes connection to leave equilibrium&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Good estimation of Round Trip Time (RTT) essential for a good timer value!&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Value too small: unnecessary retransmissions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Value too large: slow reaction to packet losses&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="estimation-of-round-trip-time">Estimation of Round Trip Time&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Timer-based RTT measurement&lt;/p>
&lt;ul>
&lt;li>Timer resolution varies (up to 500 ms)&lt;/li>
&lt;li>Requirements regarding timer resolutions vary&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>SampleRTT&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Time interval between transmission of a segment and reception of corresponding acknowledgement&lt;/li>
&lt;li>Single measurement&lt;/li>
&lt;li>Retransmissions are ignored&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>EstimatedRTT&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Smoothed value across a number of measurements&lt;/li>
&lt;li>Observation: measured values can fluctuate heavily&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Apply &lt;strong>exponential weighted moving average (EWMA)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Influence of each value becomes gradually less as it ages&lt;/li>
&lt;li>Unbiased estimator for average value&lt;/li>
&lt;/ul>
$$
EstimatedRTT=(1-\alpha) * EstimatedRTT+\alpha * SampleRTT
$$
&lt;p>​ (Typical value for $\alpha$: 0.125)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Derive value for &lt;strong>retransmission timeout (RTO)&lt;/strong>
&lt;/p>
$$
𝑅𝑇𝑂 = \beta ∗ 𝐸𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑑𝑅𝑇𝑇
$$
&lt;ul>
&lt;li>Recommended value for $\beta$: 2&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="estimation-of-deviation">Estimation of Deviation&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal: Avoid the observed occasional retransmissions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Observation: Variation of RTT can greatly increase in higher loaded networks&lt;/p>
&lt;ul>
&lt;li>Consequently, $EstimatedRTT$ requires higher “safety margin”&lt;/li>
&lt;li>Estimation error: difference between measured/sampled and estimated RTT&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Computation
&lt;/p>
$$
\begin{array}{l}
&amp;Deviation =(1-\gamma) * Deviation+\gamma * \left|SampleRTT- EstimatedRTT \right| \\\\
&amp;RTO =EstimatedRTT +\beta * Deviation
\end{array}
$$
&lt;ul>
&lt;li>Recommended values: $\alpha = 0.125, \beta = 4, \gamma = 0.25$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="multiple-retransmissions">Multiple Retransmissions&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>How large should the time interval be between two subsequent retransmissions of the same segment?&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Approach: &lt;strong>Exponential backoff&lt;/strong>&lt;/p>
&lt;p>After each new retransmission RTO doubles:
&lt;/p>
$$
𝑅𝑇𝑂 = 2 ∗ 𝑅𝑇𝑂
$$
&lt;ul>
&lt;li>Maximal value should be applied. It should be $$ 60 seconds&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>To which segment does the received ACK belong – to the original segment or to the retransmission?&lt;/p>
&lt;ul>
&lt;li>Approach: &lt;strong>Karn‘s Algorithm&lt;/strong>
&lt;ul>
&lt;li>ACKs for retransmitted segments are not included into the calculation of $EstimatedRTT$ and $Deviation$&lt;/li>
&lt;li>Backoff is calculated as before&lt;/li>
&lt;li>Timeout value is set to the value calculated by backoff algorithm until an ACK to a non-retransmitted segment is received&lt;/li>
&lt;li>Then original algorithm is reactivated&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="congestion-avoidance">Congestion Avoidance&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Consider &lt;strong>multiple&lt;/strong> concurrent TCP connections&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Assumption: TCP connection operates in equilibrium&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Packet loss is with a high probability caused by a newly started TCP connection&lt;/p>
&lt;ul>
&lt;li>New connection requires resources on bottleneck router/link&lt;/li>
&lt;/ul>
&lt;p>$\rightarrow$ Load of already existing TCP connection(s) needs to be reduced&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Basic components&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Implicit congestion signals&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Retransmission timeout&lt;/li>
&lt;li>Duplicate acknowledgements&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Strategy to adjust traffic load: &lt;strong>AIMD&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Additively increase&lt;/strong> load if no congestion signal is experienced&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>On acknowledgement received: $𝐶𝑊𝑛𝑑 += 1/𝐶𝑊𝑛𝑑$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Multiplicatively decrease&lt;/strong> load in case a congestion signal was experienced&lt;/p>
&lt;pre>&lt;code>- On retransmission timeout
$$
CWnd = \gamma * CWnd, \quad 0&amp;lt; \gamma &amp;lt; 1
&lt;/code>&lt;/pre>
&lt;p>$$&lt;/p>
&lt;pre>&lt;code>- In TCP Tahoe: $\gamma = 1/2
&lt;/code>&lt;/pre>
&lt;/li>
&lt;/ul>
&lt;h2 id="optimization-criteria">Optimization Criteria&lt;/h2>
&lt;p>Basic Scenario&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-16%2011.46.23.png" alt="截屏2021-03-16 11.46.23" style="zoom:67%;" />
&lt;ul>
&lt;li>$𝑁$ sender that use same bottleneck link
&lt;ul>
&lt;li>
&lt;p>Data rate of sender $i$: $r\_i(t)$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Capacity of bottleneck link: $C$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Bottleneck link&lt;/strong>: Link with lowest available data rate on the path to the receiver&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Network-limited&lt;/strong> sender&lt;/p>
&lt;ul>
&lt;li>Assume that the sender always has data to send and data are sent as quickly as possible&lt;/li>
&lt;li>Sender can send a full window of data&lt;/li>
&lt;li>Congestion control limits the data rate of such a sender to the available capacity at the bottleneck link&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Application-limited&lt;/strong> sender&lt;/p>
&lt;ul>
&lt;li>Data rate of the sender is limited by the application and not by the network&lt;/li>
&lt;li>Sender sends less data as allowed by the current window&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Efficiency&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Closeness of the total load on the bottleneck link to its link capacity
&lt;ul>
&lt;li>$\sum\_{j=1}^{N} r\_{i}(j)$ should be as close to 𝐶 as possible, i.e., close to the knee&lt;/li>
&lt;li>Overload and underload are not desirable&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Fairness&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>All senders that share the bottleneck link get a fair allocation of the bottleneck link capacity&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Examples&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Jain ́s fairness index&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Quantify „amount“ of unfairness
&lt;/p>
$$
F\left(r\_{i}, \ldots, r\_{N}\right)=\frac{\left(\sum r\_{i}\right)^{2}}{N\left(\sum r\_{i}^{2}\right)}
$$
&lt;/li>
&lt;li>
&lt;p>Fairness index $\in [0, 1]$&lt;/p>
&lt;ul>
&lt;li>Totally fair allocation has fairness index of $1$ (i.e., all $r\_i$ are equal)&lt;/li>
&lt;li>Totally unfair allocation has fairness index of $1/N$ (i.e., one user gets entire capacity)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Max-min fairness&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Situation&lt;/p>
&lt;ul>
&lt;li>Users share resource. Each user has an equal right to the resource&lt;/li>
&lt;li>But: some users intrinsically demand fewer resources than others (E.g., in case of application-limited senders)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Intuitive allocation of fair share&lt;/p>
&lt;ul>
&lt;li>Allocates users with a “small” demand what they want&lt;/li>
&lt;li>Equally distributes unused resources to “big” users&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>💡 Max-min fair allocation&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Resources are allocated in order of increasing demand&lt;/p>
&lt;/li>
&lt;li>
&lt;p>No source gets a resource share larger than its demand&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sources with unsatisfied demands get an equal share of the resource&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Implementation&lt;/p>
&lt;ul>
&lt;li>Senders $1, 2, ... 𝑁$ with demanded sending rates $s\_1, s\_2, ..., s\_N$
&lt;ul>
&lt;li>Without loss of generality: $s\_1 \leq s\_2 \leq ...\leq s\_N$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>$C$: capacity&lt;/li>
&lt;li>Give $\frac{C}{N}$ to sender with smallest demand
&lt;ul>
&lt;li>In case this is more than demanded, then $\frac{C}{N}− s\_1$ is still available to others&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>$\frac{C}{N} − s\_1$ equally distributed to others $\Rightarrow$ each gets $ \frac{C}{N} + \frac{\frac{C}{N} - s\_1}{N- 1}$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/z5uHTkM17P8?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Convergence&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-16%2012.35.16.png" alt="截屏2021-03-16 12.35.16">&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Responsiveness&lt;/strong>: Speed with which $r\_i$ gets to equilibrium rate at knee after starting from any starting state
&lt;ul>
&lt;li>May oscillate around goal (= network capacity)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Smoothness&lt;/strong>: Size of oscillations around network capacity at steady state&lt;/li>
&lt;/ul>
&lt;p>(Smaller is better in both cases)&lt;/p>
&lt;h3 id="on-fairness">On Fairness&lt;/h3>
&lt;p>How to divide resources among TCP connections?&lt;/p>
&lt;p>$\rightarrow$ Strive for &lt;strong>fair&lt;/strong> allocation 💪&lt;/p>
&lt;p>🎯 &lt;strong>Goal&lt;/strong>: all TCP connections receive &lt;strong>equal share&lt;/strong> of bottleneck resource&lt;/p>
&lt;ul>
&lt;li>the share should be non-zero&lt;/li>
&lt;li>equal share is not ideal for all applications 🤔&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Example&lt;/strong>: $𝑁$ TCP connections share same bottleneck, Each TCP connection receives $(1/𝑁)$-th of bottleneck capacity&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-16%2012.40.35.png" alt="截屏2021-03-16 12.40.35" style="zoom:67%;" />
&lt;p>&lt;strong>Observation&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>“Greedy” user&lt;/strong>: opens multiple TCP connections concurrently&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Example&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Link with capacity $𝐷$, two users, one connection per user&lt;/p>
&lt;p>$\rightarrow$ Each user gets capacity $\frac{D}{2}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Link with capacity $𝐷$, two users, user 1 with a single connection, user 2&lt;/p>
&lt;p>with nine connections&lt;/p>
&lt;p>$\rightarrow$ User 1 can use $\frac{1}{10}D$ , user 2 can use $\frac{9}{10}D$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>“Greedy” receiver&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Can send several ACKs per received segment&lt;/li>
&lt;li>Can send ACKs faster than it receives segments&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="additive-increase-multiplicative-decrease">Additive Increase Multiplicative Decrease&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>General feedback control algorithm&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Applied to congestion control&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Additive increase of data rate until congestion&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Multiplicative decrease of data rate in case of congestion signal&lt;/p>
&lt;/li>
&lt;/ul>
$$
r_{i}(t+1)=
\begin{cases}
r_{i}(t)+a &amp; \text { if no congestion is detected } \\\\
r_{i}(t) * b &amp; \text { if congestion is detected }
\end{cases}
$$
&lt;/li>
&lt;li>
&lt;p>Converges to equal share of capacity at bottleneck link&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="aimd-fairness">AIMD: Fairness&lt;/h4>
&lt;ul>
&lt;li>Network with two sources that share a bottleneck link with capacity $𝐶$&lt;/li>
&lt;li>🎯 Goal: bring system close to optimal point $(\frac{𝐶}{2} , \frac{𝐶}{2})$&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-16%2013.00.42.png" alt="截屏2021-03-16 13.00.42" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Efficiency line&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>$r\_1 + r\_2 = C$ holds for all points on the line&lt;/li>
&lt;li>Points under the line means underloaded $\rightarrow$ Control decision: increase rate&lt;/li>
&lt;li>Points above the line means overloaded $\rightarrow$ Control decision: decrease rate&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Fairness line&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>All allocations with fair allocation, i.e. $r\_1 = r\_2$&lt;/li>
&lt;li>Multiplying with $𝑏$ does not change fair allocation: $br\_1 = br\_2$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Optimal operating point&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Intersection of efficiency line and fairness line: point $(\frac{𝐶}{2} , \frac{𝐶}{2})$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Optimality of AIMD&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Additive increase&lt;/p>
&lt;ul>
&lt;li>Resource allocation of both users increased by $\alpha$&lt;/li>
&lt;li>In the graph: moving up along a 45-degree line&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Multiplicative decrease&lt;/p>
&lt;p>Move down along the line that connects to the origin&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>$\rightarrow$ Point of operation iteratively moves closer to optimal operating point 👏&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="periodic-model">Periodic Model&lt;/h2>
&lt;p>&lt;strong>Performance metrics&lt;/strong> of interest&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Throughput&lt;/strong>
How much data can be transferred in which time interval?&lt;/li>
&lt;li>&lt;strong>Latency&lt;/strong>
How high is the experienced delay?&lt;/li>
&lt;li>&lt;strong>Completion time&lt;/strong>
How long until the transfer of an object/file is finished?&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Variables&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>$X$: Sending rate measured in segments per time interval&lt;/li>
&lt;li>$RTT$: Round trip time [seconds]&lt;/li>
&lt;li>$p$: Loss probability of a segment&lt;/li>
&lt;li>$MSS$: Maximum segment size [bit]&lt;/li>
&lt;li>$W$: Value of a congestion window [MSS]&lt;/li>
&lt;li>$D$: Data rate measured in bit per second [bit/s]&lt;/li>
&lt;/ul>
&lt;h4 id="periodic-model-1">Periodic Model&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Simple model – strong simplifications&lt;/p>
&lt;/li>
&lt;li>
&lt;p>🎯 Goals&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Model &lt;strong>long-term steady state behavior&lt;/strong> of TCP&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Evaluate achievable &lt;strong>throughput&lt;/strong> of a TCP connection under certain network conditions&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Basic assumptions&lt;/p>
&lt;ul>
&lt;li>Network has constant loss probability $p$&lt;/li>
&lt;li>Observed TCP connection does not influence $p$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Further simplification: &lt;strong>periodic losses&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>For an individual connection segment losses are equally spaced&lt;/p>
&lt;p>$\rightarrow$ Link delivers $N = \frac{1}{p}$ segments followed by a segment loss&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Additional simplifications / model assumptions&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Slow start is ignored&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Congestion window increases linearly (congestion avoidance)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>RTT is constant&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Losses are detected using duplicate ACKs (No timeouts)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Retransmissions are not modelled&lt;/p>
&lt;ul>
&lt;li>Go-Back-N is not modelled&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Connection only limited by $CWnd$&lt;/p>
&lt;ul>
&lt;li>Flow control (receive window) is never a limiting factor&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Always $MSS$ sized segments are sent&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Under given assumptions we have the diagram:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-16%2016.44.03.png" alt="截屏2021-03-16 16.44.03" style="zoom: 67%;" />
&lt;ul>
&lt;li>Progress of CWnd: Perfect periodic &lt;strong>saw tooth curve&lt;/strong>
$$
\frac{W}{2}*MSS \leq CWnd \leq W * MSS
$$
Note: Here $W$ is unitless.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Data rate when segment loss occurs?
&lt;/p>
$$
D = \frac{W * MSS}{RTT}
$$
&lt;/li>
&lt;li>
&lt;p>How long until congestion window reaches 𝑊 again?
&lt;/p>
$$
\frac{W}{2} * RTT
$$
&lt;/li>
&lt;li>
&lt;p>Average data rate of a TCP connection?
&lt;/p>
$$
D = \frac{0.75W * MSS}{RTT}
$$
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Step 1: Determine $W$ as a function of $p$&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Minimal value of congestion window: $\frac{W}{2}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Congestion window opens by one segment per RTT&lt;/p>
&lt;ul>
&lt;li>Duration of a period:
$$
t = \frac{W}{2} \text{ round trip times } = \frac{W}{2}*RTT \text{ seconds }
$$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Number of delivered segments within one period&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Corresponds to the area under the saw tooth curve
&lt;/p>
$$
N=\left(\frac{W}{2}\right)^{2}+\frac{1}{2}\left(\frac{W}{2}\right)^{2}=\frac{3}{8} W^{2}
$$
&lt;/li>
&lt;li>
&lt;p>According to the assumptions $N = \frac{1}{p}$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>$\Rightarrow W = \sqrt{\frac{8}{3p}}$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Step 2: Determine data rate $D$ as a function of $p$&lt;/strong>&lt;/p>
&lt;p>Average data rate
&lt;/p>
$$
D=\frac{N * M S S}{t}
$$
&lt;p>
We have assumption $N = \frac{1}{p}$ and period duration is $\frac{W}{2}*RTT$ [s]
&lt;/p>
$$
\Rightarrow D=\frac{\frac{1}{p} * M S S}{R T T * \frac{W}{2}}
$$
&lt;p>
In step 1 we have $W=\sqrt{\frac{8}{3 p}}$
&lt;/p>
$$
D=\frac{1}{R T T} \sqrt{\frac{3}{2 p}} * M S S
$$
&lt;p>
This is called &lt;strong>&amp;ldquo;Inverse Square-Root $𝑝$ Law&amp;rdquo;&lt;/strong>&lt;/p>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-16%2017.37.44.png" alt="截屏2021-03-16 17.37.44" style="zoom:67%;" />
&lt;h2 id="active-queue-management-aqm">Active Queue Management (AQM)&lt;/h2>
&lt;h3 id="simple-queue-management">Simple Queue Management&lt;/h3>
&lt;ul>
&lt;li>Buffer in the router is full
&lt;ul>
&lt;li>Next segment must be dropped $\rightarrow$ Tail drop&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>TCP detects congestion and backs off&lt;/li>
&lt;li>🔴 Problems
&lt;ul>
&lt;li>Synchronization: Segments of several TCP connections are dropped (almost) at the same time&lt;/li>
&lt;li>Nearly full buffer cannot absorb short bursts&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="active-queue-management">Active Queue Management&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Basic approach&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Detect arising congestion within the network&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Give early feedback to senders&lt;/p>
&lt;ul>
&lt;li>Intentionally trigger implicit congestion signal: &lt;strong>packet loss&lt;/strong>&lt;/li>
&lt;li>Alternative: Send &lt;a href="#explicit-congestion-notification">&lt;strong>explicit congestion notification (ECN)&lt;/strong>&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Routers drop (or mark) segments, before queue completely filled up&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Randomization&lt;/strong>: random decision on which segment to be dropped&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Observations at the receiver on layer 4 Typically only a single segment is missing&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>AQM algorithms&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#random-early-detection">Random Early Detection (RED)&lt;/a>&lt;/li>
&lt;li>Newer algorithms: CoDel, FQ-CoDel, PIE &amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="random-early-detection">Random Early Detection&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Approach&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-16%2017.47.38.png" alt="截屏2021-03-16 17.47.38" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>Average queue occupancy $&lt; q\_{min}$&lt;/p>
&lt;ul>
&lt;li>No drop of segments ($𝑝 = 0$)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>$q\_{min} \leq$ average queue occupancy $&lt; q\_{max}$&lt;/p>
&lt;ul>
&lt;li>Probability of dropping an incoming packet is linearly increased with average queue occupancy&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Average queue occupancy $ \geq q\_{max}$&lt;/p>
&lt;ul>
&lt;li>Drop all segments ($𝑝 = 1$)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="explicit-congestion-notification">Explicit Congestion Notification&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal: Send explicit congestion signal, avoid unnecessary packet drops&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Approach&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Enable AQM to explicitly notify about congestion&lt;/p>
&lt;/li>
&lt;li>
&lt;p>AQM does not have to drop packets to create implicit congestion signal&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>How to notify?&lt;/p>
&lt;ul>
&lt;li>Mark IP datagram, but do not drop it&lt;/li>
&lt;li>Marked IP datagram is forwarded to receiver&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>How to react?&lt;/p>
&lt;ul>
&lt;li>Marked IP datagram is delivered to receiver instance of IP&lt;/li>
&lt;li>Information must be passed to corresponding receiver instance of TCP&lt;/li>
&lt;li>TCP sender must be notified&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="additional-resource">Additional Resource&lt;/h2>
&lt;h4 id="tcp-congestion-control-">TCP Congestion Control 👍&lt;/h4>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/kRS4J-m5n04?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div>
&lt;h4 id="37---tcp-congestion-control--fhu---computer-networks-">3.7 - TCP Congestion Control | FHU - Computer Networks 👍&lt;/h4>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/cPLDaypKQkU?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div></description></item><item><title>Gaußverteilung</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/gauss_verteilung/</link><pubDate>Sun, 03 Jul 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/gauss_verteilung/</guid><description>&lt;h2 id="skalarer-fall-1d">Skalarer Fall (1D)&lt;/h2>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/WP_Normalverteilung_01-1024x576.jpg" alt="Eigenschaften Normalverteilung, Normalverteilung, Wendestellen, Standardabweichung, Varianz, Mittelwert, Sigma, Mü, Maximum, Erwartungswert, Funktion Normalverteilung" style="zoom: 50%;" />
$$
f(x)=\frac{1}{\sqrt{2 \pi} \sigma} \exp \left\{-\frac{1}{2} \frac{(x-\hat{x})^{2}}{\sigma^{2}}\right\}
$$
&lt;ul>
&lt;li>
&lt;p>Erwartungswert
&lt;/p>
$$
E_{f}\{x\}=\hat{x}
$$
&lt;/li>
&lt;li>
&lt;p>Varianz&lt;/p>
$$
E_{f}\left\{(x-\hat{x})^{2}\right\}=\sigma^{2}
$$
&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>Given the parameters $\mu$ and $\sigma$ of a Gaussian density, mean and variance are already given. On the other hand, assume that we wish to approximate a given density $\tilde{f}_x$ with a simpler density of the same mean and standard deviation. Then, given the mean and the standard deviation of the density $\tilde{f}_x$, an appropriate Gaussian density is immediately obtained. This is a property not generally shared by more complicated densities.&lt;/p>
&lt;/blockquote>
&lt;h2 id="2d-normalverteilung">2D Normalverteilung&lt;/h2>
$$
\begin{aligned}
f_{x y}(x, y)&amp;=\frac{1}{2 \pi \sigma_{x} \sigma_{y} \sqrt{1-r^{2}}} \exp \left\{-\frac{1}{2} Q(x, y)\right\} \\
Q(x, y)&amp;=\frac{1}{1-r}\left\{\frac{(x-\hat{x})^{2}}{\sigma_{x}^{2}}-2 r \frac{x-\hat{x}}{\sigma_{x}} \frac{y-\hat{y}}{\sigma_{y}}+\frac{(y-\hat{y})^{2}}{\sigma_{y}^{2}}\right\}
\end{aligned}
$$
&lt;ul>
&lt;li>$r \in [-1, 1]$: Korrelationskoeffizent (in some literature also written as $\rho$)&lt;/li>
&lt;/ul>
&lt;p>Alternativ&lt;/p>
$$
f_{x y}(x, y)=\mathcal{N} \left(\left[\begin{array}{l}
x \\
y
\end{array}\right],\left[\begin{array}{l}
\hat{x} \\
\hat{y}
\end{array}\right],\left[\begin{array}{ll}
C_{x x} &amp; C_{x y} \\
C_{y x} &amp; C_{y y}
\end{array}\right]\right)
$$
&lt;p>mit&lt;/p>
$$
\left[\begin{array}{ll}
c_{x x} &amp; c_{x y} \\
c_{y x} &amp; c_{y y}
\end{array}\right]=\left[\begin{array}{lc}
\sigma_{x}^{2} &amp; r \sigma_{x} \sigma_{y} \\
r \sigma_{x} \sigma_{y} &amp; \sigma_{y}^{2}
\end{array}\right]
$$
&lt;h3 id="correlationskoeffizient">Correlationskoeffizient&lt;/h3>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Figure_1.png"
alt="Correlation of bivariate Gaussian distribution ($\rho$ is the correlation coefficient). (Source:
)">&lt;figcaption>
&lt;p>Correlation of bivariate Gaussian distribution ($\rho$ is the correlation coefficient). (Source:
)&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;ul>
&lt;li>
&lt;p>unkorreliert ($r = 0$) (Figure 1 right)&lt;/p>
&lt;ul>
&lt;li>$\Rightarrow \boldsymbol{x}, \boldsymbol{y}$ unkorreliert&lt;/li>
&lt;li>$\Rightarrow$ (nur für Gauß) $\boldsymbol{x}, \boldsymbol{y}$ unabhängig ($f_{\boldsymbol{x}, \boldsymbol{y}} = f_{\boldsymbol{x}}(x) f_{\boldsymbol{y}}(y)$
)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>positiv korreliert ($r > 0$) (Figure 1 left)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>positiv korreliert ($r &lt; 0$) (Figure 1 middle)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="n-dim-normalverteilung">$N$-dim. Normalverteilung&lt;/h2>
$$
f_{\boldsymbol{x}}(x)=\frac{1}{\sqrt{(2 \pi)^{N} \cdot|\mathbf{C}|}} \exp \left\{-\frac{1}{2}(\underline{x}-\underline{\hat{x}})^{\top} \mathbf{C}^{-1}(\underline{x}-\underline{\hat{x}})\right\}
$$
&lt;ul>
&lt;li>$\underline{\hat{x}}$
: Mean&lt;/li>
&lt;li>$\mathbf{C}$
: Kovarianzmatrix&lt;/li>
&lt;/ul></description></item><item><title>Ethernet</title><link>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/08-ethernet/</link><pubDate>Thu, 18 Mar 2021 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/08-ethernet/</guid><description>&lt;h2 id="aloha-slotted-aloha">Aloha, Slotted Aloha&lt;/h2>
&lt;h3 id="aloha">Aloha&lt;/h3>
&lt;p>First MAC protocol for packet-based wireless networks&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Media access control (MAC)&lt;/p>
&lt;ul>
&lt;li>Time multiplex, variable, random access&lt;/li>
&lt;li>NO previous sensing of medium and no announcement of intended transmission&lt;/li>
&lt;li>&lt;strong>Asynchronous&lt;/strong> access&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🔴 Problem: Collision possible&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Schema&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-18%2012.02.54.png" alt="截屏2021-03-18 12.02.54" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;h3 id="slotted-aloha">Slotted Aloha&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Like Aloha, but&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Uses time slots&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Synchronized&lt;/strong> access only at beginning of time slot&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>On average &lt;strong>less&lt;/strong> collisions than with Aloha&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Schema&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-18%2012.05.04.png" alt="截屏2021-03-18 12.05.04" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;h3 id="evaluation">Evaluation&lt;/h3>
&lt;p>How well can the capacity of the medium be utilized?&lt;/p>
&lt;h4 id="evaluation-of-slotted-aloha">Evaluation of Slotted Aloha&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Assumptions&lt;/p>
&lt;ul>
&lt;li>Based on the design
&lt;ul>
&lt;li>All systems start transmissions at beginning of time slot&lt;/li>
&lt;li>All systems work synchronized&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Simplifications
&lt;ul>
&lt;li>All packets have same length and fit into one time slot
&lt;ul>
&lt;li>If a collision arises, all systems notice it before end of the time slot&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>All systems always want to send data
&lt;ul>
&lt;li>Every system sends in each time slot with a probability of $𝑝$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>If a collision occurs
&lt;ul>
&lt;li>Packet will be repeated with a probability of $𝑝$ in all following time slots until the transmission is successful&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>There are $𝑁$ active systems in the network&lt;/p>
&lt;ul>
&lt;li>Probability that a system starts sending: $𝑝$&lt;/li>
&lt;li>Probability that $𝑁 − 1$ systems are not sending: $(1 - p)^{N-1}$&lt;/li>
&lt;li>Probability that a given system succeeds: $p(1 - p)^{N-1}$&lt;/li>
&lt;li>Probability for successful transmission of any one system: $Np(1 - p)^{N-1}$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Seeking for maximum utilization $U\_{max}$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Need $p^*$ s.t. $Np(1 - p)^{N-1}$ reaches its maximum&lt;/p>
&lt;ul>
&lt;li>Solution: $p^\* = \frac{1}{N}$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Therefore:&lt;/p>
$$
\begin{array}{l}
&amp;N p^{\*}\left(1-p^{\*}\right)^{N-1}=\left(1-\frac{1}{N}\right)^{N-1}\\\\
&amp;\displaystyle{\lim \_{N \rightarrow \infty}}\left(1-\frac{1}{N}\right)^{N-1}=\frac{1}{e}\\\\
&amp;U\_{\max }=\frac{1}{e}=0.36
\end{array}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="evaluation-of-aloha">Evaluation of Aloha&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Simplifying assumptions&lt;/p>
&lt;ul>
&lt;li>All packets have same length&lt;/li>
&lt;li>Immediate notification about collisions&lt;/li>
&lt;li>On collision: Packet will be repeated immediately with probability $𝑝$&lt;/li>
&lt;li>On successful transmission
&lt;ul>
&lt;li>Wait for transmission time of packet&lt;/li>
&lt;li>Then: continue sending with probability $𝑝$ and continue waiting with probability $1 − 𝑝$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Observation: Collision occurs&lt;/p>
&lt;p>​ a) if previous packet from other system has not been send completely, &lt;strong>or&lt;/strong>&lt;/p>
&lt;p>​ b) if other system starts sending before ongoing transmission is finished&lt;/p>
&lt;/li>
&lt;li>
&lt;p>There are $𝑁$ active systems in the network&lt;/p>
&lt;ul>
&lt;li>Probability that a system starts sending: $𝑝$&lt;/li>
&lt;li>Probability for (a) and (b): $(1 - p)^{N-1}$&lt;/li>
&lt;li>Probability for successful transmission of any one system: $Np(1 - p)^{2(N-1)}$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Further observations as for Slotted Aloha
&lt;/p>
$$
\begin{array}{l}
\displaystyle{\lim\_{N \rightarrow \infty}} \frac{N}{2 N-1}\left(1-\frac{1}{2 N-1}\right)^{2(N-1)}=\frac{1}{2 e} \\\\
\Rightarrow U_{\max }=\frac{1}{2 e}=0.18
\end{array}
$$
&lt;/li>
&lt;/ul>
&lt;h4 id="comparison-of-utilization-between-aloha-and-slotted-aloha">Comparison of Utilization Between Aloha and Slotted Aloha&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-18%2012.25.40.png" alt="截屏2021-03-18 12.25.40" style="zoom:67%;" />
&lt;h2 id="csma-based-approaches">CSMA-based Approaches&lt;/h2>
&lt;p>&lt;strong>CSMA&lt;/strong> = &lt;strong>C&lt;/strong>arrier &lt;strong>S&lt;/strong>ense &lt;strong>M&lt;/strong>ultiple &lt;strong>A&lt;/strong>ccess&lt;/p>
&lt;ul>
&lt;li>&lt;strong>CSMA/CD&lt;/strong>
&lt;ul>
&lt;li>&lt;strong>CD&lt;/strong> = &lt;strong>C&lt;/strong>ollision &lt;strong>D&lt;/strong>etection (&amp;ldquo;Listen before talk, listen while talk“)&lt;/li>
&lt;li>Sending system can detect collisions by listening&lt;/li>
&lt;li>Usage example: Ethernet&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>CSMA/CA&lt;/strong>
&lt;ul>
&lt;li>&lt;strong>CA&lt;/strong> = &lt;strong>C&lt;/strong>ollision &lt;strong>A&lt;/strong>voidance&lt;/li>
&lt;li>Sending system assumes collisions when acknowledgement is missing
&lt;ul>
&lt;li>MAC-layer acknowledgements, stop-and-wait&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Usage example: WLAN&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="ethernet-variants">Ethernet Variants&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Ethernet variants&lt;/th>
&lt;th>Data rate&lt;/th>
&lt;th>Topology&lt;/th>
&lt;th>Medium access&lt;/th>
&lt;th>Evaluation&lt;/th>
&lt;th>Layers&lt;/th>
&lt;th>Flow control&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Original&lt;/td>
&lt;td>10 Mbit/s&lt;/td>
&lt;td>bus&lt;/td>
&lt;td>CSMA/CD&lt;br />&lt;li> Check medium&lt;/li> &lt;li>1-persistent sending &lt;/li> &lt;li>Collision detection by sender&lt;/li> &lt;li>Exponential backoff&lt;/li>&lt;/td>
&lt;td>Utilization&lt;/td>
&lt;td>1 and 2a&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Fast Ethernet&lt;/td>
&lt;td>100 Mbit/s&lt;/td>
&lt;td>star&lt;/td>
&lt;td>CSMA/CD&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>Implicit / Explicit&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Gigabit Ethernet&lt;/td>
&lt;td>1 Gbit/s&lt;/td>
&lt;td>star&lt;/td>
&lt;td>Carrier extension, frame bursting&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="the-original">The Original&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Standardized as &lt;strong>IEEE 802.3&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Medium access control&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Time multiplex, variable, random access&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Asynchronous access&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Uses CSMA/CD&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Collisions detection through listening&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Exponential backoff&lt;/p>
&lt;/li>
&lt;li>
&lt;p>1-persistent&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Network topology&lt;/p>
&lt;ul>
&lt;li>Originally: Bus topology&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Data rate&lt;/p>
&lt;ul>
&lt;li>Originally: 10 Mbit/s&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Wire based&lt;/p>
&lt;ul>
&lt;li>Originally: Coaxial cable&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Standard consists of&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Layer 1 and&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Layer 2a (MAC-Protocol)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>CSMA/CD-based approach&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Check medium&lt;/p>
&lt;ul>
&lt;li>Considered free if no activity is detected for &lt;strong>96 bit times&lt;/strong>
&lt;ul>
&lt;li>96 bit times = &lt;strong>Inter Frame Space (IFS)&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Sending: &lt;strong>1-persistent&lt;/strong>&lt;/p>
&lt;blockquote>
&lt;p>1-persistent&lt;/p>
&lt;p>1-persistent CSMA is an aggressive transmission algorithm. When the transmitting node is ready to transmit, it senses the transmission medium for idle or busy.&lt;/p>
&lt;ul>
&lt;li>If idle, then it transmits immediately.&lt;/li>
&lt;li>If busy, then it senses the transmission medium continuously until it becomes idle, then transmits the message (a &lt;a href="https://en.wikipedia.org/wiki/Frame_(telecommunications)">frame&lt;/a>) unconditionally (i.e. with probability=1).&lt;/li>
&lt;li>In case of a &lt;a href="https://en.wikipedia.org/wiki/Collision_(telecommunications)">collision&lt;/a>, the sender waits for a &lt;a href="https://en.wikipedia.org/wiki/Randomness">random&lt;/a> period of time and attempts the same procedure again.&lt;/li>
&lt;/ul>
&lt;p>1-persistent CSMA is used in CSMA/CD systems including &lt;a href="https://en.wikipedia.org/wiki/Ethernet">Ethernet&lt;/a>.&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;li>
&lt;p>Collision detection by sender&lt;/p>
&lt;ul>
&lt;li>Abort sending&lt;/li>
&lt;li>Send jamming signal (length of 48 bit, format &lt;code>1010...&lt;/code>)&lt;/li>
&lt;li>Ensure collision detection: Minimum length of frame&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Exponential backoff for repeated transmissions&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="collision-detection">Collision Detection&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Collision detection by &lt;strong>sender&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Detection must happen before transmission is finished&lt;/p>
&lt;p>$\rightarrow$ We need Minimum duration for sending&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Doubled maximum propagation delay $t\_a$ of the medium&lt;/p>
&lt;p>$\rightarrow$ &lt;strong>Minimum length&lt;/strong> of a 802.3-MAC frame required&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>In case of shorter frames&lt;/p>
&lt;ul>
&lt;li>No reliable collision detection 🤪&lt;/li>
&lt;li>No CSMA/CD, only CSMA 🤪&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>How to enforce minimum frame length?&lt;/p>
&lt;ul>
&lt;li>Implemented transparently for the application
&lt;ul>
&lt;li>I.e., application can transmit small portions of data if desired&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Frame is extended by &lt;strong>padding field (PAD)&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="ethernet-frame">Ethernet Frame&lt;/h4>
&lt;p>Structure&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-18%2012.36.37.png" alt="截屏2021-03-18 12.36.37" style="zoom:67%;" />
&lt;p>Between two frames: &lt;strong>IFS&lt;/strong>&lt;/p>
&lt;h4 id="evaluation-ethernet-utilization">Evaluation Ethernet: Utilization&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal: Derive upper bound of utilization $U\_{max}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Assumption&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Perfect protocol&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>No transmission errors, no overhead, no processing time, &amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Achieved throughput
&lt;/p>
$$
r\_{e}=\frac{X}{t\_{s}+t\_{a}}=\frac{X}{X / r+d / v}
$$
&lt;ul>
&lt;li>$r\_e$: effective data rate&lt;/li>
&lt;li>$X$: #bits to transmit&lt;/li>
&lt;li>$t\_a$: propagation delay&lt;/li>
&lt;li>$t\_s$: transmission delay&lt;/li>
&lt;li>$r$: data rate&lt;/li>
&lt;li>$d$: medium distance&lt;/li>
&lt;li>$v$: transmission speed&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Parameter $𝑎$ often used for performance evaluation
&lt;/p>
$$
a= \frac{\text{propagation delay}}{\text{transmission delay}} = \frac{t\_{a}}{t\_{s}}=\frac{d / v}{X / r}=\frac{r d}{X v}
$$
&lt;/li>
&lt;li>
&lt;p>Utilization under optimal circumstances
&lt;/p>
$$
U\_{\max }=\frac{r\_{e}}{r}=\frac{1}{1+a}
$$
&lt;/li>
&lt;li>
&lt;p>Local network with $𝑁$ active systems&lt;/p>
&lt;ul>
&lt;li>Each system can always send a frame&lt;/li>
&lt;li>System sends frames with probability $𝑝$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Maximum normalized propagation delay of $𝑎$&lt;/p>
&lt;ul>
&lt;li>I.e., transmission time $t\_s$ of each frame is normalized to 1&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Time is logically partitioned in time slots&lt;/p>
&lt;ul>
&lt;li>Length is doubled end-to-end propagation delay (i.e., $2a$)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Observations&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Two types of time intervals&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Transmission intervals&lt;/strong>: $\frac{1}{2a}$ time slots&lt;/p>
&lt;blockquote>
&lt;ul>
&lt;li>Transmission time $t\_s$ is normalized to 1&lt;/li>
&lt;li>Length of each time slot is $2a$&lt;/li>
&lt;/ul>
&lt;p>$\Rightarrow$ We need $\frac{1}{2a}$ time slots&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Collision intervals&lt;/strong>: collisions or no transmissions&lt;/p>
&lt;/li>
&lt;/ul>
$$
U\_{\max }=\frac{\text { Transmission interval }}{\text { Transmission interval }+\text { Collision interval }}
$$
&lt;/li>
&lt;li>
&lt;p>Evaluation
&lt;/p>
$$
\lim \_{N \rightarrow \infty} U\_{\max }=\frac{1}{1+3.44 a}
$$
&lt;details>
&lt;summary>Details&lt;/summary>
&lt;p>Average length $l\_k$ of a collision interval (measured in time slots)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Probability $A$ that exactly one system is sending:
&lt;/p>
$$
A = Np(1 - p)^{N-1}
$$
&lt;/li>
&lt;li>
&lt;p>Function has maximum at $p^\* = \frac{1}{N} \Rightarrow A^\* = (1 - \frac{1}{N})^{N-1}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Probability that in $i$ following time slots a collision or no transmission occurs,&lt;/p>
&lt;p>followed by a time slot with transmission
&lt;/p>
$$
\left(1-A^{\*}\right)^{i} A^{\*}
$$
&lt;/li>
&lt;li>
&lt;p>Average length $l\_k$:
&lt;/p>
$$
E\left[l\_{k}\right]=\sum\_{i=1}^{\infty} i\left(1-A^{\*}\right)^{i} A^{\*} \to \frac{1-A^\*}{A\*}
$$
&lt;/li>
&lt;/ul>
&lt;p>Therefore
&lt;/p>
$$
U\_{\max }=\frac{\text { Transmission interval }}{\text { Transmission interval }+\text { Collision interval }} = \frac{1 /(2 a)}{1 /(2 a)+\left(1-A^{\*}\right) / A^{\*}}=\frac{1}{1+2 a\left(1-A^{\*}\right) / A^{\*}}
$$
&lt;p>
For increasing number $N$ of systems
&lt;/p>
$$
\lim \_{N \rightarrow \infty} A^{\*}=\lim \_{N \rightarrow \infty}\left(1-\frac{1}{N}\right)^{N-1}=1 / e
$$
$$
\Rightarrow \lim \_{N \rightarrow \infty} U\_{\max }=\frac{1}{1+3.44 a}
$$
&lt;/details>
&lt;/li>
&lt;/ul>
&lt;h3 id="fast-ethernet">Fast Ethernet&lt;/h3>
&lt;ul>
&lt;li>Standardization: 1995 standardized as IEEE 802.3u (100Base-TX)&lt;/li>
&lt;li>Important features
&lt;ul>
&lt;li>Data rate: 100 Mbit/s
&lt;ul>
&lt;li>Switchable between 10 Mbit/s and 100 Mbit/s&lt;/li>
&lt;li>Automatic negotiation&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Network topology: &lt;strong>Star&lt;/strong>&lt;/li>
&lt;li>Medium access control
&lt;ul>
&lt;li>CSMA/CD (for half duplex links)&lt;/li>
&lt;li>Preserve Ethernet frame format&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Modified encoding&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="ethernet-flow-control">Ethernet Flow Control&lt;/h4>
&lt;ul>
&lt;li>Goal: Avoid packet losses due to buffer overflow&lt;/li>
&lt;li>Approach: Reduce traffic transmitted to the switch&lt;/li>
&lt;li>Apply flow control at layer 2
&lt;ul>
&lt;li>&lt;strong>Half-duplex&lt;/strong> links (shared LAN)
&lt;ul>
&lt;li>&lt;strong>Implicit&lt;/strong> flow control&lt;/li>
&lt;li>Backpressure: prevent potential transmitter from actually sending traffic&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Full-duplex&lt;/strong> links
&lt;ul>
&lt;li>&lt;strong>Explicit&lt;/strong> flow control&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h5 id="implicit-flow-control">Implicit Flow Control&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>Half-duplex links&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Two backpressure methods&lt;/p>
&lt;ul>
&lt;li>Enforce collision&lt;/li>
&lt;li>Pretend medium is busy&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h5 id="explicit-flow-control">Explicit Flow Control&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>Full duplex link&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Pause function&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Receiver transmits PAUSE frame in case of an overload situation&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-28%2016.00.50.png" alt="截屏2021-03-28 16.00.50">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sender stops transmitting data frames when receiving a PAUSE frame&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Implicit continuation&lt;/strong> after pause time given in PAUSE frame (multiple of time for sending 512 bit)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Explicit continuation&lt;/strong> when receiving PAUSE frame with time=0&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>PAUSE function is part of the &lt;strong>sublayer MAC control&lt;/strong> (extension of MAC layer)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h5 id="mac-control-sublayer">MAC Control sublayer&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>Handling of frames&lt;/p>
&lt;ul>
&lt;li>MAC control frames terminate on the MAC control sublayer or are generated by it&lt;/li>
&lt;li>All other frames are passed from/to higher layers&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-28%2016.05.24-20210328205233536.png" alt="截屏2021-03-28 16.05.24">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>MAC Control Frame&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-28%2016.06.23.png" alt="截屏2021-03-28 16.06.23">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Type = 0x8808&lt;/p>
&lt;/li>
&lt;li>
&lt;p>MCC: MAC Control Opcode&lt;/p>
&lt;ul>
&lt;li>Code for selected control function&lt;/li>
&lt;li>0x0001: PAUSE frame&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>MAC Control Parameters: Unused part filled with zeros at the end&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="gigabit-ethernet">Gigabit Ethernet&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Important characteristics&lt;/p>
&lt;ul>
&lt;li>Data rate: &lt;strong>1 Gbit/s&lt;/strong>&lt;/li>
&lt;li>Network topology: &lt;strong>Star&lt;/strong>&lt;/li>
&lt;li>Medium access control
&lt;ul>
&lt;li>CSMA/CD (for half-duplex links)&lt;/li>
&lt;li>Preserve Ethernet frame format&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>New concepts&lt;/p>
&lt;ul>
&lt;li>Modify medium access control: &lt;strong>Carrier extension&lt;/strong>&lt;/li>
&lt;li>Optional possibilities for improved throughput: &lt;strong>Frame bursting&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="carrier-extension">Carrier Extension&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Goal: Ensure collision detection&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Approach: Increase transmission delay &lt;em>without&lt;/em> modifying Ethernet frame structure&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Basic enhancements&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Length of time slot ≠ minimum length of frame&lt;/p>
&lt;ul>
&lt;li>Minimum frame length: 512 bit&lt;/li>
&lt;li>New length of time slot: 512 byte&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Frame with carrier extension&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-28%2016.11.34.png" alt="3">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="frame-bursting">Frame Bursting&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Goal: Efficient transmission of short frames&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Approach: Systems are permitted to send burst of frames &lt;strong>directly following each other&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>First frame with extension, if required&lt;/li>
&lt;li>Following frames directly back-to-back (no extension)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Schema&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-28%2016.12.58.png" alt="截屏2021-03-28 16.12.58" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;h3 id="summary">Summary&lt;/h3>
&lt;ul>
&lt;li>Today ́s Ethernet is very different from the original version developed by Metcalf and Boggs&lt;/li>
&lt;li>One constant has remained: The Ethernet frame format&lt;/li>
&lt;/ul>
&lt;h2 id="spanning-tree">Spanning Tree&lt;/h2>
&lt;h3 id="bridges">Bridges&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal: Connect local area networks (LANs) on &lt;strong>layer 2&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Properties&lt;/p>
&lt;ul>
&lt;li>Filter function: Detaches intra-network traffic in one LAN from inter-network-traffic to other LANs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Schema&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-18%2014.43.57.png" alt="截屏2021-03-18 14.43.57" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Types&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Source-Routing bridges&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>End systems add forwarding information in send packets
&lt;ul>
&lt;li>
&lt;p>Bridges forward the packets based on this information&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sending packets is &lt;strong>NOT transparent&lt;/strong> for the end system – it has to know the path&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Technically easy but not often used in practice 🤪&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Transparent bridges&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Local forwarding decisions in each bridge&lt;/p>
&lt;ul>
&lt;li>Forwarding information normally stored in a table (forwarding table)&lt;/li>
&lt;li>Static entries as well as dynamically learned entries&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>End system is NOT involved in forwarding decisions&lt;/p>
&lt;p>$\rightarrow$ Existence of bridges is transparent to end systems&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Often used in practice (e.g., switches)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Comparison&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-18%2014.46.50.png" alt="截屏2021-03-18 14.46.50" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="transparent-bridges-resp-switches">Transparent Bridges resp. Switches&lt;/h4>
&lt;p>Important features&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-18%2014.49.28.png" alt="截屏2021-03-18 14.49.28" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>For each network interface exists an &lt;strong>own layer 1 and MAC instance&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Data path: Through MAC relay (implements forwarding on layer 2)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Control path&lt;/p>
&lt;ul>
&lt;li>E.g., bridge protocol, bridge management&lt;/li>
&lt;li>Logical Link Control (LLC) instances are involved&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="basic-tasks">Basic Tasks&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Establishing a &lt;strong>loop-free topology&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>s.t. Packets must not loop endlessly in the network&lt;/li>
&lt;/ul>
&lt;p>$\rightarrow$ &lt;strong>&lt;a href="#spanning-tree-algorithm">Spanning-tree algorithm&lt;/a>&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Forwarding of packets&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Learning the “location” of end systems&lt;/p>
&lt;ul>
&lt;li>Creation of the forwarding table&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Filtering resp. forwarding of packets&lt;/p>
&lt;ul>
&lt;li>Based on the information of the forwarding table&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="spanning-tree-algorithm">Spanning-Tree Algorithm&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Task&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Organize bridges in a &lt;strong>tree topology&lt;/strong> (NO loops!)&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Nodes&lt;/strong>: bridges and local networks&lt;/li>
&lt;li>&lt;strong>Edges&lt;/strong>: connections between interfaces and local networks&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Not all bridges have to be part of the tree topology&lt;/p>
&lt;ul>
&lt;li>Resources might not be used optimally&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Forwarding of packets (Only possible along the tree)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Bridge protocol&lt;/strong> implements the Spanning-Tree algorithm&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Requirements for using the bridge protocol&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Group address to address all bridges in the network&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Unique bridge identifier per bridge in the network&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Unique interface identifier per interface in each bridge&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Path costs for all interfaces of a bridge have to be known&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="bpdus">BPDUs&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Bridges send special packets: &lt;strong>Bridge Protocol Data Units (BPDUs)&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>BPDU contains (among others)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Identifier of the &lt;strong>sending bridge&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Identifier of the bridge that is assumed as &lt;strong>root bridge&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Path cost&lt;/strong> from sending bridge to root bridge&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="basic-steps">Basic Steps&lt;/h4>
&lt;ol>
&lt;li>
&lt;p>Determine &lt;strong>root bridge&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Initially
&lt;ul>
&lt;li>Bridges have no topology information&lt;/li>
&lt;li>All bridges: assumption: “I am the root bridge”
&lt;ul>
&lt;li>Periodically send BPDU with itself as root bridge&lt;/li>
&lt;li>Bridges only relay BPDUs, no “normal” packets&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Receiving BPDU with &lt;strong>smaller&lt;/strong> bridge identifier
&lt;ul>
&lt;li>Bridge no longer assumes that it is the root bridge&lt;/li>
&lt;li>No longer issues own BPDUs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>When receiving BPDU possibly update of the configuration
&lt;ul>
&lt;li>BPDU contains root bridge with smaller identifier&lt;/li>
&lt;li>BPDU with same root bridge identifier but cheaper path to root bridge&lt;/li>
&lt;li>Bridge notices that it is not the designated bridge $\rightarrow$ No longer forwards BPDUs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Determine &lt;strong>root interfaces&lt;/strong> for each bridge&lt;/p>
&lt;ul>
&lt;li>Calculate the path costs to the root bridge (Sum over costs of all interfaces on path to the root bridge)&lt;/li>
&lt;li>Select interface with the lowest costs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Determine &lt;strong>designated bridge&lt;/strong> for each LAN (loop free!)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>LAN can have multiple bridges&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Select bridge with lowest costs on root interface&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Responsible for forwarding of packets&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Other bridges in the LAN will be deactivated&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;h4 id="stable-phase">Stable Phase&lt;/h4>
&lt;ul>
&lt;li>Root bridge periodically issues BPDUs
&lt;ul>
&lt;li>Only “active” bridges forward BPDUs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>No more BPDUs are received
&lt;ul>
&lt;li>Bridge again assumes that it is the root bridge&lt;/li>
&lt;li>Algorithm re-starts&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>After stabilization packets are forwarded over the respective ports
&lt;ul>
&lt;li>Based on the entries in the forwarding table&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="example-1">Example 1&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-18%2018.08.49.png" alt="截屏2021-03-18 18.08.49" style="zoom: 67%;" />
&lt;p>Calculate path costs to root bridge&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-18%2018.09.40.png" alt="截屏2021-03-18 18.09.40" style="zoom:67%;" />
&lt;p>Determine designated bridges&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-18%2018.10.37.png" alt="截屏2021-03-18 18.10.37" style="zoom:67%;" />
&lt;p>The resulting spanning tree:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-18%2018.13.29.png" alt="截屏2021-03-18 18.13.29" style="zoom: 67%;" />
&lt;h4 id="example-2">Example 2&lt;/h4>
&lt;details>
&lt;summary>&lt;b>HW15&lt;/b>&lt;/summary>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2010.29.29.png" alt="截屏2021-03-19 10.29.29" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2010.29.55.png" alt="截屏2021-03-19 10.29.55" style="zoom:67%;" />
&lt;p>Solution:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>a)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2010.33.13.png" alt="截屏2021-03-19 10.33.13" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>b) Note: Root interface is for &lt;strong>non-root bridge&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2010.31.01.png" alt="截屏2021-03-19 10.31.01" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>c) When calculating designated interface, start from LAN and consider the shortest path&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2010.38.43.png" alt="截屏2021-03-19 10.38.43" style="zoom:67%;" />
&lt;p>​ &lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2010.41.05.png" alt="截屏2021-03-19 10.41.05" style="zoom:67%;" />&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2010.35.11.png" alt="截屏2021-03-19 10.35.11" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>d)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2010.35.48.png" alt="截屏2021-03-19 10.35.48" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2010.35.59.png" alt="截屏2021-03-19 10.35.59" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;p>​ &lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2010.41.48.png" alt="截屏2021-03-19 10.41.48" style="zoom:67%;" />&lt;/p>
&lt;/details>
&lt;h3 id="rapid-spanning-tree-protocol-rstp">Rapid Spanning Tree Protocol (RSTP)&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Overview of some relevant changes&lt;/p>
&lt;ul>
&lt;li>&lt;strong>New port states&lt;/strong>
&lt;ul>
&lt;li>&lt;strong>Alternate Port&lt;/strong>: best alternative path to root bridge&lt;/li>
&lt;li>&lt;strong>Backup Port&lt;/strong>: alternative path to a network that already has a connection
&lt;ul>
&lt;li>Bridge has two ports which connect to the same network&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Sending BPDUs&lt;/strong>
&lt;ul>
&lt;li>are additionally used as “keep-alive” messages&lt;/li>
&lt;li>Every bridge sends periodic BPDUs (Hello-Timer = 2s)
&lt;ul>
&lt;li>
&lt;p>To the next hierarchy level in the tree&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Failure of a neighbor: no BPDU for 3 times&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-18%2023.47.21.png" alt="截屏2021-03-18 23.47.21" style="zoom:67%;" />&lt;/li>
&lt;/ul></description></item><item><title>Data Center</title><link>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/09-data_center/</link><pubDate>Fri, 19 Mar 2021 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/09-data_center/</guid><description>&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/fat_tree.png"
alt="Summary of fat tree">&lt;figcaption>
&lt;p>Summary of &lt;strong>fat tree&lt;/strong>&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;h3 id="data-center">Data Center&lt;/h3>
&lt;ul>
&lt;li>Typiically has
&lt;ul>
&lt;li>Large number of compute servers with virtual machine support&lt;/li>
&lt;li>Extensive storage facilities&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Typically uses
&lt;ul>
&lt;li>Off-the-shelf commodity hardware devices
&lt;ul>
&lt;li>Huge amount of servers&lt;/li>
&lt;li>Switches with small buffers&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Commodity protocols: &lt;strong>TCP/IP, Ethernet&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Should be
&lt;ul>
&lt;li>&lt;strong>Extensible&lt;/strong> without massive reorganization&lt;/li>
&lt;li>&lt;strong>Reliable&lt;/strong>
&lt;ul>
&lt;li>Requires adequate redundancy&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Highly performant&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="data-center-network">Data Center Network&lt;/h3>
&lt;ul>
&lt;li>Interconnects data center servers and storage components with each other&lt;/li>
&lt;li>Connects data center to the Internet&lt;/li>
&lt;li>Two types of traffic
&lt;ul>
&lt;li>Between external clients and internal servers&lt;/li>
&lt;li>Between internal servers&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Border routers&lt;/strong>: Connect internal network of the data center to the public Internet&lt;/li>
&lt;li>Commodity protocols
&lt;ul>
&lt;li>TCP/IP&lt;/li>
&lt;li>Ethernet&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="simplified-sketch">Simplified Sketch&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2011.30.14.png" alt="截屏2021-03-19 11.30.14" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Top-of-Rack (ToR) Ethernet switches&lt;/strong>&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2021-03-19%2011.31.30.png" alt="截屏2021-03-19 11.31.30" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;ul>
&lt;li>connect servers within a rack&lt;/li>
&lt;li>Switches typically have small buffers&lt;/li>
&lt;li>Can be placed directly at the „top“ of the rack&lt;/li>
&lt;li>Typical data center rack has 42-48 rack units per rack&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="routingforwarding-within-data-center">Routing/Forwarding within Data Center&lt;/h3>
&lt;p>Requirements&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Efficient way to communicate between any two servers&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Utilize network efficiently&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Avoid forwarding loops&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Detect failures quickly&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Provide flexible and efficient migration of virtual machines between servers&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="fat-tree-topologies">Fat-Tree Topologies&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal: Connect large number of servers by using switches that only have a limited number of ports&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Characteristics&lt;/p>
&lt;ul>
&lt;li>For any switch, number of links going down to its children is equal to the number of links going up to its parents&lt;/li>
&lt;li>The links get &lt;strong>„fatter“&lt;/strong> towards the top of the tree&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Structure&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2011.38.57.png" alt="截屏2021-03-19 11.38.57" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;strong>East-west traffic&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Between internal servers and server racks&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Result of internal applications, e.g.,&lt;/p>
&lt;ul>
&lt;li>MapReduce,&lt;/li>
&lt;li>Storage data movement between servers&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>North-south traffic&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Result of external request from the public Internet&lt;/li>
&lt;li>Between external clients and internal servers&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🔴 Problems: Switches need different numbers of ports&lt;/p>
&lt;ul>
&lt;li>Switches with high number of ports are expensive 💸&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="k-pod-fat-tree">K-Pod Fat-Tree&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Each switch has &lt;strong>$k$ ports&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Edge&lt;/strong> and &lt;strong>aggregation&lt;/strong> switch arranged in &lt;strong>$𝑘$ pods&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$\frac{k}{2}$ edge switches and $\frac{k}{2}$ aggregation switches per pod&lt;/p>
&lt;p>$\Rightarrow$ Overall: $\frac{k^2}{2}$ edge and $\frac{k^2}{2}$ aggregation switches&lt;/p>
&lt;p>$\Rightarrow$ $k^2$ switches in all pods&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>$(\frac{k}{2})^2$ &lt;strong>core switches&lt;/strong>, each connects to $k$ pods&lt;/p>
&lt;p>$\Rightarrow$ Overall $k^2 + (\frac{k}{2})^2 = \frac{5}{4}k^2$ switches&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Each edge switch connected to $\frac{k}{2}$ servers&lt;/p>
&lt;p>$\Rightarrow$ Overall $\frac{k^2}{2} \cdot \frac{k}{2} = \frac{k^3}{4}$ can be connected&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Each aggregation switch connected to $\frac{k}{2}$ edge and $\frac{k}{2}$ core switches&lt;/p>
&lt;p>$\Rightarrow$ Overall $2 \cdot (k \cdot \frac{k}{2}) \cdot \frac{k}{2} = \frac{k^3}{2}$ links (links to servers not included)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>Summary: $k$-pod fat-tree&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Component&lt;/th>
&lt;th>number&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>pod&lt;/td>
&lt;td>$k$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>edge switch&lt;/td>
&lt;td>$\frac{k^2}{2}$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>aggregation switch&lt;/td>
&lt;td>$\frac{k^2}{2}$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>core switch&lt;/td>
&lt;td>$(\frac{k}{2})^2$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>server&lt;/td>
&lt;td>$\frac{k^3}{4}$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>links between switches&lt;/td>
&lt;td>$\frac{k^3}{2}$&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/blockquote>
&lt;ul>
&lt;li>
&lt;p>Every link is in fact a physical cable $\rightarrow$ high cabling complexity 🤪&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;em>Example: $k(=4)$-Pod Fat-Tree&lt;/em>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2011.41.39.png" alt="截屏2021-03-19 11.41.39" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>👍 Advantages&lt;/p>
&lt;ul>
&lt;li>
&lt;p>All switches are identical&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Cheap commodity switches can be used&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Multiple equal cost paths between any hosts&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🔴 Disadvantages: High cabling complexity&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="routing-paths">Routing Paths&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Within a pod&lt;/strong>: $\frac{k}{2}$ paths from source to destination&lt;/p>
&lt;ul>
&lt;li>Example&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2011.54.07.png" alt="截屏2021-03-19 11.54.07" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Between servers in different pods&lt;/strong>: $\frac{k^2}{4}$ ($= \frac{k}{2} \cdot \frac{k}{2}$) between servers in different pods&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2012.00.06.png" alt="截屏2021-03-19 12.00.06" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="address-assignment">Address Assignment&lt;/h3>
&lt;p>Suppose assigning the private IPv4 address block &lt;code>10.0.0.0/8&lt;/code>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Pods&lt;/strong> are enumerated from left to right: $[0, 𝑘 − 1]$
&lt;ul>
&lt;li>&lt;strong>Switches in a pod&lt;/strong>: IP address &lt;code>10.pod.switch.1&lt;/code>
&lt;ul>
&lt;li>Edge switches are enumerated from left to right: $[0, \frac{k}{2} - 1]$&lt;/li>
&lt;li>Enumeration continues with aggregation switches from left to right: $[ \frac{k}{2}, k - 1]$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Servers&lt;/strong>: IP address &lt;code>10.pod.switch.ID&lt;/code>
&lt;ul>
&lt;li>Based on the IP address of the connected edge switch&lt;/li>
&lt;li>IDs are assigned to servers from left to right starting with &lt;strong>2&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Core switches&lt;/strong>: IP address &lt;code>10.k.x.y&lt;/code>
&lt;ul>
&lt;li>&lt;code>x&lt;/code> : starts at 1 and increments every $\frac{k}{2}$ core switches&lt;/li>
&lt;li>&lt;code>y&lt;/code> : enumerates each switch in a block of $\frac{k}{2}$ core switches from left to right, starting with 1&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Example: IP address assignment for pod 0&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2012.40.54.png" alt="截屏2021-03-19 12.40.54" style="zoom:67%;" />
&lt;h3 id="two-level-routing-tables">Two-level Routing Tables&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2012.44.00.png" alt="截屏2021-03-19 12.44.00" style="zoom:67%;" />
&lt;details>
&lt;summary>&lt;b>Example: HW17&lt;/b>&lt;/summary>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-22%2017.03.08.png" alt="截屏2021-03-22 17.03.08" style="zoom: 67%;" />
&lt;p>Solution for (a):&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-22%2017.04.18.png" alt="截屏2021-03-22 17.04.18" style="zoom: 67%;" />
&lt;p>Solution for (b):&lt;/p>
&lt;p>Use the following short-hand notation for the TCAM-based routing tables&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-22%2017.05.21.png" alt="截屏2021-03-22 17.05.21" style="zoom:67%;" />
&lt;p>x &amp;ndash;&amp;gt; a:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-22%2017.06.32.png" alt="截屏2021-03-22 17.06.32" style="zoom:67%;" />
&lt;blockquote>
&lt;p>&lt;strong>💡 Idea: if &lt;code>x.x.x.2&lt;/code>, then choose left; if &lt;code>x.x.x.3&lt;/code> then choose right&lt;/strong>&lt;/p>
&lt;p>Switch &lt;code>10.1.0.1&lt;/code> is connected with&lt;/p>
&lt;ul>
&lt;li>Server x (&lt;code>10.1.0.2&lt;/code>)&lt;/li>
&lt;li>Server a (&lt;code>10.1.0.3&lt;/code>)&lt;/li>
&lt;li>Aggregation switch &lt;code>10.1.2.1&lt;/code>&lt;/li>
&lt;li>Aggregation switch &lt;code>10.1.3.1&lt;/code>&lt;/li>
&lt;/ul>
&lt;p>In TCAM table&lt;/p>
&lt;ul>
&lt;li>For &lt;code>10.1.0.2&lt;/code> and &lt;code>10.1.0.3&lt;/code>, there&amp;rsquo;s only ONE way to go&lt;/li>
&lt;li>For &lt;code>x.x.x.2&lt;/code> (which is the first/left server connected to the edge switch), next hop will be the first/left connected aggregation switch (in this case, &lt;code>10.1.2.1&lt;/code>)&lt;/li>
&lt;li>For &lt;code>x.x.x.3&lt;/code> (which is the second/right server connected to the edge switch), next hop will be the second/right connected aggregation switch (in this case, &lt;code>10.1.3.1&lt;/code>)&lt;/li>
&lt;/ul>
&lt;/blockquote>
&lt;p>x &amp;ndash;&amp;gt; b:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-22%2017.15.37.png" alt="截屏2021-03-22 17.15.37" style="zoom:67%;" />
&lt;p>x &amp;ndash;&amp;gt; c:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-22%2017.15.57.png" alt="截屏2021-03-22 17.15.57" style="zoom:67%;" />
&lt;/details>
&lt;h2 id="ethernet">Ethernet&lt;/h2>
&lt;h2 id="within-data-centers">within Data Centers&lt;/h2>
&lt;p>🎯 Goal&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Unification of network technologies in the context of data centers&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Storage Area Networks (SANs)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>HPC networking (High Performance Computing)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&amp;hellip;&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Ethernet as a &amp;ldquo;fabric&amp;rdquo; for data centers&lt;/p>
&lt;ul>
&lt;li>Has to cope with a mix of different types of traffic $\rightarrow$ Prioritization required&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="data-center-bridging">Data Center Bridging&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Unified, Ethernet-based solution for a wide variety of data center applications&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Extensions&lt;/strong> to Ethernet&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Priority-based flow control (PFC)&lt;/strong>&lt;/p>
&lt;p>Link level flow control independent for each priority&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Enhanced transmission selection (ETS)&lt;/strong>&lt;/p>
&lt;p>Assignment of bandwidth to traffic classes&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Quantized congestion notification&lt;/strong>&lt;/p>
&lt;p>Support for end-to-end congestion control&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Data Center Bridge Exchange&lt;/strong>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="priority-based-flow-control-pfc">Priority-based Flow Control (PFC)&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>🎯Objective: avoid data loss due to congestion&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Simple flow control already provided by Ethernet: &lt;strong>PAUSE frame&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>All traffic on the corresponding port is paused&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Priority flow control pause&lt;/strong> frame&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Eight priority levels&lt;/strong> on one link&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2021-03-19%2012.50.47.png" alt="截屏2021-03-19 12.50.47" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Use of &lt;strong>VLAN identifier&lt;/strong>&lt;/p>
&lt;p>$\rightarrow$ Eight virtual links on a physical link&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Pause time can be individually selected for each priority level&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>$\rightarrow$ Differentiated quality of service possible 👏&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Prioritization with Ethernet: &lt;strong>Virtual LAN&lt;/strong>s&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Introduction of a new field for VLAN tags: &lt;strong>Q header&lt;/strong>&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2021-03-19%2012.53.10.png" alt="截屏2021-03-19 12.53.10" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Differentiation of traffic according to priority chosen by PCP&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="enhanced-transmission-selection-ets">Enhanced Transmission Selection (ETS)&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Reservation of bandwidth&lt;/p>
&lt;ul>
&lt;li>Introduction of &lt;strong>priority groups (PGs)&lt;/strong>
&lt;ul>
&lt;li>
&lt;p>Can contain multiple priority levels of a traffic type&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Different virtual queues in the network interface&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Traffic within one priority group can be handled differently&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Guarantee a &lt;strong>minimum data rate&lt;/strong> per priority group
&lt;ul>
&lt;li>Unused capacity usable by other priority groups&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2013.04.49.png" alt="截屏2021-03-19 13.04.49" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;h4 id="quantized-congestion-notification-qcn">Quantized Congestion Notification (QCN)&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Can be used by switch to notify source node that causes congestion&lt;/p>
&lt;ul>
&lt;li>Note: PAUSE frame only send to neighbor node&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Three main functions of QCN protocol&lt;/p>
&lt;ul>
&lt;li>Congestion &lt;strong>detection&lt;/strong>
&lt;ul>
&lt;li>Estimation of the strength of congestion&lt;/li>
&lt;li>Evaluation of buffer occupancy
&lt;ul>
&lt;li>Predefined threshold reached $\rightarrow$ notification&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Congestion &lt;strong>notification&lt;/strong>
&lt;ul>
&lt;li>Feedback to congestion source via congestion notification message -
&lt;ul>
&lt;li>Contains quantized feedback&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Congestion &lt;strong>response&lt;/strong>
&lt;ul>
&lt;li>Source can limit data rate using a &lt;strong>rate limiter&lt;/strong>&lt;/li>
&lt;li>Algorithm with additive increase, multiplicative decrease (AIMD) used
&lt;ul>
&lt;li>Increase data rate (additive)
&lt;ul>
&lt;li>Autonomously in absence of feedback&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Decrease data rate (multiplicative)
&lt;ul>
&lt;li>Upon receipt of a congestion notification message&lt;/li>
&lt;li>Is lowered by a maximum of 50%&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="data-center-bridge-exchange-dcbx-protocol">Data Center Bridge Exchange (DCBX) Protocol&lt;/h4>
&lt;p>Detection of capabilities and configuration of neighbors&lt;/p>
&lt;ul>
&lt;li>
&lt;p>For example, priority-based flow control&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Periodic broadcasts to the neighbors&lt;/p>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2013.13.46.png" alt="截屏2021-03-19 13.13.46" style="zoom:67%;" />
&lt;h3 id="beyond-the-spanning-tree">Beyond the Spanning Tree&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>🎯 Goals&lt;/p>
&lt;ul>
&lt;li>&lt;strong>More flexibility&lt;/strong> in terms of network topology and usage&lt;/li>
&lt;li>&lt;strong>Better utilization&lt;/strong> of the total available capacity&lt;/li>
&lt;li>&lt;strong>Scalability&lt;/strong> for networks with many bridges&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Various concepts developed&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Shortest Path Bridging (SPB)&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Transparent Interconnection of Lots of Links (TRILL)&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Common characterstics of SPB and TRILL&lt;/p>
&lt;ul>
&lt;li>Provide multipath routing at layer 2&lt;/li>
&lt;li>Use of link state routing: modified Intermediate-System-to-Intermediate-System (IS-IS) protocol&lt;/li>
&lt;li>Use of en-/decapsulation of frames at domain border&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="shortest-path-bridging">Shortest Path Bridging&lt;/h4>
&lt;ul>
&lt;li>Method
&lt;ul>
&lt;li>Every bridge in the LAN calculates shortest paths
&lt;ul>
&lt;li>Shortest path trees (unique identifier in the LAN)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Paths have to be symmetric&lt;/li>
&lt;li>Learning of MAC addresses&lt;/li>
&lt;li>Support for equal cost multipath&lt;/li>
&lt;li>Same paths for unicast and multicast&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="transparent-interconnection-of-lots-of-links">Transparent Interconnection of Lots of Links&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Routing bridges (RBridges)&lt;/strong> implement TRILL&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Each RBridge in the LAN calculates shortest routes to all other RBridges $\rightarrow$ Tree&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Encapsulation example: data sent from S to D&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2015.59.27.png" alt="截屏2021-03-19 15.59.27" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>RBridge RB1 encapsulates frame from S&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Specifies RBridge RB3 as the target because D is behind RB3&lt;/p>
&lt;/li>
&lt;li>
&lt;p>RBridge RB3 decapsulates frame&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>RBridges&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Encapsulation: insert TRILL header&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Resulting overall header&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2016.00.41.png" alt="截屏2021-03-19 16.00.41" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Outer Ethernet&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>MAC addresses for point-to-point forwarding&lt;/li>
&lt;li>Change on every hop&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>Current source and destination Bridge MAC addresses&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>TRILL header includes among others&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Nickname fo ingress RBridge&lt;/li>
&lt;li>Nickname of egress RBridge&lt;/li>
&lt;li>Hop count&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>Nicknames of overall source (ingress) and destination (egress) bridges&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Inner Ethernet&lt;/strong>: Source and destination MAC addresses of communicating end systems&lt;/p>
&lt;blockquote>
&lt;p>MAC addresses of source and destination end systems&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;/ul>
&lt;p>Example&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2021-03-28%2021.26.25.png" alt="截屏2021-03-28 21.26.25" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="tcp-within-data-centers">TCP within Data Centers&lt;/h2>
&lt;p>Relevant Properties&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Low round trip times (RTT)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Servers typically in close geographical proximity&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Values in the range of microseconds instead of milliseconds&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Incast communication&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Many-to-one: multiple sources transmit data to one sink (synchronized)&lt;/li>
&lt;li>Application examples: MapReduce, web search, advertising, recommendation systems &amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Multiple paths&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Mix of long-lived and short-lived flows&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Little statistical multiplexing&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Virtualization&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Ethernet as a &amp;ldquo;fabric&amp;rdquo; for data centers&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Commodity switches&lt;/strong>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="incast-problem-in-data-centers">Incast Problem in Data Centers&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Incast: many-to-one communication pattern&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Request is distributed to multiple servers&lt;/li>
&lt;li>Servers respond almost synchronously
&lt;ul>
&lt;li>Often, applications can not continue until all responses are received or do worse if no responses are provided&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Total number of responses can cause overflows in small switch buffers&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-19%2016.15.07.png" alt="截屏2021-03-19 16.15.07" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Packet Loss in Ethernet Switch&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Situation&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Ports often share buffers&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Individual response may be small (a few kilobytes)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Packet losses in switch possible because&lt;/p>
&lt;ul>
&lt;li>Larger number of responses can overload a port&lt;/li>
&lt;li>High background traffic on same port as incast or&lt;/li>
&lt;li>High background traffic on a different port as incast&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Packet loss causes TCP retransmission timeout&lt;/p>
&lt;p>$\rightarrow$ no further data is received, so no duplicate acks can be generated&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2021-03-19%2016.17.10.png" alt="截屏2021-03-19 16.17.10" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Barrier synchronization&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>slowest TCP connection determines efficiency&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Affected TCP instance must wait for retransmission timeout&lt;/p>
&lt;p>$\rightarrow$ Long periods where TCP connection can not transfer data&lt;/p>
&lt;p>$\rightarrow$ Application blocked, i.e, response time increases&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Improvements&lt;/p>
&lt;ul>
&lt;li>Smaller minimum retransmission timeout&lt;/li>
&lt;li>Desynchronization&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="data-center-tcp-dctcp">Data Center TCP (DCTCP)&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal: Achieve &lt;strong>high burst tolerance&lt;/strong>, &lt;strong>low latencies&lt;/strong> and &lt;strong>high throughput&lt;/strong> with shallow-buffered commodity switches&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Property: DCTCP works with low utilization of queues without reducing throughput&lt;/p>
&lt;/li>
&lt;li>
&lt;p>How does DCTCP achieve its goal?&lt;/p>
&lt;ul>
&lt;li>Responds to strength of congestion and not to its presence&lt;/li>
&lt;li>DCTCP
&lt;ul>
&lt;li>Modifies explicit congestion notification (ECN)&lt;/li>
&lt;li>Estimates fraction of bytes that encountered congestion&lt;/li>
&lt;li>Scales TCP congestion window based on estimate&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>ECN in the Switch&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Modified explicit congestion notification (ECN)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Very simple active queue management using a threshold parameter $K$&lt;/p>
&lt;ul>
&lt;li>If $\text{\# elements in queue} > K$: Set CE codepint&lt;/li>
&lt;li>Marking based on instantaneous rather than average queue length&lt;/li>
&lt;/ul>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2021-03-19%2016.27.08.png" alt="截屏2021-03-19 16.27.08" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;ul>
&lt;li>Suggestion: $𝐾 > (𝑅𝑇𝑇 ∗ 𝐶)/7$
&lt;ul>
&lt;li>$C$: data rate in packets/s&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>ECN Echo at the Receiver&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>New boolean TCP state variable: &lt;strong>DCTCP Congestion Encountered (&lt;code>DCTCP.CE&lt;/code>)&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Receiving segments&lt;/p>
&lt;ul>
&lt;li>
&lt;p>If CE codepoint is set and &lt;code>DCTCP.CE&lt;/code> is false&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Set DCTCP.CE to true&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Send an immediate ACK&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>If CE codepoint is not set and &lt;code>DCTCP.CE&lt;/code> is true&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Set DCTCP.CE to false&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Send an immediate ACK&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Otherwise: Ignore CE codepoint&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Controller at the Sender&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Estimates fraction of bytes sent that encountered congestion (&lt;code>DCTCP.Alpha&lt;/code>)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Initialized to 1&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Update:
&lt;/p>
$$
DCTCP. Apha=(1-g) * D C T C P . Alph a+g * M
$$
&lt;ul>
&lt;li>
&lt;p>$g$: estimation gain ($0 &lt; 𝑔 &lt; 1$)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$M$: fraction of bytes sent that encountered congestion during previous observation window (approximately $RTT$)
&lt;/p>
$$
\mathrm{M}=\frac{ \text{ \# marked bytes }}{ \text { \# Bytes acked (total) }}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Update congestion window in case of congestion
&lt;/p>
$$
C W n d=(1-D C T C P . \text { Alpha } / 2) * C W n d
$$
&lt;ul>
&lt;li>if $𝐷𝐶𝑇𝐶𝑃. 𝐴𝑙𝑝h𝑎$ close to 0, $𝐶𝑊𝑛𝑑$ is only slightly reduced&lt;/li>
&lt;li>if $𝐷𝐶𝑇𝐶𝑃. 𝐴𝑙𝑝h𝑎 = 1$, $𝐶𝑊𝑛𝑑$ is cut by factor 2&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Handling of congestion window growth as in conventional TCP&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Apply as usual&lt;/p>
&lt;ul>
&lt;li>Slow start, additive increase, recovery from lost packets&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>👍 &lt;strong>Benefits of DCTCP&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Incast
&lt;ul>
&lt;li>If number of small flows is too large, no congestion control will help&lt;/li>
&lt;li>If queue is built up over multiple RTTs, early reaction of DCTCP will help&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Queue buildup: DCTCP reacts if queue is longer than $𝐾$ (instantaneously)
&lt;ul>
&lt;li>Reduces queueing delays&lt;/li>
&lt;li>Minimizes impact of long-lived flows on completion time of small flows connections&lt;/li>
&lt;li>More buffer space to absorb transient micro-bursts&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Buffer pressure
&lt;ul>
&lt;li>
&lt;p>Queue of a loaded port is kept small&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Mutual influence among ports is reduced in shared memory switches&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item><item><title>TCP Evolution</title><link>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/10-tcp_evolution/</link><pubDate>Fri, 19 Mar 2021 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/10-tcp_evolution/</guid><description>&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-21%2017.21.19.png" alt="截屏2021-03-21 17.21.19">&lt;/p>
&lt;h2 id="tcp-extensions">TCP Extensions&lt;/h2>
&lt;h3 id="tcp-options-basics">TCP Options: Basics&lt;/h3>
&lt;h4 id="tcp-header">TCP Header&lt;/h4>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-21%2017.25.54.png" alt="截屏2021-03-21 17.25.54">&lt;/p>
&lt;h4 id="tcp-options">TCP Options&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal: Flexibility for new developments&lt;/p>
&lt;/li>
&lt;li>
&lt;p>TCP header field&lt;/p>
&lt;ul>
&lt;li>Each option is coded in &lt;strong>TLV format (Type-Length-Value)&lt;/strong>&lt;/li>
&lt;li>Has variable but &lt;strong>limited&lt;/strong> length
&lt;ul>
&lt;li>&lt;strong>number of options is limited (max. 40 bytes)&lt;/strong>&lt;/li>
&lt;li>TCP header length at most 60 bytes in total (incl. options)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>TLV format&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-21%2017.31.16.png" alt="截屏2021-03-21 17.31.16">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Multiple of 32 bit words (If not padding is needed)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Type&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;a href="#option-selective-acknowledgements">Selective acknowledgements&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Time stamps&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#option-window-scaling">Window scaling&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Maximum segment size&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Multipath TCP&lt;/p>
&lt;/li>
&lt;li>
&lt;p>TCP fast open&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&amp;hellip;&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Length: Length of option&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Value: Option data&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="option-selective-acknowledgements">Option Selective Acknowledgements&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>TCP uses &lt;strong>cumulative acknowledgements&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>👍 Pro: Very robust against loss of ACK segments&lt;/p>
&lt;/li>
&lt;li>
&lt;p>👎 Cons: Inefficient loss recovery&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Sender can only learn about a single lost segment per RTT&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Consequently&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Fast retransmit/fast recovery can only recover one lost segment&lt;/p>
&lt;p>per RTT&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Multiple losses often lead to retransmission timeouts and head-of-line blocking&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Improvement: &lt;strong>selective acknowledgements&lt;/strong> (SACK)&lt;/p>
&lt;ul>
&lt;li>Also acknowledge “out-of-order” data&lt;/li>
&lt;li>Implemented as TCP option&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>💡 Idea: &lt;strong>Separately acknowledge continuous blocks of out-of-order data&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Usage of SACK option negotiated during connection establishment&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-21%2017.39.41.png" alt="截屏2021-03-21 17.39.41">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>SACK option format&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-21%2017.40.45.png" alt="截屏2021-03-21 17.40.45">&lt;/p>
&lt;ul>
&lt;li>Typically, only 2-4 blocks can be “SACKed” in one segment&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Case&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-21%2018.16.04.png" alt="截屏2021-03-21 18.16.04">&lt;/p>
&lt;p>Handling:&lt;/p>
&lt;ul>
&lt;li>Use first entry of SACK option to report &lt;strong>new&lt;/strong> information&lt;/li>
&lt;li>Use subsequent entries of SACK option for redundancy Used for redundancy,
&lt;ul>
&lt;li>
&lt;p>if prior ACKs were lost&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Should repeat most recently sent first blocks&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Different alternatives&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-21%2018.17.23.png" alt="截屏2021-03-21 18.17.23">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-21%2018.17.42.png" alt="截屏2021-03-21 18.17.42" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;h3 id="option-window-scaling">Option Window Scaling&lt;/h3>
&lt;ul>
&lt;li>Header field receive window remains unchanged (16 bit)&lt;/li>
&lt;li>&lt;strong>Scaling factor can be changed&lt;/strong>
&lt;ul>
&lt;li>E.g., measure window size in 32 bit words instead of bytes&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Option is negotiated during connection establishment
&lt;ul>
&lt;li>Within SYN and SYN/ACK segments&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Scaling factor remains unchanged during lifetime of a TCP connection&lt;/li>
&lt;/ul>
&lt;h3 id="extension-syn-cookies">Extension SYN Cookies&lt;/h3>
&lt;h2 id="multipath-tcp-mptcp">Multipath TCP (MPTCP)&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Motivation&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-21%2021.37.30.png" alt="截屏2021-03-21 21.37.30" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>🎯 Goal: Extension of TCP for parallel usage of multiple paths &lt;strong>within a single TCP connection&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Improves reliability&lt;/li>
&lt;li>Increases performance&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Important requirements&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Application compatibility&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Network compatibility&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Challenges&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Middleboxes&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="connection-vs-subflow">Connection vs. Subflow&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>MPTCP connection&lt;/strong>
&lt;ul>
&lt;li>Communication relation between sender and receiver&lt;/li>
&lt;li>Consists of one or multiple &lt;strong>MPTCP subflows&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>MPTCP subflow&lt;/strong>
&lt;ul>
&lt;li>Flow of TCP segments operating over an individual path&lt;/li>
&lt;li>Started and terminated like a „regular“ TCP connection
&lt;ul>
&lt;li>
&lt;p>Started with 3-way handshake&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Closed with FIN or RST&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Can be dynamically added and removed to/from an MPTCP connection&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="embedding-into-protocol-stack">Embedding into Protocol Stack&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-21%2021.43.10.png" alt="截屏2021-03-21 21.43.10" style="zoom:67%;" />
&lt;h3 id="connection-establishment">Connection Establishment&lt;/h3>
&lt;p>3-way handshake of TCP&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-21%2021.44.49.png" alt="截屏2021-03-21 21.44.49" style="zoom: 67%;" />
&lt;p>TCP option &lt;code>MP_CAPABLE&lt;/code>&lt;/p>
&lt;ul>
&lt;li>&lt;code>X&lt;/code>, &lt;code>Y&lt;/code>: token for client and server
&lt;ul>
&lt;li>Identification for subsequent addition/removal of subflows&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="adding-a-subflow">Adding a Subflow&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-21%2021.47.59.png" alt="截屏2021-03-21 21.47.59" style="zoom:67%;" />
&lt;p>TCP option &lt;code>MP_JOIN&lt;/code>&lt;/p>
&lt;ul>
&lt;li>3-way handshake of TCP&lt;/li>
&lt;li>Use tokens exchanged during MPTCP connection establishment&lt;/li>
&lt;/ul>
&lt;h3 id="sequence-numbers">Sequence Numbers&lt;/h3>
&lt;p>Each MPTCP segment carries &lt;strong>two&lt;/strong> sequence numbers&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-21%2021.57.57.png" alt="截屏2021-03-21 21.57.57" style="zoom:67%;" />
&lt;ul>
&lt;li>&lt;strong>Data sequence number&lt;/strong> for overall MPTCP connection&lt;/li>
&lt;li>&lt;strong>Subflow sequence number&lt;/strong> for individual flow
&lt;ul>
&lt;li>Each subflow has coherent sequence numbers without „holes“&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="congestion-control">Congestion Control&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>🎯 Goals of MPTCP&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Improve throughput&lt;/strong>&lt;/p>
&lt;p>Multipath flow should perform at least as well as a single path congestion control would on the best available path&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Do not harm&lt;/strong>&lt;/p>
&lt;p>Multipath flow should not take up more capacity from any of the resources shared than if it were a single flow&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Balance congestion&lt;/strong>&lt;/p>
&lt;p>A multipath flow should have as much traffic as possible off its most congested paths&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Congestion Control algorithm only applies to increase phase of congestion avoidance&lt;/p>
&lt;ul>
&lt;li>Unchanged: slow start, fast retransmit, fast recovery and multiplicative decrease&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Different congestion windows&lt;/p>
&lt;ul>
&lt;li>$CWnd\_i$ per subflow $i$&lt;/li>
&lt;li>$CWnd\_{total}$ per MPTCP connection (multipath flow)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Assumption: Congestion window maintained in &lt;strong>bytes&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Basic approach: &lt;strong>Couple&lt;/strong> congestion control of different subflows&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Linked increase&lt;/strong> (congestion avoidance)&lt;/p>
&lt;p>For each ACK received on subflow $i$, increase $CWnd\_i$ by
&lt;/p>
$$
\min \left( \underbrace{\frac{\alpha * \text { bytes }\_{\text {acked }} * M S S\_{i}}{C W n d_{\text {total }}}}\_{\text{ Increase for multipath subflow }}, \underbrace{\frac{\text { bytes }\_{\text {acked }} * M S S\_{i}}{C W n d\_{i}}}\_{\text{ Increase „regular“ TCP would get in same scenario }}\right)
$$
&lt;p>
(any multipath subflow cannot be more aggressive than a TCP flow in the same circumstances (do not harm))&lt;/p>
&lt;ul>
&lt;li>$\alpha$: Describes &lt;strong>aggressiveness&lt;/strong> of multipath flow
$$
\alpha=C W n d\_{\text {total }} \cdot \frac{\max \_{i}\left(\frac{C W n d\_{i}}{R T T\_{i}^{2}}\right)}{\left(\sum \frac{C W n d\_{i}}{R T T\_{i}}\right)^{2}}
$$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="tcp-in-networks-with-high-bdp">TCP in Networks with High BDP&lt;/h2>
&lt;h3 id="scalability-issues">Scalability Issues&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>It can take very long until the available data rate is fully utilized&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Cause&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Very conservative behavior of congestion avoidance&lt;/p>
&lt;ul>
&lt;li>Congestion window grows by one MSS per RTT&lt;/li>
&lt;li>Slow window growth in congestion avoidance causes low average data rate&lt;/li>
&lt;/ul>
&lt;p>➡️ NOT efficient in networks with high bandwidth-delay products&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Require &lt;strong>faster increase&lt;/strong> of the congestion window in congestion avoidance&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="faster-increase-of-congestion-window">Faster Increase of Congestion Window&lt;/h3>
&lt;ul>
&lt;li>🎯 Goals
&lt;ul>
&lt;li>
&lt;p>High resource utilization in networks with high bandwidth delay product&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Quick reactions to changes of the situation within the network&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Fairness with respect to other TCP variants&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Different types of fairness
&lt;ul>
&lt;li>&lt;strong>intra protocol fairness&lt;/strong>
&lt;ul>
&lt;li>All senders use &lt;strong>same&lt;/strong> TCP variant&lt;/li>
&lt;li>Goal: All flows should achieve &lt;strong>same&lt;/strong> data rate&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>With new TCP variants: &lt;strong>inter protocol fairness&lt;/strong>&lt;/li>
&lt;li>Furthermore: &lt;strong>RTT fairness&lt;/strong>
&lt;ul>
&lt;li>Fairness among TCP flows with different RTTs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="cubic-tcp">CUBIC TCP&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>🎯 Goals&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Provide simple algorithm for networks with high bandwidth-delay product&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>TCP-friendly&lt;/strong>&lt;/p>
&lt;p>Behaves like standard TCP (i.e., TCP Reno) in networks with short RTTs and small bandwidth&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Congestion avoidance&lt;/strong>&lt;/p>
&lt;p>Applies &lt;strong>cubic&lt;/strong> function instead of linear window increase&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Performance should not be worse than TCP Reno&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>In comparison to TCP Reno&lt;/p>
&lt;ul>
&lt;li>Better RTT fairness (Window growth independent of RTT)&lt;/li>
&lt;li>Better scalability to high data rates&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Currently default congestion control in all major operating systems&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Congestion Window Increase&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Independent&lt;/strong> from RTT&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Use of actual time $t$ that has passed since last congestion incident. I.e. Window growth depends on time between consecutive congestion events&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Apply &lt;strong>cubic&lt;/strong> function
&lt;/p>
$$
W(t)=C(t-K)^{3}+W_{\max } \quad \text { with } \mathrm{K}=\sqrt[3]{\frac{W_{\max }(1-\beta)}{C}}
$$
&lt;ul>
&lt;li>$C$: predefined constant that determines aggressiveness of increase&lt;/li>
&lt;li>$W\_{max}$: congestion window size at latest congestion incident&lt;/li>
&lt;li>$K$: time period that it takes to increase current window to $W\_{max}$ (in case of no further congestions)&lt;/li>
&lt;li>$\beta$: multiplicative decrease of congestion window
&lt;ul>
&lt;li>$\beta = 0.5$ for TCP-Reno&lt;/li>
&lt;li>$\beta = 0.7$ for CUBIC TCP&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-21%2023.21.53.png" alt="截屏2021-03-21 23.21.53" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Congestion Window over Time&lt;/strong>&lt;/p>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-21%2023.23.36.png" alt="截屏2021-03-21 23.23.36" style="zoom:80%;" />
&lt;p>&lt;strong>Three CUBIC Modes&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>TCP-friendly region&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Ensures that CUBIC achieves at least same data rate as standard TCP in networks with small RTT&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Observation: in networks with small RTTs, Cubic ́s congestion window grows &lt;em>slower&lt;/em> than with TCP Reno&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Approach: “emulation” of TCP Reno (which uses AIMD)&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$AIMD(\alpha, \beta)$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$\alpha$: additive increase factor
&lt;/p>
$$
W = W + \alpha
$$
&lt;/li>
&lt;li>
&lt;p>$\beta$: multiplicative decrease factor
&lt;/p>
$$
W = \beta \cdot W
$$
&lt;/li>
&lt;/ul>
&lt;p>TCP Reno uses $AIMD(1, \frac{1}{2})$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>TCP-fair increment
&lt;/p>
$$
\alpha=3 \cdot \frac{1-\beta}{1+\beta}
$$
&lt;ul>
&lt;li>
&lt;p>Achieves same $W\_{avg}$ as $AIMD(1, \frac{1}{2})$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Average data rate of AIMD
&lt;/p>
$$
W\_{avg} = \frac{1}{R T T} \sqrt{\frac{\alpha \cdot(1+\beta)}{2 \cdot(1-\beta) \cdot p}}
$$
&lt;ul>
&lt;li>$p$: loss rate&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Window size of emulated TCP at time $t$
&lt;/p>
$$
W\_{T C P}=W\_{\max } \cdot \beta+\frac{3 \cdot(1-\beta)}{1+\beta} \cdot \frac{t}{R T T}
$$
&lt;/li>
&lt;li>
&lt;p>Recall window size of TCP cubic
&lt;/p>
$$
W(t)=C(t-K)^{3}+W_{\max }
$$
&lt;/li>
&lt;/ul>
&lt;p>$\Rightarrow$ Rule&lt;/p>
&lt;ul>
&lt;li>$W\_{Cubic} &lt; W\_{TCP}$, then $CWnd$ is set to $W\_{TCP}$ each time an ACK is received&lt;/li>
&lt;li>otherwise, $CWnd$ is set to $W\_{Cubic}$ each time an ACK is received&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Concave region&lt;/strong>: $CWnd &lt; W\_{max}$ and not in TCP-friendly region&lt;/p>
&lt;ul>
&lt;li>For each received ACK
$$
CWnd = CWnd+\frac{W\_{cubic}(t+R T T)-CWnd}{C W n d}
$$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Convex region&lt;/strong>: $CWnd > W\_{max}$ and not in TCP-friendly region&lt;/p>
&lt;ul>
&lt;li>$CWnd$ is increased very carefully&lt;/li>
&lt;li>searching for new 𝑊𝑚𝑎𝑥&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="tcp-and-response-time">TCP and Response Time&lt;/h2>
&lt;h3 id="basic-issue">Basic Issue&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Response time&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Time between initiation of a TCP connection and receipt of the requested data&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Important components&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-22%2017.27.37.png" alt="截屏2021-03-22 17.27.37" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>Handshake of TCP connection establishment&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Slow start&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Transmission of the object&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Macroscopic Model&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Response time without applying congestion control&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2021-03-22%2017.28.54.png" alt="截屏2021-03-22 17.28.54">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>After 1st RTT: Client sends object request&lt;/p>
&lt;/li>
&lt;li>
&lt;p>After 2nd RTT&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Client begins to receive object data&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Receiver needs
&lt;/p>
$$
t = \frac{\text{object size } O}{\text{data rate } D}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>$\Rightarrow$ lower bound:
&lt;/p>
$$
\text{Response time} \geq 2 RTT + \frac{O}{D}
$$
&lt;p>
( With small objects, response time dominated by $RTT$s)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Used Variables&lt;/p>
&lt;ul>
&lt;li>$RTT$: round trip time [Seconds]&lt;/li>
&lt;li>$MSS$: maximum segment size [bit]&lt;/li>
&lt;li>$W$: Size of congestion window [MSS], given as multiples of MSS&lt;/li>
&lt;li>$O$: Size of object that has to be transferred [bit]&lt;/li>
&lt;li>$D$: Data rate [bit/s]&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Observation&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$RTT$s have significant influence on response time&lt;/p>
&lt;/li>
&lt;li>
&lt;p>On connection establishment: 2 $RTT$𝑠 until reception of object begins&lt;/p>
&lt;/li>
&lt;li>
&lt;p>During object transmission&lt;/p>
&lt;ul>
&lt;li>Small windows create pauses: waiting for ACKs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Majority of TCP connections in the Web has short lifetime&lt;/p>
&lt;p>$\rightarrow$ Slow start has significant impact on response time&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🎯 Goals&lt;/p>
&lt;ul>
&lt;li>Avoid „empty“ RTTs without data transport&lt;/li>
&lt;li>Reduce RTTs needed for slow start&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="bigger-initial-congestion-window">Bigger Initial Congestion Window&lt;/h3>
&lt;p>&lt;strong>💡 Idea: Increase initial congestion window (IW)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>at least 10 segments, thus, about 15 Kbytes&lt;/li>
&lt;/ul>
&lt;h3 id="tcp-fast-open">TCP Fast Open&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal: Reduce delays that precede the transmission of an object&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>TCP Cookie&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Goal&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Avoid DoS attacks&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Disallow sending data within first SYN segment of first connection establishment to a server&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Establish cookie for subsequent connections&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Use cookie $\rightarrow$ avoid state keeping at server&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Basic steps&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Client requests TFO cookie from server&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-22%2017.40.26.png" alt="截屏2021-03-22 17.40.26" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Client uses TFO cookies in subsequent TCP connections&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-22%2017.40.45.png" alt="截屏2021-03-22 17.40.45" style="zoom:67%;" />
&lt;/li>
&lt;/ol>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="http2">HTTP/2&lt;/h3>
&lt;h3 id="quic">QUIC&lt;/h3></description></item><item><title>Access Networks</title><link>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/11-access_networks/</link><pubDate>Wed, 24 Mar 2021 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/telematics/lecture-notes/11-access_networks/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;h3 id="circuit-switching">Circuit Switching&lt;/h3>
&lt;p>„Circuit“&lt;/p>
&lt;ul>
&lt;li>Logical circuit with reserved resources for data transmission
&lt;ul>
&lt;li>no physical cable!&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>No meta data (header, appendix) required during data exchange&lt;/li>
&lt;li>No buffer overflows in intermediate systems!&lt;/li>
&lt;li>But: possibly bad resource utilization&lt;/li>
&lt;li>Use case: telephone network&lt;/li>
&lt;/ul>
&lt;h2 id="isdn">ISDN&lt;/h2>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/ISDN.png"
alt="ISDN summary">&lt;figcaption>
&lt;p>ISDN summary&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;p>&lt;strong>ISDN&lt;/strong> = &lt;strong>I&lt;/strong>ntegrated &lt;strong>S&lt;/strong>ervices &lt;strong>D&lt;/strong>igital &lt;strong>N&lt;/strong>etwork&lt;/p>
&lt;ul>
&lt;li>🎯 Goals
&lt;ul>
&lt;li>&lt;strong>Digital&lt;/strong> up to the subscriber&lt;/li>
&lt;li>&lt;strong>Integration&lt;/strong> of different services (e.g., voice, data, images)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Offering additional services
&lt;ul>
&lt;li>Redialing&lt;/li>
&lt;li>Direct call&lt;/li>
&lt;li>Automatic call-back if receiver access is busy&lt;/li>
&lt;li>Re-direction of calls&lt;/li>
&lt;li>&amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="architecture">Architecture&lt;/h3>
&lt;p>Clear Separation of Access and Network&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2022.30.26.png" alt="截屏2021-03-24 22.30.26" style="zoom:67%;" />
&lt;p>Example Topology&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2022.32.34.png" alt="截屏2021-03-24 22.32.34" style="zoom:67%;" />
&lt;p>Simplified Architecture at Subscriber Interface&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2022.33.06.png" alt="截屏2021-03-24 22.33.06" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Network Termination (NT)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Termination of technical transmission&lt;/p>
&lt;ul>
&lt;li>Of network ($U\_{k0}$ interface)&lt;/li>
&lt;li>Of subscriber installation ($S\_0$ interface)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Power supply for subscriber installation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Detect frame errors&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Local telephone switch&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Media access to signaling channel (D channel, layer 2)&lt;/li>
&lt;li>Signaling at layer 3&lt;/li>
&lt;li>&amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Adaptor&lt;/strong>: Provide ISDN functionality for non-ISDN capable device&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="isdn-subscriber-interface">ISDN Subscriber Interface&lt;/h4>
&lt;p>Basic access&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2022.37.29.png" alt="截屏2021-03-24 22.37.29" style="zoom:67%;" />
&lt;ul>
&lt;li>2 ∗ 64kbit/s+16kbit/s ($2 ∗ 𝐵 + 𝐷\_{16}$)&lt;/li>
&lt;li>Two types of logical channels
&lt;ul>
&lt;li>B channel: data transfer&lt;/li>
&lt;li>D channel: signaling traffic&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>B channel&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>User data transmission&lt;/li>
&lt;li>Data rate: 64 kbit/s&lt;/li>
&lt;li>Two B channels available
&lt;ul>
&lt;li>Operate independent of each other&lt;/li>
&lt;li>Can transmit in different directions&lt;/li>
&lt;li>Can transmit different data types (voice, images, &amp;hellip;)&lt;/li>
&lt;li>Do not have to (but can) be active at the same time&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Medium access
&lt;ul>
&lt;li>Fixed&lt;/li>
&lt;li>Time slots are associated with either B channel&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>D Channels&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Signaling&lt;/strong> (establish B channel between end systems)&lt;/li>
&lt;li>Data rate: 16 kbit/s&lt;/li>
&lt;li>Bidirectional communication: end system &amp;lt;&amp;ndash;&amp;gt; network termination&lt;/li>
&lt;li>Medium access&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>E(cho) channel&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Data rate: 16 kbit/s&lt;/li>
&lt;li>Unidirectional communication: network termination &amp;ndash;&amp;gt; end system&lt;/li>
&lt;li>Required for medium access
&lt;ul>
&lt;li>Carrier sensing (CS)&lt;/li>
&lt;li>Collision detection (CD)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Channels and Layering&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2022.42.42.png" alt="截屏2021-03-24 22.42.42" style="zoom: 67%;" />
&lt;ul>
&lt;li>Subscriber installation
&lt;ul>
&lt;li>B channels
&lt;ul>
&lt;li>Layer 1 standardized&lt;/li>
&lt;li>Layers 2-7 usage dependent&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>D channel: Layers 1-3 standardized&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="subscriber-interface">Subscriber Interface&lt;/h3>
&lt;h4 id="subscriber-interface-s_0">Subscriber Interface $S\_0$&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Four-wire transmission&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2022.54.14.png" alt="截屏2021-03-24 22.54.14" style="zoom:67%;" />
&lt;ul>
&lt;li>&lt;strong>One twin conductor per direction&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Simplex&lt;/strong> operation, both directions separated&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Multiplexing at $S\_0$ interface&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Space division multiplex&lt;/strong>: Separation of directions&lt;/li>
&lt;li>&lt;strong>Time division multiplex&lt;/strong>: Frame structure ($S\_0$ frames)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="bus-topology-at-s_0-interface">Bus Topology at $S\_0$ Interface&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2022.55.45.png" alt="截屏2021-03-24 22.55.45" style="zoom:80%;" />
&lt;ul>
&lt;li>Each end system has two connections to the bus
&lt;ul>
&lt;li>In direction to network termination: &lt;strong>write&lt;/strong> access&lt;/li>
&lt;li>In direction to end system: &lt;strong>read&lt;/strong> access&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="s_0-frames">$S\_0$ Frames&lt;/h4>
&lt;p>Time division multiplex in both directions&lt;/p>
&lt;ul>
&lt;li>
&lt;p>End system &amp;ndash;&amp;gt; network termination&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2022.58.20.png" alt="截屏2021-03-24 22.58.20" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>End system &amp;lt;&amp;ndash; network termination&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2022.57.58.png" alt="截屏2021-03-24 22.57.58" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>NT mirrors D channel into echo channel of incoming $S\_0$ frames&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2023.05.29.png" alt="截屏2021-03-24 23.05.29" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="channel-encoding">Channel Encoding&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Inverse AMI code (0 &amp;ldquo;overwrites&amp;rdquo; 1)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2023.06.40.png" alt="截屏2021-03-24 23.06.40" style="zoom:67%;" />
&lt;ul>
&lt;li>0: alternating by positive or negative level over whole tact interval&lt;/li>
&lt;li>1: represented by 0 level&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="d-channel-medium-access">D Channel: Medium Access&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Systems access D channel independent of each other&lt;/p>
&lt;ul>
&lt;li>E.g., to establish a connection&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>CSMA/CD based approach&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Check medium&lt;/strong> (echo channel as mirror of D channel)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Free, when there is no activity visible for a duration of 8 bit&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Protocol on layer 2 in D channel is variant of HDLC&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Format of an HDLC frame&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2023.12.12.png" alt="截屏2021-03-24 23.12.12" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>Delimited by flag (&lt;code>01111110&lt;/code>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Bit stuffing&lt;/strong> to conserve data transparency for higher layers&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2023.15.50.png" alt="截屏2021-03-24 23.15.50" style="zoom:80%;" />
&lt;ul>
&lt;li>After 5 subsequent binary “1” sender adds a binary “0”
&lt;ul>
&lt;li>This happens inbetween the flags&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>After 5 subsequent binary “1” receiver removes a following binary “0”&lt;/li>
&lt;li>Bit stuffing is done when sending the bit stream
&lt;ul>
&lt;li>Calculate checksum before bit stuffing&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>“Inversed” bit stuffing when receiving bit stream
&lt;ul>
&lt;li>Verify checksum after “inversed” bit stuffing&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>8 bit no activity on D channel represents 8 ones (inverse AMI-code)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Send&lt;/strong>: 1-persistent&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Collision detection&lt;/strong> through sending system&lt;/p>
&lt;ul>
&lt;li>Systems listen on E channel while sending&lt;/li>
&lt;li>Other signal received on E channel than send on D channel?
&lt;ul>
&lt;li>0 overwrites 1&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Detecting system aborts sending and continues to check medium
&lt;ul>
&lt;li>No further bit is send on D channel&lt;/li>
&lt;li>No exponential backoff&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Other system does not note anything and continues sending successfully&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="dsl">DSL&lt;/h2>
&lt;p>&lt;strong>DSL&lt;/strong> = &lt;strong>D&lt;/strong>igital &lt;strong>S&lt;/strong>ubscriber &lt;strong>L&lt;/strong>ine&lt;/p>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal&lt;/p>
&lt;ul>
&lt;li>Performant solution for subscriber connection&lt;/li>
&lt;li>Support data services with higher data rates&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>“Invariant”: Twin conductor at the U interface = connection to customer premise&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Categories&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>ADSL (Asymmetric DSL)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Follows the typical communication model of the WWW&lt;/p>
&lt;ul>
&lt;li>
&lt;p>A lot of data is received from the server&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Much less own data is send to the server&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Downstream and upstream data rates are &lt;strong>asymmetric&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Downstream (From server to subscriber): 768 kbit/s – 8 Mbit/s&lt;/li>
&lt;li>Upstream (From subscriber to server): 128 kbit/s – 576 kbit/s&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Subscriber connection&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-24%2023.23.53.png" alt="截屏2021-03-24 23.23.53" style="zoom:80%;" />
&lt;ul>
&lt;li>&lt;strong>Splitter&lt;/strong>
&lt;ul>
&lt;li>Separates signal in telephone and data signal&lt;/li>
&lt;li>Required at subscriber as well as in telephone switch&lt;/li>
&lt;li>Works passive: Telephone signal stays available even when splitter fails&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Copper twin conductor&lt;/strong>
&lt;ul>
&lt;li>Between splitters at subscriber and telephone switch&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>DSLAM&lt;/strong>
&lt;ul>
&lt;li>
&lt;p>DSL Access Multiplexer&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Counterpart to DSL modem at subscriber&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>SDSL (Symmetric DSL)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Mainly used by business customers&lt;/li>
&lt;li>Most often much more expensive than ADSL&lt;/li>
&lt;li>Only data, i.e., no parallel phone calls possible&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="data-transmission-at-dsl-access">Data Transmission at DSL Access&lt;/h3>
&lt;h4 id="frequency-multiplexing">Frequency Multiplexing&lt;/h4>
&lt;p>Different frequencies for&lt;/p>
&lt;ul>
&lt;li>Telephony&lt;/li>
&lt;li>DSL upstream&lt;/li>
&lt;li>DSL downstream&lt;/li>
&lt;/ul>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2021-03-28%2023.39.41.png" alt="截屏2021-03-28 23.39.41" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h4 id="sources-of-signal-disturbance">Sources of Signal Disturbance&lt;/h4>
&lt;p>&lt;strong>Damping&lt;/strong>: primary influenced by three parameters&lt;/p>
&lt;ul>
&lt;li>Distance, interference, cable diameter
&lt;ul>
&lt;li>
&lt;p>Damping decreases with increasing cable diameter&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&amp;ndash;&amp;gt; Larger diameter permits higher data rates on same distance&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Crosstalk&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Interference between sender and receiver&lt;/li>
&lt;li>Interference between senders &amp;ndash;&amp;gt; Only some twin conductors of a cable bundle can be used for ADSL&lt;/li>
&lt;/ul>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2021-03-25%2000.33.07.png" alt="截屏2021-03-25 00.33.07" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h3 id="adsl2-vdsl2">ADSL2, VDSL2&lt;/h3>
&lt;h3 id="dsl-access-network">DSL Access Network&lt;/h3>
&lt;h4 id="basic-configuration">Basic configuration&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-25%2000.34.14.png" alt="截屏2021-03-25 00.34.14" style="zoom:67%;" />
&lt;ul>
&lt;li>&lt;strong>BRAS: Broadband Remote Access Server&lt;/strong>
&lt;ul>
&lt;li>Part of the ISPs core network&lt;/li>
&lt;li>Tasks
&lt;ul>
&lt;li>
&lt;p>Routes traffic to/from broadband access devices (e.g., DSLAM)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Aggregates traffic of multiple DSLAMs&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Can support policy management, quality-of-service&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Provides layer-2-connectivity&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Provide layer-3-connectivity&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Interfaces to AAA (Authentication, Authorization, Accounting)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Assigns IP addresses to clients&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="setting-up-an-adsl-connection">Setting up an ADSL Connection&lt;/h4>
&lt;p>Provider is at the same time network provider: Use &lt;strong>PPP (point-to-point protocol)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Establish&lt;/strong> phase &amp;ndash;&amp;gt; LCP (link control protocol)
&lt;ul>
&lt;li>
&lt;p>Setup PPP connection&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Negotiate connection parameters&lt;/p>
&lt;ul>
&lt;li>Data rate, used carriers&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Negotiate authentication method&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Negotiate the Data Rate&lt;/p>
&lt;ul>
&lt;li>Fixed rate
&lt;ul>
&lt;li>Data rate is set to fixed value&lt;/li>
&lt;li>Contains “safety margin”&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Adaptive rate
&lt;ul>
&lt;li>Negotiate the maximum reachable data rate&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Authentication&lt;/strong> phase
&lt;ul>
&lt;li>Authentication based on negotiated method&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Network&lt;/strong> phase
&lt;ul>
&lt;li>Assignment of IP address&lt;/li>
&lt;li>Announcing address of the DNS server&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Provider uses DSL resale link&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-25%2000.42.48.png" alt="截屏2021-03-25 00.42.48" style="zoom:67%;" />
&lt;ul>
&lt;li>Sequence
&lt;ul>
&lt;li>Abort previous sequence in the authentication phase
&lt;ul>
&lt;li>Only at this time it is known that subscriber is customer of different provider&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Thereafter
&lt;ul>
&lt;li>Forwarding all data to other provider&lt;/li>
&lt;li>Restart complete sequence&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="further-access-technologies">Further Access Technologies&lt;/h2>
&lt;h3 id="cable-tv-network">Cable TV Network&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Initially only designed for &lt;strong>TV and broadcast transmission&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Today also useable for &lt;strong>telephony&lt;/strong> and &lt;strong>Internet&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Topology&lt;/p>
&lt;ul>
&lt;li>Initially pure tree topology with coaxial cables&lt;/li>
&lt;li>Today combination of glass fiber and coaxial cables&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Configuration at household&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-25%2000.46.01.png" alt="截屏2021-03-25 00.46.01" style="zoom: 67%;" />
&lt;/li>
&lt;li>
&lt;p>Architecture&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-25%2000.46.27.png" alt="截屏2021-03-25 00.46.27" style="zoom:80%;" />
&lt;ul>
&lt;li>CMTS: Cable Modem Termination System&lt;/li>
&lt;/ul>
&lt;p>From hub to households&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2021-03-25%2000.47.48.png" alt="截屏2021-03-25 00.47.48" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Data transfer&lt;/p>
&lt;ul>
&lt;li>Downstream
&lt;ul>
&lt;li>Broadcast: all subscribers receive same signal&lt;/li>
&lt;li>Cable modem filters out “own” packets&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Upstream
&lt;ul>
&lt;li>Access to channels controlled by time multiplex (time slots)&lt;/li>
&lt;li>Time slots are assigned by CMTS in the head-end&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Shared medium: Reachable data rate depends on number of concurrent users&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="powerline">Powerline&lt;/h3></description></item><item><title>Wertdiskrete Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertdiskrete_systeme/</link><pubDate>Fri, 27 May 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertdiskrete_systeme/</guid><description/></item><item><title>Wert- und Zeitdiskrete Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertdiskrete_systeme/wert_und_zeitdiskret_system/</link><pubDate>Fri, 03 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertdiskrete_systeme/wert_und_zeitdiskret_system/</guid><description>&lt;h2 id="vorbemerkungen">Vorbemerkungen&lt;/h2>
&lt;h3 id="signale-in-kontinuierlicher-und-diskreter-zeit">Signale in kontinuierlicher und diskreter Zeit&lt;/h3>
&lt;p>&lt;strong>kontinuierliche (konti.) Zeit&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Zeit ist kontinuierliche Variable&lt;/li>
&lt;li>Signal $s(t)$ nimmt bestimmten Wert $s^*(t^*)$
für beliebig kurze Zeitspanne an&lt;/li>
&lt;li>Zwischen zwei beliebigen Zeitpunkte $t_1$ und $t_2$ liegen &lt;em>unendlich&lt;/em> viele Zeitpunkt $t_1 \leq t \leq t_2$&lt;/li>
&lt;li>Werte könne kontinuierlich oder diskret sein&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-06-03%2012.24.25.png" alt="截屏2022-06-03 12.24.25" style="zoom:60%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-06-03%2012.24.51.png" alt="截屏2022-06-03 12.24.51" style="zoom:60%;" />
&lt;ul>
&lt;li>Kontinuierlich in Zeit und Wert $\rightarrow$ analoges Signal&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Diskrete Zeit&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Diskrete Zeitpunkt $t_k, k \in \mathbb{Z}$
&lt;/p>
$$
s_k := s(t_k)
$$
&lt;p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertdiskrete_systeme-deskrete_zeit.png" alt="wertdiskrete_systeme-deskrete_zeit" style="zoom:67%;" />&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Zeitliche Anordnung der $t_k$ ist beliebig, aber in viele Fällen &lt;em>äquidistant&lt;/em>
&lt;/p>
$$
t_k = k \cdot \Delta \quad k \in \mathbb{Z}
$$
&lt;/li>
&lt;li>
&lt;p>Wert können kontinuierlich oder diskret sein&lt;/p>
&lt;ul>
&lt;li>Diskret in Zeit und Ort $\rightarrow$ digitales Signal&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-06-03%2014.54.54.png" alt="截屏2022-06-03 14.54.54" style="zoom:50%" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-06-03%2014.55.06.png" alt="截屏2022-06-03 14.55.06" style="zoom:50%" />
&lt;/li>
&lt;li>
&lt;p>Signale können inhärent zeitdiskret sein, oder aus Abtastung kontinuierliche Signale entstehen.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="kategoriale-und-kardinale-variablen">Kategoriale und Kardinale Variablen&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-06-03%2015.03.01.png" alt="截屏2022-06-03 15.03.01" style="zoom: 50%;" />
&lt;p>&lt;strong>Kategoriale Variable&lt;/strong>
&lt;details class="spoiler " id="spoiler-1">
&lt;summary class="cursor-pointer">Nominal&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;ul>
&lt;li>
&lt;p>The nominal scale is made up of pure labels.&lt;/p>
&lt;ul>
&lt;li>The only meaningful question to ask is whether two variables have the same value: the nominal scale only allows to compare two values w.r.t. equivalence.&lt;/li>
&lt;li>There is no meaningful transformation besides relabeling.&lt;/li>
&lt;li>No empirical operation is permissible, i.e., there is no mathematical operation of nominal features that is also meaningful in the material world.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>A typical example is the sex of a human.&lt;/p>
&lt;ul>
&lt;li>The two possible values can be either written as “f” vs. “m,” “female” vs. “male”. The labels are different, but the meaning is the same.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Although nominal values are sometimes represented by digits, one must not interpret them as numbers.&lt;/p>
&lt;ul>
&lt;li>For example, the postal codes used in Germany are digits, but there is no meaning in, e.g., adding two postal codes.&lt;/li>
&lt;li>Similarly, nominal features do not have an ordering, i.e., the postal code 12345 is not “smaller” than the postal code 56789. Of course, most of the time there are options for how to introduce some kind of lexicographic sorting scheme, but this is purely artificial and has no meaning for the underlying objects.
With respect to statistics, the permissible average is not the mean (since summa- tion is not allowed) or the median (since there is no ordering), but the mode, i.e., the most common value in the dataset.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/div>
&lt;/details>&lt;/p>
&lt;details class="spoiler " id="spoiler-2">
&lt;summary class="cursor-pointer">Ordinal&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;ul>
&lt;li>The ordinal scale allows comparing values &lt;strong>w.r.t. equivalence and rank.&lt;/strong>
&lt;ul>
&lt;li>Any transformation of the domain must preserve the order, which means that the transformation must be strictly increasing.&lt;/li>
&lt;li>But there is still no way to add an offset to one value in order to obtain a new value or to take the difference between two values.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Example: school grades.
&lt;ul>
&lt;li>In the German grading system, the grade 1 (“excellent”) is better than 2 (“good”), which is better than 3 (“satisfactory”) and so on.
&lt;ul>
&lt;li>But quite surely the difference in a student’s skills is not the same between the grades 1 and 2 as between 2 and 3, although the “difference” in the grades is unity in both cases.&lt;/li>
&lt;li>In addition, teachers often report the arithmetic mean of the grades in an exam, even though the arithmetic mean does not exist on the ordinal scale. In consequence, it is syntactically possible to compute the mean, even though the result, e.g., 2.47 has no place on the grading scale, other than it being “closer” to a 2 than a 3. The Anglo-Saxon grading system, which uses the letters “A” to “F”, is somewhat immune to this confusion.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>The correct average involving an ordinal scale is obtained by the median.&lt;/li>
&lt;/ul>
&lt;/div>
&lt;/details>
&lt;p>&lt;strong>Kardinale Variable&lt;/strong>&lt;/p>
&lt;details class="spoiler " id="spoiler-3">
&lt;summary class="cursor-pointer">Interval&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;ul>
&lt;li>The interval scale allows adding an offset to one value to obtain a new one, or to calculate the difference between two values—hence the name.&lt;/li>
&lt;li>However, the interval scale lacks a naturally defined zero. Values from the interval scale are typically represented using real numbers, which contains the symbol “0,” but this symbol has no special meaning and its position on the scale is arbitrary. For this reason, the scalar multiplication of two values from the interval scale is meaningless. Permissible transformations preserve the order, but may shift the position of the zero.&lt;/li>
&lt;/ul>
&lt;/div>
&lt;/details>
&lt;details class="spoiler " id="spoiler-4">
&lt;summary class="cursor-pointer">Verhältnis&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;ul>
&lt;li>
&lt;p>The ratio scale has a well defined, non-arbitrary zero, and therefore allows calculating ratios of two values.&lt;/p>
&lt;ul>
&lt;li>This implies that there is a scalar multiplication and that any transformation must preserve the zero.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Many features from the field of physics belong to this category and any transformation is merely a change of units.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/div>
&lt;/details>
&lt;details class="spoiler " id="spoiler-5">
&lt;summary class="cursor-pointer">Absolut&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
The absolute scale shares these properties, but is equipped with a natural unit and features of this scale can NOT be negative. In other words, features of the absolute scale represent counts of some quantities. Therefore, the only allowed transformation is the identity.
&lt;/div>
&lt;/details>
&lt;h2 id="wertdiskrete-systeme">Wertdiskrete Systeme&lt;/h2>
&lt;h3 id="statische-systeme">Statische Systeme&lt;/h3>
&lt;p>Ein-/Ausgang: Zufallsvariable $u_k$ (Eingang) und $y_k$ (Ausgang), $k \in \mathbb{N}_0$&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertdiskrete_systeme-statistische_systeme.drawio.png" alt="wertdiskrete_systeme-statistische_systeme.drawio">&lt;/p>
&lt;p>$u_k$ und $y_k$ sind wertdiskret, wobei o.B.d.A&lt;/p>
$$
\begin{array}{l}
u_{k} \in\{1,2, \cdots, p\} \\
y_{k} \in\{1,2, \ldots, M\}
\end{array}
$$
&lt;p>Stochastische Abhängigkeit $y_k$ von $u_k$:&lt;/p>
$$
P\left(y_{k}=i \mid u_{k}=j\right) \qquad j \in\{1, \cdots, p\}, i \in\{1, \ldots, m\}
$$
&lt;p>Anordnung der Wahrscheinlichkeit in Matrix $A_k$:&lt;/p>
$$
\mathbf{A}_{k}=\left(\begin{array}{ccc}
P\left(y_{k}=1 \mid u_{k}=1\right) &amp; \cdots &amp; P\left(y_{k}=M \mid u_{k}=1\right) \\
\vdots &amp; &amp; \vdots \\
P\left(y_{k}=1 \mid u_{k}=P\right) &amp; \cdots &amp; P\left(y_{k}=M \mid u_{k}=P\right)
\end{array}\right)
$$
&lt;ul>
&lt;li>
&lt;p>Elemente $\geq 0$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Zeilensumme $= 1$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Auftrittswahrscheinlichkeit als Vektoren:&lt;/p>
$$
\eta_{k}^{u}=\left(\begin{array}{c}
P\left(u_{k}=1\right) \\
P\left(u_{k}=2\right) \\
\vdots \\
P\left(u_{k}=P\right)
\end{array}\right) \qquad \eta_{k}^{y}=\left(\begin{array}{c}
P\left(y_{k}=1\right) \\
P\left(y_{k}=2\right) \\
\vdots \\
P\left(y_{k}=M\right)
\end{array}\right)
$$
&lt;/li>
&lt;/ul>
&lt;p>Berechnung von $\eta_k^y$ aus $\eta_k^u$ (in Vektor-Matrix-Form):&lt;/p>
$$
\eta_{k}^{y}=\mathbf{A}_{k}^{\top} \eta_{k}^{u}
$$
&lt;details class="spoiler " id="spoiler-11">
&lt;summary class="cursor-pointer">Details&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
$$
\begin{aligned}
P\left(y_{k}=i\right) &amp;=\sum_{j=1}^{P} P\left(y_{k}=i, u_{k}=j\right) \\\\
&amp;=\sum_{j=1}^{p} P\left(y_{k}=i \mid u_{k}=j\right) \cdot P\left(u_{k}=j\right)
\end{aligned}
$$
&lt;/div>
&lt;/details>
&lt;p>Spezialfall: $u_k = j^*$ ist bekannt, also&lt;/p>
$$
\begin{array}{l}
P\left(u_{k}=j^{*}\right)=1 \\
P\left(u_{k}=j\right)=0 \quad j=1, \cdots M, j \neq j^{*}
\end{array}
$$
$$
\begin{aligned}
\Rightarrow \quad P\left(y_{k}=i\right) &amp;=\sum_{j=1}^{p} p\left(y_{k}=i \mid u_{k}=j\right) P\left(u_{k}=j\right) \\
&amp;=P\left(y_{k}=i \mid u_{k}=j^{*}\right)
\end{aligned}
$$
&lt;p>In Vektor-Matrix-Form:&lt;/p>
$$
\eta_{k}^{y}={\underbrace{\mathbf{A}_{k}\left(j^{*}, :\right)}_{\text{die } j^*-\text{te Zeile von } A_k}}^\top=\left(P\left(y_{k}=1 \mid u_{k}=j^{*}\right) \cdots P\left(y_{k}=M \mid u_{k}=j^{*}\right)\right)^{\top}
$$
&lt;h3 id="dynamische-systeme">Dynamische Systeme&lt;/h3>
&lt;ul>
&lt;li>Der aktuellen Ausgang $y_k$ ist abhängig von
&lt;ul>
&lt;li>dem aktuellen Eingang $u_k$&lt;/li>
&lt;li>dem aktuellen Zustand $x_k$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Aufteilung des dynamischen Systems in zwei Teile
&lt;ul>
&lt;li>&lt;strong>&lt;a href="#systemabbildung">Systemabbildung&lt;/a>&lt;/strong> (&lt;em>dynamischer&lt;/em> Teil): beschreibt zeitliche Entwicklung des Zustands $x_k$&lt;/li>
&lt;li>&lt;strong>&lt;a href="#messabbildung">Messabbildung&lt;/a>&lt;/strong> (&lt;em>statischer&lt;/em> Teil): beschreibt die Abbildung des Ausgang $y_k$ von Zustand $x_k$ (und evtl. von aktuellem Eingang $u_k$)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="systemabbildung">Systemabbildung&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Zufallsvariable $x_k, k \in \mathbb{N}_0$ mit $x_k \in \{1, 2, \dots, N\}$
&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Entwicklung des Zustands $x_k$ bescrhieben ducrch&lt;/p>
$$
P(x_{k+1}=i | x_k, \dots, x_1, x_0, u_k)
$$
&lt;p>($u_k$ oft explizit forgelassen)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>&lt;strong>Definition&lt;/strong>&lt;/p>
&lt;p>Bei $x_k$ handelt es sich um eine &lt;mark>&lt;strong>Markov-Ketter&lt;/strong>&lt;/mark> (erster Ordnung), falls gilt&lt;/p>
$$
P\left(x_{k+1}=i \mid x_{k}, \ldots, x_{1}, x_{0}, u_{k}\right)=P\left(x_{k+1}=i \mid x_{k}, u_{k}\right)
$$
&lt;/span>
&lt;/div>
&lt;ul>
&lt;li>
&lt;p>Die zukünftige Entwicklung $x_{k+1}$ ist bedingt unabhängig von vergangen Zuständen $x_{k-1}, \dots, x_1, x_0$, falls aktueller Zustand $x_k$ bekannt ist&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Vereinfachte Übergangswahrscheinlichkeit&lt;/p>
$$
P(x_{k+1} = j| x_k = i)
$$
&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>&lt;strong>Definition&lt;/strong>&lt;/p>
&lt;p>Eine Markov-Kette wird als &lt;mark>&lt;strong>Zeithomogen&lt;/strong>&lt;/mark> oder allg. als &lt;mark>&lt;strong>zeitinvariant&lt;/strong>&lt;/mark> bezeichnet, falls die Übergangswahrscheinlichkeit nicht von Zeitindex abhängen, d.h. es gilt&lt;/p>
$$
P\left(x_{k+1}=j \mid x_{k}=i\right)=\mathbf{A}(i, j)
$$
&lt;p>Übergangsmatrix (zeithomogen):&lt;/p>
$$
\mathbf{A}=\left(\begin{array}{cccc}
A(1,1) &amp; A(1,2) &amp; \ldots &amp; A(1, N) \\\\
A(2,1) &amp; A(2,2) &amp; \cdots &amp; A(2, N) \\\\
\vdots &amp; \vdots &amp; &amp; \vdots \\\\
A(N, 1) &amp; A(N, 2) &amp; \cdots &amp; A(N, N)
\end{array}\right)
$$
&lt;/span>
&lt;/div>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>&lt;strong>Definition&lt;/strong>&lt;/p>
&lt;p>Eine quadratische Matrix $\mathbf{A}$ heißt &lt;mark>&lt;strong>Markov-Matrix&lt;/strong>&lt;/mark>, falls&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Alle Elemente &lt;strong>nicht-negative&lt;/strong> sind&lt;/p>
$$
A(i, j) \geq 0 \quad \text{ für } i, j \in \\{1, \dots, N\\}
$$
&lt;/li>
&lt;li>
&lt;p>Die Zeilensumme gleich 1&lt;/p>
$$
\sum_{i=1}^{N} A(i, j)=1 \quad \text{für } i \in \\{1, \dots, N\\}
$$
&lt;/li>
&lt;/ul>
&lt;/span>
&lt;/div>
&lt;p>Graphische Darstellung einer Markov-Kette:&lt;/p>
&lt;p>z.B. $N=2, x_k \in \\{1, 2\\}$&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertdiskrete_systeme-Markov_Kette.drawio.png" alt="wertdiskrete_systeme-Markov_Kette.drawio" style="zoom:80%;" />
&lt;h4 id="messabbildung">Messabbildung&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Zustand typischerweise NICHT direkt verfügbar (latente Variable)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Messabbildung vom Zustand $x_k$ und dem aktuelle Eingang $u_k$ auf aktuelle Ausgang $y_k$&lt;/p>
$$
P\left(y_{k}=j \mid x_{k}=i, u_{k}=m\right)
$$
&lt;ul>
&lt;li>$u_k$ oft explizit forgelassen&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Zeithomogen (allg. zeitinvariant)&lt;/p>
$$
P\left(y_{k}=j|x_{k}=i\right)=B(i, j)
$$
&lt;/li>
&lt;li>
&lt;p>Messe-/Beobachtungsmatrix&lt;/p>
$$
\mathbf{B}=\left[\begin{array}{ccc}
B(1,1) &amp; \cdots &amp; B(1, M) \\
\vdots &amp; &amp; \vdots \\
B(N, 1) &amp; \cdots &amp; B(N, M)
\end{array}\right]
$$
&lt;/li>
&lt;/ul>
&lt;h4 id="gesamtes-dynamisches-system">Gesamtes Dynamisches System&lt;/h4>
&lt;p>&lt;strong>Hidden Markov Model&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Zustand&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Wert $x_k, k=1,2,\dots$&lt;/li>
&lt;li>Verteilung $\eta_k^x, k=1,2,\dots$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Initialer Zustand&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Wert $x_0$&lt;/li>
&lt;li>Verteilung $\eta_0^x$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Eingänge&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Werte $u_k, k=0,1,\dots$&lt;/li>
&lt;li>Verteilung $\eta_k^u,k=0,1,\dots$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Ausgänge&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Werte $y_k, k=0,1,\dots$&lt;/li>
&lt;li>Verteilung $\eta_k^y,k=0,1,\dots$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Systemabbildung&lt;/strong> $\mathbf{A}_k$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Messabbildung&lt;/strong> $\mathbf{B}_k$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Graphische Darstellung&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Ausgerollte zeitliche Abhängigkeit der Zufallsvariablen&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertdiskrete_systeme-Markov_kette_ausgerollte.png"
alt="Markot-Kette (ausgerollte Darstellung)">&lt;figcaption>
&lt;p>Markot-Kette (ausgerollte Darstellung)&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;/li>
&lt;li>
&lt;p>Rekursive Darstellung der zeitliche Abbildung der Zufallsvariablen&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertdiskrete_systeme-rekursiv_Markov_kettee.png"
alt="Markot-Kette (rekursive Darstellung)">&lt;figcaption>
&lt;p>Markot-Kette (rekursive Darstellung)&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;/li>
&lt;li>
&lt;p>Betont Übergange und Wahrscheinlichkeit&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertdiskrete_systeme-Markov_kette_uebergaenge_und_wahrscheinlichkeit.drawio.png"
alt="Markot-Kette (betont Übergange und Wahrscheinlichkeit)">&lt;figcaption>
&lt;p>Markot-Kette (betont Übergange und Wahrscheinlichkeit)&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;/li>
&lt;/ul></description></item><item><title>Zustandsschätzung</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertdiskrete_systeme/zustandsschaetzung/</link><pubDate>Wed, 08 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertdiskrete_systeme/zustandsschaetzung/</guid><description>&lt;h2 id="vorbemerkungen">Vorbemerkungen&lt;/h2>
&lt;h3 id="bayessches-gesetz-und-erweiterte-konditionierung">Bayessches Gesetz und erweiterte Konditionierung&lt;/h3>
$$
\begin{array}{l}
&amp;P(a \mid b) \cdot P(b)=P(a, b)=P(b \mid a) \cdot P(a) \\\\
\Rightarrow &amp;P(b \mid a)=\frac{P(a | b) \cdot P(b)}{P(a)}
\end{array}
$$
&lt;p>Erweiterte Konditionierung:&lt;/p>
$$
\begin{array}{l}
P(b \mid a, c) \cdot \underbrace{P(a, c)}_{P(a \mid c) \cdot P(c)}=P(a, b, c)=P(a \mid b, c) \cdot \underbrace{P(b, c)}_{P(b \mid c) \cdot P(c)} \\\\
\Rightarrow P(b \mid a, c) \cdot P(a \mid c)=P(a \mid b, c) \cdot P(b \mid c) \quad (\triangle) \\\\
\Rightarrow P(b \mid a, c)=\frac{P(a \mid b, c) \cdot P(b \mid c)}{P(a \mid c)}
\end{array}
$$
&lt;h3 id="notation-zu-abhängigkeit-vom-eingang">Notation zu Abhängigkeit vom Eingang&lt;/h3>
&lt;p>Abhängigkeit der Systemmatrizen $\mathbf{A}_k$ (Übergangsmatrix) und $\mathbf{B}_k$ (Messe-/Beobachtungsmatrix) von Eingang $u_k$ (4 dimensionale Felde):&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertdiskrete_systeme-abhaengigkeit_von_Eingang.png" alt="wertdiskrete_systeme-abhaengigkeit_von_Eingang" style="zoom:67%;" />
&lt;p>Schreibweise:&lt;/p>
$$
\begin{array}{l}
&amp;A\left(k, u_{k}, X_{k+1} = x_{k+1}, X_{k}=x_{k}\right) \\
= &amp;A\left(k, u_{k}, x_{k+1}, x_{k}\right) \\
= &amp;A_{k}\left(u_{k}, x_{k+1}, x_{k}\right) \\
= &amp;A_{k}^{u_{k}}\left(x_{k+1}, x_{k}\right) \\
\end{array}
$$
&lt;p>&amp;ldquo;Zum Zeitpunkt $k$ ist der aktuelle Zustand $X_k=x_k$. Was ist die Wahrscheinlichkeit vom den nächsten Zustand $X_{k+1}=x_{k+1}$, wenn der Eingang $u_k$ ist?&amp;rdquo;&lt;/p>
&lt;p>Zeitinvariante Fall:&lt;/p>
$$
A(u_k, x_{k+1}, x_k) = A_{u_k}(x_{k+1}, x_k)
$$
&lt;h2 id="zustandsschätzung">Zustandsschätzung&lt;/h2>
&lt;h3 id="ziel">Ziel&lt;/h3>
&lt;p>&lt;strong>Rekonstruktion&lt;/strong> des &lt;em>internen&lt;/em> Zustands aus Messungen und Eingängen (Annahme: $\mathbf{A}_k, \mathbf{B}_k$ bekannt)&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertdiskrete_systeme-rekusiver_Zustandschaetzer.png"
alt="Interner Zustand Schätzer">&lt;figcaption>
&lt;p>Interner Zustand Schätzer&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;h3 id="problemformulierung">Problemformulierung&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Gegeben&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Eingänge $u_k, k = 0, \dots, k_u$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Messungen $y_k, k = 1, \dots, k_y$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Gesucht: Rekonstruktion des Zustands&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$\hat{x}_k, k = 1, \dots, k_x$ (alle interne Zustände)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$\hat{x}_{k_x}$ (der letzte Zustand)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Bsp Darstellung&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertdiskrete_systeme-Schaetzung_Problemformulierung_grid.png" alt="wertdiskrete_systeme-Schaetzung_Problemformulierung_grid" style="zoom:50%;" />
&lt;/li>
&lt;li>
&lt;p>Paradigma: Nutzung aller Daten&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Zwei wichtige Fälle/Phasen&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;a href="#pr%C3%A4diktion">Prädiktion&lt;/a> ($k_u + 1 = k_x > k_y$)&lt;/p>
&lt;p>Eine Prädiktion für den aktuellen Zustand basierend auf den letzten Zustand machen&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#filterung">Filterung&lt;/a> ($k_u + 1 = k_x = k_y$)&lt;/p>
&lt;p>Mit der beobachtbaren Messungen die Prädiktion updaten/verfeinern&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="prädiktion">Prädiktion&lt;/h3>
&lt;h4 id="allgemein">Allgemein&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Gegeben&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Schätzung des Zustands zu einem Zeitpunkt $m$, welche gesamte Eingang- und Messhistorik bis dahin enthält&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Eingänge $u_k$ für $k > m$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Systemmatrizen $A_k$ für $k>m$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Interpretation&lt;/p>
&lt;p>Ab Zeitpunkt $m+1$ fehlen Messungen. Wie entwicklt sich System rein auf Basis des Systemmodells?&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Gesucht&lt;/p>
&lt;p>Prädiktion zu späteren Zeitpunkt $k>m$ für gegeben Eingänge bis $k-1$&lt;/p>
$$
P(x_k \mid y_{1:m}, u_{0: k- 1})
$$
&lt;p>für $x_k \in \{1, \dots, N\}$
&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Beispiel:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$m = 2, k =3$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Prädiktion:&lt;/p>
$$
P(x_3 \mid y_{1:2}, u_{0:2})
$$
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertdiskrete_systeme-Praediktion.drawio.png" alt="wertdiskrete_systeme-Praediktion.drawio" style="zoom:80%;" />
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>Es ist wichtig, dem Bayessches Gesetz mit der erweiterten Konditionierung zu verwenden
&lt;/p>
$$
P(a, b \mid c) = P(a \mid b, c) \cdot P(b \mid c) \qquad (\ast)
$$&lt;/span>
&lt;/div>
&lt;p>Zum Zeitpunkt $k > m$:&lt;/p>
$$
\begin{array}{l}
&amp;P\left(x_{k} \mid y_{0: m}, u_{0: k-1}\right)\\
=&amp;P\left(x_{k} \mid y_{0}, y_{1}, \cdots, y_{m}, u_{0}, u_{1}, \cdots u_{k-1}\right) \quad \mid \text{Marginalisierung}\\
=&amp; \displaystyle \sum_{x_{k-1}=1}^{N} P\left(x_{k}, x_{k-1} \mid y_{0: m}, u_{0: k-1}\right)\\
\overset{(\ast)}{=}&amp;\displaystyle \sum_{x_{k-1}=1}^{N} P\left(x_{k} \mid x_{k-1}, y_{0: m}, u_{0: k-1}\right) P\left(x_{k-1} \mid y_{0: m}, u_{0: k-1}\right) \quad \mid \text{Markov}\\
=&amp;\displaystyle \sum_{x_{k-1}=1}^{N} \underbrace{P\left(x_{k} \mid x_{k-1}, u_{k-1}\right)}_{\text {Übergangswachrshheinlicheit }} \cdot \underbrace{P\left(x_{k-1} \mid y_{0: m}, u_{0 : k-2}\right)}_{\text {Schätzung für } k-1} \quad \text { (Rekursiv nach vorne) }
\end{array}
$$
&lt;p>(Die Summe beschreibt eine Vektor-Matrix-Multiplikation.)&lt;/p>
&lt;p>Anordnen der Einzelwahrscheinlichkeit in Vektoren:&lt;/p>
$$
\eta_{k \mid 1: m}^{x} = \left(\begin{array}{c}
P\left(x_{k}=1 \mid y_{1: m}, u_{0: k-1}\right) \\
\vdots \\
P\left(x_{k}=N \mid y_{1: m}, u_{0: k-1}\right)
\end{array}\right)
\qquad
\eta_{k-1 \mid 1: m}^{x}=\left(\begin{array}{c}
P\left(x_{k-1}=1 \mid y_{1: m}, u_{0: k-2}\right) \\
\vdots \\
P\left(x_{k-1}=N \mid y_{1: m}, u_{0: k-2}\right)
\end{array}\right)
$$
&lt;p>Rekursive Prädiktion:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Beginn: Schätzvektor $\eta_{m \mid 1: m}^{x}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Rekursion: für $k > m$&lt;/p>
$$
\eta_{k \mid 1: m}^{x}=\mathbf{A}_{k}^{\top} \eta_{k-1 \mid 1 : m}^{x}
$$
&lt;/li>
&lt;li>
&lt;p>Spezialfall: Einschrittprädiktion ($k = m + 1$)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="konkretes-beispiel">Konkretes Beispiel&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Systemmodell (zeitinvariant)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Systemabbildung&lt;/p>
$$
\mathbf{A}_{u_{k}}=\left(\begin{array}{ll}
a_{1}\left(u_{k}\right) &amp; 1-a_{1}\left(u_{k}\right) \\
a_{2}\left(u_{k}\right) &amp; 1-a_{2}\left(u_{k}\right)
\end{array}\right) \qquad a_{1}\left(u_{k}\right), a_{2}\left(u_{k}\right) \in[0,1]
$$
&lt;blockquote>
&lt;p>Reminder: $A(i, j):=P\left(x_{k+1}=j \mid x_{k}=i\right)$&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;li>
&lt;p>Messeabbildung&lt;/p>
$$
\mathbf{B}=\left(\begin{array}{ll}
b_{1} &amp; 1-b_{1} \\
b_{2} &amp; 1-b_{2}
\end{array}\right) \qquad b_1, b_2 \in [0, 1]
$$
&lt;blockquote>
&lt;p>Reminder: $B(i, j):=P\left(y_{k}=j \mid x_{k}=i\right)$&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>​&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Gegeben&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Initialer Zustandsschätzvektor&lt;/p>
$$
\eta_{0}^{x}=\left[\begin{array}{c}
p_{0} \\
1-p_{0}
\end{array}\right] (=P(x_0))
\qquad p_{0} \in[0,1]
$$
&lt;p>(Also: $P(x_0 = 1) = P_0, P(x_0 = 2) = 1- P_0$)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Werte der Eingänge $u_0, u_1, u_2$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Keine Messungen&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Gesucht&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Verbundverteilung für die Zeitpunkt $k = 1, 2, 3$&lt;/p>
$$
P\left(x_{1}, x_{2}, x_{3} \mid u_{0}, u_{1}, u_{2}\right)=: P\left(x_{1,3} \mid u_{0: 2}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Verteilung zum Zeitpunkt $k=3$&lt;/p>
$$
p\left(x_{3} \mid u_{0}, u_{1}, u_{2}\right)=p\left(x_{3} \mid u_{0: 2}\right)=\eta_{3}^{x}\left(x_{3}\right)
$$
&lt;/li>
&lt;/ul>
&lt;p>Also wir sind in Zeitschritt 0, und möchte Prädiktion machen für&lt;/p>
&lt;ul>
&lt;li>zukünftige Zustände $x_k, k = 1, 2, 3$&lt;/li>
&lt;li>zukünftige Messungen $y_k, k=1,2,3$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Aufspaltung der Verbundverteilung für $k = 0, 1, 2, 3$:&lt;/p>
$$
\begin{aligned}
&amp; P\left(x_{0: 3} \mid u_{0: 2}\right) \\
\overset{(\ast)}{=}&amp; P\left(x_{3} \mid x_{0: 2}, u_{0: 2}\right) \cdot P\left(x_{0: 2} \mid u_{0: 2}\right) \\
\overset{\text{Markov}}{=}&amp; P\left(x_{3} \mid x_{2}, u_{2}\right) \cdot P\left(x_{2} \mid x_{0: 1}, u_{0: 2}\right) \cdot P\left(x_{0: 1} \mid u_{0: 2}\right) \\
\overset{\text{Markov}}{=}&amp; P\left(x_{3} \mid x_{2}, u_{2}\right) \cdot P\left(x_{2} \mid x_{1}, u_{1}\right) P\left(x_{1} \mid x_{0}, u_{0: 2}\right) \cdot P\left(x_{0} \mid u_{0: 2}\right) \\
=&amp; P\left(x_{3} \mid x_{2}, u_{2}\right) \cdot P\left(x_{2} \mid x_{1}, u_{1}\right) \cdot P\left(x_{1} \mid x_{0}, u_{0}\right) \cdot P\left(x_{0}\right) \\
=&amp; A_{u_{2}}\left(x_{2}, x_{3}\right) \cdot A_{u}\left(x_{1}, x_{2}\right) \cdot A_{u_{0}}\left(x_{0}, x_{1}\right) \cdot \eta_{0}^{x}\left(x_{0}\right)
\end{aligned}
$$
&lt;p>Verbundverteilung für $k = 1, 2, 3$:&lt;/p>
$$
\begin{aligned}
P\left(x_{1: 3} \mid u_{0: 2}\right) &amp;=\sum_{x_{0}=1}^{2} P\left(x_{0: 3} \mid u_{0: 2}\right) \\
&amp;=\underbrace{P\left(x_{3} \mid x_{2}, u_{2}\right)}_{=\mathbf{A}_{u_{2}}\left(x_{2}, x_{3}\right)} \cdot \underbrace{P\left(x_{2} \mid x_{1}, u_{1}\right)}_{=\mathbf{A}_{u_{1}}\left(x_{1}, x_{2}\right)} \cdot \underbrace{\sum_{x_0=1}^{2} P\left(x_{1} \mid x_{0}, u_{0}\right) \cdot P\left(x_{0}\right)}_{=P\left(x_{1} \mid u_{0}\right)=\eta_{1}^{*}\left(x_{1}\right)}
\end{aligned}
$$
&lt;blockquote>
&lt;p>$P\left(x_{1: 3} \mid u_{0: 2}\right)$ bedeutet: $P$ indiziert mit dem 3-dimensionalen Indexvekter $(1, 2, 3)^\top$. Jede von dem kann 2 Wer4te annehmen.&lt;/p>
&lt;/blockquote>
$$
\begin{array}{l}
\eta_{1}^{x}\left(x_{1}\right) &amp;= \sum_{x_{0}=1}^{2} A_{u_{0}}\left(x_{0}, x_{1}\right) \cdot \eta_{0}^{x}\left(x_{0}\right)\\
&amp;=A_{u_{0}}\left(x_{0}=1, x_{1}\right) \underbrace{{P}_{0}}_{=P(x_0 = 1)}+A_{u_{0}}\left(x_{0}=2, x_{1}\right) \underbrace{\left(1-P_{0}\right)}_{=P\left(x_{0}=2\right)} \quad (\text{Marginalisierung})\\
&amp;=\left\{\begin{array}{ll}
a_{1} \cdot p_{b}+a_{2}\left(1-p_{0}\right) &amp; x_{1}=1 \\
\left(1-a_{1}\right) p_{0}+\left(1-a_{2}\right)\left(1-p_{0}\right) &amp; x_{1}=2
\end{array}\right.
\end{array}
$$
$$
\begin{aligned}
P\left(x_{3} \mid u_{0: 2}\right)=&amp;\displaystyle \sum_{x_{2}=1}^{2} \sum_{x_{1}=1}^{2} P\left(x_{1: 3} \mid u_{0: 2}\right)\\
=&amp;\displaystyle \sum_{x_{2}=1}^{2} A_{u_{2}}\left(x_{2}, x_{3}\right) \underbrace{\displaystyle \sum_{x_{1} = 1}^{2}\left(x_{1}, x_{2}\right) \eta_{1}^{x}\left(x_{1}\right)}_{=P\left(x_{1} \mid u_{0:1}\right)=\eta_{2}^{x}\left(x_{1}\right)}\\
=&amp;\sum_{x_{2}=1}^{2} A_{u_{2}}\left(x_{2}, x_{3}\right) \cdot \eta_{2}^{x}\left(x_{2}\right)\\\\
=&amp; \eta_{3}^{x}\left(x_{3}\right)
\end{aligned}
$$
$$
\begin{aligned}
\eta_{3}^{x} &amp;=\mathbf{A}_{u_{2}}^{\top} \cdot \underbrace{\eta_{2}^{x}}_{=\mathbf{A}_{u_{1}}^{\top} \cdot \eta_{1}^{x}} \\
&amp;=\mathbf{A}_{u_{2}}^{\top} \cdot (\mathbf{A}_{u_{1}}^{\top} \cdot \underbrace{\eta_{1}^{x}}_{=\mathbf{A}_{u_{0}}^{\top} \cdot \eta_{0}^{x}})\\
&amp;=\mathbf{A}_{u_{2}}^{\top} \cdot (\mathbf{A}_{u_{1}}^{\top} \cdot (\mathbf{A}_{u_{0}}^{\top} \cdot \eta_{0}^{x})) \quad \text { (rekursive Berechnung) }
\end{aligned}
$$
&lt;p>Prädikition der Messungen für $k=1,2,3$:&lt;/p>
$$
\begin{aligned}
&amp; P(y_{1}, y_{2}, y_{3}, x_{1: 3} \mid u_{0: 2}) \\\\
\overset{(\ast)}{=}&amp; P\left(y_{1: 3} \mid x_{1: 3}, u_{0: 2}\right) \cdot P\left(x_{1: 3} \mid u_{0: 2}\right) \\\\
=&amp; P\left(y_{1: 3} \mid x_{1: 3}\right) P\left(x_{1: 3} \mid u_{0: 2}\right) \\\\
=&amp; P\left(y_{1} \mid x_{1: 3}\right) P\left(y_{2} \mid x_{1: 3}\right) P\left(y_{3} \mid x_{1: 3}\right) P\left(x_{1: 3} \mid u_{0: 2}\right) \\\\
=&amp; P\left(y_{1} \mid x_{1}\right) \cdot P\left(y_{2} \mid x_{2}\right) \cdot P\left(y_{3} \mid x_{3}\right) P\left(x_{1: 3} \mid u_{0: 2}\right) \\\\
=&amp; B\left(x_{1}, y_{1}\right) B\left(x_{2}, y_{2}\right) B\left(x_{3}, y_{3}\right) P\left(x_{1: 3} \mid u_{0: 2}\right)
\end{aligned}
$$
&lt;p>Prädikition Messung für $k=3$:&lt;/p>
$$
\begin{aligned}
&amp; P\left(y_{3}, x_{3} \mid u_{0: 2}\right) \\
\overset{(\ast)}{=}&amp; P\left(y_{3} \mid x_{3}, u_{0: 2}\right) \cdot P\left(x_{3} \mid u_{0: 2}\right) \\
=&amp; P\left(y_{3} \mid x_{3}\right) \cdot P\left(x_{3} \mid u_{0: 2}\right) \\
=&amp; B\left(x_{3}, y_{3}\right) \cdot \eta_{3}^{x}\left(x_{3}\right)
\end{aligned}
$$
&lt;h3 id="filterung-wonham-filter">Filterung (Wonham Filter)&lt;/h3>
&lt;p>Wie sieht $P\left(x_{k} \mid y_{1: k}, u_{0: k-1}\right)$ auf Basis der Prädiktion $P\left(x_{k} \mid y_{1: k-1}, u_{0: k-1}\right)$ aus?&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertdiskrete_systeme-Filterung.drawio.png" alt="wertdiskrete_systeme-Filterung.drawio">&lt;/p>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>&lt;strong>Reminder&lt;/strong>&lt;/p>
$$
P(b \mid a, c) \cdot P(a \mid c)=P(a \mid b, c) \cdot P(b \mid c) \quad (\triangle)
$$&lt;/span>
&lt;/div>
$$
\begin{aligned}
&amp; P\left(x_{k} \mid y_{1: k}, u_{0: k-1}\right) \\
=&amp;\quad P(\underbrace{x_{k}}_{b} \mid \underbrace{y_{k}}_{a}, \underbrace{\left.y_{1: k-1}, u_{0: k-1}\right)}_{c}\\\\
\overset{(\triangle)}{=}&amp; \frac{P\left(y_{k} \mid x_{k}, y_{1: k-1}, u_{0: k-1}\right) \cdot P\left(x_{k} \mid y_{1: k-1}, u_{0: k-1}\right)}{P\left(y_{k} \mid y_{1: k-1}, u_{0: k-1}\right)}\\\\
= &amp; \frac{\overbrace{P\left(y_{k} \mid x_{k}\right)}^{\text{Likelihood}} \cdot \overbrace{P\left(x_{k} \mid y_{1: k-1}, u_{0: k-1}\right)}^{\text{Einschritt-Prädiktion}}}{\underbrace{P\left(y_{k} \mid y_{1: k-1}, u_{0: k-1}\right)}_{\text{Normalisierungskonstant}}}
\end{aligned}
$$
&lt;ul>
&lt;li>Likelihood&lt;/li>
&lt;/ul>
$$
P\left(y_{k} \mid x_{k}\right)=B_{k}\left(x_{k}, y_{k}\right) \qquad(\text{Element aus Messmatrix})
$$
&lt;ul>
&lt;li>Normalisierungskonstant&lt;/li>
&lt;/ul>
$$
\begin{aligned}
&amp; P\left(y_{k} \mid y_{1: k-1}, u_{0: k-1}\right) \\\\
\stackrel{\text { Margin. }}{=} &amp; \sum_{x_{k}=1}^{N} P\left(y_{k}, x_{k} \mid y_{1: k-1}, u_{0: k-1}\right) \\\\
\overset{(\ast)}{=}&amp; \sum_{x_{k}=1}^{N} P\left(y_{k} \mid x_{k}, y_{1: k-1}, u_{0: k-1}\right) \cdot P\left(x_{k} \mid y_{1: k-1}, u_{0: k-1}\right) \\\\
=&amp; \sum_{x_{k} = 1}^{N} P\left(y_{k} \mid x_{k}\right) \cdot P\left(x_{k} \mid y_{1: k-1}, u_{0: k-1}\right)
\end{aligned}
$$
&lt;ul>
&lt;li>Einschrittsprädikation&lt;/li>
&lt;/ul>
$$
\eta_{k \mid 1: k-1}^{x}=\mathbf{A}_{k}^{\top} \eta_{k-1\mid1: k-1}^{x}
$$
&lt;p>Filterung in Vektor-Matrix-Form:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Für $y_k = m$, Bilde eine Diagonalematrix $\operatorname{diag}(\mathbf{B}(:, m))$ mit Spalte des Messmatrix $\mathbf{B}(:, m)$&lt;/p>
$$
\begin{aligned}
\eta_{k \mid 1: k}^{x} &amp;\overset{y_k = m}{=}\frac{\operatorname{diag}(\mathbf{B}(:, m)) \cdot \eta_{k \mid 1: k-1}^{x}}{\mathbb{1}_{N}^{T} \operatorname{diag}(\mathbf{B}(:, m)) \cdot \eta_{k \mid 1: k-1}^{x}} \\\\
&amp;=\frac{\mathbf{B}(:, m) \odot \eta_{k \mid 1: k-1}^{x}}{\mathbf{B}(:, m)^\top \cdot \eta_{k \mid 1:k-1}^{x}}
\end{aligned}
$$
&lt;ul>
&lt;li>$\mathbf{1}_N$: Einsvektor&lt;/li>
&lt;li>$\odot$: Elementwise-Multiplikation&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Das ist ein komplett rekursives Filter $\rightarrow$ &lt;mark>&lt;strong>Wonham Filter&lt;/strong>&lt;/mark>&lt;/p>
&lt;p>Beispiel siehe &lt;a href="https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/hmm_und_wonham_filter/">hier&lt;/a>.&lt;/p></description></item><item><title>Wertekontinuierliche lineare Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_lineare_systeme/</link><pubDate>Wed, 15 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_lineare_systeme/</guid><description/></item><item><title>Statische und Dynamische Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_lineare_systeme/statische_systeme/</link><pubDate>Thu, 16 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_lineare_systeme/statische_systeme/</guid><description>&lt;h2 id="linearität">Linearität&lt;/h2>
&lt;p>Gegeben ein System $S$&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertekontinuierliche_lineare_systeme.drawio.png" alt="wertekontinuierliche_lineare_systeme.drawio">&lt;/p>
$$
\underline{x}_k \rightarrow \underline{y}_k \qquad k \in \mathbb{N}_0
$$
&lt;p>Zwei Bedingungen der Linearität&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Skalierung&lt;/p>
$$
\underline{x}_k \rightarrow \underline{y}_k \Rightarrow A \cdot \underline{x}_k \rightarrow A \cdot \underline{y}_k
$$
&lt;/li>
&lt;li>
&lt;p>Superposition&lt;/p>
$$
\begin{aligned}
\underline{x}_k^1 \rightarrow \underline{y}_k^1, \quad \underline{x}_k^2 \rightarrow \underline{y}_k^2 \\
\Rightarrow \underline{x}_k^1 + \underline{x}_k^2 \rightarrow \underline{y}_k^1 + \underline{y}_k^2
\end{aligned}
$$
&lt;/li>
&lt;/ul>
&lt;h2 id="statische-systeme">Statische Systeme&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Ein-/Ausgänge: Zufallsvektoren $\underline{u}_k$ und $\underline{y}_k$ ($k \in \mathbb{N}_0$ ist der Zeitschritt)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertekontinuierliche_lineare_systeme-statische_systeme.drawio.png" alt="wertekontinuierliche_lineare_systeme-statische_systeme.drawio" style="zoom:80%;" />
&lt;ul>
&lt;li>$\underline{u}_k \in \mathbb{R}^P$ und $\underline{y}_k \in \mathbb{R}^M$ sind wertekontinuierlich&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Abbildung von $\underline{u}_k$ und $\underline{y}_k$ durch lineare Abbildung&lt;/p>
$$
\underline{y}_k = \mathbf{A}_k \cdot \underline{u}_k
$$
&lt;p>wobei $\mathbf{A}_k \in \mathbb{R}^{M \times P}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Beschreibung der Unsicherheiten in $\underline{u}_k$ und $\underline{y}_k$ durch die ersten beiden Momente&lt;/p>
&lt;ul>
&lt;li>Erwartungswert
&lt;ul>
&lt;li>$\underline{\hat{u}}_k := E\{\underline{u}_k\}$
&lt;/li>
&lt;li>$\underline{\hat{y}}_k := E\{\underline{y}_k\}$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Kovarianz Matrix
&lt;ul>
&lt;li>$C_k^u := \operatorname{Cov}\{\underline{u}_k\}$
&lt;/li>
&lt;li>$C_k^y := \operatorname{Cov}\{\underline{y}_k\}$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Beschreibung der Kenngröße $\underline{\hat{y}}_k, C_k^y$ für gegebene $\underline{\hat{u}}_k, C_k^u$&lt;/p>
$$
\begin{aligned}
\hat{y}_{k} &amp;=E\left\{\underline{y}_{k}\right\} \\
&amp;=E\left\{A_{k} \cdot x_{k}\right\} \\
&amp;=A_{k} \cdot E\left\{x_{k}\right\} \\
&amp;=A_{k} \cdot \hat{\underline{u}}_{k} \\\\
C_{k}^{y} &amp;=\operatorname{Cov}\left\{\underline{y}_{k}\right\} \\
&amp;=E\left\{\left(y_{k}-\hat{y}_{k}\right)\left(\underline{y}_{k}-\underline{y}_{k}\right)^{\top}\right\} \\
&amp;=E\left\{A_{k}\left(\underline{u}_{k}-\underline{\hat{u}}_{k}\right)\left(\underline{u}_{k}-\underline{\hat{u}}_{k}\right)^{\top} A_{k}^{\top}\right\} \\
&amp;=A_{k} E\left\{\left(\underline{u}_{k}-\hat{u}_{k}\right)\left(\underline{u}_{k}-\underline{\hat{u}}_{k}\right)^{\top}\right\} A_{k}^{\top} \\
&amp;=A_{k} \cdot C_{k}^{u} \cdot A_{k}^{\top}
\end{aligned}
$$
&lt;/li>
&lt;/ul>
&lt;h2 id="dynamische-systeme">Dynamische Systeme&lt;/h2>
&lt;ul>
&lt;li>Anregung hängt nicht nur vom aktuellen Eingang $\underline{u}_k$ ab (analog wie wertdiskrete Systeme), sondern auch vom aktuellen Zustand&lt;/li>
&lt;li>Zustände werden in internen Speichern gespeichert&lt;/li>
&lt;li>Gesamtsystem (&amp;quot;&lt;strong>Gauß-Markov-Modell&lt;/strong>&amp;quot;) besteht aus
&lt;ul>
&lt;li>Systemabbildung&lt;/li>
&lt;li>Messabbildung&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertekontinuierliche_lineare_systeme-Gau%C3%9F-Markov-Modell.drawio.png"
alt="Graphische Darstellung von dynamischer Systeme">&lt;figcaption>
&lt;p>Graphische Darstellung von dynamischer Systeme&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;h3 id="systemabbildung">Systemabbildung&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>&lt;strong>Definition&lt;/strong>&lt;/p>
&lt;p>Ein lineares Zustandraummodell wird als &lt;mark>&lt;strong>zeitinvariant&lt;/strong> (Engl. Linear Time Invariant (LTI))&lt;/mark> bezeichnet, falls die Systemmatrizen nicht von Zeitindex $k$ abhängen, also&lt;/p>
$$
\mathbf{A}\_{k} = \mathbf{A}, \quad \mathbf{B}\_{k} = \mathbf{B}
$$
&lt;/span>
&lt;/div>
&lt;p>Zeitliche Entwicklung (linear)&lt;/p>
$$
\underline{x}_{k+1}=\mathbf{A}_{k} \cdot \underline{x}_{k}+\mathbf{B}_{k} \cdot \underbrace{(\underline{\tilde{u}}_{k}+\underline{w}_{k})}_{=\underline{u}_{k}}
$$
&lt;ul>
&lt;li>
&lt;p>Zustand: Zufallsvektor $\underline{x}_k \in \mathbb{R}^N, k\in \mathbb{N}_0$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Markov-Modell (erster Ordnung): $\underline{x}_{k+1}$
hängt NUR von $\underline{x}_{k}$
und $\underline{u}_{k}$ ab&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Häufig wird $\underline{u}_{k}$ mit mittelwertfreien Rauschen argestellt&lt;/p>
$$
\underline{u}_{k}=\underline{\tilde{u}}_{k}+\underline{w}_{k}
$$
&lt;ul>
&lt;li>$\underline{\tilde{u}}_{k}$ bekannt&lt;/li>
&lt;li>Zufallsvektor $\underline{w}_{k}$ mit $E\{\underline{w}_k\} = \underline{0}, \operatorname{Cov}\{\underline{w}_k\} = c_k^w$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="messabbildung">Messabbildung&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Zustand $\underline{x}_k$ typischerweise NICHT verfügbar&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Ausgang $\underline{y}_{k}$ hängt von $\underline{x}_k$ und evtl. von $\underline{u}_k$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Lineare Messabbildung&lt;/p>
$$
\underline{y}_{k}=\mathbf{H}_{k} \cdot \underline{x}_{k}+\underline{v}_{k}
$$
&lt;ul>
&lt;li>$\underline{v}_{k}$: additives mittelwertfreien Messrauschen ($E\{\underline{w}_k\} = \underline{0}, \operatorname{Cov}\{\underline{w}_k\} = c_k^w$
)&lt;/li>
&lt;li>Messabbildung ist zeitinvaraint, falls $\mathbf{H}_{k} = \mathbf{H}$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="einschub-systemeigenschaften-zeitdiskreter-systeme">Einschub: Systemeigenschaften zeitdiskreter Systeme&lt;/h2>
&lt;p>Für Definitionen von Systemeigenschaften zeitdiskreter Systeme siehe &lt;strong>Signale und Systeme&lt;/strong>&lt;sup id="fnref:1">&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref">1&lt;/a>&lt;/sup> Seite 312 - 314.&lt;/p>
&lt;h3 id="linearitat">Linearität&lt;/h3>
&lt;p>Ein zeitdiskretes System $\mathcal{S}$
heißt &lt;strong>linear&lt;/strong>, wenn für zwei beliebige Eingangssignale $y_{\mathrm{e} 1, n}$
und $y_{\mathrm{e} 2, n}$
und zwei beliebige Konstanten $c_1, c_2 \in \mathbb{R}$
oder $\mathbb{C}$
&lt;/p>
$$
\mathcal{S}\left\{c_{1} y_{\mathrm{e} 1, n}+c_{2} y_{\mathrm{e} 2, n}\right\}=c_{1} \mathcal{S}\left\{y_{\mathrm{e} 1, n}\right\}+c_{2} \mathcal{S}\left\{y_{\mathrm{e} 2, n}\right\}
$$
&lt;p>gilt.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Erweiterung auf auf $N$ Eingangssignale&lt;/p>
$$
\mathcal{S}\left\{\sum_{i=1}^{N} c_{i} y_{\mathrm{e} i, n}\right\}=\sum_{i=1}^{N} c_{i} \mathcal{S}\left\{y_{\mathrm{e} i, n}\right\}
$$
&lt;/li>
&lt;li>
&lt;p>Erweiterung auf unendlich viele Eingangssignale&lt;/p>
$$
\mathcal{S}\left\{\sum_{i=-\infty}^{\infty} c_{i} y_{\mathrm{e} i, n}\right\}=\sum_{i=-\infty}^{\infty} c_{i} \mathcal{S}\left\{y_{\mathrm{e} i, n}\right\}
$$
&lt;/li>
&lt;/ul>
&lt;h3 id="zeitinvarianz">Zeitinvarianz&lt;/h3>
&lt;p>Ein zeitdiskretes System $\mathcal{S}$
heißt &lt;strong>zeitinvariant&lt;/strong>, wenn es auf ein zeitlich verschobenes Eingangssignal $y_{\mathrm{e}, n-n_{0}}$
mit dem entsprechend zeitlichverschobenen Ausgangssignal $y_{\mathrm{a}, n-n_{0}}$
antwortet&lt;/p>
$$
y_{\mathrm{a}, n}=\mathcal{S}\left\{y_{\mathrm{e}, n}\right\} \quad \Longrightarrow \quad y_{\mathrm{a}, n-n_{0}}=\mathcal{S}\left\{y_{\mathrm{e}, n-n_{0}}\right\}.
$$
&lt;p>Sonst heißen die Systeme &lt;strong>zeitvariant&lt;/strong>.&lt;/p>
&lt;h3 id="kausalität">Kausalität&lt;/h3>
&lt;p>Ein zeitdiskretes System S heißt &lt;strong>kausal&lt;/strong>, wenn die Antwort NUR von &lt;em>gegenwärtigen&lt;/em> oder &lt;em>vergangenen&lt;/em>, nicht jedoch von zukünftigen Werten des Eingangssignals abhängt.&lt;/p>
&lt;p>Dies bedeutet, dass für ein System $\mathcal{S}$
aus&lt;/p>
$$
y_{\mathrm{e} 1, n}=y_{\mathrm{e} 2, n} \quad \text { für } n \leq n_{1}
$$
&lt;p>und&lt;/p>
$$
y_{\mathrm{a} 1, n}=\mathcal{S}\left\{y_{\mathrm{e} 1, n}\right\}, \quad y_{\mathrm{a} 2, n}=\mathcal{S}\left\{y_{\mathrm{e} 2, n}\right\}
$$
&lt;p>stets&lt;/p>
$$
y_{\mathrm{a} 1, n}=y_{\mathrm{a} 2, n} \quad \text { für } n \leq n_{1}
$$
&lt;p>folgt.&lt;/p>
&lt;h2 id="beispiel">Beispiel&lt;/h2>
&lt;p>(Übungsblatt 5, Aufgabe 1)&lt;/p>
&lt;p>Ein zeidiskretes wertekontinuierliches System $S$ wird durch die Differenzengleichung&lt;/p>
$$
y_{k}-2^{k} \cdot y_{k+1}+3 \cdot y_{k+2}^{2}=4 \cdot u_{k}-2 \cdot u_{k+1}
$$
&lt;p>beschrieben.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Ist das System $S$ linear?&lt;/p>
&lt;p>Das System $S$ ist aufgrund des Terms $y_{k+2}^{2}$
NICHT linear.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Ist das System $S$ zeitinvariant?&lt;/p>
&lt;p>Das System $S$ ist wegen des zeitabhängigen Koeffizienten $2^k$ von $y_{k+1}$ zeitvariant.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Ist das System $S$ kausal?&lt;/p>
&lt;p>Das System $S$ ist kausal, da $y_{k+2}$ nur von vergangenen Eingangswerten abhängt.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;div class="footnotes" role="doc-endnotes">
&lt;hr>
&lt;ol>
&lt;li id="fn:1">
&lt;p>F. P. León and H. Jäkel. Signale und Systeme. De Gruyter Oldenbourg, Berlin, Boston, 02 Sep. 2019. ISBN 978-3-11-062632-2. doi: &lt;a href="https://doi.org/10.1515/9783110626322">https://doi.org/10.1515/9783110626322&lt;/a>. URL &lt;a href="https://www.degruyter.com/view/title/543041">https://www.degruyter.com/view/title/543041&lt;/a>.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;/ol>
&lt;/div></description></item><item><title>Zustandsschätzung: Kalman Filter</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_lineare_systeme/zustandsschaetzung/</link><pubDate>Thu, 16 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_lineare_systeme/zustandsschaetzung/</guid><description>&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Die ausführliche Zusammenfassung für Kalman Filter siehe &lt;a href="https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/understanding/kalman_filter/">hier&lt;/a>.&lt;/span>
&lt;/div>
&lt;h2 id="prädiktion">Prädiktion&lt;/h2>
&lt;p>Wir möchte &lt;em>ein Schritt&lt;/em> Prädiktion für Zustand machen, also am Zeitschritt $k$ ($k > m$, $m:= \text{\#Messungen}$) die Prädiktion für den Zustand $\underline{x}_{k+1}$
zu machen&lt;/p>
&lt;p>Modell:&lt;/p>
$$
\underline{x}_{k+1}=\mathbf{A}_{k} \cdot \underline{x}_{k}+\mathbf{B}_{k} \cdot \underbrace{\left(\underline{\tilde{u}}_{k}+\underline{w}_{k}\right)}_{\underline{u_k}}
$$
&lt;ul>
&lt;li>
&lt;p>Initialer Schätzwert für $k$:&lt;/p>
$$
\underline{x}_{k|1:m}
$$
&lt;ul>
&lt;li>basiert auf Messungen $\underline{y}_{1}, \dots, \underline{y}_{m}$
&lt;/li>
&lt;li>Eingabewerte $\underline{\tilde{u}}_{0}, \dots, \underline{\tilde{u}}_{k-1}$
&lt;/li>
&lt;li>mit Erwartungswert $\underline{\hat{x}}_{k|1:m}$
und Kovarianzmatrix $C_{k|1:m}^x$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Berechnung des Erwartungswerts für $k+1$&lt;/p>
$$
\begin{aligned}
&amp;E\left\{\underline{x}_{k+1}\right\}\\\\
=&amp;E\left\{\mathbf{A}_{k} \cdot \underline{x}_{k}+\mathbf{B}_{k}\left(\underline{\tilde{u}}_{k}+\underline{w}_{k}\right)\right\}\\\\
=&amp;E\left\{\mathbf{A}_{k} \cdot x_{k}+\mathbf{B}_{k} \tilde{u}_{k}+\mathbf{B}_{k} \underline{w}_{k}\right\}\\\\
=&amp;\mathbf{A}_{k} \cdot E\left\{x_{k}\right\}+\mathbf{B}_{k} \cdot \underbrace{E\left\{\tilde{u}_{k}\right\}}_{=\tilde{\underline{u}}_{k} \text{ (da } \tilde{\underline{u}}_{k} \text{ is fix)}}+\mathbf{B}_{k} \cdot\underbrace{E\left\{\underline{w}_{k}\right\}}_{=0 \text{ ("mittelwertfrei")}}\\\\
=&amp;\mathbf{A}_{k} \cdot \underline{\hat{x}}_{k|1: m}+\mathbf{B}_{k} \tilde{\underline{u}}_{k} \qquad (+)
\end{aligned}
$$
&lt;/li>
&lt;li>
&lt;p>Berechnung der Kovarianzmatrix $C_{k+1|1:m}^x$
&lt;/p>
$$
\begin{aligned}
\underline{x}_{k+1} &amp;=\mathbf{A}_{k} \underline{x}_{k}+\mathbf{B}_{k} \underline{u}_{k} \\
&amp;=\left[\begin{array}{ll}
\mathbf{A}_{k} &amp; \mathbf{B}_{k}
\end{array}\right]\left[\begin{array}{c}
\underline{x}_{k} \\
\underline{u}_{k}
\end{array}\right]
\end{aligned}
$$
$$
\begin{aligned}
\underline{x}_{k+1}-\hat{\underline{x}}_{k+1} &amp;=\left[\begin{array}{ll}
\mathbf{A}_{k} &amp; \mathbf{B}_{k}
\end{array}\right]\left[\begin{array}{c}
\underline{x}_{k}-\hat{\underline{x}}_{k} \\
\underline{u}_{k}-\underline{\hat{u}}_{k}
\end{array}\right] \\
&amp;=\left[\begin{array}{ll}
\mathbf{A}_{k} &amp; \mathbf{B}_{k}
\end{array}\right]\left[\begin{array}{c}
\underline{x}_{k}-\underline{\hat{x}}_{k} \\
\underline{w}_{k}
\end{array}\right]
\end{aligned}
$$
&lt;p>Annahme: Zustand und Systemrauschen sind unkorreliert&lt;/p>
$$
\begin{aligned}
\operatorname{Cov}\left\{\left[\begin{array}{c}
\underline{x}_{k} \\
\underline{\tilde{u}}_{k}
\end{array}\right]\right\} &amp;=E\left\{\left[\begin{array}{c}
\underline{x}_{k}-\underline{\hat{x}}_{k} \\
\underline{w}_{k}
\end{array}\right]\left[\left(\underline{x}_{k}-\underline{\hat{x}}_{k}\right)^{\top} \underline{w}_{k}^{\top}\right]\right\} \\
&amp;=\left[\begin{array}{cc}
C_{k \mid 1: m}^{x} &amp; 0 \\
0 &amp; C_{k}^{w}
\end{array}\right]
\end{aligned}
$$
$$
\begin{aligned}
\mathbf{C}_{k+1 \mid 1 : m}^{x} &amp;=E\left\{\left(\underline{x}_{k+1}-\hat{x}_{k+1}\right)\left(x_{k+1} - \hat{\underline{x}}_{k+1}\right)^\top\right\} \\
&amp;=\left[\begin{array}{ll}
\mathbf{A}_{k} &amp; \mathbf{B}_{k}
\end{array}\right] \cdot E\left\{\left[\begin{array}{c}
\underline{x}_{k}-\hat{\underline{x}}_{k} \\
\underline{w}_{k}
\end{array}\right]\left[\begin{array}{ll}
\underline{x}_{k}-\hat{\underline{x}}_{k} &amp; \underline{w}_{k}
\end{array}\right]^\top\right\} \cdot\left[\begin{array}{l}
\mathbf{A}_{k}^{\top} \\
\mathbf{B}_{k}^{\top}
\end{array}\right] \\\\
&amp;=\left[\begin{array}{ll}
\mathbf{A}_{k} &amp; \mathbf{B}_{k}
\end{array}\right] \cdot\left[\begin{array}{cc}
\mathbf{C}_{k \mid 1:m} &amp; 0 \\
0 &amp; \mathbf{C}_{k}^{w}
\end{array}\right] \cdot\left[\begin{array}{l}
\mathbf{A}_{k}^{\top} \\
\mathbf{B}_{k}^{\top}
\end{array}\right] \\
&amp;=\mathbf{A}_{k} \cdot \mathbf{C}_{k \mid 1: m}^{x} \mathbf{A}_{k}^{\top}+\mathbf{B}_{k} \mathbf{C}_{k}^{w} \mathbf{B}_{k}^{\top} \qquad(++)
\end{aligned}
$$
&lt;/li>
&lt;/ul>
&lt;p>Rekursive Prädiktion&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Beginn mit Erwartungswert $\underline{\hat{x}}_{m|1:m}$
und Kovarianzmatrix $C_{m|1:m}^x$
&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Rekursion mit $(+)$ und $(++)$ für $k > m$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Beispiele: Übungsblatt 5, Aufgabe 4&lt;/span>
&lt;/div>
&lt;h2 id="filterung">Filterung&lt;/h2>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>&lt;strong>Erinnerung&lt;/strong>&lt;/p>
&lt;p>&lt;a href="https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_lineare_systeme/statische_systeme/#dynamische-systeme">Struktur des dynamischen Systems&lt;/a>&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertekontinuierliche_lineare_systeme-Gau%C3%9F-Markov-Modell.drawio.png"
alt="Graphische Darstellung von dynamischer Systeme">&lt;figcaption>
&lt;p>Graphische Darstellung von dynamischer Systeme&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;p>&lt;a href="https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_lineare_systeme/statische_systeme/#messabbildung">Messabbildung&lt;/a>&lt;/p>
$$
\underline{y}\_{k}=\mathbf{H}\_{k} \cdot \underline{x}\_{k}+\underline{v}\_{k}
$$
&lt;/span>
&lt;/div>
&lt;p>Ansatz: &lt;strong>Linearer Schätzer&lt;/strong>&lt;/p>
$$
\underline{x}_{k \mid 1: k}=\mathbf{K}_{k}^{(1)} \underline{x}_{k \mid 1: k-1}+\mathbf{K}_{k}^{(2)} \underline{y}_{k} \qquad(\ast)
$$
&lt;p>🎯 Wir suchen den sog. &lt;strong>BLUE-Filter&lt;/strong> (Best Linear Unbiased Estimator) 💪&lt;/p>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>Ein Schätzer heißt &lt;mark>&lt;strong>erwartungstreu&lt;/strong>&lt;/mark> , wenn sein Erwartungswert gleich dem wahren Wert des zu schätzenden Parameters ist.&lt;/p>
&lt;p>Ist eine Schätzfunktion nicht erwartungstreu, spricht man davon, dass der Schätzer &lt;mark>&lt;strong>verzerrt&lt;/strong>&lt;/mark> ist. Das Ausmaß der Abweichung seines Erwartungswerts vom wahren Wert nennt man &lt;strong>Verzerrung&lt;/strong> oder &lt;strong>Bias&lt;/strong>. Die Verzerrung drückt den systematischen Fehler des Schätzers aus.&lt;/p>
&lt;p>Source und Bsp: &lt;a href="https://de.wikipedia.org/wiki/Erwartungstreue">Wiki&lt;/a>&lt;/p>
&lt;/span>
&lt;/div>
&lt;ul>
&lt;li>
&lt;p>Erwartungswerttreue (unbiased)&lt;/p>
$$
\begin{aligned}
E\left\{\underline{x}_{k \mid 1: k}\right\}&amp;=E\left\{\mathbf{K}_{k}^{(1)} \underline{x}_{k \mid 1: k-1}+\mathbf{K}_{k}^{(2)} \underline{y}_{k}\right\} \\
E\left\{\underline{x}_{k \mid 1: k}\right\}&amp;=\mathbf{K}_{k}^{(1)} E\left\{\underline{x}_{k \mid 1: k-1}\right\}+\mathbf{K}_{k}^{(2)} E\left\{\underline{y}_{k}\right\} \\
E\left\{\underline{x}_{k \mid 1: k}\right\}&amp;=\mathbf{K}_{k}^{(1)} E\left\{\underline{x}_{k \mid 1: k-1}\right\}+\mathbf{K}_{k}^{(2)} E\left\{\mathbf{H}_{k} \cdot x_{k}+\underline{v}_{k}\right\} \\
E\left\{\underline{x}_{k \mid 1: k}\right\}&amp;=\mathbf{K}_{k}^{(1)} E\left\{\underline{x}_{k \mid 1: k-1}\right\}+\mathbf{K}_{k}^{(2)} \mathbf{H}_{k} E\left\{\underline{x}_{k}\right\} \quad \mid \text { Erwartungstreu } \\
\underline{\tilde{x}}&amp;=\mathbf{K}_{k}^{(1)} \underline{\tilde{x}}+\mathbf{K}_{k}^{(2)} \mathbf{H}_{k} \cdot \underline{\tilde{x}} \\
\Rightarrow \mathbf{I} &amp;=\mathbf{K}_{k}^{(1)}+\mathbf{K}_{k}^{(2)} \mathbf{H}_{k}
\end{aligned}
$$
&lt;p>z.B.&lt;/p>
$$
\begin{aligned}
\mathbf{K}_{k}^{(1)} &amp;= \mathbf{I} - \mathbf{K}_{k}\mathbf{H}_{k} \\
\mathbf{K}_{k}^{(2)} &amp;= \mathbf{K}_{k}
\end{aligned}
$$
&lt;/li>
&lt;/ul>
&lt;p>Setze in $(\ast)$ ein:&lt;/p>
$$
\underbrace{\underline{x}_{k \mid 1: k}}_{=: \underline{x}_{k}^{e}}=\left(\mathbf{I}-\mathbf{K}_{k}\mathbf{H}_{k} \right) \underbrace{\underline{x}_{k \mid 1: k-1}}_{=: \underline{x}_{k}^{p}}+\mathbf{K}_{k} \underline{y}_{k} \qquad(* *)
$$
&lt;p>Aber der Schätzert ist noch nicht vollständig festgelegt, da $\mathbf{K}_{k}$ noch nicht festgelegt ist.&lt;/p>
&lt;p>$\Rightarrow$ Wir suche $\mathbf{K}_{k}$ so, dass der resultierende Schätzer MINIMAL kovarianz aufweist. (&amp;ldquo;Minimalvarianz Schätzer&amp;rdquo;)&lt;/p>
&lt;p>Nehme an, dass Messung unkorreliert mit priorer Schätzung. Aus $(\ast\ast)$ gilt&lt;/p>
$$
\underbrace{\mathbf{C}_{k \mid 1: k}\left(\mathbf{K}_{k}\right)}_{=: \mathbf{C}_{k}^{e}\left(\mathbf{K}_{k}\right)}=\left(\mathbf{I}-\mathbf{K}_{k} \mathbf{H}_{k}\right) \underbrace{\mathbf{C}_{k \mid 1: k-1}^{x}}_{=: \mathbf{C}_{k}^{p}}\left(\mathbf{I}-\mathbf{K}_{k} \mathbf{H}_{k}\right)^{\top}+\mathbf{K}_{k} C_{k}^{v} \mathbf{K}_{k}^{\top} \qquad(\ast\ast\ast)
$$
&lt;p>Wir betrachten nun die Filterkovarianz $\mathbf{C}_{k}^{e}$
als Funktion von $\mathbf{K}_{k}$
, d. h. $\mathbf{C}_{k}^{e}(\mathbf{K}_k)$
. Ziel ist es, das $\mathbf{K}_{k}$
so zu finden, dass die Filterkovarianz so klein wie möglich ist.&lt;/p>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>Trick: Auf &lt;strong>Skalares Gütemaß&lt;/strong> zurückzuführen&lt;/p>
&lt;p>D.h., um Kovarianzmatrizen generell vergleichen zu können, verwende man die Funktionen, die von einer $n \times n$ Matrix in $\mathbb{R}^1$ abbilden. Anders gesagt, die einer Kovarianzmatrix einen Skalar zuordnen, denn man kann nur Skalare direkt miteinander vergleichen.&lt;/p>
&lt;p>Z.B., Projektion mit beliebigen Einheitsvektor $\underline{e}$&lt;/p>
$$
P(\mathbf{K}) = \underline{e}^\top \cdot \mathbf{C}_c(\mathbf{K}) \cdot \underline{e}
$$
&lt;p>MINIMAL Kovarianz $\Leftrightarrow$ $P(\mathbf{K})$ soll minimal sein für $\underline{e}$.&lt;/p>
&lt;p>Andere mögliche skalare Gütemaße:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$\operatorname{Spur}(\cdot)$: Summe der Diagonalelemente&lt;/p>
$$
\begin{equation}
\operatorname{Spur}(\mathbf{C})=\sigma\_{x}^{2}+\sigma\_{y}^{2}
\end{equation}
$$
&lt;/li>
&lt;li>
&lt;p>$\operatorname{det}(\cdot)$: Determinante, also Produkt der Eigenwerte&lt;/p>
$$
\operatorname{det}(\mathbf{C})=\sigma\_{x}^{2} \cdot \sigma\_{y}^{2}
$$
&lt;/li>
&lt;/ul>
&lt;details class="spoiler " id="spoiler-3">
&lt;summary class="cursor-pointer">Beispiel&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-07-03%2010.33.08.png" alt="截屏2022-07-03 10.33.08">
&lt;/div>
&lt;/details>
&lt;/span>
&lt;/div>
&lt;p>Ableitung mit der &lt;a href="https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/math/matrix_differenzieren/">Matrizen Differenzregeln&lt;/a>:&lt;/p>
$$
\begin{aligned}
\frac{\partial}{\partial \mathbf{K}} P(\mathbf{K}) &amp;=\frac{\partial}{\partial \mathbf{K}}\left\{\underline{e}^{\top}\left[(\mathbf{I}-\mathbf{K} \mathbf{H}) \mathbf{C}_{p}(\mathbf{I}-\mathbf{K} \mathbf{H})^{\top}+\mathbf{K} \mathbf{C}_{y} \mathbf{K}^{\top}\right] \underline{e}\right\} \\
&amp;=\frac{\partial}{\partial \mathbf{K}}\left\{\underline{e}^{\top}\left[\mathbf{C}_{p}-\mathbf{C}_{p} \mathbf{H}^{\top} \mathbf{K}^{\top}-\mathbf{K} \mathbf{H} \mathbf{C}_{p}+\mathbf{K} \mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top} \mathbf{K}^{\top}+\mathbf{K} \mathbf{C}_{y} \mathbf{K}^{\top}\right] \underline{e}\right\} \\
&amp;=-\left[\mathbf{H} \mathbf{C}_{p} \underline{e} \underline{e}^{\top}\right]^{\top}-\underline{e} \underline{e}^{\top}\left(\mathbf{H} \mathbf{C}_{p}\right)^{\top}+2 \underline{e} \underline{e}^{\top} \mathbf{K} \mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}+2 \underline{e} \underline{e}^{\top} \cdot \mathbf{K} \mathbf{C}_{y} \\
&amp;\overset{!}{=} \mathbf{0}
\end{aligned}
$$
&lt;p>Also&lt;/p>
$$
\begin{array}{l}
-\mathbf{C}_{p} \cdot \mathbf{\mathbf{H}}^{\top}-\mathbf{C}_{p} \mathbf{H}^{\top}+2 \mathbf{K} \mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}+2 \mathbf{K} \mathbf{C}_{y} \stackrel{!}{=} \mathbf{0} \\
\mathbf{K}\left(\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}\right)^{\top}=\mathbf{C}_{p} \mathbf{H}^{\top} \\
\mathbf{K}=\mathbf{C}_{p} \mathbf{H}^{\top}\left(\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{\mathbf{H}}^{\top}\right)^{-1} \quad \text { (Kalman gain) }
\end{array}
$$
&lt;p>Setze $\mathbf{K}$ in $(\ast \ast)$ ein&lt;/p>
$$
\begin{aligned}
\underline{\hat{x}}_{e} &amp;=(\mathbf{I}-\mathbf{K} \mathbf{H}) \underline{\hat{x}}_{p}+\mathbf{K} \cdot \underline{\hat{y}} \qquad \text { (combination form) } \\
&amp;=\underline{\hat{x}}_{p}+\mathbf{K}\left(\underline{\hat{y}}-\mathbf{H} \cdot \underline{\hat{x}}_{p}\right) \qquad \text { (feedback form) } \\
&amp;=\underline{\hat{x}}_{p}+\mathbf{C}_{p} \mathbf{H}^{\top}\left(\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}\right)^{-1}\left(\underline{y}-\mathbf{H} \cdot \underline{\hat{x}}_{p}\right)
\end{aligned}
$$
&lt;p>Das ist das &lt;mark>&lt;strong>Kalman Filter&lt;/strong>&lt;/mark>.&lt;/p>
&lt;p>Nun Setze $\mathbf{K}$ in $(\ast \ast \ast)$ ein, um die Kovarianzmatrix zu berechnen.&lt;/p>
$$
\begin{aligned}
\mathbf{C}_{e}=&amp; {\left[\mathbf{I}-\mathbf{C}_{p} \mathbf{H}^{\top}\left(\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}\right)^{-1} \mathbf{H}_{k}\right] \cdot \mathbf{C}_{p} } \\
&amp; \cdot\left[\mathbf{I}-\mathbf{C}_{p} \mathbf{H}^{\top}\left(\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}\right)^{-1} \mathbf{H}_{k}\right]^{-1} \\
&amp;+\mathbf{C}_{p} \mathbf{H}^{\top}\left(\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}\right)^{-1} \mathbf{C}_{y}\left(\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}\right)^{-1} \mathbf{H} \mathbf{C}_{p} \\\\
=&amp; \mathbf{C}_{p}-2 \mathbf{C}_{p} \mathbf{H}^{\top}\left(\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}\right)^{-1} \mathbf{H} \mathbf{C}_{p} \\
&amp;+\mathbf{C}_{p} \mathbf{H}^{\top}\left(\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}\right)^{-1} \mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}\left(\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}\right)^{-1} \mathbf{H} \mathbf{C}_{p} \\
&amp;+\mathbf{C}_{p} \mathbf{H}^{\top}(\underbrace{\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}}_{=:\mathbf{D}})^{-1} \mathbf{C}_{y}\left(\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}\right)^{-1} \mathbf{H} \mathbf{C}_{p}\\\\
=&amp; \mathbf{C}_{p}-2 \mathbf{C}_{p} \mathbf{H}^{\top} \mathbf{D}^{-1} \mathbf{H} \mathbf{C}_{p}+\mathbf{C}_{p} \mathbf{H}^{\top} \mathbf{D}^{-1} \mathbf{D} \mathbf{D}^{-1} \mathbf{H} \mathbf{C}_{p} \\\\
=&amp; \mathbf{C}_{p}-\mathbf{C}_{p} \mathbf{H}^{\top}\left(\mathbf{C}_{y}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}\right)^{-1} \mathbf{H} \mathbf{C}_{p}
\end{aligned}
$$
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Schritt für Schritt Herleitung: Übungsblatt 6, Aufgabe 1 (Sehr ausführlich und hilfreich! 👍)&lt;/span>
&lt;/div>
&lt;h2 id="beispiel">Beispiel&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Kompletter Kalman Filter: Übungsblatt 5 Aufgabe 3&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Prädiktion: Übungsblatt 5, Aufgabe 4&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Filterung: Übungsblatt 6 Aufgabe 1&lt;/p>
&lt;/li>
&lt;/ul></description></item><item><title>Wertekontinuierliche Nichtlineare Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_nichtlineare_systeme/</link><pubDate>Thu, 30 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_nichtlineare_systeme/</guid><description/></item><item><title>Statische und Dynamische Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_nichtlineare_systeme/statische_und_dynamische_systeme/</link><pubDate>Thu, 30 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_nichtlineare_systeme/statische_und_dynamische_systeme/</guid><description>&lt;h2 id="statische-systeme">Statische Systeme&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Ein-/Ausgang: Zufallsvektoren $\underline{u}_k$ und $\underline{y}_k$ ($k \in \mathbb{N}_0$ ist der Zeitschritt)&lt;/p>
&lt;ul>
&lt;li>$\underline{u}_k \in \mathbb{R}^P$ und $\underline{y}_k \in \mathbb{R}^M$ sind wertekontinuierlich&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Abbildung von $\underline{u}_k$ und $\underline{y}_k$ durch &lt;strong>nichtlineare&lt;/strong> Abbildung&lt;/p>
$$
\underline{y}_{k}=\underline{a}_{k}\left(\underline{u}_{k}\right)
\tag{Generatives Modell}
$$
&lt;/li>
&lt;li>
&lt;p>Beschreibung der Unsicherheit in $\underline{u}_k$ und $\underline{y}_k$ durch Dichten&lt;/p>
&lt;ul>
&lt;li>$\underline{u}_k$
: $f_{k}^{u}\left(\underline{u}_{k}\right)$
&lt;/li>
&lt;li>$\underline{y}_k$
: $f_k^y(\underline{y}_k)$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Gesucht: $f_k^y(\underline{y}_k)$
zu gegeben $f_{k}^{u}\left(\underline{u}_{k}\right)$
&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="dynamische-systeme">Dynamische Systeme&lt;/h2>
&lt;h3 id="systemabbildung">Systemabbildung&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Zustand $\underline{x}_k, k \in \mathbb{N}_0$
mit $\underline{x}_k \in \mathbb{R}^N$
&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Nichtlineare System (allg.)&lt;/p>
$$
\underline{x}_{k+1}=\underline{a}_{k}\left(\underline{x}_{k}, \underline{\hat{u}}_{k}, \underline{w}_{k}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Beschreibung von $\underline{x}_k$
durch Dichte $f_k^x(\underline{x}_k)$
&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Spezielle Rauschstruktur: &lt;strong>Additives Rauschen&lt;/strong>&lt;/p>
$$
\underline{x}_{k+1}=\underline{a}\left(\underline{x}_{k}, \underline{\hat{u}}_{k}\right)+\underline{w}_{k}
$$
&lt;ul>
&lt;li>
&lt;p>Systemrauschen $\underline{w}_k$
wird beschrieben durch Dichte $f_k^w(\underline{w}_k)$
&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Typische Annahme&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$\underline{w}_k$
ist Gauß verteilt mit bekannten Parametern&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$\underline{w}_k$
ist weißes Rauschen&lt;/p>
&lt;blockquote>
&lt;p>White noise: uncertainties taken at different time steps are independent&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="messabbildung">Messabbildung&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Nichtlineare Abbildung (allg.)&lt;/p>
$$
\underline{y}_{k}=\underline{h}_{k}\left(\underline{x}_{u}, \underline{v}_{k}\right)
$$
&lt;ul>
&lt;li>
&lt;p>Spezialfall: &lt;strong>Additives Rauschen&lt;/strong>&lt;/p>
$$
\underline{y}_{k}=\underline{h}_{k}\left(\underline{x}_{u}\right) + \underline{v}_{k}
$$
&lt;p>Rauschen $\underline{v}_{k}$
beschrieben durch $f_k^v(\underline{v}_k)$
&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="gesammtsystem">Gesammtsystem&lt;/h3>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertkontinuierliche_nichtlineare_system.drawio.png" alt="wertkontinuierliche_nichtlineare_system.drawio" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Note: Das System ist gekapselt. Von außen können wir nur $\underline{\hat{u}}_{k}$
und $\underline{y}_k$
sehen.&lt;/span>
&lt;/div>
&lt;h2 id="lineare-vs-nichtlineare-systeme">Lineare Vs. Nichtlineare Systeme&lt;/h2>
&lt;style type="text/css">
.tg {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top}
.tg .tg-7btt{border-color:inherit;font-weight:bold;text-align:center;vertical-align:top}
&lt;/style>
&lt;table class="tg">
&lt;thead>
&lt;tr>
&lt;th class="tg-c3ow">&lt;/th>
&lt;th class="tg-7btt">Linear&lt;/th>
&lt;th class="tg-7btt">Nichtlinear&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td class="tg-7btt">Systemabbildung&lt;/td>
&lt;td class="tg-c3ow">$\underline{x}_{k+1} = \mathbf{A}_k \underline{x}_k + \mathbf{B}_k (\underline{u}_k + \underline{w}_k)$&lt;/td>
&lt;td class="tg-c3ow">$\underline{x}_{k+1} = \underline{a}_k(\underline{x}_k, \underline{u}_k, \underline{w}_k)$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tg-7btt">Messabbildung&lt;/td>
&lt;td class="tg-c3ow">$\underline{y}_{k} = \mathbf{H}_k \underline{x}_k + \underline{v}_k$&lt;/td>
&lt;td class="tg-c3ow">$\underline{y}_k = \underline{h}_k (\underline{x}_k, \underline{v}_k)$&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table></description></item><item><title>Nichtlineare Schätzung</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_nichtlineare_systeme/nichtlineare_schaetzung/</link><pubDate>Thu, 30 Jun 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_nichtlineare_systeme/nichtlineare_schaetzung/</guid><description>&lt;h2 id="approximation-durch-linearisierung">Approximation durch Linearisierung&lt;/h2>
&lt;p>Idea&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Linearisierung der nichtlinear Funktion&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-07-03%2021.42.26.png" alt="截屏2022-07-03 21.42.26" style="zoom: 25%;" />
&lt;/li>
&lt;li>
&lt;p>(Normal/Linear) Kalman Filter anwenden&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h3 id="systemmodell">Systemmodell&lt;/h3>
$$
\underline{x}_{k+1}=\underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right)
\tag{Systemmodell}
$$
&lt;p>Linearisierung der rechten Seite von $\text{(Systemmodell)}$ mit Taylor-Entwicklung von $\underline{\overline{x}}_k, \underline{\overline{u}}_k$
:&lt;/p>
$$
\begin{array}{ll}
&amp;\underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right) = \underline{a}_{k}\left(\underline{\overline{x}}_k, \underline{\overline{u}}_k\right) &amp;+ \overbrace{\left.\frac{\partial \underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right)}{\partial \underline{x}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\bar{x}}_{k}, \underline{u}_{k}=\underline{\bar{u}}_{k}}}^{=\mathbf{A}} \cdot \overbrace{(\underline{x}_{k} - \underline{\overline{x}}_k)}^{=\Delta \underline{x}_{k}} + \text{THO} \\\\
&amp; &amp; + \underbrace{\left.\frac{\partial \underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right)}{\partial \underline{u}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\bar{x}}_{k}, \underline{u}_{k}=\overline{\underline{\bar{u}}}_{k}}}_{=\mathbf{B}} \cdot (\underline{u}_{k} - \underline{\overline{u}}_k) + \text{THO}
\end{array}
$$
&lt;ul>
&lt;li>
&lt;p>$\text{THO}$: Terme höherer Ordnung&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Jacobi-Matrizen&lt;/p>
$$
\begin{array}{l}
\mathbf{A}_{k}=\left[\begin{array}{ccc}
\frac{\partial a_{k, 1}}{\partial x_{k, 1}} &amp; \cdots &amp; \frac{\partial a_{k, 1}}{\partial x_{k, N}} \\
\vdots &amp; &amp; \vdots \\
\frac{\partial a_{k, N}}{\partial x_{k, 1}} &amp; \cdots &amp; \frac{\partial a_{k, N}}{\partial x_{k, N}}
\end{array}\right]_{\underline{x}_{k}=\overline{\underline{x}}_{k}, \underline{u}_{k}= \bar{\underline{u}}_{k}} \\\\
\mathbf{B}_{k}=\left[\begin{array}{ccc}
\frac{\partial a_{k, 1}}{\partial u_{k, 1}} &amp; \cdots &amp; \frac{\partial a_{k, 1}}{\partial u_{k, N}} \\
\vdots &amp; &amp; \vdots \\
\frac{\partial a_{k, N}}{\partial u_{k, 1}} &amp; \cdots &amp; \frac{\partial a_{k, N}}{\partial u_{k, N}}
\end{array}\right]_{\underline{x}_{k}=\overline{\underline{x}}_{k}, \underline{u}_{k}= \bar{\underline{u}}_{k}}
\end{array}
$$
&lt;/li>
&lt;/ul>
&lt;p>Annahme&lt;/p>
&lt;ul>
&lt;li>Ableitung existiert&lt;/li>
&lt;li>$\underline{a}_k(\cdot, \cdot)$
ausreichend linear um $\underline{\overline{x}}_k, \underline{\overline{u}}_k$
&lt;/li>
&lt;/ul>
&lt;p>Vernachlässigen von $\text{THO} \Rightarrow$&lt;/p>
$$
\underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right) \approx \underline{a}_{k}\left(\underline{\overline{x}}_k, \underline{\overline{u}}_k\right)+\mathbf{A}_{k}\left(\underline{x}_k-\underline{\overline{x}}_k\right)+\mathbf{B}_{k}\left(\underline{u}_{k}-\underline{\overline{u}}_k\right)
$$
&lt;p>Für die linke Seite von $(\text{Systemmodell})$:&lt;/p>
$$
\underline{x}_{k+1}= \underline{\overline{x}}_{k+1} + \Delta \underline{x}_{k+1}
$$
&lt;p>Für $\underline{u}_{k}$
definiere man&lt;/p>
$$
\underline{u}_{k}:=\underline{\hat{u}}_{k}+\underline{w}_{k}
$$
&lt;p>mit $E\left\{\underline{w}_{k}\right\}=0, \operatorname{Cov}\left\{\underline{w}_{k}\right\}=\mathbf{C}_{k}^{w}$
&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Lineariesierung: $\overline{\underline{u}}_{k} \overset{!}{=} \hat{\underline{u}}_{k} \Rightarrow \Delta \underline{u}_{k}= \underline{u}_{k} -\overline{\underline{u}}_{k} = \underline{w}_{k}$
(d.h. die Abweichung $\underline{w}_k$
ist ein Rauschen)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Äquivalentes Rauschen&lt;/p>
$$
w_{k}^{\prime}=\mathbf{B}_{k} \cdot w_{k} \Rightarrow E\left\{w_{k}^{\prime}\right\}=0, \operatorname{Cov}\left\{w_{k}^{\prime}\right\}=\mathbf{B}_{k} \cdot \mathbf{C}_{k}^{w} \cdot \mathbf{B}_{k}^{\top}
$$
&lt;/li>
&lt;/ul>
&lt;p>Durch obige Linearisierung der beiden Seiten kann man das Systemmodell so schreiben:&lt;/p>
$$
\underline{\overline{x}}_{k+1} + \Delta \underline{x}_{k+1} \approx \underline{a}_{k}\left(\underline{\overline{x}}_k, \underline{\overline{u}}_k\right)+\mathbf{A}_{k}\Delta \underline{x}_k+\underline{w}_k^\prime
$$
&lt;ul>
&lt;li>
&lt;p>Nominalteil&lt;/p>
$$
\underline{\overline{x}}_{k+1} = \underline{a}_{k}\left(\underline{\overline{x}}_k, \underline{\overline{u}}_k\right)
$$
&lt;/li>
&lt;li>
&lt;p>Differentialteil&lt;/p>
$$
\Delta \underline{x}_{k+1} \approx \mathbf{A}_{k}\Delta \underline{x}_k+\underline{w}_k^\prime
$$
&lt;/li>
&lt;/ul>
&lt;h3 id="messgleichung">Messgleichung&lt;/h3>
$$
\underline{y}_{k}=\underline{h}_{k}\left(\underline{x}_{k}, \underline{v}_{k}\right)
\tag{Messgleichung}
$$
&lt;p>Linearisierung der rechten Seite um $\underline{\bar{x}}_{k}, \underline{\bar{v}}_{k}$
:&lt;/p>
$$
\underline{h}_{k}\left(\underline{x}_{k}, \underline{v}_{k}\right) \approx \underline{h}_{k}\left(\underline{\bar{x}}_{k}, \underline{\bar{v}}_{k}\right)+\mathbf{H}_{k} \cdot \underbrace{\left(\underline{x}_{k}-\underline{\bar{x}}_{k}\right)}_{=\Delta \underline{x}_k}+\mathbf{L}_{k} \cdot\left(\underline{v}_{k}-\underline{\bar{v}}_{k}\right)
$$
&lt;p>mit Jacobi-Matrizen&lt;/p>
$$
\mathbf{H}_{k}=\left.\frac{\partial \underline{h}_{k}\left(\underline{x}_{k}, \underline{v}_{k}\right)}{\partial \underline{x}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{x}_{k}, \underline{v}_{k}=\underline{\bar{v}}_{k}} \qquad
\mathbf{L}_{k}=\left.\frac{\partial \underline{h}_{k}\left(\underline{x}_{k}, \underline{v}_{k}\right)}{\partial \underline{v}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{x}_{k}, \underline{v}_{k}=\underline{\bar{v}}_{k}}
$$
&lt;p>Sei $\underline{\bar{v}}_{k} = \underline{\hat{v}}_{k}$
für &lt;a href="https://de.wikipedia.org/wiki/Mittelwertfreiheit">mittelwertfreies&lt;/a> $\underline{v}_{k} \Rightarrow \underline{\hat{v}}_{k} = \underline{0}$
&lt;/p>
&lt;p>Das Effektive Rauschen ist dann&lt;/p>
$$
\underline{v}_{k}^\prime = \mathbf{L}_{k} \cdot \underline{v}_{k}
$$
&lt;p>mit&lt;/p>
$$
E\left\{\underline{v}_{k}^{\prime}\right\}=\underline{0}, \quad \operatorname{Cov}\left\{\underline{v}_{k}^{\prime}\right\}=\mathbf{L}_{k} \cdot \mathbf{C}_{k}^{v} \cdot \mathbf{L}_{k}^{\top}
$$
&lt;p>Damit kann man die Messgliechung so umschreiben:&lt;/p>
$$
\underline{y}_{k}=\underline{\bar{y}}_{k}+\Delta \underline{y}_{k} \approx \underline{h}_{k}\left(\underline{\bar{x}}_{k}, \underline{\bar{v}}_{k}\right)+\mathbf{H}_{k} \Delta \underline{x}_{k}+\underline{v}_{k}^{\prime}
$$
&lt;ul>
&lt;li>
&lt;p>Nominalteil&lt;/p>
$$
\underline{\bar{y}}_{k} = \underline{h}_{k}\left(\underline{\bar{x}}_{k}, \underline{\bar{v}}_{k}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Differentialteil&lt;/p>
$$
\Delta \underline{y}_{k} \approx \mathbf{H}_{k} \Delta \underline{x}_{k}+\underline{v}_{k}^{\prime}
$$
&lt;/li>
&lt;/ul>
&lt;h3 id="erweitertes-kalmanfilter-ekf">Erweitertes Kalmanfilter (EKF)&lt;/h3>
&lt;p>💡Linearisierung um jeweils beste Schätzung&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Prädiktion&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Berechnung Erwartungswert über nichtlineare Funktion&lt;/p>
$$
\underline{\hat{x}}_{k+1}^{p}=\underline{a}_{k}\left(\underline{\hat{x}}_{k}^{e}, \hat{\underline{u}}_{k}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Berechnung Kovarianzmatrix über die Linearisierung&lt;/p>
$$
\mathbf{C}_{k+1}^{p} \approx \mathbf{A}_{k} \mathbf{C}_{k}^{e} \mathbf{A}_{k}^{\top}+\mathbf{C}_{k}^{w^{\prime}}=\mathbf{A}_{k} \mathbf{C}_{k}^{e} \mathbf{A}_{k}^{\top}+\mathbf{B}_{k} \mathbf{C}_{k}^{w} \mathbf{B}_{k}^{\top}
$$
&lt;p>mit&lt;/p>
$$
\mathbf{A}_k = \left.\frac{\partial \underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right)}{\partial \underline{x}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\hat{x}}_{k-1}^{e}, \underline{u}_{k}=\hat{\underline{u}}_{k}} \qquad
\mathbf{B}_k = \left.\frac{\partial \underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right)}{\partial \underline{u}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\hat{x}}_{k-1}^{e}, \underline{u}_{k}=\hat{\underline{u}}_{k}}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Filterung&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Berechnung von $\underline{\bar{y}}_k$
(Messung, die aus dem prioren Schätzwert (also die Prädiktion) bekomme, als Nominalwert zum jetztigen Zeitpunkt)&lt;/p>
$$
\underline{\bar{y}}_k = \underline{h}_k(\underline{\bar{x}}_k^p, \underline{\hat{v}}_k)
$$
&lt;/li>
&lt;li>
&lt;p>Berechnung von $\Delta \underline{y}_k$
&lt;/p>
$$
\Delta \underline{y}_{k}=\underline{\hat{y}}_{k}-\underline{\bar{y}}_{k}
$$
&lt;ul>
&lt;li>$\underline{\hat{y}}_{k}$
: wahre Messung&lt;/li>
&lt;/ul>
&lt;p>und&lt;/p>
$$
\Delta \underline{y}_{k} \approx \mathbf{H}_{k} \cdot\left(\underline{x}_{k}^{e}-\underline{\hat{x}}_{k}^{p}\right)+\underline{v}_{k}^{\prime}
$$
&lt;p>mit&lt;/p>
$$
\mathbf{H}_{k}=\left.\frac{\partial \underline{h}_{k}\left(\underline{x}_{k}, \underline{v}_{k}\right)}{\partial \underline{x}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\hat{x}}_{k}^{p}, \underline{v}_{k}=\underline{\hat{v}}_{k}} \qquad
\mathbf{L}_{k}=\left.\frac{\partial \underline{h}_{k}\left(\underline{x}_{k}, \underline{v}_{k}\right)}{\partial \underline{v}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\hat{x}}_{k}^{p}, \underline{v}_{k}=\underline{\hat{v}}_{k}}
$$
&lt;/li>
&lt;/ul>
&lt;p>Filterung Schritt&lt;/p>
$$
\begin{aligned}
\mathbf{K}_{k}&amp;=\mathbf{C}_{k}^{p} \mathbf{H}_{k}^{\top}\left(\mathbf{L}_{k} \mathbf{C}_{k}^{v} \mathbf{L}_{k}^{\top}+\mathbf{H}_{k} \mathbf{C}_{k}^{p} \mathbf{H}_{k}^{T}\right)^{-1} \\\\
\hat{\underline{x}}_{k}^{e}&amp;=\hat{\underline{x}}_{k}^{p}+\mathbf{K}_{k}\left[\hat{\underline{y}}_{k}-\underline{h}_{k}\left(\hat{\underline{x}}_{k}^{p}, \hat{\underline{v}}_{k}\right)\right] \\\\
\mathbf{C}_{k}^{e}&amp;=\mathbf{C}_{k}^{p}-\mathbf{K}_{k} \mathbf{H}_{k} \mathbf{C}_{k}^{p} = (\mathbf{I} - \mathbf{K}_{k} \mathbf{H}_{k})\mathbf{C}_{k}^{p}
\end{aligned}
$$
&lt;/li>
&lt;/ul>
&lt;h3 id="probleme-bei-linearisierung">Probleme bei Linearisierung&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Berechnung der posteriore Verteilung nur gut für &amp;ldquo;schwache&amp;rdquo; Nichtlinearität&lt;/p>
&lt;p>$\rightarrow$ Induzierte Nichtlinearität durch die Unsicherheit in priorer Dichte (Die Nichtlinearität ist induziert durch die Unsicherheit der priorer Dichte)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-24%2015.52.30.png" alt="截屏2022-08-24 15.52.30" style="zoom:50%;" />
&lt;blockquote>
&lt;p>Wenn wir für priore Dichte kleines/schmales Rauschen (unten, schwarz) verwenden, dann funktioniert es gut.&lt;/p>
&lt;p>Wenn wir das Rauschen breiter machen (unten, grün), dann kommt ein Problem vor, dass die resultierende Dichte von $y$ nicht symmetrisch ist.&lt;/p>
&lt;p>Induzierte nichtlinearität heißt: wir können gar nicht sagen, die ist absolut betrachtet, besonders linear oder besonders nichtlinear. Es ist potential, Problem zu machen. Aber sie macht kein Problem, solange ich mich nur in den linken Bereich oder nur in den rechten Bereich des &amp;ldquo;Knickpunkt&amp;rdquo; aufhalten. Wenn wir die Dichte habe, die über den &amp;ldquo;Knickpunkt&amp;rdquo; weggeht, dann bekomme ich Problem. &lt;strong>Das ist die induzierte nichtlinearität, die durch das Rauschen induziert wird.&lt;/strong>&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;li>
&lt;p>Linearisierung nur um einen Punkt&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Linearisiertes System ist i.A. zeitvariant, auch wenn originalsytstem zeitinvariant ist, da Linearisierung vom Schätzwert abhängt.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="schätzung-in-probabilistischer-form-nichtlineares-kalmanfilter">Schätzung in probabilistischer Form: Nichtlineares Kalmanfilter&lt;/h2>
&lt;h3 id="erwartungswertbildung">Erwartungswertbildung&lt;/h3>
&lt;p>Gegeben:&lt;/p>
&lt;ul>
&lt;li>Funktion $\underline{y}=\underline{g}(\underline{x})$
&lt;/li>
&lt;li>$\underline{x} \sim f_x(x)$
&lt;/li>
&lt;/ul>
&lt;p>Gesucht: Bestimmte Momente von $\underline{y}$
&lt;/p>
&lt;p>Z.B. für skalares Fall&lt;/p>
$$
y = g(x)
$$
&lt;p>suchen wir $E\left\{y^{j}\right\}, j \in \mathbb{N}$
.&lt;/p>
&lt;p>Wir wissen&lt;/p>
$$
E\left\{y^{i}\right\}=\int_{\mathbb{R}} y^{j} f_{y}(y) d y
$$
&lt;p>Aber&lt;/p>
&lt;ul>
&lt;li>$f_y(y)$
, posteriore Dichte, ist oft nicht einfach berechbar&lt;/li>
&lt;li>Falls berechbar, die Berechnung von $f_y(y)$
is viel zu aufwändig, wenn nur Momente benötigt werden&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>Theorem (Dualität bei Erwartungswertbildung)&lt;/p>
$$
E\_{f\_y}\left\\{y^{j}\right\\}=E\_{f\_{x}}\left\\{[g(x)]^{j}\right\\}=\int\_{\mathbb{R}}[g(x)]^{j} f\_{x}(x) d x
$$&lt;/span>
&lt;/div>
&lt;ul>
&lt;li>$f_y(y)$
muss also nicht berechnet werden.&lt;/li>
&lt;li>Nützlich, wenn
&lt;ul>
&lt;li>$f_y(y)$
schwer zu berechnen&lt;/li>
&lt;li>1-order nichtlineare Momente von $f_x(\cdot)$
einfach berechenbar&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Für sample-basierte Approximation der prioren Dichte $f_x(x)$
&lt;/p>
$$
f_{x}(x)=\sum_{i=1}^{L} w_{i} \delta\left(x-x_{i}\right)
$$
&lt;p>ist berechnung der posterioren Dichte $f_y(y)$
trivial.&lt;/p>
$$
f_y(y)=\sum_{i=1}^{L} w_{i} \delta\left(y-y_{i}\right), \quad y_{i}=g\left(x_{i}\right)
$$
&lt;p>Damit&lt;/p>
$$
E\left\{y^{j}\right\}=\int_{\mathbb{R}} y^{j} f_{y}(y) d y=\sum_{i=1}^{L} w_{i} y_{i}^{j}=\sum_{i=1}^{L} w_{i}\left[g\left(x_{i}\right)\right]^{j}
$$
&lt;p>und&lt;/p>
$$
E\left\{[g(x)]^{j}\right\}=\int_{\mathbb{R}}[g(x)]^{j} f_{x}(x) d x=\sum_{i=1}^{L} w_{i}\left[g\left(x_{i}\right)\right]^{j}
$$
&lt;p>Die Berechnungen sind in diesem Fall identisch, aber im allgemeinem Fall gilt dies NICHT! 🤪&lt;/p>
&lt;h3 id="prädiktion-in-probabilistischer-form">Prädiktion in probabilistischer Form&lt;/h3>
&lt;p>Systemmodell&lt;/p>
$$
x_{k+1}=\underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right)
$$
&lt;ul>
&lt;li>$\underline{x}$
: Zustand&lt;/li>
&lt;li>$\underline{u}$
: Störgröße&lt;/li>
&lt;/ul>
&lt;p>Für einen Kalman Filter, wir möchte in nächsten Schritt die Erwartungswert und die Kovarianzmatrix haben.&lt;/p>
&lt;p>Erwartungswert&lt;/p>
$$
\hat{x}_{k+1}^{p}=E\left\{\underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right)\right\}=\int_{\mathbb{R}^{N}} \int_{\mathbb{R}^{p}} \underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right) f_{k}^{x u}\left(\underline{x}_{k}, \underline{u}_{k}\right) d\underline{x}_{k} d\underline{u}_{k}
$$
&lt;p>In der Regel sind $\underline{x}_{k}, \underline{u}_{k}$
unabhängig. Also&lt;/p>
$$
f_{k}^{x u}\left(\underline{x}_{k}, \underline{u}_{k}\right) = f_{k}^{e}\left(\underline{x}_{k}\right) \cdot f_{k}^{u}\left(\underline{u}_{k}\right).
$$
&lt;p>Und nehme an, dass $\underline{x}_{k}, \underline{u}_{k}$
normalverteilt sind, also&lt;/p>
$$
\begin{aligned}
f_{k}^{e}\left(\underline{x}_{u}\right) &amp;= \mathcal{N}(\underline{x}_{k}, \hat{x}_{k}^{e}, \mathbf{C}_{k}^{e}) \\\\
f_{k}^{u}\left(\underline{u}_{k}\right) &amp;= \mathcal{N}\left(\underline{u}_{k}, \hat{\underline{u}}_{k}, \mathbf{C}_{k}^{w}\right)
\end{aligned}
$$
&lt;p>Für additives Rauschen&lt;/p>
$$
\underline{x}_{k+1}=\underline{a}_{k}\left(\underline{x}_{k}\right)+\underline{u}_{k} \left(= \underline{a}_{k}\left(\underline{x}_{k}\right)+(\underline{\hat{u}}_{k} + \underline{w}_k)\right)
$$
&lt;p>gilt&lt;/p>
$$
\underline{\hat{x}}_{k+1}^{p}=\int_{\mathbb{R}^{n}} \underline{a}_{k}\left(\underline{x}_{k}\right) \cdot f_{k}^{e}\left(\underline{x}_{k}\right) d \underline{x}_{k}+\underline{\hat{u}}_{k}
$$
&lt;p>Dann ist&lt;/p>
$$
\begin{aligned}
\underline{x}_{k+1} - \underline{\hat{x}}_{k+1}^{p} &amp;= (\underline{a}_{k}(\underline{x}_{k})+(\underline{\hat{u}}_{k} + \underline{w}_k)) - \left(\int_{\mathbb{R}^{n}} \underline{a}_{k}\left(\underline{x}_{k}\right) \cdot f_{k}^{e}\left(\underline{x}_{k}\right) d \underline{x}_{k}+\underline{\hat{u}}_{k}\right) \\\\
&amp;= \underbrace{\underline{a}_{k}(\underline{x}_{k}) - \int_{\mathbb{R}^{n}} \underline{a}_{k}\left(\underline{x}_{k}\right) \cdot f_{k}^{e}\left(\underline{x}_{k}\right) d \underline{x}_{k}}_{:= \underline{\bar{a}}_{k}(\underline{x}_{k})} + \underline{w}_k
\end{aligned}
$$
&lt;p>Die Kovarianzmatrix ist&lt;/p>
$$
\begin{aligned}
\mathbf{C}_{k+1}^{p} &amp;= E\left\{\left(\underline{x}_{k+1}-\underline{\hat{x}}_{k+1}^{p}\right)(\underline{x}_{k+1}-\underline{\hat{x}}_{k+1}^{p})^{\top}\right\} \\\\
&amp;= E\left\{(\underline{\bar{a}}_{k}(\underline{x}_{k}) + \underline{w}_k) (\underline{\bar{a}}_{k}(\underline{x}_{k}) + \underline{w}_k) ^ \top\right\} \\\\
&amp;= E\left\{\underline{\bar{a}}_{k}(\underline{x}_k) \underline{\bar{a}}_{k}^\top(\underline{x}_k) + \underline{w}_k\underline{\bar{a}}_{k}(\underline{x}_k) + \underline{\bar{a}}_{k}(\underline{x}_k)\underline{w}_k^\top + \underline{w}_k\underline{w}_k^\top\right\} \\\\
&amp;= E\left\{\underline{\bar{a}}_{k}(\underline{x}_k) \underline{\bar{a}}_{k}^\top(\underline{x}_k)\right\} + \underbrace{E\left\{\underline{w}_k\underline{\bar{a}}_{k}(\underline{x}_k)\right\}}_{=0} + \underbrace{E\left\{\underline{\bar{a}}_{k}(\underline{x}_k)\underline{w}_k^\top\right\}}_{=0} + E\left\{\underline{w}_k\underline{w}_k^\top\right\} \\\\
&amp;= E\left\{\underline{\bar{a}}_{k}(\underline{x}_k) \underline{\bar{a}}_{k}^\top(\underline{x}_k)\right\} + E\left\{\underline{w}_k\underline{w}_k^\top\right\} \\\\
&amp;= \int_{\mathbb{R}^{N}} \overline{\underline{a}}_{k}\left(\underline{x}_{k}\right) \overline{\underline{a}}_{k}^{\top}\left(x_{k}\right) f_{k}^{e}\left(\underline{x}_{k}\right) d \underline{x}_{k}+\mathbf{C}_{k}^{w}
\end{aligned}
$$
&lt;h3 id="filterung-in-probabilistischer-form">Filterung in probabilistischer Form&lt;/h3>
&lt;h4 id="einschub-konditionierung-einer-gaußschen-verbunddichte">Einschub: Konditionierung einer Gaußschen Verbunddichte&lt;/h4>
&lt;p>Zufallsvektor $\underline{z}=\left[\begin{array}{l}\underline{x} \\ \underline{y}\end{array}\right]$
mit Gaußcher Verbundverteilung:&lt;/p>
$$
f(\underline{z})=\mathcal{N}\left(\underline{z}_{1}, \underline{\hat{z}}, \mathbf{C}_{z}\right), \quad \underline{\hat{z}}=\left[\begin{array}{l}
\underline{\hat{x}} \\
\underline{\hat{y}}
\end{array}\right], \quad \mathbf{C}_{z}=\left[\begin{array}{ll}
C_{x x} &amp; C_{x y} \\
C_{y x} &amp; C_{y y}
\end{array}\right]
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertkontinuierliche_nichtlineare_system-Konditionierung_gauss_verbunddichte.drawio.png" alt="wertkontinuierliche_nichtlineare_system-Konditionierung_gauss_verbunddichte.drawio" style="zoom:67%;" />
&lt;p>Gegeben: Messung $y^\ast$
&lt;/p>
&lt;p>Konditionale Verteilung:&lt;/p>
$$
\begin{equation}
f\left(\underline{x} \mid \underline{y}^{\ast}\right)= \mathcal{N}\left(\underline{x}, \underline{\hat{x}}^{*}, \mathbf{C}_{x}^{\ast}\right)
\end{equation}
$$
&lt;p>Dann ist&lt;/p>
$$
\begin{array}{l}
\underline{\hat{x}}^{\ast}=\underline{\hat{x}}+C_{x y} C_{y y}^{-1}\left(\underline{y}^{*}-\underline{\hat{y}}\right) \\
\mathbf{C}_{x}^{\ast}=C_{x x}-C_{x y} C_{y y}^{-1} C_{y x}
\end{array}
\tag{*}
$$
&lt;h4 id="alternative-herleitung-kalmanfilter">Alternative Herleitung Kalmanfilter&lt;/h4>
&lt;p>Messmodell&lt;/p>
$$
\underline{y}=\mathbf{H} \cdot \underline{x}+\underline{v}
$$
&lt;p>Gegeben:&lt;/p>
$$
\underline{x}_{p} \sim \mathcal{N}\left(\hat{\underline{x}}_{p}, \mathbf{C}_{p}\right), \quad \underline{v}\sim \mathcal{N}\left(\underline{0}, \mathbf{C}_{v}\right)
$$
&lt;p>Wir definiere $\underline{z}$
als Verbund von $\underline{x}_{p}$
und $\underline{y}$
&lt;/p>
$$
\underline{z}:=\left[\begin{array}{l}
\underline{x}_{p} \\
\underline{y}
\end{array}\right]=\left[\begin{array}{l}
\underline{x}_{p} \\
\mathbf{H} \cdot \underline{x}_{p}+\underline{v}
\end{array}\right]=\left[\begin{array}{ll}
\mathbf{I} &amp; 0 \\
\mathbf{H} &amp; \mathbf{I}
\end{array}\right]\left[\begin{array}{c}
\underline{x}_{p} \\
\underline{v}
\end{array}\right]
$$
&lt;p>Die Erwartungswert ist dann&lt;/p>
$$
\underline{\hat{z}}=\left[\begin{array}{c}
\hat{x}_{p} \\
\mathbf{H} \cdot \hat{x}_{p}
\end{array}\right]
$$
&lt;p>Die Kovairanzmatrix von $\underline{z}$
:&lt;/p>
$$
\begin{array}{l}
\operatorname{Cov}\{\underline{z}\}&amp;=E\left\{\left[\begin{array}{ll}
\mathbf{I} &amp; 0 \\
\mathbf{H} &amp; \mathbf{I}
\end{array}\right]\left[\begin{array}{c}
\underline{x}_{p}-\hat{\underline{x}}_{p} \\
\underline{v}
\end{array}\right]\left[\begin{array}{c}
\underline{x}_{p}-\underline{\hat{x}}_{p} \\
\underline{v}
\end{array}\right]^{\top}\left[\begin{array}{cc}
\mathbf{I} &amp; \mathbf{H}^{\top} \\
0 &amp; \mathbf{I}
\end{array}\right]\right\}\\\\
&amp;=\left[\begin{array}{ll}
\mathbf{I} &amp; 0 \\
\mathbf{H} &amp; \mathbf{I}
\end{array}\right] E\left\{\left[\begin{array}{cc}
\underbrace{\left(\underline{x}_{p}-\underline{\hat{x}}_{p}\right)(\underline{x}_{p}-\underline{\hat{x}}_{p})^{\top}}_{=\mathbf{C}_p} &amp; \underbrace{\left(\underline{x}_{p}-\underline{\hat{x}}_{p}\right) \underline{v}^{\top}}_{=0} \\
\underbrace{\underline{v}\left(\underline{x}_{p}-\underline{\hat{x}}_{p}\right)}_{=0} &amp; \underbrace{\underline{v} \underline{v}^{\top}}_{=\mathbf{C}_v}
\end{array}\right]\right\}\left[\begin{array}{cc}
\mathbf{I} &amp; 0 \\
\mathbf{H} &amp; \mathbf{I}
\end{array}\right]\\\\
&amp;=\left[\begin{array}{ll}
\mathbf{I} &amp; 0 \\
\mathbf{H} &amp; \mathbf{I}
\end{array}\right]\left[\begin{array}{cc}
\mathbf{C}_p &amp; 0 \\
0 &amp; \mathbf{C}_v
\end{array}\right]\left[\begin{array}{cc}
\mathbf{I} &amp; 0 \\
\mathbf{H} &amp; \mathbf{I}
\end{array}\right]=\left[\begin{array}{cc}
\mathbf{C}_{p} &amp; \mathbf{C}_{p} \mathbf{H}^{\top} \\
\mathbf{H} \mathbf{C}_{p} &amp; \mathbf{C}_{v}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}
\end{array}\right]
\end{array}
$$
&lt;p>Lasse&lt;/p>
$$
\mathbf{C}_{x x}=\mathbf{C}_{p} \quad \mathbf{C}_{x y}=\mathbf{C}_{p} \mathbf{H}^{\top} \quad \mathbf{C}_{y x}=\mathbf{H} \mathbf{C}_{p} \quad \mathbf{C}_{y y}=\mathbf{C}_{v}+\mathbf{H} \mathbf{C}_{p} \mathbf{H}^{\top}
$$
&lt;p>und in $(*)$ einsetzen, ergibt sich der Kalman Filter.&lt;/p>
&lt;h4 id="filterung-in-probabilistischer-form-1">Filterung in probabilistischer Form&lt;/h4>
&lt;p>Messgleichung&lt;/p>
$$
\underline{y}=\underline{h}(\underline{x}, \underline{v})
$$
&lt;p>Definiere&lt;/p>
$$
\underline{z}=\left[\begin{array}{l}
\underline{x} \\
\underline{y}
\end{array}\right]=\left[\begin{array}{c}
\underline{x} \\
\underline{h}(\underline{x}, \underline{v})
\end{array}\right] \Rightarrow E\{\underline{z}\}=\left[\begin{array}{c}
\underline{\hat{x}}_{p} \\
E\{\underline{h}(\underline{x}, \underline{v})\}
\end{array}\right]
$$
&lt;p>Bei additivem Rauschen&lt;/p>
$$
y=\underline{h}(\underline{x})+\underline{v}
$$
&lt;p>gilt&lt;/p>
$$
E\{\underline{z}\}=\left[\begin{array}{c}
\underline{\hat{x}}_{p} \\
E\{\underline{h}(x)\}
\end{array}\right], \quad E\{\underline{h}(x)\}=\int_{\mathbb{R}^{N}} \underline{h}(x) \underbrace{f_{p}(x)}_{= \mathcal{N}(\underline{x}, \underline{\hat{x}}_{p}, \mathbf{C}_p)} d \underline{x} \in \mathbb{R}^{M}
$$
&lt;p>Kovarianzmatrix:&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/image-20220711122807030.png" alt="image-20220711122807030">&lt;/p>
$$
E\left\{\left(\underline{x}-\underline{\hat{x}}_{p}\right)\right\} \underline{\bar{h}}^{\top}(\underline{x}) = \int_{\mathbb{R}^N} (\underline{x}-\underline{\hat{x}}_{p}) \underline{\bar{h}}^{\top} f_p(\underline{x}) d\underline{x} \in \mathbb{R}^{N \times M}
$$
$$
E\left\{\underline{\bar{h}}(x) \underline{\bar{h}}^{\top}(\underline{x})\right\} = \int_{\mathbb{R}^N} \underline{\bar{h}}(x) \underline{\bar{h}}^{\top}(\underline{x}) f_p(\underline{x}) d\underline{x} \in \mathbb{R}^{M \times M}
$$
$$
\operatorname{Cov}\{\underline{z}\}=\left[\begin{array}{ll}
\overbrace{C_{x x}}^{\mathbb{R}^{N \times N}} &amp; \overbrace{C_{x y}}^{\mathbb{R}^{N \times M}}\\
\underbrace{C_{y x}}_{\mathbb{R}^{M \times N}} &amp; \underbrace{C_{y y}}_{\mathbb{R}^{M \times M}}
\end{array}\right] \in R^{(N+M) \times (N+M)}
$$
&lt;p>Einsetzen in $(\ast)$ ergibt sich der &lt;strong>Nichtlineare Kalman Filter&lt;/strong>:&lt;/p>
$$
\begin{array}{l}
\underline{\hat{x}}_{e}=\underline{\hat{x}}_{p}+\mathbf{C}_{x y} \mathbf{C}_{y y}^{-1}(\underline{\hat{y}}-E\{\underline{h}(\underline{x})\}) \\
\mathbf{C}_{e}=\mathbf{C}_{p}-\mathbf{C}_{x y} \mathbf{C}_{y y}^{-1} \mathbf{C}_{y x}
\end{array}
$$</description></item><item><title>Berechnung der Momente: Unscented Kalman Filter (UKF)</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_nichtlineare_systeme/berechnung_der_momente/</link><pubDate>Mon, 11 Jul 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_nichtlineare_systeme/berechnung_der_momente/</guid><description>&lt;h2 id="analytische-momente">Analytische Momente&lt;/h2>
&lt;ul>
&lt;li>Scheinbar die beste Methode, da schnell &amp;amp; feste Laufzeit 👍&lt;/li>
&lt;li>Aber
&lt;ul>
&lt;li>Herleitung aufwändig&lt;/li>
&lt;li>Formeln werden schnell unhandlich groß&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="beispiel-kubisches-sensorproblem-skalar">Beispiel: Kubisches Sensorproblem (skalar)&lt;/h3>
&lt;p>Output $y$ ist nonlinear abhängig von dem Zustand $x$:&lt;/p>
$$
y=h(x)+v=x^{3}+v
$$
&lt;p>Gegeben&lt;/p>
&lt;ul>
&lt;li>Priore Schätzung $x_p \sim \mathcal{N}(\hat{x}_p, \sigma_p^2)$
&lt;/li>
&lt;li>Messung $\hat{y}$
&lt;/li>
&lt;li>Rauschen $v$ ist Gaußverteilt mit $E\{v\}=0, \operatorname{Cov}\{v\}=Z_{v}^{2}$
&lt;/li>
&lt;/ul>
&lt;p>Definiere&lt;/p>
$$
z := \left[\begin{array}{l}
x \\
y
\end{array}\right] \Rightarrow E\{\underline{z}\}=\left[\begin{array}{c}
\hat{x}_{p} \\
E\{h(x)\}
\end{array}\right]
$$
&lt;p>mit&lt;/p>
$$
E\{h(x)\}=\int_{\mathbb{R}} h(x) f_{p}(x) d x=\int_{\mathbb{R}} x^{3} f_{p}(x) d x=\hat{x}_{p}^{2}+3 \hat{x}_{p} \sigma_{p}^{2}=:E_{3}
$$
&lt;p>Definiere&lt;/p>
$$
\bar{h}(x)=h(x)-E\{h(x)\}
$$
&lt;p>Dann&lt;/p>
$$
\operatorname{Cov}\{\underline{z}\}=\left[\begin{array}{ll}
\mathbf{C}_{x x} &amp; \mathbf{C}_{x y} \\
\mathbf{C}_{y x} &amp; \mathbf{C}_{y y}
\end{array}\right]=\left[\begin{array}{cc}
\sigma_{p}^{2} &amp; E\left\{\left(x-\hat{x}_{p}\right) \bar{h}(x)\right\} \\
E\left\{\left(x-\hat{x}_{p}\right) \bar{h}(x)\right\} &amp; E\left\{\overline{h}^{2}(x)\right\}+\sigma_{v}^{2}
\end{array}\right]
$$
$$
\begin{aligned}
E\left\{\left(x-\hat{x}_{p}\right)\bar{h}(x)\right\} &amp;= E\left\{\left(x-\hat{x}_{p}\right)\left(x^{3}-E_{3}\right)\right\} \\
&amp;= E\left\{x^{4}-\hat{x}_{p} x^{3}-E_{3} x+\hat{x}_{p} E_{3}\right\} \\
&amp;= E_4 - \hat{x}_p E_3 - E_3 \hat{x}_p + \hat{x}_p E_3 \\
&amp;= E_4 - \hat{x}_p E_3
\end{aligned}
$$
&lt;p>mit&lt;/p>
$$
\begin{aligned}
E_{q}&amp;=\hat{x}_{p}^{4}+6 \hat{x}_{p}^{2} \sigma_{p}^{2}+3\sigma_{p}^{4} \\\\
&amp;=\hat{x}_{p}^{4}+6 \hat{x}_{p}^{2} 2_{p}^{2}+3\sigma_{p}^{4}-\hat{x}_{p}^{4}-3 \hat{x}_{p}^{2} \sigma_{p}^{2} \\\\
&amp;=3 \sigma_{p}^{4}+3 \hat{x}_{p}^{2} \sigma_{p}^{2} \\\\
&amp;=3\sigma_{p}^{2}\left(\hat{x}_{p}^{2}+2_{p}^{2}\right)
\end{aligned}
$$
&lt;p>und&lt;/p>
$$
E\left\{\bar{h}^{2}(x)\right\}=9 \hat{x}_{p}^{4} \sigma_{p}^{2}+36 \hat{x}_{p}^{2} \sigma_{p}^{4}+15\sigma_{p}^{6}
$$
&lt;p>In der Kalmanfilter Filterungsgleichung einsetzen ergibt sich&lt;/p>
$$
\begin{array}{l}
\hat{x}_{e}=\hat{x}_{p}+\mathbf{C}_{xy}\mathbf{C}_{yy}^{-1}(\hat{y}-E\{h(x)\}) \overset{\text{skalar}}{=} \hat{x}_{p}+\frac{\mathbf{C}_{x y}}{\mathbf{C}_{y y}}(\hat{y}-E\{h(x)\}) \\
\sigma_{y}^{2}= \sigma_{p}^{2}-\mathbf{C}_{xy}\mathbf{C}_{yy}^{-1}\mathbf{C}_{yx} \overset{\text{skalar}}{=} \sigma_{p}^{2}-\frac{\mathbf{C}_{x y}^{2}}{\mathbf{C}_{y y}}
\end{array}
$$
&lt;h3 id="einschub-momente-gaußdichte">Einschub: Momente Gaußdichte&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>&lt;strong>Theorem&lt;/strong>&lt;/p>
&lt;p>Die zentralen Momente einer Gaußdichte sind gegeben durch&lt;/p>
$$
C\_{i}=E\_{f}\left\\{(\boldsymbol{x}-\hat{x})^{i}\right\\}=\left\\{\begin{array}{ll}
\displaystyle\prod\_{j=1, j\text{ ungeradde}}^{i-1} j \sigma^{i}=1 \cdot 3 \cdot 5 \cdots(i-1) \sigma^{i} &amp; i \text { gerade } \\\\
0 &amp; i \text { ungerade }
\end{array}\right.
$$
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2022-07-12%2016.27.43.png" alt="截屏2022-07-12 16.27.43">&lt;/p>
&lt;/span>
&lt;/div>
&lt;h2 id="numerische-momente">Numerische Momente&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Verwendung von Standardverfahren zur Integration&lt;/p>
&lt;/li>
&lt;li>
&lt;p>👍 Vorteile&lt;/p>
&lt;ul>
&lt;li>Nutzung schneller Implementierungen&lt;/li>
&lt;li>Einstellbare Genauigkeit&lt;/li>
&lt;li>Adaptive Integration&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>👎 Nachteile&lt;/p>
&lt;ul>
&lt;li>Nicht für das konkrete Probleme der Momentenberechnung maßgeschneidert&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="basierend-auf-abtastwerten-der-prioren-dichte">Basierend auf Abtastwerten der prioren Dichte&lt;/h2>
&lt;p>&lt;strong>Approximation der Prioren Gaußdichte durch Samples&lt;/strong>&lt;/p>
&lt;p>Verschiedene Verfahren mit unterschiedliche Komplexität, Effizienz, Genauigkeit&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Zufälliges Sampling&lt;/strong> mit Zufallszahlengenerator $\rightarrow$ unabhängige Samples&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Abtastung&lt;/strong> (z.B. äquidistantes Gitter)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Minimale Approximation&lt;/strong> auf den Hauptachsen&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Verwendung von $2N$ oder $2N + 1$ samples ($N$: #Dimension)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertkontinuierliche_nichtlineare_system-Minimal_Approximation.drawio.png" alt="wertkontinuierliche_nichtlineare_system-Minimal_Approximation.drawio" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Genaue Approximation&lt;/strong> auf den Hauptachsen&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertkontinuierliche_nichtlineare_system-Genau_Approximation.drawio.png" alt="wertkontinuierliche_nichtlineare_system-Genau_Approximation.drawio" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Allgemeine Sample-Approximation&lt;/strong> $\rightarrow$ Systematische Approximation durch Minimierung eines Gütemaßes&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="einschub-diracsche-deltafunktion">Einschub: Diracsche Deltafunktion&lt;/h3>
&lt;p>Betrachtung Grenzfall einer Gaußdichte&lt;/p>
$$
f(x, m, \sigma)=\frac{1}{\sqrt{2 \pi} \sigma} \exp \left\{-\frac{1}{2} \frac{(x-m)^{2}}{\sigma^{2}}\right\}
$$
&lt;p>für $\sigma \rightarrow 0$
&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/main-qimg-1fa89b5961b0560efe603acf42e3a438.png"
alt="Plotting verschiedener Gaußdichte für $m=0$.">&lt;figcaption>
&lt;p>Plotting verschiedener Gaußdichte für $m=0$.&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;p>Dirasche Deltafunktion&lt;/p>
$$
\delta(x-m)=\lim _{\sigma \rightarrow 0} f(x, m, \sigma)
$$
&lt;p>Wenn die Bereite gegen 0 ($\sigma \to 0$), die Höhe gegen unendlich.&lt;/p>
$$
\int_{-\infty}^{\infty} \delta(x-m) d x=1
$$
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>&lt;strong>Definition&lt;/strong>: Diracsche Deltafunktion&lt;/p>
$$
\delta(x)=\left\\{\begin{array}{cc}
\text{Nicht definiert} &amp; x=0 \\\\
0 &amp; \text { sonst }
\end{array}\right.
$$
$$
\int_{-\infty}^{\infty} \delta(x) d x=\int_{-\varepsilon}^{\varepsilon} \delta(x) d x=1, \varepsilon>0
$$&lt;/span>
&lt;/div>
&lt;ul>
&lt;li>Laut Definition hat die Dirasche Deltafunktion alle Eigenschaften einer Dichte&lt;/li>
&lt;li>Wichtige Eigenschaften
&lt;ul>
&lt;li>$f(x) \cdot \delta(x-m)=f(m) \delta(x-m)$
&lt;/li>
&lt;li>$\int_{\mathbb{R}} f(x) \delta(x-m) d x=f(m)$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="heaviside-funktion-unit-step-function">Heaviside Funktion (Unit Step Function)&lt;/h4>
&lt;p>Cumulative Verteilungsfunktion der Gaußdichte&lt;/p>
$$
F(x)=P(\boldsymbol{x} \leq x)=\int_{-\infty}^{x} f(x) d x=\frac{1}{2}\left\{1+\operatorname{erf}\left(\frac{x-m}{\sqrt{2} \sigma}\right)\right\}
$$
&lt;p>Es gilt&lt;/p>
$$
f(x)=\frac{d}{d x} F(x)
$$
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;p>&lt;strong>Definition&lt;/strong>: Heaviside Funktion&lt;/p>
$$
H(x-m)=\lim\_{\sigma \to 0} F(x)=\left\\{\begin{array}{ll}
1 &amp; x>m \\\\
\frac{1}{2} &amp; x=m \\\\
0 &amp; x&lt;m
\end{array}\right.
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertkontinuierliche_nichtlineare_system-Heaviside_funktion.drawio.png" alt="wertkontinuierliche_nichtlineare_system-Heaviside_funktion.drawio" style="zoom:67%;" />&lt;/span>
&lt;/div>
&lt;p>Cumulative Verteilungsfunktion von $\delta(x)$ ist $H(x)$ mit&lt;/p>
$$
\begin{array}{l}
H(x)=\displaystyle\int_{-\infty}^{x} \delta(x) d x \\\\
\delta(x)=\frac{d}{d x} H(x)
\end{array}
$$
&lt;h4 id="multivariate-diracsche-deltafunktion">Multivariate Diracsche Deltafunktion&lt;/h4>
&lt;p>Dirasche Mischdichten (Dirac Mixture)&lt;/p>
$$
f(x)=\sum_{i=1}^{L} \omega_{i} \delta \left(x-x_{i}\right)
$$
&lt;p>Multivariate Diracdichte&lt;/p>
$$
\delta(\underline{x})=\delta\left(x_{1}\right) \cdot \delta\left(x_{2}\right) \cdot \ldots, \quad \underline{x}=\left[x_{1}, x_{2}, \ldots\right]^{\top}
$$
&lt;p>Multivariate Dirasche Mischdichte&lt;/p>
$$
f(\underline{x})=\sum_{i=1}^{L} \omega_{i} \delta\left(\underline{x}-\underline{x}_{i}\right)
$$
&lt;h3 id="umrechnung-snv-rightarrow-allgemeine-gaußdichte">Umrechnung SNV $\rightarrow$ Allgemeine Gaußdichte&lt;/h3>
&lt;p>(SNV = Standard Normalverteilung $\mathcal{N}(0, 1)$)&lt;/p>
&lt;ul>
&lt;li>Natürliche Lösung für Problem&lt;/li>
&lt;li>Verschiedene Möglichkeiten mit unterschiedlicher Komplexität und Effizienz&lt;/li>
&lt;/ul>
&lt;p>Angenommen: Wir haben ein Approximationsverfahren, das eine standardverteilung in merh-/höher-dimension approximieren kann.&lt;/p>
&lt;ul>
&lt;li>Gegeben: Gaußdichte mit $\underline{\hat{x}}=\underline{0}$
und $\mathbf{C}_x = \mathbf{I}_N$
($N$-dim. Einheitsmatrix)&lt;/li>
&lt;li>Gesucht: Dichte mit beliebigen Mittelwert $\underline{\hat{y}}$
und Kovarianzmatrix $\mathbf{C}_y$
&lt;/li>
&lt;/ul>
&lt;p>Wir machen &lt;strong>Cholesky-Zerlegung&lt;/strong>&lt;/p>
$$
\mathbf{C}_{y}=\mathcal{C}_{y} \cdot \mathcal{C}_{y}^{\top}
$$
&lt;p>wobei $\mathcal{C}_y$
eine untere Dreiecksmatrix.&lt;/p>
&lt;p>Umrechnung&lt;/p>
$$
\underline{y}=\mathcal{C}_{y} \cdot \underline{x}+\underline{\hat{y}}
$$
&lt;p>Beweis:&lt;/p>
$$
E\{\underline{y}\}=E\left\{\mathcal{C}_{y} \cdot \underline{x}+\hat{y}\right\}=\mathcal{C}_{y} \underbrace{E\{\underline{x}\}}_{=\underline{0}}+\underbrace{E\{\hat{y}}_{=\underline{y}}\}=\underline{\hat{y}}
$$
$$
\begin{aligned}
\operatorname{Cov}\{\underline{y}\} &amp;=E\left\{(\underline{y}-E\{\underline{y}\})(\underline{y}-E\{\underline{y})^{\top}\right\} \\
&amp;=E\left\{(\underline{y}-\underline{\hat{y}})(\underline{y}-\underline{\hat{y}})^{\top}\right\} \\
&amp;=E\left\{\mathcal{C}_{y} \cdot \underline{x} \cdot \underline{x}^{\top} \mathcal{C}_{y}^{\top}\right\}\\
&amp;=\mathcal{C}_{y} \cdot \underbrace{E\left\{\underline{x}\underline{x}^{\top}\right\}}_{=\mathbf{C}_{x}=\mathbf{I}_{N}} \cdot \mathcal{C}{y}^{\top} \\
&amp;=\mathcal{C}_{y} \cdot \mathbf{I}_{N} \cdot \mathcal{C}_{y}^{\top}=\mathcal{C}_{y} \cdot \mathcal{C}_{y}^{\top} = \mathbf{C}_{y}
\end{aligned}
$$
&lt;h4 id="minimale-approximation-snv-auf-hauptachsen">Minimale Approximation SNV auf Hauptachsen&lt;/h4>
&lt;p>&lt;strong>1D-Fall&lt;/strong>&lt;/p>
&lt;p>Die wahre Dichte $\tilde{f}(x)$
sei eine 1D Standardnormalverteilung (SNV). Die möchten wir darstellen über eine Dirac Mixture&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertkontinuierliche_nichtlineare_system-SNV_approx_1D.drawio.png" alt="wertkontinuierliche_nichtlineare_system-SNV_approx_1D.drawio" style="zoom:67%;" />
$$
f(x)=w_{1} \delta\left(x-x_{1}\right)+w_{2} \delta\left(x-x_{2}\right) \qquad w_{1}, w_{2} \geqslant 0
$$
&lt;p>Gaußdichte ist symmetrisch $\Rightarrow$&lt;/p>
$$
w_{1}=w_{2}=w, \quad x_{1}=-x_{2}
$$
&lt;p>Integral soll gleich 1 sein.&lt;/p>
$$
\int_{\mathbb{R}} f(x) d x=w_{1}+w_{2}=2 w \stackrel{!}{=} 1 \Rightarrow w=\frac{1}{2}
$$
&lt;p>Erwartungswert:&lt;/p>
$$
E_{f}\{x\}=0=E_{\tilde{f}}\{x\}
$$
&lt;p>Varianz:&lt;/p>
$$
E_{f}\left\{x^{2}\right\}=\int_{\mathbb{R}} x^{2} f(x) d x=w x_{1}^{2}+w x_{2}^{2}=2 w x_{1}^{2} \stackrel{!}{=} 1 \Rightarrow x_{1}^{2}=1 \Rightarrow x_1 = -1, x_2 = 1
$$
&lt;p>&lt;strong>2D-Fall&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/wertkontinuierliche_nichtlineare_system-SNV_approx_2D.drawio.png" alt="wertkontinuierliche_nichtlineare_system-SNV_approx_2D.drawio" style="zoom:67%;" />
$$
\begin{aligned}
f(x, y)=&amp; w_{1} \delta\left(x-x_{1}\right) \delta(y)+w_{2} \delta\left(x-x_{2}\right) \delta(y) &amp; w_{1}, w_{2} \geqslant 0 \\
&amp;+v_{1} \delta(x) \delta\left(y-y_{1}\right)+v_{2} \delta(x) \delta\left(y-y_{2}\right) &amp; v_{1}, v_{2} \geqslant 0
\end{aligned}
$$
&lt;p>Symmetrie $\Rightarrow$&lt;/p>
$$
w_{1}=w_{2}=v_{1}=v_{2}=w, \quad x_{1}=-x_{2}, \quad v_{1}=-y_{2}
$$
&lt;p>Integral = 1&lt;/p>
$$
\int_{\mathbb{R}^{2}} f(x, y) d x d y=w\left\{\int_{\mathbb{R}} s\left(x-x_{1}\right) d x \int_{\mathbb{R}} f(y) d y+\ldots\right\}=4 w \stackrel{!}{=} 1 \Rightarrow w=\frac{1}{4}
$$
&lt;p>Varianz&lt;/p>
$$
\iint_{\mathbb{R}} x^{2} f(x, y) d x d y=w x_{1}^{2}+w x_{2}^{2}=2 w x_{1}^{2} \stackrel{!}{=} 1 \Rightarrow x_{1}^{2}=2 \Rightarrow x_1 = -\sqrt{2}, x_2 = \sqrt{2}
$$
&lt;p>$x, y$ sind nicht unabhänging:&lt;/p>
$$
f(x, y) \neq f(x) \cdot f(y), E\{x \cdot y\}=0
$$
&lt;p>&lt;strong>N-dim Fall&lt;/strong>&lt;/p>
$$
\begin{array}{c}
w=\frac{1}{2 N} \quad \underline{x}=\left[x^{(1)}, x^{(2)}, \ldots\right]^{\top} \\
\Rightarrow \begin{equation}
x_{1}^{(i)}=-\sqrt{N}, \quad x_{2}^{(i)}=+\sqrt{N}, \quad i=1, \ldots, N
\end{equation}
\end{array}
$$
&lt;h4 id="ablauf-des-filters-mit-sampling-der-prioren-dichte">Ablauf des Filters mit Sampling der prioren Dichte&lt;/h4>
&lt;p>Messfunktion (Bsp.)&lt;/p>
$$
y = x^3 + v
$$
&lt;p>Priore Schätzung: Gaußdichte $\tilde{f}_{p}(x)=\mathcal{N}\left(x, \hat{x}_{p}, \sigma_{p}^{2}\right)$
&lt;/p>
&lt;p>Rauschen: $v \sim \tilde{f}_v(v) = \mathcal{N}(v, 0, \sigma_v^2)$
&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-09-12%2022.29.08.png" alt="截屏2022-09-12 22.29.08" style="zoom: 33%;" />
&lt;p>Approximation&lt;/p>
$$
f_{p}(x)=\frac{1}{2} \delta\left(x-x_{1}\right)+\frac{1}{2} \delta\left(x-x_{2}\right)
$$
&lt;p>wobei&lt;/p>
$$
x_1 = \hat{x}_p - \sigma_p \quad x_2 = \hat{x}_p + \sigma_p
$$
$$
f_v(v)=\frac{1}{2} \delta(\underbrace{x - \sigma_{v}}_{=v_{1}})+\frac{1}{2} \delta(\underbrace{x+\sigma_{v}}_{=v_{2}})
$$
&lt;p>Dann&lt;/p>
$$
y_{i j}=x_{i}^{3}+v_{j} \qquad i=1,2 , j=1,2
$$
&lt;p>Wir sampeln für $x$ und $v$ jeweils 2 Samples. Dann kriegen wir 4 Paare $(x, y)$: $(x_1, y_{11}), (x_1, y_{12}), (x_2, y_{21}), (x_2, y_{22})$, also die 4 violette Punkte im Bild.&lt;/p>
&lt;p>Wir nehmen an, dass $x, y$ gemeinsam Gaußverteilt sind. Dann berechnen wir mit dieser 4 Punkte den Mittelwert und Kovarianz, und fitten wir eine Gaußdichte (Moment matching).&lt;/p>
&lt;p>Wir haben auch die Messung $\hat{y}$, die diese approximierte Gaußdichte schneidet. Mit $\hat{y}$ können wir jetzt den probabilistischen Kalman Filter durchführen.&lt;/p></description></item><item><title>Ensemble Kalmanfilter (EnKF)</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_nichtlineare_systeme/ensemble_kalmanfilter/</link><pubDate>Fri, 15 Jul 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/wertekontinuierliche_nichtlineare_systeme/ensemble_kalmanfilter/</guid><description>&lt;h2 id="motivation">Motivation&lt;/h2>
&lt;p>Prädiktionsschritt von Nichtlineares Kalmanfilter (NLKF) $\rightarrow$ speziell Variante sample-basiert&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/EnKF.drawio.png" alt="EnKF.drawio" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>Durch Re-approximation mit Gaußdichte $\rightarrow$ Zusatzinformation verloren&lt;/p>
&lt;p>Wenn keine Messungen vorliegen und mehrere Prädiktionsschritte nacheinander $\rightarrow$ Man kann temporär Approximation fortlassen&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/EnKF-EnKF_Motivation.drawio.png" alt="EnKF-EnKF_Motivation.drawio" style="zoom:67%;" />
&lt;p>Filterschritt von NLKF&lt;/p>
$$
\begin{array}{l}
\underline{\hat{x}}_{e}=\underline{\hat{x}}_{p}+\mathbf{C}_{x y} \mathbf{C}_{y y}^{-1}(\underline{\hat{y}}-E\{\underline{h}(\underline{x})\}) \\
\mathbf{C}_{e}=\mathbf{C}_{p}-\mathbf{C}_{x y} \mathbf{C}_{y y}^{-1} \mathbf{C}_{y x}
\end{array}
$$
&lt;p>wobei&lt;/p>
$$
\begin{array}{ll}
\mathbf{C}_{x x}=\mathbf{C}_{p} \in \mathbb{R}^{N \times N}\quad &amp;\mathbf{C}_{x y} \in \mathbb{R}^{N \times M} \\
\mathbf{C}_{y x} \in \mathbb{R}^{M \times N}\quad &amp;\mathbf{C}_{yy} \in \mathbb{R}^{M \times M}
\end{array}
$$
&lt;p>Unabhängig von gewähltes Form der Momenteberechnung $\rightarrow$ Hoher Aufwand für Berechnung und Speichern der Kovairanzmatrizen 🤪&lt;/p>
&lt;h2 id="idee">Idee&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Beibehaltung der Samples nach Prädiktionsschritt $\rightarrow$ Keine Re-approximation durch Gauß&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Damit bleibt Forminformation erhalten und Unsicherheit wird in samples gespeichert.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Speicherkomplexität&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Kalmanfilter (KF)&lt;/p>
&lt;ul>
&lt;li>Erwartungswert; $N$&lt;/li>
&lt;li>Kovarianzmatrix $\frac{N(N+1)}{2}$&lt;/li>
&lt;/ul>
&lt;p>$\Rightarrow$ Insgesamt $\frac{N^2 + 3N}{2}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>EnKF&lt;/p>
&lt;ul>
&lt;li>Ein sample: $N$&lt;/li>
&lt;li>$L$ samples: $L \cdot N$ (z.B mit sampling auf der Hauptachse gilt $L = 2N \rightarrow 2N^2$)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Aber: spart Aufwand bei Berechnung der Kovarianzmatrix&lt;/p>
&lt;/li>
&lt;li>
&lt;p>🎯 Ziel: Rekursive Berechnung des Prädiktionsschritts&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="herausforderungen">Herausforderungen&lt;/h2>
&lt;p>Gegeben&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$L$ Samples $\underline{x}_{k, i}, i = 1, \dots, L$
&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Systemabbildung&lt;/p>
$$
\underline{x}_{k+1} = \underline{a}_k(\underline{x}_k, \underline{w}_k)
$$
&lt;/li>
&lt;/ul>
&lt;p>Gesucht: $L^\prime$ Samples $\underline{x}_{k, i+1}, i = 1, \dots, L^\prime$
&lt;/p>
&lt;p>Wir benötigen Samples für $\underline{w}_k$: $\underline{w}_{k, j}, j = 1, \dots, Q$
&lt;/p>
&lt;p>‼️ Problem: Abbildung der Kombination aller Samples $\Rightarrow$ &lt;span style="color: Red">Kartesisches Produkt!!! $\Rightarrow$ Anzahl der Samples steigt bei rekursiver Prädiktion &lt;em>exponentiell&lt;/em> !!!&lt;/span>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/EnKF-EnKF_Herausforderung.drawio.png" alt="EnKF-EnKF_Herausforderung.drawio" style="zoom:67%;" />
&lt;p>&lt;strong>Lösungsidee: Begrenzung der Abtastwerte&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Ziel: Einstellbare Anzahl an Samples $\rightarrow$ um Komplexität zu folgen&lt;/li>
&lt;li>Einfacher Fall: Konstante Anzahl Samples über Zeit&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Anzatz 1: Über Reduktion&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Prior&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/EnKF-Reduktion.drawio.png" alt="EnKF-Reduktion.drawio" style="zoom: 67%;" />
&lt;/li>
&lt;li>
&lt;p>Posterior (also Reduktion von $\underline{x}_{k+1, i}$
)&lt;/p>
&lt;ul>
&lt;li>braucht $L \cdot Q$ Abbildungen&lt;/li>
&lt;li>Ergebnis aber oft besser&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Ansatz 2: Anzahl von Parren mit Latin Hypercube Sampleing (LHS)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Jede Zeile und Spalte darf NUR &lt;em>ein&lt;/em> Element erhalten&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/EnKF-LHS.drawio.png" alt="EnKF-LHS.drawio" style="zoom:80%;" />
&lt;/li>
&lt;li>
&lt;p>Optimale Wahl schwierig&lt;/p>
&lt;ul>
&lt;li>Diskretes Gütemaß ist i.d.R. zimliche kompliziert&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Triviale praktische Umsetzung: Ziehe (Konstante) Samples aus $\underline{w}_k$
für jedes $\underline{x}_{k, i}$
(aber schlecht für wenige Samples)&lt;/p>
&lt;p>Anordnung&lt;/p>
$$
\mathcal{X}_{k}=[\underbrace{\underline{x}_{k, 1}}_{\mathbb{R}^N}, \underline{x}_{k, 2}, \ldots, \underline{x}_{k, L}] \in \mathbb{R}^{N \times L}, \quad \mathcal{W}_{k}=\left[\underline{w}_{k, 1}, \underline{w}_{k, 2}, \ldots, \underline{w}_{k, L}\right] \in \mathbb{R}^{N \times L}
$$
&lt;blockquote>
&lt;p>Jede $\underline{x}_{k, i}$
und $\underline{w}_{k, j}$
ist ein Vektor.&lt;/p>
&lt;/blockquote>
&lt;p>$\underline{a}_k$
überladen:&lt;/p>
$$
\mathcal{X}_{k+1} = \underline{a}_k(\mathcal{X}_{k}, \mathcal{W}_{k})
$$
&lt;/li>
&lt;/ul>
&lt;h2 id="filterschritt">Filterschritt&lt;/h2>
&lt;p>🎯 Ziel&lt;/p>
&lt;ul>
&lt;li>Durchführung der Filterschritt NUR mit Samples
&lt;ul>
&lt;li>Direkte Überführung der prioren Samples in posteriore Samples&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Vermeidung der Verwendung der Update-Formeln für Kovarianzmatrix
&lt;ul>
&lt;li>Reine Representation der Unsicherheiten durch Samples&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Lineare Messungabbildung&lt;/p>
$$
\underline{y}=\mathbf{H} \cdot \underline{x}+\underline{v}
$$
&lt;p>Für gegebene Messung $\hat{y}$:&lt;/p>
$$
\underbrace{\underline{\hat{y}}-\underline{v}}_{=:\hat{\mathcal{Y}}}=\mathbf{H} \cdot \underline{x}
$$
&lt;p>Mess-sampleset:&lt;/p>
$$
\hat{\mathcal{Y}}=\underline{\hat{y}} \cdot \underline{\mathbb{1}}^{\top}-\mathcal{V} \qquad \mathcal{V}=\left[\underline{v}_{1}, \underline{v}_{2}, \ldots, \underline{v}_{L}\right]
$$
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/EnKF-Mess-sampleset.drawio.png" alt="EnKF-Mess-sampleset.drawio" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>Damit ist Update des Zustands in &amp;ldquo;combination form&amp;rdquo;&lt;/p>
$$
\mathcal{X}_{e}=(\mathbf{I}-\mathbf{K} \mathbf{H}) \mathcal{X}_{p}+\mathbf{K} \mathcal{\hat{Y}}
$$
&lt;blockquote>
&lt;p>$\mathcal{X}$ und $\mathcal{Y}$ sind Matrizen&lt;/p>
&lt;/blockquote>
&lt;p>wäre begrenzt auf additives Rauschen, aber funktioniert direkt für nichtlineare Messabbildung $\underline{y}=\underline{h}(\underline{x}, \underline{v})$.&lt;/p>
&lt;p>Alternative Herleitung&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Prädizierte Mess-samples basierend auf prioren Samples und Rauschen-samples:&lt;/p>
$$
\mathcal{Y} = \mathbf{H} \cdot \mathcal{X}_p + \mathcal{V}
$$
&lt;/li>
&lt;li>
&lt;p>Update des Zustands in &amp;ldquo;feedback form&amp;rdquo;&lt;/p>
$$
\begin{aligned}
\mathcal{X}_e &amp;= \mathcal{X}_p + \mathbf{K}(\underbrace{\underline{\hat{y}} \cdot \underline{\mathbf{1}}^\top}_{\text{gemessen}} - \underbrace{\mathcal{Y}}_{\text{Prädiktion}}) \\\\
&amp;= \mathcal{X}_e + \mathbf{K}(\underline{\hat{y}} \cdot \underline{\mathbf{1}}^\top - \mathbb{H} \mathcal{X}_p - \mathcal{V})\\\\
&amp;= (\mathbb{I} - \mathbf{K}\mathbf{H})\mathcal{X}_p + \mathbf{K}(\underbrace{\underline{\hat{y}} \cdot \underline{\mathbf{1}}^\top - \mathcal{V}}_{=\hat{\mathcal{Y}}})
\end{aligned}
$$
&lt;/li>
&lt;/ul></description></item><item><title>Allgemeine Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/</link><pubDate>Sun, 17 Jul 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/</guid><description/></item><item><title>Motivation</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/motivation/</link><pubDate>Sun, 24 Jul 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/motivation/</guid><description>&lt;p>Bisher: Systeme immer durch Gaußdichte repräsentiert.&lt;/p>
&lt;p>Systemgleichung&lt;/p>
$$
\underline{x}_{k+1} = \underline{a}_k (\underline{x}_k, \underline{w}_k)
$$
&lt;p>kann durch Transitionsdichte $f(\underline{x}_{x+1} | \underline{x}_k)$
beschrieben werden.&lt;/p>
&lt;p>Messgleichung&lt;/p>
$$
\underline{y}_k = \underline{h}_k (\underline{x}_k, \underline{v}_k)
$$
&lt;p>kann durch Likelihhod $f(\underline{y}_k | \underline{x}_k)$
beschrieben werden.&lt;/p>
&lt;p>Allgemein für beide Gleichung:&lt;/p>
$$
\underline{z} = \underline{h}(\underline{x}, \underline{v})
$$
&lt;p>Für lienare Systeme: Repräsentation durch Gaußdichte $\mathcal{N}(x, \mu, \sigma)$
ist in Ordnung&lt;/p>
$$
z = Hx + v
$$
&lt;p>Erwartungswert&lt;/p>
$$
E(z | x) = Hx
$$
&lt;p>Kovarianz&lt;/p>
$$
\operatorname{Cov}(z \mid x)=E\left(\left[z-E(z|x)\right]^{2} \mid x\right)=\sigma_{v}^{2}
$$
&lt;p>Daher&lt;/p>
$$
\begin{aligned}
f(z \mid x) &amp;=\mathcal{N}\left(z, H \cdot x, \sigma_{v}\right) \\\\
&amp; \propto \exp \left(-\frac{1}{2} \frac{(z-H \cdot x)^{2}}{\sigma_{v}^{2}}\right) \quad | \text { Gauß in } z \\\\
&amp; \propto \exp \left\{-\frac{1}{2} \frac{\left(x-\frac{z}{H}\right)^{2}}{\left(\sigma_{v} / H\right)^{2}}\right\} \quad | \text { Gauß in } x \\\\
&amp; = \mathcal{N}(x, \frac{z}{H}, \frac{\sigma_v}{H})
\end{aligned}
$$
&lt;p>Aber im Allgemein für&lt;/p>
$$
z = h(x) + v
\tag{additives Rauschen}
$$
&lt;p>ist $f(z|x)$ NICHT Gauß in $x$!!!&lt;/p>
&lt;p>Wir benötigen eine Methode zur Berechnung von $f(z|x)$ im allgemeinen Fall. 💪&lt;/p></description></item><item><title>Dirac’sche Deltafunktion</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/dirac_funktion/</link><pubDate>Sun, 24 Jul 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/dirac_funktion/</guid><description>&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Mehr zu Dirac&amp;rsquo;sche Deltafunktion siehe:&lt;/span>
&lt;/div>
&lt;h2 id="eigenschaften">Eigenschaften&lt;/h2>
&lt;h3 id="symmetrie">Symmetrie&lt;/h3>
$$
\delta (x) = \delta (-x)
$$
&lt;h3 id="skalierung">Skalierung&lt;/h3>
$$
\delta (ax) = \frac{1}{|a|}\delta (x)
$$
&lt;h3 id="kompizierte-argumente">Kompizierte Argumente&lt;/h3>
$$
\delta (g(x)) = \sum_{i=1}^N \frac{1}{|g^\prime(x_i)|}\delta (x - x_i)
$$
&lt;p>wobei&lt;/p>
&lt;ul>
&lt;li>$g(x_i) = 0$ (also $x_i$ sind Nullstellen, $i = 1, 2, \dots, N$)&lt;/li>
&lt;li>$g^\prime(x_i) \neq 0$&lt;/li>
&lt;/ul>
&lt;h3 id="ableitung-der-heaviside-step-funktion">Ableitung der Heaviside Step Funktion&lt;/h3>
$$
\delta(x) = \frac{d}{dx} H(x)
$$
&lt;p>wobei $H(x)$ ist die Heaviside Step Funktion&lt;/p>
$$
H(x):= \begin{cases}1, &amp; x>0 \\ 0, &amp; x \leq 0\end{cases}
$$
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/325px-Dirac_distribution_CDF.svg.png" alt="Dirac distribution CDF.svg" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p></description></item><item><title>Funktionen von Zufallsvariablen</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/funktion_zufallsvariable/</link><pubDate>Sun, 24 Jul 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/funktion_zufallsvariable/</guid><description>&lt;p>Abbildung&lt;/p>
$$
y = h(x)
$$
&lt;ul>
&lt;li>
&lt;p>Gegeben: $x \sim f_x(x)$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Gesucht: $y \sim f_y(y)$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Verbunddichte&lt;/p>
$$
f_{xy}(x, y) = f(y | x) \cdot f_x(x)
$$
&lt;p>$f(y|x)$ kann als probabilistische Beschreibung der Abbildung anfassen.&lt;/p>
&lt;p>Dichte von $y$&lt;/p>
$$
f_{y}(y)=\int_{\mathbb{R}} f_{x y}(x, y) d x=\int_{\mathbb{R}} f(y \mid x) \cdot f_{x}(x) d x
$$
&lt;p>Probabilistische Abbildung:&lt;/p>
$$
f(y|x) = \delta(y - h(x))
$$
&lt;p>Damit folgt&lt;/p>
$$
f_y(y)=\int_{\mathbb{R}} \delta(\underbrace{y-h(x)}_{g(x)}) f_x(x) d x
$$
&lt;h2 id="beispiel">Beispiel&lt;/h2>
&lt;h3 id="beispiel-1">Beispiel 1&lt;/h3>
&lt;p>Gegeben&lt;/p>
$$
y = \frac{1}{x} \qquad x \sim f_x(x)
$$
&lt;p>Probabilistische Abbildung:&lt;/p>
$$
f(y|x) = \delta(\underbrace{y - \frac{1}{x}}_{=g(x)})
$$
$$
g(x_1) = 0 \Rightarrow x_1 = \frac{1}{y} \qquad g^\prime(x) = \frac{1}{x^2}
$$
&lt;p>Laut&lt;/p>
$$
\delta (g(x)) = \sum_{i=1}^N \frac{1}{|g^\prime(x_i)|}\delta (x - x_i)
$$
&lt;p>gilt&lt;/p>
$$
f(y|x) = \delta(y - \frac{1}{x}) = x_1^2 \delta(x - \frac{1}{y}) = \frac{1}{y^2} \delta(x - \frac{1}{y})
$$
$$
\begin{aligned}
f_y(y) &amp;= \int_{\mathbb{R}} f(y | x) \cdot f_{x}(x) d x \\
&amp;= \int_{\mathbb{R}} \frac{1}{y^2} \delta(x - \frac{1}{y}) f_{x}(x) d x \\
&amp;= \frac{1}{y^2} f_{x}(\frac{1}{y})
\end{aligned}
$$
&lt;p>Z.B., wenn $x$ gaußverteilt, also $f_x(x) = e^{-x^2}$
, dann kann man die Dichte von $y$ sofort berechnen:&lt;/p>
$$
f_y(y) = \frac{1}{y^2} e^{-\frac{1}{y^2}}
$$
&lt;h3 id="beispiel-2-quadratic-function">Beispiel 2: Quadratic Function&lt;/h3>
$$
\begin{aligned}
&amp;\delta(g(x))=\delta\left(y-a x^{2}\right), \quad a>0 \\
&amp;\Rightarrow g(x)=y-a x^{2} \\
&amp;\Rightarrow g^{\prime}(x)=-2 a x
\end{aligned}
$$
&lt;p>Fallunterscheidung&lt;/p>
$$
\begin{aligned}
&amp;g\left(x_{i}\right)=0 \\
&amp;y \geq 0: N=2, \quad x_{1}=\sqrt{\frac{y}{a}}, \quad x_{2}=-\sqrt{\frac{y}{a}} \\
&amp;y&lt;0: N=0, \quad \text { no roots. }
\end{aligned}
$$
$$
\begin{aligned}
f(y|x) &amp;= \delta\left(y-a x^{2}\right) \\\\
&amp;= \begin{cases}\frac{1}{\left|g^{\prime}\left(x_{1}\right)\right|} \delta\left(x-x_{1}\right)+\frac{1}{\left|g^{\prime}\left(x_{2}\right)\right|} \delta\left(x-x_{2}\right) &amp; , y \geq 0 \\
0 &amp; , y&lt;0\end{cases} \\\\
&amp;= \begin{cases}\frac{1}{2 \cdot \sqrt{a y}}\left(\delta\left(x-\sqrt{\frac{y}{a}}\right)+\delta\left(x+\sqrt{\frac{y}{a}}\right)\right) &amp; , y \geq 0 \\
0 &amp; , y&lt;0\end{cases} \\\\
\end{aligned}
$$
$$
\begin{aligned}
f_y(y) &amp;= \int_{\mathbb{R}} f(y | x) \cdot f_{x}(x) d x \\
&amp;= \frac{1}{2 \sqrt{a y}}\left\{f_{x}\left(-\sqrt{\frac{y}{a}}\right)+f_x\left(\sqrt{\frac{y}{a}}\right)\right\} \cdot u(y) \qquad u(y)= \begin{cases}1, &amp; y \geqslant 0 \\ 0, &amp; \text { sonst }\end{cases}
\end{aligned}
$$
&lt;p>Für $f_x(x) = \mathcal{N}(x, 0, \sigma)$
:&lt;/p>
$$
f_{y}(y)=\frac{1}{\sqrt{2 \pi a y}} \exp \left\{-\frac{1}{2} \frac{y}{a \sigma^{2}}\right\} u(y)
$$</description></item><item><title>Probabilistische Systemmodelle</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/probabilistische_systemmodelle/</link><pubDate>Sun, 24 Jul 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/probabilistische_systemmodelle/</guid><description>&lt;h2 id="mit-additivem-rauschen">Mit Additivem Rauschen&lt;/h2>
&lt;p>Allgemein:&lt;/p>
$$
\underline{z} = \underline{a}(\underline{x}) + \underline{v}
$$
&lt;p>$\Rightarrow$&lt;/p>
$$
f(\underline{z} \mid \underline{x})=f_v(\underline{z}-\underline{a}(\underline{x}))
$$
&lt;p>Beispiel:&lt;/p>
$$
z = x^2 + v \qquad v \sim f_v(v)
$$
&lt;p>Gesucht: $f(z|x)$&lt;/p>
$$
f(z \mid x, v)=\delta\left(z-x^{2}-v\right), \quad f(z, v \mid x)=f(z \mid x, v) \cdot f_v(v)
$$
$$
\begin{aligned}
f(z \mid x) &amp;\overset{\text{Marginalisierung}}{=}\int_{\mathbb{R}} f(z, v \mid x) d v\\
&amp;=\int_{\mathbb{R}} f(z \mid x, v) \cdot f_v(v) d v \\
&amp;=\int_{\mathbb{R}} \delta\left(z-x^{2}-v\right) \cdot f_v(v) d v \\
&amp;=f_{v}\left(z-x^{2}\right)
\end{aligned}
$$
&lt;p>In dem Fall&lt;/p>
$$
z = x_{k + 1} \quad x = x_{k},
$$
&lt;p>heißt&lt;/p>
$$
f_v(z \mid x) = f_v(x_{k+1} \mid x_k) = f_v(x_{k+1} - a(x_k))
\tag{additive}
$$
&lt;p>&lt;strong>Transitionsdichte&lt;/strong> (Engl. transition density).&lt;/p>
&lt;h2 id="mit-multiplikativem-rauschen">Mit Multiplikativem Rauschen&lt;/h2>
&lt;p>Abbildung&lt;/p>
$$
z = x \cdot v \quad v \sim \mathcal{N}(v, 0, \sigma_v)
$$
&lt;p>Annahme: $z, x, v$ sind positiv.&lt;/p>
&lt;p>Gesucht: $f(z \mid x)$&lt;/p>
&lt;p>Rückführung auf additiven Fall mit $\log(\cdot)$:&lt;/p>
$$
\underbrace{\log (z)}_{\bar{z}}=\log (x \cdot v)=\underbrace{\log (x)}_{\bar{x}}+\underbrace{\log (v)}_{\bar{v}} \Leftrightarrow \bar{z}=\bar{x}+\bar{v}
$$
&lt;p>Dichte von $\bar{v} = \log(v)$
:&lt;/p>
$$
f(\bar{v} \mid v) = \delta(\bar{v} - \log(v)) = \exp(\bar{v})\delta(v - \exp(\bar{v}))
$$
$$
\begin{aligned}
f_\bar{v}(\bar{v}) &amp;= \int_{\mathbb{R}} f(\bar{v} \mid v) f_v(v) dv \\\\
&amp;= \int_{\mathbb{R}} \exp(\bar{v})\delta(v - \exp(\bar{v})) f_v(v) dv \\\\
&amp;= \exp(\bar{v}) f_v(\exp(\bar{v})) \\\\
&amp;= \frac{1}{\sqrt{2 \pi} \sigma_{v}} \exp (\bar{v}) \exp\left\{-\frac{1}{2} \frac{[\exp(\bar{v})]^{2}}{\sigma_{v}^{2}}\right\}
\end{aligned}
$$
&lt;p>Dann&lt;/p>
$$
\begin{aligned}
f(\bar{z} \mid \bar{x}) &amp;= f_\bar{v}(\bar{z} - \bar{x}) \\
&amp;= \frac{1}{\sqrt{2 \pi} \sigma_{v}} \exp \{\bar{z} - \bar{x}\} \exp\left\{-\frac{1}{2} \frac{[\exp(\bar{z} - \bar{x})]^{2}}{\sigma_{v}^{2}}\right\}
\end{aligned}
$$
$$
\begin{aligned}
z = \exp\{\bar{z}\} &amp;\Rightarrow g(\bar{z}) = z - \exp(\bar{z}) \\
&amp;\Rightarrow g^{\prime}(\bar{z}) = -\exp(\bar{z}) \quad \text{Nullstelle}: \bar{z} = \log(z)
\end{aligned}
$$
$$
f(z \mid \bar{x}) = \frac{1}{|z|} f(\log(z) \mid \bar{x})
$$
&lt;p>$x = \exp(\bar{x}) \Rightarrow$&lt;/p>
$$
f(z \mid x)=\frac{1}{\sqrt{2 \pi} \sigma_{v}} \frac{1}{|x|} \exp \left\{-\frac{1}{2} \frac{z^{2}}{\sigma_{v}^{2} x^{2}}\right\}
$$
&lt;p>&lt;strong>Direkte Lösung:&lt;/strong>&lt;/p>
$$
f(z \mid x, v) = \delta(z - x \cdot v)
$$
$$
f(z, v \mid x) = f(z \mid x, v) \cdot f_v(v) = \delta(z - x \cdot v) f_v(v)
$$
$$
f(z \mid x) = \int_{\mathbb{R}} f(z, v \mid x) dv = \int_{\mathbb{R}}\delta(z - x \cdot v) f_v(v) dv
$$
&lt;p>Setze&lt;/p>
$$
\begin{aligned}
g(v) := z - xv &amp;\Rightarrow g^\prime(v) = -x, \quad \text{Nullstelle } v = \frac{z}{x}
\end{aligned}
$$
&lt;p>Daher&lt;/p>
$$
\begin{aligned}
f(z \mid x)&amp;=\int_{\mathbb{R}} \frac{1}{|x|} \delta\left(v-\frac{z}{x}\right) \cdot f_v(v) d v \\
&amp;=\frac{1}{|x|} \cdot f_v\left(\frac{z}{x}\right) \qquad \qquad (\text{multiplicative})
\end{aligned}
$$
&lt;h3 id="mixed-additive-and-multiplicative-noise-script-chp-922">Mixed Additive and Multiplicative Noise (Script Chp. 9.2.2)&lt;/h3>
&lt;p>System equation&lt;/p>
$$
x_{k+1} = x_k v_k + w_k
$$
&lt;p>with additive noise $w_k$ and multiplicative noise $v_k$. The noise termsare jointly distributed according to $f_{k}^{vw}(v_k, w_k)$.&lt;/p>
&lt;p>The joint density of the state at time step $k+1$ is&lt;/p>
$$
f\left(x_{k+1}, v_{k}, w_{k} \mid x_{k}\right)=f\left(x_{k+1} \mid x_{k}, v_{k}, w_{k}\right) f_{k}^{v w}\left(v_{k}, w_{k}\right),
$$
&lt;p>where according to the system equation the density of the state at time step $k + 1$ conditioned on the state at time step $k$ and the noise terms $v_k$ and $w_k$ is&lt;/p>
$$
f(x_{k+1} \mid x_{k}, v_{k}, w_{k}) = \delta(x_{k+1} - x_{k}v_{k} - w_{k}).
$$
&lt;p>The desired transition density is now given by&lt;/p>
$$
\begin{aligned}
f\left(x_{k+1} \mid x_{k}\right) &amp;=\int_{\mathbb{R}} \int_{\mathbb{R}} f\left(x_{k+1}, v_{k}, w_{k} \mid x_{k}\right) d w_{k} d v_{k} \\
&amp;=\int_{\mathbb{R}} \int_{\mathbb{R}} \delta\left(x_{k+1}-x_{k} v_{k}-w_{k}\right) f_{k}^{v w}\left(v_{k}, w_{k}\right) \mathrm{d} w_{k} \mathrm{~d} v_{k}\\
&amp;\overset{\text{additive}}{=} f_{k}\left(x_{k+1} \mid x_{k}\right)=\int_{\mathbb{R}} f_{k}^{v w}\left(v_{k}, x_{k+1}-x_{k} v_{k}\right) \mathrm{d} v_{k} \mid v_k, w_k \text{ independent}\\
&amp;=\int_{\mathbb{R}} f_{k}^{v}\left(v_{k}\right) f_{k}^{w}\left(x_{k+1}-x_{k} v_{k}\right) \mathrm{d} v_{k}
\end{aligned}
$$
&lt;p>These expressions cannot in general be solved analytically.&lt;/p></description></item><item><title>Abstraktion</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/abstraktion/</link><pubDate>Wed, 27 Jul 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/abstraktion/</guid><description>&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Skript 10.1, 10.2&lt;/span>
&lt;/div>
&lt;h2 id="abstrahierte-systembeschreibung--eigenschaften">Abstrahierte Systembeschreibung &amp;amp; Eigenschaften&lt;/h2>
&lt;p>Alle Komponenten eines Systems können durch&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-07-27%2022.02.34.png" alt="截屏2022-07-27 22.02.34" style="zoom:50%;" />
&lt;p>beschrieben werden ($\underline{a} \in \mathbb{R}^A, \underline{b}\in \mathbb{R}^b$
) .&lt;/p>
&lt;p>Kauselität: $a$ (Grund) bewrikt $b$ (Wirkung).&lt;/p>
&lt;p>Für $\underline{a}$ gegeben, $f(\underline{b} \mid \cdot)$ heißt &lt;mark>&lt;strong>Transitionsdichte&lt;/strong>&lt;/mark>.&lt;/p>
&lt;p>Für $\underline{b}$ gegeben, $f(\cdot \mid \underline{a})$ heißt &lt;mark>&lt;strong>Likelihood&lt;/strong>&lt;/mark>.&lt;/p>
&lt;h3 id="eigenschaften-von-probabilistischer-systembeschreibung">Eigenschaften von probabilistischer Systembeschreibung&lt;/h3>
&lt;p>In Allg. gilt&lt;/p>
$$
\int_{\mathbb{R}^{B}} f(\underline{b} \mid \underline{a}) d \underline{b}=1 \quad \forall \underline{a}
$$
&lt;p>Es gilt aber i.A.&lt;/p>
$$
\int_{\mathbb{R}^{A}} f(\underline{b} \mid \underline{a}) d \underline{a} \neq 1,
$$
&lt;p>sogar nicht definiert.&lt;/p>
&lt;h2 id="vorwarts-ruckwartsinferenz">Vorwärts-/Rückwärtsinferenz&lt;/h2>
&lt;p>&lt;strong>Vorwärtsinferenz&lt;/strong>&lt;/p>
&lt;p>&lt;em>&amp;ldquo;Given information about $\underline{a}$, we desire information about $\underline{b}$.&amp;rdquo;&lt;/em>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-07-27%2022.18.55.png" alt="截屏2022-07-27 22.18.55" style="zoom:50%;" />
&lt;ul>
&lt;li>Gegeben: Werte für $\underline{\hat{a}}$ oder Dichte $f(\underline{a})$&lt;/li>
&lt;li>Gesucht: $f(\underline{b})$&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Rückwärtsinferenz&lt;/strong>&lt;/p>
&lt;p>&lt;em>&amp;ldquo;Information about the output $\underline{b}$ is given and we desire to reconstruct an appropriate description of $\underline{a}$.&amp;rdquo;&lt;/em>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-07-27%2022.19.17.png" alt="截屏2022-07-27 22.19.17" style="zoom:50%;" />
&lt;ul>
&lt;li>Gegeben: Werte für $\underline{\hat{b}}$ oder Dichte $f(\underline{b})$&lt;/li>
&lt;li>Gesucht: $f(\underline{a})$&lt;/li>
&lt;/ul>
&lt;h2 id="vorwartsinferenz">Vorwärtsinferenz&lt;/h2>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Übungsblatt Aufg. 9.1&lt;/span>
&lt;/div>
&lt;p>&lt;strong>Annahme: KEIN Vorwissen über $f(\underline{b})$&lt;/strong>&lt;/p>
&lt;p>Betrachte eine einfache generative Systemabbildung:&lt;/p>
$$
\underline{b} = \underline{g}(\underline{a}) \quad \underline{a} \in \mathbb{R}^A, \underline{b} \in \mathbb{R}^B
$$
&lt;p>Probablistische Systemabbildung:&lt;/p>
$$
f(\underline{b} \mid \underline{a}) = \delta(\underline{b} - \underline{g}(\underline{a}))
$$
&lt;p>Marginalisierung ergibt:&lt;/p>
$$
\begin{aligned}
f(\underline{b}) &amp;= \int_{\mathbb{R}^A} f(\underline{a}, \underline{b}) d\underline{a} \\\\
&amp;= \int_{\mathbb{R}^A} f(\underline{a} \mid \underline{b}) f(\underline{a}) d\underline{a} \\\\
&amp;= \int_{\mathbb{R}^A} \delta(\underline{b} - \underline{g}(\underline{a})) f(\underline{a}) d\underline{a}
\end{aligned}
$$
&lt;p>Weitere Vereinfachung NUR für konkrete $\underline{g}(\cdot)$ möglich.&lt;/p>
&lt;p>Für Speizialfall der Vorgabe eines Wertes $\underline{\hat{a}}$ ergibt sich&lt;/p>
$$
f(\underline{a}) = \delta(\underline{a} - \underline{\hat{a}})
$$
&lt;p>Damit&lt;/p>
$$
\begin{aligned}
f(\underline{b}) &amp;= \int_{\mathbb{R}^A} \delta(\underline{b} - \underline{g}(\underline{a})) f(\underline{a}) d\underline{a} \\\\
&amp;= \int_{\mathbb{R}^A} \delta(\underline{b} - \underline{g}(\underline{a})) \delta(\underline{a} - \underline{\hat{a}}) d\underline{a} \\\\
&amp;= \delta(\underbrace{\underline{b} - g(\underline{\hat{a}})}_{\underline{\hat{b}}})
\end{aligned}
$$
&lt;p>Das erwartete Ergebnis ist dann&lt;/p>
$$
f(\underline{b}) = \delta(\underline{b} - \underline{\hat{b}})
$$
&lt;p>mit $\underline{\hat{b}} = \underline{g}(\underline{\hat{a}})$.&lt;/p>
&lt;h2 id="probabilistisches-nichtlineares-systemmodell">Probabilistisches nichtlineares Systemmodell&lt;/h2>
&lt;p>Allgemeines Systemmodell&lt;/p>
$$
\underline{x}_{k+1} = \underline{a}_k(\underline{x}_k, \underline{w}_k)
$$
&lt;p>in Form $f(\underline{b} \mid \underline{a})$ bringen:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-prob_nichtlin_sys.drawio.png" alt="allg_sys-prob_nichtlin_sys.drawio" style="zoom:67%;" />
$$
\underline{a}=\left[\begin{array}{c}
\underline{x}_{k} \\
\underline{w}_{k}
\end{array}\right], \quad \underline{b}=\underline{x}_{k+1}
$$
$$
f(\underline{b} \mid \underline{a}) = \delta \left(\underline{x}_{k+1} - \underline{a}_k(\underline{x}_k, \underline{w}_k)\right)
$$
&lt;p>Mit anderen Systemgrenzen:&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Copy%20of%20prob_nichtlin_sys.drawio.png" alt="allg_sys-Copy of prob_nichtlin_sys.drawio" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
$$
\begin{aligned}
f(\underline{b} \mid \underline{a}^\prime) &amp;= f(\underline{x}_{k+1} \mid \underline{x}_k) \\\\
&amp;= \int_{\mathbb{R}^N} \underbrace{f(\underline{x}_{k+1} \mid \underline{x}_k, \underline{w}_k)}_{f(\underline{b} \mid \underline{a})} \cdot f(\underline{w}_k) d\underline{w}_k
\end{aligned}
$$
&lt;p>In diesem Fall enthält $f(\underline{b} \mid \underline{a})$ Systemrauschen $\rightarrow$ ist nicht mehr durch $\delta$-funktion beschreibbar.&lt;/p></description></item><item><title>Prädiktion nichtlinearer Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/pradiktion_nichtlinearer_systeme/</link><pubDate>Wed, 27 Jul 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/pradiktion_nichtlinearer_systeme/</guid><description>&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Skript 10.2, 10.3&lt;/span>
&lt;/div>
&lt;h2 id="chapman-kolmogorov-gleichung">Chapman-Kolmogorov-Gleichung&lt;/h2>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Übungsblatt Aufg. 10.1&lt;/span>
&lt;/div>
&lt;p>Verbunddichte&lt;/p>
$$
f\left(\underline{x}_{k+1}, \underline{x}_{k}\right)=f\left(\underline{x}_{k+1} \mid \underline{x}_{k}\right) \cdot f\left(\underline{x}_{k}\right)
$$
&lt;p>Marginalisierung&lt;/p>
$$
f\left(x_{k+1}\right)=\int_{\mathbb{R}^{N}} f\left(\underline{x}_{k+1} \mid \underline{x}_{k}\right) \cdot f\left(\underline{x}_{k}\right) d \underline{x}_{k}
$$
&lt;p>Definition&lt;/p>
&lt;ul>
&lt;li>
&lt;p>geschätzte Dichte im Zeitschritt $k$ einschließlich der letzten Messung&lt;/p>
$$
f_{k}^{e}\left(\underline{x}_{k}\right)=f\left(\underline{x}_{k} \mid \underline{\hat{y}}_{k}, \underline{\hat{y}}_{k-1}, \ldots, \underline{\hat{y}}_{1}, \underline{\hat{u}}_{k-1}, \underline{\hat{u}}_{k-2}, \ldots, \underline{\hat{u}}_{0}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Prädiktion der Dichte im Zeitschritt $k+1$ (Messung nicht inklusive)&lt;/p>
$$
f_{k+1}^{p}\left(\underline{x}_{k+1}\right)=f\left(\underline{x}_{k+1} \mid \underline{\hat{y}}_{k}, \underline{\hat{y}}_{k-1}, \ldots, \underline{\hat{y}}_{1}, \underline{\hat{u}}_{k}, \underline{\hat{u}}_{k-1}, \ldots, \underline{\hat{u}}_{0}\right)
$$
&lt;/li>
&lt;/ul>
&lt;p>Prädiktion für dynamische Systeme ( &lt;mark>&lt;strong>Chapman-Kolmogorov-Gleichung&lt;/strong>&lt;/mark>)&lt;/p>
$$
f_{k+1}^{p}\left(\underline{x}_{k+1}\right)=\int_{\mathbb{R}^{N}} \underbrace{f\left(\underline{x}_{k+1} \mid \underline{x}_{k}\right)}_{\text{Prädiktionsdichte}} f_{k}^{e}\left(\underline{x}_{k}\right) \mathrm{d} \underline{x}_{k}
$$
&lt;h3 id="erklärung">Erklärung&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb A10.1&lt;/span>
&lt;/div>
&lt;p>Die Chapman-Kolmogorov-Gleichung berechnet die Dichte von $\underline{x}_{k+1}$
aus einer gegebenen &lt;em>Dichte&lt;/em> $f_{k}^{e}\left(\underline{x}_{k}\right)$
von $\underline{x}_{k}$
, während die probabilistische Systembeschreibung $f\left(\underline{x}_{k+1} \mid \underline{x}_{k}\right)$
die Dichte von $\underline{x}_{k+1}$
für einen &lt;em>konkreten Wert&lt;/em> von $\underline{x}_{k}$
aus.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-28%2017.02.13.png" alt="截屏2022-08-28 17.02.13" style="zoom:50%;" />
&lt;ul>
&lt;li>
$f\left(\underline{x}_{k+1} \mid \underline{x}_{k}\right)$
&lt;ul>
&lt;li>
&lt;p>das probablistische Systemmodell, welches eine Wahrscheinlichkeitsdichte für den nächsten Zustand $\underline{x}_{k+1}$
zu einem gegebenen aktuellen Zustand $\underline{x}_{k}$
ausgibt.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Diese Transitionsdichte $f\left(\underline{x}_{k+1} \mid \underline{x}_{k}\right)$
können wir aus dem gegebenen Systemmodell $\underline{x}_{k+1} = \underline{a}(\underline{x}_{k}, \underline{u}_{k}, \underline{v}_{k})$
berechnen - es ist einfach die probabilistische Darstellung davon&lt;/p>
$$
f\left(\underline{x}_{k+1} \mid \underline{x}_{k}\right) = \int_{\mathbb{R}^N} \delta(\underline{x}_{k+1} - \underline{a}(\underline{x}_{k}, \underline{u}_{k}, \underline{v}_{k})) \cdot f_k^v(\underline{v}_k) d \underline{v}_k
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
$f_{k}^{e}\left(\underline{x}_{k}\right)$
&lt;p>die beste Schätzung, die wir über den Systemzustand zum Zeitpunkt $k$ haben, gegeben als Wahrscheinlichkeitsdichte&lt;/p>
&lt;/li>
&lt;li>
$f_{k+1}^{p}\left(\underline{x}_{k+1}\right)$
&lt;ul>
&lt;li>die beste Prädiktion des Zustands zum Zeitpunkt $(k+1)$, die sich aus dem Wissen über den Zustand $f_{k}^{e}\left(\underline{x}_{k}\right)$
und dem Systemmodell $\underline{x}_{k+1} = \underline{a}(\underline{x}_{k}, \underline{u}_{k}, \underline{v}_{k})$
(generative Darstellung) bzw. $f\left(\underline{x}_{k+1} \mid \underline{x}_{k}\right)$
(probabilistische Darstellung) berechnen lässt.&lt;/li>
&lt;li>Bei einer Prädiktion wird die (relative) Unsicherheit generell größer.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="problem">Problem&lt;/h3>
&lt;p>‼️ &lt;span style="color: Red">Es  handelt sich um ein Parameterintegral!&lt;/span>&lt;/p>
&lt;ul>
&lt;li>&lt;span style="color: Red">Integrand hängt von $\underline{x}_{k+1}$ ab (lässt sich i.Allg nicht herausziehen)&lt;/span>&lt;/li>
&lt;li>&lt;span style="color: Red">Nur möglich für analytische Lösung&lt;/span>&lt;/li>
&lt;li>&lt;span style="color: Red">Sonst erfordert (numerische) Lösung des Integrals für alle $\underline{x}_{k+1}$&lt;/span>&lt;/li>
&lt;/ul>
&lt;h3 id="weiter-nützliche-form-der-ck-gleichung">Weiter nützliche Form der CK-Gleichung&lt;/h3>
$$
\begin{aligned}
f(\underline{x}_{k + 2}, \underline{x}_{k}) &amp;= \int_{\mathbb{R}^N} f(\underline{x}_{k+2}, \underline{x}_{k+1}, \underline{x}_{k}) d\underline{x}_{k+1} \\\\
f(\underline{x}_{k + 2} \mid \underline{x}_{k}) f(\underline{x}_{k}) &amp;= \int_{\mathbb{R}^N} f(\underline{x}_{k+2} \mid \underline{x}_{k+1}, \underline{x}_{k}) f(\underline{x}_{k+1}, \underline{x}_{k}) d\underline{x}_{k+1} \quad | \quad \text{Markov} \\\\
f(\underline{x}_{k + 2} \mid \underline{x}_{k}) f(\underline{x}_{k}) &amp;= \int_{\mathbb{R}^N} f(\underline{x}_{k+2} \mid \underline{x}_{k+1}) f(\underline{x}_{k+1}, \underline{x}_{k}) d\underline{x}_{k+1} \\\\
f(\underline{x}_{k + 2} \mid \underline{x}_{k}) f(\underline{x}_{k}) &amp;= \int_{\mathbb{R}^N} f(\underline{x}_{k+2} \mid \underline{x}_{k+1}) f(\underline{x}_{k+1} \mid \underline{x}_{k}) f(\underline{x}_{k})d\underline{x}_{k+1} \\\\
f(\underline{x}_{k + 2} \mid \underline{x}_{k}) &amp;= \int_{\mathbb{R}^N} f(\underline{x}_{k+2} \mid \underline{x}_{k+1}) f(\underline{x}_{k+1} \mid \underline{x}_{k}) d\underline{x}_{k+1}
\end{aligned}
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-CK_Gleichung.drawio.png" alt="allg_sys-CK_Gleichung.drawio" style="zoom:67%;" />
&lt;h3 id="pradiktion-mit-ck-glg-losungsansatze">Prädiktion mit CK-Glg.: Lösungsansätze&lt;/h3>
&lt;p>Im allgemeinen Fall ist CK-Gleichung NICHT exakt lösbar 🤪&lt;/p>
&lt;p>Ausnahme (Bsp.)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>System ist linear und $f_{k}^{e}(\cdot)$ kann durch erste zwei Momente beschrieben werden&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$f_{k}^{e}(\underline{x}_k)$ ist durch Abstastwerte repräsentiert.&lt;/p>
$$
\begin{aligned}
&amp; f_{k}^{e}\left(\underline{x}_{k}\right)=\sum_{i=1}^{L} w_{i} \delta\left(\underline{x}_{k}-\hat{\underline{x}}_{k, i}\right) \qquad w_i \geq 0, \sum_i w_i = 1\\
\Rightarrow \qquad &amp; f_{k=1}^{p}\left(\underline{x}_{k+1}\right)=\sum_{i=1}^{L} w_{i} f\left(\underline{x}_{k+1} \mid \hat{\underline{x}}_{k, i}\right)
\end{aligned}
$$
&lt;/li>
&lt;/ul>
&lt;h2 id="vereinfachte-pradiktion">Vereinfachte Prädiktion&lt;/h2>
&lt;h3 id="systemmodell-mit-additivem-rauschen">Systemmodell mit additivem Rauschen&lt;/h3>
&lt;p>Wir beginnen mit additivem Rauschen.&lt;/p>
&lt;p>Generatives Modell&lt;/p>
$$
\underline{x}_{k+1} = \underline{a}_{k}(\underline{x}_{k}) + \underline{w}_{k} \qquad \underline{x}_{k+1}, \underline{x}_{k}, \underline{w}_{k} \in \mathbb{R}^N
$$
&lt;p>Vereinfachte Schreibweise&lt;/p>
$$
\underline{z} = \underline{a}(\underline{x}) + \underline{w}
$$
&lt;p>Probablistisches Modell (inkl. Rauschen)&lt;/p>
$$
f(\underline{z} \mid \underline{x}) = f_w(\underline{z} - \underline{a}(\underline{x}))
$$
&lt;p>Vereinfachung: Aufteilung in diskrete &amp;ldquo;Streifen&amp;rdquo;:&lt;/p>
$$
f\left(\underline{z} \mid \underline{\hat{x}}_{i}\right)=f_w\left(\underline{z}-\underline{a}\left(\underline{\hat{x}}_{i}\right)\right) \qquad i \in \mathbb{Z}
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-09-13%2015.49.14.png" alt="截屏2022-09-13 15.49.14" style="zoom: 33%;" />
&lt;p>In den &amp;ldquo;Zwischenräumen&amp;rdquo; gilt nun aber $\int f(\underline{z} \mid \underline{x}) = 1$
NICHT. Wir definiere eine &amp;ldquo;Füllfunktion&amp;rdquo; $f_i(\underline{x})$:&lt;/p>
$$
f_i(\underline{x}) = \mathcal{N}(\underline{x}, \underline{\hat{x}}_i, C_i) \qquad i \in \mathbb{Z}
$$
&lt;p>mit&lt;/p>
$$
f(\underline{x}) = \sum_{i \in \mathbb{Z}} w_if_i(\underline{x}) \approx 1
$$
&lt;blockquote>
&lt;p>Z.B. Skalarer Fall&lt;/p>
$$
> f(x)=\sum_{i \in \mathbb{Z}} w_{i} f_i(x), \quad f_i(x)=\exp \left(-\frac{1}{2} \frac{\left(x-\hat{x}_{i}\right)^{2}}{\sigma^{2}}\right)
> $$
&lt;p>
mit geeigneten $\sigma$.&lt;/p>
&lt;/blockquote>
&lt;p>Betrachtung für jeweils ein Komponente $i$&lt;/p>
$$
f_i(\underline{z} \mid \underline{x}) = f(\underline{z} \mid \underline{\hat{x}}_i) \cdot f_i(\underline{x})
$$
&lt;p>Gesamtdichte ist&lt;/p>
$$
f(\underline{z} \mid \underline{x}) \approx \sum_{i \in \mathbb{Z}} w_i f(\underline{z} \mid \underline{\hat{x}}_i) \cdot f_i(\underline{x})
$$
&lt;p>Es gilt&lt;/p>
$$
\begin{aligned}
\int_{\mathbb{R}^{N}} f(\underline{z}(\underline{x}) d \underline{z}&amp;=\sum_{i \in \mathbb{Z}} w_{i} f_{i}(\underline{x}) \underbrace{\int_{\mathbb{R}^N}f(\underline{z} \mid \underline{x}) d\underline{z}}_{=1}\\
&amp;=\sum_{i \in \mathbb{Z}} w_{i} f_{i}(\underline{x}) \approx 1
\end{aligned}
$$
&lt;p>Fall Rauschen $\underline{w}_k$ Gaußverteilt, ist $f(\underline{z} \mid \underline{x} )$ Gaussian Mixture&lt;/p>
$$
f(\underline{z} \mid \underline{x}) = \sum_{i \in \mathbb{Z}} \underbrace{f_i^z(\underline{z})}_{f_w(\underline{z} - \underline{a}(\underline{\hat{x}}_i))} \cdot f_i^x(\underline{x})
$$
&lt;h3 id="allgemeine-systemmodelle">Allgemeine Systemmodelle&lt;/h3>
$$
\underline{x}_{k+1} = \underline{a}_k (\underline{x}_k, \underline{w}_k)
$$
&lt;p>Vereinfachte Schreibweise:&lt;/p>
$$
\underline{z} = \underline{a}(\underline{x}, \underline{w})
$$
&lt;p>Ergibt allgemeine Transitionsdichte $f(\underline{z} | \underline{x})$, auch durch Mixture approximierbar&lt;/p>
$$
f(\underline{z} | \underline{x}) = \sum_{i \in \mathbb{Z}} w_i f_i^z(\underline{z}) \cdot f_i^x(\underline{x})
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-09-13%2016.11.08.png" alt="截屏2022-09-13 16.11.08" style="zoom:50%;" />
&lt;p>Wichtig ist, dass die einzelnen Komponenten entkoppelt sind. 👏&lt;/p>
&lt;h3 id="bsp-1">Bsp 1&lt;/h3>
&lt;p>Annahme: $f(\underline{x})$ ist eine Gaußdichte.&lt;/p>
&lt;p>Einsetzen in CK-Gleichung:&lt;/p>
$$
\begin{aligned}
f(\underline{z})&amp;=\int_{\mathbb{R}^{N}}\left(\sum_{i \in \mathbb{Z}} w_{i} f_{i}^{z}(\underline{z}) \cdot f_{i}^{x}(\underline{x})\right) f(\underline{x}) d \underline{x}\\
&amp;=\sum_{i \in \mathbb{Z}} w_{i} f_{i}^{z}(\underline{z}) \underbrace{\int_{\mathbb{R}^{N}} f_{i}^{x}(\underline{x}) \cdot f(\underline{x}) d \underline{x}}_{\text{Konstante } c_{i}}\\
&amp;= \sum_{i \in \mathbb{Z}} \underbrace{w_{i} c_i}_{=: \bar{w}_i} f_{i}^{z}(\underline{z})\\
&amp;=\sum_{i \in \mathbb{Z}} \bar{w}_{i} f_{i}^{z}(\underline{z})
\end{aligned}
$$
&lt;blockquote>
&lt;p>Hier sieht man, dass $f_i^z(\underline{z})$ einfach aus dem Integral ausgezogen werden kann. Innerhalb des Integrals gibt es nur $\underline{x}$.&lt;/p>
&lt;/blockquote>
&lt;p>Speizialfall: $f(\underline{x}) = \delta(\underline{x} - \underline{\hat{x}}) \Rightarrow$&lt;/p>
$$
c_i = \int_{\mathbb{R}^{N}} f_{i}(\underline{x}) \delta \left(\underline{x}-\underline{\hat{x}}\right) d \underline{x}=f_i(\underline{\hat{x}})
$$
&lt;h3 id="bsp-2">Bsp 2&lt;/h3>
&lt;p>Annahme: Gaussian Mixture&lt;/p>
$$
f(\underline{x}) = \sum_{j=1}^L v_j f_j^*(\underline{x})
$$
&lt;p>Einsetzen in CK-Gleichung:&lt;/p>
$$
\begin{aligned}
f(\underline{z})&amp;=\int_{\mathbb{R}^{N}}\left\{\sum_{i \in \mathbb{Z}} w_{i} f_{i}^{z}(\underline{z}) f_{i}^{x}(\underline{x})\right\} \cdot \left\{\sum_{i=1}^{L} v_{j} f_{j}^{*}(\underline{x})\right\} d x\\
&amp;=\sum_{i \in \mathbb{Z}} w_{i} f_{i}^{z}(\underline{z}) \underbrace{\sum_{i=1}^{L} v_{j} \underbrace{\int_{\mathbb{R}^N} f_{i}^{x}(\underline{x}) \cdot f_{j}^{*}(\underline{x}) d \underline{x}}_{\text{Konstante}}}_{\text {Kondante } C_{i}} \\
&amp;=\sum_{i \in \pi} \underbrace{w_{i}C_i}_{=: \bar{w}_i} f_{i}^{z}(\underline{z}) \\
&amp;=\sum_{i \in \pi} \bar{w}_{i} f_{i}^{z}(\underline{z})
\end{aligned}
$$
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2022-09-13%2020.55.22.png" alt="截屏2022-09-13 20.55.22" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p></description></item><item><title>Filterschritt für nichtlineare Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/filterschritt_nichtlinear_systeme/</link><pubDate>Wed, 03 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/filterschritt_nichtlinear_systeme/</guid><description>&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Skript 10.4&lt;/span>
&lt;/div>
&lt;p>&lt;strong>Rückwärtsinferenz&lt;/strong>: Inferenz &lt;strong>entgegen&lt;/strong> der modellierter Abhängigkeit mit gegebenen Vorwissen&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-07-27%2022.19.17.png" alt="截屏2022-07-27 22.19.17" style="zoom:50%;" />
&lt;p>Zwei Fälle&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#r%C3%BCckw%C3%A4rtsinferenz-mit-konkrektem-messwert">&lt;strong>Konkrekter Wert&lt;/strong> für Ausgang (Messung) gegeben&lt;/a>&lt;/li>
&lt;li>&lt;a href="#r%C3%BCckw%C3%A4rtsinferenz-mit-dichte">&lt;strong>Dichte&lt;/strong> für Ausgang gegeben&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="rückwärtsinferenz-mit-konkrektem-messwert">Rückwärtsinferenz mit Konkrektem Messwert&lt;/h2>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;ul>
&lt;li>Skript 10.4.1&lt;/li>
&lt;li>Übungsblatt Aufg. 9.2, 9.3&lt;/li>
&lt;/ul>
&lt;/span>
&lt;/div>
&lt;p>Stochastische Abbildung von $a \in \mathbb{R}^N$ auf $b \in \mathbb{R}^M$&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-05%2009.54.23.png" alt="截屏2022-08-05 09.54.23" style="zoom: 67%;" />
&lt;p>Probabilistischer Modell $f(b \mid a)$ (grafisch)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-05%2009.54.37.png" alt="截屏2022-08-05 09.54.37" style="zoom:67%;" />
&lt;p>Für konkretes $\underline{\hat{b}}$, wir suchen $f(a \mid \underline{\hat{b}})$ 💪&lt;/p>
$$
\begin{aligned}
&amp;f(\underline{a} \mid \underline{\hat{b}}) f(\underline{\hat{b}})=f(\underline{\hat{b}} \mid \underline{a}) \cdot f(\underline{a}) \\
&amp;\Rightarrow \underbrace{ f(\underline{a} \mid \underline{\hat{b}})}_{\text{Posteriror}}=\underbrace{\frac{1}{f(\underline{\hat{b}})}}_{\text{Normalizationskonstant}} \cdot \underbrace{f(\underline{\hat{b}} \mid \underline{a})}_{\text{Likelihood}} \cdot \underbrace{f(\underline{a})}_{\text{Vorwissen}}
\end{aligned}
$$
&lt;p>Für Messmodell&lt;/p>
&lt;ul>
&lt;li>Likelihood: $f(\underline{\hat{y}} \mid \underline{x})$, wobei $\underline{\hat{y}}$ die Messung ist&lt;/li>
&lt;li>$f^p(\underline{x})$: Gegebene priore Verteilung (also die Prädiktion) für Zustand&lt;/li>
&lt;/ul>
&lt;p>$\Rightarrow$ Posteriore Verteilung:&lt;/p>
$$
f^e(\underline{x}) = f(\underline{x} \mid \underline{\hat{y}}) \propto f(\underline{\hat{y}} \mid \underline{x}) \cdot f^p(\underline{x})
$$
&lt;h2 id="rückwärtsinferenz-mit-dichte">Rückwärtsinferenz mit Dichte&lt;/h2>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;ul>
&lt;li>Skript 10.4.2&lt;/li>
&lt;li>Übungsblatt Aufg. 9.4&lt;/li>
&lt;/ul>
&lt;/span>
&lt;/div>
&lt;p>Spezialfall: Additives Rauschen&lt;/p>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Skript 10.4.3&lt;/span>
&lt;/div>
$$
\underline{y} = \underline{g}(\underline{x}) + \underline{v} = \underline{t} + \underline{v}
$$
&lt;p>Generative Modell&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-backward_inferenz_dichte.drawio.png" alt="allg_sys-backward_inferenz_dichte.drawio" style="zoom:67%;" />
&lt;p>Gegeben&lt;/p>
&lt;ul>
&lt;li>Vorwissen über Zustand $\underline{x}$ in Form von $f_x(\underline{x})$&lt;/li>
&lt;li>Messung $\underline{\hat{y}}$&lt;/li>
&lt;li>Charakteristik der Messrauschen $\underline{v}$ durch $f_v(\underline{v})$&lt;/li>
&lt;/ul>
&lt;p>Gesucht: $f(\underline{x} \mid \underline{\hat{y}})$&lt;/p>
&lt;p>Probabilistisches Modell: Faktorisierung Beschreibung der Vebundsdichte
&lt;/p>
$$
\begin{aligned}
f(\underline{t}, \underline{v}, \underline{x}, \underline{y}) &amp;= f(\underline{y} \mid \underline{t}, \underline{v}, \underline{x}) \cdot f(\underline{t}, \underline{v}, \underline{x}) \quad | \quad \underline{y}, \underline{x} \text{ sind unab.} \\\\
&amp;= f(\underline{y} \mid \underline{t}, \underline{v}) \cdot f(\underline{t} \mid \underline{v}, \underline{x}) \cdot f(\underline{v}, \underline{x}) \quad | \quad \underline{v}, \underline{t} \text{ sind unab.} \\\\
&amp;= f(\underline{y} \mid \underline{t}, \underline{v}) \cdot f(\underline{t} \mid \underline{x}) \cdot f(\underline{v}, \underline{x}) \quad | \quad \underline{v}, \underline{x} \text{ sind unab.}\\\\
&amp;= \delta(\underline{y} - \underline{t} - \underline{v}) \cdot \delta(\underline{t} - \underline{g}(\underline{x})) \cdot f_v(\underline{v}) \cdot f_x(\underline{x})
\end{aligned}
$$
&lt;p>
Grafisches Modell&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-backward_inferenz_dichte_grafisch.png" alt="allg_sys-backward_inferenz_dichte_grafisch" style="zoom:67%;" />
&lt;p>&lt;strong>Betrachtung 1: Direkt Marginalisierung&lt;/strong>&lt;/p>
$$
\begin{aligned}
f(\underline{x} \mid \underline{\hat{y}}) &amp;= \frac{f(\underline{x}, \underline{\hat{y}})}{f(\underline{\hat{y}})} \\
&amp;= \frac{1}{f(\underline{\hat{y}})} \int_{\mathbb{R}^M} \int_{\mathbb{R}^M} f(\underline{t}, \underline{v}, \underline{x}, \underline{\hat{y}}) d\underline{v} d\underline{t} \\
&amp;= \frac{1}{f(\underline{\hat{y}})} \int_{\mathbb{R}^M} \int_{\mathbb{R}^M} \delta(\underline{\hat{y}} - \underline{t} - \underline{v}) \cdot \delta(\underline{t} - \underline{g}(\underline{x})) \cdot f_v(\underline{v}) \cdot f_x(\underline{x}) d\underline{v} d\underline{t} \\
&amp;= \frac{1}{f(\underline{\hat{y}})} f_x(\underline{x}) \int_{\mathbb{R}^M} \delta(\underline{t} - \underline{g}(\underline{x})) f_v(\underline{\hat{y}} - \underline{t}) d\underline{t} \\
&amp;= \frac{1}{f(\underline{\hat{y}})} f_x(\underline{x})f_v(\underline{\hat{y}} - \underline{g}(\underline{x}))
\end{aligned}
$$
&lt;p>&lt;strong>Betrachtung 2: Unsicheres System und deterministische Messung&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Copy%20of%20backward_inferenz_dichte_grafisch.drawio.png" alt="allg_sys-Copy of backward_inferenz_dichte_grafisch.drawio">&lt;/p>
&lt;p>Wir betrachte $f(\underline{y} \mid \underline{x})$ als ein Ersatzsystem.&lt;/p>
$$
\begin{aligned}
f(\underline{y} \mid \underline{x}) &amp;= \frac{1}{f_x(\underline{x})} f(\underline{x}, \underline{y}) \\\\
&amp;= \int_{\mathbb{R}^M} \int_{\mathbb{R}^M} \delta(\underline{t} - \underline{g}(\underline{x})) \delta(\underline{\hat{y}} - \underline{t} - \underline{v}) f_v(\underline{v}) d\underline{v} d\underline{t} \\\\
&amp;= f_v(\underline{\hat{y}} - \underline{g}(\underline{x}))
\end{aligned}
$$
&lt;p>Damit folgt für das vereinfachte System&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-backward_inferenz_dichte_grafisch_vereinfachte.drawio.png" alt="allg_sys-backward_inferenz_dichte_grafisch_vereinfachte.drawio" style="zoom:67%;" />
&lt;p>Gesuchte posteriore Dichte&lt;/p>
$$
\begin{aligned}
f(\underline{x} \mid \underline{\hat{y}}) &amp;= \frac{f(\underline{x}, \underline{\hat{y}})}{f(\underline{\hat{y}})} \\\\
&amp;= \frac{1}{f(\underline{\hat{y}})} \cdot f(\underline{\hat{y}} \mid \underline{x}) \cdot f(\underline{x}) \\\\
&amp;= \frac{1}{f(\underline{\hat{y}})} \cdot f_v(\underline{\hat{y}} - \underline{g}(\underline{x})) \cdot f_x(\underline{x})
\end{aligned}
$$
&lt;p>&lt;strong>Betrachtung 3: Deterministisches System und unsichere Messung&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-backward_inferenz_dichte_grafisch_Betrachtung3.drawio%20%281%29.png" alt="allg_sys-backward_inferenz_dichte_grafisch_Betrachtung3.drawio (1)" style="zoom:67%;" />
$$
\begin{aligned}
f(\underline{\hat{y}} \mid \underline{t}) &amp;= \frac{f(\underline{\hat{y}}, \underline{t})}{f(\underline{t})} \\\\
&amp;= \frac{1}{f(\underline{t})} \int_{\mathbb{R}^M} \underbrace{f(\underline{v}, \underline{t}, \underline{\hat{y}})}_{= f(\underline{\hat{y}} \mid \underline{v}, \underline{t}) f(\underline{v}, \underline{t}) = f(\underline{\hat{y}} \mid \underline{v}, \underline{t}) f(\underline{v}) f(\underline{t})} d\underline{v} \\\\
&amp;= \frac{1}{f(\underline{t})} f(\underline{t}) \int_{\mathbb{R}^M} f_v(\underline{v}) \delta(\underline{\hat{y}} - \underline{t} - \underline{v}) d\underline{v} \\\\
&amp;= f_v(\underline{\hat{y}} - \underline{t})
\end{aligned}
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-backward_inferenz_dichte_grafisch_Betrachtung3_vereinfacht.drawio%20%281%29.png" alt="allg_sys-backward_inferenz_dichte_grafisch_Betrachtung3_vereinfacht.drawio (1)" style="zoom:67%;" />
&lt;p>Gesuchte posteriore Dichte&lt;/p>
$$
\begin{aligned}
f(\underline{x} \mid \underline{\hat{y}}) &amp;= \frac{f(\underline{x}, \underline{\hat{y}})}{f(\underline{\hat{y}})} \\\\
&amp;= \frac{1}{f(\underline{\hat{y}})} \int_{\mathbb{R}^M} f(\underline{x}, \underline{t}, \underline{\hat{y}}) d\underline{t} \\\\
&amp;= \frac{1}{f(\underline{\hat{y}})} \int_{\mathbb{R}^M} \underbrace{f(\underline{\hat{y}} \mid \underline{x}, \underline{t})}_{=f(\underline{\hat{y}} \mid \underline{t})} f(\underline{x}, \underline{t}) d\underline{t} \quad \mid \underline{x}, \underline{t} \text{ sind unab.} \\\\
&amp;= \frac{1}{f(\underline{\hat{y}})} \int_{\mathbb{R}^M} f(\underline{\hat{y}} \mid \underline{t}) f(\underline{x}) f(\underline{t}) d\underline{t} \\\\
&amp;= \frac{1}{f(\underline{\hat{y}})} \cdot f_v(\underline{\hat{y}} - \underline{g}(\underline{x})) \cdot f(\underline{x})
\end{aligned}
$$
&lt;h2 id="schwierigkeiten-filterschritt">Schwierigkeiten Filterschritt&lt;/h2>
&lt;h3 id="problem-1-type-der-dichte-zur-beschreibung-der-schätzung-ändert-sich">Problem 1: Type der Dichte zur Beschreibung der Schätzung ändert sich.&lt;/h3>
&lt;p>Beispiel:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Prior&lt;/p>
$$
f^p(x) \propto \exp \left[-\frac{1}{2} \frac{(x - x^p)^2}{\sigma_p^2}\right]
$$
&lt;/li>
&lt;li>
&lt;p>Messabbildung&lt;/p>
$$
y = x^2 + v \quad v \sim f^v(v)
$$
&lt;p>z.B. $f^v(v)$ ist Gauß mit zero-mean und Varianz $=1$&lt;/p>
$$
f^L(y \mid x) = f^v(y - x^2) \propto \exp \left[-\frac{1}{2} (y - x^2)^2\right]
$$
&lt;/li>
&lt;li>
&lt;p>Posteriror&lt;/p>
$$
\begin{aligned}
f^{e}(x) &amp; \propto f^{p}(x) \cdot f^{L}(\hat{y} \mid x)\\
&amp; \propto \exp \left[-\frac{1}{2}\left(\frac{x-x^{p}}{\sigma_{p}}\right)^{2}\right] \cdot \exp \left[-\frac{1}{2}\left(y-x^{2}\right)^{2}\right] \\
&amp; \propto \exp \left[a x^{4}+b x^{3}+c x^{2}+d x+e\right]
\end{aligned}
$$
&lt;p>ist nicht mehr Gauß!🤪&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="problem-2-dichte-wrid-mit-jedem-schritt-komplexer">Problem 2: Dichte wrid mit jedem Schritt komplexer&lt;/h3>
&lt;p>Beispiel&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Prior ist eine Mixture mit 2 Komponente&lt;/p>
$$
f^p(x) = \sum_{i=1}^2 f^{p, i}(x)
$$
&lt;/li>
&lt;li>
&lt;p>Messabbildung&lt;/p>
$$
y = x + v \quad v \sim f^v(v) = \sum_{j=1}^2 f^{v, j}(v)
$$
&lt;/li>
&lt;li>
&lt;p>Posterior&lt;/p>
$$
\begin{aligned}
f^e(x) &amp; \propto f^{p}(x) \cdot f^{v}(\hat{y}-x) \\
&amp;=\left(\sum_{i=1}^{2} f^{p, i}(x)\right) \cdot\left(\sum_{j=1}^{2} f^{v, i}(\hat{y}-x)\right) \\
&amp;=\sum_{i=1}^{4} f^{e_{i} i}(x)
\end{aligned}
$$
&lt;p>$\Rightarrow$ Insgesamt ist Approximation unvermeidbar! 🤪&lt;/p>
&lt;/li>
&lt;/ul></description></item><item><title>Faktorgraphen und Message Passing</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/faktorgraph_und_message_passing/</link><pubDate>Wed, 03 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/faktorgraph_und_message_passing/</guid><description>&lt;h2 id="faktorgraphen">Faktorgraphen&lt;/h2>
&lt;h3 id="regeln">Regeln&lt;/h3>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Faktorgraph.drawio%20%281%29.png" alt="allg_sys-Faktorgraph.drawio (1)" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h3 id="beispiel">Beispiel&lt;/h3>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Faktorgraph_Bsp.drawio.png" alt="allg_sys-Faktorgraph_Bsp.drawio" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="message-passing">Message Passing&lt;/h2>
&lt;p>Definiere Nachricht an einer Kante&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Nachricht.drawio%20%281%29.png" alt="allg_sys-Nachricht.drawio (1)" style="zoom:67%;" />
&lt;p>Schnitt zur Aufteilung eines Systems in 2 Teile&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-System_Schnitt.drawio%20%281%29.png" alt="allg_sys-System_Schnitt.drawio (1)" style="zoom:67%;" />
&lt;p>Betrachtung von Block mit einem Eingang und einem Ausgang&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Block.drawio%20%282%29.png" alt="allg_sys-Block.drawio (2)" style="zoom:67%;" />
&lt;p>Gegeben: $R_x$ und $L_y$&lt;/p>
$$
\begin{aligned}
&amp;R_{y}(y)=\int f(y \mid x) \cdot R_{x}(x) d x \\
&amp;L_{x}(x)=\int f(y \mid x) \cdot L_{y}(y) d y
\end{aligned}
$$
&lt;p>Speizialfall: Lineares System&lt;/p>
$$
\begin{aligned}
y &amp;= Hx\\
\Rightarrow f(y \mid x) &amp;= \delta(y - Hx)
\end{aligned}
$$
$$
\begin{aligned}
R_y(y) &amp;= \int \delta(y-Hx) R_x(x) dx \quad \mid g(x):=y-Hx, g^\prime(x) = -H, x_1 = \frac{y}{H}\\
&amp;= \int \frac{1}{|H|} \delta(x - \frac{y}{H}) R_x(x) dx \\
&amp;= \frac{1}{|H|} R_x(\frac{y}{H})
\end{aligned}
$$
$$
\begin{aligned}
L_{x}(x) &amp;=\int f(y \mid x) L_{y}(y) d y \\
&amp;=\int \delta(y-H x) \cdot L_{y}(y) d y \\
&amp;=L_{y}(H \cdot x)
\end{aligned}
$$
&lt;h3 id="beispiel-1">Beispiel&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Message_Passing_Bsp.drawio.png" alt="allg_sys-Message_Passing_Bsp.drawio" style="zoom: 67%;" />
&lt;ul>
&lt;li>
&lt;p>Gegeben: $\underline{\hat{x}}_4$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Gesucht: $f(\underline{x}_2 \mid \underline{\hat{x}}_4)$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Ziel: Rekursive Berechnung der Nachrichten&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Direkt gegeben:&lt;/p>
$$
R_{1}\left(\underline{x}_{1}\right)=f_{1}\left(\underline{x}_{1}\right) \quad L_{3}\left(\underline{x}_{3}\right)=f\left(\underline{\hat{x}}_{4} \mid \underline{x}_{3}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Benötigt: $L_2(\underline{x}_2)$ und $R_2(\underline{x}_2)$&lt;/p>
$$
\begin{aligned}
&amp;R_{2}\left(\underline{x}_{2}\right)=\int f\left(\underline{x}_{2} \mid \underline{x}_{1}\right) R_{1}\left(\underline{x}_{1}\right) d \underline{x}_{1} \\
&amp;L_{2}\left(\underline{x}_{2}\right)=\int f\left(\underline{x}_{3} \mid \underline{x}_{2}\right) L_{3}\left(\underline{x}_{3}\right) d \underline{x}_{3}
\end{aligned}
$$
&lt;/li>
&lt;/ul>
&lt;p>$\Rightarrow$ Fusionsergebnis:&lt;/p>
$$
f\left(\underline{x}_{2} \mid \underline{\hat{x}}_{4}\right) \propto L_{2}\left(\underline{x}_{2}\right) \cdot R_{2}\left(\underline{x}_{2}\right)
$$</description></item><item><title>Vereinfachte Filterung</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/vereinfachte_filterung/</link><pubDate>Wed, 03 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/vereinfachte_filterung/</guid><description>&lt;h2 id="approximation-der-likelihood">Approximation der Likelihood&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Vereinfachung der Likelihood $f(\underline{y} \mid \underline{x})$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Analog zu vereinfachter Prädiktion&lt;/p>
&lt;ul>
&lt;li>Approximierte Repräsentation durch Gaussian Mixture&lt;/li>
&lt;li>Wichtig: Entkoppelte Komponenten&lt;/li>
&lt;/ul>
$$
f(\underline{y} \mid \underline{x}) = \sum_{i \in \mathbb{Z}} f_i^y(\underline{y}) f_i^x(\underline{x})
$$
&lt;/li>
&lt;/ul>
&lt;h2 id="resultierender-vereinfachter-filterschritt">Resultierender vereinfachter Filterschritt&lt;/h2>
&lt;p>Likelihood für konkreten Messwert $\underline{\hat{y}}$:&lt;/p>
$$
f^{L}(\underline{x})=f(\underline{\hat{y}} \mid \underline{x})=\sum_{i \in \mathbb{Z}} f_{i}^{y}(\underline{\hat{y}}) \cdot f_{i}^{x}(\underline{x})
$$
&lt;p>Priore Gaussian Mixture:&lt;/p>
$$
f^{p}(\underline{x})=\sum_{j=1}^{L} f_{j}^{p}(\underline{x})
$$
&lt;p>$\Rightarrow$ Posterior:&lt;/p>
$$
\begin{aligned}
f^{e}(\underline{x}) &amp; \propto f^{p}(\underline{x}) \cdot f^{L}(\underline{x}) \\
&amp;= \left(\sum_{i \in \mathbb{z}} f_{i}^{y}(\underline{\hat{y}})\right) \cdot \left(\sum_{j=1}^{L} f_{i}^{p}(\underline{x}) \cdot f_{i}^{k}(\underline{x})\right)
\end{aligned}
$$
&lt;p>Aber Anzahl der Komponenten nimmt zu! 🤪&lt;/p></description></item><item><title>Einfache Filter für stark nichtlineare Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/einfacher_filter_stark_nichlin_sys/</link><pubDate>Tue, 09 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/einfacher_filter_stark_nichlin_sys/</guid><description>&lt;h2 id="nutzung-einfacher-filter-für-stark-nichtlineare-systeme">Nutzung „einfacher“ Filter für stark nichtlineare Systeme&lt;/h2>
&lt;p>2 Variante&lt;/p>
&lt;ul>
&lt;li>Approximation der Zustandsdichten durch &lt;a href="#gaussian-mixture-filter">Gaussian Mixture&lt;/a> $\rightarrow$ Bank von nichtlinearen Kalman Filter für Prädiktion und Filterung&lt;/li>
&lt;li>Approximation aller Dichten durch wertdiskrete Repräsentation $\rightarrow$ &lt;a href="#rasterbasierte-filter">Wertdiskreter Filter&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="gaussian-mixture-filter">Gaussian Mixture Filter&lt;/h2>
&lt;h3 id="motivation">Motivation&lt;/h3>
&lt;p>Approximation der Zustandsschätzung durch Gaussian Mixture&lt;/p>
$$
f(\underline{x})=\sum_{i=1}^{L} w_{i} \mathcal{N}\left(\underline{x}-\underline{\hat{x}}_{i}, C_{i}\right)
$$
&lt;p>mit&lt;/p>
$$
\begin{aligned}
&amp;w_{i} \geqslant 0, \quad i \in\{1, \ldots,L\} \\
&amp;\sum_{i=1}^{L} w_{i}=1
\end{aligned}
$$
&lt;p>(Damit ist Gaussian Mixture für beliebige $L$ eine gültige Dichte)&lt;/p>
&lt;p>Parameter&lt;/p>
&lt;ul>
&lt;li>Gewichtsvektor $\underline{w} = [w_1, \dots, w_L]^T$&lt;/li>
&lt;li>Mittelwerte $\underline{\hat{x}}_1, \dots, \underline{\hat{x}}_L$&lt;/li>
&lt;li>Kovarianzmatrizen $C_1, \dots, C_L$&lt;/li>
&lt;/ul>
&lt;p>Gaussian Mixtures sind universelle Approximators. Falls $L$ genügend groß, kann jede Dichte beliebig genau approximiert werden.&lt;/p>
&lt;h3 id="vorgehen">Vorgehen&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Ziel: Nutzung der Erkenntnisse zum Kalman Filter für schwach nichtlineare Systeme $\rightarrow$ stark nichtlinearer Fall&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Deshalb: Individuelle Verarbeitung der einzelnen Komponente $i$ (also Vernachlässigung der Überlappung)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Ergibt Bank von nichtlinearen Kalman Filter, die parallel arbeiten.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Funktioniert besonders gut, wenn&lt;/p>
&lt;ul>
&lt;li>Überlappung der Komponenten klein&lt;/li>
&lt;li>einzelne Komponenten schmal (induzierte Nichtlinearität)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="prädiktionsschritt">Prädiktionsschritt&lt;/h3>
&lt;p>Systemmodell&lt;/p>
$$
\underline{x}_{k+1} = \underline{a}_k(\underline{x}_k) + \underline{w}_k
$$
&lt;p>Einfache Schreiweise:&lt;/p>
$$
\underline{z} = \underline{a}(\underline{x}) + \underline{w} \quad \underline{w} \sim \text{Gauß}
$$
&lt;p>&lt;strong>💡Kernidee: Aufspaltung der Chapman-Kolmogorov-Gleichung&lt;/strong>&lt;/p>
$$
\begin{aligned}
f^{p}(\underline{z})&amp;=\int_{\mathbb{R}^{N}} f^{w}(\underline{z}-\underline{a}(\underline{x})) \cdot f^{e}(\underline{x}) d \underline{x}\\
&amp;=\int_{\mathbb{R}^{N}} f^{w}(\underline{z}-\underline{a}(\underline{x})) \cdot\left[\sum_{i=1}^{c} w_{i} \mathcal{N} \left(\underline{x}-\underline{\hat{x}}_{i}^{e}, C_{i}^{e}\right)\right] d \underline{x}\\
&amp;=\sum_{i=1}^{L} w_{i} \underbrace{\int_{\mathbb{R}^{N}} f^{w}(\underline{z}-\underline{a}(\underline{x})) \mathcal{N}\left(\underline{x}-\underline{\hat{x}}_{i}^{e}, C_{i}^{e}\right) d \underline{x}}_{\approx \mathcal{N}(\underline{z} - \underline{z}_{i+1}^p, C_{i+1}^p)}
\end{aligned}
$$
&lt;p>Also wir approximieren das Integral einfach mit einem lokalen Posterior für jedes $i$, die wieder Gauß ist, da sie so schmal ist.&lt;/p>
&lt;p> $\underline{z}_{i+1}^p, C_{i+1}^p$
durch Anwendung nichtlinearer Kalman Filter&lt;/p>
&lt;h3 id="filterschritt">Filterschritt&lt;/h3>
&lt;p>Messmodell:&lt;/p>
$$
\underline{y}_k=\underline{h}_{k}\left(\underline{x}_{k}\right)+\underline{v}_{u}
$$
&lt;p>Einfache Schreibweise:&lt;/p>
$$
\underline{y}=\underline{h}_{k}(\underline{x})+\underline{v} \quad \underline{v} \sim \operatorname{Gauß}
$$
&lt;p>Filterschritt&lt;/p>
$$
\begin{aligned}
f^{e}(\underline{x}) &amp;= \underline{c^{e}}_{\text{Normalisierungskonstante}} f^{v}\left(\underline{\hat{y}}-\underline{h}(\underline{x})\right) \cdot \sum_{i=1}^{L} w_{i} \mathcal{N}\left(\underline{x}-\hat{\underline{x}}_{i}^{p}, C_{i}^{p}\right) \\
&amp;=c^{e} \sum_{i=1}^{L} w_{i} \underbrace{f^{v}(\underline{\hat{y}}-\underline{h}(\underline{x})) \cdot \mathcal{N} \left(\underline{x}-\underline{\hat{x}}_{i}^{p}, C_{i}^{p}\right)}_{\approx k_i \mathcal{N} \left(\underline{x}-\underline{\hat{x}}_{i}^{e}, C_{i}^{e}\right)} \\
&amp;= c^e \sum_{i=1}^{L} w_{i} k_i \mathcal{N} \left(\underline{x}-\underline{\hat{x}}_{i}^{e}, C_{i}^{e}\right)
\end{aligned}
$$
&lt;p>$\underline{\hat{x}}_{i}^{e}, C_{i}^{e}$
durch nichtlinearen Kalman Filter bestimmen.&lt;/p>
&lt;h2 id="rasterbasierte-filter">Rasterbasierte Filter&lt;/h2>
&lt;h3 id="rasterbasierte-repräsentation-von-dichten">Rasterbasierte Repräsentation von Dichten&lt;/h3>
&lt;p>Zunächst: Skalarer Fall&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Gegeben: Dichte $f(x), x \in \mathbb{R}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Gescuht: Wertdiskrete Repräsentation&lt;/p>
$$
\underline{\eta} \in \mathbb{R}_{+}^{L}, \quad \underline{\mathbf{1}}^{\top} \cdot \underline{\eta}=1 \text{ (Normalisierung)}
$$
&lt;/li>
&lt;/ul>
&lt;h3 id="rasterbasierter-filter--und-prädiktionsschritt">Rasterbasierter Filter- und Prädiktionsschritt&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-rasterbasierte_repr_dichte.drawio.png" alt="allg_sys-rasterbasierte_repr_dichte.drawio" style="zoom:67%;" />
$$
\underline{\eta}=\left[\begin{array}{c}
\eta
_{1} \\
\eta
_{2} \\
\vdots \\
\eta
_{L}
\end{array}\right]
$$
&lt;p>Annahme: Repräsentiere $\eta_i$ in &lt;strong>Mitte&lt;/strong> jedes Intervalls durch Dirac&amp;rsquo;sche Deltafunktion&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Copy%20of%20rasterbasierte_repr_dichte.drawio.png" alt="allg_sys-Copy of rasterbasierte_repr_dichte.drawio" style="zoom:67%;" />
&lt;p>Kriterium: Integralwerte sollen gleich sein.&lt;/p>
$$
\begin{aligned}
&amp;\int_{x_{i-1}}^{x_i}f(x)dx \overset{!}{=} \underbrace{\int_{x_{i-1}}^{x_i} \eta_i \cdot \delta(x - \frac{x_i + x_{i-1}}{2}) dx}_{=\eta_i} \\
&amp;\Rightarrow \eta_i \propto \int_{x_{i-1}}^{x_i}f(x)dx \quad i \in \{1, \dots, L\}
\end{aligned}
$$
&lt;p>Normalisierung erfordlich:&lt;/p>
$$
\eta_{i}:=\frac{\eta_{i}}{\sum_{i} \eta_{i}} \quad i \in\left\{1, \dots, L\right\}
$$
&lt;p>In vielen Fällen, Integral über $f(x)$ nicht analytisch lösbar. $\Rightarrow$ Integration zu aufwändig.&lt;/p>
&lt;p>Alternative: Stückweise Konstant Approximation von $f(x)$&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-rasterbasierte_repr_dichte_stueckweise_konstant_approx.drawio.png" alt="allg_sys-rasterbasierte_repr_dichte_stueckweise_konstant_approx.drawio" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>Aber: Optimaler Vergleich erfordert auch Integration&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Deshalb: Verwendung des Dichtwerts an Stelle&lt;/p>
$$
h_i = f(\frac{x_i + x_{i-1}}{2})
$$
&lt;p>Damit&lt;/p>
$$
\begin{aligned}
&amp;\int_{x_{i-1}}^{x_i} f(x) dx \approx \int_{x_{i-1}}^{x_i} h_i dx = h_i(\underbrace{x_i - x_{i-1}}_{=\Delta}) \\
&amp; \Rightarrow \eta_i \propto \Delta \cdot h_i = \Delta \cdot f(\frac{x_i + x_{i-1}}{2})
\end{aligned}
$$
&lt;p>mit Normalisierung&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="rasterbasierter-filterschritt">&lt;strong>Rasterbasierter Filterschritt&lt;/strong>&lt;/h4>
&lt;p>Generatives Modell&lt;/p>
$$
y = h(x, v)
$$
&lt;p>Kovertiere in probabilitisches Modell $f(y \mid x)$&lt;/p>
&lt;p>Messung $\hat{y}$ sidn nicht wertdiskret $\rightarrow$ Quantisierung von $f(\hat{y} \mid x) = f^L(x)$&lt;/p>
&lt;p>Da $f(\hat{y} \mid x)$ i.d.R. nicht analytisch integrierbar $\rightarrow$&lt;/p>
$$
\eta_{i}^{L} \propto \Delta f^{L}\left(\frac{x_{i}+x_{i-1}}{2}\right) \quad i \in\{1, \dots,L\}
$$
&lt;p>und Normalisierung.&lt;/p>
&lt;p>Für gegebene Dichte $\underline{\eta}^{p}=\left[\eta_{1}^{p}, \eta_{2}^{p}, \ldots, \eta_{L}^{p}\right]^{\top}$&lt;/p>
$$
\underline{\eta}^{e} \propto \underline{\eta}^{L} \odot \underline{\eta}^{p}
\tag{posteriore Verteilung}
$$
&lt;h4 id="rasterbasierter-pradiktionsschritt">&lt;strong>Rasterbasierter Prädiktionsschritt&lt;/strong>&lt;/h4>
&lt;p>Generatives Modell&lt;/p>
$$
x_{k+1} = a_k(x_k, w_k)
$$
&lt;p>Einfache Schreibweise&lt;/p>
$$
z = a(x, w)
$$
&lt;p>probabilitisches Modell: $f(z \mid x)$&lt;/p>
&lt;p>Hier müssen wir für skalare Zustände eine 2D-Dichte quantisieren.&lt;/p>
&lt;p>$\Rightarrow$ Es ergibt sich eine Matrix&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-rasterbasiert_prädiktion.drawio.png" alt="allg_sys-rasterbasiert_prädiktion.drawio" style="zoom:67%;" />
$$
A_{i j} \propto f\left(\frac{z_{j}+z_{j-1}}{2}, \frac{x_{i}+x_{i-1}}{2}\right)
$$
&lt;p>Normalisierung&lt;/p>
&lt;ul>
&lt;li>Es handelt sich um Transitionsmatrix&lt;/li>
&lt;li>Stochastische Matrix, Zeilensumme = 1&lt;/li>
&lt;li>$A_{i j}:=\displaystyle\frac{A_{i j}}{\sum_{i=1}^{L} A_{i j}}, i \in\{1, \ldots,L\}$
&lt;/li>
&lt;/ul>
&lt;p>Gegeben:&lt;/p>
&lt;ul>
&lt;li>Transitionsmatrix $A \in \mathbb{R}_{+}^{L \times L}$&lt;/li>
&lt;li>Schätzung aus letzen Filterschritt $\underline{\eta}^e \in \mathbb{R}_{+}^{L}$&lt;/li>
&lt;/ul>
&lt;p>Ergebnis des Prädiktionsschritts:&lt;/p>
$$
\underline{\eta}^p = A^\top \underline{\eta}^e
$$
&lt;p>Aufwändiger als Filterschritt 🤪&lt;/p>
&lt;h3 id="erweiterung-prädiktionsschritt">Erweiterung Prädiktionsschritt&lt;/h3>
&lt;p>Bisher angenommen: Raster für $z$ (also $x_{k+1}$) schon bekannt/fest&lt;/p>
&lt;p>Das ist leider nicht praxisgerecht, da sich Wertbereich aus Abbildung ergibt.&lt;/p>
&lt;p>Speizialfall: Lineares System mit additives Rauschen (i. Allg. schwieriger)&lt;/p>
$$
z = \underbrace{x + u}_{z^\prime} + w \quad w \sim f^w(w)
$$
&lt;ul>
&lt;li>
&lt;p>Zwischengröße $z^\prime$: Nutze Eingang $\hat{u}$, um Raster zu verschieben (bewegliches Raster)&lt;/p>
$$
z_i^\prime = x_i + \hat{u} \quad i \in \{1, \dots, L\}
$$
&lt;p>Wir setzen $z_i = z_i^\prime$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Danach Faltung mit $f_w(w)$:&lt;/p>
$$
fz\left(z^{\prime}\right)=f^{w}\left(z-z^{\prime}\right)
$$
&lt;p>Dann Quantisierung von $f(z \mid z^\prime) \Rightarrow$&lt;/p>
$$
\begin{aligned}
A_{i j} &amp;=f^{w}\left(\frac{z_{j}+z_{j - 1}}{2} \mid \frac{z_{i}^{\prime}+z_{i - 1}^\prime}{2}\right) \\
A_{i j}&amp;=f^{\omega}\left(\frac{1}{2}\left[z_{i}+z_{j-1}-\left(z_{i}^{\prime}+z_{i-1}^{\prime}\right)\right]\right)
\end{aligned}
$$
&lt;p>Wir wissen&lt;/p>
$$
\begin{aligned}
\frac{z_{i}+z_{j-1}}{2}&amp;=\frac{2 j-1}{2} \Delta+z_{0} \\
\frac{z_{i}^{\prime}+z_{i-1}^{\prime}}{2}&amp;=\frac{z_{i}-1}{2} \Delta+z_{0}^{\prime} \\
\Rightarrow A_{ij} &amp;= f^w(\Delta[j - i]), \text{ falls } z_0 = z_0^\prime
\Rightarrow j - i \in \{-(L-1), \dots, -1, 0, 1, \dots, L - 1\}
\end{aligned}
$$
&lt;p>Vorabdiskretisierung von $f^w(\cdot)$&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Vorabdiskretisierung.drawio.png" alt="allg_sys-Vorabdiskretisierung.drawio">&lt;/p>
&lt;p>Eintragen der Werte in Transitionsmatrix $A$ mit $A_{ij} = f^w(\Delta(j-i))$&lt;/p>
&lt;p>Dann Berechnung der Posteriro wie gehabt.&lt;/p>
&lt;h3 id="rekonstruktion-kontinuierlicher-dichten">Rekonstruktion kontinuierlicher Dichten&lt;/h3>
&lt;p>Ergebnis von Prädiktion und Filterung in &lt;em>wertdiskreter&lt;/em> Form $\underline{\eta} \in \mathbb{R}_+^L$&lt;/p>
&lt;p>Berechnung von Kenngröße einfach, dazu Positionen erforderlich&lt;/p>
&lt;p>Erwartungswert&lt;/p>
$$
\begin{aligned}
\hat{x} &amp;=\int_{\mathbb{R}} x \sum_{i=1}^{2} \eta_{i} \int\left(x-\frac{x_{i}+x_{i-1}}{2}\right) d x \\
&amp;=\sum_{i=1}^{L} \eta_{i} \frac{x_{i}+x_{i-1}}{2} \quad \mid \frac{x_{i}+x_{i-1}}{2}=\frac{2 i-1}{2} \Delta+x_{0} \\
&amp;= \sum_{i=1}^{L} \eta_{i} (\frac{2i-1}{2} \Delta + x_0)
\end{aligned}
$$
&lt;p>Analog für Varianz.&lt;/p>
&lt;p>Gesucht: kontinuierliche Rekonstruktion $f(x)$ aus $\eta$&lt;/p>
&lt;p>Als Dirac Mixture&lt;/p>
$$
f(x) \approx \sum_{i=1}^{L} \eta_{i} \delta\left(x-\frac{x_{i}+x_{i-1}}{2}\right)
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Rekonstruktion.drawio.png" alt="allg_sys-Rekonstruktion.drawio" style="zoom:67%;" />
&lt;p>Verschiedenen Möglichkeiten der Interpolation&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Stückweise Konstante Interpolation&lt;/p>
$$
\int_{x_{i-1}}^{x_{i}} h_{i} d x \overset{!}{=} \int_{x_{i=1}}^{x_{i}} u_{i} \delta() d x \Rightarrow h_{i}=\frac{\eta_{i}}{\Delta}
$$
&lt;/li>
&lt;li>
&lt;p>Stetige, stückweise lineare Interpolation&lt;/p>
$$
(t_i + t_{i-1}) \frac{\Delta}{2} = \eta_i
$$
&lt;p>und weitere Bedingung&lt;/p>
$$
t_0 = t_1
$$
&lt;/li>
&lt;/ul>
&lt;h3 id="erweiterungen">Erweiterungen&lt;/h3>
&lt;p>Mehrdimensional Fall: $\underline{x} \in \mathbb{R}^N$&lt;/p>
&lt;ul>
&lt;li>Filterschritt analog&lt;/li>
&lt;li>Prädiktionsschritt: $f(\underline{z} \mid \underline{x})$ nun von $\mathbb{R}^N$aud $\mathbb{R}^N \Rightarrow A \in \mathbb{R}^{2N}$&lt;/li>
&lt;/ul>
&lt;p>Lösung&lt;/p>
&lt;ul>
&lt;li>Bewegliches Raster für nichtlineare Systemmodelle&lt;/li>
&lt;li>Adaptive Auflösung eines äquidistanten / homogenen Rasters&lt;/li>
&lt;li>Inhomoge Raster $\rightarrow$ variable Auflösung&lt;/li>
&lt;li>Effiziente Implementierung, z.B. dünn besetzte Matrizen ($0$ nicht explizit dargestellt)&lt;/li>
&lt;/ul></description></item><item><title>Zusammenfassung</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/summary/</link><pubDate>Sun, 07 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/allgemine_systeme/summary/</guid><description>&lt;h2 id="vorwärtsinferenz">Vorwärtsinferenz&lt;/h2>
&lt;ul>
&lt;li>Gegeben
&lt;ul>
&lt;li>$f_a(a)$&lt;/li>
&lt;li>$g(a)$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Gesucht: $f_b(b)$&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Vorwärtsinferenz.drawio.png" alt="allg_sys-Vorwärtsinferenz.drawio" style="zoom:67%;" />
&lt;p>&lt;strong>Schritte&lt;/strong>:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Umforme $f(b \mid a) = \delta(b - g(a))$ mit&lt;/p>
$$
\delta (g(x)) = \sum_{i=1}^N \frac{1}{|g^\prime(x_i)|}\delta (x - x_i)
$$
&lt;p>wobei&lt;/p>
&lt;ul>
&lt;li>$g(x_i) = 0$ (also $x_i$ sind Nullstellen, $i = 1, 2, \dots, N$)&lt;/li>
&lt;li>$g^\prime(x_i) \neq 0$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Berechne $f_b(b)$ mithilfe von &lt;strong>Chapman-Kolmogorov-Gleichung&lt;/strong>&lt;/p>
$$
f(b) = \int f(b \mid a) f(a) da
$$
&lt;p>und setze die Unformung von $f(b \mid a)$ von Schritt 1 ein. Dann kriege die gesuchte Dichtefunktion $f_b(b)$ in Abhängigkeit von $f_a(a)$.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Bsp: Aufgabe 9.1&lt;/span>
&lt;/div>
&lt;h2 id="rückwartsinferenz">Rückwartsinferenz&lt;/h2>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Rückwärtsinferenz.drawio%20%281%29.png" alt="allg_sys-Rückwärtsinferenz.drawio (1)" style="zoom:67%;" />
&lt;h3 id="konkrete-messung">Konkrete Messung&lt;/h3>
&lt;ol>
&lt;li>
&lt;p>Umforme $f_b(b \mid a) = \delta(b - g(a))$ mit&lt;/p>
$$
\delta (g(x)) = \sum_{i=1}^N \frac{1}{|g^\prime(x_i)|}\delta (x - x_i)
$$
&lt;p>wobei&lt;/p>
&lt;ul>
&lt;li>$g(x_i) = 0$ (also $x_i$ sind Nullstellen, $i = 1, 2, \dots, N$)&lt;/li>
&lt;li>$g^\prime(x_i) \neq 0$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Berechne $f_b(b)$&lt;/p>
$$
f_b(b) = \int f_{a, b}(a, b) da = \int f_{b}(b \mid a) f_a(a) da
$$
&lt;p>mit Einsetzen der Unformung von $f(b \mid a)$ von Schritt 1 ein&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Berechne $f_a(a \mid b)$ mithilfe von Bayes Regeln&lt;/p>
$$
f_a(a \mid b) = \frac{f_a(b \mid a) f_a(a)}{f_b(b)} = \frac{\overbrace{\delta(b - g(a))}^{\text{Schritt 1}} f_a(a)}{\underbrace{f_b(b)}_{\text{Schritt 2}}}
$$
&lt;/li>
&lt;/ol>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Bsp: Aufgabe 9.2, 9.3&lt;/span>
&lt;/div>
&lt;h3 id="unsichere-messung">Unsichere Messung&lt;/h3>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Ru%cc%88ckwa%cc%88rtsinferenz_dichte.drawio.png" alt="allg_sys-Rückwärtsinferenz_dichte.drawio" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>&lt;strong>Schritte&lt;/strong>:&lt;/p>
&lt;ol start="0">
&lt;li>Erweitere das System um eine zusätzliche stochastische Abbildung und einen festen Ausgang $\hat{z}$&lt;/li>
&lt;/ol>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-08%2016.51.24.png" alt="截屏2022-08-08 16.51.24" style="zoom: 50%;" />
&lt;ol>
&lt;li>
&lt;p>Bestimme $f(\hat{z} \mid y)$&lt;/p>
$$
\begin{aligned}
f(\hat{z} \mid y) &amp;= \frac{f(y \mid \hat{z})f(\hat{z})}{f(y)} \\\\
&amp;= \frac{f(y \mid \hat{z})f(\hat{z})}{\int f(y, x) dx} \\\\
&amp;= \frac{f(y \mid \hat{z})f(\hat{z})}{\int f(y|x)f(x) dx} \\\\
&amp;= \frac{f(y \mid \hat{z})f(\hat{z})}{\int \delta(y - g(x)) f(x) dx} \\\\
\end{aligned}
$$
&lt;p>Und setze die Umformung von $\delta(y - g(x))$&lt;/p>
$$
\delta (g(x)) = \sum_{i=1}^N \frac{1}{|g^\prime(x_i)|}\delta (x - x_i)
$$
&lt;ul>
&lt;li>$g(x_i) = 0$ (also $x_i$ sind Nullstellen, $i = 1, 2, \dots, N$)&lt;/li>
&lt;li>$g^\prime(x_i) \neq 0$&lt;/li>
&lt;/ul>
&lt;p>ein.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Berechung der Rückwärtsinferenz $f(x \mid \hat{z})$&lt;/p>
$$
\begin{aligned}
f(x \mid \hat{z}) &amp;=\frac{1}{f\left(\hat{z}\right)} \cdot f(x, z) \quad \mid \text{Marginalisierung nach } y\\
&amp;=\frac{1}{f(\hat{z})} \int f(x, y, z) d y \\
&amp;=\frac{1}{f(\hat{z})} \int f(\hat{z} \mid y, x) \cdot f(y , x) d y \quad \mid \hat{z}, x \text{ sind unabhängig}\\
&amp;=\frac{1}{f(\hat{z})} \int f(\hat{z} \mid y) \cdot f(y \mid x) \cdot f(x) d y \\
&amp;=\frac{1}{f(\hat{z})} \int \underbrace{f(\hat{z} \mid y)}_{\text{Berechnet in Schritt 1}} \cdot \underbrace{f(y \mid x)}_{\text{Systemmodell}} \cdot f(x) d y
\end{aligned}
$$
&lt;/li>
&lt;/ol>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Bsp: Aufgabe 9.4&lt;/span>
&lt;/div></description></item><item><title>Sample-basierte Filter</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/sample_basierte_filter/</link><pubDate>Wed, 10 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/sample_basierte_filter/</guid><description/></item><item><title>Stochastische Informationsverarbeitung</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/</link><pubDate>Thu, 26 May 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/</guid><description>&lt;h2 id="meta-information">Meta Information&lt;/h2>
&lt;p>Lecture website: &lt;a href="https://isas.iar.kit.edu/de/LehreWS2122_SI.php">&lt;strong>Stochastische Informationsverarbeitung&lt;/strong>&lt;/a>&lt;/p>
&lt;p>Semester: WS21/22&lt;/p>
&lt;p>Language: German&lt;/p>
&lt;p>Lecturer:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://isas.iar.kit.edu/de/Mitarbeiter_Hanebeck.php">Prof. Dr.-Ing. Uwe Hanebeck&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://isas.iar.kit.edu/de/Mitarbeiter_Frisch.php">Daniel Frisch&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>Exam type: Oral&lt;/p>
&lt;h2 id="beschreibung">Beschreibung&lt;/h2>
&lt;h3 id="inhalt">Inhalt&lt;/h3>
&lt;p>Die SI vermittelt die fundamentalen und formalen Grundlagen der Zustandsschätzung rund um Prädiktion und Filterung.&lt;/p>
&lt;p>Modelle und Zustandsschätzer für &lt;strong>wertediskrete und -kontinuierliche&lt;/strong> lineare sowie &lt;strong>allgemeine&lt;/strong> Systeme werden behandelt&lt;/p>
&lt;ul>
&lt;li>Für wertediskrete und -kontinuierliche lineare Systeme
&lt;ul>
&lt;li>Prädiktion und Filterung (HMM, Kalman Filter)&lt;/li>
&lt;li>Glättung für wertediskrete Systeme (zusätzlich)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Modellierung von allgemeinen statischen und dynamischen Systemen
&lt;ul>
&lt;li>Entwickeln ausgehend von einer generativen eine probabilistische Systembeschreibung&lt;/li>
&lt;li>Unterschiedliche Arten des Rauscheinflusses (additiv, multiplikativ) sowie verschiedene Dichterepräsentationen werden untersucht.&lt;/li>
&lt;li>Grundlegenden Methoden der Zustandsschätzung für allgemeine Systeme&lt;/li>
&lt;li>Herausforderungen bei der Implementierung generischer Schätzer&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="ziel">Ziel&lt;/h3>
&lt;ul>
&lt;li>Wiederholung von Grundlagen Wahrscheinlichkeitstheorie&lt;/li>
&lt;li>Gefühl für Systemtheorie und Behandlung von Unsicherheiten&lt;/li>
&lt;li>Verständnis für
&lt;ul>
&lt;li>Systemmodellierung und Systemidentifikation&lt;/li>
&lt;li>grundlegende Schätz-, Fusion-, Filterungs- und Prädiktionsverfahren&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Bewusstsein für Schwierigkeiten und Herausforderungen&lt;/li>
&lt;li>Herleitung und Anwendung von exakten Schätzern für
&lt;ul>
&lt;li>wertediskrete Systeme&lt;/li>
&lt;li>lineare wertekontinuierliche Systeme&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Herleitung und Anwendung von approximativen Schätzern für
&lt;ul>
&lt;li>schwach nichtlineare Systeme&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="struktur">Struktur&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Wertediskrete Systeme&lt;/strong>
&lt;ul>
&lt;li>Statische Systeme&lt;/li>
&lt;li>Dynamische Systeme: Markov-Kette, Messmodell&lt;/li>
&lt;li>Zustandsschätzung im Hidden Markov Model&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Wertekontinuierliche lineare Systeme&lt;/strong>
&lt;ul>
&lt;li>Statische Systeme&lt;/li>
&lt;li>Dynamische Systeme: Systemmodell mit Markov-Eigenschaft, Messmodell&lt;/li>
&lt;li>Zustandsschätzung: Kalman Filter&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Wertekontinuierliche und schwach nichtlineare Systeme&lt;/strong>
&lt;ul>
&lt;li>Statische Systeme&lt;/li>
&lt;li>Dynamische Systeme&lt;/li>
&lt;li>Nichtlineare Schätzung durch Linearisierung (EKF)&lt;/li>
&lt;li>Nichtlineare Schätzung durch Kalmanfilter in probabilistischer Form&lt;/li>
&lt;li>Berechnung der Momente: analytisch, numerisch, basierend auf Abtastwerten (UKF)&lt;/li>
&lt;li>Ensemble Kalmanfilter (EnKF)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Allgemeine Systeme&lt;/strong>
&lt;ul>
&lt;li>Dirac&amp;rsquo;sche Deltafunktion&lt;/li>
&lt;li>Funktionen von Zufallsvariablen&lt;/li>
&lt;li>Probabilistische Systemmodelle, Abstraktion&lt;/li>
&lt;li>Prädiktion nichtlinearer Systeme&lt;/li>
&lt;li>Filterschritt für nichtlineare Systeme&lt;/li>
&lt;li>Faktorgraphen und Message Passing&lt;/li>
&lt;li>Einfache Filter für stark nichtlineare Systeme&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Sample-basierte Filter&lt;/strong>
&lt;ul>
&lt;li>Reapproximation von kontinuierlichen Dichten mit Samples&lt;/li>
&lt;li>Partikelfilter&lt;/li>
&lt;li>Progressive Filterung&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item><item><title>Empirische Momente von zufälligen und deterministischen Samples</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/sample_basierte_filter/exkurs_empirische_momente/</link><pubDate>Wed, 10 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/sample_basierte_filter/exkurs_empirische_momente/</guid><description>&lt;p>Erzeugung von Samples&lt;/p>
$$
y_i = m + \sigma \cdot w_i \quad w_i \in \mathcal{N}(0, 1) \quad i = 1, \dots , L
$$
&lt;ul>
&lt;li>$w_i$: Grundsamples&lt;/li>
&lt;li>$m$: Mittelwert&lt;/li>
&lt;/ul>
&lt;p>Check:&lt;/p>
$$
\begin{aligned}
&amp;E\left\{y_{i}\right\}=E\left\{m+\sigma \cdot w_{i}\right\}=E\{m\}+\sigma \cdot E\left\{w_{i}\right\}=m \\
&amp;E\left\{\left(y_{i}-m\right)^{2}\right\}=E\left\{\left(\sigma \cdot w_{i}\right)^{2}\right\}=\sigma^{2} E\left\{w_{i}^{2}\right\}=\sigma^{2}
\end{aligned}
$$
&lt;p>Empirische Schätzer&lt;/p>
$$
\hat{m}=\frac{1}{L} \sum_{i=1}^{L} y_{i}
$$
$$
\begin{aligned}
\hat{c}=\hat{\sigma}^{2} &amp;=\frac{1}{L} \sum_{i=1}^{L}\left(y_{i}-\hat{m}\right)^{2} \\
&amp;=\frac{1}{L} \sum_{i=1}^{L}\left(y_{i}^{2}-2 \hat{m} y_{i}+\hat{m}^{2}\right) \\
&amp;=\frac{1}{L} \sum_{i=1}^{L} y_{i}^{2}-\left(\frac{1}{L} \sum_{i=1}^{L} y_{i}\right)^{2} \\
&amp;=\frac{1}{L} \sum_{i=1}^{L} y_{i}^{2}-\frac{1}{L^{2}} \sum_{i=1}^{L} \sum_{j=1}^{L} y_{i} y_{j}
\end{aligned}
$$
&lt;p>Überprüfung&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Mittelwert&lt;/p>
$$
\begin{aligned}
E\{\hat{m}\} &amp;=E\left\{\frac{1}{L} \sum_{i=1}^{L}\left(m+2 w_{i}\right)\right\} \\
&amp;=\frac{1}{L} \sum_{i=1}^{L} E\left\{m_{i}+w_{i}\right\} \\
&amp;=m \quad ✅
\end{aligned}
$$
&lt;/li>
&lt;li>
&lt;p>Varianz&lt;/p>
$$
\begin{aligned}
E\{\hat{C}\}&amp;=E\left\{\frac{1}{L} \sum_{i=1}^{L}\left(m+\sigma w_{i}\right)^{2}-\frac{1}{L^{2}} \sum_{i=1}^{l} \sum_{j=1}^{c}\left(m_{1}+\sigma w_{i}\right)\left(m+\sigma w_{i}\right)\right\}\\
&amp;=\frac{1}{L} \sum_{i=1}^{L} E\left\{m^{2}+2\sigma w_{i}+\sigma^{2} w_{i}^{2}\right\}- \\
&amp; \qquad\frac{1}{L^{2}} \sum_{i=1}^{L} \sum_{j=1}^{L} E\left\{m^{2}+m \sigma w_{i}+m \sigma w_{j}+\sigma^{2} w_{i} w_{j}\right\}\\
&amp;=\frac{1}{L} \sum_{i=1}^{L}\left(m^{2}+\sigma^{2}\right)-\frac{1}{L^{2}} \sum_{i=1}^{L} \sum_{j=1}^{L}\left(m^{2}+\sigma^{2} E\left\{\omega_{i} \omega_{j}\right\}\right)\\
&amp;=m^{2}+\sigma^{2}-m^{2}-\frac{1}{L^{2}} \cdot L \cdot{\sigma^{2}}^{2}\\
&amp;=\sigma^{2}-\frac{1}{L} \sigma^{2}\\
&amp;=\frac{L-1}{L} \cdot \sigma^{2}
\end{aligned}
$$
&lt;/li>
&lt;/ul>
&lt;p>Für deterministische Samples (z.B.)&lt;/p>
$$
\begin{aligned}
&amp;y_{1}=m-\sigma \\
&amp;y_{2}=m+\sigma
\end{aligned}
$$
$$
\begin{aligned}
\hat{m} &amp;=\frac{1}{2}(m-\sigma+m+\sigma)=m \quad ✅ \\
\hat{z}^{2} &amp;=\frac{1}{2}\left[(m-\sigma)^{2}+(m+\sigma)^{2}\right]-\frac{1}{4}(m-\sigma+m+\sigma)^{2} \\
&amp;=\frac{1}{2}\left[m^{2}-2 m \sigma+\sigma^{2}+m^{2}+2 m \sigma+\sigma^{2}\right]-m^{2} \\
&amp;=\frac{1}{2}\left[2 m^{2}+2\sigma^{2}\right]-m^{2} \\
&amp;=\sigma^{2}
\end{aligned}
$$</description></item><item><title>Reapproximation von Dichten</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/sample_basierte_filter/reapproximation_von_dichte/</link><pubDate>Wed, 10 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/sample_basierte_filter/reapproximation_von_dichte/</guid><description>&lt;p>4 cases&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-10%2021.58.40.png" alt="截屏2022-08-10 21.58.40" style="zoom: 33%;" />
&lt;p>Examples for reapproximation&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Continuous → continuous: Gaussian mixture reduction&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Continuous → discrete: deterministic sampling, i.e., replacing a continuous density with Dirac mixture&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Discrete → continuous: density estimation, i.e., finding a continuous density representing a set of given samples&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Discrete → discrete: Dirac mixture reduction&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Challenge:&lt;/strong> Three cases involving discrete densities&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Continuous → continuous case: Use standard distance measures, e.g. integral squared distance (ISD)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Discrete densities prohibit use of standard distance measures&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Here we focus on &lt;strong>continuous&lt;/strong> → &lt;strong>discrete Reapproximation&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Given: Continuous density $\tilde{f}(\underline{x})$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Deterministic sampling, i.e., approximation with Dirac mixture&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Definition of Dirac mixture with $L$ components&lt;/p>
$$
f(\underline{x})=\sum_{i=1}^{L} w_{i} \cdot \delta\left(\underline{x}-\underline{\hat{x}}_{i}\right)
$$
&lt;ul>
&lt;li>Weights $w_{i}>0, \displaystyle \sum_{i=1}^{L} w_{i}=1$&lt;/li>
&lt;li>$\underline{x}_i$: locations / samples&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>🎯 Goal: Systematic approximation of given continuous density&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Application examples&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Mapping of random variables through nonlinear functions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sample-based fusion and estimation (UKF)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="univariate-case-1d">Univariate Case (1D)&lt;/h2>
&lt;h3 id="synthesis">Synthesis&lt;/h3>
&lt;p>Instead of comparing densities $\tilde{f}(x), f(x)$, we compare &lt;strong>cumulative distribution functions (CDFs)&lt;/strong> $\tilde{F}(x), F(x)$&lt;/p>
&lt;p>CDF of $f(x)$:&lt;/p>
$$
F(x)=P(\boldsymbol{x} \leq x)=\int_{-\infty}^{x} f(u) \mathrm{d} u
$$
&lt;ul>
&lt;li>
&lt;p>This definition is &lt;em>unique&lt;/em>, as other definition $\bar{F}(x)=P(\boldsymbol{x}>x)$ is dependent&lt;/p>
$$
\begin{aligned}
\bar{F}(x)=&amp;P(\boldsymbol{x}>x) \\
&amp;=\int_{x}^{\infty} f(u) d u \\
&amp;=1-\int_{-\infty}^{x} f(u) d u \\
&amp;=1-P(\boldsymbol{x} \leq x) \\
&amp;=1-F(x)
\end{aligned}
$$
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>Dirac mixture density&lt;/p>
$$
f(x, \underline{\hat{x}})=\sum_{i=1}^{L} w_{i} \delta\left(x-\hat{x}_{i}\right)
$$
&lt;p>Dirac mixture cumulative distribution&lt;/p>
$$
F(x, \underline{\hat{x}})=\sum_{i=1}^{L} w_{i} \mathrm{H}\left(x-\hat{x}_{i}\right) \text { with } \mathrm{H}(x)=\int_{-\infty}^{x} \delta(t) \mathrm{d} t= \begin{cases}0 &amp; x&lt;0 \\ \frac{1}{2} &amp; x=0 \\ 1 &amp; x>0\end{cases}
$$
&lt;p>with the Dirac position&lt;/p>
$$
\underline{\hat{x}}=\left[\hat{x}_{1}, \hat{x}_{2}, \ldots, \hat{x}_{L}\right]^{\top}
$$
&lt;/li>
&lt;/ul>
&lt;p>CDF of $\tilde{f}(x)$ follows analogously:&lt;/p>
$$
\tilde{F}(x)=\int_{-\infty}^{x} \tilde{f}(u) \mathrm{d} u
$$
&lt;ul>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>Gaussian density:&lt;/p>
$$
\tilde{f}(x)=\frac{1}{\sqrt{2 \pi}} \exp \left(-\frac{1}{2} x^{2}\right)
$$
&lt;p>$\Rightarrow$ Gaussian cumulative distribution:&lt;/p>
$$
\tilde{F}(x)=\frac{1}{2}\left(1+\operatorname{erf}\left(\frac{x}{\sqrt{2}}\right)\right)
$$
&lt;/li>
&lt;/ul>
&lt;p>We compare $\tilde{F}(x), F(x)$ use &lt;strong>Cramér–von Mises distance&lt;/strong>:&lt;/p>
$$
D(\underline{\hat{x}})=\int_{\mathbb{R}}(\tilde{F}(x)-F\left(x, \underline{\hat{x}})\right)^{2} \mathrm{~d} x
$$
&lt;h3 id="minimization-of-cramervon-mises-distance">Minimization of Cramér–von Mises distance&lt;/h3>
&lt;p>Gradient of the distance measure $D(\underline{\hat{x}})$:&lt;/p>
$$
\underline{G}(\underline{\hat{x}})=\nabla D(\underline{\hat{x}})=\frac{\partial D(\underline{\hat{x}})}{\partial \underline{\hat{x}}}=\left[\frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{1}}, \frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{2}}, \ldots, \frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{L}}\right]^{\top}
$$
&lt;p>with&lt;/p>
$$
\frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{i}}=2 w_{i} \int_{-\infty}^{\infty}[\tilde{F}(t)-F(t, \underline{\hat{x}})] \delta\left(t-\hat{x}_{i}\right) \mathrm{d} t
$$
&lt;p>or&lt;/p>
$$
\frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{i}}=2 w_{i}\left[\tilde{F}\left(\hat{x}_{i}\right)-F\left(\hat{x}_{i}, \underline{\hat{x}}\right)\right] \text { with } F\left(\hat{x}_{i}, \underline{\hat{x}}\right)=\sum_{j=1}^{L} w_{j} \mathrm{H}\left(\hat{x}_{i}-\hat{x}_{j}\right)
$$
&lt;p>The Hesse matrix is&lt;/p>
$$
\mathbf{H}(\underline{x})=\operatorname{diag}\left(\left[\frac{\partial^{2} D(\underline{\hat{x}})}{\partial \hat{x}_{1}^{2}}, \frac{\partial^{2} D(\underline{\hat{x}})}{\partial \hat{x}_{2}^{2}}, \ldots, \frac{\partial^{2} D(\underline{\hat{x}})}{\partial \hat{x}_{L}^{2}}\right]\right)
$$
&lt;p>with&lt;/p>
$$
\frac{\partial^{2} D(\underline{\hat{x}})}{\partial \hat{x}_{i}^{2}}=2 w_{i} \tilde{f}\left(\hat{x}_{i}\right)
$$
&lt;h3 id="sorted-locations--equal-weights">Sorted Locations &amp;amp; Equal Weights&lt;/h3>
&lt;p>When location vector $\underline{\hat{x}}$ is sorted, i.e., $\hat{x}_{1}&lt;\hat{x}_{2}&lt;\ldots&lt;\hat{x}_{L}$
, we obtain&lt;/p>
$$
H\left(\hat{x}_{i}-\hat{x}_{j}\right)=
\begin{cases}
0 &amp; i &lt; j \\
\frac{1}{2} &amp; i=j \\
1 &amp; i > j
\end{cases}
$$
&lt;p>Cumulative distribution can be simplified&lt;/p>
$$
F\left(\hat{x}_{i}, \underline{\hat{x}}\right)=\frac{w_{i}}{2}+\sum_{j=1}^{i-1} w_{j}
$$
&lt;p>When samples are &lt;em>equally&lt;/em> weighted (i.e. $w_i = \frac{1}{L}$), we get&lt;/p>
$$
F(\hat{x}_{i}, \underline{\hat{x}}) = \frac{1}{2L} + \frac{i-1}{L} = \frac{2i - 1}{2L} \qquad i = 1, \dots, L
$$
&lt;p>Analytic solutions (possible in some special cases)&lt;/p>
$$
\tilde{F}\left(\hat{x}_{i}\right)-F\left(\hat{x}_{i}, \underline{\hat{x}}\right)=0 \Rightarrow \hat{x}_{i}=\tilde{F}^{-1}(\frac{2 i-1}{2 L}) \qquad i=1, \ldots, L
$$
&lt;ul>
&lt;li>
&lt;p>E.g. Gaussian distribution:&lt;/p>
$$
\tilde{F}^{-1}(x)=\sqrt{2} \operatorname{erfinv}((2 i-1) /(2 L))
$$
&lt;/li>
&lt;/ul>
&lt;h4 id="example-dma-of-standard-normal-distribution">Example: DMA of standard Normal Distribution&lt;/h4>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Kapture%202022-08-11%20at%2009.59.37.gif"
alt="With increasing number of Dirac functions, the CDF can be well approximated.">&lt;figcaption>
&lt;p>With increasing number of Dirac functions, the CDF can be well approximated.&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-11%2009.57.08.png" alt="截屏2022-08-11 09.57.08" style="zoom: 50%;" />
&lt;h3 id="general-optimization">General Optimization&lt;/h3>
&lt;p>In general: Minimum of $D(\underline{\hat{x}})$ is obtained iteratively using &lt;strong>Newton’s method&lt;/strong>&lt;/p>
$$
\Delta \underline{\hat{x}}=-\mathbf{H}^{-1}(\underline{\hat{x}}) \underline{G}(\underline{\hat{x}})
$$
&lt;p>with&lt;/p>
$$
\underline{G}(\underline{\hat{\hat{x}}})=\left[\frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{1}}, \frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{2}}, \ldots, \frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{L}}\right]^{\top}
$$
&lt;p>and&lt;/p>
$$
\frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{i}}=2 w_{i}\left[\tilde{F}\left(\hat{x}_{i}\right)-F\left(\hat{x}_{i}, \underline{\hat{x}}\right)\right]
$$
&lt;p>The Hessian $\mathbf{H}(\underline{x})$ is given by&lt;/p>
$$
\mathbf{H}(\underline{\hat{x}})=2 \operatorname{diag}\left(\left[w_{1} \tilde{f}\left(\hat{x}_{1}\right), w_{2} \tilde{f}\left(\hat{x}_{2}\right), \ldots, w_{L} \tilde{f}\left(\hat{x}_{L}\right)\right]\right)
$$
&lt;p>The resulting Newton step:&lt;/p>
$$
\Delta \underline{\hat{x}}=-\left[\frac{\tilde{F}\left(\hat{x}_{1}\right)-F\left(\hat{x}_{1}, \underline{\hat{x}}\right)}{\tilde{f}\left(\hat{x}_{1}\right)}, \frac{\tilde{F}\left(\hat{x}_{2}\right)-F\left(\hat{x}_{2}, \underline{\hat{x}}\right)}{\tilde{f}\left(\hat{x}_{2}\right)}, \ldots, \frac{\tilde{F}\left(\hat{x}_{L}\right)-F\left(\hat{x}_{L}, \underline{\hat{x}}\right)}{\tilde{f}\left(\hat{x}_{L}\right)}\right]^{\top}
$$
&lt;h2 id="extension-to-multivariate-distributions">Extension to Multivariate Distributions&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Extension to multivariate case is not trivial&lt;/p>
&lt;ul>
&lt;li>Less nice properties of multivariate cumulative distributions 🤪&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Distinguish several classes of multivariate methods&lt;/p>
&lt;ul>
&lt;li>Methods that generalize concept of CDF&lt;/li>
&lt;li>Methods that perform reduction to univariate case&lt;/li>
&lt;li>Kernel-based methods&lt;/li>
&lt;li>Continuous flow between density approximations&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="challenge-of-multivariate-cumulative-distributions">Challenge of Multivariate Cumulative Distributions&lt;/h3>
&lt;p>Definition for 2D:&lt;/p>
$$
F(u, v)=\int_{-\infty}^{u} \int_{-\infty}^{v} f(x, y) d x d y
$$
&lt;p>However, $F(u, v)$ is &lt;em>asymmetric&lt;/em> and definition is not unique.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-11%2011.44.30.png" alt="截屏2022-08-11 11.44.30" style="zoom:50%;" />
&lt;p>Alternative definitions:&lt;/p>
$$
\begin{aligned}
&amp;F(u, v)=\int_{-\infty}^{u} \int_{v}^{\infty} f(x, y) d x d y \\
&amp;F(u, v)=\int_{u}^{\infty} \int_{v}^{\infty} f(x, y) d x d y \\
&amp;F(u, v)=\int_{u}^{\infty} \int_{-\infty}^{v} f(x, y) d x d y
\end{aligned}
$$
&lt;ul>
&lt;li>Three CDFs are independent, forth is dependent.&lt;/li>
&lt;/ul>
&lt;p>For general $N$–dimensional random vectors: $2^N$ different variants,, $2^N - 1$ are independent&lt;/p>
&lt;p>$\rightarrow$ exponentially complex!&lt;/p>
&lt;ul>
&lt;li>Use in statistical tests difficult&lt;/li>
&lt;li>Results differ depending on variant&lt;/li>
&lt;/ul>
&lt;p>Thus, we require generalization of concept of CDF. 💪&lt;/p>
&lt;h3 id="localized-cumulative-distributions-lcds">Localized Cumulative Distributions (LCDs)&lt;/h3>
&lt;p>&lt;strong>Univariate case (1D)&lt;/strong>&lt;/p>
&lt;p>💡 Key idea&lt;/p>
&lt;ul>
&lt;li>Compare local probability masses of $\tilde{f}(x)$ and $f(x)$&lt;/li>
&lt;li>Integrate over intervals at all positions $m$ and all widths $b$&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-11%2012.04.07.png" alt="截屏2022-08-11 12.04.07" style="zoom: 67%;" />
&lt;p>Compare $\tilde{A}(m, b)$ and $A(m,b), \forall m, b$&lt;/p>
&lt;p>Symmetric, unique, but redundant&amp;hellip;&lt;/p>
&lt;p>&lt;strong>Multivariate case (2D)&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-11%2012.13.09.png" alt="截屏2022-08-11 12.13.09" style="zoom: 67%;" />
&lt;p>Different kernels possible&lt;/p>
&lt;ul>
&lt;li>Rectangular kernels&lt;/li>
&lt;li>Gaussian kernels&lt;/li>
&lt;li>Anisotropic vs. isotropic kernels&lt;/li>
&lt;li>Separable vs. inseparable kernels&lt;/li>
&lt;/ul>
&lt;h4 id="cumulative-transformation-of-densities">&lt;strong>Cumulative Transformation of Densities&lt;/strong>&lt;/h4>
&lt;p>Given&lt;/p>
&lt;ul>
&lt;li>Random vector $\underline{x} \in \mathbb{R}^N$&lt;/li>
&lt;li>Probability density function $f(\underline{x}): \mathbb{R}^N \to \mathbb{R}_+$&lt;/li>
&lt;/ul>
&lt;p>&lt;mark>&lt;strong>Localized Cumulative Distribution (LCD)&lt;/strong>&lt;/mark>:&lt;/p>
$$
F(\underline{m}, b)=\int_{\mathbb{R}^{N}} f(\underline{x}) K(\underline{x}-\underline{m}, b) \mathrm{d} \underline{x}
$$
&lt;ul>
&lt;li>$K(\cdot, \cdot)$: Kernel&lt;/li>
&lt;li>$\underline{m}$: Kernel location&lt;/li>
&lt;li>$\underline{b}$: Kernel width&lt;/li>
&lt;/ul>
&lt;p>Specific kernel employed:&lt;/p>
$$
K(\underline{x}-\underline{m}, b)=\prod_{k=1}^{N} \exp \left(-\frac{1}{2} \frac{\left(x_{k}-m_{k}\right)^{2}}{b^{2}}\right)
$$
&lt;ul>
&lt;li>Separable (i.e. in form of product)&lt;/li>
&lt;li>isotropic (i.e. same in each direction)&lt;/li>
&lt;li>Gaussian&lt;/li>
&lt;/ul>
&lt;p>Properties of LCD:&lt;/p>
&lt;ul>
&lt;li>Symmetric&lt;/li>
&lt;li>Unique&lt;/li>
&lt;li>Multivariate&lt;/li>
&lt;/ul>
&lt;p>Examples&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>LCD&lt;/strong> &lt;strong>(Rectangular Kernel)&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-13%2010.44.51.png" alt="截屏2022-08-13 10.44.51" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>&lt;strong>LCD (Gaussian Kernel)&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-13%2010.47.53.png" alt="截屏2022-08-13 10.47.53" style="zoom: 33%;" />
&lt;/li>
&lt;/ul>
&lt;h4 id="generalized-cramervon-mises-distance-gcvd">&lt;strong>Generalized Cramér–von Mises Distance (GCvD)&lt;/strong>&lt;/h4>
&lt;p>Given:&lt;/p>
&lt;ul>
&lt;li>LCD of given continuous density $\tilde{F}(\underline{m}, b)$&lt;/li>
&lt;li>LCD of Dirac mixture $F(\underline{m}, b)$&lt;/li>
&lt;/ul>
&lt;p>Definition:&lt;/p>
$$
D=\int_{\mathbb{R}_{+}} w(b) \int_{\mathbb{R}^{N}}(\tilde{F}(\underline{m}, b)-F(\underline{m}, b))^{2} \mathrm{~d} \underline{m} \mathrm{~d} b
$$
&lt;p>Minimization of GCvD:&lt;/p>
&lt;ul>
&lt;li>For many Dirac components → high-dimensional optimization problem&lt;/li>
&lt;li>Gradient available, Hessian more difficult&lt;/li>
&lt;li>Use Quasi-Newton method: L-BFGS&lt;/li>
&lt;/ul>
&lt;h3 id="projected-cumulative-distributions-pcd">Projected Cumulative Distributions (PCD)&lt;/h3>
&lt;h4 id="options-for-reduction-to-univariate-case">Options for Reduction to Univariate Case&lt;/h4>
&lt;p>Reapproximation methods for univariate case readily available. How can we use univariate methods in multivariate case?&lt;/p>
&lt;p>Solution&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Approximation on principal axis of PDF&lt;/p>
&lt;ul>
&lt;li>Limited to densities where principal axis can be defined&lt;/li>
&lt;li>Examples: Gaussian PDF, Bingham PDF on sphere&lt;/li>
&lt;li>Does not cover the entire density 🤪&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-13%2010.57.53.png" alt="截屏2022-08-13 10.57.53" style="zoom: 33%;" />
&lt;/li>
&lt;li>
&lt;p>Cartesian product of 1D approximation 👎&lt;/p>
&lt;ul>
&lt;li>Curse of dimensionality (as very similar to grid)&lt;/li>
&lt;li>Only for product densities (or rotations thereof)&lt;/li>
&lt;li>Inefficient coverage&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Representing PDFs by &lt;em>all&lt;/em> one-dimensional projections (a.k.a Radon transform)&lt;/strong> 👍&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Represent the two densities $\tilde{f}(\underline{x})$ and $f(\underline{x})$ by infinite set of one-dimensional projections&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Projections onto all unit vectors $\underline{u} \in \mathbb{S}^{N-1}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>We obtain sets of projections $\tilde{f}(r \mid \underline{u})$ and $f(r \mid u)$ ($r$: the density along the unit vector)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>These are the Radon transforms of $\tilde{f}(\underline{x})$ and $f(\underline{x})$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>We compare the sets of projections $\tilde{f}(r \mid \underline{u})$ and $f(r \mid u)$ for every $\underline{u} \in \mathbb{S}^{N-1}$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>For comparison, we use the univariate cumulative distribution functions $\tilde{F}(r \mid \underline{u})$ and $F(r \mid u)$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>These are unique, well defined, and easy to calculate 👏&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Resulting distance measures&lt;/p>
$$
D_{1}(\underline{u})=D(\tilde{f}(r \mid \underline{u}), f(r \mid \underline{u}))
$$
&lt;p>depend on the projection vector $\underline{u}$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>We integrate these one-dimensional distance measures $D_1(\underline{u})$ over all unit vectors $\underline{u} \in \mathbb{S}^{N-1}$&lt;/p>
&lt;ul>
&lt;li>This gives multivariate distance measure $D(\tilde{f}(\underline{x}), f(\underline{x}))$&lt;/li>
&lt;li>Typically a discretized subset of $\underline{u} \in \mathbb{S}^{N-1}$ is used&lt;/li>
&lt;li>Distance measure minimized via univariate Newton updates&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;/li>
&lt;/ul>
&lt;h4 id="radon-transform">Radon Transform&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Represent general $N$-dimensional probability density functions via the set of all one-dimensional projections&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Linear projection of random vector $\underline{\boldsymbol{x}} \in \mathbb{R}^{N}$ to scalar random variable $\boldsymbol{r} \in \mathbb{R}$ onto line described by unit vector $\underline{u} \in \mathbb{S}^{N-1}$&lt;/p>
$$
\boldsymbol{r} = \underline{u}^\top \underline{\boldsymbol{x}}
$$
&lt;/li>
&lt;li>
&lt;p>Given probability density function $f(\underline{x})$ of random vector $\underline{\boldsymbol{x}}$, density $f_r(r \mid \underline{u})$ of $\boldsymbol{r}$ is given by&lt;/p>
$$
f_{r}(r \mid \underline{u})=\int_{\mathbb{R}^{N}} f(\underline{t}) \delta\left(r-\underline{u}^{\top} \underline{t}\right) \mathrm{d} \underline{t}
$$
&lt;/li>
&lt;li>
&lt;p>$f_r(r \mid \underline{u})$ is Radon transform of $f(\underline{x})$ for all $\underline{u} \in \mathbb{S}^{N-1}$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Visualization:&lt;/p>
&lt;blockquote>
&lt;p>$u$ is the unit vector.&lt;/p>
&lt;p>We project the density on $u$ and we get the projection (yellow area).&lt;/p>
&lt;p>Then if we cut through the projection, it gives us the red line.&lt;/p>
&lt;/blockquote>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-13%2012.11.44.png" alt="截屏2022-08-13 12.11.44" style="zoom: 33%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-13%2012.12.06.png" alt="截屏2022-08-13 12.12.06" style="zoom:33%;" />
&lt;p>&lt;strong>Dirac Mixture Densities&lt;/strong>&lt;/p>
$$
f(\underline{x} \mid \hat{\mathbf{X}})=\sum_{i=1}^{L} w_{i} \delta\left(\underline{x}-\underline{\hat{x}}_{i}\right)
$$
&lt;p>with&lt;/p>
$$
\hat{\mathbf{X}}=\left[\underline{\hat{x}}_{1}, \underline{\hat{x}}_{2}, \ldots, \underline{\hat{x}}_{L}\right]
$$
&lt;p>Radon transform is given by&lt;/p>
$$
f_{r}(r \mid \underline{\hat{r}}, \underline{u})=\sum_{i=1}^{L} w_{i} \delta\left(\underline{u}^{\top} \underline{x} - \underline{u}^{\top} \underline{\hat{x}}_{i}\right)=\sum_{i=1}^{L} w_{i} \delta\left(r-\hat{r}_{i}(\underline{u})\right)
$$
&lt;ul>
&lt;li>$\hat{r}_{i}(\underline{u})=\underline{u}^{\top} \underline{x}_{i}, i=1, \ldots, L$
are the projected Dirac locations&lt;/li>
&lt;/ul>
&lt;p>Collect projected Dirac locations $\hat{r}_{i}(\underline{u})$ in vector&lt;/p>
$$
\underline{\hat{r}}=\left[\hat{r}_{1}(\underline{u}), \hat{r}_{2}(\underline{u}), \ldots, \hat{r}_{L}(\underline{u})\right]^{\top}
$$
&lt;p>&lt;strong>Gaussian Densities&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>For Gaussian Density $f (\underline{x}) $with mean vector $\underline{\hat{x}}$ and covariance matrix $\mathbf{C}_x$, density $f_r(r \mid \underline{u})$ resulting from the projection is also Gaussian&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Its mean $\hat{r}(\underline{u})$ can simply be calculated by taking the expected value&lt;/p>
$$
\hat{r}(\underline{u})=\mathrm{E}\{\boldsymbol{r}(\underline{u})\}=\mathrm{E}\left\{\underline{u}^{\top} \underline{\boldsymbol{x}}\right\}=\underline{u}^{\top} \underline{\hat{x}}
$$
&lt;/li>
&lt;li>
&lt;p>Its standard deviation $\sigma_r(\underline{u})$ is given by&lt;/p>
$$
\sigma_{r}^{2}(\underline{u})=\mathrm{E}\left\{(\boldsymbol{r}(\underline{u})-\hat{r}(\underline{u}))^{2}\right\}=\mathrm{E}\left\{\left(\underline{u}^{\top} \boldsymbol{x}-\underline{u}^{\top} \underline{\hat{x}}\right)^{2}\right\}=\mathrm{E}\left\{\underline{u}^{\top}(\boldsymbol{x}-\underline{\hat{x}})(\boldsymbol{x}-\underline{\hat{x}})^{\top} \underline{u}\right\}=\underline{u}^{\top} \mathbf{C}_{x} \underline{u}
$$
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Gaussian Mixture Densities&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>For $N$-dimensional Gaussian mixture densities $f(\underline{x})$ of the form&lt;/p>
$$
f(\underline{x})=\sum_{i=1}^{M} w_{i} \frac{1}{\sqrt{(2 \pi)^{N}\left|\mathbf{C}_{x, i}\right|}} \exp \left(-\frac{1}{2}\left(\underline{x}-\underline{\hat{x}}_{i}\right)^{\top} \mathbf{C}_{x, i}^{-1}\left(\underline{x}-\underline{\hat{x}}_{i}\right)\right)
$$
&lt;p>the density $f_r(r, \underline{u})$ is also a Gaussian mixture&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Due to the linearity of the projection operator, it is given by&lt;/p>
$$
f_{r}(r \mid \underline{u})=\sum_{i=1}^{M} w_{i} \frac{1}{\sqrt{2 \pi} \sigma_{r, i}(\underline{u})} \exp \left(-\frac{1}{2} \frac{\left(r-\hat{r}_{i}(\underline{u})\right)^{2}}{\sigma_{r, i}^{2}(\underline{u})}\right)
$$
&lt;p>with&lt;/p>
$$
\hat{r}_{i}(\underline{u})=\underline{u}^{\top} \underline{\hat{x}}_{i}
$$
&lt;p>and&lt;/p>
$$
\sigma_{r, i}(\underline{u})=\sqrt{\underline{u}^{\top} \mathbf{C}_{x, i} \underline{u}}
$$
&lt;/li>
&lt;/ul>
&lt;h3 id="multivariate-cramer-von-mises-distance">Multivariate Cramér-von Mises Distance&lt;/h3>
&lt;p>Multivariate distance measure between two continuous and/or discrete probability density functions&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>One-dimensional Projections via Radon Transform&lt;/strong>&lt;/p>
&lt;p>Given density $\tilde{f}(\underline{x})$ and its approximation $f(\underline{x})$, represented by their Radon transforms $\tilde{f}(r \mid \underline{u})$ (i.e. by their 1D projections onto unit vectors $\underline{u} \in \mathbb{S}^{N-1}$)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>One-dimensional Cumulative Distributions&lt;/strong>&lt;/p>
&lt;p>Based on Radon transform $\tilde{f}(r \mid \underline{u})$, calculate one-dimensional cumulative distributions of the projected densities as&lt;/p>
$$
\tilde{F}(r \mid \underline{u})=\int_{\infty}^{r} \tilde{f}(t, \underline{u}) \mathrm{d} t
$$
&lt;p>and similarly for $F(r \mid \underline{u})$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Example: For Dirac mixture approximation, cumulative distribution function of its Radon transform is given by&lt;/p>
$$
F(r \mid \underline{\hat{r}}, \underline{u})=\sum_{i=1}^{L} w_{i} \mathrm{H}\left(r-\hat{r}_{i}(\underline{u})\right)
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>One-dimensional Distance&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>For comparing the one-dimensional projections, we compare their cumulative distributions $\tilde{F}(r \mid \underline{u})$ and $F(r \mid \underline{\hat{r}}, \underline{u})$ for all $\underline{u} \in \mathbb{S}^{N-1}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>As distance measure use integral squared distance&lt;/p>
$$
D_{1}(\underline{\hat{r}}, \underline{u})=\int_{\mathbb{R}}[\tilde{F}(r \mid \underline{u})-F(r \mid \underline{\hat{r}}, \underline{u})]^{2} \mathrm{~d} r
$$
&lt;/li>
&lt;li>
&lt;p>Gives distance between the projected densities in the direction of the unit vector $\underline{u}$ for all $\underline{u}$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>One-dimensional Newton Step&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Newton step can now be written as&lt;/li>
&lt;/ul>
$$
\Delta \underline{\hat{r}}(\underline{\hat{r}}, \underline{u})=-\mathbf{H}^{-1}(\underline{\hat{r}}, \underline{u}) \underline{G}(\underline{\hat{r}}, \underline{u})
$$
&lt;p>with&lt;/p>
$$
\begin{aligned}
\underline{G}(\underline{\hat{r}}, \underline{u})&amp;=\left[\frac{\partial D_{1}(\underline{\hat{r}}, \underline{u})}{\partial \hat{r}_{1}}, \frac{\partial D_{1}(\underline{\hat{r}}, \underline{u})}{\partial \hat{r}_{2}}, \ldots, \frac{\partial D_{1}(\underline{\hat{r}}, \underline{u})}{\partial \hat{r}_{L}}\right]^{\top} \\
\frac{\partial D_{1}(\underline{\hat{r}}, \underline{u})}{\partial \hat{r}_{i}}&amp;=2 w_{i}\left[\tilde{F}\left(\hat{r}_{i} \mid \underline{u}\right)-F\left(\hat{r}_{i} \mid \underline{u}\right)\right]
\end{aligned}
$$
&lt;ul>
&lt;li>
&lt;p>Hessian $\mathbf{H}(\underline{r}, \underline{u})$ is given by&lt;/p>
$$
\mathbf{H}(\underline{\hat{r}}, \underline{u})=2 \operatorname{diag}\left(\left[w_{1} \tilde{f}\left(\hat{r}_{1} \mid \underline{u}\right), w_{2} \tilde{f}\left(\hat{r}_{2} \mid \underline{u}\right), \ldots, w_{L} \tilde{f}\left(\hat{r}_{L} \mid \underline{u}\right)\right]\right)
$$
&lt;/li>
&lt;li>
&lt;p>$\Rightarrow$ Resulting Newton step&lt;/p>
$$
\Delta \underline{\hat{r}}(\underline{\hat{r}}, \underline{u})=-\left[\frac{\tilde{F}\left(\hat{r}_{1} \mid \underline{u}\right)-F\left(\hat{r}_{1} \mid \underline{\hat{r}}, \underline{u}\right)}{\tilde{f}\left(\hat{r}_{1} \mid \underline{u}\right)}, \ldots, \frac{\tilde{F}\left(\hat{r}_{L} \mid \underline{u}\right)-F\left(\hat{r}_{L} \mid \underline{\hat{r}}, \underline{u}\right)}{\tilde{f}\left(\hat{r}_{L} \mid \underline{u}\right)}\right]^{\top}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Backprojection to $N$-dimensional space&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>For specific projection vector $\underline{u}$: Newton update $\Delta \underline{\hat{r}}(\underline{\hat{r}}, \underline{u})$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Backprojection into original $N$-dimensional space: Update can be used to modify original Dirac locations in direction along the vector $\underline{u}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>For every location vector $\underline{\hat{x}}_i$ we obtain&lt;/p>
$$
\Delta \underline{\hat{x}}_{i}(\underline{u})=\Delta \underline{\hat{r}}(\underline{\hat{r}}, \underline{u}) \cdot \underline{u}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Assemble Multivariate Distance&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Individual 1D distances $D_1(\underline{r}, \underline{u})$ can be assembled to form multivariate distance measure&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Performed by integrating over all 1D distances depending on unit vector $\underline{u}$&lt;/p>
$$
D_{N}(\hat{\mathbf{X}})=\int_{\mathbb{S}^{N-1}} D_{1}(\underline{\hat{r}}, \underline{u}) \mathrm{d} \underline{u}
$$
&lt;/li>
&lt;li>
&lt;p>Plugging in $D_1(\underline{r}, \underline{u})$:&lt;/p>
$$
D_{N}(\hat{\mathbf{X}})=\int_{\mathbb{S}^{N-1}} \int_{\mathbb{R}}[\tilde{F}(r \mid \underline{u})-F(r \mid \underline{\hat{r}}, \underline{u})]^{2} \mathrm{~d} r \mathrm{~d} \underline{u} \quad \text { with } r=\underline{u}^{\top} \cdot \underline{x}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Perform Full Newton Update&lt;/strong>&lt;/p>
&lt;p>Full Newton update by integrating over all partial updates along projection vectors $\underline{u}$&lt;/p>
$$
\Delta \underline{\hat{x}}_{i}=\int_{\mathbb{S}^{N-1}} \Delta \underline{\hat{x}}_{i}(\underline{u}) \mathrm{d} \underline{u}
$$
&lt;/li>
&lt;/ol>
&lt;h4 id="discretization-of-space-of-unit-vectors">&lt;strong>Discretization of Space of Unit Vectors&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>In practice, Space $\mathbb{S}^{N-1}$ containing unit vectors $\underline{u}$ has to be discretized&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Two options are available for performing the discretization:&lt;/p>
&lt;ul>
&lt;li>Deterministic discretization, e.g., by calculating a grid&lt;/li>
&lt;li>Random discretization by drawing uniform samples from the hypersphere&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>For both cases: Consider $K$ samples $\underline{u}_k$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Integration reduces to summation&lt;/p>
$$
\Delta \underline{\hat{x}}_{i} \approx \frac{1}{K} \sum_{k=1}^{K} \Delta \underline{\hat{x}}_{i}\left(\underline{\hat{u}}_{k}\right) \quad \text { for } i=1,2, \ldots, L
$$
&lt;/li>
&lt;li>
&lt;p>Stopping criterion&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Given initial locations for the location of the Dirac components&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Full Newton updates are performed until the maximum change over all location vectors&lt;/p>
$$
\max _{i}\left|\Delta \underline{\hat{x}}_{i}\right|
$$
&lt;p>falls below a given threshold&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Complete Algorithm (Randomized Variant)&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-14%2011.44.09.png" alt="截屏2022-08-14 11.44.09" style="zoom: 50%;" /></description></item><item><title>Partikel Filter</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/sample_basierte_filter/partikel_filter/</link><pubDate>Sun, 14 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/sample_basierte_filter/partikel_filter/</guid><description>&lt;h2 id="naives-partikelfilter">Naives Partikelfilter&lt;/h2>
&lt;h3 id="pradiktions--und-filterschritt">Prädiktions- und Filterschritt&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Übungsblatt Aufg. 13.2&lt;/span>
&lt;/div>
&lt;h4 id="pradiktionsschritt">Prädiktionsschritt&lt;/h4>
&lt;p>Zum Startzeitpunkt (z.B. $k=0$): Initiale Samples gegeben&lt;/p>
$$
f_{k}^{e}\left(\underline{x}_{k}\right)=\sum_{i=1}^{L} w_{k}^{e, i} \cdot \delta\left(\underline{x}_{k}-\underline{\hat{x}}_{k}^{e, i}\right) \qquad w_{k}^{e, i}=\frac{1}{L}, i \in\left\{1, \ldots, L\right\}
$$
&lt;p>Prädiktion mithilfe des probabilistischen Systemmodells $f(\underline{x}_{k+1} \mid \underline{x}_k)$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Ziehe Samples zum Zeitpunkt $k+1$&lt;/p>
$$
\underline{\hat{x}}_{k+1}^{p, i} \sim f\left(\underline{x}_{k+1} \mid \hat{x}_{k}^{e, i}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Gewichte bleiben gleich&lt;/p>
$$
w_{k+1}^{p, i} = w_{k}^{e, i}
$$
&lt;/li>
&lt;/ul>
&lt;p>$\Rightarrow$&lt;/p>
$$
f_{k+1}^{p}\left(\underline{x}_{k+1}\right)=\sum_{i=1}^{L} w_{k+1}^{p, i} \delta\left(\underline{x}_{k+1}-\underline{\hat{x}}_{k+1}^{p, i}\right)
$$
&lt;p>Bei gegebenen geschriebenen Systemmodell&lt;/p>
$$
\underline{x}_{k+1} = \underline{a}_k(\underline{x}_k, \underline{w}_k)
$$
&lt;ul>
&lt;li>Ziehe $\underline{w}_k^i \sim f_k^w(\cdot)$
&lt;/li>
&lt;li>$\underline{\hat{x}}_{k+1}^{p, i}=\underline{a}_{k}\left(\underline{\hat{x}}_{k}^{e, i}, \underline{w}_{k}^{i}\right), i \in\left\{1, \ldots,L\right\}$
&lt;/li>
&lt;/ul>
&lt;h4 id="filterschritt">Filterschritt&lt;/h4>
&lt;p>Filterung mithilfe des probabilistischen Messmodells $f(\underline{y}_k \mid \underline{x}_k)$, falls Messung verfügbar.&lt;/p>
&lt;p>Messupdate&lt;/p>
$$
\begin{aligned}
f_{k}^{e}\left(\underline{x}_{k}\right) &amp;\propto f\left(\underline{y}_{k} \mid \underline{x}_{k}\right) \cdot f_{k}^{p}\left(\underline{x}_{k}\right)\\
&amp;=f\left(\underline{y}_{k} \mid \underline{x}_{k}\right) \cdot \sum_{i=1}^{L} w_{k}^{p, i} \cdot \delta\left(\underline{x}_{k}-\underline{\hat{x}}_{k}^{p, i}\right)\\
&amp;=\sum_{i=1}^{L} \underbrace{w_{k}^{p, i} \cdot f\left(\underline{y}_{k} \mid \hat{\underline{x}}_{k}^{p, i}\right)}_{\propto w_{k}^{e, i}} \cdot \delta(\underline{x}_{k}-\underbrace{\underline{\hat{x}}_{k}^{p, i}}_{\underline{\hat{x}}_{k}^{e, i}})
\end{aligned}
$$
&lt;ul>
&lt;li>
&lt;p>Positionen bleiben gleich&lt;/p>
$$
\underline{\hat{x}}_{k}^{e, i} = \underline{\hat{x}}_{k}^{p, i}
$$
&lt;/li>
&lt;li>
&lt;p>Gewichte werden adaptiert&lt;/p>
$$
w_{k}^{e, i} \propto w_{k}^{p, i} \cdot f\left(\underline{y}_{k} \mid \hat{\underline{x}}_{k}^{p, i}\right)
$$
&lt;ul>
&lt;li>
&lt;p>Normalisierung erforderlich&lt;/p>
$$
w_{k}^{e, i}:=\frac{w_{k}^{e, i}}{\displaystyle \sum_{i} w_{k}^{e,i}}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="ablauf">Ablauf&lt;/h3>
&lt;p>Gewichte sind repräsentiert mit Kreise.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-14%2017.37.55.png" alt="截屏2022-08-14 17.37.55" style="zoom: 33%;" />
&lt;h3 id="vor--und-nachteile">Vor- und Nachteile&lt;/h3>
&lt;p>👍 &lt;span style="color: ForestGreen">Vorteile&lt;/span>&lt;/p>
&lt;ul>
&lt;li>&lt;span style="color: ForestGreen">Problemlose Behandlung vom nichtlinearen System- und Messmodellen&lt;/span>&lt;/li>
&lt;li>&lt;span style="color: ForestGreen">Einstellbare Genauigkeit und Rechenaufwand nach Anzahl der Partikel balancieren&lt;/span>&lt;/li>
&lt;li>&lt;span style="color: ForestGreen">Extreme einfache Implementierung&lt;/span>&lt;/li>
&lt;/ul>
&lt;p>👎 &lt;span style="color: Red">Nachteile&lt;/span>&lt;/p>
&lt;ul>
&lt;li>&lt;span style="color: Red">Varianz der Samples erhöht sich mit Filterschritten&lt;/span>&lt;/li>
&lt;li>&lt;span style="color: Red">Partikel sterben aus $\rightarrow$ Degenerierung des Filters&lt;/span>&lt;/li>
&lt;li>&lt;span style="color: Red">Aussterben schneller, je genauer die Messung, da Likelihood schmaler ($\rightarrow$ Paradox!)&lt;/span>&lt;/li>
&lt;/ul>
&lt;h2 id="verbesserungen">Verbesserungen&lt;/h2>
&lt;h3 id="resampling">Resampling&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Maßnahme zur Veringerung der Varianz der Samples&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Approximation der gewichteter Samples durch ungewichtete&lt;/p>
$$
f_{k}^{e}\left(\underline{x}_{k}\right)=\sum_{i=1}^{L} w_{k}^{e, i} \cdot \delta\left(\underline{x}_{k}-\underline{\hat{x}}_{k}^{e, i}\right) \approx \sum_{i=1}^{L} \frac{1}{L} \delta\left(\underline{x}_{k}-\underline{\hat{\hat{x}}}_{k}^{e, i}\right)
$$
&lt;p>$\underline{\hat{\hat{x}}}_{k}^{e, i}$
: i.d.R nicht die gleiche Orte wie $\underline{\hat{x}}_{k}^{e, i}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sehr einfaches Verfahren&lt;/p>
&lt;ul>
&lt;li>Verwerfen von Samples mit kleinen Gewichten&lt;/li>
&lt;li>Duplizieren von Samples mit hohen Gewichten proportional zu $w_i$ (importance resampling)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Positionen der Samples nicht verändert&lt;/p>
&lt;ul>
&lt;li>Veränderung der Position erst im nachfolgenden Prädiktionsschritt&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="partikelfilter-mit-resampling">Partikelfilter mit Resampling&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-14%2018.04.26.png" alt="截屏2022-08-14 18.04.26" style="zoom:50%;" />
&lt;ul>
&lt;li>Fangen mit Samples der gleichen Gewichte an&lt;/li>
&lt;li>In $k=1$
&lt;ul>
&lt;li>Propagieren durch Prädiktionsschritt. Die Orte werden verändert, während die Geweichte gleich bleiben.&lt;/li>
&lt;li>In Filterschritt, verändert die Gewichte. Orte bleiben gleich.&lt;/li>
&lt;li>Die größeren Sample werden repliziert. Die ganz kleinere werden weg.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="verschiedene-techniken-fur-resampling">Verschiedene Techniken für Resampling&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">This
gives a much clearer explanation. 👍&lt;/span>
&lt;/div>
&lt;ul>
&lt;li>
&lt;p>Gegeben: $L$ Partikel mit Gewichten $w_i$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Gesucht: $L$ Partikel mit Geweichte $\frac{1}{L}$ (gleichgewichtet)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Achtung&lt;/p>
&lt;ul>
&lt;li>Hier nur Vervielfältigung&lt;/li>
&lt;li>Positionen der Partikel &lt;em>unverändert&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Kann als Kategoriale Verteilung gesehen werden&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-15%2009.43.53.png" alt="截屏2022-08-15 09.43.53" style="zoom:50%;" />
&lt;/li>
&lt;/ul>
&lt;h4 id="rouletterad">Rouletterad&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Resampling &lt;strong>proportional&lt;/strong> zu der Gewichten $w_i$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Betrachtung der kumulative Verteilung $F(i)$&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-15%2009.46.51.png" alt="截屏2022-08-15 09.46.51" style="zoom:50%;" />
&lt;/li>
&lt;li>
&lt;p>Ziehe $L$-mal Zufallszahl $u$ und wähle größte $i$ mit $F(i) \leq u$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Entspricht Auswahl mit Rouletterad (z.B. hier $L=5$)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-15%2009.49.11.png" alt="截屏2022-08-15 09.49.11" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Problem&lt;/p>
&lt;ul>
&lt;li>Sehr kleine Gewichte nicht ausreichend proportional gezogen werden.&lt;/li>
&lt;li>Man bevorzugt ganz große Gewichte.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>Given: Set $S$ of weighted samples&lt;/p>
&lt;p>Wanted: Random sample, where the probability of drawing $x_i$ is given by $w_i$&lt;/p>
&lt;p>Typically done $n$ times with replacement to generate new sample set $S^\prime$&lt;/p>
&lt;p>We have a roulette ring, where the arc length is proportional to the weight.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-31%2022.57.24.png" alt="截屏2022-08-31 22.57.24" style="zoom: 33%;" />
&lt;p>We can think of the sampling as following:&lt;/p>
&lt;p>We just randomly pick a direction. If I hit $w_3$, then I will take sample Nr. 3&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-31%2023.00.05.png" alt="截屏2022-08-31 23.00.05" style="zoom: 33%;" />
&lt;p>And repeat this for $n$ times.&lt;/p>
&lt;/blockquote>
&lt;h4 id="stochastic-universal-sampling">Stochastic Universal Sampling&lt;/h4>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb A13.1 (a)&lt;/span>
&lt;/div>
&lt;ul>
&lt;li>
&lt;p>Bisher&lt;/p>
&lt;ul>
&lt;li>starkes Rauschen $\rightarrow$ Auswahl variiert stark&lt;/li>
&lt;li>Bevorzugung großer Gewichte&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Daher: Determistisches Auswahl&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Randomisierung durch einmaliges Ziehen von $\epsilon \in [0, \frac{1}{2}]$&lt;/p>
$$
u_i = \frac{i}{L} - \epsilon \qquad i \in \{1, \dots, L\}
$$
&lt;p>Für $\epsilon = \frac{1}{2L}$:&lt;/p>
$$
u_i = \frac{2i - 1}{2L}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Bsp: $L=5$&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-15%2010.01.04.png" alt="截屏2022-08-15 10.01.04" style="zoom:50%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-15%2010.01.22.png" alt="截屏2022-08-15 10.01.22" style="zoom: 67%;" />
&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>We model the roulette wheel like a wagen wheel:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-31%2023.01.47.png" alt="截屏2022-08-31 23.01.47" style="zoom:33%;" />
&lt;p>We can make a set of spokes that are $\frac{1}{n}$ full rotation around.&lt;/p>
&lt;p>We can randomly put it down someplace, and read off, which $w$ did each spoke hit.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-31%2023.05.06.png" alt="截屏2022-08-31 23.05.06" style="zoom:33%;" />
&lt;p>Compared to roulette wheel:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-31%2023.06.19.png" alt="截屏2022-08-31 23.06.19" style="zoom: 25%;" />
&lt;/blockquote>
&lt;h2 id="importance-sampling">Importance Sampling&lt;/h2>
&lt;p>&lt;strong>🎯 Ziel: Berechung des Integrals&lt;/strong>&lt;/p>
$$
E = \int_{\mathbb{R}^N} g(\underline{x}) f(\underline{x}) d\underline{x}
$$
&lt;ul>
&lt;li>
&lt;p>$E$: Erwartungswert&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$g(\underline{x})$
: nichtlineare Funktion&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$f(\underline{x})$
: Verteilungsdichte&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Falls Samples von $f(\underline{x})$ verfügbar:&lt;/p>
$$
E=\int_{\mathbb{R}^{N}} g(\underline{x}) \sum_{i=1}^{L} w_{i} \cdot \delta\left(\underline{x}-\underline{\hat{x}}_{i}\right) d \underline{x}=\sum_{i=1}^{L} w_{i} g(\underline{x}_i)
$$
&lt;p>Aber: Oft Sampling von $f(\underline{x})$ &lt;u>nicht&lt;/u> möglich 🤪&lt;/p>
&lt;p>Abhilfe: &lt;strong>Proposal distribution&lt;/strong> (a.k.a instrumental distribution, importance distribution) $p(\underline{x})$ mit&lt;/p>
$$
\operatorname{supp}\{f(\cdot)\} \subset \operatorname{supp}\{p(\cdot)\}
$$
&lt;p>($\operatorname{supp}$ steht für support)&lt;/p>
&lt;p>d.h. $p(\underline{x}) > 0$ falls $f(\underline{x}) > 0$.&lt;/p>
&lt;p>Für $p(\underline{x})$ müssen wir so auswählen, dass Sampling von $p(\underline{x})$ einfach ist (z.B. Gaußdichte).&lt;/p>
&lt;p>Einsetzen:&lt;/p>
$$
E=\int_{\mathbb{R}^{N}} g(\underline{x}) \cdot \frac{p(\underline{x})}{p(\underline{x})} \cdot f(\underline{x}) d \underline{x}=\int_{\mathbb{R}^{N}} g(\underline{x}) \cdot \frac{f(\underline{x})}{p(\underline{x})} \cdot p(\underline{x}) d \underline{x}
$$
&lt;p>Jetzt würden wir &lt;em>nicht&lt;/em> $f(\underline{x})$
in eine Dirac Mixture entwickeln, sondern $p(\underline{x})$
. Davon können wir samplen.&lt;/p>
$$
p(\underline{x}) \approx \sum_{i=1}^{L} w_{i} \cdot \delta\left(\underline{x}-\underline{\hat{x}}_{i}\right)
$$
$$
\begin{aligned}
\Rightarrow E &amp;\approx \int_{\mathbb{R}^{N}} g(\underline{x}) \cdot \frac{f(\underline{x})}{p(\underline{x})} \cdot \sum_{i=1}^{L} w_{i} \delta\left(\underline{x}-\underline{\hat{x}}_{i}\right) d \underline{x} \\\\
&amp;= \sum_{i=1}^{L} g(\underline{\hat{x}}_{i}) \cdot \underbrace{\frac{f(\underline{\hat{x}}_i)}{p(\underline{\hat{x}}_{i})} \cdot w_i}_{=: w_{i}^\prime} \\\\
&amp;= \sum_{i=1}^{L} w_{i}^\prime \cdot g(\underline{\hat{x}}_{i})
\end{aligned}
$$
&lt;p>Konvergenz gegen $E$ für $L \to \infty$&lt;/p>
&lt;p>I.e. Wir teilen den Ausdruck so auf, dass wir Sample $\underline{\hat{x}}_i$
von der Proposal $p(\underline{x})$
sampeln und ihr ursprüngliches Gewicht $w_i$
mit &amp;ldquo;Importance&amp;rdquo; $\frac{f(\underline{\hat{x}}_i)}{p(\underline{\hat{x}}_{i})} $
anpassen.&lt;/p>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Check
for a clear explanation and visualization.&lt;/span>
&lt;/div>
&lt;h3 id="sequential-importance-sampling">Sequential Importance Sampling&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Übungsblatt Aufg. 13.3&lt;/span>
&lt;/div>
&lt;p>&lt;strong>Vor-positionierung von Samples&lt;/strong>&lt;/p>
&lt;p>&lt;strong>🎯 Ziel: Systematische und korrekte Positionierung der Samples an Stellen hoher Likelihood &lt;u>vor&lt;/u> Filterschritt&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Verwendung von &lt;strong>Proposal&lt;/strong> statt Systemmodell $f(\underline{x}_{k+1} \mid \underline{x}_k)$
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>💡Idee: Importance Sampling für $f(\underline{x}_k, \underline{x}_{k-1} \mid \underline{y}_{1:k})$
&lt;/strong> (die Messung wird auch in Berücksichtigung genommen)&lt;/p>
$$
f_{k}^{e}\left(\underline{x}_{k}\right)=f\left(\underline{x}_{k} \mid \underline{y}_{1: k}\right)=\int_{\mathbb{R}^{N}} \cdots \int_{\mathbb{R}^{N}} f\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right) d \underline{x}_{1: k-1}
$$
&lt;p>Proposal: $p\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right)$
hängt auch von $\underline{y}_k$
ab.&lt;/p>
&lt;p>Damit:&lt;/p>
$$
f_{k}^{e}\left(\underline{x}_{k}\right) = \int_{\mathbb{R}^{N}} \cdots \int_{\mathbb{R}^{N}} \underbrace{\frac{f\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right)}{p\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right)}}_{=: w_k^{e, i}} p\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right) d \underline{x}_{1: k-1}
$$
&lt;p>Annahme: Proposal ist faktorisierbar&lt;/p>
$$
p\left(\underline{x}_{1: k} \mid \underline{y}_{1: k}\right)=p\left(\underline{x}_{k} \mid \underline{x}_{1: k - 1}, \underline{y}_{1: k}\right) \cdot p\left(\underline{x}_{1: k -1} \mid \underline{y}_{1: k - 1}\right)
$$
&lt;p>Für gegebenes Sample $\underline{\hat{x}}_{k-1}^{e, i}$
aus letzten Zeitpunkt, ziehe&lt;/p>
$$
\underline{x}_{k}^{e, i} \sim P\left(\underline{x}_{k} \mid \hat{\underline{x}}_{k-1}^{e, i}, \underline{y}_{k}\right)
$$
&lt;p>Jetzt umschreiben von $\frac{f\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right)}{p\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right)}$
&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Zähler&lt;/p>
$$
\begin{aligned}
f\left(\underline{x}_{1: k} \mid \underline{y}_{1: k}\right) &amp;\propto f\left(\underline{y}_{k} \mid \underline{x}_{1: k}, \underline{y}_{1: k - 1}\right) \cdot f\left(\underline{x}_{1: k} \mid \underline{y}_{1: k-1}\right)\\
&amp;=f\left(\underline{y}_{k} \mid \underline{x}_{k}\right) \cdot f\left(\underline{x}_{k} \mid \underline{x}_{1:k-1}, \underline{y}_{1:k-1}\right) \cdot f\left(\underline{x}_{1:k-1} \mid \underline{y}_{1: k-1}\right)\\
&amp;=f\left(\underline{y}_{k} \mid \underline{x}_{k}\right) \cdot f\left(\underline{x}_{k} \mid \underline{x}_{k-1}\right) \cdot f\left(\underline{x}_{1: k-1} \mid \underline{y}_{1: k \cdot 1}\right)
\end{aligned}
$$
&lt;/li>
&lt;li>
&lt;p>Nenner&lt;/p>
$$
p\left(\underline{x}_{1: k} \mid \underline{y}_{1: k}\right)=p\left(\underline{x}_{k} \mid \underline{x}_{1: k - 1}, \underline{y}_{1: k}\right) \cdot p\left(\underline{x}_{1: k -1} \mid \underline{y}_{1: k - 1}\right)
$$
&lt;/li>
&lt;/ul>
&lt;p>$\Rightarrow$ Gewicht für Position $i$:&lt;/p>
$$
w_k^{e, i} = \frac{f\left(\underline{\hat{x}}_{1: k} \mid \underline{y}_{1 : k}\right)}{p\left(\underline{\hat{x}}_{1: k} \mid \underline{y}_{1 : k}\right)} \propto \frac{f\left(\underline{y}_{k} \mid \underline{x}_{k}^i\right) \cdot f\left(\underline{x}_{k}^i\mid \underline{x}_{k-1}^i\right)}{p\left(\underline{x}_{k}^i \mid \underline{x}_{1: k - 1}^i, \underline{y}_{1: k}\right)} \cdot \underbrace{\frac{f\left(\underline{x}_{1: k-1}^i \mid \underline{y}_{1: k \cdot 1}\right)}{p\left(\underline{x}_{1: k -1}^i \mid \underline{y}_{1: k - 1}\right)}}_{=w_{k-1}^{e, i}}
$$
&lt;p>mit Normalisierung.&lt;/p>
&lt;h3 id="spezielle-proposal">Spezielle Proposal&lt;/h3>
&lt;h4 id="standard-proposal">&lt;strong>Standard Proposal&lt;/strong>&lt;/h4>
&lt;p>Einfache Verwendung der Systemdynamik:&lt;/p>
$$
p\left(\underline{x}_{k} \mid \underline{x}_{k-1}, \underline{y}_{k}\right) \stackrel{!}{=} f\left(\underline{x}_{k} \mid \underline{x}_{k-1}\right)
$$
&lt;p>Es ergibt sich&lt;/p>
$$
w_{k}^{e, i} \propto \frac{f\left(\underline{y}_{k} \mid \hat{\underline{x}}_{k}^{i}\right) \cdot f\left(\hat{\underline{x}}_{k}^{i} \mid \hat{\underline{x}}_{k-1}^{i}\right)}{p\left(\underline{\hat{x}}_{k}^{i} \mid \hat{\underline{x}}_{k-1}^{i}, \underline{y}_k\right)} \cdot w_{k-1}^{e, i}=f\left(\underline{y}_{k} \mid \hat{\underline{x}}_{k}^{i}\right) \cdot w_{k - 1}^{e, i}
$$
&lt;p>Sehr einfach, aber KEINE verbesserte Performance 🤪&lt;/p>
&lt;h4 id="optimales-proposal">&lt;strong>Optimales Proposal&lt;/strong>&lt;/h4>
&lt;p>Verwende&lt;/p>
$$
\begin{aligned}
p\left(\underline{x}_{k} \mid \underline{x}_{k-1}, \underline{y}_{k}\right) &amp;=f\left(\underline{x}_{k} \mid \underline{x}_{k-1}, \underline{y}_{k}\right) \\
&amp; \propto f\left(\underline{y}_{k} \mid \underline{x}_{k}\right) \cdot f\left(\underline{x}_{k} \mid \underline{x}_{k-1}\right)
\end{aligned}
$$
&lt;p>Damit wäre&lt;/p>
$$
w_k^{e, i} = w_{k-1}^{e, i}
$$
&lt;p>Wird als &lt;strong>optimales Proposal&lt;/strong> genannt&lt;/p>
&lt;ul>
&lt;li>Minimierte Varianz der Gewicht&lt;/li>
&lt;li>Varianz der Gewicht ändert sich nicht&lt;/li>
&lt;/ul>
&lt;p>‼️ Aber typischerweise können wir hiervon nicht samplen $\rightarrow$ Nur in Spezialfällen verwendbar.&lt;/p>
&lt;h2 id="einfaches-praktisches-filter-sir-partikelfilter">Einfaches praktisches Filter: SIR-Partikelfilter&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Standard Proposal&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Resampling nach jedem Filterschritt&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Da Gewichte in Prädiktionsschritt unverändert&lt;/p>
$$
w_{k-1}^{e, i} = \frac{1}{L}
$$
&lt;p>und damit&lt;/p>
$$
w_k^{e, i} \propto f(\underline{y}_k \mid \underline{\hat{x}}_k^i)
$$
&lt;/li>
&lt;li>
&lt;p>Einfachstes praktisches Partikelfilter&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Algorithm&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Input&lt;/p>
&lt;ul>
&lt;li>
$\underline{\hat{x}}_{k-1}^{e, i}$
&lt;/li>
&lt;li>
$w_{k-1}^{e, i} = \frac{1}{L}, i \in \{1, \dots, L\}$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>For $i \in \{1, \dots, L\}$
&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Ziehe&lt;/p>
$$
\underline{\hat{x}}_{k-1}^{e, i} \sim f(\underline{x}_k \mid \underline{\hat{x}}_{k-1}^i)
$$
&lt;/li>
&lt;li>
&lt;p>Gewichtung&lt;/p>
$$
w_k^{e, i} \propto f(\underline{y}_k \mid \underline{\hat{x}}_{k}^{e, i})
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Normalisierung Gewichte $w_k^{e, i}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Resampling&lt;/p>
$$
\underline{\hat{x}}_{k}^{e, i}, \quad w_{k}^{e, i} = \frac{1}{L} \qquad i \in \{1, \dots, L\}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item><item><title>Gauß Rechenregeln</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/sample_basierte_filter/gauss_rechenregeln/</link><pubDate>Mon, 15 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/sample_basierte_filter/gauss_rechenregeln/</guid><description>&lt;h2 id="produkt-zweier-gaußdichten">Produkt zweier Gaußdichten&lt;/h2>
&lt;p>Gegeben&lt;/p>
$$
\begin{aligned}
f_1(x) &amp;= \frac{1}{\sqrt{2\pi} \sigma_1} \exp\left\{-\frac{1}{2} \frac{(x - m_1)^2}{\sigma_1^2}\right\} \\
f_2(x) &amp;= \frac{1}{\sqrt{2\pi} \sigma_2} \exp\left\{-\frac{1}{2} \frac{(x - m_2)^2}{\sigma_2^2}\right\}
\end{aligned}
$$
&lt;p>Gesucht:&lt;/p>
$$
\begin{aligned}
f(x) &amp;= \frac{1}{\sqrt{2\pi} \sigma} \exp\left\{-\frac{1}{2} \frac{(x - m)^2}{\sigma_1^2}\right\} \\\\
&amp;\propto f_1(x) \cdot f_2(x) = \frac{1}{\sqrt{2\pi} \sigma_1}\frac{1}{\sqrt{2\pi} \sigma_2} \cdot e^{-\frac{1}{2} \frac{(x - m_1)^2}{\sigma_1^2}} e^{-\frac{1}{2} \frac{(x - m_2)^2}{\sigma_2^2}}
\end{aligned}
$$
&lt;p>Exponent:&lt;/p>
$$
\begin{aligned}
&amp;\frac{\left(x-m_{1}\right)^{2}}{\sigma_{1}^{2}}+\frac{\left(x-m_{2}\right)^{2}}{\sigma_{2}^{2}} \overset{!}{=} \frac{(x-m)^{2}}{\sigma^{2}}+2C \\\\
&amp;\frac{x^{2}-2 m_{1} x+m_{1}^{2}}{\sigma_{1}^{2}}+\frac{x-2 m_{2} x+m_{2}^{2}}{\sigma_{2}{ }^{2}} \stackrel{!}{=} \frac{x^{2}-2 mx+m^{2}}{\sigma^{2}}+2 C \\\\
&amp;x^{2}\left(\frac{1}{\sigma_{1}^{2}}+\frac{1}{\sigma_{2}^{2}}-\frac{1}{\sigma^{2}}\right)-2\left(\frac{m_{1}}{\sigma_{1}^{2}}+\frac{m_{2}}{\sigma_{2}^{2}}-\frac{m}{\sigma^{2}}\right) \cdot x +\frac{m_{1}^{2}}{\sigma_{1}^{2}}+\frac{m_{2}^{2}}{\sigma_{2}^{2}}-\frac{m^{2}}{\sigma^{2}}-2 C \stackrel{!}{=} 0
\end{aligned}
$$
&lt;p>Ergebnis:&lt;/p>
$$
\begin{aligned}
\sigma^{2}&amp;=\frac{1}{\frac{1}{\sigma_{1}^{2}}+\frac{1}{\sigma_{2}^{2}}}=\frac{\sigma_{1}^{2} \sigma_{2}^{2}}{\sigma_{1}^{2}+\sigma_{2}^{2}} \\\\
m &amp;= \sigma^2 \left(\frac{m_1}{\sigma_1^2} + \frac{m_2}{\sigma_2^2} \right)\\\\
2C &amp;= \frac{m_1^2}{\sigma_1^2} + \frac{m_2^2}{\sigma_2^2} - \frac{m^2}{\sigma^2} = \frac{(m_1 - m_2)^2}{\sigma_1^2 + \sigma_2^2}
\end{aligned}
$$
&lt;p>(See also: &lt;a href="https://ccrma.stanford.edu/~jos/sasp/Product_Two_Gaussian_PDFs.html">Product of Two Gaussian PDFs&lt;/a>)&lt;/p>
&lt;p>Andere Darstellung:&lt;/p>
$$
\begin{aligned}
f(x) &amp;\propto \frac{1}{\sqrt{2\pi} \sigma_1}\frac{1}{\sqrt{2\pi} \sigma_2} \cdot e^{-\frac{1}{2} \frac{(m_1 - m_2)^2}{\sigma_1^2 + \sigma_2^2}} e^{-\frac{1}{2} \frac{(x - m)^2}{\sigma^2}} \\\\
&amp;= \underbrace{\frac{1}{\sqrt{2\pi} \sqrt{\sigma_1^2 + \sigma_2^2}} e^{-\frac{1}{2} \frac{(m_1 - m_2)^2}{\sigma_1^2 + \sigma_2^2}}}_{\text{Gewicht (norm.)}} \cdot \underbrace{\frac{1}{\sqrt{2\pi} \sigma} e^{-\frac{1}{2} \frac{(x - m)^2}{\sigma^2}}}_{\text{Ergebnisdichte (n orm.)}}
\end{aligned}
$$
&lt;h2 id="dekomposition-einer-gaußdichten">Dekomposition einer Gaußdichten&lt;/h2>
&lt;p>Gegeben: Gaußdichte mit $m, \sigma$&lt;/p>
&lt;p>Gesucht: Dekomposition, d.h. mögliche Werte für $m_1, m_2, \sigma_1, \sigma_2$&lt;/p>
$$
\begin{aligned}
\frac{1}{\sigma^2} &amp;= \frac{1}{\sigma_1^2} + \frac{1}{\sigma_2^2} \\\\
\Rightarrow \kappa^2 &amp;= \kappa_1^2 + \kappa_2^2 \\\\
\Rightarrow \kappa_1^2 &amp;= (1 - \gamma)\kappa^2, \kappa_2^2 = \gamma \cdot \kappa^2 \qquad \gamma \in [0, 1]
\end{aligned}
$$
$$
m=\frac{1}{\kappa^{2}}\left((1-\gamma) \kappa^{2} \cdot m_{1}+\gamma \kappa^{2} \cdot m_{2}\right)=(1-\gamma) \cdot m_{1}+\gamma \cdot m_{2}
\tag{*}
$$
&lt;p>Gilt offennsichtlich für $m_1 = m_2 = m$
, aber auch Wahl von $m_1, m_2$ nach $(*)$ möglich&lt;/p></description></item><item><title>Progressive Filterung</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/sample_basierte_filter/progressive_filterung/</link><pubDate>Mon, 15 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/sample_basierte_filter/progressive_filterung/</guid><description>&lt;h2 id="systematisches-resampling">Systematisches Resampling&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Gegeben: Priore Dirac Mixture&lt;/p>
$$
f_{p}(\underline{x})=\sum_{i=1}^{L} w_{i}^{p} \delta(\underline{x}-\underline{\hat{x}}_i^p)
$$
&lt;/li>
&lt;li>
&lt;p>Filterschritt (Bayes)&lt;/p>
$$
\tilde{f}_e(\underline{x}) \propto f_{p}(\underline{x}) \cdot f_{L}(\underline{x})=\sum_{i=1}^{L} \underbrace{w_{i}^{p} \cdot f_{L}\left(\hat{\underline{x}}_{i}^{p}\right)}_{w_{i}^{e}} \cdot \delta(\underline{x} - \underbrace{\underline{\hat{x}}_{i}^{p}}_{\underline{x}_{i}^{e}})
$$
&lt;/li>
&lt;li>
&lt;p>Probleme&lt;/p>
&lt;ul>
&lt;li>Falls Support / Träger von $f_L(\cdot)$ kleiner als Support von $f_p(\cdot)$, sterben viele Partikel aus!&lt;/li>
&lt;li>Positionen $\underline{\hat{x}}_i^e$
sterben aus!&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Lösung&lt;/p>
&lt;ul>
&lt;li>Progressive Verarbeitung&lt;/li>
&lt;li>Reapproximation durch Optimierung&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="progressive-filterung">Progressive Filterung&lt;/h2>
&lt;p>Progressiv = Der Filterschritt wird nicht in einen Schlag verwendet, sondern wir verwenden mehrere Likelihood, um die Filterung durchzuführen.&lt;/p>
&lt;p>Effektives Support:&lt;/p>
$$
\alpha_{\varepsilon}(f(\cdot))=\{x: f(x) \geqslant \varepsilon\} \qquad (\alpha-\text{Schritt bei } \epsilon)
$$
&lt;p>Gegeben: Likelihood $f_L(\underline{x})$ mit $\alpha_{\varepsilon}(f_L(\cdot)) \ll \alpha_{\varepsilon}(f_p(\cdot))$
&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-15%2022.21.19.png" alt="截屏2022-08-15 22.21.19" style="zoom:50%;" />
&lt;p>Dekomposition von $f_L(\underline{x}) $
&lt;/p>
$$
f_L(\underline{x}) = f_L^1(\underline{x}) \cdot f_L^2(\underline{x}) \cdots f_L^k(\underline{x})
$$
&lt;p>Der Produkt von Dichten: $f_L^i(\underline{x})$ &amp;ldquo;breiter&amp;rdquo; als $f_L(\underline{x})$ $\rightarrow$ Effektives Support ist größer ($\alpha_{\varepsilon}(f_L^i(\cdot)) > \alpha_{\varepsilon}(f_L(\cdot))$
)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-15%2022.33.16.png" alt="截屏2022-08-15 22.33.16" style="zoom:67%;" />
&lt;p>Note: Dekomposition ist NICHT eindeutig.&lt;/p>
&lt;p>Damit kann Filterschritt dekomponiert werden:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-15%2022.43.30.png" alt="截屏2022-08-15 22.43.30" style="zoom: 33%;" />
&lt;ul>
&lt;li>In jedem Schritt Gewichtung der prioren Dirac Mixture&lt;/li>
&lt;li>Reapproximation nach jedem Teil-Filterschritt&lt;/li>
&lt;/ul>
&lt;p>Algorithms:&lt;/p>
&lt;ul>
&lt;li>
$f_e^0 (\underline{x}) = f_p(\underline{x})$
&lt;/li>
&lt;li>
&lt;p>For $i \in \{1, \dots, k\}$&lt;/p>
$$
\begin{aligned}
\tilde{f}_e^i(\underline{x}) &amp;= f_e^{i-1}(\underline{x}) \cdot f_L^i((\underline{x})) \text{ (gewicht)} \quad \to f_e(\underline{x}) \\
f_e^{i}(\underline{x}) &amp;= \operatorname{Reapproximate}(\tilde{f}_e^i(\underline{x})) \text{ (ungewicht)} \quad \to = f_e^k(\underline{x})
\end{aligned}
$$
&lt;/li>
&lt;/ul>
&lt;h2 id="reapproximation">Reapproximation&lt;/h2>
&lt;h4 id="ziel">Ziel&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Gegeben: Gewichtete Dirac Mixture&lt;/p>
$$
\tilde{f}(\underline{x}) = \sum_{i=1}^L \tilde{w}_i \cdot \delta(\underline{x} - \underline{\hat{x}}_i)
$$
&lt;/li>
&lt;li>
&lt;p>Gesucht: Ungewichtete Dirac Mixture&lt;/p>
$$
\tilde{f}(\underline{x}) \approx f(\underline{x}) = \sum_{i=1}^L \frac{1}{L} \cdot \delta(\underline{x} - \underline{\hat{x}}_i)
$$
&lt;/li>
&lt;li>
&lt;p>Gütemaß: Distanz $D(\tilde{f}(\underline{x}) , f(\underline{x}))$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Aber Abstand zwischen Dirac Mixtures in Dichtebereich schwierig 🤪 $\rightarrow$ Wir betrachten die Kumulative Verteilung $\tilde{F}(\underline{x}), F(\underline{x})$&lt;/p>
$$
\begin{aligned}
\tilde{F}(\underline{x}) &amp;= \sum_{i=1}^L \tilde{w}_i \cdot H(\underline{x} - \underline{\hat{x}}_i) \\
F(\underline{x}) &amp;= \sum_{i=1}^L \frac{1}{L} \cdot H(\underline{x} - \underline{\hat{x}}_i) \\
\end{aligned}
$$
&lt;/li>
&lt;/ul>
&lt;h3 id="herausforderungen">Herausforderungen&lt;/h3>
&lt;p>Minimalbeispiel: Approximation von zwei Dirac Komponenten durch eine Komponent&lt;/p>
$$
\begin{aligned}
\tilde{F}(\underline{x}) &amp;= w_1 \cdot H(x - \tilde{x}_1) + w_2 \cdot H(x - \tilde{x}_2) \qquad w_1, w_2 > 0, w_1 + w_2 = 1 \\
F(x) &amp;= H(x - \hat{x})
\end{aligned}
$$
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-15%2022.59.28.png" alt="截屏2022-08-15 22.59.28" style="zoom: 50%;" />
&lt;p>Cramér–von Mises Distanz:&lt;/p>
$$
D=\int_{-\infty}^{\infty}[\tilde{F}(x)-F(x)]^{2} d x=\left(\hat{x}-\tilde{x}_{1}\right) \cdot w_{1}^{2}+\left(\hat{x}-\tilde{x}_{2}\right) \cdot w_{2}^{2} \quad \text{für} \quad \tilde{x}_{1} \leq \hat{x} \leq \tilde{x}_{2}
$$
$$
\frac{\partial D}{\partial \hat{x}} = w_1^2 + w_2^2
$$
&lt;p>D.h., Für alle $\hat{x}$ mit $\tilde{x}_{1} \leq \hat{x} \leq \tilde{x}_{2}$
, $D$
is immer minimiert $\rightarrow$ NICHT eindeutig!&lt;/p>
&lt;h3 id="wasserstein-distanz">Wasserstein-Distanz&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-16%2009.40.17.png" alt="截屏2022-08-16 09.40.17" style="zoom:50%;" />
$$
D=\int_{0}^{1}\left[\tilde{F}^{-1}(y)-F^{-1}(y)\right]^{2} d y=w_{1}\left(\hat{x}-\tilde{x}_{1}\right)^{2}+w_{2}\left(\hat{x}-\tilde{x}_{2}\right)^{2}
$$
$$
\begin{aligned}
&amp;\frac{\partial D}{\partial{x}}=2 w_{1}\left(\hat{x}-\tilde{x}_{1}\right)+2 w_{2}\left(\hat{x}-\tilde{x}_{2}\right) \\
&amp;\Rightarrow \hat{x}=\frac{w_{1} \cdot \tilde{x}_{1}+w_{2} \tilde{x}_{2}}{w_{1}+w_{2}} \quad \text{(Gewichteter Mittelwert)}
\end{aligned}
$$
&lt;h4 id="allgemeines-verfahren">Allgemeines Verfahren&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-16%2009.50.33.png" alt="截屏2022-08-16 09.50.33" style="zoom:50%;" />
$$
\begin{aligned}
&amp;\hat{x}_{1}=\frac{w_{1} \cdot \tilde{x}_{1}+\left(0.5-w_{1}\right) \cdot \tilde{x}_{2}}{0.5} \\
&amp;\hat{x}_{2}=\frac{\left(w_{1}+w_{2}-0.5\right) \tilde{x}_{2}+\left(1-w_{1}-w_{2}\right) \tilde{x}_{3}}{0.5}
\end{aligned}
$$
&lt;h2 id="gesamtverfahren-progressives-filterverfahren-mit-laufender-reapproximation">Gesamtverfahren: Progressives Filterverfahren mit laufender Reapproximation&lt;/h2>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-16%2010.00.49.png" alt="截屏2022-08-16 10.00.49" style="zoom: 25%;" /></description></item><item><title>Zusammenfassung</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/</link><pubDate>Thu, 18 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/</guid><description/></item><item><title>Mindmap</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/mindmap/</link><pubDate>Wed, 14 Sep 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/mindmap/</guid><description>&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/SI_Zusammenfassung.png" alt="SI_Zusammenfassung" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p></description></item><item><title>Allgemeine Fragen</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/allg_fragen/</link><pubDate>Thu, 18 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/allg_fragen/</guid><description>&lt;h2 id="vorlesung-in-eigenen-worten-zusammenfassen">Vorlesung in eigenen Worten zusammenfassen&lt;/h2>
&lt;p>Die SI Vorlesung vermittelt die fundamentalen und formalen Grundlagen der Zustandsschätzung rund um Prädiktion und Filterung.&lt;/p>
&lt;h2 id="vier-behandelten-typen-von-systemen">Vier behandelten Typen von Systemen&lt;/h2>
&lt;p>erläutern&lt;/p>
&lt;ul>
&lt;li>Nennen&lt;/li>
&lt;li>Zusammenhänge&lt;/li>
&lt;li>Unterschiede&lt;/li>
&lt;li>Limitierungen&lt;/li>
&lt;li>Komplexität einer Implementierung der zugehörigen Schätzer&lt;/li>
&lt;/ul>
&lt;p>4 Type von Systeme&lt;/p>
&lt;ul>
&lt;li>Wertediskrete Systeme&lt;/li>
&lt;li>Wertekontinuierliche lineare Systeme&lt;/li>
&lt;li>Wertekontinuierliche und schwach nichtlineare Systeme&lt;/li>
&lt;li>Allgemeine Systeme&lt;/li>
&lt;/ul>
&lt;h2 id="wann-kann-man-mit-1d-messungen-auch-auf-einen-3d-zustand-schließen-wie-sehen-dann-die-unsicherheits-ellipsen-uber-der-zeit-aus">Wann kann man mit 1D-Messungen auch auf einen 3D-Zustand schließen? Wie sehen dann die Unsicherheits-Ellipsen über der Zeit aus?&lt;/h2>
&lt;h2 id="definition">Definition&lt;/h2>
&lt;h3 id="induzierte-nichtlinearitat">Induzierte Nichtlinearität&lt;/h3>
&lt;h3 id="bedingte-unabhängigkeit">Bedingte Unabhängigkeit&lt;/h3>
&lt;p>Zwei Variable $A, B$ sind bedingt unabhängig, gegeben $C$ $\Leftrightarrow$&lt;/p>
$$
P(A, B | C) = P(A | C) P(B | C)
$$
&lt;p>Damit äquivalent sind die Formulierungen:
$$
P(A | B,C) = P(A | C) \qquad P(B | A,C) = P(B | C)
$$
&lt;/p>
&lt;h3 id="zustand">Zustand&lt;/h3>
&lt;p>(Script P19)&lt;/p>
&lt;p>The state of a dynamic system is defined as the smallest set of variables, the so called &lt;strong>state variables&lt;/strong>, that completely determine the behavior of the system for $t \geq t_0$ given their values at $t_0$ together with the input function for $t \geq t_0$.&lt;/p>
&lt;ul>
&lt;li>When modeling a system, the choice of state variables is not unique.&lt;/li>
&lt;li>State variables do not need be physically existent. They also do not need to be measurable.&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>Der Zustand eines dynamischen Systems ist definiert als die kleinste Menge von Variablen, den so genannten &lt;strong>Zustandsvariablen&lt;/strong>, die das Verhalten des Systems für $t \geq t_0$ vollständig bestimmen/beschreiben, wenn man ihre Werte bei $t_0$ zusammen mit der Eingangsfunktion für $t \geq t_0$ betrachtet.&lt;/p>
&lt;/blockquote>
&lt;h3 id="zustandsschätzung">Zustandsschätzung&lt;/h3>
&lt;p>Rekonstruktion des internen Zustands aus Messungen und Eingängen&lt;/p>
&lt;h3 id="komplexität-einer-rekursion">Komplexität einer Rekursion&lt;/h3>
&lt;h3 id="dichtefunktion-likelihood">Dichtefunktion, Likelihood&lt;/h3>
&lt;p>&lt;strong>Verteilungsfunktion&lt;/strong> oder &lt;strong>kumulative Wahrscheinlichkeitsdichte&lt;/strong> $F_{\boldsymbol{x}}(x)$ der Zufallsvariablen $\boldsymbol{x}$&lt;/p>
$$
F_{\boldsymbol{x}}: \mathbb{R} \rightarrow[0,1] \qquad F_{\boldsymbol{x}}(x):=\mathrm{P}(\boldsymbol{x} \leq x)
$$
&lt;p>Eigenschaften von $F_{\boldsymbol{x}}(x)$&lt;/p>
&lt;ul>
&lt;li>$\lim _{x \rightarrow-\infty} F_{\boldsymbol{x}}(x)=0$
&lt;/li>
&lt;li>$\lim _{x \rightarrow\infty} F_{\boldsymbol{x}}(x)=1$
&lt;/li>
&lt;li>monoton steigend und rechtsseitig stetig.&lt;/li>
&lt;/ul>
&lt;p>Bei stetiger Zufallsvariable:&lt;/p>
$$
F_{\boldsymbol{x}}(x)=\int_{-\infty}^{x} f_{\boldsymbol{x}}(u) \mathrm{d} u
$$
&lt;p>$f_{\boldsymbol{x}}(x)$ heißt &lt;strong>Dichte&lt;/strong> von $x$.&lt;/p>
&lt;p>&amp;ldquo;Dichte&amp;rdquo; einer diskreten Zufallsvariable:&lt;/p>
$$
f_{\boldsymbol{x}}(x)=\sum_{n=1}^{\infty} \mathrm{P}\left(\boldsymbol{x}=x_{n}\right) \delta\left(x-x_{n}\right)=\sum_{n=1}^{\infty} p_{n} \delta\left(x-x_{n}\right)
$$
&lt;h3 id="zufallsvariable">Zufallsvariable&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb 1, A4, A5&lt;/span>
&lt;/div>
&lt;p>Eine &lt;strong>Zufallsvariable&lt;/strong> ist eine numerische Beschreibung des Ergebnisses eines Zufallsexperiments. Es handelt sich um eine Funktion, die ein Ergebnis $\omega$ aus einem Ergebnisraum $\Omega$ in den Raum $\mathbb{R}$ der reellen Zahlen abbildet&lt;/p>
$$
\boldsymbol{x}=\boldsymbol{x}(\omega): \Omega \rightarrow \mathbb{R}
$$
&lt;p>Zwei Typen&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Diskret&lt;/strong>: Ergebnisse sind endlich oder höchstens abzählbar unendlich&lt;/li>
&lt;li>&lt;strong>Kontinuierlich&lt;/strong>: Ereignis- und Wertemenge ist überabzählbaren.&lt;/li>
&lt;/ul>
&lt;h4 id="momente">Momente&lt;/h4>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb 2, A1.1&lt;/span>
&lt;/div>
&lt;p>&lt;strong>Erwartungswert&lt;/strong> (Mittelwert, 1-te Moment) der Zufallsvariablen $\boldsymbol{x}$:&lt;/p>
$$
\mathrm{E}_{f_{\boldsymbol{x}}}\{\boldsymbol{x}\}=\hat{\boldsymbol{x}}=\mu_{\boldsymbol{x}}=\int_{-\infty}^{\infty} x f_{\boldsymbol{x}}(x) \mathrm{d} x
$$
&lt;p>&lt;strong>$k$-te Moment&lt;/strong> der Zufallsvariablen $\boldsymbol{x}$:
$$
\mathrm{E}_{f_{\boldsymbol{x}}}\left\{\boldsymbol{x}^{k}\right\}=\int_{-\infty}^{\infty} x^{k} f_{\boldsymbol{x}}(x) \mathrm{d} x
$$
&lt;/p>
&lt;p>&lt;strong>$k$-te zentrale Moment&lt;/strong> der Zufallsvariablen $\boldsymbol{x}$:&lt;/p>
$$
\mathrm{E}_{f_{\boldsymbol{x}}}\left\{\left(\boldsymbol{x}-\mathrm{E}_{f_{\boldsymbol{x}}}\{\boldsymbol{x}\}\right)^{k}\right\}=\int_{-\infty}^{\infty}\left(x-\mu_{\boldsymbol{x}}\right)^{k} f_{\boldsymbol{x}}(x) \mathrm{d} x
$$
&lt;p>Varianz (2-te zentral Moment) der Zufallsvariablen $\boldsymbol{x}$:&lt;/p>
$$
\mathrm{E}_{f_{\boldsymbol{x}}}\left\{\left(\boldsymbol{x}-\mathrm{E}_{f_{\boldsymbol{x}}}\{\boldsymbol{x}\}\right)^{2}\right\}=\int_{-\infty}^{\infty}\left(x-\mu_{\boldsymbol{x}}\right)^{2} f_{\boldsymbol{x}}(x) \mathrm{d} x
$$
&lt;ul>
&lt;li>$\sigma_{\boldsymbol{x}}$: Standardabweichung der Zufallsvariablen $\boldsymbol{x}$&lt;/li>
&lt;/ul>
&lt;h3 id="2-dim-zufallsvariable">2-dim. Zufallsvariable&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb 2, A2.2&lt;/span>
&lt;/div>
&lt;p>$\underline{X}$ sei eine zweidimensionale Zufallsvariable mit der Dichte $f(\underline{X})=f_{\underline{X}}\left(x_{1}, x_{2}\right)$.&lt;/p>
&lt;p>&lt;strong>Randdichte&lt;/strong>&lt;/p>
$$
\begin{array}{l}
f_{X_{1}}\left(x_{1}\right)=\int_{-\infty}^{\infty} f_{\underline{X}}\left(x_{1}, x_{2}\right) \mathrm{d} x_{2} \\
f_{X_{2}}\left(x_{2}\right)=\int_{-\infty}^{\infty} f_{\underline{X}}\left(x_{1}, x_{2}\right) \mathrm{d} x_{1}
\end{array}
$$
&lt;p>&lt;strong>Bedingte Dichte&lt;/strong>&lt;/p>
&lt;p>Bedingte Dichte von $x_1$&lt;/p>
$$
f_{X_{1}}\left(x_{1} \mid X_{2}=x_{2}\right)=\frac{f_{\underline{X}}\left(x_{1}, x_{2}\right)}{f_{X_{2}}\left(x_{2}\right)}
$$
&lt;p>Bedingte Dichte von $x_2$&lt;/p>
$$
f_{X_{2}}\left(x_{2} \mid X_{1}=x_{1}\right)=\frac{f_{\underline{X}}\left(x_{1}, x_{2}\right)}{f_{X_{1}}\left(x_{1}\right)}
$$
&lt;h3 id="unabhängigkeit-und-unkorreliertheit-von-zufallsvariablen">Unabhängigkeit und Unkorreliertheit von Zufallsvariablen&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb 2, A2.3&lt;/span>
&lt;/div>
&lt;p>$X, Y$ sind unabhängig $\Leftrightarrow$&lt;/p>
$$
f_{X, Y}(x, y)=f_{X}(x) \cdot f_{Y}(y)
$$
&lt;p>Damit gilt auch&lt;/p>
$$
f_{X}(x \mid Y=y)=f_{X}(x)
$$
&lt;p>Die &lt;strong>Kovarianz&lt;/strong> $\sigma_{X, Y}=\operatorname{Cov}_{\boldsymbol{f}_{X, Y}}\{X, Y\}$
von $X$ und $Y$:&lt;/p>
$$
\sigma_{X, Y}=\operatorname{Cov}_{f_{X, Y}}\{X, Y\}=\mathrm{E}\{(X-\mathrm{E}\{X\}) \cdot(Y-\mathrm{E}\{Y\})\}=\mathrm{E}\left\{\left(X-\mu_{x}\right) \cdot\left(Y-\mu_{y}\right)\right\}
$$
&lt;p>Der &lt;strong>Korrelationskoeffizient&lt;/strong> von $X$ und $Y$:&lt;/p>
$$
\rho_{X, Y}=\frac{\sigma_{X, Y}}{\sigma_{X} \cdot \sigma_{Y}} \in [-1, 1]
$$
&lt;ul>
&lt;li>$\left|\rho_{X, Y}\right|=1$: $X$ und $Y$ sind maximal ähnlich&lt;/li>
&lt;li>$\left|\rho_{X, Y}\right|=0$: $X$ und $Y$ sind komplett unähnlich (&lt;em>i.e.&lt;/em>, $X$ und $Y$ sind &lt;strong>unkorreliert&lt;/strong>)&lt;/li>
&lt;/ul>
&lt;p>Unabhängigkeit und Unkorreliertheit:&lt;/p>
$$
\text{Unabhängigkeit} \underset{\text{+ Normalverteilung}}{\rightleftharpoons} \text{Unkorreliertheit}
$$
&lt;h3 id="erwartungswert">Erwartungswert&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;ul>
&lt;li>Üb 1, A7&lt;/li>
&lt;li>Üb 2, A3.4&lt;/li>
&lt;/ul>
&lt;/span>
&lt;/div>
&lt;p>Der &lt;strong>Erwartungswert&lt;/strong> kann interpretiert werden als Mittelwert aller möglichen Werte $x_n$, die eine (diskrete) Zufallsvariable $\boldsymbol{x}$ annehmen kann. Dabei werden die Werte entsprechend ihrer Auftretenswahrscheinlichkeit $p_n$ gewichtet.&lt;/p>
$$
\mathrm{E}\{\boldsymbol{x}\}=\sum_{n=1}^{N} x_{n} p_{n}
$$
&lt;p>Kontinuierlicher Fall:&lt;/p>
$$
\mathrm{E}_{f_\boldsymbol{x}}\{\boldsymbol{x}\} = \int_{-\infty}^{\infty}x f_\boldsymbol{x}(x) dx
$$
&lt;p>Erwartungswert für Funktionen einer Zufallsvariable:&lt;/p>
$$
\mathrm{E}_{f_{\boldsymbol{x}}}\{g(\boldsymbol{x})\}=\int_{-\infty}^{\infty} g(x) f_{\boldsymbol{x}}(x) \mathrm{d} x
$$
&lt;p>Recehenregeln:&lt;/p>
&lt;ul>
&lt;li>$\mathrm{E}_{f_{X}}\{a X+b\}=a \mathrm{E}_{f_{X}}\{X\}+b$
&lt;/li>
&lt;li>$a$ ist eine Konstante: $E(a)=a$
&lt;/li>
&lt;li>$E(X \pm Y)=E(X) \pm E(Y)$
&lt;/li>
&lt;li>$E(XY) = E(x) E(Y)$
, falls $x, Y$ unabhängig&lt;/li>
&lt;/ul>
&lt;h3 id="varianz">Varianz&lt;/h3>
$$
E_{f_X}\{(X - \mu_X)^2\} = \operatorname{Var}(X) = \sigma_X^2
$$
&lt;p>Rechenregeln:&lt;/p>
&lt;ul>
&lt;li>$\operatorname{Var}_{f_X}\{aX+b\} = a^2 \operatorname{Var}_{f_X}\{X\}$
&lt;/li>
&lt;li>$\operatorname{Var}_{f_{X}}\{X\}=\mathrm{E}_{f_{X}}\left\{X^{2}\right\}-\left(\mathrm{E}_{f_{X}}\{X\}\right)^{2}$
&lt;/li>
&lt;li>$a$ is eine Konstante
&lt;ul>
&lt;li>$\operatorname{Var}_{f_X}\{a\} = 0$&lt;/li>
&lt;li>$\operatorname{Var}_{f_X}\{a \pm X\} = \operatorname{Var}_{f_X}\{X\}$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>$\operatorname{Var}\{X, Y\} = E\{XY\} - \mu_X \mu_Y $
&lt;/li>
&lt;/ul>
&lt;h3 id="kovarianzmatrix">Kovarianzmatrix&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">&lt;ul>
&lt;li>Üb 2, A2.3&lt;/li>
&lt;li>Üb 4, A5&lt;/li>
&lt;/ul>
&lt;/span>
&lt;/div>
$$
\begin{array}{l}
\operatorname{Cov}_{f_{\underline{x}}}\{\underline{X}\}=\mathrm{E}_{f_{\underline{\underline{x}}}}\left\{(\underline{X}-\underline{\mu})(\underline{X}-\underline{\mu})^{\top}\right\}\\
\newline
=\left[\begin{array}{cccc}
\sigma_{X_{1}}^{2} &amp; \sigma_{X_1 X_2} &amp; \cdots &amp; \sigma_{X_1 X_N} \\
\sigma_{X_2 X_1} &amp; \sigma_{X_{2}}^{2} &amp; \cdots &amp; \sigma_{X_2 X_N} \\
\vdots &amp; \vdots &amp; \ddots &amp; \vdots \\
\sigma_{X_N X_1} &amp; \sigma_{X_N X_2} &amp; \cdots &amp; \sigma_{X_{N}}^{2}
\end{array}\right]
\newline
=\left[\begin{array}{cccc}
\sigma_{X_{1}}^{2} &amp; \rho_{X_{1}, X_{2}} \sigma_{X_{1}} \sigma_{X_{2}} &amp; \cdots &amp; \rho_{X_{1}, X_{N}} \sigma_{X_{1}} \sigma_{X_{N}} \\
\rho_{X_{2}, X_{1}} \sigma_{X_{2}} \sigma_{X_{1}} &amp; \sigma_{X_{2}}^{2} &amp; \cdots &amp; \rho_{X_{2}, X_{N}} \sigma_{X_{2}} \sigma_{X_{N}} \\
\vdots &amp; \vdots &amp; \ddots &amp; \vdots \\
\rho_{X_{N}, X_{1}} \sigma_{X_{N}} \sigma_{X_{1}} &amp; \rho_{X_{N}, X_{2}} \sigma_{X_{N}} \sigma_{X_{2}} &amp; \cdots &amp; \sigma_{X_{N}}^{2}
\end{array}\right]
\end{array}
$$
&lt;h3 id="positiv-definit-positiv-semidefinit">Positiv definit, positiv semidefinit&lt;/h3>
&lt;p>Eine beliebige (ggf. symmetrische bzw. hermitesche) $n \times n$-Matrix $A$ ist&lt;/p>
&lt;ul>
&lt;li>
&lt;p>positiv definit, falls&lt;/p>
$$
x^T A x > 0
$$
&lt;/li>
&lt;li>
&lt;p>positiv semidefinit, falls&lt;/p>
$$
x^T A x \geq 0
$$
&lt;/li>
&lt;/ul>
&lt;h3 id="weißes-rauschen">Weißes Rauschen&lt;/h3>
&lt;p>Uncertainties taken at different time steps are also independent&lt;/p>
&lt;h3 id="system-eigenschaften-dynamisch-statisch-linear-zeitinvariant">System-Eigenschaften: dynamisch, statisch, linear, zeitinvariant&lt;/h3>
&lt;p>&lt;strong>Statisch&lt;/strong>: Der aktuellen Ausgang $y_k$ ist abhängig von dem aktuellen Eingang $u_k$.&lt;/p>
&lt;p>&lt;strong>Dynamisch&lt;/strong>: Der aktuellen Ausgang $y_k$ ist abhängig von&lt;/p>
&lt;ul>
&lt;li>dem aktuellen Eingang $u_k$&lt;/li>
&lt;li>dem aktuellen Zustand $x_k$&lt;/li>
&lt;/ul>
&lt;p>Bei wertkontinuierlicher linearer Systeme:&lt;/p>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb 5, A1&lt;/span>
&lt;/div>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Linear&lt;/strong>&lt;/p>
$$
\mathcal{S}\left\{\sum_{i=1}^{N} c_{i} y_{\mathrm{e} i, n}\right\}=\sum_{i=1}^{N} c_{i} \mathcal{S}\left\{y_{\mathrm{e} i, n}\right\}
$$
&lt;p>(also höhste Exponent $\leq 1$)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Zeitinvariant&lt;/strong>&lt;/p>
&lt;p>Das System antwortet auf ein zeitlich verschobenes Eingangssignal $y_{\mathrm{e}, n-n_{0}}$
mit dem entsprechend zeitlichverschobenen Ausgangssignal $y_{\mathrm{a}, n-n_{0}}$
&lt;/p>
$$
y_{\mathrm{a}, n}=\mathcal{S}\left\{y_{\mathrm{e}, n}\right\} \quad \Longrightarrow \quad y_{\mathrm{a}, n-n_{0}}=\mathcal{S}\left\{y_{\mathrm{e}, n-n_{0}}\right\}.
$$
&lt;p>(also unabhängig von dem Zeitindex $k$)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Kausalität&lt;/strong>&lt;/p>
&lt;p>Ein zeitdiskretes System S heißt &lt;strong>kausal&lt;/strong>, wenn die Antwort NUR von &lt;em>gegenwärtigen&lt;/em> oder &lt;em>vergangenen&lt;/em>, NICHT jedoch von zukünftigen Werten des Eingangssignals abhängt.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="dirac-funktion">Dirac Funktion&lt;/h3>
&lt;p>Definition:&lt;/p>
$$
\begin{aligned}
\delta(x)&amp;=0, \quad x \neq 0 \\
\int_{a}^b \delta(x) dx &amp;= 1 \quad a &lt; x &lt; b
\end{aligned}
$$
&lt;p>&lt;strong>Rechenregeln&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Verschiebung&lt;/p>
$$
\int_{a}^{b} f(x) \delta\left(x-x_{0}\right) \mathrm{d} x=f\left(x_{0}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Symmetrie&lt;/p>
$$
\delta(x) = \delta(-x)
$$
&lt;/li>
&lt;li>
&lt;p>Skalierung&lt;/p>
$$
\int_{a}^{b} f(x) \delta(|k| x) \mathrm{d} x=\frac{1}{|k|} f(0)
$$
&lt;/li>
&lt;li>
&lt;p>Hintereinanderausführung&lt;/p>
$$
\int_{-\infty}^{\infty} f(x) \delta(g(x)) \mathrm{d} x=\sum_{i=1}^{n} \frac{f\left(x_{i}\right)}{\left|g^{\prime}\left(x_{i}\right)\right|}
$$
&lt;p>wobei $g(x_i) = 0$ und $g^\prime(x_i) \neq 0$.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Verkettung auflösen (super wichtig!!!)&lt;/p>
$$
\delta(g(x)) = \sum_i \frac{1}{g^\prime(x_i)} \delta(x - x_i)
$$
&lt;p>wobei $g(x_i) = 0$ und $g^\prime(x_i) \neq 0$.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="dirac-mixture">Dirac Mixture&lt;/h3>
$$
f(x)=\sum_{i=1}^{L} w_{i} \delta(\underline{x}-\underline{\hat{x}}_i)
$$</description></item><item><title>Wertediskrete Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/wertediskrete_sys/</link><pubDate>Sat, 20 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/wertediskrete_sys/</guid><description>&lt;h2 id="wonham-filter">Wonham Filter&lt;/h2>
&lt;p>Zustandschätzung für wertediskrete Systeme: &lt;strong>Wonham Filter&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Prädiktion&lt;/p>
$$
\underline{\xi}_{k}^{p}=\mathbf{A}^{\top} \underline{\xi}_{k-1}^{e}
$$
&lt;/li>
&lt;li>
&lt;p>Filterung&lt;/p>
$$
\underline{\xi}_{k}^{e} \overset{y_k = m}{=} \frac{\mathbf{B}(:, m) \odot \underline{\xi}_{k}^{p}}{\mathbf{B}(:, m)^\top \cdot \underline{\xi}_{k}^{p}}
$$
&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb 4, A2&lt;/span>
&lt;/div>
&lt;p>Herleitung&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Prädiktion $P(x_k \mid y_{0:m}, u_{0:k-1})$
für $k > m$&lt;/p>
&lt;ol>
&lt;li>
&lt;p>nach $x_{k-1}$ marginalisieren&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Bayes einsetzen&lt;/p>
$$
P(a, b \mid c) = P(a \mid b, c) \cdot P(b \mid c) \qquad (\ast)
$$
&lt;/li>
&lt;li>
&lt;p>Markov Eigenschaft verwenden&lt;/p>
&lt;/li>
&lt;/ol>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-22%2010.14.29.png" alt="截屏2022-08-22 10.14.29" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Filterung: $P\left(x_{k} \mid y_{1: k}, u_{0: k-1}\right)$
&lt;/p>
&lt;ol>
&lt;li>
$P\left(x_{k} \mid y_{1: k}, u_{0: k-1}\right) = P(x_{k} \mid y_k, y_{1: k-1}, u_{0: k-1})$
&lt;/li>
&lt;li>
&lt;p>Bayes einsetzen&lt;/p>
$$
P(b \mid a, c) \cdot P(a \mid c)=P(a \mid b, c) \cdot P(b \mid c) \quad (\triangle)
$$
&lt;/li>
&lt;li>
&lt;p>Schreibe in Form $\frac{\text{Likelihood} \cdot \text{Prädiktion}}{\text{Normalisierungskonstant}}$
&lt;/p>
$$
P\left(x_{k} \mid y_{1: k}, u_{0: k-1}\right) = \frac{\overbrace{P\left(y_{k} \mid x_{k}\right)}^{\text{Likelihood}} \cdot \overbrace{P\left(x_{k} \mid y_{1: k-1}, u_{0: k-1}\right)}^{\text{Einschritt-Prädiktion}}}{\underbrace{P\left(y_{k} \mid y_{1: k-1}, u_{0: k-1}\right)}_{\text{Normalisierungskonstant}}}
$$
&lt;ul>
&lt;li>
&lt;p>Likelihood: $P\left(y_{k} \mid x_{k}\right) = \mathbf{B}(x_k, y_k)$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Prädiktion erhalten wir in Prädiktionsschritt&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Normalisierungskonstant&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Marginalisierung nach $x_k$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Bayes einsetzen&lt;/p>
$$
P(a, b \mid c) = P(a \mid b, c) \cdot P(b \mid c) \qquad (\ast)
$$
&lt;/li>
&lt;/ol>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;/li>
&lt;/ul>
&lt;h2 id="komplexitätsproblem-bei-der-diskretisierung-eines-allgemeinen-systems">Komplexitätsproblem bei der Diskretisierung eines allgemeinen Systems&lt;/h2>
&lt;p>Riesiger Speicherbedarf von Wahrscheinlichkeitsvektor und Transitionsmatrix&lt;/p></description></item><item><title>Wertekontinuierliche lineare Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/wertekont_lin_sys/</link><pubDate>Mon, 22 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/wertekont_lin_sys/</guid><description>&lt;h2 id="kalman-filter">Kalman Filter&lt;/h2>
&lt;p>&lt;strong>Prädiktion&lt;/strong>&lt;/p>
$$
\underline{\hat{x}}_k^p = \mathbf{A}_{k-1}\underline{\hat{x}}_{k-1}^e + \mathbf{B}_{k-1} \underline{\hat{u}}_{k-1}
$$
$$
\mathbf{C}_k^p = \mathbf{A}_{k-1} \mathbf{C}_{k-1}^e A_{k-1}^\top + \mathbf{B}_{k-1} \mathbf{C}_{k-1}^w \mathbf{B}_{k-1}^\top
$$
&lt;p>&lt;strong>Filterung&lt;/strong>&lt;/p>
$$
\mathbf{K}_k = \mathbf{C}_k^p \mathbf{H}_k^\top (\mathbf{C}_k^v + \mathbf{H}_k \mathbf{C}_k^p \mathbf{H}_k ^\top)^{-1}
\tag{Kalman Gain}
$$
$$
\underline{\hat{x}}_k^e = (\mathbf{I} - \mathbf{K}_k \mathbf{H}_k) \underline{\hat{x}}_k^p + \mathbf{K}_k \underline{\hat{y}}_k = \underline{\hat{x}}_k^p + \mathbf{K}_k(\underline{\hat{y}}_k - \mathbf{H}_k \underline{\hat{x}}_k^p)
$$
$$
\mathbf{C}_k^e = (\mathbf{I} - \mathbf{K}_k\mathbf{H}_k)\mathbf{C}_k^p = \mathbf{C}_k^p - \mathbf{C}_k^p \mathbf{H}_k^\top (\mathbf{C}_k^v + \mathbf{H}_k \mathbf{C}_k^p \mathbf{H}_k ^\top)^{-1}\mathbf{H}_k \mathbf{C}_k^p
$$
&lt;h2 id="kalman-filter-vektoriell-herleiten">Kalman Filter (vektoriell) herleiten&lt;/h2>
&lt;h3 id="prädiktion">Prädiktion&lt;/h3>
&lt;p>Systemabbildung&lt;/p>
$$
\underline{x}_{k+1}=\mathbf{A}_{k} \cdot \underline{x}_{k}+\mathbf{B}_{k} \cdot \underbrace{\left(\underline{\tilde{u}}_{k}+\underline{w}_{k}\right)}_{\underline{u_k}}
$$
&lt;p>Schritte&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Berechnung des Erwartungswerts für $k+1$&lt;/p>
$$
E\left\{\underline{x}_{k+1}\right\}=\mathbf{A}_{k} \cdot \underline{\hat{x}}_{k|1: m}+\mathbf{B}_{k} \tilde{\underline{u}}_{k} \qquad (+)
$$
&lt;/li>
&lt;li>
&lt;p>Berechnung der Kovarianzmatrix $C_{k+1|1:m}^x$
mit der Annahme, dass Zustand und Systemrauschen unkorreliert sind&lt;/p>
$$
\begin{aligned}
\underline{x}_{k+1} &amp;=\mathbf{A}_{k} \underline{x}_{k}+\mathbf{B}_{k} \underline{u}_{k} \\
&amp;=\left[\begin{array}{ll}
\mathbf{A}_{k} &amp; \mathbf{B}_{k}
\end{array}\right]\left[\begin{array}{c}
\underline{x}_{k} \\
\underline{u}_{k}
\end{array}\right]
\end{aligned}
$$
&lt;ol>
&lt;li>
&lt;p>Berechne $\operatorname{Cov}\left\{\left[\begin{array}{c}
\underline{x}_{k} \\
\underline{\tilde{u}}_{k}
\end{array}\right]\right\}$
&lt;/p>
$$
\begin{aligned}
\underline{x}_{k+1}-\hat{\underline{x}}_{k+1} &amp;=\left[\begin{array}{ll}
\mathbf{A}_{k} &amp; \mathbf{B}_{k}
\end{array}\right]\left[\begin{array}{c}
\underline{x}_{k}-\hat{\underline{x}}_{k} \\
\underline{u}_{k}-\underline{\hat{u}}_{k}
\end{array}\right] \\
&amp;=\left[\begin{array}{ll}
\mathbf{A}_{k} &amp; \mathbf{B}_{k}
\end{array}\right]\left[\begin{array}{c}
\underline{x}_{k}-\underline{\hat{x}}_{k} \\
\underline{w}_{k}
\end{array}\right]
\end{aligned}
$$
$$
\begin{aligned}
\operatorname{Cov}\left\{\left[\begin{array}{c}
\underline{x}_{k} \\
\underline{\tilde{u}}_{k}
\end{array}\right]\right\} &amp;=E\left\{\left[\begin{array}{c}
\underline{x}_{k}-\underline{\hat{x}}_{k} \\
\underline{w}_{k}
\end{array}\right]\left[\left(\underline{x}_{k}-\underline{\hat{x}}_{k}\right)^{\top} \underline{w}_{k}^{\top}\right]\right\} \\
&amp;=\left[\begin{array}{cc}
C_{k \mid 1: m}^{x} &amp; 0 \\
0 &amp; C_{k}^{w}
\end{array}\right]
\end{aligned}
$$
&lt;/li>
&lt;li>
&lt;p>$\operatorname{Cov}\left\{\left[\begin{array}{c}
\underline{x}_{k} \\
\underline{\tilde{u}}_{k}
\end{array}\right]\right\}$
in Berechnung von $C_{k+1|1:m}^x$
einsetzen&lt;/p>
$$
\begin{aligned}
\mathbf{C}_{k+1 \mid 1 : m}^{x} &amp;=E\left\{\left(\underline{x}_{k+1}-\hat{x}_{k+1}\right)\left(x_{k+1} - \hat{\underline{x}}_{k+1}\right)^\top\right\} \\
&amp;=\left[\begin{array}{ll}
\mathbf{A}_{k} &amp; \mathbf{B}_{k}
\end{array}\right] \cdot E\left\{\left[\begin{array}{c}
\underline{x}_{k}-\hat{\underline{x}}_{k} \\
\underline{w}_{k}
\end{array}\right]\left[\begin{array}{ll}
\underline{x}_{k}-\hat{\underline{x}}_{k} &amp; \underline{w}_{k}
\end{array}\right]^\top\right\} \cdot\left[\begin{array}{l}
\mathbf{A}_{k}^{\top} \\
\mathbf{B}_{k}^{\top}
\end{array}\right] \\\\
&amp;=\left[\begin{array}{ll}
\mathbf{A}_{k} &amp; \mathbf{B}_{k}
\end{array}\right] \cdot\left[\begin{array}{cc}
\mathbf{C}_{k \mid 1:m} &amp; 0 \\
0 &amp; \mathbf{C}_{k}^{w}
\end{array}\right] \cdot\left[\begin{array}{l}
\mathbf{A}_{k}^{\top} \\
\mathbf{B}_{k}^{\top}
\end{array}\right] \\
&amp;=\mathbf{A}_{k} \cdot \mathbf{C}_{k \mid 1: m}^{x} \mathbf{A}_{k}^{\top}+\mathbf{B}_{k} \mathbf{C}_{k}^{w} \mathbf{B}_{k}^{\top} \qquad(++)
\end{aligned}
$$
&lt;/li>
&lt;/ol>
&lt;/li>
&lt;/ol>
&lt;h3 id="filterung">Filterung&lt;/h3>
&lt;p>Messabbildung&lt;/p>
$$
\underline{y}_{k}=\mathbf{H}_{k} \cdot \underline{x}_{k}+\underline{v}_{k}
$$
&lt;p>Schritte&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Schreibe $\underline{x}_k^e$ als lineare Kombination von $\underline{x}_k^p$ und $\underline{y}_k$&lt;/p>
$$
\underline{x}_{k}^e=\mathbf{K}_{k}^{(1)} \underline{x}_{k}^p+\mathbf{K}_{k}^{(2)} \underline{y}_{k}
$$
&lt;/li>
&lt;li>
&lt;p>Aus BLUE Filter ergibt sich&lt;/p>
$$
E\{\underline{x}_{k}^e\}=E\{\mathbf{K}_{k}^{(1)} \underline{x}_{k}^p+\mathbf{K}_{k}^{(2)} \underline{y}_{k}\}
$$
&lt;p>$\Rightarrow$&lt;/p>
$$
\begin{aligned}
\mathbf{K}_{k}^{(1)} &amp;= \mathbf{I} - \mathbf{K}_{k}\mathbf{H}_{k} \\
\mathbf{K}_{k}^{(2)} &amp;= \mathbf{K}_{k}
\end{aligned}
$$
&lt;p>und&lt;/p>
$$
\underline{x}_{k}^e=(\mathbf{I} - \mathbf{K}_{k}\mathbf{H}_{k}) \underline{x}_{k}^p+\mathbf{K}_{k} \underline{y}_{k}
$$
&lt;/li>
&lt;li>
&lt;p>Berechne Kovarianzmatrix $\mathbf{C}_k^e$&lt;/p>
$$
\mathbf{C}_{k}^{e}\left(\mathbf{K}_{k}\right)=\operatorname{Cov}\{\underline{x}_k^e - \underline{x}\} = \left(\mathbf{I}-\mathbf{K}_{k} \mathbf{H}_{k}\right) \mathbf{C}_{k}^{p}\left(\mathbf{I}-\mathbf{K}_{k} \mathbf{H}_{k}\right)^{\top}+\mathbf{K}_{k} C_{k}^{v} \mathbf{K}_{k}^{\top}
$$
&lt;p>Wir suche $\mathbf{K}_{k}$ so, dass der resultierende Schätzer MINIMAL kovarianz aufweist.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Auf skalares Gütemaß zurückzuführen&lt;/p>
$$
P(\mathbf{K}_{k}) = \underline{e}^\top \left( \left(\mathbf{I}-\mathbf{K}_{k} \mathbf{H}_{k}\right) \mathbf{C}_{k}^{p}\left(\mathbf{I}-\mathbf{K}_{k} \mathbf{H}_{k}\right)^{\top}+\mathbf{K}_{k} C_{k}^{v} \mathbf{K}_{k}^{\top}\right) \underline{e}
$$
&lt;/li>
&lt;li>
$\frac{\partial}{\partial \mathbf{K}_{k}} P(\mathbf{K}_{k})\overset{!}{=} 0 \Rightarrow$
$$
\mathbf{K}_k = \mathbf{C}_k^p \mathbf{H}_k^\top (\mathbf{C}_k^v + \mathbf{H}_k \mathbf{C}_k^p \mathbf{H}_k^\top)^{-1}
$$
&lt;/li>
&lt;/ol>
&lt;/li>
&lt;li>
&lt;p>$\mathbf{K}_k$
in $\underline{x}_{k}^e$
und $\mathbf{C}_{k}^{e}$
einsetzen&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="ergebnis-von-gauß-mal-gauß">Ergebnis von &amp;ldquo;Gauß mal Gauß&amp;rdquo;&lt;/h2>
&lt;h2 id="drei-gütemaße-für-die-größe-einer-kovarianzmatrix">Drei Gütemaße für die „Größe“ einer Kovarianzmatrix&lt;/h2>
&lt;p>Mögliche Gütemaße für generelles Vergleichen von Kovarianzmatrizen:&lt;/p>
$$
f: \mathbb{R}^{n \times n} \to \mathbb{R}^1
$$
&lt;p>Funktion, die einer Kovarianzmatrix einen Skalar zuordnen kann, denn man kann nur Skalare direkt miteinander vergleichen.&lt;/p>
&lt;p>Drei Gütemaße&lt;/p>
&lt;ul>
&lt;li>Projektion mit Einheitsvektor&lt;/li>
&lt;li>Spur&lt;/li>
&lt;li>Determinante&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-22%2012.31.16.png" alt="截屏2022-08-22 12.31.16" style="zoom:67%;" /></description></item><item><title>Schwach nichtlineare wertekontinuierliche Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/schwach_nichtlin_wertkont_sys/</link><pubDate>Wed, 24 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/schwach_nichtlin_wertkont_sys/</guid><description>&lt;h2 id="lineare-vs-nichtlineare-systeme">Lineare Vs. Nichtlineare Systeme&lt;/h2>
&lt;style type="text/css">
.tg {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top}
.tg .tg-7btt{border-color:inherit;font-weight:bold;text-align:center;vertical-align:top}
&lt;/style>
&lt;table class="tg">
&lt;thead>
&lt;tr>
&lt;th class="tg-c3ow">&lt;/th>
&lt;th class="tg-7btt">Linear&lt;/th>
&lt;th class="tg-7btt">Nichtlinear&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td class="tg-7btt">Systemabbildung&lt;/td>
&lt;td class="tg-c3ow">$\underline{x}_{k+1} = \mathbf{A}_k \underline{x}_k + \mathbf{B}_k (\underline{u}_k + \underline{w}_k)$&lt;/td>
&lt;td class="tg-c3ow">$\underline{x}_{k+1} = \underline{a}_k(\underline{x}_k, \underline{u}_k, \underline{w}_k)$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tg-7btt">Messabbildung&lt;/td>
&lt;td class="tg-c3ow">$\underline{y}_{k} = \mathbf{H}_k \underline{x}_k + \underline{v}_k$&lt;/td>
&lt;td class="tg-c3ow">$\underline{y}_k = \underline{h}_k (\underline{x}_k, \underline{v}_k)$&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="extended-kalman-filter-ekf">Extended Kalman Filter (EKF)&lt;/h2>
&lt;p>💡 Idee: Linearisierung mit Tylorentwicklung 1. Ordnung um die beste verfügbare Schätzung, um den (linear) Kalman-Filter zu vewenden.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Systemabbildung&lt;/p>
$$
\underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right) \approx \underbrace{\underline{a}_{k}\left(\underline{\overline{x}}_k, \underline{\overline{u}}_k\right)}_{\text{Nomialteil}}+\underbrace{\mathbf{A}_{k}\left(\underline{x}_k-\underline{\overline{x}}_k\right)+\mathbf{B}_{k}\left(\underline{u}_{k}-\underline{\overline{u}}_k\right)}_{\text{Differentialteil}}
$$
&lt;/li>
&lt;li>
&lt;p>Messabbildung&lt;/p>
$$
\underline{h}_{k}\left(\underline{x}_{k}, \underline{v}_{k}\right) \approx \underbrace{\underline{h}_{k}\left(\underline{\bar{x}}_{k}, \underline{\bar{v}}_{k}\right)}_{\text{Nomialteil}}+ \underbrace{\mathbf{H}_{k} \cdot \left(\underline{x}_{k}-\underline{\bar{x}}_{k}\right)+\mathbf{L}_{k} \cdot\left(\underline{v}_{k}-\underline{\bar{v}}_{k}\right)}_{\text{Differentialteil}}
$$
&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb 7, A2&lt;/span>
&lt;/div>
&lt;p>Prädiktion&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Berechnung Erwartungswert über nichtlineare Funktion&lt;/p>
$$
\underline{\hat{x}}_{k+1}^{p}=\underline{a}_{k}\left(\underline{\hat{x}}_{k}^{e}, \hat{\underline{u}}_{k}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Berechnung Kovarianzmatrix über die Linearisierung&lt;/p>
$$
\mathbf{C}_{k+1}^{p} \approx \mathbf{A}_{k} \mathbf{C}_{k}^{e} \mathbf{A}_{k}^{\top}+\mathbf{C}_{k}^{w^{\prime}}=\mathbf{A}_{k} \mathbf{C}_{k}^{e} \mathbf{A}_{k}^{\top}+\mathbf{B}_{k} \mathbf{C}_{k}^{w} \mathbf{B}_{k}^{\top}
$$
&lt;p>mit&lt;/p>
$$
\mathbf{A}_k = \left.\frac{\partial \underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right)}{\partial \underline{x}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\hat{x}}_{k-1}^{e}, \underline{u}_{k}=\hat{\underline{u}}_{k}} \qquad
\mathbf{B}_k = \left.\frac{\partial \underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right)}{\partial \underline{u}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\hat{x}}_{k-1}^{e}, \underline{u}_{k}=\hat{\underline{u}}_{k}}
$$
&lt;/li>
&lt;/ul>
&lt;p>Filterung&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Linearisierung um $\underline{x}_k$ und $\underline{v}_k$&lt;/p>
$$
\mathbf{H}_{k}=\left.\frac{\partial \underline{h}_{k}\left(\underline{x}_{k}, \underline{v}_{k}\right)}{\partial \underline{x}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\hat{x}}_{k}^{p}, \underline{v}_{k}=\underline{\hat{v}}_{k}} \qquad
\mathbf{L}_{k}=\left.\frac{\partial \underline{h}_{k}\left(\underline{x}_{k}, \underline{v}_{k}\right)}{\partial \underline{v}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\hat{x}}_{k}^{p}, \underline{v}_{k}=\underline{\hat{v}}_{k}}
$$
&lt;/li>
&lt;li>
&lt;p>KF Filterung schriit mit Linearisierung&lt;/p>
$$
\begin{aligned}
\mathbf{K}_{k}&amp;=\mathbf{C}_{k}^{p} \mathbf{H}_{k}^{\top}\left(\mathbf{L}_{k} \mathbf{C}_{k}^{v} \mathbf{L}_{k}^{\top}+\mathbf{H}_{k} \mathbf{C}_{k}^{p} \mathbf{H}_{k}^{T}\right)^{-1} \\\\
\hat{\underline{x}}_{k}^{e}&amp;=\hat{\underline{x}}_{k}^{p}+\mathbf{K}_{k}\left[\hat{\underline{y}}_{k}-\underline{h}_{k}\left(\hat{\underline{x}}_{k}^{p}, \hat{\underline{v}}_{k}\right)\right] \overset{\underline{v} \text{ mittelwertfrei}}{=} \hat{\underline{x}}_{k}^{p}+\mathbf{K}_{k}\left[\hat{\underline{y}}_{k}-\underline{h}_{k}\left(\hat{\underline{x}}_{k}^{p}, 0\right)\right]\\\\
\mathbf{C}_{k}^{e}&amp;=\mathbf{C}_{k}^{p}-\mathbf{K}_{k} \mathbf{H}_{k} \mathbf{C}_{k}^{p} = (\mathbf{I} - \mathbf{K}_{k} \mathbf{H}_{k})\mathbf{C}_{k}^{p}
\end{aligned}
$$
&lt;/li>
&lt;/ul>
&lt;h2 id="linear-kf-vs-ekf">(Linear) KF vs. EKF&lt;/h2>
&lt;style type="text/css">
.tg {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-1wig{font-weight:bold;text-align:left;vertical-align:top}
.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
.tg .tg-fymr{border-color:inherit;font-weight:bold;text-align:left;vertical-align:top}
.tg .tg-0lax{text-align:left;vertical-align:top}
&lt;/style>
&lt;table class="tg">
&lt;thead>
&lt;tr>
&lt;th class="tg-0pky">&lt;/th>
&lt;th class="tg-fymr">(Linear) KF&lt;/th>
&lt;th class="tg-fymr">EKF&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td class="tg-fymr">Prädiktion&lt;/td>
&lt;td class="tg-0pky">$\underline{\hat{x}}_k^p = \mathbf{A}_{k-1}\underline{\hat{x}}_{k-1}^e + \mathbf{B}_{k-1} \underline{\hat{u}}_{k-1}$&lt;br>$\mathbf{C}_k^p = \mathbf{A}_{k-1} \mathbf{C}_{k-1}^e A_{k-1}^\top + \mathbf{B}_{k-1} \mathbf{C}_{k-1}^w \mathbf{B}_{k-1}^\top$&lt;/td>
&lt;td class="tg-0pky">$\underline{\hat{x}}_{k+1}^{p}=\underline{a}_{k}\left(\underline{\hat{x}}_{k}^{e}, \hat{\underline{u}}_{k}\right)$&lt;br>$\mathbf{C}_{k+1}^{p} \approx \mathbf{A}_{k} \mathbf{C}_{k}^{e} \mathbf{A}_{k}^{\top}+\mathbf{C}_{k}^{w^{\prime}}=\mathbf{A}_{k} \mathbf{C}_{k}^{e} \mathbf{A}_{k}^{\top}+\mathbf{B}_{k} \mathbf{C}_{k}^{w} \mathbf{B}_{k}^{\top}$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tg-fymr">Filterung&lt;/td>
&lt;td class="tg-0pky">$\mathbf{K}_k = \mathbf{C}_k^p \mathbf{H}_k^\top (\mathbf{C}_k^v + \mathbf{H}_k \mathbf{C}_k^p \mathbf{H}_k ^\top)^{-1}$&lt;br>$\underline{\hat{x}}_k^e = \underline{\hat{x}}_k^p + \mathbf{K}_k(\underline{\hat{y}}_k - \mathbf{H}_k \underline{\hat{x}}_k^p)$&lt;br>$\mathbf{C}_k^e = (\mathbf{I} - \mathbf{K}_k\mathbf{H}_k)\mathbf{C}_k^p$&lt;/td>
&lt;td class="tg-0pky">$\begin{aligned}&lt;br> \mathbf{K}_{k}&amp;amp;=\mathbf{C}_{k}^{p} \mathbf{H}_{k}^{\top}\left(\mathbf{L}_{k} \mathbf{C}_{k}^{v} \mathbf{L}_{k}^{\top}+\mathbf{H}_{k} \mathbf{C}_{k}^{p} \mathbf{H}_{k}^{T}\right)^{-1} \\&lt;br> \hat{\underline{x}}_{k}^{e}&amp;amp;=\hat{\underline{x}}_{k}^{p}+\mathbf{K}_{k}\left[\hat{\underline{y}}_{k}-\underline{h}_{k}\left(\hat{\underline{x}}_{k}^{p}, \hat{\underline{v}}_{k}\right)\right] \\&lt;br> \mathbf{C}_{k}^{e}&amp;amp;=\mathbf{C}_{k}^{p}-\mathbf{K}_{k} \mathbf{H}_{k} \mathbf{C}_{k}^{p} = (\mathbf{I} - \mathbf{K}_{k} \mathbf{H}_{k})\mathbf{C}_{k}^{p}&lt;br> \end{aligned}$&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tg-1wig">Auxiliary&lt;/td>
&lt;td class="tg-0lax">&lt;/td>
&lt;td class="tg-0lax">$\mathbf{A}_k = \left.\frac{\partial \underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right)}{\partial \underline{x}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\hat{x}}_{k-1}^{e}, \underline{u}_{k}=\hat{\underline{u}}_{k}}$&lt;br>$\mathbf{B}_k = \left.\frac{\partial \underline{a}_{k}\left(\underline{x}_{k}, \underline{u}_{k}\right)}{\partial \underline{u}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\hat{x}}_{k-1}^{e}, \underline{u}_{k}=\hat{\underline{u}}_{k}}$&lt;br>$\mathbf{H}_{k}=\left.\frac{\partial \underline{h}_{k}\left(\underline{x}_{k}, \underline{v}_{k}\right)}{\partial \underline{x}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\hat{x}}_{k}^{p}, \underline{v}_{k}=\underline{\hat{v}}_{k}}$&lt;br>$\mathbf{L}_{k}=\left.\frac{\partial \underline{h}_{k}\left(\underline{x}_{k}, \underline{v}_{k}\right)}{\partial \underline{v}_{k}^{\top}}\right|_{\underline{x}_{k}=\underline{\hat{x}}_{k}^{p}, \underline{v}_{k}=\underline{\hat{v}}_{k}}$&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="probleme">Probleme&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Berechnung der posteriore Verteilung nur gut für “schwache” Nichtlinearität&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Linearisierung nur um einen Punkt&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Linearisiertes System ist i.A. zeitvariant, auch wenn originalsytstem zeitinvariant ist, da Linearisierung vom Schätzwert abhängt.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="kalman-filter-in-probabilistischer-form">Kalman Filter in probabilistischer Form&lt;/h2>
&lt;p>&lt;strong>Filterung&lt;/strong>&lt;/p>
&lt;p>(Annahme: $\underline{x}_k$
und $\underline{y}_k$
sind gemeinsam Gaußverteilt)&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Define $\underline{z}:=\left[\begin{array}{l}
\underline{x} \\
\underline{y}
\end{array}\right]$
&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Mittelwert und Varianz von $\underline{z}$ berechnen.&lt;/p>
$$
\underline{\mu}_z=\left[\begin{array}{l}
\underline{\mu}_x \\
\underline{\mu}_y
\end{array}\right]=\frac{1}{L}\sum_{i=1}^L\left[\begin{array}{l}
\underline{x}_i \\
\underline{y}_i
\end{array}\right], \quad \mathbf{C}_{z} = \frac{1}{L}\sum_{i=1}^L(\underline{z}_i - \underline{\mu}_z)(\underline{z}_i - \underline{\mu}_z)^\top = \left[\begin{array}{ll}
\mathbf{C}_{x x} &amp; \mathbf{C}_{x y} \\
\mathbf{C}_{y x} &amp; \mathbf{C}_{y y}
\end{array}\right]
$$
&lt;/li>
&lt;li>
&lt;p>Filterung in probabilistischer Form mit Messung $\hat{\underline{y}}$
&lt;/p>
$$
\begin{aligned}
\underline{\hat{x}}_k^e &amp;= \underline{x}_k^p + \mathbf{C}_{xy} \mathbf{C}_{yy}^{-1} (\underline{\hat{y}} - \underline{\mu}_y) \\
\mathbf{C}_k^e &amp;= \mathbf{C}_k^p - \mathbf{C}_{xy} \mathbf{C}_{yy}^{-1} \mathbf{C}_{yx}
\end{aligned}
$$
&lt;/li>
&lt;/ol>
&lt;h3 id="unscented-kalman-filter-ukf">Unscented Kalman Filter (UKF)&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb 7, A3&lt;/span>
&lt;/div>
&lt;p>Unscented Prinzipien&lt;/p>
&lt;ul>
&lt;li>Nichtlineare Transformation eines einzelnen Punktes ist einfach&lt;/li>
&lt;li>Es ist einfach, eine Punktwolke zu finden, deren Stichprobenmittelwert und -varianz mit den Momenten der gegebene Dichte übereinstimmen.&lt;/li>
&lt;li>Es ist einfach, Mittelwert und Varianz einer Punktwolke zu bestimmen&lt;/li>
&lt;/ul>
&lt;p>Bsp: additives Rauschen&lt;/p>
$$
\begin{aligned}
\underline{x}_{k+1} &amp;= \underline{a}_{k}(\underline{x}_{k}) + \underline{w}_{k} \\
\underline{y}_{k} &amp;= \underline{h}_{k}(\underline{x}_{k}) + \underline{v}_{k}
\end{aligned}
$$
&lt;p>&lt;strong>Prädiktion&lt;/strong>&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Samples/Particles/Punkte propagieren&lt;/p>
$$
\underline{x}_{k}^{p, i} = \underline{a}_{k-1}(\underline{x}_{k-1}^{e, i})
$$
&lt;/li>
&lt;li>
&lt;p>Mittelwert und Varianz basierend auf Samples berechnen&lt;/p>
$$
\begin{aligned}
\underline{\hat{x}}_{k}^p &amp;= \frac{1}{L} \sum_{i=1}^L \underline{x}_{k}^{p, i} \\
\mathbf{C}_k^p &amp;= \frac{1}{L} \sum_{i=1}^L (\underline{x}_{k}^{p, i} - \underline{\hat{x}}_{k}^p) (\underline{x}_{k}^{p, i} - \underline{\hat{x}}_{k}^p)^\top + \mathbf{C}_k^w
\end{aligned}
$$
&lt;/li>
&lt;/ol>
&lt;p>&lt;strong>Fitlerung&lt;/strong>&lt;/p>
&lt;ol start="0">
&lt;li>
&lt;p>Sampling:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Für prioren Schätzwert: $2N$ btw. $2N+1$ Samples auf Hauptachsen für Dimension $N$&lt;/p>
&lt;blockquote>
&lt;p>Bsp: Im skalaren Fall ($N=1$), 2 Samples:&lt;/p>
$$
> x_1 = \mu_p + \sigma_p \quad x_2 = \mu_p - \sigma_p
> $$
&lt;/blockquote>
&lt;/li>
&lt;li>
&lt;p>Ähnlich für Samples vom Mess-Rauschen&lt;/p>
&lt;blockquote>
&lt;p>Bsp: Im skalaren Fall ($N=1$), 2 Samples:&lt;/p>
$$
> v_1 = \mu_v + \sigma_v \quad v_2 = \mu_v - \sigma_v
> $$
&lt;/blockquote>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Punkte Propagation&lt;/p>
$$
\underline{y}_{k}^{p, i} = \underline{h}_{k}(\underline{x}_{k}^{p, i})
$$
&lt;p>bzw.&lt;/p>
$$
\underline{y}_{k}^{i, j} = \underline{h}_{k}(\underline{x}_{k}^{p, i}, \underline{v}_k^j)
$$
&lt;/li>
&lt;li>
&lt;p>Verbundraum $\underline{z}=\left[\begin{array}{l}
\underline{x} \\
\underline{y}
\end{array}\right]$
erstellen (Annahme: $\underline{x}_k$
und $\underline{y}_k$
sind gemeinsam Gaußverteilt).
Mittelwert und Varianz von $\underline{z}$ berechnen.&lt;/p>
$$
\underline{\mu}_z=\left[\begin{array}{l}
\underline{\mu}_x \\
\underline{\mu}_y
\end{array}\right]=\frac{1}{L}\sum_{i=1}^L\left[\begin{array}{l}
\underline{x}_i \\
\underline{y}_i
\end{array}\right], \quad \mathbf{C}_{z} = \frac{1}{L}\sum_{i=1}^L(\underline{z}_i - \underline{\mu}_z)(\underline{z}_i - \underline{\mu}_z)^\top = \left[\begin{array}{ll}
\mathbf{C}_{x x} &amp; \mathbf{C}_{x y} \\
\mathbf{C}_{y x} &amp; \mathbf{C}_{y y}
\end{array}\right]
$$
&lt;/li>
&lt;li>
&lt;p>Filterung in probabilistischer Form mit Messung $\hat{\underline{y}}$
&lt;/p>
$$
\begin{aligned}
\underline{\hat{x}}_k^e &amp;= \underline{x}_k^p + \mathbf{C}_{xy} \mathbf{C}_{yy}^{-1} (\underline{\hat{y}} - \underline{\mu}_y) \\
\mathbf{C}_k^e &amp;= \mathbf{C}_k^p - \mathbf{C}_{xy} \mathbf{C}_{yy}^{-1} \mathbf{C}_{yx}
\end{aligned}
$$
&lt;/li>
&lt;/ol>
&lt;h4 id="sampling">Sampling&lt;/h4>
&lt;p>Samples nur auf Hauptachsen: Insgesamt $2N$ btw. $2N+1$ ($N$: #Dimensionen)&lt;/p>
&lt;h4 id="vorteil-von-ukf-gegen-ekf">Vorteil von UKF gegen EKF&lt;/h4>
&lt;ul>
&lt;li>UKF reduziert möglicherweise den Linearisierungsfehler des EKF&lt;/li>
&lt;li>Man braucht die Jacobi-Matrizen nicht zu berechnen 👏&lt;/li>
&lt;/ul>
&lt;h3 id="analytische-momente">Analytische Momente&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb 7, A4&lt;/span>
&lt;/div>
&lt;ol>
&lt;li>
&lt;p>Verbundraum $\underline{z}$ erstellen&lt;/p>
$$
z := \left[\begin{array}{l}
x \\
y
\end{array}\right]
$$
&lt;/li>
&lt;li>
&lt;p>Mittelwert von $\underline{z}$ berechnen (mithilfe von höheren Momente der Gaußdichte)&lt;/p>
$$
E\{\underline{z}\}=\left[\begin{array}{c}
\hat{x}_{p} \\
E\{h(x)\}
\end{array}\right]
$$
&lt;/li>
&lt;li>
&lt;p>Differenz zwichen $h(x)$ und $E\\{h(x)\\}$ berechnen&lt;/p>
$$
\bar{h}(x)=h(x)-E\{h(x)\}
$$
&lt;/li>
&lt;li>
&lt;p>$\operatorname{Cov}\{\underline{z}\}$ berechnen&lt;/p>
$$
\operatorname{Cov}\{\underline{z}\}=\left[\begin{array}{ll}
\mathbf{C}_{x x} &amp; \mathbf{C}_{x y} \\
\mathbf{C}_{y x} &amp; \mathbf{C}_{y y}
\end{array}\right]=\left[\begin{array}{cc}
\sigma_{p}^{2} &amp; E\left\{\left(x-\hat{x}_{p}\right) \bar{h}(x)\right\} \\
E\left\{\left(x-\hat{x}_{p}\right) \bar{h}(x)\right\} &amp; E\left\{\overline{h}^{2}(x)\right\}+\sigma_{v}^{2}
\end{array}\right]
$$
&lt;/li>
&lt;li>
&lt;p>Filterung in probabilistischer Form.&lt;/p>
$$
\begin{aligned}
\underline{\hat{x}}_k^e &amp;= \underline{x}_k^p + \mathbf{C}_{xy} \mathbf{C}_{yy}^{-1} (\underline{\hat{y}} - \underline{\mu}_y) \\
\mathbf{C}_k^e &amp;= \mathbf{C}_k^p - \mathbf{C}_{xy} \mathbf{C}_{yy}^{-1} \mathbf{C}_{yx}
\end{aligned}
$$
&lt;/li>
&lt;/ol>
&lt;h2 id="ensemble-kalman-filter-enkf">Ensemble Kalman Filter (EnKF)&lt;/h2>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb 6, A4 (f)&lt;/span>
&lt;/div>
&lt;p>💡 Repräsentiere den unsicheren Schätzwert nun per „Streuungsbreite“ einer Punktwolke.&lt;/p>
&lt;p>Als „unsicheren Zustand“ verwende $L$ $N$-dim. Vektoren als &lt;strong>Samples&lt;/strong>&lt;/p>
$$
\mathcal{X}_{k}=[\underbrace{\underline{x}_{k, 1}}_{\mathbb{R}^N}, \underline{x}_{k, 2}, \ldots, \underline{x}_{k, L}] \in \mathbb{R}^{N \times L}, \quad \mathcal{W}_{k}=\left[\underline{w}_{k, 1}, \underline{w}_{k, 2}, \ldots, \underline{w}_{k, L}\right] \in \mathbb{R}^{N \times L}
$$
&lt;p>wobei die Samples als Spalten einer Matrix kompakt aufgefasst werden können.&lt;/p>
&lt;p>&lt;strong>Prädiktion&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Nichtlinear&lt;/p>
$$
\mathcal{X}_{k}^p = \underline{a}_{k-1}(\mathcal{X}_{k-1}^e, \underline{u}_{k-1}, \mathcal{W}_{k-1})
$$
&lt;/li>
&lt;li>
&lt;p>Linear&lt;/p>
$$
\mathcal{X}_{k}^p = \mathbf{A}_{k-1}\mathcal{X}_{k-1}^e + \mathbf{B}_{k-1}(\underline{u}_{k-1} + \mathcal{W}_{k-1})
$$
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Filterung&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Durchführung der Filterschritt NUR mit Samples&lt;/li>
&lt;li>Vermeidung der Verwendung der Update-Formeln für Kovarianzmatrix (Reine Representation der Unsicherheiten durch Samples)&lt;/li>
&lt;/ul>
&lt;p>Schritte&lt;/p>
&lt;ol>
&lt;li>
&lt;p>„Prädizierte“ Mess-Samples berechnen&lt;/p>
&lt;ul>
&lt;li>
&lt;p>linear&lt;/p>
$$
\mathcal{Y}_k = \mathbf{H}_k \mathcal{X}_{k}^p + \mathcal{V}_{k}
$$
&lt;/li>
&lt;li>
&lt;p>nichtlinear&lt;/p>
$$
\mathcal{Y}_k = \underline{h}_k (\mathcal{X}_{k}^p, \mathcal{V}_{k})
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Kalman Gain berechnen&lt;/p>
$$
\begin{aligned}
\mathbf{C}_{x y} &amp;=\frac{1}{L} \sum_{i=1}^{L} \underline{x}_{k, i}^{\mathrm{p}} \cdot \underline{y}_{k, i}^{\top} \\
&amp;=\frac{1}{L} \mathcal{X}_{k}^{\mathrm{p}} \cdot \mathcal{Y}_{k}^{\top} \\\\
\mathbf{C}_{y y} &amp;=\frac{1}{L} \sum_{i=1}^{L} \underline{y}_{k, i} \cdot \underline{y}_{k, i}^{\top} \\
&amp;=\frac{1}{L} \mathcal{Y}_{k} \cdot \mathcal{Y}_{k}^{\top} \\\\
\mathbf{K} &amp;=\mathbf{C}_{x y} \cdot \mathbf{C}_{y y}^{-1}
\end{aligned}
$$
&lt;/li>
&lt;li>
&lt;p>Filterschritt mit der tatsächlichen Messung $\underline{\hat{y}}_k$&lt;/p>
$$
\mathcal{X}_{k}^e = \mathcal{X}_{k}^p + \mathbf{K} (\underline{\hat{y}}_k \cdot \underline{\mathbb{1}}^\top - \mathcal{Y}_k)
$$
&lt;/li>
&lt;/ol></description></item><item><title>Allgemeine Systeme</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/allg_sys/</link><pubDate>Thu, 25 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/allg_sys/</guid><description>&lt;h2 id="generatives-und-probabilistisches-modell">Generatives und probabilistisches Modell&lt;/h2>
&lt;p>Für Herleitung ist es super wichtig, die Eigenschaft der Dirac&amp;rsquo;schen Funktion anzuwenden:&lt;/p>
$$
\delta (g(x)) = \sum_{i=1}^N \frac{1}{|g^\prime(x_i)|}\delta (x - x_i)
$$
&lt;ul>
&lt;li>$g(x_i) = 0$&lt;/li>
&lt;li>$g^\prime(x_i) \neq 0$&lt;/li>
&lt;/ul>
&lt;h3 id="mit-additivem-rauschen">Mit Additivem Rauschen&lt;/h3>
&lt;p>Generatives Modell:&lt;/p>
$$
z = a(x) + v \quad v \sim f_v(v)
$$
&lt;p>Probabilistisches Modell:&lt;/p>
$$
f(z \mid x) = f_v(z - a(x))
$$
&lt;h3 id="mit-multiplikativem-rauschen">Mit Multiplikativem Rauschen&lt;/h3>
&lt;p>Generatives Modell:&lt;/p>
$$
z = x \cdot v \quad v \sim f_v(v)
$$
&lt;p>Probabilistisches Modell:&lt;/p>
$$
f(z \mid x) = \frac{1}{|x|}f_v(\frac{z}{x})
$$
&lt;h3 id="warum-lässt-sich-das-nur-bei-bestimmten-modellen-exakt-lösen">Warum lässt sich das nur bei bestimmten Modellen exakt lösen?&lt;/h3>
&lt;p>&amp;ldquo;For the general generative model, where the noise enters the system in an arbitrary fashion.&amp;rdquo; (Script P149)&lt;/p>
&lt;h2 id="abstraktion">Abstraktion&lt;/h2>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-27%2010.48.03.png" alt="截屏2022-08-27 10.48.03" style="zoom: 33%;" />
&lt;h2 id="prädiktion-vorwärtsinferenz">Prädiktion (Vorwärtsinferenz)&lt;/h2>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb9 A2, A3&lt;/span>
&lt;/div>
&lt;ul>
&lt;li>Gegeben
&lt;ul>
&lt;li>$f_a(a)$&lt;/li>
&lt;li>$g(a)$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Gesucht: $f_b(b)$&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Vorwaertsinferenz.drawio.png" alt="allg_sys-Vorwaertsinferenz.drawio" style="zoom: 67%;" />
&lt;h3 id="chapman-kolmogorov-gleichung">Chapman-Kolmogorov-Gleichung&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb A10.1&lt;/span>
&lt;/div>
$$
f_{k+1}^{p}\left(\underline{x}_{k+1}\right)=\int_{\mathbb{R}^{N}} \underbrace{f\left(\underline{x}_{k+1} \mid \underline{x}_{k}\right)}_{\text{Prädiktionsdichte}} f_{k}^{e}\left(\underline{x}_{k}\right) \mathrm{d} \underline{x}_{k}
$$
&lt;p>Herleitung ist ganz simple: Verbunddichte + Marginalisierung&lt;/p>
$$
f\left(x_{k+1}\right)= \int_{\mathbb{R}^{N}} f\left(\underline{x}_{k+1}, \underline{x}_{k}\right) d \underline{x}_{k}= \int_{\mathbb{R}^{N}} f\left(\underline{x}_{k+1} \mid \underline{x}_{k}\right) \cdot f\left(\underline{x}_{k}\right) d \underline{x}_{k}
$$
&lt;p>‼️ &lt;span style="color: Red">Problem: &lt;strong>Parameterintegral&lt;/strong>&lt;/span>&lt;/p>
&lt;ul>
&lt;li>&lt;span style="color: Red">Integrand hängt von $\underline{x}_{k+1}$ ab (lässt sich i.Allg nicht herausziehen)&lt;/span>&lt;/li>
&lt;li>&lt;span style="color: Red">Erfordert Lösung des Integrals für alle $\underline{x}_{k+1}$&lt;/span>&lt;/li>
&lt;li>&lt;span style="color: Red">Nur möglich für analytische Lösung&lt;/span>&lt;/li>
&lt;/ul>
&lt;h3 id="prädiktionsschritte">Prädiktionsschritte&lt;/h3>
&lt;ol>
&lt;li>
&lt;p>Umforme $f(b \mid a) = \delta(b - g(a))$ mit&lt;/p>
$$
\delta (g(x)) = \sum_{i=1}^N \frac{1}{|g^\prime(x_i)|}\delta (x - x_i)
$$
&lt;p>wobei&lt;/p>
&lt;ul>
&lt;li>$g(x_i) = 0$ (also $x_i$ sind Nullstellen, $i = 1, 2, \dots, N$)&lt;/li>
&lt;li>$g^\prime(x_i) \neq 0$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Berechne $f_b(b)$ mithilfe von &lt;strong>Chapman-Kolmogorov-Gleichung&lt;/strong>&lt;/p>
$$
f(b) = \int f(b \mid a) f(a) da
$$
&lt;p>und setze die Unformung von $f(b \mid a)$ von Schritt 1 ein. Dann kriege die gesuchte Dichtefunktion $f_b(b)$ in Abhängigkeit von $f_a(a)$.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h3 id="vereinfachte-prädiktion">Vereinfachte Prädiktion&lt;/h3>
&lt;p>Für&lt;/p>
$$
\underline{z} = \underline{a}(\underline{x}, \underline{w})
$$
&lt;p>ist die Transitionsdichte $f(\underline{z} | \underline{x})$ durch Mixture approximierbar&lt;/p>
$$
f(\underline{z} | \underline{x}) = \sum_{i \in \mathbb{Z}} f_i^z(\underline{z}) \cdot f_i^x(\underline{x})
$$
&lt;p>wobei $f_i^z(\underline{z})$
und $f_i^x(\underline{x})$
beliebige Dichte (z.B Gaußdichte) sein können.&lt;/p>
&lt;p>Schreibweise mit $\underline{x}_{k+1}$
und $\underline{x}_{k}$
:&lt;/p>
$$
f\left(\underline{x}_{k+1} \mid \underline{x}_k\right)=\sum_{i=1}^L w_k^{(i)} f_{k+1}^{(i)}\left(\underline{x}_{k+1}\right) f_k^{(i)}\left(\underline{x}_k\right)
$$
&lt;h2 id="filterung">Filterung&lt;/h2>
&lt;h3 id="rückwartsinferenz">Rückwartsinferenz&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Rueckwaertsinferenz.drawio.png" alt="allg_sys-Rueckwaertsinferenz.drawio" style="zoom:67%;" />
&lt;p>Bei Rückwartsinferenz ist es wichtig, Formel von Bayes anwuwenden.&lt;/p>
$$
f(a \mid b) = \frac{f(a, b)}{f(b)} = \frac{f(b \mid a) f(a)}{f(b)} = \underbrace{\frac{1}{f(b)}}_{\text{Normalizationskonstant}} \cdot \underbrace{f(b \mid a)}_{\text{Likelihood}} \cdot \underbrace{f(a)}_{\text{Vorwissen}}
$$
&lt;h3 id="konkrete-messung">Konkrete Messung&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb9 A2, A3&lt;/span>
&lt;/div>
&lt;ol>
&lt;li>
&lt;p>Umforme $f_b(b \mid a) = \delta(b - g(a))$ mit&lt;/p>
$$
\delta (g(x)) = \sum_{i=1}^N \frac{1}{|g^\prime(x_i)|}\delta (x - x_i)
$$
&lt;p>wobei&lt;/p>
&lt;ul>
&lt;li>$g(x_i) = 0$ (also $x_i$ sind Nullstellen, $i = 1, 2, \dots, N$)&lt;/li>
&lt;li>$g^\prime(x_i) \neq 0$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Berechne $f_b(b)$&lt;/p>
$$
f_b(b) = \int f_{a, b}(a, b) da = \int f_{b}(b \mid a) f_a(a) da
$$
&lt;p>mit Einsetzen der Unformung von $f(b \mid a)$ von Schritt 1 ein&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Berechne $f_a(a \mid \hat{b})$ mithilfe von Bayes Regeln&lt;/p>
$$
f_a(a \mid \hat{b}) = \frac{f_a(\hat{b} \mid a) f_a(a)}{f_b(\hat{b})} = \frac{\overbrace{\delta(\hat{b} - g(a))}^{\text{Schritt 1}} f_a(a)}{\underbrace{f_b(\hat{b})}_{\text{Schritt 2}}}
$$
&lt;/li>
&lt;/ol>
&lt;h3 id="unsichere-messung">Unsichere Messung&lt;/h3>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb A9.4&lt;/span>
&lt;/div>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/allg_sys-Rueckwaertsinferenz_dichte.drawio.png" alt="allg_sys-Rueckwaertsinferenz_dichte.drawio" style="zoom:67%;" />
&lt;p>&lt;strong>Schritte&lt;/strong>:&lt;/p>
&lt;ol start="0">
&lt;li>Erweitere das System um eine zusätzliche stochastische Abbildung und einen festen Ausgang $\hat{z}$&lt;/li>
&lt;/ol>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-08 16.51.24.png" alt="截屏2022-08-08 16.51.24" style="zoom: 50%;" />
&lt;ol>
&lt;li>
&lt;p>Bestimme $f(\hat{z} \mid y)$&lt;/p>
$$
\begin{aligned}
f(\hat{z} \mid y) &amp;= \frac{f(y \mid \hat{z})f(\hat{z})}{f(y)} \\\\
&amp;= \frac{f(y \mid \hat{z})f(\hat{z})}{\int f(y, x) dx} \\\\
&amp;= \frac{f(y \mid \hat{z})f(\hat{z})}{\int f(y|x)f(x) dx} \\\\
&amp;= \frac{f(y \mid \hat{z})f(\hat{z})}{\int \delta(y - g(x)) f(x) dx} \\\\
\end{aligned}
$$
&lt;p>Und setze die Umformung von $\delta(y - g(x))$&lt;/p>
$$
\delta (g(x)) = \sum_{i=1}^N \frac{1}{|g^\prime(x_i)|}\delta (x - x_i)
$$
&lt;ul>
&lt;li>$g(x_i) = 0$ (also $x_i$ sind Nullstellen, $i = 1, 2, \dots, N$)&lt;/li>
&lt;li>$g^\prime(x_i) \neq 0$&lt;/li>
&lt;/ul>
&lt;p>ein.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Berechung der Rückwärtsinferenz $f(x \mid \hat{z})$&lt;/p>
$$
\begin{aligned}
f(x \mid \hat{z}) &amp;=\frac{1}{f\left(\hat{z}\right)} \cdot f(x, \hat{z}) \quad \mid \text{Marginalisierung nach } y\\
&amp;=\frac{1}{f(\hat{z})} \int f(x, y, \hat{z}) d y \\
&amp;=\frac{1}{f(\hat{z})} \int f(\hat{z} \mid y, x) \cdot f(y , x) d y \quad \mid \hat{z}, x \text{ sind unabhängig}\\
&amp;=\frac{1}{f(\hat{z})} \int f(\hat{z} \mid y) \cdot f(y \mid x) \cdot f(x) d y \\
&amp;=\frac{1}{f(\hat{z})} \int \underbrace{f(\hat{z} \mid y)}_{\text{Berechnet in Schritt 1}} \cdot \underbrace{f(y \mid x)}_{\text{Systemmodell}} \cdot f(x) d y
\end{aligned}
$$
&lt;/li>
&lt;/ol>
&lt;h3 id="schwierigkeit-vom-filterschritt">Schwierigkeit vom Filterschritt&lt;/h3>
&lt;ol>
&lt;li>Type der Dichte zur Beschreibung der Schätzung ändert sich&lt;/li>
&lt;li>Dichte wrid mit jedem Schritt komplexer&lt;/li>
&lt;/ol>
&lt;h3 id="vereinfachte-filterung">Vereinfachte Filterung&lt;/h3>
&lt;p>Vereinfachung der Likelihood $f(\underline{y} \mid \underline{x})$
durch Mixture (Analog zu vereinfachter Prädiktion)&lt;/p>
$$
f(\underline{y} \mid \underline{x}) = \sum_{i \in \mathbb{Z}} f_i^y(\underline{y}) f_i^x(\underline{x})
$$</description></item><item><title>Sampling</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/sampling/</link><pubDate>Sun, 28 Aug 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/sampling/</guid><description>&lt;h2 id="reapproximation-von-dichten">Reapproximation von Dichten&lt;/h2>
&lt;p>Approximate original continuous density with discrete Dirac Mixture&lt;/p>
$$
f(\underline{x})=\sum_{i=1}^{L} w_{i} \cdot \delta\left(\underline{x}-\underline{\hat{x}}_{i}\right)
$$
&lt;ul>
&lt;li>Weights $w_{i}>0, \displaystyle \sum_{i=1}^{L} w_{i}=1$&lt;/li>
&lt;li>$\underline{x}_i$: locations / samples&lt;/li>
&lt;/ul>
&lt;p>In univariate case (1D), compare &lt;strong>cumulative distribution functions (CDFs)&lt;/strong> $\tilde{F}(x), F(x)$ using &lt;strong>Cramér–von Mises distance&lt;/strong>:&lt;/p>
$$
D(\underline{\hat{x}})=\int_{\mathbb{R}}(\tilde{F}(x)-F\left(x, \underline{\hat{x}})\right)^{2} \mathrm{~d} x
$$
&lt;p>$F(x, \underline{\hat{x}})$
: Dirac mixture cumulative distribution&lt;/p>
$$
F(x, \underline{\hat{x}})=\sum_{i=1}^{L} w_{i} \mathrm{H}\left(x-\hat{x}_{i}\right) \text { with } \mathrm{H}(x)=\int_{-\infty}^{x} \delta(t) \mathrm{d} t= \begin{cases}0 &amp; x&lt;0 \\ \frac{1}{2} &amp; x=0 \\ 1 &amp; x>0\end{cases}
$$
&lt;p>with the Dirac position&lt;/p>
$$
\underline{\hat{x}}=\left[\hat{x}_{1}, \hat{x}_{2}, \ldots, \hat{x}_{L}\right]^{\top}
$$
&lt;p>We minimize the Cramér–von Mises distance $D(\underline{\hat{x}})$
with Newton&amp;rsquo;s method.&lt;/p>
&lt;h3 id="generalization-of-concept-of-cdf">Generalization of concept of CDF&lt;/h3>
&lt;h4 id="localized-cumulative-distribution-lcd">Localized Cumulative Distribution (LCD)&lt;/h4>
$$
F(\underline{m}, b)=\int_{\mathbb{R}^{N}} f(\underline{x}) K(\underline{x}-\underline{m}, b) \mathrm{d} \underline{x}
$$
&lt;ul>
&lt;li>
&lt;p>$K(\cdot, \cdot)$: Kernel&lt;/p>
$$
K(\underline{x}-\underline{m}, b)=\prod_{k=1}^{N} \exp \left(-\frac{1}{2} \frac{\left(x_{k}-m_{k}\right)^{2}}{b^{2}}\right)
$$
&lt;/li>
&lt;li>
&lt;p>$\underline{m}$: Kernel location&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$\underline{b}$: Kernel width&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Properties of LCD:&lt;/p>
&lt;ul>
&lt;li>Symmetric&lt;/li>
&lt;li>Unique&lt;/li>
&lt;li>Multivariate&lt;/li>
&lt;/ul>
&lt;p>Generalized Cramér–von Mises Distance (GCvD)&lt;/p>
$$
D=\int_{\mathbb{R}_{+}} w(b) \int_{\mathbb{R}^{N}}(\tilde{F}(\underline{m}, b)-F(\underline{m}, b))^{2} \mathrm{~d} \underline{m} \mathrm{~d} b
$$
&lt;ul>
&lt;li>$\tilde{F}(\underline{m}, b)$: LCD of continuous density&lt;/li>
&lt;li>$F(\underline{m}, b)$: LCD of Dirac mixture&lt;/li>
&lt;/ul>
&lt;p>Minimization of GCvD: Quasi-Newton method (L-BFGS)&lt;/p>
&lt;h4 id="projected-cumulative-distribution-pcd">Projected Cumulative Distribution (PCD)&lt;/h4>
&lt;p>Use reapproximation methods for univariate case in multivariate case.&lt;/p>
&lt;p>&lt;strong>Radon Transform&lt;/strong>&lt;/p>
&lt;p>Represent general $N$-dimensional probability density functions via the set of all one-dimensional projections&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Linear projection of random vector $\underline{\boldsymbol{x}} \in \mathbb{R}^{N}$ to to scalar random variable $\boldsymbol{r} \in \mathbb{R}$ onto line described by unit vector $\underline{u} \in \mathbb{S}^{N-1}$&lt;/p>
$$
\boldsymbol{r} = \underline{u}^\top \underline{\boldsymbol{x}}
$$
&lt;/li>
&lt;li>
&lt;p>Given probability density function $f(\underline{x})$ of random vector $\underline{\boldsymbol{x}}$, density $f_r(r \mid \underline{u})$ is Radon transfrom of $f(\underline{x})$ for all $\underline{u} \in \mathbb{S}^{N-1}$&lt;/p>
$$
f_{r}(r \mid \underline{u})=\int_{\mathbb{R}^{N}} f(\underline{t}) \delta\left(r-\underline{u}^{\top} \underline{t}\right) \mathrm{d} \underline{t}
$$
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Representing PDFs by &lt;em>all&lt;/em> one-dimensional projections&lt;/strong>&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Represent the two densities $\tilde{f}(\underline{x})$ and $f(\underline{x})$ by their Radon transforms $\tilde{f}(r \mid \underline{u})$ and $f(r \mid u)$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Compare the sets of projections $\tilde{f}(r \mid \underline{u})$ and $f(r \mid u)$ for every $\underline{u} \in \mathbb{S}^{N-1}$. Resulting distance is&lt;/p>
$$
D_{1}(\underline{u})=D(\tilde{f}(r \mid \underline{u}), f(r \mid \underline{u}))
$$
&lt;/li>
&lt;li>
&lt;p>Integrate these one-dimensional distance measures $D_1(\underline{u})$ over all unit vectors $\underline{u} \in \mathbb{S}^{N-1}$ to get the multivariate distance measure $D(\tilde{f}(\underline{x}), f(\underline{x}))$. Minimize via univariate Newton updates.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="navies-partikel-filter">Navies Partikel Filter&lt;/h2>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Üb A13.2&lt;/span>
&lt;/div>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-08-14%2017.37.55.png" alt="截屏2022-08-14 17.37.55" style="zoom: 33%;" />
&lt;h3 id="prädiktion">Prädiktion&lt;/h3>
&lt;p>💡Update Sample Positionen. Gewichte bleiben gleich.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>$f_{k}^{e}\left(\underline{x}_{k}\right)$
durch Dirac Mixture darstellen&lt;/p>
$$
f_{k}^{e}\left(\underline{x}_{k}\right)=\sum_{i=1}^{L} w_{k}^{e, i} \cdot \delta\left(\underline{x}_{k}-\underline{\hat{x}}_{k}^{e, i}\right) \qquad w_{k}^{e, i}=\frac{1}{L}, i \in\left\{1, \ldots, L\right\}
$$
&lt;/li>
&lt;li>
&lt;p>Ziehe Samples zum Zeitpunkt $k+1$&lt;/p>
$$
\underline{\hat{x}}_{k+1}^{p, i} \sim f\left(\underline{x}_{k+1} \mid \hat{x}_{k}^{e, i}\right)
$$
&lt;p>Gewichte bleiben gleich&lt;/p>
$$
w_{k+1}^{p, i} = w_{k}^{e, i}
$$
&lt;/li>
&lt;li>
&lt;p>$f_{k+1}^{p}\left(\underline{x}_{k}\right)$
durch Dirac Mixture darstellen&lt;/p>
$$
f_{k+1}^{p}\left(\underline{x}_{k+1}\right)=\sum_{i=1}^{L} w_{k+1}^{p, i} \delta\left(\underline{x}_{k+1}-\underline{\hat{x}}_{k+1}^{p, i}\right)
$$
&lt;/li>
&lt;/ol>
&lt;h3 id="filterung">Filterung&lt;/h3>
&lt;p>💡Update Gewichte. Sample Positionen bleiben gleich.&lt;/p>
$$
\begin{aligned}
f_{k}^{e}\left(\underline{x}_{k}\right) &amp;\propto f\left(\underline{y}_{k} \mid \underline{x}_{k}\right) \cdot f_{k}^{p}\left(\underline{x}_{k}\right)\\
&amp;=f\left(\underline{y}_{k} \mid \underline{x}_{k}\right) \cdot \sum_{i=1}^{L} w_{k}^{p, i} \cdot \delta\left(\underline{x}_{k}-\underline{\hat{x}}_{k}^{p, i}\right)\\
&amp;=\sum_{i=1}^{L} \underbrace{w_{k}^{p, i} \cdot f\left(\underline{y}_{k} \mid \hat{\underline{x}}_{k}^{p, i}\right)}_{\propto w_{k}^{e, i}} \cdot \delta(\underline{x}_{k}-\underbrace{\underline{\hat{x}}_{k}^{p, i}}_{\underline{\hat{x}}_{k}^{e, i}})
\end{aligned}
$$
&lt;ol>
&lt;li>
&lt;p>Positionen bleiben gleich&lt;/p>
$$
\underline{\hat{x}}_{k}^{e, i} = \underline{\hat{x}}_{k}^{p, i}
$$
&lt;/li>
&lt;li>
&lt;p>Gewichte adaptieren&lt;/p>
$$
w_{k}^{e, i} \propto w_{k}^{p, i} \cdot f\left(\underline{y}_{k} \mid \hat{\underline{x}}_{k}^{p, i}\right)
$$
&lt;p>und Normalisieren&lt;/p>
$$
w_{k}^{e, i}:=\frac{w_{k}^{e, i}}{\displaystyle \sum_{i} w_{k}^{e,i}}
$$
&lt;/li>
&lt;/ol>
&lt;h3 id="problem">Problem&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Varianz der Samples erhöht sich mit Filterschritten&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Partikel sterben aus $\rightarrow$ Degenerierung des Filters&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Aussterben schneller, je genauer die Messung, da Likelihood schmaler (Paradox!)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="resampling">Resampling&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Approximation der gewichteter Samples durch ungewichtete&lt;/p>
$$
f_{k}^{e}\left(\underline{x}_{k}\right)=\sum_{i=1}^{L} w_{k}^{e, i} \cdot \delta\left(\underline{x}_{k}-\underline{\hat{x}}_{k}^{e, i}\right) \approx \sum_{i=1}^{L} \frac{1}{L} \delta\left(\underline{x}_{k}-\underline{\hat{x}}_{k}^{e, i}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Gegeben: $L$ Partikel mit Gewichten $w_i$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Gesucht: $L$ Partikel mit Geweichte $\frac{1}{L}$ (gleichgewichtet)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="sequential-importance-sampling">Sequential Importance Sampling&lt;/h2>
&lt;ol start="0">
&lt;li>
&lt;p>$f_{k}^{e}\left(\underline{x}_{k}\right)=f\left(\underline{x}_{k} \mid \underline{y}_{1: k}\right)$
auf $\underline{x}_{1: k-1}$
marginalisieren&lt;/p>
$$
f_{k}^{e}\left(\underline{x}_{k}\right)=f\left(\underline{x}_{k} \mid \underline{y}_{1: k}\right)=\int_{\mathbb{R}^{N}} \cdots \int_{\mathbb{R}^{N}} f\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right) d \underline{x}_{1: k-1}
$$
&lt;/li>
&lt;li>
&lt;p>Importance Sampling für $f(\underline{x}_k, \underline{x}_{k-1} \mid \underline{y}_{1:k})$
&lt;/p>
$$
f_{k}^{e}\left(\underline{x}_{k}\right) = \int_{\mathbb{R}^{N}} \cdots \int_{\mathbb{R}^{N}} \underbrace{\frac{f\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right)}{p\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right)}}_{=: w_k^{e, i}} p\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right) d \underline{x}_{1: k-1}
$$
&lt;/li>
&lt;li>
&lt;p>$\frac{f\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right)}{p\left(\underline{x}_{1: k} \mid \underline{y}_{1 : k}\right)}$
umschreiben&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Zähler&lt;/p>
$$
\begin{aligned}
f\left(\underline{x}_{1: k} \mid \underline{y}_{1: k}\right) &amp;\propto f\left(\underline{y}_{k} \mid \underline{x}_{1: k}, \underline{y}_{1: k - 1}\right) \cdot f\left(\underline{x}_{1: k} \mid \underline{y}_{1: k-1}\right)\\
&amp;=f\left(\underline{y}_{k} \mid \underline{x}_{k}\right) \cdot f\left(\underline{x}_{k} \mid \underline{x}_{1:k-1}, \underline{y}_{1:k-1}\right) \cdot f\left(\underline{x}_{1:k-1} \mid \underline{y}_{1: k-1}\right)\\
&amp;=f\left(\underline{y}_{k} \mid \underline{x}_{k}\right) \cdot f\left(\underline{x}_{k} \mid \underline{x}_{k-1}\right) \cdot f\left(\underline{x}_{1: k-1} \mid \underline{y}_{1: k \cdot 1}\right)
\end{aligned}
$$
&lt;/li>
&lt;li>
&lt;p>Nenner&lt;/p>
$$
p\left(\underline{x}_{1: k} \mid \underline{y}_{1: k}\right)=p\left(\underline{x}_{k} \mid \underline{x}_{1: k - 1}, \underline{y}_{1: k}\right) \cdot p\left(\underline{x}_{1: k -1} \mid \underline{y}_{1: k - 1}\right)
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Einsetzen, $w_k^{e, i}$ in Rekursiven Form schreiben&lt;/p>
$$
w_k^{e, i} = \frac{f\left(\underline{\hat{x}}_{1: k} \mid \underline{y}_{1 : k}\right)}{p\left(\underline{\hat{x}}_{1: k} \mid \underline{y}_{1 : k}\right)} \propto \frac{f\left(\underline{y}_{k} \mid \underline{x}_{k}^i\right) \cdot f\left(\underline{x}_{k}^i\mid \underline{x}_{k-1}^i\right)}{p\left(\underline{x}_{k}^i \mid \underline{x}_{1: k - 1}^i, \underline{y}_{1: k}\right)} \cdot \underbrace{\frac{f\left(\underline{x}_{1: k-1}^i \mid \underline{y}_{1: k \cdot 1}\right)}{p\left(\underline{x}_{1: k -1}^i \mid \underline{y}_{1: k - 1}\right)}}_{=w_{k-1}^{e, i}}
$$
&lt;p>und Normalisieren.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h3 id="spezielle-proposals">Spezielle Proposals&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Standard Proposal&lt;/strong>&lt;/p>
$$
p\left(\underline{x}_{k} \mid \underline{x}_{k-1}, \underline{y}_{k}\right) \stackrel{!}{=} f\left(\underline{x}_{k} \mid \underline{x}_{k-1}\right)
$$
&lt;p>Dann ist&lt;/p>
$$
w_{k}^{e, i} \propto \frac{f\left(\underline{y}_{k} \mid \hat{\underline{x}}_{k}^{i}\right) \cdot f\left(\hat{\underline{x}}_{k}^{i} \mid \hat{\underline{x}}_{k-1}^{i}\right)}{p\left(\underline{\hat{x}}_{k}^{i} \mid \hat{\underline{x}}_{k-1}^{i}, \underline{y}_k\right)} \cdot w_{k-1}^{e, i}=f\left(\underline{y}_{k} \mid \hat{\underline{x}}_{k}^{i}\right) \cdot w_{k - 1}^{e, i}
$$
&lt;p>Sehr einfach aber keine verbesserte Performance.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Optimales Proposal&lt;/strong>&lt;/p>
$$
\begin{aligned}
p\left(\underline{x}_{k} \mid \underline{x}_{k-1}, \underline{y}_{k}\right) &amp;=f\left(\underline{x}_{k} \mid \underline{x}_{k-1}, \underline{y}_{k}\right) \\
&amp; \propto f\left(\underline{y}_{k} \mid \underline{x}_{k}\right) \cdot f\left(\underline{x}_{k} \mid \underline{x}_{k-1}\right)
\end{aligned}
$$
&lt;p>Dann ist&lt;/p>
$$
w_k^{e, i} = w_{k-1}^{e, i}
$$
&lt;p>Minimierte Varianz der Gewichte aber nur in Spezialfällen verwendbar.&lt;/p>
&lt;/li>
&lt;/ul></description></item><item><title>Häufige Prüfungsfragen</title><link>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/haeufige_fragen/</link><pubDate>Tue, 13 Sep 2022 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/notes/stochastische_informationsverarbeitung/zusammenfassung/haeufige_fragen/</guid><description>&lt;h2 id="allgemeine-fragen">Allgemeine Fragen&lt;/h2>
&lt;h3 id="was-haben-wir-in-der-vorlesung-gemachtgelerntbehandelt">Was haben wir in der Vorlesung gemacht/gelernt/behandelt?&lt;/h3>
&lt;h3 id="was-ist-zustandsschätzung">Was ist Zustandsschätzung?&lt;/h3>
&lt;h3 id="was-ist-zustand">Was ist Zustand?&lt;/h3>
&lt;h3 id="welche-arten-von-systemen-sind-einfach-warum">Welche Arten von Systemen sind einfach? Warum?&lt;/h3>
&lt;p>Wertdiskret und wertkontinuierlich linear.&lt;/p>
&lt;p>Grund: konstanter Rechen- und Speicherbedarf&lt;/p>
&lt;h2 id="wertdiskrete-systeme">Wertdiskrete Systeme&lt;/h2>
&lt;h3 id="wonham-filter-herleiten">Wonham Filter herleiten&lt;/h3>
&lt;h2 id="wertkontinuierliche-lineare-systeme">Wertkontinuierliche lineare Systeme&lt;/h2>
&lt;h3 id="linear-kalman-filter-herleiten">Linear Kalman Filter herleiten&lt;/h3>
&lt;h3 id="eigeenschaften-des-kfs">Eigeenschaften des KFs&lt;/h3>
&lt;h2 id="wertkontinuierliche-schwache-nichtlineare-systeme">Wertkontinuierliche schwache nichtlineare Systeme&lt;/h2>
&lt;h3 id="wie-kann-man-erkennen-ob-ein-system-stark-oder-schwach-nichtlinear">Wie kann man erkennen, ob ein System stark oder schwach nichtlinear?&lt;/h3>
&lt;ul>
&lt;li>Vergleich mit Taylor Entwicklung 1. Ordnung&lt;/li>
&lt;li>Induzierte Nichtlinearität&lt;/li>
&lt;/ul>
&lt;h3 id="was-kann-man-machen-wenn-das-system-schwach-nichtlinear-ist">Was kann man machen, wenn das System schwach nichtlinear ist?&lt;/h3>
&lt;h3 id="wie-funktioniert-die-zustandsschätzung-bei-schwach-nichtlinearen-systemen">Wie funktioniert die Zustandsschätzung bei schwach nichtlinearen Systemen?&lt;/h3>
&lt;h3 id="wie-funktioniert-das-ekf-ekf-herleiten">Wie funktioniert das EKF? EKF herleiten&lt;/h3>
&lt;h3 id="ukf-erklären-und-herleiten">UKF erklären und herleiten&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Wie funktioniert die Filterung mit Samples?&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-09-12 22.29.08.png" alt="截屏2022-09-12 22.29.08" style="zoom: 33%;" />
&lt;/li>
&lt;li>
&lt;p>Wie können wir Samples von der Priore erzeugen?&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="unterschied-zwischen-ukf-und-enkf">Unterschied zwischen UKF und EnKF?&lt;/h3>
&lt;h3 id="nlkf-kf-in-probabilistischer-form">NLKF (KF in probabilistischer Form)&lt;/h3>
&lt;h2 id="allgemine-systeme">Allgemine Systeme&lt;/h2>
&lt;p>###Chapman-Komolgorov Gleichung herleiten&lt;/p>
&lt;h3 id="problem-von-allgemeinen-systeme">Problem von allgemeinen Systeme?&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Prädiktion: Parameterintergral bei Chapman-Komolgorov Gleichung&lt;/p>
&lt;ul>
&lt;li>&lt;span style="color: Red">Integrand hängt von $\underline{x}_{k+1}$ ab (lässt sich i.Allg nicht herausziehen)&lt;/span>&lt;/li>
&lt;li>&lt;span style="color: Red">Nur möglich für analytische Lösung&lt;/span>&lt;/li>
&lt;li>&lt;span style="color: Red">Sonst erfordert (numerische) Lösung des Integrals für alle $\underline{x}_{k+1}$&lt;/span>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Filterung&lt;/p>
&lt;ul>
&lt;li>Type der Dichte zur Beschreibung der Schätzung ändert sich&lt;/li>
&lt;/ul>
&lt;ol start="2">
&lt;li>Dichte wrid mit jedem Schritt komplexer&lt;/li>
&lt;/ol>
&lt;/li>
&lt;/ul>
&lt;h3 id="wie-kann-man-gegen-parameterintergral-bei-prädiktion-tun">Wie kann man gegen Parameterintergral bei Prädiktion tun?&lt;/h3>
&lt;p>Transitionsdichte $f\left(\underline{x}_{k+1} \mid \underline{x}_k\right)$
durch entkoppelte Mixture approximieren&lt;/p>
$$
f\left(\underline{x}_{k+1} \mid \underline{x}_k\right)=\sum_{i=1}^L w_k^{(i)} f_{k+1}^{(i)}\left(\underline{x}_{k+1}\right) f_k^{(i)}\left(\underline{x}_k\right)
$$
&lt;p>Vorteil: die Integrande von CK-Gleichung, die von $\underline{x}_{k+1}$
, lässt sich rausziehen. Das Integral ist eine Konstante und wird als Faktor fürs neue Gewicht verwendet.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2022-09-13 20.55.22.png" alt="截屏2022-09-13 20.55.22" style="zoom: 33%;" />
&lt;h3 id="generatives-system-mit-additivem-oder-multiplikativem-rauschen-in-probabilistische-überführen-und-herleiten">Generatives System (mit additivem oder multiplikativem Rauschen) in probabilistische überführen und herleiten&lt;/h3>
&lt;h2 id="sampling">Sampling&lt;/h2>
&lt;h3 id="wie-funktioniert-partikel-filter">Wie funktioniert Partikel Filter?&lt;/h3></description></item><item><title>Introduction</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/00-intro/</link><pubDate>Sun, 13 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/00-intro/</guid><description>&lt;h2 id="what-is-nlp">What is NLP?&lt;/h2>
&lt;blockquote>
&lt;p>Wikipedia:
&lt;strong>Natural language processing&lt;/strong> (&lt;strong>NLP&lt;/strong>) is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages. As such, NLP is related to the area of human–computer interaction.&lt;/p>
&lt;/blockquote>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-10%2023.56.28.png" alt="截屏2020-05-10 23.56.28" style="zoom: 67%;" />
&lt;h2 id="what-is-dialog-modeling">What is Dialog Modeling&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Designing/building a spoken dialog system with its goals, user handling etc.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Synonymous to dialog management (DM)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-10%2023.58.28.png" alt="截屏2020-05-10 23.58.28" style="zoom:50%;" />
&lt;/li>
&lt;li>
&lt;p>Examples&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Goal-oriented dialog&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Social dialog / Chat bot&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="how-to-do-nlp">How to do NLP?&lt;/h2>
&lt;ul>
&lt;li>Aim: Understand &lt;strong>linguistic&lt;/strong> structure of communication&lt;/li>
&lt;li>Idea: There are rules to decide if a sentence is correct or not
&lt;ul>
&lt;li>A proper sentence needs to have:
&lt;ul>
&lt;li>1 Subject&lt;/li>
&lt;li>1 Verb&lt;/li>
&lt;li>several objects (depending on the verb&amp;rsquo;s valence)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="tldr">TL;DR&lt;/h3>
&lt;ul>
&lt;li>Task:
&lt;ul>
&lt;li>Linguistic dimension: Syntax, semantics, pragmatics&lt;/li>
&lt;li>Level: Word, word groups, sentence, beyond sentences&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Approaches
&lt;ul>
&lt;li>Technique:
&lt;ul>
&lt;li>Rule-based,&lt;/li>
&lt;li>Statistical,&lt;/li>
&lt;li>Neural&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Learning scenario:&lt;/li>
&lt;li>Supervised,&lt;/li>
&lt;li>semi-supervised,&lt;/li>
&lt;li>unsupervised,&lt;/li>
&lt;li>reinforcement learning&lt;/li>
&lt;li>Model:
&lt;ul>
&lt;li>Classification,&lt;/li>
&lt;li>sequence classification,&lt;/li>
&lt;li>sequence labeling,&lt;/li>
&lt;li>sequence to sequence,&lt;/li>
&lt;li>structure prediction&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="technique">Technique&lt;/h3>
&lt;h4 id="hand-written-rules-to-parse-the-sentences-rule-based">Hand-written rules to parse the sentences (Rule-based)&lt;/h4>
&lt;p>‼️Problems&lt;/p>
&lt;ul>
&lt;li>There is no fixed set of rules&lt;/li>
&lt;li>Language changes over time&lt;/li>
&lt;li>A(ny?) language is constantly influenced by other languages&lt;/li>
&lt;li>Classification of words into POS tags not always clear&lt;/li>
&lt;/ul>
&lt;h4 id="corpus-based-approaches-to-nlp-statistical">&lt;strong>Corpus-based Approaches to NLP&lt;/strong> (Statistical)&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Corpus = large collection of &lt;em>annotated&lt;/em> texts (or speech files)&lt;/strong>&lt;/li>
&lt;li>👍 &lt;strong>advantages&lt;/strong>:
&lt;ul>
&lt;li>Automatically learn rules from data&lt;/li>
&lt;li>Statistical Models → no hard decision&lt;/li>
&lt;li>Use machine learning approaches
&lt;ul>
&lt;li>Possible since larger computation resources&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Corpus will concentrate on most common approaches&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Input&lt;/strong>:
&lt;ul>
&lt;li>Data (Text corpora)&lt;/li>
&lt;li>Machine learning algorithm&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Output&lt;/strong>: Statistical model&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-11%2000.43.05.png" alt="截屏2020-05-11 00.43.05" style="zoom: 67%;" />
&lt;ul>
&lt;li>&lt;strong>Problems of simple statistical models&lt;/strong>: feature engineering
&lt;ul>
&lt;li>What features are important to determine the POS tag
&lt;ul>
&lt;li>Word ending&lt;/li>
&lt;li>Surrounding words&lt;/li>
&lt;li>Capitalization&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="deep-learning-approaches-to-nlp-neural">&lt;strong>Deep learning Approaches to NLP&lt;/strong> (Neural)&lt;/h4>
&lt;ul>
&lt;li>Use neural networks to automatically infer features&lt;/li>
&lt;li>Better generalization&lt;/li>
&lt;li>Successfully applied to many NLP tasks&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-11%2000.45.08.png" alt="截屏2020-05-11 00.45.08" style="zoom:67%;" />
&lt;h3 id="learning-scenarios">Learning scenarios&lt;/h3>
&lt;ul>
&lt;li>Supervised learning&lt;/li>
&lt;li>Unsupervised learning&lt;/li>
&lt;li>Semi supervised learning&lt;/li>
&lt;li>Reinforcement learning&lt;/li>
&lt;/ul>
&lt;h3 id="model-types">Model types&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Model type&lt;/th>
&lt;th>Input&lt;/th>
&lt;th>Output&lt;/th>
&lt;th>Example task&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Classification&lt;/td>
&lt;td>&lt;strong>Fix&lt;/strong> input size &lt;br />&lt;em>(E.g. word and surrounding k words)&lt;/em>&lt;/td>
&lt;td>Label&lt;/td>
&lt;td>Word sense disambiguation&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Sequence classification&lt;/td>
&lt;td>Sequence with &lt;strong>variable&lt;/strong> length&lt;/td>
&lt;td>Label&lt;/td>
&lt;td>Sentiment analysis&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Sequence labelling&lt;/td>
&lt;td>Sequence with &lt;strong>variable&lt;/strong> length&lt;/td>
&lt;td>Label sequence with &lt;strong>same&lt;/strong> length&lt;/td>
&lt;td>Named entity recognition&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Sequence to Sequence model&lt;/td>
&lt;td>Sequence with &lt;strong>variable&lt;/strong> length&lt;/td>
&lt;td>Sequence &lt;strong>variable&lt;/strong> length&lt;/td>
&lt;td>Summarization&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Structure prediction&lt;/td>
&lt;td>Sequence with &lt;strong>variable&lt;/strong> length&lt;/td>
&lt;td>Complex structure&lt;/td>
&lt;td>Parsing&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="resources">Resources&lt;/h3>
&lt;ul>
&lt;li>Texts
&lt;ul>
&lt;li>Brown Corpus&lt;/li>
&lt;li>Penn Treebank&lt;/li>
&lt;li>Europarl&lt;/li>
&lt;li>Google books corpus&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Dictionaries/Ontologies
&lt;ul>
&lt;li>WordNet,&lt;/li>
&lt;li>GermaNet,&lt;/li>
&lt;li>EuroWordNet&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="approaches-to-dialog-modeling">Approaches to Dialog Modeling&lt;/h3>
&lt;ul>
&lt;li>Many problems of NLP also apply to Dialog Modeling&lt;/li>
&lt;li>Use conversational corpora for learning interaction patterns
&lt;ul>
&lt;li>Meeting Corpus (multiparty conversation)&lt;/li>
&lt;li>Switchboard Corpus (telephone speech)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Problems ‼️
&lt;ul>
&lt;li>Very domain dependent&lt;/li>
&lt;li>Need human interaction in training&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="why-is-nlp-hard">Why is NLP hard?&lt;/h2>
&lt;p>&lt;span style="color:red"> Ambiguities! Ambiguities! Ambiguities!&lt;/span>&lt;/p>
&lt;h3 id="ambiguities">Ambiguities&lt;/h3>
&lt;p>Examples:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-11%2011.19.29.png" alt="截屏2020-05-11 11.19.29" style="zoom:67%;" />
&lt;h3 id="rare-events">Rare events&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Calculate probabilities for events/words&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Most words occur only very rarely&lt;/p>
&lt;ul>
&lt;li>Most words occur one time&lt;/li>
&lt;li>What to do with words that occur not in training data? 🧐&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Zipf&amp;rsquo;s Law&lt;/strong>
&lt;/p>
$$
f \propto \frac{1}{r}
$$
&lt;ul>
&lt;li>order list of words by occurrence&lt;/li>
&lt;li>rank: position in the list&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>The frequency of any word is &lt;a href="https://en.wikipedia.org/wiki/Inversely_proportional">inversely proportional&lt;/a> to its rank in the &lt;a href="https://en.wikipedia.org/wiki/Frequency_table">frequency table&lt;/a>.&lt;/p>
&lt;p>Thus the most frequent word will occur approximately twice as often as the second most frequent word, three times as often as the third most frequent word&lt;/p>
&lt;p>For example, in the &lt;a href="https://en.wikipedia.org/wiki/Brown_Corpus">Brown Corpus&lt;/a> of American English text, the word &lt;em>&lt;a href="https://en.wikipedia.org/wiki/English_articles#Definite_article">the&lt;/a>&lt;/em> is the most frequently occurring word, and by itself accounts for nearly 7% of all word occurrences (69,971 out of slightly over 1 million). True to Zipf&amp;rsquo;s Law, the second-place word &lt;em>of&lt;/em> accounts for slightly over 3.5% of words (36,411 occurrences), followed by &lt;em>and&lt;/em> (28,852).&lt;/p>
&lt;/blockquote></description></item><item><title>Word Sense Disambiguation</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/01-word-sense-disambiguation/</link><pubDate>Mon, 14 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/01-word-sense-disambiguation/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-11%2012.26.47.png" alt="截屏2020-05-11 12.26.47" style="zoom:67%;" />
&lt;h3 id="definition">Definition&lt;/h3>
&lt;p>&lt;strong>Word Sense Disambiguation&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Determine which sense/meaning of a word is used in a particular context&lt;/li>
&lt;li>Classification problem&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Sense inventory&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>considered senses of the words&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Word Sense Discrimination&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Divide usages of a word into different meanings&lt;/li>
&lt;li>Unsupervised Algorithms&lt;/li>
&lt;/ul>
&lt;h3 id="task">Task&lt;/h3>
&lt;p>&lt;strong>Determine which sense of a word is activated in a context&lt;/strong>&lt;/p>
&lt;p>Find mapping $A$ for word $w_i$:
&lt;/p>
$$
A(i) \subseteq \operatorname{Sense}\_{D}\left(w\_{i}\right)
$$
&lt;ul>
&lt;li>Mostly $|A(i)|=1$&lt;/li>
&lt;/ul>
&lt;p>Model as classification problem:&lt;/p>
&lt;ul>
&lt;li>Assign sense based on context and external knowledge sources&lt;/li>
&lt;li>Every word has different number of classes&lt;/li>
&lt;li>$n$ distinct classification tasks ($n$ Vocabulary size)&lt;/li>
&lt;/ul>
&lt;h4 id="task-conditions">Task-conditions&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Word senses&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Finite set of senses for every word&lt;/li>
&lt;li>Automatic clustering of word senses&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Sense inventories&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>coarse-grained&lt;/li>
&lt;li>fine-grained&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Text characteristics&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>domain-oriented&lt;/li>
&lt;li>unrestricted&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Target words&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>one target word per sentence&lt;/li>
&lt;li>all words&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="resources">Resources&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Annotated data&lt;/strong>
&lt;ul>
&lt;li>Input data X and output/label data Y&lt;/li>
&lt;li>Hard to acquire, but important&lt;/li>
&lt;li>Supervised training&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Unlabeled data&lt;/strong>
&lt;ul>
&lt;li>Input data X&lt;/li>
&lt;li>Large amounts&lt;/li>
&lt;li>Unsupervised data&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Structured resources&lt;/strong>
&lt;ul>
&lt;li>Thesauri&lt;/li>
&lt;li>Machine-readable dictionaries (MRDs)&lt;/li>
&lt;li>Computation lexicon (Wordnet)&lt;/li>
&lt;li>Ontologies&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Unstructured resources
&lt;ul>
&lt;li>Corpora&lt;/li>
&lt;li>Collocations resources&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="-problems">🔴 Problems&lt;/h2>
&lt;ul>
&lt;li>Sense definition is task dependent&lt;/li>
&lt;li>Different algorithms for different applications&lt;/li>
&lt;li>No discrete sense division possible&lt;/li>
&lt;li>Knowledge acquisition bottleneck&lt;/li>
&lt;li>Intermediate task&lt;/li>
&lt;/ul>
&lt;h2 id="application">Application&lt;/h2>
&lt;ul>
&lt;li>Machine Translation (MT)&lt;/li>
&lt;li>Information Retrieval (IR)&lt;/li>
&lt;li>Question Answering (QA)&lt;/li>
&lt;li>Semantic interpretation&lt;/li>
&lt;/ul>
&lt;h2 id="approaches">Approaches&lt;/h2>
&lt;h3 id="dictionary--and-knowledge-based">Dictionary- and Knowledge-Based&lt;/h3>
&lt;h4 id="lesk-method--gloss-overlap">Lesk method / Gloss overlap&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>💡 Idea: Word used together in a text are related&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Method: Find word sense with the &lt;strong>most overlap&lt;/strong> of dictionary definition&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Input: Dictionary with definition of the different word sense&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Overlap calculation&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Two words $w_1$ and $w_2$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>For each pair of senses $S_1$ in $\operatorname{Senses}(w_1)$ and $S_2$ in $\operatorname{Senses}(w_2)$:
&lt;/p>
$$
\operatorname{score}\left(S_1, S_{2}\right)=\left|\operatorname{gloss}(S_1) \cap \operatorname{gloss}\left(S_{2}\right)\right|
$$
&lt;ul>
&lt;li>$\operatorname{gloss}(S_1)=\text{bag of words of definition of } S_1$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Problem: Many words in the context -&amp;gt; calculation very slow 🤪
&lt;/p>
$$
\prod_{i=1}^{n} \operatorname{Senses}\left(w_{i}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Variant (simplified): Calculate overlap between context (set of words in surrounding sentence or paragraph) and gloss:
&lt;/p>
$$
\operatorname{score}(S)=|\operatorname{context}(w) \cap \operatorname{gloss}(S)|
$$
&lt;ul>
&lt;li>
&lt;p>Example:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-11%2021.56.13.png" alt="截屏2020-05-11 21.56.13" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Problems:&lt;/p>
&lt;ul>
&lt;li>depend heavily on the exact definition&lt;/li>
&lt;li>definitions are often very short&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="supervised">Supervised&lt;/h3>
&lt;p>💡 Train classifier using &lt;strong>annotated&lt;/strong> examples (i.e., annotated text corpora)&lt;/p>
&lt;ul>
&lt;li>Input features: Use context to disambiguate words&lt;/li>
&lt;li>Problems:
&lt;ul>
&lt;li>high-dimension of the feature space&lt;/li>
&lt;li>data sparseness problem&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Techniques:
&lt;ul>
&lt;li>&lt;a href="#naive-bayes-classifier">Naive Bayes classifier&lt;/a>&lt;/li>
&lt;li>&lt;a href="#instance-based-learning">Instance-based Learning&lt;/a>&lt;/li>
&lt;li>SVM&lt;/li>
&lt;li>&lt;a href="#ensemble-methods">Ensemble Methods&lt;/a>&lt;/li>
&lt;li>Neural Networks (e.g. Bi-LSTM)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="feature-extraction">Feature extraction&lt;/h4>
&lt;p>&lt;strong>Feature vector&lt;/strong>:&lt;/p>
&lt;ul>
&lt;li>Vector describing input data&lt;/li>
&lt;li>Fixed number of dimensions
&lt;ul>
&lt;li>Challenges:
&lt;ul>
&lt;li>Variable sentence length&lt;/li>
&lt;li>Unknown number of words&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Two kinds of features in the vectors:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Collocational&lt;/strong>: Features about words at &lt;strong>specific&lt;/strong> positions near target word&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Think as a (&lt;em>ordered&lt;/em>) list&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Often limited to just word identity and POS&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-11%2023.04.45.png" alt="截屏2020-05-11 23.04.45" style="zoom:50%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Bag-of-words&lt;/strong>: Features about words that occur anywhere in the window (regardless of position)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Think as &amp;ldquo;an &lt;em>unordered&lt;/em> set of words&amp;rdquo;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Typically limited to frequency counts&lt;/p>
&lt;/li>
&lt;li>
&lt;p>How it works?&lt;/p>
&lt;ul>
&lt;li>Counts of words occur within the window.&lt;/li>
&lt;li>First choose a vocabulary&lt;/li>
&lt;li>Then count how often each of those terms occurs in a given window
&lt;ul>
&lt;li>sometimes just a binary “indicator” 1 or 0&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-11%2023.06.20.png" alt="截屏2020-05-11 23.06.20" style="zoom:50%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Text processing&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Tokenization&lt;/li>
&lt;li>Part-of-speech tagging&lt;/li>
&lt;li>Lemmatization&lt;/li>
&lt;li>Chunking: divided text into syntactically correlated part&lt;/li>
&lt;li>Parsing&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Feature definition&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Local features
&lt;ul>
&lt;li>surrounding words, POS tags, position with respect to target word&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Topical/Global features
&lt;ul>
&lt;li>general topic of a text&lt;/li>
&lt;li>mostly bag-of-words representation of (sentence, paragraph, &amp;hellip;)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Syntactic features
&lt;ul>
&lt;li>syntactic clues&lt;/li>
&lt;li>can be outside the local context&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Semantic features
&lt;ul>
&lt;li>previous determined sense of words in the context&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="naive-bayes-classifier">Naive Bayes classifier&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Input:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>a word $w$ &lt;em>in a text window&lt;/em> $d$ &lt;em>(which we’ll call a “document”)&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>a fixed set of classes $C = \{c_1, c_2, \dots, c_j\}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>A training set of $m$ hand-labeled text windows again called&lt;/p>
&lt;p>“documents” $(d_1, c_1), \dots, (d_m, c_m)$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Output: a learn classifier
&lt;/p>
$$
\gamma: d \to c
$$
&lt;/li>
&lt;li>
&lt;p>$P(c)$: prior probability of that sense&lt;/p>
&lt;ul>
&lt;li>Counting in a labeled training set&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>$P(w|c)$: conditional probability of a word given a particular sense&lt;/p>
&lt;ul>
&lt;li>$p(w|c) = \frac{\operatorname{count}(w, c)}{\operatorname{count}(c)}$&lt;/li>
&lt;/ul>
&lt;p>(We get both of these from a tagged corpus)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example:&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-05-11%2022.45.57.png"
alt="Example of naive bayes classfier">&lt;figcaption>
&lt;p>Example of naive bayes classfier&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;/li>
&lt;/ul>
&lt;h4 id="instance-based-learning">Instance-based Learning&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Build classification model based on examples&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>k-Nearest Neighbor (k-NN)&lt;/strong> algorithm&lt;/p>
&lt;/li>
&lt;li>
&lt;p>💡Idea:&lt;/p>
&lt;ul>
&lt;li>represent examples in vector space&lt;/li>
&lt;li>define distance metric in vector space&lt;/li>
&lt;li>find $k$ nearest neighbor&lt;/li>
&lt;li>take most common sense in the k nearest neighbors&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Distance: e.g., Hamming distance
&lt;/p>
$$
\Delta\left(x, x_{i}\right)=\sum w_{j} \delta\left(x_{j}, x_{i_{j}}\right)
$$
&lt;ul>
&lt;li>$\delta\left(x_{j}, x_{i_j}\right)=0$ if $x_{j}=x_{i_j},$ else 1&lt;/li>
&lt;li>$w_j$: weight (e.g., Gain ration measure)&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>In &lt;a href="https://en.wikipedia.org/wiki/Information_theory">information theory&lt;/a>, the &lt;strong>Hamming distance&lt;/strong> between two &lt;a href="https://en.wikipedia.org/wiki/String_(computer_science)">strings&lt;/a> of equal length is the number of positions at which the corresponding &lt;a href="https://en.wikipedia.org/wiki/Symbol">symbols&lt;/a> are different. In other words, it measures the minimum number of &lt;em>substitutions&lt;/em> required to change one string into the other, or the minimum number of &lt;em>errors&lt;/em> that could have transformed one string into the other.&lt;/p>
&lt;p>Example:&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-05-12%2011.29.21-20200914235008349.png" alt="截屏2020-05-12 11.29.21">&lt;/p>
&lt;/blockquote>
&lt;/li>
&lt;/ul>
&lt;h4 id="ensemble-methods">Ensemble Methods&lt;/h4>
&lt;p>Combine different classifier&lt;/p>
&lt;ul>
&lt;li>classifier have strength in different situation&lt;/li>
&lt;li>improve by asking several experts&lt;/li>
&lt;/ul>
&lt;p>Algorithm:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Score input by several First-order classifier&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Combine results&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Result:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Only best hypothesis (majority vote)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>take decision of most classifiers&lt;/p>
&lt;/li>
&lt;li>
&lt;p>if tie, randomly choose between them&lt;/p>
$$
\hat{S}=\underset{S\_i \in \text{Sense}\_{D(w)}}{\operatorname{argmax}}|j: \operatorname{vote}(C\_{j})=S\_{j}|
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Score for all hypothesis (Probability Mixture)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Normalize scores of every classifier to get probability
&lt;/p>
$$
P\_{C\_{j}}(S\_i)=\frac{\operatorname{score}\left(C\_{j}, S\_i\right)}{\sum \operatorname{score}\left(C\_{j}, S\_i\right)}
$$
&lt;/li>
&lt;li>
&lt;p>Take class with &lt;em>highest&lt;/em> sum of probabilities&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
$$
\hat{S}=\underset{S\_i \in \operatorname{Sense}\_D(w)}{\operatorname{argmax}}\sum\_{j=1}^{m}P\_{c\_j}(S\_i)
$$
&lt;ul>
&lt;li>Ranking of all hypothesis (Rank-based Combination)
$$
\hat{S}=\underset{S\_i \in \operatorname{Sense}\_D(w)}{\operatorname{argmax}}\sum\_{j=1}^{m} -\operatorname{Rank}\_{c\_j}(S_i)
$$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="semi-supervised">Semi-supervised&lt;/h3>
&lt;p>‼️ &lt;strong>Knowledge acquisition bottleneck&lt;/strong>: &lt;span style="color:red">hard to get large amounts of annotated data&lt;/span>&lt;/p>
&lt;p>💡 &lt;strong>Idea&lt;/strong> of Semi-supervised approaches:&lt;/p>
&lt;ul>
&lt;li>Some initial model trained on small amounts of annotated data&lt;/li>
&lt;li>Improve model using raw data&lt;/li>
&lt;/ul>
&lt;h4 id="bootstrapping">Bootstrapping&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Seed data:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>manual annotated&lt;/p>
&lt;/li>
&lt;li>
&lt;p>surefire decision rules&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Train classifier on annotated data A&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Select subset U’ of unlabeled data&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Annotate U’ with classifier&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Filter most reliable examples&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Add examples to A&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Repeat from training&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-05-12%2011.39.03-20200915105745838.png" alt="截屏2020-05-12 11.39.03">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="self-training">Self-training&lt;/h4>
&lt;ul>
&lt;li>always use same classifier&lt;/li>
&lt;/ul>
&lt;h4 id="co-training">Co-training&lt;/h4>
&lt;ul>
&lt;li>train classifier 1 (e.g. using local feature)&lt;/li>
&lt;li>Annotate $P’$ with classifier 1&lt;/li>
&lt;li>train classifier 2 (e.g. topical information) on $P’$ and A&lt;/li>
&lt;li>Annotate $P’_2$ with classifier 2&lt;/li>
&lt;li>train classifier 1 &amp;hellip;&lt;/li>
&lt;/ul>
&lt;h3 id="unsupervised">Unsupervised&lt;/h3>
&lt;h4 id="-idea">💡 Idea&lt;/h4>
&lt;ul>
&lt;li>If a word is used in similar context, the meaning should be similar&lt;/li>
&lt;li>If the word is used in completely different context, different meaning&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Approach&lt;/strong>: Cluster contexts of words&lt;/p>
&lt;h4 id="context-clustering">Context clustering&lt;/h4>
&lt;p>&lt;strong>Word space model&lt;/strong>:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Vector space with dimension of the words&lt;/p>
&lt;/li>
&lt;li>
&lt;p>vector for word $w$:&lt;/p>
&lt;ul>
&lt;li>$j$-th component: number of co-occurs of $w$ and $w\_j$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Similarity&lt;/strong>:
&lt;/p>
$$
\operatorname{sim}(v, w)=\frac{v^{*} w}{|v|^{\*}|w|}=\frac{\displaystyle\sum\_{i=1}^{m} v\_{i} \* w\_{i}}{\sqrt{\displaystyle\sum_{i=1}^{m} v\_{i}^{2} \displaystyle\sum_{i=1}^{m} w\_{i}^{2}}}
$$
&lt;/li>
&lt;li>
&lt;p>Example:&lt;/p>
&lt;ul>
&lt;li>Dimension: (food, bank)&lt;/li>
&lt;li>restaurant=(210, 80)&lt;/li>
&lt;li>money = (100, 250)&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-12%2011.58.38.png" alt="截屏2020-05-12 11.58.38" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>‼️ Problem:&lt;/p>
&lt;ul>
&lt;li>sparse representation&lt;/li>
&lt;li>latent semantic analyses (LSA)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Context representation&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Second-order vectors: &lt;em>average&lt;/em> of all word vectors in the context&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-12%2012.02.09.png" alt="截屏2020-05-12 12.02.09" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Cluster contexts&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Agglomerative clustering
&lt;ul>
&lt;li>Start with one context per cluster&lt;/li>
&lt;li>Merge most similar clusters&lt;/li>
&lt;li>Continue until threshold is reached&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Co-occurrence Graphs&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>HyperLex: Co-occurrence graph for one target ambiguous word $w$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Nodes: All Words occurring in a paragraph with $w$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Edge: words occur in same paragraph&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Weight:
&lt;/p>
$$
\begin{array}{c}
w_{i j}=1-\max \left(P\left(w_{i} | w_{j}\right), P\left(w_{j} | w_{i}\right)\right) \\\\
P\left(w_{i} | w_{j}\right)=\frac{f r e q_{i j}}{f r e q_{j}}
\end{array}
$$
&lt;ul>
&lt;li>
&lt;p>Low weight -&amp;gt; High probability of co-occurring&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Discard edges with very high weight&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-12%2012.20.11.png" alt="截屏2020-05-12 12.20.11" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>How HyperLex works?&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Select Hubs (Nodes with highest degree)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Connect target words with weight 0 to hubs&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Calculate Minimal Spanning Tree&lt;/p>
&lt;/li>
&lt;li>
&lt;p>See Target word in Context $W = (w_1, w_2, \dots, w_n)$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Calculate vector for every word with $s_k$ (if $w_j$ ancestor of $h_k$)
&lt;/p>
$$
s_{k}=\frac{1}{1+d\left(h_{k}, w_{j}\right)}
$$
&lt;/li>
&lt;li>
&lt;p>Sum all vectors and assign to hub with highest sum&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-12%2012.23.12.png" alt="截屏2020-05-12 12.23.12" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="evaluation">Evaluation&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Hand-annotated data&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Precision&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Recall&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Task&lt;/strong>:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Lexical sample: only some words need to be disambiguate&lt;/p>
&lt;/li>
&lt;li>
&lt;p>All-words: all words need to be disambiguate&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Baseline&lt;/strong>:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Random baseline: Randomly choose one class&lt;/p>
&lt;/li>
&lt;li>
&lt;p>First Sense Baseline: Always take most common sense&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item><item><title>Sentiment Analysis</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/02-sentiment-analysis/</link><pubDate>Tue, 15 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/02-sentiment-analysis/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-12%2023.38.49.png" alt="截屏2020-05-12 23.38.49" style="zoom:50%;" />
&lt;h3 id="definition">Definition&lt;/h3>
&lt;p>&lt;strong>Sentiment analysis / opinion mining&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Determine opinion, sentiment and subjectivity in text
&lt;ul>
&lt;li>What is the authors opinion about something?&lt;/li>
&lt;li>What are the pros and cons?&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Important task in natural language processing&lt;/li>
&lt;/ul>
&lt;h3 id="application">Application&lt;/h3>
&lt;ul>
&lt;li>Automatically maintain review and opinion-aggregation websites&lt;/li>
&lt;li>Web search target towards reviews
&lt;ul>
&lt;li>generate results with variety of opinions&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Improve customer relationship management
&lt;ul>
&lt;li>Automatically analyze customer feedback&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Predict public attitudes towards brand/politics&lt;/li>
&lt;li>Ad placement
&lt;ul>
&lt;li>Advertise products near positive text&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Summarization&lt;/li>
&lt;li>Question-answering&lt;/li>
&lt;/ul>
&lt;h3 id="challenges">Challenges&lt;/h3>
&lt;ul>
&lt;li>Deep undetstanding&lt;/li>
&lt;li>Co-reference resolution&lt;/li>
&lt;li>Negation handling&lt;/li>
&lt;li>Different hints in the text&lt;/li>
&lt;/ul>
&lt;h2 id="tasks-of-sa">Tasks of SA&lt;/h2>
&lt;ul>
&lt;li>&lt;u>Polarity classification&lt;/u>
&lt;ul>
&lt;li>binary classifier if text, sentence, document is positive or negative&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Agreement detection
&lt;ul>
&lt;li>Do two text agree on their opinion?&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Rating
&lt;ul>
&lt;li>How does the user rate a product (1 to 5 stars)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Subjectivity detection
&lt;ul>
&lt;li>Is a text or sentence subjective or objective?&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Feature/aspect-based sentiment analysis
&lt;ul>
&lt;li>Opinions express on different features/aspects&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Viewpoints and perspectives&lt;/li>
&lt;/ul>
&lt;h2 id="polar-classification">Polar classification&lt;/h2>
&lt;p>&lt;strong>Task&lt;/strong>:&lt;/p>
&lt;ul>
&lt;li>Input: Text (Sentence, Document, Several Documents) (variable length)&lt;/li>
&lt;li>Output: positive or negative opinion&lt;/li>
&lt;li>Sequence classification&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Techniques&lt;/strong>:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#keyword-spotting">Keyword spotting&lt;/a>&lt;/li>
&lt;li>&lt;a href="#lexical-affinity">Lexical affinity&lt;/a>&lt;/li>
&lt;li>&lt;a href="#statistical-methods">Statistical methods&lt;/a>&lt;/li>
&lt;li>&lt;a href="#concept-based-approaches">concept-based approaches&lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="keyword-spotting">Keyword spotting&lt;/h3>
&lt;ul>
&lt;li>Classify based on occurrence of unambiguous affect words
&lt;ul>
&lt;li>E.g.: happy, sad, afraid, bored&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>‼️ Problems
&lt;ul>
&lt;li>&lt;span style="color:red">affect-negated words&lt;/span>&lt;/li>
&lt;li>E.g.: “&lt;em>today was a happy day” vs “today wasn’t a happy day at all&lt;/em>”&lt;/li>
&lt;li>&lt;span style="color:red">surface features&lt;/span>
&lt;ul>
&lt;li>Often no obvious affect words are present&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="lexical-affinity">Lexical affinity&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Increase the number of considered words&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Assign “probable” affinity to particular emotions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example: &lt;em>Accident&lt;/em> (75% of indicating a negative affect (car accident))&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Train probabilities from linguistic corpora&lt;/p>
&lt;/li>
&lt;li>
&lt;p>‼️ Problems&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;span style="color:red">negated sentences&lt;/span> (&lt;em>&amp;ldquo;I avoided an accident&amp;rdquo;&lt;/em>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;span style="color:red">Words with different meaning&lt;/span> (&lt;em>&amp;ldquo;I met my girlfriend by accident&amp;rdquo;&lt;/em>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;span style="color:red">Bias towards training data &amp;ndash;&amp;gt; domain-dependent&lt;/span>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="statistical-methods">Statistical methods&lt;/h3>
&lt;p>💡 Use &lt;strong>machine-learning algorithm&lt;/strong> to train classifier&lt;/p>
&lt;ul>
&lt;li>Input:
&lt;ul>
&lt;li>Represent input text as &lt;a href="#features">features&lt;/a> vector
&lt;ul>
&lt;li>&lt;strong>Feature selection&lt;/strong> important for classification performance&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Classifier:
&lt;ul>
&lt;li>Naive Bayes&lt;/li>
&lt;li>Support Vector Machines&lt;/li>
&lt;li>Maximum-entropy-based classification&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="features">Features&lt;/h4>
&lt;ul>
&lt;li>Word representation&lt;/li>
&lt;li>Position information&lt;/li>
&lt;li>POS infromation&lt;/li>
&lt;li>Syntax: Tree-based features&lt;/li>
&lt;/ul>
&lt;h5 id="feeatures-negation">Feeatures Negation&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>Negation should invert the features of the sentence&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Approaches:&lt;/p>
&lt;ul>
&lt;li>Attach &lt;strong>NOT&lt;/strong> to all words near a negation&lt;/li>
&lt;/ul>
&lt;p>However&lt;/p>
&lt;ul>
&lt;li>Not all negation reverse meaning
&lt;ul>
&lt;li>&lt;em>“No wonder this is considered one of the best “&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Negation do often not use a key word
&lt;ul>
&lt;li>&lt;em>“it avoids all clichés and predictability found in Hollywood movies”&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h5 id="topic-oriented-features">Topic-oriented features&lt;/h5>
&lt;ul>
&lt;li>Opinion of a sentence depend on topic of the article&lt;/li>
&lt;li>Approach: Replace subject of the article by general term&lt;/li>
&lt;/ul>
&lt;h4 id="domain-adaptation">Domain adaptation&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Meaning depends on the domain&lt;/strong>&lt;/li>
&lt;li>Different approaches to transfer knowledge from one domain to another
&lt;ul>
&lt;li>Search domain-independent features&lt;/li>
&lt;li>Structural correspondence learning algorithm&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="unsupervised-approaches">Unsupervised approaches&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Unsupervised lexicon induction&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Find adjectives using linguistic heuristics&lt;/p>
&lt;ul>
&lt;li>words that co-occur with “but”
&lt;ul>
&lt;li>&lt;em>elegant but over-priced&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;ul>
&lt;li>words that co-occur with “and”
&lt;ul>
&lt;li>&lt;em>clever and informative&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Build graph&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Cluster or build binary-partition&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Assign polarity using some seed words&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="relation-identification">Relation identification&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Sentence relationship&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Objective and subjective sentence in a review&lt;/p>
&lt;/li>
&lt;li>
&lt;p>No random order&lt;/p>
&lt;ul>
&lt;li>After subjective sentence most probable also subjective sentence&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>First cluster sentence into objective and subjective&lt;/p>
&lt;ul>
&lt;li>Use labels of the surrounding sentences&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Then Use &lt;strong>only subjective sentence&lt;/strong> to classify polarity of review&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Order of sentence is important&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>End is more important than beginning&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Use trajectory of local sentiments&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Dialog structure&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Class structure&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>One-vs.-all multi-class categorization&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Model as Metric labeling problem&lt;/strong>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="-problems-of-statistical-methods">‼️ Problems of statistical methods&lt;/h4>
&lt;ul>
&lt;li>Need enough text to perform classification&lt;/li>
&lt;li>Good performance on page and paragraph level&lt;/li>
&lt;li>Problems on sentence or clause level&lt;/li>
&lt;/ul>
&lt;h3 id="concept-based-approaches">Concept-based approaches&lt;/h3>
&lt;ul>
&lt;li>Perform semantic text analysis
&lt;ul>
&lt;li>Resources:
&lt;ul>
&lt;li>Web ontologies&lt;/li>
&lt;li>Semantic networks&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>try to recognize meaning/features&lt;/li>
&lt;li>heavily rely on depth and breadth of knowledge base&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="opinion-summarization">Opinion summarization&lt;/h2>
&lt;h3 id="opinion-oriented-extraction">&lt;strong>Opinion-oriented extraction&lt;/strong>&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Example: &lt;em>&amp;ldquo;What is the best about the new iPhone?&amp;rdquo;&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Approach&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Extract product features&lt;/p>
&lt;ul>
&lt;li>nouns / frequent nouns&lt;/li>
&lt;li>heuristic pruning&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>extract opinions associated with these features&lt;/p>
&lt;ul>
&lt;li>sometimes also extract the opinion holder&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="what-is-opinion-summarization">What is opinion summarization?&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Generate summary of large number of opinions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Aggregate results of sentiment prediction&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Structured summaries:&lt;/p>
&lt;ul>
&lt;li>breakdown by aspects/topics&lt;/li>
&lt;li>text or visualization&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Conceptual Framework&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#aspect-based">aspect-based summarization&lt;/a>&lt;/li>
&lt;li>&lt;a href="#non-aspect-based-opinion-summarizatio">non-aspect-based summarization&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="aspect-based">Aspect-based&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>💡 &lt;strong>Divide input text into aspect/features/subtopics&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;em>E.g.: Review on iPod&lt;/em>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;em>battery life&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;em>design&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;em>price&lt;/em>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Show structured details&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="how-aspect-based-opinion-summarization-works">How aspect-based opinion summarization works?&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Framework&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>&lt;a href="#aspectfeature-identification">aspect/feature identification&lt;/a>&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>find important topics&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>&lt;a href="#sentiment-prediction">sentiment prediction&lt;/a>&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>determine the sentiment orientation&lt;/li>
&lt;li>is the aspect judged positive/negative?&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>&lt;a href="#summary-generation">summary generation&lt;/a>&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>present results&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-13%2011.34.26.png" alt="截屏2020-05-13 11.34.26" style="zoom:67%;" />
&lt;h4 id="aspectfeature-identification">&lt;strong>Aspect/Feature Identification&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>Find subtopics (In some cases already known)&lt;/li>
&lt;li>Techniques
&lt;ul>
&lt;li>NLP-based approaches using POS-tagging /parse trees&lt;/li>
&lt;li>Shallow parsing&lt;/li>
&lt;li>use additional knowledge&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="sentiment-prediction">&lt;strong>Sentiment prediction&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Predict sentiment for the different aspects&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Learning approach:&lt;/p>
&lt;ul>
&lt;li>Learn aspect level ratings using the global rating&lt;/li>
&lt;li>Naive Bayes classifier&lt;/li>
&lt;li>‼️ Problem: label examples is expensive&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Most approaches use &lt;strong>lexicon/rule-based&lt;/strong> methods&lt;/p>
&lt;ul>
&lt;li>&lt;em>e.g. list of positive and negative words&lt;/em> (extend by wordNet)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="summary-generation">&lt;strong>Summary Generation&lt;/strong>&lt;/h4>
&lt;p>Generate and present the opinion summaries&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Statistical Summary&lt;/p>
&lt;ul>
&lt;li>show statistics about opinion on different aspect&lt;/li>
&lt;li>directly use sentiment prediction output&lt;/li>
&lt;li>easy to understand&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Text selection&lt;/p>
&lt;ul>
&lt;li>show small pieces of text as the summary&lt;/li>
&lt;li>show strongest opinion words for every aspect&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Aggregate Ratings&lt;/p>
&lt;ul>
&lt;li>Show statistics and text selection&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Summary with timeline&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Show opinion trends over a timeline&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Example:&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-15%2019.05.16.png" alt="截屏2020-09-15 19.05.16">&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-15%2018.55.28.png" alt="截屏2020-09-15 18.55.28">&lt;/p>
&lt;h4 id="integrated-approaches">Integrated Approaches&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>No clear separation of the different steps&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Topic Sentiment Mixture Model&lt;/p>
&lt;ul>
&lt;li>unsupervised approach&lt;/li>
&lt;li>sentiment prediction and aspect identification in &lt;strong>one&lt;/strong> step&lt;/li>
&lt;li>Model: &lt;strong>Probabilistic latent semantic analysis (PLSA)&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="multi-task-learning">&lt;strong>Multi-task learning&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>CNN-based approach&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>C predefined aspect mappers&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sentiment classfiers&lt;/p>
&lt;/li>
&lt;li>
&lt;p>shared word embedding layer&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-13%2011.42.47.png" alt="截屏2020-05-13 11.42.47" style="zoom:50%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>LSTM with attention&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Input: word embedding and aspect embedding&lt;/li>
&lt;li>Relevant parts of the sentence identified through &lt;strong>attention&lt;/strong> mechanism&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-13%2011.43.52.png" alt="截屏2020-05-13 11.43.52" style="zoom:50%;" />
&lt;/li>
&lt;/ul>
&lt;h3 id="non-aspect-based-opinion-summarization">&lt;strong>Non-aspect-based opinion summarization&lt;/strong>&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Basic Sentiment Summarization&lt;/strong>&lt;/p>
&lt;ol>
&lt;li>Classify each input text separately&lt;/li>
&lt;li>Count number of positive and negative opinions&lt;/li>
&lt;/ol>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Text Summarization&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Opinion Integration:&lt;/p>
&lt;ul>
&lt;li>Expert opinions: complete, but rarely updated&lt;/li>
&lt;li>Ordinary opinions: unstructured, but updated more often&lt;/li>
&lt;li>Combine both by first extracting information from expert opinions&lt;/li>
&lt;li>Add information from the ordinary opinions&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Contrastive Opinion Summarization&lt;/p>
&lt;ul>
&lt;li>Show positive and negative aspects&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Abstractive Text summarization&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item><item><title>Part-of-Speech Tagging</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/03-part-of-speech-tagging/</link><pubDate>Tue, 15 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/03-part-of-speech-tagging/</guid><description>&lt;h2 id="part-of-speech-tagging">Part-of-Speech Tagging&lt;/h2>
&lt;h3 id="what-is-part-of-speech-tagging">What is Part-of-Speech Tagging?&lt;/h3>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-15%2019.18.00.png" alt="截屏2020-09-15 19.18.00">&lt;/p>
&lt;p>Part-of-Speech tagging:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Grammatical tagging&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Word-category disambiguation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Task: Marking up a word in a text as corresponding to a particular part of speech&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>based on definition and context&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Word level&lt;/strong> task: Assign one class to every word&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Variations:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>English schools: 9 POS&lt;/p>
&lt;ul>
&lt;li>noun, verb, article, adjective, preposition, pronoun, adverb, conjunction, and interjection.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>POS-tagger: 50 – 150 classes&lt;/p>
&lt;ul>
&lt;li>Plural, singular&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>POS + Morph tags:&lt;/p>
&lt;ul>
&lt;li>More than 600&lt;/li>
&lt;li>Gender, case, &amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="data-sources">Data sources&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Brown corpus&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Penn Tree Bank&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Tiger Treebank&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="-problems">🔴 Problems&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Ambiguities&lt;/strong>
&lt;ul>
&lt;li>E.g.: &amp;ldquo;&lt;em>A &lt;u>can&lt;/u> of beans&lt;/em>&amp;rdquo; vs. &amp;ldquo;&lt;em>We &lt;u>can&lt;/u> do it&lt;/em>&amp;rdquo;&lt;/li>
&lt;li>Many content words in English can have more than 1 POS tag&lt;/li>
&lt;li>E.g.: &lt;em>play&lt;/em>, &lt;em>flour&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Data sparseness&lt;/strong>: What to do with rare words?&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Disambiguate using context information&lt;/strong> &amp;#x1f4aa;&lt;/p>
&lt;h3 id="example-applications">Example applications&lt;/h3>
&lt;ul>
&lt;li>Information extraction&lt;/li>
&lt;li>QA&lt;/li>
&lt;li>Shallow parsing&lt;/li>
&lt;li>Machine Translation&lt;/li>
&lt;/ul>
&lt;h2 id="how-to-do-pos-tagging">How to do POS Tagging?&lt;/h2>
&lt;h3 id="rule-based">Rule-based&lt;/h3>
&lt;blockquote>
&lt;p>Rule-based taggers use dictionary or lexicon for getting possible tags for tagging each word. If the word has more than one possible tag, then rule-based taggers use hand-written rules to identify the correct tag. Disambiguation can also be performed in rule-based tagging by analyzing the linguistic features of a word along with its preceding as well as following words. For example, suppose if the preceding word of a word is article then word must be a noun.&lt;/p>
&lt;/blockquote>
&lt;h4 id="design-rules-to-assign-pos-tags-to-words">Design rules to assign POS tags to words&lt;/h4>
&lt;p>How can one decide on the right POS tag used in a context?&lt;/p>
&lt;p>Two sources of information:&lt;/p>
&lt;ul>
&lt;li>Tags of other words in the context of the word we are interested in&lt;/li>
&lt;li>knowing the word itself gives a lot of information about the correct tag&lt;/li>
&lt;/ul>
&lt;h5 id="syntagmatic-approach">Syntagmatic approach&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>most obvious source of information&lt;/p>
&lt;/li>
&lt;li>
&lt;p>With rule-based approach only 77% tagged correctly 🤪&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Should &lt;em>play&lt;/em> get an &lt;code>NN&lt;/code> or &lt;code>VBP&lt;/code> tag?&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Take the more common POS tag sequence for phrase &lt;em>a new play&lt;/em>:&lt;/p>
&lt;p>&lt;code>AT&lt;/code> &lt;code>JJ&lt;/code> &lt;code>NN&lt;/code> vs. &lt;code>AT&lt;/code> &lt;code>JJ&lt;/code> &lt;code>VBP&lt;/code>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h5 id="lexical-information">Lexical information&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>assign &lt;strong>the most common tag&lt;/strong> to a word&lt;/p>
&lt;/li>
&lt;li>
&lt;p>90% correct !!! (favorable conditions)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>So useful because the distribution of a word&amp;rsquo;s usages across different POS is typically extremely uneven → usually occur as 1 POS&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>All modern taggers use a combination of syntagmatic and lexical information.&lt;/p>
&lt;p>Statistical approaches should work well on POS tagging, assuming a word has different POS tags according certain &lt;em>a priori&lt;/em> probabilities&lt;/p>
&lt;h4 id="brill-tagger">Brill-Tagger&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Developed by Eric Brill in 1995&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Algorithm&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Initialize:&lt;/p>
&lt;ul>
&lt;li>Every word gets most frequent POS&lt;/li>
&lt;li>Unknown: Noun&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Until no longer possible&lt;/p>
&lt;ul>
&lt;li>Apply rules&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Rules&lt;/p>
&lt;ul>
&lt;li>Linguistically motivated&lt;/li>
&lt;li>Machine learning algorithms&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>&lt;a href="https://en.wikipedia.org/wiki/Brill_tagger">Wiki&lt;/a>:&lt;/p>
&lt;p>The &lt;strong>Brill tagger&lt;/strong> is an inductive method for &lt;a href="https://en.wikipedia.org/wiki/Part-of-speech_tagging">part-of-speech tagging&lt;/a>. It can be summarized as an &amp;ldquo;error-driven transformation-based tagger&amp;rdquo;.&lt;/p>
&lt;p>It is:&lt;/p>
&lt;ul>
&lt;li>a form of &lt;a href="https://en.wikipedia.org/wiki/Supervised_learning">supervised learning&lt;/a>, which aims to minimize error; and,&lt;/li>
&lt;li>a transformation-based process, in the sense that a tag is assigned to each word and changed using a set of predefined rules.&lt;/li>
&lt;/ul>
&lt;p>In the transformation process,&lt;/p>
&lt;ul>
&lt;li>if the word is known, it first assigns the most frequent tag,&lt;/li>
&lt;li>if the word is unknown, it naively assigns the tag &amp;ldquo;noun&amp;rdquo; to it.&lt;/li>
&lt;/ul>
&lt;p>Applying over and over these rules, changing the incorrect tags, a quite high accuracy is achieved.&lt;/p>
&lt;/blockquote>
&lt;h3 id="statistical">Statistical&lt;/h3>
&lt;p>Probabilistic tagging: Model POS tags as &lt;strong>Sequence labeling&lt;/strong>&lt;/p>
&lt;blockquote>
&lt;p>&lt;a href="https://en.wikipedia.org/wiki/Sequence_labeling">Wiki&lt;/a>:&lt;/p>
&lt;p>In &lt;a href="https://en.wikipedia.org/wiki/Machine_learning">machine learning&lt;/a>, &lt;strong>sequence labeling&lt;/strong> is a type of &lt;a href="https://en.wikipedia.org/wiki/Pattern_recognition">pattern recognition&lt;/a> task that involves the algorithmic assignment of a &lt;a href="https://en.wikipedia.org/wiki/Categorical_data">categorical&lt;/a> label to each member of a sequence of observed values.&lt;/p>
&lt;p>A common example of a sequence labeling task is &lt;a href="https://en.wikipedia.org/wiki/Part_of_speech_tagging">part of speech tagging&lt;/a>, which seeks to assign a &lt;a href="https://en.wikipedia.org/wiki/Part_of_speech">part of speech&lt;/a> to each word in an input sentence or document. Sequence labeling can be treated as a set of independent &lt;a href="https://en.wikipedia.org/wiki/Classification_(machine_learning)">classification&lt;/a> tasks, one per member of the sequence. However, accuracy is generally improved by making the optimal label for a given element dependent on the choices of nearby elements, using special algorithms to choose the &lt;em>globally&lt;/em> best set of labels for the entire sequence at once.&lt;/p>
&lt;/blockquote>
&lt;ul>
&lt;li>
&lt;p>Sequence labeling&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Input: sequence $x\_1, \dots, x\_n$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Output: Sequence $y\_1, \dots, y\_n$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-15%2019.46.19.png" alt="截屏2020-09-15 19.46.19" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Model as Machine Learning Problem&lt;/p>
&lt;ul>
&lt;li>
&lt;p>💡 Classify each token independently but use as input features, information about the surrounding tokens (sliding window).&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Training data&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Label sequence $\left\\{\left(x^{1}, y^{1}\right),\left(x^{2}, y^{2}\right), \ldots,\left(x^{M}, y^{M}\right)\right\\}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Learn model: $X \to Y$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Problem: &lt;em>Exponential&lt;/em> number of solutions!!!&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Number of solutions: $\text{#Classes}^{\text{#Words}}$&lt;/p>
&lt;p>-&amp;gt; Can NOT directly model $P(y|x)$ or $P(x, y)$ 🤪&lt;/p>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-21%2015.17.11.png" alt="截屏2020-05-21 15.17.11" style="zoom: 67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>The model that includes frequency or probability (statistics) can be called &lt;strong>stochastic&lt;/strong>. Any number of different approaches to the problem of part-of-speech tagging can be referred to as &lt;strong>stochastic tagger&lt;/strong>.&lt;/p>
&lt;h4 id="decision-trees">Decision Trees&lt;/h4>
&lt;p>Automatically learn which question to ask&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-25%2015.18.08.png" alt="截屏2020-05-25 15.18.08" style="zoom:80%;" />
&lt;p>Probabilistic tagging&lt;/p>
&lt;ul>
&lt;li>Define probability for tag sequence recursively&lt;/li>
&lt;li>Using two models
&lt;ul>
&lt;li>$P(t\_n | t\_{n-1}, t\_{n-2})$: model using decision tree&lt;/li>
&lt;li>$P(w\_n | t\_n)$
&lt;ul>
&lt;li>Lexicon&lt;/li>
&lt;li>Suffix lexicon for unknown words
&lt;ul>
&lt;li>Which POS tag attached to unknown words&lt;/li>
&lt;li>Depending on the ending some POS tags are more probable&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="condition-random-fields-crfs">Condition Random Fields (CRFs)&lt;/h4>
&lt;blockquote>
&lt;p>&lt;a href="https://en.wikipedia.org/wiki/Conditional_random_field">Wiki&lt;/a>:&lt;/p>
&lt;p>&lt;strong>Conditional random fields&lt;/strong> (&lt;strong>CRFs&lt;/strong>) are a class of &lt;a href="https://en.wikipedia.org/wiki/Statistical_model">statistical modeling method&lt;/a> often applied in &lt;a href="https://en.wikipedia.org/wiki/Pattern_recognition">pattern recognition&lt;/a> and &lt;a href="https://en.wikipedia.org/wiki/Machine_learning">machine learning&lt;/a> and used for &lt;a href="https://en.wikipedia.org/wiki/Structured_prediction">structured prediction&lt;/a>. Whereas a &lt;a href="https://en.wikipedia.org/wiki/Statistical_classification">classifier&lt;/a> predicts a label for a single sample without considering &amp;ldquo;neighboring&amp;rdquo; samples, a CRF can &lt;strong>take context into account&lt;/strong>.&lt;/p>
&lt;/blockquote>
&lt;p>&lt;strong>Hidden Markov Model (HMM)&lt;/strong>:&lt;/p>
&lt;ul>
&lt;li>Hidden states: POS&lt;/li>
&lt;li>Output: Words&lt;/li>
&lt;li>Task: Estimate state sequence from output&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Generative model&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Assign a joint probability $P(x, y)$ to paired observation and label sequences&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Problem when modeling $P(x)$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Introduce highly dependent features&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example: Word, Capitalization, Suffix, Prefix&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Possible solutions:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Model dependencies&lt;/p>
&lt;ul>
&lt;li>How does the capitalization depend on the suffix?&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Independence assumption&lt;/p>
&lt;ul>
&lt;li>Hurts performance&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Discriminative Model&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Directly model $P(y|x)$&lt;/li>
&lt;li>No model for $P(x)$ is involved
&lt;ul>
&lt;li>Not needed for classification since x is observed&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="linear-chain-conditional-random-fields">&lt;strong>Linear Chain Conditional Random Fields&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>$x$: random variable (Representing the input)&lt;/li>
&lt;li>$y$: random variable (POS tags)&lt;/li>
&lt;li>$\theta$: Parameter&lt;/li>
&lt;li>$f(y, y', x)$: feature function&lt;/li>
&lt;/ul>
&lt;p>Model:
&lt;/p>
$$
p(\mathbf{y} | \mathbf{x})=\frac{1}{Z(\mathbf{x})} \prod\_{t=1}^{T} \exp \left\\{\sum\_{k=1}^{K} \theta\_{k} f\_{k}\left(y\_{t}, y\_{t-1}, \mathbf{x}\_{t}\right)\right\\}
$$
$$
Z(\mathrm{x})=\sum\_{\mathbf{y}} \prod\_{t=1}^{T} \exp \left\\{\sum\_{k=1}^{K} \theta\_{k} f\_{k}\left(y\_{t}, y\_{t-1}, \mathbf{x}\_{t}\right)\right\\}
$$
&lt;h5 id="feature-functions">Feature functions&lt;/h5>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-26%2009.50.04.png" alt="截屏2020-05-26 09.50.04" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>First-order dependencies&lt;/p>
&lt;ul>
&lt;li>$\mathbf{1}(y'=\text{DET}, y=\text{NN})$&lt;/li>
&lt;li>$\mathbf{1}(y'=\text{DET}, y=\text{VB})$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Lexical: $\mathbf{1}(y=\text{DET}, x=\text{"the"})$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Lexical with context: $\mathbf{1}(y'=\text{NN}, x=\text{"can"}, \operatorname{pre}(x)=\text{"the"})$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Additional features: $\mathbf{1}(y=\text{NN}, \operatorname{cap}(x)=true)$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h5 id="inference">Inference&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>Task: Get &lt;strong>most probabale&lt;/strong> POS sequence&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Problem: Exponential number of label sequences 🤪&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Linear-chain layout&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Dynamic programming can be used&lt;/p>
&lt;p>$\rightarrow$ Efficient computing&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h5 id="training">Training&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>Task: How to find the best weight $\theta$ ?&lt;/p>
&lt;/li>
&lt;li>
&lt;p>💡 &lt;strong>Maximum (Log-)Likelihood estimation&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Maximize probability of the training data&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Given: $M$ sequence with labels $(x^M, y^M)$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Maximize
&lt;/p>
$$
l(\theta)=\sum \log \left(P\left(y^{k} | x^{k}, \theta\right)\right.
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Regularization&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Prevent overfitting by prefering lower weights
&lt;/p>
$$
\sum\_{k=1}^{M} \log \left(P\left(y^{k} | x^{k}, \theta\right)\right)-\frac{1}{2} C\|\theta\|^{2}
$$
&lt;/li>
&lt;li>
&lt;p>Convex function&lt;/p>
&lt;p>$\Rightarrow$ Can use gradient descent to find optimal value 👏&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="neural-network">Neural Network&lt;/h4>
&lt;p>🔴 &lt;strong>Data sparseness Problem&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Many words have rarely seen in training $\Rightarrow$ Hard to estimate probabilities 🤪&lt;/li>
&lt;li>CRFs:
&lt;ul>
&lt;li>Use many features to represent the word&lt;/li>
&lt;li>&lt;span style="color:red">Problem: A lot of engineering!&lt;/span>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h5 id="neural-networks">Neural networks&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>Able to learn hidden representation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Learn representation of words based on letters, E.g.:&lt;/p>
&lt;ul>
&lt;li>Words ending on &lt;em>ness&lt;/em> with be &lt;code>noun&lt;/code>s&lt;/li>
&lt;li>Words ending on &lt;em>phoby&lt;/em> will be &lt;code>noun&lt;/code>s&lt;/li>
&lt;li>Words ending on &lt;em>ly&lt;/em> are often &lt;code>adverb&lt;/code>s&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h5 id="structure">Structure&lt;/h5>
&lt;ul>
&lt;li>First layer: Word representation
&lt;ul>
&lt;li>
&lt;p>CNN&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Learn mapping: Word $\to$ continuous vector&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-15%2020.01.55.png" alt="截屏2020-09-15 20.01.55" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-15%2020.02.18.png" alt="截屏2020-09-15 20.02.18" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-26%2010.25.49.png" alt="截屏2020-05-26 10.25.49" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-26%2010.25.57.png" alt="截屏2020-05-26 10.25.57" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>Second layer:&lt;/p>
&lt;ul>
&lt;li>Use several words to predict POS tag&lt;/li>
&lt;li>Feed forward net&lt;/li>
&lt;li>RNN: Contain complete history&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-26%2010.29.01.png" alt="截屏2020-05-26 10.29.01" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Training&lt;/strong>&lt;/p>
&lt;p>Train both layers together using &lt;strong>backpropagation&lt;/strong>&lt;/p></description></item><item><title>Named Entity Recognition</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/04-name-entity-recognition/</link><pubDate>Tue, 15 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/04-name-entity-recognition/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;h3 id="definition">Definition&lt;/h3>
&lt;p>&lt;strong>Named Entity&lt;/strong>: some entity represented by a name&lt;/p>
&lt;p>&lt;strong>Named Entity Recognition&lt;/strong>: Find and classify named entities in text&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-26%2018.13.30.png" alt="截屏2020-05-26 18.13.30" style="zoom: 40%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-26%2018.13.49.png" alt="截屏2020-05-26 18.13.49" style="zoom:40%;" />
&lt;h3 id="why-useful">Why useful?&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Create indices &amp;amp; hyperlinks&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Information extraction&lt;/p>
&lt;ul>
&lt;li>Establish relationships between named entities, build knowledge base&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Question answering: answers often NEs&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Machine translation: NEs require special care&lt;/p>
&lt;ul>
&lt;li>NEs often unknown words, usually passed through without translation.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="why-difficult">Why difficult?&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>World knowledge&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Non-local decisions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Domain specificity&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Labeled data is very expensive&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="label-representation">Label Representation&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>IO&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>I&lt;/strong>: Inside&lt;/li>
&lt;li>&lt;strong>O&lt;/strong>: Outside (indicates that a token belongs to no chunk)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>BIO&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>B&lt;/strong>: Begin&lt;/p>
&lt;blockquote>
&lt;p>&lt;a href="https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)">Wiki&lt;/a>&lt;/p>
&lt;p>The &lt;strong>IOB format&lt;/strong> (short for inside, outside, beginning), a synonym for &lt;strong>BIO format&lt;/strong>, is a common tagging format for tagging &lt;a href="https://en.wikipedia.org/wiki/Lexical_token">tokens&lt;/a> in a &lt;a href="https://en.wikipedia.org/wiki/Chunking_(computational_linguistics)">chunking&lt;/a> task in &lt;a href="https://en.wikipedia.org/wiki/Computational_linguistics">computational linguistics&lt;/a>.&lt;/p>
&lt;ul>
&lt;li>&lt;strong>B&lt;/strong>-prefix before a tag: indicates that the tag is the &lt;strong>beginning&lt;/strong> of a chunk&lt;/li>
&lt;li>&lt;strong>I&lt;/strong>-prefix before a tag: indicates that the tag is the &lt;strong>inside&lt;/strong> of a chunk&lt;/li>
&lt;li>&lt;strong>O&lt;/strong> tag: indicates that a token belongs to NO chunk&lt;/li>
&lt;/ul>
&lt;/blockquote>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>BIOES&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>E&lt;/strong>: Ending character&lt;/li>
&lt;li>&lt;strong>S&lt;/strong>: single element&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>BILOU&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>L&lt;/strong>: Last character&lt;/li>
&lt;li>&lt;strong>U&lt;/strong>: Unit length&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Example:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-tex" data-lang="tex">&lt;span class="line">&lt;span class="cl">Fred showed Sue Mengqiu Huang&amp;#39;s new painting
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-26%2018.19.51.png" alt="截屏2020-05-26 18.19.51" style="zoom:50%;" />
&lt;h3 id="data">Data&lt;/h3>
&lt;ul>
&lt;li>CoNLL03 shared task data&lt;/li>
&lt;li>MUC7 dataset&lt;/li>
&lt;li>Guideline examples for special cases:
&lt;ul>
&lt;li>Tokenization&lt;/li>
&lt;li>Elision&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="evaluation">Evaluation&lt;/h2>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-26%2022.46.58.png" alt="截屏2020-05-26 22.46.58" style="zoom:50%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Precision and Recall&lt;/strong>
&lt;/p>
$$
\text{Precision} = \frac{\\# \text { correct labels }}{\\# \text { hypothesized labels }} = \frac{TP}{TP + FP}
$$
$$
\text{Recall} = \frac{\\# \text { correct labels }}{\\# \text { reference labels }} = \frac{TP}{TP + FN}
$$
&lt;ul>
&lt;li>
&lt;p>Phrase-level counting&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-26%2022.47.59.png" alt="截屏2020-05-26 22.47.59" style="zoom:40%;" />
&lt;blockquote>
&lt;ul>
&lt;li>
&lt;p>System 1:&lt;/p>
&lt;ul>
&lt;li>&amp;ldquo;$\text{\\$200,000,000}$" is correctly recognized as NE $\Rightarrow$ TP =1&lt;/li>
&lt;li>&amp;ldquo;First Bank of Chicago&amp;rdquo; is incorrectly recognised as non-NE (i.e., O) $\Rightarrow$ FN = 1&lt;/li>
&lt;/ul>
&lt;p>Therefore:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$\text{Precision} = \frac{1}{1 + 0} = 1$&lt;/p>
&lt;ul>
&lt;li>$\text{Recall} = \frac{1}{1 + 1} = \frac{1}{2}$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>System 2:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&amp;ldquo;$\text{\\$200,000,000}$" is correctly recognized as NE $\Rightarrow$ TP =1&lt;/p>
&lt;/li>
&lt;li>
&lt;p>For &amp;ldquo;First Bank of Chicago&amp;rdquo;&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Word&lt;/th>
&lt;th>Actual label&lt;/th>
&lt;th>Predicted label&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>First&lt;/td>
&lt;td>ORG&lt;/td>
&lt;td>O&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Bank of Chicago&lt;/td>
&lt;td>ORG&lt;/td>
&lt;td>ORG&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>There&amp;rsquo;s a boundary error (since we consider the whole phrase):&lt;/p>
&lt;ul>
&lt;li>FN = 1&lt;/li>
&lt;li>FP = 1&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Therefore:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$\text{Precision} = \frac{1}{1 + 1} = \frac{1}{2}$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$\text{Recall} = \frac{1}{1 + 1} = \frac{1}{2}$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/blockquote>
&lt;ul>
&lt;li>Problems
&lt;ul>
&lt;li>Punish partial overlaps&lt;/li>
&lt;li>Ignore true negatives&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Token-level&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-26%2023.51.18.png" alt="截屏2020-05-26 23.51.18" style="zoom: 67%;" />
&lt;blockquote>
&lt;p>In token-level, we consider these tokens: &amp;ldquo;First&amp;rdquo;, &amp;ldquo;Bank&amp;rdquo;, &amp;ldquo;of&amp;rdquo;, &amp;ldquo;Chicago&amp;rdquo;, and &amp;ldquo;$200,000,000&amp;rdquo;&lt;/p>
&lt;ul>
&lt;li>
&lt;p>System 1&lt;/p>
&lt;ul>
&lt;li>&amp;ldquo;$\text{\\$200,000,000}$" is correctly recognized as NE $\Rightarrow$ TP =1&lt;/li>
&lt;li>&amp;ldquo;First&amp;rdquo;, &amp;ldquo;Bank&amp;rdquo;, &amp;ldquo;of&amp;rdquo;, &amp;ldquo;Chicago&amp;rdquo; are incorrectly recognised as non-NE (i.e., O) $\Rightarrow$ FN = 4&lt;/li>
&lt;/ul>
&lt;p>Therefore:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$\text{Precision} = \frac{1}{1 + 0} = 1$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$\text{Recall} = \frac{1}{1 + 4} = \frac{1}{5}$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/blockquote>
&lt;ul>
&lt;li>Partial overlaps rewarded!&lt;/li>
&lt;li>But
&lt;ul>
&lt;li>
&lt;p>longer entities weighted more strongly&lt;/p>
&lt;/li>
&lt;li>
&lt;p>True negatives still ignored 🤪&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>$F\_1$ score&lt;/strong> (harmonic mean of precision and recall)
&lt;/p>
$$
F\_1 = \frac{2 \times \text { precision } \times \text { recall }}{\text { precision }+\text { recall }}
$$
&lt;/li>
&lt;/ul>
&lt;h2 id="text-representation">Text Representation&lt;/h2>
&lt;h3 id="local-features">Local features&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Previous two predictions (tri-gram feature)&lt;/p>
&lt;ul>
&lt;li>$y\_{i-1}$ and $y\_{i-2}$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Current word $x\_i$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Word type $x\_i$&lt;/p>
&lt;ul>
&lt;li>all-capitalized, is-capitalized, all-digits, alphanumeric, &amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Word shape&lt;/p>
&lt;ul>
&lt;li>
&lt;p>lower case - &amp;lsquo;x&amp;rsquo;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>upper case - &amp;lsquo;X&amp;rsquo;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>numbers - &amp;rsquo;d'&lt;/p>
&lt;/li>
&lt;li>
&lt;p>retain punctuation&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-05-27%2010.18.48-20200915221013059.png" alt="截屏2020-05-27 10.18.48">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Word substrings&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Tokens in window&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Word shapes in window&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&amp;hellip;&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="non-local-features">Non-local features&lt;/h3>
&lt;p>&lt;strong>Identify tokens that should have same labels&lt;/strong>&lt;/p>
&lt;p>Type:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Context aggregation&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Derived from all words in the document&lt;/p>
&lt;/li>
&lt;li>
&lt;p>No dependencies on predictions, usable with any inference algorithm&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Prediction aggregation&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Derived from predictions of the whole document&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Global dependencies; Inference:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>first apply baseline without non-local features&lt;/p>
&lt;/li>
&lt;li>
&lt;p>then apply second system conditioned on output of first system&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Extended prediction history&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Condition only on past predictions &amp;ndash;&amp;gt; greedy / beam search&lt;/p>
&lt;/li>
&lt;li>
&lt;p>💡 Intuition: Beginning of document often easier, later in document terms often get abbreviated&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="sequence-model">Sequence Model&lt;/h2>
&lt;h3 id="hmms">HMMs&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Generative model&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Generative story:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Choose document length $N$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>For each word $t = 0, \dots, N$:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Draw NE label $\sim P(y\_t | y\_{t-1})$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Draw word $\sim P\left(x\_{t} | y\_{t}\right)$
&lt;/p>
$$
P(\mathbf{y}, \mathbf{x})=\prod P\left(y\_{t} | y\_{t-1}\right) P\left(x | y\_{t}\right)
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-15%2022.19.02.png" alt="截屏2020-09-15 22.19.02">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>👍 Pros&lt;/p>
&lt;ul>
&lt;li>intuitive model&lt;/li>
&lt;li>Works with unknown label sequences&lt;/li>
&lt;li>Fast inference&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>👎 Cons&lt;/p>
&lt;ul>
&lt;li>Strong limitation on textual features (conditional independence)&lt;/li>
&lt;li>Model overly simplistic (can improve the generative story but would lose fast inference)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="max-entropy">Max. Entropy&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Discriminative model $P(y\_t|x)$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Don’t care about generation process or input distribution&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Only model conditional output probabilities&lt;/p>
&lt;/li>
&lt;li>
&lt;p>👍 Pros: Flexible feature design&lt;/p>
&lt;/li>
&lt;li>
&lt;p>👎 Cons: local classifier -&amp;gt; disregard sequence information&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="crf">CRF&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Discriminative model&lt;/p>
&lt;/li>
&lt;li>
&lt;p>👍 Pros:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Flexible feature design&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Condition on local sequence context&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Training as easy as MaxEnt&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>👎 Cons: Still no long-range dependencies possible&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="modelling">Modelling&lt;/h2>
&lt;p>Difference to POS&lt;/p>
&lt;ul>
&lt;li>Long-range dependencies&lt;/li>
&lt;li>Alternative resources can be very helpful&lt;/li>
&lt;li>Several NER more than one word long&lt;/li>
&lt;/ul>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-27%2010.36.36.png" alt="截屏2020-05-27 10.36.36" />
&lt;h2 id="inference">Inference&lt;/h2>
&lt;h3 id="viterbi">Viterbi&lt;/h3>
&lt;ul>
&lt;li>Finds exact solution&lt;/li>
&lt;li>Efficient algorithmic solution using dynamic programming&lt;/li>
&lt;li>Complexity exponential in order of Markov model
&lt;ul>
&lt;li>Only feasible for small order&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="greedy">Greedy&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>At each timestep, choose locally best label&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Fast, support conditioning on global history (not future)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>No possibility for “label revision”&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="beam">Beam&lt;/h3>
&lt;ul>
&lt;li>Keep a beam of the $n$ best greedy solutions, expand and prune&lt;/li>
&lt;li>Limited room for label revisions&lt;/li>
&lt;/ul>
&lt;h3 id="gibbs-sampling">Gibbs Sampling&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Stochastic method&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Easy way to sample from multivariate distribution&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Normally used to approximate joint distributions or intervals
&lt;/p>
$$
P\left(y^{(t)} | y^{(t-1)}\right)=P\left(y\_{i}^{(t)} | y\_{-i}^{(t-1)}, x\right)
$$
&lt;ul>
&lt;li>$-1$ means all states except $i$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>💡 Intuitively:&lt;/p>
&lt;ul>
&lt;li>Sample one variable at a time, conditioned on current assignment of all other variables&lt;/li>
&lt;li>Keep checkpoints (e.g. after each sweep through all variables) to approximate distribution&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>In our case:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Initialize NER tags (e.g. random or via Viterbi baseline model)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Re-sample one tag at a time, conditioned on input and all other tags&lt;/p>
&lt;/li>
&lt;li>
&lt;p>After sampling for a long time, we can estimate the joint distribution over outputs $P(y|x)$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>However, it’s slow, and we may only be interested in the best output 🤪&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Could choose &lt;strong>best instead of sampling&lt;/strong>
&lt;/p>
$$
y^{(t)}=y\_{-i}^{(t-1)} \cup \underset{y\_{i}^{(t)}}{\operatorname{argmax}}\left(P\left(y\_{i}^{(t)} | y\_{-i}^{(t-1)}, x\right)\right)
$$
&lt;ul>
&lt;li>will get stuck in local optima 😭&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Better: &lt;strong>Simulated annealing&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Gradually move from sampling to argmax
$$
P\left(y^{(t)} | y^{(t-1)}\right)=\frac{P\left(y\_{i}^{(t)} | y\_{-i}^{(t-1)}, x\right)^{1 / c\_{t}}}{\displaystyle\sum\_{j} P\left(y\_{j}^{(t)} | y\_{-j}^{(t-1)}, x\right)^{1 / c\_{t}}}
$$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="external-knowledge">External Knowledge&lt;/h2>
&lt;h3 id="data-1">Data&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Supervised learning:&lt;/p>
&lt;ul>
&lt;li>Label Data:
&lt;ul>
&lt;li>Text&lt;/li>
&lt;li>NE Annotation&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Unsupervised learning&lt;/p>
&lt;ul>
&lt;li>Unlabeled Data: Text&lt;/li>
&lt;li>Problem: Hard to directly learn NER&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Semi-supervised: Labeled and Unlabeled Data&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="word-clustering">Word Clustering&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Problem: Data Sparsity&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Idea&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Find lower-dimensional representation of words&lt;/p>
&lt;/li>
&lt;li>
&lt;p>real vector /probabilities have natural measure of similarity&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Which words are similarr?&lt;/p>
&lt;ul>
&lt;li>Distributional notion&lt;/li>
&lt;li>if they appear in similar context, e.g.
&lt;ul>
&lt;li>“president” and “chairman” are similar&lt;/li>
&lt;li>“cut” and “knife” not&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Words in same cluster should be similar&lt;/strong>&lt;/p>
&lt;h3 id="brown-clusters">Brown clusters&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Bottom-up&lt;/strong> agglomerative word clustering&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Input: Sequence of words $w\_1, \dots, w\_n$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Output&lt;/p>
&lt;ul>
&lt;li>binary tree&lt;/li>
&lt;li>Cluster: subtree (according to desired #clusters)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>💡 Intuition: put syntacticly &amp;ldquo;exchangable&amp;rdquo; words in same cluster. E.g.:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Similar words: president/chairman, Saturday/Monday&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Not similar: cut/knife&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Algorithm:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Initialization: Every word is its own cluster&lt;/p>
&lt;/li>
&lt;li>
&lt;p>While there are more than one cluster&lt;/p>
&lt;ul>
&lt;li>Merge two clusters that maximizes the quality of the clustering&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Result:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Hard&lt;/strong> clustering: each word belongs to &lt;strong>exactly one&lt;/strong> cluster&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Quality of the clustering&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Use class-based bigram language model&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-05-27%2011.09.55-20200916110552353.png" alt="截屏2020-05-27 11.09.55" style="zoom:150%;" />
&lt;/li>
&lt;li>
&lt;p>Quality: logarithm of the probability of the training text normalized by the length of the text
&lt;/p>
$$
\begin{aligned}
\text { Quality }(C) &amp;=\frac{1}{n} \log P\left(w\_{1}, \ldots, w\_{n}\right) \\\\
&amp;=\frac{1}{n} \log P\left(w\_{1}, \ldots, w\_{n}, C\left(w\_{1}\right), \ldots, C\left(w\_{n}\right)\right) \\\\
&amp;=\frac{1}{n} \log \prod\_{i=1}^{n} P\left(C\left(w\_{i}\right) | C\left(w\_{i-1}\right)\right) P\left(w\_{i} | C\left(w\_{i}\right)\right)
\end{aligned}
$$
&lt;/li>
&lt;li>
&lt;p>Parameters: estimated using maximum-likelihood&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item><item><title>Parsing</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/05-parsing/</link><pubDate>Tue, 15 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/05-parsing/</guid><description>&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2023.41.41.png" alt="截屏2020-09-16 23.41.41">&lt;/p>
&lt;h2 id="tldr">TL;DR&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Representing and Analyze Sentence Structure&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Phrase structure grammar&lt;/p>
&lt;ul>
&lt;li>Context free grammar&lt;/li>
&lt;li>Problems:
&lt;ul>
&lt;li>Ambiguities : PP Attachment&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Traditional Approaches&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Stochastically Parsing&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Probabilistic Context Free Grammar&lt;/p>
&lt;/li>
&lt;li>
&lt;p>CYK Algorithm&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Transition-based parsing&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="grammaticality">Grammaticality&lt;/h2>
&lt;p>Common approach in statistical natural language processing: &lt;strong>n-gram Language Model&lt;/strong>&lt;/p>
&lt;p>E.g., tri-gram
&lt;/p>
$$
\begin{array}{l}
P\left(w_{1}, \ldots, w_{n}\right) \\
=P\left(w_{1}\right) * P\left(w_{2} \mid w_{1}\right) * P\left(w_{3} \mid w_{1} w_{2}\right) \ldots \\
\approx P\left(w_{n} \mid w_{n-2} w_{n-1}\right)
\end{array}
$$
&lt;p>
&lt;span style="color:red">Problems of Language Models&lt;/span>&lt;/p>
&lt;ul>
&lt;li>&lt;span style="color:red">Generalization: even with very long context there are sentence you cannot model with a n-gram language model&lt;/span>&lt;/li>
&lt;li>&lt;span style="color:red">Overall sentence structure&lt;/span>&lt;/li>
&lt;/ul>
&lt;p>How can we model what a grammatically correct sentence is?&lt;/p>
&lt;ul>
&lt;li>Need arbitrary context&lt;/li>
&lt;li>Use grammar describing generation of the sentence&lt;/li>
&lt;/ul>
&lt;h2 id="phrase-structure-grammar">Phrase structure grammar&lt;/h2>
&lt;p>&lt;strong>Describe sentence structure by grammar&lt;/strong> (Constituency relation)&lt;/p>
&lt;p>Phrase structure organizes words into nested constituents (can represent the grammar with &lt;a href="#context-free-grammar">CFG&lt;/a> rules)&lt;/p>
&lt;p>Units in the grammar: &lt;strong>Constituency&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Can be moved around
&lt;ul>
&lt;li>&lt;em>I saw you &lt;u>today&lt;/u>&lt;/em>&lt;/li>
&lt;li>&lt;em>&lt;u>Today&lt;/u>, I saw you&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>expand/contract
&lt;ul>
&lt;li>&lt;em>I saw &lt;u>the boy&lt;/u>&lt;/em>&lt;/li>
&lt;li>&lt;em>I saw &lt;u>him&lt;/u>&lt;/em>&lt;/li>
&lt;li>&lt;em>I saw &lt;u>the old boy&lt;/em>&lt;/u>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>&lt;a href="https://en.wikipedia.org/wiki/Constituent_(linguistics)">Wiki&lt;/a>&lt;/p>
&lt;p>In syntactic analysis, a &lt;strong>constituent&lt;/strong> is a word or a group of words that function &lt;strong>as a single unit&lt;/strong> within a hierarchical structure. The constituent structure of sentences is identified using tests for constituents.&lt;/p>
&lt;p>A phrase is a sequence of one or more words (in some theories two or more) built around a head lexical item and working as a unit within a sentence. A word sequence is shown to be a phrase/constituent if it exhibits one or more of the behaviors discussed below.&lt;/p>
&lt;/blockquote>
&lt;h3 id="phrase-structure-rules">Phrase structure rules&lt;/h3>
&lt;ul>
&lt;li>Describe syntax of language&lt;/li>
&lt;li>Example
&lt;ul>
&lt;li>&lt;code>s&lt;/code> &amp;ndash;&amp;gt;&lt;code>NP&lt;/code> &lt;code>VP&lt;/code> (Sentence consists of a noun phrase and a verb phrase)&lt;/li>
&lt;li>&lt;code>NP&lt;/code> &amp;ndash;&amp;gt; &lt;code>Det&lt;/code> &lt;code>N&lt;/code> (A noun phrase consists of a determiner and a noun)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Only looking at the syntax&lt;/strong>&lt;/li>
&lt;li>&lt;strong>No semantics&lt;/strong>&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>&lt;a href="https://en.wikipedia.org/wiki/Phrase_structure_grammar">Wiki&lt;/a>:&lt;/p>
&lt;p>In linguistics, phrase structure grammars are all those grammars that are based on the &lt;strong>constituency relation&lt;/strong>, as opposed to the dependency relation associated with &lt;strong>dependency grammars&lt;/strong>; hence, phrase structure grammars are also known as &lt;strong>constituency grammars&lt;/strong>&lt;/p>
&lt;p>The fundamental trait that these frameworks all share is that they view sentence structure in terms of the constituency relation.&lt;/p>
&lt;p>Example: Constituency relation Vs. Dependency relation&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Thistreeisillustratingtherelation%28PSG%29.png" alt="Constituency and dependency relations">&lt;/p>
&lt;/blockquote>
&lt;h3 id="context-free-grammar">Context Free Grammar&lt;/h3>
&lt;p>&lt;strong>Constituency = phrase structure grammar = context-free grammars (CFGs)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Introduced by Chomsky&lt;/p>
&lt;/li>
&lt;li>
&lt;p>4-tuple:
&lt;/p>
$$
G = (V, \Sigma, R, S)
$$
&lt;ul>
&lt;li>$V$: finite set of non-terminals
&lt;ul>
&lt;li>variables describing the phrases (NP, VP, &amp;hellip;)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>$\Sigma$: finite set of terminals
&lt;ul>
&lt;li>
&lt;p>content of the sentence&lt;/p>
&lt;/li>
&lt;li>
&lt;p>all words in the grammar&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>$R$: finite relation $V$ to $(V \cup \Sigma)^{\*}$
&lt;ul>
&lt;li>Rules defining how non-terminals can be replaced&lt;/li>
&lt;li>E.g.: &lt;code>s&lt;/code> &amp;ndash;&amp;gt;&lt;code>NP&lt;/code> &lt;code>VP&lt;/code>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>$S$: start symbol&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;em>Example&lt;/em>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-27%2016.05.45.png" alt="截屏2020-09-27 16.05.45">&lt;/p>
&lt;h2 id="dependency-structure">Dependency Structure&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Different approach to describe sentence structure&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Identify semantic relations!&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Idea:&lt;/p>
&lt;ul>
&lt;li>Which words depend on which words&lt;/li>
&lt;li>Which word modifies which word&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-27%2016.20.20.png" alt="截屏2020-09-27 16.20.20" style="zoom: 67%;" />
&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>&lt;a href="https://en.wikipedia.org/wiki/Dependency_grammar">Wiki&lt;/a>&lt;/p>
&lt;p>The (finite) verb is taken to be the structural center of clause structure. All other syntactic units (words) are either directly or indirectly connected to the verb in terms of the directed links, which are called dependencies.&lt;/p>
&lt;p>A dependency structure is determined by the relation between a word (a head) and its dependents. Dependency structures are flatter than phrase structures in part because they lack a finite verb phrase constituent, and they are thus well suited for the analysis of languages with free word order, such as Czech or Warlpiri.&lt;/p>
&lt;/blockquote>
&lt;h2 id="difficulties">Difficulties&lt;/h2>
&lt;p>&lt;strong>&lt;span style="color:red">Ambiguities!!!&lt;/span>&lt;/strong>&lt;/p>
&lt;p>E.g.: Prepositional phrase attachment ambiguity&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-16%2016.34.55.png" alt="截屏2020-09-16 16.34.55" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-16%2016.35.42.png" alt="截屏2020-09-16 16.35.42" style="zoom:67%;" />
&lt;h2 id="parsing">Parsing&lt;/h2>
&lt;p>&lt;strong>Automatically generate parse tree for sentence&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Given:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Grammar&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sentence&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Find: hidden structure&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Idea: Search for different parses&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Applications&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Question – Answering&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Named Entity extraction&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sentiment analysis&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sentence Compression&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="traditional-approaches">Traditional approaches&lt;/h3>
&lt;p>&lt;strong>Hand-defined rules&lt;/strong>: restrict rules by hand to have at best only one possible parse tree&lt;/p>
&lt;p>🔴 Problems&lt;/p>
&lt;ul>
&lt;li>Many parses for the same sentence&lt;/li>
&lt;li>Coverage Problem (Many sentences could not be parsed)&lt;/li>
&lt;li>Time and cost intensive&lt;/li>
&lt;/ul>
&lt;h3 id="statistical-parsing">Statistical parsing&lt;/h3>
&lt;p>Use &lt;strong>machine learning techniques&lt;/strong> to distinguish probable and less probable trees&lt;/p>
&lt;ul>
&lt;li>Automatically learn rules from training data
&lt;ul>
&lt;li>Hand-annotated text with parse trees&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>still many parse trees for one sentence 🤪
&lt;ul>
&lt;li>But weights define most probable&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Tasks
&lt;ul>
&lt;li>&lt;strong>Training&lt;/strong>: learn possible rules and their probabilities&lt;/li>
&lt;li>&lt;strong>Search&lt;/strong>: find most probable parse tree for sentence&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="annotated-data">Annotated Data&lt;/h4>
&lt;p>&lt;strong>Treebank&lt;/strong>:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>human annotated sentence with structure&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Words&lt;/p>
&lt;/li>
&lt;li>
&lt;p>POS Tags&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Phrase structure&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>👍 Advantages:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Reusable&lt;/p>
&lt;/li>
&lt;li>
&lt;p>High coverage&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Evaluation&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-16%2017.42.49.png" alt="截屏2020-09-16 17.42.49" style="zoom:80%;" />
&lt;/li>
&lt;/ul>
&lt;h4 id="probabilistic-context-free-grammar">&lt;strong>Probabilistic Context Free Grammar&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Extension to Context Free Grammar&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Formel definition: 5 tuple
&lt;/p>
$$
G = (V, \Sigma, R, S, P)
$$
&lt;ul>
&lt;li>$V, \Sigma, R, S$: same as Context Free Grammar&lt;/li>
&lt;li>$P$: set of Probabilities on production rules
&lt;ul>
&lt;li>E.g.: &lt;code>s&lt;/code> &amp;ndash;&amp;gt;&lt;code>NP&lt;/code> &lt;code>VP&lt;/code> 0.5&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Properties&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Probability of derivation is product over all rules
&lt;/p>
$$
P(D)=\prod_{r \in D} P(r)
$$
&lt;/li>
&lt;li>
&lt;p>Sum over all probabilities of rules replacing one non-terminal is one
&lt;/p>
$$
\sum_{A} P(S \rightarrow A)=1
$$
&lt;/li>
&lt;li>
&lt;p>Sum over all derivations is one
&lt;/p>
$$
\sum_{D \in S} P(D)=1
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="training">Training&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Input: Annotated training data (E.g.: Treebank)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Training&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Rule extraction&lt;/strong>: Extract possible rules from the trees of the training data&lt;/li>
&lt;li>&lt;strong>Probability estimation&lt;/strong>
&lt;ul>
&lt;li>Assign probabilities of the rules&lt;/li>
&lt;li>Maximum-likelihood estimation&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example:&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2018.35.34.png" alt="截屏2020-09-16 18.35.34">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="search">Search&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Find possible parses of the sentence&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Statistical approach: Find all/many possible parse trees&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Return most probable one&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Strategies:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Top-Down&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Bottom up&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Shift reduce&lt;/strong> algorithm&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Shift&lt;/strong>: advances in the input stream by one symbol. That shifted symbol becomes a new single-node parse tree.&lt;/li>
&lt;li>&lt;strong>Reduce&lt;/strong>: applies a completed grammar rule to some of the recent parse trees, joining them together as one tree with a new root symbol.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/56cee123595482cf3edaef089cb9a6a7.jpg"
alt="Shift reducea algorithm example">&lt;figcaption>
&lt;p>Shift reducea algorithm example&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Dynamic Programming&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="cyk-parsing">CYK Parsing&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Avoid repeat work&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Use Dynamic Programming&lt;/p>
&lt;ul>
&lt;li>Transform grammar in Chomsky normal form&lt;/li>
&lt;li>Store best trees for subphrases&lt;/li>
&lt;li>Combine tree from best trees of subphrases&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>All rules must have the following form&lt;/p>
&lt;ul>
&lt;li>A &amp;ndash;&amp;gt; BC
&lt;ul>
&lt;li>A, B, C non-terminals&lt;/li>
&lt;li>B, C not the start symbol&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>A &amp;ndash;&amp;gt; a
&lt;ul>
&lt;li>A non-terminal&lt;/li>
&lt;li>a terminal&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>S &amp;ndash;&amp;gt; $\epsilon$
&lt;ul>
&lt;li>Create empty string if it is in the grammar&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Every context-free grammar can be transferred into one having Chomsky normal form&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Binarization&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Only rules with two non-terminals&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Idea:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Introduce additional non-terminal&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Replace one rules with three non-terminals by two rules with two non- terminals each&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-16%2021.32.16.png" alt="截屏2020-09-16 21.32.16" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Remove unaries&lt;/p>
&lt;ul>
&lt;li>Remove intermediate rules&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Problems&lt;/p>
&lt;ul>
&lt;li>Very strong indepedence assumption&lt;/li>
&lt;li>Label is bottleneck&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://en.wikipedia.org/wiki/CYK_algorithm">Example&lt;/a>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Grammar&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-27%2023.40.08.png" alt="截屏2020-09-27 23.40.08">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Analyse the sentence &amp;ldquo;&lt;em>she eats a fish with a fork&lt;/em>&amp;rdquo; with the CYK algorithm:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/440px-CYK_algorithm_animation_showing_every_step_of_a_sentence_parsing.gif" alt="img" style="zoom:80%;" />
&lt;p>result:&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-27%2023.41.25.png" alt="截屏2020-09-27 23.41.25">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="transition-based-dependency-parsing">Transition-based Dependency Parsing&lt;/h3>
&lt;p>Model Dependency structure&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2022.08.00.png" alt="截屏2020-09-16 22.08.00">&lt;/p>
&lt;p>Predict transition sequence: Transition between configuration&lt;/p>
&lt;h4 id="arc-standard-system">&lt;strong>Arc-standard System&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Configuration&lt;/p>
&lt;ul>
&lt;li>Stack&lt;/li>
&lt;li>Buffer&lt;/li>
&lt;li>Set of Dependency Arcs&lt;/li>
&lt;li>Initial configuration: [Root], $w_1,\dots, w_n$, {}
&lt;ul>
&lt;li>All words are in the buffer&lt;/li>
&lt;li>The stack is empty&lt;/li>
&lt;li>The dependency graph is empty&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Terminal configuration
&lt;ul>
&lt;li>The buffer is empty&lt;/li>
&lt;li>The stack contains a single word&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2022.00.17.png" alt="截屏2020-09-16 22.00.17">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Transistions&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Left-arc
&lt;/p>
$$
([\sigma|i| j], B, A) \Rightarrow([\sigma \mid j], B, A \cup\{j, I, i\})
$$
&lt;ul>
&lt;li>Add dependency between top and second top element of the stack with label l to the arcs&lt;/li>
&lt;li>Remove second top element from the stack&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Right-arc
&lt;/p>
$$
([\sigma|i| j], B, A) \Rightarrow([\sigma \mid i], B, A \cup\{i, I, j\})
$$
&lt;ul>
&lt;li>Add dependency between second top and top element of the stack with label l to the arcs&lt;/li>
&lt;li>Remove top element from the stack&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Shift: Move first elemnt of the buffer to the stack&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;ul>
&lt;li>Initial configuration&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-16%2022.14.32.png" alt="截屏2020-09-16 22.14.32" />
&lt;ul>
&lt;li>Shift&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-16%2022.14.35.png" alt="截屏2020-09-16 22.14.35" />
&lt;ul>
&lt;li>Shift&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2022.14.38.png" alt="截屏2020-09-16 22.14.38">&lt;/p>
&lt;ul>
&lt;li>Left arc&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2022.14.42.png" alt="截屏2020-09-16 22.14.42">&lt;/p>
&lt;ul>
&lt;li>Shift&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2022.14.49-20200916222538371.png" alt="截屏2020-09-16 22.14.49">&lt;/p>
&lt;ul>
&lt;li>Shift&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2022.14.52.png" alt="截屏2020-09-16 22.14.52">&lt;/p>
&lt;ul>
&lt;li>Left arc&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2022.15.08-20200916221721930.png" alt="截屏2020-09-16 22.15.08">&lt;/p>
&lt;ul>
&lt;li>Shift&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2022.15.11-20200916221740541.png" alt="截屏2020-09-16 22.15.11">&lt;/p>
&lt;ul>
&lt;li>Right arc&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2022.15.16.png" alt="截屏2020-09-16 22.15.16">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Right arc&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2022.25.12.png" alt="截屏2020-09-16 22.25.12">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h5 id="problems">Problems&lt;/h5>
&lt;ul>
&lt;li>Sparsity&lt;/li>
&lt;li>Incompleteness&lt;/li>
&lt;li>Expensive computation&lt;/li>
&lt;/ul>
&lt;h4 id="neural-network-based-prediction">Neural Network-based prediction&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Feed forward neural network to predict operation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Inpupt&lt;/p>
&lt;ul>
&lt;li>Set of words $S^w$, pos-tags $S^t$ adn labels $S^l$&lt;/li>
&lt;li>Fixed number&lt;/li>
&lt;li>Map to continuous space&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Output&lt;/p>
&lt;ul>
&lt;li>Operation&lt;/li>
&lt;li>$2N_l + 1$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example structure&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2023.33.02.png" alt="截屏2020-09-16 23.33.02">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="evaluation">Evaluation&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Label precision/recall&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Describe tree as set of triple (Label, start, end)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Calculate precision/recall/f-score of reference and hypothesis&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="reference">Reference&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Shift reduce algorithm&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://www.bookstack.cn/read/nlp-py-2e-zh/spilt.4.8.md">Shift reduce Parsing&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://en.wikipedia.org/wiki/Shift-reduce_parser">Wiki&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://cl.lingfil.uu.se/~sara/kurser/5LN455-2014/lectures/5LN455-F8.pdf">Transition-based dependency parsing&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://zhuanlan.zhihu.com/p/110532288">[CS224n笔记] L5 Dependency Parsing&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="http://web.stanford.edu/class/cs224n/slides/cs224n-2020-lecture05-dep-parsing.pdf">CS224n, Linguistic Structure: Dependency Parsing&lt;/a>&lt;/p>
&lt;/li>
&lt;/ul></description></item><item><title>Summarization</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/06-summarization/</link><pubDate>Wed, 16 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/06-summarization/</guid><description>&lt;h2 id="tldr">TL;DR&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>&lt;a href="#what-is-summarization">Text summarization&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Most important technique&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#extraction">Extraction&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Tasks:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#key-word-extraction">Key word extraction&lt;/a>&lt;/li>
&lt;li>&lt;a href="#sentence-extraction">Sentence extraction&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Algorithms:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;a href="#supervised-approaches">Supervised&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#unsupervised-approaches">Unsupervised&lt;/a>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Abstract summarization still an open problem&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-16%2023.43.02.png" alt="截屏2020-09-16 23.43.02">&lt;/p>
&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;h3 id="what-is-summarization">&lt;strong>What is Summarization?&lt;/strong>&lt;/h3>
&lt;ul>
&lt;li>Reduce natural language text document&lt;/li>
&lt;li>Goal: &lt;strong>Compress&lt;/strong> text by extracting the most important/relevant parts &amp;#x1f4aa;&lt;/li>
&lt;/ul>
&lt;h3 id="applications">Applications&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Articles, news: Outlines or abstracts&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Email / Email threads&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Health information&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Meeting summarization&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="dimensions">Dimensions&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Single vs. multiple&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Single-document&lt;/strong> summarization
&lt;ul>
&lt;li>Given single document&lt;/li>
&lt;li>Produce abstract, outline, headline, etc.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Multiple-document&lt;/strong> summarization
&lt;ul>
&lt;li>
&lt;p>Given a group of documents&lt;/p>
&lt;/li>
&lt;li>
&lt;p>A series of news stories on the same event&lt;/p>
&lt;/li>
&lt;li>
&lt;p>A set of web pages about some topic or question&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Generic vs. Query-focused summarization&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Generic&lt;/strong> summarization: summarize the content of the document&lt;/li>
&lt;li>&lt;strong>Query-focused&lt;/strong> summarization: kind of &lt;em>complex&lt;/em> question answering 🤪
&lt;ul>
&lt;li>Summarize a document with respect to an information need expressed in a user query&lt;/li>
&lt;li>Longer, descriptive, more informative answers&lt;/li>
&lt;li>Answer a question by summarizing a document that has information to construct the answer&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="techniques">Techniques&lt;/h2>
&lt;h3 id="extraction">Extraction&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Select subset of existing text segments&lt;/strong>&lt;/li>
&lt;li>e.g.:
&lt;ul>
&lt;li>
&lt;p>Sentence extraction&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Key-phrase extraction&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Simpler, most focus in research&lt;/li>
&lt;/ul>
&lt;h3 id="abstraction">Abstraction&lt;/h3>
&lt;ul>
&lt;li>Use natural language generation to create summary&lt;/li>
&lt;li>More human like&lt;/li>
&lt;/ul>
&lt;h2 id="extractive-summarization">Extractive summarization&lt;/h2>
&lt;p>Three main components&lt;/p>
&lt;ul>
&lt;li>Content selection (&amp;quot;&lt;em>Which parts are important to be in the summary?&lt;/em>&amp;quot;)&lt;/li>
&lt;li>Information ordering (&amp;quot;&lt;em>How to order summaries?&lt;/em>&amp;quot;)&lt;/li>
&lt;li>Sentence realization (Clean up/Simplify sentences)&lt;/li>
&lt;/ul>
&lt;h3 id="supervised-approaches">Supervised approaches&lt;/h3>
&lt;h4 id="key-word-extraction">Key-word extraction&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Given: Text (e.g. abstract of an article, &amp;hellip;)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Task: Find most important key phrases&lt;/strong>&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Computer&lt;/th>
&lt;th>Human&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Select key phrases from the text&lt;/td>
&lt;td>Abstraction of the text&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>No new wordings&lt;/td>
&lt;td>New words&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;/li>
&lt;/ul>
&lt;h5 id="key-phrase-extraction-using-supervised-approaches">&lt;strong>Key-phrase extraction using Supervised approaches&lt;/strong>&lt;/h5>
&lt;ul>
&lt;li>
&lt;p>Given: Collection of text with key-words&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Algorithm&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Extract all uni-grams, bi-grams and tri-grams&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-17%2023.08.38.png" alt="截屏2020-09-17 23.08.38">&lt;/p>
&lt;p>Extraction: Compatibility, Compatibility of, Compatibility of systems, of, of systems, of systems of, systems, systems of, systems of linear, of linear, of linear constraints, linear, linear constraints, linear constraints over, &amp;hellip;&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Annotate each examples with features&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Annotate training examples with class:&lt;/p>
&lt;ul>
&lt;li>1 if sequence is part of the key words&lt;/li>
&lt;li>0 if sequence is not part of the key words&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Train classifier&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Create test examples and classify&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Examples set&lt;/p>
&lt;ul>
&lt;li>All uni-, bi-, and trigrams (except punctuation)&lt;/li>
&lt;li>restrict to certain POS sets&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🔴 &lt;strong>Problem&lt;/strong>:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Enough examples to generate all/most key phrases&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Too many examples -&amp;gt; low performance of classifier&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Features&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Term frequency&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>TF-IDF (Term Frequency–Inverse Document Frequency)&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Reflect importance of a word in a document
&lt;/p>
$$
\text{TF-IDF} = tf * idf
$$
&lt;ul>
&lt;li>
&lt;p>$tf(w, D)$:&lt;/p>
&lt;ul>
&lt;li>Term Frequency, measures how frequently a term occurs in a document.&lt;/li>
&lt;li>Number of occurrences of word $w$ in document $d$ divided by the maximum frequency of one word in $D$&lt;/li>
&lt;/ul>
$$
t f(w, D)=\\#(w, D) \frac{\\#(w, D)}{\max\_{w^{\prime} \in D}\left(w^{\prime}, D\right)}
$$
&lt;blockquote>
&lt;p>Alternative definition:
&lt;/p>
$$
> tf(w, D) = \frac{\text{count of } w \text{ in } D}{\text{number of words in } D}
> $$
&lt;/blockquote>
&lt;/li>
&lt;li>
&lt;p>$idf(w)$:&lt;/p>
&lt;ul>
&lt;li>Inverse Document Frequency, measures how important a term is&lt;/li>
&lt;li>Idea: Words which occur in less documents are more important&lt;/li>
&lt;li>Number of documents divided by the number of documents which contain $w$&lt;/li>
&lt;/ul>
$$
i d f(w)=\log \frac{|D|}{|\\{d \in D: w \in d\\}|}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="http://www.tfidf.com/">Example&lt;/a>:&lt;/p>
&lt;ul>
&lt;li>&lt;em>Consider a document containing 100 words wherein the word cat appears 3 times. The term frequency (i.e., tf) for cat is then (3 / 100) = 0.03. Now, assume we have 10 million documents and the word cat appears in one thousand of these. Then, the inverse document frequency (i.e., idf) is calculated as log(10,000,000 / 1,000) = 4. Thus, the Tf-idf weight is the product of these quantities: 0.03 * 4 = 0.12.&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Length of the example&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Relative position of the first occurrence&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Boolean syntactic features&lt;/p>
&lt;ul>
&lt;li>contains all caps&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Learning algorithm&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Decision trees,&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Naive Bayes classifier&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&amp;hellip;&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Evaluation&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Compare results to reference&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Test set: Text&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Human generated Key words&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Metrics:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Precision&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Recall&lt;/p>
&lt;/li>
&lt;li>
&lt;p>F-Score&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🔴 Problems&lt;/p>
&lt;ul>
&lt;li>Humans do not only extract key words, but also abstract&lt;/li>
&lt;li>Normally not all key words are reachable&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="sentence-extraction">Sentence extraction&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Use statistic heuristics to select sentences&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Do not change content and meaning&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>💡 Idea&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Use measure to determine importance of sentence
&lt;ul>
&lt;li>TF-IDF&lt;/li>
&lt;li>Supervised trained combination of several features&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Rank sentence according to metric&lt;/li>
&lt;li>Output sentences with highest scores:
&lt;ul>
&lt;li>Fixed number&lt;/li>
&lt;li>All sentence above threshold&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Limitations: Do NOT change text (e.g. add phrases, delete parts of the text)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Evaluation&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Idea: Compare automatic summary to abstract of text&lt;/p>
&lt;ul>
&lt;li>&lt;span style="color:red">Problem: Different sentences &amp;ndash;&amp;gt; Nearly no exact match&lt;/span> 😭&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>ROUGE - Recall-Oriented Understudy for Gisting Evaluation&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Use also approximate matches&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Compare automatic summary to human generated text&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Given a document D and an automatic summary X&lt;/p>
&lt;ul>
&lt;li>
&lt;p>M humans produce a set of reference summaries of D&lt;/p>
&lt;/li>
&lt;li>
&lt;p>What percentage of the n-grams from the reference summaries appear in X?&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>ROUGE-N: Overlap of N-grams between the system and reference summaries
&lt;/p>
$$
\text{ROUGH-N} = \frac{\sum\_{S \in \\{\text{Reference Summaries}\\}} \sum\_{gram\_n \in S}\operatorname{Count}\_{match}(gram\_n)}{\sum\_{S \in \\{\text{Reference Summaries}\\}} \sum\_{gram\_n \in S}\operatorname{Count}(gram\_n)}
$$
&lt;ul>
&lt;li>
&lt;p>Example:&lt;/p>
&lt;p>Auto-generated summary ($Y$)&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-tex" data-lang="tex">&lt;span class="line">&lt;span class="cl">the cat was found under the bed
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Gold standard (human produced) ($X1$)&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-tex" data-lang="tex">&lt;span class="line">&lt;span class="cl">the cat was under the bed
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>1-gram and 2-gram summary:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>#&lt;/th>
&lt;th>1-gram&lt;/th>
&lt;th>reference 1-gram&lt;/th>
&lt;th>2-gram&lt;/th>
&lt;th>reference 2-gram&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>1&lt;/td>
&lt;td>&lt;span style="color:green">the&lt;/span>&lt;/td>
&lt;td>&lt;span style="color:green">the&lt;/span>&lt;/td>
&lt;td>&lt;span style="color:green">the cat&lt;/span>&lt;/td>
&lt;td>&lt;span style="color:green">the cat&lt;/span>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>2&lt;/td>
&lt;td>&lt;span style="color:green">cat&lt;/span>&lt;/td>
&lt;td>&lt;span style="color:green">cat&lt;/span>&lt;/td>
&lt;td>&lt;span style="color:green">cat was&lt;/span>&lt;/td>
&lt;td>&lt;span style="color:green">cat was&lt;/span>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>3&lt;/td>
&lt;td>&lt;span style="color:green">was&lt;/span>&lt;/td>
&lt;td>&lt;span style="color:green">was&lt;/span>&lt;/td>
&lt;td>was found&lt;/td>
&lt;td>was under&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>4&lt;/td>
&lt;td>found&lt;/td>
&lt;td>&lt;span style="color:green">under&lt;/span>&lt;/td>
&lt;td>found under&lt;/td>
&lt;td>&lt;span style="color:green">under the&lt;/span>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>5&lt;/td>
&lt;td>&lt;span style="color:green">under&lt;/span>&lt;/td>
&lt;td>&lt;span style="color:green">the&lt;/span>&lt;/td>
&lt;td>&lt;span style="color:green">under the&lt;/span>&lt;/td>
&lt;td>&lt;span style="color:green">the bed&lt;/span>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>6&lt;/td>
&lt;td>&lt;span style="color:green">the&lt;/span>&lt;/td>
&lt;td>&lt;span style="color:green">bed&lt;/span>&lt;/td>
&lt;td>&lt;span style="color:green">the bed&lt;/span>&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>7&lt;/td>
&lt;td>&lt;span style="color:green">bed&lt;/span>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>count&lt;/td>
&lt;td>7&lt;/td>
&lt;td>6&lt;/td>
&lt;td>6&lt;/td>
&lt;td>5&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;ul>
&lt;li>$\operatorname{ROUGE}-1(X1, Y) = \frac{6}{6} = 1$&lt;/li>
&lt;li>$\operatorname{ROUGE}-2(X1, Y) = \frac{4}{5}$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="unsupervised-approaches">Unsupervised approaches&lt;/h3>
&lt;p>Problems of supervised approaches: &lt;span style="color:red">Hard to acquire training data&lt;/span>&lt;/p>
&lt;p>We try to use unsupervised learning to find key phrases / sentences which are most important. But which sentences are most important?&lt;/p>
&lt;p>💡 Idea: Sentences which are most similar to the other sentences in the text&lt;/p>
&lt;h3 id="graph-based-approaches">Graph-based approaches&lt;/h3>
&lt;ul>
&lt;li>Map text into a graph
&lt;ul>
&lt;li>Nodes:
&lt;ul>
&lt;li>Text segments: Words&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Edges: Similarity&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Find most important/central vertices&lt;/li>
&lt;li>Algorithms: TextRank / LexRank&lt;/li>
&lt;/ul>
&lt;h4 id="graph-based-approaches--key-phrase-extraction">Graph-based approaches : &lt;strong>Key-phrase extraction&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Graph&lt;/p>
&lt;ul>
&lt;li>Nodes
&lt;ul>
&lt;li>Text segments: Words&lt;/li>
&lt;li>Restriction:
&lt;ul>
&lt;li>Nouns&lt;/li>
&lt;li>Adjective&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Edges
&lt;ul>
&lt;li>items co-occur in a window of N words&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Calculate most important nodes&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Build Multi-word expression in &lt;strong>post-processing&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Mark selected items in original text&lt;/li>
&lt;li>If two adjacent words are marked &amp;ndash;&amp;gt; Collapse to one multi-words expression&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-18%2000.43.48.png" alt="截屏2020-09-18 00.43.48">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="graph-based-approaches--sentence-extraction">&lt;strong>Graph-based approaches : Sentence extraction&lt;/strong>&lt;/h4>
&lt;p>Graph:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Nodes: Sentences&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Edges: Fully connected with weights&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Weights:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>TextRank: Word overlap normalized to sentence length
&lt;/p>
$$
\text {Similarity}\left(S\_{i}, S\_{j}\right)=\frac{\left|\left\\{w\_{k} \mid w\_{k} \in S\_{i} \text{ &amp; } w\_{k} \in S\_{j}\right\\}\right|}{\log \left(\left|S\_{i}\right|\right)+\log \left(\left|S\_{j}\right|\right)}
$$
&lt;/li>
&lt;li>
&lt;p>LexRank: Cosine Similarity of TF-IDF vectors
&lt;/p>
$$
\text { idf-modified-cosine }(x, y)=\frac{\sum\_{w \in x, y} \mathrm{tf}\_{w, x} \mathrm{tf}\_{w, y}\left(\mathrm{idf}\_{w}\right)^{2}}{\sqrt{\sum\_{x\_{i} \in x}\left(\mathrm{tf}\_{x\_{i}, x} \mathrm{idf}\_{x\_{i}}\right)^{2}} \times \sqrt{\sum\_{y\_{i} \in y}\left(\mathrm{tf}\_{y\_{i}, y} \mathrm{idf}\_{y\_{i}}\right)^{2}}}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="abstract-summarization">Abstract summarization&lt;/h2>
&lt;ul>
&lt;li>Sequence to Sequence task
&lt;ul>
&lt;li>Input: Document&lt;/li>
&lt;li>Output: Summary&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Several NLP tasks can be modeled like this (ASR, MT,&amp;hellip;)&lt;/li>
&lt;li>Successful deep learning approach: &lt;strong>Encoder-Decoder Model&lt;/strong>&lt;/li>
&lt;/ul>
&lt;h3 id="sequence-to-sequence-model">Sequence-to-Sequence Model&lt;/h3>
&lt;img src="https://miro.medium.com/max/1986/1*1JcHGUU7rFgtXC_mydUA_Q.jpeg" alt="Image for post" style="zoom: 33%;" />
&lt;ul>
&lt;li>Predict words based on
&lt;ul>
&lt;li>previous target words and&lt;/li>
&lt;li>source sentence&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Encoder: Read in source sentence&lt;/li>
&lt;li>Decoder: Generate target sentence word by word&lt;/li>
&lt;/ul>
&lt;h4 id="encoder">Encoder&lt;/h4>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-18%2010.11.37.png" alt="截屏2020-09-18 10.11.37">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Read in input: Represent content as hidden vector with fixed dimension&lt;/p>
&lt;/li>
&lt;li>
&lt;p>LSTM-based model&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Fixed-size sentence representation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Details:&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-18%2010.12.59.png" alt="截屏2020-09-18 10.12.59">&lt;/p>
&lt;ul>
&lt;li>One–hot encoding&lt;/li>
&lt;li>Word embedding&lt;/li>
&lt;li>RNN layer(s)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="decoder">Decoder&lt;/h4>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-18%2010.13.33.png" alt="截屏2020-09-18 10.13.33">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Generate output: Use output of encoder as input&lt;/p>
&lt;/li>
&lt;li>
&lt;p>LSTM-based model&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Input last target word&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="attention-based-encoder-decoder">Attention-based Encoder-Decoder&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-18%2010.17.42.png" alt="截屏2020-09-18 10.17.42" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-18%2010.16.48.png" alt="截屏2020-09-18 10.16.48" style="zoom:67%;" />
&lt;h4 id="attention-based-encoder--copy-mechanism">Attention-based Encoder : copy mechanism&lt;/h4>
&lt;p>Calculate probability “&lt;strong>better to generate one word from vocabulary than to copy a word from source sentence&lt;/strong>“
&lt;/p>
$$
p\_{g e n}=\sigma\left(w\_{c}^{T} c\_{t}+w\_{s}^{T} s\_{t}+w\_{x}^{T} x\_{t}+b\_{p t r}\right)
$$
&lt;p>
Word with the &lt;strong>highest probability&lt;/strong> should be the output word
&lt;/p>
$$
P(w)=p\_{g e n} P\_{v o c a b}(w)+\left(1-p\_{g e n}\right) \sum\_{j: w\_{j}=w} \alpha\_{i j}
$$
&lt;h3 id="data">Data&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Training data&lt;/strong>
&lt;ul>
&lt;li>Documents and summary&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>DUC data set&lt;/strong>&lt;/li>
&lt;li>News article&lt;/li>
&lt;li>Around 14 word summary&lt;/li>
&lt;li>&lt;strong>Giga word&lt;/strong>
&lt;ul>
&lt;li>News articles&lt;/li>
&lt;li>Headline generation&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>CNN/Mail Corpus&lt;/strong>&lt;/li>
&lt;li>Article&lt;/li>
&lt;li>Predict bullet points&lt;/li>
&lt;/ul>
&lt;h2 id="reference">Reference&lt;/h2>
&lt;ul>
&lt;li>TF-IDF:
&lt;ul>
&lt;li>&lt;a href="https://blog.csdn.net/zhaomengszu/article/details/81452907">TF-IDF算法详解&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item><item><title>Question Answering</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/07-question-answering/</link><pubDate>Fri, 18 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/07-question-answering/</guid><description>&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2020-09-18%2010.32.49.png" alt="截屏2020-09-18 10.32.49" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="definition">Definition&lt;/h2>
&lt;p>&lt;strong>Question Answering&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Automatically answer questions posed by humans in natural language&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Give user short answer to their question&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Gather and consult necessary information&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Related topics&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Information Retrieval&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Reading Comprehension&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Database Access&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Dialog&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Text Summarization&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="problem-dimensions">Problem Dimensions&lt;/h2>
&lt;h3 id="questions">Questions&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Question class&lt;/strong>
&lt;ul>
&lt;li>Almost universally factoid questions
E.g.: &lt;em>“What does the Peugeot company manufacture?”&lt;/em>&lt;/li>
&lt;li>More open in dialog context&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Question domain&lt;/strong>
&lt;ul>
&lt;li>
&lt;p>Topic of the content&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Open-Domain: Any topic&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Closed-Domain: Specific topic, e.g. movies, sports, etc&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Context&lt;/strong>
&lt;ul>
&lt;li>How much context is provided?&lt;/li>
&lt;li>Is search necessary?&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Answer types&lt;/strong>
&lt;ul>
&lt;li>Factual Answers&lt;/li>
&lt;li>Opinion&lt;/li>
&lt;li>Summary&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Kind of questions&lt;/strong>
&lt;ul>
&lt;li>
&lt;p>Yes/No&lt;/p>
&lt;/li>
&lt;li>
&lt;p>“wh”-questions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Indirect requests (I would like to&amp;hellip;)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Commands&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="applications">Applications&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Knowledge source types&lt;/strong>
&lt;ul>
&lt;li>
&lt;p>Structured data (database)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Semi-structured data (e.g. Wikipedia tables)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Free text (e.g. Wikipedia text)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Knowledge source origins&lt;/strong>
&lt;ul>
&lt;li>Search over the web&lt;/li>
&lt;li>Search of a collection&lt;/li>
&lt;li>Single text&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Domain&lt;/strong>
&lt;ul>
&lt;li>Domain-independent&lt;/li>
&lt;li>Domain-specific system&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="users">Users&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>First time/casual users&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Explain limitations&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Power users&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Emphasize novel information&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Omit previously provided information&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="answers">Answers&lt;/h3>
&lt;ul>
&lt;li>Long&lt;/li>
&lt;li>Short&lt;/li>
&lt;li>Lists&lt;/li>
&lt;li>Narrative&lt;/li>
&lt;li>Creation
&lt;ul>
&lt;li>Extraction&lt;/li>
&lt;li>Generation&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="evaluation">Evaluation&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>What is a good answer?&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Should the answer be short or long?&lt;/strong>
&lt;ul>
&lt;li>Easier to have the answer in longer segments&lt;/li>
&lt;li>Less concise, more comprehensive&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="presentation">Presentation&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Underspecified question&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Feedback&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Too many documents&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Text or speech input&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="examples">Examples&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>TREC&lt;/strong>&lt;/li>
&lt;li>&lt;strong>SQuAD&lt;/strong> (Stanford Question Answering Dataset)&lt;/li>
&lt;li>&lt;strong>IBM Watson&lt;/strong>&lt;/li>
&lt;/ul>
&lt;h2 id="motivation">Motivation&lt;/h2>
&lt;ul>
&lt;li>Vast amounts of information written by humans for humans&lt;/li>
&lt;li>Computers are good at searching vast amounts of information&lt;/li>
&lt;li>Natural interaction with computers &amp;#x1f4aa;&lt;/li>
&lt;/ul>
&lt;h2 id="system-approaches">System Approaches&lt;/h2>
&lt;h3 id="text-based-system">Text-based system&lt;/h3>
&lt;ul>
&lt;li>Use &lt;strong>information retrieval&lt;/strong> to search for matching documents&lt;/li>
&lt;/ul>
&lt;h3 id="knowledge-based-approaches">Knowledge-based approaches&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Build &lt;strong>semantic representation&lt;/strong> of the query&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Retrieve answer from semantic databases (Ontologies)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="knowledge-rich--hybrid-approaches">Knowledge-rich / hybrid approaches&lt;/h3>
&lt;p>Combine both&lt;/p>
&lt;h2 id="qa-system-overview">QA System Overview&lt;/h2>
&lt;h3 id="components">Components&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Information Retrieval&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Need to find good text segments&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Answer Extraction&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Given some context and the question, produce an answer&lt;/li>
&lt;li>Either part may be supplemented by other NLP tools&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Common Components&lt;/strong>&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2020-09-18%2013.12.43.png" alt="截屏2020-09-18 13.12.43" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h3 id="preprocessing">Preprocessing&lt;/h3>
&lt;h3 id="question-analysis">Question Analysis&lt;/h3>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2020-09-18%2013.02.34.png" alt="截屏2020-09-18 13.02.34" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Input: Natural language question&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Implicit input&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Dialog state&lt;/p>
&lt;/li>
&lt;li>
&lt;p>User information&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Derived inputs&lt;/p>
&lt;ul>
&lt;li>POS-tags, NER, dependency graph, syntax tree, etc.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Output: Representation for Information Retrieval and Answer Extraction&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>For &lt;strong>IR&lt;/strong>: Weighted vector or search term collection&lt;/li>
&lt;li>For &lt;strong>answer extraction&lt;/strong>
&lt;ul>
&lt;li>Lexical answer type (person/company/acronym/&amp;hellip;)&lt;/li>
&lt;li>Additional constraints (e.g. relations)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="answer-type-classification">Answer Type Classification&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-18%2013.04.13.png" alt="截屏2020-09-18 13.04.13" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Classical approach&lt;/strong>: Question word (who, what, where,&amp;hellip;)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>When: date&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Who: person&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Where: location&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Examples&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Regular expressions&lt;/p>
&lt;p>Who {is | was | are | were } – Person&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Question head word (First noun phrase after the question word)&lt;/p>
&lt;ul>
&lt;li>Which &lt;strong>city&lt;/strong> in China has the largest number of foreign financial companies?&lt;/li>
&lt;li>What is the state &lt;strong>flower&lt;/strong> of California?&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🔴 Problems&lt;/p>
&lt;ul>
&lt;li>“Who” questions could refer to e.g. companies
&lt;ul>
&lt;li>E.g. &lt;em>&amp;ldquo;Who makes the Beetle?&amp;rdquo;&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Which / What is not clear
&lt;ul>
&lt;li>E.g. &lt;em>&amp;ldquo;What was the Beatles’ first hit single?&amp;rdquo;&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Approaches&lt;/p>
&lt;ul>
&lt;li>Manually created question type hierarchy&lt;/li>
&lt;li>Machine learning classification&lt;/li>
&lt;/ul>
&lt;p>(Current ML systems often do NOT use Answer Type Classification 😂)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="constraints">Constraints&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Keyword extraction&lt;/p>
&lt;ul>
&lt;li>Expand keywords using synonyms&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Statistical parsing&lt;/p>
&lt;ul>
&lt;li>Identify semantic constraints&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>Represent a question as bag-of-words&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;em>“What was the monetary value of the Nobel Peace Price in 1989?”&lt;/em>&lt;/p>
&lt;p>&lt;code>monetary, value, Nobel, Peace, Price, 1989&lt;/code>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;em>“What does the Peugeot company manufacture?”&lt;/em>&lt;/p>
&lt;p>&lt;code>Peugeot, company, manufacture&lt;/code>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;em>“How much did Mercury spend on advertising in 1993?”&lt;/em>&lt;/p>
&lt;p>&lt;code>Mercury, spend, advertising, 1993&lt;/code>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="retrieval-candidate-document-selection">Retrieval: Candidate Document Selection&lt;/h3>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2020-09-18%2013.10.29.png" alt="截屏2020-09-18 13.10.29" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Most common approach:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Conventional Information Retrieval search&lt;/strong>
&lt;ul>
&lt;li>
&lt;p>Using search indices&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Lucene&lt;/p>
&lt;/li>
&lt;li>
&lt;p>TF-IDF&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Several stages&lt;/strong>: Coarse-to-fine search&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Result: Small set of documents for detailed analysis&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Decisions: Boolean vs. rank-based engines&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Retrieve only part of the document&lt;/p>
&lt;ul>
&lt;li>Mostly only part of the document is important&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Passage retrieval&lt;/p>
&lt;ul>
&lt;li>Return only &lt;strong>subsets&lt;/strong> of the document&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Segment document into coherent text segments&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Combine results from multiple search engines&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Text-based system&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Use only syntactic information such as n-grams&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example: &lt;strong>TF-IDF&lt;/strong> (Term Frequency, Inverse Document frequency)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Weighted bag-of-words vector&lt;/p>
&lt;/li>
&lt;li>
&lt;p>One component per word in vocabulary&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Term frequency&lt;/strong>: Number of times term appears in the document&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Document frequency&lt;/strong>: Number of documents the term appears in&lt;/p>
&lt;/li>
&lt;/ul>
$$
\begin{array}{l}
T F^{\prime}(d, t)=\log (1+T F(d, t)) \\\\
I D F(t)=\log \frac{n_{d}-D F(t)}{D F(t)} \\\\
T F I D F(d, t)=T F^{\prime}(d, t) I D F(t)
\end{array}
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Knowledge-based / semantic-based system&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Build semantic representation by extracting information from the question&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Construct structured query for semantic database&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Not raw or indexed text corpus&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Examples&lt;/p>
&lt;ul>
&lt;li>
&lt;p>WordNet&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Wikipedia Infoboxes&lt;/p>
&lt;/li>
&lt;li>
&lt;p>FreeBase&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="candidate-document-analysis">Candidate Document Analysis&lt;/h3>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2020-09-18%2014.50.32.png" alt="截屏2020-09-18 14.50.32" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;ul>
&lt;li>Named entity tagging
&lt;ul>
&lt;li>Often including subclasses (towns, cities, provinces, &amp;hellip;)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Sentence splitting, tagging, chunk parsing&lt;/li>
&lt;li>Identify multi-word terms and their variants&lt;/li>
&lt;li>Represent relation constraints of the text&lt;/li>
&lt;/ul>
&lt;h3 id="answer-extraction">Answer Extraction&lt;/h3>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2020-09-18%2014.58.16.png" alt="截屏2020-09-18 14.58.16" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Input&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Representations for candidate text segments and question&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Rank set of candidate sentences&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Expected answer type(s)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Find answer strings that match the answer type(s) based on documents&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Extractive: Answers are substrings in the documents&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Generative: Answers are free text (NLG)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Rank the candidate answers&lt;/p>
&lt;ul>
&lt;li>E.g. overlap between answer and question&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Return result(s) with best overall score&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2020-09-18%2015.00.35.png" alt="截屏2020-09-18 15.00.35" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="response-generation">Response Generation&lt;/h3>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2020-09-18%2015.01.12.png" alt="截屏2020-09-18 15.01.12" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Rephrase text segment&lt;/p>
&lt;ul>
&lt;li>E.g. resolve anaphors&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Provide longer or shorter answer&lt;/p>
&lt;ul>
&lt;li>Add some part of context into the answer&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>If answer is too complex&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Truncate answer&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Start dialog&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="neural-network-approach">Neural Network Approach&lt;/h2>
&lt;ul>
&lt;li>Neural models struggle with Information Retrieval 🤪&lt;/li>
&lt;li>&lt;strong>Excellent results on answer extraction&lt;/strong> 😍
&lt;ul>
&lt;li>Given: Question and Context (document, paragraph, nugget, etc.)&lt;/li>
&lt;li>Result: Answer as substring from context
&lt;ul>
&lt;li>Predict most likely start and end index as classification task&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Combines:
&lt;ul>
&lt;li>Question Analysis&lt;/li>
&lt;li>Retrieved Document Analysis&lt;/li>
&lt;li>Answer Extraction&lt;/li>
&lt;li>Response Generation&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="neural-answer-extraction">Neural Answer Extraction&lt;/h3>
&lt;p>&lt;strong>Encoder-decoder model&lt;/strong>&lt;/p>
&lt;p>Encoder&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-18%2019.06.22.png" alt="截屏2020-09-18 19.06.22" style="zoom:80%;" />
&lt;p>Answer prediction&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-18%2019.07.10.png" alt="截屏2020-09-18 19.07.10" style="zoom:80%;" />
&lt;ul>
&lt;li>Softmax output $i$ is probability that answer starts at token $i$&lt;/li>
&lt;li>Mirrored setup for end probability&lt;/li>
&lt;li>🔴 &lt;span style="color:red">Problem: Relying on single vector for question encoding&lt;/span>
&lt;ul>
&lt;li>Long range dependencies&lt;/li>
&lt;li>Feedback at end of sequence&lt;/li>
&lt;li>Vanishing gradients&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Solution: Use MORE information from the question&lt;/p>
&lt;p>&amp;ndash;&amp;gt; &lt;strong>Attention mechanism&lt;/strong>&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%e6%88%aa%e5%b1%8f2020-09-18%2019.09.26.png" alt="截屏2020-09-18 19.09.26" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;ul>
&lt;li>Calculates weighted sum of question encodings&lt;/li>
&lt;li>Weight is based on similarity between question encoding and context encoding&lt;/li>
&lt;li>Different similarity metrics&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">Review of models see:&lt;/span>
&lt;/div></description></item><item><title>Natural/Spoken Language Understanding</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/08-natural-language-understanding/</link><pubDate>Fri, 18 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/08-natural-language-understanding/</guid><description>&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-18%2022.29.24.png" alt="截屏2020-09-18 22.29.24">&lt;/p>
&lt;h2 id="definition">Definition&lt;/h2>
&lt;p>&lt;strong>Natural language understanding&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Representing the semantics of natural language&lt;/li>
&lt;li>Possible view: Translation from natural language to representation of meaning&lt;/li>
&lt;/ul>
&lt;p>&lt;span style="color:red">Difficulties&lt;/span>&lt;/p>
&lt;ul>
&lt;li>Ambiguities
&lt;ul>
&lt;li>
&lt;p>Lexical&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Syntax&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Referential&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Vagueness
&lt;ul>
&lt;li>E.g., &lt;em>&amp;ldquo;I had a late lunch.&amp;rdquo;&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Dimensons
&lt;ul>
&lt;li>Depth: Shallow vs Deep&lt;/li>
&lt;li>Domain: Narrow vs Open&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="examples">Examples&lt;/h2>
&lt;h3 id="siri-2011">Siri (2011)&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2012.02.45.png" alt="截屏2020-09-19 12.02.45" style="zoom:50%;" />
&lt;h2 id="dialog-modeling">Dialog Modeling&lt;/h2>
&lt;p>&lt;strong>Dialog system / Conversational agent&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Computer system that converse with a human&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Coherent structure&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Different modalities:&lt;/p>
&lt;ul>
&lt;li>Text, speech, graphics, haptics, gestures&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="components">Components&lt;/h3>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-18%2022.58.28.png" alt="截屏2020-09-18 22.58.28">&lt;/p>
&lt;h4 id="input-recognition">Input recognition&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Different modalities&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Automatic speech recognition (ASR)&lt;/strong>&lt;/li>
&lt;li>Gesture recognition&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Transformation&lt;/p>
&lt;p>Input modality (e.g. speech) &amp;ndash;&amp;gt; text&lt;/p>
&lt;/li>
&lt;li>
&lt;p>May introduce first errors&lt;/p>
&lt;/li>
&lt;li>
&lt;p>High influence on the performance of an dialog system&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="natural-language-understanding-nlu">Natural language understanding (NLU)&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Semantic interpretation of written text&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Transformation from natural language to semantic representation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Representations:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Deep vs Shallow&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Domain-dependent vs. domain independent&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="dialog-manager-dm">Dialog manager (DM)&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Manage flow of conversation&lt;/strong>&lt;/li>
&lt;li>Input: Semantic representation of the input&lt;/li>
&lt;li>Output: Semantic representation of the output&lt;/li>
&lt;li>Utilize additional knowledge
&lt;ul>
&lt;li>
&lt;p>User information&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Dialog History&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Task-specific information&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="natural-language-generation-nlg">Natural language generation (NLG)&lt;/h4>
&lt;p>&lt;strong>Generate natural language from semantic representation&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Input: Semantic output representation of the dialog manager&lt;/li>
&lt;li>Output: Natural language text for the user&lt;/li>
&lt;/ul>
&lt;h4 id="output-rendering">Output rendering&lt;/h4>
&lt;p>&lt;strong>Generate correct output&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>e.g. &lt;strong>Text-to-Speech (TTS)&lt;/strong> for Spoken Dialog Systems&lt;/li>
&lt;/ul>
&lt;h2 id="natural-language-understanding">Natural Language understanding&lt;/h2>
&lt;h3 id="approaches">Approaches&lt;/h3>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-18%2023.09.46.png" alt="截屏2020-09-18 23.09.46">&lt;/p>
&lt;h4 id="output-representation">&lt;strong>Output representation&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Relation instances&lt;/strong>
&lt;ul>
&lt;li>(Larry Page, founder, Google)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Logical forms&lt;/strong>
&lt;ul>
&lt;li>Love(Mary, John)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Scalar&lt;/strong>
&lt;ul>
&lt;li>Positive/Negative 0.9&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Vector&lt;/strong>
&lt;ul>
&lt;li>Hidden representation/ Word embeddings&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="algorithms">Algorithms&lt;/h4>
&lt;ul>
&lt;li>Rule-based / Template&lt;/li>
&lt;li>Machine learning
&lt;ul>
&lt;li>Conditional random fields&lt;/li>
&lt;li>Support Vector Machine&lt;/li>
&lt;li>Neural Networks / Deep learning&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="semantic-parsing">Semantic Parsing&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Parse natural language sentence into semantic representation&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Machine learning approaches most successful &amp;#x1f44f;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Most common approach:&lt;/p>
&lt;ul>
&lt;li>Shallow Semantic Parsing / Semantic Role Labeling&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Most important resources:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#propbank">PropBank&lt;/a>&lt;/li>
&lt;li>&lt;a href="#framenet">FrameNet&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="propbank">PropBank&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Proposition Bank (PropBank)&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Labels for all sentence in the English Penn TreeBank&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Defines semantic based on the verbs of the sentence&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Verbs&lt;/strong>: Define different senses of the verbs&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Sense&lt;/strong>: Number of Arguments important to this sense (Often only numbers)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Arg0: Proto-Agent&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Arg1: Proto-Patient&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Arg2: mostly benefactive, instrument, attribute, or end state&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Arg3: start point, benefactive, instrument, or attribute&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h5 id="example-agree">Example: &amp;ldquo;agree&amp;rdquo;&lt;/h5>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-18%2023.16.28.png" alt="截屏2020-09-18 23.16.28" style="zoom:67%;" />
&lt;h5 id="example-fall">Example: &amp;ldquo;fall&amp;rdquo;&lt;/h5>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2010.52.24.png" alt="截屏2020-09-19 10.52.24" style="zoom: 40%;" />
&lt;h5 id="propbank-argm">PropBank ArgM&lt;/h5>
&lt;ul>
&lt;li>&lt;code>TMP&lt;/code> : when? &lt;em>yesterday evening, now&lt;/em>&lt;/li>
&lt;li>&lt;code>LOC&lt;/code> : where? &lt;em>at the museum, in San Francisco&lt;/em>&lt;/li>
&lt;li>&lt;code>DIR&lt;/code> : where to/from? &lt;em>down, to Bangkok&lt;/em>&lt;/li>
&lt;li>&lt;code>MNR&lt;/code> : how? &lt;em>clearly, with much enthusiasm&lt;/em>&lt;/li>
&lt;li>&lt;code>PRP/CAU&lt;/code> : why? &lt;em>because &amp;hellip; , in response to the ruling&lt;/em>&lt;/li>
&lt;li>&lt;code>REC&lt;/code> : &lt;em>themselves, each other&lt;/em>&lt;/li>
&lt;li>&lt;code>ADV&lt;/code> : &lt;em>miscellaneous&lt;/em>&lt;/li>
&lt;li>&lt;code>PRD&lt;/code> : secondary predication &amp;hellip;&lt;em>ate the meat raw&lt;/em>&lt;/li>
&lt;/ul>
&lt;h5 id="-problem">🔴 Problem&lt;/h5>
&lt;p>Different words, Predicate expressed by noun&lt;/p>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2010.59.11.png" alt="截屏2020-09-19 10.59.11" style="zoom: 40%;" />
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">More see: &lt;a href="https://zhuanlan.zhihu.com/p/37254041">SRL数据集(1): Proposition Bank 数据集介绍&lt;/a>&lt;/span>
&lt;/div>
&lt;h4 id="framenet">FrameNet&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Roles based on Frames&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Frame&lt;/strong>: holistic background knowledge that unites these words&lt;/li>
&lt;li>&lt;strong>Frame-elements&lt;/strong>: Frame-specific semantic roles&lt;/li>
&lt;/ul>
&lt;h5 id="example-1">Example 1&lt;/h5>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2011.08.49.png" alt="截屏2020-09-19 11.08.49" style="zoom: 40%;" />
&lt;h5 id="example-2">Example 2&lt;/h5>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2011.19.24.png" alt="截屏2020-09-19 11.19.24" style="zoom: 40%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2011.18.40.png" alt="截屏2020-09-19 11.18.40" style="zoom: 33%;" />
&lt;h3 id="semantic-role-labeling">Semantic Role labeling&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Task: Automatically finding semantic roles for each argument of each predicate&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Approach: Maching Learning&lt;/p>
&lt;/li>
&lt;li>
&lt;p>High level algorithm:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Parse sentence (Syntax tree)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Find Predicates&lt;/p>
&lt;/li>
&lt;li>
&lt;p>For every node in tree: Decide semantic role&lt;/p>
&lt;/li>
&lt;/ol>
&lt;/li>
&lt;/ul>
&lt;h2 id="spoken-language-understanding">Spoken Language understanding&lt;/h2>
&lt;p>&lt;strong>Natural language processing for spoken input&lt;/strong>&lt;/p>
&lt;h3 id="difficulties">Difficulties&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Less grammatically speech&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Partial Sentences&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Disfluencies (Self correction, hesitations, repetitions)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Robust to noise&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>ASR errors&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Techniques:&lt;/p>
&lt;ul>
&lt;li>Confidence&lt;/li>
&lt;li>Multiple hypothesis&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>No Structure information&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Punctuation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Text segmentation&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="approach">Approach&lt;/h3>
&lt;ul>
&lt;li>Transform text into task-specific semantic representation of the user’s intention&lt;/li>
&lt;li>Subtasks
&lt;ul>
&lt;li>&lt;a href="#domain-detection">Domain detection&lt;/a>&lt;/li>
&lt;li>&lt;a href="#intention-determination">Intention determination&lt;/a>&lt;/li>
&lt;li>&lt;a href="#slot-filling">Slot filling&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="domain-detection">Domain Detection&lt;/h4>
&lt;ul>
&lt;li>Motivated by &lt;strong>Call Centers&lt;/strong>
&lt;ul>
&lt;li>Many agents with specialization on one topic (&lt;em>Billing inquiries, technical support requests, sales inquiries, etc.&lt;/em>)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>First techniques: Menus to find appropriate agent&lt;/li>
&lt;li>Automatic task:
&lt;ul>
&lt;li>Given the utterance find the correct agent Utterance classification task&lt;/li>
&lt;li>&lt;strong>Utterance classification task&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Input: Utterance&lt;/li>
&lt;li>Output: Topic&lt;/li>
&lt;/ul>
&lt;h4 id="intention-determination">&lt;strong>Intention determination&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>Domain-dependent utterance classes
&lt;ul>
&lt;li>e.g. Find_Flight&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Task: Assign class to Utterance&lt;/li>
&lt;li>Use similar technique&lt;/li>
&lt;/ul>
&lt;h4 id="slot-filling">Slot filling&lt;/h4>
&lt;p>&lt;strong>Sequence labeling task&lt;/strong>: Assign semantic class label to every word and history&lt;/p>
&lt;ul>
&lt;li>History: previous words and labels&lt;/li>
&lt;/ul>
&lt;p>Example:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2011.56.02.png" alt="截屏2020-09-19 11.56.02" style="zoom: 50%;" />
&lt;p>Success of deep learning in other approaches:&lt;/p>
&lt;ul>
&lt;li>RNN-based approach&lt;/li>
&lt;li>Find most probable label given word and history&lt;/li>
&lt;/ul></description></item><item><title>Dialog Management</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/10-dialog-management/</link><pubDate>Sun, 20 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/10-dialog-management/</guid><description>&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-20%2011.48.55.png" alt="截屏2020-09-20 11.48.55">&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2011.55.10.png" alt="截屏2020-09-20 11.55.10" style="zoom:80%;" />
&lt;h2 id="dialog-modeling">Dialog Modeling&lt;/h2>
&lt;h3 id="dialog-manager">Dialog manager&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Manage flow of conversation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Input: Semantic representation of the input&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Output: Semantic representation of the output&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Utilize additional knowledge&lt;/p>
&lt;ul>
&lt;li>
&lt;p>User information&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Dialog History&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Task-specific information&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="-challenges">🔴 Challenges&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Consisting of many different components&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Each component has errors&lt;/p>
&lt;/li>
&lt;li>
&lt;p>More components &amp;ndash;&amp;gt; less robust&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Should be modular&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Need to find unambiguous representation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Hard to train from data&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="dialog-types">Dialog Types&lt;/h2>
&lt;h3 id="goal-oriented-dialog">Goal-oriented Dialog&lt;/h3>
&lt;ul>
&lt;li>Follows a fixed (set of) goals
&lt;ul>
&lt;li>
&lt;p>Ticket vending machines&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Restaurant reservation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Car SDS&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Aim: Reach goal as fast as possible&lt;/li>
&lt;li>&lt;strong>Main focus of SDS research&lt;/strong>&lt;/li>
&lt;/ul>
&lt;h3 id="social-dialog">Social Dialog&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Social Dialog / Conversational Bots / Chit-Chat Setting&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Most human&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Small talk conversation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Aims:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Generate interesting, coherent, meaningful responses&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Carry-on as long as possible&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Be a companion&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="dialog-systems">Dialog Systems&lt;/h2>
&lt;h3 id="initiative">Initiative&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>System Initiative&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Command &amp;amp; control&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example (U: User, S: System)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2012.03.55.png" alt="截屏2020-09-20 12.03.55" style="zoom:70%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Mixed Initiative&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Most nature&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2012.05.30.png" alt="截屏2020-09-20 12.05.30" style="zoom:70%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>User Initiative&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>User most powerful&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Error-prone&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-20%2012.06.52.png" alt="截屏2020-09-20 12.06.52">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="confirmation">Confirmation&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Explicit verification&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2012.08.03.png" alt="截屏2020-09-20 12.08.03" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Implicit verification&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2012.08.25.png" alt="截屏2020-09-20 12.08.25" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Alternative verification&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2012.08.41.png" alt="截屏2020-09-20 12.08.41" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;h3 id="development">Development&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;a href="#rule-based-systems">&lt;strong>Rule-based&lt;/strong>&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Create management by templates/rules&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#statistical-dm">&lt;strong>Statistical&lt;/strong>&lt;/a>&lt;/p>
&lt;ul>
&lt;li>Train model to predict answer given input&lt;/li>
&lt;li>POMDP&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#neural-model-models">&lt;strong>End-to-End Neural Models&lt;/strong>&lt;/a>&lt;/p>
&lt;ul>
&lt;li>No separation into NLU/DM/NLG&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="components">Components&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Dialog Model&lt;/strong>: contains information about&lt;/p>
&lt;ul>
&lt;li>whether system, user or mixed initiative?&lt;/li>
&lt;li>whether explicit or implicit confirmation?&lt;/li>
&lt;li>what kind of speech acts needed?&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>User Model&lt;/strong>: contains the system’s beliefs about&lt;/p>
&lt;ul>
&lt;li>
&lt;p>what the user knows&lt;/p>
&lt;/li>
&lt;li>
&lt;p>the user’s expertise, experience and ability to understand the system’s utterances&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Knowledge Base&lt;/strong>: contains information about&lt;/p>
&lt;ul>
&lt;li>the world and the domain&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Discourse Context&lt;/strong>: contains information about&lt;/p>
&lt;ul>
&lt;li>the dialog history and the current discourse&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Reference Resolver&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>performs reference resolution and handles ellipsis&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Plan Recognizer and Grounding Module&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>interprets the user’s utterance given the current context&lt;/li>
&lt;li>reasons about the user’s goals and beliefs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Domain Reasoner/Planner&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>generates plans to achieve the shared goals&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Discourse Manager&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>manages all information of dialog flow&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Error Handling&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>errors or misunderstandings detection and recovery&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="rule-based-systems">Rule-based Systems&lt;/h2>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-20%2013.42.02.png" alt="截屏2020-09-20 13.42.02">&lt;/p>
&lt;h3 id="finite-state-based">Finite State-based&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>💡 &lt;strong>Idea: Iterate though states that define actions&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Dialog flow:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>specified as a set of dialog states (stages)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>transitions denoting various alternative paths through the dialog graph&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Nodes&lt;/strong> = dialogue states (prompts)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Arcs = actions based on the recognized response&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-20%2012.57.29.png" alt="截屏2020-09-20 12.57.29">&lt;/p>
&lt;/li>
&lt;li>
&lt;p>👍 Advantages&lt;/p>
&lt;ul>
&lt;li>Simple to construct due to simple dialog control&lt;/li>
&lt;li>The required vocabulary and grammar for each state can be specified in advance
&lt;ul>
&lt;li>Results in more constrained ASR and SLU&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>👎 Disadvantages&lt;/p>
&lt;ul>
&lt;li>Restrict the user’s input to predetermined words/phrases&lt;/li>
&lt;li>Makes the correction of misrecognized items difficult&lt;/li>
&lt;li>Inhibits the user’s opportunity to take the initiative and ask questions or introduce new topics&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="frame-based">Frame-based&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>💡 &lt;strong>Idea: Fill slots in a frame that defines the goal&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Dialog flow:&lt;/p>
&lt;ul>
&lt;li>is NOT predetermined, but depends on
&lt;ul>
&lt;li>
&lt;p>the contents of the user’s input&lt;/p>
&lt;/li>
&lt;li>
&lt;p>the information that the system has to elicit&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Eg1&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2013.12.50.png" alt="截屏2020-09-20 13.12.50" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Eg2&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2013.13.34.png" alt="截屏2020-09-20 13.13.34" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Slot(/Form/Template) filling&lt;/p>
&lt;ul>
&lt;li>
&lt;p>One slot per piece of information&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Takes a particular action based on the current state of affairs&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Questions and other prompts&lt;/p>
&lt;ul>
&lt;li>List of possibilities&lt;/li>
&lt;li>conditions that have to be true for that particular question or prompt&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>👍 Advantages&lt;/p>
&lt;ul>
&lt;li>User can provide over-informative answers&lt;/li>
&lt;li>Allows more natural dialogues&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>👎 Disadvantages&lt;/p>
&lt;ul>
&lt;li>Cannot handle complex dialogues&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="agent-based">Agent-based&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>💡 Idea:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Communication viewed as interaction between two agents&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Each capable of reasoning about its own actions and beliefs&lt;/p>
&lt;/li>
&lt;li>
&lt;p>also about other’s actions and beliefs&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Use of “contexts”&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2013.20.28.png" alt="截屏2020-09-20 13.20.28" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Allow complex communication between the system, the user and the underlying application to solve some problem/task&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Many variants depends on particular aspects of intelligent behavior included&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Tends to be mixed-initiative&lt;/p>
&lt;ul>
&lt;li>User can control the dialog, introduce new topics, or make contribution&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>👍 Advantages&lt;/p>
&lt;ul>
&lt;li>Allow natural dialogue in complex domains&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>👎 Disadvantages&lt;/p>
&lt;ul>
&lt;li>Such agents are usually very complex&lt;/li>
&lt;li>Hard to build &amp;#x1f622;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="limitations-of-rule-based-dm">Limitations of Rule-based DM&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Expensive to build Manual work&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Fragile to ASR errors&lt;/p>
&lt;/li>
&lt;li>
&lt;p>No self-improvement over time&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="statistical-dm">Statistical DM&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Motivation&lt;/p>
&lt;ul>
&lt;li>
&lt;p>User intention can ONLY be imperfectly known&lt;/p>
&lt;ul>
&lt;li>Incompleteness – user may not specify full intention initially&lt;/li>
&lt;li>Noisiness – errors from ASR/SLU&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Automatic learning of dialog strategies&lt;/p>
&lt;ul>
&lt;li>Rule based time consuming&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>👍 Advantages&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Maintain a distribution over multiple hypotheses for the correct dialog state&lt;/p>
&lt;ul>
&lt;li>Not a single hypothesis for the dialog state&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Choose actions through an automatic optimization process&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Technology is not domain dependent&lt;/p>
&lt;ul>
&lt;li>same technology can be applied to other domain by learning new domain data&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="markov-decision-process-mdp">Markov Decision Process (MDP)&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>A model for sequential decision making problems&lt;/p>
&lt;ul>
&lt;li>Solved using &lt;strong>dynamic programming&lt;/strong> and &lt;strong>reinforcement learning&lt;/strong>&lt;/li>
&lt;li>MDP based SDM: dialog evolves as a &lt;strong>Markov process&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Specified by a tuple $(S, A, T, R)$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$S$: a set of possible world states $s \in S$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$A$: a set of possible actions $a\in A$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$R$: a local real-valued reward function&lt;br>
&lt;/p>
$$
R: S \times A \mapsto \mathcal{R}
$$
&lt;/li>
&lt;li>
&lt;p>$T$: a transition mode
&lt;/p>
$$
T(s\_{t-1}, a\_{t-1}, s\_t) = P(s\_t | s\_{t-1}, a\_{t-1})
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🎯 Goal of MDP based SDM: Maximize its expected cumulative (discounted) reward
&lt;/p>
$$
E\left(\sum\_{t=0}^{\infty} \gamma^{t} R\left(s\_{t}, a\_{t}\right)\right)
$$
&lt;/li>
&lt;li>
&lt;p>Requires complete knowledge of $S$ !!!&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="reinforcement-learning">Reinforcement Learning&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>“Learning through trial-and-error” (reward/penalty)&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>🔴 Problem&lt;/p>
&lt;ul>
&lt;li>
&lt;p>No direct feedback&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Only feedback at the end of dialog&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🎯 Goal: Learn evaluation function from feedback&lt;/p>
&lt;/li>
&lt;li>
&lt;p>💡 Idea&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Initial all operations have equal probability&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If dialog was successful &amp;ndash;&amp;gt; all operations are positive&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If dialog was negative &amp;ndash;&amp;gt; operations negative&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="how-rl-works">How RL works?&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>There is an &lt;strong>agent&lt;/strong> with the capacity to &lt;strong>act&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Each &lt;strong>action&lt;/strong> influences the agent’s future &lt;strong>state&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Success is measured by a scalar &lt;strong>reward&lt;/strong> signal&lt;/p>
&lt;/li>
&lt;li>
&lt;p>In a nutshell:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Select actions to maximize future reward&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Ideally, a single agent could learn to solve any task &amp;#x1f4aa;&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="sequential-decision-making">Sequential Decision Making&lt;/h4>
&lt;ul>
&lt;li>🎯 &lt;strong>Goal: select actions to maximize total future reward&lt;/strong>&lt;/li>
&lt;li>Actions may have long term consequences&lt;/li>
&lt;li>Reward may be delayed&lt;/li>
&lt;li>It may be better to sacrifice immediate reward to gain more long-term reward 🤔&lt;/li>
&lt;/ul>
&lt;h4 id="agent-and-environment">Agent and Environment&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2015.50.33.png" alt="截屏2020-09-20 15.50.33" style="zoom:50%;" />
&lt;p>At each step $t$&lt;/p>
&lt;ul>
&lt;li>Agent:
&lt;ul>
&lt;li>Receives state $s\_t$&lt;/li>
&lt;li>Receives scalar reward $r\_t$&lt;/li>
&lt;li>Executes action $a\_t$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>The environment:
&lt;ul>
&lt;li>Receives action $a\_t$&lt;/li>
&lt;li>Emits state $s\_t$&lt;/li>
&lt;li>Emits scalar reward $r\_t$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>The evolution of this process is called a &lt;strong>Markov Decision Process (MDP)&lt;/strong>&lt;/li>
&lt;/ul>
&lt;h4 id="supervised-learning-vs-reinforcement-learning">Supervised Learning Vs. Reinforcement Learning&lt;/h4>
&lt;p>&lt;strong>Supervised Learning&lt;/strong>:&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2016.04.21.png" alt="截屏2020-09-20 16.04.21" style="zoom:100%;" />
&lt;ul>
&lt;li>Label is given: we can compute gradients given label and update our parameters&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Reinforcement Learning&lt;/strong>&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-20%2016.05.11.png" alt="截屏2020-09-20 16.05.11">&lt;/p>
&lt;ul>
&lt;li>NO label given: instead we have feedback from the environment&lt;/li>
&lt;li>Not an absolute label / error. We can compute gradients, but do not yet know if our action choice is good. 🤪&lt;/li>
&lt;/ul>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">More see: &lt;a href="http://karpathy.github.io/2016/05/31/rl/">Deep Reinforcement Learning: Pong from Pixels&lt;/a>&lt;/span>
&lt;/div>
&lt;h4 id="policy-and-value-functions">Policy and Value Functions&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Policy $\pi$&lt;/strong> : a probability distribution of actions given a state
&lt;/p>
$$
a = \pi(s)
$$
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Value function $Q^\pi(s, a)$&lt;/strong> : the expected total reward from state $s$ and action $a$ under policy $\pi$
&lt;/p>
$$
Q^{\pi}(s, a)=\mathbb{E}\left[r\_{t+1}+\gamma r\_{t+2}+\gamma^{2} r\_{t+3}+\cdots \mid s, a\right]
$$
&lt;ul>
&lt;li>“How good is action $a$ in state $s$?”
&lt;ul>
&lt;li>Same reward for two actions, but different consequences down the road&lt;/li>
&lt;li>Want to update our value function accordingly&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="appoaches-to-rl">Appoaches to RL&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Policy-based RL&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Search directly for the &lt;strong>optimal policy $\pi^\*$&lt;/strong>&lt;/p>
&lt;p>(policy achieving maximum future reward)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Value-based RL&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Estimate the &lt;strong>optimal value function $Q^{∗}(s,a)$&lt;/strong>
(maximum value achievable under any policy)&lt;/li>
&lt;li>&lt;strong>Q-Learning&lt;/strong>: Learn Q-Function that approximates $Q^{∗}(s,a)$
&lt;ul>
&lt;li>Maximum reward when taking action $a$ in $s$&lt;/li>
&lt;li>Policy: Select action with maximal $Q$ value&lt;/li>
&lt;li>Algorithm:
&lt;ul>
&lt;li>Initialized $Q$ randomly&lt;/li>
&lt;li>$Q(s, a) \leftarrow(1-\alpha) Q(s, a)+\alpha\left(r\_{t}+\gamma \cdot \underset{a}{\max} Q\left(s\_{t+1}, a\right)\right)$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="goal-oriented-dialogs-statistical-pomdp">Goal-oriented Dialogs: Statistical POMDP&lt;/h2>
&lt;h3 id="pomdp--partially-observable-markov-decision-process">&lt;strong>POMDP : Partially Observable Markov Decision Process&lt;/strong>&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>MDP &amp;ndash;&amp;gt; POMDP: all states $s$ cannot observed&lt;/p>
&lt;ul>
&lt;li>
&lt;p>POMDP based SDM &amp;ndash;&amp;gt; reinforcement learning + belief state tracking&lt;/p>
&lt;ul>
&lt;li>
&lt;p>dialog evolves as a Markov process $P(s\_t | s\_{t-1}, a\_{t-1})$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$s\_t$ is NOT directly observable&lt;/p>
&lt;p>&amp;ndash;&amp;gt; belief state $b(s\_t)$: prob. distribution of all states&lt;/p>
&lt;/li>
&lt;li>
&lt;p>SLU outputs a noisy observation $o\_t$ of the user input with prob. $P(o\_t|s\_t)$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Specified by tuple $(S, A, T, R, O, Z)$&lt;/p>
&lt;ul>
&lt;li>
&lt;p>$S, A, T, R$ constitute an MDP&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$O$: a finite set of observations received from the environment&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$Z$: the observation function s.t.
&lt;/p>
$$
Z(o\_t,s\_t,a\_{t-1}) = P(o\_t|s\_t,a\_{t-1})
$$
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Local reward&lt;/strong> is the expected reward $\rho$ over belief states
&lt;/p>
$$
\rho(b, a)=\sum\_{s \in S} R(s, a) \cdot b(s)
$$
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Goal&lt;/strong>: maximize the expected cumulative reward.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Operation&lt;/strong> (at each time step)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2017.07.48.png" alt="截屏2020-09-20 17.07.48" style="zoom:80%;" />
- World is in unobserved state $s\_t$
&lt;ul>
&lt;li>
&lt;p>Maintain distribution over all possible states with $b\_t$
&lt;/p>
$$
b\_t(s\_t) = \text{Probability of being in state } s\_t
$$
&lt;/li>
&lt;li>
&lt;p>DM selects action $a\_t$ based on $b\_t$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Receive reward $r\_t$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Transition to unobserved state $s\_{t+1}$ ONLY depending on $s\_t$ and $a\_t$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Receive obserservation $o\_{t+1}$ ONLY depending on $a\_t$ and $s\_{t+1}$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Update of belief state
&lt;/p>
$$
b\_{t+1}\left(s\_{t+1}\right)=\eta P\left(o\_{t+1} \mid s\_{t+1}, a\_{t}\right) \sum\_{s\_{t}} P\left(s\_{t+1} \mid s\_{t}, a\_{t}\right) b\_{t}\left(s\_{t}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Policy $\pi$:
&lt;/p>
$$
\pi(b) \in \mathbb{A}
$$
&lt;/li>
&lt;li>
&lt;p>Value function:
&lt;/p>
$$
V^{\pi}\left(b\_{t}\right)=\mathbb{E}\left[r\_{t}+\gamma r\_{t+1}+\gamma^{2} r\_{t+2}+\ldots\right]
$$
&lt;/li>
&lt;/ul>
&lt;h3 id="pomdp-model">POMDP model&lt;/h3>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-20%2023.07.52.png" alt="截屏2020-09-20 23.07.52">&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Two stochastic models&lt;/p>
&lt;ul>
&lt;li>Dialogue model $M$
&lt;ul>
&lt;li>Transition and observation probability model&lt;/li>
&lt;li>In what state is the dialogue at the moment&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Policy Model $\mathcal{P}$
&lt;ul>
&lt;li>What is the best next action&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Both models are optimized jointly&lt;/p>
&lt;ul>
&lt;li>Maximize the expect accumulated sum of rewards
&lt;ul>
&lt;li>Online: Interaction with user&lt;/li>
&lt;li>Offline: Training with corpus&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Key ideas&lt;/p>
&lt;ul>
&lt;li>Belief tracking
&lt;ul>
&lt;li>
&lt;p>Represent uncertainty&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Pursuing all possible dialogue paths in parallel&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Reinforcement learning
&lt;ul>
&lt;li>Use machine learning to learn parameters&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🔴 Challenges&lt;/p>
&lt;ul>
&lt;li>Belief tracking&lt;/li>
&lt;li>Policy learning&lt;/li>
&lt;li>User simulation&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="belief-state">Belief state&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2023.21.04.png" alt="截屏2020-09-20 23.21.04" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>Information encoded in the state
&lt;/p>
$$
\begin{aligned}
b\_{t+1}\left(g\_{t+1}, u\_{t+1}, h\_{t+1}\right)=&amp; \eta P\left(o\_{t+1} \mid u\_{t+1}\right) \\\\
\cdot &amp; P\left(u\_{t+1} \mid g\_{t+1}, a\_{t}\right) \\\\
\cdot &amp; \sum_{g\_{t}} P\left(g\_{t+1} \mid g\_{t}, a\_{t}\right) \\\\
\cdot &amp; \sum_{h\_{t}} P\left(h\_{t+1} \mid g\_{t+1}, u\_{t+1}, h\_{t}, a\_{t}\right) \\\\
\cdot &amp; b\_{t}\left(g\_{t}, h\_{t}\right)
\end{aligned}
$$
&lt;ul>
&lt;li>&lt;strong>User goal $g\_t$&lt;/strong>: Information from the user necessary to fulfill the task&lt;/li>
&lt;li>&lt;strong>User utterance $u\_t$&lt;/strong>
&lt;ul>
&lt;li>What was said&lt;/li>
&lt;li>Not what was recognized&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Dialogue history $h\_t$&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Using independence assumptions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Observation model: Probability of observation $o$ given $u$&lt;/p>
&lt;ul>
&lt;li>Reflect speech understanding errors&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>User model: Probability of the utterance given previous output and new state&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Goal transition model&lt;/p>
&lt;/li>
&lt;li>
&lt;p>History model&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Model still too complex 🤪&lt;/p>
&lt;ul>
&lt;li>Solution
&lt;ul>
&lt;li>n-best approach&lt;/li>
&lt;li>Factored approach&lt;/li>
&lt;li>Combination is possible&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="policy">Policy&lt;/h3>
&lt;ul>
&lt;li>Mapping between belief states and system actions&lt;/li>
&lt;li>🎯 &lt;strong>Goal&lt;/strong>: Find optimal policy π’&lt;/li>
&lt;li>&lt;strong>Problem&lt;/strong>: State and action space very large&lt;/li>
&lt;li>But:
&lt;ul>
&lt;li>Small part of belief space only visited&lt;/li>
&lt;li>Plausible actions at every point very restricted&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Summary space: Simplified representation&lt;/li>
&lt;/ul>
&lt;h3 id="-disadvantages">🔴 Disadvantages&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Predefine structure of the dialog states&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Location&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Price range&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Type of cuisine&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Limited to very narrow domain&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Cannot encode all features/slots that might be useful&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="neural-dialog-models">Neural Dialog Models&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>End-to-End training&lt;/p>
&lt;ul>
&lt;li>Optimize all parameters jointly&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Continuous representations&lt;/p>
&lt;ul>
&lt;li>No early decision&lt;/li>
&lt;li>No propagation of errors&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Challenges&lt;/p>
&lt;ul>
&lt;li>Representation of history/context&lt;/li>
&lt;li>Policy- Learning
&lt;ul>
&lt;li>Interactive learning&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>dIntegration of knowledge sources&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="datasets">Datasets&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Goal oriented&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>bAbI task&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Synthetic data – created by templates&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>DSTC&lt;/strong> (Dialog State tracking challenge)&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Restaurant reservation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Collected using 3 dialog managers&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Annotated with dialog states&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Social dialog&lt;/p>
&lt;ul>
&lt;li>Learn from human-human communication&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="architecture">Architecture&lt;/h3>
&lt;h4 id="memory-networks">Memory Networks&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2017.43.00.png" alt="截屏2020-09-20 17.43.00" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>Neural network model&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Writing and reading from a memory component&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Store dialog history&lt;/p>
&lt;ul>
&lt;li>Learn to focus on important parts&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="sequence-to-sequence-models-encoder-decoder">Sequence-to-Sequence Models: Encoder-Decoder&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2022.50.42.png" alt="截屏2020-09-20 22.50.42" style="zoom: 67%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Encoder&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Read in Input&lt;/li>
&lt;li>Represent content in hidden fix dimension vector&lt;/li>
&lt;li>LSTM-based model&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Decoder&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Generate Output&lt;/li>
&lt;li>Use fix dimension vector as input&lt;/li>
&lt;li>LSTM-based model&lt;/li>
&lt;li>&lt;code>EOS&lt;/code> symbol to start outputting&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="example">Example&lt;/h4>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2022.52.57.png" alt="截屏2020-09-20 22.52.57" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>Recurrent-based Encoder-Decoder Architecture&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Trained end-to-end.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Encoder&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2022.54.14.png" alt="截屏2020-09-20 22.54.14" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2022.54.27.png" alt="截屏2020-09-20 22.54.27" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2022.54.47.png" alt="截屏2020-09-20 22.54.47" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2022.55.02.png" alt="截屏2020-09-20 22.55.02" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Decoder&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2022.55.31.png" alt="截屏2020-09-20 22.55.31" style="zoom:67%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2022.55.54.png" alt="截屏2020-09-20 22.55.54" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;h4 id="dedicated-dialog-architecture">Dedicated Dialog Architecture&lt;/h4>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-20%2022.57.55.png" alt="截屏2020-09-20 22.57.55">&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-20%2022.58.59.png" alt="截屏2020-09-20 22.58.59" style="zoom:67%;" />
&lt;h3 id="training">Training&lt;/h3>
&lt;h4 id="supervised-learning">Supervised learning&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Supervised: Learning from corpus&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Algorithm:&lt;/p>
&lt;ul>
&lt;li>Input user utterance&lt;/li>
&lt;li>Calculate system output&lt;/li>
&lt;li>Measure error&lt;/li>
&lt;li>Backpropagation error&lt;/li>
&lt;li>Update weights&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Problem:&lt;/p>
&lt;ul>
&lt;li>Error lead to different dialogue state&lt;/li>
&lt;li>Compounding errors&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="imitation-learning">Imitation learning&lt;/h4>
&lt;ul>
&lt;li>Imitation learning
&lt;ul>
&lt;li>
&lt;p>Interactive learning&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Correct mistakes and demonstrate expected actions&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Algorithm: same as supervised learning&lt;/li>
&lt;li>Problem: costly&lt;/li>
&lt;/ul>
&lt;h4 id="deep-reinforcement-learning">Deep reinforcement learning&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Imitation learning&lt;/p>
&lt;ul>
&lt;li>Interactive learning&lt;/li>
&lt;li>Feedback only at end of the dialogue
&lt;ul>
&lt;li>
&lt;p>Successful/ Failed task&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Additional reward for fewer steps &amp;#x1f44f;&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Challenge:&lt;/p>
&lt;ul>
&lt;li>Sampling of different actions&lt;/li>
&lt;li>Hugh action space&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item><item><title>Natural Language Generation</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/09-natural-language-generation/</link><pubDate>Sat, 19 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/09-natural-language-generation/</guid><description>&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-19%2012.16.52.png" alt="截屏2020-09-19 12.16.52">&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2012.17.17.png" alt="截屏2020-09-19 12.17.17" style="zoom:40%;" />
&lt;h2 id="motivation">Motivation&lt;/h2>
&lt;p>🎯 &lt;strong>Goal: generate natural language from semantic representation (or other data)&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2012.18.56.png" alt="截屏2020-09-19 12.18.56" style="zoom:50%;" />
&lt;h3 id="examples">Examples&lt;/h3>
&lt;h4 id="pollen-forecast">Pollen Forecast&lt;/h4>
&lt;p>Pollen Forecast for Scotland&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Taking six numbers as input, a simple NLG system generates a short textual summary of pollen levels&lt;/p>
&lt;p>&lt;em>“Grass pollen levels for Friday have increased from the moderate to high levels of yesterday with values of around 6 to 7 across most parts of the country. However, in Northern areas, pollen levels will be moderate with values of 4.”&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>The actual forecast (written by a human meteorologist) from the data&lt;/p>
&lt;p>&lt;em>“Pollen counts are expected to remain high at level 6 over most of Scotland, and even level 7 in the south east. The only relief is in the Northern Isles and far northeast of mainland Scotland with medium levels of pollen count.”&lt;/em>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="weather-forecast">Weather Forecast&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Function: Produces textual weather reports in English and French&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Input: Numerical weather simulation data annotated by human forecaster&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="difficultieschallenges">Difficulties/Challenges&lt;/h2>
&lt;p>&lt;strong>Making choices&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Content to be included/omitted&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Organization of content into coherent structure&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Style (formality, opinion, genre, personality&amp;hellip;)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Packaging into sentences&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Syntactic constructions&lt;/p>
&lt;/li>
&lt;li>
&lt;p>How to refer to entities (referring expression generation)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>What words to use (lexical choice)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="rule-based-methods">Rule-based methods&lt;/h2>
&lt;p>Six basic activities in NLG:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;a href="#content-selection">&lt;strong>&lt;span style="color:LightCoral">Content determination&lt;/span>&lt;/strong>&lt;/a>&lt;/p>
&lt;p>Deciding what information to mention in the text&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>&lt;span style="color:LightCoral">Discourse planning&lt;/span>&lt;/strong>&lt;/p>
&lt;p>Imposing ordering and structure over the information to convey&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#aggregation">&lt;strong>&lt;span style="color:OliveDrab">Sentence aggregation&lt;/span>&lt;/strong>&lt;/a>&lt;/p>
&lt;p>Merging of similar sentences to improve readability and naturalness&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#lexicalization">&lt;strong>&lt;span style="color:OliveDrab">Lexicalization&lt;/span>&lt;/strong>&lt;/a>&lt;/p>
&lt;p>Deciding the specific words and phrases to express the concepts and relations&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#generating-referring-expressions-gre">&lt;strong>&lt;span style="color:OliveDrab">Referring expression generation&lt;/span>&lt;/strong>&lt;/a>&lt;/p>
&lt;p>Selecting words or phrases to identify domain entities&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="#realization">&lt;strong>&lt;span style="color:CornflowerBlue">Linguistic realization&lt;/span>&lt;/strong>&lt;/a>&lt;/p>
&lt;p>Creating the actual text, which is correct according to the grammar rules of syntax, morphology and orthography&lt;/p>
&lt;/li>
&lt;/ol>
&lt;p>3-stages pipelined architecture:&lt;/p>
&lt;ul>
&lt;li>&lt;span style="color:LightCoral">Text planning&lt;/span> (Act 1 and 2)&lt;/li>
&lt;li>&lt;span style="color:OliveDrab">Sentence planning&lt;/span> (Act 3, 4, and 5)&lt;/li>
&lt;li>&lt;span style="color:CornflowerBlue">Linguistic realization&lt;/span> (Act 6)&lt;/li>
&lt;/ul>
&lt;p>Intermediate representations: &lt;strong>Text plans&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Represented as trees whose leaf nodes specify individual messages and internal nodes show how messages are conceptually grouped&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Sentence plans&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Template representation, possibly with some linguistic processing → Represent sentences as boilerplate text and parameters that need to be inserted into the boilerplate text&lt;/li>
&lt;li>abstract sentential representation → Specify the content words (nouns, verbs, adjectives and adverbs) of a sentence, and how they are related&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2013.35.01.png" alt="截屏2020-09-19 13.35.01" style="zoom:40%;" />
&lt;h3 id="textdocument-planner">Text/Document planner&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Determine&lt;/p>
&lt;ul>
&lt;li>what information to communicate&lt;/li>
&lt;li>how to structure information into a coherent text&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Common Approaches:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>methods based on observations about common text structures (Schemas)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>methods based on reasoning about the purpose of the text and discourse coherence (Rhetorical Structure Theory, planning)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="content-selection">Content Selection&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Text is sequence of &lt;strong>MESSAGES&lt;/strong>, predefined data structures:&lt;/p>
&lt;ul>
&lt;li>correspond to informational units in the text&lt;/li>
&lt;li>collect together underlying data in ways that are convenient for linguistic expression&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>How to devise MESSAGE types?&lt;/p>
&lt;ul>
&lt;li>&lt;a href="#rhetorical-predicates">Rhetorical predicates&lt;/a>: generalizations made by linguists&lt;/li>
&lt;li>&lt;a href="#corpus-based-content-selection">From corpus analysis, identify agglomerations of informational elements&lt;/a>
&lt;ul>
&lt;li>Application dependent&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="rhetorical-predicates">Rhetorical predicates&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Attribute&lt;/strong>&lt;/p>
&lt;p>E.g. &lt;em>Mary has a pink coat.&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Equivalence&lt;/strong>&lt;/p>
&lt;p>E.g. &lt;em>Wines described as ‘great’ are fine wines from an especially good village.&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Specification&lt;/strong>&lt;/p>
&lt;p>E.g. &lt;em>[The machine is heavy.] It weighs 2 tons.&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Constituency&lt;/strong>&lt;/p>
&lt;p>E.g. &lt;em>[This is an octopus.] There is his eye, these are his legs, and he has these suction cups.&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Evidence&lt;/strong>&lt;/p>
&lt;p>E.g. &lt;em>[The audience recognized the difference.] They started laughing right from the very first frames of that film.&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&amp;hellip;&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="corpus-based-content-selection">Corpus-based content selection&lt;/h4>
&lt;p>(Take weather forecast as example)&lt;/p>
&lt;ul>
&lt;li>Routine messages: always included
&lt;ul>
&lt;li>E.g.
&lt;ul>
&lt;li>&lt;code>MonthlyRainFallMsg&lt;/code>&lt;/li>
&lt;li>&lt;code>MonthlyTemperatureMsg&lt;/code>&lt;/li>
&lt;li>&lt;code>RainSoFarMsg&lt;/code>&lt;/li>
&lt;li>&lt;code>MonthlyRainyDaysMsg&lt;/code>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Significant Event messages: Only constructed if the data warrants it
&lt;ul>
&lt;li>E.g. if rain occurs on more than a specified number of days in a row
&lt;ul>
&lt;li>&lt;code>RainEventMsg&lt;/code>&lt;/li>
&lt;li>&lt;code>RainSpellMsg&lt;/code>&lt;/li>
&lt;li>&lt;code>TemperatureEventMsg&lt;/code>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2014.19.06.png" alt="截屏2020-09-19 14.19.06" style="zoom:67%;" />
&lt;p>Define Schemas&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2015.33.52.png" alt="截屏2020-09-19 15.33.52" style="zoom: 67%;" />
&lt;p>Produces a text/document plan&lt;/p>
&lt;ul>
&lt;li>a tree structure populated by messages at its leaf nodes&lt;/li>
&lt;/ul>
&lt;h3 id="aggregation">Aggregation&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Deciding how messages should be composed together to produce specifications for sentences or other linguistic units&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>On the basis of&lt;/p>
&lt;ul>
&lt;li>Information content&lt;/li>
&lt;li>Possible forms of realization&lt;/li>
&lt;li>Semantics&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Some possibilities:&lt;/p>
&lt;ul>
&lt;li>Simple conjunction&lt;/li>
&lt;li>Ellipsis&lt;/li>
&lt;li>Embedding&lt;/li>
&lt;li>Set introduction&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Without aggregation:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-txt" data-lang="txt">&lt;span class="line">&lt;span class="cl">Heavy rain fell on the 27th.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Heavy rain fell on the 28th.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/li>
&lt;li>
&lt;p>Aggregation via simple conjunction:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-txt" data-lang="txt">&lt;span class="line">&lt;span class="cl">Heavy rain fell on the 27th and heavy rain fell on the 28th.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/li>
&lt;li>
&lt;p>Aggregation via ellipsis:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-txt" data-lang="txt">&lt;span class="line">&lt;span class="cl">Heavy rain fell on the 27th and [] on the 28th.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/li>
&lt;li>
&lt;p>Aggregation via set introduction:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-txt" data-lang="txt">&lt;span class="line">&lt;span class="cl">Heavy rain fell on the 27th and 28th.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="lexicalization">Lexicalization&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Choose words and syntactic structures to express content selected&lt;/strong>&lt;/li>
&lt;li>If several lexicalizations are possible, consider:
&lt;ul>
&lt;li>
&lt;p>user knowledge and preferences&lt;/p>
&lt;/li>
&lt;li>
&lt;p>consistency with previous usage&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Pragmatics: emphasis, level of formality, personality, &amp;hellip;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>interaction with other aspects of micro planning&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Example
&lt;ul>
&lt;li>
&lt;p>S: &lt;em>rainfall was very poor&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>NP: &lt;em>a much worse than average rainfall&lt;/em>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>ADJP: &lt;em>much drier than average&lt;/em>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="generating-referring-expressions-gre">Generating Referring Expressions (GRE)&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Identify specific domain objects and entities&lt;/p>
&lt;/li>
&lt;li>
&lt;p>GRE produces description of object or event that allows hearer to distinguish it from distractors&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Issues&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Initial introduction of an object&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Subsequent references to an already salient object&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Referring to months:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>June 1999&lt;/p>
&lt;/li>
&lt;li>
&lt;p>June&lt;/p>
&lt;/li>
&lt;li>
&lt;p>the month&lt;/p>
&lt;/li>
&lt;li>
&lt;p>next June&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Referring to temporal intervals&lt;/p>
&lt;ul>
&lt;li>8 days starting from the 11th&lt;/li>
&lt;li>From the 11th to the 18th&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>(Relatively simple, so can be hardcoded in document planning)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="realization">Realization&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>🎯 Goal: to convert text specifications into actual text&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Purpose: hide the peculiarities of the target language from the rest of the NLG system&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-19%2017.14.42.png" alt="截屏2020-09-19 17.14.42">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="evaluation">Evaluation&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Task-based (extrinsic) evaluation&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>how well the generated text helps to perform a task&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Human ratings&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>quality and usefulness of the text&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Metrics&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>e.g. BLEU (Bilingual Evaluation Understudy)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Quality is considered to be the correspondence between machine’s output and that of a human&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="statistical-methods">Statistical methods&lt;/h2>
&lt;p>Problems of conventional NLG components&lt;/p>
&lt;ul>
&lt;li>expensive to build
&lt;ul>
&lt;li>need lots of handcrafting or a well-labeled dataset to be trained on&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>kind and amount of available data severely limits the development &amp;#x1f622;&lt;/li>
&lt;li>makes cross-domain, multi-lingual SDSs (Spoken Dialogue Systems) intractable &amp;#x1f622;&lt;/li>
&lt;/ul>
&lt;p>Motivation&lt;/p>
&lt;ul>
&lt;li>human languages are context-aware&lt;/li>
&lt;li>natural response should be directly learned from data than depending on defined syntaxes or rules&lt;/li>
&lt;/ul>
&lt;h3 id="deep-learning-nlg">Deep Learning NLG&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Significant progress in applying statistical method for SLU and DM in past decade&lt;/p>
&lt;ul>
&lt;li>including making them more easily extensible to other application/domains&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Data-driven NLG for SDSs relatively unexplored due to mentioned difficulty of collecting semantically-annotated corpora&lt;/p>
&lt;ul>
&lt;li>rule-based NLG remains the norm for most systems&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Goal of the NLG component of an SDS:&lt;/p>
&lt;p>map an abstract dialog act consisting of an act type and a set of attribute(slot)-value pairs into an appropriate surface text&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="rnn-based-generation">(RNN-based) Generation&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Conditional text generation&lt;/p>
&lt;ul>
&lt;li>Text has different length&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Use RNN-based neural network&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Decoding&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Initialize RNN with input&lt;/p>
&lt;ul>
&lt;li>Hidden state or first input&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2017.25.57.png" alt="截屏2020-09-19 17.25.57" style="zoom: 67%;" />
&lt;/li>
&lt;li>
&lt;p>Generate output probability for first word&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sample first word/Select most probable word&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2017.26.51.png" alt="截屏2020-09-19 17.26.51" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Insert selected word into RNN&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2017.27.31.png" alt="截屏2020-09-19 17.27.31" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Continue till &lt;code>&amp;lt;eos&amp;gt;&lt;/code>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2017.28.13.png" alt="截屏2020-09-19 17.28.13" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="-challenges">🔴 Challenges&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Large vocabulary&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Names of all restaurants&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Delexicalization: Replace slot values by slot names&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2017.34.19.png" alt="截屏2020-09-19 17.34.19" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Vanishing gradient&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Repeated input&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2017.30.44.png" alt="截屏2020-09-19 17.30.44" style="zoom:67%;" />
&lt;/li>
&lt;li>
&lt;p>Gating of input vector&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Problem: Output NAME several times&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Remove NAME from S when it has been output&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-19%2017.30.44.png" alt="截屏2020-09-19 17.30.44" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Only backward dependencies&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Rerank output with different models&lt;/p>
&lt;ul>
&lt;li>&lt;strong>N-Best list reranking&lt;/strong>
&lt;ul>
&lt;li>Cannot look at all possible output&lt;/li>
&lt;li>But: Generate several good outputs (e.g. top 10; top 100)&lt;/li>
&lt;li>Then we can also use other models to evaluate them&lt;/li>
&lt;li>Possible to select different one
&lt;ul>
&lt;li>But if good output is not in best, we can not find it 🤪&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>N-Best generation&lt;/strong>
&lt;ul>
&lt;li>Beam search
&lt;ul>
&lt;li>Select top $k$ words at timestep 1&lt;/li>
&lt;li>Independently insert all of them at timestep 2
&lt;ul>
&lt;li>Select top $k$ words&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>$k*k$ possible output at timestep 2&lt;/li>
&lt;li>Filter top $k$&lt;/li>
&lt;li>Continue with top $k$ at timestep 3&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Right to left&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Rescoring&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-20%2011.31.57.png" alt="截屏2020-09-20 11.31.57">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Inverse direction&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="left-to-write-decoding">Left to write decoding&lt;/h4>
&lt;ul>
&lt;li>RNN allows generation from left-to-right
&lt;ul>
&lt;li>👍 Advantages
&lt;ul>
&lt;li>Do not need to generate all possible output and then evaluate&lt;/li>
&lt;li>Possible for most task&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>👎 Disadvantages
&lt;ul>
&lt;li>
&lt;p>No global view&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Word probability only on previous words&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Non optimal modeling if all slots have been filed&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="generating-long-sequence">Generating long sequence&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>RNN prefers short sequences &amp;ndash;&amp;gt; Hard to train long sequences &amp;#x1f622;&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Incoherent&lt;/strong>
E.g. &lt;em>The sun is the center of the sun&lt;/em>&lt;/li>
&lt;li>&lt;strong>Redundant&lt;/strong>
E.g. &lt;em>I like cake and cake&lt;/em>&lt;/li>
&lt;li>&lt;strong>Contradictory&lt;/strong>
E.g. &lt;em>I don’t own a gun, but I do own a gun&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>💡 Idea:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Generate only fix length segments&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Condition on input and previous target sequence&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="generating-by-editing">Generating by editing&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Similar sentence should be in the training data&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Edit this sentence instead of generating new sentence&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>💡Idea&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Find similar sentence&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Combine edit vector and input sentence&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Generate output sentence&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Use sequence to sequence model&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Again RNN&lt;/p>
&lt;/li>
&lt;li>
&lt;p>But easier to copy then to generate&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-20%2011.40.15.png" alt="截屏2020-09-20 11.40.15">&lt;/p></description></item><item><title>Information Retrieval</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/11-information-retrieval/</link><pubDate>Sun, 20 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/11-information-retrieval/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>&lt;strong>Information Retrieval (IR)&lt;/strong>:&lt;/p>
&lt;p>finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large collections (usually stored on computers).&lt;/p>
&lt;p>Use case / applications&lt;/p>
&lt;ul>
&lt;li>web search (most common)&lt;/li>
&lt;li>E-mail search&lt;/li>
&lt;li>Searching your laptop&lt;/li>
&lt;li>Corporate knowledge bases&lt;/li>
&lt;li>Legal information retrieval&lt;/li>
&lt;/ul>
&lt;h3 id="basic-idea">Basic idea&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Collection&lt;/strong>: A set of documents&lt;/p>
&lt;ul>
&lt;li>Assume it is a static collection for the moment&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🎯 Goal: Retrieve documents with information that is relevant to the user’s information need and helps the user complete a task&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-20%2023.47.13-20200921115037131.png" alt="截屏2020-09-20 23.47.13">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="main-idea">Main idea&lt;/h3>
&lt;p>Compare document and query to estimate relevance&lt;/p>
&lt;h3 id="components">Components&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Representation&lt;/strong>: How to represent the document and query&lt;/li>
&lt;li>&lt;strong>Metric&lt;/strong>: How to compare document and query&lt;/li>
&lt;/ul>
&lt;h3 id="evaluation-of-retrieved-docs">Evaluation of retrieved docs&lt;/h3>
&lt;p>&amp;ldquo;&lt;strong>How good are the retrieved docs?&lt;/strong>&amp;rdquo;&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Precision&lt;/strong>: Fraction of retrieved docs that are relevant to the user’s information need&lt;/li>
&lt;li>&lt;strong>Recall&lt;/strong>: Fraction of relevant docs in collection that are retrieved&lt;/li>
&lt;/ul>
&lt;h2 id="logic-based-ir">Logic-based IR&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Find all text containing words&lt;/strong>
&lt;ul>
&lt;li>Allow boolean operations between words&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Representation: Words occurring in the document&lt;/li>
&lt;li>&lt;strong>Metric&lt;/strong>: Matching (with Boolean operations)&lt;/li>
&lt;li>&lt;strong>Limitations&lt;/strong>
&lt;ul>
&lt;li>Only exact matches&lt;/li>
&lt;li>No relevance metric 🤪&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Primary commercial retrieval tool for 3 decades.
&lt;ul>
&lt;li>Many search systems you still use are Boolean:
&lt;ul>
&lt;li>Email&lt;/li>
&lt;li>library catalog&lt;/li>
&lt;li>Mac OSX Spotlight&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="example">Example&lt;/h3>
&lt;p>&amp;ldquo;Which plays of Shakespeare contain the words &lt;strong>Brutus&lt;/strong> &lt;em>AND&lt;/em> &lt;strong>Caesar&lt;/strong> but &lt;em>NOT&lt;/em> &lt;strong>Calpurnia&lt;/strong>?&amp;rdquo;&lt;/p>
&lt;p>One could &lt;a href="https://en.wikipedia.org/wiki/Grep">&lt;code>grep&lt;/code>&lt;/a> all of Shakespeare’s plays for &lt;strong>Brutus&lt;/strong> and &lt;strong>Caesar,&lt;/strong> then strip out lines containing &lt;strong>Calpurnia&lt;/strong>&lt;/p>
&lt;p>But this is not the answer &amp;#x1f622;&lt;/p>
&lt;ul>
&lt;li>Slow (for large corpora)&lt;/li>
&lt;li>&lt;em>NOT&lt;/em> &lt;strong>Calpurnia&lt;/strong> is non-trivial&lt;/li>
&lt;li>Other operations (e.g.,find the word &lt;strong>Romans&lt;/strong> near &lt;strong>countrymen&lt;/strong>) not feasible&lt;/li>
&lt;/ul>
&lt;h3 id="incidence-vectors">Incidence vectors&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-21 11.53.05.png" alt="截屏2020-09-21 11.53.05" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>0/1 vector for each term&lt;/p>
&lt;/li>
&lt;li>
&lt;p>To answer the query in the example above:&lt;/p>
&lt;p>take the vectors for &lt;strong>Brutus, Caesar&lt;/strong> and &lt;strong>Calpurnia&lt;/strong> (complemented), then bitwise &lt;em>AND&lt;/em>.&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Brutus&lt;/strong>: &lt;code>110100&lt;/code> AND&lt;/li>
&lt;li>&lt;strong>Caesar&lt;/strong>: &lt;code>110111&lt;/code> AND&lt;/li>
&lt;li>complemented &lt;strong>Calpurnia&lt;/strong>: &lt;code>101111&lt;/code>&lt;/li>
&lt;li>= &lt;code>100100&lt;/code>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>However, this is not feasible for large collection! 😭&lt;/p>
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">More see: &lt;a href="https://nlp.stanford.edu/IR-book/html/htmledition/an-example-information-retrieval-problem-1.html">An example information retrieval problem&lt;/a>&lt;/span>
&lt;/div>
&lt;h3 id="inverted-index">Inverted index&lt;/h3>
&lt;p>For each term $t$, store a list of all documents that contain $t$.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Identify each doc by a &lt;strong>docID&lt;/strong>, a document serial number&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-21%2012.01.42.png" alt="截屏2020-09-21 12.01.42" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;p>Construction&lt;/p>
&lt;ol>
&lt;li>Collect the documents to be indexed&lt;/li>
&lt;li>Tokenize the text, turning each document into a list of tokens&lt;/li>
&lt;li>Do linguistic preprocessing, producing a list of normalized tokens, which are the indexing terms&lt;/li>
&lt;li>Index the documents that each term occurs in by creating an inverted index, consisting of a dictionary and postings&lt;/li>
&lt;/ol>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-21%2012.34.22.png" alt="截屏2020-09-21 12.34.22" style="zoom:80%;" />
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">More see: &lt;a href="https://nlp.stanford.edu/IR-book/html/htmledition/a-first-take-at-building-an-inverted-index-1.html">A first take at building an inverted index&lt;/a>&lt;/span>
&lt;/div>
&lt;h3 id="initial-stages-of-text-processing">Initial stages of text processing&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Tokenization&lt;/strong>: Cut character sequence into word tokens&lt;/li>
&lt;li>&lt;strong>Normalization&lt;/strong>: Map text and query term to same form
&lt;ul>
&lt;li>E.g. We want &lt;code>U.S.A&lt;/code> and &lt;code>USA&lt;/code> to match&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Stemming&lt;/strong>: different forms of a root to match
&lt;ul>
&lt;li>E.g. &lt;code>authorize&lt;/code> and &lt;code>authorization&lt;/code> should match&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Stop words&lt;/strong>: we may omit very common words (or not)
&lt;ul>
&lt;li>E.g. &lt;code>the&lt;/code>, &lt;code>a&lt;/code>, &lt;code>to&lt;/code>, &lt;code>of&lt;/code>&amp;hellip;&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="query-processing-and">Query processing: AND&lt;/h3>
&lt;p>For example, consider processing the query: &lt;strong>Brutus&lt;/strong> &lt;em>AND&lt;/em> &lt;strong>Caesar&lt;/strong>&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Locate &lt;strong>Brutus&lt;/strong> in the Dictionary&lt;/p>
&lt;ul>
&lt;li>Retrieve its postings&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Locate &lt;strong>Caesar&lt;/strong> in the Dictionary&lt;/p>
&lt;ul>
&lt;li>Retrieve its postings&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>“Merge” the two postings (intersect the document sets)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-21%2012.45.19.png" alt="截屏2020-09-21 12.45.19" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>Walk through the two postings simultaneously, in time linear in the total number of postings entries&lt;/p>
&lt;p>(If the list lengths are $x$ and $y$, the merge takes &lt;em>$O(x+y)$&lt;/em> operations.)&lt;/p>
&lt;ul>
&lt;li>‼️Crucial: postings sorted by docID&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;h3 id="phrase-queries">&lt;strong>Phrase queries&lt;/strong>&lt;/h3>
&lt;p>E.g. We want to be able to answer queries such as &lt;em>&lt;strong>&amp;ldquo;stanford university&amp;rdquo;&lt;/strong>&lt;/em> as a phrase&lt;/p>
&lt;p>&amp;ndash;&amp;gt; The sentence &lt;em>&amp;ldquo;I went to university at stanford&amp;rdquo;&lt;/em> is not a match.&lt;/p>
&lt;p>Implementation:&lt;/p>
&lt;ul>
&lt;li>Multi-words&lt;/li>
&lt;li>Position index&lt;/li>
&lt;/ul>
&lt;h2 id="rank-based-ir">Rank-based IR&lt;/h2>
&lt;h3 id="motivation">Motivation&lt;/h3>
&lt;p>&lt;strong>Boolean queries: Documents either match or don’t.&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Good for:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>expert users&lt;/strong> with precise understanding of their needs and the collection.&lt;/li>
&lt;li>&lt;strong>applications&lt;/strong>: Applications can easily consume 1000s of results.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>NOT good for the majority of users&lt;/p>
&lt;ul>
&lt;li>Most users incapable of writing Boolean queries (or they are, but they think it’s too much work).&lt;/li>
&lt;li>Most users don’t want to wade through 1000s of results.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>🔴 Problem: feast of famine&lt;/p>
&lt;ul>
&lt;li>Often result in either too few (=0) or too many (1000s) results.&lt;/li>
&lt;li>It takes a lot of skill to come up with a query that produces a manageable number of hits.
&lt;ul>
&lt;li>AND gives too few;&lt;/li>
&lt;li>OR gives too many&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Ranked retrieval models&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Returns an ordering over the (top) documents in the collection for a query&lt;/li>
&lt;li>&lt;strong>Free text queries&lt;/strong>: Rather than a query language of operators and expressions, the user’s query is just one or more words in a human language&lt;/li>
&lt;li>Large result sets are not an issue
&lt;ul>
&lt;li>
&lt;p>Indeed, the size of the result set is not an issue&lt;/p>
&lt;/li>
&lt;li>
&lt;p>We just show the top $k$ (≈10) results&lt;/p>
&lt;/li>
&lt;li>
&lt;p>We don’t overwhelm the user&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Premise: the ranking algorithm works&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Representation&lt;/strong>:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Term weights (TF-IDF)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Word embeddings&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Char Embeddings&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Metric&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Cosine similarity&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Supervised trained classifier using clickthrough logs&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h3 id="document-similarity">Document similarity&lt;/h3>
&lt;h4 id="query-document-matching-scores">Query-document matching scores&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Assigning a score to a query/document pair&lt;/p>
&lt;/li>
&lt;li>
&lt;p>One-term query&lt;/p>
&lt;ul>
&lt;li>If the query term does not occur in the document: score should be 0&lt;/li>
&lt;li>The more frequent the query term in the document, the higher the score (should be)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Binary term-document incidence matrix&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-21%2023.20.08.png" alt="截屏2020-09-21 23.20.08" style="zoom: 67%;" />
&lt;ul>
&lt;li>Each document is represented by a binary vector $\in \\{0, 1\\}^{|V|}$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Term-document count matrices&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Consider the number of occurrences of a term in a document&lt;/p>
&lt;ul>
&lt;li>Each document is a count vector in Nv: a column below&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-21%2023.21.53.png" alt="截屏2020-09-21 23.21.53" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Term frequency tf&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Term frequency of term $t$ in document $d$
&lt;/p>
$$
\text{tf}\_{t,d}:= \text{number of timest that } t \text{ occurs in } d
$$
&lt;/li>
&lt;li>
&lt;p>We want to use tf when computing query-document match scores&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Log-frequency weighting&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Log frequency weight of term $t$ in $d$&lt;/li>
&lt;/ul>
$$
w\_{t,d} = \begin{cases} 1 + \log\_{10}\text{tf}\_{t, d}&amp; \text{if } \text{tf}\_{t,d}>0 \\\\
0 &amp; \text {otherwise }\end{cases}
$$
&lt;ul>
&lt;li>Score for a document-query pair: sum over terms $t$ in both $q$ and $d$
$$
\text{score} = \sum\_{t \in q \cap d}(1 + \log \text{tf}\_{t,d})
$$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="document-frequency">&lt;strong>Document frequency&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>💡 &lt;strong>Rare terms are more informative than frequent terms&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>$\text{df}\_t$: Document frequency of $t$&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>The number of documents that contain $t$&lt;/li>
&lt;li>Inverse measure of the informativeness of $t$&lt;/li>
&lt;li>$\text{df}\_t \leq N$&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>$idf$: inverse document frequency of $t$&lt;/strong>
&lt;/p>
$$
\text{idf}\_t = \log\_{10}(\frac{N}{\text{df}\_t})
$$
&lt;p>
(use $\log (N/\text{df}\_t)$ instead of $N/\text{df}\_t$ to “dampen” the effect of $\text{idf}$)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Collection frequency of $t$&lt;/strong>: the number of occurrences of $t$ in the collection, counting multiple occurrences.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h4 id="tf-idf-weighting">&lt;strong>tf-idf weighting&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>The tf-idf weight of a term is the product of its tf weight and its idf weight
&lt;/p>
$$
\mathrm{w}\_{t, d}=\log \left(1+\mathrm{tf}\_{t, d}\right) \times \log \_{10}\left(N / \mathrm{df}\_{t}\right)
$$
&lt;/li>
&lt;li>
&lt;p>Best known weighting scheme in information retrieval&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Increases with the number of occurrences within a document&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Increases with the rarity of the term in the collection&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-21%2023.49.26.png"
alt="Each document is now represented by a real-valued vector of tf-idf weights $\in \mathbb{R}^{|V|}$">&lt;figcaption>
&lt;p>Each document is now represented by a real-valued vector of tf-idf weights $\in \mathbb{R}^{|V|}$&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;/li>
&lt;/ul>
&lt;h3 id="documents-as-vectors">Documents as vectors&lt;/h3>
&lt;ul>
&lt;li>$|V|$-dimensional vector space
&lt;ul>
&lt;li>Terms are axes of the space&lt;/li>
&lt;li>Documents are points or vectors in this space&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Very high-dimensional: tens of millions of dimensions when you apply this to a web search engine! 😱
&lt;ul>
&lt;li>Very sparse vectors (most entries are zero)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Distributional similarity based representations&lt;/strong>
&lt;ul>
&lt;li>Get a lot of value by representing a word by means of its neighbors&lt;/li>
&lt;li>“You shall know a word by the company it keeps”&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Low dimensional vectors&lt;/strong>
&lt;ul>
&lt;li>The number of topics that people talk about is small&lt;/li>
&lt;li>💡Idea: store “most” of the important information in a fixed, small number of dimensions: a dense vector (Usually 25 – 1000 dimensions)&lt;/li>
&lt;li>&lt;strong>Reduce the dimensionality&lt;/strong>: Go from big, sparse co-occurrence count vector to low dimensional “word embedding”
&lt;ul>
&lt;li>Traditional Way: &lt;strong>Latent Semantic Indexing/Analysis&lt;/strong>
&lt;ul>
&lt;li>Use Singular Value Decomposition (SVD)&lt;/li>
&lt;li>Similarity is preserved as much as possible&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="dl-methods">DL methods&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Word representation in neural networks:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>1-hot vector&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sparse representation&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>NN learn continuous dense representation&lt;/p>
&lt;ul>
&lt;li>Word embeddings
&lt;ul>
&lt;li>End-to-End learning&lt;/li>
&lt;li>Pre-training using other task&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="word-embeddings">Word embeddings&lt;/h4>
&lt;ul>
&lt;li>Predict surrounding words
&lt;ul>
&lt;li>E.g. Word2Vec, GloVe&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Document representation:
&lt;ul>
&lt;li>&lt;strong>TF-IDF Vectors&lt;/strong>: Sum of word vectors&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Word embeddings: Sum or average of word vectors&lt;/li>
&lt;li>🔴 Problems
&lt;ul>
&lt;li>&lt;strong>High dimension&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Unseen words&lt;/strong>: Not possible to represent words not seen in training&lt;/li>
&lt;li>&lt;strong>Morphology&lt;/strong>: No modelling of spelling similarity&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="letter-n-grams">Letter n-grams&lt;/h4>
&lt;ul>
&lt;li>Mark begin and ending
&lt;ul>
&lt;li>E.g. &lt;code>#good#&lt;/code>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Letter tri-grams
&lt;ul>
&lt;li>E.g. &lt;code>#go&lt;/code>, &lt;code>goo&lt;/code>, &lt;code>ood&lt;/code>, &lt;code>od#&lt;/code>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>🔴 Problem:
&lt;ul>
&lt;li>&lt;strong>Collision&lt;/strong>: Different words may be represented by same trigrams&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="measure-similarity">Measure similarity&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Rank documents according to their proximity to the query in this space&lt;/p>
&lt;ul>
&lt;li>proximity = similarity of vectors&lt;/li>
&lt;li>proximity ≈ inverse of distance&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>(Euclidean) Distance is a bad idea!&lt;/p>
&lt;ul>
&lt;li>Euclidean distance is large for vectors of different lengths&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Use &lt;strong>angle&lt;/strong> instead of distance&lt;/p>
&lt;ul>
&lt;li>💡 Key idea: Rank documents according to angle with query.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>From angles to cosines&lt;/p>
&lt;ul>
&lt;li>
&lt;p>As Cosine is a monotonically decreasing function for the interval $[0^{\circ}, 180^{\circ}]$&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/1024px-Cosine.svg.png" alt="File:Cosine.svg - Wikimedia Commons">&lt;/p>
&lt;p>The following two notions are equivalent:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Rank documents in &lt;em>&lt;u>decreasing&lt;/u>&lt;/em> order of the angle between query and&lt;/p>
&lt;p>document&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Rank documents in &lt;em>&lt;u>increasing&lt;/u>&lt;/em> order of $\operatorname{cosine}(\text{query},\text{document})$&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Length normalization&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Dividing a vector by its $L\_2$ norm makes it a unit (length) vector (on&lt;/p>
&lt;p>surface of unit hypersphere)
&lt;/p>
$$
\|\vec{x}\|\_{2}=\sqrt{\sum x\_{i}^{2}}
$$
&lt;p>
&amp;ndash;&amp;gt; Long and short documents now have comparable weights&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>$\operatorname{cosine}(\text{query},\text{document})$
&lt;/p>
$$
\cos (\vec{q}, \vec{d})=\frac{\vec{q} \cdot \vec{d}}{|\vec{q}||\vec{d}|}=\frac{\vec{q}}{|\vec{q}|} \cdot \frac{\vec{d}}{|\vec{d}|}=\frac{\sum_{i=1}^{|V|} q_{i} d_{i}}{\sqrt{\sum_{i=1}^{|V|} q_{i}^{2}} \sqrt{\sum_{i=1}^{|V|} d_{i}^{2}}}
$$
&lt;ul>
&lt;li>
&lt;p>$q\_i$: e.g. the tf-idf weight of term &lt;em>i&lt;/em> in the query&lt;/p>
&lt;/li>
&lt;li>
&lt;p>$d\_i$: e.g. the tf-idf weight of term &lt;em>i&lt;/em> in the document&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Illustration example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-23%2010.57.26.png" alt="截屏2020-09-23 10.57.26" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="link-information">Link information&lt;/h2>
&lt;h3 id="hypertext-and-links">Hypertext and links&lt;/h3>
&lt;ul>
&lt;li>Questions
&lt;ul>
&lt;li>Do the links represent a conferral of authority to some pages? Is this useful for ranking?&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Application
&lt;ul>
&lt;li>
&lt;p>The Web&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Email&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Social networks&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="links">Links&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>The &lt;span style="color:green">Good&lt;/span>, The &lt;span style="color:red">Bad&lt;/span> and The Unknown&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-23%2011.21.05.png" alt="截屏2020-09-23 11.21.05" style="zoom:67%;" />
&lt;ul>
&lt;li>
&lt;p>&lt;span style="color:green">Good&lt;/span> nodes won’t point to &lt;span style="color:red">Bad&lt;/span> nodes&lt;/p>
&lt;ul>
&lt;li>
&lt;p>All other combinations plausible&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If you point to a &lt;span style="color:red">Bad&lt;/span> node, you’re &lt;span style="color:red">Bad&lt;/span>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If a &lt;span style="color:green">Good&lt;/span> node points to you, you’re Good&lt;/p>
&lt;/li>
&lt;/ul>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-23%2011.22.57.png" alt="截屏2020-09-23 11.22.57" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="web-as-a-directed-graph">Web as a Directed Graph&lt;/h3>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-23%2011.28.29.png" alt="截屏2020-09-23 11.28.29" style="zoom:80%;" />
&lt;ul>
&lt;li>&lt;strong>Hypothesis 1:&lt;/strong> A hyperlink between pages denotes a conferral of authority (quality signal)&lt;/li>
&lt;li>&lt;strong>Hypothesis 2:&lt;/strong> The text in the anchor of the hyperlink on page A describes the target page B&lt;/li>
&lt;/ul>
&lt;h3 id="anchor-text">Anchor Text&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Assumptions&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>reputed sites&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-23%2011.32.10.png" alt="截屏2020-09-23 11.32.10" style="zoom: 67%;" />
&lt;/li>
&lt;li>
&lt;p>&lt;strong>annotation of target&lt;/strong>&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-23%2011.32.41.png" alt="截屏2020-09-23 11.32.41" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Indexing&lt;/strong>: When indexing a document &lt;em>D&lt;/em>, include (with some weight) anchor text from links pointing to &lt;em>D&lt;/em>.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-23%2011.34.16.png" alt="截屏2020-09-23 11.34.16" style="zoom: 67%;" />
- Can sometimes have unexpected effects, e.g., spam, **miserable failure** 🤪
- Solution: score anchor text with weight depending on the **authority** of the anchor page’s website
- *E.g., if we were to assume that content from cnn.com or yahoo.com is authoritative, then trust (more) the anchor text from them*
&lt;/li>
&lt;/ul>
&lt;h3 id="link-analysis-pagerank">Link analysis: Pagerank&lt;/h3>
&lt;h4 id="citation-analysis">&lt;strong>Citation Analysis&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>&lt;strong>Citation frequency&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Bibliographic coupling frequency&lt;/strong>: Articles that co-cite the same articles are related&lt;/li>
&lt;li>&lt;strong>Citation indexing&lt;/strong>&lt;/li>
&lt;/ul>
&lt;h4 id="pagerank-scoring">&lt;strong>Pagerank scoring&lt;/strong>&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Imagine a user doing a random walk on web pages:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Start at a random page&lt;/p>
&lt;/li>
&lt;li>
&lt;p>At each step, go out of the current page along one on the links on that page, equiprobably&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>“In the long run” each page has a long-term visit rate - use this as the page’s score.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>But the web is full of &lt;strong>dead-end&lt;/strong>s.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Random walk can get stuck in dead-ends. &amp;#x1f622;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Makes no sense to talk about long-term visit rates.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>At a dead end, jump to a random web page.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>At any non-dead end, with probability 10%, jump to a random web page.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Result of teleporting&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Now cannot get stuck locally.&lt;/li>
&lt;li>There is a long-term rate at which any page is visited&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item><item><title>Language and Vision</title><link>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/12-language-and-vision/</link><pubDate>Sun, 20 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/natural-language-processing/lecture-notes/12-language-and-vision/</guid><description>&lt;h2 id="motivation">Motivation&lt;/h2>
&lt;p>Human interacts with environment multimodal&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Modalities&lt;/p>
&lt;ul>
&lt;li>Text&lt;/li>
&lt;li>Audio&lt;/li>
&lt;li>Vision&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Other modalities can be used to disambiguate text&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Jointly using different modalities&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="image-description">Image description&lt;/h2>
&lt;h3 id="generation">Generation&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>Generate description/caption of image&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Verbalize the most salient aspects of the image&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Typically one sentence&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-20%2023.57.18.png" alt="截屏2020-09-20 23.57.18">&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Joint use of&lt;/p>
&lt;ul>
&lt;li>Computer vision&lt;/li>
&lt;li>Natural language processing&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="-challenges">🔴 Challenges&lt;/h4>
&lt;ul>
&lt;li>Cover any visual aspect of the image:
&lt;ul>
&lt;li>
&lt;p>Objects and their attributes&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Features of the scene&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Interaction of objects&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Reference to objects not in the image:
&lt;ul>
&lt;li>E.g. &lt;em>people waiting for a train&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Background knowledge necessary
&lt;ul>
&lt;li>E.g. &lt;em>Picture of Mona Lisa&lt;/em>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="task">Task&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Input: Image&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Generate representation&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Output: Text&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Related to Natural language generation&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Content selection&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Organizing of content&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Surface realization&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h4 id="generation-from-visual-input">Generation from Visual Input&lt;/h4>
&lt;ul>
&lt;li>
&lt;p>Standard pipeline:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Computer vision&lt;/strong>: Recognize
&lt;ul>
&lt;li>
&lt;p>Scene&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Objects&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Spatial relationship&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Actions&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;strong>Natural language generation&lt;/strong>
&lt;ul>
&lt;li>Combine words/phrases from first step using
&lt;ul>
&lt;li>Templates&lt;/li>
&lt;li>N-grams&lt;/li>
&lt;li>Grammar rules&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-21%2000.02.47.png" alt="截屏2020-09-21 00.02.47" style="zoom:80%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-21%2000.03.09.png" alt="截屏2020-09-21 00.03.09" style="zoom:80%;" />
&lt;/li>
&lt;li>
&lt;p>&lt;strong>End-to-End approaches&lt;/strong> (&lt;a href="https://arxiv.org/abs/1502.03044">Show, Attend, Tell&lt;/a>)&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-21%2000.05.48.png" alt="截屏2020-09-21 00.05.48" style="zoom:80%;" />
&lt;ul>
&lt;li>
&lt;p>CNN Encoder of the image&lt;/p>
&lt;/li>
&lt;li>
&lt;p>LSTM-based Decoder generating the sentences&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Attention mechanism to attend to different parts of the image&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Examples&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-21%2000.06.17.png" alt="截屏2020-09-21 00.06.17" style="zoom:80%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="retrieval">Retrieval&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>💡 Idea: Use description of similar image&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Algorithm:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Extract visual feature&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Retrieve most similar images using similarity function&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Re-rank images&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Combine retrieved descriptions&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;figure>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-21%2000.08.07.png"
alt="Description retrieval">&lt;figcaption>
&lt;p>Description retrieval&lt;/p>
&lt;/figcaption>
&lt;/figure>
&lt;/li>
&lt;/ul>
&lt;h2 id="visual-question-answering">Visual question answering&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Given:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Image&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Question related to the image&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Example&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/截屏2020-09-21%2000.14.39.png" alt="截屏2020-09-21 00.14.39" style="zoom:67%;" />
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Output: Answer&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Most common model: Joint neural network&lt;/p>
&lt;/li>
&lt;li>
&lt;p>🔴 Challenges: Multi-step reasoning&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Steps&lt;/p>
&lt;ol>
&lt;li>Locate objects (&lt;em>bike, window, street, basket and dogs&lt;/em>)&lt;/li>
&lt;li>Identify concepet (&lt;em>sitting&lt;/em>)&lt;/li>
&lt;li>Rule out irrelavant objects&lt;/li>
&lt;/ol>
&lt;/li>
&lt;/ul>
&lt;h3 id="image-model">Image model&lt;/h3>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-21%2000.12.48.png" alt="截屏2020-09-21 00.12.48">&lt;/p>
&lt;p>CNN:&lt;/p>
&lt;ul>
&lt;li>Often pretrained models used&lt;/li>
&lt;li>Global features: Fixed size representation of the whole image&lt;/li>
&lt;li>Local features: Representation of different regions of the image&lt;/li>
&lt;/ul>
&lt;h3 id="text-model">Text model&lt;/h3>
&lt;p>Read question word by word&lt;/p>
&lt;p>&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/%E6%88%AA%E5%B1%8F2020-09-21%2000.13.55.png" alt="截屏2020-09-21 00.13.55">&lt;/p>
&lt;h3 id="answer-generation">Answer generation&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>One word or free text&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Input: Image features and text features&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Output: Most probable word&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Models:&lt;/p>
&lt;ul>
&lt;li>Fully connected NN&lt;/li>
&lt;li>Attention mechanism&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item></channel></rss>