Reapproximation von Dichten

4 cases

Examples for reapproximation

Continuous → continuous: Gaussian mixture reduction
Continuous → discrete: deterministic sampling, i.e., replacing a continuous density with Dirac mixture
Discrete → continuous: density estimation, i.e., finding a continuous density representing a set of given samples
Discrete → discrete: Dirac mixture reduction

Challenge: Three cases involving discrete densities

Continuous → continuous case: Use standard distance measures, e.g. integral squared distance (ISD)
Discrete densities prohibit use of standard distance measures

Here we focus on continuous → discrete Reapproximation

Given: Continuous density $\tilde{f}(\underline{x})$
Deterministic sampling, i.e., approximation with Dirac mixture
Definition of Dirac mixture with $L$ components
$$ f(\underline{x})=\sum_{i=1}^{L} w_{i} \cdot \delta\left(\underline{x}-\underline{\hat{x}}_{i}\right) $$
- Weights $w_{i}>0, \displaystyle \sum_{i=1}^{L} w_{i}=1$
- $\underline{x}_i$: locations / samples
🎯 Goal: Systematic approximation of given continuous density
Application examples
- Mapping of random variables through nonlinear functions
- Sample-based fusion and estimation (UKF)

Univariate Case (1D)

Synthesis

Instead of comparing densities $\tilde{f}(x), f(x)$, we compare cumulative distribution functions (CDFs) $\tilde{F}(x), F(x)$

CDF of $f(x)$:

$$ F(x)=P(\boldsymbol{x} \leq x)=\int_{-\infty}^{x} f(u) \mathrm{d} u $$

This definition is unique, as other definition $\bar{F}(x)=P(\boldsymbol{x}>x)$ is dependent
$$ \begin{aligned} \bar{F}(x)=&P(\boldsymbol{x}>x) \\ &=\int_{x}^{\infty} f(u) d u \\ &=1-\int_{-\infty}^{x} f(u) d u \\ &=1-P(\boldsymbol{x} \leq x) \\ &=1-F(x) \end{aligned} $$
Example
Dirac mixture density
$$ f(x, \underline{\hat{x}})=\sum_{i=1}^{L} w_{i} \delta\left(x-\hat{x}_{i}\right) $$
Dirac mixture cumulative distribution
$$ F(x, \underline{\hat{x}})=\sum_{i=1}^{L} w_{i} \mathrm{H}\left(x-\hat{x}_{i}\right) \text { with } \mathrm{H}(x)=\int_{-\infty}^{x} \delta(t) \mathrm{d} t= \begin{cases}0 & x<0 \\ \frac{1}{2} & x=0 \\ 1 & x>0\end{cases} $$
with the Dirac position
$$ \underline{\hat{x}}=\left[\hat{x}_{1}, \hat{x}_{2}, \ldots, \hat{x}_{L}\right]^{\top} $$

CDF of $\tilde{f}(x)$ follows analogously:

$$ \tilde{F}(x)=\int_{-\infty}^{x} \tilde{f}(u) \mathrm{d} u $$

Example
Gaussian density:
$$ \tilde{f}(x)=\frac{1}{\sqrt{2 \pi}} \exp \left(-\frac{1}{2} x^{2}\right) $$
$\Rightarrow$ Gaussian cumulative distribution:
$$ \tilde{F}(x)=\frac{1}{2}\left(1+\operatorname{erf}\left(\frac{x}{\sqrt{2}}\right)\right) $$

We compare $\tilde{F}(x), F(x)$ use Cramér–von Mises distance:

$$ D(\underline{\hat{x}})=\int_{\mathbb{R}}(\tilde{F}(x)-F\left(x, \underline{\hat{x}})\right)^{2} \mathrm{~d} x $$

Minimization of Cramér–von Mises distance

Gradient of the distance measure $D(\underline{\hat{x}})$:

$$ \underline{G}(\underline{\hat{x}})=\nabla D(\underline{\hat{x}})=\frac{\partial D(\underline{\hat{x}})}{\partial \underline{\hat{x}}}=\left[\frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{1}}, \frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{2}}, \ldots, \frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{L}}\right]^{\top} $$

with

$$ \frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{i}}=2 w_{i} \int_{-\infty}^{\infty}[\tilde{F}(t)-F(t, \underline{\hat{x}})] \delta\left(t-\hat{x}_{i}\right) \mathrm{d} t $$

$$ \frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{i}}=2 w_{i}\left[\tilde{F}\left(\hat{x}_{i}\right)-F\left(\hat{x}_{i}, \underline{\hat{x}}\right)\right] \text { with } F\left(\hat{x}_{i}, \underline{\hat{x}}\right)=\sum_{j=1}^{L} w_{j} \mathrm{H}\left(\hat{x}_{i}-\hat{x}_{j}\right) $$

The Hesse matrix is

$$ \mathbf{H}(\underline{x})=\operatorname{diag}\left(\left[\frac{\partial^{2} D(\underline{\hat{x}})}{\partial \hat{x}_{1}^{2}}, \frac{\partial^{2} D(\underline{\hat{x}})}{\partial \hat{x}_{2}^{2}}, \ldots, \frac{\partial^{2} D(\underline{\hat{x}})}{\partial \hat{x}_{L}^{2}}\right]\right) $$

with

$$ \frac{\partial^{2} D(\underline{\hat{x}})}{\partial \hat{x}_{i}^{2}}=2 w_{i} \tilde{f}\left(\hat{x}_{i}\right) $$

Sorted Locations & Equal Weights

When location vector $\underline{\hat{x}}$ is sorted, i.e., $\hat{x}_{1}<\hat{x}_{2}<\ldots<\hat{x}_{L}$ , we obtain

$$ H\left(\hat{x}_{i}-\hat{x}_{j}\right)= \begin{cases} 0 & i < j \\ \frac{1}{2} & i=j \\ 1 & i > j \end{cases} $$

Cumulative distribution can be simplified

$$ F\left(\hat{x}_{i}, \underline{\hat{x}}\right)=\frac{w_{i}}{2}+\sum_{j=1}^{i-1} w_{j} $$

When samples are equally weighted (i.e. $w_i = \frac{1}{L}$), we get

$$ F(\hat{x}_{i}, \underline{\hat{x}}) = \frac{1}{2L} + \frac{i-1}{L} = \frac{2i - 1}{2L} \qquad i = 1, \dots, L $$

Analytic solutions (possible in some special cases)

$$ \tilde{F}\left(\hat{x}_{i}\right)-F\left(\hat{x}_{i}, \underline{\hat{x}}\right)=0 \Rightarrow \hat{x}_{i}=\tilde{F}^{-1}(\frac{2 i-1}{2 L}) \qquad i=1, \ldots, L $$

E.g. Gaussian distribution:
$$ \tilde{F}^{-1}(x)=\sqrt{2} \operatorname{erfinv}((2 i-1) /(2 L)) $$

Example: DMA of standard Normal Distribution

With increasing number of Dirac functions, the CDF can be well approximated.

General Optimization

In general: Minimum of $D(\underline{\hat{x}})$ is obtained iteratively using Newton’s method

$$ \Delta \underline{\hat{x}}=-\mathbf{H}^{-1}(\underline{\hat{x}}) \underline{G}(\underline{\hat{x}}) $$

with

$$ \underline{G}(\underline{\hat{\hat{x}}})=\left[\frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{1}}, \frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{2}}, \ldots, \frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{L}}\right]^{\top} $$

and

$$ \frac{\partial D(\underline{\hat{x}})}{\partial \hat{x}_{i}}=2 w_{i}\left[\tilde{F}\left(\hat{x}_{i}\right)-F\left(\hat{x}_{i}, \underline{\hat{x}}\right)\right] $$

The Hessian $\mathbf{H}(\underline{x})$ is given by

$$ \mathbf{H}(\underline{\hat{x}})=2 \operatorname{diag}\left(\left[w_{1} \tilde{f}\left(\hat{x}_{1}\right), w_{2} \tilde{f}\left(\hat{x}_{2}\right), \ldots, w_{L} \tilde{f}\left(\hat{x}_{L}\right)\right]\right) $$

The resulting Newton step:

$$ \Delta \underline{\hat{x}}=-\left[\frac{\tilde{F}\left(\hat{x}_{1}\right)-F\left(\hat{x}_{1}, \underline{\hat{x}}\right)}{\tilde{f}\left(\hat{x}_{1}\right)}, \frac{\tilde{F}\left(\hat{x}_{2}\right)-F\left(\hat{x}_{2}, \underline{\hat{x}}\right)}{\tilde{f}\left(\hat{x}_{2}\right)}, \ldots, \frac{\tilde{F}\left(\hat{x}_{L}\right)-F\left(\hat{x}_{L}, \underline{\hat{x}}\right)}{\tilde{f}\left(\hat{x}_{L}\right)}\right]^{\top} $$

Extension to Multivariate Distributions

Extension to multivariate case is not trivial
- Less nice properties of multivariate cumulative distributions 🤪
Distinguish several classes of multivariate methods
- Methods that generalize concept of CDF
- Methods that perform reduction to univariate case
- Kernel-based methods
- Continuous flow between density approximations

Challenge of Multivariate Cumulative Distributions

Definition for 2D:

$$ F(u, v)=\int_{-\infty}^{u} \int_{-\infty}^{v} f(x, y) d x d y $$

However, $F(u, v)$ is asymmetric and definition is not unique.

Alternative definitions:

$$ \begin{aligned} &F(u, v)=\int_{-\infty}^{u} \int_{v}^{\infty} f(x, y) d x d y \\ &F(u, v)=\int_{u}^{\infty} \int_{v}^{\infty} f(x, y) d x d y \\ &F(u, v)=\int_{u}^{\infty} \int_{-\infty}^{v} f(x, y) d x d y \end{aligned} $$

Three CDFs are independent, forth is dependent.

For general $N$–dimensional random vectors: $2^N$ different variants,, $2^N - 1$ are independent

$\rightarrow$ exponentially complex!

Use in statistical tests difficult
Results differ depending on variant

Thus, we require generalization of concept of CDF. 💪

Localized Cumulative Distributions (LCDs)

Univariate case (1D)

💡 Key idea

Compare local probability masses of $\tilde{f}(x)$ and $f(x)$
Integrate over intervals at all positions $m$ and all widths $b$

Compare $\tilde{A}(m, b)$ and $A(m,b), \forall m, b$

Symmetric, unique, but redundant…

Multivariate case (2D)

Different kernels possible

Rectangular kernels
Gaussian kernels
Anisotropic vs. isotropic kernels
Separable vs. inseparable kernels

Cumulative Transformation of Densities

Given

Random vector $\underline{x} \in \mathbb{R}^N$
Probability density function $f(\underline{x}): \mathbb{R}^N \to \mathbb{R}_+$

Localized Cumulative Distribution (LCD):

$$ F(\underline{m}, b)=\int_{\mathbb{R}^{N}} f(\underline{x}) K(\underline{x}-\underline{m}, b) \mathrm{d} \underline{x} $$

$K(\cdot, \cdot)$: Kernel
$\underline{m}$: Kernel location
$\underline{b}$: Kernel width

Specific kernel employed:

$$ K(\underline{x}-\underline{m}, b)=\prod_{k=1}^{N} \exp \left(-\frac{1}{2} \frac{\left(x_{k}-m_{k}\right)^{2}}{b^{2}}\right) $$

Separable (i.e. in form of product)
isotropic (i.e. same in each direction)
Gaussian

Properties of LCD:

Symmetric
Unique
Multivariate

Examples

LCD (Rectangular Kernel)
LCD (Gaussian Kernel)

Generalized Cramér–von Mises Distance (GCvD)

Given:

LCD of given continuous density $\tilde{F}(\underline{m}, b)$
LCD of Dirac mixture $F(\underline{m}, b)$

Definition:

$$ D=\int_{\mathbb{R}_{+}} w(b) \int_{\mathbb{R}^{N}}(\tilde{F}(\underline{m}, b)-F(\underline{m}, b))^{2} \mathrm{~d} \underline{m} \mathrm{~d} b $$

Minimization of GCvD:

For many Dirac components → high-dimensional optimization problem
Gradient available, Hessian more difficult
Use Quasi-Newton method: L-BFGS

Projected Cumulative Distributions (PCD)

Options for Reduction to Univariate Case

Reapproximation methods for univariate case readily available. How can we use univariate methods in multivariate case?

Solution

Approximation on principal axis of PDF
- Limited to densities where principal axis can be defined
- Examples: Gaussian PDF, Bingham PDF on sphere
- Does not cover the entire density 🤪
Cartesian product of 1D approximation 👎
- Curse of dimensionality (as very similar to grid)
- Only for product densities (or rotations thereof)
- Inefficient coverage
Representing PDFs by all one-dimensional projections (a.k.a Radon transform) 👍
1. Represent the two densities $\tilde{f}(\underline{x})$ and $f(\underline{x})$ by infinite set of one-dimensional projections
  - Projections onto all unit vectors $\underline{u} \in \mathbb{S}^{N-1}$
  - We obtain sets of projections $\tilde{f}(r \mid \underline{u})$ and $f(r \mid u)$ ($r$: the density along the unit vector)
  - These are the Radon transforms of $\tilde{f}(\underline{x})$ and $f(\underline{x})$
2. We compare the sets of projections $\tilde{f}(r \mid \underline{u})$ and $f(r \mid u)$ for every $\underline{u} \in \mathbb{S}^{N-1}$
  - For comparison, we use the univariate cumulative distribution functions $\tilde{F}(r \mid \underline{u})$ and $F(r \mid u)$
  - These are unique, well defined, and easy to calculate 👏
  - Resulting distance measures
    $$ D_{1}(\underline{u})=D(\tilde{f}(r \mid \underline{u}), f(r \mid \underline{u})) $$
    depend on the projection vector $\underline{u}$
3. We integrate these one-dimensional distance measures $D_1(\underline{u})$ over all unit vectors $\underline{u} \in \mathbb{S}^{N-1}$
  - This gives multivariate distance measure $D(\tilde{f}(\underline{x}), f(\underline{x}))$
  - Typically a discretized subset of $\underline{u} \in \mathbb{S}^{N-1}$ is used
  - Distance measure minimized via univariate Newton updates

Radon Transform

Represent general $N$-dimensional probability density functions via the set of all one-dimensional projections
Linear projection of random vector $\underline{\boldsymbol{x}} \in \mathbb{R}^{N}$ to scalar random variable $\boldsymbol{r} \in \mathbb{R}$ onto line described by unit vector $\underline{u} \in \mathbb{S}^{N-1}$
$$ \boldsymbol{r} = \underline{u}^\top \underline{\boldsymbol{x}} $$
Given probability density function $f(\underline{x})$ of random vector $\underline{\boldsymbol{x}}$, density $f_r(r \mid \underline{u})$ of $\boldsymbol{r}$ is given by
$$ f_{r}(r \mid \underline{u})=\int_{\mathbb{R}^{N}} f(\underline{t}) \delta\left(r-\underline{u}^{\top} \underline{t}\right) \mathrm{d} \underline{t} $$
$f_r(r \mid \underline{u})$ is Radon transform of $f(\underline{x})$ for all $\underline{u} \in \mathbb{S}^{N-1}$

Visualization:

$u$ is the unit vector.
We project the density on $u$ and we get the projection (yellow area).
Then if we cut through the projection, it gives us the red line.

Dirac Mixture Densities

$$ f(\underline{x} \mid \hat{\mathbf{X}})=\sum_{i=1}^{L} w_{i} \delta\left(\underline{x}-\underline{\hat{x}}_{i}\right) $$

with

$$ \hat{\mathbf{X}}=\left[\underline{\hat{x}}_{1}, \underline{\hat{x}}_{2}, \ldots, \underline{\hat{x}}_{L}\right] $$

Radon transform is given by

$$ f_{r}(r \mid \underline{\hat{r}}, \underline{u})=\sum_{i=1}^{L} w_{i} \delta\left(\underline{u}^{\top} \underline{x} - \underline{u}^{\top} \underline{\hat{x}}_{i}\right)=\sum_{i=1}^{L} w_{i} \delta\left(r-\hat{r}_{i}(\underline{u})\right) $$

$\hat{r}_{i}(\underline{u})=\underline{u}^{\top} \underline{x}_{i}, i=1, \ldots, L$ are the projected Dirac locations

Collect projected Dirac locations $\hat{r}_{i}(\underline{u})$ in vector

$$ \underline{\hat{r}}=\left[\hat{r}_{1}(\underline{u}), \hat{r}_{2}(\underline{u}), \ldots, \hat{r}_{L}(\underline{u})\right]^{\top} $$

Gaussian Densities

For Gaussian Density $f (\underline{x}) $with mean vector $\underline{\hat{x}}$ and covariance matrix $\mathbf{C}_x$, density $f_r(r \mid \underline{u})$ resulting from the projection is also Gaussian
Its mean $\hat{r}(\underline{u})$ can simply be calculated by taking the expected value
$$ \hat{r}(\underline{u})=\mathrm{E}\{\boldsymbol{r}(\underline{u})\}=\mathrm{E}\left\{\underline{u}^{\top} \underline{\boldsymbol{x}}\right\}=\underline{u}^{\top} \underline{\hat{x}} $$
Its standard deviation $\sigma_r(\underline{u})$ is given by
$$ \sigma_{r}^{2}(\underline{u})=\mathrm{E}\left\{(\boldsymbol{r}(\underline{u})-\hat{r}(\underline{u}))^{2}\right\}=\mathrm{E}\left\{\left(\underline{u}^{\top} \boldsymbol{x}-\underline{u}^{\top} \underline{\hat{x}}\right)^{2}\right\}=\mathrm{E}\left\{\underline{u}^{\top}(\boldsymbol{x}-\underline{\hat{x}})(\boldsymbol{x}-\underline{\hat{x}})^{\top} \underline{u}\right\}=\underline{u}^{\top} \mathbf{C}_{x} \underline{u} $$

Gaussian Mixture Densities

For $N$-dimensional Gaussian mixture densities $f(\underline{x})$ of the form
$$ f(\underline{x})=\sum_{i=1}^{M} w_{i} \frac{1}{\sqrt{(2 \pi)^{N}\left|\mathbf{C}_{x, i}\right|}} \exp \left(-\frac{1}{2}\left(\underline{x}-\underline{\hat{x}}_{i}\right)^{\top} \mathbf{C}_{x, i}^{-1}\left(\underline{x}-\underline{\hat{x}}_{i}\right)\right) $$
the density $f_r(r, \underline{u})$ is also a Gaussian mixture
Due to the linearity of the projection operator, it is given by
$$ f_{r}(r \mid \underline{u})=\sum_{i=1}^{M} w_{i} \frac{1}{\sqrt{2 \pi} \sigma_{r, i}(\underline{u})} \exp \left(-\frac{1}{2} \frac{\left(r-\hat{r}_{i}(\underline{u})\right)^{2}}{\sigma_{r, i}^{2}(\underline{u})}\right) $$
with
$$ \hat{r}_{i}(\underline{u})=\underline{u}^{\top} \underline{\hat{x}}_{i} $$
and
$$ \sigma_{r, i}(\underline{u})=\sqrt{\underline{u}^{\top} \mathbf{C}_{x, i} \underline{u}} $$

Multivariate Cramér-von Mises Distance

Multivariate distance measure between two continuous and/or discrete probability density functions

One-dimensional Projections via Radon Transform
Given density $\tilde{f}(\underline{x})$ and its approximation $f(\underline{x})$, represented by their Radon transforms $\tilde{f}(r \mid \underline{u})$ (i.e. by their 1D projections onto unit vectors $\underline{u} \in \mathbb{S}^{N-1}$)
One-dimensional Cumulative Distributions
Based on Radon transform $\tilde{f}(r \mid \underline{u})$, calculate one-dimensional cumulative distributions of the projected densities as
$$ \tilde{F}(r \mid \underline{u})=\int_{\infty}^{r} \tilde{f}(t, \underline{u}) \mathrm{d} t $$
and similarly for $F(r \mid \underline{u})$
- Example: For Dirac mixture approximation, cumulative distribution function of its Radon transform is given by
  $$ F(r \mid \underline{\hat{r}}, \underline{u})=\sum_{i=1}^{L} w_{i} \mathrm{H}\left(r-\hat{r}_{i}(\underline{u})\right) $$
One-dimensional Distance
- For comparing the one-dimensional projections, we compare their cumulative distributions $\tilde{F}(r \mid \underline{u})$ and $F(r \mid \underline{\hat{r}}, \underline{u})$ for all $\underline{u} \in \mathbb{S}^{N-1}$
- As distance measure use integral squared distance
  $$ D_{1}(\underline{\hat{r}}, \underline{u})=\int_{\mathbb{R}}[\tilde{F}(r \mid \underline{u})-F(r \mid \underline{\hat{r}}, \underline{u})]^{2} \mathrm{~d} r $$
- Gives distance between the projected densities in the direction of the unit vector $\underline{u}$ for all $\underline{u}$
One-dimensional Newton Step
- Newton step can now be written as
$$ \Delta \underline{\hat{r}}(\underline{\hat{r}}, \underline{u})=-\mathbf{H}^{-1}(\underline{\hat{r}}, \underline{u}) \underline{G}(\underline{\hat{r}}, \underline{u}) $$
with
$$ \begin{aligned} \underline{G}(\underline{\hat{r}}, \underline{u})&=\left[\frac{\partial D_{1}(\underline{\hat{r}}, \underline{u})}{\partial \hat{r}_{1}}, \frac{\partial D_{1}(\underline{\hat{r}}, \underline{u})}{\partial \hat{r}_{2}}, \ldots, \frac{\partial D_{1}(\underline{\hat{r}}, \underline{u})}{\partial \hat{r}_{L}}\right]^{\top} \\ \frac{\partial D_{1}(\underline{\hat{r}}, \underline{u})}{\partial \hat{r}_{i}}&=2 w_{i}\left[\tilde{F}\left(\hat{r}_{i} \mid \underline{u}\right)-F\left(\hat{r}_{i} \mid \underline{u}\right)\right] \end{aligned} $$
- Hessian $\mathbf{H}(\underline{r}, \underline{u})$ is given by
  $$ \mathbf{H}(\underline{\hat{r}}, \underline{u})=2 \operatorname{diag}\left(\left[w_{1} \tilde{f}\left(\hat{r}_{1} \mid \underline{u}\right), w_{2} \tilde{f}\left(\hat{r}_{2} \mid \underline{u}\right), \ldots, w_{L} \tilde{f}\left(\hat{r}_{L} \mid \underline{u}\right)\right]\right) $$
- $\Rightarrow$ Resulting Newton step
  $$ \Delta \underline{\hat{r}}(\underline{\hat{r}}, \underline{u})=-\left[\frac{\tilde{F}\left(\hat{r}_{1} \mid \underline{u}\right)-F\left(\hat{r}_{1} \mid \underline{\hat{r}}, \underline{u}\right)}{\tilde{f}\left(\hat{r}_{1} \mid \underline{u}\right)}, \ldots, \frac{\tilde{F}\left(\hat{r}_{L} \mid \underline{u}\right)-F\left(\hat{r}_{L} \mid \underline{\hat{r}}, \underline{u}\right)}{\tilde{f}\left(\hat{r}_{L} \mid \underline{u}\right)}\right]^{\top} $$
Backprojection to $N$-dimensional space
- For specific projection vector $\underline{u}$: Newton update $\Delta \underline{\hat{r}}(\underline{\hat{r}}, \underline{u})$
- Backprojection into original $N$-dimensional space: Update can be used to modify original Dirac locations in direction along the vector $\underline{u}$
- For every location vector $\underline{\hat{x}}_i$ we obtain
  $$ \Delta \underline{\hat{x}}_{i}(\underline{u})=\Delta \underline{\hat{r}}(\underline{\hat{r}}, \underline{u}) \cdot \underline{u} $$
Assemble Multivariate Distance
- Individual 1D distances $D_1(\underline{r}, \underline{u})$ can be assembled to form multivariate distance measure
- Performed by integrating over all 1D distances depending on unit vector $\underline{u}$
  $$ D_{N}(\hat{\mathbf{X}})=\int_{\mathbb{S}^{N-1}} D_{1}(\underline{\hat{r}}, \underline{u}) \mathrm{d} \underline{u} $$
- Plugging in $D_1(\underline{r}, \underline{u})$:
  $$ D_{N}(\hat{\mathbf{X}})=\int_{\mathbb{S}^{N-1}} \int_{\mathbb{R}}[\tilde{F}(r \mid \underline{u})-F(r \mid \underline{\hat{r}}, \underline{u})]^{2} \mathrm{~d} r \mathrm{~d} \underline{u} \quad \text { with } r=\underline{u}^{\top} \cdot \underline{x} $$
Perform Full Newton Update
Full Newton update by integrating over all partial updates along projection vectors $\underline{u}$
$$ \Delta \underline{\hat{x}}_{i}=\int_{\mathbb{S}^{N-1}} \Delta \underline{\hat{x}}_{i}(\underline{u}) \mathrm{d} \underline{u} $$

Discretization of Space of Unit Vectors

In practice, Space $\mathbb{S}^{N-1}$ containing unit vectors $\underline{u}$ has to be discretized
Two options are available for performing the discretization:
- Deterministic discretization, e.g., by calculating a grid
- Random discretization by drawing uniform samples from the hypersphere
For both cases: Consider $K$ samples $\underline{u}_k$
Integration reduces to summation
$$ \Delta \underline{\hat{x}}_{i} \approx \frac{1}{K} \sum_{k=1}^{K} \Delta \underline{\hat{x}}_{i}\left(\underline{\hat{u}}_{k}\right) \quad \text { for } i=1,2, \ldots, L $$
Stopping criterion
- Given initial locations for the location of the Dirac components
- Full Newton updates are performed until the maximum change over all location vectors
  $$ \max _{i}\left|\Delta \underline{\hat{x}}_{i}\right| $$
  falls below a given threshold

Complete Algorithm (Randomized Variant)

Last updated on 2024-09-07

← Empirische Momente von zufälligen und deterministischen Samples 2022-08-10

Partikel Filter 2022-08-14 →