Math Basics Linear Algebra Vectors Vector : multi-dimensional quantity
Each dimension contains different information (e.g.: Age, Weight, Heightโฆ)
represented as bold symbols
A vector x \boldsymbol{x} x is always a column vector
x = [ 1 2 4 ]
\boldsymbol{x}=\left[\begin{array}{l}
{1} \\\\
{2} \\\\
{4}
\end{array}\right]
x = โ 1 2 4 โ โ A transposed vector x T \boldsymbol{x}^T x T is a row vector
x T = [ 1 2 4 ]
\boldsymbol{x}^{T}=\left[\begin{array}{lll}
{1} & {2} & {4}
\end{array}\right]
x T = [ 1 โ 2 โ 4 โ ] Vector Operations โจ v , w โฉ = 1 โ
2 + 2 โ
4 + 4 โ
8 = 42
\langle v, w\rangle= 1 \cdot 2+2 \cdot 4+4 \cdot 8=42
โจ v , w โฉ = 1 โ
2 + 2 โ
4 + 4 โ
8 = 42 Length of a vector : Square root of the inner product with itself
โฅ v โฅ = โจ v , v โฉ 1 2 = ( 1 2 + 2 2 + 4 2 ) 1 2 = 21
\|\boldsymbol{v}\|=\langle\boldsymbol{v}, \boldsymbol{v}\rangle^{\frac{1}{2}}=\left(1^{2}+2^{2}+4^{2}\right)^{\frac{1}{2}}=\sqrt{21}
โฅ v โฅ = โจ v , v โฉ 2 1 โ = ( 1 2 + 2 2 + 4 2 ) 2 1 โ = 21 โ Matrices Matrix: rectangular array of numbers arranged in rows and columns
denoted with bold upper-case letters
X = [ 1 3 2 3 4 7 ]
\boldsymbol{X}=\left[\begin{array}{ll}{1} & {3} \\\\ {2} & {3} \\\\ {4} & {7}\end{array}\right]
X = โ 1 2 4 โ 3 3 7 โ โ Dimension: \\#rows \\times \\#columns (E.g.: ๐X โ R 3 ร 2 X \in \mathbb{R}^{3 \times 2} X โ R 3 ร 2 )
Vectors are special cases of matrices
x T = [ 1 2 4 ] โ 1 ร 3 matrix
\boldsymbol{x}^{T}=\underbrace{\left[\begin{array}{ccc}{1} & {2} & {4}\end{array}\right]}_{1 \times 3 \text { matrix }}
x T = 1 ร 3 matrix [ 1 โ 2 โ 4 โ ] โ โ ####Matrices in ML
Matrice Operations Multiplication with scalar
3 M = 3 [ 3 4 5 1 0 1 ] = [ 9 12 15 3 0 3 ]
3 \boldsymbol{M}=3\left[\begin{array}{lll}{3} & {4} & {5} \\\\ {1} & {0} & {1}\end{array}\right]=\left[\begin{array}{ccc}{9} & {12} & {15} \\\\ {3} & {0} & {3}\end{array}\right]
3 M = 3 โ 3 1 โ 4 0 โ 5 1 โ โ = โ 9 3 โ 12 0 โ 15 3 โ โ Addition of matrices
M + N = [ 3 4 5 1 0 1 ] + [ 1 2 1 3 1 1 ] = [ 4 6 6 4 1 2 ]
\boldsymbol{M} + \boldsymbol{N}=\left[\begin{array}{lll}{3} & {4} & {5} \\\\ {1} & {0} & {1}\end{array}\right]+\left[\begin{array}{lll}{1} & {2} & {1} \\\\ {3} & {1} & {1}\end{array}\right]=\left[\begin{array}{lll}{4} & {6} & {6} \\\\ {4} & {1} & {2}\end{array}\right]
M + N = โ 3 1 โ 4 0 โ 5 1 โ โ + โ 1 3 โ 2 1 โ 1 1 โ โ = โ 4 4 โ 6 1 โ 6 2 โ โ Transposed
M = [ 3 4 5 1 0 1 ] , M T = [ 3 1 4 0 5 1 ]
\boldsymbol{M}=\left[\begin{array}{lll}{3} & {4} & {5} \\\\ {1} & {0} & {1}\end{array}\right], \boldsymbol{M}^{T}=\left[\begin{array}{ll}{3} & {1} \\\\ {4} & {0} \\\\ {5} & {1}\end{array}\right]
M = โ 3 1 โ 4 0 โ 5 1 โ โ , M T = โ 3 4 5 โ 1 0 1 โ โ Matrix-Vector product (Vector need to have same dimensionality as number of columns)
[ w _ 1 , โฆ , w _ n ] โ W [ v _ 1 โฎ v _ n ] โ _ v = [ v _ 1 w _ 1 + โฏ + v _ n w _ n ] โ _ u
\underbrace{\left[\boldsymbol{w}\_{1}, \ldots, \boldsymbol{w}\_{n}\right]}_{\boldsymbol{W}} \underbrace{\left[\begin{array}{c}{v\_{1}} \\\\ {\vdots} \\\\ {v\_{n}}\end{array}\right]}\_{\boldsymbol{v}}=\underbrace{\left[\begin{array}{c}{v\_{1} \boldsymbol{w}\_{1}+\cdots+v\_{n} \boldsymbol{w}\_{n}}\end{array}\right]}\_{\boldsymbol{u}}
W [ w _ 1 , โฆ , w _ n ] โ โ โ v _ 1 โฎ v _ n โ โ โ _ v = [ v _ 1 w _ 1 + โฏ + v _ n w _ n โ ] โ _ u E.g.:
u = W v = [ 3 4 5 1 0 1 ] [ 1 0 2 ] = [ 3 โ
1 + 4 โ
0 + 5 โ
2 1 โ
1 + 0 โ
0 + 1 โ
2 ] = [ 13 3 ]
\boldsymbol{u}=\boldsymbol{W} \boldsymbol{v}=\left[\begin{array}{ccc}{3} & {4} & {5} \\\\ {1} & {0} & {1}\end{array}\right]\left[\begin{array}{l}{1} \\\\ {0} \\\\ {2}\end{array}\right]=\left[\begin{array}{l}{3 \cdot 1+4 \cdot 0+5 \cdot 2} \\\\ {1 \cdot 1+0 \cdot 0+1 \cdot 2}\end{array}\right]=\left[\begin{array}{c}{13} \\\\ {3}\end{array}\right]
u = W v = โ 3 1 โ 4 0 โ 5 1 โ โ โ 1 0 2 โ โ = โ 3 โ
1 + 4 โ
0 + 5 โ
2 1 โ
1 + 0 โ
0 + 1 โ
2 โ โ = โ 13 3 โ โ ๐ก Think as: We sum over the columns w i \boldsymbol{w}_i w i โ of W \boldsymbol{W} W weighted by v i v_i v i โ
u = v _ 1 w _ 1 + โฏ + v _ n w _ n = 1 [ 3 1 ] + 0 [ 4 0 ] + 2 [ 5 1 ] = [ 13 3 ]
u=v\_{1} w\_{1}+\cdots+v\_{n} w\_{n}=1\left[\begin{array}{l}{3} \\\\ {1}\end{array}\right]+0\left[\begin{array}{l}{4} \\\\ {0}\end{array}\right]+2\left[\begin{array}{l}{5} \\\\ {1}\end{array}\right]=\left[\begin{array}{c}{13} \\\\ {3}\end{array}\right]
u = v _ 1 w _ 1 + โฏ + v _ n w _ n = 1 โ 3 1 โ โ + 0 โ 4 0 โ โ + 2 โ 5 1 โ โ = โ 13 3 โ โ Important Special Cases Calculus โThe derivative of a function of a real variable measures the sensitivity to change of a quantity (a function value or dependent variable) which is determined by another quantity (the independent variable)โ
Scalar Vector Function f ( x ) f(x) f ( x ) f ( x ) f(\boldsymbol{x}) f ( x ) Derivative โ f ( x ) โ x = g \frac{\partial f(x)}{\partial x}=g โ x โ f ( x ) โ = g โ f ( x ) โ x = [ โ f ( x ) โ x _ 1 , โฆ , โ f ( x ) โ x _ d ] T = : โ f ( x ) \frac{\partial f(\boldsymbol{x})}{\partial \boldsymbol{x}}=\left[\frac{\partial f(\boldsymbol{x})}{\partial x\_{1}}, \ldots, \frac{\partial f(\boldsymbol{x})}{\partial x\_{d}}\right]^{T} =: \nabla f(x)\quad โ x โ f ( x ) โ = [ โ x _ 1 โ f ( x ) โ , โฆ , โ x _ d โ f ( x ) โ ] T =: โ f ( x ) (๐ gradient of function f f f at x \boldsymbol{x} x )Min/Max โ f ( x ) โ x = 0 \frac{\partial f(x)}{\partial x}=0 โ x โ f ( x ) โ = 0 โ f ( x ) โ x = [ 0 , โฆ , 0 ] T = 0 \frac{\partial f(\boldsymbol{x})}{\partial \boldsymbol{x}}=[0, \ldots, 0]^{T}=\mathbf{0} โ x โ f ( x ) โ = [ 0 , โฆ , 0 ] T = 0
Matrix Calculus Scalar Vector Linear โ a x โ x = a \frac{\partial a x}{\partial x}=a โ x โ a x โ = a โ _ x A x = A T \nabla\_{\boldsymbol{x}} \boldsymbol{A} \boldsymbol{x}=\boldsymbol{A}^{T} โ_ x A x = A T Quadratic โ x 2 โ x = 2 x \frac{\partial x^{2}}{\partial x}=2 x โ x โ x 2 โ = 2 x โ _ x x T x = 2 x โ _ x x T A x = 2 A x \begin{array}{l}{\nabla\_{\boldsymbol{x}} \boldsymbol{x}^{T} \boldsymbol{x}=2 \boldsymbol{x}} \\\\ {\nabla\_{\boldsymbol{x}} \boldsymbol{x}^{T} \boldsymbol{A} \boldsymbol{x}=2 \boldsymbol{A} \boldsymbol{x}}\end{array} โ_ x x T x = 2 x โ_ x x T A x = 2 A x โ