Deep Learning

59_Generative Classifiers(2)

elif 2024. 1. 28. 09:11

Continuing with additional content following the last post.


If there is data composed of a dataset and corresponding class labels, a parametric form for the class-conditional densities can be specified. Through this, the values of the parameters can be determined using maximum likelihood.

First, let's assume there are two classes. Each class has a Gaussian class-conditional density with a shared covariance matrix, and the dataset is $\{ {{\text{x}}_n},{t_n}\} $. Here, ${t_n}=1$ represents ${C_1}$, and ${t_n}=2$ represents ${C_2}$. The prior class probabilities are $p({C_1}) = \pi $ and $p({C_2}) = 1 - \pi $. Since in class ${C_1}$, for a data point ${{\text{x}}_n}$, ${t_n}=1$, it can be represented as follows.



In class ${C_2}$, ${t_n}=0$.



Therefore, the likelihood function is as follows.



Where ${\text{t}} = {({t_1}, \cdots {t_N})^T}$, and as previously explained, it is convenient to take the log to maximize the log likelihood function. In the above function, terms dependent on $\pi$ can be organized as follows.



setting the derivative to 0.



Where the total number of data points belonging to classes ${C_1}$ and ${C_2}$ are ${N_1}$ and ${N_2}$, respectively. Therefore, the maximum likelihood estimate for $\pi$ is the proportion of data points in ${C_1}$, meaning the maximum likelihood estimate for the prior probability of class ${C_k}$ is the proportion of training set points assigned to that class.


Next, organizing the terms dependent on $\mu $ yields the following.



Similarly, setting the derivative to 0 results in the following.



Finally, considering the maximum likelihood solution for the shared coveriance matrix ${\Bbb C}$ results in the following.






From the results of the maximum likelihood values for the Gaussian distribution, we can see that ${\Bbb C} = {\text{S}}$, which represents the weighted average of the covariance matrices for each class.



'Deep Learning' 카테고리의 다른 글

61_Gradient Descent  (0) 2024.01.30
60_Multilayer Networks and Activation Function  (0) 2024.01.29
58_Generative Classifiers  (0) 2024.01.27
57_Single Layer Classification  (1) 2024.01.26
56_Sequential Learning  (1) 2024.01.25