Tag: prml
-
Converting joint distributions to Bayesian networks
In these notes we discuss how to convert a joint distribution into a graph called a Bayesian network, and how the structure of the graph suggests ways to reduce the parameters required to specify the joint.
-
Notes on Kernel PCA
Following Bishop, I show how to express the eigenvectors of the feature projections in terms of the eigenvectors of the kernel matrix, and how to compute the kernel of centered features from the uncentered one.
-
An iterative reweighted least squares miracle
I show what’s really happening in the iterative reweighted least squares updates for logistic regression described in PRML 4.3.3.
-
EM for Factor Analysis
In this note I work out the EM updates for factor analysis, following the presentation in PRML 12.2.4.
-
Automatic Relevance Determination for PPCA
In this note I flesh out the computations for Section 12.2.3 of Bishop’s Pattern Recognition and Machine Learning, where he uses automatic relevance to determine the dimensionality of the principal subspace in probabilistic PCA.
-
The equivalent kernel for non-zero prior mean
This note is a brief addendum to Section 3.3 of Bishop on Bayesian Linear Regression. Some of the derivations in that section assume, for simplicity, that the prior mean on the weights is zero. Here we’ll relax this assumption and see what happens to the equivalent kernel. Background The setting in that section is that,…
-
Notes on the Geometry of Least Squares
In this post I expand on the details of section 3.1.2 in Pattern Recognition and Machine Learning. We found that maximum likelihood estimation requires minimizing $$E(\mathbf w) = {1 \over 2} \sum_{n=1}^N (t_n – \ww^T \bphi(\xx_n))^2.$$ Here the vector $\bphi(\xx_n)$ contains each of our features evaluated on the single input datapoint $\xx_n$, $$\bphi(\xx_n) = [\phi_0(\xx_n),…