Sina – sinatootoonian.com

Linearizing Covariance for the Free Model, Part II

I extend the linearization to include non-linear diagonal terms, but at least the simplest approximation doesn’t capture the large values we’re after.

10 March 2026

Linearizing the Covariance Loss for the Free Model

I linearize the covariance for the Free model, and find that I need to include an additional diagonal component.

9 March 2026

Deep Linear Networks Learn Hierarchical Structure

Running notes on Saxe et al.’s “A mathematical theory of semantic development in deep neural networks.”

8 March 2026

Linearizing the Covariance Loss

We’re after insight, not an exact solution. ChatGPT had a good suggestion to linearize the loss around $\zz = 1$. In this post we do that.

5 March 2026

The Diagonal Model with Centering

We’re going to try to make sense of the solutions to minimizing the following $$L(\zz) = {1 \over 2} \|\XX^T \ZZ^T \JJ \ZZ \XX – \SS\|_F^2 + {\lambda \over 2}\|\zz – \bone\|_2^2$$

2 March 2026

Plume Marginalization

I workout the expression for the likelihood after marginalizing out flow changes.

27 February 2026

When to Smell in Stereo?

We show that stereo-olfaction beats mono-olfaction when searching surfaces for for olfactory edges.

26 February 2026

Unary vs. Binary Expressions of Independence

I discuss the unary and binary expressions of independence and how their meanings are slightly different.

26 February 2026

Markov Blankets in Bayesian Networks

We determine how to find the Markov boundary of a node by first looking at some examples, then using a formal derivation.

26 February 2026

Synchronization with Bimodal Spines

I update the activity-dependent synchrony model to make spines bimodal, increasing their inhibition when their parent GC is active.

10 February 2026

Activity-Dependent Synchronization of Linear Integrate and Fire Units

We’re interested in activity-dependent synchronization. This is where the inhibition that a mitral cell receives from a granule cell spine requires that spine to have been previously depolarized enough to activate the NMDA channels that are required (via both the additional depolarization of the spine and increased Ca2+ influx they provide) to cause vesicle release.…

10 February 2026

Eigenvalue Density via the Stieltjes Transform

Notes on how to compute the eigenvalue density of a random matrix using the Stieltes transform. based on Chapter 2 of “A First Course in Random Matrix Theory,” and conversations with ChatGPT.

5 February 2026

Linking Representational Geometry and Neural Function

These are my notes on Harvey et al. “What represenational similarity measures imply about decodable information.”

28 January 2026

Memory erasure by dopamine-gated retrospective learning

Hastily written notes immediately after the Gatsby TNJC where this preprint was presented.

27 January 2026

Multivariate Gaussians from Bayesian Networks

We show in detail how to compute the mean and covariance of the multivariate Gaussian produced by a linear-Gaussian Bayesian network.

26 January 2026

Notes on Toy Models of Superposition

On the Discord we’ve been discussing “Toy Models of Superposition” form Anthropic. It’s a long blog post, so these are my running notes to get people (and myself) up to speed if they’ve missed a week or two of the discussion. Problem Setup The authors’ basic aim is to demonstrate “superposition“: neurons representing multiple input…

18 January 2026

Author: Sina

Linearizing Covariance for the Free Model, Part II

Linearizing the Covariance Loss for the Free Model

Deep Linear Networks Learn Hierarchical Structure

Linearizing the Covariance Loss

The Diagonal Model with Centering

Plume Marginalization

When to Smell in Stereo?

Unary vs. Binary Expressions of Independence

Markov Blankets in Bayesian Networks

Synchronization with Bimodal Spines

Activity-Dependent Synchronization of Linear Integrate and Fire Units

Eigenvalue Density via the Stieltjes Transform

Linking Representational Geometry and Neural Function

Memory erasure by dopamine-gated retrospective learning

Multivariate Gaussians from Bayesian Networks

Notes on Toy Models of Superposition

Converting joint distributions to Bayesian networks

Jensen-Shannon Divergence

Graph Spectra and Clustering

Matching Pearson Correlations