{"id":3411,"date":"2025-02-28T19:17:02","date_gmt":"2025-02-28T19:17:02","guid":{"rendered":"https:\/\/sinatootoonian.com\/?p=3411"},"modified":"2025-12-28T15:33:23","modified_gmt":"2025-12-28T15:33:23","slug":"reaction-rate-inference","status":"publish","type":"post","link":"https:\/\/sinatootoonian.com\/index.php\/2025\/02\/28\/reaction-rate-inference\/","title":{"rendered":"Reaction rate inference"},"content":{"rendered":"\n<p>In this post we show that, given a set of first order reactions with unknown rates, inferring the reaction rates from a dataset of instantaneous species concentrations is a linear regression problem. We solve it for some toy examples.<\/p>\n\n\n\n<p>Consider the toy example set of reactions<br>\\begin{align*}<br>S_0 &amp;\\xrightarrow{k_1} S_1 + S_2\\\\<br>S_2 &amp;\\xrightarrow{k_2}S_3 + S_4\\\\<br>S_1 + S_3 &amp;\\xrightarrow{k_3} S_5<br>\\end{align*}<br>We have (noisy) data on the concentrations of the species as a function of time. We want to infer the rates $k_1$ to $k_3$. <\/p>\n\n\n\n<p>Let&#8217;s write the derivatives:<br>\\begin{align*}<br>\\dot S_0 &amp;=- k_1 S_0\\\\<br>\\dot S_1 &amp;=  k_1 S_0 -k_3 S_1 S_3\\\\<br>\\dot S_2 &amp;= k_1 S_0 &#8211; k_2 S_2\\\\<br>\\dot S_3 &amp;= k_2 S_2 &#8211; k_3 S_1 S_3\\\\<br>\\dot S_4 &amp;= k_2 S_2\\\\<br>\\dot S_5 &amp;= k_3 S_1S_3<br>\\end{align*}<\/p>\n\n\n\n<p>In the general case, we will have a sequence of equations $$ L_i \\xrightarrow{k_i} R_i.$$<\/p>\n\n\n\n<p>A particular species diminishes whenever it&#8217;s on the left-hand side, by the product of the species on that side. Let&#8217;s define $$q_j = \\prod_{i \\in L_j} S_i.$$ If we let $A_{ij}$ be a binary variable that says if species $i$ is in the left-hand side of equation $j$, we get a contribution of $-A_{ij} q_j k_j .$<\/p>\n\n\n\n<p>Similarly, the species will increase whenever it&#8217;s on the right-hand side, by the product of the species on the left hand side. If we let $B_{ij}$ be a binary variable that indicates if species $i$ is in the right-hand side of equation $j$, we get a contribution of $B_{ij} q_j k_j$. If we combine these, we get $$ \\dot S_i = \\sum_j (B_{ij} &#8211; A_{ij}) q_j k_j.$$ This is a linear system of equations in the unknown rates. If we define $ \\mathbf Q(\\mathbf s) = \\diag{q_j}$, we can write the above as one matrix equation $$ \\dot {\\mathbf s} = (\\mathbf B &#8211; \\mathbf A) \\mathbf Q(\\mathbf s) \\mathbf k = \\mathbf H(\\mathbf s) \\mathbf k.$$<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Inferring the rates<\/h2>\n\n\n\n<p>We have (noisy) measurements of the species concentrations at each point in time, so we have both $\\dot {\\mathbf s}$ and $\\mathbf H (\\mathbf s)$ at each time point. Therefore we can define a loss using the squared error at each time point, and add those up, with a regularizer on $\\mathbf k$:<br>$$ L(\\mathbf k) = {1 \\over 2} \\sum_t \\|\\dot{\\mathbf s}_t &#8211; \\mathbf{H}_t \\mathbf{k} \\|_2^2 + {1 \\over 2} \\alpha \\|\\mathbf k\\|_2^2,$$ where $\\mathbf H_t$ is shorthand for $\\mathbf H(\\mathbf s_t).$<\/p>\n\n\n\n<p>The gradient is <br>$$ \\nabla_{\\mathbf k} L= \\alpha \\mathbf k-\\sum_t \\mathbf H_t^T(\\mathbf s_t &#8211; \\mathbf H_t \\mathbf k) .$$Setting this to zero, we get <br>$$ (\\alpha \\mathbf I + \\sum_t \\mathbf{H}_t^T \\mathbf{H}_t) \\mathbf k = \\sum_t \\mathbf H_t^T \\dot {\\mathbf s_t}.$$ Defining $$ \\mathbf G = (\\alpha \\mathbf I + \\sum_t \\mathbf{H}_t^T \\mathbf{H}_t), \\quad \\mathbf y = \\sum_t \\mathbf H_t^T \\dot {\\mathbf s_t},$$ we have $$ \\mathbf { G k} = \\mathbf y.$$ Solving for $\\mathbf k$ we get<br>$$ \\mathbf k = \\mathbf G^{-1} \\mathbf y.$$<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Trying it out<\/h2>\n\n\n\n<p>We try this out in <a href=\"https:\/\/github.com\/stootoon\/inferring-reaction-rates\/blob\/main\/demo.ipynb\">https:\/\/github.com\/stootoon\/inferring-reaction-rates\/blob\/main\/demo.ipynb<\/a>, demonstrating that we can infer the correct rates in the noise-free setting, and can get close under moderate observation noise if we use regularization.<\/p>\n\n\n\n<p>$$\\blacksquare$$<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this post we show that, given a set of first order reactions with unknown rates, inferring the reaction rates from a dataset of instantaneous species concentrations is a linear regression problem. We solve it for some toy examples.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1,152],"tags":[95,83,104,103],"class_list":["post-3411","post","type-post","status-publish","format-standard","hentry","category-blog","category-post","tag-inference","tag-linear-regression","tag-reaction","tag-regression"],"acf":[],"_links":{"self":[{"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/posts\/3411","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/comments?post=3411"}],"version-history":[{"count":52,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/posts\/3411\/revisions"}],"predecessor-version":[{"id":5723,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/posts\/3411\/revisions\/5723"}],"wp:attachment":[{"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/media?parent=3411"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/categories?post=3411"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/tags?post=3411"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}