{"id":476,"date":"2024-01-15T18:29:48","date_gmt":"2024-01-15T18:29:48","guid":{"rendered":"https:\/\/sinatootoonian.com\/?p=476"},"modified":"2025-12-28T15:34:29","modified_gmt":"2025-12-28T15:34:29","slug":"wrangling-quartics","status":"publish","type":"post","link":"https:\/\/sinatootoonian.com\/index.php\/2024\/01\/15\/wrangling-quartics\/","title":{"rendered":"Wrangling quartics"},"content":{"rendered":"\n<p>The problem we have is to minimize $${1\\over 2} \\|A ^T Z^T Z A &#8211; B \\|_F^2 + {\\lambda \\over 2} \\|Z &#8211; I\\|_F^2,$$ where $B$ is symmetric. This is quartic over $Z$ so its solution will be the roots of some cubic function of $Z$, which is likely intractable.  However, this problem might have some special structure that will make it soluble. <\/p>\n\n\n\n<p>We&#8217;ll try to alleviate the quartic-ness, which comes from the first term, by moving $Z^TZ$ into a new variable $W$. Defining $$L(Z, W) = {1\\over 2} \\|A ^T W A &#8211; B \\|_F^2 + {\\lambda \\over 2} \\|Z &#8211; I\\|_F^2,$$ we solve the constrained optimization problem  $$\\argmin_{Z,W} \\\\; L(Z, W) \\quad \\text{such that} \\quad Z^TZ = W.$$<\/p>\n\n\n\n<p>The Lagrangian is $$ \\begin{align}L(Z, W , \\La) &amp;= {1\\over 2} \\tr((A ^T W A &#8211; B)(A^T W A &#8211; B)^T)\\\\ &amp;+ {\\la \\over 2}\\tr((Z &#8211; I)(Z-I)^T) +\\tr(\\Lambda^T (W &#8211; Z^T Z)).\\end{align}$$<\/p>\n\n\n\n<p>The differential for $Z$ is $$dL_Z = \\la \\tr(dZ(Z-I)^T) &#8211; \\tr(\\La^T dZ^T Z &#8211; \\La^T Z^T dZ),$$ so the gradient is $$\\nabla_Z L= \\la (Z &#8211; I) &#8211; Z (\\La^T + \\La).$$ Setting it to zero we get $$ \\la (Z -I) = Z(\\La^T + \\La).$$ Left multiplying by $Z^T$ and applying the constraint we get, after rearranging, $$ \\begin{align}W (\\la I- \\La^T &#8211; \\La) =\\la Z^T.\\tag{1}\\label{eq1} \\end{align}$$ <s>Left<\/s> Right multiplying both sides by their transpose, we get $$ W (\\la I- \\La^T &#8211; \\La)^2 W =\\la^2 Z^T Z = \\la^2 W.$$ And since $W$ is invertible,<br>$$ W (\\la I- \\La^T &#8211; \\La)^2  = \\la^2 I  \\implies \\la I- \\La^T &#8211; \\La = \\la W^{-1\/2}.$$ Plugging this into $\\Eqn{eq1}$ we get $$Z = W^{1\/2}.$$ This isn&#8217;t entirely trivial &#8211; the constraint could have been satisfied for e.g. any $ Q W^{1\/2}$ with orthonormal $Q$. But presumably the regularization makes $Q = I$. <\/p>\n\n\n\n<p>The differential for $W$ is $$dL_W = \\tr(A^T dW A (A^T W A &#8211; B)^T ) + \\tr(\\La^T dW)$$ so the gradient is $$ \\nabla_W L = A(A^T W A &#8211; B) A^T+ \\La.$$ Adding this to its transpose, adding $\\la I$ to both sides, setting the gradient contributions to zero and rearranging, we get $$ 2 A(A^T W A &#8211; B) A^T + \\la I = \\la I- \\La &#8211; \\La^T = \\la W^{-1\/2}.$$ In terms of $Z$,<br>$$ 2 A A^T Z^2 AA^T &#8211; 2 A BA^T + \\la I = \\la Z^{-1}.$$<\/p>\n\n\n\n<p>So we&#8217;ve got a slightly simplified cubic. Any chance we can do something with this? The best bet is probably to change basis. Applying SVD,<br>$$ A = USV^T \\implies 2 U S^2 U^T Z^2 U S^2 U^T &#8211; 2 U S V^T B V S U^T + \\la I = \\la Z^{-1}.$$ Using the notation $\\wt{Z}_{UV} = U^T Z V$, we can clean this up a bit to $$ 2 U S^2 \\wt{Z}_{UU}^2 S^2 U^T &#8211; 2 U S \\wt{B}_{VV} S U^T + \\la UU^T = \\la U\\wt{Z}_{UU}^{-1}U^T, $$ and left multiply by $U^T$ and right by $U$ to get $$ 2 S^2 \\wt{Z}_{UU}^2 S^2  &#8211; 2  S \\wt{B}_{VV} S  + \\la I = \\la \\wt{Z}_{UU}^{-1}.$$ <\/p>\n\n\n\n<p>We&#8217;d normally be hooped at this point, but for the special case I&#8217;m interested in, we have two further special properties: $V = I$, and $B$ is diagonal. $\\wt{B}_{VV}$ then simplifies to just $B$, and we get $$ 2 S^2 \\wt{Z}_{UU}^2 S^2  &#8211; 2  S^2 B  + \\la I = \\la \\wt{Z}_{UU}^{-1}.$$ This then implies that $\\wt{Z}_{UU}$ is diagonal, so we get a system of independent cubic equations<br>$$ 2 S^4 \\wt{Z}_{UU}^3  &#8211; (2  S^2 B &#8211; \\la I) \\wt{Z}_{UU}  &#8211; \\la I = 0.$$ There are closed-form formulas for the solutions of such equations, but they&#8217;re quite complicated and are unlikely to yield any insight. <\/p>\n\n\n\n<p>Instead, we resign ourselves to the cases of very small and very large $\\la$. When the regularizer dominates,  $$\\la \\to \\infty \\implies \\wt{Z}_{UU} \\to I \\implies Z \\to I,$$ as we&#8217;d expect. When there&#8217;s little regularization, and using that $Z$ is full rank,<br>$$ \\la \\to 0 \\implies 2 S^4 \\wt{Z}_{UU}^3 \\to 2 S^2 B \\wt{Z}_{UU} \\implies \\wt{Z}_{UU} \\to \\sqrt{B}\/S.$$<\/p>\n\n\n\n<p>In summary, we managed to simplify the problem significantly but still ended up with a system of independent cubics that are too complicated to solve to yield any insight, so we just examined the solutions at the extremes of the regularization path.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The problem we have is to minimize $${1\\over 2} \\|A ^T Z^T Z A &#8211; B \\|_F^2 + {\\lambda \\over 2} \\|Z &#8211; I\\|_F^2,$$ where $B$ is symmetric. This is quartic over $Z$ so its solution will be the roots of some cubic function of $Z$, which is likely intractable. However, this problem might have [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1,152],"tags":[17,3,16],"class_list":["post-476","post","type-post","status-publish","format-standard","hentry","category-blog","category-post","tag-calculations","tag-optimization","tag-quartics"],"acf":[],"_links":{"self":[{"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/posts\/476","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/comments?post=476"}],"version-history":[{"count":74,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/posts\/476\/revisions"}],"predecessor-version":[{"id":957,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/posts\/476\/revisions\/957"}],"wp:attachment":[{"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/media?parent=476"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/categories?post=476"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/tags?post=476"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}