{"id":3657,"date":"2025-03-18T11:27:01","date_gmt":"2025-03-18T11:27:01","guid":{"rendered":"https:\/\/sinatootoonian.com\/?p=3657"},"modified":"2025-12-28T15:33:09","modified_gmt":"2025-12-28T15:33:09","slug":"a-simple-property-of-sparse-vectors","status":"publish","type":"post","link":"https:\/\/sinatootoonian.com\/index.php\/2025\/03\/18\/a-simple-property-of-sparse-vectors\/","title":{"rendered":"A simple property of sparse vectors"},"content":{"rendered":"\n<p>This came up in Chapter 7 of Wainwright&#8217;s &#8220;High-dimensional Statistics&#8221;. In that Chapter we&#8217;re interested in determining how close solutions $\\hat \\theta$ to different flavours of the Lasso problem come to the true, $S$-sparse vector $\\theta^*$. <\/p>\n\n\n\n<p>A useful notion is the set of $S$-dominant vectors (my terminology): <br>$$ C(S) = \\{x: \\|x_{S^c}\\|_1 \\le \\|x_S\\|_1\\},$$ i.e. those for which at least half of their $\\ell_1$ norm lives on the indices $S$.<\/p>\n\n\n\n<p>It turns out for any vector $\\theta^*$ that is $S$-sparse i.e. whose non-zero elements are in the set $S$ alone, if another vector $\\hat \\theta$ satisfies $$\\|\\hat \\theta\\|_1 \\le \\|\\theta^*\\|_1,$$ then their difference is $S$-dominant,<br>$$\\hat \\Delta = \\hat \\theta &#8211; \\theta^* \\in C(S).$$<\/p>\n\n\n\n<p>We&#8217;ll show this algebraically first, and then try to give some intuition for why it&#8217;s the case. The key is decomposition into the parts on $S$ and $S^c$, and noting that $\\theta^*_{S^c} = 0$:<br>\\begin{align}<br>\\|\\hat \\theta\\|_1 \\le \\|\\theta^*\\|_1 &amp;\\implies \\|\\hat \\theta_{S^c}\\|_1 + \\|\\hat \\theta_S\\|_1 \\le \\|\\theta^*_S\\|_1 \\\\ &amp;\\implies \\|\\hat \\theta_{S^c}\\|_1 \\le \\|\\theta^*_S\\|_1  &#8211;  \\|\\hat \\theta_S\\|_1 \\\\&amp;\\implies \\|\\hat \\theta_{S^c}\\|_1 \\le \\|(\\theta^* &#8211; \\hat \\theta)_S\\|_1\\\\ &amp;\\implies \\|(\\theta^* &#8211; \\hat \\theta)_{S^c}\\|_1 \\le \\|(\\theta^* &#8211; \\hat \\theta)_S\\|_1.\\end{align}<\/p>\n\n\n\n<p>While I can follow the derivation above, it&#8217;s not intuitive to me why it&#8217;s true in general. However, it is intuitive in the simple case where $\\hat \\theta$ lives entirely on $S^c$, in which case the result follows immediately, since <br>$$ (\\hat \\theta &#8211; \\theta^*)_{S^c} = \\hat \\theta, \\quad \\text{and} \\quad (\\hat \\theta &#8211; \\theta^*)_S = -\\theta^*.$$<\/p>\n\n\n\n<p>The main result says that this holds generally, even when $\\hat \\theta$ has a component that lives on $S$. To make the connection to the simple case above, we  decompose $\\hat \\theta$ into parts that live on and off of $S$:<br>$$\\hat \\theta = \\hat \\theta_S + \\hat \\theta_{S^c}. $$ Because the $\\ell_1$ norm distributes over elements, <br>$$ \\|\\hat \\theta\\|_1 =  \\|\\hat \\theta_S\\|_1 + \\|\\hat \\theta_{S^c}\\|_1 \\le \\|\\theta^*_S\\|_1.$$ Rearranging we get a bound on the excess norm of $\\hat \\theta$,$$ \\|\\hat \\theta_{S^c}\\|_1 \\le \\|\\theta^*_S\\|_1 &#8211; \\|\\hat \\theta_S\\|_1, $$ which we then relate to the error norm using the triangle inequality $$ \\dots \\le \\|\\theta^* &#8211; \\hat \\theta\\|_1.$$<\/p>\n\n\n\n<p>Not sure if this was much more intuitive!<\/p>\n\n\n\n<p>$$ \\blacksquare$$<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We show that the difference of any vector that is $S$-sparse and any other vector with the same or lesser $\\ell_1$ norm is $S$-dominant.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1,30],"tags":[110,15,111],"class_list":["post-3657","post","type-post","status-publish","format-standard","hentry","category-blog","category-notes-blog","tag-lasso","tag-sparsity","tag-triangle-inequality"],"acf":[],"_links":{"self":[{"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/posts\/3657","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/comments?post=3657"}],"version-history":[{"count":20,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/posts\/3657\/revisions"}],"predecessor-version":[{"id":5718,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/posts\/3657\/revisions\/5718"}],"wp:attachment":[{"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/media?parent=3657"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/categories?post=3657"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sinatootoonian.com\/index.php\/wp-json\/wp\/v2\/tags?post=3657"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}