hexops
diff --git a/‎2021/feed.xml‎
Lines changed: 11 additions & 11 deletions b/‎2021/feed.xml‎
Lines changed: 11 additions & 11 deletions
diff --git a/‎2021/i-write-code-100-hours-a-week/index.html‎
Lines changed: 2 additions & 2 deletions b/‎2021/i-write-code-100-hours-a-week/index.html‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎2021/increasing-my-contribution-to-zig-to-200-a-month/index.html‎
Lines changed: 2 additions & 2 deletions b/‎2021/increasing-my-contribution-to-zig-to-200-a-month/index.html‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎2021/index.html‎
Lines changed: 1 addition & 1 deletion b/‎2021/index.html‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎2021/mach-engine-the-future-of-graphics-with-zig/index.html‎
Lines changed: 2 additions & 2 deletions b/‎2021/mach-engine-the-future-of-graphics-with-zig/index.html‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/index.html‎
Lines changed: 2 additions & 2 deletions b/‎2021/perfecting-glfw-for-zig-and-finding-undefined-behavior/index.html‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎2021/postgres-regex-search-over-10000-github-repositories/index.html‎
Lines changed: 1 addition & 1 deletion b/‎2021/postgres-regex-search-over-10000-github-repositories/index.html‎
Lines changed: 1 addition & 1 deletion
@@ -14,5 +14,5 @@
 This is a follow up to &ldquo;Postgres Trigram search learnings&rdquo;, in which we shared several learnings and beliefs about trying to use Postgres Trigram indexes as an alterative to Google&rsquo;s Zoekt (&ldquo;Fast trigram based code search&rdquo;).
 We share our results, as well as the exact steps we performed, scripts, and lists of the top 20,000 repositories by stars/language on GitHub so you can reproduce the results yourself should you desire.</p></a></p><p><h3><a class=title href=/2021/postgres-trigram-search-learnings/>Postgres Trigram search learnings</a></h3><a class=summary href=/2021/postgres-trigram-search-learnings/><p>In this article I talk about learnings I have from trying to use pg_trgm as the backend for a search engine, Tridex, which aims to be a competitor to Google&rsquo;s Zoekt (&ldquo;Fast trigram based code search&rdquo;)
 Background I work @ Sourcegraph, which provides code search, code intelligence, and other developer tooling. (If you&rsquo;re one of my Sourcegraph co-workers, hey! Hexops is the name of my GitHub organization for after-hours experiments and what I hope will one day in the distant future become a successful game development company.</p></a></p></div><div class=footer><div class=row-1><a href=https://hexops.com/privacy>Privacy matters</a>
-<a href=https://github.com/sponsors/slimsag>Sponsor on GitHub</a>
+<a href=https://github.com/sponsors/emidoots>Sponsor on GitHub</a>
 <a href=https://machengine.org>machengine.org</a></div><div class=row-2><a href=/feed.xml><img alt="RSS feed" src="https://shields.io/badge/RSS-follow-green?logo=RSS"></a></div><div class=row-3><a href=https://hexops.com><img class="logo color-auto" alt="Hexops logo" src=https://raw.githubusercontent.com/hexops/media/234e15f265b19743c580a078b2d68660c92675d4/logo.svg height=50px></a></div></div></body></html>
@@ -92,6 +92,6 @@
             <span class=p>}</span>
         <span class=p>}</span>
 <span class=p>...</span>
-</code></pre></div><p>What is happening here is that:</p><ul><li><code>images[i].pixels[j * 4 + 0]</code> is returning an <code>unsigned char</code> (8 bits)</li><li><del>It is then being shifted left by <code>&lt;&lt; 16</code> bits. !!! That&rsquo;s further than an 8-bit number can be shifted left by, so that&rsquo;s UB</del><ul><li>EDIT: Actually, it turns out that&rsquo;s not exactly right, it&rsquo;s the <code>&lt;&lt; 24</code> that&rsquo;s the cause of the UB, thanks <a href=https://github.com/Maato>@Maato</a> for <a href=https://github.com/glfw/glfw/pull/1986#issuecomment-955784179>pointing this out and explaining in better detail than I could</a>.</li></ul></li></ul><p>Suddenly, it all makes sense. And <a href=https://godbolt.org/z/ddq75WsYK>if we load an equal snippet of code into Godbolt</a> we can see what is happening when we compile without UBSan / the <code>-fsanitize=undefined</code> flag:</p><p><a href=https://user-images.githubusercontent.com/3173176/139594650-eff35347-3f32-42e5-bc60-da2a1dceb1e1.png><img alt="Compilation with godbolt with UBSan turned off shows movement into 32-bit EAX register" class=color src=https://user-images.githubusercontent.com/3173176/139594650-eff35347-3f32-42e5-bc60-da2a1dceb1e1.png></a></p><p>Without UBsan, clang merely uses the 32-bit EAX register as an optimization. It loads the 8-bit number into the 32-bit register, and then performs the left shift. Although the shift exceeds 8 bits, it <em>does not get truncated to zero</em> - instead it is effectively as if the number was converted to a <code>long</code> (32 bits) prior to the left-shift operation.</p><p>This explains why nobody has caught this UB in GLFW yet, too: it works by accident! Just because the compiler likes to use 32-bit registers in this context.</p><p>And this change benefits all the languages out there using GLFW: <a href=https://github.com/glfw/glfw/pull/1986>glfw/glfw#1986</a></p><h2 id=defaults-are-_critical_>Defaults are <em>critical</em></h2><p>This code, and undefined behavior, has been in GLFW for over 6 years according to <code>git blame</code>.</p><p>Anybody using GLFW <em>could have</em> enabled UBSan in their C compiler. Anybody <em>could have</em> run into this same crash and debugged it in the last 6 years. But they didn&rsquo;t.</p><p>In mach-glfw, we compile all of GLFW&rsquo;s C code with Zig (which is also a fully functional C and C++ compiler), with UBSan enabled by default.</p><p>Only because Zig has good defaults, because it places so much emphasis on things being right <em>out of the box</em>, and because there is such an emphasis on having safety checks for undefined behavior - were we able to catch this undefined behavior that went unnoticed in GLFW for the last 6 years.</p><h2 id=thanks-for-reading>Thanks for reading</h2><p>All key Mach engine developments will be posted here, with incremental updates on Twitter <a href=https://twitter.com/machengine>@machengine</a>.</p><p>Follow <a href=https://github.com/hexops/mach>Mach engine on GitHub</a>, and if you like what I&rsquo;m doing please consider <a href=https://github.com/sponsors/slimsag>sponsoring my work</a>.</p></div></main><script>function addAnchor(a){a.insertAdjacentHTML('afterbegin',`<a href="#${a.id}" class="hanchor" ariaLabel="Anchor">#</a> `)}document.addEventListener('DOMContentLoaded',function(){var a=document.querySelectorAll('h1[id], h2[id], h3[id], h4[id]');a&&a.forEach(addAnchor)})</script></div><div class=footer><div class=row-1><a href=https://hexops.com/privacy>Privacy matters</a>
-<a href=https://github.com/sponsors/slimsag>Sponsor on GitHub</a>
+</code></pre></div><p>What is happening here is that:</p><ul><li><code>images[i].pixels[j * 4 + 0]</code> is returning an <code>unsigned char</code> (8 bits)</li><li><del>It is then being shifted left by <code>&lt;&lt; 16</code> bits. !!! That&rsquo;s further than an 8-bit number can be shifted left by, so that&rsquo;s UB</del><ul><li>EDIT: Actually, it turns out that&rsquo;s not exactly right, it&rsquo;s the <code>&lt;&lt; 24</code> that&rsquo;s the cause of the UB, thanks <a href=https://github.com/Maato>@Maato</a> for <a href=https://github.com/glfw/glfw/pull/1986#issuecomment-955784179>pointing this out and explaining in better detail than I could</a>.</li></ul></li></ul><p>Suddenly, it all makes sense. And <a href=https://godbolt.org/z/ddq75WsYK>if we load an equal snippet of code into Godbolt</a> we can see what is happening when we compile without UBSan / the <code>-fsanitize=undefined</code> flag:</p><p><a href=https://user-images.githubusercontent.com/3173176/139594650-eff35347-3f32-42e5-bc60-da2a1dceb1e1.png><img alt="Compilation with godbolt with UBSan turned off shows movement into 32-bit EAX register" class=color src=https://user-images.githubusercontent.com/3173176/139594650-eff35347-3f32-42e5-bc60-da2a1dceb1e1.png></a></p><p>Without UBsan, clang merely uses the 32-bit EAX register as an optimization. It loads the 8-bit number into the 32-bit register, and then performs the left shift. Although the shift exceeds 8 bits, it <em>does not get truncated to zero</em> - instead it is effectively as if the number was converted to a <code>long</code> (32 bits) prior to the left-shift operation.</p><p>This explains why nobody has caught this UB in GLFW yet, too: it works by accident! Just because the compiler likes to use 32-bit registers in this context.</p><p>And this change benefits all the languages out there using GLFW: <a href=https://github.com/glfw/glfw/pull/1986>glfw/glfw#1986</a></p><h2 id=defaults-are-_critical_>Defaults are <em>critical</em></h2><p>This code, and undefined behavior, has been in GLFW for over 6 years according to <code>git blame</code>.</p><p>Anybody using GLFW <em>could have</em> enabled UBSan in their C compiler. Anybody <em>could have</em> run into this same crash and debugged it in the last 6 years. But they didn&rsquo;t.</p><p>In mach-glfw, we compile all of GLFW&rsquo;s C code with Zig (which is also a fully functional C and C++ compiler), with UBSan enabled by default.</p><p>Only because Zig has good defaults, because it places so much emphasis on things being right <em>out of the box</em>, and because there is such an emphasis on having safety checks for undefined behavior - were we able to catch this undefined behavior that went unnoticed in GLFW for the last 6 years.</p><h2 id=thanks-for-reading>Thanks for reading</h2><p>All key Mach engine developments will be posted here, with incremental updates on Twitter <a href=https://twitter.com/machengine>@machengine</a>.</p><p>Follow <a href=https://github.com/hexops/mach>Mach engine on GitHub</a>, and if you like what I&rsquo;m doing please consider <a href=https://github.com/sponsors/emidoots>sponsoring my work</a>.</p></div></main><script>function addAnchor(a){a.insertAdjacentHTML('afterbegin',`<a href="#${a.id}" class="hanchor" ariaLabel="Anchor">#</a> `)}document.addEventListener('DOMContentLoaded',function(){var a=document.querySelectorAll('h1[id], h2[id], h3[id], h4[id]');a&&a.forEach(addAnchor)})</script></div><div class=footer><div class=row-1><a href=https://hexops.com/privacy>Privacy matters</a>
+<a href=https://github.com/sponsors/emidoots>Sponsor on GitHub</a>
 <a href=https://machengine.org>machengine.org</a></div><div class=row-2><a href=/feed.xml><img alt="RSS feed" src="https://shields.io/badge/RSS-follow-green?logo=RSS"></a></div><div class=row-3><a href=https://hexops.com><img class="logo color-auto" alt="Hexops logo" src=https://raw.githubusercontent.com/hexops/media/234e15f265b19743c580a078b2d68660c92675d4/logo.svg height=50px></a></div></div></body></html>
@@ -39,5 +39,5 @@
 </span><span class=w></span><span class=p>...</span><span class=w>
 </span></code></pre></div><p>Was much faster in native Postgres, taking about 2-8s for each table instead of 20-40s previously, and taking only 15m in total instead of 2h before.</p><p>Parallel creation of the Trigram indexes using e.g.:</p><div class=highlight><pre class=chroma><code class=language-sql data-lang=sql><span class=k>CREATE</span><span class=w> </span><span class=k>INDEX</span><span class=w> </span><span class=k>IF</span><span class=w> </span><span class=k>NOT</span><span class=w> </span><span class=k>EXISTS</span><span class=w> </span><span class=n>files_000_contents_trgm_idx</span><span class=w> </span><span class=k>ON</span><span class=w> </span><span class=n>files</span><span class=w> </span><span class=k>USING</span><span class=w> </span><span class=n>GIN</span><span class=w> </span><span class=p>(</span><span class=n>contents</span><span class=w> </span><span class=n>gin_trgm_ops</span><span class=p>);</span><span class=w>
 </span></code></pre></div><p>Was also much faster, taking only 23m compared to ~3h with Docker.</p><h3 id=query-performance-is-12-99-faster-depending-on-query>Query performance is 12-99% faster, depending on query</h3><p>We re-ran the same 350 queries as in our earlier table-splitting benchmark, and found the following substantial improvements:</p><ol><li>Queries that were previously very slow noticed a ~12% improvement. This is likely due to IO operations needed when interfacing with the 200 separate tables.</li><li>Queries that were previously in the middle-ground noticed meager ~5% improvements.</li><li>Queries that were previously fairly fast (likely searching only over a one or two tables before returning) noticed substantial 16-99% improvements.</li></ol><details><summary>Exhaustive comparison details (negative change is good)</summary><div markdown=1><table><thead><tr><th>Change</th><th>Time bucket</th><th>Queries under bucket <strong>before</strong></th><th>Queries under bucket <strong>after</strong></th></tr></thead><tbody><tr><td>0%</td><td>500s</td><td>350 of 350</td><td>350 of 350</td></tr><tr><td>-12%</td><td>100s</td><td>309 of 350</td><td>350 of 350</td></tr><tr><td>-12%</td><td>50s</td><td>309 of 350</td><td>350 of 350</td></tr><tr><td>-12%</td><td>40s</td><td>308 of 350</td><td>350 of 350</td></tr><tr><td>-12%</td><td>30s</td><td>308 of 350</td><td>349 of 350</td></tr><tr><td>-7%</td><td>25s</td><td>307 of 350</td><td>330 of 350</td></tr><tr><td>-7%</td><td>25s</td><td>307 of 350</td><td>330 of 350</td></tr><tr><td>-8%</td><td>20s</td><td>302 of 350</td><td>330 of 350</td></tr><tr><td>-8%</td><td>20s</td><td>302 of 350</td><td>330 of 350</td></tr><tr><td>-5%</td><td>10s</td><td>297 of 350</td><td>311 of 350</td></tr><tr><td>-26%</td><td>5s</td><td>237 of 350</td><td>319 of 350</td></tr><tr><td>-7%</td><td>2500ms</td><td>224 of 350</td><td>240 of 350</td></tr><tr><td>-9%</td><td>2000ms</td><td>219 of 350</td><td>240 of 350</td></tr><tr><td>-9%</td><td>1500ms</td><td>219 of 350</td><td>240 of 350</td></tr><tr><td>-16%</td><td>1000ms</td><td>200 of 350</td><td>237 of 350</td></tr><tr><td>-14%</td><td>750ms</td><td>190 of 350</td><td>221 of 350</td></tr><tr><td>-23%</td><td>500ms</td><td>170 of 350</td><td>220 of 350</td></tr><tr><td>-59%</td><td>250ms</td><td>88 of 350</td><td>217 of 350</td></tr><tr><td>-99%</td><td>100ms</td><td>1 of 350</td><td>168 of 350</td></tr><tr><td>-99%</td><td>50ms</td><td>1 of 350</td><td>168 of 350</td></tr></tbody></table></div></details><h2 id=conclusions>Conclusions</h2><p>We think the following learnings are most important:</p><ul><li><code>.git</code> directories, even with <code>--depth=1</code> clones, account for 30% of a repositories size on disk (at least in top 10,000 GitHub repositories.)</li><li>Files > 1 MiB (often binaries) account for another 51% of the data size on disk of repositories.</li><li>On only a Macbook Pro, it is possible to get Postgres Trigram regex search over 10,000 repositories to run most reasonable queries in under 5s - and certainly much faster with more hardware.</li><li><code>pg_trgm</code> performs single-threaded indexing and querying, unless you split your data up into multiple tables.</li><li>By default, a Postgres <code>text</code> colum will be compressed by Postgres on disk out of the box - resulting in a 23% reduction in size (with the files we inserted.)</li><li><code>pg_trgm</code> GIN indexes take around 26% the size of your data on disk. So if indexing 1 GiB of raw text, expect Postgres to store that text in around ~827 MiB plus 279 MiB for the GIN trigram index.</li><li>Splitting your data into multiple tables if using <code>pg_trgm</code> is an obvious win, as it allows for paralle indexing which can be the difference between 4h vs 22h. It also reduces the risk of an indexing failure after 22h due to e.g. lack of memory and uses much less peak memory overall.</li><li>Docker bind mounts (not volumes) are quite slow outside of Linux host environments (there are many other articles on this subject.)</li></ul><p>If you are looking for fast regexp or code search today, consider:</p><ul><li><a href=https://sourcegraph.com>Sourcegraph</a> (disclaimer: the author works here, but this article is not endorsed or affiliated in any way)</li><li><a href=https://github.com/google/zoekt>Zoekt</a></li><li><a href=https://github.com/BurntSushi/ripgrep>Ripgrep</a></li></ul><p>Follow this devlog for updates as we continue investigating faster ways to do regexp & ngram search at large scales.</p></div></main><script>function addAnchor(a){a.insertAdjacentHTML('afterbegin',`<a href="#${a.id}" class="hanchor" ariaLabel="Anchor">#</a> `)}document.addEventListener('DOMContentLoaded',function(){var a=document.querySelectorAll('h1[id], h2[id], h3[id], h4[id]');a&&a.forEach(addAnchor)})</script></div><div class=footer><div class=row-1><a href=https://hexops.com/privacy>Privacy matters</a>
-<a href=https://github.com/sponsors/slimsag>Sponsor on GitHub</a>
+<a href=https://github.com/sponsors/emidoots>Sponsor on GitHub</a>
 <a href=https://machengine.org>machengine.org</a></div><div class=row-2><a href=/feed.xml><img alt="RSS feed" src="https://shields.io/badge/RSS-follow-green?logo=RSS"></a></div><div class=row-3><a href=https://hexops.com><img class="logo color-auto" alt="Hexops logo" src=https://raw.githubusercontent.com/hexops/media/234e15f265b19743c580a078b2d68660c92675d4/logo.svg height=50px></a></div></div></body></html>