Skip to content

Commit d51b0e0

Browse files
author
github-actions
committed
[2025-06-17 11:40:31 UTC] New release [ci skip]
1 parent 17b0e97 commit d51b0e0

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

podcast/50/index.html

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -318,7 +318,7 @@ <h2 class="text-2xl font-normal">Transcript</h2>
318318
<div class="border-l-3 border-gray-300 ml-5 md:ml-8 lg:ml-10 pl-4 md:pl-8 lg:pl-14 pr-4">
319319
<p><em>Matthías Páll Gissurarson (0:00:15)</em>: Hi, and welcome to this episode of the Haskell Interlude. My name is Matti, and I’m here with my co-host, Niki.</p>
320320
<p><em>Niki Vazou (0:00:23)</em>: Hello.</p>
321-
<p><em>MPG (0:00:24)</em>: Today, we’re joined by Tom Sydney, author of many tools like sydtest, decking, and nix-ci. He’ll tell us about the rules for sustainable Haskell, how Haskell lets one man do the job of 50, and the secret sauce for open source.</p>
321+
<p><em>MPG (0:00:24)</em>: Today, we’re joined by Tom Sydney, author of many tools like sydtest, Dekking, and nix-ci. He’ll tell us about the rules for sustainable Haskell, how Haskell lets one man do the job of 50, and the secret sauce for open source.</p>
322322
<p>Welcome, Syd. So, how did you get into Haskell?</p>
323323
<p><em>Tom Sydney Kerckhove (0:00:44)</em>: I think I remember at university, before we even started learning about declarative languages, I was looking into which programming language I might use to build my own things, and I tried a bunch of them. And eventually, I stumbled onto Haskell because someone said, “You look like the kind of person that would enjoy this.” And then there we started. And then it turned out that that fell right into place very well, because now I’m still convinced that it’s an excellent way to build something maintainable over a long period of time. And you can tell because many of my projects are almost 10 years old now, which is about when I started.</p>
324324
<p><em>NV (0:01:19)</em>: So, you learned Haskell completely by yourself?</p>
@@ -340,7 +340,7 @@ <h2 class="text-2xl font-normal">Transcript</h2>
340340
<p><em>NV (0:03:51)</em>: And you didn’t try to push Haskell there?</p>
341341
<p><em>TSK (0:03:53)</em>: Well, we did, and very successfully so. But then all sorts of things happened that I can’t, in good conscience, talk about for certain legal reasons that made me not work there anymore. And so, I don’t think they’re using Haskell anymore. And I remember that various people who worked there agreed that that was not a great decision, but I don’t know what the current reality is there.</p>
342342
<p><em>MPG (0:04:14)</em>: But you did a bunch of work on your testing framework, sydtest, right? </p>
343-
<p><em>TSK (0:04:22)</em>: Right. I had been using hspec at the time, and there were some things missing, some very small things missing all over the place. And then I thought, ‘I have some extra time.’ I think it was between jobs or something. ‘Let me just see what I can do if I try to do it myself.’ And then I wrote something, and it turns out that by choosing the right defaults or different defaults, let’s say, you can find a lot of extra problems with the same amount of testing effort. So, for example, if you run tests in parallel by default, you’ll find threat safety issues just by doing that. And if you run them in random order by default, then you’ll find problems where some – it’s called test pollution or, let’s say, I’ve called it test pollution. I don’t know what it’s actually called. It’s where two tests can either pass or fail, depending on whether they’re being run at the same time or before or after other tests. And you’ll find those problems by running tests in parallel, for example. And there are all sorts of defaults like that that I could choose because I was making the thing. And then I ended up choosing some things that are controversial, but then found a whole bunch of extra problems for me. </p>
343+
<p><em>TSK (0:04:22)</em>: Right. I had been using hspec at the time, and there were some things missing, some very small things missing all over the place. And then I thought, ‘I have some extra time.’ I think it was between jobs or something. ‘Let me just see what I can do if I try to do it myself.’ And then I wrote something, and it turns out that by choosing the right defaults or different defaults, let’s say, you can find a lot of extra problems with the same amount of testing effort. So, for example, if you run tests in parallel by default, you’ll find thread safety issues just by doing that. And if you run them in random order by default, then you’ll find problems where some – it’s called test pollution or, let’s say, I’ve called it test pollution. I don’t know what it’s actually called. It’s where two tests can either pass or fail, depending on whether they’re being run at the same time or before or after other tests. And you’ll find those problems by running tests in parallel, for example. And there are all sorts of defaults like that that I could choose because I was making the thing. And then I ended up choosing some things that are controversial, but then found a whole bunch of extra problems for me. </p>
344344
<p>So, another is that I wanted this thing to be good for running in CI because that’s where I run my tests for the most part, which means that all of the randomness that was being generated when executing tests needed to be reproducible. So, there’s a set seed for all randomness, all controllable randomness, let’s say, when you run a test suite so that it’s easy to reproduce.</p>
345345
<p><em>NV (0:05:39)</em>: Wait, but as a Haskell programmer, I’m confused. Why is the ordering important? </p>
346346
<p><em>TSK (0:05:44)</em>: So, for example, if you have two tests, A and B, where A writes a file and B reads a file, expecting to exist, and then if you run B first, then it will read the file that doesn’t exist. If you run B second, then it will read a file and it will be there. If you have this kind of dependencies between tests—I call it test pollution—then you cannot run tests in parallel. And it turns out that being able to run tests in parallel a goal lets a lot of puzzle pieces fall into place automatically. If all tests can be run in parallel and in any order, then running test suites and tests in parallel, it turns them embarrassingly parallelizable. So, that’s the first puzzle piece, right? You can parallelize tests. </p>
@@ -393,16 +393,16 @@ <h2 class="text-2xl font-normal">Transcript</h2>
393393
<p><em>MPG (0:20:14)</em>: Right. And even, I think we test a lot, right? So, you discovered some calendar bug, right? </p>
394394
<p><em>TSK (0:20:19)</em>: Right. </p>
395395
<p><em>MPG (0:20:19)</em>: So, that was just a sydtest finding something or –</p>
396-
<p><em>TSK (0:20:22)</em>: So actually, no, someone found that in production because it turns out that there was a threat safety issue with the way that if you run the same SQL query at the same time, then you might have segfaults, something like that. And there was no way to find that unless you actually tried the exact same query at the same time for various threats. And so, that’s one thing we found there. I built this project called Decking, which is next-generation coverage reports for Haskell, which doesn’t work in some cases for complicated type safety reasons, but it produces some nice reports for me. And now I can see that the coverage that I get from my repositories, and I hope you’d be surprised to find out that the coverage is usually around 60%. So, it’s not astronomically higher, but I’m starting to measure it, which is already something weird, apparently.</p>
396+
<p><em>TSK (0:20:22)</em>: So actually, no, someone found that in production because it turns out that there was a thread safety issue with the way that if you run the same SQL query at the same time, then you might have segfaults, something like that. And there was no way to find that unless you actually tried the exact same query at the same time for various threads. And so, that’s one thing we found there. I built this project called Dekking, which is next-generation coverage reports for Haskell, which doesn’t work in some cases for complicated type safety reasons, but it produces some nice reports for me. And now I can see that the coverage that I get from my repositories, and I hope you’d be surprised to find out that the coverage is usually around 60%. So, it’s not astronomically higher, but I’m starting to measure it, which is already something weird, apparently.</p>
397397
<p><em>NV (0:21:07)</em>: The coverage of the code that is run, this is not about testing, right?</p>
398398
<p><em>TSK (0:21:11)</em>: Oh, I’m sorry. I’m now talking about testing, but you can use this tool to get the coverage of anything that’s run, yes.</p>
399399
<p><em>NV (0:21:17)</em>: But 60% is the coverage that your unit test provides you on your code. And how do you measure that?</p>
400-
<p><em>TSK (0:21:23)</em>: So, there’s this project called Decking. It replaces every expression in your code by unsafePerformIO, mark this thing as covered, and then return the value. But it turns out that that’s actually not something that preserves. It compiles according to GHC, which is really weird. But that way, you can see which parts of your tests were covered and which weren’t. And then you can have some really nice, colorful reports for them as well.</p>
400+
<p><em>TSK (0:21:23)</em>: So, there’s this project called Dekking. It replaces every expression in your code by unsafePerformIO, mark this thing as covered, and then return the value. But it turns out that that’s actually not something that preserves. It compiles according to GHC, which is really weird. But that way, you can see which parts of your tests were covered and which weren’t. And then you can have some really nice, colorful reports for them as well.</p>
401401
<p><em>MPG (0:21:46)</em>: How does this compare to the Haskell program coverage thing?</p>
402402
<p><em>TSK (0:21:50)</em>: Have you ever looked into the source code for that? It’s not built into GHC. That’s one of the differences. Sorry. HPC is checking this.</p>
403403
<p><em>MPG (0:21:58)</em>: Right. Yeah.</p>
404404
<p><em>TSK (0:21:59)</em>: That’s the big difference. And I wanted to do it that way so that the GHC maintainers didn’t have to maintain my coverage report.</p>
405-
<p><em>MPG (0:22:06)</em>: So decking is like, it does a whole program transformation of the program when you’re compiling it?</p>
405+
<p><em>MPG (0:22:06)</em>: So Dekking is like, it does a whole program transformation of the program when you’re compiling it?</p>
406406
<p><em>TSK (0:22:11)</em>: I think so. To be honest, I’m not clear with the term, but it sounds like that would be what it is. Yes. So, there’s a GHC plugin that just goes and replaces the source code. And that was actually part of the issue because I couldn’t pinky promise to GHC that this was correct type-wise. It turns out that if you replace an expression of type A by identity function of A, sometimes that doesn’t compile anymore. That was part of the issue. So, I had to build in something where you could say, “Don’t try to cover this because it’s not going to compile if you do.” That’s part of what makes it different from HPC, is that you sometimes have to do stuff like that.</p>
407407
<p><em>MPG (0:22:43)</em>: Right.</p>
408408
<p><em>TSK (0:22:43)</em>: So, you can mark certain sections of your code as not coverable. And that might actually be useful as well if you’re trying to – let’s say you have an admin panel in your website that maybe you don’t really care about testing that part and you can mark it as not coverable and it doesn’t show up in your test reports as not covered.</p>
@@ -413,7 +413,7 @@ <h2 class="text-2xl font-normal">Transcript</h2>
413413
<p><em>NV (0:24:25)</em>: And this works only with your testing suite, the sydtest, right? </p>
414414
<p><em>TSK (0:24:30)</em>: No, no, because it just does expression replacements, so you can – it doesn’t even have to be a test suite. You can do this in production, I guess, but might be a bit slower. </p>
415415
<p><em>MPG (0:24:37)</em>: Yeah. Because I think, I mean, HPC does something similar, but like you said, it’s not a – it’s called the STG level, right? So, down after the expressions. But I think you could combine the two really. I think you could just run the HPC and then not to use their libraries to produce nice output, right? Because the output library is a bit of a mess, right? Like there’s a C parser that parses these files. It’s a bit of a mess. Yeah.</p>
416-
<p><em>TSK (0:25:02)</em>: It’s also terribly difficult to get the HPC files that are output from the right packages into the right places when you’re building stuff across packages. So, cross package code coverage reports are really difficult to do with HPC to the point that I never figured it out. And it took me less time to build decking to figure it out.</p>
416+
<p><em>TSK (0:25:02)</em>: It’s also terribly difficult to get the HPC files that are output from the right packages into the right places when you’re building stuff across packages. So, cross package code coverage reports are really difficult to do with HPC to the point that I never figured it out. And it took me less time to build Dekking to figure it out.</p>
417417
<p><em>MPG (0:25:21)</em>: Because we use this for program repair, right? Because you know which tests failed and you know what parts of the code that test touched, right? So then you can say, “Ah, if you want to fix something, you should probably look here first.” Right?</p>
418418
<p><em>TSK (0:25:36)</em>: I’ve actually got an open proposal for a master’s thesis at the university, which is about making something where you could say, “This is my test suite. Can you try and figure out across the history of my code base which tests have been useful and how useful so every time a piece of code was run by a test, that test has been useful?” You could argue it that way. And then you can see across time which tests have been most useful, for example. No one’s ever started on it, but that would have been pretty interesting.</p>
419419
<p><em>MPG (0:26:06)</em>: That’s cool. So, you mentioned some other projects you were looking into, right?</p>
@@ -458,7 +458,7 @@ <h2 class="text-2xl font-normal">Transcript</h2>
458458
<p><em>MPG (0:35:20)</em>: But this is all in the NixCI system, or where does this –</p>
459459
<p><em>TSK (0:35:24)</em>: No, this is just the various little tools I use for any project. So, there is pre-commit hooks, which makes sure that my code is always formatted with the way I do that. It doesn’t really matter which one you use as long as – then there’s all the GHC warnings. There’s something called TagRef, which is amazing, and the people at the GHC team will love this if they haven’t heard about it. It’s based on the GHC node system, where pieces of code can refer to each other in a way that needs to stay consistent. And so, in TagRef, you say, “Here is a little explanation of something,” and now I can refer to that. And TagRef will just go and check that all those links are not there. And that way, you have consistent developer to itself main communication.</p>
460460
<p><em>NV (0:36:01)</em>: But this is for the comments?</p>
461-
<p><em>TSK (0:36:04)</em>: Yes, that’s right. That’s for any texts, really. So, for example, if you look at the decking source code, you’ll see all sorts of comments about linker issues that are figured out. And then if you refactor a lot of code, these get moved around, and sometimes these links breaks and then links break and then it doesn’t work anymore. So, that’s the second. And then there is HLint, which lets me catch a whole bunch of things, and it’s quite customizable. That’s nice. So, I usually put my entire Dangerous Functions list into HLint. So, that warns me when I use any of them. And I need to go and explicitly opt into using the dangerous function in order to do that. And then now there’s Weeder as well, which requires me to remove all the code that I’m not using, for example. </p>
461+
<p><em>TSK (0:36:04)</em>: Yes, that’s right. That’s for any texts, really. So, for example, if you look at the Dekking source code, you’ll see all sorts of comments about linker issues that are figured out. And then if you refactor a lot of code, these get moved around, and sometimes these links breaks and then links break and then it doesn’t work anymore. So, that’s the second. And then there is HLint, which lets me catch a whole bunch of things, and it’s quite customizable. That’s nice. So, I usually put my entire Dangerous Functions list into HLint. So, that warns me when I use any of them. And I need to go and explicitly opt into using the dangerous function in order to do that. And then now there’s Weeder as well, which requires me to remove all the code that I’m not using, for example. </p>
462462
<p><em>MPG (0:36:43)</em>: We need a book because everyone’s complaining about Haskell tooling, but it seems like all this stuff is out there. It’s just, nobody’s using it, right? So, you need something like real world Haskell 2.0, all the tools that you need or something like that, because people are complaining and they’re like just running GHC and looking at a lot of output. And you’re like, well, if you run the Java compiler and look at the output, you’re not going to do anything, right?</p>
463463
<p><em>NV (0:37:07)</em>: I think once in one project I had HLint in my commits, and when HLint broke, I was so pissed that I couldn’t push. And then I removed this.</p>
464464
<p><em>TSK (0:37:21)</em>: Exactly. And so, you see this very often. Whenever there’s a method that you rely on in CI that has false positives, we end up just removing it instead of listening to it because the false positives are something you need to be able to deal with. And so Weeder, for example, has a bunch of them, and you need to be able to turn the warning for it off. And it’s possible, so that’s nice. But then now you get weeds in that config file. And so, any system where you can ignore something like that needs to also be able to tell you when it’s no longer ignoring the thing you told it to ignore. So, the way I look about that is on a large-ish timescale, code seems to behave as a random walk through all possible existences of that code that would pass CI. That means that you need to be able to deal with what looks like a monkey typing on a keyboard. If CI doesn’t catch it, it will end up in your codebase at some point.</p>

0 commit comments

Comments
 (0)