-
Notifications
You must be signed in to change notification settings - Fork 40
Add Haskell implementation to benchmark #144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* Add Haskell dataframe benchmark entry --------- Co-authored-by: Claude <noreply@anthropic.com>
Tmonster
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this! Looks pretty good to me. I tried to test it on my own c6id.4xlarge instance and was getting errors at the build step. Potentially the dataframe library has been updated recently?
This is the error I saw
Downloading the latest package list from hackage.haskell.org
Package list of hackage.haskell.org is up to date.
The index-state is set to 2026-01-02T18:38:39Z.
Build profile: -w ghc-9.4.7 -O2
In order, the following will be built (use -v for more details):
- haskell-benchmark-0.1.0.0 (exe:groupby-haskell) (first run)
- haskell-benchmark-0.1.0.0 (exe:join-haskell) (first run)
Preprocessing executable 'join-haskell' for haskell-benchmark-0.1.0.0...
Preprocessing executable 'groupby-haskell' for haskell-benchmark-0.1.0.0...
Building executable 'join-haskell' for haskell-benchmark-0.1.0.0...
Building executable 'groupby-haskell' for haskell-benchmark-0.1.0.0...
[1 of 1] Compiling Main ( groupby-haskell.hs, /var/lib/mount/db-benchmark-metal/haskell/dist-newstyle/build/x86_64-linux/ghc-9.4.7/haskell-benchmark-0.1.0.0/x/groupby-haskell/opt/build/groupby-haskell/groupby-haskell-tmp/Main.o )
[1 of 1] Compiling Main ( join-haskell.hs, /var/lib/mount/db-benchmark-metal/haskell/dist-newstyle/build/x86_64-linux/ghc-9.4.7/haskell-benchmark-0.1.0.0/x/join-haskell/opt/build/join-haskell/join-haskell-tmp/Main.o )
join-haskell.hs:126:34: error:
• Couldn't match expected type ‘D.Expr Double’
with actual type ‘T.Text’
• In the first argument of ‘D.columnAsDoubleVector’, namely
‘(T.pack name)’
In the expression: D.columnAsDoubleVector (T.pack name) df
In the expression:
case D.columnAsDoubleVector (T.pack name) df of
Right vec -> VU.sum vec
Left _ -> 0.0
|
126 | case D.columnAsDoubleVector (T.pack name) df of
| ^^^^^^^^^^^
groupby-haskell.hs:175:31: error:
• Couldn't match expected type ‘D.Expr Int’ with actual type ‘Text’
• In the first argument of ‘D.columnAsIntVector’, namely
‘(T.pack col)’
In the expression: D.columnAsIntVector (T.pack col) df
In the expression:
case D.columnAsIntVector (T.pack col) df of
Right vec -> fromIntegral $ VU.sum vec
Left _ -> 0.0
|
175 | case D.columnAsIntVector (T.pack col) df of
| ^^^^^^^^^^
groupby-haskell.hs:181:34: error:
• Couldn't match expected type ‘D.Expr Double’
with actual type ‘Text’
• In the first argument of ‘D.columnAsDoubleVector’, namely
‘(T.pack col)’
In the expression: D.columnAsDoubleVector (T.pack col) df
In the expression:
case D.columnAsDoubleVector (T.pack col) df of
Right vec -> VU.sum vec
Left _ -> 0.0
|
181 | case D.columnAsDoubleVector (T.pack col) df of
| ^^^^^^^^^^
Error: [Cabal-7125]
|
@Tmonster updated the implementation. I pinned it to a major version so it doesn't get broken by version updates. |
|
Hi @mchav, seems like some other package got updated causing the regression tests to start failing. I'm gonna try and fix that first, then I'll go ahead and merge this. Also, the DuckDB release was pushed back a week, so results will therefore also be about a week later |
|
@Tmonster alright. I noticed the failures in the last CI check were about trailing commas I had left in some R files. I made sure to fix those as well. |
|
Hi @mchav, Thanks, was going to mention an issue with Seems like something is wrong with how the join data file names are read/parsed? |
|
@Tmonster was a small bug when inferring how to replace the NA. Should be fixed now. |
Also tested out that this works end to end on a c6id.4xlarge instance.