Skip to content

Conversation

@benjaminweb
Copy link
Collaborator

@benjaminweb benjaminweb commented Jul 31, 2025

Provided package streaming is installed, the utility is demonstrated with the adapted example from the tutorial as follows:

{-# LANGUAGE DeriveGeneric, OverloadedStrings, OverloadedLabels #-}

module Lib where

import Database.Selda
import Database.Selda.SQLite
import qualified Streaming.Prelude as S

data Pet = Dog | Horse | Dragon
  deriving (Show, Read, Bounded, Enum)
instance SqlType Pet

data Person = Person
  { name :: Text
  , age  :: Int
  , pet  :: Maybe Pet
  } deriving (Generic, Show)
instance SqlRow Person

people :: Table Person
people = table "people" [#name :- primary]

prep = do
    createTable people
    insert_
      people
      [ Person "Velvet" 19 (Just Dog)
      , Person "Kobayashi" 23 (Just Dragon)
      , Person "Miyu" 10 Nothing
      ]
    let q = do
          person <- select people
          restrict (person ! #age .>= 18)
          return (person ! #name :*: person ! #pet)
    return q

-- since our stream is Monoid of (), we can use print
main :: IO ()
main =
  withSQLite "people.sqlite" $ do
    q <- prep
    forQuery q $ liftIO . print

-- instead of print, now, we use the singleton function of our streaming library,
-- in our example streaming it is `S.yield`
main2 :: IO ()
main2 =
  withSQLite "people.sqlite" $ do
    q <- prep
    a <- forQuery q $ pure . S.yield
    S.print a

-- since we can use Monoids to construct a list, this is how we get back to the "usual" list
main3 :: IO ()
main3 = 
  withSQLite "people.sqlite" $ do
    q <- prep
    a <- forQuery q $ pure . (:[])
    liftIO $ print a

@exaexa
Copy link
Collaborator

exaexa commented Oct 24, 2025

@benjaminweb can you pls enable maintainer pushes to the branch?

(or merge https://github.com/exaexa/selda/tree/streaming :) )

-- | Like `runStmt` but instead of collecting all results at once in a
-- list, collects chunks of results and passes them one by one to a given
-- "callback" action. The callbacks may collect and return any `Monoid`.
, runStmtStreaming
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we add a batch size hint of some sort here? Even if libpq isn't supported yet, I think we should make sure the API is prepared to use it when it lands. Also, I guess we could implement it in the SQLite backend beforehand, to test it out?

Copy link
Collaborator

@exaexa exaexa Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so I finally got back to cleaning this up.

The API for sqlite is streaming but doesn't support batches (it always takes steps 1, with no issue I guess because sqlite is local). For postgres it makes better sense but the libraries aren't supporting it yet (see haskellari/postgresql-libpq#79 ).

Other notes here:

  • streaming API for inserts doesn't make much sense (all backends solve that with prepared batched inserts; the best we can do there is to autobatch, which is a different topic I guess).
  • I might review the API a very little bit too, was recently playing with streaming pkg and it's mildly addictive. :D
  • not sure if the batch size would be better as a call argument (but that decorates all user code with potentially unwanted and badly configurable integer constant) or a backend parameter (that would also allow us to completely dodge the batch questions in unsupporting backends, i.e. sqlite)

@valderman
Copy link
Owner

valderman commented Oct 29, 2025

If we make sure the default batch size is somewhat sane, perhaps we could remove the old backend API (i.e. the one that fetches all results at once) and have the frontend wrap the streaming API? I'm not convinced it makes sense to maintain both? That would also give us some testing for free.

@exaexa
Copy link
Collaborator

exaexa commented Oct 29, 2025

If we make sure the default batch size is somewhat sane, perhaps we could remove the old backend API (i.e. the one that fetches all results at once) and have the frontend wrap the streaming API?

For sqlite I think this might be the case; even normally the results are fetched one by one. For libpq this actually makes difference at runtime (the calls are different and the database may hold stuff for longer). I'll try to ask if there's anything that can get broken by chunking everytime, will report if I find anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants