-
Notifications
You must be signed in to change notification settings - Fork 59
Implementing streaming support #200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
mtl stopped (re)exporting some functions, causing a build error.
…traint from constructor level
|
@benjaminweb can you pls enable maintainer pushes to the branch? (or merge https://github.com/exaexa/selda/tree/streaming :) ) |
| -- | Like `runStmt` but instead of collecting all results at once in a | ||
| -- list, collects chunks of results and passes them one by one to a given | ||
| -- "callback" action. The callbacks may collect and return any `Monoid`. | ||
| , runStmtStreaming |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we add a batch size hint of some sort here? Even if libpq isn't supported yet, I think we should make sure the API is prepared to use it when it lands. Also, I guess we could implement it in the SQLite backend beforehand, to test it out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok so I finally got back to cleaning this up.
The API for sqlite is streaming but doesn't support batches (it always takes steps 1, with no issue I guess because sqlite is local). For postgres it makes better sense but the libraries aren't supporting it yet (see haskellari/postgresql-libpq#79 ).
Other notes here:
- streaming API for inserts doesn't make much sense (all backends solve that with prepared batched inserts; the best we can do there is to autobatch, which is a different topic I guess).
- I might review the API a very little bit too, was recently playing with
streamingpkg and it's mildly addictive. :D - not sure if the batch size would be better as a call argument (but that decorates all user code with potentially unwanted and badly configurable integer constant) or a backend parameter (that would also allow us to completely dodge the batch questions in unsupporting backends, i.e. sqlite)
|
If we make sure the default batch size is somewhat sane, perhaps we could remove the old backend API (i.e. the one that fetches all results at once) and have the frontend wrap the streaming API? I'm not convinced it makes sense to maintain both? That would also give us some testing for free. |
For sqlite I think this might be the case; even normally the results are fetched one by one. For libpq this actually makes difference at runtime (the calls are different and the database may hold stuff for longer). I'll try to ask if there's anything that can get broken by chunking everytime, will report if I find anything. |
Provided package
streamingis installed, the utility is demonstrated with the adapted example from the tutorial as follows:{-# LANGUAGE DeriveGeneric, OverloadedStrings, OverloadedLabels #-} module Lib where import Database.Selda import Database.Selda.SQLite import qualified Streaming.Prelude as S data Pet = Dog | Horse | Dragon deriving (Show, Read, Bounded, Enum) instance SqlType Pet data Person = Person { name :: Text , age :: Int , pet :: Maybe Pet } deriving (Generic, Show) instance SqlRow Person people :: Table Person people = table "people" [#name :- primary] prep = do createTable people insert_ people [ Person "Velvet" 19 (Just Dog) , Person "Kobayashi" 23 (Just Dragon) , Person "Miyu" 10 Nothing ] let q = do person <- select people restrict (person ! #age .>= 18) return (person ! #name :*: person ! #pet) return q -- since our stream is Monoid of (), we can use print main :: IO () main = withSQLite "people.sqlite" $ do q <- prep forQuery q $ liftIO . print -- instead of print, now, we use the singleton function of our streaming library, -- in our example streaming it is `S.yield` main2 :: IO () main2 = withSQLite "people.sqlite" $ do q <- prep a <- forQuery q $ pure . S.yield S.print a -- since we can use Monoids to construct a list, this is how we get back to the "usual" list main3 :: IO () main3 = withSQLite "people.sqlite" $ do q <- prep a <- forQuery q $ pure . (:[]) liftIO $ print a