[ENG-6284] render tsv/csv#834
Conversation
2503bce to
47d2150
Compare
move SKIPPABLE_COLUMNS into osfmap
47d2150 to
a81abf7
Compare
b75d4e0 to
ce20fc4
Compare
mfraezz
left a comment
There was a problem hiding this comment.
A couple nits or questions, but nothing blocking. Tests look sufficient, but behavior should still be confirmed manually on staging.
Pass complete ![]()
trove/trovesearch/page_cursor.py
Outdated
| MAX_OFFSET = 9997 | ||
|
|
||
| DEFAULT_PAGE_SIZE = 13 | ||
| MAX_PAGE_SIZE = 10000 |
There was a problem hiding this comment.
Minor: Is this maximum reasonable? Looks like it was previously 101.
Edit: I see the commit message called it "absurd," but I'm guessing it's also "justified for the sake of rendering files"?
There was a problem hiding this comment.
yeah the need here is downloading all results in one response, but i hesitated to make that behavior automagic by mediatype... considered making withFileName obviate pagination whenever present, but overall i opted for consistent query param behavior, putting the onus on the client to string together all the params needed for the desired result (e.g. acceptMediatype=text/csv&page[size]=10000&withFileName=my-file-name for a full csv download with up to 10000 rows)
if 10000 at once turns out to be unreasonable in practice... a more complicated (but less costly all-at-once) alternative might be view logic that queries/renders smaller pages one at a time and streams the results
There was a problem hiding this comment.
update: now streams, loading only one page (~100 rows) at a time, but streaming more than ~4000 items total still times out -- can further optimize or we can talk about increasing those timeouts for responses that are actively sending data...
There was a problem hiding this comment.
We might not run into those same timeouts for ~4k items with production resourcing (or configuration -- unsure where you got that figure, but by default most nginx timeouts are between successive operations rather than the whole response), but I suspect it's fine for now and we can reevaluate if encountering that issue later.
CardsearchResponse => CardsearchHandle ValuesearchResponse => ValuesearchHandle
88e566a to
f3def1e
Compare

allow rendering search responses as lines of tab-separated or comma-separated values
main point:
simple_tsvandsimple_csvrenderers introve.renderacceptMediatype=text/tab-separated-valuesoracceptMediatype=text/csvDEFAULT_TABULAR_SEARCH_COLUMN_PATHSintrove.vocab.osfmapwithFileName=fooquery param to get a response withContent-Disposition: attachmentand a filename based on "foo"changes made along the way:
ProtoRenderingas renderer output type, to better decouple rendering from view logicStreamableRenderingfor responses that might could be streamed, like csv/tsv (tho it's not currently handled any differently fromSimpleRendering)BaseRenderer(and each existing renderer) to have a consistent call signature (and returnProtoRendering)trove.render.get_rendererwithtrove.render.get_renderer_type-- instantiate the renderer with response datatrove.views._responderwith common logic for building a djangoHttpResponsefor aProtoRenderingwithFileName/Content-Dispositiontrove.vocab.osfmapfor easier reusetrove.render.simple_jsonintotrove.render._simple_trovesearch(for renderers that include only the list of search results)tests.trove.derive._baseintotests.trove._input_output_tests(for tests following the same simple input/output pattern as deriver and renderer tests)tests.trove.renderto cover the new rendererssimple_tsvandsimple_csv, as well as the existing renderersjsonapi,simple_json,jsonld, andturtle