Skip to content

Streaming mode (preferably in a binary format) for rest apiΒ #8831

@marcintustin

Description

@marcintustin

Is your feature request related to a problem? Please describe.
I use cube to accelerate an api and related web application. The rest api is a very convenient api to program against, and I like keeping my sql in cube, rather than generating sql for cube to consume. However, there is no streaming mode, and I need to parse json to extract my results and re-serve them to my own clients, both of which add latency. Web applications and apis seem better to users when their latency to first result is lower.

Describe the solution you'd like
I would like the rest api to support streaming data back, with totalcount (if requested) returned as a header, and preferably in a format that minimizes the effort spent on parsing, for example Arrow IPC. https://arrow.apache.org/docs/format/Columnar.html#format-ipc . I specifically mention Arrow IPC because I believe that is the native format inside cube before conversion to json.

Describe alternatives you've considered

  • An official library that has the same interface as the rest api that can generate the sql to drive the streaming mode of the sql api could also work, I think.
  • A pseudo-sql api that takes in the json query format of the rest api, and returns streaming data back in postgres format would also work.

Additional context
An example of how I structure cube definitions:

cube(`AlphaOneBase`, {
    sql: `SELECT
    bb.FoxtrotSix,
    bb.TangoFortySix,
    bb.XrayTwentyFour,
    bb.GolfThirtyThree,
    bb.MikeThirtyNine,
    bb.OscarFortyOne,
    bb.PapaFortyTwo,
    bb.PapaSixteen,
    bb.OscarFifteen,
    bb.WhiskeyTwentyThree,
    bb.DeltaThirty,
    bb.VictorTwentyTwo,
    bb.ZuluTwentySix,
    bb.SierraNineteen,
    bb.HotelEight,
    bb.UniformFortySeven,
    bb.KiloThirtySeven,
    bb.SierraFortyFive,
    em.EchoFive,
    p.name,
    p.LimaTwelve
    ,current_timestamp as aggupdatedts
    FROM public.CharlieThree AS bb
    JOIN IndiaThirtyFive AS pcd ON pcd.bv_entity_id = bb.FoxtrotSix AND pcd.project_id = bb.TangoFortySix
    JOIN JulietThirtySix AS p ON p.id = pcd.GolfThirtyThreefolio_id AND p.GolfThirtyThreefolio_type = 'normal'
    JOIN QuebecSeventeen em ON em.FoxtrotSix = bb.FoxtrotSix`,
    measures:{
        JulietTen: {
            type: `countDistinct`,
            sql: `XrayTwentyFour`,
        }
    },
    dimensions: {
        DeltaFour: {
            sql: `FoxtrotSix`,
            type: `string`
        },
        LimaThirtyEight: {
            sql: `TangoFortySix`,
            type: `number`
        },
        XrayTwentyFour: {
            sql: `XrayTwentyFour`,
            type: `string`
        },
        GolfThirtyThree: {
            sql: `GolfThirtyThree`,
            type: `number`
        },
        MikeThirtyNine: {
            sql: `MikeThirtyNine`,
            type: `string`
        },
        OscarFortyOne: {
            sql: `OscarFortyOne`,
            type: `string`
        },
        NovemberForty: {
            sql: `PapaFortyTwo`,
            type: `string`
        },
        NovemberFourteen: {
            sql: `PapaSixteen`,
            type: `string`
        },
        deviceName: {
            sql: `OscarFifteen`,
            type: `string`
        },
        UniformTwentyOne: {
            sql: `WhiskeyTwentyThree`,
            type: `string`
        },
        CharlieTwentyNine: {
            sql: `DeltaThirty`,
            type: `string`
        },
        TangoTwenty: {
            sql: `VictorTwentyTwo`,
            type: `string`
        },
        YankeeTwentyFive: {
            sql: `ZuluTwentySix`,
            type: `time`
        },
        RomeoEighteen: {
            sql: `SierraNineteen`,
            type: `number`
        },
        HotelEight: {
            sql: `HotelEight`,
            type: `string`
        },
        UniformFortySeven: {
            sql: `UniformFortySeven`,
            type: `string`
        },
        KiloThirtySeven: {
            sql: `KiloThirtySeven`,
            type: `string`
        },
        RomeoFortyFour: {
            sql: `SierraFortyFive`,
            type: `time`
        },
        IndiaNine: {
            sql: `EchoFive`,
            type: `string`
        },
        HotelThirtyFour: {
            sql: `name`,
            type: `string`
        },
        KiloEleven: {
            sql: `LimaTwelve`,
            type: `string`
        }
    },
  segments: {
    AlphaTwentySeven: {
      sql: `HotelEight = 'Other' and UniformFortySeven = 'Generic' and KiloThirtySeven = 'HamSlice'`,
    }
  },
  sqlAlias: `bb`,
});

cube(`AlphaOne`, {
  extends: AlphaOneBase,
  dataSource: "postgresdb",
  preAggregations: {
    orig: {
      type: "original_sql",
      refreshKey: {
                sql: `SELECT MAX(max_ts)
                      FROM (
                         SELECT MAX(SierraFortyFive) AS max_ts FROM public.CharlieThree
                         UNION
                         SELECT MAX(IndiaThirtyFive.SierraFortyFive) AS max_ts FROM IndiaThirtyFive
                         UNION
                         SELECT MAX(SierraFortyFive) AS max_ts FROM JulietThirtySix
                      ) AS t`,
                every: `10 minute`
            }
    },
    GolfSeven: {
      time_dimension:  CUBE.YankeeTwentyFive,
      granularity: `day`,
      partition_granularity: `day`,
      use_original_sql_pre_aggregations: true,
      dimensions: [
                CUBE.DeltaFour,
                CUBE.LimaThirtyEight, 
                CUBE.XrayTwentyFour, 
                CUBE.GolfThirtyThree,  
                CUBE.MikeThirtyNine, 
                CUBE.OscarFortyOne, 
                CUBE.NovemberForty, 
                CUBE.NovemberFourteen, 
                CUBE.deviceName, 
                CUBE.UniformTwentyOne, 
                CUBE.CharlieTwentyNine,  
                CUBE.TangoTwenty, 
                CUBE.RomeoEighteen, 
                CUBE.HotelEight, 
                CUBE.UniformFortySeven, 
                CUBE.KiloThirtySeven,
                CUBE.IndiaNine,
                CUBE.HotelThirtyFour,
              CUBE.KiloEleven,
              CUBE.RomeoFortyFour],
      refreshKey: {
                sql: `SELECT aggupdatedts from postgres_pre_aggregations.bb_orig limit 1`,
                every: `10 minute`
            }
      },
      BravoTwentyEight: {
      use_original_sql_pre_aggregations: true,
        dimensions: [CUBE.LimaThirtyEight],
        measures: [CUBE.JulietTen],
	segments: [CUBE.AlphaTwentySeven],
        refreshKey: {
            sql: `SELECT aggupdatedts from postgres_pre_aggregations.bb_orig limit 1`,
            every: `10 minute`,
        }
      }
    }
});

The cube file contains two cubes, AlphaOneBase, and AlphaOne. AlphaOneBase has no preaggregations. It is laid out this way because when a query matches against a preaggregation that has not yet been built, cube will return continue wait until the preaggregation has been built. This cube is part of a web portal that is expected to be available continuously; accordingly we maintain the ability to query the database via cube without any preaggs getting in the way by having the base version. Obviously, that path is slower when preaggs are built, but faster while preaggs are building.

Metadata

Metadata

Assignees

No one assigned

    Labels

    api:restIssues related to REST API

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions