Skip to content

Commit 073db5f

Browse files
Merge branch 'master' into autostart-engine
2 parents 3238d23 + 4b85612 commit 073db5f

File tree

88 files changed

+2959
-1449
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

88 files changed

+2959
-1449
lines changed

docs/pages/product/apis-integrations/ai-api.mdx

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,3 +138,83 @@ One way of handling this is to pass the error message back into the AI API; it m
138138
#### 3. Continue wait
139139

140140
When using `"runQuery": true`, you might sometimes receive a query result containing `{ "error": "Continue wait" }`. If this happens, you should use `/load` ([described above](#2-load)) instead of `runQuery` to run the query, and handle retries as described in the [REST API documentation](/product/apis-integrations/rest-api#continue-wait).
141+
142+
## Advanced Usage
143+
144+
<InfoBox>
145+
The advanced features discussed here are available on Cube version 1.1.7 and above.
146+
</InfoBox>
147+
148+
### Custom prompts
149+
150+
You can prompt the AI API with custom instructions. For example, you may want it to always
151+
respond in a particular language, or to refer to itself by a name matching your brand.
152+
Custom prompts also allow you to give the model more context on your company and data model,
153+
for example if it should usually prefer a particular view.
154+
155+
To use a custom prompt, set the `CUBE_CLOUD_AI_API_PROMPT` environment variable in your deployment.
156+
157+
<InfoBox>
158+
Custom prompts add to, rather than overwrite, the AI API's existing prompting, so you
159+
do not need to re-write instructions around how to generate the query itself.
160+
</InfoBox>
161+
162+
### Meta tags
163+
164+
The AI API can read [meta tags](/reference/data-model/view#meta) on your dimensions, measures,
165+
segments, and views.
166+
167+
Use the `ai` meta tag to give context that is specific to AI and goes beyond what is
168+
included in the description. This can have any keys that you want. For example, you can use it
169+
to give the AI context on possible values in a categorical dimension:
170+
```yaml
171+
- name: status
172+
sql: status
173+
type: string
174+
meta:
175+
ai:
176+
values:
177+
- shipped
178+
- processing
179+
- completed
180+
```
181+
182+
### Other LLM providers
183+
184+
<InfoBox>
185+
These environment variables also apply to the [AI Assistant](/product/workspace/ai-assistant),
186+
if it is enabled on your deployment.
187+
</InfoBox>
188+
189+
If desired, you may "bring your own" LLM model by providing a model and API credentials
190+
for a supported model provider. Do this by setting environment variables in your Cube
191+
deployment. See below for required variables by provider (required unless noted):
192+
193+
#### AWS Bedrock
194+
195+
<WarningBox>
196+
The AI API currently supports only Anthropic Claude models on AWS Bedrock. Other
197+
models may work but are not fully supported.
198+
</WarningBox>
199+
200+
- `CUBE_BEDROCK_MODEL_ID` - A supported [AWS Bedrock chat model](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html), for example `anthropic.claude-3-5-sonnet-20241022-v2:0`
201+
- `CUBE_BEDROCK_ACCESS_KEY` - An access key for an IAM user with `InvokeModelWithResponseStream` permissions on the desired region/model.
202+
- `CUBE_BEDROCK_ACCESS_SECRET` - The corresponding access secret
203+
- `CUBE_BEDROCK_REGION_ID` - A supported AWS Bedrock region, for example `us-west-2`
204+
205+
#### GCP Vertex
206+
207+
<WarningBox>
208+
The AI API currently supports only Anthropic Claude models on GCP Vertex. Other
209+
models may work but are not fully supported.
210+
</WarningBox>
211+
212+
- `CUBE_VERTEX_MODEL_ID` - A supported GCP Vertex chat model, for example `claude-3-5-sonnet@20240620`
213+
- `CUBE_VERTEX_PROJECT_ID` - The GCP project the model is deployed in
214+
- `CUBE_VERTEX_REGION` - The GCP region the model is deployed in, for example `us-east5`
215+
- `CUBE_VERTEX_CREDENTIALS` - The private key for a service account with permissions to run the chosen model
216+
217+
#### OpenAI
218+
219+
- `OPENAI_MODEL` - An OpenAI chat model ID, for example `gpt-4o`
220+
- `OPENAI_API_KEY` - An OpenAI API key (we recommend creating a service account for the AI API)

docs/pages/product/caching/running-in-production.mdx

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -325,6 +325,28 @@ Refresh worker should be able to finish pre-aggregation refresh before
325325
garbage collection starts. It means that all pre-aggregation partitions
326326
should be built before any tables are removed.
327327

328+
#### Supported file systems
329+
330+
The garbage collection mechanism relies on the ability of the underlying file
331+
system to report the creation time of a file.
332+
333+
If the file system does not support getting the creation time, you will see the
334+
following error message in Cube Store logs:
335+
336+
```
337+
ERROR [cubestore::remotefs::cleanup] <pid:1>
338+
error while getting created time for file "<name>.chunk.parquet":
339+
creation time is not available for the filesystem
340+
```
341+
342+
<ReferenceBox>
343+
344+
XFS is known to not support getting the creation time of a file.
345+
Please see [this issue](https://github.com/cube-js/cube/issues/7905#issuecomment-2504212623)
346+
for possible workarounds.
347+
348+
</ReferenceBox>
349+
328350
## Security
329351

330352
### Authentication

docs/pages/product/workspace/ai-assistant.mdx

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,49 @@ The query will automatically run in the sidebar and can be opened in the [Playgr
5858

5959
<Screenshot src="https://ucarecdn.com/4249ff1e-fae1-42c8-ad3a-b9e406ea2022/Screenshot20240624at34327PM.png" />
6060

61+
## Advanced Usage
62+
63+
<InfoBox>
64+
The advanced features discussed here are available on Cube version 1.1.7 and above.
65+
</InfoBox>
66+
67+
### Custom prompts
68+
69+
You can prompt the AI Assistant with custom instructions. For example, you may want it to always
70+
respond in a particular language, or to refer to itself by a name matching your brand.
71+
Custom prompts also allow you to give the model more context on your company and data model,
72+
for example if it should usually prefer a particular view.
73+
74+
To use a custom prompt, set the `CUBE_CLOUD_AI_ASSISTANT_PROMPT` environment variable in your deployment.
75+
76+
<InfoBox>
77+
Custom prompts add to, rather than overwrite, the AI Assistant's existing prompting.
78+
</InfoBox>
79+
80+
### Meta tags
81+
82+
The AI Assistant can read [meta tags](/reference/data-model/view#meta) on your dimensions, measures,
83+
segments, and views.
84+
85+
Use the `ai` meta tag to give context that is specific to AI and goes beyond what is
86+
included in the description. This can have any keys that you want. For example, you can use it
87+
to give the AI context on possible values in a categorical dimension:
88+
```yaml
89+
- name: status
90+
sql: status
91+
type: string
92+
meta:
93+
ai:
94+
values:
95+
- shipped
96+
- processing
97+
- completed
98+
```
99+
100+
### Other LLM providers
101+
102+
See the [AI API's documentation][ref-ai-api-providers] for information on how to "bring your own" LLM.
103+
61104
## FAQ and limitations
62105
63106
### 1. What language model(s) does the AI Assistant use?
@@ -83,3 +126,4 @@ The query will automatically run in the sidebar and can be opened in the [Playgr
83126
[ref-catalog]: /product/workspace/semantic-catalog
84127
[ref-playground]: /product/workspace/playground
85128
[ref-catalog-downstream]: /product/workspace/semantic-catalog#connecting-downstream-tools
129+
[ref-ai-api-providers]: /product/apis-integrations/ai-api#other-llm-providers

docs/pages/reference/ai-api.mdx

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,21 @@ Generate a Cube query that can be used to answer a user's question, and (optiona
1111
| `messages` | ✅ Yes | An array of messages in the format: `{ "role": "user" \| "assistant", "content": "string" }` |
1212
| `views` | | An array of view names (used to limit the views that the AI API can use to generate its answer) |
1313
| `runQuery` | | Boolean (true or false) whether to run the query and return its results |
14+
| `options` | | An object in the format `{ "chart": true \| false }`
1415

1516
Response
1617

1718
- `message` - A message from the AI assistant describing the query, how it was chosen, why it could not generate the requested query, etc.
1819
- `cube_query` - A Cube [Query](/product/apis-integrations/rest-api/query-format) that could be used to answer the given question
20+
- `chart` - If the `chart` option is set to `true`, an object containing a chart spec for the generated query in the following format:
21+
```json
22+
{
23+
"type": "bar" | "line" | "pie" | "table" | "area" | "scatter",
24+
"x": string,
25+
"y": string[],
26+
"pivot": string // optional; the field to pivot by, if any
27+
}
28+
```
1929

2030
### Examples
2131

@@ -28,7 +38,7 @@ curl \
2838
-X POST \
2939
-H "Content-Type: application/json" \
3040
-H "Authorization: EXAMPLE-API-TOKEN" \
31-
--data '{ "messages": [{ "role": "user", "content": "What cities have the highest aov this year?" }]}' \
41+
--data '{ "messages": [{ "role": "user", "content": "What cities have the highest aov this year?", "views": ["orders_view"] }]}' \
3242
https://YOUR_CUBE_API/cubejs-api/v1/ai/query/completions
3343
```
3444

docs/pages/reference/data-model/pre-aggregations.mdx

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1635,6 +1635,13 @@ cubes:
16351635
16361636
</CodeTabs>
16371637
1638+
<ReferenceBox>
1639+
1640+
In some cases, indexes would not work with `original_sql` pre-aggregations.
1641+
Please [track this issue](https://github.com/cube-js/cube/issues/7420).
1642+
1643+
</ReferenceBox>
1644+
16381645
#### `type`
16391646

16401647
This option is used to define [aggregating indexes][ref-aggregating-indexes]

packages/cubejs-databricks-jdbc-driver/src/DatabricksQuery.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,7 @@ export class DatabricksQuery extends BaseQuery {
186186

187187
public sqlTemplates() {
188188
const templates = super.sqlTemplates();
189+
templates.functions.CURRENTDATE = 'CURRENT_DATE';
189190
templates.functions.DATETRUNC = 'DATE_TRUNC({{ args_concat }})';
190191
templates.functions.DATEPART = 'DATE_PART({{ args_concat }})';
191192
templates.functions.BTRIM = 'TRIM({% if args[1] is defined %}{{ args[1] }} FROM {% endif %}{{ args[0] }})';
@@ -197,6 +198,7 @@ export class DatabricksQuery extends BaseQuery {
197198
templates.functions.TRUNC = 'CASE WHEN ({{ args[0] }}) >= 0 THEN FLOOR({{ args_concat }}) ELSE CEIL({{ args_concat }}) END';
198199
templates.expressions.timestamp_literal = 'from_utc_timestamp(\'{{ value }}\', \'UTC\')';
199200
templates.expressions.extract = 'EXTRACT({{ date_part }} FROM {{ expr }})';
201+
templates.expressions.interval_single_date_part = 'INTERVAL \'{{ num }}\' {{ date_part }}';
200202
templates.quotes.identifiers = '`';
201203
templates.quotes.escape = '``';
202204
// TODO: Databricks has `TIMESTAMP_NTZ` with logic similar to Pg's `TIMESTAMP`

packages/cubejs-ksql-driver/src/KsqlDriver.ts

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,9 @@ export class KsqlDriver extends BaseDriver implements DriverInterface {
131131
if (this.config.kafkaHost) {
132132
this.kafkaClient = new Kafka({
133133
clientId: 'Cube',
134-
brokers: [this.config.kafkaHost],
134+
brokers: this.config.kafkaHost
135+
.split(',')
136+
.map(h => h.trim()),
135137
// authenticationTimeout: 10000,
136138
// reauthenticationThreshold: 10000,
137139
ssl: this.config.kafkaUseSsl,

packages/cubejs-schema-compiler/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@
5757
"uuid": "^8.3.2"
5858
},
5959
"devDependencies": {
60-
"@cubejs-backend/apla-clickhouse": "^1.7.0",
60+
"@clickhouse/client": "^1.7.0",
6161
"@cubejs-backend/linter": "^1.0.0",
6262
"@cubejs-backend/query-orchestrator": "1.1.7",
6363
"@types/babel__code-frame": "^7.0.6",

packages/cubejs-schema-compiler/src/adapter/BaseQuery.js

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -577,9 +577,9 @@ export class BaseQuery {
577577
}
578578

579579
/**
580-
* Returns an array of SQL query strings for the query.
580+
* Returns a pair of SQL query string and parameter values for the query.
581581
* @param {boolean} [exportAnnotatedSql] - returns annotated sql with not rendered params if true
582-
* @returns {Array<string>}
582+
* @returns {[string, Array<unknown>]}
583583
*/
584584
buildSqlAndParams(exportAnnotatedSql) {
585585
if (getEnv('nativeSqlPlanner')) {
@@ -1521,13 +1521,13 @@ export class BaseQuery {
15211521
this.queryCache
15221522
);
15231523
if (m.expressionName && !collectedMeasures.length && !m.isMemberExpression) {
1524-
throw new UserError(`Subquery dimension ${m.expressionName} should reference at least one measure`);
1524+
throw new UserError(`Subquery measure ${m.expressionName} should reference at least one member`);
15251525
}
15261526
if (!collectedMeasures.length && m.isMemberExpression && m.query.allCubeNames.length > 1 && m.measureSql() === 'COUNT(*)') {
15271527
const cubeName = m.expressionCubeName ? `\`${m.expressionCubeName}\` ` : '';
15281528
throw new UserError(`The query contains \`COUNT(*)\` expression but cube/view ${cubeName}is missing \`count\` measure`);
15291529
}
1530-
return [m.measure, collectedMeasures];
1530+
return [typeof m.measure === 'string' ? m.measure : `${m.measure.cubeName}.${m.measure.name}`, collectedMeasures];
15311531
}));
15321532
}
15331533

@@ -3214,24 +3214,35 @@ export class BaseQuery {
32143214
DATE: 'DATE({{ args_concat }})',
32153215
},
32163216
statements: {
3217-
select: 'SELECT {% if distinct %}DISTINCT {% endif %}' +
3217+
select: '{% if ctes %} WITH \n' +
3218+
'{{ ctes | join(\',\n\') }}\n' +
3219+
'{% endif %}' +
3220+
'SELECT {% if distinct %}DISTINCT {% endif %}' +
32183221
'{{ select_concat | map(attribute=\'aliased\') | join(\', \') }} {% if from %}\n' +
32193222
'FROM (\n' +
32203223
'{{ from | indent(2, true) }}\n' +
3221-
') AS {{ from_alias }}{% endif %}' +
3224+
') AS {{ from_alias }}{% elif from_prepared %}\n' +
3225+
'FROM {{ from_prepared }}' +
3226+
'{% endif %}' +
32223227
'{% if filter %}\nWHERE {{ filter }}{% endif %}' +
32233228
'{% if group_by %}\nGROUP BY {{ group_by }}{% endif %}' +
3229+
'{% if having %}\nHAVING {{ having }}{% endif %}' +
32243230
'{% if order_by %}\nORDER BY {{ order_by | map(attribute=\'expr\') | join(\', \') }}{% endif %}' +
32253231
'{% if limit is not none %}\nLIMIT {{ limit }}{% endif %}' +
32263232
'{% if offset is not none %}\nOFFSET {{ offset }}{% endif %}',
32273233
group_by_exprs: '{{ group_by | map(attribute=\'index\') | join(\', \') }}',
3234+
join: '{{ join_type }} JOIN {{ source }} ON {{ condition }}',
3235+
cte: '{{ alias }} AS ({{ query | indent(2, true) }})'
32283236
},
32293237
expressions: {
3238+
column_reference: '{% if table_name %}{{ table_name }}.{% endif %}{{ name }}',
32303239
column_aliased: '{{expr}} {{quoted_alias}}',
3240+
query_aliased: '{{ query }} AS {{ quoted_alias }}',
32313241
case: 'CASE{% if expr %} {{ expr }}{% endif %}{% for when, then in when_then %} WHEN {{ when }} THEN {{ then }}{% endfor %}{% if else_expr %} ELSE {{ else_expr }}{% endif %} END',
32323242
is_null: '{{ expr }} IS {% if negate %}NOT {% endif %}NULL',
32333243
binary: '({{ left }} {{ op }} {{ right }})',
32343244
sort: '{{ expr }} {% if asc %}ASC{% else %}DESC{% endif %} NULLS {% if nulls_first %}FIRST{% else %}LAST{% endif %}',
3245+
order_by: '{% if index %} {{ index }} {% else %} {{ expr }} {% endif %} {% if asc %}ASC{% else %}DESC{% endif %}{% if nulls_first %} NULLS FIRST{% endif %}',
32353246
cast: 'CAST({{ expr }} AS {{ data_type }})',
32363247
window_function: '{{ fun_call }} OVER ({% if partition_by_concat %}PARTITION BY {{ partition_by_concat }}{% if order_by_concat or window_frame %} {% endif %}{% endif %}{% if order_by_concat %}ORDER BY {{ order_by_concat }}{% if window_frame %} {% endif %}{% endif %}{% if window_frame %}{{ window_frame }}{% endif %})',
32373248
window_frame_bounds: '{{ frame_type }} BETWEEN {{ frame_start }} AND {{ frame_end }}',
@@ -3260,7 +3271,8 @@ export class BaseQuery {
32603271
gt: '{{ column }} > {{ param }}',
32613272
gte: '{{ column }} >= {{ param }}',
32623273
lt: '{{ column }} < {{ param }}',
3263-
lte: '{{ column }} <= {{ param }}'
3274+
lte: '{{ column }} <= {{ param }}',
3275+
always_true: '1 == 1'
32643276

32653277
},
32663278
quotes: {
@@ -3270,6 +3282,10 @@ export class BaseQuery {
32703282
params: {
32713283
param: '?'
32723284
},
3285+
join_types: {
3286+
inner: 'INNER',
3287+
left: 'LEFT'
3288+
},
32733289
window_frame_types: {
32743290
rows: 'ROWS',
32753291
range: 'RANGE',

0 commit comments

Comments
 (0)