You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update root README.md and other documentation with latest changes (#1113)
* Update root `README.md` adding example how to use ...
and cleanup some of the context.
Relates to #1105
* add new architectural diagram
* address code review comments
* address review comments
`SessionConfig::new_with_ballista()` will setup `SessionConfig` for use with ballista. This is not required, `SessionConfig::new` could be used, but it's advised as it will set up some sensible configuration defaults .
44
+
45
+
`SessionConfigExt` expose set of `SessionConfigExt::with_ballista_` and `SessionConfigExt::ballista_` methods which can tune retrieve ballista specific options.
46
+
47
+
Notable `SessionConfigExt` configuration methods would be:
| ballista.repartition.joins | Boolean | true | When set to true, Ballista will repartition data using the join keys to execute joins in parallel using the provided `ballista.shuffle.partitions` level. |
45
-
| ballista.repartition.aggregations | Boolean | true | When set to true, Ballista will repartition data using the aggregate keys to execute aggregates in parallel using the provided `ballista.shuffle.partitions` level. |
46
-
| ballista.repartition.windows | Boolean | true | When set to true, Ballista will repartition data using the partition keys to execute window functions in parallel using the provided `ballista.shuffle.partitions` level. |
47
-
| ballista.parquet.pruning | Boolean | true | Determines whether Parquet pruning should be enabled or not. |
48
-
| ballista.with_information_schema | Boolean | true | Determines whether the `information_schema` should be created in the context. This is necessary for supporting DDL commands such as `SHOW TABLES`. |
49
-
50
-
### DataFusion Configuration Settings
51
-
52
-
In addition to Ballista-specific configuration settings, the following DataFusion settings can also be specified.
| datafusion.execution.coalesce_batches | Boolean | true | When set to true, record batches will be examined between each operator and small batches will be coalesced into larger batches. This is helpful when there are highly selective filters or joins that could produce tiny output batches. The target batch size is determined by the configuration setting 'datafusion.execution.coalesce_target_batch_size'. |
57
-
| datafusion.execution.coalesce_target_batch_size | UInt64 | 4096 | Target batch size when coalescing batches. Uses in conjunction with the configuration setting 'datafusion.execution.coalesce_batches'. |
58
-
| datafusion.explain.logical_plan_only | Boolean | false | When set to true, the explain statement will only print logical plans. |
59
-
| datafusion.explain.physical_plan_only | Boolean | false | When set to true, the explain statement will only print physical plans. |
60
-
| datafusion.optimizer.filter_null_join_keys | Boolean | false | When set to true, the optimizer will insert filters before a join between a nullable and non-nullable column to filter out nulls on the nullable side. This filter can add additional overhead when the file format does not fully support predicate push down. |
61
-
| datafusion.optimizer.skip_failed_rules | Boolean | true | When set to true, the logical plan optimizer will produce warning messages if any optimization rules produce errors and then proceed to the next rule. When set to false, any rules that produce errors will cause the query to fail. |
69
+
which could be used to change default ballista behavior.
70
+
71
+
If information schema is enabled all configuration parameters could be retrieved or set using SQL;
Copy file name to clipboardExpand all lines: docs/source/user-guide/deployment/quick-start.md
+34-26Lines changed: 34 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,16 +19,14 @@
19
19
20
20
# Ballista Quickstart
21
21
22
-
A simple way to start a local cluster for testing purposes is to use cargo to build the project and then run the scheduler and executor binaries directly along with the Ballista UI.
22
+
A simple way to start a local cluster for testing purposes is to use cargo to build the project and then run the scheduler and executor binaries directly.
0 commit comments