Skip to content

Commit 2dd9193

Browse files
not-napoleonBogdan Pintea
andauthored
ESQL Javadoc for creating new data types (#117520) (#118096)
This adds some java doc to the DataType enum, listing out the steps I followed for adding DateNanos. Hopefully it's helpful to future folks adding data types. --------- Co-authored-by: Bogdan Pintea <[email protected]>
1 parent d86e6a0 commit 2dd9193

File tree

1 file changed

+107
-0
lines changed
  • x-pack/plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/type

1 file changed

+107
-0
lines changed

x-pack/plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/type/DataType.java

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,113 @@
3232
import static org.elasticsearch.xpack.esql.core.util.PlanStreamInput.readCachedStringWithVersionCheck;
3333
import static org.elasticsearch.xpack.esql.core.util.PlanStreamOutput.writeCachedStringWithVersionCheck;
3434

35+
/**
36+
* This enum represents data types the ES|QL query processing layer is able to
37+
* interact with in some way. This includes fully representable types (e.g.
38+
* {@link DataType#LONG}, numeric types which we promote (e.g. {@link DataType#SHORT})
39+
* or fold into other types (e.g. {@link DataType#DATE_PERIOD}) early in the
40+
* processing pipeline, types for internal use
41+
* cases (e.g. {@link DataType#PARTIAL_AGG}), and types which the language
42+
* doesn't support, but require special handling anyway (e.g.
43+
* {@link DataType#OBJECT})
44+
*
45+
* <h2>Process for adding a new data type</h2>
46+
* Note: it is not expected that all the following steps be done in a single PR.
47+
* Use capabilities to gate tests as you go, and use as many PRs as you think
48+
* appropriate. New data types are complex, and smaller PRs will make reviews
49+
* easier.
50+
* <ul>
51+
* <li>
52+
* Create a new feature flag for the type in {@link EsqlCorePlugin}. We
53+
* recommend developing the data type over a series of smaller PRs behind
54+
* a feature flag; even for relatively simple data types.</li>
55+
* <li>
56+
* Add a capability to EsqlCapabilities related to the new type, and
57+
* gated by the feature flag you just created. Again, using the feature
58+
* flag is preferred over snapshot-only. As development progresses, you may
59+
* need to add more capabilities related to the new type, e.g. for
60+
* supporting specific functions. This is fine, and expected.</li>
61+
* <li>
62+
* Create a new CSV test file for the new type. You'll either need to
63+
* create a new data file as well, or add values of the new type to
64+
* and existing data file. See CsvTestDataLoader for creating a new data
65+
* set.</li>
66+
* <li>
67+
* In the new CSV test file, start adding basic functionality tests.
68+
* These should include reading and returning values, both from indexed data
69+
* and from the ROW command. It should also include functions that support
70+
* "every" type, such as Case or MvFirst.</li>
71+
* <li>
72+
* Add the new type to the CsvTestUtils#Type enum, if it isn't already
73+
* there. You also need to modify CsvAssert to support reading values
74+
* of the new type.</li>
75+
* <li>
76+
* At this point, the CSV tests should fail with a sensible ES|QL error
77+
* message. Make sure they're failing in ES|QL, not in the test
78+
* framework.</li>
79+
* <li>
80+
* Add the new data type to this enum. This will cause a bunch of
81+
* compile errors for switch statements throughout the code. Resolve those
82+
* as appropriate. That is the main way in which the new type will be tied
83+
* into the framework.</li>
84+
* <li>
85+
* Add the new type to the {@link DataType#UNDER_CONSTRUCTION}
86+
* collection. This is used by the test framework to disable some checks
87+
* around how functions report their supported types, which would otherwise
88+
* generate a lot of noise while the type is still in development.</li>
89+
* <li>
90+
* Add typed data generators to TestCaseSupplier, and make sure all
91+
* functions that support the new type have tests for it.</li>
92+
* <li>
93+
* Work to support things all types should do. Equality and the
94+
* "typeless" MV functions (MvFirst, MvLast, and MvCount) should work for
95+
* most types. Case and Coalesce should also support all types.
96+
* If the type has a natural ordering, make sure to test
97+
* sorting and the other binary comparisons. Make sure these functions all
98+
* have CSV tests that run against indexed data.</li>
99+
* <li>
100+
* Add conversion functions as appropriate. Almost all types should
101+
* support ToString, and should have a "ToType" function that accepts a
102+
* string. There may be other logical conversions depending on the nature
103+
* of the type. Make sure to add the conversion function to the
104+
* TYPE_TO_CONVERSION_FUNCTION map in EsqlDataTypeConverter. Make sure the
105+
* conversion functions have CSV tests that run against indexed data.</li>
106+
* <li>
107+
* Support the new type in aggregations that are type independent.
108+
* This includes Values, Count, and Count Distinct. Make sure there are
109+
* CSV tests against indexed data for these.</li>
110+
* <li>
111+
* Support other functions and aggregations as appropriate, making sure
112+
* to included CSV tests.</li>
113+
* <li>
114+
* Consider how the type will interact with other types. For example,
115+
* if the new type is numeric, it may be good for it to be comparable with
116+
* other numbers. Supporting this may require new logic in
117+
* EsqlDataTypeConverter#commonType, individual function type checking, the
118+
* verifier rules, or other places. We suggest starting with CSV tests and
119+
* seeing where they fail.</li>
120+
* </ul>
121+
* There are some additional steps that should be taken when removing the
122+
* feature flag and getting ready for a release:
123+
* <ul>
124+
* <li>
125+
* Ensure the capabilities for this type are always enabled
126+
* </li>
127+
* <li>
128+
* Remove the type from the {@link DataType#UNDER_CONSTRUCTION}
129+
* collection</li>
130+
* <li>
131+
* Fix new test failures related to declared function types
132+
* </li>
133+
* <li>
134+
* Make sure to run the full test suite locally via gradle to generate
135+
* the function type tables and helper files with the new type. Ensure all
136+
* the functions that support the type have appropriate docs for it.</li>
137+
* <li>
138+
* If appropriate, remove the type from the ESQL limitations list of
139+
* unsupported types.</li>
140+
* </ul>
141+
*/
35142
public enum DataType {
36143
/**
37144
* Fields of this type are unsupported by any functions and are always

0 commit comments

Comments
 (0)