|
32 | 32 | import static org.elasticsearch.xpack.esql.core.util.PlanStreamInput.readCachedStringWithVersionCheck; |
33 | 33 | import static org.elasticsearch.xpack.esql.core.util.PlanStreamOutput.writeCachedStringWithVersionCheck; |
34 | 34 |
|
| 35 | +/** |
| 36 | + * This enum represents data types the ES|QL query processing layer is able to |
| 37 | + * interact with in some way. This includes fully representable types (e.g. |
| 38 | + * {@link DataType#LONG}, numeric types which we promote (e.g. {@link DataType#SHORT}) |
| 39 | + * or fold into other types (e.g. {@link DataType#DATE_PERIOD}) early in the |
| 40 | + * processing pipeline, types for internal use |
| 41 | + * cases (e.g. {@link DataType#PARTIAL_AGG}), and types which the language |
| 42 | + * doesn't support, but require special handling anyway (e.g. |
| 43 | + * {@link DataType#OBJECT}) |
| 44 | + * |
| 45 | + * <h2>Process for adding a new data type</h2> |
| 46 | + * Note: it is not expected that all the following steps be done in a single PR. |
| 47 | + * Use capabilities to gate tests as you go, and use as many PRs as you think |
| 48 | + * appropriate. New data types are complex, and smaller PRs will make reviews |
| 49 | + * easier. |
| 50 | + * <ul> |
| 51 | + * <li> |
| 52 | + * Create a new feature flag for the type in {@link EsqlCorePlugin}. We |
| 53 | + * recommend developing the data type over a series of smaller PRs behind |
| 54 | + * a feature flag; even for relatively simple data types.</li> |
| 55 | + * <li> |
| 56 | + * Add a capability to EsqlCapabilities related to the new type, and |
| 57 | + * gated by the feature flag you just created. Again, using the feature |
| 58 | + * flag is preferred over snapshot-only. As development progresses, you may |
| 59 | + * need to add more capabilities related to the new type, e.g. for |
| 60 | + * supporting specific functions. This is fine, and expected.</li> |
| 61 | + * <li> |
| 62 | + * Create a new CSV test file for the new type. You'll either need to |
| 63 | + * create a new data file as well, or add values of the new type to |
| 64 | + * and existing data file. See CsvTestDataLoader for creating a new data |
| 65 | + * set.</li> |
| 66 | + * <li> |
| 67 | + * In the new CSV test file, start adding basic functionality tests. |
| 68 | + * These should include reading and returning values, both from indexed data |
| 69 | + * and from the ROW command. It should also include functions that support |
| 70 | + * "every" type, such as Case or MvFirst.</li> |
| 71 | + * <li> |
| 72 | + * Add the new type to the CsvTestUtils#Type enum, if it isn't already |
| 73 | + * there. You also need to modify CsvAssert to support reading values |
| 74 | + * of the new type.</li> |
| 75 | + * <li> |
| 76 | + * At this point, the CSV tests should fail with a sensible ES|QL error |
| 77 | + * message. Make sure they're failing in ES|QL, not in the test |
| 78 | + * framework.</li> |
| 79 | + * <li> |
| 80 | + * Add the new data type to this enum. This will cause a bunch of |
| 81 | + * compile errors for switch statements throughout the code. Resolve those |
| 82 | + * as appropriate. That is the main way in which the new type will be tied |
| 83 | + * into the framework.</li> |
| 84 | + * <li> |
| 85 | + * Add the new type to the {@link DataType#UNDER_CONSTRUCTION} |
| 86 | + * collection. This is used by the test framework to disable some checks |
| 87 | + * around how functions report their supported types, which would otherwise |
| 88 | + * generate a lot of noise while the type is still in development.</li> |
| 89 | + * <li> |
| 90 | + * Add typed data generators to TestCaseSupplier, and make sure all |
| 91 | + * functions that support the new type have tests for it.</li> |
| 92 | + * <li> |
| 93 | + * Work to support things all types should do. Equality and the |
| 94 | + * "typeless" MV functions (MvFirst, MvLast, and MvCount) should work for |
| 95 | + * most types. Case and Coalesce should also support all types. |
| 96 | + * If the type has a natural ordering, make sure to test |
| 97 | + * sorting and the other binary comparisons. Make sure these functions all |
| 98 | + * have CSV tests that run against indexed data.</li> |
| 99 | + * <li> |
| 100 | + * Add conversion functions as appropriate. Almost all types should |
| 101 | + * support ToString, and should have a "ToType" function that accepts a |
| 102 | + * string. There may be other logical conversions depending on the nature |
| 103 | + * of the type. Make sure to add the conversion function to the |
| 104 | + * TYPE_TO_CONVERSION_FUNCTION map in EsqlDataTypeConverter. Make sure the |
| 105 | + * conversion functions have CSV tests that run against indexed data.</li> |
| 106 | + * <li> |
| 107 | + * Support the new type in aggregations that are type independent. |
| 108 | + * This includes Values, Count, and Count Distinct. Make sure there are |
| 109 | + * CSV tests against indexed data for these.</li> |
| 110 | + * <li> |
| 111 | + * Support other functions and aggregations as appropriate, making sure |
| 112 | + * to included CSV tests.</li> |
| 113 | + * <li> |
| 114 | + * Consider how the type will interact with other types. For example, |
| 115 | + * if the new type is numeric, it may be good for it to be comparable with |
| 116 | + * other numbers. Supporting this may require new logic in |
| 117 | + * EsqlDataTypeConverter#commonType, individual function type checking, the |
| 118 | + * verifier rules, or other places. We suggest starting with CSV tests and |
| 119 | + * seeing where they fail.</li> |
| 120 | + * </ul> |
| 121 | + * There are some additional steps that should be taken when removing the |
| 122 | + * feature flag and getting ready for a release: |
| 123 | + * <ul> |
| 124 | + * <li> |
| 125 | + * Ensure the capabilities for this type are always enabled |
| 126 | + * </li> |
| 127 | + * <li> |
| 128 | + * Remove the type from the {@link DataType#UNDER_CONSTRUCTION} |
| 129 | + * collection</li> |
| 130 | + * <li> |
| 131 | + * Fix new test failures related to declared function types |
| 132 | + * </li> |
| 133 | + * <li> |
| 134 | + * Make sure to run the full test suite locally via gradle to generate |
| 135 | + * the function type tables and helper files with the new type. Ensure all |
| 136 | + * the functions that support the type have appropriate docs for it.</li> |
| 137 | + * <li> |
| 138 | + * If appropriate, remove the type from the ESQL limitations list of |
| 139 | + * unsupported types.</li> |
| 140 | + * </ul> |
| 141 | + */ |
35 | 142 | public enum DataType { |
36 | 143 | /** |
37 | 144 | * Fields of this type are unsupported by any functions and are always |
|
0 commit comments