|
1 | | -# Daffodil 'Format' Reader |
| 1 | +# Daffodil Format Reader |
2 | 2 | This plugin enables Drill to read DFDL-described data from files by way of the Apache Daffodil DFDL implementation. |
3 | 3 |
|
4 | | -## Validation |
| 4 | +## Configuration: |
| 5 | +To use Daffodil schemata, simply add the following to the `formats` section of a file-based storage plugin: |
5 | 6 |
|
6 | | -Data read by Daffodil is always validated using Daffodil's Limited Validation mode. |
| 7 | +```json |
| 8 | +"daffodil": { |
| 9 | + "type": "daffodil", |
| 10 | + "extensions": [ |
| 11 | + "dat" |
| 12 | + ] |
| 13 | + } |
| 14 | +``` |
| 15 | +There are four other optional parameters which you can specify: |
| 16 | +* `schemaURI`: Pre-compiled dfdl schema (.bin extension) or DFDL schema source (.xsd extension) |
| 17 | +* `validationMode`: Use `true` to request Daffodil built-in limited validation. Use `false` for no validation. |
| 18 | +* `rootName`: Local name of root element of the message. Can be null to use the first element declaration of the primary schema file. Ignored if reloading a pre-compiled schema. |
| 19 | +* `rootNameSpace`: Namespace URI as a string. Can be `null` to use the target namespace of the primary schema file or if it is unambiguous what element is the rootName. Ignored if reloading a pre-compiled schema. |
7 | 20 |
|
8 | | -TBD: do we need an option to control escalating validation errors to fatal? Currently this is not provided. |
| 21 | +## Usage: |
9 | 22 |
|
10 | | -## Limitations: TBD |
11 | 23 |
|
12 | | -At the moment, the DFDL schema is found on the local file system, which won't support Drill's distributed architecture. |
13 | 24 |
|
14 | | -There are restrictions on the DFDL schemas that this can handle. |
| 25 | +## Limitations: |
| 26 | +At the moment, the DFDL schema is found on the local file system, which won't support Drill's distributed architecture. |
15 | 27 |
|
16 | | -In particular, all element children must have distinct element names, including across choice branches. |
17 | | -(This rules out a number of large DFDL schemas.) |
| 28 | +There are restrictions on the DFDL schemas that this can handle. In particular, all element children must have distinct element names, including across choice branches. Unfortunately, this rules out a number of large DFDL schemas. |
18 | 29 |
|
19 | 30 | TBD: Auto renaming as part of the Daffodil-to-Drill metadata mapping? |
20 | 31 |
|
21 | | -The data is parsed fully from its native form into a Drill data structure held in memory. |
22 | | -No attempt is made to avoid access to parts of the DFDL-described data that are not needed to answer the query. |
| 32 | +The data is parsed fully from its native form into a Drill data structure held in memory. No attempt is made to avoid access to parts of the DFDL-described data that are not needed to answer the query. |
23 | 33 |
|
24 | 34 | If the data is not well-formed, an error occurs and the query fails. |
25 | 35 |
|
|
0 commit comments