This document provides a detailed API reference for the SchemaFlow library.
The library is built around a few core concepts:
- Parsers: Functions that take unknown data and return a typed value if the data conforms to a schema. If the data is invalid, they throw a
ParseError. - Transformers: Functions that take data in one shape and transform it into another shape.
- Pipelines: A sequence of operations that typically includes parsing, transforming, and then parsing the output.
A Data Contract Packet is a JSON object that contains everything needed to perform a bi-directional transformation:
inputSchema: The JSON Schema for the input data.outputSchema: The JSON Schema for the output data.decoderRules: The JSON Logic rules for decoding the data (input -> output).encoderRules: The JSON Logic rules for encoding the data (output -> input).data: An array of sample data that can be used to verify the packet.
This packet is a self-contained unit that can be easily shared and tested.
This function takes a DataContractPacket and returns a bi-directional transformer with decode and encode methods.
This function takes a DataContractPacket and returns true if the sample data in the packet can be successfully transformed in both directions.
We use JSON Schema to define the shape of our data. A JSON Schema is a JSON object that defines the structure of your JSON data.
Here are some of the most common properties used in our JSON Schemas:
type: Defines the data type of a property. Common types areobject,array,string,number,boolean, andnull.properties: An object that defines the properties of an object.required: An array of strings that lists the required properties of an object.items: Defines the schema for the items in an array.format: Defines a specific format for a string, such asemailordate-time.
For a complete list of JSON Schema properties, please refer to the JSON Schema documentation.
We use JSON Logic to define our transformation rules. JSON Logic is a simple, safe, and portable way to express complex logic in a JSON format.
JSON Logic comes with a set of standard operations. Here are some of the most common ones:
var: Retrieves a value from the input data.cat: Concatenates strings.+,-,*,/: Mathematical operations.if: Conditional logic.substr: Extracts a substring from a string.
For a complete list of standard operations, please refer to the JSON Logic documentation.
We have extended JSON Logic with the following custom operations:
indexOf: Returns the index of a substring within a string. Takes two arguments: the string to search in, and the substring to search for. Returns-1if the substring is not found.
The library uses two custom error classes to provide detailed error information:
ParseError: Thrown by parsers when data does not conform to a schema. It contains anerrorsproperty with an array ofErrorObjectfrom the AJV validator.EtlError: A wrapper error that is thrown during the ETL process. It contains the following properties:stage: The stage of the ETL process where the error occurred (input,transformation, oroutput).originalError: The original error that was thrown.
Here are some examples of how to use the library.
This example shows a simple transformation from an object with firstName and lastName properties to an object with a fullName property.
Input Schema:
{
"type": "object",
"properties": {
"firstName": { "type": "string" },
"lastName": { "type": "string" }
},
"required": ["firstName", "lastName"]
}Output Schema:
{
"type": "object",
"properties": {
"fullName": { "type": "string" }
},
"required": ["fullName"]
}Decoder Rules:
{
"fullName": { "cat": [{ "var": "firstName" }, " ", { "var": "lastName" }] }
}This example is taken from complex-packet.json and shows a more complex transformation with nested objects and different property names.
Input Schema:
{
"type": "object",
"properties": {
"id": { "type": "number" },
"firstName": { "type": "string" },
"lastName": { "type": "string" },
"email": { "type": "string", "format": "email" },
"street": { "type": "string" },
"city": { "type": "string" },
"zipCode": { "type": "string" }
},
"required": ["id", "firstName", "lastName", "email", "street", "city", "zipCode"]
}Output Schema:
{
"type": "object",
"properties": {
"personId": { "type": "number" },
"name": {
"type": "object",
"properties": {
"first": { "type": "string" },
"last": { "type": "string" }
},
"required": ["first", "last"]
},
"contact": {
"type": "object",
"properties": {
"email": { "type": "string", "format": "email" }
},
"required": ["email"]
},
"address": {
"type": "object",
"properties": {
"street": { "type": "string" },
"city": { "type": "string" },
"zip": { "type": "string" }
},
"required": ["street", "city", "zip"]
}
},
"required": ["personId", "name", "contact", "address"]
}Decoder Rules:
{
"personId": { "var": "id" },
"name.first": { "var": "firstName" },
"name.last": { "var": "lastName" },
"contact.email": { "var": "email" },
"address.street": { "var": "street" },
"address.city": { "var": "city" },
"address.zip": { "var": "zipCode" }
}