Skip to content

Latest commit

 

History

History
209 lines (159 loc) · 6.6 KB

File metadata and controls

209 lines (159 loc) · 6.6 KB

API Reference

This document provides a detailed API reference for the SchemaFlow library.

Table of Contents

Core Concepts

The library is built around a few core concepts:

  • Parsers: Functions that take unknown data and return a typed value if the data conforms to a schema. If the data is invalid, they throw a ParseError.
  • Transformers: Functions that take data in one shape and transform it into another shape.
  • Pipelines: A sequence of operations that typically includes parsing, transforming, and then parsing the output.

Data Contract Packets

A Data Contract Packet is a JSON object that contains everything needed to perform a bi-directional transformation:

  • inputSchema: The JSON Schema for the input data.
  • outputSchema: The JSON Schema for the output data.
  • decoderRules: The JSON Logic rules for decoding the data (input -> output).
  • encoderRules: The JSON Logic rules for encoding the data (output -> input).
  • data: An array of sample data that can be used to verify the packet.

This packet is a self-contained unit that can be easily shared and tested.

processPacket(packet)

This function takes a DataContractPacket and returns a bi-directional transformer with decode and encode methods.

verifyPacket(packet)

This function takes a DataContractPacket and returns true if the sample data in the packet can be successfully transformed in both directions.

JSON Schema

We use JSON Schema to define the shape of our data. A JSON Schema is a JSON object that defines the structure of your JSON data.

Common Properties

Here are some of the most common properties used in our JSON Schemas:

  • type: Defines the data type of a property. Common types are object, array, string, number, boolean, and null.
  • properties: An object that defines the properties of an object.
  • required: An array of strings that lists the required properties of an object.
  • items: Defines the schema for the items in an array.
  • format: Defines a specific format for a string, such as email or date-time.

For a complete list of JSON Schema properties, please refer to the JSON Schema documentation.

JSON Logic Transformers

We use JSON Logic to define our transformation rules. JSON Logic is a simple, safe, and portable way to express complex logic in a JSON format.

Standard Operations

JSON Logic comes with a set of standard operations. Here are some of the most common ones:

  • var: Retrieves a value from the input data.
  • cat: Concatenates strings.
  • +, -, *, /: Mathematical operations.
  • if: Conditional logic.
  • substr: Extracts a substring from a string.

For a complete list of standard operations, please refer to the JSON Logic documentation.

Custom Operations

We have extended JSON Logic with the following custom operations:

  • indexOf: Returns the index of a substring within a string. Takes two arguments: the string to search in, and the substring to search for. Returns -1 if the substring is not found.

Error Handling

The library uses two custom error classes to provide detailed error information:

  • ParseError: Thrown by parsers when data does not conform to a schema. It contains an errors property with an array of ErrorObject from the AJV validator.
  • EtlError: A wrapper error that is thrown during the ETL process. It contains the following properties:
    • stage: The stage of the ETL process where the error occurred (input, transformation, or output).
    • originalError: The original error that was thrown.

Examples

Here are some examples of how to use the library.

Simple Transformation

This example shows a simple transformation from an object with firstName and lastName properties to an object with a fullName property.

Input Schema:

{
  "type": "object",
  "properties": {
    "firstName": { "type": "string" },
    "lastName": { "type": "string" }
  },
  "required": ["firstName", "lastName"]
}

Output Schema:

{
  "type": "object",
  "properties": {
    "fullName": { "type": "string" }
  },
  "required": ["fullName"]
}

Decoder Rules:

{
  "fullName": { "cat": [{ "var": "firstName" }, " ", { "var": "lastName" }] }
}

Complex Transformation

This example is taken from complex-packet.json and shows a more complex transformation with nested objects and different property names.

Input Schema:

{
  "type": "object",
  "properties": {
    "id": { "type": "number" },
    "firstName": { "type": "string" },
    "lastName": { "type": "string" },
    "email": { "type": "string", "format": "email" },
    "street": { "type": "string" },
    "city": { "type": "string" },
    "zipCode": { "type": "string" }
  },
  "required": ["id", "firstName", "lastName", "email", "street", "city", "zipCode"]
}

Output Schema:

{
  "type": "object",
  "properties": {
    "personId": { "type": "number" },
    "name": {
      "type": "object",
      "properties": {
        "first": { "type": "string" },
        "last": { "type": "string" }
      },
      "required": ["first", "last"]
    },
    "contact": {
      "type": "object",
      "properties": {
        "email": { "type": "string", "format": "email" }
      },
      "required": ["email"]
    },
    "address": {
      "type": "object",
      "properties": {
        "street": { "type": "string" },
        "city": { "type": "string" },
        "zip": { "type": "string" }
      },
      "required": ["street", "city", "zip"]
    }
  },
  "required": ["personId", "name", "contact", "address"]
}

Decoder Rules:

{
  "personId": { "var": "id" },
  "name.first": { "var": "firstName" },
  "name.last": { "var": "lastName" },
  "contact.email": { "var": "email" },
  "address.street": { "var": "street" },
  "address.city": { "var": "city" },
  "address.zip": { "var": "zipCode" }
}