Hello! This guide is for you, the Gemini Coding Assistant. This document outlines the architecture, core principles, and future direction of SchemaFlow, a powerful, simple, and robust data transformation library.
- Unix Philosophy: Each part of this library is a small, sharp tool that does one thing well.
createParsermakes parsers.createJsonLogicTransformermakes transformers.createEtlPipelinecomposes them into a pipeline. We compose these simple, predictable tools to build complex and reliable behavior. - Parse, Don't Validate: We never trust data. We force it through a parser. Success means we get back a value that the TypeScript compiler guarantees is correctly typed. Failure means we throw a detailed, structured error. This eliminates an entire class of bugs at the source.
- Simplicity over Ease: We prioritize correctness and predictability. Logic is treated as data (JSON Schema, JSONLogic rules), which makes the system transparent and configurable.
The architecture of SchemaFlow is designed to be modular and composable. The data flow is a pipeline of functions, where each function has a single responsibility.
- Parsing: The first step in any pipeline is to parse the input data against a JSON Schema. The
createParserfunction creates a parser that takes unknown data and returns a typed value. If the data is invalid, it throws aParseError. - Transformation: The core of the library is the transformation engine, which uses JSON Logic. The
createJsonLogicTransformerfunction takes a set of JSON Logic rules and returns a transformation function. - ETL Pipeline: The
createEtlPipelinefunction composes a parser, a transformer, and an output parser into a single, robust pipeline with unified error handling. - Bi-Directional Transformation: The
createBiDirectionalTransformerfunction uses two ETL pipelines to create a transformer that can encode and decode data between two different schemas. - Packet Processing: The
DataContractPacketis a self-contained unit that includes schemas, rules, and sample data. TheprocessPacketandverifyPacketfunctions provide a high-level interface for working with these packets.
- TypeScript: For static typing and improved developer experience.
- Bun: For fast and efficient development, testing, and dependency management.
- JSON Schema: For defining data contracts and validating data shapes.
- AJV: As the underlying engine for our
createParserfactory. It is fast, standard-compliant, and provides detailed error reporting. - JSONLogic: For defining transformation rules as portable, serializable data.
- Install Dependencies:
bun install - Run Tests:
bun test
bun run demo.ts
src/types.ts: Central, shared TypeScript type definitions.src/parser.ts: Contains thecreateParserfactory.src/etl.ts: Contains the building blocks for creating ETL pipelines.src/json-logic-extensions.ts: Extendsjson-logic-jswith custom operations.src/packet-processor.ts: Contains functions for working withDataContractPacketobjects.tests/: Contains tests for all the core components.demo.ts: A script that demonstrates how to use the library.
- Add the operation to
src/json-logic-extensions.ts: Follow the existing pattern to add your new function usingjsonLogic.add_operation. - Write a test for the new operation: Add a new test case to
tests/etl.test.tsto ensure the operation works as expected. - Update
API.md: Add the new operation to the "Custom Operations" section of the API documentation.
- Enhanced CLI: Create a CLI tool that uses
demo.tsas a foundation to allow users to process packets from the command line. - More JSON Logic Extensions: Expand
src/json-logic-extensions.tswith more custom operations as needed for more complex transformations. - Error Reporting: Improve the error messages in
EtlErrorto provide more context and make debugging easier. - Code Generation: Explore the possibility of generating
DataContractPacketfiles from other sources, such as TypeScript types or database schemas.
- Always write tests first for any new functionality.
- Embrace Immutability. Transformation functions should be pure.
- When a user asks to add a feature, think "How can I build this by composing the existing simple tools?"
- Keep the error handling robust.
ParseErrorandEtlErrorare the standard way to signal that data does not conform to a contract.
This foundation is solid. Let's build upon it with care and precision.