Skip to content

Latest commit

 

History

History
72 lines (50 loc) · 4.85 KB

File metadata and controls

72 lines (50 loc) · 4.85 KB

Gemini Coding Assistant Guide: SchemaFlow

Hello! This guide is for you, the Gemini Coding Assistant. This document outlines the architecture, core principles, and future direction of SchemaFlow, a powerful, simple, and robust data transformation library.

Core Philosophy

  1. Unix Philosophy: Each part of this library is a small, sharp tool that does one thing well. createParser makes parsers. createJsonLogicTransformer makes transformers. createEtlPipeline composes them into a pipeline. We compose these simple, predictable tools to build complex and reliable behavior.
  2. Parse, Don't Validate: We never trust data. We force it through a parser. Success means we get back a value that the TypeScript compiler guarantees is correctly typed. Failure means we throw a detailed, structured error. This eliminates an entire class of bugs at the source.
  3. Simplicity over Ease: We prioritize correctness and predictability. Logic is treated as data (JSON Schema, JSONLogic rules), which makes the system transparent and configurable.

Architecture Overview

The architecture of SchemaFlow is designed to be modular and composable. The data flow is a pipeline of functions, where each function has a single responsibility.

  1. Parsing: The first step in any pipeline is to parse the input data against a JSON Schema. The createParser function creates a parser that takes unknown data and returns a typed value. If the data is invalid, it throws a ParseError.
  2. Transformation: The core of the library is the transformation engine, which uses JSON Logic. The createJsonLogicTransformer function takes a set of JSON Logic rules and returns a transformation function.
  3. ETL Pipeline: The createEtlPipeline function composes a parser, a transformer, and an output parser into a single, robust pipeline with unified error handling.
  4. Bi-Directional Transformation: The createBiDirectionalTransformer function uses two ETL pipelines to create a transformer that can encode and decode data between two different schemas.
  5. Packet Processing: The DataContractPacket is a self-contained unit that includes schemas, rules, and sample data. The processPacket and verifyPacket functions provide a high-level interface for working with these packets.

Core Technologies

  • TypeScript: For static typing and improved developer experience.
  • Bun: For fast and efficient development, testing, and dependency management.
  • JSON Schema: For defining data contracts and validating data shapes.
  • AJV: As the underlying engine for our createParser factory. It is fast, standard-compliant, and provides detailed error reporting.
  • JSONLogic: For defining transformation rules as portable, serializable data.

Development Workflow

  1. Install Dependencies: bun install
  2. Run Tests: bun test

Running the Demo

bun run demo.ts

Project Structure

  • src/types.ts: Central, shared TypeScript type definitions.
  • src/parser.ts: Contains the createParser factory.
  • src/etl.ts: Contains the building blocks for creating ETL pipelines.
  • src/json-logic-extensions.ts: Extends json-logic-js with custom operations.
  • src/packet-processor.ts: Contains functions for working with DataContractPacket objects.
  • tests/: Contains tests for all the core components.
  • demo.ts: A script that demonstrates how to use the library.

How to Approach Common Tasks

Adding a New JSON Logic Operation

  1. Add the operation to src/json-logic-extensions.ts: Follow the existing pattern to add your new function using jsonLogic.add_operation.
  2. Write a test for the new operation: Add a new test case to tests/etl.test.ts to ensure the operation works as expected.
  3. Update API.md: Add the new operation to the "Custom Operations" section of the API documentation.

Future Directions

  1. Enhanced CLI: Create a CLI tool that uses demo.ts as a foundation to allow users to process packets from the command line.
  2. More JSON Logic Extensions: Expand src/json-logic-extensions.ts with more custom operations as needed for more complex transformations.
  3. Error Reporting: Improve the error messages in EtlError to provide more context and make debugging easier.
  4. Code Generation: Explore the possibility of generating DataContractPacket files from other sources, such as TypeScript types or database schemas.

General Guidance

  • Always write tests first for any new functionality.
  • Embrace Immutability. Transformation functions should be pure.
  • When a user asks to add a feature, think "How can I build this by composing the existing simple tools?"
  • Keep the error handling robust. ParseError and EtlError are the standard way to signal that data does not conform to a contract.

This foundation is solid. Let's build upon it with care and precision.