@@ -8,6 +8,8 @@ JQ-Synth automatically generates [jq](https://stedolan.github.io/jq/) filter exp
88[ ![ Python 3.10+] ( https://img.shields.io/badge/python-3.10+-blue.svg )] ( https://www.python.org/downloads/ )
99[ ![ License: MIT] ( https://img.shields.io/badge/License-MIT-yellow.svg )] ( https://opensource.org/licenses/MIT )
1010
11+ ![ Demo] ( demo.gif )
12+
1113## Overview
1214
1315JQ-Synth solves a common developer problem: you know what JSON transformation you want, but writing the correct jq filter is tricky. Simply provide example input/output pairs, and JQ-Synth will synthesize the filter for you.
@@ -48,64 +50,6 @@ source .venv/bin/activate # On Windows: .venv\Scripts\activate
4850pip install -e .
4951```
5052
51- ## Supported Providers
52-
53- | Provider | Status | Note |
54- | ----------| --------| ------|
55- | OpenAI | Stable ✅ | Default, tested |
56- | Anthropic | Beta ⚠️ | May have edge cases |
57- | OpenRouter | Beta ⚠️ | Via OpenAI-compatible endpoint |
58- | Ollama | Alpha 🧪 | Local only, requires setup |
59-
60- > Note: OpenAI is default and most tested. Others should work but report issues if found.
61-
62- ### Provider Setup
63-
64- ** OpenAI (Default)**
65-
66- ``` bash
67- export OPENAI_API_KEY=' sk-...'
68- # Optional: specify model (default: gpt-4o)
69- export LLM_MODEL=' gpt-4o'
70- ```
71-
72- ** Anthropic**
73-
74- ``` bash
75- export LLM_PROVIDER=' anthropic'
76- export ANTHROPIC_API_KEY=' sk-ant-...'
77- # Optional: specify model (default: claude-sonnet-4-20250514)
78- export LLM_MODEL=' claude-sonnet-4-20250514'
79- ```
80-
81- ** OpenRouter**
82-
83- ``` bash
84- export LLM_BASE_URL=' https://openrouter.ai/api/v1'
85- export OPENAI_API_KEY=' sk-or-...'
86- export LLM_MODEL=' anthropic/claude-3.5-sonnet'
87- ```
88-
89- ** Local (Ollama)**
90-
91- ``` bash
92- export LLM_BASE_URL=' http://localhost:11434/v1'
93- export LLM_MODEL=' llama3'
94- export OPENAI_API_KEY=' dummy' # Ollama doesn't require a real key
95- ```
96-
97- ** Together AI / Groq**
98-
99- ``` bash
100- # Together AI
101- export LLM_BASE_URL=' https://api.together.xyz/v1'
102- export OPENAI_API_KEY=' ...'
103-
104- # Groq
105- export LLM_BASE_URL=' https://api.groq.com/openai/v1'
106- export OPENAI_API_KEY=' gsk_...'
107- ```
108-
10953## Quick Start
11054
11155### Interactive Mode
@@ -245,6 +189,19 @@ jq-synth --base-url https://openrouter.ai/api/v1 --model anthropic/claude-3.5-so
245189jq-synth --base-url http://localhost:11434/v1 --model llama3 --task nested-field
246190```
247191
192+ ## How It Works
193+
194+ JQ-Synth uses a ** deterministic oracle** approach:
195+
196+ 1 . ** Generation** : An LLM (GPT-4, Claude, or compatible model) generates candidate jq filters based on your examples and description
197+ 2 . ** Verification** : Each filter is executed against the real jq binary with your input examples
198+ 3 . ** Scoring** : A deterministic algorithm compares actual vs expected outputs, computing similarity scores (0.0 to 1.0)
199+ 4 . ** Feedback** : The algorithm classifies errors (syntax, shape, missing/extra elements, order) and generates actionable feedback
200+ 5 . ** Refinement** : The LLM receives the feedback and generates an improved filter
201+ 6 . ** Iteration** : Steps 2-5 repeat until a perfect match is found or limits are reached
202+
203+ This hybrid approach combines LLM creativity with deterministic verification, ensuring correctness while leveraging AI for filter synthesis.
204+
248205## Architecture
249206
250207JQ-Synth follows a modular architecture with clear separation of concerns:
@@ -349,6 +306,104 @@ The reviewer classifies errors by priority (highest to lowest):
349306- ** Scalars** : Binary (1.0 for exact match, 0.0 for mismatch)
350307- ** Multiple examples** : Arithmetic mean of scores
351308
309+ ## Supported jq Patterns
310+
311+ JQ-Synth works well with these common jq operations:
312+
313+ - ** Field extraction** : ` .foo ` , ` .user.name ` , ` .data.items[0] `
314+ - ** Array operations** : ` .[] ` , ` .[0] ` , ` .[1:3] ` , ` .[-1] `
315+ - ** Filtering** : ` select(.active == true) ` , ` select(.age > 18) `
316+ - ** Mapping** : ` map(.name) ` , ` [.[] | .id] `
317+ - ** Array construction** : ` [.items[].name] `
318+ - ** Object construction** : ` {name: .user.name, email: .user.email} `
319+ - ** Conditionals** : ` if .status == "active" then .name else null end `
320+ - ** Null handling** : ` select(. != null) ` , ` .field // "default" `
321+ - ** String operations** : String interpolation, concatenation
322+ - ** Arithmetic** : Addition, subtraction, comparison operators
323+ - ** Type checking** : ` type ` , ` length `
324+
325+ ## Known Limitations
326+
327+ JQ-Synth may struggle with these advanced jq features:
328+
329+ - ** Aggregations** : ` group_by() ` , ` reduce ` , ` min_by() ` , ` max_by() `
330+ - ** Complex recursion** : ` recurse() ` , ` walk() `
331+ - ** Variable bindings** : Complex ` as $var ` patterns
332+ - ** Custom functions** : ` def ` statements (blocked for security)
333+ - ** Advanced array operations** : ` combinations() ` , ` transpose() `
334+ - ** Path manipulation** : ` getpath() ` , ` setpath() ` , ` delpaths() `
335+ - ** Format strings** : ` @csv ` , ` @json ` , ` @base64 `
336+
337+ For these cases, you may need to write the filter manually or break down the task into simpler steps.
338+
339+ ## Model recommendations
340+
341+ | Task complexity | Recommended model | Speed |
342+ | -----------------| -------------------| -------|
343+ | Simple filters (extract, select) | GPT-4o-mini, Claude Haiku | Fast |
344+ | Medium (grouping, aggregation, recursion) | Claude Sonnet, GPT-4o | Fast |
345+ | Complex algorithms (graph traversal, sorting) | DeepSeek R1 | Slow (minutes) |
346+
347+ > Note: DeepSeek R1 solved topological sort and Dijkstra's shortest path in jq. Most users won't need this — standard models handle 95%+ of real-world tasks.
348+
349+ ## Supported Providers
350+
351+ | Provider | Status | Note |
352+ | ----------| --------| ------|
353+ | OpenAI | Stable ✅ | Default provider |
354+ | Anthropic | Beta ⚠️ | Different API format |
355+ | OpenRouter | Tested ✅ | OpenAI-compatible |
356+ | Ollama | Alpha 🧪 | Local only, requires setup |
357+
358+ > Note: OpenAI is default and most tested. Others should work but report issues if found.
359+
360+ ### Provider Setup
361+
362+ ** OpenAI (Default)**
363+
364+ ``` bash
365+ export OPENAI_API_KEY=' sk-...'
366+ # Optional: specify model (default: gpt-4o)
367+ export LLM_MODEL=' gpt-4o'
368+ ```
369+
370+ ** Anthropic**
371+
372+ ``` bash
373+ export LLM_PROVIDER=' anthropic'
374+ export ANTHROPIC_API_KEY=' sk-ant-...'
375+ # Optional: specify model (default: claude-sonnet-4-20250514)
376+ export LLM_MODEL=' claude-sonnet-4-20250514'
377+ ```
378+
379+ ** OpenRouter**
380+
381+ ``` bash
382+ export LLM_BASE_URL=' https://openrouter.ai/api/v1'
383+ export OPENAI_API_KEY=' sk-or-...'
384+ export LLM_MODEL=' anthropic/claude-3.5-sonnet'
385+ ```
386+
387+ ** Local (Ollama)**
388+
389+ ``` bash
390+ export LLM_BASE_URL=' http://localhost:11434/v1'
391+ export LLM_MODEL=' llama3'
392+ export OPENAI_API_KEY=' dummy' # Ollama doesn't require a real key
393+ ```
394+
395+ ** Together AI / Groq**
396+
397+ ``` bash
398+ # Together AI
399+ export LLM_BASE_URL=' https://api.together.xyz/v1'
400+ export OPENAI_API_KEY=' ...'
401+
402+ # Groq
403+ export LLM_BASE_URL=' https://api.groq.com/openai/v1'
404+ export OPENAI_API_KEY=' gsk_...'
405+ ```
406+
352407## Task File Format
353408
354409Tasks are defined in JSON format:
@@ -604,7 +659,7 @@ mypy src && \
604659pytest -m " not e2e"
605660```
606661
607- ### Project Structure
662+ ## Project Structure
608663
609664```
610665jq-synth/
@@ -650,7 +705,7 @@ Contributions are welcome! Please follow these steps:
6507056 . ** Push** to your fork: ` git push origin feature/my-feature `
6517067 . ** Open** a Pull Request
652707
653- ### Code Style
708+ ## Code Style
654709
655710- Type hints required for all public functions
656711- Docstrings required for all public functions and classes (Google style)
@@ -669,49 +724,6 @@ MIT License - see [LICENSE](LICENSE) for details.
669724- [ OpenAI] ( https://openai.com ) - GPT models and API
670725- [ Anthropic] ( https://anthropic.com ) - Claude models and API
671726
672- ## Supported jq Patterns
673-
674- JQ-Synth works well with these common jq operations:
675-
676- - ** Field extraction** : ` .foo ` , ` .user.name ` , ` .data.items[0] `
677- - ** Array operations** : ` .[] ` , ` .[0] ` , ` .[1:3] ` , ` .[-1] `
678- - ** Filtering** : ` select(.active == true) ` , ` select(.age > 18) `
679- - ** Mapping** : ` map(.name) ` , ` [.[] | .id] `
680- - ** Array construction** : ` [.items[].name] `
681- - ** Object construction** : ` {name: .user.name, email: .user.email} `
682- - ** Conditionals** : ` if .status == "active" then .name else null end `
683- - ** Null handling** : ` select(. != null) ` , ` .field // "default" `
684- - ** String operations** : String interpolation, concatenation
685- - ** Arithmetic** : Addition, subtraction, comparison operators
686- - ** Type checking** : ` type ` , ` length `
687-
688- ## Known Limitations
689-
690- JQ-Synth may struggle with these advanced jq features:
691-
692- - ** Aggregations** : ` group_by() ` , ` reduce ` , ` min_by() ` , ` max_by() `
693- - ** Complex recursion** : ` recurse() ` , ` walk() `
694- - ** Variable bindings** : Complex ` as $var ` patterns
695- - ** Custom functions** : ` def ` statements (blocked for security)
696- - ** Advanced array operations** : ` combinations() ` , ` transpose() `
697- - ** Path manipulation** : ` getpath() ` , ` setpath() ` , ` delpaths() `
698- - ** Format strings** : ` @csv ` , ` @json ` , ` @base64 `
699-
700- For these cases, you may need to write the filter manually or break down the task into simpler steps.
701-
702- ## How It Works
703-
704- JQ-Synth uses a ** deterministic oracle** approach:
705-
706- 1 . ** Generation** : An LLM (GPT-4, Claude, or compatible model) generates candidate jq filters based on your examples and description
707- 2 . ** Verification** : Each filter is executed against the real jq binary with your input examples
708- 3 . ** Scoring** : A deterministic algorithm compares actual vs expected outputs, computing similarity scores (0.0 to 1.0)
709- 4 . ** Feedback** : The algorithm classifies errors (syntax, shape, missing/extra elements, order) and generates actionable feedback
710- 5 . ** Refinement** : The LLM receives the feedback and generates an improved filter
711- 6 . ** Iteration** : Steps 2-5 repeat until a perfect match is found or limits are reached
712-
713- This hybrid approach combines LLM creativity with deterministic verification, ensuring correctness while leveraging AI for filter synthesis.
714-
715727---
716728
717729** JQ-Synth** - Because life's too short to debug jq filters manually.
0 commit comments