DeckSage

Card similarity, search, and deck operations for Magic: The Gathering, Pokemon TCG, and Yu-Gi-Oh!.

DeckSage combines tournament co-occurrence embeddings (PecanPy + Word2Vec on 184K deck lists), card attribute fusion, text search (MeiliSearch + Qdrant), visual embeddings (SigLIP2), and Jaccard co-occurrence to find similar cards, complete partial decks, and surface synergies / substitutes / upgrades.

Install

Requires Python 3.11+, Docker (for MeiliSearch + Qdrant). Recommended: uv.

uv sync --extra embeddings

For development (adds ruff, pytest, playwright):

uv sync --extra dev --extra embeddings

Data Assets

Embedding files, graph databases, and processed CSVs are not stored in git (too large). To set up a new machine:

Obtain the data archive (tarball, ~2 GB compressed / ~6 GB uncompressed)
Extract into the repo root: tar xzf decksage-demo-data.tar.gz
This creates the required files under data/embeddings/, data/graphs/, data/processed/, and data/cache/

Required assets (referenced in .env.example):

data/embeddings/magic_v7_spectral_mu35.wv (14 MB)
data/embeddings/pokemon_v7_fused.wv (2.3 MB)
data/embeddings/yugioh_v7_spectral_mu3.wv (7.4 MB)
data/graphs/{magic,pokemon,yugioh}_unified.db (5.3 GB total -- card metadata, co-occurrence)
data/processed/pairs_*.csv (456 MB -- Jaccard signal)
data/processed/card_attributes_*_enriched.csv (46 MB)
data/cache/text_embeddings/ (205 MB -- optional, rebuilds on startup)

Usage

Start the API

# 1. Start search backends (MeiliSearch + Qdrant)
docker compose up -d meilisearch qdrant

# 2. Copy and edit .env (defaults in .env.example work if data assets are in place)
cp .env.example .env

# 3. Start the API (loads all 3 games, ~40s startup)
uv run uvicorn src.ml.api.api:app --host 127.0.0.1 --port 8001

Open http://localhost:8001 for the web UI, or http://localhost:8001/docs for interactive API docs.

CLI

DECKSAGE="uv run src/ml/cli/main.py"
$DECKSAGE --game magic similar "Lightning Bolt" --k 5 --output table
$DECKSAGE --game magic search "destroy all creatures" --limit 5
$DECKSAGE --game pokemon similar "Ultra Ball" --k 5
$DECKSAGE health --output json

API Endpoints

Endpoint	Method	Description
`/live`, `/ready`	GET	Liveness / readiness probes
`/v1/games`	GET	List loaded games and card counts
`/v1/health?game=magic`	GET	Per-game health (card count, embedding dim)
`/v1/similar`	POST	Card similarity (all 6 modes, custom weights)
`/v1/cards/{name}/similar`	GET	Card similarity (convenience GET)
`/v1/cards/{name}/contextual`	GET	Contextual suggestions (synergies, alternatives, upgrades, downgrades)
`/v1/cards?prefix=Light`	GET	Card name autocomplete
`/v1/search`	GET/POST	Hybrid text + vector search
`/v1/deck/complete`	POST	Deck completion (greedy fill to target size)
`/v1/deck/suggest_actions`	POST	Deck improvement suggestions
`/v1/deck/apply_patch`	POST	Apply add/remove operations to a deck
`/v1/feedback`	POST	Submit user feedback

Interactive docs at /docs when the server is running.

Similarity modes

Mode	Method	Use case
`substitute`	Embedding cosine	Functional replacements (same role/effect)
`synergy`	Jaccard co-occurrence	Cards that go in the same deck
`meta`	Meta fusion	Competitive metagame pairings
`fusion`	Weighted late fusion	Blended signal (all methods)
`embedding`	Raw embedding cosine	Direct embedding similarity
`jaccard`	Jaccard index	Direct co-occurrence overlap

Training Data

184K tournament deck lists (Magic 83K, Yu-Gi-Oh 77K, Pokemon 24K) from MTGGoldfish, MTGTop8, Limitless TCG, MasterDuelMeta, YGOProDeck, Archidekt. Six similarity signals: co-occurrence embeddings (PecanPy + Word2Vec, 128D), text embeddings (E5-base-v2), visual embeddings (SigLIP2), Jaccard co-occurrence, functional tag matching, and archetype similarity. See scripts/training/ for the full pipeline and docs/experimental_narrative.md for a walkthrough of 63 experiments.

Evaluation

~~100K LLM-generated annotations across 3 games (~~$25 total via Groq 70B + Cerebras 235B cascade). See data/experiments/SUMMARY.md for all 63 experiments.

Co-occurrence embeddings (v7, substitute nDCG@10, saturated):

Game	nDCG	Condensed	Gap
Magic	0.525	0.527	0.002
Pokemon	0.437	0.438	0.001
Yu-Gi-Oh	0.478	0.482	0.004

Text embeddings (E5-base-v2, condensed substitute nDCG@10, 14-25% better):

Game	Condensed nDCG	vs Co-occurrence
Magic	0.613	+22%
Pokemon	0.518	+25%
Yu-Gi-Oh	0.532	+14%

Eval scripts in scripts/evaluation/: eval_per_mode.py, eval_search_relevance.py, eval_deck_completion.py, intrinsic_eval.py.

Project Layout

src/ml/           Python ML code (similarity, deck building, search, API, CLI)
src/ml/tests/     Test suite (818 tests)
src/backend/      Go backend (data extraction for 6 games)
frontend/         Web frontend (unified light theme)
scripts/          Training, annotation, evaluation, data processing
tests/e2e/        Playwright E2E tests (45 tests)
data/             Embeddings, pairs, enriched CSVs, annotations, test sets

Development

just test              # full Python test suite (818 tests)
just lint              # ruff check + format
npx playwright test    # E2E tests (45 tests, requires running server + search backends)

Search backends via Docker:

just qa-deps-up        # start MeiliSearch + Qdrant
just qa-deps-down      # stop

Docs

docs/experimental_narrative.md -- walkthrough of 63 experiments (12 phases), intended for ML audiences
data/experiments/SUMMARY.md -- experiment index with metrics
docs/failure_taxonomy.md -- categorized failure modes from manual analysis
docs/figures/experiment_progression.png -- nDCG progression across experiments

License

MIT; see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 555 Commits
.github		.github
data		data
docs		docs
frontend		frontend
qa/reference		qa/reference
references		references
scripts		scripts
src		src
tests/e2e		tests/e2e
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.override.example.yml		docker-compose.override.example.yml
docker-compose.yml		docker-compose.yml
justfile		justfile
package.json		package.json
playwright.config.mjs		playwright.config.mjs
pyproject.toml		pyproject.toml
ty.toml		ty.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeckSage

Install

Data Assets

Usage

Start the API

CLI

API Endpoints

Similarity modes

Training Data

Evaluation

Project Layout

Development

Docs

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DeckSage

Install

Data Assets

Usage

Start the API

CLI

API Endpoints

Similarity modes

Training Data

Evaluation

Project Layout

Development

Docs

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages