AI Hack Night

Before you start building, complete the setup below. Once setup is finished, you'll have access to Cognee's search interface, backed by a prebuilt knowledge graph generated from synthetic invoice and transaction data. Your job is to build anything that uses this QA capability in a meaningful way.

What You Work With

Query Cognee using natural-language questions (see how completion is generated in the solution_q_and_a.py).
Receive structured or free-text answers.
Use those answers, however, you like in your project.

Constraints

Qdrant must be the vector store of choice – whether local or hosted.
The local model must remain functional; online LLM use is optional.
The raw data included in the data folder is there for reference and should not be used directly.

What You Can Build

Any tool, workflow, interface, or feature that benefits from QA over vendor, product, payment, or order information.

Deliverables

Create a folder named submission on your USB stick and place your entire project inside it. Alternatively, you can share your GH repo with [email protected]
Your project must include code that demonstrates successful queries to Cognee.
Be ready to give a short demo.

Notes

You do not have to add new files, modify or enrich the graph. In case you want to, there is some additional data in the optional_data_for_enrichment folder.

Setup for Q&A with Qdrant and Local Model

We will set up:

Ollama with two local models (embedding and LLM)
A Python virtual environment with pinned dependencies
A Cognee knowledge graph imported from prebuilt data
A local Qdrant vector store loaded with snapshot data
The question answering script (solution_q_and_a.py)

This will allow you to:

Access ingested data from invoice and transaction documents
Retrieve structured context from a knowledge graph for LLM queries
Ask natural-language questions about the data using a local language model
Build tools, agents, or workflows on top of the Q&A pipeline

Before installation:

copy models/ from the USB to your working directory
do the same for cognee_export/
verify the three subdirectories contain Modelfile and a *.gguf each

Project installation:

# Ollama installation
brew install ollama   # Mac OS
ollama serve &

# Ollama model registration
cd models
ollama create nomic-embed-text -f nomic-embed-text/Modelfile
ollama create cognee-distillabs-model-gguf-quantized -f cognee-distillabs-model-gguf-quantized/Modelfile
cd ..

# Initialize python environment, install dependencies
uv venv
source .venv/bin/activate
uv sync

# Graph setup
python setup.py

# Qdrant (local Docker)
docker run -d --name qdrant -p 6333:6333 -p 6334:6334 -v qdrant_storage:/qdrant/storage qdrant/qdrant

# Configure for use locally, retrieve data, restore to database
cp .env.example.local .env
uv run python download-from-spaces.py
uv run python restore-snapshots.py

# Run Q and A example
python solution_q_and_a.py

Pitfalls to avoid:

failing to copy both models/ and cognee_export/ from USB
building the venv in models/ instead of the project root
having a stale venv activated
Ollama is not running
New Qdrant conflicting with old in Docker

Next steps:

look around the code
play with the queries
check out the databases
build something

Useful setup commands

Skip this reference if setup went smoothly.

Turn off and remove Qdrant from Docker

If necessary for recreating:

docker stop qdrant && docker rm qdrant
docker volume rm qdrant_storage

Mac start/stop ollama

brew services start ollama
brew services stop ollama
brew services info ollama

Linux start/stop ollama

sudo systemctl start ollama
sudo systemctl stop ollama
sudo systemctl status ollama

Alternate Ollama Installation

# Alternate direct option
curl -fsSL https://ollama.com/install.sh | sh

What data do I have?

After restore, your cluster contains 14,837 vectors across 6 collections:

Collection	Records	Content
DocumentChunk_text	2,000	Invoice and transaction chunks
Entity_name	8,816	Products, vendors, SKUs
EntityType_name	8	Entity type definitions
EdgeType_relationship_name	13	Relationship types
TextDocument_name	2,000	Document references
TextSummary_text	2,000	Document summaries

These items are also connected via semantics in your graph DB.

The models included in the models/ directory:

nomic-embed-text -- 768-dim embeddings, local inference
Distil Labs SLM -- fine-tuned reasoning model, GGUF quantized
Qwen3-4B -- fallback LLM, optional

Example Project Architecture

Several example projects which one can work off of (if desired, totally optional). These are three ready-to-run FastAPI projects: semantic search, spend analytics, and anomaly detection on procurement data.

Stack: cognee (knowledge graph memory) + Qdrant Cloud (vector search) + Distil Labs (LLM reasoning) + DigitalOcean (deployment)

Raw documents
    |
    v
cognee.add() + cognee.cognify()     <-- cognee extracts entities, relationships, summaries
    |
    v
Qdrant Cloud (6 collections)        <-- vectors + knowledge graph stored here
    |
    v
FastAPI apps                         <-- search, analytics, anomaly detection
    |
    v
Distil Labs SLM                     <-- LLM reasoning (local GGUF or hosted API)
    |
    v
DigitalOcean App Platform           <-- deployed and shareable

Example projects

Hackathon participants should feel free to build off of these if they wish, or to do something totally different. The three example projects in project1, project2, and project3 directories are each self-contained with their ownpyproject.toml and dependencies:

cd project1-procurement-search  # or project2 or project3
uv sync
uv run python app.py

Project 1: Procurement Semantic Search (port 7777) -- semantic search across all procurement data with interactive UI.

Qdrant features: Query API, Prefetch + RRF Fusion, Group API, Discovery API, Recommend API, payload indexing, filtered search

Endpoints: /search, /search/grouped, /discover, /recommend, /filter, /ask (RAG Q&A), /cognee-search, /add-knowledge, /collections

Project 2: Spend Analytics Dashboard (port 5553) -- interactive analytics dashboard with Chart.js visualizations and semantic search.

Qdrant features: Scroll API (bulk extraction), Query API, Group API, payload indexing

Endpoints: /api/analytics, /api/search, /api/search/grouped, /api/insights (LLM analysis)

Project 3: Anomaly Detective (port 6971) -- automated anomaly detection using vector analysis and Qdrant's Batch Query API. Detection methods include amount outliers (z-score), embedding outliers (centroid distance), near-duplicates (similarity > 0.99), and vendor variance.

Qdrant features: Batch Query API (50 recommend queries/request), Recommend API, Scroll API with vectors, payload indexing

Endpoints: /api/anomalies, /api/search, /api/investigate/{point_id}, /api/explain/{point_id} (LLM explanation)

Using Qdrant Cloud (alternative to local Docker)

If you prefer hosted Qdrant over local Docker, set up a free cluster at cloud.qdrant.io and use .env.example instead of .env.example.local:

cp .env.example .env
# Edit .env -- fill in QDRANT_URL and QDRANT_API_KEY with your Cloud values
uv run python download-from-spaces.py
uv run python restore-snapshots.py

Example results

Example results comparing LLM and SLM outputs can be found in responses.txt.

Adding your own data

The starter data was built using cognee's ECL (Extract, Cognify, Load) pipeline:

cd cognee-pipeline
cp .env.example .env
# Edit .env: add Qdrant credentials + LLM provider
uv sync
uv run python ingest.py

Programmatic usage:

import cognee
from cognee.api.v1.search import SearchType

await cognee.add("Your document text here...")
await cognee.cognify()
results = await cognee.search(
    query_text="What vendors supply IT equipment?",
    query_type=SearchType.CHUNKS,
)

Supported input types: plain text strings, PDF, DOCX, TXT, CSV files, URLs, and directories of files.

To reset and re-ingest from scratch:

await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)

See cognee docs for full pipeline options.

Using Qwen3 as an alternative model

Register the Qwen3 model with Ollama:

cd models
ollama create Qwen3-4B-Q4_K_M -f Qwen3-4B-Q4_K_M/Modelfile
cd ..

Access it via the standard OpenAI-compatible interface at http://localhost:11434/v1 with model name Qwen3-4B-Q4_K_M.

DigitalOcean deployment

Two modes are available: local (GGUF models, default) and remote (API-based inference).

Local dev runs the Distil Labs SLM via llama-cpp-python (requires 4-8GB RAM):

# .env: LLM_MODE=local, EMBED_MODE=local (defaults)
uv run python app.py

Remote deployment to DigitalOcean App Platform:

uv run python upload-to-spaces.py
# Set LLM_MODE=remote and EMBED_MODE=remote in .env
doctl apps create --spec .do/app.yaml

Or run remotely via Docker:

docker compose up

Environment variables:

Variable	Default	Description
`QDRANT_URL`	-	Qdrant Cloud cluster URL
`QDRANT_API_KEY`	-	Qdrant Cloud API key
`LLM_MODE`	`local`	`local` (GGUF) or `remote` (API)
`LLM_API_URL`	-	OpenAI-compatible chat completions endpoint
`LLM_API_KEY`	-	API key for remote LLM
`LLM_MODEL_NAME`	`distil-labs-slm`	Model name for remote LLM
`EMBED_MODE`	`local`	`local` (GGUF) or `remote` (API)
`EMBED_API_URL`	-	OpenAI-compatible embeddings endpoint
`EMBED_API_KEY`	-	API key for remote embeddings
`SPACES_ENDPOINT`	-	DO Spaces endpoint
`SPACES_BUCKET`	-	DO Spaces bucket name

Prerequisites

Python 3.11+
uv
Ollama
Docker (for local Qdrant)

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.do		.do
.github/skills		.github/skills
cognee-pipeline		cognee-pipeline
graphs		graphs
helper_functions		helper_functions
optional_data_for_enrichment		optional_data_for_enrichment
project1-procurement-search		project1-procurement-search
project2-spend-analytics		project2-spend-analytics
project3-anomaly-detective		project3-anomaly-detective
prompts		prompts
shared		shared
.dockerignore		.dockerignore
.env.example		.env.example
.env.example.local		.env.example.local
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
README.md		README.md
custom_generate_completion.py		custom_generate_completion.py
custom_retriever.py		custom_retriever.py
docker-compose.yml		docker-compose.yml
download-from-spaces.py		download-from-spaces.py
hackathon-banner.avif		hackathon-banner.avif
hackathon-banner.jpg		hackathon-banner.jpg
initial_graph_creation.py		initial_graph_creation.py
pyproject.toml		pyproject.toml
responses.txt		responses.txt
restore-snapshots.py		restore-snapshots.py
setup.py		setup.py
solution_enrichtment.py		solution_enrichtment.py
solution_q_and_a.py		solution_q_and_a.py
upload-to-spaces.py		upload-to-spaces.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Hack Night

What You Work With

Constraints

What You Can Build

Deliverables

Notes

Setup for Q&A with Qdrant and Local Model

Useful setup commands

What data do I have?

Example Project Architecture

Example projects

Using Qdrant Cloud (alternative to local Docker)

Example results

Adding your own data

Using Qwen3 as an alternative model

DigitalOcean deployment

Prerequisites

Useful commands

About

Uh oh!

Releases

Packages

Languages

topoteretes/ai-memory-hackathon

Folders and files

Latest commit

History

Repository files navigation

AI Hack Night

What You Work With

Constraints

What You Can Build

Deliverables

Notes

Setup for Q&A with Qdrant and Local Model

Useful setup commands

What data do I have?

Example Project Architecture

Example projects

Using Qdrant Cloud (alternative to local Docker)

Example results

Adding your own data

Using Qwen3 as an alternative model

DigitalOcean deployment

Prerequisites

Useful commands

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages