Skip to content

Commit 87a8cc4

Browse files
Merge pull request #24 from ContextLab/generate-astrophysics-questions
Joint UMAP refit, density flattening, and welcome screen redesign
2 parents 095bf47 + 8bf223e commit 87a8cc4

File tree

11,030 files changed

+135589
-1849875
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

11,030 files changed

+135589
-1849875
lines changed

.claude/skills/generate-questions.md renamed to .claude/skills/generate-questions/SKILL.md

Lines changed: 79 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,18 @@
1+
---
2+
name: generate-questions
3+
description: "Generate high-quality multiple-choice questions for a Knowledge Mapper domain. Use when asked to generate or regenerate questions for a domain (e.g., 'generate questions for biology', 'regenerate the physics question set'). Accepts a domain name as $ARGUMENTS (e.g., /generate-questions quantum-physics). Runs a 5-step iterative pipeline: generate Q+A → review Q+A → generate distractors → review distractors → compile JSON."
4+
---
5+
16
# Skill: Generate Domain Questions
27

38
Generate high-quality multiple-choice questions for the Knowledge Mapper application using an iterative multi-agent pipeline.
49

10+
## Arguments
11+
12+
This skill accepts a **domain ID** as `$ARGUMENTS` (e.g., `quantum-physics`, `astrophysics`, `biology`).
13+
14+
If no argument is provided, ask the user which domain to generate questions for.
15+
516
## When to Use
617

718
Use this skill when asked to generate or regenerate questions for a domain (e.g., "generate questions for biology", "regenerate the physics question set").
@@ -10,7 +21,7 @@ Use this skill when asked to generate or regenerate questions for a domain (e.g.
1021

1122
Knowledge Mapper is a GP-based knowledge estimation app. Users answer multiple-choice questions positioned on a 2D map of Wikipedia articles. Question quality directly impacts the usefulness of knowledge estimation.
1223

13-
### Output Format (per question)
24+
### Working Output Format (per question during generation)
1425

1526
```json
1627
{
@@ -24,7 +35,7 @@ Knowledge Mapper is a GP-based knowledge estimation app. Users answer multiple-c
2435
}
2536
```
2637

27-
**Do NOT include**: `id`, `x`, `y`, `z`, `options`, or `correct_answer` slot letter. IDs and coordinates are assigned programmatically after generation. Option slot assignment (A/B/C/D) and randomization happen at display time.
38+
**Do NOT include during generation**: `id`, `x`, `y`, `z`, `options`, or `correct_answer` slot letter. These are assigned during Final Assembly.
2839

2940
### Formatting Rules
3041
- Questions: **50 words or fewer**
@@ -50,11 +61,11 @@ This enables resuming from working files if context runs out.
5061

5162
### Prerequisites
5263

53-
The orchestrator provides each question generation with:
64+
The domain ID comes from `$ARGUMENTS`. The orchestrator provides each question generation with:
5465
- A **CONCEPT** (e.g., "photosynthesis")
5566
- A **WIKIPEDIA ARTICLE** (full text, fetched via WebFetch)
5667
- A **DIFFICULTY LEVEL** (integer 1-4)
57-
- A **DOMAIN ID** (e.g., "biology")
68+
- A **DOMAIN ID** (from `$ARGUMENTS`, e.g., "biology")
5869

5970
### Step 1: Generate Question + Correct Answer
6071

@@ -181,7 +192,7 @@ Revise any distractors that fail checks.
181192

182193
**Instructions to agent**:
183194

184-
Compile the final question JSON. Do NOT include `id`, `x`, `y`, `z`, `options`, or `correct_answer` slot letter — these are assigned programmatically.
195+
Compile the final question JSON for the working file. Do NOT include `id`, `x`, `y`, `z`, `options`, or `correct_answer` slot letter — these are assigned during Final Assembly.
185196

186197
**Agent output** (JSON):
187198
```json
@@ -202,7 +213,7 @@ Compile the final question JSON. Do NOT include `id`, `x`, `y`, `z`, `options`,
202213

203214
```
204215
TodoWrite([
205-
{ content: "Generate concepts for {domain}", status: "in_progress", activeForm: "Generating concepts" },
216+
{ content: "Generate concepts for $ARGUMENTS", status: "in_progress", activeForm: "Generating concepts" },
206217
{ content: "Questions: 0/50 complete", status: "pending", activeForm: "Generating questions" },
207218
{ content: "Assemble final domain JSON", status: "pending", activeForm: "Assembling domain JSON" },
208219
])
@@ -214,7 +225,7 @@ For each question, update the master todo AND maintain per-question detail:
214225

215226
```
216227
TodoWrite([
217-
{ content: "Generate concepts for {domain}", status: "completed", activeForm: "Generating concepts" },
228+
{ content: "Generate concepts for $ARGUMENTS", status: "completed", activeForm: "Generating concepts" },
218229
{ content: "Questions: 12/50 complete", status: "in_progress", activeForm: "Generating questions" },
219230
{ content: "Q13 '{concept}': Step 1 generate", status: "in_progress", activeForm: "Generating Q13 question+answer" },
220231
{ content: "Q13 '{concept}': Step 2 review Q+A", status: "pending", activeForm: "Reviewing Q13" },
@@ -227,21 +238,74 @@ TodoWrite([
227238

228239
## Checkpointing
229240

230-
Write completed questions to `data/domains/.working/{domain-id}-questions.json` after EVERY question completes Step 5. This file is an array of completed question JSONs. If context runs out, the next agent reads this file to know which questions are done and resumes from where it left off.
241+
Write completed questions to `data/domains/.working/$ARGUMENTS-questions.json` after EVERY question completes Step 5. This file is an array of completed question JSONs. If context runs out, the next agent reads this file to know which questions are done and resumes from where it left off.
242+
243+
## Final Assembly (after all 50 questions complete)
244+
245+
After all questions are generated, assemble the final domain JSON file:
246+
247+
### Assembly Steps
231248

232-
## Assembly (after all 50 questions complete)
249+
1. **Read working file**: `data/domains/.working/$ARGUMENTS-questions.json`
250+
2. **Read existing domain file**: `data/domains/$ARGUMENTS.json` to get the existing `domain`, `labels`, and `articles` arrays
251+
3. **For each question**, assign:
252+
- **ID**: First 16 hex characters of SHA-256 hash of `question_text`
253+
- **Option slots**: Randomly assign correct answer and distractors to A/B/C/D slots:
254+
- Pick a random slot (A, B, C, or D) for the correct answer
255+
- Fill remaining slots with the 3 distractors in random order
256+
- Record which slot letter contains the correct answer
257+
4. **Write final domain JSON** to `data/domains/$ARGUMENTS.json`
258+
259+
### Final Domain JSON Structure
260+
261+
```json
262+
{
263+
"domain": {
264+
"id": "astrophysics",
265+
"name": "Astrophysics",
266+
"parent_id": "physics",
267+
"level": "sub",
268+
"region": {
269+
"x_min": 0.042179,
270+
"x_max": 0.295656,
271+
"y_min": 0.413276,
272+
"y_max": 0.67439
273+
},
274+
"grid_size": 70
275+
},
276+
"questions": [
277+
{
278+
"id": "04a772bcef67e50f",
279+
"question_text": "What is stellar parallax?",
280+
"options": {
281+
"A": "The gravitational bending of light from distant stars...",
282+
"B": "The redshift observed in a star's light spectrum...",
283+
"C": "The apparent shift in a nearby star's position against distant background stars...",
284+
"D": "The dimming of a star's brightness as it passes behind another celestial body..."
285+
},
286+
"correct_answer": "C",
287+
"difficulty": 1,
288+
"source_article": "Stellar parallax",
289+
"domain_ids": ["astrophysics"],
290+
"concepts_tested": ["stellar parallax"]
291+
}
292+
],
293+
"labels": [...],
294+
"articles": [...]
295+
}
296+
```
233297

234-
After all questions are generated:
298+
### Assembly Notes
235299

236-
1. Read working file: `data/domains/.working/{domain-id}-questions.json`
237-
2. Read existing domain file: `data/domains/{domain-id}.json` (preserve `articles` array)
238-
3. Assign IDs, coordinates, and option slots programmatically (handled by caller, NOT this skill)
239-
4. Write completed questions to working file for the caller to assemble
300+
- **x, y, z coordinates** are NOT assigned by this skill — they come from the embedding pipeline
301+
- **Preserve existing data**: Keep the `domain`, `labels`, and `articles` arrays from the existing domain file
302+
- **Replace questions**: The `questions` array is fully replaced with the newly generated questions
303+
- **Randomization**: Option slot assignment must be truly random to prevent position bias in answers
240304

241305
## Important Notes
242306

243307
- **Model**: Use Claude Opus (claude-opus-4-6) for all 5 steps. Question quality is paramount.
244308
- **One domain at a time**: The caller invokes this skill per domain and can parallelize across domains.
245309
- **Factual accuracy is non-negotiable**: Steps 1 and 2 MUST verify facts via the Wikipedia article and web searches. Any ambiguity must be resolved before proceeding.
246310
- **TodoWrite is mandatory**: Every step transition and every completed question MUST be reflected in TodoWrite.
247-
- **No coordinates or IDs**: This skill produces question content only. Spatial embedding and ID assignment happen in a separate post-processing step.
311+
- **No coordinates**: This skill produces question content only. Spatial embedding (x, y, z coordinates) happens in a separate post-processing step.

.gitignore

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -141,6 +141,7 @@ celerybeat.pid
141141
.env
142142
.envrc
143143
.venv
144+
.venv-*/
144145
env/
145146
venv/
146147
ENV/
@@ -218,9 +219,21 @@ dist/
218219
# Project-specific
219220
CLAUDE.md
220221

221-
# UMAP model files
222+
# UMAP model files (too large for git)
222223
umap_reducer.pkl
223224
umap_bounds.pkl
225+
embeddings/umap_reducer.pkl
226+
embeddings/umap_bounds.pkl
227+
228+
# Large embedding pkl files (too large for git)
229+
embeddings/question_embeddings_2500.pkl
230+
embeddings/transcript_embeddings.pkl
231+
embeddings/article_coords.pkl
232+
embeddings/article_coords_flat.pkl
233+
embeddings/question_coords.pkl
234+
embeddings/question_coords_flat.pkl
235+
embeddings/transcript_coords.pkl
236+
embeddings/article_registry.pkl
224237
*.credentials
225238
.credentials/
226239

@@ -286,7 +299,16 @@ checkpoints.zip
286299
wikipedia_articles_level_0.json.zip
287300
level_0_concepts.json.zip
288301
data.zip
289-
.gitignore
290302
backups/embeddings.zip
291303
backups/large_checkpoints/level_0_final.json
292304
backups/large_checkpoints/level_1_after_download.json
305+
306+
# Video pipeline working files
307+
data/videos/.working/coordinates/
308+
data/videos/.working/embeddings/
309+
data/videos/.working/audio_cache/
310+
data/videos/transcripts_raw/
311+
data/videos/transcripts/
312+
313+
# Python virtual environments
314+
.venv/

0 commit comments

Comments
 (0)