The most systematic computational analysis of the Voynich Manuscript ever conducted.
290 files | 146,071 lines | 33 hypotheses tested | 18 eliminated | 10 validated
An exhaustive AI-driven analysis of the Voynich Manuscript — a mysterious 15th-century document that has resisted decipherment for over 500 years. This repository contains the complete record of the analysis: every hypothesis tested, every failure documented, every validated discovery reported with full statistical evidence.
| Discovery | Method | Statistical Evidence |
|---|---|---|
| Cross-word sandhi (phonological alternation between words) | 10-fold cross-validation | 42.9% accuracy vs 34.7% baseline, p<0.001 |
| 6-layer morphological architecture (PREFIX+CLASSIFIER+ROOT+GRADE+TERMINAL+SUFFIX) | Mutual information analysis | NMI(prefix,suffix) = 0.032 |
| Vowel grade = specificity (more vowels = more specific/derived concept) | Pearson correlation | r = -0.954 |
| Within-page template (description mode → prescription mode transition) | Chi-squared test | p<0.001, crossover at line 7-8 |
| 93% unique first words on herbal pages (plant-specific identifiers) | Direct count | 53/57 recto folios |
| qo- measurement system (systematic quantity notation) | Doubling, position, context | Standard unit doubled 16x |
| ty = thin/linear leaves | Fisher's exact test | p=0.008, OR=18.0 |
| Headache-treatment plants share vocabulary (Nigella + Vitis) | Jaccard + permutation test | 99.8th percentile, p<0.01 |
| 14/15 morphemes have non-random page distributions | Bonferroni correction | p<0.001 |
| Compound classifier compositionality | Distribution prediction | 8/8 pass |
Standard Italian, Turkish, Arabic, Hebrew, constructed language, Cardan grille, tensor writing, Galenic degree encoding, simple substitution cipher, verbose homophonic cipher, syllable cipher, standard nomenclator, abugida script, Semitic root-pattern, sh=root (blind test 25%), cthy=fruit (full-scale p=0.66), text-illustration word-level correlation (0/168 after Bonferroni)
Structural morphology: ████████████████████ 95% (validated)
Grammatical skeleton: ████████████████░░░░ 80%
Within-page template: █████████████████░░░ 85%
Measurement system: █████████████████░░░ 85%
Specific word meanings: ██░░░░░░░░░░░░░░░░░░ ~10%
Complete translation: ░░░░░░░░░░░░░░░░░░░░ 0%
The manuscript remains undeciphered. We mapped its structural architecture but could not crack the ~200 content morphemes that encode specific pharmaceutical knowledge. These require either physical discovery of a codebook in Padua/Venice archives, or identification of a parallel medieval pharmaceutical text.
voynich-manuscript-analysis/
├── README.md
├── FINAL_PAPER_v3.md # English academic paper (Nature format)
├── 最終報告書v3.md # Japanese full report (complete translation)
├── FULL_PROCESS_DOCUMENTATION.md # Every hypothesis, failure, and discovery
├── VALIDATED_TRANSLATIONS.md # Best translations of 3 pages with confidence markers
├── DEFINITIVE_DICTIONARY_AND_TRANSLATION.md # Complete morpheme dictionary + translations
│
├── analysis/ # All analysis reports, organized by topic
│ ├── sandhi/ # Cross-word sandhi discovery & validation
│ ├── morphology/ # Morphological architecture & morpheme decoding
│ ├── language-tests/ # Tests against known languages (all negative)
│ ├── cipher-tests/ # Cipher/notation system hypothesis tests
│ ├── illustration/ # Plant identification & text-image correlation
│ ├── recipe/ # Recipe section decoding & Antidotarium Nicolai
│ ├── historical/ # Historical research (Padua, Venice, Fontana, Llull)
│ ├── validation/ # Blind prediction tests & full-scale validation
│ └── red-team/ # Adversarial reviews of all claims
│
├── submissions/ # Ready-to-use submission materials
│ ├── QUICK_START_GUIDE.md # How to submit (step-by-step, Japanese)
│ ├── ARXIV_PREPRINT.md # arXiv-ready preprint
│ ├── COVER_LETTER_*.md # Cover letters for Cryptologia, DSH, Nature
│ ├── REDDIT_READY.txt # Copy-paste Reddit post
│ ├── EMAIL_READY.txt # Copy-paste emails to researchers
│ └── PRESS_RELEASE*.md # Press releases (EN/JA)
│
└── archive/ # Raw materials
├── scripts/ # 75 Python analysis scripts
├── data/ # EVA transcriptions, JSON data files
└── early_reports/ # Superseded intermediate reports
- Overview → Read this README
- Full paper →
FINAL_PAPER_v3.md(English) or最終報告書v3.md(Japanese) - Translations →
VALIDATED_TRANSLATIONS.md - All failures →
FULL_PROCESS_DOCUMENTATION.md - Deep dive →
analysis/subdirectories by topic
Multi-agent AI analysis using Claude (Anthropic), with:
- Parallel specialist agents (statistics, linguistics, history, art, pharmacology)
- Hypothesis-destruction cycles (propose → test → red team → revise)
- Blind prediction testing (first in Voynich studies)
- Adversarial red team review of all claims
- Full statistical rigor (Bonferroni correction, Fisher's exact test, cross-validation)
"A 15th-century person could not create an information-theoretically perfect cipher. Therefore the encoding must have generative rules, not arbitrary assignments."
This insight — contributed by the human collaborator — broke through the "arbitrary code" assumption and led to the discovery of the hierarchical stem structure (CLASSIFIER + ROOT + GRADE + TERMINAL), reducing the problem from 200 arbitrary codes to ~50 compositional morphemes.
The remaining 20% requires:
- Botanists: More plant identifications from illustrations (currently 5/130)
- Archivists: Search Padua/Venice archives for 15th-century pharmaceutical notation
- Codicologists: Lisa Fagin Davis's folio reordering reconstruction
- Multispectral imaging: Full analysis of Lazarus Project 2024 data
This research is released into the public domain. Use it freely.
If you use this work, please cite:
Anonymous (2026). "Multi-Agent AI Analysis of the Voynich Manuscript:
Validated Structural Constraints and the Limits of Computational Decipherment."
GitHub: https://github.com/Workwrite-Niidome/voynich-manuscript-analysis
This analysis was conducted by a human-AI collaboration. The AI (Claude, Anthropic) performed the computational analysis; the human provided critical insights that broke through key analytical barriers. The manuscript remains one of the great unsolved mysteries of history.