Skip to content

Workwrite-Niidome/voynich-manuscript-analysis

Repository files navigation

Voynich Manuscript: Multi-Agent AI Structural Analysis (2026)

The most systematic computational analysis of the Voynich Manuscript ever conducted.

290 files | 146,071 lines | 33 hypotheses tested | 18 eliminated | 10 validated

What is this?

An exhaustive AI-driven analysis of the Voynich Manuscript — a mysterious 15th-century document that has resisted decipherment for over 500 years. This repository contains the complete record of the analysis: every hypothesis tested, every failure documented, every validated discovery reported with full statistical evidence.

Key Validated Discoveries

Discovery Method Statistical Evidence
Cross-word sandhi (phonological alternation between words) 10-fold cross-validation 42.9% accuracy vs 34.7% baseline, p<0.001
6-layer morphological architecture (PREFIX+CLASSIFIER+ROOT+GRADE+TERMINAL+SUFFIX) Mutual information analysis NMI(prefix,suffix) = 0.032
Vowel grade = specificity (more vowels = more specific/derived concept) Pearson correlation r = -0.954
Within-page template (description mode → prescription mode transition) Chi-squared test p<0.001, crossover at line 7-8
93% unique first words on herbal pages (plant-specific identifiers) Direct count 53/57 recto folios
qo- measurement system (systematic quantity notation) Doubling, position, context Standard unit doubled 16x
ty = thin/linear leaves Fisher's exact test p=0.008, OR=18.0
Headache-treatment plants share vocabulary (Nigella + Vitis) Jaccard + permutation test 99.8th percentile, p<0.01
14/15 morphemes have non-random page distributions Bonferroni correction p<0.001
Compound classifier compositionality Distribution prediction 8/8 pass

Key Refuted Hypotheses

Standard Italian, Turkish, Arabic, Hebrew, constructed language, Cardan grille, tensor writing, Galenic degree encoding, simple substitution cipher, verbose homophonic cipher, syllable cipher, standard nomenclator, abugida script, Semitic root-pattern, sh=root (blind test 25%), cthy=fruit (full-scale p=0.66), text-illustration word-level correlation (0/168 after Bonferroni)

How far did we get?

Structural morphology:    ████████████████████  95% (validated)
Grammatical skeleton:     ████████████████░░░░  80%
Within-page template:     █████████████████░░░  85%
Measurement system:       █████████████████░░░  85%
Specific word meanings:   ██░░░░░░░░░░░░░░░░░░  ~10%
Complete translation:     ░░░░░░░░░░░░░░░░░░░░  0%

The manuscript remains undeciphered. We mapped its structural architecture but could not crack the ~200 content morphemes that encode specific pharmaceutical knowledge. These require either physical discovery of a codebook in Padua/Venice archives, or identification of a parallel medieval pharmaceutical text.

Repository Structure

voynich-manuscript-analysis/
├── README.md
├── FINAL_PAPER_v3.md              # English academic paper (Nature format)
├── 最終報告書v3.md                  # Japanese full report (complete translation)
├── FULL_PROCESS_DOCUMENTATION.md   # Every hypothesis, failure, and discovery
├── VALIDATED_TRANSLATIONS.md       # Best translations of 3 pages with confidence markers
├── DEFINITIVE_DICTIONARY_AND_TRANSLATION.md  # Complete morpheme dictionary + translations
│
├── analysis/                       # All analysis reports, organized by topic
│   ├── sandhi/                     # Cross-word sandhi discovery & validation
│   ├── morphology/                 # Morphological architecture & morpheme decoding
│   ├── language-tests/             # Tests against known languages (all negative)
│   ├── cipher-tests/               # Cipher/notation system hypothesis tests
│   ├── illustration/               # Plant identification & text-image correlation
│   ├── recipe/                     # Recipe section decoding & Antidotarium Nicolai
│   ├── historical/                 # Historical research (Padua, Venice, Fontana, Llull)
│   ├── validation/                 # Blind prediction tests & full-scale validation
│   └── red-team/                   # Adversarial reviews of all claims
│
├── submissions/                    # Ready-to-use submission materials
│   ├── QUICK_START_GUIDE.md        # How to submit (step-by-step, Japanese)
│   ├── ARXIV_PREPRINT.md           # arXiv-ready preprint
│   ├── COVER_LETTER_*.md           # Cover letters for Cryptologia, DSH, Nature
│   ├── REDDIT_READY.txt            # Copy-paste Reddit post
│   ├── EMAIL_READY.txt             # Copy-paste emails to researchers
│   └── PRESS_RELEASE*.md           # Press releases (EN/JA)
│
└── archive/                        # Raw materials
    ├── scripts/                    # 75 Python analysis scripts
    ├── data/                       # EVA transcriptions, JSON data files
    └── early_reports/              # Superseded intermediate reports

Start Here

  1. Overview → Read this README
  2. Full paperFINAL_PAPER_v3.md (English) or 最終報告書v3.md (Japanese)
  3. TranslationsVALIDATED_TRANSLATIONS.md
  4. All failuresFULL_PROCESS_DOCUMENTATION.md
  5. Deep diveanalysis/ subdirectories by topic

Methodology

Multi-agent AI analysis using Claude (Anthropic), with:

  • Parallel specialist agents (statistics, linguistics, history, art, pharmacology)
  • Hypothesis-destruction cycles (propose → test → red team → revise)
  • Blind prediction testing (first in Voynich studies)
  • Adversarial red team review of all claims
  • Full statistical rigor (Bonferroni correction, Fisher's exact test, cross-validation)

The Key Insight

"A 15th-century person could not create an information-theoretically perfect cipher. Therefore the encoding must have generative rules, not arbitrary assignments."

This insight — contributed by the human collaborator — broke through the "arbitrary code" assumption and led to the discovery of the hierarchical stem structure (CLASSIFIER + ROOT + GRADE + TERMINAL), reducing the problem from 200 arbitrary codes to ~50 compositional morphemes.

How to Continue This Work

The remaining 20% requires:

  1. Botanists: More plant identifications from illustrations (currently 5/130)
  2. Archivists: Search Padua/Venice archives for 15th-century pharmaceutical notation
  3. Codicologists: Lisa Fagin Davis's folio reordering reconstruction
  4. Multispectral imaging: Full analysis of Lazarus Project 2024 data

License

This research is released into the public domain. Use it freely.

Citation

If you use this work, please cite:

Anonymous (2026). "Multi-Agent AI Analysis of the Voynich Manuscript: 
Validated Structural Constraints and the Limits of Computational Decipherment."
GitHub: https://github.com/Workwrite-Niidome/voynich-manuscript-analysis

This analysis was conducted by a human-AI collaboration. The AI (Claude, Anthropic) performed the computational analysis; the human provided critical insights that broke through key analytical barriers. The manuscript remains one of the great unsolved mysteries of history.

About

Multi-Agent AI Structural Analysis of the Voynich Manuscript (2026). 33 hypotheses tested, 18 eliminated, 10 validated. 290 files, 146K lines. The manuscript remains undeciphered but its structural architecture is now mapped in unprecedented detail.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages