Optical Character Recognition for Manchu script using multiple model architectures.
- CPU: Intel Core i9-13900KS (32 cores)
- GPU: NVIDIA RTX 6000 Ada Generation (49GB VRAM)
- RAM: 188GB
uv syncpython scripts/download_data.pypython scripts/finetune_llama32_11b.pyTrain Llama-3.2-11B model on Manchu OCR datasets.
python scripts/infer_llama32_11b.pyThe scripts/ folder contains the main entry points:
python scripts/train.pyTrains VLM and CRNN models on Manchu OCR datasets.
python scripts/evaluate.pyEvaluates trained models on validation and test datasets.
python scripts/generate_figures.pyCreates performance comparison charts and analysis figures.
- qwen-25-3b/7b: Qwen2.5-VL-3B/7B
- llama-32-11b: Llama-3.2-11B
- crnn-base-3m: Convolutional Recurrent Neural Network
- openai-41: OpenAI GPT-4.1-2025-04-14
Results are saved in results/ directory.