This repository contains source code and data used in the paper Illarionov et al, Digitized Dataset of Solar Disk H-alpha Observations At Sacramento Peak Observatory, for training the text detection and recognition models.
Content:
- notebooks - model training and inference pipelines;
- src - source files used in the notebooks;
- sample_data - sample images from the catalog of observations;
- train_datasets - labeled datasets for model training;
- trained_models - weights of the trained models;
- viewer.html - web-application to search images in the dataset by date and time.
Tip
Use this link to open viewer.html directly. Alternatively, you can download the file viewer.html and open it in a browser.
Clone the repository including the submodule
git clone --recursive https://github.com/observethesun/halpha.git
Install the dependencies of the helio framework listed in helio/requirements.txt
pip install -r helio/requirements.txt
Install ultralytics for the YOLO model
pip install ultralytics
The full archive of observational data is on the server https://nispdata.nso.edu/ftp/flare_patrol_h_alpha_sp/, and digitized metadata in available in CSV format on https://doi.org/10.5281/zenodo.18392561. Notebooks provided in this repository can be used to digitize the dataset independently or as a basis for further extension/improvement of the digitization process.
E. Illarionov, W. Yukui, A. Pevtsov. Digitized Dataset of Solar Disk H-alpha Observations
At Sacramento Peak Observatory. 2025.
