Adding edf2asc conversion by christian-oreilly · Pull Request #29 · scott-huberty/eyelinkio

christian-oreilly · 2026-02-14T19:49:11Z

ENH: Add EDF-to-ASCII (.asc) converter

Summary

Adds a to_asc() function that converts EyeLink EDF binary files to ASCII (.asc) format, producing output equivalent to SR Research's proprietary edf2asc command-line tool. This is a pure-Python, streaming, single-pass converter that uses the existing edfapi ctypes bindings.

Verified against 270 EDF files — 269/270 pass line-by-line comparison against reference .asc files generated by the official edf2asc tool (1 file has a corrupt EDF that edfapi cannot open).

Motivation

Users currently need the proprietary edf2asc tool (Windows-only or requires a separate SR Research install) to get ASCII output from EDF files.
Having to_asc() in eyelinkio lets users convert EDF to ASCII in a single Python call, on any platform where edfapi is available.

Changes

New files

File	Description
`src/eyelinkio/edf/to_asc.py` (461 lines)	Core converter module. Streams through EDF elements via `edf_get_next_data()` and writes ASC output using a handler-per-event-type architecture.
`src/eyelinkio/tests/test_to_asc.py` (387 lines)	Comprehensive test suite that compares generated output line-by-line against a reference `.asc` file, with documented tolerances for known edfapi version differences.

Modified files

File	Change
`src/eyelinkio/edf/__init__.py`	Export `to_asc`
`src/eyelinkio/__init__.py`	Export `to_asc` at package level
`src/eyelinkio/edf/read.py`	Add `EDF.to_asc()` convenience method; store `_fpath` on the EDF instance
`src/eyelinkio/edf/_edf2py.py`	Replace bare `assert` with `raise OSError` for missing edfapi library
`src/eyelinkio/tests/test_edf.py`	Change `raise ValueError` to `continue` for unexpected EDF files in test loops (allows test data directory to contain additional `.edf` files without breaking tests)

Usage

import eyelinkio

# Standalone function
eyelinkio.to_asc("recording.edf")                    # → recording.asc
eyelinkio.to_asc("recording.edf", "output.asc")      # → output.asc

# From an EDF object
edf = eyelinkio.read_edf("recording.edf")
edf.to_asc()                                          # → recording.asc

# Control INPUT field inclusion (for compatibility with edf2asc 3.1)
eyelinkio.to_asc("recording.edf", include_input=False)

What the converter handles

The converter produces all ASC output sections in correct chronological order:

Preamble — CONVERTED FROM header with edfapi version, raw EDF preamble
Recording blocks — START / END with config lines (PRESCALER, VPRESCALER, PUPIL, EVENTS, SAMPLES)
Sample data — Tab-separated: timestamp, gaze x/y, pupil, input (optional), HTARGET (if available), status markers
Events — SFIX/EFIX, SSACC/ESACC, SBLINK/EBLINK with correct spacing and field formatting
Messages — MSG lines with original timestamps
Button/Input — BUTTON lines (decomposed from bit masks), INPUT lines
Missing data — Gaze values ≥ 1e8 rendered as .; HTARGET sentinel (-32768) rendered as . with M............ status
Resolution tracking — Per-sample rx/ry accumulated for END RES averages; start/end resolution used for ESACC amplitude computation

Known limitations: edfapi version differences

Background

The converter uses SR Research's edfapi C library (via ctypes) to read EDF files. The reference .asc files used for validation were generated by the official edf2asc tool. There are two edfapi versions in play:

edfapi 3.1 (Win32) — used by edf2asc to generate 266 of 270 reference files
edfapi 4.2 (macOS/Linux) — used by this converter at runtime

Both versions read the same EDF binary files, so any data that is stored in the EDF (timestamps, gaze coordinates, pupil size, messages, event boundaries) should be identical. However, per-sample angular resolution (FSAMPLE.rx, FSAMPLE.ry) is not stored in the EDF — it is computed at read time by the edfapi from calibration data and the current gaze position. The two edfapi versions use different internal algorithms for this computation, producing different resolution values for the same sample.

How we proved this

We compared the raw data output from both edfapi versions across 270 EDF files (72,946 samples total):

Gaze coordinates (gx, gy): 100.0% exact match across all 72,946 samples. Every single gaze value read by edfapi 4.2 is bit-for-bit identical to the value read by edfapi 3.1. This confirms both versions faithfully decode what is stored in the EDF binary.
Pupil size (pa): 100.0% exact match (same reasoning — stored in EDF, not computed).
Timestamps, messages, event boundaries: 100.0% exact match.
Angular resolution (rx, ry): Differs between versions. At screen center the versions agree to within <0.01%, but at screen edges edfapi 4.2 can return values up to ~90 px/deg while edfapi 3.1 caps around 55–60 px/deg. This is consistent with different internal algorithms for mapping gaze position to angular resolution — not a bug in either version.

What this affects in the ASC output

The resolution difference propagates to exactly two ASC output fields:

1. `END RES` — average resolution across a recording

The END line includes two resolution values that are the mean of all per-sample rx/ry values in that recording block. Since the per-sample resolution is computed differently, the averages differ.

Measured distribution across 9,394 END lines from 270 files:

Percentile	Relative error (%)
p50	0.77%
p75	1.65%
p90	3.24%
p95	4.90%
p99	12.14%
max	82.94%

The large tail values (>10%) occur in recordings where gaze frequently moves to screen edges, where the resolution computation diverges most between edfapi versions. Recordings where gaze stays near screen center show <1% difference.

2. `ESACC` amplitude — saccade size in degrees

Saccade amplitude is computed as sqrt((dx/res_x)² + (dy/res_y)²) where res_x/res_y are the resolution values at the saccade start and end points. Since resolution differs between edfapi versions, the degree-converted amplitude differs too. Measured up to ~6% difference across 106 files. The gaze coordinates themselves (saccade start/end positions in pixels) match exactly.

Tolerance approach

Given these findings, the test comparator applies the following rules:

Strictly verified (must match exactly):

Sample timestamps, gaze x/y coordinates, HTARGET values
Event timestamps, gaze coordinates
START / PRESCALER / VPRESCALER / PUPIL / EVENTS / SAMPLES config lines
MSG / BUTTON / INPUT lines

Resolution-dependent fields (accept any difference, verify non-resolution fields):

Field	What is tolerated	What must still match
`END RES` (last 2 fields of END line)	Any difference in the resolution averages	Timestamp, SAMPLES, EVENTS, and all other END fields
`ESACC` amplitude (field 7)	Any difference in degree-converted amplitude	Eye, start time, end time, duration, start gaze x/y, end gaze x/y
`ESACC` pvel (field 8)	±1 integer difference	All other fields

The rationale: these tolerances accept only fields that are mathematically derived from the edfapi-computed resolution. All fields that are stored in the EDF binary are still verified exactly. A converter bug that corrupts timestamps, gaze coordinates, event boundaries, or messages would still be caught.

Other minor cross-version differences

These affect specific edge cases and are documented in test_to_asc.py:

Difference	Scope	Explanation
Pupil value when gaze is missing	152/270 files (~91k samples)	edf2asc 3.1 zeroes the pupil field when gaze is missing (`.`); edfapi 4.2 keeps the actual pupil value. The pupil hardware reading is valid even during blinks.
Gaze zeroing during non-tracked-eye blinks	2/270 files	edf2asc 3.1 may zero the tracked eye's gaze during a non-tracked-eye blink (e.g., `SBLINK R` in a left-eye-only recording). edfapi 4.2 keeps the valid tracked-eye data.
Non-tracked-eye blink events	2/270 files	edfapi 3.1 emits `SBLINK R`/`EBLINK R` in left-eye-only recordings; edfapi 4.2 omits these (produces 1 fewer line).
Event/sample ordering at blink boundaries	2/270 files	The edfapi versions may deliver `SBLINK`/`EBLINK` events and adjacent sample lines in slightly different order.
Sample status markers	Cosmetic only	`...` vs `I..` vs `M............` — different status string conventions between versions.
EFIX/ESACC/EBLINK duration overflow	2/270 files	When `sttime` is near `UINT32_MAX` (4294967295), duration computation requires unsigned 32-bit arithmetic. Both versions compute valid durations; minor rounding differences at the overflow boundary.

Testing methodology

The test suite (test_to_asc.py) performs line-by-line comparison against reference .asc files, implementing the tolerances described above. The test_to_asc_matches_reference() function can be called with custom paths for batch validation, which is how the 270-file comprehensive test was run.

Test plan

pytest src/eyelinkio/tests/test_to_asc.py — 3 tests pass (creates file, line counts, line-by-line match)
pytest src/eyelinkio/tests/test_edf.py — existing tests pass (no regressions)
Comprehensive validation against 270 EDF files — 269/270 pass (1 corrupt EDF)

Adding edf2asc conversion.

christian-oreilly · 2026-02-14T19:50:31Z

@scott-huberty In case you are interested to merge this into your main package. I implemented that to streamline our data processing for Q1K. It is merged in my forked so we can use it from there, but I thought you might be interested in merging in back to the main project.

scott-huberty · 2026-02-14T21:02:54Z

Would love to, it would actually add the "O" part to EyeLinkIO ; ) Thanks Christian, I'll look closer tomorrow or Sunday.

christian-oreilly added 2 commits February 14, 2026 14:34

Adding edf2asc conversion.

19e033c

Merge pull request #1 from lina-usc/edf2asc

3fa914d

Adding edf2asc conversion.

gabrielblancogomez mentioned this pull request Feb 17, 2026

Adding functionality fo RS Rio task lina-usc/q1k_eeget_init#6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding edf2asc conversion#29

Adding edf2asc conversion#29
christian-oreilly wants to merge 2 commits intoscott-huberty:mainfrom
lina-usc:main

christian-oreilly commented Feb 14, 2026

Uh oh!

christian-oreilly commented Feb 14, 2026

Uh oh!

scott-huberty commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

christian-oreilly commented Feb 14, 2026

ENH: Add EDF-to-ASCII (.asc) converter

Summary

Motivation

Changes

New files

Modified files

Usage

What the converter handles

Known limitations: edfapi version differences

Background

How we proved this

What this affects in the ASC output

1. END RES — average resolution across a recording

2. ESACC amplitude — saccade size in degrees

Tolerance approach

Other minor cross-version differences

Testing methodology

Test plan

Uh oh!

christian-oreilly commented Feb 14, 2026

Uh oh!

scott-huberty commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `END RES` — average resolution across a recording

2. `ESACC` amplitude — saccade size in degrees