Adding edf2asc conversion#29
Open
christian-oreilly wants to merge 2 commits intoscott-huberty:mainfrom
Open
Conversation
Adding edf2asc conversion.
Author
|
@scott-huberty In case you are interested to merge this into your main package. I implemented that to streamline our data processing for Q1K. It is merged in my forked so we can use it from there, but I thought you might be interested in merging in back to the main project. |
Owner
|
Would love to, it would actually add the "O" part to EyeLinkIO ; ) Thanks Christian, I'll look closer tomorrow or Sunday. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ENH: Add EDF-to-ASCII (.asc) converter
Summary
Adds a
to_asc()function that converts EyeLink EDF binary files to ASCII (.asc) format, producing output equivalent to SR Research's proprietaryedf2asccommand-line tool. This is a pure-Python, streaming, single-pass converter that uses the existing edfapi ctypes bindings.Verified against 270 EDF files — 269/270 pass line-by-line comparison against reference
.ascfiles generated by the officialedf2asctool (1 file has a corrupt EDF that edfapi cannot open).Motivation
edf2asctool (Windows-only or requires a separate SR Research install) to get ASCII output from EDF files.to_asc()in eyelinkio lets users convert EDF to ASCII in a single Python call, on any platform where edfapi is available.Changes
New files
src/eyelinkio/edf/to_asc.py(461 lines)edf_get_next_data()and writes ASC output using a handler-per-event-type architecture.src/eyelinkio/tests/test_to_asc.py(387 lines).ascfile, with documented tolerances for known edfapi version differences.Modified files
src/eyelinkio/edf/__init__.pyto_ascsrc/eyelinkio/__init__.pyto_ascat package levelsrc/eyelinkio/edf/read.pyEDF.to_asc()convenience method; store_fpathon the EDF instancesrc/eyelinkio/edf/_edf2py.pyassertwithraise OSErrorfor missing edfapi librarysrc/eyelinkio/tests/test_edf.pyraise ValueErrortocontinuefor unexpected EDF files in test loops (allows test data directory to contain additional.edffiles without breaking tests)Usage
What the converter handles
The converter produces all ASC output sections in correct chronological order:
CONVERTED FROMheader with edfapi version, raw EDF preambleSTART/ENDwith config lines (PRESCALER,VPRESCALER,PUPIL,EVENTS,SAMPLES)SFIX/EFIX,SSACC/ESACC,SBLINK/EBLINKwith correct spacing and field formattingMSGlines with original timestampsBUTTONlines (decomposed from bit masks),INPUTlines.; HTARGET sentinel (-32768) rendered as.withM............statusrx/ryaccumulated forEND RESaverages; start/end resolution used forESACCamplitude computationKnown limitations: edfapi version differences
Background
The converter uses SR Research's
edfapiC library (via ctypes) to read EDF files. The reference.ascfiles used for validation were generated by the officialedf2asctool. There are two edfapi versions in play:edf2ascto generate 266 of 270 reference filesBoth versions read the same EDF binary files, so any data that is stored in the EDF (timestamps, gaze coordinates, pupil size, messages, event boundaries) should be identical. However, per-sample angular resolution (
FSAMPLE.rx,FSAMPLE.ry) is not stored in the EDF — it is computed at read time by the edfapi from calibration data and the current gaze position. The two edfapi versions use different internal algorithms for this computation, producing different resolution values for the same sample.How we proved this
We compared the raw data output from both edfapi versions across 270 EDF files (72,946 samples total):
Gaze coordinates (
gx,gy): 100.0% exact match across all 72,946 samples. Every single gaze value read by edfapi 4.2 is bit-for-bit identical to the value read by edfapi 3.1. This confirms both versions faithfully decode what is stored in the EDF binary.Pupil size (
pa): 100.0% exact match (same reasoning — stored in EDF, not computed).Timestamps, messages, event boundaries: 100.0% exact match.
Angular resolution (
rx,ry): Differs between versions. At screen center the versions agree to within <0.01%, but at screen edges edfapi 4.2 can return values up to ~90 px/deg while edfapi 3.1 caps around 55–60 px/deg. This is consistent with different internal algorithms for mapping gaze position to angular resolution — not a bug in either version.What this affects in the ASC output
The resolution difference propagates to exactly two ASC output fields:
1.
END RES— average resolution across a recordingThe
ENDline includes two resolution values that are the mean of all per-samplerx/ryvalues in that recording block. Since the per-sample resolution is computed differently, the averages differ.Measured distribution across 9,394
ENDlines from 270 files:The large tail values (>10%) occur in recordings where gaze frequently moves to screen edges, where the resolution computation diverges most between edfapi versions. Recordings where gaze stays near screen center show <1% difference.
2.
ESACCamplitude — saccade size in degreesSaccade amplitude is computed as
sqrt((dx/res_x)² + (dy/res_y)²)whereres_x/res_yare the resolution values at the saccade start and end points. Since resolution differs between edfapi versions, the degree-converted amplitude differs too. Measured up to ~6% difference across 106 files. The gaze coordinates themselves (saccade start/end positions in pixels) match exactly.Tolerance approach
Given these findings, the test comparator applies the following rules:
Strictly verified (must match exactly):
Resolution-dependent fields (accept any difference, verify non-resolution fields):
END RES(last 2 fields of END line)ESACCamplitude (field 7)ESACCpvel (field 8)The rationale: these tolerances accept only fields that are mathematically derived from the edfapi-computed resolution. All fields that are stored in the EDF binary are still verified exactly. A converter bug that corrupts timestamps, gaze coordinates, event boundaries, or messages would still be caught.
Other minor cross-version differences
These affect specific edge cases and are documented in
test_to_asc.py:.); edfapi 4.2 keeps the actual pupil value. The pupil hardware reading is valid even during blinks.SBLINK Rin a left-eye-only recording). edfapi 4.2 keeps the valid tracked-eye data.SBLINK R/EBLINK Rin left-eye-only recordings; edfapi 4.2 omits these (produces 1 fewer line).SBLINK/EBLINKevents and adjacent sample lines in slightly different order....vsI..vsM............— different status string conventions between versions.sttimeis nearUINT32_MAX(4294967295), duration computation requires unsigned 32-bit arithmetic. Both versions compute valid durations; minor rounding differences at the overflow boundary.Testing methodology
The test suite (
test_to_asc.py) performs line-by-line comparison against reference.ascfiles, implementing the tolerances described above. Thetest_to_asc_matches_reference()function can be called with custom paths for batch validation, which is how the 270-file comprehensive test was run.Test plan
pytest src/eyelinkio/tests/test_to_asc.py— 3 tests pass (creates file, line counts, line-by-line match)pytest src/eyelinkio/tests/test_edf.py— existing tests pass (no regressions)