Skip to content

BUG: Fix infinite loops and data corruption in data layer#317

Open
pandashark wants to merge 1 commit intostefan-jansen:mainfrom
pandashark:bugfix/data-layer-integrity
Open

BUG: Fix infinite loops and data corruption in data layer#317
pandashark wants to merge 1 commit intostefan-jansen:mainfrom
pandashark:bugfix/data-layer-integrity

Conversation

@pandashark
Copy link

Summary

Fixes 5 data layer bugs including 2 infinite loops, a data corruption issue, a timezone inconsistency, and a security vulnerability:

  • Exclude mask operator: convert_cols used &= instead of |= for the exclude mask, meaning rows were never excluded unless every column overflowed simultaneously.
  • Infinite loop in spot value: DataPortal._get_daily_spot_value looped infinitely searching backwards for a non-NaN close price with no bounds check against _first_trading_day.
  • Infinite loop in last traded: BcolzDailyBarReader.get_last_traded_dt looped infinitely when prev_day_ix reached -1.
  • Merger tz mismatch: Merger adjustment dates were not tz-localized, unlike splits and dividends, causing comparison failures.
  • Path traversal (CVE-2007-4559): tar.extractall() in the Quandl bundle had no path validation.

Closes #314

Test plan

  • Full test suite passes (3142 tests, 11 pre-existing pandas 2.3 failures)
  • flake8 clean on all modified files

Fix convert_cols exclude mask to use |= instead of &= so rows with
uint32 overflow are actually filtered. Add bounds check in
DataPortal._get_daily_spot_value to prevent infinite loop when no
valid close price exists. Fix BcolzDailyBarReader.get_last_traded_dt
to return pd.NaT when prev_day_ix reaches -1. Apply tz_localize to
merger adjustment dates consistent with splits and dividends. Add
path traversal validation to tar.extractall in Quandl bundle
(CVE-2007-4559).

Closes stefan-jansen#314
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Infinite loops and data integrity issues in data portal and bar readers

1 participant