Skip to content

Releases: OpenSourceAP/CrossSection

Signals Translated to Python + Annual Improvements

22 Oct 07:29
b4e911e

Choose a tag to compare

What's Changed

  • Signals are now constructed in Python instead of Stata. All translated scripts satisfy very strict numerical replication tests of the Stata outputs, or were manually approved with documented explanations for overriding the numerical tests. While this change has little direct effect on user experience, it makes long-term maintenance much easier and will provide higher quality data and more timely updates in the future. It also allows high quality AI access to the code and dataset. (#174)
  • SignalDoc.csv got two upgrades (1) Google Scholar citation data from Ivo Welch and (2) an interactive HTML browser (#191)
  • Market-cap now aggregates across all permnos within a permco before computing valuation ratios. Multi-class firms (e.g., GOOG/GOOGL) were previously treated at the permno level; we now sum equity across a company’s permnos and use that in B/M, earnings yield, etc. This addresses mismeasurement that affected ~4% of total market cap and improved replication quality and performance for MS and PS. (#167)
  • Analyst-linked signals switch to using WRDS’s CRSP–IBES link file directly, rather than the iclink macro from /wrds/lib/utility/wrdslib.sas on the WRDS file system. This affects signals that rely on IBES analyst data (e.g., changes in analyst coverage, forecast-dispersion), improves matching quality, and streamlines implementation. (#150)
  • Option-based signals (e.g., O/S from Johnson & So, 2012) now (i) use OptionMetrics’ CRSP linkfile directly and (ii) filter option volume by time-to-expiration as in the original paper. This removes volume due to automatic rebalancing that has become very high in the current era of 0DTEs (#172) (#166)
  • Short-interest signals: fixed a fill-forward bug in Recomm_ShortInterest.do that used asrol ... stat(first) but produced many missings instead of “last non-missing in past 12 months.” We now carry forward within a 12-month window by permno, restoring intended coverage; also refreshed the short-interest input data. (#178) (#165)
  • Price-delay t-stat signal: corrected typo in formula and removed unnecessary winsorization. (#177)
  • Analyst-coverage change (ChNAnalyst) now applies the size filter by date instead of over the whole sample. This improves both replication quality and performance (#182)
  • Trend-factor construction: fixed an asreg/deduplication issue that could double-count or drop rows before regressions. This stabilizes sample sizes and estimates in the trend signal. (#179)
  • For a complete list of closed issues see: https://github.com/OpenSourceAP/CrossSection/issues?q=is%3Aissue%20closed%3A2024-10-22..2025-10-22%20sort%3Aupdated-desc

AnnouncementReturn lookahead bias patch

08 Oct 15:09

Choose a tag to compare

Addresses the lookahead bias in AnnouncementReturn from end-of-month announcements described here: #158

Used to generate the October 2024 release (technically, the code was run at the end of September).

Annual improvements

22 Aug 09:21
f9b4823

Choose a tag to compare

  • Two signals needed to be switched to new datasources after the original ones were discontinued:
    • betaVIX is originally based on VXO (volatility index based on S&P100) which was discontinued in September 2021. We switch to VIX afterwards.
    • Mom6mJunk is based on S&P ratings data but WRDS S&P credit ratings end in Feb 2017. We switch to Capital IQ S&P ratings data from 2016. In addition, we only assign a stock to "Junk" if it has a proper credit rating and the credit rating is low (previously, we interpreted missing credit ratings as “Junk” as well).
  • Fixed typos in the signal documentation (signaldoc.csv) for some signals (DivYieldST, dCPVolSpread, AgeIPO).
  • Fixed low number of observations in some months or years in two signals (FirmAgeMom, ForecastDispersion) that were due to filters that set some observations to missing.
  • New code for the “zerotrade” signals that more closely follows Liu (2006). Also rationalized naming.
    zerotrade1M, zerotrade6M and zerotrade12M are the 1-,6- and 12-month versions of the signal (as opposed to zerotradeAlt1, zerotrade, zerotradeAlt12 in earlier versions).
  • FailureProbability requires book value of equity and its construction now follows Cohen, Polk, and Vuolteenaho (2003) (instead of just using ceqq) as referenced by Campbell, Hilscher and Szilagyi (2008).
  • We verified that there is no look-ahead bias in signals that use cfacshr or cfacpr.
    • In the process, we lagged ShareIss5Y by an additional 5 months and we included alternative code (as a comment) for ShareIss1Y that closely follows Pontiff and Woodgate (2008) that gives very similar results to our implemented version.
    • We still need to check signals that are based on 13F data.
  • For a complete list of closed issues see: https://github.com/OpenSourceAP/CrossSection/issues?q=is%3Aissue+closed%3A2023-08-16..2024-08-22+sort%3Aupdated-desc

Five new predictors, annual improvements

15 Aug 11:45

Choose a tag to compare

Major updates:

  • Three new predictors using option prices from An, Ang, Bali and Cakici (2014)
  • Two new predictors from Bali and Hovaikimian (2009)

Fixes and minor updates:

  • We use an improved OptionMetrics-CRSP link.
  • Fixed typos in NetDebtFinance, NetEquityFinance, NetExternalFinance (XFIN), KZ and KZ_q.
  • Fixed gaps in a number of signals that were due to unbalanced panel issues (ChNAnalyst, Rev6, DivYieldST). Fixed gap in CoSkewACX that was due to assuming all 12-month samples have 252 trading days.
  • Improved code to compute Ang Hodrick Xing Zhang’s idiosyncratic volatility. Also rationalized naming.
    • RealizedVol is the volatility of returns over the past month
    • IdioVol3F is the volatility of FF3 residuals (previously also called IdioRisk
    • IdioRisk is deleted
  • New code for sinAlgo that more closely follows Hong and Kacperczyk (2009).
  • BM now follows the original paper, Stattman (1980).
  • For a complete list of closed issues see: https://github.com/OpenSourceAP/CrossSection/issues?page=1&q=is%3Aissue+is%3Aclosed

Two new predictors, FF1993 2x3 implementations, annual improvements

29 Mar 13:11

Choose a tag to compare

Major Updates

  • Two new predictors:
    1. TrendFactor from Han, Zhou and Zhu (2016)
    2. Recomm_ShortInterest from Drake, Rees, and Swanson (2011)
  • Fama-French 1993 style 2x3 implementations for all signals
  • SignalDocumentation.xlsx BasicInfo and AddInfo are now in SignalDoc.csv
    • The lit comparisons tabs are in other csvs
    • This change allows for clean versioning in git

Minor Updates

  • Coskewness and CoskewACX: now uses Ken French’s market return and risk-free rates
    • Old version used CRSP’s NYSE/AMEX or NYSE only index and CRSP’s risk free rates.
  • Accruals: now more closely match Sloan 1996 by including depreciation. #51
  • Delisting return adjustments now computed with compounding #49
  • Quarterly Compustat lagging deals with subtle issues with rdq #50
  • Various bug fixes
  • For a complete list of closed issues see here

Full Changelog: v1.1.0...v1.2.0

Daily portfolio returns, more monthly implementations, completeness checks

22 Apr 15:33
c9dedef

Choose a tag to compare

Major Updates:

  • Fixed missing FirmAge signal. Signal was missing in a couple of the data files.
  • Made daily portfolios ready for sharing. There was code before, but it wasn't ready to share.
  • Added more implementations and ways to access the data
  • Added code to prepare data for sharing and to check for completeness. Data should be more reliable now.
  • Improved README, overall more polished product

Minor updates:

  • Many usability issues resolved: removed unnecessary iclink.csv checks, sort data before posting, font download, package downloads, smaller daily crsp downloads for reliability, default view of SignalDocumentation
  • Fixed a few rebalancing frequencies and detailed descriptions in SignalDocumentation.xlsx

More details:

Major modularization update

19 Mar 15:10
d1e4e4d

Choose a tag to compare

  • Code is entirely rewritten
    • Now each data download is in its own file, and each signal construction is in its own file.
    • Most of the signal files can be run in parallel.
    • If signal file errors out, main program moves on to the next file (try-catch)
    • Huge improvement to usability
      • Easy to find which signal file you’re looking for
      • Easy to improve any individual signal file due to lack of dependencies
  • Modular signals data structure
    • Each signal has its own firm-month csv (E.g. STreversal.csv has a short-term reversal signal for each permno-month in CRSP)
    • Some improvement to usability:
      • Allows for modular updates to the full dataset
      • Also allows users interested in specific signals to simply retrieve the signal
  • Simplified portfolio code
    • Replaced complicated balanced matrix portfolio tracking with simple conditional weighted mean construction
    • Inputs are now just individual single signal csvs instead of the full dataset
    • Constructs all quantile-portfolios and long-shorts together
    • Accommodates daily portfolios (though signals must still be monthly)
    • Accommodates discrete signals as a generalization of binary
  • Some improvement in transparency and error checking
    • Ensures consistency between all-quantile portfolios and long-shorts
  • Improved classification and documentation of signals in SignalDocumentation.xlsx
    • For each signal, we hand collect the number of the table with predictability evidence, test in the table (port sort, regression), sign, t-stat, mean monthly return, quantile, portfolio assignment period, and filters.
    • We use the above to categorize signals by predictability in the original paper and signal replication quality. As a result, these categories are more true to the original papers.
  • Improved signals and new signals
    • Improved/Fixed:
      • EarningsStreak (Loh and Warachka, formerly EarnIncrease)
      • DivSeason (Hartzmark and Salomon, formerly DivInd)
      • UpRecomm / DownRecomm (Barber et al, formerly UpForecast / DownForecast)
      • MomSeason* and MomOffSeason* (Heston and Sadka, formerly MomSeas*)
      • DivYieldST (Litzenberger and Ramaswamy, formerly DivYield_q)
      • Coskewness (Harvey and Siddique)
      • EquityDuraion (Dechow, Sloan, and Soliman)
      • Governance (Gompers, Ishii, Metrick)
      • DivInit and DivOmit (Michaely, Thaler, and Womack)
      • MS (Mohanoram)
      • ZScore, OScore (Dichev)
    • New signals
      • CoskewACX (Ang, Chen, and Xing)
      • AnalystRevision (Hawkins, Chamberlain, and Daniel)
      • FEPS (Cen, Wei, and Zhang)
      • OrderBacklogChg (Baik and Ahn)
    • Removed a couple of redundant signals

v0.1.2

23 Jul 17:16
6d01121

Choose a tag to compare

  • Switched to many-to-one matching to monthly CRSP for ticker-based signals
  • Thanks again to Yang Liu (Tsinghua Finance) for helping us with v0.1.1.

v0.1.1

21 Jul 10:06

Choose a tag to compare

  • Fixed timing for the availability of quarterly Compustat data (HT: Yang Liu (Tsinghua Finance))
  • Adjustment to EarnIncrease
  • Updated holding period for Cash

v0.1.0

21 Jul 10:05

Choose a tag to compare

Initial commit. Replication code.