Skip to content
Guy Yollin edited this page Mar 9, 2015 · 7 revisions

WRDS: A package to perform ECTL (Extract-Clean-Transform-Load) of Compustat and CRSP data from WRDS

Summary: Create a new R package for working with data from the CRSP/Compustat Merged (CCM) database from Wharton Research Data Services (WRDS).

Description: Wharton Research Data Services (WRDS) is a web-based business data research service from The Wharton School at the University of Pennsylvania.

Related work: There are other projects to facilitate the downloading of data from WRDS via MATLAB and Python, however, the authors are not aware of an R package that supports this process.

Potential tasks:

  • Function(s) for downloading CCM data from WRDS
  • Function(s) for inserting downloaded CCM data into a local SQLite database
  • Function(s) for extracting CCM data from the local SQLite database based on asset class, index constituent, etc.
  • Function(s) for aggregating/disaggregating and aligning data to different frequencies (i.e. quarterly, monthly, weekly)
  • Function(s) for interpolating missing data

Skills required:

  • Knowledge of R and R package development
  • Ability to document R functions and data via Roxygen2
  • Ability to work with version control systems (R-forge/Github)
  • Ability to write clear vignettes demonstrating function usage

Test: TBD

Mentor: Guy Yollin ([@](mailto:gyollin {at} uw {dot} edu))

Clone this wiki locally