Skip to content

Replication package for 'Regulatory Leakage Among Financial Advisors: Evidence From FINRA Regulation of Bad Brokers'

Notifications You must be signed in to change notification settings

edwinhu/bad_brokers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Replication Package for "Regulatory Leakage Among Financial Advisors: Evidence From FINRA Regulation of 'Bad' Brokers"

DOI

Overview

This replication package contains all code and documentation needed to reproduce the analysis and figures in the above-named study. The project examines how financial advisors with misconduct records migrate between different regulatory regimes (FINRA, SEC investment advisors, state investment advisors, and insurance producers) in response to increased regulatory scrutiny.

The code constructs the analysis files from multiple regulatory and public data sources primarily using SAS and R. The main notebooks run all code to generate the data for the figures and tables in the paper. Replication requires a large merged dataset and several hours of compute time (see below).

Repository Structure

bad_brokers/
├── README.md                          # This file
├── 01_merge.ipynb                     # Data processing and merging
├── 02_analysis.ipynb                  # Main regression analyses and figures
├── pixi.toml                          # Package management configuration
└── data/                              # Symlinked to shared data directory
    ├── input/                         # Raw input data files
    │   ├── NAIC_state_year_clean.csv  # NAIC state insurance department data
    │   ├── NPN_LOA_anon.csv           # Insurance lines of authority (anonymized)
    │   ├── TX_compl.csv               # Texas complaint data
    │   ├── bls_advisor_salary.xlsx    # BLS wage data
    │   ├── drp_history_anon.csv       # Disciplinary actions (anonymized)
    │   ├── exam_history_anon.csv      # Professional exams (anonymized)
    │   ├── firm_crds.csv              # Firm identifiers
    │   ├── firm_summary.csv           # Firm characteristics
    │   ├── ia_bd_reg3_anon.csv        # Investment advisor registrations (anonymized)
    │   ├── ins_assets.csv             # Insurance industry assets data
    │   ├── ins_reg_history_anon.csv   # Insurance registrations (anonymized)
    │   ├── new_registrations.csv      # New registration statistics
    │   ├── producer_matches_anon.csv  # Name matching results (anonymized)
    │   ├── reg_history.csv            # Registration history
    │   ├── state_history_anon.csv     # State registrations (anonymized)
    │   └── wgnd_2_0_name-gender-code.csv  # Gender name dictionary
    └── output/                        # SAS output files created by 01_merge.ipynb

Data and Code Availability

Data Name File(s) Provided? Source/Access
FINRA BrokerCheck data/input/drp_history_anon.csv, reg_history.csv, exam_history_anon.csv, firm_crds.csv, firm_summary.csv Yes FINRA, see https://brokercheck.finra.org/
SEC IAPD data/input/state_history_anon.csv, ia_bd_reg3_anon.csv Yes SEC IAPD, https://adviserinfo.sec.gov/
State Insurance Producer data/input/ins_reg_history_anon.csv, NPN_LOA_anon.csv, producer_matches_anon.csv Yes State insurance departments, NAIC
NAIC State Data data/input/NAIC_state_year_clean.csv Yes National Association of Insurance Commissioners
Insurance Assets data/input/ins_assets.csv Yes Industry reports
New Registrations data/input/new_registrations.csv Yes Regulatory filings
Texas Complaints data/input/TX_compl.csv Yes Texas Department of Insurance
BLS Wage Data data/input/bls_advisor_salary.xlsx Yes U.S. Bureau of Labor Statistics
Gender Data data/input/wgnd_2_0_name-gender-code.csv Yes Gender API, see file
Merged Analysis Data data/output/all_reg.sas7bdat, last.sas7bdat, tx_all.sas7bdat, tx_recidivism.sas7bdat, cma_map.sas7bdat, cma_drop_last.sas7bdat, sankey2.sas7bdat No Derived

All data files are described above. The main analysis dataset (all_reg.sas7bdat) is approximately 8GB and contains ~7 million advisor-year observations. All required data files in the data/output/ directory are generated and not included in the archive.

See FOIA Log below for details on our public records requests for insurance producer registration data.

Computational Requirements

Software Requirements

  • R (v4.4+ recommended)
    • data.table (v1.16.4+)
    • haven (v2.5+)
    • lubridate (v1.9+)
    • lfe (v3.1.1+)
    • fixest (v0.12+)
    • stargazer (v5.2.3+)
    • ggplot2 (v3.4+)
    • scales (v1.2+)
    • maps (v3.4+)
    • shadowtext (v0.1+)
    • ggthemes (v5.0+)
    • IRdisplay (v1.1+)
    • plotly (v4.10+)
    • tidyverse (v2.0+)
    • binsreg
    • mapproj (v1.2.12+)
    • marginaleffects (v0.10+, for ggfixest)
    • dreamerr (for ggfixest)
    • legendry (v0.2+, for ggfixest)
    • ggfixest (v0.3+, installed from CRAN tarballs with pak)
  • Python 3.12+ (for working with R/SAS kernels)
  • SAS (for data prep)
  • pixi (recommended: for environment management)

Hardware Requirements

  • OS: Linux (recommended), MacOS, or Windows
  • RAM: 32GB+ recommended for full data processing
  • Disk: 20GB+ free space (main dataset is ~8GB)
  • Runtime: Data processing 2-4 hours; analysis 30-60 minutes

Instructions to Replicators

Using Pixi Environment Management (Recommended)

This project uses pixi for environment management. To set up the environment:

# Install pixi (if not already installed)
# See https://pixi.sh/latest/

# Install all dependencies
pixi install

# Activate the environment and start Jupyter
pixi shell
jupyter lab

Alternative: Using the Portable Environment from Zenodo

A portable environment archive environment.tar (~586MB) and the required data files data.tar.gz (~548MB) are available through the Zenodo DOI above. Download both archives from Zenodo, then to use the environment:

# Install pixi-pack for unpacking (if not already installed)
# See https://github.com/Quantco/pixi-pack

# Extract the data files
tar -xzf data.tar.gz

# Unpack the environment  
pixi-unpack environment.tar

# Install the custom ggfixest package from the included tarball
pixi shell
Rscript -e "pak::pak('local::r_packages/src/contrib/ggfixest_0.3.0.tar.gz')"

# Start Jupyter
jupyter lab

The environment contains all conda packages plus an offline CRAN repository with ggfixest and 83 dependency tarballs (~65MB), ready for installation with pak.

Running the Analysis

  1. Set up the environment using the portable environment from Zenodo or install packages manually (see Software Requirements).
  2. If using Zenodo archives, extract data.tar.gz to get all required data files in the data/input directory.
  3. Run 01_merge.ipynb to process and merge data (runtime: 2-4 hours).
  4. Run 02_analysis.ipynb to generate all tables and figures (runtime: 30-60 minutes).
  5. Outputs (tables and figures) are generated within the notebooks.

Data Sources

The analysis uses data from multiple regulatory sources:

  1. FINRA BrokerCheck: Individual broker records, misconduct, and employment history
  2. SEC Investment Adviser Public Disclosure (IAPD): SEC investment advisor records
  3. State Insurance Producer Databases: State-level insurance producer registrations
  4. NAIC Data: State insurance department resources and enforcement statistics
  5. Bureau of Labor Statistics: Wage data by occupation and state
  6. Lines of Authority (LOA) Data: Insurance product classifications

Key Variables

Individual-Level Variables

  • indv_crd: Individual Central Registration Depository number (FINRA)
  • has_srs_ever: Ever had serious misconduct (Specified Risk Event)
  • form_cma: High-risk broker (2+ SREs or 1+ criminal matter)
  • bc, ia, ins: Current FINRA, investment advisor, insurance registrations
  • sec_ia, state_ia: SEC vs. state investment advisor registration
  • add_ia, drop_bc: Indicators for adding IA or dropping BC registration

Misconduct Variables

  • has_mis_ever: Ever had misconduct disclosure
  • has_srs_ever: Ever had serious misconduct
  • has_drp_ever: Ever had disciplinary action
  • count_srs: Count of serious misconduct events
  • fine_*: Dollar amounts of fines by type

Demographic and Professional Variables

  • female: Gender indicator
  • years_exp: Years of experience in financial services
  • n_exams: Number of professional exams passed
  • retail_broker: Serves retail clients
  • qual_va, qual_nasaa: Qualifications for variable annuities and state advisory work

State-Level Variables

  • department_budget: State insurance department budget
  • insurance_staff: Number of insurance department staff
  • dollar_fines, n_fines: Dollar amount and count of fines issued
  • n_complaints, n_inquiries: Consumer complaints and inquiries
  • broker_p50, insurance_p50: Median wages by occupation

Main Analysis Steps

1. Data Processing (01_merge.ipynb)

The data merging process involves several key steps:

  1. Load FINRA BrokerCheck Data: Individual advisor records with employment history and misconduct
  2. Process IAPD Data: SEC investment advisor registrations and firm information
  3. Merge Insurance Data: State insurance producer records linked by name matching
  4. Add State-Level Data: NAIC insurance department characteristics
  5. Create Panel Dataset: Individual-year observations from 2012-2022
  6. Generate Analysis Variables: Migration indicators, misconduct measures, controls

2. Regression Analysis (02_analysis.ipynb)

The main analyses include:

Registration Status Regressions

Examine how misconduct relates to current regulatory registrations:

felm(bc ~ has_srs_ever + controls | firm_county_year | 0 | firm, data)
felm(ins ~ has_srs_ever + controls | firm_county_year | 0 | firm, data)

Migration Regressions

Analyze likelihood of adding/dropping registrations:

felm(add_ia ~ has_srs + controls | firm_county_year | 0 | firm, data[bc==1 & ia==0])
felm(drop_bc ~ has_srs + controls | firm_county_year | 0 | firm, data[bc==1])

Policy Analysis

Examine impact of 2019 regulatory changes on high-risk brokers:

felm(drop_bc ~ form_cma * post_2018 * insurance + controls | firm_county_year | 0 | firm, data)

FOIA Log

Below is the list of the states that we FOIA'd to get insurance producer registration data. The Data Received column contains the date when we received the request, a 'Y' if we obtained the data, but did not log the exact date, and 'N' if we did not receive any response. For some states we were able to download the data ourselves, and the link to the data is provided.

State FOIA Submission Date Data Received
Alabama 2022-04-25 N
Alaska 2022-04-25 2022-05-05
Arizona 2022-04-25 N
Arkansas 2022-04-25 N
California 2022-04-25 N
Colorado 2022-04-25 2022-04-26
Connecticut 2022-04-27 2022-04-27
Delaware 2022-04-25 N
Florida 2022-04-25 https://licenseesearch.fldfs.com/BulkDownload
Georgia 2022-04-25 N
Hawaii 2022-04-25 N
Idaho 2022-04-25 Y
Illinois 2022-04-27 2022-06-13
Indiana 2022-04-27 2022-07-08
Iowa 2022-04-27 2022-05-02
Kansas 2022-04-25 N
Kentucky 2022-04-25 2022-04-29
Louisiana 2022-04-25 https://www.ldi.la.gov/industry/producer-adjuster/search-for-producers-and-adjusters/producer-adjuster-licensee-report
Maine 2022-04-25 Y
Maryland 2022-04-25 Y
Massachusetts 2022-04-27 2022-05-21
Michigan 2022-04-25 2022-05-12
Minnesota 2022-04-25 N
Mississippi 2022-04-27 2022-04-27
Missouri 2022-04-27 2022-05-20
Montana 2022-04-27 N
Nebraska 2022-04-27 2022-04-27
Nevada 2022-04-26 N
New Hampshire 2022-04-26 N
New Jersey 2022-04-26 2022-04-27
New Mexico 2022-04-26 N
New York 2022-04-26 2022-06-24
North Carolina 2022-04-26 2022-07-07
North Dakota 2022-04-26 20220-04-29
Ohio 2022-04-26 2022-04-26
Oklahoma 2022-04-26 2022-05-03
Oregon 2022-04-26 N
Pennsylvania 2022-04-26 N
Rhode Island 2022-04-26 2022-05-03
South Carolina 2022-04-26 N
South Dakota 2022-04-26 Y
Tennessee 2022-04-26 N
Texas 2022-04-26 2022-04-27; https://data.texas.gov/dataset/Insurance-complaints-All-data/ubdr-4uff/about_data
Utah 2022-04-26 2022-05-02
Vermont 2022-04-26 https://dfr.vermont.gov/industry/insurance/producer-and-individual-licensing
Virginia 2022-04-26 N
Washington 2022-04-26 2022-04-27
West Virginia 2022-04-26 N
Wisconsin 2022-04-26 2022-07-13
Wyoming 2022-04-26 2022-07-15

About

Replication package for 'Regulatory Leakage Among Financial Advisors: Evidence From FINRA Regulation of Bad Brokers'

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •