RAISE-25 RESEARCH ARTIFACTS
- data/ - Contains research_data.json, which is the master dataset containing all the raw numbers used in the visualizations.
- figures/ - Contains the high-resolution PNG images generated by the scripts. These are the exact graphs used in the paper.
- scripts/ - Contains the Python source code used to generate each figure. Each script is standalone and reproducible.
VISUALIZATION DETAILS
-
Section 3: Attention Mechanism (section_3_attention.py) Visualizes the internal self-attention weights of a Transformer model. It demonstrates how the model resolves pronoun ambiguity (e.g., linking "it" to "animal") using an attention heatmap.
-
Section 5: Empirical Scaling Laws (section_5_scaling.py) Plots the relationship between Compute (FLOPs) and Loss. It demonstrates the "Power Law," proving that performance improvements are predictable functions of increased compute resources.
-
Section 5B: Emergent Capabilities (section_5b_emergence.py) Contrasts the smooth scaling laws with "Phase Transitions." It shows how specific capabilities (like arithmetic) remain near-zero until a critical model size is reached, appearing suddenly.
-
Section 6: Forecasting (section_6_forecast.py) Projects future benchmark performance (MMLU) using historical trend lines. It establishes a "Cone of Uncertainty" to model best-case vs. stagnation scenarios for policy analysis.
-
Section 7: The Trust Paradox (section_7_trust.py) A dual-axis chart illustrating the inverse relationship between Model Accuracy and Human Verification. It quantifies the risk of over-reliance as models become more capable.
-
Section 8: Economic Impact (section_8_economic.py) Visualizes the "Skill-Leveling" effect. The data shows that LLM tools provide a significantly larger performance boost to low-skill workers compared to high-skill workers, reducing the performance gap.
HOW TO RUN
- Install the required Python libraries:
pip install matplotlib seaborn pandas numpy
- Run the master script to generate all figures:
python run_all.py
REQUIREMENTS
- Python 3.8+
- matplotlib
- seaborn
- pandas
- numpy