Skip to content
View mycarta's full-sized avatar
💭
Trying to live in the present moment
💭
Trying to live in the present moment

Organizations

@softwareunderground

Block or report mycarta

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mycarta/README.md

Ciao! I'm Matteo Niccoli (he/him)

Geophysicist | Data Scientist | Building Tools for Rigorous Thinking

I'm a geoscientist who solves problems with Python — from subsurface characterization to statistical detection of flawed research. My work sits at the intersection of domain expertise, data science, and AI-assisted reasoning. I build tools that make quantitative skepticism systematic: checking assumptions, questioning numbers, and catching the patterns that don't survive scrutiny.

Currently developing agentic AI applications for document extraction in geotechnical engineering.

Board member at Software Underground — open-source geoscience community.

Contact: matteo@mycarta.ca | Blog: mycarta | Twitter: @my_carta


Featured Work

Bullshit Detector — Statistical Screening for Research Papers (open source)

A Python package for systematically screening published research for statistical red flags and methodological issues. Four-tier detection system, from quick API lookups to deep data analysis.

What it catches: p-value inconsistencies, impossible descriptive statistics (GRIMMER), spurious correlations in high-dimensional data, underpowered studies masquerading as significant findings, causal claims without evidence, and language escalation from "association" to "effect."

What makes it different: The package includes 12 structured skill files (~2,500 lines) that teach an LLM when to reach for which tool and how to interpret what it finds. The reasoning frameworks — Mill's Methods inverted as audit checks, Fermi sanity estimation, logical fallacy detection — are as important as the code.

>>> from bullshit_detector.spurious import P_spurious, r_crit
>>> P_spurious(r=0.50, n=21, k=100)
0.884   # 88.4% chance this correlation is spurious
>>> r_crit(n=21)
0.433   # minimum r to even consider at n=21
>>> from bullshit_detector.power import achieved_power
>>> achieved_power(effect_size=0.5, n_per_group=16)
{'power': 0.293, 'adequate': False}
# 29% power — less than a coin flip. If significant, probably a false positive.

9 modules, 104 tests, 7 example scripts, 12 skill files. Published on PyPI.

pip install bullshit-detector

Links: PyPI | GitHub | LinkedIn launch post


Perceptual Colormaps: A Decade of Knowledge Sharing (open source, professionally motivated)

~10 years of research, community contribution, and accessible visualization

Poor colormap choices create false patterns in scientific data. I've spent a decade developing methods to evaluate colormaps perceptually, sharing the results through papers, blog posts, and interactive tools.

Colormap distortion demo

Try the interactive app — see how bad colormaps distort geophysical data.

Impact:

Links: Live app | Source | Paper | SEG Wiki


Significant Projects

Fermi Estimation Framework (personal project)

Teaching LLMs quantitative reasoning through structured laws and worked examples

A framework of 17 Laws (mechanical + estimation) that teaches AI models to decompose problems, bound unknowns, proceed with imperfect information, and know when to ask for help. Developed over three years, tested on 11 problems with a 6-criteria scoring rubric. Codifies methodology from Weinstein's Guesstimation books into explicit rules for human-AI collaboration.

Connected to bullshit-detector as the "Fermi sanity" tier — order-of-magnitude plausibility checks on reported claims.

Blog series: Part 1: The Problem That Wouldn't Compute | Part 2: Permission to Guess


LLM Discipline (personal project)

Anti-sycophancy guardrails and structured prompting for rigorous AI work

Working practices for getting honest, useful behaviour from language models — learned the hard way, across real projects where sycophancy, silent data corruption, and fabrication under questioning cost real time. Two blog posts document the failure modes and the systems I built around them.

The core principle: AI tools that agree with everything you say are worse than useless for analytical work. Configure them to challenge assumptions, not confirm them.

Blog: Operational Discipline for LLM Projects | Standing in the Middle of Intelligence | GitHub


Be a Geoscience and Data Science Detective (open source, professionally motivated)

The intellectual ancestor of bullshit-detector

A methodology for combining domain expertise with statistical rigour when evaluating published results. Don't just accept visual/qualitative claims — back them up with custom error flags, bootstrap confidence intervals, influence plots, and distance correlation. This detective approach evolved directly into the bullshit-detector package.

Links: Blog post | Notebook 1 | Notebook 2


Mill's Methods, Machine Learning, and Drilling Risk (personal project, professionally motivated)

When 19th-century philosophy and neural networks agree

A drilling problem from a CSEG talk: seven wells, five seismic attributes, four with mud loss problems. Three approaches converged — Mill's Methods of Induction (1843), a simple neural network, and domain expert Lee Hunt — all pointing to the same attributes. The pragmatic insight: the goal isn't finding "the cause," it's building a defensible decision rule under uncertainty.

Links: Blog post | Notebook


Fun Projects

Picobot Optimizer (personal project)

How does a nearly blind robot cover every cell in a room? Picobot can only sense its four immediate neighbours — no map, no memory. In 2015, working through Harvey Mudd's CS materials on my own, I optimized the empty room solution from 7 to 6 rules and the maze from 16 to 12. The key insight: the X move (stay put) enables state transitions that let you reuse rules instead of duplicating logic.

Revisited in 2025 with a Python simulator, exhaustive verification, and proper documentation.

Picobot maze solving

Links: Blog post | GitHub


Dandelions (personal project)

A free, offline, no-ads mobile game based on Math Games with Bad Drawings by Ben Orlin. Plant flowers, spread seeds, try to cover every square before the Wind leaves you with gaps. Built with the author's knowledge and encouragement.

Single player vs computer Wind (Easy/Hard) • Interactive tutorial • Installable as offline PWA • No accounts, no tracking

Includes a JavaScript tutorial for Python developers walking through the game engine.

Play it — works on phone, tablet, or desktop.

Dandelions screenshot

Links: Play | GitHub


Published Work

How to evaluate and compare color mapsThe Leading Edge 34(8), 2015. Top 30 most-downloaded SEG papers (2010-2020).

Mapping and validating lineamentsThe Leading Edge 34(8), 2015.

Introduction to Classification with SVMsThe Recorder (CSEG), expert-reviewed.

Keep on improving your geocomputing projects — Chapter in 52 Things You Should Know About Geocomputing, 2019.


Community

Software Underground — Board Member (2024-present)

mycarta blog — 65+ technical posts on geoscience, visualization, data science, and AI.

Conference talks: Transform 2020 (colormap tool), TRANSFORM 2021 (FRIDA hackathon)

Stack Exchange: Land percentage in Northern Hemisphere — combining map projections with Python to answer a deceptively simple question.


Previous Work & Additional Projects

Integrated Geophysical Workflows (professional work)

Multi-scale characterization for unconventional resources: regional lineament analysis, seismic attribute extraction, 3D morphological segmentation, multivariate analysis. Workflows refined to inform drilling prioritization. Methods transfer to geothermal, groundwater, CO₂ sequestration, and infrastructure stability.

Links: Lineament mapping | Fault proximity | Seismic tutorial

FRIDA — Acquisition Footprint Removal (personal project, professionally applied)

10+ year project: interactive removal of acquisition footprint noise from 3D seismic data using FFT-based methods. Evolved from MATLAB (2010) through Python port (2014-2018) to Transform 2021 hackathon prototype.

Links: Transform 2021 demo | Tutorial notebooks | 52 Things chapter

Hackathon Projects

Sketch2model — 2015 Calgary Geoconvention, Honorary Mention. Hand-drawn sketches → geological models → synthetic seismic.

SEG 2016 ML Contest — Top-performing facies classification using shallow ML.

FORCE 2020 — GMM clustering for lithology prediction.

Wind Calculator (professional work)

Swept area method calculator adapted for East Coast North Atlantic offshore conditions. Repository

Additional Repositories

Earth Observation & Remote Sensing Training

InSAR SAR Applications Professional Certificate (edX) • Radar Backscattering (EO College) • NASA ARSET: Fundamentals of Remote Sensing • NASA ARSET: SAR for Disaster & Humanitarian Applications • Climate Geospatial Analysis with Xarray (Coursera) • AI For Good Specialization (Coursera/DeepLearning.AI)

About Me

Hobbies: Drawing and art projects, juggling, longboarding Current fixation: Card tricks and Rubik's cube


Last updated: March 2026

Pinned Loading

  1. bullshit-detector bullshit-detector Public

    Statistical detection tools for screening published research — spurious correlations, GRIMMER, p-value recomputation, power analysis, and more

    Python 3

  2. Colormap-distorsions-Panel-app Colormap-distorsions-Panel-app Public

    A Panel app to demonstrate distorsions created by non-perceptual colormaps on geophysical data

    Jupyter Notebook 12 2

  3. llm-operational-discipline llm-operational-discipline Public

    2

  4. picobot-optimizer picobot-optimizer Public

    Optimizing Picobot solutions from Harvey Mudd's CS for All course (2015 → 2026)

    Python 1

  5. Dandelions Dandelions Public

    JavaScript 1

  6. Be-a-geoscience-detective Be-a-geoscience-detective Public

    Why choose between Data Science and Geoscience? Do both!

    Jupyter Notebook 7 5