Skip to content

Commit e352a48

Browse files
authored
Update 3_interpretable.md
1 parent 981e293 commit e352a48

File tree

1 file changed

+11
-5
lines changed

1 file changed

+11
-5
lines changed

_projects/3_interpretable.md

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
layout: page
3-
title: Intelligible ML
4-
description: Building glass-box models that are both predictive and interpretable
3+
title: Foundations of Intepretability
4+
description: Developing formal understanding of interpretability through structured models, GAMs, and interaction effects.
55
img: assets/img/glass.jpeg
66
importance: 3
77
category: methods
@@ -10,10 +10,16 @@ related_publications: true
1010

1111
{% include figure.liquid loading="eager" path="assets/img/glass.jpeg" title="glass" class="img-fluid rounded z-depth-1" %}
1212

13-
For many reasons (e.g. scientific inquiry, high-stakes decision making), we need AI systems that are both _accurate_ and _intelligible_.
13+
In scientific and high-stakes domains, we need AI systems that are not only accurate but also intelligible. Our research investigates the theoretical underpinnings of interpretability, focusing on how model structure affects human understanding.
1414

15-
We find that _interaction effects_ are often a useful lens through which to view intelligibility. Interaction effects {% cite lengerich2020purifying %} are effects which require two input components to know anything about the output (one component alone tells you nothing). Since humans reason by chunking and hierarchical logic, we struggle to understand interactions of multiple variables. If we can instead represent effects as additive (non-interactive) combinations of components, we can understand the components independently and reason about even very complex concepts.
15+
We explore **interaction effects** as a key barrier to interpretability. Interaction effects arise when two or more input components must be considered jointly to affect the output—meaning that no single component is informative on its own {% cite lengerich2020purifying %}. Because humans tend to reason compositionally and hierarchically, we find that **additive representations**—which isolate effects into separable components—make complex models more understandable.
1616

17-
Toward this end, we have designed new architectures including deep additive models {% cite agarwal2022neural %} and contextualized additive models {% cite lengerich2022automated %}, studied deep learning theory through the lens of interaction effects {% cite lengerich2022dropout %}, studied additive models and identifiability {% cite chang2021how %}, {% cite lengerich2020purifying %}, and applied intelligible models to real-world evidence {% cite lengerich2024interpretable %}.
17+
To advance this perspective, we have:
18+
- Developed new architectures such as deep additive models {% cite agarwal2022neural %} and contextualized additive models {% cite lengerich2022automated %}.
19+
- Analyzed interaction effects in deep learning theory, including dropout behavior {% cite lengerich2022dropout %}.
20+
- Studied identifiability and disentanglement in additive models {% cite chang2021how lengerich2020purifying %}.
21+
- Applied these models to real-world clinical data to support intelligible medical decisions {% cite lengerich2024interpretable %}.
1822

23+
Our goal is to formalize what makes models understandable—and build models that are easy to reason about without sacrificing performance.
24+
1925
<br/><br/>

0 commit comments

Comments
 (0)