9.66 josh tenenbaum comp. cog.sci class + 9.s vikash<>josh #250
Replies: 19 comments 11 replies
-
I'm skeptical about "common sense". I think it's a catch-all term that is (over)used to invoke certainty or tradition where the speaker has none. I'm sure there's some underlying quantity that exists, but it's some kind of naive view of causality that applies only to simple situations. Real systems don't always suit common sense (example: https://metasd.com/2019/07/complexity-default-assumption/ ). |
Beta Was this translation helpful? Give feedback.
-
W2: Foundations of Inductive Learning, Bayesian Inference, Bayesian Concept Learningeight concepts on hypothesis, observation/example of concept, prior of hypothesis, likelihood of observations given hypothesis, posterior, size principle of likelihood (smaller hypothesis receives greater likelihood, more so as n increases), choice principle (natural over logic) of principle, hypothesis averaging for new observation (averaging prediction over all hypothesis weighted by posterior)
|
Beta Was this translation helpful? Give feedback.
-
psetpset1
Toolkits:
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Week 6: Graphical Models and Probabilistic Programmingbest lecture so far as i unpacked my attraction to probabilistic program as using how josh persuades prob.prog is better (finer) than graphical models, i can share suggestions on value lab's graphical representation of belief on node and arc of value hypothesis here. lec slide lecture transcript 🗣️. personal highlight is Angie-Josh QnA!
size principle: Smaller hypotheses receive greater likelihood, exponentially more so as example n increases question for today was
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
W9. Probabilistic language of thought (meaning function) and MCMC as intuitive learning mechanismmeaning function allows purposeful knowledge projection to hyperplane that matters and implementable (i.e. have capability express any points in hyperplane with resources). todo: make collage of MCMC (TAC, three proposal comparison (indistinguishable after crossing) slides) using optimal fit through sequential monte carlo cld, i drew analogies between the proposal in the inference algorithm and how it might apply in the equity valuation negotiation situation in the context of #249. table shows how each inference technique maps to specific aspects of the equity valuation negotiation process, with particular attention to the objective function x*(i,c) and the various states and constraints specified in the problem setup.
The remaining techniques, while useful, play supporting roles in the negotiation process:
|
Beta Was this translation helpful? Give feedback.
-
W10 ![]() |
Beta Was this translation helpful? Give feedback.
-
W11 |
Beta Was this translation helpful? Give feedback.
-
W12 hierarchical bayes
|
Beta Was this translation helpful? Give feedback.
-
I'm preparing a mail to josh after reading andrew's "Why forecast an election that’s too close to call? Predictive models don’t make the news, but they have a crucial role in democracy.", regarding josh's comment that if the p is closer to .5 (close call), sampling is a waste (just decide) from one and done paper. my argument is, the absence of modeler in one and done model. Modeling capability stock (which I believe is the key spirit of Bayesian workflow (centering modeler (process) instead of model (product)) andrew's "This predictability affects how politicians and journalists think about elections, the economy and the balance between parties" seem very related with josh's action understanding
|
Beta Was this translation helpful? Give feedback.
-
summary of the thirteen special topics to follow below
📐shallow to deep: structured ignorance priors enable robust inference by starting deliberately broad and refining incrementally as evidence accumulates. this approach maintains flexibility against unexpected cases while leveraging structure when appropriate, yielding systems that scale better than those with premature specificity. 👁️see flowing mass: probabilistic programming visualization reveals how probability distributes across hypothesis space rather than fixating on point estimates. this perceptual skill transforms inference from opaque guesswork to transparent reasoning, allowing rational uncertainty quantification that drives adaptive computation and robust decision-making. 🪒auto occam's razor: hierarchical model encodes uncertain beliefs and sampling navigates that uncertainty representation. together, they form consistent algorithm for probabilistic inference which behaves rationally compared to those violating consistency hence making predictably irrational decisions. 🧩compose to simplify: higher-order probabilistic operations automate mathematical reasoning by transforming sampling-based programs into expectation-operator form. this systematic approach allows gradient estimators and density evaluations to emerge from primitive transformations, replacing pages of derivations with composable building blocks that maintain unbiasedness guarantees while enabling exploration of variance-performance tradeoffs across inference algorithms. |
Beta Was this translation helpful? Give feedback.
-
1. scaling behavior of intelligence vs machine learningrational ai principle📐shallow to deep: structured ignorance priors enable robust inference by starting deliberately broad and refining incrementally as evidence accumulates. this approach maintains flexibility against unexpected cases while leveraging structure when appropriate, yielding systems that scale better than those with premature specificity. applying principle 1
📐shallow to deep: structured ignorance priors enable robust inference by starting deliberately broad and refining incrementally as evidence accumulates. this approach maintains flexibility against unexpected cases while leveraging structure when appropriate, yielding systems that scale better than those with premature specificity. |
Beta Was this translation helpful? Give feedback.
-
2. Perception and navigationrational ai principle2👁️see flowing mass: probabilistic programming visualization reveals how probability distributes across hypothesis space rather than fixating on point estimates. this perceptual skill transforms inference from opaque guesswork to transparent reasoning, allowing rational uncertainty quantification that drives adaptive computation and robust decision-making. applying principle1, 2
|
Beta Was this translation helpful? Give feedback.
-
3. Foundations of modeling and inferencerational ai principle3🪒auto occam's razor: hierarchical model encodes uncertain beliefs and sampling navigates that uncertainty representation. together, they form consistent algorithm for probabilistic inference which behaves rationally compared to those violating consistency hence making predictably irrational decisions. applying principle 1,2,3
|
Beta Was this translation helpful? Give feedback.
-
I applied four principles in automating paper writing project Pose Estimation with Sensors:Basic principle: A robot's position is determined by combining noisy sensor measurements with a motion model to estimate its most likely location. Particle Filtering vs. Simple Importance Sampling:Basic principle: Particle filtering sequentially updates and resamples candidate solutions at each step, while importance sampling tries to estimate the entire distribution at once. Benefits of Rejuvenation:Basic principle: Rejuvenation perturbs resampled particles to maintain diversity, preventing collapse to a single solution in high-dimensional problems. Integrated Path vs. Probabilistic Inference:Basic principle: Simple path integration accumulates errors while probabilistic inference maintains uncertainty and can correct itself with new information. Differential Visualization ImplementationHere's a simple implementation of differential visualization that shows how changes to inputs propagate through the paper generation process: // The simplest implementation of Differential Visualization for paper generation
function createDifferentialVisualization() {
// Track current state
let state = {
inputs: {
phenomena: "",
theory: "",
application: ""
},
activeInput: "phenomena",
paperSections: {
introduction: { content: "", impactScores: {} },
literature: { content: "", impactScores: {} },
methodology: { content: "", impactScores: {} },
results: { content: "", impactScores: {} },
conclusion: { content: "", impactScores: {} }
}
};
// Define impact relationships (how inputs affect sections)
// These would be learned or defined based on model analysis
const impactMatrix = {
phenomena: {
introduction: 0.8,
literature: 0.4,
methodology: 0.2,
results: 0.3,
conclusion: 0.2
},
theory: {
introduction: 0.3,
literature: 0.7,
methodology: 0.9,
results: 0.5,
conclusion: 0.3
},
application: {
introduction: 0.2,
literature: 0.1,
methodology: 0.4,
results: 0.7,
conclusion: 0.8
}
};
// Calculate differential impact when input changes
function updateDifferentialImpact(inputKey, oldText, newText) {
// Calculate change magnitude (simplified)
// In a real implementation, this would use embeddings comparison
const changeRatio = newText.length > 0 ?
1 - (oldText.length / newText.length) : 1;
// Update impact scores based on change magnitude
Object.keys(impactMatrix[inputKey]).forEach(section => {
const baseImpact = impactMatrix[inputKey][section];
state.paperSections[section].impactScores[inputKey] = baseImpact * changeRatio;
});
// Update visualization
updateHeatmap();
}
// Update the heatmap visualization
function updateHeatmap() {
// Compute total impact for each section
Object.keys(state.paperSections).forEach(section => {
const sectionImpacts = state.paperSections[section].impactScores;
const totalImpact = Object.values(sectionImpacts).reduce((sum, val) => sum + val, 0);
// Update the visual heatmap for this section (DOM operation in real implementation)
console.log(`Section "${section}" impact: ${totalImpact.toFixed(2)}`);
// In a real implementation, we would update DOM elements:
// document.getElementById(`${section}-heatmap`).style.backgroundColor =
// `rgba(255, 102, 0, ${totalImpact})`;
});
}
// Input change handler
function handleInputChange(inputKey, newValue) {
const oldValue = state.inputs[inputKey];
state.inputs[inputKey] = newValue;
state.activeInput = inputKey;
// Calculate and visualize impact
updateDifferentialImpact(inputKey, oldValue, newValue);
// In a real system, we'd regenerate affected sections
// regenerateAffectedSections();
}
// Public API
return {
updateInput: handleInputChange,
getState: () => state
};
}
// Usage example
const paperVisualizer = createDifferentialVisualization();
paperVisualizer.updateInput("phenomena", "Experimental choice differences between entrepreneurs");
paperVisualizer.updateInput("theory", "Hierarchical Bayesian model of decision making"); Key Logic for Differential Visualization
|
Beta Was this translation helpful? Give feedback.
-
4. automating mathlew25_auto(math(integ, diff)).pdf |
Beta Was this translation helpful? Give feedback.
-
i prompted below to katie as we have shared interest on model synthesis & entrepreneurship and angie would like to get advice on building domain specific language for entrepreneurial operations from katie Q1. does it make sense if i say nail scale sail comic book is an attempt for defining semantically rich but syntactically restrictive entrepreneurial operations? full paper below where ten casestudies are compactly used to illustarte ten tools: Q2. if Q1 is yes, if you were to design a domain specific language for entrepreneurial operations (below structure), where'd you spend most time developing? Q3. what do you think about vikash's separation between generative program and causality?
|
Beta Was this translation helpful? Give feedback.
-
My favorite seminar so far! Seminar Transcript Summary: Challenges in Interdisciplinary ResearchKey Topics Covered:
The seminar encouraged participants to be intentional about research communication, to understand field-specific evaluation standards, and to balance technical excellence with strong presentation and positioning. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
3 Course Description
An introduction to computational theories of human cognition, and the computational frameworks that could support human-like artificial intelligence (AI). Our central questions are: What is the form and content of people's knowledge of the world across different domains, and what are the principles that guide people in learning new knowledge and reasoning to reach decisions based on sparse, noisy data? We survey recent approaches to cognitive science and AI built on these principles:
World knowledge can be described using probabilistic generative models; perceiving, learning, reasoning and other cognitive processes can be understood as Bayesian inferences over these generative models.
To capture the flexibility and productivity of human cognition, generative models can be defined over richly structured symbolic systems such as graphs, grammars, predicate logics, and most generally probabilistic programs.
Inference in hierarchical models can explain how knowledge at multiple levels of abstraction is acquired.
Learning with adaptive data structures allows models to grow in complexity or change form in response to the observed data.
Approximate inference schemes based on sampling (Monte Carlo) and deep neural networks allow rich models to scale up efficiently, and may also explain some of the algorithmic and neural underpinnings of human thought.
We will introduce a range of modeling tools, including core methods from contemporary AI and Bayesian machine learning, as well as new approaches based on probabilistic programming languages. We will show how these methods can be applied to many aspects of cognition, including perception, concept learning and categorization, language understanding and acquisition, common-sense reasoning, decision-making and planning, theory of mind and social cognition. Lectures will focus on the intuitions behind these models and their applications to cognitive phenomena, rather than detailed mathematics. Recitations will fill in mathematical background and give hands-on modeling guidance in several probabilistic programming environments, including WebPPL and Gen.
4 Prequisites
(1) Basic probability and statistical inference as you would acquire in 9.014, 9.40, 18.05, 18.600, 6.008, 6.036, 6.041, or 6.042, or an equivalent class. If you have not taken one of these classes, please talk to the instructor after the first day. (2) Previous experience with programming, especially in Matlab, Python, Scheme or Javascript, which form the basis of the probabilistic programming environments we use. Also helpful would be previous exposure to core problems and methods in artificial intelligence, machine learning, or cognitive science.
5 Requirements and Grading (to be updated)
Participation: Attendance at lectures is required. Recitations are optional. Students must attend at least 80% of lectures, and not miss more than 2 in a row. Students failing to meet the attendance requirement will be penalized by one full letter grade (for example, the grade will go from an A- to a B-). Attendance will be taken using a Google form with a Question of the Day during each class. The only exception to this attendance requirement is by specific exemption from the S^3 Deans or DAS (Disability and Access Services).
Problem Sets [60%]: There will be 3 problem sets, along with an optional 4th problem set. The release and due dates of the problem sets are still being finalized, and this section will be updated in the future.
Final Project [40%]: this is a project-based course. You will submit a project proposal and a paper-style write up for a final course project. Projects can be done individually or in groups of 2-3, but proportionately more work is expected for a group project.
Late Policy: A late penalty of 5% per day will be applied to late problem sets, up to 1 week past the deadline. We can't accept work later than 1 week after the deadline except in extraordinary circumstances because doing so would hinder our ability to discuss solutions with students in a timely manner.
Collaboration Policy: Students are allowed to talk and work with others on problem sets, but the work they write up and hand in must be their own. Students should also indicate the names of anyone that they worked with or collaborated with.
The Institute obliges us to remind you of its policy on integrity. It can be found at the website http://web.mit.edu/academicintegrity/. Please read it if you have not already done so.
Consistent with MIT policy, there is no curve or pre-set grade distribution for this class.
Please let us know on an individual basis if you have a learning disability or other special concern you would like us to be aware of.
6 Topics
Specific techniques and topics in cognitive modeling to be covered this year will include some or all of the following:
Foundations of inductive learning: philosophical challenges and theories of how learning is possible.
Introduction to Bayesian inference and Bayesian concept learning.
Modeling human cognition as rational statistical inference: Bayes meets Marr’s levels of analysis. Case studies in modeling surface perception and predicting the future
Graphical models and Bayesian networks. Modeling how people learn and reason about simple causal structures.
Probabilistic programming languages: Generalizations of graphical models that can capture the core of common- sense reasoning. Modeling social evaluation and attribution, visual scene understanding and common-sense physical reasoning.
Sampling-based methods for approximate probabilistic inference: Markov chain Monte Carlo (MCMC) - Gibbs sampling, Metropolis-Hastings; Sequential Monte Carlo (particle filtering). Modeling the dynamics of binocular rivalry, online sentence processing, change detection and multiple object tracking.
Learning as Bayesian inference: parameter estimation and model selection; the Bayesian Occam’s razor. Modeling visual learning and classical conditioning in animals.
Hierarchical Bayesian models: a framework for learning to learn, transfer learning, and multitask learning. Mod- eling how children learn the meanings of words, and learn the basis for rapid (’one-shot’) learning. Building a machine that learns words like children do.
Probabilistic models for unsupervised clustering: Modeling human categorization and category discovery; prototype and exemplar categories; categorizing objects by relations and causal properties.
Nonparametric Bayesian models - capturing the long tail of an infinitely complex world: Dirichlet processes, adaptor grammars, fragment grammars. Models for morphology in language.
Planning with Markov Decision Processes (MDPs): Modeling single- and multi-agent decision-making. Modeling human ’theory of mind’ as inverse planning.
Modeling human cognitive development - how we get to be so smart: infants’ probabilistic reasoning and curiosity; how children learn about causality and number; the growth of intuitive theories.
W1
Sep.06
q1. does common sense evolve with sensory system ? i haven't seen tameren for the last five years and this'd affect my answer for his question on it.
Beta Was this translation helpful? Give feedback.
All reactions