|
38 | 38 | "\n", |
39 | 39 | "However, ML is not a panacea. It can perform wonders under very strict boundaries and still fail miserably if the data it's using deviates a little from what the model is accustomed to. To give another example from Prediction Machines, \"in many industries, low prices are associated with low sales. For example, in the hotel industry, prices are low outside the tourist season, and prices are high when demand is highest and hotels are full. Given that data, a naive prediction might suggest that increasing the price would lead to more rooms sold.” \n", |
40 | 40 | "\n", |
41 | | - "ML is notoriously bad at this inverse causality type of problems. They require us to answer \"what if\" questions, what Economists call counterfactuals. What would happen if instead of this price I'm currently asking for my merchandise, I use another price? What would happen if instead of this low fat diet I'm in, I do a low sugar one? If you work in a bank, giving credit, you will have to figure out how changing the customer line changes you'r revenue. Or if you work at the local government, you might be asked to figure out how to make the schooling system better. Should you give tablets to every kid because the era of digital knowledge tells you to? Or should you build an old fashioned library? \n", |
| 41 | + "ML is notoriously bad at this inverse causality type of problems. They require us to answer \"what if\" questions, what Economists call counterfactuals. What would happen if instead of this price I'm currently asking for my merchandise, I use another price? What would happen if instead of this low fat diet I'm in, I do a low sugar one? If you work in a bank, giving credit, you will have to figure out how changing the customer line changes your revenue. Or if you work at the local government, you might be asked to figure out how to make the schooling system better. Should you give tablets to every kid because the era of digital knowledge tells you to? Or should you build an old fashioned library? \n", |
42 | 42 | "\n", |
43 | 43 | "At the heart of these questions there is a causal inquiry we wish to know the answer. Causal questions permeate everyday problems, like figuring out how to make sales go up, but they also play an important role on dilemmas that are very personal and dear to us: do I have to go to an expensive school to be successful in life (does education cause earnings)? Does imigration lowers my chances of getting a job (does imigration causes unemployment to go up)? Does money transfer to the poor lower the crime rate? It doesn't matter the field you are in, it is very likely you had or will have to answer some type of causal question. Unfortunately for ML, we can't rely on correlation type predictions to tackle them.\n", |
44 | 44 | "\n", |
|
53 | 53 | "cell_type": "code", |
54 | 54 | "execution_count": 1, |
55 | 55 | "metadata": { |
| 56 | + "collapsed": true, |
56 | 57 | "jupyter": { |
57 | | - "outputs_hidden": true, |
58 | | - "source_hidden": true |
| 58 | + "outputs_hidden": true |
59 | 59 | } |
60 | 60 | }, |
61 | 61 | "outputs": [], |
|
83 | 83 | { |
84 | 84 | "cell_type": "code", |
85 | 85 | "execution_count": 2, |
86 | | - "metadata": { |
87 | | - "jupyter": { |
88 | | - "source_hidden": true |
89 | | - } |
90 | | - }, |
| 86 | + "metadata": {}, |
91 | 87 | "outputs": [ |
92 | 88 | { |
93 | 89 | "data": { |
|
390 | 386 | "\n", |
391 | 387 | "## Bias\n", |
392 | 388 | "\n", |
393 | | - "Bias is what makes association different from causation. Fortunately, it too can be easily understood with our intuition. Let's recap our tablets in the classroom example. When confronted with the claim that schools that give tablets to their kids do better on tests score, we can rebut it by saying those schools will probably do better anyway, even without the tables. That is because they probably have more money than the other schools, and hence can pay better teachers, afford better classrooms, and so on. In other words, it is the case that treated schools (with tablets) are not comparable with untreated schools. \n", |
| 389 | + "Bias is what makes association different from causation. Fortunately, it too can be easily understood with our intuition. Let's recap our tablets in the classroom example. When confronted with the claim that schools that give tablets to their kids do better on tests score, we can rebut it by saying those schools will probably do better anyway, even without the tables. That is because they probably have more money than the other schools; hence they can pay better teachers, afford better classrooms, and so on. In other words, it is the case that treated schools (with tablets) are not comparable with untreated schools. \n", |
394 | 390 | "\n", |
395 | 391 | "To say this in potential outcome notation is to say that \\\\(Y_0\\\\) of the treated is different from the \\\\(Y_0\\\\) of the untreated. Remember that the \\\\(Y_0\\\\) of the treated **is counterfactual**. We can't observe it, but we can reason about it. In this particular case, we can even leverage our understanding of how the world works to go even further. We can say that, probably, \\\\(Y_0\\\\) of the treated is bigger than \\\\(Y_0\\\\) of the untreated schools. That is because schools that can afford to give tablets to their kids can also afford other factors that contribute to better test scores. Let this sink in for a moment. It takes some time to get used to talking about potential outcomes. Read this paragraph again and make sure you understand it.\n", |
396 | 392 | "\n", |
|
0 commit comments