|
554 | 554 | "\n", |
555 | 555 | "* To estimate **quantiles**, the following is a strictly proper scoring rule:\n", |
556 | 556 | "$$L(\\hat \\theta, \\theta; \\tau) = (\\hat \\theta - \\theta)(\\mathbf{1}_{\\hat \\theta - \\theta > 0} - \\tau)$$\n", |
557 | | - "Here we write an indicator function as $\\mathbf{1}_{\\hat \\theta - \\theta > 0}$ to evaluate to 1 for overestimation (positive $\\hat \\theta - \\theta$) and $0$ otherwise.\n", |
558 | 557 | "\n", |
559 | | - " For $\\tau=\\frac 1 2$, over- or underestimating a true posterior sample $\\theta$ is weighted equally. In fact, the quantile loss with $\\tau=\\frac 1 2$ is identical to the median loss (up to a scaling of $\\frac 1 2$). For the same reasons, both estimate the median of the posterior.\n", |
| 558 | + " Here we write an indicator function as $\\mathbf{1}_{\\hat \\theta - \\theta > 0}$ to evaluate to 1 for overestimation (positive $\\hat \\theta - \\theta$) and $0$ otherwise.\n", |
560 | 559 | "\n", |
| 560 | + " For $\\tau=\\frac 1 2$, over- or underestimating a true posterior sample $\\theta$ is weighted equally. In fact, the quantile loss with $\\tau=\\frac 1 2$ is identical to the median loss (up to a scaling of $\\frac 1 2$). For the same reasons, both estimate the median of the posterior.\n", |
561 | 561 | "\n", |
562 | 562 | " More generally, $\\tau \\in (0,1)$ is the quantile level, that is the point where to evaluate the [quantile function](https://en.wikipedia.org/wiki/Quantile_function).\n", |
563 | 563 | "\n", |
564 | 564 | "\n", |
565 | | - "\n", |
566 | | - "\n", |
567 | 565 | "* Note, that when approximating the full distribution in BayesFlow we score a **probability estimate** $\\hat p(\\theta|x)$ with the log-score,\n", |
568 | 566 | "$$L(\\hat p(\\theta|x), \\theta) = \\log (\\hat p(\\theta)) $$\n", |
569 | 567 | "which is also a strictly proper scoring rule.\n", |
|
791 | 789 | "Just for fun and because we can, let us save the trained point approximator to disk." |
792 | 790 | ] |
793 | 791 | }, |
794 | | - { |
795 | | - "cell_type": "code", |
796 | | - "execution_count": 20, |
797 | | - "id": "0de263ba-b9a9-4aca-bf8d-0b01b18ef4e8", |
798 | | - "metadata": {}, |
799 | | - "outputs": [], |
800 | | - "source": [ |
801 | | - "point_inference_workflow.approximator.build_from_data(adapter(training_data))" |
802 | | - ] |
803 | | - }, |
804 | 792 | { |
805 | 793 | "cell_type": "code", |
806 | 794 | "execution_count": 21, |
|
854 | 842 | "Since one point estimate already summarizes many posterior samples, we only have to do one forward pass with a point inference network, where we would have to make ~100 passes with a generative, full posterior approximator." |
855 | 843 | ] |
856 | 844 | }, |
857 | | - { |
858 | | - "cell_type": "code", |
859 | | - "execution_count": 23, |
860 | | - "id": "2f3833f9-a155-49aa-9e0a-d1b264c72fda", |
861 | | - "metadata": {}, |
862 | | - "outputs": [ |
863 | | - { |
864 | | - "data": { |
865 | | - "text/plain": [ |
866 | | - "False" |
867 | | - ] |
868 | | - }, |
869 | | - "execution_count": 23, |
870 | | - "metadata": {}, |
871 | | - "output_type": "execute_result" |
872 | | - } |
873 | | - ], |
874 | | - "source": [ |
875 | | - "point_inference_workflow.approximator.built" |
876 | | - ] |
877 | | - }, |
878 | 845 | { |
879 | 846 | "cell_type": "code", |
880 | 847 | "execution_count": 24, |
|
0 commit comments