|
2 | 2 | "cells": [ |
3 | 3 | { |
4 | 4 | "cell_type": "markdown", |
5 | | - "id": "floating-button", |
| 5 | + "id": "rapid-beaver", |
6 | 6 | "metadata": {}, |
7 | 7 | "source": [ |
8 | 8 | "A common challenge in data analytics is: easilly plotting multiple curves\n", |
|
19 | 19 | { |
20 | 20 | "cell_type": "code", |
21 | 21 | "execution_count": 1, |
22 | | - "id": "russian-perspective", |
| 22 | + "id": "younger-break", |
23 | 23 | "metadata": {}, |
24 | 24 | "outputs": [], |
25 | 25 | "source": [ |
|
31 | 31 | { |
32 | 32 | "cell_type": "code", |
33 | 33 | "execution_count": 2, |
34 | | - "id": "ongoing-chart", |
| 34 | + "id": "asian-perception", |
35 | 35 | "metadata": {}, |
36 | 36 | "outputs": [ |
37 | 37 | { |
|
197 | 197 | }, |
198 | 198 | { |
199 | 199 | "cell_type": "markdown", |
200 | | - "id": "developmental-exhibition", |
| 200 | + "id": "consistent-spine", |
201 | 201 | "metadata": {}, |
202 | 202 | "source": [ |
203 | 203 | "The above is a data frame of an A/B test plan as a function of the test size `n`. \n", |
|
225 | 225 | { |
226 | 226 | "cell_type": "code", |
227 | 227 | "execution_count": 3, |
228 | | - "id": "express-requirement", |
| 228 | + "id": "accepting-hungarian", |
229 | 229 | "metadata": {}, |
230 | 230 | "outputs": [ |
231 | 231 | { |
|
289 | 289 | }, |
290 | 290 | { |
291 | 291 | "cell_type": "markdown", |
292 | | - "id": "grateful-bachelor", |
| 292 | + "id": "interracial-reducing", |
293 | 293 | "metadata": {}, |
294 | 294 | "source": [ |
295 | 295 | "For plotting we want this record to look like this:" |
|
298 | 298 | { |
299 | 299 | "cell_type": "code", |
300 | 300 | "execution_count": 4, |
301 | | - "id": "fourth-champion", |
| 301 | + "id": "operating-divorce", |
302 | 302 | "metadata": {}, |
303 | 303 | "outputs": [ |
304 | 304 | { |
|
368 | 368 | }, |
369 | 369 | { |
370 | 370 | "cell_type": "markdown", |
371 | | - "id": "floral-serbia", |
| 371 | + "id": "celtic-wings", |
372 | 372 | "metadata": {}, |
373 | 373 | "source": [ |
374 | 374 | "That is, we want a record that spans multiple rows that picks up a set of the columns of the original record.\n", |
|
401 | 401 | { |
402 | 402 | "cell_type": "code", |
403 | 403 | "execution_count": 5, |
404 | | - "id": "freelance-rebel", |
| 404 | + "id": "technical-clearing", |
405 | 405 | "metadata": {}, |
406 | 406 | "outputs": [], |
407 | 407 | "source": [ |
|
414 | 414 | }, |
415 | 415 | { |
416 | 416 | "cell_type": "markdown", |
417 | | - "id": "intermediate-complexity", |
| 417 | + "id": "multiple-zimbabwe", |
418 | 418 | "metadata": {}, |
419 | 419 | "source": [ |
420 | 420 | "Now that we have our pivot object we can use it to realize the transform." |
|
423 | 423 | { |
424 | 424 | "cell_type": "code", |
425 | 425 | "execution_count": 6, |
426 | | - "id": "gorgeous-customer", |
| 426 | + "id": "allied-nursery", |
427 | 427 | "metadata": {}, |
428 | 428 | "outputs": [ |
429 | 429 | { |
|
488 | 488 | }, |
489 | 489 | { |
490 | 490 | "cell_type": "markdown", |
491 | | - "id": "parental-caution", |
| 491 | + "id": "fleet-level", |
492 | 492 | "metadata": {}, |
493 | 493 | "source": [ |
494 | 494 | "Notice the transform exactly realizes our desired record format. The point is: this transform is being applied to all\n", |
|
500 | 500 | { |
501 | 501 | "cell_type": "code", |
502 | 502 | "execution_count": 7, |
503 | | - "id": "laden-daily", |
| 503 | + "id": "worth-tumor", |
504 | 504 | "metadata": {}, |
505 | 505 | "outputs": [ |
506 | 506 | { |
|
525 | 525 | }, |
526 | 526 | { |
527 | 527 | "cell_type": "markdown", |
528 | | - "id": "foster-feelings", |
| 528 | + "id": "imperial-glasgow", |
529 | 529 | "metadata": {}, |
530 | 530 | "source": [ |
531 | 531 | "And we are done." |
532 | 532 | ] |
533 | 533 | }, |
534 | 534 | { |
535 | 535 | "cell_type": "markdown", |
536 | | - "id": "following-board", |
| 536 | + "id": "split-today", |
537 | 537 | "metadata": {}, |
538 | 538 | "source": [ |
539 | 539 | "A neat side-feature of the transform is: it is pretty much invertable. For fun let's look at that." |
|
542 | 542 | { |
543 | 543 | "cell_type": "code", |
544 | 544 | "execution_count": 8, |
545 | | - "id": "organizational-investigation", |
| 545 | + "id": "racial-queue", |
546 | 546 | "metadata": {}, |
547 | 547 | "outputs": [ |
548 | 548 | { |
|
571 | 571 | { |
572 | 572 | "cell_type": "code", |
573 | 573 | "execution_count": 9, |
574 | | - "id": "suitable-archive", |
| 574 | + "id": "parental-musician", |
575 | 575 | "metadata": {}, |
576 | 576 | "outputs": [ |
577 | 577 | { |
|
601 | 601 | }, |
602 | 602 | { |
603 | 603 | "cell_type": "markdown", |
604 | | - "id": "olive-strike", |
| 604 | + "id": "european-connection", |
605 | 605 | "metadata": {}, |
606 | 606 | "source": [ |
607 | 607 | "Notice the `blocks_in` and `blocks_out` reversed roles when we took the inverse. Let's see what the inverse does." |
|
610 | 610 | { |
611 | 611 | "cell_type": "code", |
612 | 612 | "execution_count": 10, |
613 | | - "id": "transsexual-right", |
| 613 | + "id": "phantom-working", |
614 | 614 | "metadata": {}, |
615 | 615 | "outputs": [ |
616 | 616 | { |
|
674 | 674 | { |
675 | 675 | "cell_type": "code", |
676 | 676 | "execution_count": 11, |
677 | | - "id": "muslim-ceramic", |
| 677 | + "id": "configured-deviation", |
678 | 678 | "metadata": {}, |
679 | 679 | "outputs": [ |
680 | 680 | { |
|
730 | 730 | }, |
731 | 731 | { |
732 | 732 | "cell_type": "markdown", |
733 | | - "id": "opened-parker", |
| 733 | + "id": "careful-exhibition", |
734 | 734 | "metadata": {}, |
735 | 735 | "source": [ |
736 | 736 | "The inverse has re-assembled the columns that were present into essentially the original format.\n", |
|
746 | 746 | "Some follow-up links:\n", |
747 | 747 | "\n", |
748 | 748 | " * The `Python` `data_algebra` package that supplies the `cdata` implementation: [https://github.com/WinVector/data_algebra](https://github.com/WinVector/data_algebra).\n", |
749 | | - " * Some examples: [https://github.com/WinVector/data_algebra/tree/main/Examples/cdata](https://github.com/WinVector/data_algebra/tree/main/Examples/cdata).\n", |
| 749 | + " * Some examples: [https://github.com/WinVector/data_algebra/tree/main/Examples/cdata](https://github.com/WinVector/data_algebra/tree/main/Examples/cdata) (including this example [here](https://github.com/WinVector/data_algebra/blob/main/Examples/cdata/plotting_multiple_curves_in_seaborn.ipynb)).\n", |
750 | 750 | " * The `R` version of the package: [https://github.com/WinVector/cdata](https://github.com/WinVector/cdata) (where\n", |
751 | 751 | " we did a lot of the original research).\n", |
752 | 752 | " * A tutorial video on the \"coordinatized data\" methodology (in `R`, but the concepts carry over): [https://youtu.be/4cYbP3kbc0k](https://youtu.be/4cYbP3kbc0k).\n" |
|
755 | 755 | { |
756 | 756 | "cell_type": "code", |
757 | 757 | "execution_count": null, |
758 | | - "id": "color-stuart", |
| 758 | + "id": "necessary-buddy", |
759 | 759 | "metadata": {}, |
760 | 760 | "outputs": [], |
761 | 761 | "source": [] |
|
0 commit comments