WinVector
diff --git a/‎Examples/ValuesAsColumns/ValuesAsColumns.ipynb‎
Lines changed: 147 additions & 9 deletions b/‎Examples/ValuesAsColumns/ValuesAsColumns.ipynb‎
Lines changed: 147 additions & 9 deletions
diff --git a/‎coverage.txt‎
Lines changed: 1 addition & 1 deletion b/‎coverage.txt‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎dist/data_algebra-1.2.1-py3-none-any.whl‎
0 Bytes b/‎dist/data_algebra-1.2.1-py3-none-any.whl‎
0 Bytes
diff --git a/‎dist/data_algebra-1.2.1.tar.gz‎
-1 Bytes b/‎dist/data_algebra-1.2.1.tar.gz‎
-1 Bytes
@@ -1,23 +1,73 @@
 {
  "cells": [
+  {
+   "cell_type": "markdown",
+   "source": [
+    "# Values as Columns\n",
+    "\n",
+    "A [SQL](https://en.wikipedia.org/wiki/SQL) feature I realy like is the equivalence or interchangeability of values and columns. It is a small convenience, but a nice feature.\n",
+    "\n",
+    "Let's work an example to illustrate the point. Our task will be to count how many rows are in each group of a data frame.\n",
+    "\n",
+    "In the [data algebra](https://github.com/WinVector/data_algebra) over [Pandas](https://pandas.pydata.org) this looks like the following.\n",
+    "\n",
+    "First we import our packges and set up our example Pandas data frame."
+   ],
+   "metadata": {
+    "collapsed": false,
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   }
+  },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {
     "collapsed": true
    },
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": "  group  one\n0     a    1\n1     a    1\n2     b    1\n3     b    1\n4     b    1",
+      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>group</th>\n      <th>one</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>a</td>\n      <td>1</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>a</td>\n      <td>1</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>b</td>\n      <td>1</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>b</td>\n      <td>1</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>b</td>\n      <td>1</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
+     },
+     "execution_count": 1,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "import pandas as pd\n",
-    "from data_algebra.data_ops import data, descr, ex\n",
-    "from data_algebra.BigQuery import BigQueryModel, BigQuery_DBHandle\n",
+    "from data_algebra.data_ops import descr\n",
+    "from data_algebra.BigQuery import BigQueryModel\n",
     "\n",
     "\n",
     "d = pd.DataFrame({\n",
     "    'group': ['a', 'a', 'b', 'b', 'b'],\n",
     "    'one': [1, 1, 1, 1, 1],\n",
     "})\n",
     "\n",
+    "d\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "Now we specify our grouped counting operations, using a data algebra project step."
+   ],
+   "metadata": {
+    "collapsed": false,
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "outputs": [],
+   "source": [
     "ops = (\n",
     "    descr(d=d)\n",
     "        .project(\n",
@@ -27,26 +77,114 @@
     "            },\n",
     "            group_by=['group']\n",
     "        )\n",
-    ")\n",
+    ")"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "The point is, we have the freedom to count using a value in a column (such as the column `one`) *or* just by summing a value directly (such as `1`, the parenthesis are so that the dot is interpreted as an attribute lookup, and not as a floating point marker).\n",
     "\n",
+    "As desired, both calculations return the same result."
+   ],
+   "metadata": {
+    "collapsed": false,
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "outputs": [
+    {
+     "data": {
+      "text/plain": "  group  sum_one  sum_1\n0     a        2      2\n1     b        3      3",
+      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>group</th>\n      <th>sum_one</th>\n      <th>sum_1</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>a</td>\n      <td>2</td>\n      <td>2</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>b</td>\n      <td>3</td>\n      <td>3</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
     "ops.transform(d)"
-   ]
+   ],
+   "metadata": {
+    "collapsed": false,
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "And the equivalent SQL is given as follows."
+   ],
+   "metadata": {
+    "collapsed": false,
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   }
   },
   {
    "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
+   "execution_count": 4,
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "-- data_algebra SQL https://github.com/WinVector/data_algebra\n",
+      "--  dialect: BigQueryModel\n",
+      "--       string quote: \"\n",
+      "--   identifier quote: `\n",
+      "SELECT  -- .project({ 'sum_one': 'one.sum()', 'sum_1': '(1).sum()'}, group_by=['group'])\n",
+      " SUM(`one`) AS `sum_one` ,\n",
+      " SUM(1) AS `sum_1` ,\n",
+      " `group`\n",
+      "FROM\n",
+      " `d`\n",
+      "GROUP BY\n",
+      " `group`\n",
+      "\n"
+     ]
+    }
+   ],
    "source": [
     "db_model = BigQueryModel()\n",
     "\n",
-    "print(db_model.to_sql(ops))\n"
+    "sql_str = db_model.to_sql(ops)\n",
+    "\n",
+    "print(sql_str)"
    ],
    "metadata": {
     "collapsed": false,
     "pycharm": {
      "name": "#%%\n"
     }
    }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "SQL being where the values and columns equivalence principle is borrowed from."
+   ],
+   "metadata": {
+    "collapsed": false,
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   }
   }
  ],
  "metadata": {
 
@@ -135,4 +135,4 @@ data_algebra/util.py                     140     29    79%
 TOTAL                                   5263    909    83%
 
 
-============================= 250 passed in 27.40s =============================
+============================= 250 passed in 33.25s =============================
Original file line number	Diff line number	Diff line change
`@@ -135,4 +135,4 @@ data_algebra/util.py 140 29 79%`
`135`	`135`	`TOTAL 5263 909 83%`
`136`	`136`
`137`	`137`
`138`		`-============================= 250 passed in 27.40s =============================`
	`138`	`+============================= 250 passed in 33.25s =============================`