Skip to content
This repository was archived by the owner on Jul 27, 2024. It is now read-only.

Commit 71f63d2

Browse files
authored
add reference to Facets Overview spark project
1 parent 8f80597 commit 71f63d2

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

facets_overview/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,11 @@ import pandas as pd
3939
df = pd.DataFrame({'num' : [1, 2, 3, 4], 'str' : ['a', 'a', 'b', None]})
4040
proto = GenericFeatureStatisticsGenerator().ProtoFromDataFrames([{'name': 'test', 'table': df}])
4141
```
42+
43+
## Large Datasets
44+
45+
The python code in this repository for generating feature stats only works on datasets that are small enough to fit into memory on your local machine. For distributed generation of feature stats for large datasets, check out the independently-developed [Facets Overview Spark project](https://github.com/gopro/facets-overview-spark).
46+
4247
# Visualization
4348

4449
A proto can easily be visualized in a Jupyter notebook using the installed nbextension.

0 commit comments

Comments
 (0)