Skip to content

Commit 0979dd0

Browse files
machowjuliasilge
andauthored
docs: cleanup wip sections (#169)
Co-authored-by: Julia Silge <[email protected]>
1 parent c36de26 commit 0979dd0

File tree

4 files changed

+36
-84
lines changed

4 files changed

+36
-84
lines changed

docs/api/index.rst

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -9,19 +9,19 @@ Board Pin Methods
99

1010
=========================================== =======================================================
1111
:meth:`.pin_read`, :meth:`.pin_write` |read-write|
12-
:meth:`.pin_meta` |arrange|
13-
:meth:`.pin_download`, :meth:`.pin_upload` |select|
14-
:meth:`.pin_versions`, TODO complete |mutate|
15-
:meth:`.pin_list` |summarize|
16-
:meth:`.pin_search` |group_by|
12+
:meth:`.pin_meta` |meta|
13+
:meth:`.pin_download`, :meth:`.pin_upload` |download|
14+
:meth:`.pin_versions` |versions|
15+
:meth:`.pin_list` |list|
16+
:meth:`.pin_search` |search|
1717
=========================================== =======================================================
1818

1919
.. |read-write| replace:: Read and write objects to and from a board
20-
.. |arrange| replace:: Retrieve metadata for a pin
21-
.. |select| replace:: Upload and download files to and from a board
22-
.. |mutate| replace:: List, delete, and prune pin versions
23-
.. |summarize| replace:: List all pins
24-
.. |group_by| replace:: Search for pins
20+
.. |meta| replace:: Retrieve metadata for a pin
21+
.. |download| replace:: Upload and download files to and from a board
22+
.. |versions| replace:: List, delete, and prune pin versions
23+
.. |list| replace:: List all pins
24+
.. |search| replace:: Search for pins
2525

2626

2727
Board Constructors

docs/getting_started.Rmd

Lines changed: 20 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -71,10 +71,7 @@ But you can choose another option depending on your goals:
7171
- `type = "csv"` uses `to_csv()` from pandas to create a `.csv` file. CSVs can read by any application, but only support simple columns (e.g. numbers, strings, dates), can take up a lot of disk space, and can be slow to read.
7272
- `type = "joblib"` uses `joblib.dump()` to create a binary python data file. See the [joblib docs](https://joblib.readthedocs.io/en/latest/) for more information.
7373
- `type = "arrow"` uses `pyarrow` to create an arrow/feather file. [Arrow](https://arrow.apache.org) is a modern, language-independent, high-performance file format designed for data science. Not every tool can read arrow files, but support is growing rapidly.
74-
75-
🚧 Data formats TODO 🚧
76-
77-
- `type = "json"` uses `jsonlite::write_json()` to create a `.json` file. Pretty much every programming language can read json files, but they only work well for nested lists.
74+
- `type = "json"` uses `json.dump()` to create a `.json` file. Pretty much every programming language can read json files, but they only work well for nested lists.
7875

7976
After you've pinned an object, you can read it back with `pin_read()`:
8077

@@ -135,41 +132,15 @@ board.pin_meta("mtcars")
135132
While we’ll do our best to keep the automatically generated metadata consistent over time, I’d recommend manually capturing anything you really care about in metadata.
136133

137134

138-
## 🚧 Versioning
139-
140-
141-
> ⚠️: Warning the examples in this section use joblib to read and write data. Joblib uses the pickle format, and **pickle files are not secure**. Only read pickle files you trust. In order to read pickle files, set the `allow_pickle_read=True` argument. See: https://docs.python.org/3/library/pickle.html.
142-
143-
144-
> ⚠️: Turning off versioning is not yet implemented; all Python pins are versioned. These docs are copied from R pins.
145-
146-
In many situations it's useful to version pins, so that writing to an existing pin does not replace the existing data, but instead adds a new copy.
147-
There are two ways to turn versioning on:
135+
## Versioning
148136

149-
- When you create a board you can turn versioning on for every pin in that board:
150-
151-
```python
152-
board2 = board_temp(versioned = TRUE)
153-
```
154-
155-
- When you write a pin, you can specifically request that versioning be turned on for that pin:
156-
157-
```python
158-
board2 = board_temp()
159-
board2.pin_write(mtcars, versioned = TRUE)
160-
```
161-
162-
Most boards have versioning on by default.
163-
The primary exception is `board_folder()` since that stores data on your computer, and there's no automated way to clean up the data your saving.
164-
165-
Once you have turned versioning on, every `pin_write()` will create a new version:
137+
Every `pin_write()` will create a new version:
166138

167139
```{python}
168-
board2 = board_temp(versioned = True, allow_pickle_read=True)
169-
170-
board2.pin_write([1,2,3,4,5], name = "x", type = "joblib", title="TODO")
171-
board2.pin_write([1,2,3], name = "x", type = "joblib", title="TODO")
172-
board2.pin_write([1,2], name = "x", type = "joblib", title="TODO")
140+
board2 = board_temp()
141+
board2.pin_write([1,2,3,4,5], name = "x", type = "json")
142+
board2.pin_write([1,2,3], name = "x", type = "json")
143+
board2.pin_write([1,2], name = "x", type = "json")
173144
board2.pin_versions("x")
174145
```
175146

@@ -186,55 +157,35 @@ version = board2.pin_versions("x").version[1]
186157
board2.pin_read("x", version = version)
187158
```
188159

189-
## 🚧 Reading and writing files
190-
191-
> ⚠️: `pin_upload()` and `pin_download()` are not yet implemented in Python. These docs are copied from R pins.
160+
## Storing models
192161

193-
So far we've focussed on `pin_write()` and `pin_read()` which work with R objects.
194-
pins also provides the lower-level `pin_upload()` and `pin_download()` which work with files on disk.
195-
You can use them to share types of data that are otherwise unsupported by pins.
162+
> ⚠️: Warning the examples in this section use joblib to read and write data. Joblib uses the pickle format, and **pickle files are not secure**. Only read pickle files you trust. In order to read pickle files, set the `allow_pickle_read=True` argument. See: https://docs.python.org/3/library/pickle.html.
196163
197-
`pin_upload()` works like `pin_write()` but instead of an R object you give it a vector of paths.
198-
I'll start by creating a few files in the temp directory:
164+
You can write a pin with `type="joblib"` to store arbitrary python objects, including fitted models from packages like [scikit-learn](https://scikit-learn.org/).
199165

166+
For example, suppose you wanted to store a custom `namedtuple` object.
200167

201168
```{python}
202-
# paths = file.path(tempdir(), c("mtcars.csv", "alphabet.txt"))
203-
# write.csv(mtcars, paths[[1]])
204-
# writeLines(letters, paths[[2]])
205-
```
169+
from collections import namedtuple
206170
207-
Now I can upload those to the board:
171+
board3 = board_temp(allow_pickle_read=True)
208172
209-
```{python}
210-
# board.pin_upload(paths, "example")
211-
```
173+
Coords = namedtuple("Coords", ["x", "y"])
174+
coords = Coords(1, 2)
212175
213-
`pin_download()` returns a vector of paths:
214-
215-
```{python}
216-
# board.pin_download("example")
176+
coords
217177
```
218178

219-
220-
You should treat these paths as internal implementation details --- never modify them and never save them for use outside of pins.
221-
222-
Note that you can't `pin_read()` something you pinned with `pin_upload()`:
223-
179+
Using `type="joblib"` lets you store and read back the custom `coords` object.
224180

225181
```{python}
226-
# board.pin_read("example")
227-
```
228-
229-
But you can `pin_download()` something that you've pinned with `pin_write()`:
182+
board3.pin_write(coords, "my_coords", type="joblib")
230183
231-
```{python}
232-
# board.pin_download("mtcars")
184+
board3.pin_read("my_coords")
233185
```
234186

235-
## Caching
236187

237-
> ⚠️: `board_url` is not yet implemented in Python. These docs are copied from R pins.
188+
## Caching
238189

239190
The primary purpose of pins is to make it easy to share data.
240191
But pins is also designed to help you spend as little time as possible downloading data.

docs/intro.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ The pins package publishes data, models, and other Python objects, making it eas
2424
You can pin objects to a variety of pin *boards*, including folders (to share on a networked drive or with services like DropBox), RStudio Connect, Amazon S3, Google Cloud Storage, and Azure Datalake.
2525
Pins can be automatically versioned, making it straightforward to track changes, re-run analyses on historical data, and undo mistakes.
2626

27+
You can use pins from R as well as Python. For example, you can use one language to read a pin created with the other. Learn more about [pins for R](https://pins.rstudio.com).
28+
2729
## Installation
2830

2931
To install the released version from PyPI:
@@ -67,18 +69,17 @@ board.pin_read("mtcars")
6769
A board on your computer is good place to start, but the real power of pins comes when you use a board that's shared with multiple people.
6870
To get started, you can use `board_folder()` with a directory on a shared drive or in DropBox, or if you use [RStudio Connect](https://www.rstudio.com/products/connect/) you can use `board_rsconnect()`:
6971

70-
🚧 TODO: add informational messages shown in display below
71-
7272
+++
7373

7474
```python
7575
from pins import board_rsconnect
7676

7777
board = board_rsconnect()
78-
#> Connecting to RSC 1.9.0.1 at <https://connect.rstudioservices.com>
7978

8079
board.pin_write(tidy_sales_data, "hadley/sales-summary", type = "csv")
81-
#> Writing to pin 'hadley/sales-summary'
80+
#> Writing pin:
81+
#> Name: 'hadley/sales-summary'
82+
#> Version: ...
8283
```
8384

8485
+++

pins/boards.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ def pin_write(
241241
Pin name.
242242
type:
243243
File type used to save ``x`` to disk. May be "csv", "arrow", "parquet",
244-
"joblib", or "file".
244+
"joblib", "json", or "file".
245245
title:
246246
A title for the pin; most important for shared boards so that others
247247
can understand what the pin contains. If omitted, a brief description

0 commit comments

Comments
 (0)