You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/getting_started.Rmd
+20-69Lines changed: 20 additions & 69 deletions
Original file line number
Diff line number
Diff line change
@@ -71,10 +71,7 @@ But you can choose another option depending on your goals:
71
71
-`type = "csv"` uses `to_csv()` from pandas to create a `.csv` file. CSVs can read by any application, but only support simple columns (e.g. numbers, strings, dates), can take up a lot of disk space, and can be slow to read.
72
72
-`type = "joblib"` uses `joblib.dump()` to create a binary python data file. See the [joblib docs](https://joblib.readthedocs.io/en/latest/) for more information.
73
73
-`type = "arrow"` uses `pyarrow` to create an arrow/feather file. [Arrow](https://arrow.apache.org) is a modern, language-independent, high-performance file format designed for data science. Not every tool can read arrow files, but support is growing rapidly.
74
-
75
-
🚧 Data formats TODO 🚧
76
-
77
-
-`type = "json"` uses `jsonlite::write_json()` to create a `.json` file. Pretty much every programming language can read json files, but they only work well for nested lists.
74
+
-`type = "json"` uses `json.dump()` to create a `.json` file. Pretty much every programming language can read json files, but they only work well for nested lists.
78
75
79
76
After you've pinned an object, you can read it back with `pin_read()`:
80
77
@@ -135,41 +132,15 @@ board.pin_meta("mtcars")
135
132
While we’ll do our best to keep the automatically generated metadata consistent over time, I’d recommend manually capturing anything you really care about in metadata.
136
133
137
134
138
-
## 🚧 Versioning
139
-
140
-
141
-
> ⚠️: Warning the examples in this section use joblib to read and write data. Joblib uses the pickle format, and **pickle files are not secure**. Only read pickle files you trust. In order to read pickle files, set the `allow_pickle_read=True` argument. See: https://docs.python.org/3/library/pickle.html.
142
-
143
-
144
-
> ⚠️: Turning off versioning is not yet implemented; all Python pins are versioned. These docs are copied from R pins.
145
-
146
-
In many situations it's useful to version pins, so that writing to an existing pin does not replace the existing data, but instead adds a new copy.
147
-
There are two ways to turn versioning on:
135
+
## Versioning
148
136
149
-
- When you create a board you can turn versioning on for every pin in that board:
150
-
151
-
```python
152
-
board2 = board_temp(versioned=TRUE)
153
-
```
154
-
155
-
- When you write a pin, you can specifically request that versioning be turned on for that pin:
156
-
157
-
```python
158
-
board2 = board_temp()
159
-
board2.pin_write(mtcars, versioned=TRUE)
160
-
```
161
-
162
-
Most boards have versioning on by default.
163
-
The primary exception is`board_folder()` since that stores data on your computer, and there's no automated way to clean up the data your saving.
164
-
165
-
Once you have turned versioning on, every `pin_write()` will create a new version:
board2.pin_write([1,2,3,4,5], name = "x", type = "json")
142
+
board2.pin_write([1,2,3], name = "x", type = "json")
143
+
board2.pin_write([1,2], name = "x", type = "json")
173
144
board2.pin_versions("x")
174
145
```
175
146
@@ -186,55 +157,35 @@ version = board2.pin_versions("x").version[1]
186
157
board2.pin_read("x", version = version)
187
158
```
188
159
189
-
## 🚧 Reading and writing files
190
-
191
-
> ⚠️: `pin_upload()`and`pin_download()` are not yet implemented in Python. These docs are copied from R pins.
160
+
## Storing models
192
161
193
-
So far we've focussed on `pin_write()` and `pin_read()` which work with R objects.
194
-
pins also provides the lower-level `pin_upload()`and`pin_download()` which work with files on disk.
195
-
You can use them to share types of data that are otherwise unsupported by pins.
162
+
> ⚠️: Warning the examples in this section use joblib to read and write data. Joblib uses the pickle format, and **pickle files are not secure**. Only read pickle files you trust. In order to read pickle files, set the `allow_pickle_read=True` argument. See: https://docs.python.org/3/library/pickle.html.
196
163
197
-
`pin_upload()` works like `pin_write()` but instead of an R object you give it a vector of paths.
198
-
I'll start by creating a few files in the temp directory:
164
+
You can write a pin with `type="joblib"` to store arbitrary python objects, including fitted models from packages like [scikit-learn](https://scikit-learn.org/).
199
165
166
+
For example, suppose you wanted to store a custom `namedtuple` object.
Copy file name to clipboardExpand all lines: docs/intro.md
+5-4Lines changed: 5 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,6 +24,8 @@ The pins package publishes data, models, and other Python objects, making it eas
24
24
You can pin objects to a variety of pin *boards*, including folders (to share on a networked drive or with services like DropBox), RStudio Connect, Amazon S3, Google Cloud Storage, and Azure Datalake.
25
25
Pins can be automatically versioned, making it straightforward to track changes, re-run analyses on historical data, and undo mistakes.
26
26
27
+
You can use pins from R as well as Python. For example, you can use one language to read a pin created with the other. Learn more about [pins for R](https://pins.rstudio.com).
28
+
27
29
## Installation
28
30
29
31
To install the released version from PyPI:
@@ -67,18 +69,17 @@ board.pin_read("mtcars")
67
69
A board on your computer is good place to start, but the real power of pins comes when you use a board that's shared with multiple people.
68
70
To get started, you can use `board_folder()` with a directory on a shared drive or in DropBox, or if you use [RStudio Connect](https://www.rstudio.com/products/connect/) you can use `board_rsconnect()`:
69
71
70
-
🚧 TODO: add informational messages shown in display below
71
-
72
72
+++
73
73
74
74
```python
75
75
from pins import board_rsconnect
76
76
77
77
board = board_rsconnect()
78
-
#> Connecting to RSC 1.9.0.1 at <https://connect.rstudioservices.com>
0 commit comments