-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathREADME.Rmd
More file actions
152 lines (110 loc) · 5.31 KB
/
README.Rmd
File metadata and controls
152 lines (110 loc) · 5.31 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
---
output: github_document
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
# Use a temporary directory as working directory and add the yaml example
tmp <- withr::local_tempdir()
system.file("examples/readme.yml", package = "connector") |>
file.copy(to = file.path(tmp, "_connector.yml"))
knitr::opts_knit$set(root.dir = tmp)
```
# connector <a href="https://novonordisk-opensource.github.io/connector/"><img src="man/figures/logo.png" align="right" height="138" alt="connector website" /></a>
<!-- badges: start -->
[](https://github.com/NovoNordisk-OpenSource/connector/actions/workflows/check_and_co.yaml)
[](https://app.codecov.io/gh/NovoNordisk-OpenSource/connector)
[](https://CRAN.R-project.org/package=connector)
<!-- badges: end -->
## Overview
`connector` provides a seamless and consistent interface for connecting to different data sources,
such as as simple file storage systems and databases.
It also gives the option to use a central configuration file to manage your connections in your project,
which ensures a consistent reference to the same data source across different scripts in your project,
and enables you to easily switch between different data sources.
The connector package comes with the possibilities of creating connections to file system folders using `connector_fs()` and general databases using `connector_dbi()`, which is built on top of the `{DBI}` package.
connector also has a series of expansion packages that allows you to easily connect to more specific data sources:
* [`{connector.databricks}`](https://novonordisk-opensource.github.io/connector.databricks/): Connect to Databricks
* [`{connector.sharepoint}`](https://novonordisk-opensource.github.io/connector.sharepoint/): Connect to SharePoint sites
## Installation
```{r, eval=FALSE}
# Install the released version from CRAN:
install.packages("connector")
# Install the development version from GitHub:
pak::pak("NovoNordisk-OpenSource/connector")
```
## Usage
The recommended way of using connector is to specify a common yaml configuration file in your project
that contains the connection details to all your data sources.
A simple example creating connectors to both a folder and a database is shown below:
`_connector.yml:`
```yaml
`r paste(readLines("_connector.yml"), collapse = "\n")`
```
It is easy to initialize this file with:
```{r init-connector, eval=FALSE}
connector::use_connector()
```
First we specify common metadata for the connectors, which here is a temporary folder
that we want to use. Afterwards we specify the datasources needed in the project, and their specifications.
The first we name "folder", specify the type to be `connector_fs()`, and the path to the folder.
The second is a database connector to an in memory SQLite database, that we specify using the `connector_dbi()` type,
which uses `DBI::dbConnect()` to initalize the connection. Therefore we also give the `DBI driver` to use, and arguments to it.
To connect and create the conenctors we use `connect()` with the configuration file as input:
```{r connect}
library(connector)
db <- connect("_connector.yml")
print(db)
```
This creates a `connectors` objects that contains each `connector`. When printing the individual `connector` you
get the some general information on their methods and specifications.
```{r print-connector}
print(db$database)
```
We are now ready to use the `connectors`, so we can start by writing some data to the `folder` one:
```{r example-fs}
# Initially it is empty
db$folder |>
list_content_cnt()
# Create some data
cars <- mtcars |>
tibble::as_tibble(rownames = "car")
# Write to folder as a parquet file
db$folder |>
write_cnt(x = cars, name = "cars.parquet")
# Now the folder contains the file
db$folder |>
list_content_cnt()
# And we can read it back in
db$folder |>
read_cnt(name = "cars.parquet")
```
Here the parquet format has been used, but when using a `connector_fs` it is possible to read and write several different file types.
See `read_file()` and `write_file()` for more information.
For the `database` connector it works in the same way:
```{r example-dbi}
# Initially no tables exists
db$database |>
list_content_cnt()
# Write cars to the database as a table
db$database |>
write_cnt(x = cars, name = "cars")
# Now the cara table exists
db$database |>
list_content_cnt()
# And we can read it back in
db$database |>
read_cnt(name = "cars") |>
dplyr::as_tibble()
```
## Useful links
For more information on how to use the package, see the following links:
* `connect()` for more documentation and how to specify the configuration file
* `vignette("connector")` for more examples and how to use the package
* `vignette("customize")` on how to create your own connector and customize behavior
* `help("connector-options")` for all the options available to customize the behavior of `connector`
* [NovoNordisk-OpenSource/R-packages](https://novonordisk-opensource.github.io/R-packages/) for an overview of connector and other R packages published by Novo Nordisk