You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A Julia interface to the Sqlite library and support for operations on DataFrames
4
+
A Julia interface to the SQLite library and support for operations on DataFrames
5
5
6
6
Installation through the Julia package manager:
7
7
```julia
8
8
julia> Pkg.init() # Creates julia package repository (only runs once for all packages)
9
-
julia> Pkg.add("Sqlite") # Creates the Sqlite repo folder and downloads the Sqlite package + dependancy (if needed)
10
-
julia>usingSqlite# Loads the Sqlite module for use (needs to be run with each new Julia instance)
9
+
julia> Pkg.add("SQLite") # Creates the SQLite repo folder and downloads the SQLite package + dependancy (if needed)
10
+
julia>usingSQLite# Loads the SQLite module for use (needs to be run with each new Julia instance)
11
11
```
12
12
## Package Documentation
13
13
14
14
#### Functions
15
-
*`Sqlite.connect(file::String)`
15
+
*`connect(file::String)`
16
16
17
-
`connect` requires the `file` string argument as the name of either a pre-defined Sqlite database to be opened, or if the database doesn't exist, one will be created.
17
+
`connect` requires the `file` string argument as the name of either a pre-defined SQLite database to be opened, or if the database doesn't exist, one will be created.
18
18
19
-
`connect` returns a `SqliteDB` type which contains basic information
20
-
about the connection and Sqlite handle pointers.
19
+
`connect` returns a `SQLiteDB` type which contains basic information
20
+
about the connection and SQLite handle pointers.
21
21
22
22
`connect` can be used by storing the `Connection` type in
23
23
a variable to be able to close or facilitate handling multiple
24
24
databases like so:
25
25
```julia
26
-
co =Sqlite.connect("mydatasource")
26
+
co =connect("mydatasource")
27
27
```
28
-
But it's unneccesary to store the `SqliteDB`, as an exported
29
-
`sqlitedb` variable holds the most recently created `SqliteDB` type and other
30
-
Sqlite functions (i.e. `query`) will use it by default in the absence of a specified connection.
28
+
But it's unneccesary to store the `SQLiteDB`, as an exported
29
+
`sqlitedb` variable holds the most recently created `SQLiteDB` type and other
30
+
SQLite functions (i.e. `query`) will use it by default in the absence of a specified connection.
If a connection type isn't specified as the first positional argument, the query will be executed against
35
35
the default connection (stored in the exported variable `sqlitedb` if you'd like to
@@ -38,38 +38,38 @@ inspect).
38
38
Once the query is executed, the resultset is stored in a
39
39
`DataFrame` by default.
40
40
41
-
For the general user, a simple `Sqlite.query(querystring)` is enough to return a single resultset in a DataFrame. Results are stored in the passed SqliteDB type's resultset field. (i.e. `sqlitedb.resultset`). Results are stored by default to avoid immediate garbarge collection and provide access for the user even if the resultset returned by query isn't stored in a variable.
41
+
For the general user, a simple `query(querystring)` is enough to return a single resultset in a DataFrame. Results are stored in the passed SQLiteDB type's resultset field. (i.e. `sqlitedb.resultset`). Results are stored by default to avoid immediate garbarge collection and provide access for the user even if the resultset returned by query isn't stored in a variable.
`createtable` takes its `DataFrame` argument and converts it to an Sqlite table in the specified `SqliteDB`. By default, the resulting table will have the same name as the DataFrame variable, unless specifically passed with the `name` keyword argument.
45
+
`createtable` takes either a `DataFrame` argument or file name string. The DataFrame or file is converted to an SQLite table in the specified `SQLiteDB`. By default, the resulting table will have the same name as the DataFrame variable or file name, unless specifically passed with the `name` keyword argument. The `delim`, `header`, `types`, and `infer` keyword arguments are for use with files. `delime` specifies the file delimiter, (comma ',', tab '\t', etc.). `header` specifies whether the file has a header or not and generates column names if needed. `types` allows the user to specify the column types to be read in, while `infer` allows an algorithm to figure out each columns type before commiting to the SQLite table. Note that if the `types` argument is empty and `infer=false`, then all values will be passed as Strings/text, which ends up being very fast, but obviously without any resulting type information.
`droptable` is pretty self-explanatory. It's really just a convenience wrapper around `query` to execute a DROP TABLE command.
50
50
51
51
*`sqldf(q::String)`
52
52
53
-
`sqldf` mirrors the function of the same name in R, allowing common SQL operations on Julia DataFrames. The passed query string is parsed and the DataFrames named in the FROM and JOIN statements are first converted to Sqlite tables and then the SELECT statement is run on them. The tables are dropped after the query is run and the result is returned as a DataFrame.
53
+
`sqldf` mirrors the function of the same name in R, allowing common SQL operations on Julia DataFrames. The passed query string is parsed and the DataFrames named in the FROM and JOIN statements are first converted to SQLite tables and then the SELECT statement is run on them. The tables are dropped after the query is run and the result is returned as a DataFrame.
54
54
55
-
* Planned Functions
56
55
57
-
`createtable` specifying a delimted (CSV,TSV,etc.) file for the table to be created from. `readdlmsql` will then be possible, allowing a raw file to be read and a DataFrame to be returned according to a given SQL statement.
56
+
57
+
`createtable` specifying a delimted (csv,tsv,etc.) file for the table to be created from. `readdlmsql` will then be possible, allowing a raw file to be read and a DataFrame to be returned according to a given SQL statement.
58
58
59
59
#### Types
60
-
*`SqliteDB`
60
+
*`SQLiteDB`
61
61
62
-
Stores information about an Sqlite database connection. Names include `file` for the Sqlite database filename, `handle` as the internal connection handle pointer, and `resultset` which
63
-
stores the last resultset returned from a `Sqlite.query` call.
62
+
Stores information about an SQLite database connection. Names include `file` for the SQLite database filename, `handle` as the internal connection handle pointer, and `resultset` which
63
+
stores the last resultset returned from a `query` call.
64
64
65
65
*`typealias TableInput Union(DataFrame,String)`
66
66
67
67
#### Variables
68
68
*`sqlitedb`
69
-
Global, exported variable that initially holds a null `SqliteDB` type until a connection is successfully made by `Sqlite.connect`. Is used by `query` as the default datasource `SqliteDB` if none is explicitly specified.
69
+
Global, exported variable that initially holds a null `SQLiteDB` type until a connection is successfully made by `connect`. Is used by `query` as the default datasource `SQLiteDB` if none is explicitly specified.
70
70
71
71
### Known Issues
72
-
* We've had limited Sqlite testing between various platforms, so it may happen that `Sqlite.jl` doesn't recognize your Sqlite shared library. The current approach, since Sqlite doesn't come standard on many platforms, is to provide the shared library in the `Sqlite.jl/lib` folder. If this doesn't work on your machine, you'll need to manually locate your Sqlite shared library (searching for something along the lines of
72
+
* We've had limited SQLite testing between various platforms, so it may happen that `SQLite.jl` doesn't recognize your SQLite shared library. The current approach, since SQLite doesn't come standard on many platforms, is to provide the shared library in the `SQLite.jl/lib` folder. If this doesn't work on your machine, you'll need to manually locate your SQLite shared library (searching for something along the lines of
73
73
`libsqlite3` or `sqlite3`, or compiling/installing it yourself) and then run the following:
74
74
```julia
75
75
const sqlite3_lib ="path/to/library/sqlite3.so" (or .dylib on OSX)
@@ -78,6 +78,4 @@ stores the last resultset returned from a `Sqlite.query` call.
78
78
That said, if you end up doing this, open an issue on GitHub to let me know if the library is on your platform by default and I can add it is as one of the defaults to check for.
79
79
80
80
### TODO
81
-
* Overload `createtable` to take a delimted filename
82
-
* Function `readdlmsql` similar to `read.csv.sql` in R
83
81
* Additional benchmarking: I've only tested `createtable` so far, as I was initially having performance issues with it, but now we're even with the RSQLite package in R (whose functions are all implemented in C).
0 commit comments