Skip to content

Commit bfb8939

Browse files
committed
Update README.md
Add a short tutorial on regexp and UDFs.
1 parent 6906127 commit bfb8939

File tree

1 file changed

+159
-0
lines changed

1 file changed

+159
-0
lines changed

README.md

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,3 +73,162 @@ A Julia interface to the SQLite library and support for operations on DataFrames
7373
* `drop(db::SQLiteDB,table::String)`
7474

7575
`drop` is pretty self-explanatory. It's really just a convenience wrapper around `query` to execute a DROP TABLE command, while also calling "VACUUM" to clean out freed memory from the database.
76+
77+
#### User Defined Functions
78+
79+
##### SQLite Regular Expressions
80+
81+
SQLite provides syntax for calling the [`regexp` function](http://sqlite.org/lang_expr.html#regexp) from inside `WHERE` clauses. Unfortunately, however, SQLite does not provide a default implementation of the `regexp` function so SQLite.jl creates one automatically when you open a database. The function can be called in the following ways (examples using the [Chinook Database](http://chinookdatabase.codeplex.com/))
82+
83+
```julia
84+
julia> using SQLite
85+
86+
julia> db = SQLiteDB("Chinook_Sqlite.sqlite")
87+
88+
julia> # using SQLite's in-built syntax
89+
90+
julia> query(db, "SELECT FirstName, LastName FROM Employee WHERE LastName REGEXP 'e(?=a)'")
91+
1x2 ResultSet
92+
| Row | "FirstName" | "LastName" |
93+
|-----|-------------|------------|
94+
| 1 | "Jane" | "Peacock" |
95+
96+
julia> # explicitly calling the regexp() function
97+
98+
julia> query(db, "SELECT * FROM Genre WHERE regexp('e[trs]', Name)")
99+
6x2 ResultSet
100+
| Row | "GenreId" | "Name" |
101+
|-----|-----------|----------------------|
102+
| 1 | 3 | "Metal" |
103+
| 2 | 4 | "Alternative & Punk" |
104+
| 3 | 6 | "Blues" |
105+
| 4 | 13 | "Heavy Metal" |
106+
| 5 | 23 | "Alternative" |
107+
| 6 | 25 | "Opera" |
108+
109+
julia> # you can even do strange things like this if you really want
110+
111+
julia> query(db, "SELECT * FROM Genre ORDER BY GenreId LIMIT 2")
112+
2x2 ResultSet
113+
| Row | "GenreId" | "Name" |
114+
|-----|-----------|--------|
115+
| 1 | 1 | "Rock" |
116+
| 2 | 2 | "Jazz" |
117+
118+
julia> query(db, "INSERT INTO Genre VALUES (regexp('^word', 'this is a string'), 'My Genre')")
119+
1x1 ResultSet
120+
| Row | "Rows Affected" |
121+
|-----|-----------------|
122+
| 1 | 0 |
123+
124+
julia> query(db, "SELECT * FROM Genre ORDER BY GenreId LIMIT 2")
125+
2x2 ResultSet
126+
| Row | "GenreId" | "Name" |
127+
|-----|-----------|------------|
128+
| 1 | 0 | "My Genre" |
129+
| 2 | 1 | "Rock" |
130+
```
131+
132+
Due to the heavy use of escape characters you may run into problems where julia parses out some backslashes in your query, for example `"\y"` simlpy becomes `"y"`. For example the following two queries are identical
133+
134+
```julia
135+
julia> query(db, "SELECT * FROM MediaType WHERE Name REGEXP '-\d'")
136+
1x1 ResultSet
137+
| Row | "Rows Affected" |
138+
|-----|-----------------|
139+
| 1 | 0 |
140+
141+
julia> query(db, "SELECT * FROM MediaType WHERE Name REGEXP '-d'")
142+
1x1 ResultSet
143+
| Row | "Rows Affected" |
144+
|-----|-----------------|
145+
| 1 | 0 |
146+
```
147+
148+
This can be avoided in two ways. You can either escape each backslash yourself or you can use the sr"..." string literal that SQLite.jl exports. The previous query can then successfully be run like so
149+
150+
```julia
151+
julia> # manually escaping backslashes
152+
153+
julia> query(db, "SELECT * FROM MediaType WHERE Name REGEXP '-\\d'")
154+
1x2 ResultSet
155+
| Row | "MediaTypeId" | "Name" |
156+
|-----|---------------|-------------------------------|
157+
| 1 | 3 | "Protected MPEG-4 video file" |
158+
159+
julia> # using sr"..."
160+
161+
julia> query(db, sr"SELECT * FROM MediaType WHERE Name REGEXP '-\d'")
162+
1x2 ResultSet
163+
| Row | "MediaTypeId" | "Name" |
164+
|-----|---------------|-------------------------------|
165+
| 1 | 3 | "Protected MPEG-4 video file" |
166+
```
167+
168+
The sr"..." currently escapes all special characters in a string but it may be changed in the future to escape only characters which are part of a regex.
169+
170+
##### Custom Scalar Functions
171+
172+
SQLite.jl also provides a way that you can implement your own [Scalar Functions](https://www.sqlite.org/lang_corefunc.html) (though [Aggregate Functions](https://www.sqlite.org/lang_aggfunc.html) are not currently supported). This is done using the `registerfunc` function and `@scalarfunc` macro.
173+
174+
`@scalarfunc` takes an optional function name and a function and defines a new function which can be passed to `registerfunc`. It can be used with block function syntax
175+
176+
```julia
177+
julia> @scalarfunc function add3(x)
178+
x + 3
179+
end
180+
add3 (generic function with 1 method)
181+
182+
julia> @scalarfunc add5 function irrelevantfuncname(x)
183+
x + 5
184+
end
185+
add5 (generic function with 1 method)
186+
```
187+
188+
inline function syntax
189+
190+
```julia
191+
julia> @scalarfunc mult3(x) = 3 * x
192+
mult3 (generic function with 1 method)
193+
194+
julia> @scalarfunc mult5 anotherirrelevantname(x) = 5 * x
195+
mult5 (generic function with 1 method)
196+
```
197+
198+
and previously defined functions (note that name inference does not work with this method)
199+
200+
```julia
201+
julia> @scalarfunc sin sin
202+
sin (generic function with 1 method)
203+
204+
julia> @scalarfunc subtract -
205+
subtract (generic function with 1 method)
206+
```
207+
208+
The function that is defined can then be passed to `registerfunc`. `registerfunc` takes three arguments; the database to which the function should be registered, the number of arguments that the function takes and the function itself. The function is registered to the database connection rather than the database itself so must be registered each time the database opens. Your function can not take more than 127 arguments unless it takes a variable number of arguments, if it does take a variable number of arguments then you must pass -1 as the second argument to `registerfunc`.
209+
210+
The `@scalarfunc` macro uses the `sqlreturn` function to return your function's return value to SQLite. By default, `sqlreturn` maps the returned value to a [native SQLite type]() or, failing that, serializes the julia value and stores it as a `BLOB`. To change this behaviour simply define a new method for `sqlreturn` which then calls a previously defined method for `sqlreturn`. Methods which map to native SQLite types are
211+
212+
```julia
213+
sqlreturn(context, ::NullType)
214+
sqlreturn(context, val::Int32)
215+
sqlreturn(context, val::Int64)
216+
sqlreturn(context, val::Float64)
217+
sqlreturn(context, val::UTF16String)
218+
sqlreturn(context, val::String)
219+
sqlreturn(context, val::Any)
220+
```
221+
222+
As an example, say you would like `BigInt`s to be stored as `TEXT` rather than a `BLOB`. You would simply need to define the following method
223+
224+
```julia
225+
sqlreturn(context, val::BigInt) = sqlreturn(context, string(val))
226+
```
227+
228+
Another example is the `sqlreturn` used by the `regexp` function. For `regexp` to work correctly it must return it must return an `Int` (more specifically a `0` or `1`) but `ismatch` (used by `regexp`) returns a `Bool`. For this reason the following method was defined
229+
230+
```julia
231+
sqlreturn(context, val::Bool) = sqlreturn(context, int(val))
232+
```
233+
234+
Any new method defined for `sqlreturn` must take two arguments and must pass the first argument straight through as the first argument.

0 commit comments

Comments
 (0)