Skip to content

Commit 070a304

Browse files
committed
Merge branch 'UDFs' of git://github.com/quinnj/SQLite.jl
2 parents 8d7b67b + 8b89a33 commit 070a304

File tree

6 files changed

+447
-2
lines changed

6 files changed

+447
-2
lines changed

README.md

Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,3 +73,179 @@ A Julia interface to the SQLite library and support for operations on DataFrames
7373
* `drop(db::SQLiteDB,table::String)`
7474

7575
`drop` is pretty self-explanatory. It's really just a convenience wrapper around `query` to execute a DROP TABLE command, while also calling "VACUUM" to clean out freed memory from the database.
76+
77+
* `registerfunc(db::SQLiteDB, nargs::Int, func::Function, isdeterm::Bool=true; name="")`
78+
79+
Register a function `func` (which takes `nargs` number of arguments) with the SQLite database connection `db`. If the keyword argument `name` is given the function is registered with that name, otherwise it is registered with the name of `func`. If the function is stochastic (e.g. uses a random number) `isdeterm` should be set to `false`, see SQLite's [function creation documentation](http://sqlite.org/c3ref/create_function.html) for more information.
80+
81+
* `@scalarfunc function`
82+
`@scalarfunc name function`
83+
84+
Define a function which can then be passed to `registerfunc`. In the first usage the function name is infered from the function definition, in the second it is explicitly given as the first parameter. The second form is only recommended when it's use is absolutely necessary, see below.
85+
86+
* `sr"..."`
87+
88+
This string literal is used to escape all special characters in the string, useful for using regex in a query.
89+
90+
* `sqlreturn(contex, val)`
91+
92+
This function should never be called explicitly. Instead it is exported so that it can be overloaded when necessary, see below.
93+
94+
#### User Defined Functions
95+
96+
##### SQLite Regular Expressions
97+
98+
SQLite provides syntax for calling the [`regexp` function](http://sqlite.org/lang_expr.html#regexp) from inside `WHERE` clauses. Unfortunately, however, SQLite does not provide a default implementation of the `regexp` function so SQLite.jl creates one automatically when you open a database. The function can be called in the following ways (examples using the [Chinook Database](http://chinookdatabase.codeplex.com/))
99+
100+
```julia
101+
julia> using SQLite
102+
103+
julia> db = SQLiteDB("Chinook_Sqlite.sqlite")
104+
105+
julia> # using SQLite's in-built syntax
106+
107+
julia> query(db, "SELECT FirstName, LastName FROM Employee WHERE LastName REGEXP 'e(?=a)'")
108+
1x2 ResultSet
109+
| Row | "FirstName" | "LastName" |
110+
|-----|-------------|------------|
111+
| 1 | "Jane" | "Peacock" |
112+
113+
julia> # explicitly calling the regexp() function
114+
115+
julia> query(db, "SELECT * FROM Genre WHERE regexp('e[trs]', Name)")
116+
6x2 ResultSet
117+
| Row | "GenreId" | "Name" |
118+
|-----|-----------|----------------------|
119+
| 1 | 3 | "Metal" |
120+
| 2 | 4 | "Alternative & Punk" |
121+
| 3 | 6 | "Blues" |
122+
| 4 | 13 | "Heavy Metal" |
123+
| 5 | 23 | "Alternative" |
124+
| 6 | 25 | "Opera" |
125+
126+
julia> # you can even do strange things like this if you really want
127+
128+
julia> query(db, "SELECT * FROM Genre ORDER BY GenreId LIMIT 2")
129+
2x2 ResultSet
130+
| Row | "GenreId" | "Name" |
131+
|-----|-----------|--------|
132+
| 1 | 1 | "Rock" |
133+
| 2 | 2 | "Jazz" |
134+
135+
julia> query(db, "INSERT INTO Genre VALUES (regexp('^word', 'this is a string'), 'My Genre')")
136+
1x1 ResultSet
137+
| Row | "Rows Affected" |
138+
|-----|-----------------|
139+
| 1 | 0 |
140+
141+
julia> query(db, "SELECT * FROM Genre ORDER BY GenreId LIMIT 2")
142+
2x2 ResultSet
143+
| Row | "GenreId" | "Name" |
144+
|-----|-----------|------------|
145+
| 1 | 0 | "My Genre" |
146+
| 2 | 1 | "Rock" |
147+
```
148+
149+
Due to the heavy use of escape characters you may run into problems where julia parses out some backslashes in your query, for example `"\y"` simlpy becomes `"y"`. For example the following two queries are identical
150+
151+
```julia
152+
julia> query(db, "SELECT * FROM MediaType WHERE Name REGEXP '-\d'")
153+
1x1 ResultSet
154+
| Row | "Rows Affected" |
155+
|-----|-----------------|
156+
| 1 | 0 |
157+
158+
julia> query(db, "SELECT * FROM MediaType WHERE Name REGEXP '-d'")
159+
1x1 ResultSet
160+
| Row | "Rows Affected" |
161+
|-----|-----------------|
162+
| 1 | 0 |
163+
```
164+
165+
This can be avoided in two ways. You can either escape each backslash yourself or you can use the sr"..." string literal that SQLite.jl exports. The previous query can then successfully be run like so
166+
167+
```julia
168+
julia> # manually escaping backslashes
169+
170+
julia> query(db, "SELECT * FROM MediaType WHERE Name REGEXP '-\\d'")
171+
1x2 ResultSet
172+
| Row | "MediaTypeId" | "Name" |
173+
|-----|---------------|-------------------------------|
174+
| 1 | 3 | "Protected MPEG-4 video file" |
175+
176+
julia> # using sr"..."
177+
178+
julia> query(db, sr"SELECT * FROM MediaType WHERE Name REGEXP '-\d'")
179+
1x2 ResultSet
180+
| Row | "MediaTypeId" | "Name" |
181+
|-----|---------------|-------------------------------|
182+
| 1 | 3 | "Protected MPEG-4 video file" |
183+
```
184+
185+
The sr"..." currently escapes all special characters in a string but it may be changed in the future to escape only characters which are part of a regex.
186+
187+
##### Custom Scalar Functions
188+
189+
SQLite.jl also provides a way that you can implement your own [Scalar Functions](https://www.sqlite.org/lang_corefunc.html) (though [Aggregate Functions](https://www.sqlite.org/lang_aggfunc.html) are not currently supported). This is done using the `registerfunc` function and `@scalarfunc` macro.
190+
191+
`@scalarfunc` takes an optional function name and a function and defines a new function which can be passed to `registerfunc`. It can be used with block function syntax
192+
193+
```julia
194+
julia> @scalarfunc function add3(x)
195+
x + 3
196+
end
197+
add3 (generic function with 1 method)
198+
199+
julia> @scalarfunc add5 function irrelevantfuncname(x)
200+
x + 5
201+
end
202+
add5 (generic function with 1 method)
203+
```
204+
205+
inline function syntax
206+
207+
```julia
208+
julia> @scalarfunc mult3(x) = 3 * x
209+
mult3 (generic function with 1 method)
210+
211+
julia> @scalarfunc mult5 anotherirrelevantname(x) = 5 * x
212+
mult5 (generic function with 1 method)
213+
```
214+
215+
and previously defined functions (note that name inference does not work with this method)
216+
217+
```julia
218+
julia> @scalarfunc sin sin
219+
sin (generic function with 1 method)
220+
221+
julia> @scalarfunc subtract -
222+
subtract (generic function with 1 method)
223+
```
224+
225+
The function that is defined can then be passed to `registerfunc`. `registerfunc` takes three arguments; the database to which the function should be registered, the number of arguments that the function takes and the function itself. The function is registered to the database connection rather than the database itself so must be registered each time the database opens. Your function can not take more than 127 arguments unless it takes a variable number of arguments, if it does take a variable number of arguments then you must pass -1 as the second argument to `registerfunc`.
226+
227+
The `@scalarfunc` macro uses the `sqlreturn` function to return your function's return value to SQLite. By default, `sqlreturn` maps the returned value to a [native SQLite type](http://sqlite.org/c3ref/result_blob.html) or, failing that, serializes the julia value and stores it as a `BLOB`. To change this behaviour simply define a new method for `sqlreturn` which then calls a previously defined method for `sqlreturn`. Methods which map to native SQLite types are
228+
229+
```julia
230+
sqlreturn(context, ::NullType)
231+
sqlreturn(context, val::Int32)
232+
sqlreturn(context, val::Int64)
233+
sqlreturn(context, val::Float64)
234+
sqlreturn(context, val::UTF16String)
235+
sqlreturn(context, val::String)
236+
sqlreturn(context, val::Any)
237+
```
238+
239+
As an example, say you would like `BigInt`s to be stored as `TEXT` rather than a `BLOB`. You would simply need to define the following method
240+
241+
```julia
242+
sqlreturn(context, val::BigInt) = sqlreturn(context, string(val))
243+
```
244+
245+
Another example is the `sqlreturn` used by the `regexp` function. For `regexp` to work correctly it must return it must return an `Int` (more specifically a `0` or `1`) but `ismatch` (used by `regexp`) returns a `Bool`. For this reason the following method was defined
246+
247+
```julia
248+
sqlreturn(context, val::Bool) = sqlreturn(context, int(val))
249+
```
250+
251+
Any new method defined for `sqlreturn` must take two arguments and must pass the first argument straight through as the first argument.

src/SQLite.jl

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,10 @@ type SQLiteDB{T<:String}
3030
end
3131
SQLiteDB(file,handle) = SQLiteDB(file,handle,0)
3232

33+
include("UDF.jl")
34+
export registerfunc, sqlreturn, @scalarfunc, @sr_str
35+
36+
3337
function changes(db::SQLiteDB)
3438
new_tot = sqlite3_total_changes(db.handle)
3539
diff = new_tot - db.changes
@@ -50,6 +54,7 @@ function SQLiteDB(file::String="";UTF16::Bool=false)
5054
file = isempty(file) ? file : expanduser(file)
5155
if @OK sqliteopen(utf(file),handle)
5256
db = SQLiteDB(utf(file),handle[1])
57+
registerfunc(db, 2, regexp)
5358
finalizer(db,close)
5459
return db
5560
else # error

src/UDF.jl

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
# scalar functions
2+
function registerfunc(db::SQLiteDB, nargs::Integer, func::Function, isdeterm::Bool=true; name="")
3+
@assert (-1 <= nargs <= 127) "nargs must follow the inequality -1 <= nargs <= 127"
4+
5+
name = isempty(name) ? string(func) : name::String
6+
cfunc = cfunction(func, Nothing, (Ptr{Void}, Cint, Ptr{Ptr{Void}}))
7+
8+
# TODO: allow the other encodings
9+
enc = SQLITE_UTF8
10+
enc = isdeterm ? enc | SQLITE_DETERMINISTIC : enc
11+
12+
@CHECK db sqlite3_create_function_v2(
13+
db.handle, name, nargs, enc, C_NULL, cfunc, C_NULL, C_NULL, C_NULL
14+
)
15+
end
16+
17+
# aggregate functions
18+
function registerfunc(db::SQLiteDB, nargs::Integer, step::Function, final::Function, isdeterm::Bool=true; name="")
19+
@assert (-1 <= nargs <= 127) "nargs must follow the inequality -1 <= nargs <= 127"
20+
21+
name = isempty(name) ? string(step) : name::String
22+
cstep = cfunction(step, Nothing, (Ptr{Void}, Cint, Ptr{Ptr{Void}}))
23+
cfinal = cfunction(final, Nothing, (Ptr{Void}, Cint, Ptr{Ptr{Void}}))
24+
25+
# TODO: allow the other encodings
26+
enc = SQLITE_UTF8
27+
enc = isdeterm ? enc | SQLITE_DETERMINISTIC : enc
28+
29+
@CHECK db sqlite3_create_function_v2(
30+
db.handle, name, nargs, enc, C_NULL, C_NULL, cstep, cfinal, C_NULL
31+
)
32+
end
33+
34+
function sqlvalue(values, i)
35+
temp_val_ptr = unsafe_load(values, i)
36+
valuetype = sqlite3_value_type(temp_val_ptr)
37+
38+
if valuetype == SQLITE_INTEGER
39+
if WORD_SIZE == 64
40+
return sqlite3_value_int64(temp_val_ptr)
41+
else
42+
return sqlite3_value_int(temp_val_ptr)
43+
end
44+
elseif valuetype == SQLITE_FLOAT
45+
return sqlite3_value_double(temp_val_ptr)
46+
elseif valuetype == SQLITE_TEXT
47+
# TODO: have a way to return UTF16
48+
return bytestring(sqlite3_value_text(temp_val_ptr))
49+
elseif valuetype == SQLITE_BLOB
50+
nbytes = sqlite3_value_bytes(temp_val_ptr)
51+
blob = sqlite3_value_blob(temp_val_ptr)
52+
buf = zeros(Uint8, nbytes)
53+
unsafe_copy!(pointer(buf), convert(Ptr{Uint8}, blob), nbytes)
54+
return sqldeserialize(buf)
55+
else
56+
return NULL
57+
end
58+
end
59+
60+
sqlreturn(context, ::NullType) = sqlite3_result_null(context)
61+
sqlreturn(context, val::Int32) = sqlite3_result_int(context, val)
62+
sqlreturn(context, val::Int64) = sqlite3_result_int64(context, val)
63+
sqlreturn(context, val::Float64) = sqlite3_result_double(context, val)
64+
sqlreturn(context, val::String) = sqlite3_result_text(context, val)
65+
sqlreturn(context, val::UTF16String) = sqlite3_result_text16(context, val)
66+
sqlreturn(context, val) = sqlite3_result_blob(context, sqlserialize(val))
67+
68+
sqlreturn(context, val::Bool) = sqlreturn(context, int(val))
69+
70+
sqludferror(context, msg::String) = sqlite3_result_error(context, msg)
71+
sqludferror(context, msg::UTF16String) = sqlite3_result_error16(context, msg)
72+
73+
function funcname(expr)
74+
if length(expr) == 2
75+
func = expr[2]
76+
name = expr[1]
77+
else
78+
func = expr[1]
79+
name = func.args[1].args[1]
80+
end
81+
name, func
82+
end
83+
84+
macro scalarfunc(args...)
85+
name, func = funcname(args)
86+
return quote
87+
function $(esc(name))(context::Ptr{Void}, nargs::Cint, values::Ptr{Ptr{Void}})
88+
args = [sqlvalue(values, i) for i in 1:nargs]
89+
ret = $(func)(args...)
90+
sqlreturn(context, ret)
91+
nothing
92+
end
93+
end
94+
end
95+
96+
97+
# annotate types because the MethodError makes more sense that way
98+
@scalarfunc regexp(r::String, s::String) = ismatch(Regex(r), s)
99+
# macro for preserving the special characters in a string
100+
macro sr_str(s) s end

0 commit comments

Comments
 (0)