Skip to content

Commit 13f823d

Browse files
committed
Merge branch 'master' into userbind
Conflicts: test/runtests.jl
2 parents ca6f50d + 885c69f commit 13f823d

File tree

6 files changed

+451
-1
lines changed

6 files changed

+451
-1
lines changed

README.md

Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,3 +75,179 @@ A Julia interface to the SQLite library and support for operations on DataFrames
7575
* `drop(db::SQLiteDB,table::String)`
7676

7777
`drop` is pretty self-explanatory. It's really just a convenience wrapper around `query` to execute a DROP TABLE command, while also calling "VACUUM" to clean out freed memory from the database.
78+
79+
* `registerfunc(db::SQLiteDB, nargs::Int, func::Function, isdeterm::Bool=true; name="")`
80+
81+
Register a function `func` (which takes `nargs` number of arguments) with the SQLite database connection `db`. If the keyword argument `name` is given the function is registered with that name, otherwise it is registered with the name of `func`. If the function is stochastic (e.g. uses a random number) `isdeterm` should be set to `false`, see SQLite's [function creation documentation](http://sqlite.org/c3ref/create_function.html) for more information.
82+
83+
* `@scalarfunc function`
84+
`@scalarfunc name function`
85+
86+
Define a function which can then be passed to `registerfunc`. In the first usage the function name is infered from the function definition, in the second it is explicitly given as the first parameter. The second form is only recommended when it's use is absolutely necessary, see below.
87+
88+
* `sr"..."`
89+
90+
This string literal is used to escape all special characters in the string, useful for using regex in a query.
91+
92+
* `sqlreturn(contex, val)`
93+
94+
This function should never be called explicitly. Instead it is exported so that it can be overloaded when necessary, see below.
95+
96+
#### User Defined Functions
97+
98+
##### SQLite Regular Expressions
99+
100+
SQLite provides syntax for calling the [`regexp` function](http://sqlite.org/lang_expr.html#regexp) from inside `WHERE` clauses. Unfortunately, however, SQLite does not provide a default implementation of the `regexp` function so SQLite.jl creates one automatically when you open a database. The function can be called in the following ways (examples using the [Chinook Database](http://chinookdatabase.codeplex.com/))
101+
102+
```julia
103+
julia> using SQLite
104+
105+
julia> db = SQLiteDB("Chinook_Sqlite.sqlite")
106+
107+
julia> # using SQLite's in-built syntax
108+
109+
julia> query(db, "SELECT FirstName, LastName FROM Employee WHERE LastName REGEXP 'e(?=a)'")
110+
1x2 ResultSet
111+
| Row | "FirstName" | "LastName" |
112+
|-----|-------------|------------|
113+
| 1 | "Jane" | "Peacock" |
114+
115+
julia> # explicitly calling the regexp() function
116+
117+
julia> query(db, "SELECT * FROM Genre WHERE regexp('e[trs]', Name)")
118+
6x2 ResultSet
119+
| Row | "GenreId" | "Name" |
120+
|-----|-----------|----------------------|
121+
| 1 | 3 | "Metal" |
122+
| 2 | 4 | "Alternative & Punk" |
123+
| 3 | 6 | "Blues" |
124+
| 4 | 13 | "Heavy Metal" |
125+
| 5 | 23 | "Alternative" |
126+
| 6 | 25 | "Opera" |
127+
128+
julia> # you can even do strange things like this if you really want
129+
130+
julia> query(db, "SELECT * FROM Genre ORDER BY GenreId LIMIT 2")
131+
2x2 ResultSet
132+
| Row | "GenreId" | "Name" |
133+
|-----|-----------|--------|
134+
| 1 | 1 | "Rock" |
135+
| 2 | 2 | "Jazz" |
136+
137+
julia> query(db, "INSERT INTO Genre VALUES (regexp('^word', 'this is a string'), 'My Genre')")
138+
1x1 ResultSet
139+
| Row | "Rows Affected" |
140+
|-----|-----------------|
141+
| 1 | 0 |
142+
143+
julia> query(db, "SELECT * FROM Genre ORDER BY GenreId LIMIT 2")
144+
2x2 ResultSet
145+
| Row | "GenreId" | "Name" |
146+
|-----|-----------|------------|
147+
| 1 | 0 | "My Genre" |
148+
| 2 | 1 | "Rock" |
149+
```
150+
151+
Due to the heavy use of escape characters you may run into problems where julia parses out some backslashes in your query, for example `"\y"` simlpy becomes `"y"`. For example the following two queries are identical
152+
153+
```julia
154+
julia> query(db, "SELECT * FROM MediaType WHERE Name REGEXP '-\d'")
155+
1x1 ResultSet
156+
| Row | "Rows Affected" |
157+
|-----|-----------------|
158+
| 1 | 0 |
159+
160+
julia> query(db, "SELECT * FROM MediaType WHERE Name REGEXP '-d'")
161+
1x1 ResultSet
162+
| Row | "Rows Affected" |
163+
|-----|-----------------|
164+
| 1 | 0 |
165+
```
166+
167+
This can be avoided in two ways. You can either escape each backslash yourself or you can use the sr"..." string literal that SQLite.jl exports. The previous query can then successfully be run like so
168+
169+
```julia
170+
julia> # manually escaping backslashes
171+
172+
julia> query(db, "SELECT * FROM MediaType WHERE Name REGEXP '-\\d'")
173+
1x2 ResultSet
174+
| Row | "MediaTypeId" | "Name" |
175+
|-----|---------------|-------------------------------|
176+
| 1 | 3 | "Protected MPEG-4 video file" |
177+
178+
julia> # using sr"..."
179+
180+
julia> query(db, sr"SELECT * FROM MediaType WHERE Name REGEXP '-\d'")
181+
1x2 ResultSet
182+
| Row | "MediaTypeId" | "Name" |
183+
|-----|---------------|-------------------------------|
184+
| 1 | 3 | "Protected MPEG-4 video file" |
185+
```
186+
187+
The sr"..." currently escapes all special characters in a string but it may be changed in the future to escape only characters which are part of a regex.
188+
189+
##### Custom Scalar Functions
190+
191+
SQLite.jl also provides a way that you can implement your own [Scalar Functions](https://www.sqlite.org/lang_corefunc.html) (though [Aggregate Functions](https://www.sqlite.org/lang_aggfunc.html) are not currently supported). This is done using the `registerfunc` function and `@scalarfunc` macro.
192+
193+
`@scalarfunc` takes an optional function name and a function and defines a new function which can be passed to `registerfunc`. It can be used with block function syntax
194+
195+
```julia
196+
julia> @scalarfunc function add3(x)
197+
x + 3
198+
end
199+
add3 (generic function with 1 method)
200+
201+
julia> @scalarfunc add5 function irrelevantfuncname(x)
202+
x + 5
203+
end
204+
add5 (generic function with 1 method)
205+
```
206+
207+
inline function syntax
208+
209+
```julia
210+
julia> @scalarfunc mult3(x) = 3 * x
211+
mult3 (generic function with 1 method)
212+
213+
julia> @scalarfunc mult5 anotherirrelevantname(x) = 5 * x
214+
mult5 (generic function with 1 method)
215+
```
216+
217+
and previously defined functions (note that name inference does not work with this method)
218+
219+
```julia
220+
julia> @scalarfunc sin sin
221+
sin (generic function with 1 method)
222+
223+
julia> @scalarfunc subtract -
224+
subtract (generic function with 1 method)
225+
```
226+
227+
The function that is defined can then be passed to `registerfunc`. `registerfunc` takes three arguments; the database to which the function should be registered, the number of arguments that the function takes and the function itself. The function is registered to the database connection rather than the database itself so must be registered each time the database opens. Your function can not take more than 127 arguments unless it takes a variable number of arguments, if it does take a variable number of arguments then you must pass -1 as the second argument to `registerfunc`.
228+
229+
The `@scalarfunc` macro uses the `sqlreturn` function to return your function's return value to SQLite. By default, `sqlreturn` maps the returned value to a [native SQLite type](http://sqlite.org/c3ref/result_blob.html) or, failing that, serializes the julia value and stores it as a `BLOB`. To change this behaviour simply define a new method for `sqlreturn` which then calls a previously defined method for `sqlreturn`. Methods which map to native SQLite types are
230+
231+
```julia
232+
sqlreturn(context, ::NullType)
233+
sqlreturn(context, val::Int32)
234+
sqlreturn(context, val::Int64)
235+
sqlreturn(context, val::Float64)
236+
sqlreturn(context, val::UTF16String)
237+
sqlreturn(context, val::String)
238+
sqlreturn(context, val::Any)
239+
```
240+
241+
As an example, say you would like `BigInt`s to be stored as `TEXT` rather than a `BLOB`. You would simply need to define the following method
242+
243+
```julia
244+
sqlreturn(context, val::BigInt) = sqlreturn(context, string(val))
245+
```
246+
247+
Another example is the `sqlreturn` used by the `regexp` function. For `regexp` to work correctly it must return it must return an `Int` (more specifically a `0` or `1`) but `ismatch` (used by `regexp`) returns a `Bool`. For this reason the following method was defined
248+
249+
```julia
250+
sqlreturn(context, val::Bool) = sqlreturn(context, int(val))
251+
```
252+
253+
Any new method defined for `sqlreturn` must take two arguments and must pass the first argument straight through as the first argument.

src/SQLite.jl

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,10 @@ type SQLiteDB{T<:String}
3030
end
3131
SQLiteDB(file,handle) = SQLiteDB(file,handle,0)
3232

33+
include("UDF.jl")
34+
export registerfunc, sqlreturn, @scalarfunc, @sr_str
35+
36+
3337
function changes(db::SQLiteDB)
3438
new_tot = sqlite3_total_changes(db.handle)
3539
diff = new_tot - db.changes
@@ -50,6 +54,7 @@ function SQLiteDB(file::String="";UTF16::Bool=false)
5054
file = isempty(file) ? file : expanduser(file)
5155
if @OK sqliteopen(utf(file),handle)
5256
db = SQLiteDB(utf(file),handle[1])
57+
registerfunc(db, 2, regexp)
5358
finalizer(db,close)
5459
return db
5560
else # error

src/UDF.jl

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# scalar functions
2+
function registerfunc(db::SQLiteDB, nargs::Integer, func::Function, isdeterm::Bool=true; name="")
3+
@assert nargs <= 127 "only varargs functions can have more than 127 arguments"
4+
# assume any negative number means a varargs function
5+
nargs < -1 && (nargs = -1)
6+
7+
name = isempty(name) ? string(func) : name::String
8+
@assert sizeof(name) <= 255 "size of function name must be <= 255"
9+
10+
cfunc = cfunction(func, Nothing, (Ptr{Void}, Cint, Ptr{Ptr{Void}}))
11+
12+
# TODO: allow the other encodings
13+
enc = SQLITE_UTF8
14+
enc = isdeterm ? enc | SQLITE_DETERMINISTIC : enc
15+
16+
@CHECK db sqlite3_create_function_v2(
17+
db.handle, name, nargs, enc, C_NULL, cfunc, C_NULL, C_NULL, C_NULL
18+
)
19+
end
20+
21+
# aggregate functions
22+
function registerfunc(db::SQLiteDB, nargs::Integer, step::Function, final::Function, isdeterm::Bool=true; name="")
23+
@assert nargs <= 127 "only varargs functions can have more than 127 arguments"
24+
# assume any negative number means a varargs function
25+
nargs < -1 && (nargs = -1)
26+
27+
name = isempty(name) ? string(step) : name::String
28+
cstep = cfunction(step, Nothing, (Ptr{Void}, Cint, Ptr{Ptr{Void}}))
29+
cfinal = cfunction(final, Nothing, (Ptr{Void}, Cint, Ptr{Ptr{Void}}))
30+
31+
# TODO: allow the other encodings
32+
enc = SQLITE_UTF8
33+
enc = isdeterm ? enc | SQLITE_DETERMINISTIC : enc
34+
35+
@CHECK db sqlite3_create_function_v2(
36+
db.handle, name, nargs, enc, C_NULL, C_NULL, cstep, cfinal, C_NULL
37+
)
38+
end
39+
40+
function sqlvalue(values, i)
41+
temp_val_ptr = unsafe_load(values, i)
42+
valuetype = sqlite3_value_type(temp_val_ptr)
43+
44+
if valuetype == SQLITE_INTEGER
45+
if WORD_SIZE == 64
46+
return sqlite3_value_int64(temp_val_ptr)
47+
else
48+
return sqlite3_value_int(temp_val_ptr)
49+
end
50+
elseif valuetype == SQLITE_FLOAT
51+
return sqlite3_value_double(temp_val_ptr)
52+
elseif valuetype == SQLITE_TEXT
53+
# TODO: have a way to return UTF16
54+
return bytestring(sqlite3_value_text(temp_val_ptr))
55+
elseif valuetype == SQLITE_BLOB
56+
nbytes = sqlite3_value_bytes(temp_val_ptr)
57+
blob = sqlite3_value_blob(temp_val_ptr)
58+
buf = zeros(Uint8, nbytes)
59+
unsafe_copy!(pointer(buf), convert(Ptr{Uint8}, blob), nbytes)
60+
return sqldeserialize(buf)
61+
else
62+
return NULL
63+
end
64+
end
65+
66+
sqlreturn(context, ::NullType) = sqlite3_result_null(context)
67+
sqlreturn(context, val::Int32) = sqlite3_result_int(context, val)
68+
sqlreturn(context, val::Int64) = sqlite3_result_int64(context, val)
69+
sqlreturn(context, val::Float64) = sqlite3_result_double(context, val)
70+
sqlreturn(context, val::String) = sqlite3_result_text(context, val)
71+
sqlreturn(context, val::UTF16String) = sqlite3_result_text16(context, val)
72+
sqlreturn(context, val) = sqlite3_result_blob(context, sqlserialize(val))
73+
74+
sqlreturn(context, val::Bool) = sqlreturn(context, int(val))
75+
76+
sqludferror(context, msg::String) = sqlite3_result_error(context, msg)
77+
sqludferror(context, msg::UTF16String) = sqlite3_result_error16(context, msg)
78+
79+
function funcname(expr)
80+
if length(expr) == 2
81+
func = expr[2]
82+
name = expr[1]
83+
else
84+
func = expr[1]
85+
name = func.args[1].args[1]
86+
end
87+
name, func
88+
end
89+
90+
macro scalarfunc(args...)
91+
name, func = funcname(args)
92+
return quote
93+
function $(esc(name))(context::Ptr{Void}, nargs::Cint, values::Ptr{Ptr{Void}})
94+
args = [sqlvalue(values, i) for i in 1:nargs]
95+
ret = $(func)(args...)
96+
sqlreturn(context, ret)
97+
nothing
98+
end
99+
end
100+
end
101+
102+
# annotate types because the MethodError makes more sense that way
103+
@scalarfunc regexp(r::String, s::String) = ismatch(Regex(r), s)
104+
# macro for preserving the special characters in a string
105+
macro sr_str(s) s end

0 commit comments

Comments
 (0)