Skip to content

Commit 0d93031

Browse files
committed
initial WIP draft to demonstrate general idea
1 parent fd7bf2f commit 0d93031

File tree

3 files changed

+172
-33
lines changed

3 files changed

+172
-33
lines changed

README.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,40 @@ s = query(io) # io is a stream
3737
will return a `File` or `Stream` object that also encodes the detected
3838
file format.
3939

40+
Sometimes you want to read or write files that are larger than your available
41+
memory, or might be an unknown or infinite length (e.g. reading an audio or
42+
video stream from a socket). In these cases it might not make sense to process
43+
the whole file at once, but instead process it a chunk at a time. For these situations FileIO provides the `loadstreaming` and `savestreaming` functions, which return an object that you can `read` or `write`, rather than the file data itself.
44+
45+
This would look something like:
46+
47+
```jl
48+
using FileIO
49+
audio = loadstreaming("bigfile.wav")
50+
try
51+
while !eof(audio)
52+
chunk = read(audio, 4096) # read 4096 frames
53+
# process the chunk
54+
end
55+
finally
56+
close(stream)
57+
end
58+
```
59+
60+
or use `do` syntax to auto-close the stream:
61+
62+
```jl
63+
using FileIO
64+
do loadstreaming("bigfile.wav") audio
65+
while !eof(audio)
66+
chunk = read(audio, 4096) # read 4096 frames
67+
# process the chunk
68+
end
69+
end
70+
```
71+
72+
Note that in these cases you may want to use `read!` with a pre-allocated buffer for maximum efficiency.
73+
4074
## Adding new formats
4175

4276
You register a new format by adding `add_format(fmt, magic,
@@ -139,6 +173,46 @@ automatically even if the code inside the `do` scope throws an error.)
139173
Conversely, `load(::Stream)` and `save(::Stream)` should not close the
140174
input stream.
141175

176+
`loadstreaming` and `savestreaming` use the same query mechanism, but return a decoded stream that users can `read` or `write`. You should also implement a `close` method on your reader or writer type. Just like with `load` and `save`, if the user provided a filename, your `close` method should be responsible for closing any streams you opened in order to read or write the file. If you are given a `Stream`, your `close` method should only do the clean up for your reader or writer type, not close the stream.
177+
178+
```julia
179+
struct WAVReader
180+
io::IO
181+
ownstream::Bool
182+
end
183+
184+
function read(reader::WAVReader, frames::Int)
185+
# read and decode audio samples from reader.io
186+
end
187+
188+
function close(reader::WAVReader)
189+
# do whatever cleanup the reader needs
190+
if reader.ownstream
191+
close(reader.io)
192+
end
193+
end
194+
loadstreaming(f::File{format"WAV"}) = WAVReader(open(f), ownstream=true)
195+
loadstreaming(s::Stream{format"WAV"}) = WAVReader(s, ownstream=false)
196+
# FileIO has fallback functions that make these work using `do` syntax as well.
197+
```
198+
199+
If you choose to implement `loadstreaming` and `savestreaming` in your package,
200+
you can easily add `save` and `load` methods in the form of:
201+
202+
```julia
203+
function save(q::Formatted{format"WAV"}, data, args...; kwargs...)
204+
savestreaming(args...; kwargs...) do stream
205+
write(stream, data)
206+
end
207+
end
208+
209+
function load(q::Formatted{format"WAV"}, args...; kwargs...)
210+
savestreaming(args...; kwargs...) do stream
211+
readall(stream)
212+
end
213+
end
214+
```
215+
142216
## Help
143217

144218
You can get an API overview by typing `?FileIO` at the REPL prompt.

src/FileIO.jl

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,11 @@ export DataFormat,
1717
file_extension,
1818
info,
1919
load,
20+
loadstreaming,
2021
magic,
2122
query,
2223
save,
24+
savestreaming,
2325
skipmagic,
2426
stream,
2527
unknown
@@ -40,7 +42,9 @@ include("registry.jl")
4042
4143
- `load([filename|stream])`: read data in formatted file, inferring the format
4244
- `load(File(format"PNG",filename))`: specify the format manually
45+
- `loadstreaming(f)`: similar to `load`, except that it returns an object that can be read from
4346
- `save(filename, data...)` for similar operations involving saving data
47+
- `savestreaming(f)`: similar to `save`, except that it returns an object that can be written to
4448
4549
- `io = open(f::File, args...)` opens a file
4650
- `io = stream(s::Stream)` returns the IOStream from the query object `s`

src/loadsave.jl

Lines changed: 94 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -49,17 +49,50 @@ the magic bytes are essential.
4949
- `load(File(format"PNG",filename))` specifies the format directly, and bypasses inference.
5050
- `load(f; options...)` passes keyword arguments on to the loader.
5151
"""
52-
load(s::Union{AbstractString,IO}, args...; options...) =
53-
load(query(s), args...; options...)
52+
load
53+
54+
"""
55+
Some packages may implement a streaming API, where the contents of the file can
56+
be read in chunks and processed, rather than all at once. Reading from these
57+
higher-level streams should return a formatted object, like an image or chunk of
58+
video or audio.
59+
60+
- `loadstreaming(filename)` loads the contents of a formatted file, trying to infer
61+
the format from `filename` and/or magic bytes in the file. It returns a streaming
62+
type that can be read from in chunks, rather than loading the whole contents all
63+
at once
64+
- `loadstreaming(strm)` loads the stream from an `IOStream` or similar object. In this case,
65+
the magic bytes are essential.
66+
- `load(File(format"PNG",filename))` specifies the format directly, and bypasses inference.
67+
- `load(f; options...)` passes keyword arguments on to the loader.
68+
"""
69+
loadstreaming
5470

5571
"""
5672
- `save(filename, data...)` saves the contents of a formatted file,
5773
trying to infer the format from `filename`.
5874
- `save(Stream(format"PNG",io), data...)` specifies the format directly, and bypasses inference.
5975
- `save(f, data...; options...)` passes keyword arguments on to the saver.
6076
"""
61-
save(s::Union{AbstractString,IO}, data...; options...) =
62-
save(query(s), data...; options...)
77+
save
78+
79+
"""
80+
Some packages may implement a streaming API, where the contents of the file can
81+
be written in chunks, rather than all at once. These higher-level streams should
82+
accept formatted objects, like an image or chunk of video or audio.
83+
84+
- `savestreaming(filename, data...)` saves the contents of a formatted file,
85+
trying to infer the format from `filename`.
86+
- `savestreaming(Stream(format"PNG",io), data...)` specifies the format directly, and bypasses inference.
87+
- `savestreaming(f, data...; options...)` passes keyword arguments on to the saver.
88+
"""
89+
savestreaming
90+
91+
92+
for fn in (:load, :loadstreaming, :save, :savestreaming)
93+
@eval $fn(s::Union{AbstractString,IO}, data...; options...) =
94+
$fn(query(s), data...; options...)
95+
end
6396

6497
function save(s::Union{AbstractString,IO}; options...)
6598
data -> save(s, data; options...)
@@ -73,51 +106,79 @@ function save{sym}(df::Type{DataFormat{sym}}, f::AbstractString, data...; option
73106
$data...; $options...)))
74107
end
75108

109+
function savestreaming{sym}(df::Type{DataFormat{sym}}, s::IO, data...; options...)
110+
libraries = applicable_savers(df)
111+
checked_import(libraries[1])
112+
eval(Main, :($savestreaming($Stream($(DataFormat{sym}), $s),
113+
$data...; $options...)))
114+
76115
function save{sym}(df::Type{DataFormat{sym}}, s::IO, data...; options...)
77116
libraries = applicable_savers(df)
78117
checked_import(libraries[1])
79118
eval(Main, :($save($Stream($(DataFormat{sym}), $s),
80119
$data...; $options...)))
120+
121+
function savestreaming{sym}(df::Type{DataFormat{sym}}, f::AbstractString, data...; options...)
122+
libraries = applicable_savers(df)
123+
checked_import(libraries[1])
124+
eval(Main, :($savestreaming($File($(DataFormat{sym}), $f),
125+
$data...; $options...)))
126+
end
127+
128+
# do-syntax for streaming IO
129+
for fn in (:loadstreaming, :savestreaming)
130+
@eval function $fn(f::Function, args...; kwargs...)
131+
str = $fn(args...; kwargs...)
132+
try
133+
f(str)
134+
finally
135+
close(str)
136+
end
137+
end
81138
end
82139

83140

84141
# Fallbacks
85-
function load{F}(q::Formatted{F}, args...; options...)
86-
if unknown(q)
87-
isfile(filename(q)) || open(filename(q)) # force systemerror
88-
throw(UnknownFormat(q))
89-
end
90-
libraries = applicable_loaders(q)
91-
failures = Any[]
92-
for library in libraries
93-
try
94-
Library = checked_import(library)
95-
if !has_method_from(methods(Library.load), Library)
96-
throw(LoaderError(string(library), "load not defined"))
142+
for fn in (:load, :loadstreaming)
143+
@eval function $fn{F}(q::Formatted{F}, args...; options...)
144+
if unknown(q)
145+
isfile(filename(q)) || open(filename(q)) # force systemerror
146+
throw(UnknownFormat(q))
147+
end
148+
libraries = applicable_loaders(q)
149+
failures = Any[]
150+
for library in libraries
151+
try
152+
Library = checked_import(library)
153+
if !has_method_from(methods(Library.$fn), Library)
154+
throw(LoaderError(string(library), "$fn not defined"))
155+
end
156+
return eval(Main, :($(Library.$fn)($q, $args...; $options...)))
157+
catch e
158+
push!(failures, (e, q))
97159
end
98-
return eval(Main, :($(Library.load)($q, $args...; $options...)))
99-
catch e
100-
push!(failures, (e, q))
101160
end
161+
handle_exceptions(failures, "loading \"$(filename(q))\"")
102162
end
103-
handle_exceptions(failures, "loading \"$(filename(q))\"")
104163
end
105-
function save{F}(q::Formatted{F}, data...; options...)
106-
unknown(q) && throw(UnknownFormat(q))
107-
libraries = applicable_savers(q)
108-
failures = Any[]
109-
for library in libraries
110-
try
111-
Library = checked_import(library)
112-
if !has_method_from(methods(Library.save), Library)
113-
throw(WriterError(string(library), "save not defined"))
164+
for fn in (:save, :savestreaming)
165+
@eval function $fn{F}(q::Formatted{F}, data...; options...)
166+
unknown(q) && throw(UnknownFormat(q))
167+
libraries = applicable_savers(q)
168+
failures = Any[]
169+
for library in libraries
170+
try
171+
Library = checked_import(library)
172+
if !has_method_from(methods(Library.$fn), Library)
173+
throw(WriterError(string(library), "$fn not defined"))
174+
end
175+
return eval(Main, :($(Library.$fn)($q, $data...; $options...)))
176+
catch e
177+
push!(failures, (e, q))
114178
end
115-
return eval(Main, :($(Library.save)($q, $data...; $options...)))
116-
catch e
117-
push!(failures, (e, q))
118179
end
180+
handle_exceptions(failures, "saving \"$(filename(q))\"")
119181
end
120-
handle_exceptions(failures, "saving \"$(filename(q))\"")
121182
end
122183

123184
function has_method_from(mt, Library)

0 commit comments

Comments
 (0)