|
1 | 1 | TranscodingStreams.jl
|
2 | 2 | =====================
|
3 | 3 |
|
| 4 | +Overview |
| 5 | +-------- |
| 6 | + |
4 | 7 | TranscodingStreams.jl is a package for transcoding (e.g. compression) data
|
5 |
| -streams. This package exports a type `TranscodingStream`, which |
6 |
| -is a subtype of `IO` and supports various I/O operations like other usual I/O |
7 |
| -streams in the standard library. |
| 8 | +streams. It exports a type `TranscodingStream`, which is a subtype of `IO` and |
| 9 | +supports various I/O operations like other usual I/O streams in the standard |
| 10 | +library. Operations are quick, simple, and consistent. |
| 11 | + |
| 12 | +In this page, we intorduce the basic concepts of TranscodingStreams.jl and |
| 13 | +available packages. The [Examples](@ref) page demonstrates common usage. The |
| 14 | +[References](@ref) page offers a comprehensive API document. |
8 | 15 |
|
9 | 16 |
|
10 | 17 | Introduction
|
@@ -119,146 +126,14 @@ feasible like these packages. TranscodingStreams.jl requests a codec to
|
119 | 126 | implement some interface functions which will be described later.
|
120 | 127 |
|
121 | 128 |
|
122 |
| -Examples |
123 |
| --------- |
124 |
| - |
125 |
| -### Read lines from a gzip-compressed file |
126 |
| - |
127 |
| -The following snippet is an example of using CodecZlib.jl, which exports |
128 |
| -`GzipDecompressionStream{S}` as an alias of |
129 |
| -`TranscodingStream{GzipDecompression,S} where S<:IO`: |
130 |
| -```julia |
131 |
| -using CodecZlib |
132 |
| -stream = GzipDecompressionStream(open("data.txt.gz")) |
133 |
| -for line in eachline(stream) |
134 |
| - # do something... |
135 |
| -end |
136 |
| -close(stream) |
137 |
| -``` |
138 |
| - |
139 |
| -Note that the last `close` call will close the file as well. Alternatively, |
140 |
| -`open(<stream type>, <filepath>) do ... end` syntax will close the file at the |
141 |
| -end: |
142 |
| -```julia |
143 |
| -using CodecZlib |
144 |
| -open(GzipDecompressionStream, "data.txt.gz") do stream |
145 |
| - for line in eachline(stream) |
146 |
| - # do something... |
147 |
| - end |
148 |
| -end |
149 |
| -``` |
150 |
| - |
151 |
| -### Save a data matrix with Zstd compression |
152 |
| - |
153 |
| -Writing compressed data is easy. One thing you need to keep in mind is to call |
154 |
| -`close` after writing data; otherwise, the output file will be incomplete: |
155 |
| -```julia |
156 |
| -using CodecZstd |
157 |
| -mat = randn(100, 100) |
158 |
| -stream = ZstdCompressionStream(open("data.mat.zst", "w")) |
159 |
| -writedlm(stream, mat) |
160 |
| -close(stream) |
161 |
| -``` |
162 |
| - |
163 |
| -Of course, `open(<stream type>, ...) do ... end` works well: |
164 |
| -```julia |
165 |
| -using CodecZstd |
166 |
| -mat = randn(100, 100) |
167 |
| -open(ZstdCompressionStream, "data.mat.zst", "w") do stream |
168 |
| - writedlm(stream, mat) |
169 |
| -end |
170 |
| -``` |
171 |
| - |
172 |
| -### Explicitly finish transcoding by writing `TOKEN_END` |
173 |
| - |
174 |
| -When writing data, the end of a data stream is indicated by calling `close`, |
175 |
| -which may write an epilogue if necessary and flush all buffered data to the |
176 |
| -underlying I/O stream. If you want to explicitly specify the end position of a |
177 |
| -stream for some reason, you can write `TranscodingStreams.TOKEN_END` to the |
178 |
| -transcoding stream as follows: |
179 |
| -```julia |
180 |
| -using CodecZstd |
181 |
| -using TranscodingStreams |
182 |
| -buf = IOBuffer() |
183 |
| -stream = ZstdCompressionStream(buf) |
184 |
| -write(stream, "foobarbaz"^100, TranscodingStreams.TOKEN_END) |
185 |
| -flush(stream) |
186 |
| -compressed = take!(buf) |
187 |
| -close(stream) |
188 |
| -``` |
189 |
| - |
190 |
| -### Use a noop codec |
191 |
| - |
192 |
| -Sometimes, the `Noop` codec, which does nothing, may be useful. The following |
193 |
| -example creates a decompression stream based on the extension of a filepath: |
194 |
| -```julia |
195 |
| -using CodecZlib |
196 |
| -using CodecBzip2 |
197 |
| -using TranscodingStreams |
198 |
| - |
199 |
| -function makestream(filepath) |
200 |
| - if endswith(filepath, ".gz") |
201 |
| - codec = GzipDecompression() |
202 |
| - elseif endswith(filepath, ".bz2") |
203 |
| - codec = Bzip2Decompression() |
204 |
| - else |
205 |
| - codec = Noop() |
206 |
| - end |
207 |
| - return TranscodingStream(codec, open(filepath)) |
208 |
| -end |
| 129 | +Error handling |
| 130 | +-------------- |
209 | 131 |
|
210 |
| -makestream("data.txt.gz") |
211 |
| -makestream("data.txt.bz2") |
212 |
| -makestream("data.txt") |
213 |
| -``` |
214 |
| - |
215 |
| -### Transcode data in one shot |
216 |
| - |
217 |
| -TranscodingStreams.jl extends the `transcode` function to transcode a data |
218 |
| -in one shot. `transcode` takes a codec object as its first argument and a data |
219 |
| -vector as its second argument: |
220 |
| -```julia |
221 |
| -using CodecZlib |
222 |
| -decompressed = transcode(ZlibDecompression(), b"x\x9cKL*JLNLI\x04R\x00\x19\xf2\x04U") |
223 |
| -String(decompressed) |
224 |
| -``` |
225 |
| - |
226 |
| - |
227 |
| -API |
228 |
| ---- |
229 |
| - |
230 |
| -```@meta |
231 |
| -CurrentModule = TranscodingStreams |
232 |
| -``` |
233 |
| - |
234 |
| -```@docs |
235 |
| -TranscodingStream(codec::Codec, stream::IO) |
236 |
| -transcode(codec::Codec, data::Vector{UInt8}) |
237 |
| -TranscodingStreams.TOKEN_END |
238 |
| -``` |
239 |
| - |
240 |
| -```@docs |
241 |
| -TranscodingStreams.Noop |
242 |
| -TranscodingStreams.NoopStream |
243 |
| -``` |
244 |
| - |
245 |
| -```@docs |
246 |
| -TranscodingStreams.CodecIdentity.Identity |
247 |
| -TranscodingStreams.CodecIdentity.IdentityStream |
248 |
| -``` |
249 |
| - |
250 |
| - |
251 |
| -Defining a new codec |
252 |
| --------------------- |
253 |
| - |
254 |
| -```@docs |
255 |
| -TranscodingStreams.Codec |
256 |
| -TranscodingStreams.initialize |
257 |
| -TranscodingStreams.finalize |
258 |
| -TranscodingStreams.startproc |
259 |
| -TranscodingStreams.process |
260 |
| -``` |
261 |
| - |
262 |
| -```@docs |
263 |
| -TranscodingStreams.Memory |
264 |
| -``` |
| 132 | +You may encounter an error while processing data with this package. For example, |
| 133 | +your compressed data may be corrupted or truncated and the decompression codec |
| 134 | +cannot handle it properly. In this case, the codec informs the stream of the |
| 135 | +error and the stream goes to an unrecoverable state. In this state, the only |
| 136 | +possible operations are `isopen` and `close`. Other operations, such as `read` |
| 137 | +or `write`, will result in an argument error exception. Resources allocated in |
| 138 | +the codec will be released by the stream and hence you must not call the |
| 139 | +finalizer of a codec that is once passed to a transcoding stream object. |
0 commit comments