Skip to content

Commit 2158bc5

Browse files
author
michael-conway
committed
#187 adding reading
1 parent 917efdf commit 2158bc5

File tree

1 file changed

+44
-0
lines changed

1 file changed

+44
-0
lines changed

user-guide/reading.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,9 +155,53 @@ Java provides an i/o library that defines standard input streams and random file
155155
of Java i/o that communicates with an iRODS grid under the covers. These implementations are in the Jargon [core.pub.io](https://github.com/DICE-UNC/jargon/tree/master/jargon-core/src/main/java/org/irods/jargon/core/pub/io)
156156
package. In this package are several key classes and interfaces that allow reading of data from iRODS. The various i/o packages are created using the IRODSFileFactory which is described [here](irodsfilefactory.md).
157157

158+
The basic facility to stream files from iRODS is in the Jargon version of the standard java.io.FileInputStream. This is the [IRODSFileInputStream](https://github.com/DICE-UNC/jargon/blob/master/jargon-core/src/main/java/org/irods/jargon/core/pub/io/IRODSFileInputStream.java)
159+
Once created by the IRODSFileFactory, this can be used in the same fashion as the standard java.io.FileInputStream. This example is from the FileInputStream [unit test](https://github.com/DICE-UNC/jargon/blob/master/jargon-core/src/test/java/org/irods/jargon/core/pub/io/IRODSFileInputStreamTest.java)
158160

161+
```
162+
IRODSFileFactory irodsFileFactory = accessObjectFactory
163+
.getIRODSFileFactory(irodsAccount);
164+
IRODSFile irodsFile = irodsFileFactory.instanceIRODSFile(
165+
targetIrodsCollection, testFileName);
166+
IRODSFileInputStream fis = irodsFileFactory
167+
.instanceIRODSFileInputStream(irodsFile);
168+
169+
ByteArrayOutputStream actualFileContents = new ByteArrayOutputStream();
170+
171+
int bytesRead = 0;
172+
int readLength = 0;
173+
byte[] readBytesBuffer = new byte[1024];
174+
while ((readLength = (fis.read(readBytesBuffer))) > -1) {
175+
actualFileContents.write(readBytesBuffer);
176+
bytesRead += readLength;
177+
}
159178
160179
180+
```
181+
182+
All of the standard i/o contracts are honored by the i/o libraries, and standard buffering techniques and stream wrapping operations
183+
are supported (i.e. wrapping a stream with a buffer or reader). Use them just like the i/o libraries. There are also
184+
several variants on the basic stream in the [core.io](https://github.com/DICE-UNC/jargon/tree/master/jargon-core/src/main/java/org/irods/jargon/core/pub/io).
185+
186+
Here are a few highlights of that i/o library worth noting:
187+
188+
* [IRODSFileReader](https://github.com/DICE-UNC/jargon/blob/master/jargon-core/src/main/java/org/irods/jargon/core/pub/io/IRODSFileReader.java) reads and does character
189+
encoding
190+
* [IRODSRandomAccessFile](https://github.com/DICE-UNC/jargon/blob/master/jargon-core/src/main/java/org/irods/jargon/core/pub/io/IRODSRandomAccessFile.java) for random i/o
191+
operations
192+
* [PackingIrodsInputStream](https://github.com/DICE-UNC/jargon/blob/master/jargon-core/src/main/java/org/irods/jargon/core/pub/io/PackingIrodsInputStream.java) is an enhanced
193+
IRODSFileInputStream that does read-ahead and write-behind buffering, and is highly recommended. It's used in REST and WebDav and the Cloud Browser! See the comment below
194+
on stream i/o performance
195+
196+
### Stream i/o performance
197+
198+
Standard 'put/get' transfer operations typically outperform streaming operations. This is for many reasons, but in a nutshell the protocol for a transfer is
199+
one message that says "here comes the data", and then the raw bytes are shoved down the pipe. For streaming i/o, each individual read of a buffer is
200+
a complete protocol request, and this ends up being much slower. This is especially true when the buffer being
201+
used for the read operation is small. If, for example, a program attempts to read from a stream in 8K increments,
202+
the protocol overhead is amortized over a smaller amount of data'. The PackingIrodsInputStream and output stream allow a
203+
program to read and write in smaller buffer sizes, but under the covers accumulate a much larger byte buffer before
204+
making a call to iRODS, this amortizes the protocol overhead over a much bigger payload.
161205

162206

163207

0 commit comments

Comments
 (0)