Skip to content

Commit 3949a69

Browse files
tpietzschbogovicjcmhulbert
authored
ReadData and DataBlockCodec (#137)
* Add N5ReadBenchmark * remove redundant modifiers * WIP: remove @OverRide annotations for ByteBuffer methods to be able to turn those methods on/off in the DataBlock interface without causing compile errors in the implementations. * Fix benchmark image URL * Add byte[] DataType.createSerializeArray(numElements) * WIP Compression / Codec API * WIP encode/decode implementations * bugfix * cleanup * wip * feat: DataBlock methods to read/write directly from DataInput/Output * WIP: SplittableReadData * WIP DataBlock readData variants * WIP SplittableReadData * WIP clean up Compression interface * WIP speed up DataBlock.readData(InputStream), and clean up * WIP DataBlock.writeData(OutputStream) * WIP use Splittable.ReadData in DefaultBlockReader * WIP revise StringDataBlock (doesn't work yet) * WIP revise StringDataBlock revert to holding serialized and actual data. otherwise we can't be compatible with existing N5 datasets. * Add ByteOrder to DataBlock.writeData() * Clean up * Clean up * fix javadoc error * switch from byte[] to ByteBuffer for DataBlock.de/serialize() * Add explicit commons-io dependency * Lean more on ReadData * Move ReadData etc to separate classes * Add ReadData.writeTo(OutputStream) * Add EncodedReadData that wraps a ReadData and an OutputStreamEncoder * Compression (BytesCodec) can encode ReadData. (This might happen immediately or later when the ReadData is written to OutputStream). * WIP ReadData.decode(Codec) * Add decode(ReadData readData) variant that knows the length of the decoded data * Hide ReadData implementations, use static ReadData factory methods instead * Remove deprecated BlockWriter/BlockReader interfaces * Clean up * javacod * Remove old Compression methods. Everything goes through ReadData * Remove DataBlock de/serialize() methods Maybe we'll rename readData()/writeData() later, the point is to remove the ByteBuffer methods from the API * fix javadoc error * Add ReadData method toByteBuffer() * clean up * Use OutputStream wrapper to intercept close() in EncodedReadData We still need a custom interface "OutputStreamOperator" because we want to throw IOException and UnaryOperator::apply doesn't. * ReadData.encode/decode methods forwarding to BytesCodec * remove BytesCodec interface and put encode/decode into Compression for now * idea * Add LazyReadData and OutputStreamWriter When data is requested from the LazyReadData, the LazyReadData will ask its OutputStreamWriter to write the data to a ByteArrayOutputStream. When the LazyReadData itself is written to an OutputStream, it will pass that OutputStream to its OutputStreamWriter (without loading the data into a byte[] array first). * WIP Codecs * WIP Codecs * WIP Codecs * WIP use Codecs, remove serialization methods from DataBlock * Use ProxyOutputStream. FilterOutputStream is slow * use only array type in DataBlockCodec generics * refactor * refactor * convenience DataCodec methods to get codec with approprioate endianness * typo * refactor * refactor * Let DatasetAttributes provide the DataBlockCodec avoids the Compression argument to encode/decode methods * refactor, add ReadData.materialize() This is in preparation for moving SplittableReadData into a separate PR * Remove SplittableReadData interface * fix javadoc error * rename ChunkHeader to BlockHeader * Remove unnecessary flush()s, instead close OutputStream where it is constructed * Refactor for compatibility with upcoming Codecs Release (#2) * refactor: don't pass `decodedLength` to ReadData * refactor: move DataBlockFactory/DataBlockCodecFactory to N5Codecs * feat: Add StringDataCodec, ObjectDataCodec, StringDataBlockCodec, ObjectDataBlockCodec DataCodecs now creat the access object and return it during `deserialize` * refactor: add AbstractDataBlock to extract shared logic between Default/String/Object blocks * refactor: DatasetAttributes responsible for DataBlockCodec creation N5BlockCodec uses dataType (and potentially other DatasetAttributes to wrap the desired DataBlockCodec * revert: add back createDataBlock logic in DataType * doc: retain javadoc from before refactor * revert: keep protected constructor with DataBlockCodec parameter refactor: inline createDataBlockCodec from constructor params * refactor: remove currently unused N5BlockCodec. Something like this may be needed when multiple codecs are supported * refactor: dont expose N5Codecs internals * refactor: rename encodeBlockHeader -> createBlockHeader * feat: add ZarrStringDataCodec support * ZarrStringDataCodec always LITTLE_ENDIAN * fix javadoc * clean up visibility and formatting * bugfix * improve readability --------- Co-authored-by: John Bogovic <bogovicj@janelia.hhmi.org> Co-authored-by: Caleb Hulbert <cmhulbert@users.noreply.github.com>
1 parent 7724ced commit 3949a69

35 files changed

+1542
-601
lines changed

pom.xml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -202,6 +202,22 @@
202202
<groupId>org.apache.commons</groupId>
203203
<artifactId>commons-compress</artifactId>
204204
</dependency>
205+
<dependency>
206+
<groupId>commons-io</groupId>
207+
<artifactId>commons-io</artifactId>
208+
</dependency>
209+
210+
<!-- JMH -->
211+
<dependency>
212+
<groupId>org.openjdk.jmh</groupId>
213+
<artifactId>jmh-core</artifactId>
214+
<scope>test</scope>
215+
</dependency>
216+
<dependency>
217+
<groupId>org.openjdk.jmh</groupId>
218+
<artifactId>jmh-generator-annprocess</artifactId>
219+
<scope>test</scope>
220+
</dependency>
205221
</dependencies>
206222

207223
<repositories>

src/main/java/org/janelia/saalfeldlab/n5/AbstractDataBlock.java

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@
2525
*/
2626
package org.janelia.saalfeldlab.n5;
2727

28+
import java.util.function.ToIntFunction;
29+
2830
/**
2931
* Abstract base class for {@link DataBlock} implementations.
3032
*
@@ -35,15 +37,21 @@
3537
*/
3638
public abstract class AbstractDataBlock<T> implements DataBlock<T> {
3739

38-
protected final int[] size;
39-
protected final long[] gridPosition;
40-
protected final T data;
40+
private final int[] size;
41+
private final long[] gridPosition;
42+
private final T data;
43+
private final ToIntFunction<T> numElements;
4144

42-
public AbstractDataBlock(final int[] size, final long[] gridPosition, final T data) {
45+
public AbstractDataBlock(
46+
final int[] size,
47+
final long[] gridPosition,
48+
final T data,
49+
final ToIntFunction<T> numElements) {
4350

4451
this.size = size;
4552
this.gridPosition = gridPosition;
4653
this.data = data;
54+
this.numElements = numElements;
4755
}
4856

4957
@Override
@@ -63,4 +71,10 @@ public T getData() {
6371

6472
return data;
6573
}
66-
}
74+
75+
@Override
76+
public int getNumElements() {
77+
78+
return numElements.applyAsInt(data);
79+
}
80+
}

src/main/java/org/janelia/saalfeldlab/n5/BlockReader.java

Lines changed: 0 additions & 54 deletions
This file was deleted.

src/main/java/org/janelia/saalfeldlab/n5/BlockWriter.java

Lines changed: 0 additions & 52 deletions
This file was deleted.

src/main/java/org/janelia/saalfeldlab/n5/ByteArrayDataBlock.java

Lines changed: 1 addition & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -25,31 +25,10 @@
2525
*/
2626
package org.janelia.saalfeldlab.n5;
2727

28-
import java.nio.ByteBuffer;
29-
3028
public class ByteArrayDataBlock extends AbstractDataBlock<byte[]> {
3129

3230
public ByteArrayDataBlock(final int[] size, final long[] gridPosition, final byte[] data) {
3331

34-
super(size, gridPosition, data);
35-
}
36-
37-
@Override
38-
public ByteBuffer toByteBuffer() {
39-
40-
return ByteBuffer.wrap(getData());
41-
}
42-
43-
@Override
44-
public void readData(final ByteBuffer buffer) {
45-
46-
if (buffer.array() != getData())
47-
buffer.get(getData());
48-
}
49-
50-
@Override
51-
public int getNumElements() {
52-
53-
return data.length;
32+
super(size, gridPosition, data, a -> a.length);
5433
}
5534
}

src/main/java/org/janelia/saalfeldlab/n5/Bzip2Compression.java

Lines changed: 11 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,13 @@
2727

2828
import java.io.IOException;
2929
import java.io.InputStream;
30-
import java.io.OutputStream;
31-
3230
import org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream;
3331
import org.apache.commons.compress.compressors.bzip2.BZip2CompressorOutputStream;
3432
import org.janelia.saalfeldlab.n5.Compression.CompressionType;
33+
import org.janelia.saalfeldlab.n5.readdata.ReadData;
3534

3635
@CompressionType("bzip2")
37-
public class Bzip2Compression implements DefaultBlockReader, DefaultBlockWriter, Compression {
36+
public class Bzip2Compression implements Compression {
3837

3938
private static final long serialVersionUID = -4873117458390529118L;
4039

@@ -52,36 +51,22 @@ public Bzip2Compression() {
5251
}
5352

5453
@Override
55-
public InputStream getInputStream(final InputStream in) throws IOException {
56-
57-
return new BZip2CompressorInputStream(in);
58-
}
59-
60-
@Override
61-
public OutputStream getOutputStream(final OutputStream out) throws IOException {
62-
63-
return new BZip2CompressorOutputStream(out, blockSize);
64-
}
65-
66-
@Override
67-
public Bzip2Compression getReader() {
54+
public boolean equals(final Object other) {
6855

69-
return this;
56+
if (other == null || other.getClass() != Bzip2Compression.class)
57+
return false;
58+
else
59+
return blockSize == ((Bzip2Compression)other).blockSize;
7060
}
7161

7262
@Override
73-
public Bzip2Compression getWriter() {
63+
public ReadData decode(final ReadData readData) throws IOException {
7464

75-
return this;
65+
return ReadData.from(new BZip2CompressorInputStream(readData.inputStream()));
7666
}
7767

7868
@Override
79-
public boolean equals(final Object other) {
80-
81-
if (other == null || other.getClass() != Bzip2Compression.class)
82-
return false;
83-
else
84-
return blockSize == ((Bzip2Compression)other).blockSize;
69+
public ReadData encode(final ReadData readData) {
70+
return readData.encode(out -> new BZip2CompressorOutputStream(out, blockSize));
8571
}
86-
8772
}

src/main/java/org/janelia/saalfeldlab/n5/Compression.java

Lines changed: 41 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,14 @@
2525
*/
2626
package org.janelia.saalfeldlab.n5;
2727

28+
import java.io.IOException;
2829
import java.io.Serializable;
2930
import java.lang.annotation.ElementType;
3031
import java.lang.annotation.Inherited;
3132
import java.lang.annotation.Retention;
3233
import java.lang.annotation.RetentionPolicy;
3334
import java.lang.annotation.Target;
34-
35+
import org.janelia.saalfeldlab.n5.readdata.ReadData;
3536
import org.scijava.annotations.Indexable;
3637

3738
/**
@@ -49,7 +50,7 @@ public interface Compression extends Serializable {
4950
@Inherited
5051
@Target(ElementType.TYPE)
5152
@Indexable
52-
public static @interface CompressionType {
53+
@interface CompressionType {
5354

5455
String value();
5556
}
@@ -61,9 +62,9 @@ public interface Compression extends Serializable {
6162
@Retention(RetentionPolicy.RUNTIME)
6263
@Inherited
6364
@Target(ElementType.FIELD)
64-
public static @interface CompressionParameter {}
65+
@interface CompressionParameter {}
6566

66-
public default String getType() {
67+
default String getType() {
6768

6869
final CompressionType compressionType = getClass().getAnnotation(CompressionType.class);
6970
if (compressionType == null)
@@ -72,7 +73,40 @@ public default String getType() {
7273
return compressionType.value();
7374
}
7475

75-
public BlockReader getReader();
76+
// --------------------------------------------------
77+
//
78+
79+
/**
80+
* Decode the given {@code readData}.
81+
* <p>
82+
* The returned decoded {@code ReadData} reports {@link ReadData#length()
83+
* length()}{@code == decodedLength}. Decoding may be lazy or eager,
84+
* depending on the {@code BytesCodec} implementation.
85+
*
86+
* @param readData
87+
* data to decode
88+
*
89+
* @return decoded ReadData
90+
*
91+
* @throws IOException
92+
* if any I/O error occurs
93+
*/
94+
ReadData decode(ReadData readData) throws IOException;
95+
96+
/**
97+
* Encode the given {@code readData}.
98+
* <p>
99+
* Encoding may be lazy or eager, depending on the {@code BytesCodec}
100+
* implementation.
101+
*
102+
* @param readData
103+
* data to encode
104+
*
105+
* @return encoded ReadData
106+
*
107+
* @throws IOException
108+
* if any I/O error occurs
109+
*/
110+
ReadData encode(ReadData readData) throws IOException;
76111

77-
public BlockWriter getWriter();
78-
}
112+
}

0 commit comments

Comments
 (0)