Skip to content

Conversation

@kleinschmidt
Copy link

@kleinschmidt kleinschmidt commented Oct 31, 2025

This implementation is my (i.e. a TypeScript non-knower's) attempt to pattern match the
other codecs here along with @manzt's suggestion in
manzt/numcodecs.js#49 (comment).

One thing I noticed is that the ArrayArrayCodec type has a single type
parameter (I think?). Does that imply that all array-array codecs must output
the same type that they accept as input? If so, I think that's probably too
restrictive; the python fixedscaleoffset codec explicitly supports encoding to a
different type than the input uses, and my intended use case is to decode
int16-quantized floats.


@changeset-bot
Copy link

changeset-bot bot commented Oct 31, 2025

⚠️ No Changeset found

Latest commit: 6e0129b

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Comment on lines +7 to +8
dtype: string;
astype?: string;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zarr v3 data types can be a string or a JSON object with type {name: string, configuration: object}. See an example here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one i think needs to be restricted to number types; can numeric types also have that format? is that how endianness is stored?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see in the spec:

Each data type is associated with an identifier, which can be used in metadata documents to refer to the data type. For the data types defined in this specification, the identifier is a simple ASCII string. However, extensions may use any JSON value to identify a data type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and endianness is specified as a bytes codec, I'm inferring.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah the bytes codec sets the endianness of the encoded data, but for decoded data, endianness is up to the implementation

@kleinschmidt kleinschmidt changed the title Implement decode-only fixedscaleoffset codec Implement fixedscaleoffset codec Nov 1, 2025
#TypedArrayOut: TypedArrayConstructor<A>

constructor(configuration: FixedScaleOffsetConfig) {
const { data_type } = coerce_dtype(configuration.dtype);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testing on v3 has made me realize that coerce_dtype is really only meant for v2 data type strings, so this will need to be adjusted.

@kleinschmidt
Copy link
Author

I think resolving the v2 vs. v3 issues may be trickier than I'm prepared to take on right now. Would you be willing to accept v2-only support @manzt ?

The numcodecs.zarr3 python functions generate a different ID (numcodecs.fixedscaleoffset vs. fixedscaleoffset in v2) so leaving this as-is will simply error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants