Skip to content

Commit d87e9b0

Browse files
committed
improvements
1 parent eff0eeb commit d87e9b0

File tree

16 files changed

+144
-57
lines changed

16 files changed

+144
-57
lines changed

packages/README.md

Lines changed: 0 additions & 3 deletions
This file was deleted.

packages/blake3-wasm/README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
JS and WASM implementations of https://github.com/BLAKE3-team/BLAKE3
2+
3+
Using [AssemblyScript](https://www.assemblyscript.org/) to generate a lean WASM.
4+
5+
## Usage
6+
7+
```javascript
8+
import { blake3, blake3Hex, createHasher, update, finalize } from '@huggingface/gearhash-wasm';
9+
10+
// Create a Uint8Array of data to search through
11+
const data = new Uint8Array(1_000_000); // Example: 1MB of data
12+
// ... fill data with your content ...
13+
14+
const hashUint8 = blake3(data);
15+
const hashHex = blake3Hex(data);
16+
17+
// Or streaming fashion
18+
const hasher = createHasher();
19+
20+
for (const chunk of dataSource) {
21+
hasher.update(chunk);
22+
}
23+
24+
const hash = hasher.finalize();
25+
```

packages/blake3-wasm/assembly/blake3.ts

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -371,3 +371,17 @@ export function blake3Hex(input: Uint8Array): string {
371371
}
372372
return hex.join("");
373373
}
374+
375+
export function createHasher(): Blake3Hasher {
376+
return new Blake3Hasher();
377+
}
378+
379+
export function update(hasher: Blake3Hasher, input: Uint8Array): void {
380+
hasher.update(input);
381+
}
382+
383+
export function finalize(hasher: Blake3Hasher): Uint8Array {
384+
const output = new Uint8Array(32);
385+
hasher.finalize(output);
386+
return output;
387+
}
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
{
2-
"extends": "../node_modules/.pnpm/[email protected].37/node_modules/assemblyscript/std/assembly.json",
2+
"extends": "../node_modules/.pnpm/[email protected].36/node_modules/assemblyscript/std/assembly.json",
33
"include": ["./**/*.ts"]
44
}

packages/blake3-wasm/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,6 @@
2828
}
2929
},
3030
"devDependencies": {
31-
"assemblyscript": "^0.27.36"
31+
"assemblyscript": "0.27.36"
3232
}
3333
}

packages/blake3-wasm/pnpm-lock.yaml

Lines changed: 5 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

packages/gearhash-wasm/README.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,14 @@ import { nextMatch } from '@huggingface/gearhash-wasm';
1111
const data = new Uint8Array(1000000); // Example: 1MB of data
1212
// ... fill data with your content ...
1313

14-
// Search for a pattern with a specific mask
15-
const mask = 0x0000d90003530000n; // Example mask as a BigInt
14+
const mask = 0x0000d90003530000n; // Example mask as a BigInt, more 0s => bigger chunks
1615
const match = nextMatch(data, mask);
1716
const allMatches = nextMatches(data, mask).matches;
1817
```
1918

2019
The `nextMatch` function takes two parameters:
2120
- `data`: A Uint8Array containing the data to search through
22-
- `mask`: A BigInt representing the pattern mask to search for
21+
- `mask`: A BigInt, the bigger it is the bigger the chunks are
2322

2423
The function returns an object with the `position` (i32) and `hash` (u64) properties
2524

packages/gearhash-wasm/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,6 @@
2828
}
2929
},
3030
"devDependencies": {
31-
"assemblyscript": "^0.27.36"
31+
"assemblyscript": "0.27.36"
3232
}
3333
}

packages/gearhash-wasm/pnpm-lock.yaml

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

packages/xetchunk-wasm/README.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
JS and WASM implementations of https://github.com/huggingface/xet-core/blob/main/deduplication/src/chunking.rs
2+
3+
Using [AssemblyScript](https://www.assemblyscript.org/) to generate a lean WASM.
4+
5+
## Usage
6+
7+
```javascript
8+
import { createChunker, getChunks, nextBlock, finalize } from '@huggingface/xetchunk-wasm';
9+
10+
const TARGET_CHUNK_SIZE = Math.pow(2, 12);
11+
12+
// Create a Uint8Array of data to search through
13+
const data = new Uint8Array(1000000); // Example: 1MB of data
14+
// ... fill data with your content ...
15+
16+
const chunks = getChunks(data, TARGET_CHUNK_SIZE);
17+
18+
// Alternative, in case your data is streaming
19+
const chunker = createChunker(TARGET_CHUNK_SIZE);
20+
21+
for await (const data of source) {
22+
const chunks = nextBlock(chunker, data);
23+
console.log(chunks);
24+
}
25+
26+
console.log("last chunk", finalize(chunker));
27+
```

0 commit comments

Comments
 (0)