Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
286 changes: 262 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

# WHAT IS IT?

This library compares two arrays or objects and returns a full diff of their differences.
This library compares arrays, objects and texts and returns a full diff of their differences.

ℹ️ The documentation is also available on our [website](https://superdiff.gitbook.io/donedeal0-superdiff)!

Expand All @@ -20,7 +20,7 @@ This library compares two arrays or objects and returns a full diff of their dif

Most existing solutions return a confusing diff format that often requires extra parsing. They are also limited to object comparison.

**Superdiff** provides a complete and readable diff for both arrays **and** objects. Plus, it supports stream and file inputs for handling large datasets efficiently, is battle-tested, has zero dependencies, and is super fast.
**Superdiff** provides a complete and readable diff for both **arrays**, **texts** and **objects**. Plus, it supports **stream** and file inputs for handling large datasets efficiently, is battle-tested, has zero dependencies, and is super fast.

Import. Enjoy. 👍

Expand All @@ -40,29 +40,20 @@ I am grateful to the generous donors of **Superdiff**!

<hr/>

## FEATURES
## API

**Superdiff** exports 5 functions:
**Superdiff** exports 6 functions:

```ts
// Returns a complete diff of two objects
getObjectDiff(prevObject, nextObject)

// Returns a complete diff of two arrays
getListDiff(prevList, nextList)

// Streams the diff of two object lists, ideal for large lists and maximum performance
streamListDiff(prevList, nextList, referenceProperty)

// Checks whether two values are equal
isEqual(dataA, dataB)
- [getObjectDiff](#getobjectdiff)
- [getListDiff](#getlistdiff)
- [streamListDiff](#streamlistdiff)
- [getTextDiff](#gettextdiff)
- [isEqual](#isequal)
- [isObject](#isobject)

// Checks whether a value is an object
isObject(data)
```
<hr/>

### getObjectDiff()
### getObjectDiff

```js
import { getObjectDiff } from "@donedeal0/superdiff";
Expand Down Expand Up @@ -202,7 +193,7 @@ getObjectDiff(
```
<hr/>

### getListDiff()
### getListDiff

```js
import { getListDiff } from "@donedeal0/superdiff";
Expand Down Expand Up @@ -305,7 +296,7 @@ getListDiff(
```
<hr/>

### streamListDiff()
### streamListDiff

```js
// If you are in a server environment
Expand Down Expand Up @@ -498,7 +489,217 @@ diff.on("error", (err) => console.log(err))

<hr/>

### isEqual()
### getTextDiff

```js
import { getTextDiff } from "@donedeal0/superdiff";
```

Compares two texts and returns a diff for each characters, words or sentence, depending on your preference.

The output is optimized by default to produce a readable, visual diff (like GitHub or Git). A strict mode that tracks exact token moves and updates is also available.

All language subtleties (Unicode, CJK scripts, locale-aware sentence segmentation, etc.) are handled.

#### FORMAT

**Input**

```ts
previousText: string | null | undefined,
currentText: string | null | undefined,
options?: {
showOnly?: ("added" | "deleted" | "moved" | "updated" | "equal")[], // [] by default.
separation?: "character" | "word" | "sentence", // "word" by default
mode?: "visual" | "strict", // "visual" by default
ignoreCase?: boolean, // false by default
ignorePunctuation?: boolean, // false by default
locale?: Intl.Locale | string // english by default
}
```
- `previousText`: the original text.
- `currentText`: the new text.
- `options`
- `showOnly` gives you the option to return only the values whose status you are interested in (e.g. `["added", "equal"]`).
- `moved` and `updated` are only available in `strict` mode.
- `separation` whether you want a `character`, `word` or `sentence` based diff.
- `mode`:
- `visual` (default): optimized for readability. Token moves are ignored so insertions don’t cascade and break equality (recommended for UI diffing).
- `strict`: tracks token moves exactly. Semantically precise, but noisier (a simple addition will move all the next tokens, breaking equality).
- `ignoreCase`: if set to `true` `hello` and `HELLO` will be considered equal.
- `ignorePunctuation`: if set to `true` `hello!` and `hello` will be considered equal.
- `locale`: the locale of your text.

**Output**

```ts
type TextDiff = {
type: "text";
status: "added" | "deleted" | "equal" | "updated";
diff: {
value: string;
previousValue?: string
status: "added" | "deleted" | "equal" | "moved" | "updated";
currentIndex: number | null;
previousIndex: number | null;
}[];
};
```

#### USAGE

**VISUAL MODE**

`visual` is optimized for readability. Token moves are ignored so insertions don’t cascade and break equality (recommended for UI diffing). Token updates are rendered as two `added` and `deleted` entries.

This mode is based on a [longest common subsequence (LCS) computation](https://en.wikipedia.org/wiki/Longest_common_subsequence), similar to Git and GitHub diffs.

**Input**

```diff
getTextDiff(
- "The brown fox jumped high",
+ "The orange cat has jumped",
{ mode: "visual", separation: "word" }
);
```

**Output**

```diff
{
type: "text",
+ status: "updated",
diff: [
{
value: 'The',
status: 'equal',
currentIndex: 0,
previousIndex: 0
},
- {
- value: "brown",
- status: "deleted",
- currentIndex: null,
- previousIndex: 1,
- }
- {
- value: "fox",
- status: "deleted",
- currentIndex: null,
- previousIndex: 2,
- }
+ {
+ value: "orange",
+ status: "added",
+ currentIndex: 1,
+ previousIndex: null,
+ },
+ {
+ value: "cat",
+ status: "added",
+ currentIndex: 2,
+ previousIndex: null,
+ },
+ {
+ value: "has",
+ status: "added",
+ currentIndex: 3,
+ previousIndex: null,
+ },
{
value: "jumped",
status: "equal",
currentIndex: 4,
previousIndex: 3,
},
- {
- value: "high",
- status: "deleted",
- currentIndex: null,
- previousIndex: 4,
- }
],
}
```

**STRICT MODE**

`strict` tracks token moves exactly. Semantically precise, but noisier (a simple addition will move all the next tokens, breaking equality). It also considers direct token swaps as `updated`.

**Input**

```diff
getTextDiff(
- "The brown fox jumped high",
+ "The orange cat has jumped",
{ mode: "strict", separation: "word" }
);
```

**Output**

```diff
{
type: "text",
+ status: "updated",
diff: [
{
value: 'The',
status: 'equal',
currentIndex: 0,
previousIndex: 0
},
+ {
+ value: "orange",
+ previousValue: "brown",
+ status: "updated",
+ currentIndex: 1,
+ previousIndex: null,
+ },
+ {
+ value: "cat",
+ previousValue: "fox",
+ status: "updated",
+ currentIndex: 2,
+ previousIndex: null,
+ },
+ {
+ value: "has",
+ status: "added",
+ currentIndex: 3,
+ previousIndex: null,
+ },
+ {
+ value: "jumped",
+ status: "moved",
+ currentIndex: 4,
+ previousIndex: 3,
+ },
- {
- value: "high",
- status: "deleted",
- currentIndex: null,
- previousIndex: 4,
- }
],
}
```

#### TOKEN STATUSES

| Status | Represents | Index meaning |
| ------- | ------------- | --------------------------------------- |
| **equal** | same token | both indexes valid |
| **added** | new token | `previousIndex = null` |
| **deleted** | removed token | `currentIndex = null` |
| **moved** | same token (only in `strict` mode) | both indexes valid |
| **updated** | replacement (only in `strict` mode) | no shared identity, one index only |


<hr/>

### isEqual

```js
import { isEqual } from "@donedeal0/superdiff";
Expand Down Expand Up @@ -544,7 +745,7 @@ false;
```
<hr/>

### isObject()
### isObject

```js
import { isObject } from "@donedeal0/superdiff";
Expand Down Expand Up @@ -600,3 +801,40 @@ If you or your company uses **Superdiff**, please show your support by becoming
## CONTRIBUTING

Issues and pull requests are welcome!

## COMPETITORS

| Feature | Superdiff | deep-object-diff | deep-diff | diff |
| ------------------------------ | --------- | ------ | --------- | --------- |
| Unified API (object/list/text) | ✅ | ❌ | ❌ | ❌ |
| Text diff | ✅ | ✅ | ❌ | ✅ |
| Object diff | ✅ | ✅ | ✅ | ❌ |
| List diff | ✅ | ❌ | ❌ | ❌ |
| Streaming for huge datasets | ✅ | ❌ | ❌ | ❌ |
| Strict move detection | ✅ | ❌ | ❌ | ❌ |
| Zero dependencies | ✅ | ❌ | ❌ | ❌ |

## BENCHMARK

Environment: Node.js 24.12.0 (LTS) • macOS Sequoia 15.1 • MacBook Pro M2 (2023) • 16GB RAM.

Method: Warm up runs, then each script runs 20 times and we keep the median time.

### List diff

| Scenario | Superdiff | arr-diff | deep-diff |
| ------------------------- | --------- | -------- | --------- |
| 10k items array | **2.59 ms** | 47.52 ms | 5.49 ms |
| 100k items array | **36.44 ms** | 4836.25 ms | 62.30 ms. |

Despite providing a more complex diff, Superdiff outperforms the competition on lists and scales linearly.

### Object diff

| Scenario | Superdiff | deep-object-diff | deep-diff |
| ------------------------------ | --------- | ---------------- | --------- |
| 10k flat object keys | **2.31 ms** | 2.46 ms | 39.56 ms |
| 100k flat object keys | **30.23 ms** | 31.86 ms | 3784.50 ms|
| 100k nested nodes | **4.25 ms** | 9.67 ms | 16.51 ms |

Despite providing a full structural diff with a richer output, Superdiff is the fastest. It also scales linearly, even with deeply nested data.
35 changes: 35 additions & 0 deletions benchmark/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
//import { runListBench10K } from "./list-benchmark";
// import { runTextBench10K } from "./text-benchmark";
import {
runObjectBench10K,
// runObjectBench100K,
// runNestedObjectBench,
} from "./object-benchmark";

console.log("Running Superdiff benchmarks");

async function main() {
console.log("=== SUPERDIFF BENCHMARKS ===");

// Objects
runObjectBench10K();
//runObjectBench100K();
//runNestedObjectBench();

// List
// runListBench10K();
// runListBench100K();
// Lists (streaming)
// await runListStreamBench();

// Text
//runTextBench10K();
//runTextBench100K();

console.log("\n=== BENCHMARK COMPLETE ===");
}

main().catch((err) => {
console.error(err);
process.exit(1);
});
Loading
Loading