TagDataTranslation is a C# library implementing GS1's Tag Data Translation (TDT) specification for encoding and decoding EPC (Electronic Product Code) identifiers used in RFID tags.
- TDT 2.2 - Full support for all standard EPC schemes
- TDS 2.3 - Support for '+' and '++' schemes with hostname encoding
- Standards in markdown are in the parent Mimasu repo:
docs/standards/md/gs1/tdt/anddocs/standards/md/gs1/tds/
- .NET 8.0 / 9.0 / 10.0 (multi-targeting)
- xUnit for testing
- JSON-based scheme definitions (in
src/TagDataTranslation/Schemes2/folder)
The main translation engine. Key methods:
Translate(input, inputFormat, parameterList)- Main translation methodProcessInput()- Parses input and extracts fieldsProcessOutput()- Formats output to requested level
JSON definitions for each EPC scheme containing:
- Level definitions (BINARY, BARE_IDENTIFIER, GS1_DIGITAL_LINK, etc.)
- Field patterns and extraction rules
- Encoding/decoding rules
Helper classes for specific encoding methods:
HostnameEncoder.cs- TDS 2.3 hostname encoding with optimizations
Single-plus schemes like SGTIN+, SSCC+, etc. that:
- Use variable-length serial encoding
- Support GS1 Digital Link URIs with id.gs1.org hostname
- Do NOT encode custom hostnames
Note: The '++' scheme JSON files are custom implementations created by Claude, not official GS1 scheme definitions. They are based on the TDS 2.3 specification but the JSON schema files themselves are not from GS1.
Double-plus schemes like SGTIN++, SSCC++, etc. that:
- Include all features of '+' schemes
- Additionally encode a custom hostname in binary
- Support branded GS1 Digital Link URIs (e.g., https://coca-cola.com/01/...)
Two methods supported:
-
Code 40 (indicator bit 0): For uppercase-only hostnames
- 16 bits per 3 characters
- Character set: 0-9, A-Z, -, .
-
7-bit ASCII with optimizations (indicator bit 1): For mixed-case hostnames
- Uses optimization tables for common TLDs and subdomains
.com,.org,.netetc. encoded as single 7-bit sequenceid.,www.,qr.encoded as single 7-bit sequence- Country TLDs encoded as 14-bit sequences
Important: The hostname length field indicates number of 7-bit sequences, NOT number of output characters.
Format: 3-bit encoding indicator + 5-bit length + variable data
| Indicator | Method | Bits per char |
|---|---|---|
| 0 | Numeric | ~3.32 bits/digit |
| 1 | Upper hex | 4 bits |
| 2 | Lower hex | 4 bits |
| 3 | Base64 URI-safe | 6 bits |
| 4 | 7-bit ASCII | 7 bits |
| 5 | URN Code 40 | ~5.33 bits |
All '++' schemes require trailing ([01]*) in their BINARY pattern to capture variable-length serial and hostname data after the fixed fields.
Example for SGTIN++:
"pattern": "^11111101([01])([01]{3})([01]{56})([01]*)"
Most '++' schemes follow this structure:
- Header (8 bits) - unique per scheme
- DataToggle (1 bit) - +AIDC data indicator
- Filter (3 bits)
- Fixed fields (scheme-specific, BCD encoded)
- Serial (variable-length alphanumeric)
- Hostname (1-bit encoding + 6-bit length + data)
See docs/TDS-2.3-Errata.md for documented errors in the TDS 2.3 specification, including:
- SGTIN++/DSGTIN++ hostname errors in E.3
- SSCC++/ITIP++ header errors in E.3
Note: Errors in '++' scheme JSON files are not "errata" since the JSON files are custom implementations, not from GS1. Only errors in the official TDS 2.3 specification document should be documented in the errata file.
Tests are in a private submodule. After cloning, initialize:
git submodule update --init --recursiveAll tests must pass. Tests are never allowed to fail. Before committing any changes, ensure all tests pass by running:
dotnet test test/TagDataTranslation.Tests/TagDataTranslation.Tests.csprojUse the coverage script to generate an HTML coverage report:
./scripts/coverage.shThis runs all tests with coverlet, generates a Cobertura XML file, and produces an HTML report at coveragereport/index.html (opens automatically on macOS).
Coverage-driven test improvement process:
- Run
./scripts/coverage.shto generate the report - Read the Cobertura XML (
test/TagDataTranslation.Tests/TestResults/*/coverage.cobertura.xml) to identify uncovered lines/branches per class - Prioritize gaps by impact: internal classes with
InternalsVisibleToaccess can be tested directly; otherwise test throughTDTEngine.Translate() - Write tests targeting the uncovered lines, run the suite, re-run coverage to verify improvement
Key coverage notes:
InternalsVisibleTo("TagDataTranslation.Tests")is set inAssemblyInfo.cs— internal classes likeEncodedAICodec,VariableLengthFieldCodec,PlusPlusFieldConvertercan be tested directly- Generated JSON serializer code (
TdtJsonContext,TableJsonContext) will always have partial coverage — this is expected - Table lookup classes (TableB, TableK, TableE) are exercised indirectly through scheme translations; their query methods may show low coverage if only used by specific scheme paths
coveragereport/andTestResults/are gitignored
# Build all targets
dotnet build src/TagDataTranslation/TagDataTranslation.csproj
# Run tests
dotnet test test/TagDataTranslation.Tests/TagDataTranslation.Tests.csproj
# Run specific test categories
dotnet test --filter "FullyQualifiedName~TDS23"
dotnet test --filter "FullyQualifiedName~TDT22"
# Run benchmarks
dotnet run -c Release --project test/TagDataTranslation.Benchmarks
# Build npm WASM package
cd npm && npm run build
# Run npm smoke test
cd npm && node test/smoke.jsThe npm package wraps the .NET library via WebAssembly. The build pipeline:
dotnet publishcompiles the WASM project (sdk/wasm/) targetingbrowser-wasmnpm/scripts/build.jscopies the_framework/output tonpm/dist/wasm/npm/dist/index.jsloads the .NET WASM runtime and exposescreateEngine()
| Path | Description |
|---|---|
sdk/wasm/TagDataTranslation.Wasm.csproj |
WASM project (net10.0, browser-wasm) |
sdk/wasm/JsInterop.cs |
JSExport methods callable from JavaScript |
sdk/wasm/Program.cs |
Minimal entry point required by runtime |
sdk/wasm/main.js |
WASM module entry point |
npm/package.json |
npm package metadata |
npm/dist/index.js |
CJS wrapper with createEngine() |
npm/dist/index.mjs |
ESM re-export |
npm/dist/index.d.ts |
TypeScript type definitions |
npm/scripts/build.js |
Build script (WASM compile + license copy) |
npm/test/smoke.js |
Smoke test for encode/decode/tryTranslate |
examples/NodeApp/ |
Example Node.js app using the published package |
- SDK: Use
Microsoft.NET.Sdk(notMicrosoft.NET.Sdk.BlazorWebAssembly) for library-style WASM - AllowUnsafeBlocks: Required — the JSExport source generator emits unsafe code in .NET 10
- JsonSerializerIsReflectionEnabledByDefault: Must be
true— trimmed WASM disables reflection-based JSON by default, but TDTEngine usesSystem.Text.Jsonwith reflection to load scheme files - TrimmerRootAssembly: Must include
TagDataTranslation— without this, the IL trimmer strips model constructors, causingDeserializeNoConstructorerrors at runtime - Entry point: .NET 10 requires a
Program.cswithMain(even for library-style WASM) - Output path: .NET 10 outputs to
AppBundle/_framework/(not Blazor'swwwroot/_framework/) - getAssemblyExports: Returns a Promise in .NET 10 — must
awaitit - Initialization order: Call
dotnet.create(), thengetAssemblyExports(), thenrunMain()
# Build WASM + copy license
cd npm && npm run build
# Set version
npm version 3.x.x --no-git-tag-version
# Publish (opens browser for auth challenge)
npm publish --tag beta --access public # prerelease
npm publish --access public # stable releaseThe build script auto-copies LICENSING.md from the repo root to npm/LICENSE.md (gitignored) so the license ships with every publish.
- Scope:
@mimasu(public) - License: BSL-1.1
- Size: ~2.6 MB compressed, ~15.7 MB unpacked (includes .NET WASM runtime)
- Engine requirement: Node.js >= 18.0.0
| Path | Description |
|---|---|
src/TagDataTranslation/ |
Main library |
src/TagDataTranslation/Schemes2/ |
JSON scheme definitions |
src/TagDataTranslation/Encoding/ |
Encoding helper classes |
src/TagDataTranslation/Tables/ |
Lookup tables (Table F, K, E, B) |
test/TagDataTranslation.Tests/ |
Unit tests |
docs/ |
Errata and plans |
docs/TDS-2.3-Errata.md |
Known errors in TDS 2.3 specification |
docs/Scheme-Conversion-Errata.md |
Errors found in XML to JSON conversion |
- Create JSON scheme file in
src/TagDataTranslation/Schemes2/ - Define all levels (BINARY, BARE_IDENTIFIER, etc.)
- Add BINARY pattern with appropriate capture groups
- For '++' schemes, add
variableLengthFieldandhostnameFielddefinitions - Add tests in appropriate test file
The Translate() hot path uses several caches to avoid repeated work:
- Regex cache:
ConcurrentDictionary<string, Regex>in TDTEngine — compiled regex patterns shared across all engine instances - Character set regex cache:
ConcurrentDictionary<string, Regex?>in RuleExecutor — caches ValidateCharacterset patterns (null = invalid pattern) - Grammar token cache:
ConcurrentDictionary<string, GrammarToken[]>in TDTEngine — parsed grammar strings cached as token arrays - Pre-sorted fields/rules: Option.Field sorted by Seq at load time; Level.ExtractRules/FormatRules pre-split and sorted at load time
- BinaryConverter lookup tables: Static arrays for hex↔binary conversion (no per-character Convert calls)
- Static grammar regex: Single compiled Regex instance for grammar parsing
All caches are static and thread-safe. They grow monotonically (no eviction) which is fine because the set of patterns/grammars is bounded by the scheme definitions.
Run with dotnet run -c Release --project test/TagDataTranslation.Benchmarks. Results on Apple M1 Pro, .NET 8.0:
| Benchmark | Mean | Allocated |
|---|---|---|
| SGTIN-96 encode | 7.82 us | 9.9 KB |
| SGTIN-96 decode | 7.65 us | 9.2 KB |
| SGTIN++ encode | 24.31 us | 75.3 KB |
| SGTIN++ decode | 5.02 us | 7.8 KB |
| HexToBinary (96-bit) | 99 ns | 480 B |
| BinaryToHex (96-bit) | 54 ns | 192 B |
| Failure (random hex) | 17.82 us | 504 B |
| Failure (random binary) | 18.27 us | 784 B |
- Do NOT create
new Regex(pattern)in the hot path — useGetCachedRegex(pattern)in TDTEngine or the RuleExecutor charset cache - Do NOT use
.OrderBy().ToList()on fields/rules — they are pre-sorted at load time - Pre-size StringBuilder when output length is predictable (e.g.,
hex.Length * 4for HexToBinary)
- Use
TryTranslateDetails()for detailed translation information - Binary patterns must match exactly - check bit counts
- For '++' schemes, hostname length is in sequences, not characters
- GS1 standards in markdown are in the parent Mimasu repo:
docs/standards/md/gs1/
The '+' scheme JSON files (SGTIN+.json, SSCC+.json, etc.) are from the GS1 standard and should NOT be modified. They support GS1 Digital Link URIs with ANY hostname, not just id.gs1.org.
The '++' scheme JSON files are custom implementations and CAN be modified as needed to match the TDS 2.3 specification.
When GS1_DIGITAL_LINK input is provided, both '+' and '++' schemes may match the URL pattern:
- '+' schemes match URLs with any hostname (e.g.,
https://id.gs1.org/01/...) - '++' schemes also match URLs with any hostname and capture it for encoding
The engine may select the '++' scheme due to more specific pattern matching. For '+' scheme tests:
- Test GS1_DIGITAL_LINK as OUTPUT only (translate from BINARY/BARE_IDENTIFIER to GS1_DIGITAL_LINK)
- Do NOT test GS1_DIGITAL_LINK as INPUT (ambiguous which scheme will be selected)
Use ExecuteTestsWithOutputOnly() helper for '+' scheme tests with GS1_DIGITAL_LINK.
Field names MUST match across all levels of a scheme:
- BAD:
itipBinaryin BINARY level,itipin BARE_IDENTIFIER level (no conversion rule) - GOOD:
itipin both BINARY and BARE_IDENTIFIER levels
DSGTIN++: Requires multiple options for different date types (like DSGTIN+):
- Option 0: prodDate (date type indicator 0000)
- Option 4: expDate (date type indicator 0100)
- etc.
GRAI++: GS1_DIGITAL_LINK should capture 14-digit grai field (not 13 digits + hardcoded 0):
- Pattern:
\\/8003\\/([0-9]{14})... - Grammar:
'/8003/' grai urlEscapedSerial
GDTI++: BARE_IDENTIFIER should use ;serial= separator:
- Pattern:
^gdti=([0-9]{13});serial=... - Grammar:
'gdti=' gdti ';serial=' serial ';hostname=' hostname
ITIP++: Use combined itip field (18 digits = gtin + piece + total):
- BARE_IDENTIFIER:
itip=095211411234540102;serial=rif981;hostname=... - GS1_DIGITAL_LINK:
/8006/095211411234540102/21/rif981
BSL 1.1 - See LICENSING.md