Skip to content

Commit 768fc98

Browse files
committed
Add tundra.tn.document.duplicate:key service to calculate a unique
key for duplicate detection for a given document. Change `tundra.tn.document.duplicate:content` to return the calculated key used for duplicate detection, and to store the key on the given document in the `Unique Key` document attribute if it is defined in Trading Networks. Change `tundra.tn.document.duplicate:identity` to return the calculated key used for duplicate detection, and to store the key on the given document in the `Unique Key` document attribute if it is defined in Trading Networks.
1 parent 758b193 commit 768fc98

25 files changed

+1292
-305
lines changed

SERVICES.md

Lines changed: 103 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -2862,45 +2862,55 @@ Checks if the given document is a duplicate by checking if there
28622862
are other documents with the same document type, sender, receiver,
28632863
and SHA-512 message digest of the document content.
28642864

2865-
This service is designed to be called as a custom document duplicate
2866-
check service from Trading Networks, and is compatible with the the
2867-
`WmTN/wm.tn.rec:DupCheckService` specification.
2865+
If the given document is unique (not a duplicate), it's unique key
2866+
will be inserted into the `BizDocUniqueKeys` database table, to be
2867+
used by future duplicate document checks.
28682868

2869-
This service can also be called directly, and it will perform
2870-
the duplicate check when called and store the calculated unique key
2871-
against the document to be used for future duplicate checks.
2869+
The calculated unique key will also be added as an attribute to the
2870+
given document if a document attribute named `Unique Key` exists in
2871+
Trading Networks, whether or not this document is considered a
2872+
duplicate.
2873+
2874+
This service can be called as a custom document duplicate check
2875+
service from Trading Networks document types, as it is silently
2876+
compatible with the `WmTN/wm.tn.rec:DupCheckService` specification.
2877+
2878+
This service can also be called directly, when the result of using a
2879+
document type duplicate check is not desirable.
28722880

28732881
#### Inputs:
28742882

2875-
* `bizdoc` is the Trading Networks document to be checked. Only
2883+
* `$bizdoc` is the Trading Networks document to be checked. Only
28762884
the `InternalID` of the bizdoc must be specified, with the
28772885
remainder of the `WmTN/wm.tn.rec:BizDocEnvelope` structure purely
28782886
optional.
2879-
* `bizdoc.content` is an optional string, byte array, input stream, or
2880-
`IData` document used to calculate the SHA-512 message digest. If an
2881-
`IData` document is provided, it is first canonicalized by
2887+
* `$bizdoc.content` is an optional string, byte array, input stream,
2888+
or `IData` document used to calculate the SHA-512 message digest. If
2889+
an `IData` document is provided, it is first canonicalized by
28822890
recursively sorting the keys in ascending lexicographic order and
28832891
then serialized as minified JSON before calculating the message
2884-
digest. If not specified, the bizdoc's default content part will be
2892+
digest. If not specified, the `$bizdoc` default content part will be
28852893
used.
28862894

28872895
#### Outputs:
28882896

2889-
* `duplicate` is a boolean flag which when `true` indicates that the
2890-
given document was detected as a duplicate of another pre-existing
2891-
document.
2892-
* `message` is a message describing the result of the duplicate check
2893-
suitable for logging.
2894-
* `bizdoc` is the Trading Networks document that was checked.
2895-
* `bizdoc.content` is the content used to calculate the SHA-512
2897+
* `$bizdoc` is the Trading Networks document that was checked.
2898+
* `$bizdoc.content` is the content used to calculate the SHA-512
28962899
message digest. The content is returned in the same format as
28972900
provided. This is important as when the content was provided as an
28982901
input stream, it is necessarily consumed by the SHA-512 message
28992902
digest calculation, and therefore a new input stream containing the
29002903
same input content is returned for subsequent use by the caller.
2901-
* `bizdoc.duplicate` is returned when `duplicate` is `true`, and is
2904+
* `$bizdoc.duplicate?` is a boolean flag which when `true` indicates
2905+
that the given document was detected as a duplicate of another pre-
2906+
existing document.
2907+
* `$bizdoc.duplicate.key` is the unique key calculated for the given
2908+
`$bizdoc` that was used to check for duplicate documents.
2909+
* `$bizdoc.duplicate.message` is a message describing the result of
2910+
the duplicate check suitable for logging.
2911+
* `$bizdoc.duplicate` is returned when `duplicate` is `true`, and is
29022912
the Trading Networks document that is the pre-existing duplicate of
2903-
`bizdoc`.
2913+
`$bizdoc`.
29042914

29052915
---
29062916

@@ -2910,27 +2920,88 @@ Checks if the given document is a duplicate by checking if there
29102920
are other documents with the same document type, sender, receiver,
29112921
and document ID.
29122922

2913-
This service is designed to be called as a custom document duplicate
2914-
check service from Trading Networks, and is compatible with the the
2915-
`WmTN/wm.tn.rec:DupCheckService` specification.
2923+
If the given document is unique (not a duplicate), it's unique key
2924+
will be inserted into the `BizDocUniqueKeys` database table, to be
2925+
used by future duplicate document checks.
2926+
2927+
The calculated unique key will also be added as an attribute to the
2928+
given document if a document attribute named `Unique Key` exists in
2929+
Trading Networks, whether or not this document is considered a
2930+
duplicate.
2931+
2932+
This service can be called as a custom document duplicate check
2933+
service from Trading Networks document types, as it is silently
2934+
compatible with the `WmTN/wm.tn.rec:DupCheckService` specification.
2935+
2936+
This service can also be called directly, when the result of using a
2937+
document type duplicate check is not desirable.
29162938

29172939
#### Inputs:
29182940

2919-
* `bizdoc` is the Trading Networks document to be checked. Only
2920-
the internal ID of the bizdoc must be specified, with the
2941+
* `$bizdoc` is the Trading Networks document to be checked. Only
2942+
the `InternalID` of the bizdoc must be specified, with the
29212943
remainder of the `WmTN/wm.tn.rec:BizDocEnvelope` structure purely
29222944
optional.
29232945

29242946
#### Outputs:
29252947

2926-
* `duplicate` is a boolean flag which when `true` indicates that the
2927-
given document was considered a duplicate.
2928-
* `message` is a message describing the result of the duplicate check
2929-
suitable for logging.
2930-
* `bizdoc` is the Trading Networks document that was checked.
2931-
* `bizdoc.duplicate` is returned when `duplicate` is `true`, and is
2948+
* `$bizdoc` is the Trading Networks document that was checked.
2949+
* `$bizdoc.duplicate?` is a boolean flag which when `true` indicates
2950+
that the given document was detected as a duplicate of another pre-
2951+
existing document.
2952+
* `$bizdoc.duplicate.key` is the unique key calculated for the given
2953+
`$bizdoc` that was used to check for duplicate documents.
2954+
* `$bizdoc.duplicate.message` is a message describing the result of
2955+
the duplicate check suitable for logging.
2956+
* `$bizdoc.duplicate` is returned when `duplicate` is `true`, and is
29322957
the Trading Networks document that is the pre-existing duplicate of
2933-
`bizdoc`.
2958+
`$bizdoc`.
2959+
2960+
---
2961+
2962+
### tundra.tn.document.duplicate:key
2963+
2964+
Calculates the unique key for the given document that can be used for
2965+
duplicate detection.
2966+
2967+
#### Inputs:
2968+
2969+
* `$bizdoc` is the Trading Networks document to be checked. Only
2970+
the `InternalID` of the bizdoc must be specified, with the
2971+
remainder of the `WmTN/wm.tn.rec:BizDocEnvelope` structure purely
2972+
optional.
2973+
* `$bizdoc.content` is an optional string, byte array, input stream,
2974+
or `IData` document used to calculate the SHA-512 message digest. If
2975+
an `IData` document is provided, it is first canonicalized by
2976+
recursively sorting the keys in ascending lexicographic order and
2977+
then serialized as minified JSON before calculating the message
2978+
digest. If not specified, the `$bizdoc` default content part will be
2979+
used. This input parameter is only used when
2980+
`$bizdoc.duplicate.key.type` is specified as `content`.
2981+
* `$bizdoc.duplicate.key.type` specifies the strategy for calculating
2982+
the unique key:
2983+
* `content` will calculate the unique key as a concatenation of the
2984+
document type, sender, receiver, and a SHA-512 message digest of
2985+
the specified `$bizdoc.content`, or `$bizdoc` default content part
2986+
if `$bizdoc.content` is not specified. This is the default
2987+
strategy, and is also the strategy used by the service:
2988+
`TundraTN/tundra.tn.document.duplicate:content`.
2989+
* `identity` will calculate the unique key as a concatenation of the
2990+
document type, sender, receiver, and document identity. This is
2991+
the strategy used by the service:
2992+
`TundraTN/tundra.tn.document.duplicate:identity`.
2993+
2994+
#### Outputs:
2995+
2996+
* `$bizdoc` is the Trading Networks document that was checked.
2997+
* `$bizdoc.content` is the content used to calculate the SHA-512
2998+
message digest. The content is returned in the same format as
2999+
provided. This is important as when the content was provided as an
3000+
input stream, it is necessarily consumed by the SHA-512 message
3001+
digest calculation, and therefore a new input stream containing the
3002+
same input content is returned for subsequent use by the caller.
3003+
* `$bizdoc.duplicate.key` is the unique key calculated for the given
3004+
`$bizdoc` that can be used to check for duplicate documents.
29343005

29353006
---
29363007

0 Bytes
Binary file not shown.
0 Bytes
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
0 Bytes
Binary file not shown.
0 Bytes
Binary file not shown.
2.06 KB
Binary file not shown.

0 commit comments

Comments
 (0)