You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
logger.info("Number of match ids: {}", metrics.matchIDs());
344
346
logger.info("Number of records not processed: {}", metrics.recordsNotProcessed());
345
347
logger.info("Number of total records processed: {}", metrics.totalRecordsProcessed());
348
+
logger.info("The following represents the actual output data generated by the Entity Resolution workflow based on the JSON and CSV input data. The output data is stored in the {} bucket.", glueBucketName);
346
349
logger.info("""
347
-
348
-
The output of the machinelearning-based matching job is a CSV file in the S3 bucket. The following is a sample of the output:
arn:aws:glue:region:xxxxxxxxxxxx:table/entity_resolution_db/csvgluetable 7 Jane E. Doe [email protected] 111-222-3333 7 036298535ed6471ebfc358fc76e1f51200006472446402560 \s
354
-
arn:aws:glue:region:xxxxxxxxxxxx:table/entity_resolution_db/csvgluetable 0.90523 2 Bob Smith Jr. [email protected] 987-654-3210 2 6ae2d360d6594089837eafc31b20f31600003506806140928 \s
355
-
arn:aws:glue:region:xxxxxxxxxxxx:table/entity_resolution_db/jsongluetable 0.90523 2 Bob Smith [email protected] 2 6ae2d360d6594089837eafc31b20f31600003506806140928 \s
356
-
arn:aws:glue:region:xxxxxxxxxxxx:table/entity_resolution_db/csvgluetable 0.89398956 1 Alice B. Johnson [email protected] 746-876-9846 1 34a5075b289247efa1847ab292ed677400009137438953472 \s
357
-
arn:aws:glue:region:xxxxxxxxxxxx:table/entity_resolution_db/jsongluetable 0.89398956 1 Alice Johnson [email protected] 1 34a5075b289247efa1847ab292ed677400009137438953472 \s
358
-
arn:aws:glue:region:xxxxxxxxxxxx:table/entity_resolution_db/csvgluetable 0.605295 3 Charlie Black [email protected] 345-567-1234 3 92c8ef3f68b34948a3af998d700ed02700002146028888064 \s
359
-
arn:aws:glue:region:xxxxxxxxxxxx:table/entity_resolution_db/jsongluetable 0.605295 3 Charlie Black [email protected] 3 92c8ef3f68b34948a3af998d700ed02700002146028888064 \s
360
-
361
-
Note that each of the last 3 pairs of records are considered a match even though the 'name' or 'email' differ between the records;
362
-
For example 'Bob Smith Jr.' compared to 'Bob Smith'.
363
-
The confidence level is a value between 0 and 1, where 1 indicates a perfect match. In the last pair of matched records,
364
-
the confidence level is lower for the differing email addresses.
365
-
354
+
355
+
arn:aws:glue:us-east-1:xxxxxxxxxxxx:table/entity_resolution_db/csvgluetable Mary Major [email protected], 555-222-3333 4 ec05e7a55a0d4319b86da0a65286118f000040 \s
356
+
arn:aws:glue:us-east-1:xxxxxxxxxxxx:table/entity_resolution_db/csvgluetable 0.605295 3 María García marí[email protected] 555-567-1234 3 201ed8241ec04f9aa7fcfd962220580500001369367187456 \s
357
+
arn:aws:glue:us-east-1:xxxxxxxxxxxx:table/entity_resolution_db/jsongluetable 1 Jane Doe [email protected] 1 895c3a439dc44a298663d52c08635e1a0000434359738368 \s
358
+
arn:aws:glue:us-east-1:xxxxxxxxxxxx:table/entity_resolution_db/csvgluetable 1 Jane B.Doe [email protected] 1 69c2b2190c60427c8f5a2daa7ce5d45b00001463856467968 \s
359
+
arn:aws:glue:us-east-1:xxxxxxxxxxxx:table/entity_resolution_db/jsongluetable 0.8914204 2 John Doe [email protected] 2 fbeda81b4c72429382c064b20cd592ff00001386547056640 \s
360
+
arn:aws:glue:us-east-1::xxxxxxxxxxxx:table/entity_resolution_db/csvgluetable 0.8914204 2 John Doe Jr. [email protected] 555-654-3210 2 fbeda81b4c72429382c064b20cd592ff00001386547056640 \s
361
+
362
+
Note that each of the last 2 records are considered a match even though the 'name' differs between the records;
363
+
For example 'John Doe Jr.' compared to 'John Doe'.
364
+
The confidence level is a value between 0 and 1, where 1 indicates a perfect match.
0 commit comments