You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/clean-prep/deduplicate.ipynb
+5-5Lines changed: 5 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -1093,7 +1093,7 @@
1093
1093
"cell_type": "markdown",
1094
1094
"metadata": {},
1095
1095
"source": [
1096
-
"[prepare_training](https://docs.dedupe.io/en/latest/API-documentation.html#dedupe.Dedupe.prepare_training) initialises active learning with our data and, optionally, with existing training data.\n",
1096
+
"`prepare_training` initialises active learning with our data and, optionally, with existing training data.\n",
1097
1097
"\n",
1098
1098
"`T` mirrors the DataFrame across its diagonal by writing rows as columns and vice versa. For this, [pandas.DataFrame.transpose](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.transpose.html) is used."
1099
1099
]
@@ -1104,7 +1104,7 @@
1104
1104
"source": [
1105
1105
"## 5. Active learning\n",
1106
1106
"\n",
1107
-
"Use [dedupe.console_label](https://docs.dedupe.io/en/latest/API-documentation.html#dedupe.console_label) to train your dedupe instance. When Dedupe finds a record pair, you will be asked to label it as a duplicate. You can use the `y`, `n` and `u` keys to label duplicates. Press `f` when you are finished."
1107
+
"Use `dedupe.console_label` to train your dedupe instance. When Dedupe finds a record pair, you will be asked to label it as a duplicate. You can use the `y`, `n` and `u` keys to label duplicates. Press `f` when you are finished."
1108
1108
]
1109
1109
},
1110
1110
{
@@ -1300,11 +1300,11 @@
1300
1300
"source": [
1301
1301
"The last training dataset compared make it clear that we did not delete this duplicate with our `drop_duplicates` example above - `marquesseastie` and `marquessebastien` were recognised as different.\n",
1302
1302
"\n",
1303
-
"[Dedupe.train](https://docs.dedupe.io/en/latest/API-documentation.html#dedupe.Dedupe.train) adds the record pairs you marked to the training data and updates the matching model.\n",
1303
+
"`Dedupe.train` adds the record pairs you marked to the training data and updates the matching model.\n",
1304
1304
"\n",
1305
1305
"With `index_predicates=True`, deduplication also takes into account predicates based on the indexing of the data.\n",
1306
1306
"\n",
1307
-
"When you are done, save your training data with [Dedupe.write_settings](https://docs.dedupe.io/en/latest/API-documentation.html#dedupe.Dedupe.write_settings)."
1307
+
"When you are done, save your training data with `Dedupe.write_settings`."
1308
1308
]
1309
1309
},
1310
1310
{
@@ -1328,7 +1328,7 @@
1328
1328
"cell_type": "markdown",
1329
1329
"metadata": {},
1330
1330
"source": [
1331
-
"With [dedupe.Dedupe.partition](https://docs.dedupe.io/en/latest/API-documentation.html#dedupe.Dedupe.partition), records that all refer to the same entity are identified and returned as tuples that are a sequence of record IDs and confidence values. For more details on the confidence value, see [dedupe.Dedupe.cluster](https://docs.dedupe.io/en/latest/API-documentation.html#dedupe.Dedupe.cluster)."
1331
+
"With `dedupe.Dedupe.partition`, records that all refer to the same entity are identified and returned as tuples that are a sequence of record IDs and confidence values. For more details on the confidence value, see `dedupe.Dedupe.cluster`."
0 commit comments