reformat mafat nb

dnth · dnth · commit 8a8fc479535d · 2023-07-13T18:44:00.000+08:00
diff --git a/examples/mafat-final.ipynb b/examples/mafat-final.ipynb
@@ -19,6 +19,8 @@
     "\n",
     "In this notebook we load satellite data from Mafat Competition https://mafatchallenge.mod.gov.il/, which consists of 16 bit grayscale images with rotated bounding boxes.\n",
     "\n",
+    "The dataset is also available on Kaggle [here](https://www.kaggle.com/datasets/dragonzhang/mafat-train-dataset).\n",
+    "\n",
     "We show how to work with this dataset using fastdup. It takes 140 seconds to process 18,000 bounding boxes and find all similarities.\n",
     "\n",
     "We use components gallery to highly suspected wrong bounding boxes as well as correct bounding boxes.\n"
@@ -165,36 +167,26 @@
     }
    ],
    "source": [
-    "# install latst fastdup (required 0.904 or up)\n",
-    "%pip install fastdup -U --force-reinstall"
+    "!pip install fastdup -Uq"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "62c0ac2e-cd8d-428e-b5ff-1b75c917f9e3",
+   "cell_type": "markdown",
+   "id": "547f2a35",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "#download mafat traing data, extract the zip file and put the notebook one level below images/ folder"
+    "Download mafat traing data, extract the zip file and put the notebook one level below images/ folder"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "538d2699-4678-4f0b-a570-412d4a97c7ae",
    "metadata": {},
    "source": [
-    "# Prepare annotation for fastdup format"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 18,
-   "id": "f2fa9853-0765-4d0a-a474-1eb703ea0a66",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Here we read the data as given in the competition, one annotation file per each image. We combine all files into a single flat table"
+    "## Prepare annotation for fastdup format\n",
+    "\n",
+    "\n",
+    "Here we read the data as given in the competition, one annotation file per each image. We combine all files into a single flat table"
    ]
   },
   {
@@ -448,7 +440,7 @@
    "id": "620799ea-3318-4a74-8dd0-d74ec3f42849",
    "metadata": {},
    "source": [
-    "# Run fastdup to crop and build a model for the crops"
+    "## Run fastdup to crop and build a model for the crops"
    ]
   },
   {
@@ -531,7 +523,7 @@
    "id": "a834aaaa-a76c-49bc-b293-c3c3e114d7aa",
    "metadata": {},
    "source": [
-    "# Find suspected wrong bounding boxes\n",
+    "## Find suspected wrong bounding boxes\n",
     "\n",
     "From - crop image name\n",
     "To - similar images\n",
@@ -1980,13 +1972,11 @@
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": 9,
-   "id": "44174ffd-72f0-4a63-8849-6989bf982fa2",
+   "cell_type": "markdown",
+   "id": "ffa5de31",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "# Looking at the raw cluster to link back cluster name to to file"
+    "Looking at the raw cluster to link back cluster name to to file"
    ]
   },
   {
@@ -2124,13 +2114,11 @@
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": 15,
-   "id": "bcb6a063-698c-480b-88e4-8ec3c9bfdb27",
+   "cell_type": "markdown",
+   "id": "bcc93d2e",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "# Looking at good labels"
+    "Looking at good labels"
    ]
   },
   {
@@ -3491,13 +3479,13 @@
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "5b86b38f-2f3e-4ab5-911b-f43079f82e93",
+   "cell_type": "markdown",
+   "id": "b4d06ad8",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "# Let's look on outliers on the satellite image level"
+    "## Outliers\n",
+    "\n",
+    "Let's look on outliers on the satellite image level"
    ]
   },
   {
@@ -4542,13 +4530,11 @@
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": 22,
-   "id": "f7998fe4-db21-4c06-aca6-3287119b74d2",
+   "cell_type": "markdown",
+   "id": "60ee12c8",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "# Now we look at outliers at the crop level"
+    "Now we look at outliers at the crop level"
    ]
   },
   {
@@ -5600,13 +5586,13 @@
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "47cfd1cc-7db6-4256-9550-62ab7fe3e81e",
+   "cell_type": "markdown",
+   "id": "aa35647a",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "# We look for the brightest satellite images"
+    "## Brightest Image\n",
+    "\n",
+    "We look for the brightest satellite images"
    ]
   },
   {
@@ -6652,13 +6638,12 @@
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "9711f363-9d0f-4d42-b4cd-66f5f9ab1b00",
+   "cell_type": "markdown",
+   "id": "73a82a89",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "# Now we look for the most blurry images"
+    "## Blurry Images \n",
+    "Now we look for the most blurry images"
    ]
   },
   {