|
62 | 62 | "source": [ |
63 | 63 | "## Installation\n", |
64 | 64 | "\n", |
65 | | - "First, let's install the neccessary packages:\n", |
| 65 | + "First, let's install the necessary packages:\n", |
66 | 66 | "\n", |
67 | 67 | "- [fastdup](https://github.com/visual-layer/fastdup) - To analyze issues in the dataset.\n", |
68 | 68 | "- [Recognize Anything](https://github.com/xinyu1205/recognize-anything) - To use the RAM and Tag2Text model.\n", |
|
118 | 118 | "metadata": {}, |
119 | 119 | "source": [ |
120 | 120 | "## Download Dataset\n", |
121 | | - "Download the [coco-minitrain](https://github.com/giddyyupp/coco-minitrain) dataset - a curated mini training set consisting of 20% of COCO 2017 training dataset. The coco-minitrain consists of 25,000 images and annoatations." |
| 121 | + "Download the [coco-minitrain](https://github.com/giddyyupp/coco-minitrain) dataset - a curated mini training set consisting of 20% of COCO 2017 training dataset. The coco-minitrain consists of 25,000 images and annotations." |
122 | 122 | ] |
123 | 123 | }, |
124 | 124 | { |
|
271 | 271 | "source": [ |
272 | 272 | "## Zero-Shot Classification with RAM and Tag2Text\n", |
273 | 273 | "\n", |
274 | | - "Within fastdup you can readily use the zero-shot classifier models such as [Recognize Anything Model (RAM)](https://github.com/xinyu1205/recognize-anything) and [Tag2Text](https://github.com/xinyu1205/recognize-anything). Both Tag2Text and RAM exihibit strong recognition ability.\n", |
| 274 | + "Within fastdup you can readily use the zero-shot classifier models such as [Recognize Anything Model (RAM)](https://github.com/xinyu1205/recognize-anything) and [Tag2Text](https://github.com/xinyu1205/recognize-anything). Both Tag2Text and RAM exhibit strong recognition ability.\n", |
275 | 275 | "\n", |
276 | 276 | "+ RAM is an image tagging model, which can recognize any common category with high accuracy. Outperforms CLIP and BLIP.\n", |
277 | 277 | "+ Tag2Text is a vision-language model guided by tagging, which can support caption, retrieval and tagging." |
|
1182 | 1182 | "id": "59c8e8d0-1c00-403b-84d9-226458b9268a", |
1183 | 1183 | "metadata": {}, |
1184 | 1184 | "source": [ |
1185 | | - "Once, done you'll notice that 3 new columns are appened into the DataFrame namely - `grounding_dino_bboxes`, `grounding_dino_scores`, and `grounding_dino_labels`. " |
| 1185 | + "Once, done you'll notice that 3 new columns are appended into the DataFrame namely - `grounding_dino_bboxes`, `grounding_dino_scores`, and `grounding_dino_labels`. " |
1186 | 1186 | ] |
1187 | 1187 | }, |
1188 | 1188 | { |
|
1897 | 1897 | "id": "7a979b19-eaef-422b-944b-0285115e24d6", |
1898 | 1898 | "metadata": {}, |
1899 | 1899 | "source": [ |
1900 | | - "Not all images contain \"face\", \"eye\" and \"hair\", let's remove the columns with no detections and plot the colums with detections." |
| 1900 | + "Not all images contain \"face\", \"eye\" and \"hair\", let's remove the columns with no detections and plot the column with detections." |
1901 | 1901 | ] |
1902 | 1902 | }, |
1903 | 1903 | { |
|
0 commit comments