Skip to content

Commit ef37608

Browse files
authored
Merge pull request #8 from Imageomics/json-payload
Improve performance and prediction results
2 parents e001ed1 + 0b182fe commit ef37608

File tree

7 files changed

+379
-207
lines changed

7 files changed

+379
-207
lines changed

README.md

Lines changed: 120 additions & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,8 @@ Command line tool and python package to simplify using [BioCLIP](https://imageom
1212
**Table of Contents**
1313

1414
- [Installation](#installation)
15-
- [Command Line Usage](#command-line-usage)
1615
- [Python Package Usage](#python-package-usage)
17-
- [License](#license)
16+
- [Command Line Usage](#command-line-usage)
1817
- [Acknowledgments](#acknowledgments)
1918
- [License](#license)
2019

@@ -29,132 +28,162 @@ pip install git+https://github.com/Imageomics/pybioclip
2928

3029
If you have any issues with installation, please first upgrade pip by running `pip install --upgrade pip`.
3130

32-
## Command Line Usage
31+
## Python Package Usage
32+
### Predict species classification
3333

34-
### Predict classification
34+
```python
35+
from bioclip import TreeOfLifeClassifier, Rank
3536

36-
#### Example: Predict species for an image
37-
The example image used below is [`Ursus-arctos.jpeg`](https://huggingface.co/spaces/imageomics/bioclip-demo/blob/ef075807a55687b320427196ac1662b9383f988f/examples/Ursus-arctos.jpeg) from the [bioclip-demo](https://huggingface.co/spaces/imageomics/bioclip-demo).
37+
classifier = TreeOfLifeClassifier()
38+
predictions = classifier.predict("Ursus-arctos.jpeg", Rank.SPECIES)
3839

39-
Predict species for an `Ursus-arctos.jpeg` file:
40-
```console
41-
bioclip predict Ursus-arctos.jpeg
42-
```
43-
Output:
40+
for prediction in predictions:
41+
print(prediction["species"], "-", prediction["score"])
4442
```
45-
+----------------------------------------------------------------------------------------+-----------------------+
46-
| Taxon | Probability |
47-
+----------------------------------------------------------------------------------------+-----------------------+
48-
| Animalia Chordata Mammalia Carnivora Ursidae Ursus arctos (Kodiak bear) | 0.9356034994125366 |
49-
| Animalia Chordata Mammalia Carnivora Ursidae Ursus arctos syriacus (syrian brown bear) | 0.05616999790072441 |
50-
| Animalia Chordata Mammalia Carnivora Ursidae Ursus arctos bruinosus | 0.004126196261495352 |
51-
| Animalia Chordata Mammalia Carnivora Ursidae Ursus arctus | 0.0024959812872111797 |
52-
| Animalia Chordata Mammalia Carnivora Ursidae Ursus americanus (Louisiana black bear) | 0.0005009894957765937 |
53-
+----------------------------------------------------------------------------------------+-----------------------+
54-
```
55-
56-
---
5743

58-
To save as a CSV or JSON file you can use the `--format <file type>` and `--output <filename>` arguments with `csv` or `json`, respectively.
59-
60-
To save the JSON output to `ursus.json` run:
44+
Output:
6145
```console
62-
bioclip predict --format json --output ursus.json Ursus-arctos.jpeg
63-
```
46+
Ursus arctos - 0.9356034994125366
47+
Ursus arctos syriacus - 0.05616999790072441
48+
Ursus arctos bruinosus - 0.004126196261495352
49+
Ursus arctus - 0.0024959812872111797
50+
Ursus americanus - 0.0005009894957765937
51+
```
52+
53+
Output from the `predict()` method showing the dictionary structure:
54+
```
55+
[{
56+
'kingdom': 'Animalia',
57+
'phylum': 'Chordata',
58+
'class': 'Mammalia',
59+
'order': 'Carnivora',
60+
'family': 'Ursidae',
61+
'genus': 'Ursus',
62+
'species_epithet': 'arctos',
63+
'species': 'Ursus arctos',
64+
'common_name': 'Kodiak bear'
65+
'score': 0.9356034994125366
66+
}]
67+
```
68+
69+
The output from the predict function can be converted into a [pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) like so:
70+
```python
71+
import pandas as pd
72+
from bioclip import TreeOfLifeClassifier, Rank
6473

65-
To save the CSV output to `ursus.csv` run:
66-
```console
67-
bioclip predict --format csv --output ursus.csv Ursus-arctos.jpeg
74+
classifier = TreeOfLifeClassifier()
75+
predictions = classifier.predict("Ursus-arctos.jpeg", Rank.SPECIES)
76+
df = pd.DataFrame(predictions)
6877
```
6978

70-
#### Predict genus for an image
79+
### Predict from a list of classes
80+
```python
81+
from bioclip import CustomLabelsClassifier
7182

72-
Predict genus for image `Ursus-arctos.jpeg`, restricted to the top 3 predictions:
73-
```console
74-
bioclip predict --rank genus --k 3 Ursus-arctos.jpeg
83+
classifier = CustomLabelsClassifier()
84+
predictions = classifier.predict("Ursus-arctos.jpeg", ["duck","fish","bear"])
85+
for prediction in predictions:
86+
print(prediction["classification"], prediction["score"])
7587
```
7688
Output:
89+
```console
90+
duck 1.0306726583309e-09
91+
fish 2.932403668845507e-12
92+
bear 1.0
7793
```
78-
+---------------------------------------------------------+------------------------+
79-
| Taxon | Probability |
80-
+---------------------------------------------------------+------------------------+
81-
| Animalia Chordata Mammalia Carnivora Ursidae Ursus | 0.9994320273399353 |
82-
| Animalia Chordata Mammalia Artiodactyla Cervidae Cervus | 0.00032594642834737897 |
83-
| Animalia Chordata Mammalia Artiodactyla Cervidae Alces | 7.803700282238424e-05 |
84-
+---------------------------------------------------------+------------------------+
94+
95+
## Command Line Usage
8596
```
97+
bioclip predict [options] [IMAGE_FILE...]
8698
87-
#### Optional arguments for predicting classifications:
88-
- `--rank RANK` - rank of the classification (kingdom, phylum, class, order, family, genus, species) [default: species]
89-
- `--k K` - number of top predictions to show [default: 5]
90-
- `--format FORMAT` - format of the output (table, json, or csv) [default: table]
91-
- `--output OUTPUT` - save output to a filename instead of printing it [default: stdout]
99+
Arguments:
100+
IMAGE_FILE input image file
92101
102+
Options:
103+
-h --help
104+
--format=FORMAT format of the output (table or csv) [default: csv]
105+
--rank=RANK rank of the classification (kingdom, phylum, class, order, family, genus, species)
106+
[default: species]
107+
--k=K number of top predictions to show [default: 5]
108+
--cls=CLS comma separated list of classes to predict, when specified the --rank and
109+
--k arguments are ignored [default: all]
110+
--output=OUTFILE print output to file OUTFILE [default: stdout]
111+
```
93112

94-
### Predict from a list of classes
113+
### Predict classification
95114

96-
Create predictions for 3 classes (cat, bird, and bear) for image `Ursus-arctos.jpeg`:
115+
#### Predict species for an image
116+
The example images used below are [`Ursus-arctos.jpeg`](https://huggingface.co/spaces/imageomics/bioclip-demo/blob/ef075807a55687b320427196ac1662b9383f988f/examples/Ursus-arctos.jpeg)
117+
and [`Felis-catus.jpeg`](https://huggingface.co/spaces/imageomics/bioclip-demo/blob/ef075807a55687b320427196ac1662b9383f988f/examples/Felis-catus.jpeg) both from the [bioclip-demo](https://huggingface.co/spaces/imageomics/bioclip-demo).
118+
119+
Predict species for an `Ursus-arctos.jpeg` file:
97120
```console
98-
bioclip predict --cls cat,bird,bear Ursus-arctos.jpeg
121+
bioclip predict Ursus-arctos.jpeg
99122
```
100123
Output:
101124
```
102-
+-------+-----------------------+
103-
| Taxon | Probability |
104-
+-------+-----------------------+
105-
| cat | 4.581644930112816e-08 |
106-
| bird | 3.051998476166773e-08 |
107-
| bear | 0.9999998807907104 |
108-
+-------+-----------------------+%
125+
bioclip predict Ursus-arctos.jpeg
126+
file_name,kingdom,phylum,class,order,family,genus,species_epithet,species,common_name,score
127+
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos,Ursus arctos,Kodiak bear,0.9356034994125366
128+
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos syriacus,Ursus arctos syriacus,syrian brown bear,0.05616999790072441
129+
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos bruinosus,Ursus arctos bruinosus,,0.004126196261495352
130+
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctus,Ursus arctus,,0.0024959812872111797
131+
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,americanus,Ursus americanus,Louisiana black bear,0.0005009894957765937
109132
```
110133

111-
#### Optional arguments for predicting from a list of classes:
112-
- `--format FORMAT` - format of the output (table, json, or csv) [default: table]
113-
- `--output OUTPUT` - save output to a filename instead of printing it [default: stdout]
114-
- `--cls CLS` - comma separated list of classes to predict, when specified the `--rank` and `--k` arguments are ignored [default: all]
134+
#### Predict species for multiple images saving to a file
115135

116-
117-
### View command line help
136+
To make predictions for files `Ursus-arctos.jpeg` and `Felis-catus.jpeg` saving the output to a file named `predictions.csv`:
118137
```console
119-
bioclip --help
138+
bioclip predict --output predictions.csv Ursus-arctos.jpeg Felis-catus.jpeg
139+
```
140+
The contents of `predictions.csv` will look like this:
141+
```
142+
file_name,kingdom,phylum,class,order,family,genus,species_epithet,species,common_name,score
143+
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos,Ursus arctos,Kodiak bear,0.9356034994125366
144+
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos syriacus,Ursus arctos syriacus,syrian brown bear,0.05616999790072441
145+
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctos bruinosus,Ursus arctos bruinosus,,0.004126196261495352
146+
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,arctus,Ursus arctus,,0.0024959812872111797
147+
Ursus-arctos.jpeg,Animalia,Chordata,Mammalia,Carnivora,Ursidae,Ursus,americanus,Ursus americanus,Louisiana black bear,0.0005009894957765937
148+
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Felis,silvestris,Felis silvestris,European Wildcat,0.7221033573150635
149+
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Felis,catus,Felis catus,Domestic Cat,0.19810837507247925
150+
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Felis,margarita,Felis margarita,Sand Cat,0.02798456884920597
151+
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Lynx,felis,Lynx felis,,0.021829601377248764
152+
Felis-catus.jpeg,Animalia,Chordata,Mammalia,Carnivora,Felidae,Felis,bieti,Felis bieti,Chinese desert cat,0.010979168117046356
120153
```
121154

122-
## Python Package Usage
123-
### Predict species classification
124-
125-
```python
126-
from bioclip import predict_classification, Rank
127-
128-
predictions = predict_classification("Ursus-arctos.jpeg", Rank.SPECIES)
129-
130-
for species_name, probability in predictions.items():
131-
print(species_name, probability)
155+
#### Predict top 3 genera for an image and display output as a table
156+
```console
157+
bioclip predict --format table --k 3 --rank=genus Ursus-arctos.jpeg
132158
```
133159

134160
Output:
135-
```console
136-
Animalia Chordata Mammalia Carnivora Ursidae Ursus arctos (Kodiak bear) 0.9356034994125366
137-
Animalia Chordata Mammalia Carnivora Ursidae Ursus arctos syriacus (syrian brown bear) 0.05616999790072441
138-
Animalia Chordata Mammalia Carnivora Ursidae Ursus arctos bruinosus 0.004126196261495352
139-
Animalia Chordata Mammalia Carnivora Ursidae Ursus arctus 0.0024959812872111797
140-
Animalia Chordata Mammalia Carnivora Ursidae Ursus americanus (Louisiana black bear) 0.0005009894957765937
161+
```
162+
+-------------------+----------+----------+----------+--------------+----------+--------+------------------------+
163+
| file_name | kingdom | phylum | class | order | family | genus | score |
164+
+-------------------+----------+----------+----------+--------------+----------+--------+------------------------+
165+
| Ursus-arctos.jpeg | Animalia | Chordata | Mammalia | Carnivora | Ursidae | Ursus | 0.9994320273399353 |
166+
| Ursus-arctos.jpeg | Animalia | Chordata | Mammalia | Artiodactyla | Cervidae | Cervus | 0.00032594642834737897 |
167+
| Ursus-arctos.jpeg | Animalia | Chordata | Mammalia | Artiodactyla | Cervidae | Alces | 7.803700282238424e-05 |
168+
+-------------------+----------+----------+----------+--------------+----------+--------+------------------------+
141169
```
142170

143171
### Predict from a list of classes
144-
```python
145-
from bioclip import predict_classifications_from_list, Rank
146-
147-
predictions = predict_classifications_from_list("Ursus-arctos.jpeg",
148-
["duck","fish","bear"])
149-
150-
for cls, probability in predictions.items():
151-
print(cls, probability)
172+
Create predictions for 3 classes (cat, bird, and bear) for image `Ursus-arctos.jpeg`:
173+
```console
174+
bioclip predict --cls cat,bird,bear Ursus-arctos.jpeg
152175
```
153176
Output:
177+
```
178+
file_name,classification,score
179+
Ursus-arctos.jpeg,cat,4.581644930112816e-08
180+
Ursus-arctos.jpeg,bird,3.051998476166773e-08
181+
Ursus-arctos.jpeg,bear,0.9999998807907104
182+
```
183+
184+
### View command line help
154185
```console
155-
duck 1.0306726583309e-09
156-
fish 2.932403668845507e-12
157-
bear 1.0
186+
bioclip --help
158187
```
159188

160189
## License

pyproject.toml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ dependencies = [
3232
'torch',
3333
'docopt-ng',
3434
'prettytable',
35+
'pandas',
3536
]
3637

3738
[project.urls]
@@ -90,3 +91,8 @@ exclude_lines = [
9091
"if __name__ == .__main__.:",
9192
"if TYPE_CHECKING:",
9293
]
94+
95+
[tool.pytest.ini_options]
96+
pythonpath = [
97+
"src"
98+
]

src/bioclip/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# SPDX-FileCopyrightText: 2024-present John Bradley <[email protected]>
22
#
33
# SPDX-License-Identifier: MIT
4-
from bioclip.predict import predict_classification, Rank, predict_classifications_from_list
4+
from bioclip.predict import TreeOfLifeClassifier, Rank, CustomLabelsClassifier
55

6-
__all__ = ["predict_classification", "Rank", "predict_classifications_from_list"]
6+
__all__ = ["TreeOfLifeClassifier", "Rank", "CustomLabelsClassifier"]

src/bioclip/__main__.py

Lines changed: 30 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
"""Usage: bioclip predict [options] IMAGE_FILE
1+
"""Usage: bioclip predict [options] [IMAGE_FILE...]
22
33
Use BioCLIP to generate predictions for an IMAGE_FILE.
44
@@ -7,36 +7,41 @@
77
88
Options:
99
-h --help
10-
--format=FORMAT format of the output (table, json, or csv) [default: table]
10+
--format=FORMAT format of the output (table or csv) [default: csv]
1111
--rank=RANK rank of the classification (kingdom, phylum, class, order, family, genus, species) [default: species]
1212
--k=K number of top predictions to show [default: 5]
1313
--cls=CLS comma separated list of classes to predict, when specified the --rank and --k arguments are ignored [default: all]
14-
--output=OUTFILE save output to a filename instead of printing it [default: stdout]
14+
--output=OUTFILE print output to file OUTFILE [default: stdout]
1515
1616
"""
1717
from docopt import docopt
18-
from bioclip import predict_classification, predict_classifications_from_list, Rank
18+
from bioclip import TreeOfLifeClassifier, Rank, CustomLabelsClassifier
1919
import json
2020
import sys
2121
import prettytable as pt
2222
import csv
23+
import pandas as pd
2324

2425

25-
def write_results(result, format, outfile):
26+
def write_results(data, format, output):
27+
df = pd.DataFrame(data)
28+
if output == 'stdout':
29+
write_results_to_file(df, format, sys.stdout)
30+
else:
31+
with open(output, 'w') as outfile:
32+
write_results_to_file(df, format, outfile)
33+
34+
35+
def write_results_to_file(df, format, outfile):
2636
if format == 'table':
2737
table = pt.PrettyTable()
28-
table.field_names = ['Taxon', 'Probability']
29-
for taxon, prob in result.items():
30-
table.add_row([taxon, prob])
38+
table.field_names = df.columns
39+
for index, row in df.iterrows():
40+
table.add_row(row)
3141
outfile.write(str(table))
3242
outfile.write('\n')
33-
elif format == 'json':
34-
json.dump(result, outfile, indent=2)
3543
elif format == 'csv':
36-
writer = csv.writer(outfile)
37-
writer.writerow(['Taxon', 'Probability'])
38-
for taxon, prob in result.items():
39-
writer.writerow([taxon, prob])
44+
df.to_csv(outfile, index=False)
4045
else:
4146
raise ValueError(f"Invalid format: {format}")
4247

@@ -48,22 +53,21 @@ def main():
4853
output = x['--output']
4954
image_file = x['IMAGE_FILE']
5055
cls = x['--cls']
51-
if not format in ['table', 'json', 'csv']:
56+
if not format in ['table', 'csv']:
5257
raise ValueError(f"Invalid format: {format}")
5358
rank = Rank[x['--rank'].upper()]
5459
if cls == 'all':
55-
result = predict_classification(img=image_file,
56-
rank=rank,
57-
k=int(x['--k']))
58-
else:
59-
result = predict_classifications_from_list(img=image_file,
60-
cls_ary=cls.split(','))
61-
outfile = sys.stdout
62-
if output == 'stdout':
63-
write_results(result, format, sys.stdout)
60+
classifier = TreeOfLifeClassifier()
61+
data = []
62+
for image_path in image_file:
63+
data.extend(classifier.predict(image_path=image_path, rank=rank, k=int(x['--k'])))
64+
write_results(data, format, output)
6465
else:
65-
with open(output, 'w') as outfile:
66-
write_results(result, format, outfile)
66+
classifier = CustomLabelsClassifier()
67+
data = []
68+
for image_path in image_file:
69+
data.extend(classifier.predict(image_path=image_path, cls_ary=cls.split(',')))
70+
write_results(data, format, output)
6771

6872

6973
if __name__ == '__main__':

0 commit comments

Comments
 (0)