Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
c56e39a
Update README.md
paul-covert Jun 21, 2025
e86a431
Update README.md
paul-covert Jun 21, 2025
e188510
Update README.md
paul-covert Jun 21, 2025
0167b9d
Update README.md
paul-covert Jun 21, 2025
ca2eecd
Update README.md
paul-covert Jun 21, 2025
022e42f
Delete slurm-logs directory
paul-covert Jun 21, 2025
4d85c5c
Update .gitignore
paul-covert Jun 21, 2025
f64af1a
Removed SLURM and batch processing functionality
Jun 21, 2025
51f29cf
Update README.md
paul-covert Jun 21, 2025
af3d9c1
Update README.md
paul-covert Jun 21, 2025
e3d5559
Removed requirements directory
Jun 21, 2025
1df2de0
Delete neuston_sbatch.py
Jun 23, 2025
78d3260
Create environment.yml
Jun 23, 2025
1fec295
Update environment.yml
paul-covert Jun 23, 2025
c7071be
Update environment.yml
paul-covert Jun 23, 2025
8a14976
individual mac and win environment files
Jun 23, 2025
6ab9ae6
Update README.md
paul-covert Jun 25, 2025
0c65bc9
Update README.md
paul-covert Jun 25, 2025
902f124
Update README.md
paul-covert Jun 25, 2025
40eff11
Changed version numbering convention
paul-covert Jun 25, 2025
09f8cb6
Update README.md
paul-covert Jun 25, 2025
21fd5b4
Updated to PyTorch 2.x api
paul-covert Jun 25, 2025
853dbc4
Update README.md
paul-covert Jun 25, 2025
2b8257f
v1.01
htleblond Jun 25, 2025
41e0098
Merge branch 'pytorch_2.x' of
htleblond Jun 25, 2025
fb1f347
v2025.07a2 (apparently)
htleblond Jun 26, 2025
259bd73
Update README.md
htleblond Jun 27, 2025
57b82eb
1.02
htleblond Aug 5, 2025
452297e
Update channels to conda-forge only
paul-covert Nov 1, 2025
7916f7b
Revise README for clarity on repository details
paul-covert Nov 2, 2025
786eb2c
Fix formatting in README.md
paul-covert Nov 2, 2025
0dd8ef6
Fix formatting issues in README.md
paul-covert Nov 2, 2025
53be0fa
Fix formatting and clarify repository description
paul-covert Nov 2, 2025
3172b48
Updated environment files
Nov 2, 2025
a16c6ff
Remove Pillow dependency from environment file
paul-covert Nov 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
# Ignore MacOS-specific files
.DS_Store

# Ignore PyCharm IDE files
.idea

Expand Down
20 changes: 18 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,22 @@
# IFCB Classifier

This repo host an image classifying program designed to be trained on plankton images from an IFCB datasource.
This repository is a fork of the original ```WHOIGit/ibcb_classifier repo```, an image classifying program designed to be trained on plankton images from an IFCB datasource. The ```legacy``` branch is identical to ```WHOIGit/ifcb_classifier v0.3.1```, with the exception that the dependencies have been updated to rely on conda-forge channels only. The ```main``` branch contains the code that has been migrated to the modern PyTorch packages (```pytorch>=2.5.0```, etc.). In addition, he SLURM functionality has been removed.

For details on usage, please see [the WHOIGit repository wiki](https://github.com/WHOIGit/ifcb_classifier/wiki)

## Changes to original code

- "ptl.callbacks.base.Callback" replaced with "ptl.callbacks.Callback".
- Loaders are now used as parameters for the NeustonModel object, as they don't work as parameters when the trainer fits the model anymore.
- 'input_classes', 'output_classes', 'input_srcs' and 'outputs' are now placed into a separate dictionary object ('unloggable_dict') instead of being logged, as logging lists doesn't work anymore.
- 'training_epoch_end' changed to 'on_train_epoch_end' as the lingo was out of date (ditto for validation and testing).
- 'steps' now stored in separate lists and cleared after their respective 'on_epoch_end' function is called, as said function no longer accepts 'steps' as a parameter.
- 'gpus' and 'checkpoint_callback' removed as parameters from the Trainers, as they are no longer valid for whatever reason.
- Added parameters to Trainers: accelerator='gpu', devices=1
- Added parameter to dataloaders: persistent_workers=True

## Comparison of v2025.07a2 with v0.3.1

To confirm successful migration to PyTorch 2.x, comparison between a model trained with the updated code and the ```legacy_pytorch_1.7.1``` was made. (TBD)

For details on usage and installation, please see [this repository's wiki](https://github.com/WHOIGit/ifcb_classifier/wiki)

32 changes: 0 additions & 32 deletions batches/templates/example.RUN.sbatch

This file was deleted.

33 changes: 0 additions & 33 deletions batches/templates/example.TRAINING.sbatch

This file was deleted.

22 changes: 22 additions & 0 deletions environment-linux-64.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: ifcbnn
channels:
- conda-forge
dependencies:
- python=3.12
- pytorch=2.8.0=cuda129_mkl_py312_hda324a2_301
- torchvision=0.24.0=cuda129_py312_h9f56bbf_2
- pytorch-lightning=2.5.5
- scikit-learn
- scipy=1.13.1
- pandas=2.2.3
- h5py=3.12.1
- requests=2.32.4
- Pillow=11.1.0
- rectpack=0.2.2
- scikit-image=0.24.0
- pysmb=1.2.10
- smbprotocol=1.15.0
- pyyaml=6.0.2
- pip
- pip:
- git+https://github.com/joefutrelle/[email protected]
23 changes: 23 additions & 0 deletions environment-win-64.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: ifcbnn
channels:
- conda-forge
dependencies:
- python=3.11
- pytorch
- torchvision
- pytorch-lightning
- Pillow
- scikit-learn
- scikit-image
- pandas
- h5py
- scipy
- numpy
- pyyaml
- requests
- rectpack
- pysmb
- smbprotocol
- pip
- pip:
- git+https://github.com/joefutrelle/[email protected]
19 changes: 13 additions & 6 deletions neuston_callbacks.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@

## Training ##

class SaveValidationResults(ptl.callbacks.base.Callback):
#class SaveValidationResults(ptl.callbacks.base.Callback): # deprecated in PyTorch 2.x
class SaveValidationResults(ptl.callbacks.Callback):

def __init__(self, outdir, outfile, series, best_only=True):
self.outdir = outdir
Expand All @@ -27,6 +28,7 @@ def __init__(self, outdir, outfile, series, best_only=True):

def on_validation_end(self, trainer, pl_module):
log = trainer.callback_metrics # flattened dict
unlog = pl_module.unloggable_dict
#log: val_loss input_classes output_classes input_srcs outputs epoch best train_loss f1_macro f1_weighted

if not(log['best'] or not self.best_only):
Expand All @@ -45,11 +47,14 @@ def on_validation_end(self, trainer, pl_module):
training_image_basenames = [os.path.splitext(os.path.basename(img))[0] for img in training_image_fullpaths]
training_classes = train_dataset.targets

output_scores = log['outputs']
#output_scores = log['outputs']
output_scores = unlog['outputs']
output_winscores = np.max(output_scores, axis=1)
output_classes = np.argmax(output_scores, axis=1)
input_classes = log['input_classes']
image_fullpaths = log['input_srcs']
#input_classes = log['input_classes']
input_classes = unlog['input_classes']
#image_fullpaths = log['input_srcs']
image_fullpaths = unlog['input_srcs']
image_basenames = [os.path.splitext(os.path.basename(img))[0] for img in image_fullpaths]

assert output_scores.shape[0] == len(input_classes), 'wrong number inputs-to-outputs'
Expand Down Expand Up @@ -272,7 +277,8 @@ def _save_run_results_hdf(outfile, results):
if outfile.endswith('.h5'): _save_run_results_hdf(outfile, results)


class SaveTestResults(ptl.callbacks.base.Callback):
#class SaveTestResults(ptl.callbacks.base.Callback): # deprecated in PyTorch 2.x
class SaveTestResults(ptl.callbacks.Callback):

def __init__(self, outdir, outfile, timestamp):
self.outdir = outdir
Expand All @@ -281,7 +287,8 @@ def __init__(self, outdir, outfile, timestamp):

def on_test_end(self, trainer, pl_module):

RRs = trainer.callback_metrics['RunResults']
#RRs = trainer.callback_metrics['RunResults']
RRs = pl_module.unloggable_dict['RunResults']
# RunResult rr: inputs, outputs, bin_id
if not isinstance(RRs,list):
RRs = [RRs]
Expand Down
81 changes: 64 additions & 17 deletions neuston_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,8 @@ def get_namebrand_model(model_name, num_o_classes, pretrained=False):


class NeustonModel(ptl.LightningModule):
def __init__(self, hparams):
#def __init__(self, hparams):
def __init__(self, hparams, training_loader=None, validation_loader=None, testing_loader=None):
super().__init__()

if isinstance(hparams,dict):
Expand All @@ -59,6 +60,15 @@ def __init__(self, hparams):
self.best_val_loss = np.inf
self.best_epoch = 0
self.agg_train_loss = 0.0

# Holly's additions
self.train_steps = []
self.validation_steps = []
self.test_steps = []
self.training_loader = training_loader
self.validation_loader = validation_loader
self.testing_loader = testing_loader
self.unloggable_dict = {}

def configure_optimizers(self):
return Adam(self.parameters(), lr=0.001)
Expand All @@ -83,10 +93,14 @@ def training_step(self, batch, batch_nb):
outputs = self.forward(input_data)
batch_loss = self.loss(input_classes, outputs)
self.agg_train_loss += batch_loss.item()
self.train_steps.append(dict(loss=batch_loss))
return dict(loss=batch_loss)

def training_epoch_end(self, steps):
train_loss = torch.stack([batch['loss'] for batch in steps]).sum().item()
#def training_epoch_end(self, steps):
def on_train_epoch_end(self):
#train_loss = torch.stack([batch['loss'] for batch in steps]).sum().item()
train_loss = torch.stack([batch['loss'] for batch in self.train_steps]).sum().item()
self.train_steps = []
#print('training_epoch_end: self.agg_train_loss={:.5f}, train_loss={:.5f}, DIFF={:.9f}'.format(self.agg_train_loss, train_loss, self.agg_train_loss-train_loss), end='\n\n')
#return dict(train_loss=train_loss)

Expand All @@ -97,16 +111,24 @@ def validation_step(self, batch, batch_idx):
val_batch_loss = self.loss(input_classes, outputs)
outputs = outputs.logits if isinstance(outputs,InceptionOutputs) else outputs
outputs = softmax(outputs,dim=1)
return dict(val_batch_loss=val_batch_loss,
#return dict(val_batch_loss=val_batch_loss,
# val_outputs=outputs,
# val_input_classes=input_classes,
# val_input_srcs=input_src)
outp = dict(val_batch_loss=val_batch_loss,
val_outputs=outputs,
val_input_classes=input_classes,
val_input_srcs=input_src)
self.validation_steps.append(outp)
return outp

def validation_epoch_end(self, steps):
#def validation_epoch_end(self, steps):
def on_validation_epoch_end(self):
print(end='\n\n') # give space for progress bar
if self.current_epoch==0: self.best_val_loss = np.inf # takes care of any lingering val_loss from sanity checks

validation_loss = torch.stack([batch['val_batch_loss'] for batch in steps]).sum()
#validation_loss = torch.stack([batch['val_batch_loss'] for batch in steps]).sum()
validation_loss = torch.stack([batch['val_batch_loss'] for batch in self.validation_steps]).sum()
#eoe0 = 'validation_epoch_end: best_val_loss={}, curr_val_loss={}, curr<best={}, curr-best (neg is good)={}'
#eoe0 = eoe0.format(self.best_val_loss, validation_loss.item(), validation_loss.item()<self.best_val_loss, validation_loss.item()-self.best_val_loss)
#print(eoe0)
Expand All @@ -115,10 +137,13 @@ def validation_epoch_end(self, steps):
self.best_val_loss = validation_loss.item()
self.best_epoch = self.current_epoch

outputs = torch.cat([batch['val_outputs'] for batch in steps],dim=0).detach().cpu().numpy()
#outputs = torch.cat([batch['val_outputs'] for batch in steps],dim=0).detach().cpu().numpy()
outputs = torch.cat([batch['val_outputs'] for batch in self.validation_steps],dim=0).detach().cpu().numpy()
output_classes = np.argmax(outputs, axis=1)
input_classes = torch.cat([batch['val_input_classes'] for batch in steps],dim=0).detach().cpu().numpy()
input_srcs = [item for sublist in [batch['val_input_srcs'] for batch in steps] for item in sublist]
#input_classes = torch.cat([batch['val_input_classes'] for batch in steps],dim=0).detach().cpu().numpy()
input_classes = torch.cat([batch['val_input_classes'] for batch in self.validation_steps],dim=0).detach().cpu().numpy()
#input_srcs = [item for sublist in [batch['val_input_srcs'] for batch in steps] for item in sublist]
input_srcs = [item for sublist in [batch['val_input_srcs'] for batch in self.validation_steps] for item in sublist]

f1_weighted = metrics.f1_score(input_classes, output_classes, average='weighted')
f1_macro = metrics.f1_score(input_classes, output_classes, average='macro')
Expand All @@ -134,17 +159,23 @@ def validation_epoch_end(self, steps):
self.log('val_loss', validation_loss, on_epoch=True)

# csv_logger logger hacked to not include these in epochs.csv output
self.log('input_classes', input_classes, on_epoch=True)
self.log('output_classes', output_classes, on_epoch=True)
self.log('input_srcs', input_srcs, on_epoch=True)
self.log('outputs', outputs, on_epoch=True)
#self.log('input_classes', input_classes, on_epoch=True)
#self.log('output_classes', output_classes, on_epoch=True)
#self.log('input_srcs', input_srcs, on_epoch=True)
#self.log('outputs', outputs, on_epoch=True)
self.unloggable_dict['input_classes'] = input_classes
self.unloggable_dict['output_classes'] = output_classes
self.unloggable_dict['input_srcs'] = input_srcs
self.unloggable_dict['outputs'] = outputs

# these will apppear in epochs.csv, but are not used by callbacks
self.log('f1_macro',f1_macro, on_epoch=True)
self.log('f1_weighted',f1_weighted, on_epoch=True)

# Cleanup
self.agg_train_loss = 0.0

self.validation_steps = []

return dict(hiddens=dict(outputs=outputs))

Expand All @@ -154,14 +185,19 @@ def test_step(self, batch, batch_idx, dataloader_idx=None):
outputs = self.forward(input_data)
outputs = outputs.logits if isinstance(outputs,InceptionOutputs) else outputs
outputs = softmax(outputs, dim=1)
return dict(test_outputs=outputs, test_srcs=input_srcs)
#return dict(test_outputs=outputs, test_srcs=input_srcs)
outp = dict(test_outputs=outputs, test_srcs=input_srcs)
self.test_steps.append(outp)
return outp

def test_epoch_end(self, steps):
#def test_epoch_end(self, steps):
def on_test_epoch_end(self):

# handle single and multiple test dataloaders
datasets = self.test_dataloader()
if isinstance(datasets, list): datasets = [ds.dataset for ds in datasets]
else: datasets = [datasets.dataset]
steps = self.test_steps
if isinstance(steps[0],dict):
steps = [steps]

Expand All @@ -176,8 +212,19 @@ def test_epoch_end(self, steps):
input_obj = dataset.input_src # a path string
rr = self.RunResults(inputs=images, outputs=outputs, input_obj=input_obj)
RRs.append(rr)
self.log('RunResults',RRs)
#return dict(RunResults=RRs)
#self.log('RunResults',RRs)
self.unloggable_dict['RunResults'] = RRs
self.test_steps = []
#return dict(RunResults=RRs) # Note from Holly: I did not comment this out, it came like this.

def train_dataloader(self):
return self.training_loader

def val_dataloader(self):
return self.validation_loader

def test_dataloader(self):
return self.testing_loader

class RunResults:
def __init__(self, inputs, outputs, input_obj):
Expand Down
Loading