Deep Learning Project Template

A lightweight template for PyTorch based deep learning projects with main features of configuration management (Hydra), logging (Loguru+TensorBoard), and hardware-agnostic training (Lightning Fiber). Designed for rapid experimentation while enforcing best practices.

Why This Template?

Note: This section provides detailed background and motivation. If you're just looking to get started quickly, you can skip to the Quick Start section. If you're interested in why this template exists and what problems it solves, expand below.

🤔Click to expand the motivation and background🤔

There are plenty of deep learning templates out there—so why this one?

As a research engineer in computer vision for over four years, my workflow has consistently involved:

Reviewing SOTA papers
Implementing papers or adapting their existing codebases to specific datasets and problems

However, existing implementations often come with excessive complexity. Each codebase has a different structure, making it time-consuming to adapt. In reality, I only need the essentials:

Dataset processing
Model architecture
Loss functions
Training logic
...

Everything else should be familiar and easy to modify for experiments. I often end up stripping down implementations to the bare minimum and rewriting them for:

Configurable experiment management
Effective logging (preferably free)
Seamless multi-GPU support

Why Not Use Existing Templates?

Yes, there are other well-structured templates, but:

They are over-engineered → Hard to modify, too much boilerplate
They impose strict frameworks → Require learning Lightning or other abstractions
They add unnecessary complexity → I just need a simple, adaptable structure

What I need is simple:
✅ Run multiple experiments with different models & settings
✅ Quickly switch configurations
✅ Track experiments efficiently
✅ Easy inference with saved settings
✅ Monitor training behavior with intuitisve logging
✅ No CUDA/CPU/hardware headaches

What This Template Offers

Configuration Management → Hierarchical configs with Hydra
Experiment Tracking → Auto-save & load experiment settings
Logging → Console & file logging with Loguru
Visualization → TensorBoard support for metrics, images, and models
Hardware Agnostic → Lightning Fabric (better flexibility than PyTorch Lightning)
Lean & Adaptable → No unnecessary overhead, quick to modify

This template is designed to keep things simple, flexible, and experiment-focused—without unnecessary complexity.

How to use

Use this mainly for the config management and logging features. A toy example of a reconstruction autoencoder for a random image to show it works and show where the dataset, model, loss optimizers, training/validation/inference logic, vis, io, other utils could go. You're in control.

Project Structure

├── configs/                # Configuration files
│   ├── config.yaml         # Main configuration
│   ├── data/               # Dataset configurations
│   ├── experiment/         # Experiment configurations
│   ├── model/              # Model configurations
│   └── training/           # Training configurations
├── model_save/             # Saved models and experiment data
├── src/                    # Source code
│   ├── data/               # Dataset implementations
│   ├── model/              # Model implementations
│   ├── utils/              # Utility functions
│   │   ├── logging/        # Logging utilities
│   │   │   ├── msg_logger.py  # Message logging with Loguru
│   │   │   └── tb_logger.py   # TensorBoard logging
│   │   ├── io.py          # I/O utilities
│   │   ├── utils.py       # General utilities
│   │   └── visualization.py # Visualization utilities
│   ├── infer.py           # Inference script
│   └── train.py           # Training script
└── requirements.txt        # Dependencies

Environment Setup

Set up your own environment, but to you need atleast mentioned in the provided requirements.txt file to use the features here and run the toy example (adapt the cuda version or just use your own installation proc):

conda create -n minimal python=3.10
conda activate minimal
pip install --force-reinstall -r requirements.txt

Training

To train a model with the default configuration:

python src/train.py

This will:

Create an experiment directory in model_save/ with the experiment name <exp_name>
Save the configuration used for training in model_save/<exp_name>/.hydra (for reference and used for resuming exp or inference)
Log messages to both console and a log file model_save/<exp_name>/train.log in the experiment directory
Log metrics and visualizations to TensorBoard in model_save/<exp_name>/tb

Key CLI Overrides: You can override any configuration parameter from the command line:

# Change run params 
python src/train.py experiment.exp_name=my_experiment training.epochs=100 training.batch_size=64

# Change model architecture
python src/train.py model=complex

# Multi-GPU training
python src/train.py training.devices=2 training.accelerator="gpu"

# Mixed precision
python src/train.py training.precision="16-mixed"

Inference

python src/infer.py --experiment <exp_name>

Loads config from original training run
Saves predictions to model_save/<exp_name>/preds

Optionally, you can also use CLI to over-ride params during inference (needed sometimes)

python src/infer.py --experiment <exp_name> data.image_type=tif

Configuration Management

Core Concepts

Hierarchical Configs
Compose configurations from multiple files:

# configs/config.yaml
defaults:
  - experiment: default
  - training: default
  - model: base # or complex (see configs/model)
  - data: default

Experiment-Specific Settings

# configs/experiment/default.yaml
exp_name: "unet_baseline"
resume: false

Model Zoo
Switch architectures via config:

# configs/model/base.yaml
_target_: src.model.base.UNet
in_channels: 1
out_channels: 1
initial_features: 64

Training Settings

epochs: 500
lr: 0.0001
ckpt_frequency: 50  # Save checkpoint every 5 epochs
resume_from_last: false  # Path to checkpoint to resume from
accelerator: "auto"  # Lightning Fabric: "cpu", "gpu", "tpu", or "auto"
devices: "auto"  # Number of devices (e.g., 2 for 2 GPUs)
precision: "32-true"  # Mixed precision: "16-mixed", "bf16-mixed", or "32-true"

logging:
    epoch_frequency: 1
    image_frequency: 50  # Log images every epoch

Creating New Configurations

To create a new configuration, add a YAML file to the appropriate directory:

For a new model: configs/model/my_model.yaml
For a new dataset: configs/data/my_dataset.yaml
For a new training setup: configs/training/my_training.yaml

Then update the default yaml

# configs/config.yaml
defaults:
  - experiment: my_experiment
  - training: my_training
  - model: my_model
  - data: my_dataset

OR use it from CLI with:

python src/train.py model=my_model data=my_dataset training=my_training experiment=my_experiment

Logging and Visualization

Message Logging

The template uses Loguru for message logging. The flow is simple. Look at src/train.py. Here is a snippet, in your entry point:

import hydra
from src.utils.logging.msg_logger import setup_logging
# Set up msg logger
hydra_cfg = hydra.core.hydra_config.HydraConfig.get()
setup_logging(exp_dir=hydra_cfg.run.dir, log_filename=hydra_cfg.job.name)

Then in any file/module/function/class, do

from loguru import logger
epoch=0
loss=1
logger.info(f"Epoch {epoch} loss: {loss:.4f}")
logger.info("Starting training process")
logger.warning("GPU memory is running low")
logger.error("Failed to load dataset")
something = `something`
logger.opt(colors=True).info("<blue>Using color to highlight(s):</blue> <green>{}</green>", something)

Logs are saved to both the console and a log file in the experiment directory.

TensorBoard Logging

The template includes a TensorBoard logger for visualizing metrics and images:

from src.utils.logging.tb_logger import TensorBoardLogger
tb_logger = TensorBoardLogger(tb_dir)

# Log a scalar value
tb_logger.log_scalar("training/loss", loss_value, step)

# Log images
tb_logger.log_images("training/generated_images", sample_images, step)

# Log model graph
tb_logger.log_model_graph(model, dummy_input)

To view TensorBoard logs:

tensorboard --logdir model_save/my_experiment/tb

Toy Example: Image Reconstruction

Implemented Components:

UNet architecture with configurable depth/features
Random image dataset (template for easy replacement)
MSE loss autoencoder training
Reconstruction visualization utilities

Extending the Template

This template is designed to be extended for your specific needs:

1. Add New Models

Implement model in src/model/
Create config in configs/model/

Update main config:

defaults:
  - model: your_model  # in configs/config.yaml

2. Add Datasets

Implement dataset class in src/data/

Update data config:

# configs/data/default.yaml
data:
  name: "your_dataset"
  image_size: 256

3. Modify Training Logic

Edit src/train.py core loop
Add new metrics/visualizations
Extend logging as needed

4. Add custom metrics and visualizations using the TensorBoard logger

Again

Make use of the core features of config management and logging for deep learning exps
Look at the lightweight toy example for what the workflow could be
Easy (relatively easy/easier) to open it and adapt for your use case (hopefully)

Star History

License

MIT License

Copyright (c) 2021 ashleve

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
configs		configs
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning Project Template

Table of Contents

Why This Template?

Why Not Use Existing Templates?

What This Template Offers

How to use

Project Structure

Environment Setup

Training

Inference

Configuration Management

Core Concepts

Creating New Configurations

Logging and Visualization

Message Logging

TensorBoard Logging

Toy Example: Image Reconstruction

Extending the Template

1. Add New Models

2. Add Datasets

3. Modify Training Logic

4. Add custom metrics and visualizations using the TensorBoard logger

Again

Star History

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Languages

AntiLibrary5/pytorch-template-fiber-hydra-tensorboard-2025

Folders and files

Latest commit

History

Repository files navigation

Deep Learning Project Template

Table of Contents

Why This Template?

Why Not Use Existing Templates?

What This Template Offers

How to use

Project Structure

Environment Setup

Training

Inference

Configuration Management

Core Concepts

Creating New Configurations

Logging and Visualization

Message Logging

TensorBoard Logging

Toy Example: Image Reconstruction

Extending the Template

1. Add New Models

2. Add Datasets

3. Modify Training Logic

4. Add custom metrics and visualizations using the TensorBoard logger

Again

Star History

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Languages

Packages