14 Nov 21:28

Phhofm

43a9792

Latest

2xParagonSR_Nano_gan

This is a perceptual model I had trained on my ParagonSR network that I had released. This is the nano variant, so the fastest.

This model was trained on downsampled content only (different downsampling as to not overfit to a specific one like bicubic) on a phtographic cc0 dataset, meaning it is meant for good quality inputs, as can be seen in the visual examples.

Being the fastest option, it is also the least capable. This one is meant for speed. I attach some photo visuals at the bottom to visually showcase its capability and at the same time its limits. I tried to tune my losses in a way that the output is (mostly) artifact-free while being tuned for perceptual results (sharper losses can lead to ringing pretty fast with such a small option, i attached my training config). While other variants should give better quality, this one is more focussed a bit on speed, you should be able to try it out with trt or directml, though i only tested it with onnxruntime so far.

The models released beneath are the release models as envisioned with my network. The op18_fp16 is the recommended version. But for completeness, i also provide the fp32 model, and for compatibility, in case of your onnx version not being able to handle mish (this is why opset 18 had been chosen, because of mish) I also provide a no_mish fp16 checkpoint, opset 17, for maximum compatibility.

I think ill also provide the training config and the fused checkpoint, just for completeness, maybe also the unfused "raw" training checkpoint.

Though I am excited to maybe show once what i am working on right now. Or rather i am training (or did multiple tries already) a model based on a different version of this network, with its own discriminator, r3gan instead of vanilla gan for more stability, slight clip-based contrastive loss additionally, replaced perceptualloss which uses that legacy imagenet vgg with a convnext-tiny loss (more modern, i basically disliked the waterpainterly gan look and also gan artifacts and try to counteract them), and so forth, just optimizing multiple different things for my sisr training currently for future models. But thats a slight other topic or model than this release.

Here some visual examples for this one

Assets 9

14 Oct 12:09

Phhofm

2xBHI_small_esrgan_pretrain

c16e557

2xBHI_small_esrgan_pretrain

Scale: 2x
Network type: ESRGAN
Author: Philip Hofmann
License: CC-BY-4.0
Purpose: 2x esrgan pretrain model with l1&mssim loss only.
Training iterations: 300'000
Description: A 2x esrgan pretrain model.

Slowpic Example
https://slow.pics/s/92cYa0w3?image-fit=cover

Assets 4

14 Oct 12:09

Phhofm

2xBHI_small_drct-xl_pretrain

c16e557

2xBHI_small_drct-xl_pretrain

Scale: 2x
Network type: DRCT-XL
Author: Philip Hofmann
License: CC-BY-4.0
Purpose: 2x drct-xl pretrain model with l1&mssim loss only.
Training iterations: 180'000
Description: A 2x drct-xl pretrain model

Slowpic Example
https://slow.pics/s/PkqzlPH1?image-fit=cover

Assets 4

14 Oct 12:38

Phhofm

1xgaterv3_r_sharpen

c16e557

1xgaterv3_r_sharpen

Scale: 1×
Network Type: GaterV3
Author: Philip Hofmann
Iterations: 90,000
License: CC BY 4.0
Purpose: A 1× enhancement model designed to subtly sharpen image outputs.
It can be used as a standalone model or chained directly after 1xgaterv3_r_restore for improved sharpness and clarity.

Usage

This model was trained on CC0 images using the GaterV3 architecture (MIT).
It is released under the CC BY 4.0 license, permitting personal, commercial, and open use with proper attribution or prior agreement.
ONNX conversions are included for easy inference.

🖼️ Visual Results

Full Changelog: 2xPublic_realplksr_dysample_layernorm_gan...1xgaterv3_r_sharpen

Assets 7

14 Oct 12:38

Phhofm

1xgaterv3_r_restore

c16e557

1xgaterv3_r_restore

Scale: 1×
Network Type: GaterV3
Author: Philip Hofmann
Iterations: 180,000
License: CC BY 4.0
Purpose: A 1× restoration model for improving image quality — handles noise, resizing, JPEG compression, and mild blur.
If outputs appear too soft, you can chain 1xgaterv3_r_sharpen afterward to slightly enhance sharpness.

Usage

🖼️ Visual Results

Assets 7

23 Jun 13:35

Phhofm

2xPublic_realplksr_dysample_layernorm_real_nn

90313fd

2xPublic_realplksr_dysample_layernorm_real_nn

🚀 2xPublic_realplksr_dysample_layernorm_real_nn

Scale: 2x
Network Type: RealPLKSR Dysample LayerNorm
Author: Philip Hofmann
Iterations: 120'000
License: Apache 2.0
Purpose: 2× upscaling of images with some degradation handling (blur, JPEG and sharpening artifacts), but no denoising, which can help the preservation of subtle textures

🔍 Overview

2× super-resolution model combining:

RealPLKSR backbone (GAN-optimized architecture)
Dysample dynamic upsampling
LayerNorm stabilization
Trained on rigorously curated public domain imagery for robust real-world performance. Specifically no noise has been added to the low resolution training data counterpart. This might result in keeping more details, specifically preservation of subtle textures, since denoising can lead to smooth outputs, washing out of subtle, high-frequency textures.

🖼️ Visual Results

🧠 Model Details

🏋️ Training Dataset

Input: 56K+ public domain images (CC0/public domain marked)
Processing: On average 6 hours/10K images (processing alone)
Final Training Dataset: 4,297 512×512 tiles after:
- Multi-scale strategy
- Visual de-duplication
- Strict quality thresholds of 10 different IQA metric scores

⚙️ LR Degradation Pipeline

Probabilistic degradation stack pre-applied on the training set:

Optical Blur
📌 Gaussian kernels (3×3 to 7×7)
📌 σ = 0.8-2.5 (randomized per sample)
Resampling Cycles
📌 Downsampling methods = INTER_AREA, INTER_LINEAR, INTER_CUBIC
📌 Upsampling methods = INTER_CUBIC, INTER_LINEAR, INTER_NEAREST, INTER_LANCZOS4
📌 Asymmetric x/y scaling factors
JPEG Compression
📌 1-3 compression cycles
📌 Chroma subsampling (420/422)
📌 Quality: 30-90 (variable)
Sharpening Artifacts
📌 Laplacian edge detection
📌 Intensity difference masking
Final Downscaling
📌 DPID (Deep Parametric Image Downscaling)
📌 Kernel width: 0.4-0.6 (randomized)

🧩 Architecture

Component	Origin	License	Contribution
PLKSR	dslisleedh/PLKSR	MIT	Base architecture
RealPLKSR	neosr-project	MIT	GAN stabilization
Dysample	tiny-smart/dynsample	MIT	Dynamic upsampling
LayerNorm	trainner-redux	Apache 2.0	Training stability

🔮 Future Plans

I plan to enlarge the training dataset from the current 4,297 512x512 tiles. This expansion should significantly improve model quality by:

Increasing the amount of information and contextual understanding available during training
Enhancing the distribution coverage of degradation strengths
Improving generalization capabilities across diverse real-world scenarios
Reducing potential overfitting to specific artifact patterns

Also adjusting degradations&strengths applied to LR creation, and maybe adjust losses, all from learnings taken from inspecting the validation images

📜 License (Apache 2.0)

✅ Permitted Usage

Commercial applications (SaaS, apps, APIs)
Non-commercial/research projects
Model modification & fine-tuning
Redistribution of original/derived versions
Patent implementation of included techniques

🚫 Restrictions

Removing original attribution
Implying creator endorsement
Patent litigation against included techniques

📎 Attribution Requirements

Required when:

Public product/service usage
Research publications
Model redistribution

Attribution Template:

Image enhancement powered by 2xPublic_realplksr_dysample_layernorm_real 
by Philip Hofmann. [GitHub Release](https://github.com/Phhofm/models)

⚠️ Legal Note: Full license terms govern all usage. This summary doesn't constitute legal advice.
📄 Complete license: Apache 2.0 TEXT

🔗 References

Assets 7

23 Jun 13:32

Phhofm

2xPublic_realplksr_dysample_layernorm_real

90313fd

2xPublic_realplksr_dysample_layernorm_real

🚀 2xPublic_realplksr_dysample_layernorm_real

Scale: 2x
Network Type: RealPLKSR Dysample LayerNorm
Author: Philip Hofmann
Iterations: 250'000
License: Apache 2.0
Purpose: 2× upscaling of images with some degradation handling (blur, noise, JPEG and sharpening artifacts)

🔍 Overview

2× super-resolution model combining:

RealPLKSR backbone (GAN-optimized architecture)
Dysample dynamic upsampling
LayerNorm stabilization
Trained on rigorously curated public domain imagery for robust real-world performance.

🖼️ Visual Results

🧠 Model Details

🏋️ Training Dataset

Input: 56K+ public domain images (CC0/public domain marked)
Processing: On average 6 hours/10K images (processing alone)
Final Training Dataset: 4,297 512×512 tiles after:
- Multi-scale strategy
- Visual de-duplication
- Strict quality thresholds of 10 different IQA metric scores

⚙️ LR Degradation Pipeline

Probabilistic degradation stack pre-applied on the training set:

Optical Blur
📌 Gaussian kernels (3×3 to 7×7)
📌 σ = 0.8-2.5 (randomized per sample)
Sensor Noise
📌 Physics-based Poisson-Gaussian model
📌 Channel-dependent sensitivity parameters
Resampling Cycles
📌 Downsampling methods = INTER_AREA, INTER_LINEAR, INTER_CUBIC
📌 Upsampling methods = INTER_CUBIC, INTER_LINEAR, INTER_NEAREST, INTER_LANCZOS4
📌 Asymmetric x/y scaling factors
JPEG Compression
📌 1-3 compression cycles
📌 Chroma subsampling (420/422)
📌 Quality: 30-90 (variable)
Sharpening Artifacts
📌 Laplacian edge detection
📌 Intensity difference masking
Final Downscaling
📌 DPID (Deep Parametric Image Downscaling)
📌 Kernel width: 0.4-0.6 (randomized)

🧩 Architecture

Component	Origin	License	Contribution
PLKSR	dslisleedh/PLKSR	MIT	Base architecture
RealPLKSR	neosr-project	MIT	GAN stabilization
Dysample	tiny-smart/dynsample	MIT	Dynamic upsampling
LayerNorm	trainner-redux	Apache 2.0	Training stability

🔮 Future Plans

I plan to enlarge the training dataset from the current 4,297 512x512 tiles. This expansion should significantly improve model quality by:

Increasing the amount of information and contextual understanding available during training
Enhancing the distribution coverage of degradation strengths
Improving generalization capabilities across diverse real-world scenarios
Reducing potential overfitting to specific artifact patterns

Also adjusting degradations&strengths applied to LR creation, and maybe adjust losses, all from learnings taken from inspecting the validation images

📜 License (Apache 2.0)

✅ Permitted Usage

Commercial applications (SaaS, apps, APIs)
Non-commercial/research projects
Model modification & fine-tuning
Redistribution of original/derived versions
Patent implementation of included techniques

🚫 Restrictions

Removing original attribution
Implying creator endorsement
Patent litigation against included techniques

📎 Attribution Requirements

Required when:

Public product/service usage
Research publications
Model redistribution

Attribution Template:

Image enhancement powered by 2xPublic_realplksr_dysample_layernorm_real 
by Philip Hofmann. [GitHub Release](https://github.com/Phhofm/models/releases/tag/2xPublic_realplksr_dysample_layernorm_real)

⚠️ Legal Note: Full license terms govern all usage. This summary doesn't constitute legal advice.
📄 Complete license: Apache 2.0 TEXT

🔗 References

Assets 7

23 Jun 13:42

Phhofm

2xPublic_realplksr_dysample_layernorm_pretrain

90313fd

2xPublic_realplksr_dysample_layernorm_pretrain

🚀 2xPublic_realplksr_dysample_layernorm_pretrain

Scale: 2x
Network Type: RealPLKSR Dysample LayerNorm
Author: Philip Hofmann
Iterations: 100'000
License: Apache 2.0
Purpose: 2× pretrain

🔍 Overview

2× super-resolution pretraincombining:

RealPLKSR backbone (GAN-optimized architecture)
Dysample dynamic upsampling
LayerNorm stabilization
Trained on rigorously curated public domain imagery, this one has been trained on dpid downscaled content only

🖼️ Visual Results

🧠 Model Details

🏋️ Training Dataset

Input: 56K+ public domain images (CC0/public domain marked)
Processing: On average 6 hours/10K images (processing alone)
Final Training Dataset: 4,297 512×512 tiles after:
- Multi-scale strategy
- Visual de-duplication
- Strict quality thresholds of 10 different IQA metric scores

⚙️ LR Degradation Pipeline

DPID downscaled only

🧩 Architecture

Component	Origin	License	Contribution
PLKSR	dslisleedh/PLKSR	MIT	Base architecture
RealPLKSR	neosr-project	MIT	GAN stabilization
Dysample	tiny-smart/dynsample	MIT	Dynamic upsampling
LayerNorm	trainner-redux	Apache 2.0	Training stability

🔮 Future Plans

I plan to enlarge the training dataset from the current 4,297 512x512 tiles. This expansion should significantly improve model quality by:

Increasing the amount of information and contextual understanding available during training
Enhancing the distribution coverage of degradation strengths
Improving generalization capabilities across diverse real-world scenarios
Reducing potential overfitting to specific artifact patterns

📜 License (Apache 2.0)

✅ Permitted Usage

Commercial applications (SaaS, apps, APIs)
Non-commercial/research projects
Model modification & fine-tuning
Redistribution of original/derived versions
Patent implementation of included techniques

🚫 Restrictions

Removing original attribution
Implying creator endorsement
Patent litigation against included techniques

📎 Attribution Requirements

Required when:

Public product/service usage
Research publications
Model redistribution

Attribution Template:

Image enhancement powered by 2xPublic_realplksr_dysample_layernorm_real 
by Philip Hofmann. [GitHub Release](https://github.com/Phhofm/models)

⚠️ Legal Note: Full license terms govern all usage. This summary doesn't constitute legal advice.
📄 Complete license: Apache 2.0 TEXT

🔗 References

Assets 6

23 Jun 13:39

Phhofm

2xPublic_realplksr_dysample_layernorm_gan

90313fd

2xPublic_realplksr_dysample_layernorm_gan

🚀 2xPublic_realplksr_dysample_layernorm_gan

Scale: 2x
Network Type: RealPLKSR Dysample LayerNorm
Author: Philip Hofmann
Iterations: 125'000
License: Apache 2.0
Purpose: 2× upscaling of images, no degradation handling

🔍 Overview

2× super-resolution model combining:

RealPLKSR backbone (GAN-optimized architecture)
Dysample dynamic upsampling
LayerNorm stabilization
Trained on rigorously curated public domain imagery, this one has been trained on dpid downscaled content only

🖼️ Visual Results

🧠 Model Details

🏋️ Training Dataset

Input: 56K+ public domain images (CC0/public domain marked)
Processing: On average 6 hours/10K images (processing alone)
Final Training Dataset: 4,297 512×512 tiles after:
- Multi-scale strategy
- Visual de-duplication
- Strict quality thresholds of 10 different IQA metric scores

⚙️ LR Degradation Pipeline

DPID downscaled only

🧩 Architecture

Component	Origin	License	Contribution
PLKSR	dslisleedh/PLKSR	MIT	Base architecture
RealPLKSR	neosr-project	MIT	GAN stabilization
Dysample	tiny-smart/dynsample	MIT	Dynamic upsampling
LayerNorm	trainner-redux	Apache 2.0	Training stability

🔮 Future Plans

I plan to enlarge the training dataset from the current 4,297 512x512 tiles. This expansion should significantly improve model quality by:

Increasing the amount of information and contextual understanding available during training
Enhancing the distribution coverage of degradation strengths
Improving generalization capabilities across diverse real-world scenarios
Reducing potential overfitting to specific artifact patterns

Also adjusting degradations&strengths applied to LR creation, and maybe adjust losses, all from learnings taken from inspecting the validation images

📜 License (Apache 2.0)

✅ Permitted Usage

Commercial applications (SaaS, apps, APIs)
Non-commercial/research projects
Model modification & fine-tuning
Redistribution of original/derived versions
Patent implementation of included techniques

🚫 Restrictions

Removing original attribution
Implying creator endorsement
Patent litigation against included techniques

📎 Attribution Requirements

Required when:

Public product/service usage
Research publications
Model redistribution

Attribution Template:

Image enhancement powered by 2xPublic_realplksr_dysample_layernorm_real 
by Philip Hofmann. [GitHub Release](https://github.com/Phhofm/models)

⚠️ Legal Note: Full license terms govern all usage. This summary doesn't constitute legal advice.
📄 Complete license: Apache 2.0 TEXT

🔗 References

Assets 7

21 May 12:47

Phhofm

2xBHI_small_realplksr_small_pretrain

c815325

2xBHI_small_realplksr_small_pretrain

Scale: 2x
Network type: realplksr_small
Author: Philip Hofmann
License: CC-BY-4.0
Release: 21.05.2025
Purpose: 2x realplksr_small pretrain model with l1&mssim loss only.
Training iterations: 100'000
Description: A 2x realplksr_small pretrain model.

Visual Examples

Tensorboard Validation Graphs on BHI100

Assets 3

Releases: Phhofm/models