Skip to content

Configuration

ESA OpenSR relies on YAML files to control every aspect of the training pipeline. This page documents the available keys and how they influence the underlying code. Use opensr_srgan/configs/config_20m.yaml and opensr_srgan/configs/config_10m.yaml as starting points.

File structure

A typical configuration contains the following top-level sections:

Data:
Model:
Training:
Generator:
Discriminator:
Optimizers:
Schedulers:
Logging:

Each section maps directly to parameters consumed inside opensr_srgan/model/SRGAN.py, the dataset factory, or the training script.

Data

Key Default Description
train_batch_size 12 Mini-batch size for the training dataloader. Falls back to batch_size if set.
val_batch_size 8 Batch size for validation.
num_workers 6 Number of worker processes for both dataloaders.
prefetch_factor 2 Additional batches prefetched by each worker. Ignored when num_workers == 0.
dataset_type ExampleDataset Dataset selector consumed by opensr_srgan.data.dataset_selector.select_dataset.
normalization 'sen2_stretch' Normalisation strategy applied to input tensors. Accepts a string alias or a mapping (see below).

Normalization policies

The :class:~opensr_srgan.data.utils.normalizer.Normalizer centralises all normalisation logic. Pick one of the built-in aliases that matches your data:

Method Description
sen2_stretch Multiply by 10/3 for a light Sentinel-2 contrast stretch.
normalise_10k / reflectance Scale Sentinel-2 style 0–10000 reflectance values to [0, 1].
normalise_10k_signed / reflectance_signed Scale 0–10000 reflectance to [-1, 1] (/5000 - 1).
normalise_s2 Symmetric Sentinel-2 stretch used during training (maps to [-1, 1] and back).
zero_one Clamp incoming values to [0, 1] without otherwise changing them.
zero_one_signed Convert [0, 1] inputs to [-1, 1] via the common tensor * 2 - 1 rule.
identity / none Leave tensors unchanged (use when data is already normalised).

Aliases such as reflectance, sentinel2, or zero_to_one map to the canonical entries above. Call :meth:opensr_srgan.data.utils.normalizer.Normalizer.available_methods to inspect the current list programmatically.

Using custom callables

When you need a bespoke policy, provide a mapping instead of a string. The normaliser will import and wrap your functions:

Data:
  normalization:
    name: custom
    normalize: my_package.normalization:scale_to_unit
    denormalize: my_package.normalization:unit_to_scale
    # Optional keyword arguments applied when calling the functions
    normalize_kwargs:
      clip: true

The callables receive a single torch.Tensor argument and must return a tensor. If you need to reuse the same function for both directions (for example opensr_srgan.utils.radiometrics.normalise_10k), add normalize_kwargs / denormalize_kwargs with the appropriate stage parameter.

Model

Key Default Description
in_bands 6 Number of input channels expected by the generator and discriminator.
continue_training False Path to a Lightning checkpoint for resuming training (Trainer.fit(resume_from_checkpoint=...)).
load_checkpoint False Path to a checkpoint used solely for weight initialisation (no training state restored).

Training

Warm-up and adversarial scheduling

Key Default Description
pretrain_g_only True Enable generator-only warm-up before adversarial updates.
g_pretrain_steps 10000 Number of optimiser steps spent in the warm-up phase.
adv_loss_ramp_steps 5000 Duration of the adversarial weight ramp after the warm-up.
label_smoothing True Replaces target value 1.0 with 0.9 for real examples to stabilise discriminator training.

Generator EMA (Training.EMA)

Maintaining an exponential moving average (EMA) of the generator smooths out sharp weight updates and usually yields sharper yet stable validation imagery. The EMA is fully optional and controlled through the Training.EMA block:

Key Default Description
enabled False Turns EMA tracking on/off. When enabled, the EMA weights automatically replace the live generator during evaluation/inference.
decay 0.999 Smoothing factor applied at every update. Values closer to 1.0 retain longer history.
update_after_step 0 Defers EMA updates until the given optimiser step. Useful when you want the generator to warm up before tracking.
device null Stores EMA weights on a dedicated device ("cpu", "cuda:1", …). null keeps the weights on the same device as the generator.
use_num_updates True Enables PyTorch’s bias correction so the EMA ramps in smoothly during the first few updates.

Generator content loss (Training.Losses)

Key Default Description
adv_loss_beta 1e-3 Target weight applied to the adversarial term after ramp-up.
adv_loss_schedule cosine Ramp shape (linear or cosine).
adv_loss_type bce Adversarial objective (bce for classic SRGAN logits, wasserstein for a non-saturating critic-style loss).
r1_gamma 0.0 Strength of the R1 gradient penalty applied to real images (useful with Wasserstein critics).
l1_weight 1.0 Weight of the pixelwise L1 loss.
sam_weight 0.05 Weight of the spectral angle mapper loss.
perceptual_weight 0.1 Weight of the perceptual feature loss.
perceptual_metric vgg Backbone used for perceptual features (vgg or lpips).
tv_weight 0.0 Total variation regularisation strength.
max_val 1.0 Peak value assumed by PSNR/SSIM computations.
ssim_win 11 Window size for SSIM metrics. Must be an odd integer.

Generator

Key Default Description
model_type SRResNet Generator family (SRResNet, stochastic_gan, or esrgan).
block_type standard SRResNet variant (standard, res, rcab, rrdb, lka). Ignored for stochastic_gan/esrgan.
large_kernel_size 9 Kernel size for input/output convolution layers.
small_kernel_size 3 Kernel size for residual/attention blocks.
n_channels 96 Base number of feature channels (RRDB/ESRGAN trunk width).
n_blocks 32 Number of residual/attention blocks (RRDB count when model_type: esrgan).
scaling_factor 8 Super-resolution scale factor (2, 4, 8, ...).
growth_channels 32 ESRGAN-only: growth channels inside each RRDB block.
res_scale 0.2 Residual scaling used by stochastic/ESRGAN variants.
out_channels Model.in_bands ESRGAN-only: override the number of output bands.

Discriminator

Key Default Description
model_type standard Discriminator architecture (standard, patchgan, or esrgan).
n_blocks 8 Number of convolutional blocks. PatchGAN defaults to 3 when unspecified (ignored by esrgan).
base_channels 64 ESRGAN-only: base number of feature maps.
linear_size 1024 ESRGAN-only: hidden dimension of the fully connected head.
use_spectral_norm False Apply spectral normalization to the SRGAN discriminator layers for improved Lipschitz control.

Suggested settings

Generator presets

The defaults in the YAML configs intentionally balance stability and fidelity for Sentinel-2 data. Start here before performing sweeps:

  • Keep n_channels around 96 for residual-style backbones so feature widths match the initial convolution used by the flexible generator factory.
  • Depth drives detail. Begin with n_blocks = 32 for flexible variants and reduce to 16 when training budgets are tight or when using the conditional generator, which already injects stochasticity via latent noise.
  • Set scaling_factor according to your target resolution (2×/4×/8×); all bundled generators support those values out of the box.
Generator type Recommended n_channels Recommended n_blocks Typical scaling_factor Notes
SRResNet (block_type: standard) 64 16 Canonical baseline with batch-norm residual blocks; scale can be 2×/4×/8× as needed.
SRResNet (block_type: res) 96 32 4×–8× Lightweight residual blocks without batch norm; works well for high-scale (8×) Sentinel data.
SRResNet (block_type: rcab) 96 32 4×–8× Attention-enhanced residual blocks; keep depth high to exploit channel attention.
SRResNet (block_type: rrdb) 96 32 4×–8× Dense residual blocks expand receptive field; expect higher VRAM use at 32 blocks.
SRResNet (block_type: lka) 96 24–32 4×–8× Large-kernel attention blocks stabilise at moderate depth; drop to 24 blocks if memory bound.
stochastic_gan 96 16 Latent-modulated residual stack; pair with noise_dim ≈ 128 and res_scale ≈ 0.2 defaults.
esrgan 64 23 ESRGAN-style RRDB trunk; tune growth_channels (typically 32) and keep res_scale ≈ 0.2 for stability.

Discriminator presets

Tune discriminator depth to match the generator capacity—too shallow and adversarial loss underfits, too deep and the training loop destabilises. These starting points mirror the architectures bundled with the repo:

Discriminator type Recommended depth parameter Additional notes
standard n_blocks = 8 Mirrors the original SRGAN CNN with alternating stride-1/stride-2 blocks before the dense head.
patchgan n_blocks = 3 Maps to the 3-layer PatchGAN (a.k.a. n_layers); increase to 4–5 for larger crops or when the generator is particularly sharp.
esrgan base_channels = 64, linear_size = 1024 Deep VGG-style discriminator from ESRGAN; keep base width aligned with the generator feature count.

When adjusting these presets, scale generator and discriminator together and monitor adversarial loss ramps defined in Training.Losses to keep training stable.

Note

When you pick model_type: esrgan or stochastic_gan, SRResNet-only keys such as block_type, large_kernel_size, or small_kernel_size are automatically ignored. The model factory prints a console notice so you know which settings were overridden.

Optimisers

The trainer instantiates independent Adam optimisers for the generator and discriminator and enables a Two-Time-Scale Update Rule (TTUR) setup by default. The discriminator learning rate automatically defaults to a slower schedule than the generator, which keeps adversarial updates balanced without extra configuration.

Key Default Description
optim_g_lr 1e-4 Learning rate for the generator Adam optimiser.
optim_d_lr 0.5 * optim_g_lr Learning rate for the discriminator. Falls back to half of the generator LR (TTUR) when not explicitly set.
betas (0.0, 0.99) GAN-friendly Adam momentum pair that favours fast response from the second moment term while removing generator bias from the first moment.
eps 1e-7 Lower epsilon that matches common GAN recipes and prevents plateau-induced numerical noise.
weight_decay_g 0.0 Weight decay applied to generator parameters that are not normalisation affine/bias terms.
weight_decay_d 0.0 Weight decay applied to discriminator parameters that are not normalisation affine/bias terms.
gradient_clip_val 0.0 Global gradient-norm clipping threshold applied to both optimisers (set to 0 to disable).

Weight decay exclusions are handled automatically: batch/instance/group-norm layers and bias parameters are filtered into a no-decay group so regularisation only touches convolutional kernels and dense weights. This mirrors best practices for GAN training and keeps normalisation statistics stable.

Schedulers

Both optimisers share the same configuration keys because they use torch.optim.lr_scheduler.ReduceLROnPlateau.

Key Default Description
metric val_metrics/l1 Validation metric monitored for plateau detection.
metric_g Optional override for the generator scheduler monitor.
metric_d Optional override for the discriminator scheduler monitor.
patience_g 100 Epochs with no improvement before reducing the generator LR.
patience_d 100 Epochs with no improvement before reducing the discriminator LR.
factor_g 0.5 Multiplicative factor applied to the generator LR upon plateau.
factor_d 0.5 Multiplicative factor applied to the discriminator LR upon plateau.
cooldown 0 Number of epochs to wait after an LR drop before resuming plateau checks.
min_lr 1e-7 Minimum learning rate allowed for both schedulers.
verbose True Enables scheduler logging messages.
g_warmup_steps 2000 Number of optimiser steps used for generator LR warmup. Set to 0 to disable.
g_warmup_type cosine Warmup curve for the generator LR (cosine or linear).

g_warmup_steps applies a step-wise warmup through torch.optim.lr_scheduler.LambdaLR before resuming the standard ReduceLROnPlateau schedule. Cosine warmup is smoother for most runs, but a linear ramp (especially for 1–5k steps) remains available for experiments that prefer a steady rise. Both generator and discriminator schedulers expose Plateau parameters, including a shared cooldown period (epochs to wait before resuming plateau checks) and a min_lr floor so the learning rate never collapses to zero. Separate monitor keys (metric_g, metric_d) can be provided when generator and discriminator use different validation metrics.

Logging

Key Default Description
num_val_images 5 Number of validation batches visualised and logged to Weights & Biases each epoch.

Tips for managing configurations

  • Version control your YAML files. Tracking them alongside experiment logs makes it easy to reproduce results.
  • Leverage OmegaConf interpolation. You can reference other fields (e.g., reuse a base path) to avoid duplication.
  • Use descriptive filenames. Include dataset, scale, and generator type in the config name to keep experiments organised.
  • Override selectively. When launching through scripts or notebooks, you can load a base config and override specific fields at runtime using OmegaConf.merge.

With a clear understanding of these fields, you can rapidly iterate on architectures, datasets, and training strategies without modifying the underlying code.