Normalizers Plugin Documentation

The Normalizers Plugin provides data normalization filters for the VIPR inference workflow.

Overview

Normalizers are implemented as discovery filters on INFERENCE_PREPROCESS_PRE_FILTER (preprocess step, typically early with weight=-10).

  • Filter-based architecture: no dedicated handler step required

  • DataSet-aware: works with DataSet transfer objects

  • Immutable transformations: returns updated data via copy_with_updates()

  • Discovery pattern: auto-registration via @discover_filter

Current uncertainty behavior:

  • dy transformation is implemented

  • dx is currently passed through unchanged

Available Normalizers

1. vipr.plugins.normalizers.minmax_normalizer.MinMaxNormalizer

Scales y to [0, 1].

Formula:

y' = (y - min(y)) / (max(y) - min(y))

Uncertainty handling:

dy' = dy / (max(y) - min(y))   (if range > 0, else dy unchanged)
dx' = dx                        (unchanged)

2. vipr.plugins.normalizers.zscore_normalizer.ZScoreNormalizer

Standardizes y to mean 0 and std 1.

Formula:

y' = (y - mean(y)) / std(y)

Uncertainty handling:

dy' = dy / std(y)   (if std > 0, else dy unchanged)
dx' = dx            (unchanged)

3. vipr.plugins.normalizers.log_normalizer.LogNormalizer

Applies logarithmic transformation.

Formula:

y' = log(y + offset),  offset = 0 if min(y) > 0 else abs(min(y)) + 1e-10

Uncertainty handling:

dy' = dy / (y + offset)
dx' = dx   (unchanged)

Use case: data spanning multiple orders of magnitude (for example reflectivity curves).

Architecture

Integration with Inference Workflow

PreprocessStep (Step 3):
|- PRE_PRE_FILTER_HOOK
|- PRE_FILTER (INFERENCE_PREPROCESS_PRE_FILTER) <- normalizers register here
|- POST_PRE_FILTER_HOOK
|- execute() (optional handler, often passthrough)
|- PRE_POST_FILTER_HOOK
|- POST_FILTER
`- POST_POST_FILTER_HOOK

Normalizers are filter transformations; they are not a separate workflow step.

Implementation Pattern

from vipr.plugins.discovery.decorators import discover_filter
from .interfaces.normalizer import NormalizerInterface
from vipr.plugins.inference.dataset import DataSet

class CustomNormalizer(NormalizerInterface):
    def __init__(self, app):
        self.app = app

    @discover_filter('INFERENCE_PREPROCESS_PRE_FILTER', enabled_in_config=False, weight=-10)
    def normalize_filter(self, data: DataSet, **kwargs) -> DataSet:
        if data.y is None:
            return data

        normalized_y = self._normalize(data.y)
        normalized_dy = self._transform_dy(data.dy, data.y) if data.dy is not None else None

        return data.copy_with_updates(y=normalized_y, dy=normalized_dy)

Key Components

Plugin registration (__init__.py):

def load(app):
    app.extend('normalizer_minmax', MinMaxNormalizer(app))
    app.extend('normalizer_zscore', ZScoreNormalizer(app))
    app.extend('normalizer_log', LogNormalizer(app))

Discovery decorator:

  • @discover_filter('INFERENCE_PREPROCESS_PRE_FILTER', enabled_in_config=False, weight=-10)

  • Auto-registers filter for discovery/config tooling

  • enabled_in_config=False keeps it disabled by default in generated configs

Configuration

Enable normalizers via YAML under vipr.inference.filters.

MinMax

vipr:
  inference:
    filters:
      INFERENCE_PREPROCESS_PRE_FILTER:
      - class: vipr.plugins.normalizers.minmax_normalizer.MinMaxNormalizer
        enabled: true
        method: normalize_filter
        weight: -10

Z-Score

vipr:
  inference:
    filters:
      INFERENCE_PREPROCESS_PRE_FILTER:
      - class: vipr.plugins.normalizers.zscore_normalizer.ZScoreNormalizer
        enabled: true
        method: normalize_filter
        weight: -10

Log

vipr:
  inference:
    filters:
      INFERENCE_PREPROCESS_PRE_FILTER:
      - class: vipr.plugins.normalizers.log_normalizer.LogNormalizer
        enabled: true
        method: normalize_filter
        weight: -10

Programmatic Registration

def load(app):
    app.filter.register(
        'INFERENCE_PREPROCESS_PRE_FILTER',
        app.normalizer_minmax.normalize_filter,
        weight=-10,
    )

Usage Example

vipr:
  inference:
    load_data:
      handler: csv_spectrareader
      parameters:
        data_path: data/reflectivity_data.txt
        column_mapping:
          q: 0
          I: 1

    load_model:
      handler: reflectorch
      parameters:
        config_name: b_mc_point_xray_conv_standard_L2_InputQ

    preprocess:
      handler: ''
      parameters: {}

    prediction:
      handler: reflectorch_predictor
      parameters:
        calc_pred_curve: true
        clip_prediction: true

    postprocess:
      handler: ''
      parameters: {}

    filters:
      INFERENCE_PREPROCESS_PRE_FILTER:
      - class: vipr.plugins.normalizers.log_normalizer.LogNormalizer
        enabled: true
        method: normalize_filter
        weight: -10
      - class: vipr_reflectometry.reflectorch.reflectorch_extension.Reflectorch
        enabled: true
        method: _preprocess_interpolate
        weight: 0

Choosing a Normalizer

Normalizer

Data Range

Outlier Sensitivity

Typical Use Case

MinMax

[0, 1]

High

Bounded model inputs

Z-Score

mean=0, std=1

Medium

Standardized statistical inputs

Log

log-transformed

Lower for large dynamic ranges

Multi-magnitude reflectivity data

See Also