Normalizers Plugin Documentation

The Normalizers Plugin provides data normalization methods for the VIPR inference workflow (Step 3: NormalizeStep).

Overview

The plugin implements three normalizers using the filter-based pattern:

  • Filter-based architecture: No handlers, pure filter transformations

  • DataSet-aware: Works with DataSet transfer objects

  • Immutable transformations: Uses copy_with_updates() for functional updates

  • Discovery pattern: Auto-registration via @discover_filter decorator

Note: Error propagation for uncertainties (dy, dx) is not yet implemented.

Available Normalizers

1. vipr.plugins.normalizers.minmax_normalizer.MinMaxNormalizer

Scales data to [0, 1] range.

Formula:

y' = (y - min(y)) / (max(y) - min(y))

Error propagation (TODO):

dy' = dy / (max(y) - min(y))

Use case: When you need bounded values in a fixed range.

2. vipr.plugins.normalizers.zscore_normalizer.ZScoreNormalizer

Standardizes data to mean=0, standard deviation=1.

Formula:

y' = (y - mean(y)) / std(y)

Error propagation (TODO):

dy' = dy / std(y)

Use case: When you need standardized data for models sensitive to scale.

3. vipr.plugins.normalizers.log_normalizer.LogNormalizer

Applies logarithmic transformation.

Formula:

y' = log(y + offset)  where offset ensures y + offset > 0

Error propagation (TODO):

dy' = dy / (y + offset)

Use case: Data spanning multiple orders of magnitude (e.g., reflectivity curves).

Architecture

Integration with Inference Workflow

NormalizeStep (Step 3):
├── PRE_PRE_FILTER_HOOK
├── PRE_FILTER (INFERENCE_NORMALIZE_PRE_FILTER) ◄── Normalizers register here
├── POST_PRE_FILTER_HOOK
├── execute() (no-op, normalization via filters)
├── PRE_POST_FILTER_HOOK
├── POST_FILTER
└── POST_POST_FILTER_HOOK

Normalizers are pure filter transformations - no handler interface required.

Implementation Pattern

from vipr.plugins.discovery.decorators import discover_filter
from .interfaces.normalizer import NormalizerInterface
from vipr.plugins.inference.dataset import DataSet

class CustomNormalizer(NormalizerInterface):
    def __init__(self, app):
        self.app = app

    @discover_filter('INFERENCE_NORMALIZE_PRE_FILTER', enabled_in_config=False)
    def normalize_filter(self, data: DataSet, **kwargs) -> DataSet:
        """Transform DataSet with normalized y values."""
        
        # Normalize y
        normalized_y = self._normalize(data.y)
        
        # TODO: Transform dy (error propagation)
        normalized_dy = None
        if data.dy is not None:
            # Apply error propagation formula
            normalized_dy = self._transform_errors(data.dy, data.y)
        
        # Return updated DataSet (immutable)
        return data.copy_with_updates(y=normalized_y, dy=normalized_dy)

Key Components

Plugin Registration (__init__.py):

def load(app):
    app.extend('normalizer_minmax', MinMaxNormalizer(app))
    app.extend('normalizer_zscore', ZScoreNormalizer(app))
    app.extend('normalizer_log', LogNormalizer(app))

Discovery Decorator:

  • @discover_filter('INFERENCE_NORMALIZE_PRE_FILTER', enabled_in_config=False)

  • Auto-registers filter for discovery by config generation tools

  • enabled_in_config=False: Disabled by default in auto-generated configurations

Configuration

Filter Configuration

Enable normalizers via YAML configuration under vipr.inference.filters:

vipr:
  inference:
    filters:
      INFERENCE_NORMALIZE_PRE_FILTER:
      - class: vipr.plugins.normalizers.minmax_normalizer.MinMaxNormalizer
        enabled: true
        method: normalize_filter
        weight: 0

Normalizer Selection

MinMax Normalization:

vipr:
  inference:
    filters:
      INFERENCE_NORMALIZE_PRE_FILTER:
      - class: vipr.plugins.normalizers.minmax_normalizer.MinMaxNormalizer
        enabled: true
        method: normalize_filter
        weight: 0

Z-Score Normalization:

vipr:
  inference:
    filters:
      INFERENCE_NORMALIZE_PRE_FILTER:
      - class: vipr.plugins.normalizers.zscore_normalizer.ZScoreNormalizer
        enabled: true
        method: normalize_filter
        weight: 0

Log Normalization:

vipr:
  inference:
    filters:
      INFERENCE_NORMALIZE_PRE_FILTER:
      - class: vipr.plugins.normalizers.log_normalizer.LogNormalizer
        enabled: true
        method: normalize_filter
        weight: 0

Programmatic Registration

def load(app):
    # Register filter directly
    app.filter.register(
        'INFERENCE_NORMALIZE_PRE_FILTER',
        app.normalizer_minmax.normalize_filter
    )

Usage Example

Complete Configuration

Realistic example using reflectometry components:

vipr:
  inference:
    load_data:
      handler: csv_spectrareader
      parameters:
        column_mapping:
          I: 1
          q: 0
        data_path: data/reflectivity_data.txt
    
    load_model:
      handler: reflectorch
      parameters:
        config_name: b_mc_point_xray_conv_standard_L2_InputQ
    
    normalize:
      handler: ''
      parameters: {}
    
    preprocess:
      handler: ''
      parameters: {}
    
    prediction:
      handler: reflectorch_predictor
      parameters:
        calc_pred_curve: true
        clip_prediction: true
    
    postprocess:
      handler: ''
      parameters: {}
    
    filters:
      INFERENCE_NORMALIZE_PRE_FILTER:
      - class: vipr.plugins.normalizers.log_normalizer.LogNormalizer
        enabled: true
        method: normalize_filter
        weight: 0

Choosing a Normalizer

Normalizer

Data Range

Outlier Sensitivity

Use Case

MinMax

[0, 1]

High

Bounded inputs, CNNs

Z-Score

Unbounded (mean=0, std=1)

Medium

Statistical models, normally distributed data

Log

Transformed scale

Low

Multi-magnitude data (e.g., 10^-8 to 10^0)

TODO

  • ⚠️ Uncertainty transformation: dy and dx transformation formulas

See Also