# Normalizers Plugin Documentation The Normalizers Plugin provides data normalization methods for the VIPR inference workflow (Step 3: NormalizeStep). ## Overview The plugin implements three normalizers using the filter-based pattern: - **Filter-based architecture**: No handlers, pure filter transformations - **DataSet-aware**: Works with DataSet transfer objects - **Immutable transformations**: Uses `copy_with_updates()` for functional updates - **Discovery pattern**: Auto-registration via `@discover_filter` decorator **Note**: Error propagation for uncertainties (dy, dx) is not yet implemented. ## Available Normalizers ### 1. {py:class}`vipr.plugins.normalizers.minmax_normalizer.MinMaxNormalizer` Scales data to [0, 1] range. **Formula:** ``` y' = (y - min(y)) / (max(y) - min(y)) ``` **Error propagation (TODO):** ``` dy' = dy / (max(y) - min(y)) ``` **Use case**: When you need bounded values in a fixed range. ### 2. {py:class}`vipr.plugins.normalizers.zscore_normalizer.ZScoreNormalizer` Standardizes data to mean=0, standard deviation=1. **Formula:** ``` y' = (y - mean(y)) / std(y) ``` **Error propagation (TODO):** ``` dy' = dy / std(y) ``` **Use case**: When you need standardized data for models sensitive to scale. ### 3. {py:class}`vipr.plugins.normalizers.log_normalizer.LogNormalizer` Applies logarithmic transformation. **Formula:** ``` y' = log(y + offset) where offset ensures y + offset > 0 ``` **Error propagation (TODO):** ``` dy' = dy / (y + offset) ``` **Use case**: Data spanning multiple orders of magnitude (e.g., reflectivity curves). ## Architecture ### Integration with Inference Workflow ``` NormalizeStep (Step 3): ├── PRE_PRE_FILTER_HOOK ├── PRE_FILTER (INFERENCE_NORMALIZE_PRE_FILTER) ◄── Normalizers register here ├── POST_PRE_FILTER_HOOK ├── execute() (no-op, normalization via filters) ├── PRE_POST_FILTER_HOOK ├── POST_FILTER └── POST_POST_FILTER_HOOK ``` Normalizers are pure filter transformations - no handler interface required. ### Implementation Pattern ```python from vipr.plugins.discovery.decorators import discover_filter from .interfaces.normalizer import NormalizerInterface from vipr.plugins.inference.dataset import DataSet class CustomNormalizer(NormalizerInterface): def __init__(self, app): self.app = app @discover_filter('INFERENCE_NORMALIZE_PRE_FILTER', enabled_in_config=False) def normalize_filter(self, data: DataSet, **kwargs) -> DataSet: """Transform DataSet with normalized y values.""" # Normalize y normalized_y = self._normalize(data.y) # TODO: Transform dy (error propagation) normalized_dy = None if data.dy is not None: # Apply error propagation formula normalized_dy = self._transform_errors(data.dy, data.y) # Return updated DataSet (immutable) return data.copy_with_updates(y=normalized_y, dy=normalized_dy) ``` ### Key Components **Plugin Registration** (`__init__.py`): ```python def load(app): app.extend('normalizer_minmax', MinMaxNormalizer(app)) app.extend('normalizer_zscore', ZScoreNormalizer(app)) app.extend('normalizer_log', LogNormalizer(app)) ``` **Discovery Decorator**: - `@discover_filter('INFERENCE_NORMALIZE_PRE_FILTER', enabled_in_config=False)` - Auto-registers filter for discovery by config generation tools - `enabled_in_config=False`: Disabled by default in auto-generated configurations ## Configuration ### Filter Configuration Enable normalizers via YAML configuration under `vipr.inference.filters`: ```yaml vipr: inference: filters: INFERENCE_NORMALIZE_PRE_FILTER: - class: vipr.plugins.normalizers.minmax_normalizer.MinMaxNormalizer enabled: true method: normalize_filter weight: 0 ``` ### Normalizer Selection **MinMax Normalization:** ```yaml vipr: inference: filters: INFERENCE_NORMALIZE_PRE_FILTER: - class: vipr.plugins.normalizers.minmax_normalizer.MinMaxNormalizer enabled: true method: normalize_filter weight: 0 ``` **Z-Score Normalization:** ```yaml vipr: inference: filters: INFERENCE_NORMALIZE_PRE_FILTER: - class: vipr.plugins.normalizers.zscore_normalizer.ZScoreNormalizer enabled: true method: normalize_filter weight: 0 ``` **Log Normalization:** ```yaml vipr: inference: filters: INFERENCE_NORMALIZE_PRE_FILTER: - class: vipr.plugins.normalizers.log_normalizer.LogNormalizer enabled: true method: normalize_filter weight: 0 ``` ### Programmatic Registration ```python def load(app): # Register filter directly app.filter.register( 'INFERENCE_NORMALIZE_PRE_FILTER', app.normalizer_minmax.normalize_filter ) ``` ## Usage Example ### Complete Configuration Realistic example using reflectometry components: ```yaml vipr: inference: load_data: handler: csv_spectrareader parameters: column_mapping: I: 1 q: 0 data_path: data/reflectivity_data.txt load_model: handler: reflectorch parameters: config_name: b_mc_point_xray_conv_standard_L2_InputQ normalize: handler: '' parameters: {} preprocess: handler: '' parameters: {} prediction: handler: reflectorch_predictor parameters: calc_pred_curve: true clip_prediction: true postprocess: handler: '' parameters: {} filters: INFERENCE_NORMALIZE_PRE_FILTER: - class: vipr.plugins.normalizers.log_normalizer.LogNormalizer enabled: true method: normalize_filter weight: 0 ``` ## Choosing a Normalizer | Normalizer | Data Range | Outlier Sensitivity | Use Case | |-----------|------------|---------------------|----------| | **MinMax** | [0, 1] | High | Bounded inputs, CNNs | | **Z-Score** | Unbounded (mean=0, std=1) | Medium | Statistical models, normally distributed data | | **Log** | Transformed scale | Low | Multi-magnitude data (e.g., 10^-8 to 10^0) | ## TODO - ⚠️ **Uncertainty transformation**: dy and dx transformation formulas ## See Also - [Inference Plugin Documentation](./inference.md) - Full workflow context - [Dynamic Hooks and Filters Extension](../extensions/dynamic_hooks_filters.md) - Configuration system - `vipr-core/vipr/plugins/normalizers/` - Source code