Filter Documentation

Filters transform data during the inference pipeline. They are activated in YAML configurations and can be chained together.

Quick Start for Reflectometry

For most users: No action needed!

  • ✅ Standard workflow uses Reflectorch Interpolation (enabled by default)

  • ✅ No normalization required (Reflectorch has built-in scaling)

  • ✅ Just run: vipr --config @vipr_reflectometry/reflectorch/examples/configs/Ni500.yaml inference run

Optional: Clean noisy data

Add NeutronDataCleaner to remove high-error points:

filters:
  INFERENCE_PREPROCESS_PRE_FILTER:
  - class: vipr_reflectometry.shared.preprocessing.neutron_data_cleaner.NeutronDataCleaner
    enabled: true
    weight: -10
    parameters:
      error_threshold: 0.5

Discover available filters:

vipr discovery filters

📖 For detailed information, see sections below.


Available Filters

Normalization Filters (INFERENCE_NORMALIZE_PRE_FILTER)

Execution Order: After data loading, before preprocessing

Note: Normalization is typically not required for reflectometry data in the example configs, as the data is already in the correct format for the models. These filters are provided for other use cases or custom workflows.

MinMaxNormalizer

Scales intensity values to [0,1] range.

Formula:

y_norm = (y - min(y)) / (max(y) - min(y))
dy_norm = dy / (max(y) - min(y))

Default: Disabled (enabled_in_config=False)

YAML Configuration:

vipr:
  inference:
    filters:
      INFERENCE_NORMALIZE_PRE_FILTER:
      - class: vipr.plugins.normalizers.minmax_normalizer.MinMaxNormalizer
        enabled: true
        method: normalize_filter
        weight: 0

ZScoreNormalizer

Standardizes data to mean=0, standard deviation=1.

Formula:

y_norm = (y - mean(y)) / std(y)
dy_norm = dy / std(y)

Default: Disabled (enabled_in_config=False)

YAML Configuration:

vipr:
  inference:
    filters:
      INFERENCE_NORMALIZE_PRE_FILTER:
      - class: vipr.plugins.normalizers.zscore_normalizer.ZScoreNormalizer
        enabled: true
        method: normalize_filter
        weight: 0

LogNormalizer

Logarithmic transformation of intensity values.

Formula:

y_norm = log(y + offset)  # offset if y ≤ 0
dy_norm = dy / (y + offset)

Default: Disabled (enabled_in_config=False)

YAML Configuration:

vipr:
  inference:
    filters:
      INFERENCE_NORMALIZE_PRE_FILTER:
      - class: vipr.plugins.normalizers.log_normalizer.LogNormalizer
        enabled: true
        method: normalize_filter
        weight: 0

Preprocessing Filters (INFERENCE_PREPROCESS_PRE_FILTER)

Execution Order: After normalization, before prediction

NeutronDataCleaner

Cleans experimental neutron reflectometry data.

Functions:

  1. Removes points with negative intensity (R < 0)

  2. Filters/truncates curves at consecutive high-error points

Default: Disabled (enabled_in_config=False)

Parameters:

  • error_threshold (float, default=0.5): Relative error threshold dR/R (range: 0.0-1.0)

  • consecutive_errors (int, default=3): Number of consecutive high-error points to trigger truncation (minimum: 1)

  • remove_single_errors (bool, default=false): Remove isolated high-error points before truncation

YAML Configuration:

vipr:
  inference:
    filters:
      INFERENCE_PREPROCESS_PRE_FILTER:
      - class: vipr_reflectometry.shared.preprocessing.neutron_data_cleaner.NeutronDataCleaner
        enabled: true
        method: clean_experimental_data
        weight: -10
        parameters:
          error_threshold: 0.5
          consecutive_errors: 3
          remove_single_errors: false

Note: weight: -10 ensures execution before interpolation.

Reflectorch Interpolation

Interpolates experimental curves to the model Q-grid.

Functions:

  • Q-grid interpolation (logarithmic for reflectivity)

  • Propagates Q-resolution (dQ) and intensity errors (dR)

  • Batch processing

Default: Enabled (enabled_in_config=True) - Standard for Reflectorch workflows

YAML Configuration:

vipr:
  inference:
    filters:
      INFERENCE_PREPROCESS_PRE_FILTER:
      - class: vipr_reflectometry.reflectorch.reflectorch_extension.Reflectorch
        enabled: true
        method: _preprocess_interpolate
        weight: 0

FlowPreprocessor

Preprocessing for flow models (CINN, NSF, MAF).

Functions:

  • Q-grid interpolation

  • Flow-specific curve scaling

  • Tensor formatting for inverse sampling

Default: Disabled (enabled_in_config=False)

YAML Configuration:

vipr:
  inference:
    filters:
      INFERENCE_PREPROCESS_PRE_FILTER:
      - class: vipr_reflectometry.flow_models.flow_preprocessor.FlowPreprocessor
        enabled: true
        method: _preprocess_flow
        weight: 0

Filter Chaining

Filters are executed in order by weight (lower values first):

vipr:
  inference:
    filters:
      INFERENCE_PREPROCESS_PRE_FILTER:
      # 1. First: Data cleaning (weight: -10)
      - class: vipr_reflectometry.shared.preprocessing.neutron_data_cleaner.NeutronDataCleaner
        enabled: true
        method: clean_experimental_data
        weight: -10
        parameters:
          error_threshold: 0.5
          consecutive_errors: 3
      
      # 2. Then: Interpolation (weight: 0)
      - class: vipr_reflectometry.reflectorch.reflectorch_extension.Reflectorch
        enabled: true
        method: _preprocess_interpolate
        weight: 0

Practical Examples

Standard Workflow (interpolation only)

vipr --config @vipr_reflectometry/reflectorch/examples/configs/Ni500.yaml inference run

Config uses only Reflectorch Interpolation (default).

Quality Filtering + Interpolation

vipr --config @vipr_reflectometry/reflectorch/examples/configs/D17_SiO.yaml inference run

Config uses:

  1. NeutronDataCleaner (removes problematic points)

  2. Reflectorch Interpolation (interpolates cleaned data)

Filter Discovery

Show all available filters:

vipr discovery filters

Best Practices

When to use NeutronDataCleaner?

  • For noisy experimental data

  • When curves have high error bars at the end

  • With negative intensity values

Parameter Tuning for NeutronDataCleaner

  • error_threshold=0.5: Standard for neutron reflectometry (50% relative error)

  • consecutive_errors=3: Balance between noise tolerance and data loss

  • remove_single_errors=false: Preserves curve structure, removes only consecutive issues

Normalization

  • Not required for reflectometry data with Reflectorch models

  • Why no normalization for Reflectorch?

    • Reflectorch has built-in scaling (LogAffineCurvesScaler) that applies log10(R + eps) * weight + bias

    • Models expect raw reflectivity values as measured experimentally (typically from ~1 down to measurement sensitivity limit)

    • The absolute scale contains physical information needed for accurate predictions

    • External normalization would interfere with the model’s trained scaling transformation

  • Use cases for normalization filters:

    • LogNormalizer: For custom models or other domains with data spanning multiple orders of magnitude

    • MinMax/ZScore: When adapting VIPR for other domains or training custom ML models

Uncertainty Propagation

Filters that transform data values also transform measurement uncertainties (dQ for Q-values, dR for reflectivity) using standard error propagation formulas.


Notes

  • enabled_in_config: Determines if filter is enabled by default in generated standard configs

  • weight: Determines execution order (lower values = earlier execution)

  • DataSet: Filters work with immutable DataSet objects (Pydantic)

  • Batch Processing: All filters support batch processing of multiple spectra