Filter Documentation¶
Filters transform data during the inference pipeline. They are activated in YAML configurations and can be chained together.
Quick Start for Reflectometry¶
For most users: No action needed!
✅ Standard workflow uses Reflectorch Interpolation (enabled by default)
✅ No normalization required (Reflectorch has built-in scaling)
✅ Just run:
vipr --config @vipr_reflectometry/reflectorch/examples/configs/Ni500.yaml inference run
Optional: Clean noisy data
Add NeutronDataCleaner to remove high-error points:
filters:
INFERENCE_PREPROCESS_PRE_FILTER:
- class: vipr_reflectometry.shared.preprocessing.neutron_data_cleaner.NeutronDataCleaner
enabled: true
weight: -10
parameters:
error_threshold: 0.5
Discover available filters:
vipr discovery filters
📖 For detailed information, see sections below.
Available Filters¶
Normalization Filters (INFERENCE_NORMALIZE_PRE_FILTER)¶
Execution Order: After data loading, before preprocessing
Note: Normalization is typically not required for reflectometry data in the example configs, as the data is already in the correct format for the models. These filters are provided for other use cases or custom workflows.
MinMaxNormalizer¶
Scales intensity values to [0,1] range.
Formula:
y_norm = (y - min(y)) / (max(y) - min(y))
dy_norm = dy / (max(y) - min(y))
Default: Disabled (enabled_in_config=False)
YAML Configuration:
vipr:
inference:
filters:
INFERENCE_NORMALIZE_PRE_FILTER:
- class: vipr.plugins.normalizers.minmax_normalizer.MinMaxNormalizer
enabled: true
method: normalize_filter
weight: 0
ZScoreNormalizer¶
Standardizes data to mean=0, standard deviation=1.
Formula:
y_norm = (y - mean(y)) / std(y)
dy_norm = dy / std(y)
Default: Disabled (enabled_in_config=False)
YAML Configuration:
vipr:
inference:
filters:
INFERENCE_NORMALIZE_PRE_FILTER:
- class: vipr.plugins.normalizers.zscore_normalizer.ZScoreNormalizer
enabled: true
method: normalize_filter
weight: 0
LogNormalizer¶
Logarithmic transformation of intensity values.
Formula:
y_norm = log(y + offset) # offset if y ≤ 0
dy_norm = dy / (y + offset)
Default: Disabled (enabled_in_config=False)
YAML Configuration:
vipr:
inference:
filters:
INFERENCE_NORMALIZE_PRE_FILTER:
- class: vipr.plugins.normalizers.log_normalizer.LogNormalizer
enabled: true
method: normalize_filter
weight: 0
Preprocessing Filters (INFERENCE_PREPROCESS_PRE_FILTER)¶
Execution Order: After normalization, before prediction
NeutronDataCleaner¶
Cleans experimental neutron reflectometry data.
Functions:
Removes points with negative intensity (R < 0)
Filters/truncates curves at consecutive high-error points
Default: Disabled (enabled_in_config=False)
Parameters:
error_threshold(float, default=0.5): Relative error threshold dR/R (range: 0.0-1.0)consecutive_errors(int, default=3): Number of consecutive high-error points to trigger truncation (minimum: 1)remove_single_errors(bool, default=false): Remove isolated high-error points before truncation
YAML Configuration:
vipr:
inference:
filters:
INFERENCE_PREPROCESS_PRE_FILTER:
- class: vipr_reflectometry.shared.preprocessing.neutron_data_cleaner.NeutronDataCleaner
enabled: true
method: clean_experimental_data
weight: -10
parameters:
error_threshold: 0.5
consecutive_errors: 3
remove_single_errors: false
Note: weight: -10 ensures execution before interpolation.
Reflectorch Interpolation¶
Interpolates experimental curves to the model Q-grid.
Functions:
Q-grid interpolation (logarithmic for reflectivity)
Propagates Q-resolution (dQ) and intensity errors (dR)
Batch processing
Default: Enabled (enabled_in_config=True) - Standard for Reflectorch workflows
YAML Configuration:
vipr:
inference:
filters:
INFERENCE_PREPROCESS_PRE_FILTER:
- class: vipr_reflectometry.reflectorch.reflectorch_extension.Reflectorch
enabled: true
method: _preprocess_interpolate
weight: 0
FlowPreprocessor¶
Preprocessing for flow models (CINN, NSF, MAF).
Functions:
Q-grid interpolation
Flow-specific curve scaling
Tensor formatting for inverse sampling
Default: Disabled (enabled_in_config=False)
YAML Configuration:
vipr:
inference:
filters:
INFERENCE_PREPROCESS_PRE_FILTER:
- class: vipr_reflectometry.flow_models.flow_preprocessor.FlowPreprocessor
enabled: true
method: _preprocess_flow
weight: 0
Filter Chaining¶
Filters are executed in order by weight (lower values first):
vipr:
inference:
filters:
INFERENCE_PREPROCESS_PRE_FILTER:
# 1. First: Data cleaning (weight: -10)
- class: vipr_reflectometry.shared.preprocessing.neutron_data_cleaner.NeutronDataCleaner
enabled: true
method: clean_experimental_data
weight: -10
parameters:
error_threshold: 0.5
consecutive_errors: 3
# 2. Then: Interpolation (weight: 0)
- class: vipr_reflectometry.reflectorch.reflectorch_extension.Reflectorch
enabled: true
method: _preprocess_interpolate
weight: 0
Practical Examples¶
Standard Workflow (interpolation only)¶
vipr --config @vipr_reflectometry/reflectorch/examples/configs/Ni500.yaml inference run
Config uses only Reflectorch Interpolation (default).
Quality Filtering + Interpolation¶
vipr --config @vipr_reflectometry/reflectorch/examples/configs/D17_SiO.yaml inference run
Config uses:
NeutronDataCleaner (removes problematic points)
Reflectorch Interpolation (interpolates cleaned data)
Filter Discovery¶
Show all available filters:
vipr discovery filters
Best Practices¶
When to use NeutronDataCleaner?¶
For noisy experimental data
When curves have high error bars at the end
With negative intensity values
Parameter Tuning for NeutronDataCleaner¶
error_threshold=0.5: Standard for neutron reflectometry (50% relative error)
consecutive_errors=3: Balance between noise tolerance and data loss
remove_single_errors=false: Preserves curve structure, removes only consecutive issues
Normalization¶
Not required for reflectometry data with Reflectorch models
Why no normalization for Reflectorch?
Reflectorch has built-in scaling (
LogAffineCurvesScaler) that applieslog10(R + eps) * weight + biasModels expect raw reflectivity values as measured experimentally (typically from ~1 down to measurement sensitivity limit)
The absolute scale contains physical information needed for accurate predictions
External normalization would interfere with the model’s trained scaling transformation
Use cases for normalization filters:
LogNormalizer: For custom models or other domains with data spanning multiple orders of magnitude
MinMax/ZScore: When adapting VIPR for other domains or training custom ML models
Uncertainty Propagation¶
Filters that transform data values also transform measurement uncertainties (dQ for Q-values, dR for reflectivity) using standard error propagation formulas.
Notes¶
enabled_in_config: Determines if filter is enabled by default in generated standard configs
weight: Determines execution order (lower values = earlier execution)
DataSet: Filters work with immutable DataSet objects (Pydantic)
Batch Processing: All filters support batch processing of multiple spectra