vipr_reflectometry.shared.load_data.readers package¶
Submodules¶
vipr_reflectometry.shared.load_data.readers.experimental_data_manager module¶
- class vipr_reflectometry.shared.load_data.readers.experimental_data_manager.ExperimentalDataManager(file_path, dataset_name, num_params=8, _preloaded_full_file_dict=None)¶
Bases:
object- dataset: AbstractExperimentalDataset | None¶
- extract_fit_parameters()¶
Extracts and processes the fit parameters.
- get_q_model(q_generator)¶
Generates or retrieves the q_model based on the trainer.
- Parameters:
q_generator (QGenerator) – Generator for model q values.
- Returns:
Array of q_model values.
- Return type:
torch.Tensor
- get_reshaped_experiment_data()¶
Retrieves and reshapes experimental q values and curves. :returns:
- (q_exp, curve_exp, curve_errors, q_errors) where:
q_exp is as stored in the dataset.
curve_exp is reshaped to 2D if necessary.
curve_errors is reshaped to 2D if available, None otherwise.
q_errors is reshaped to 2D if available, None otherwise.
- Return type:
- get_scaled_data(q_generator, prior_sampler, curves_scaler)¶
Retrieves scaled experimental parameters and curves.
- Parameters:
q_generator (QGenerator) – Object for generating q values.
prior_sampler (PriorSampler) – Handles parameter scaling and sampling.
curves_scaler (CurvesScaler) – Scales experimental curves for model training.
- Returns:
(scaled_params, scaled_curves, q_values, unscaled_params, unscaled_curves).
- Return type:
- interpolate_experimental_curves(q_generator, curves_scaler, device)¶
Interpolates experimental reflectivity curves.
- Parameters:
q_generator (QGenerator) – Object for generating q values.
curves_scaler (CurvesScaler) – Object to scale interpolated curves.
device (torch.device) – Device where the tensors are stored (e.g., ‘cpu’ or ‘cuda’).
- Returns:
- (unscaled_curve, scaled_curve)
unscaled_curve (torch.Tensor): The interpolated experimental curve (raw, unscaled). scaled_curve (torch.Tensor): The experimental curve after scaling.
- Return type:
- static interpolate_reflectivity(q_model, q_exp, curve_exp)¶
Interpolates reflectivity curves between experimental and model q values.
- Parameters:
q_model (np.ndarray) – Model q values.
q_exp (np.ndarray) – Experimental q values.
curve_exp (np.ndarray) – Experimental curve data.
- Returns:
Interpolated curve.
- Return type:
np.ndarray
- prepare_unscaled_params(device)¶
Prepares the unscaled fit parameters.
- Parameters:
device (torch.device) – Device to store the unscaled parameters.
- Returns:
Tensor of unscaled parameters.
- Return type:
torch.Tensor
- split_dataset(q_generator, prior_sampler, curves_scaler, test_size=0.2, random_seed=None, allow_single_sample=False)¶
Splits the experimental dataset into training and testing subsets.
- Parameters:
q_generator (QGenerator) – Generates q values for reflectivity calculations.
prior_sampler (PriorSampler) – Handles parameter scaling and sampling.
curves_scaler (CurvesScaler) – Scales experimental curves for model training.
test_size (float) – Proportion of the dataset to include in the test split (e.g., 0.2 for 20% test).
random_seed (int or None) – Random seed for reproducibility. If None, no seed is set.
allow_single_sample (bool) – If True and the dataset contains only one sample,
manager. (return self as the test manager and None as the train)
- Returns:
- (train_manager, test_manager) where each manager’s processed_data contains:
scaled_params, scaled_curves, q_values,
unscaled_params, unscaled_curves.
- Return type:
vipr_reflectometry.shared.load_data.readers.experimental_datasets module¶
- class vipr_reflectometry.shared.load_data.readers.experimental_datasets.AbstractExperimentalDataset(file_path: str, dataset_name: str, preloaded_raw_data_for_this_dataset: Dict[str, Any] | None = None)¶
Bases:
ABCDefines the interface for experimental dataset access.
- ensure_batch_shape(value, batch_size)¶
Ensures that a value has the correct batch shape.
- class vipr_reflectometry.shared.load_data.readers.experimental_datasets.MariaExperimentalDataset(file_path: str, dataset_name: str, preloaded_raw_data_for_this_dataset: Dict[str, Any] | None = None)¶
Bases:
AbstractExperimentalDataset- extract_fit_parameters()¶
Extracts and returns the fit parameters.
- get_experiment_data()¶
Returns the experiment data (containing ‘q’ and ‘data’).
- class vipr_reflectometry.shared.load_data.readers.experimental_datasets.XrrExperimentalDataset(file_path: str, dataset_name: str, preloaded_raw_data_for_this_dataset: Dict[str, Any] | None = None)¶
Bases:
AbstractExperimentalDataset- extract_fit_parameters()¶
Extracts and returns the fit parameters.
- get_experiment_data()¶
Returns the experiment data (containing ‘q’ and ‘data’).
- vipr_reflectometry.shared.load_data.readers.experimental_datasets.detect_format(file_path, dataset_name, preloaded_full_file_dict=None)¶
- vipr_reflectometry.shared.load_data.readers.experimental_datasets.discover_experimental_datasets(file_path)¶
Scans an HDF5 file and identifies groups that likely represent experimental datasets compatible with ExperimentalDataManager.
vipr_reflectometry.shared.load_data.readers.hdf5_cache module¶
HDF5 Caching for Reflectometry Data
Process-persistent caching functionality for HDF5 files using VIPR’s cache infrastructure. This module is co-located with HDF5-related logic in the flow_models plugin, following the same pattern as streaming_handler.py.
- vipr_reflectometry.shared.load_data.readers.hdf5_cache.clear_hdf5_cache(file_path: str | None = None)¶
Clear HDF5 cache entries.
- Parameters:
file_path – Optional specific file to clear. If None, clears all HDF5 cache entries.
- vipr_reflectometry.shared.load_data.readers.hdf5_cache.get_hdf5_cache_info()¶
Get information about current HDF5 cache entries.
- Returns:
Dict with cache statistics
- vipr_reflectometry.shared.load_data.readers.hdf5_cache.get_or_load_hdf5_data(file_path: str)¶
Load HDF5 file data with process-persistent caching.
This function provides intelligent caching for HDF5 files by: - Using file modification time for automatic cache invalidation - Leveraging VIPR’s existing process cache infrastructure - Working across both VIPR-Core and FastAPI contexts
- Parameters:
file_path – Path to the HDF5 file
- Returns:
Loaded HDF5 data (dict-like structure from nxtodict)
- Raises:
FileNotFoundError – When the file doesn’t exist
ImportError – When silx is not available
Exception – For other loading errors
vipr_reflectometry.shared.load_data.readers.spectra_reader module¶
Spectra Reader Adapter for Reflectometry Data
Provides unified API for different reflectometry data formats using adapter pattern: - HDF5 files (using ExperimentalDataManager) - CSV/DAT files (Maria format)
Architecture: SpectraReader (manager) delegates to format-specific adapters (HDF5Adapter/CSVAdapter). Adapters create lightweight SpectrumProxy objects that load data explicitly via resolve() method.
- Usage:
reader = SpectraReader(“path/to/data”) proxies = reader.list() # Get lightweight proxies proxy = proxies[0] # Select first spectrum data = proxy.resolve() # Explicit data loading with caching q = data.q # Direct attribute access I = data.I # Direct attribute access dI = data.dI # Direct attribute access (can be None)
- class vipr_reflectometry.shared.load_data.readers.spectra_reader.BaseDataHandle(*, format: Literal['hdf5', 'csv'])¶
Bases:
BaseModelBase model shared by all data handles.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class vipr_reflectometry.shared.load_data.readers.spectra_reader.CSVAdapter(file_path: str, column_mapping: dict | None = None)¶
Bases:
SpectraAdapterAdapter for single CSV/DAT/TXT files with configurable column mapping.
- datasets()¶
CSV has no datasets/groups - return empty list.
- fetch(data_handle: CSVDataHandle) → SpectrumData¶
Optimized fetch that reads the file once (and caches it), then extracts all required data columns using a clean helper method.
- list(dataset=None)¶
Create SpectrumProxy object for the single spectrum.
- size(dataset=None)¶
Number of spectra - always 1 for single file.
- class vipr_reflectometry.shared.load_data.readers.spectra_reader.CSVDataHandle(*, format: Literal['csv'] = 'csv')¶
Bases:
BaseDataHandleSpecific model for CSV data handles (no additional fields needed).
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class vipr_reflectometry.shared.load_data.readers.spectra_reader.HDF5Adapter(file_path: str)¶
Bases:
SpectraAdapter- datasets()¶
List of all samples/datasets.
- fetch(data_handle: HDF5DataHandle) → SpectrumData¶
Optimized fetch that reads the entire dataset batch once (and caches it), then extracts the data for the requested spectrum index.
- list(dataset=None)¶
Create SpectrumProxy objects for available spectra.
- size(dataset=None)¶
Total count or count within a sample.
- class vipr_reflectometry.shared.load_data.readers.spectra_reader.HDF5DataHandle(*, format: Literal['hdf5'] = 'hdf5', dataset_name: str, spectrum_index: int)¶
Bases:
BaseDataHandleSpecific model for HDF5 data handles.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class vipr_reflectometry.shared.load_data.readers.spectra_reader.SpectraAdapter¶
Bases:
ABCAbstract base class for all spectra adapters.
- abstract fetch(data_handle: HDF5DataHandle | CSVDataHandle) → SpectrumData¶
Fetch all spectrum data (q, I, dI, dQ) in a single optimized call.
This method replaces the individual get_q/get_I/get_dI/get_dQ methods and performs all data loading in one operation to minimize I/O overhead.
- Parameters:
data_handle – Data handle containing spectrum reference information
- Returns:
Container with all spectrum data (q, I, dI, dQ)
- Return type:
- abstract list(dataset: str | None = None) → list[SpectrumProxy]¶
Return all spectra as SpectrumProxy objects.
- class vipr_reflectometry.shared.load_data.readers.spectra_reader.SpectraReader(data_path: str, column_mapping: dict | None = None)¶
Bases:
objectAdapter for various reflectometry data formats.
- get(dataset: str, index: int) → SpectrumProxy¶
Direct access to spectrum proxy.
- list(dataset: str | None = None) → list[SpectrumProxy]¶
- class vipr_reflectometry.shared.load_data.readers.spectra_reader.SpectrumData(q: ndarray, I: ndarray, dI: ndarray | None = None, dQ: ndarray | None = None)¶
Bases:
objectPure data container for spectrum data.
This dataclass holds the actual loaded spectrum data and provides direct attribute access to Q values, intensities, and errors.
- I: ndarray¶
- q: ndarray¶
- class vipr_reflectometry.shared.load_data.readers.spectra_reader.SpectrumProxy(adapter: SpectraAdapter, data_handle: HDF5DataHandle | CSVDataHandle)¶
Bases:
objectLightweight proxy for a single spectrum with explicit loading.
This proxy doesn’t contain actual spectrum data, but knows how to fetch it explicitly when resolve() is called. The loaded data is cached for performance. This design makes expensive I/O operations explicit and avoids hidden costs.
- resolve() → SpectrumData¶
Load the actual spectrum data from the source.
This is the primary method to explicitly load data. The result is cached so subsequent calls return the same SpectrumData object without reloading.
- Returns:
Container with q, I, dI, dQ arrays
- Return type: