pyphotomol package#

PyPhotoMol: A Python package for mass photometry data analysis.

PyPhotoMol provides a comprehensive suite of tools for analyzing mass photometry data, including data import, histogram analysis, peak detection, Gaussian fitting, and comprehensive operation logging.

Main Classes#

PyPhotoMol : Main class for single-dataset analysis MPAnalyzer : Class for batch processing multiple datasets

Key Features#

  • Import from HDF5 and CSV files

  • Histogram creation

  • Peak detection

  • Multi-Gaussian fitting

  • Mass-contrast calibration

Examples

Basic single-file analysis:

>>> from pyphotomol import PyPhotoMol
>>> model = PyPhotoMol()
>>> model.import_file('data.h5')
>>> model.create_histogram(use_masses=True, window=[0, 1000], bin_width=20)
>>> model.guess_peaks(min_height=5)
>>> model.fit_histogram(
...     peaks_guess=model.peaks_guess,
...     mean_tolerance=200,
...     std_tolerance=300
... )
>>> model.create_fit_table()

Batch processing:

>>> from pyphotomol import MPAnalyzer
>>> batch = MPAnalyzer()
>>> batch.import_files(['file1.h5', 'file2.h5', 'file3.h5'])
>>> batch.apply_to_all('count_binding_events')
>>> batch.apply_to_all('create_histogram', use_masses=True, window=[0, 1000], bin_width=20)
class pyphotomol.PyPhotoMol[source]#

Bases: object

Main class for analyzing mass photometry data.

The PyPhotoMol class provides a comprehensive suite of tools for importing, analyzing, and visualizing mass photometry data. It supports data import from HDF5 and CSV files, histogram creation and analysis, peak detection, Gaussian fitting, and mass-contrast calibration.

All operations are automatically logged to a comprehensive logbook that tracks parameters, results, and any errors encountered during analysis.

contrasts#

Array of contrast values from imported data

Type:

np.ndarray or None

masses#

Array of mass values (in kDa) from imported data or converted from contrasts

Type:

np.ndarray or None

histogram_centers#

Bin centers for created histograms

Type:

np.ndarray or None

hist_counts#

Count values for histogram bins

Type:

np.ndarray or None

hist_nbins#

Number of bins in the histogram

Type:

int or None

hist_window#

[min, max] window used for histogram creation

Type:

list or None

bin_width#

Width of histogram bins

Type:

float or None

hist_data_type#

Type of data used for histogram (‘masses’ or ‘contrasts’)

Type:

str or None

peaks_guess#

Positions of detected peaks in the histogram

Type:

np.ndarray or None

nbinding#

Number of binding events detected

Type:

int or None

nunbinding#

Number of unbinding events detected

Type:

int or None

fitted_params#

Parameters from Multi-Gaussian fitting

Type:

np.ndarray or None

fitted_data#

Fitted curve data points

Type:

np.ndarray or None

fitted_params_errors#

Error estimates for fitted parameters

Type:

np.ndarray or None

masses_fitted#

Mass values corresponding to fitted peaks

Type:

np.ndarray or None

baseline#

Baseline value used for fitting operations (default: 0)

Type:

float

fit_table#

Summary table of fitting results

Type:

pd.DataFrame or None

calibration_dic#

Dictionary containing mass-contrast calibration parameters

Type:

dict or None

logbook#

List of all operations performed, with timestamps and parameters

Type:

list

Examples

Basic workflow for mass photometry analysis:

>>> model = PyPhotoMol()
>>> model.import_file('data.h5')
>>> model.count_binding_events()
>>> model.create_histogram(use_masses=True, window=[0, 1000], bin_width=10)
>>> model.guess_peaks()
>>> model.fit_histogram(peaks_guess=model.peaks_guess, mean_tolerance=200, std_tolerance=300)
>>> model.print_logbook_summary()
__init__()[source]#

Initialize a new PyPhotoMol instance.

Creates an empty instance with all data properties set to None and initializes an empty logbook for operation tracking.

calibrate(calibration_standards)[source]#

Obtain a calibration of the type f(mass) = slope * contrast + intercept

Parameters:

calibration_standards (list) – List with the known masses

clear_logbook()[source]#

Clear all logbook entries.

contrasts_to_masses(slope=1.0, intercept=0.0)[source]#

Convert contrasts to masses using a linear transformation. We assume a calibratio was done using f(mass) = slope * contrast + intercept

Parameters:
  • slope (float, default 1.0) – Slope of the linear transformation.

  • intercept (float, default 0.0) – Intercept of the linear transformation.

count_binding_events()[source]#

Count binding and unbinding events from the mass data.

create_fit_table()[source]#

Create a fit table from the fitted parameters

create_histogram(use_masses=True, window=[0, 2000], bin_width=10)[source]#

Create a histogram from imported contrast or mass data.

This method generates a histogram from the imported data, which is essential for subsequent peak detection and fitting operations. The histogram parameters can be customized for different types of analysis.

Parameters:
  • use_masses (bool, default True) – If True, create histogram from mass data (requires masses to be available). If False, create histogram from contrast data.

  • window (list of two floats, default [0, 2000]) – Range for the histogram as [min, max]. Units depend on data type: - For masses: typically [0, 2000] kDa - For contrasts: typically [-1, 0] (e.g., [-0.8, -0.2])

  • bin_width (float, default 10) – Width of histogram bins. Units depend on data type: - For masses: typically 10-50 kDa - For contrasts: typically 0.0004-0.001

Raises:
  • AttributeError – If no data has been imported yet

  • ValueError – If use_masses=True but no mass data is available

Notes

After histogram creation, the following attributes are populated: - self.histogram_centers : Bin center positions - self.hist_counts : Count values for each bin - self.hist_nbins : Number of bins created - self.hist_window : Window used for histogram - self.bin_width : Bin width used - self.hist_data_type : Data type used (‘masses’ or ‘contrasts’)

Examples

Create mass histogram for protein analysis:

>>> model.create_histogram(use_masses=True, window=[0, 1000], bin_width=20)

Create contrast histogram for calibration:

>>> model.create_histogram(use_masses=False, window=[-0.8, -0.2], bin_width=0.0004)

High-resolution histogram for detailed analysis:

>>> model.create_histogram(use_masses=True, window=[50, 200], bin_width=5)
fit_histogram(peaks_guess, mean_tolerance=None, std_tolerance=None, threshold=None, baseline=0.0, fit_baseline=False)[source]#

Fit the histogram data to the guessed peaks. We use a multi-Gaussian fit to the histogram data.

The data type (masses or contrasts) is automatically detected from the histogram that was previously created using create_histogram().

Parameters:
  • peaks_guess (list) – List of guessed peaks.

  • mean_tolerance (float) – Tolerance for the mean of the Gaussian fit. If None, it will be inferred from the peaks guesses.

  • std_tolerance (float) – Tolerance for the standard deviation of the Gaussian fit. If None, it will be inferred from the peaks guesses.

  • threshold (float, optional) – For masses: minimum value that can be observed (in kDa units). Default is 40. For contrasts: maximum value that can be observed (should be negative). Default is -0.0024. If None, defaults are applied based on detected data type.

  • baseline (float, default 0.0) – Baseline value to be subtracted from the fit.

  • fit_baseline (bool, default False) – Whether to fit a baseline to the histogram. If True, a baseline will be included in the fit and the ‘baseline’ argument will be ignored.

Examples

Fit histogram after creating it:

>>> model.create_histogram(use_masses=True, window=[0, 2000], bin_width=10)
>>> model.guess_peaks()
>>> model.fit_histogram(model.peaks_guess, mean_tolerance=100, std_tolerance=200)
get_logbook(as_dataframe=True, save_to_file=None)[source]#

Retrieve the logbook of all operations performed on this instance.

The logbook contains a complete history of all method calls, including parameters used, results obtained, timestamps, and any errors encountered. This provides full traceability of the analysis workflow.

Parameters:
  • as_dataframe (bool, default True) – If True, return logbook as a pandas DataFrame for easy analysis. If False, return as a list of dictionaries.

  • save_to_file (str, optional) – If provided, save the logbook to this file path as JSON format.

Returns:

Logbook entries containing operation history. DataFrame columns include: - timestamp: ISO format timestamp of when operation was performed - method: Name of the method that was called - parameters: Dictionary of parameters passed to the method - result_summary: Summary of results produced (for successful operations) - notes: Additional notes about the operation - success: Boolean indicating if operation completed successfully - error: Error message (only present for failed operations)

Return type:

pandas.DataFrame or list

Examples

Get logbook as DataFrame for analysis:

>>> model = PyPhotoMol()
>>> model.import_file('data.h5')
>>> logbook_df = model.get_logbook()
>>> print(logbook_df[['timestamp', 'method', 'success']])

Save logbook to file:

>>> model.get_logbook(save_to_file='analysis_log.json')

Get raw logbook data:

>>> raw_logbook = model.get_logbook(as_dataframe=False)
guess_peaks(min_height=10, min_distance=4, prominence=4)[source]#

Guess peaks in the histogram data.

The different arguments will be adjusted according to the region of the histogram. For example, the given distance will be used for mass data between 0 and 650 kDa, between 650 and 1500 kDa, the distance will be multiplied by a factor of 3, and for data above 1500 kDa, the distance will be multiplied by a factor of 8. See the guess_peaks function in utils.helpers for more details.

Example of min_height, min_distance and prominence for contrasts:

min_height=10, min_distance=4, prominence=4

Parameters:
  • min_height (int, default 10) – Minimum height of the peaks.

  • min_distance (int, default 4) – Minimum distance between peaks.

  • prominence (int, default 4) – Minimum prominence of the peaks.

import_file(file_path)[source]#

Import mass photometry data from HDF5 or CSV files.

This method loads contrast and mass data from supported file formats. NaN values are automatically removed from the imported data. The import operation is automatically logged with file information and data statistics.

Parameters:

file_path (str) – Path to the data file. Supported formats are: - ‘.h5’ : HDF5 files with standard mass photometry structure - ‘.csv’ : CSV files with contrast and mass columns

Raises:
  • ValueError – If the file format is not supported (not .h5 or .csv)

  • FileNotFoundError – If the specified file does not exist

  • KeyError – If required data columns are missing from the file

Notes

After import, the following attributes are populated: - self.contrasts : Array of contrast values with NaN removed - self.masses : Array of mass values with NaN removed (if available)

The logbook will record: - File path and type - Number of data points imported - Range of contrast and mass values

Examples

Import HDF5 data:

>>> model = PyPhotoMol()
>>> model.import_file('experiment_data.h5')
>>> print(f"Imported {len(model.contrasts)} contrast measurements")

Import CSV data:

>>> model.import_file('processed_data.csv')
>>> print(f"Mass range: {model.masses.min():.1f} - {model.masses.max():.1f} kDa")
print_logbook_summary()[source]#

Print a summary of the logbook.

class pyphotomol.MPAnalyzer[source]#

Bases: object

A class to handle multiple PyPhotoMol instances. This is useful for batch processing of multiple files.

__init__()[source]#
apply_to_all(method_name, *args, names=None, **kwargs)[source]#

Apply a method to all or selected PyPhotoMol instances.

Parameters:
  • method_name (str) – Name of the method to apply to instances

  • *args (tuple) – Positional arguments to pass to the method

  • names (list or str, optional) – Names of specific models to apply method to. If None (default), applies to all models.

  • **kwargs (dict) – Keyword arguments to pass to the method

Examples

Count binding events for all models:

>>> pms.apply_to_all('count_binding_events')

Create histograms for specific models only:

>>> pms.apply_to_all('create_histogram', names=['model1', 'model2'],
...                  use_masses=True, window=[0, 2000], bin_width=10)

Guess peaks for a single model:

>>> pms.apply_to_all('guess_peaks', names='model1',
...                  min_height=10, min_distance=4, prominence=4)
create_plotting_config(repeat_colors=True)[source]#

Create configuration dataframes for plotting multiple PhotoMol models.

Parameters:

repeat_colors (bool, default True) – If True, repeat the same color scheme for each model’s peaks. If False, use sequential colors across all peaks from all models.

Returns:

tuple

  • legends_df: DataFrame with legends, colors, and selection flags for Gaussian traces

  • colors_hist_df: DataFrame with histogram colors for each model

Return type:

(legends_df, colors_hist_df)

get_all_logbooks(save_to_file=None)[source]#

Get combined logbooks from all models plus batch operations.

Parameters:

save_to_file (str, optional) – Optional file path to save combined logbook as JSON

Returns:

Combined logbooks with batch and individual model logs

Return type:

dict

get_batch_logbook(as_dataframe=True, save_to_file=None)[source]#

Retrieve the batch logbook of all operations performed.

Parameters:
  • as_dataframe (bool, default True) – If True, return as pandas DataFrame, else as list of dicts

  • save_to_file (str, optional) – Optional file path to save logbook as JSON

Returns:

Batch logbook entries

Return type:

pandas.DataFrame or list

get_properties(variable)[source]#

Get properties from all PyPhotoMol instances.

Parameters:

variable (str) – The property to get from each instance.

Returns:

List of the specified property from each instance.

Return type:

list

Examples

Get masses from all models:

>>> masses_list = pms.get_properties('masses')

Get fit tables from all models:

>>> fit_tables = pms.get_properties('fit_table')
import_files(files, names=None)[source]#

Load multiple files into PyPhotoMol instances.

Parameters:
  • files (list) – List of file paths to load.

  • names (list, optional) – List of names for the PyPhotoMol instances.

master_calibration(calibration_standards)[source]#

Perform master calibration using known masses. It uses information from the fit table of each model to create a master calibration. :param calibration_standards: List of known masses for calibration. :type calibration_standards: list

pyphotomol.contrasts_to_mass(contrasts, slope, intercept)[source]#

Function to convert masses from contrasts using known calibration parameters.

Caution! slope and intercept are based on f(mass) = contrast !!!! In other words, contrast = slope*mass + intercept

Parameters:
  • contrasts (np.ndarray) – Contrasts to convert.

  • slope (float) – Slope of the calibration line.

  • intercept (float) – Intercept of the calibration line.

Returns:

Converted masses in kDa.

Return type:

np.ndarray

pyphotomol.create_histogram(vector, window=[0, 2000], bin_width=10)[source]#

Creates an histogram of the provided vector within a specified window and bin width.

Parameters:
  • vector (np.ndarray) – The data to create the histogram from.

  • window (list, default [0, 2000]) – The range of values to include in the histogram [min, max].

  • bin_width (float, default 10) – The width of each bin in the histogram.

Returns:

  • histogram_centers (np.ndarray) – The x-coordinates of the histogram bins.

  • hist_counts (np.ndarray) – The counts of values in each bin (Y-axis of the histogram).

  • hist_nbins (int) – The number of bins in the histogram.

Examples

For contrast data:

>>> centers, counts, nbins = create_histogram(contrasts, window=[-1, 0], bin_width=0.0004)

For mass data:

>>> centers, counts, nbins = create_histogram(masses, window=[0, 2000], bin_width=10)
pyphotomol.guess_peaks(x, histogram_centers, height=14, distance=4, prominence=8, masses=True)[source]#

Try to find peaks in the histogram data.

Automatically finds peaks in histogram data with adaptive parameters based on the data range. For mass data, different distance thresholds are used for different mass ranges to account for peak spacing variations.

Parameters:
  • x (np.ndarray) – The histogram counts data to find peaks in.

  • histogram_centers (np.ndarray) – The centers of the histogram bins corresponding to x values.

  • height (int, default 14) – Minimum height of peaks.

  • distance (int, default 4) – Minimum distance between peaks (will be scaled for different mass ranges).

  • prominence (int, default 8) – Minimum prominence of peaks.

  • masses (bool, default True) – If True, find peaks in mass data; if False, find peaks in contrast data.

Returns:

Histogram centers of the found peaks.

Return type:

np.ndarray

Examples

For contrast data:

>>> peaks = guess_peaks(hist_counts, hist_centers, height=10, distance=4, prominence=4, masses=False)

For mass data:

>>> peaks = guess_peaks(hist_counts, hist_centers, height=14, distance=4, prominence=8, masses=True)
pyphotomol.fit_histogram(hist_counts, hist_centers, guess_positions=[66, 148, 480], mean_tolerance=None, std_tolerance=None, threshold=40, baseline=0, masses=True, fit_baseline=False)[source]#

Fit a histogram with multiple truncated gaussians.

Parameters:
  • hist_counts (np.ndarray) – The counts of values in each bin of the histogram.

  • hist_centers (np.ndarray) – The centers of the histogram bins.

  • guess_positions (list, default [66,148,480]) – Initial guesses for the positions of the peaks.

  • mean_tolerance (int, default 100) – Tolerance for the peak positions. If None, it will be copied from guess_positions.

  • std_tolerance (int, default 200) – Maximum standard deviation for the peaks. If None, it will be copied from guess_positions.

  • threshold (int, default 40 for masses in kDa units) – For masses, minimum value that can be observed. For contrasts, it is be the max value that can be observed. It should be a negative value.

  • baseline (float, default 0) – Baseline value to be added to the fit.

  • masses (bool, default True) – If True, the fit is for mass data; if False, it is for contrast data.

  • fit_baseline (bool, default False) – If True, the fit will include a baseline parameter. The ‘baseline’ argument will be ignored.

Returns:

  • popt (np.ndarray) – Optimized parameters for the fit.

  • fit (np.ndarray) – Fitted values for the histogram. The first column is the x-coordinates, followed by the individual Gaussian fits and the total fit.

  • fit_error (np.ndarray) – Errors of the fitted parameters.

Examples

For contrast data:

>>> popt, fit, errors = fit_histogram(counts, centers, mean_tolerance=0.05, std_tolerance=0.1)
pyphotomol.create_fit_table(popt, popt_error, fit, n_binding, n_unbinding, hist_centers, masses=True, include_errors=True)[source]#

Generate a pandas DataFrame that summarizes fit results

Parameters:
  • popt (np.ndarray) – Optimized parameters from the fit.

  • popt_error (np.ndarray) – Errors of the fitted parameters.

  • fit (np.ndarray) – Fitted values for the histogram.

  • n_binding (int) – Number of binding events.

  • n_unbinding (int) – Number of unbinding events.

  • hist_centers (np.ndarray) – The centers of the histogram bins.

  • masses (bool, default True) – If True, the fit is for mass data; if False, it is for contrast data.

  • include_errors (bool, default True) – If True, include errors in the fit table.

Returns:

fit_table – DataFrame containing the fit results.

Return type:

pd.DataFrame

pyphotomol.calibrate(calib_floats, fit_table)[source]#

Calibration based on contrasts histogram

Parameters:
  • calib_floats (list) – List of calibration standards in kDa (e.g. [66, 146, 480]).

  • fit_table (pd.DataFrame) – DataFrame containing the fit results. Created by create_fit_table.

Returns:

calibration_dic – Dictionary containing the calibration results: - ‘standards’: Calibration standards used. - ‘exp_points’: Expected points from the fit. - ‘fit_params’: Parameters of the fit. - ‘fit_r2’: R-squared value of the fit.

Return type:

dict

pyphotomol.import_file_h5(filename)[source]#

Import mass photometry data from HDF5 files generated by Refeyn instruments.

This function reads contrast and mass data from standard Refeyn HDF5 file formats. It automatically handles different data structures and performs calibration conversions when necessary. NaN values are filtered out automatically.

Parameters:

filename (str) – Path to the HDF5 file to import

Returns:

  • contrasts (np.ndarray) – Array of contrast values with NaN values removed

  • masses_kDa (np.ndarray or None) – Array of mass values in kDa units, or None if not available. If masses are not directly available but calibration data exists, they will be computed from contrasts using the calibration parameters.

Notes

The function searches for data in the following order: 1. Direct ‘contrasts’ and ‘masses_kDa’ datasets 2. ‘calibrated_values’ with calibration parameters 3. ‘per_movie_events’ for movie-based data

Raises:
  • FileNotFoundError – If the specified file does not exist

  • KeyError – If required datasets are not found in the HDF5 file

  • ValueError – If the file format is not recognized or data is corrupted

Examples

Import standard Refeyn data:

>>> contrasts, masses = import_file_h5('experiment.h5')
>>> print(f"Loaded {len(contrasts)} events")
>>> if masses is not None:
...     print(f"Mass range: {masses.min():.1f} - {masses.max():.1f} kDa")
pyphotomol.import_csv(filename)[source]#

Import data from a CSV file generated by a Refeyn instrument.

The CSV file should contain columns ‘contrasts’ and optionally ‘masses_kDa’. NaN values are automatically filtered out from the imported data.

Parameters:

filename (str) – Path to the CSV file to import

Returns:

  • contrasts (np.ndarray) – Array of contrast values with NaN values removed

  • masses_kDa (np.ndarray or None) – Array of mass values in kDa, or None if not available in the CSV file

Raises:
  • FileNotFoundError – If the specified file does not exist

  • KeyError – If the required ‘contrasts’ column is not found in the CSV file

  • ValueError – If the file format is not recognized or data is corrupted

Examples

Import CSV data with contrasts only:

>>> contrasts, masses = import_csv('contrasts_only.csv')
>>> print(f"Loaded {len(contrasts)} contrast measurements")
>>> print(f"Masses available: {masses is not None}")

Import CSV data with both contrasts and masses:

>>> contrasts, masses = import_csv('full_data.csv')
>>> if masses is not None:
...     print(f"Mass range: {masses.min():.1f} - {masses.max():.1f} kDa")
pyphotomol.plot_histograms_and_fits(analyzer, legends_df=None, colors_hist=None, plot_config: PlotConfig = None, axis_config: AxisConfig = None, layout_config: LayoutConfig = None, legend_config: LegendConfig = None)[source]#

Create a comprehensive plot of PhotoMol fit data with histograms and Gaussian traces.

Parameters:
  • analyzer (pyphotomol.MPAnalyzer or pyphotomol.PyPhotoMol) – MPAnalyzer instance containing multiple PyPhotoMol models - or a single PyPhotoMol instance

  • legends_df (pd.DataFrame, optional) – DataFrame containing legends, colors, and selections with columns [‘legends’, ‘color’, ‘select’, ‘show_legend’] This dataframe affects the fitted curves only, not the histograms.

  • colors_hist (list, str, or pd.DataFrame, optional) – List of colors for histograms (one per model) If a string, it will be used for all histograms. If a DataFrame, it should have a column ‘color’ with hex color codes.

  • plot_config (PlotConfig, optional) – General plot configuration (dimensions, format, contrasts, etc.)

  • axis_config (AxisConfig, optional) – Axis styling configuration (grid, line widths, etc.)

  • layout_config (LayoutConfig, optional) – Layout configuration (stacked, spacing, etc.)

  • legend_config (LegendConfig, optional) – Legend and labeling configuration

Returns:

Configured plotly figure object

Return type:

go.Figure

Examples

Simple plot with default settings:

>>> fig = plot_histograms_and_fits(analyzer, colors_hist=['blue', 'red'])
>>> fig.show()

Customized plot with configuration objects:

>>> plot_config = PlotConfig(plot_width=800, contrasts=True, x_range=[0, 500])
>>> layout_config = LayoutConfig(stacked=True, vertical_spacing=0.05)
>>> fig = plot_histograms_and_fits(analyzer, plot_config=plot_config,
...                               layout_config=layout_config)
>>> fig.show()

Plot with custom x-axis limits:

>>> plot_config = PlotConfig(x_range=[100, 800])  # Zoom to 100-800 kDa range
>>> fig = plot_histograms_and_fits(analyzer, plot_config=plot_config)
>>> fig.show()
pyphotomol.plot_histogram(analyzer, colors_hist=None, plot_config: PlotConfig = None, axis_config: AxisConfig = None, layout_config: LayoutConfig = None)[source]#

Create a plot with only histograms from PhotoMol data (wrapper around plot_histograms_and_fits).

This function is a simplified wrapper that creates histogram-only plots without requiring fitted data or legend configuration.

Parameters:
  • analyzer (pyphotomol.MPAnalyzer or pyphotomol.PyPhotoMol) – MPAnalyzer instance containing multiple PyPhotoMol models or a single PyPhotoMol instance

  • colors_hist (list, optional) – List of colors for histograms (one per model)

  • plot_config (PlotConfig, optional) – General plot configuration (dimensions, format, contrasts, etc.)

  • axis_config (AxisConfig, optional) – Axis styling configuration (grid, line widths, etc.)

  • layout_config (LayoutConfig, optional) – Layout configuration (stacked, spacing, etc.)

Returns:

Configured plotly figure object with histograms only

Return type:

go.Figure

Examples

Create a simple histogram plot:

>>> fig = plot_histogram(analyzer, ['#FF5733', '#33C3FF'])
>>> fig.show()

Create stacked normalized histograms:

>>> plot_config = PlotConfig(normalize=True)
>>> layout_config = LayoutConfig(stacked=True)
>>> fig = plot_histogram(analyzer, ['blue', 'red'],
...                      plot_config=plot_config, layout_config=layout_config)
>>> fig.show()
pyphotomol.config_fig(fig, plot_width=800, plot_height=600, plot_type='png', plot_title_for_download='plot')[source]#

Configure plotly figure with download options and toolbar settings.

Parameters:
  • fig (go.Figure) – Plotly figure object

  • plot_width (int, default 800) – Width of the plot in pixels

  • plot_height (int, default 600) – Height of the plot in pixels

  • plot_type (str, default "png") – Format for downloading the plot (e.g., “png”, “jpeg”)

  • plot_title_for_download (str, default "plot") – Title for the downloaded plot file

Returns:

Configured plotly figure

Return type:

go.Figure

pyphotomol.plot_calibration(mass, contrast, slope, intercept, plot_config: PlotConfig = None, axis_config: AxisConfig = None)[source]#

Create a scatter plot of mass vs contrast with calibration line.

This function creates a visualization showing the relationship between mass and ratiometric contrast, with a fitted calibration line overlaid. This is useful for visualizing calibration quality and outliers.

Parameters:
  • mass (array-like) – Array of mass values in kDa

  • contrast (array-like) – Array of corresponding ratiometric contrast values

  • slope (float) – Slope of the calibration line (contrast = slope * mass + intercept)

  • intercept (float) – Intercept of the calibration line

  • plot_config (PlotConfig, optional) – General plot configuration (dimensions, format, axis size, etc.)

  • axis_config (AxisConfig, optional) – Axis styling configuration (grid, line widths, etc.)

Returns:

Plotly figure object containing the mass vs contrast calibration plot

Return type:

go.Figure

Examples

Plot mass vs contrast calibration:

>>> import numpy as np
>>> from pyphotomol.utils.plotting import plot_calibration, PlotConfig, AxisConfig
>>>
>>> # Simulated calibration data
>>> mass = np.array([66, 146, 480])
>>> contrast = np.array([-0.1, -0.2, -0.5])
>>> slope = -0.001
>>> intercept = 0.02
>>>
>>> # Simple plot with defaults
>>> fig = plot_calibration(mass, contrast, slope, intercept)
>>> fig.show()
>>>
>>> # Customized plot
>>> plot_config = PlotConfig(plot_width=600, plot_height=400, font_size=12)
>>> axis_config = AxisConfig(showgrid_x=False, n_y_axis_ticks=6)
>>> fig = plot_calibration(mass, contrast, slope, intercept,
...                       plot_config=plot_config, axis_config=axis_config)
>>> fig.show()
class pyphotomol.PlotConfig(plot_width: int = 1000, plot_height: int = 400, plot_type: str = 'png', font_size: int = 14, normalize: bool = False, contrasts: bool = False, cst_factor_for_contrast: float = 1, x_range: List[float] | None = None)[source]#

Bases: object

General plot configuration

__init__(plot_width: int = 1000, plot_height: int = 400, plot_type: str = 'png', font_size: int = 14, normalize: bool = False, contrasts: bool = False, cst_factor_for_contrast: float = 1, x_range: List[float] | None = None) None#
contrasts: bool = False#
cst_factor_for_contrast: float = 1#
font_size: int = 14#
normalize: bool = False#
plot_height: int = 400#
plot_type: str = 'png'#
plot_width: int = 1000#
x_range: List[float] | None = None#
class pyphotomol.AxisConfig(showgrid_x: bool = True, showgrid_y: bool = True, n_y_axis_ticks: int = 3, axis_linewidth: int = 1, axis_tickwidth: int = 1, axis_gridwidth: int = 1)[source]#

Bases: object

Axis styling configuration

__init__(showgrid_x: bool = True, showgrid_y: bool = True, n_y_axis_ticks: int = 3, axis_linewidth: int = 1, axis_tickwidth: int = 1, axis_gridwidth: int = 1) None#
axis_gridwidth: int = 1#
axis_linewidth: int = 1#
axis_tickwidth: int = 1#
n_y_axis_ticks: int = 3#
showgrid_x: bool = True#
showgrid_y: bool = True#
class pyphotomol.LayoutConfig(stacked: bool = False, show_subplot_titles: bool = False, vertical_spacing: float = 0.1, shared_yaxes: bool = True, extra_padding_y_label: float = 0)[source]#

Bases: object

Layout and spacing configuration

__init__(stacked: bool = False, show_subplot_titles: bool = False, vertical_spacing: float = 0.1, shared_yaxes: bool = True, extra_padding_y_label: float = 0) None#
extra_padding_y_label: float = 0#
shared_yaxes: bool = True#
show_subplot_titles: bool = False#
stacked: bool = False#
vertical_spacing: float = 0.1#
class pyphotomol.LegendConfig(add_masses_to_legend: bool = True, add_percentage_to_legend: bool = False, add_labels: bool = True, add_percentages: bool = True, line_width: int = 3)[source]#

Bases: object

Legend and labeling configuration

__init__(add_masses_to_legend: bool = True, add_percentage_to_legend: bool = False, add_labels: bool = True, add_percentages: bool = True, line_width: int = 3) None#
add_labels: bool = True#
add_masses_to_legend: bool = True#
add_percentage_to_legend: bool = False#
add_percentages: bool = True#
line_width: int = 3#

Subpackages#

Submodules#

pyphotomol.main module#

pyphotomol.main.log_method(func)[source]#

Decorator to automatically handle errors and logging.

This decorator will: 1. Handle exceptions by logging them to the logbook 2. Re-raise exceptions after logging Note: The actual success logging is handled by each method individually

pyphotomol.main.log_batch_method(func)[source]#

Decorator to automatically handle errors and logging for MPAnalyzer methods.

class pyphotomol.main.PyPhotoMol[source]#

Bases: object

Main class for analyzing mass photometry data.

The PyPhotoMol class provides a comprehensive suite of tools for importing, analyzing, and visualizing mass photometry data. It supports data import from HDF5 and CSV files, histogram creation and analysis, peak detection, Gaussian fitting, and mass-contrast calibration.

All operations are automatically logged to a comprehensive logbook that tracks parameters, results, and any errors encountered during analysis.

contrasts#

Array of contrast values from imported data

Type:

np.ndarray or None

masses#

Array of mass values (in kDa) from imported data or converted from contrasts

Type:

np.ndarray or None

histogram_centers#

Bin centers for created histograms

Type:

np.ndarray or None

hist_counts#

Count values for histogram bins

Type:

np.ndarray or None

hist_nbins#

Number of bins in the histogram

Type:

int or None

hist_window#

[min, max] window used for histogram creation

Type:

list or None

bin_width#

Width of histogram bins

Type:

float or None

hist_data_type#

Type of data used for histogram (‘masses’ or ‘contrasts’)

Type:

str or None

peaks_guess#

Positions of detected peaks in the histogram

Type:

np.ndarray or None

nbinding#

Number of binding events detected

Type:

int or None

nunbinding#

Number of unbinding events detected

Type:

int or None

fitted_params#

Parameters from Multi-Gaussian fitting

Type:

np.ndarray or None

fitted_data#

Fitted curve data points

Type:

np.ndarray or None

fitted_params_errors#

Error estimates for fitted parameters

Type:

np.ndarray or None

masses_fitted#

Mass values corresponding to fitted peaks

Type:

np.ndarray or None

baseline#

Baseline value used for fitting operations (default: 0)

Type:

float

fit_table#

Summary table of fitting results

Type:

pd.DataFrame or None

calibration_dic#

Dictionary containing mass-contrast calibration parameters

Type:

dict or None

logbook#

List of all operations performed, with timestamps and parameters

Type:

list

Examples

Basic workflow for mass photometry analysis:

>>> model = PyPhotoMol()
>>> model.import_file('data.h5')
>>> model.count_binding_events()
>>> model.create_histogram(use_masses=True, window=[0, 1000], bin_width=10)
>>> model.guess_peaks()
>>> model.fit_histogram(peaks_guess=model.peaks_guess, mean_tolerance=200, std_tolerance=300)
>>> model.print_logbook_summary()
__init__()[source]#

Initialize a new PyPhotoMol instance.

Creates an empty instance with all data properties set to None and initializes an empty logbook for operation tracking.

get_logbook(as_dataframe=True, save_to_file=None)[source]#

Retrieve the logbook of all operations performed on this instance.

The logbook contains a complete history of all method calls, including parameters used, results obtained, timestamps, and any errors encountered. This provides full traceability of the analysis workflow.

Parameters:
  • as_dataframe (bool, default True) – If True, return logbook as a pandas DataFrame for easy analysis. If False, return as a list of dictionaries.

  • save_to_file (str, optional) – If provided, save the logbook to this file path as JSON format.

Returns:

Logbook entries containing operation history. DataFrame columns include: - timestamp: ISO format timestamp of when operation was performed - method: Name of the method that was called - parameters: Dictionary of parameters passed to the method - result_summary: Summary of results produced (for successful operations) - notes: Additional notes about the operation - success: Boolean indicating if operation completed successfully - error: Error message (only present for failed operations)

Return type:

pandas.DataFrame or list

Examples

Get logbook as DataFrame for analysis:

>>> model = PyPhotoMol()
>>> model.import_file('data.h5')
>>> logbook_df = model.get_logbook()
>>> print(logbook_df[['timestamp', 'method', 'success']])

Save logbook to file:

>>> model.get_logbook(save_to_file='analysis_log.json')

Get raw logbook data:

>>> raw_logbook = model.get_logbook(as_dataframe=False)
clear_logbook()[source]#

Clear all logbook entries.

print_logbook_summary()[source]#

Print a summary of the logbook.

import_file(file_path)[source]#

Import mass photometry data from HDF5 or CSV files.

This method loads contrast and mass data from supported file formats. NaN values are automatically removed from the imported data. The import operation is automatically logged with file information and data statistics.

Parameters:

file_path (str) – Path to the data file. Supported formats are: - ‘.h5’ : HDF5 files with standard mass photometry structure - ‘.csv’ : CSV files with contrast and mass columns

Raises:
  • ValueError – If the file format is not supported (not .h5 or .csv)

  • FileNotFoundError – If the specified file does not exist

  • KeyError – If required data columns are missing from the file

Notes

After import, the following attributes are populated: - self.contrasts : Array of contrast values with NaN removed - self.masses : Array of mass values with NaN removed (if available)

The logbook will record: - File path and type - Number of data points imported - Range of contrast and mass values

Examples

Import HDF5 data:

>>> model = PyPhotoMol()
>>> model.import_file('experiment_data.h5')
>>> print(f"Imported {len(model.contrasts)} contrast measurements")

Import CSV data:

>>> model.import_file('processed_data.csv')
>>> print(f"Mass range: {model.masses.min():.1f} - {model.masses.max():.1f} kDa")
count_binding_events()[source]#

Count binding and unbinding events from the mass data.

create_histogram(use_masses=True, window=[0, 2000], bin_width=10)[source]#

Create a histogram from imported contrast or mass data.

This method generates a histogram from the imported data, which is essential for subsequent peak detection and fitting operations. The histogram parameters can be customized for different types of analysis.

Parameters:
  • use_masses (bool, default True) – If True, create histogram from mass data (requires masses to be available). If False, create histogram from contrast data.

  • window (list of two floats, default [0, 2000]) – Range for the histogram as [min, max]. Units depend on data type: - For masses: typically [0, 2000] kDa - For contrasts: typically [-1, 0] (e.g., [-0.8, -0.2])

  • bin_width (float, default 10) – Width of histogram bins. Units depend on data type: - For masses: typically 10-50 kDa - For contrasts: typically 0.0004-0.001

Raises:
  • AttributeError – If no data has been imported yet

  • ValueError – If use_masses=True but no mass data is available

Notes

After histogram creation, the following attributes are populated: - self.histogram_centers : Bin center positions - self.hist_counts : Count values for each bin - self.hist_nbins : Number of bins created - self.hist_window : Window used for histogram - self.bin_width : Bin width used - self.hist_data_type : Data type used (‘masses’ or ‘contrasts’)

Examples

Create mass histogram for protein analysis:

>>> model.create_histogram(use_masses=True, window=[0, 1000], bin_width=20)

Create contrast histogram for calibration:

>>> model.create_histogram(use_masses=False, window=[-0.8, -0.2], bin_width=0.0004)

High-resolution histogram for detailed analysis:

>>> model.create_histogram(use_masses=True, window=[50, 200], bin_width=5)
guess_peaks(min_height=10, min_distance=4, prominence=4)[source]#

Guess peaks in the histogram data.

The different arguments will be adjusted according to the region of the histogram. For example, the given distance will be used for mass data between 0 and 650 kDa, between 650 and 1500 kDa, the distance will be multiplied by a factor of 3, and for data above 1500 kDa, the distance will be multiplied by a factor of 8. See the guess_peaks function in utils.helpers for more details.

Example of min_height, min_distance and prominence for contrasts:

min_height=10, min_distance=4, prominence=4

Parameters:
  • min_height (int, default 10) – Minimum height of the peaks.

  • min_distance (int, default 4) – Minimum distance between peaks.

  • prominence (int, default 4) – Minimum prominence of the peaks.

contrasts_to_masses(slope=1.0, intercept=0.0)[source]#

Convert contrasts to masses using a linear transformation. We assume a calibratio was done using f(mass) = slope * contrast + intercept

Parameters:
  • slope (float, default 1.0) – Slope of the linear transformation.

  • intercept (float, default 0.0) – Intercept of the linear transformation.

fit_histogram(peaks_guess, mean_tolerance=None, std_tolerance=None, threshold=None, baseline=0.0, fit_baseline=False)[source]#

Fit the histogram data to the guessed peaks. We use a multi-Gaussian fit to the histogram data.

The data type (masses or contrasts) is automatically detected from the histogram that was previously created using create_histogram().

Parameters:
  • peaks_guess (list) – List of guessed peaks.

  • mean_tolerance (float) – Tolerance for the mean of the Gaussian fit. If None, it will be inferred from the peaks guesses.

  • std_tolerance (float) – Tolerance for the standard deviation of the Gaussian fit. If None, it will be inferred from the peaks guesses.

  • threshold (float, optional) – For masses: minimum value that can be observed (in kDa units). Default is 40. For contrasts: maximum value that can be observed (should be negative). Default is -0.0024. If None, defaults are applied based on detected data type.

  • baseline (float, default 0.0) – Baseline value to be subtracted from the fit.

  • fit_baseline (bool, default False) – Whether to fit a baseline to the histogram. If True, a baseline will be included in the fit and the ‘baseline’ argument will be ignored.

Examples

Fit histogram after creating it:

>>> model.create_histogram(use_masses=True, window=[0, 2000], bin_width=10)
>>> model.guess_peaks()
>>> model.fit_histogram(model.peaks_guess, mean_tolerance=100, std_tolerance=200)
create_fit_table()[source]#

Create a fit table from the fitted parameters

calibrate(calibration_standards)[source]#

Obtain a calibration of the type f(mass) = slope * contrast + intercept

Parameters:

calibration_standards (list) – List with the known masses

class pyphotomol.main.MPAnalyzer[source]#

Bases: object

A class to handle multiple PyPhotoMol instances. This is useful for batch processing of multiple files.

__init__()[source]#
get_batch_logbook(as_dataframe=True, save_to_file=None)[source]#

Retrieve the batch logbook of all operations performed.

Parameters:
  • as_dataframe (bool, default True) – If True, return as pandas DataFrame, else as list of dicts

  • save_to_file (str, optional) – Optional file path to save logbook as JSON

Returns:

Batch logbook entries

Return type:

pandas.DataFrame or list

get_all_logbooks(save_to_file=None)[source]#

Get combined logbooks from all models plus batch operations.

Parameters:

save_to_file (str, optional) – Optional file path to save combined logbook as JSON

Returns:

Combined logbooks with batch and individual model logs

Return type:

dict

import_files(files, names=None)[source]#

Load multiple files into PyPhotoMol instances.

Parameters:
  • files (list) – List of file paths to load.

  • names (list, optional) – List of names for the PyPhotoMol instances.

apply_to_all(method_name, *args, names=None, **kwargs)[source]#

Apply a method to all or selected PyPhotoMol instances.

Parameters:
  • method_name (str) – Name of the method to apply to instances

  • *args (tuple) – Positional arguments to pass to the method

  • names (list or str, optional) – Names of specific models to apply method to. If None (default), applies to all models.

  • **kwargs (dict) – Keyword arguments to pass to the method

Examples

Count binding events for all models:

>>> pms.apply_to_all('count_binding_events')

Create histograms for specific models only:

>>> pms.apply_to_all('create_histogram', names=['model1', 'model2'],
...                  use_masses=True, window=[0, 2000], bin_width=10)

Guess peaks for a single model:

>>> pms.apply_to_all('guess_peaks', names='model1',
...                  min_height=10, min_distance=4, prominence=4)
get_properties(variable)[source]#

Get properties from all PyPhotoMol instances.

Parameters:

variable (str) – The property to get from each instance.

Returns:

List of the specified property from each instance.

Return type:

list

Examples

Get masses from all models:

>>> masses_list = pms.get_properties('masses')

Get fit tables from all models:

>>> fit_tables = pms.get_properties('fit_table')
create_plotting_config(repeat_colors=True)[source]#

Create configuration dataframes for plotting multiple PhotoMol models.

Parameters:

repeat_colors (bool, default True) – If True, repeat the same color scheme for each model’s peaks. If False, use sequential colors across all peaks from all models.

Returns:

tuple

  • legends_df: DataFrame with legends, colors, and selection flags for Gaussian traces

  • colors_hist_df: DataFrame with histogram colors for each model

Return type:

(legends_df, colors_hist_df)

master_calibration(calibration_standards)[source]#

Perform master calibration using known masses. It uses information from the fit table of each model to create a master calibration. :param calibration_standards: List of known masses for calibration. :type calibration_standards: list