pychemelt package#

PyChemelt package for the analysis of chemical and thermal denaturation data

Subpackages#

Submodules#

pychemelt.main module#

Main class to handle thermal and chemical denaturation data The current model assumes that the unfolding is reversible

class pychemelt.main.Sample(name='Test')[source]#

Bases: object

Class to hold, process, and fit thermal and chemical denaturation data.

This class manages multiple signal types (e.g., 350nm, 330nm, Ratio) and concentrations, providing an interface for global thermodynamic analysis under the assumption of reversible unfolding.

Parameters:

name (str, optional) – Identifier for the sample. Default is ‘Test’.

signal_dic#

Raw signal data mapped by signal name.

Type:

dict

temp_dic#

Temperature data mapped by signal name.

Type:

dict

conditions#

Processed numeric values for experimental conditions (e.g., [Denaturant]).

Type:

list of float

labels#

Original string labels for each condition.

Type:

list of str

signals#

Names of all available signal types in the loaded files.

Type:

list of str

nr_signals#

Number of distinct signal types selected for analysis.

Type:

int

single_fit_done#

Flag indicating if individual dataset fits have been completed.

Type:

bool

global_fit_done#

Flag for global thermodynamic fitting with local baselines.

Type:

bool

global_global_fit_done#

Flag for global thermodynamics and global baseline slopes.

Type:

bool

global_global_global_fit_done#

Flag for global thermodynamics, slopes, and intercepts.

Type:

bool

__init__(name='Test')[source]#
read_file(file)[source]#

Read the file and load the data into the sample object

Parameters:

file (str) – Path to the file

Returns:

True if the file was read and loaded into the sample object

Return type:

bool

read_multiple_files(files)[source]#

Read multiple files and load the data into the sample object

Parameters:

files (list or str) – List of paths to the files (or a single path)

Returns:

True if the files were read and loaded into the sample object

Return type:

bool

set_signal(signal_names)[source]#

Set multiple signals to be used for the analysis. This way, we can fit globally multiple signals at the same time, such as 350nm and 330nm

Parameters:

signal_names (list or str) – List of names of the signals to be used. E.g., [‘350nm’,’330nm’] or a single name

Notes

This method creates/updates the following attributes on the instance: - signal_lst_pre_multiple, temp_lst_pre_multiple : lists of lists - signal_names : list of signal name strings - nr_signals : int, number of signal types

set_temperature_range(min_temp=0, max_temp=100)[source]#

Set the temperature range for the sample

Parameters:
  • min_temp (float, optional) – Minimum temperature

  • max_temp (float, optional) – Maximum temperature

set_signal_id()[source]#

Create a list with the same length as the total number of signals The elements of the list indicated the ID of the signal, e.g., all 350nm datasets are mapped to 0, all 330nm datasets to 1, etc.

estimate_derivative(window_length=8)[source]#

Estimate the derivative of the signal using Savitzky-Golay filter

Parameters:

window_length (int, optional) – Length of the filter window in degrees

Notes

Creates/updates attributes: - temp_deriv_lst_multiple, deriv_lst_multiple, deriv_lst_expanded : lists storing estimated derivatives and corresponding temps - predicted_deriv_lst_multiple : list storing estimated derivatives of predicted values

guess_Tm(x1=6, x2=11)[source]#

Guess the Tm of the sample using the derivative of the signal

Parameters:
  • x1 (float, optional) – Shift from the minimum and maximum temperature to estimate the median of the initial and final baselines

  • x2 (float, optional) – Shift from the minimum and maximum temperature to estimate the median of the initial and final baselines

Notes

x2 must be greater than x1.

This method creates/updates attributes: - t_melting_init_multiple : list of initial Tm guesses per signal - t_melting_df_multiple : list of pandas.DataFrame objects with Tm vs Denaturant

estimate_baseline_parameters(native_baseline_type, unfolded_baseline_type, window_range_native=12, window_range_unfolded=12)[source]#

Estimate the baseline parameters for multiple signals

Parameters:
  • native_baseline_type (str) – one of ‘constant’, ‘linear’, ‘quadratic’, ‘exponential’

  • unfolded_baseline_type (str) – one of ‘constant’, ‘linear’, ‘quadratic’, ‘exponential’

  • window_range_native (int, optional) – Range of the window (in degrees) to estimate the baselines and slopes of the native state

  • window_range_unfolded (int, optional) – Range of the window (in degrees) to estimate the baselines and slopes of the unfolded state

Notes

This method sets or updates these attributes: - bNs_per_signal, bUs_per_signal, kNs_per_signal, kUs_per_signal, qNs_per_signal, qUs_per_signal - poly_order_native, poly_order_unfolded

reset_fittings_results()[source]#

Deletes the results of previous fittings from the object

expand_multiple_signal()[source]#

Create a single list with all the signals Create a single list with all the temperatures

Notes

Creates/updates attributes: - signal_lst_expanded, temp_lst_expanded - signal_lst_expanded_subset, temp_lst_expanded_subset

create_params_df()[source]#

Create a dataframe of the parameters

pychemelt.monomer module#

Main class to handle thermal and chemical denaturation data The current model assumes the protein is a monomer and that the unfolding is reversible

class pychemelt.monomer.Monomer(name='Test')[source]#

Bases: Sample

Class to hold the data of a single sample and fit it

__init__(name='Test')[source]#
set_denaturant_concentrations(concentrations=None)[source]#

Set the denaturant concentrations for the sample

Parameters:

concentrations (list, optional) – List of denaturant concentrations. If None, use the sample conditions

Notes

Creates/updates attribute denaturant_concentrations_pre (numpy.ndarray)

select_conditions(boolean_lst=None, normalise_to_global_max=True)[source]#

For each signal, select the conditions to be used for the analysis

Parameters:
  • boolean_lst (list of bool, optional) – List of booleans selecting which conditions to keep. If None, keep all.

  • normalise_to_global_max (bool, optional) – If True, normalise the signal to the global maximum - per signal type

Notes

Creates/updates several attributes used by downstream fitting: - signal_lst_multiple, temp_lst_multiple : lists of lists with selected data - denaturant_concentrations : list of selected denaturant concentrations - denaturant_concentrations_expanded : flattened numpy array matching expanded signals - boolean_lst, normalise_to_global_max, nr_den : control flags/values

fit_thermal_unfolding_local()[source]#

Fit the thermal unfolding of the sample using the signal and temperature data We fit one curve at a time, with individual parameters

guess_Cp()[source]#

Guess the Cp of the sample by fitting a line to the Tm and dH values

Notes

This method creates/updates attributes used later in fitting: - Tms, dHs, slope_dh_tm, intercept_dh_tm, Cp0, Cp0 assigned to self.Cp0

guess_initial_parameters(native_baseline_type, unfolded_baseline_type, window_range_native=12, window_range_unfolded=12)[source]#

Estimate starting thermodynamic and baseline parameters for global fitting.

Parameters:
  • native_baseline_type ({'constant', 'linear', 'quadratic', 'exponential'}) – The model type for the native state baseline.

  • unfolded_baseline_type ({'constant', 'linear', 'quadratic', 'exponential'}) – The model type for the unfolded state baseline.

  • window_range_native (float, optional) – Temperature range at the start of the curve (in degrees) used for native baseline estimation. Default is 12.

  • window_range_unfolded (float, optional) – Temperature range at the end of the curve used for unfolded baseline estimation. Default is 12.

create_dg_df()[source]#

Create a dataframe of the dg values versus temperature

fit_thermal_unfolding_global(fit_m_dep=False, cp_limits=None, dh_limits=None, tm_limits=None, cp_value=None)[source]#

Fit the thermal unfolding of the sample using the signal and temperature data We fit all the curves at once, with global thermodynamic parameters but local slopes and local baselines) Multiple signals can be fitted at the same time, such as 350nm and 330nm

Parameters:
  • fit_m_dep (bool, optional) – If True, fit the temperature dependence of the m-value

  • cp_limits (list, optional) – List of two values, the lower and upper bounds for the Cp value. If None, bounds set automatically

  • dh_limits (list, optional) – List of two values, the lower and upper bounds for the dH value. If None, bounds set automatically

  • tm_limits (list, optional) – List of two values, the lower and upper bounds for the Tm value. If None, bounds set automatically

  • cp_value (float, optional) – If provided, the Cp value is fixed to this value, the bounds are ignored

Notes

This is a heavy routine that creates/updates many fitting-related attributes, including: - bNs_expanded, bUs_expanded, kNs_expanded, kUs_expanded, qNs_expanded, qUs_expanded - p0, low_bounds, high_bounds, global_fit_params, rel_errors - predicted_lst_multiple, params_names, params_df, dg_df - flags: global_fit_done, fit_m_dep, limited_tm, limited_dh, limited_cp, fixed_cp

fit_thermal_unfolding_global_global()[source]#

Fit the thermal unfolding of the sample using the signal and temperature data We fit all the curves at once, with global thermodynamic parameters and global slopes (but local baselines) Multiple refers to the fact that we fit many signals at the same time, such as 350nm and 330nm Must be run after fit_thermal_unfolding_global_multiple

Notes

Updates global fitting attributes and sets global_global_fit_done when complete.

fit_thermal_unfolding_global_global_global(model_scale_factor=True)[source]#

Fit the thermal unfolding of the sample using the signal and temperature data We fit all the curves at once, with global thermodynamic parameters, global slopes and global baselines Must be run after fit_thermal_unfolding_global_global

Parameters:

model_scale_factor (bool, optional) – If True, model a scale factor for each denaturant concentration

Notes

Updates many global fitting attributes and sets global_global_global_fit_done when complete. If model_scale_factor is True the method also creates scaled signal attributes: - signal_lst_multiple_scaled, predicted_lst_multiple_scaled

signal_to_df(signal_type='raw', scaled=False)[source]#

Create a dataframe with three columns: Temperature, Signal, and Denaturant. Optimized for speed by avoiding per-curve DataFrame creation.

Parameters:
  • signal_type ({'raw', 'fitted', 'derivative'}, optional) – Which signal to include in the dataframe. ‘raw’ uses experimental data, ‘fitted’ uses model predictions, ‘derivative’ uses the estimated derivative signal.

  • scaled (bool, optional) – If True and signal_type == ‘fitted’ or ‘raw’, use the scaled versions if available.

pychemelt.thermal_oligomer module#

Main class to handle thermal denaturation data of mono- and oligomers up to tetramers The current model assumes the proteins’ unfolding is reversible

class pychemelt.thermal_oligomer.ThermalOligomer(name='Test')[source]#

Bases: Sample

Class to hold the data of a DSF experiment of thermal unfolding with different concentrations of an oligomer.

__init__(name='Test')[source]#
set_model(model_name)[source]#

Set subunit number of the oligomer used for the analysis. Currently supported are two state models of monomers, dimers, trimers and tetramers

Parameters:

model_name (str) – name of the used model. Can be: “Monomer”, “Dimer”, “Trimer”, “Tetramer”. Case insensitive

Raises:

ValueError – If the provided model name is not in the supported list.

Notes

This method creates/updates the following attributes on the instance: - self.model: oligomeric model used for analysis

set_concentrations(concentrations=None)[source]#

Set the oligomeric concentrations for the sample

Parameters:

concentrations (list, optional) – List of oligomer concentrations. If None, use the sample conditions

Notes

Creates/updates attribute oligomer_concentrations_pre (numpy.ndarray)

select_conditions(boolean_lst=None, normalise_to_global_max=True)[source]#

For each signal, select the conditions to be used for the analysis

Parameters:
  • boolean_lst (list of bool, optional) – List of booleans selecting which conditions to keep. If None, keep all.

  • normalise_to_global_max (bool, optional) – If True, normalise the signal to the global maximum - per signal type

Notes

Creates/updates several attributes used by downstream fitting: - signal_lst_multiple, temp_lst_multiple : lists of lists with selected data - oligomer_concentrations : list of selected oligomer concentrations - oligomer_concentrations_expanded : flattened numpy array matching expanded signals - boolean_lst, normalise_to_global_max, nr_olig : control flags/values

guess_Cp()[source]#

Guess the Cp of the assembled oligomer by the number of residues.

Raises:

ValueError – If self.n_residues is not set.

Notes

The number of residues represent the total number of residues in the oligomer

This method creates/updates attributes used later in fitting: - Cp0 assigned to self.Cp0

estimate_baseline_parameters(native_baseline_type, unfolded_baseline_type, window_range_native=12, window_range_unfolded=12)[source]#

Estimate the baseline parameters for multiple signals of the oligomer. The native baseline represents the curve for the assemble doligomer while the unfolded baseline represents the curve for the unfolded and disassembled oligomer.

Parameters:
  • native_baseline_type (str) – one of ‘constant’, ‘linear’, ‘quadratic’, ‘exponential’

  • unfolded_baseline_type (str) – one of ‘constant’, ‘linear’, ‘quadratic’, ‘exponential’

  • window_range_native (int, optional) – Range of the window (in degrees) to estimate the baselines and slopes of the native state

  • window_range_unfolded (int, optional) – Range of the window (in degrees) to estimate the baselines and slopes of the unfolded state

Notes

This method sets or updates these attributes: - bNs_per_signal, bUs_per_signal, kNs_per_signal, kUs_per_signal, qNs_per_signal, qUs_per_signal - poly_order_native, poly_order_unfolded

create_dg_df()[source]#

Create a dataframe of the dg values versus temperature

fit_thermal_unfolding_global(cp_limits=None, dh_limits=None, tm_limits=None, cp_value=None)[source]#

Fit the thermal unfolding of the sample using the signal and temperature data We fit all the curves at once, with global thermodynamic parameters but local slopes and local baselines) Multiple signals can be fitted at the same time, such as 350nm and 330nm

Parameters:
  • cp_limits (list, optional) – List of two values, the lower and upper bounds for the Cp value. If None, bounds set automatically

  • dh_limits (list, optional) – List of two values, the lower and upper bounds for the dH value. If None, bounds set automatically

  • tm_limits (list, optional) – List of two values, the lower and upper bounds for the Tm value. If None, bounds set automatically

  • cp_value (float, optional) – If provided, the Cp value is fixed to this value, the bounds are ignored

Notes

This is a heavy routine that creates/updates many fitting-related attributes, including: - bNs_expanded, bUs_expanded, kNs_expanded, kUs_expanded, qNs_expanded, qUs_expanded - p0, low_bounds, high_bounds, global_fit_params, rel_errors - predicted_lst_multiple, params_names, params_df, dg_df - flags: global_fit_done, limited_tm, limited_dh, limited_cp, fixed_cp

fit_thermal_unfolding_global_global()[source]#

Fit the thermal unfolding of the sample using the signal and temperature data We fit all the curves at once, with global thermodynamic parameters and global slopes (but local baselines) Multiple refers to the fact that we fit many signals at the same time, such as 350nm and 330nm Must be run after fit_thermal_unfolding_global_multiple

Notes

Updates global fitting attributes and sets global_global_fit_done when complete.

fit_thermal_unfolding_global_global_global(model_scale_factor=True)[source]#

Fit the thermal unfolding of the sample using the signal and temperature data We fit all the curves at once, with global thermodynamic parameters, global slopes and global baselines Must be run after fit_thermal_unfolding_global_global

Parameters:

model_scale_factor (bool, optional) – If True, model a scale factor for each oligomer concentration

Notes

Updates many global fitting attributes and sets global_global_global_fit_done when complete. If model_scale_factor is True the method also creates scaled signal attributes: - signal_lst_multiple_scaled, predicted_lst_multiple_scaled

signal_to_df(signal_type='raw', scaled=False)[source]#

Create a dataframe with three columns: Temperature, Signal, and oligomer. Optimized for speed by avoiding per-curve DataFrame creation.

Parameters:
  • signal_type ({'raw', 'fitted', 'derivative'}, optional) – Which signal to include in the dataframe. ‘raw’ uses experimental data, ‘fitted’ uses model predictions, ‘derivative’ uses the estimated derivative signal.

  • scaled (bool, optional) – If True and signal_type == ‘fitted’ or ‘raw’, use the scaled versions if available.

Returns:

A DataFrame with columns: [‘Temperature’, ‘Signal’, ‘Oligomer’, ‘ID’].

Return type:

pd.DataFrame