pychemelt.utils package#

Submodules#

pychemelt.utils.constants module#

pychemelt.utils.files module#

This module contains helper functions to parse Differential Scanning Fluorimetry files from different instrument providers Author: Osvaldo Burastero

All functions that import files should return: - signal_data_dic: dictionary with the signal data, one entry per signal - temp_data_dic: dictionary with the temperature data, one entry per signal - conditions: list with the names of the samples - signals: list with the names of the signals

A signal can be “350nm”, “330nm”, “Scattering”, “Ratio”, “Turbidity”, “Ratio 350nm/330nm”, etc. The length of the lists in signal_data_dic and temp_data_dic should be the same as the length of conditions

pychemelt.utils.files.load_csv_file(file)[source]#

Load a CSV file containing temperature and signal columns and return structured data.

Parameters:

file (str) – Path to the csv file

Returns:

signal_data_dic (dict) – Dictionary mapping signal names to lists of 1D numpy arrays (one array per condition)
temp_data_dic (dict) – Dictionary mapping signal names to lists of temperature arrays corresponding to the signals
conditions (list) – List of condition names
signals (numpy.ndarray) – Array of signal name strings

pychemelt.utils.files.load_aunty_xlsx(file_path)[source]#

Load AUNTY-format multi-sheet Excel file where each sheet is a condition.

Parameters:: file_path (str) – Path to the AUNTY xlsx file

pychemelt.utils.files.load_quantstudio_txt(QSfile)[source]#

Load QuantStudio TXT files (.txt) exported from QuantStudio instruments.

Parameters:

QSfile (str) – Path to the QuantStudio txt file

Returns:

signal_data_dic (dict) – Dictionary with signal data (key: ‘Fluorescence’)
temp_data_dic (dict) – Dictionary with temperature arrays per condition
conditions (list) – List of condition names (well identifiers)
signals (numpy.ndarray) – Array with signal name(s)

pychemelt.utils.files.load_thermofluor_xlsx(thermofluor_file)[source]#

Load DSF Thermofluor xls file and extract data.

Parameters:

thermofluor_file (str) – Path to the xls file

Returns:

signal_data_dic (dict) – Dictionary with signal data
temp_data_dic (dict) – Dictionary with temperature data
conditions (list) – List of conditions

pychemelt.utils.files.load_nanoDSF_xlsx(processed_dsf_file)[source]#

Load nanotemper processed xlsx file and extract relevant data.

Parameters:

processed_dsf_file (str) – Path to the processed xlsx file

Returns:

signal_data_dic (dict) – Dictionary with signal data
temp_data_dic (dict) – Dictionary with temperature data
conditions (list) – List of conditions
signals (numpy.ndarray) – Array of signal names

pychemelt.utils.files.load_panta_xlsx(pantaFile)[source]#

Load the xlsx file generated by a Prometheus Panta instrument.

Parameters:

pantaFile (str) – Path to the xlsx file

Returns:

signal_data_dic (dict) – Dictionary with signal data
temp_data_dic (dict) – Dictionary with temperature data
conditions (list) – List of conditions
signals (numpy.ndarray) – List of signal names, such as 330nm and 350nm

pychemelt.utils.files.load_uncle_multi_channel(uncle_file)[source]#

Function to load the data from the UNCLE instrument.

Parameters:

uncle_file (str) – Path to the xlsx file

Returns:

signal_data_dic (dict) – Dictionary with signal data (keys: wavelength strings like ‘350 nm’)
temp_data_dic (dict) – Dictionary with temperature arrays per condition
conditions (list) – List of sample names
signals (list) – List of wavelength strings

pychemelt.utils.files.load_mx3005p_txt(filename)[source]#

Load Agilent MX3005P qPCR txt file and extract data

Parameters:

filename (str) – Path to the MX3005P txt file. The second column has the fluorescence data, and the third column the temperature. Wells are separated by rows containing a sentence like this one: ‘Segment 2 Plateau 1 Well 1’

Returns:

signal_data_dic (dict) – Dictionary with signal data
temp_data_dic (dict) – Dictionary with temperature data
conditions (list) – List of conditions (well numbers)
signals (numpy.ndarray) – List of signal names

pychemelt.utils.files.detect_file_type(file)[source]#

Detect the type of file based on its extension and content.

Parameters:: file (str) – Path to the file
Returns:: Type of file (e.g., ‘supr’, ‘csv’, ‘prometheus’, ‘panta’, ‘uncle’, ‘mx3005p’, ‘quantstudio’, etc.) or None if unknown
Return type:: str or None

pychemelt.utils.files.detect_encoding(file_path)[source]#

Detect the encoding of a file by trying common encodings.

Parameters:: file_path (str) – Path to the file
Returns:: Detected encoding or the string ‘Unknown encoding’
Return type:: str

pychemelt.utils.files.read_jasco_thermal_ramp(file)[source]#

Given a JASCO file with a thermal ramp, this function reads the data

The data is given in chuncks:

Channel 1: 4.94 14.93 25.08 35.02 45.03 55.07 64.99 75.04 85.08 95.03

250 -0.310564 -0.112003 0.0199744 -0.217282 -0.238716 -0.173046 0.00129784 -0.394731 -0.687165 -1.40543

Parameters:

file (str) – Path to the JASCO thermal ramp file

Returns:

signal_data_dic (dict) – Dictionary with signal data (key: wavelength string like ‘250 nm’)
temp_data_dic (dict) – Dictionary with temperature arrays per condition (only one condition in this case)
conditions (list) – File name as condition
wavelength_data (list) – List of wavelength strings (e.g., ‘250 nm’, ‘255 nm’, etc.)

pychemelt.utils.fitting module#

This module contains helper functions to fit unfolding data Author: Osvaldo Burastero

pychemelt.utils.fitting.fit_line_robust(x, y)[source]#

Fit a line to the data using robust fitting

Parameters:

x (array-like) – x data
y (array-like) – y data

Returns:

m (float) – Slope of the fitted line
b (float) – Intercept of the fitted line

pychemelt.utils.fitting.fit_quadratic_robust(x, y)[source]#

Fit a quadratic equation to the data using robust fitting

Parameters:

x (array-like) – x data
y (array-like) – y data

Returns:

a (float) – Quadratic coefficient of the fitted polynomial
b (float) – Linear coefficient of the fitted polynomial
c (float) – Constant coefficient of the fitted polynomial

pychemelt.utils.fitting.fit_exponential_robust(x, y)[source]#

Fit an exponential function to the data using robust fitting.

Notes

Temperatures should be shifted to the reference (Tref) before calling this function.

Parameters:

x (array-like) – x data
y (array-like) – y data

Returns:

a (float) – Baseline
c (float) – Pre-exponential factor
alpha (float) – Exponential factor

pychemelt.utils.fitting.fit_thermal_unfolding(list_of_temperatures, list_of_signals, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, Cp, list_of_oligomer_conc=None)[source]#

Fit the thermal unfolding profile of many curves at the same time.

This performs global fitting of shared thermodynamic parameters with per-curve baselines.

Parameters:

list_of_temperatures (list of array-like) – List of temperature arrays for each dataset
list_of_signals (list of array-like) – List of signal arrays for each dataset
initial_parameters (array-like) – Initial guess for the parameters
low_bounds (array-like) – Lower bounds for the parameters
high_bounds (array-like) – Upper bounds for the parameters
signal_fx (callable) – Function to calculate the signal based on the parameters
baseline_native_fx (callable) – function to calculate the native state baseline
baseline_unfolded_fx (callable) – function to calculate the unfolded state baseline
Cp (float) – Heat capacity change (passed to signal_fx)
list_of_oligomer_conc (list, optional) – List of oligomer concentrations for each dataset (if applicable)

Returns:

global_fit_params (numpy.ndarray) – Fitted global parameters
cov (numpy.ndarray) – Covariance matrix of the fitted parameters
predicted_lst (list of numpy.ndarray) – Predicted signals for each dataset based on the fitted parameters

pychemelt.utils.fitting.fit_tc_unfolding_single_slopes(list_of_temperatures, list_of_signals, denaturant_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, fit_m1=False, cp_value=None, tm_value=None, dh_value=None, method='least_squares')[source]#

Vectorized and optimized version of global thermal unfolding fitting.

Parameters:

list_of_temperatureslist of array-like

Temperature arrays for each dataset

list_of_signalslist of array-like

Signal arrays for each dataset

denaturant_concentrationslist

Denaturant concentrations (one per dataset)

initial_parametersarray-like

Initial guess for parameters

low_boundsarray-like

Lower bounds for parameters

high_boundsarray-like

Upper bounds for parameters

signal_fxcallable

Signal model function

baseline_native_fxcallable

function to calculate the native state baseline

baseline_unfolded_fxcallable

function to calculate the unfolded state baseline

fit_m1bool, optional

Whether to fit temperature dependence of m-value

cp_value, tm_value, dh_valuefloat or None, optional

Optional fixed thermodynamic parameters

methodstr, optional

Optimization method (‘least_sq’ or ‘curve_fit’)

:returns: * **global_fit_params (numpy.ndarray) – Fitted global parameters**

cov (numpy.ndarray) – Covariance matrix
predicted_lst (list of numpy.ndarray) – Predicted signals per dataset

pychemelt.utils.fitting.fit_tc_unfolding_shared_slopes_many_signals(list_of_temperatures, list_of_signals, signal_ids, denaturant_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, fit_m1=False, cp_value=None, tm_value=None, dh_value=None, method='least_squares')[source]#

Vectorized fitting of thermochemical unfolding curves for multiple signal types sharing thermodynamic parameters and slopes, using lmfit.

Parameters:

list_of_temperatures (list of array-like) – Temperature arrays for each dataset
list_of_signals (list of array-like) – Signal arrays for each dataset
signal_ids (list of int) – Signal-type id for each dataset (0..n_signals-1)
denaturant_concentrations (list) – Denaturant concentrations for each dataset (flattened across signals)
initial_parameters (array-like) – Initial guess for the parameters
low_bounds (array-like) – Lower bounds for the parameters
high_bounds (array-like) – Upper bounds for the parameters
signal_fx (callable) –
Signal model function baseline_native_fx : callable

function to calculate the baseline for the native state
baseline_unfolded_fx (callable) – function to calculate the baseline for the unfolded state
fit_m1 (bool, optional) – Whether to fit temperature dependence of m-value
cp_value (float or None, optional) – Optional fixed thermodynamic parameters
tm_value (float or None, optional) – Optional fixed thermodynamic parameters
dh_value (float or None, optional) – Optional fixed thermodynamic parameters
method (str, optional) – Optimization method for lmfit minimizer. Defaults to ‘least_squares’.

Returns:

global_fit_params (numpy.ndarray) – Fitted global parameters
cov (numpy.ndarray) – Covariance matrix
predicted_lst (list of numpy.ndarray) – Predicted signals per dataset
result (lmfit.minimizer.MinimizerResult) – lmfit minimization result object
minimizer (lmfit.minimizer.Minimizer) – lmfit minimizer object

pychemelt.utils.fitting.fit_tc_unfolding_many_signals(list_of_temperatures, list_of_signals, signal_ids, denaturant_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, fit_m1=False, model_scale_factor=False, scale_factor_exclude_ids=[], cp_value=None, method='least_squares', fit_native_den_slope=True, fit_unfolded_den_slope=True)[source]#

Fit thermochemical unfolding curves for many signals using lmfit.

Parameters:

list_of_temperatures (list of array-like) – Temperature arrays for each dataset.
list_of_signals (list of array-like) – Signal arrays for each dataset.
signal_ids (list of int) – Signal-type id for each dataset (0..n_signals-1)
denaturant_concentrations (list) – Denaturant concentrations for each dataset (flattened across signals)
initial_parameters (array-like) – Initial guess for the parameters
low_bounds (array-like) – Lower bounds for the parameters
high_bounds (array-like) – Upper bounds for the parameters
signal_fx (callable) – Signal model function
baseline_native_fx (callable) – function to calculate the native state baseline
baseline_unfolded_fx (callable) – function to calculate the unfolded state baseline
fit_m1 (bool, optional) – Whether to include and fit temperature dependence of the m-value (m1)
model_scale_factor (bool, optional) – If True, include a per-denaturant concentration scale factor to account for intensity differences
scale_factor_exclude_ids (list, optional) – IDs of scale factors to exclude / fix to 1
cp_value (float or None, optional) – If provided, Cp is fixed to this value and not fitted
method (str, optional) – Optimization method for lmfit minimizer. Defaults to ‘least_squares’.
fit_native_den_slope (bool, optional) – Whether to fit denaturant dependence of baselines.
fit_unfolded_den_slope (bool, optional) – Whether to fit denaturant dependence of baselines.

Returns:

global_fit_params (numpy.ndarray) – Fitted global parameters
cov (numpy.ndarray) – Covariance matrix
predicted_lst (list of numpy.ndarray) – Predicted signals per dataset
result (lmfit.minimizer.MinimizerResult) – lmfit minimization result object
minimizer (lmfit.minimizer.Minimizer) – lmfit minimizer object

pychemelt.utils.fitting.fit_oligomer_unfolding_single_slopes(list_of_temperatures, list_of_signals, oligomer_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, cp_value=None, tm_value=None, dh_value=None, method='least_squares')[source]#

Vectorized and optimized version of global thermal unfolding fitting. of oligomers

Parameters:

list_of_temperatures (list of array-like) – Temperature arrays for each dataset
list_of_signals (list of array-like) – Signal arrays for each dataset
oligomer_concentrations (list) – sample concentrations of the oligomeric complex (one per dataset)
initial_parameters (array-like) – Initial guess for parameters
low_bounds (array-like) – Lower bounds for parameters
high_bounds (array-like) – Upper bounds for parameters
signal_fx (callable) – Signal model function
baseline_native_fx (callable) – function to calculate the native state baseline
baseline_unfolded_fx (callable) – function to calculate the unfolded state baseline
cp_value (float or None, optional) – Optional fixed thermodynamic parameters
tm_value (float or None, optional) – Optional fixed thermodynamic parameters
dh_value (float or None, optional) – Optional fixed thermodynamic parameters

Returns:

global_fit_params (numpy.ndarray) – Fitted global parameters
cov (numpy.ndarray) – Covariance matrix
predicted_lst (list of numpy.ndarray) – Predicted signals per dataset
result (lmfit.minimizer.MinimizerResult) – lmfit minimization result object
minimizer (lmfit.minimizer.Minimizer) – lmfit minimizer object

pychemelt.utils.fitting.fit_oligomer_unfolding_shared_slopes_many_signals(list_of_temperatures, list_of_signals, signal_ids, oligomer_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, cp_value=None, tm_value=None, dh_value=None, method='least_squares')[source]#

Vectorized fitting of oligomer thermal unfolding curves for multiple signal types sharing thermodynamic parameters and slopes, using lmfit.

Parameters:

list_of_temperatures (list of array-like) – Temperature arrays for each dataset.
list_of_signals (list of array-like) – Signal arrays for each dataset.
signal_ids (list of int) – Signal-type id for each dataset (0..n_signals-1)
oligomer_concentrations (list) – sample concentrations of the oligomeric complex for each dataset (flattened across signals)
initial_parameters (array-like) – Initial guess for the parameters
low_bounds (array-like) – Lower bounds for the parameters
high_bounds (array-like) – Upper bounds for the parameters
signal_fx (callable) –
Signal model function baseline_native_fx : callable

function to calculate the baseline for the native state
baseline_unfolded_fx (callable) – function to calculate the baseline for the unfolded state
cp_value (float or None, optional) – Optional fixed thermodynamic parameters
tm_value (float or None, optional) – Optional fixed thermodynamic parameters
dh_value (float or None, optional) – Optional fixed thermodynamic parameters

Returns:

global_fit_params (numpy.ndarray) – Fitted global parameters
cov (numpy.ndarray) – Covariance matrix
predicted_lst (list of numpy.ndarray) – Predicted signals per dataset
result (lmfit.minimizer.MinimizerResult) – lmfit minimization result object
minimizer (lmfit.minimizer.Minimizer) – lmfit minimizer object

pychemelt.utils.fitting.fit_oligomer_unfolding_many_signals(list_of_temperatures, list_of_signals, signal_ids, oligomer_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, model_scale_factor=False, scale_factor_exclude_ids=[], cp_value=None, method='least_squares')[source]#

Fit thermal unfolding curves of oligomers for many signals (optimized variant).

Parameters:

list_of_temperatureslist of array-like: Temperature arrays for each dataset
list_of_signalslist of array-like: Signal arrays for each dataset
signal_idslist of int: Signal-type id for each dataset (0..n_signals-1)
oligomer_concentrationslist: sample concentrations of the oligomeric complex for each dataset (flattened across signals)
initial_parametersarray-like: Initial guess for the parameters
low_boundsarray-like: Lower bounds for the parameters
high_boundsarray-like: Upper bounds for the parameters
signal_fxcallable: Signal model function
baseline_native_fxcallable: function to calculate the native state baseline
baseline_unfolded_fxcallable: function to calculate the unfolded state baseline
model_scale_factorbool, optional: If True, include a per-oligomeric concentration scale factor to account for intensity differences
scale_factor_exclude_idslist, optional: IDs of scale factors to exclude / fix to 1
cp_valuefloat or None, optional: If provided, Cp is fixed to this value and not fitted

Returns:

global_fit_paramsnumpy.ndarray: Fitted global parameters
covnumpy.ndarray: Covariance matrix
predicted_lstlist of numpy.ndarray: Predicted signals per dataset
resultlmfit.minimizer.MinimizerResult: lmfit minimization result object
minimizerlmfit.minimizer.Minimizer: lmfit minimizer object

pychemelt.utils.fitting.fit_oligomer_unfolding_three_states_single_slopes(list_of_temperatures, list_of_signals, oligomer_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, t1=None, t2=None, dh1=None, dh2=None, CpTh_value=None, method='least_squares', max_nfev=None)[source]#

Vectorized and optimized version of global thermal unfolding fitting of oligomers.

Returns:

global_fit_params (numpy.ndarray)
cov (numpy.ndarray)
predicted_lst (list of numpy.ndarray)
result (lmfit.minimizer.MinimizerResult)
minimizer (lmfit.minimizer.Minimizer)

Note

Dear dev/user. Fitting Cp1 will probably not work in the case of monomers, given that changing Cp does not change the shape of the unfolding curve.

pychemelt.utils.fitting.fit_oligomer_unfolding_three_states_shared_slopes_many_signals(list_of_temperatures, list_of_signals, signal_ids, oligomer_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, t1=None, t2=None, dh1=None, dh2=None, CpTh_value=None, method='least_squares')[source]#

Vectorized fitting of oligomer thermal unfolding curves for multiple signal types sharing thermodynamic parameters and slopes, using lmfit.

Parameters:

list_of_temperatures (list of array-like) – Temperature arrays for each dataset.
list_of_signals (list of array-like) – Signal arrays for each dataset.
signal_ids (list of int) – Signal-type id for each dataset (0..n_signals-1)
oligomer_concentrations (list) – Oligomer concentrations for each dataset (flattened across signals)
initial_parameters (array-like) – Initial guess for the parameters
low_bounds (array-like) – Lower bounds for the parameters
high_bounds (array-like) – Upper bounds for the parameters
signal_fx (callable) –
Signal model function baseline_native_fx : callable

function to calculate the baseline for the native state
baseline_unfolded_fx (callable) – function to calculate the baseline for the unfolded state
t1 (float, optional) – Values for the unfolding temperatures one and two
t2 (float, optional) – Values for the unfolding temperatures one and two
dh1 (float, optional) – Values for the unfolding enthalpy one and two
dh2 (float, optional) – Values for the unfolding enthalpy one and two
CpTh_value (float, optional) – Value for the total Cp of the system, enabling fitting of Cp1

Returns:

global_fit_params (numpy.ndarray) – Fitted global parameters
cov (numpy.ndarray) – Covariance matrix
predicted_lst (list of numpy.ndarray) – Predicted signals per dataset
result (lmfit.minimizer.MinimizerResult)
minimizer (lmfit.minimizer.Minimizer)

pychemelt.utils.fractions module#

This module contains helper functions to obtain the amount of folded/intermediate/unfolded (etc.) protein Author: Osvaldo Burastero

pychemelt.utils.fractions.fn_two_state_monomer(K)[source]#

Given the equilibrium constant K of N <-> U, return the fraction of folded protein.

Parameters:: K (float) – Equilibrium constant of the reaction N <-> U
Returns:: Fraction of folded protein
Return type:: float

pychemelt.utils.fractions.fu_two_state_dimer(K, C)[source]#

Given the equilibrium constant K, of N2 <-> 2U, and the concentration of dimer equivalent C, return the fraction of unfolded protein

Parameters:

K (float) – Equilibrium constant of the reaction N2 <-> 2U
C (float) – Total concentration of the protein in dimer equivalents

Returns:

Fraction of unfolded protein

Return type:

float

pychemelt.utils.fractions.fu_two_state_trimer(K, C)[source]#

Given the equilibrium constant K, of N3 <-> 3U, and the concentration of trimer equivalent C, return the fraction of unfolded protein

Parameters:

K (float) – Equilibrium constant of the reaction N3 <-> 3U
C (float) – Total concentration of the protein in trimer equivalents

Returns:

Fraction of unfolded protein

Return type:

float

pychemelt.utils.fractions.fu_two_state_tetramer(K, C)[source]#

Given the equilibrium constant K, of N4 <-> 4U, and the concentration of tetramer equivalent C, return the fraction of folded protein

Parameters:

K (float) – Equilibrium constant of the reaction N4 <-> 4U
C (float) – Total concentration of the protein in tetramer equivalents

Returns:

Fraction of unfolded protein

Return type:

float

pychemelt.utils.fractions.fi_three_state_tetramer_monomeric_intermediate(K1, K2, Ct)[source]#: Given the equilibrium constant K1, of N4 <-> 4I, K2, of I <-> U, and the concentration of tetramer equivalent Ct, return the fraction of intermediate

pychemelt.utils.fractions.fi_three_state_dimer_monomeric_intermediate(K1, K2, C)[source]#: Given the equilibrium constant K1, of N2 <-> 2I, K2, of 2I <-> 2U and the concentration of dimer equivalent C, return the fraction of intermediate

pychemelt.utils.fractions.fu_three_state_dimer_dimeric_intermediate(K1, K2, C)[source]#: Given the equilibrium constant K1, of N2 <-> I2, K2, of I2 <-> 2U and the concentration of dimer equivalent C, return the fraction of unfolded protein

pychemelt.utils.fractions.fi_three_state_dimer_dimeric_intermediate(fu, K2, C)[source]#: Given the fraction of unfolded protein fu, the equilibrium constant K2, of I2 <-> 2U, and the concentration of dimer equivalent C, return the fraction of intermediate

pychemelt.utils.fractions.fi_three_state_trimer_monomeric_intermediate(K1, K2, C)[source]#: Given the equilibrium constant K1, of N3 <-> 3I, K2, of 3I <-> 3U and the concentration of trimer equivalent C, return the fraction of unfolded protein

pychemelt.utils.fractions.fu_three_state_trimer_trimeric_intermediate(K1, K2, C)[source]#: Given the equilibrium constant K1, of N3 <-> I3, K2, of I3 <-> 3U and the concentration of trimer equivalent C, return the fraction of unfolded protein

pychemelt.utils.fractions.fi_three_state_trimer_trimeric_intermediate(fu, K2, C)[source]#: Given the fraction of unfolded protein fu, the equilibrium constant K2, of I3 <-> 3U, and the concentration of trimer equivalent C, return the fraction of intermediate

pychemelt.utils.math module#

This module contains helper functions for mathematical operations Author: Osvaldo Burastero

pychemelt.utils.math.temperature_to_kelvin(T)[source]#

Convert temperature from Celsius to Kelvin if necessary.

Parameters:: T (array-like) – Temperature values
Returns:: Temperature values in Kelvin
Return type:: array-like

pychemelt.utils.math.temperature_to_celsius(T)[source]#

Convert temperature from Kelvin to Celsius if necessary.

Parameters:: T (array-like) – Temperature values
Returns:: Temperature values in Celsius
Return type:: array-like

pychemelt.utils.math.shift_temperature(T)[source]#

Shift temperature to be relative to Tref_cst in Kelvin.

Parameters:: T (array-like) – Temperature values
Returns:: Shifted temperature values
Return type:: array-like

pychemelt.utils.math.constant_baseline(dt, d, den_slope, a, *args)[source]#

Baseline function with no dependence on temperature and dependence on denaturant concentration

Parameters:

dt (float) – delta temperature, not used here but required for compatibility with other baseline functions
d (float) – denaturant concentration
den_slope (float) – linear dependence of signal on denaturant concentration
a (float) – intercept of the baseline

Returns:

Baseline signal

Return type:

float

pychemelt.utils.math.linear_baseline(dt, d, den_slope, a, b, *args)[source]#

Baseline function with linear dependence on temperature and linear dependence on denaturant concentration

Parameters:

dt (float) – delta temperature, not used here but required for compatibility with other baseline functions
d (float) – denaturant concentration
den_slope (float) – linear dependence of signal on denaturant concentration
a (float) – intercept of the baseline
b (float) – linear dependence of signal on temperature

Returns:

Baseline signal

Return type:

float

pychemelt.utils.math.quadratic_baseline(dt, d, den_slope, a, b, c)[source]#

Baseline function with quadratic dependence on temperature and linear dependence on denaturant concentration

Parameters:

dt (float) – delta temperature, not used here but required for compatibility with other baseline functions
d (float) – denaturant concentration
den_slope (float) – linear dependence of signal on denaturant concentration
a (float) – intercept of the baseline
b (float) – linear dependence of signal on temperature
c (float) – quadratic dependence of signal on temperature

Returns:

Baseline signal

Return type:

float

pychemelt.utils.math.exponential_baseline(dt, d, den_slope, a, c, alpha)[source]#

Baseline function with exponential dependence on temperature and linear dependence on denaturant concentration

Parameters:

dt (float) – delta temperature, not used here but required for compatibility with other baseline functions
d (float) – denaturant concentration
den_slope (float) – linear dependence of signal on denaturant concentration
a (float) – intercept of the baseline
b (float) – pre-exponential factor for the dependence on temperature
c (float) – exponential coefficient for the dependence on temperature

Returns:

Baseline signal

Return type:

float

pychemelt.utils.math.constant_baseline_only_temp(dt, a, *args)[source]#

Baseline function with no dependence on temperature or denaturant concentration

Parameters:

dt (float) – delta temperature, not used here but required for compatibility with other baseline functions
a (float) – intercept of the baseline

Returns:

Baseline signal

Return type:

float

pychemelt.utils.math.linear_baseline_only_temp(dt, a, b, *args)[source]#

Baseline function with linear dependence on temperature and no dependence on denaturant concentration

Parameters:

dt (float) – delta temperature
a (float) – intercept of the baseline
b (float) – linear dependence of signal on temperature

Returns:

Baseline signal

Return type:

float

pychemelt.utils.math.quadratic_baseline_only_temp(dt, a, b, c)[source]#

Baseline function with quadratic dependence on temperature and no dependence on denaturant concentration

Parameters:

dt (float) – delta temperature
a (float) – intercept of the baseline
b (float) – linear dependence of signal on temperature
c (float) – quadratic dependence of signal on temperature

Returns:

Baseline signal

Return type:

float

pychemelt.utils.math.exponential_baseline_only_temp(dt, a, c, alpha)[source]#

Baseline function with exponential dependence on temperature and no dependence on denaturant concentration

Parameters:

dt (float) – delta temperature
a (float) – intercept of the baseline
c (float) – pre-exponential factor for the dependence on temperature
alpha (float) – exponential coefficient for the dependence on temperature

Returns:

Baseline signal

Return type:

float

pychemelt.utils.math.is_evenly_spaced(x, tol=0.0001)[source]#

Check if x is evenly spaced within a given tolerance.

Parameters:

x (array-like) – x data
tol (float, optional) – Tolerance for considering spacing equal (default: 1e-4)

Returns:

True if x is evenly spaced, False otherwise

Return type:

bool

pychemelt.utils.math.first_derivative_savgol(x, y, window_length=5, polyorder=4)[source]#

Estimate the first derivative using Savitzky-Golay filtering.

Parameters:

x (array-like) – x data (must be evenly spaced)
y (array-like) – y data
window_length (int, optional) – Length of the filter window, in temperature units (default: 5)
polyorder (int, optional) – Order of the polynomial used to fit the samples (default: 4)

Returns:

First derivative of y with respect to x

Return type:

numpy.ndarray

Notes

This function will raise a ValueError if x is not evenly spaced.

pychemelt.utils.math.relative_errors(params, cov)[source]#

Calculate the relative errors of the fitted parameters.

Parameters:

params (numpy.ndarray) – Fitted parameters
cov (numpy.ndarray) – Covariance matrix of the fitted parameters

Returns:

Relative errors of the fitted parameters (in percent)

Return type:

numpy.ndarray

pychemelt.utils.math.find_line_outliers(m, b, x, y, sigma=2.5)[source]#

Find outliers in a linear fit using the sigma rule.

Parameters:

m (float) – Slope of the line
b (float) – Intercept of the line
x (array-like) – x data
y (array-like) – y data
sigma (float, optional) – Number of standard deviations to use for outlier detection (default: 2.5)

Returns:

Indices of the outliers

Return type:

numpy.ndarray

pychemelt.utils.math.get_rss(y, y_fit)[source]#

Compute the residual sum of squares.

Parameters:

y (array-like) – Observed values
y_fit (array-like) – Fitted values

Returns:

Residual sum of squares

Return type:

float

pychemelt.utils.math.solve_one_root_quadratic(a, b, c)[source]#

Solution to one root quadratic: a * X**2 + b * X + c = 0

Parameters:

a (number type) – parameter a
b (number type) – parameter b
c (number type) – parameter c

Returns:

Solution of the formula

Return type:

float

pychemelt.utils.math.solve_one_root_depressed_cubic(p, q)[source]#

Solution to one root depressed cubic: X**3 + p * X + q = 0

Parameters:

p (number type) – parameter p
q (number type) – parameter q

Returns:

Solution of the formula

Return type:

float

pychemelt.utils.palette module#

Viridis color palette.

A perceptually uniform color map that is readable by those with colorblindness. Contains hex color values transitioning from dark purple to yellow.

pychemelt.utils.plotting module#

class pychemelt.utils.plotting.PlotConfig(width: int = 1000, height: int = 800, type: str = 'png', font_size: int = 16, marker_size: int = 8, line_width: int = 3)[source]#

Bases: object

General plot configuration

width: int = 1000#

height: int = 800#

type: str = 'png'#

font_size: int = 16#

marker_size: int = 8#

line_width: int = 3#

__init__(width: int = 1000, height: int = 800, type: str = 'png', font_size: int = 16, marker_size: int = 8, line_width: int = 3) → None#

class pychemelt.utils.plotting.AxisConfig(showgrid_x: bool = True, showgrid_y: bool = True, n_y_axis_ticks: int = 5, linewidth: int = 1, tickwidth: int = 1, ticklen: int = 5, gridwidth: int = 1)[source]#

Bases: object

Axis styling configuration

showgrid_x: bool = True#

showgrid_y: bool = True#

n_y_axis_ticks: int = 5#

linewidth: int = 1#

tickwidth: int = 1#

ticklen: int = 5#

gridwidth: int = 1#

__init__(showgrid_x: bool = True, showgrid_y: bool = True, n_y_axis_ticks: int = 5, linewidth: int = 1, tickwidth: int = 1, ticklen: int = 5, gridwidth: int = 1) → None#

class pychemelt.utils.plotting.LayoutConfig(show_subplot_titles: bool = False, vertical_spacing: float = 0.1)[source]#

Bases: object

Layout and spacing configuration

show_subplot_titles: bool = False#

vertical_spacing: float = 0.1#

__init__(show_subplot_titles: bool = False, vertical_spacing: float = 0.1) → None#

class pychemelt.utils.plotting.LegendConfig[source]#

Bases: object

Legend and labeling configuration

color_bar_length = 0.4#

color_bar_orientation = 'v'#

color_bar_x_pos = 1.05#

color_bar_y_pos = 0.5#

__init__() → None#

pychemelt.utils.plotting.config_fig(fig, plot_width=800, plot_height=600, plot_type='png', plot_title_for_download='plot')[source]#

Configure plotly figure with download options and toolbar settings.

Parameters:

fig (go.Figure) – Plotly figure object
plot_width (int, default 800) – Width of the plot in pixels
plot_height (int, default 600) – Height of the plot in pixels
plot_type (str, default "png") – Format for downloading the plot (e.g., “png”, “jpeg”)
plot_title_for_download (str, default "plot") – Title for the downloaded plot file

Returns:

Configured plotly figure

Return type:

go.Figure

pychemelt.utils.plotting.plot_unfolding(pychemelt_sample, plot_derivative=False, plot_config: PlotConfig = None, axis_config: AxisConfig = None, layout_config: LayoutConfig = None, legend_config: LegendConfig = None)[source]#

Plot the unfolding curves, including the signal and the predicted curves

Parameters:

pychemelt_sample – pychemelt.Sample object
plot_derivative (bool) – Whether to plot the derivative of the signal
plot_config (PlotConfig, optional) – Configuration for the overall plot
axis_config (AxisConfig, optional) – Configuration for the axes
layout_config (LayoutConfig, optional) – Configuration for the layout
legend_config (LegendConfig, optional) – configuration for the legend

pychemelt.utils.plotting.plot_baselines(pychemelt_sample, plot_config: PlotConfig = None, axis_config: AxisConfig = None, layout_config: LayoutConfig = None, legend_config: LegendConfig = None)[source]#

Plot the fitted native and unfolded baseline curves on the data

Parameters:

pychemelt_sample – pychemelt.Sample object
plot_config (PlotConfig, optional) – Configuration for the overall plot
axis_config (AxisConfig, optional) – Configuration for the axes
layout_config (LayoutConfig, optional) – Configuration for the layout
legend_config (LegendConfig, optional) – configuration for the legend

pychemelt.utils.processing module#

This module contains helper functions to process data Author: Osvaldo Burastero

pychemelt.utils.processing.set_param_bounds(p0, param_names)[source]#

Generate heuristic lower and upper bounds for fitting parameters based on initial guesses.

Parameters:

p0 (array-like) – Initial parameter guesses.
param_names (list of str) – Names of the parameters to apply specific logic (e.g., non-negative constraints).

Returns:

(low_bounds, high_bounds) as lists of numeric values.

Return type:

tuple

pychemelt.utils.processing.expand_temperature_list(temp_lst, signal_lst)[source]#

Expand the temperature list to match the length of the signal list.

Parameters:

temp_lst (list) – List of temperatures
signal_lst (list) – List of signals

Returns:

Expanded temperature list

Return type:

list

pychemelt.utils.processing.clean_conditions_labels(conditions)[source]#

Clean the conditions labels by removing unwanted characters and patterns.

Parameters:: conditions (list) – List of condition strings.
Returns:: List of cleaned condition strings.
Return type:: list

pychemelt.utils.processing.subset_signal_by_temperature(signal_lst, temp_lst, min_temp, max_temp)[source]#

Subset the signal and temperature lists based on the specified temperature range.

Parameters:

signal_lst (list) – List of signal arrays.
temp_lst (list) – List of temperature arrays.
min_temp (float) – Minimum temperature for subsetting.
max_temp (float) – Maximum temperature for subsetting.

Returns:

Tuple containing the subsetted signal and temperature lists.

Return type:

tuple

pychemelt.utils.processing.guess_Tm_from_derivative(temp_lst, deriv_lst, x1, x2)[source]#

Estimate the melting temperature (Tm) by finding the extremum of the first derivative.

Parameters:

temp_lst (list of np.ndarray) – Temperature arrays for each dataset.
deriv_lst (list of np.ndarray) – First derivative of the signal for each dataset.
x1 (float) – Lower buffer from the temperature edges to exclude noise/artifacts.
x2 (float) – Upper buffer from the temperature edges to define the baseline median window.

Returns:

Estimated Tm values for each dataset.

Return type:

list of float

pychemelt.utils.processing.estimate_signal_baseline_params(signal_lst, temp_lst, native_baseline_type, unfolded_baseline_type, window_range_native=12, window_range_unfolded=12, oligomer_number=1)[source]#

Estimate the baseline parameters for the sample

Parameters:

signal_lst (list of np.ndarray) – List of signal arrays
temp_lst (list of np.ndarray) – List of temperature arrays
window_range_native (float) – Range of the temperature window to estimate the native state baseline
window_range_unfolded (float) – Range of the temperature window to estimate the unfolded state baseline
native_baseline_type (str) – options: ‘constant’, ‘linear’, ‘quadratic’, ‘exponential’
unfolded_baseline_type (str) – options: ‘constant’, ‘linear’, ‘quadratic’, ‘exponential’
oligomer_number (int) – number of subunits in the oligomer

Returns:

Lists of estimated parameters (p1Ns, p1Us, p2Ns, p2Us, p3Ns, p3Us).

Return type:

tuple

pychemelt.utils.processing.fit_local_thermal_unfolding_to_signal_lst(signal_lst, temp_lst, t_melting_init, p1_Ns, p1_Us, p2_Ns, p2_Us, p3_Ns, p3_Us, baseline_native_fx, baseline_unfolded_fx)[source]#

Perform individual (local) fits for each signal curve in a list.

Parameters:

signal_lst (list of np.ndarray) – List of signals.
temp_lst (list of np.ndarray) – List of temperatures.
t_melting_init (list of float) – Initial Tm guesses.
p1_Ns (list of float) – Estimated baseline parameters for each curve.
p1_Us (list of float) – Estimated baseline parameters for each curve.
p2_Ns (list of float) – Estimated baseline parameters for each curve.
p2_Us (list of float) – Estimated baseline parameters for each curve.
p3_Ns (list of float) – Estimated baseline parameters for each curve.
p3_Us (list of float) – Estimated baseline parameters for each curve.
baseline_native_fx (callable) – Function to calculate the native baseline.
baseline_unfolded_fx (callable) – Function to calculate the unfolded baseline.

Returns:

(Tms, dHs, predicted_lst) containing fitted parameters and signal arrays.

Return type:

tuple

pychemelt.utils.processing.re_arrange_predictions(predicted_lst, n_signals, n_denaturants)[source]#

Re-arrange the flattened predictions to match the original signal list with sublists.

Parameters:

predicted_lst (list) – Flattened list of predicted signals of length n_signals * n_denaturants.
n_signals (int) – Number of signal types (e.g., different wavelengths).
n_denaturants (int) – Number of denaturant concentrations or conditions per signal.

Returns:

Re-arranged list of predicted signals of length n_signals, where each element is a sublist of length n_denaturants.

Return type:

list

pychemelt.utils.processing.re_arrange_params(params, n_signals)[source]#

Re-arrange flattened parameters into a list of sublists grouped by signal.

Parameters:

params (list or np.ndarray) – Flattened list of parameters.
n_signals (int) – Number of signal types to group parameters by.

Returns:

Re-arranged list of parameters of length n_signals containing parameter arrays for each signal.

Return type:

list of np.ndarray

pychemelt.utils.processing.subset_data(data, max_points)[source]#

Reduces the number of data points by repeated striding until the size is below a threshold.

Parameters:

data (np.ndarray) – Input data array to be subsetted.
max_points (int) – The maximum number of points allowed in the resulting array.

Returns:

Subsetted data array containing every $2^n$-th point of the original.

Return type:

np.ndarray

pychemelt.utils.processing.get_colors_from_numeric_values(values, min_val, max_val, use_log_scale=False)[source]#

Map numeric values to colors in the VIRIDIS palette based on a specified range.

Parameters:

values (list or np.ndarray) – Numeric values to map to colors.
min_val (float) – Minimum value of the range.
max_val (float) – Maximum value of the range.
use_log_scale (bool, optional) – Whether to use logarithmic scaling for the values, default is True.

Returns:

List of hex color codes corresponding to the input values.

Return type:

list

pychemelt.utils.processing.combine_sequences(seq1, seq2)[source]#

Combine two sequences to generate all possible combinations of their elements.

Parameters:

seq1 (list) – First sequence of elements.
seq2 (list) – Second sequence of elements.

Returns:

A list of tuples, where each tuple contains one element from seq1 and one from seq2.

Return type:

list

pychemelt.utils.processing.adjust_value_to_interval(value, lower_bound, upper_bound, shift)[source]#: Verify that a value is within the specified bounds. If the value is outside the bounds, adjust it to the nearest bound. :param value: The value to be adjusted. :type value: float :param lower_bound: The lower bound of the interval. :type lower_bound: float :param upper_bound: The upper bound of the interval. :type upper_bound: float :param shift: How much to shift the value if it is outside the bounds. :type shift: float

pychemelt.utils.processing.oligomer_number(model)[source]#

Get the number of subunits in the oligomer based on the model.

Returns:: The number of subunits (2 for ‘Dimer’, 3 for ‘Trimer’, 4 for ‘Tetramer’, 1 otherwise).
Return type:: int

pychemelt.utils.processing.parse_number(s)[source]#

Parse a string as a float, handling: - European decimal (comma) - Optional thousands separators - Standard decimal point

Parameters:: s (str) – The string to parse
Return type:: float The parsed number
Raises:: ValueError If the string cannot be parsed as a float –

pychemelt.utils.processing.are_all_strings_numeric(lst)[source]#

Parameters:: lst (list of str) – List of strings to check
Returns:: True if all strings in the list are numeric (can contain digits, ‘.’, ‘-’, ‘,’), False otherwise
Return type:: bool

pychemelt.utils.processing.is_float(element)[source]#

pychemelt.utils.processing.transform_to_list(element_or_list)[source]#

Parameters:: element_or_list (bool, str, int, float, list, or numpy array) – The input element or list to be transformed into a list.
Returns:: A list containing the input element if it is not already a list, or the input itself if it is None, a numpy array, or a list.
Return type:: list or None
Raises:: ValueError – If the input is not a boolean, string, integer, float, list, numpy array

pychemelt.utils.processing.ci_dict_to_summary_df(ci_dict, percentage=0.95)[source]#

Convert lmfit confidence interval dictionary into a summary DataFrame.

Parameters:: ci_dict (dict) – Dictionary containing confidence intervals for fitted parameters, typically in the format returned by lmfit.
Returns:: DataFrame summarizing the confidence intervals for each parameter, with columns: - Parameter: Name of the fitted parameter - Lower_CI: Lower bound of the confidence interval - Value: Best-fit value of the parameter - Upper_CI: Upper bound of the confidence interval
Return type:: pd.DataFrame

pychemelt.utils.rates module#

This module contains helper functions to obtain equilibrium constants Author: Osvaldo Burastero

Useful references for unfolding models:

Rumfeldt, Jessica AO, et al. “Conformational stability and folding mechanisms of dimeric proteins.” Progress in biophysics and molecular biology 98.1 (2008): 61-84.
Bedouelle, Hugues. “Principles and equations for measuring and interpreting protein stability: From monomer to tetramer.” Biochimie 121 (2016): 29-37.
Mazurenko, Stanislav, et al. “Exploration of protein unfolding by modelling calorimetry data from reheating.” Scientific reports 7.1 (2017): 16321.

All thermodynamic parameters are used in kcal mol units

Unfolding functions for monomers have an argument called ‘extra_arg’ that is not used. This is because unfolding functions for oligomers require the protein concentration in that position

pychemelt.utils.rates.eq_constant_thermo(T, DH1, T1, Cp)[source]#

T1 is the temperature at which ΔG(T) = 0 ΔH1, the variation of enthalpy between the two considered states at T1 Cp the variation of calorific capacity between the two states

Parameters:

T (array-like) – Temperature (Kelvin)
DH1 (float) – Variation of enthalpy between the two considered states at T1 (kcal/mol)
T1 (float) – Temperature at which the equilibrium constant equals one (Kelvin)
Cp (float) – Variation of heat capacity between the two states (kcal/mol/K)

Returns:

Equilibrium constant at the given temperature

Return type:

numpy.ndarray

pychemelt.utils.rates.eq_constant_termochem(T, D, DHm, Tm, Cp0, m0, m1)[source]#

Ref: Louise Hamborg et al., 2020. Global analysis of protein stability by temperature and chemical denaturation

Parameters:

T (array-like) – Temperature (Kelvin only!)
D (float) – Denaturant concentration (M)
DHm (float) – Enthalpy change at Tm (kcal/mol)
Tm (float) – Melting temperature where ΔG = 0 (Kelvin only!)
Cp0 (float) – Heat capacity change (kcal/mol/K)
m0 (float) – m-value at the reference temperature
m1 (float) – Temperature dependence of the m-value

Returns:

Equilibrium constant at a certain temperature and denaturant agent concentration

Return type:

numpy.ndarray

pychemelt.utils.signals module#

This module contains helper functions to obtain the signal, given certain parameters Author: Osvaldo Burastero

Note: One could move dT outside the signal functions but the speedup was not significant.

pychemelt.utils.signals.signal_two_state_tc_unfolding(T, D, DHm, Tm, Cp0, m0, m1, p1_N, p2_N, p3_N, p4_N, p1_U, p2_U, p3_U, p4_U, baseline_N_fx, baseline_U_fx, extra_arg=None)[source]#

Ref: Louise Hamborg et al., 2020. Global analysis of protein stability by temperature and chemical denaturation

Parameters:

T (array-like) – Temperature in Kelvin units
D (array-like) – Denaturant agent concentration
DHm (float) – Variation of enthalpy between the two considered states at Tm
Tm (float) – Temperature at which the equilibrium constant equals one, in Kelvin units
Cp0 (float) – Variation of calorific capacity between the two states
m0 (float) – m-value at the reference temperature (Tref)
m1 (float) – Variation of m-value with temperature
p1_N (float) – parameters describing the native-state baseline
p2_N (float) – parameters describing the native-state baseline
p3_N (float) – parameters describing the native-state baseline
p4_N (float) – parameters describing the native-state baseline
p1_U (float) – parameters describing the unfolded-state baseline
p2_U (float) – parameters describing the unfolded-state baseline
p3_U (float) – parameters describing the unfolded-state baseline
p4_U (float) – parameters describing the unfolded-state baseline
baseline_N_fx (function) – for the native-state baseline
baseline_U_fx (function) – for the unfolded-state baseline
extra_arg (None, optional) – Not used but present for API compatibility with oligomeric models

Returns:

Signal at the given temperatures and denaturant agent concentration, given the parameters

Return type:

numpy.ndarray

pychemelt.utils.signals.signal_two_state_t_unfolding(T, Tm, dHm, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, Cp=0, extra_arg=None)[source]#

Two-state temperature unfolding (monomer).

Parameters:

T (array-like) – Temperature
Tm (float) – Temperature at which the equilibrium constant equals one
dHm (float) – Variation of enthalpy between the two considered states at Tm
p1_N (float) – baseline parameters for the native-state baseline
p2_N (float) – baseline parameters for the native-state baseline
p3_N (float) – baseline parameters for the native-state baseline
p1_U (float) – baseline parameters for the unfolded-state baseline
p2_U (float) – baseline parameters for the unfolded-state baseline
p3_U (float) – baseline parameters for the unfolded-state baseline
baseline_N_fx (callable) – function to calculate the baseline for the native state
baseline_U_fx (callable) – function to calculate the baseline for the unfolded state
Cp (float, optional) – Variation of heat capacity between the two states (default: 0)
extra_arg (None, optional) – Not used but present for compatibility

Returns:

Signal at the given temperatures, given the parameters

Return type:

numpy.ndarray

pychemelt.utils.signals.two_state_thermal_unfold_curve(T, C, Tm, dHm, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, Cp=0)[source]#

Two-state temperature unfolding (monomer). N ⇔ U

Parameters:

T (array-like) – Temperature
C (array-like) – Oligomer sample concentration - only for compatibility with oligomeric models, not used in the monomeric model
Tm (float) – Temperature at which the equilibrium constant equals one
dHm (float) – Variation of enthalpy between the two considered states at Tm
p1_N (float) – baseline parameters for the native-state baseline
p2_N (float) – baseline parameters for the native-state baseline
p3_N (float) – baseline parameters for the native-state baseline
p1_U (float) – baseline parameters for the unfolded-state baseline
p2_U (float) – baseline parameters for the unfolded-state baseline
p3_U (float) – baseline parameters for the unfolded-state baseline
baseline_N_fx (callable) – function to calculate the baseline for the native state
baseline_U_fx (callable) – function to calculate the baseline for the unfolded state
Cp (float, optional) – Variation of heat capacity between the two states (default: 0)

Returns:

Signal at the given temperatures, given the parameters

Return type:

numpy.ndarray

pychemelt.utils.signals.two_state_thermal_unfold_curve_dimer(T, C, Tm, dHm, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, Cp=0)[source]#

Two-state temperature unfolding (dimer). N2 ⇔ 2U C is the total concentration (M) of the protein in dimer equivalent.

Parameters:

T (array-like) – Temperature
C (array-like) – Oligomer sample concentration
Tm (float) – Temperature at which the equilibrium constant equals one
dHm (float) – Variation of enthalpy between the two considered states at Tm
p1_N (float) – baseline parameters for the native-state baseline
p2_N (float) – baseline parameters for the native-state baseline
p3_N (float) – baseline parameters for the native-state baseline
p1_U (float) – baseline parameters for the unfolded-state baseline
p2_U (float) – baseline parameters for the unfolded-state baseline
p3_U (float) – baseline parameters for the unfolded-state baseline
baseline_N_fx (callable) – function to calculate the baseline for the native state
baseline_U_fx (callable) – function to calculate the baseline for the unfolded state
Cp (float, optional) – Variation of heat capacity between the two states (default: 0)

Returns:

Signal at the given temperatures, given the parameters

Return type:

numpy.ndarray

Notes

C is the total concentration (M) of the protein in dimer equivalent.

pychemelt.utils.signals.two_state_thermal_unfold_curve_trimer(T, C, Tm, dHm, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, Cp=0)[source]#

Two-state temperature unfolding (trimer). N3 ⇔ 3U

Parameters:

T (array-like) – Temperature
C (array-like) – Oligomer sample concentration
Tm (float) – Temperature at which the equilibrium constant equals one
dHm (float) – Variation of enthalpy between the two considered states at Tm
p1_N (float) – baseline parameters for the native-state baseline
p2_N (float) – baseline parameters for the native-state baseline
p3_N (float) – baseline parameters for the native-state baseline
p1_U (float) – baseline parameters for the unfolded-state baseline
p2_U (float) – baseline parameters for the unfolded-state baseline
p3_U (float) – baseline parameters for the unfolded-state baseline
baseline_N_fx (callable) – function to calculate the baseline for the native state
baseline_U_fx (callable) – function to calculate the baseline for the unfolded state
Cp (float, optional) – Variation of heat capacity between the two states (default: 0)

Returns:

Signal at the given temperatures, given the parameters

Return type:

numpy.ndarray

Notes

C is the total concentration (M) of the protein in trimer equivalent.

pychemelt.utils.signals.two_state_thermal_unfold_curve_tetramer(T, C, Tm, dHm, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, Cp=0, extra_arg=None)[source]#

Two-state temperature unfolding (tetramer). N4 ⇔ 4U

Parameters:

T (array-like) – Temperature
C (array-like) – Oligomer sample concentration
Tm (float) – Temperature at which the equilibrium constant equals one
dHm (float) – Variation of enthalpy between the two considered states at Tm
p1_N (float) – baseline parameters for the native-state baseline
p2_N (float) – baseline parameters for the native-state baseline
p3_N (float) – baseline parameters for the native-state baseline
p1_U (float) – baseline parameters for the unfolded-state baseline
p2_U (float) – baseline parameters for the unfolded-state baseline
p3_U (float) – baseline parameters for the unfolded-state baseline
baseline_N_fx (callable) – function to calculate the baseline for the native state
baseline_U_fx (callable) – function to calculate the baseline for the unfolded state
Cp (float, optional) – Variation of heat capacity between the two states (default: 0)

Returns:

Signal at the given temperatures, given the parameters

Return type:

numpy.ndarray

Notes

C is the total concentration (M) of the protein in tetramer equivalent.

pychemelt.utils.signals.unfolding_curve_monomer_monomeric_intermediate(T, C, T1, DH1, T2, DH2, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, bI, Cp1=0, CpTh=0)[source]#: Three states reversible unfolding N <-> I <-> U

pychemelt.utils.signals.unfolding_curve_dimer_monomeric_intermediate(T, C, T1, DH1, T2, DH2, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, bI, Cp1=0, CpTh=0)[source]#: N2 ⇔ 2Ι ⇔ 2U Three-state unfolding with a monomeric intermediate C = concentration in dimer equivalent CpTotal = Cp1 + 2*Cp2

pychemelt.utils.signals.unfolding_curve_trimer_monomeric_intermediate(T, C, T1, DH1, T2, DH2, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, bI, Cp1=0, CpTh=0)[source]#: N3 ⇔ 3Ι ⇔ 3U Three-state unfolding with a monomeric intermediate C = concentration of the trimer equivalent

pychemelt.utils.signals.unfolding_curve_tetramer_monomeric_intermediate(T, C, T1, DH1, T2, DH2, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, bI, Cp1=0, CpTh=0)[source]#: N4 ⇔ 4Ι ⇔ 4U Three-state unfolding with a monomeric intermediate C = concentration of the tetramermer equivalent

pychemelt.utils.signals.unfolding_curve_trimer_trimeric_intermediate(T, C, T1, DH1, T2, DH2, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, bI, Cp1=0, CpTh=0)[source]#: N3 ⇔ Ι3 ⇔ 3U Three-state unfolding with a trimeric intermediate C = concentration of the trimer equivalent

pychemelt.utils.signals.unfolding_curve_dimer_dimeric_intermediate(T, C, T1, DH1, T2, DH2, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, bI, Cp1=0, CpTh=0)[source]#: N2 ⇔ Ι2 ⇔ 2U Three-state unfolding with a monomeric intermediate C = molar concentration in dimer equivalent CpTotal = Cp1 + Cp2

pychemelt.utils.signals.map_two_state_model_to_signal_fx(model)[source]#

Maps the model string to the signal type

Parameters:: model (str,) – string representation of model type.
Returns:: signal function corresponding to the string
Return type:: function

pychemelt.utils.signals.map_three_state_model_to_signal_fx(model)[source]#

pychemelt.utils.svd module#

Module containing functions to perform Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) on spectral data, along with utilities for manipulating basis spectra and coefficients.

Author: Osvaldo Burastero

pychemelt.utils.svd.apply_svd(X)[source]#

Perform Singular Value Decomposition (SVD) on the input data matrix X.

Parameters:

X (numpy array of shape (n_wavelengths, n_measurements)) – The input data matrix to decompose.

Returns:

explained_variance (numpy array) – The cumulative explained variance for each component.
basis_spectra (numpy array) – The left singular vectors (U matrix) representing the basis spectra.
coefficients (numpy array) – The coefficients associated with each basis spectrum.

pychemelt.utils.svd.filter_basis_spectra(explained_variance, basis_spectra_all, coefficients_all, explained_variance_threshold=99)[source]#

Filter the basis spectra and coefficients based on the explained variance threshold :param explained_variance: The cumulative explained variance for each component. :type explained_variance: numpy array :param basis_spectra_all: The left singular vectors (U matrix) representing the basis spectra. :type basis_spectra_all: numpy array :param coefficients_all: The coefficients associated with each basis spectrum. :type coefficients_all: numpy array :param explained_variance_threshold: The threshold for explained variance to filter components. Default is 99. :type explained_variance_threshold: float, optional

Returns:

basis_spectra (numpy array) – The filtered basis spectra.
coefficients (numpy array) – The filtered coefficients.
k (int) – The number of components that meet the explained variance threshold.

pychemelt.utils.svd.align_basis_spectra_and_coefficients(X, basis_spectra, coefficients)[source]#

Align the basis spectra peaks to the original data :param X: The input data matrix. :type X: numpy array of shape (n_samples, n_features) :param basis_spectra: The basis spectra obtained from SVD. :type basis_spectra: numpy array :param coefficients: The coefficients associated with each basis spectrum. :type coefficients: numpy array

Returns:

basis_spectra (numpy array) – The aligned basis spectra.
coefficients (numpy array) – The adjusted coefficients.

pychemelt.utils.svd.angle_from_cathets(adjacent_leg, opposite_leg)[source]#

Calculate the angle between the hypotenuse and the adjacent leg of a right triangle. :param adjacent_leg: Length of the adjacent leg. :type adjacent_leg: float :param opposite_leg: Length of the opposite leg. :type opposite_leg: float

Returns:: angle_in_radians – Angle in radians between the hypotenuse and the adjacent leg.
Return type:: float

pychemelt.utils.svd.get_2d_counterclockwise_rot_matrix(angle_in_radians)[source]#

Obtain the rotation matrix for a 2d coordinates system using a counterclockwise direction :param angle_in_radians: Angle in radians for the rotation. :type angle_in_radians: float

Returns:: rotM – 2x2 rotation matrix.
Return type:: numpy array

pychemelt.utils.svd.get_3d_counterclockwise_rot_matrix_around_z_axis(angle_in_radians)[source]#

Obtain the rotation matrix for a 3d coordinates system around the z axis using a counterclockwise direction :param angle_in_radians: Angle in radians for the rotation. :type angle_in_radians: float

Returns:: rotM – 3x3 rotation matrix.
Return type:: numpy array

pychemelt.utils.svd.get_3d_clockwise_rot_matrix_around_y_axis(angle_in_radians)[source]#

Obtain the rotation matrix for a 3d coordinates system around the y axis using a clockwise direction :param angle_in_radians: Angle in radians for the rotation. :type angle_in_radians: float

Returns:: rotM – 3x3 rotation matrix.
Return type:: numpy array

pychemelt.utils.svd.rotate_two_basis_spectra(X, basis_spectra, pca_based=False)[source]#

Create a new basis spectra using a linear combination of the first and second basis spectra

Parameters:

X (numpy array) – The raw data matrix of size n*m, where ‘n’ is the number of measured wavelengths and ‘m’ is the number of acquired spectra.
basis_spectra (numpy array) – The matrix containing the set of basis spectra.
pca_based (bool, optional) – Boolean to decide if we need to center the matrix X. Default is False.

Returns:

basis_spectra_new (numpy array) – The new set of basis spectra.
coefficients (numpy array) – The new set of associated coefficients.

pychemelt.utils.svd.rotate_three_basis_spectra(X, basis_spectra, pca_based=False)[source]#

Create a new basis spectra using a linear combination from the first, second and third basis spectra

Parameters:

X (numpy array) – The raw data matrix of size n*m, where ‘n’ is the number of measured wavelengths and ‘m’ is the number of acquired spectra.
basis_spectra (numpy array) – The matrix containing the set of basis spectra.
pca_based (bool, optional) – Boolean to decide if we need to center the matrix X. Default is False.

Returns:

basis_spectra_new (numpy array) – The new set of basis spectra.
coefficients_subset (numpy array) – The new set of associated coefficients.

pychemelt.utils.svd.reconstruct_spectra(basis_spectra, coefficients, X=None, pca_based=False)[source]#

Reconstruct the original spectra based on the set of basis spectra and the associated coefficients

Parameters:

basis_spectra (numpy array) – The matrix containing the set of basis spectra.
coefficients (numpy array) – The associated coefficients of each basis spectrum.
X (numpy array, optional) – Only used if pca_based equals TRUE! X is the raw data matrix of size n*m, where ‘n’ is the number of measured wavelengths and ‘m’ is the number of acquired spectra.
pca_based (bool, optional) – Boolean to decide if we need to extract the mean from the the X raw data matrix. Default is False.
Returns
-------
fitted (numpy array) – The reconstructed matrix which should be close the original raw data.

pychemelt.utils.svd.explained_variance_from_orthogonal_vectors(vectors, coefficients, total_variance)[source]#

Useful to get the percentage of variance, not in the coordinate space provided by PCA/SVD, but against a different set of (rotated) vectors.

Parameters:

vectors (numpy array) – The set of orthogonal vectors.
coefficients (numpy array) – The associated coefficients of each orthogonal vector.
total_variance (float) – The total variance of the original data (mean subtracted if we performed PCA…).

Returns:

explained_variance – The amount of explained variance by each orthogonal vector.

Return type:

list

pychemelt.utils.svd.apply_pca(X)[source]#

Perform Principal Component Analysis (PCA) on the input data matrix X. :param X: The input data matrix to decompose. :type X: numpy array of shape (n_wavelengths, n_measurements)

Returns:

cum_sum_eigenvalues (numpy array) – The cumulative explained variance for each principal component.
principal_components (numpy array) – The principal components (eigenvectors) representing the basis spectra.
coefficients (numpy array) – The coefficients associated with each principal component.

pychemelt.utils.svd.recalc_explained_variance(basis_spectra, coefficients, X, pca_based=False)[source]#

Recalculate the explained variance of a set of basis spectra and associated coefficients :param basis_spectra: The basis spectra. :type basis_spectra: numpy array :param coefficients: The associated coefficients of each basis spectrum. :type coefficients: numpy array :param X: The raw data matrix of size n*m, where ‘n’ is the number of measured wavelengths

and ‘m’ is the number of acquired spectra.

Parameters:: pca_based (bool, optional) – Boolean to decide if we need to center the matrix X. Default is False.
Returns:: explained_variance – The cumulative explained variance for each component.
Return type:: numpy array

pychemelt.utils package#

Submodules#

pychemelt.utils.constants module#

pychemelt.utils.files module#

pychemelt.utils.fitting module#

pychemelt.utils.fractions module#

pychemelt.utils.math module#

pychemelt.utils.palette module#

pychemelt.utils.plotting module#

pychemelt.utils.processing module#

pychemelt.utils.rates module#

pychemelt.utils.signals module#

pychemelt.utils.svd module#

This Page