pychemelt.utils package#
Submodules#
pychemelt.utils.constants module#
pychemelt.utils.files module#
This module contains helper functions to parse Differential Scanning Fluorimetry files from different instrument providers Author: Osvaldo Burastero
All functions that import files should return: - signal_data_dic: dictionary with the signal data, one entry per signal - temp_data_dic: dictionary with the temperature data, one entry per signal - conditions: list with the names of the samples - signals: list with the names of the signals
A signal can be “350nm”, “330nm”, “Scattering”, “Ratio”, “Turbidity”, “Ratio 350nm/330nm”, etc. The length of the lists in signal_data_dic and temp_data_dic should be the same as the length of conditions
- pychemelt.utils.files.load_csv_file(file)[source]#
Load a CSV file containing temperature and signal columns and return structured data.
- Parameters:
file (str) – Path to the csv file
- Returns:
signal_data_dic (dict) – Dictionary mapping signal names to lists of 1D numpy arrays (one array per condition)
temp_data_dic (dict) – Dictionary mapping signal names to lists of temperature arrays corresponding to the signals
conditions (list) – List of condition names
signals (numpy.ndarray) – Array of signal name strings
- pychemelt.utils.files.load_aunty_xlsx(file_path)[source]#
Load AUNTY-format multi-sheet Excel file where each sheet is a condition.
- Parameters:
file_path (str) – Path to the AUNTY xlsx file
- pychemelt.utils.files.load_quantstudio_txt(QSfile)[source]#
Load QuantStudio TXT files (.txt) exported from QuantStudio instruments.
- Parameters:
QSfile (str) – Path to the QuantStudio txt file
- Returns:
signal_data_dic (dict) – Dictionary with signal data (key: ‘Fluorescence’)
temp_data_dic (dict) – Dictionary with temperature arrays per condition
conditions (list) – List of condition names (well identifiers)
signals (numpy.ndarray) – Array with signal name(s)
- pychemelt.utils.files.load_thermofluor_xlsx(thermofluor_file)[source]#
Load DSF Thermofluor xls file and extract data.
- Parameters:
thermofluor_file (str) – Path to the xls file
- Returns:
signal_data_dic (dict) – Dictionary with signal data
temp_data_dic (dict) – Dictionary with temperature data
conditions (list) – List of conditions
- pychemelt.utils.files.load_nanoDSF_xlsx(processed_dsf_file)[source]#
Load nanotemper processed xlsx file and extract relevant data.
- Parameters:
processed_dsf_file (str) – Path to the processed xlsx file
- Returns:
signal_data_dic (dict) – Dictionary with signal data
temp_data_dic (dict) – Dictionary with temperature data
conditions (list) – List of conditions
signals (numpy.ndarray) – Array of signal names
- pychemelt.utils.files.load_panta_xlsx(pantaFile)[source]#
Load the xlsx file generated by a Prometheus Panta instrument.
- Parameters:
pantaFile (str) – Path to the xlsx file
- Returns:
signal_data_dic (dict) – Dictionary with signal data
temp_data_dic (dict) – Dictionary with temperature data
conditions (list) – List of conditions
signals (numpy.ndarray) – List of signal names, such as 330nm and 350nm
- pychemelt.utils.files.load_uncle_multi_channel(uncle_file)[source]#
Function to load the data from the UNCLE instrument.
- Parameters:
uncle_file (str) – Path to the xlsx file
- Returns:
signal_data_dic (dict) – Dictionary with signal data (keys: wavelength strings like ‘350 nm’)
temp_data_dic (dict) – Dictionary with temperature arrays per condition
conditions (list) – List of sample names
signals (list) – List of wavelength strings
- pychemelt.utils.files.load_mx3005p_txt(filename)[source]#
Load Agilent MX3005P qPCR txt file and extract data
- Parameters:
filename (str) – Path to the MX3005P txt file. The second column has the fluorescence data, and the third column the temperature. Wells are separated by rows containing a sentence like this one: ‘Segment 2 Plateau 1 Well 1’
- Returns:
signal_data_dic (dict) – Dictionary with signal data
temp_data_dic (dict) – Dictionary with temperature data
conditions (list) – List of conditions (well numbers)
signals (numpy.ndarray) – List of signal names
- pychemelt.utils.files.detect_file_type(file)[source]#
Detect the type of file based on its extension and content.
- Parameters:
file (str) – Path to the file
- Returns:
Type of file (e.g., ‘supr’, ‘csv’, ‘prometheus’, ‘panta’, ‘uncle’, ‘mx3005p’, ‘quantstudio’, etc.) or None if unknown
- Return type:
str or None
- pychemelt.utils.files.detect_encoding(file_path)[source]#
Detect the encoding of a file by trying common encodings.
- Parameters:
file_path (str) – Path to the file
- Returns:
Detected encoding or the string ‘Unknown encoding’
- Return type:
str
- pychemelt.utils.files.read_jasco_thermal_ramp(file)[source]#
Given a JASCO file with a thermal ramp, this function reads the data
The data is given in chuncks:
- Channel 1
4.94 14.93 25.08 35.02 45.03 55.07 64.99 75.04 85.08 95.03
250 -0.310564 -0.112003 0.0199744 -0.217282 -0.238716 -0.173046 0.00129784 -0.394731 -0.687165 -1.40543
- Parameters:
file (str) – Path to the JASCO thermal ramp file
- Returns:
signal_data_dic (dict) – Dictionary with signal data (key: wavelength string like ‘250 nm’)
temp_data_dic (dict) – Dictionary with temperature arrays per condition (only one condition in this case)
conditions (list) – File name as condition
wavelength_data (list) – List of wavelength strings (e.g., ‘250 nm’, ‘255 nm’, etc.)
pychemelt.utils.fitting module#
This module contains helper functions to fit unfolding data Author: Osvaldo Burastero
- pychemelt.utils.fitting.fit_line_robust(x, y)[source]#
Fit a line to the data using robust fitting
- Parameters:
x (array-like) – x data
y (array-like) – y data
- Returns:
m (float) – Slope of the fitted line
b (float) – Intercept of the fitted line
- pychemelt.utils.fitting.fit_quadratic_robust(x, y)[source]#
Fit a quadratic equation to the data using robust fitting
- Parameters:
x (array-like) – x data
y (array-like) – y data
- Returns:
a (float) – Quadratic coefficient of the fitted polynomial
b (float) – Linear coefficient of the fitted polynomial
c (float) – Constant coefficient of the fitted polynomial
- pychemelt.utils.fitting.fit_exponential_robust(x, y)[source]#
Fit an exponential function to the data using robust fitting.
Notes
Temperatures should be shifted to the reference (Tref) before calling this function.
- Parameters:
x (array-like) – x data
y (array-like) – y data
- Returns:
a (float) – Baseline
c (float) – Pre-exponential factor
alpha (float) – Exponential factor
- pychemelt.utils.fitting.fit_thermal_unfolding(list_of_temperatures, list_of_signals, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, Cp, list_of_oligomer_conc=None)[source]#
Fit the thermal unfolding profile of many curves at the same time.
This performs global fitting of shared thermodynamic parameters with per-curve baselines.
- Parameters:
list_of_temperatures (list of array-like) – List of temperature arrays for each dataset
list_of_signals (list of array-like) – List of signal arrays for each dataset
initial_parameters (array-like) – Initial guess for the parameters
low_bounds (array-like) – Lower bounds for the parameters
high_bounds (array-like) – Upper bounds for the parameters
signal_fx (callable) – Function to calculate the signal based on the parameters
baseline_native_fx (callable) – function to calculate the native state baseline
baseline_unfolded_fx (callable) – function to calculate the unfolded state baseline
Cp (float) – Heat capacity change (passed to signal_fx)
list_of_oligomer_conc (list, optional) – List of oligomer concentrations for each dataset (if applicable)
- Returns:
global_fit_params (numpy.ndarray) – Fitted global parameters
cov (numpy.ndarray) – Covariance matrix of the fitted parameters
predicted_lst (list of numpy.ndarray) – Predicted signals for each dataset based on the fitted parameters
- pychemelt.utils.fitting.fit_tc_unfolding_single_slopes(list_of_temperatures, list_of_signals, denaturant_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, fit_m1=False, cp_value=None, tm_value=None, dh_value=None, method='least_squares')[source]#
Vectorized and optimized version of global thermal unfolding fitting.
- Parameters:
- list_of_temperatureslist of array-like
Temperature arrays for each dataset
- list_of_signalslist of array-like
Signal arrays for each dataset
- denaturant_concentrationslist
Denaturant concentrations (one per dataset)
- initial_parametersarray-like
Initial guess for parameters
- low_boundsarray-like
Lower bounds for parameters
- high_boundsarray-like
Upper bounds for parameters
- signal_fxcallable
Signal model function
- baseline_native_fxcallable
function to calculate the native state baseline
- baseline_unfolded_fxcallable
function to calculate the unfolded state baseline
- fit_m1bool, optional
Whether to fit temperature dependence of m-value
- cp_value, tm_value, dh_valuefloat or None, optional
Optional fixed thermodynamic parameters
- methodstr, optional
Optimization method (‘least_sq’ or ‘curve_fit’)
- :returns: * **global_fit_params (numpy.ndarray) – Fitted global parameters**
cov (numpy.ndarray) – Covariance matrix
predicted_lst (list of numpy.ndarray) – Predicted signals per dataset
Vectorized fitting of thermochemical unfolding curves for multiple signal types sharing thermodynamic parameters and slopes, using lmfit.
- Parameters:
list_of_temperatures (list of array-like) – Temperature arrays for each dataset
list_of_signals (list of array-like) – Signal arrays for each dataset
signal_ids (list of int) – Signal-type id for each dataset (0..n_signals-1)
denaturant_concentrations (list) – Denaturant concentrations for each dataset (flattened across signals)
initial_parameters (array-like) – Initial guess for the parameters
low_bounds (array-like) – Lower bounds for the parameters
high_bounds (array-like) – Upper bounds for the parameters
signal_fx (callable) –
Signal model function baseline_native_fx : callable
function to calculate the baseline for the native state
baseline_unfolded_fx (callable) – function to calculate the baseline for the unfolded state
fit_m1 (bool, optional) – Whether to fit temperature dependence of m-value
cp_value (float or None, optional) – Optional fixed thermodynamic parameters
tm_value (float or None, optional) – Optional fixed thermodynamic parameters
dh_value (float or None, optional) – Optional fixed thermodynamic parameters
method (str, optional) – Optimization method for lmfit minimizer. Defaults to ‘least_squares’.
- Returns:
global_fit_params (numpy.ndarray) – Fitted global parameters
cov (numpy.ndarray) – Covariance matrix
predicted_lst (list of numpy.ndarray) – Predicted signals per dataset
result (lmfit.minimizer.MinimizerResult) – lmfit minimization result object
minimizer (lmfit.minimizer.Minimizer) – lmfit minimizer object
- pychemelt.utils.fitting.fit_tc_unfolding_many_signals(list_of_temperatures, list_of_signals, signal_ids, denaturant_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, fit_m1=False, model_scale_factor=False, scale_factor_exclude_ids=[], cp_value=None, method='least_squares', fit_native_den_slope=True, fit_unfolded_den_slope=True)[source]#
Fit thermochemical unfolding curves for many signals using lmfit.
- Parameters:
list_of_temperatures (list of array-like) – Temperature arrays for each dataset.
list_of_signals (list of array-like) – Signal arrays for each dataset.
signal_ids (list of int) – Signal-type id for each dataset (0..n_signals-1)
denaturant_concentrations (list) – Denaturant concentrations for each dataset (flattened across signals)
initial_parameters (array-like) – Initial guess for the parameters
low_bounds (array-like) – Lower bounds for the parameters
high_bounds (array-like) – Upper bounds for the parameters
signal_fx (callable) – Signal model function
baseline_native_fx (callable) – function to calculate the native state baseline
baseline_unfolded_fx (callable) – function to calculate the unfolded state baseline
fit_m1 (bool, optional) – Whether to include and fit temperature dependence of the m-value (m1)
model_scale_factor (bool, optional) – If True, include a per-denaturant concentration scale factor to account for intensity differences
scale_factor_exclude_ids (list, optional) – IDs of scale factors to exclude / fix to 1
cp_value (float or None, optional) – If provided, Cp is fixed to this value and not fitted
method (str, optional) – Optimization method for lmfit minimizer. Defaults to ‘least_squares’.
fit_native_den_slope (bool, optional) – Whether to fit denaturant dependence of baselines.
fit_unfolded_den_slope (bool, optional) – Whether to fit denaturant dependence of baselines.
- Returns:
global_fit_params (numpy.ndarray) – Fitted global parameters
cov (numpy.ndarray) – Covariance matrix
predicted_lst (list of numpy.ndarray) – Predicted signals per dataset
result (lmfit.minimizer.MinimizerResult) – lmfit minimization result object
minimizer (lmfit.minimizer.Minimizer) – lmfit minimizer object
- pychemelt.utils.fitting.fit_oligomer_unfolding_single_slopes(list_of_temperatures, list_of_signals, oligomer_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, cp_value=None, tm_value=None, dh_value=None, method='least_squares')[source]#
Vectorized and optimized version of global thermal unfolding fitting. of oligomers
- Parameters:
list_of_temperatures (list of array-like) – Temperature arrays for each dataset
list_of_signals (list of array-like) – Signal arrays for each dataset
oligomer_concentrations (list) – sample concentrations of the oligomeric complex (one per dataset)
initial_parameters (array-like) – Initial guess for parameters
low_bounds (array-like) – Lower bounds for parameters
high_bounds (array-like) – Upper bounds for parameters
signal_fx (callable) – Signal model function
baseline_native_fx (callable) – function to calculate the native state baseline
baseline_unfolded_fx (callable) – function to calculate the unfolded state baseline
cp_value (float or None, optional) – Optional fixed thermodynamic parameters
tm_value (float or None, optional) – Optional fixed thermodynamic parameters
dh_value (float or None, optional) – Optional fixed thermodynamic parameters
- Returns:
global_fit_params (numpy.ndarray) – Fitted global parameters
cov (numpy.ndarray) – Covariance matrix
predicted_lst (list of numpy.ndarray) – Predicted signals per dataset
result (lmfit.minimizer.MinimizerResult) – lmfit minimization result object
minimizer (lmfit.minimizer.Minimizer) – lmfit minimizer object
Vectorized fitting of oligomer thermal unfolding curves for multiple signal types sharing thermodynamic parameters and slopes, using lmfit.
- Parameters:
list_of_temperatures (list of array-like) – Temperature arrays for each dataset.
list_of_signals (list of array-like) – Signal arrays for each dataset.
signal_ids (list of int) – Signal-type id for each dataset (0..n_signals-1)
oligomer_concentrations (list) – sample concentrations of the oligomeric complex for each dataset (flattened across signals)
initial_parameters (array-like) – Initial guess for the parameters
low_bounds (array-like) – Lower bounds for the parameters
high_bounds (array-like) – Upper bounds for the parameters
signal_fx (callable) –
Signal model function baseline_native_fx : callable
function to calculate the baseline for the native state
baseline_unfolded_fx (callable) – function to calculate the baseline for the unfolded state
cp_value (float or None, optional) – Optional fixed thermodynamic parameters
tm_value (float or None, optional) – Optional fixed thermodynamic parameters
dh_value (float or None, optional) – Optional fixed thermodynamic parameters
- Returns:
global_fit_params (numpy.ndarray) – Fitted global parameters
cov (numpy.ndarray) – Covariance matrix
predicted_lst (list of numpy.ndarray) – Predicted signals per dataset
result (lmfit.minimizer.MinimizerResult) – lmfit minimization result object
minimizer (lmfit.minimizer.Minimizer) – lmfit minimizer object
- pychemelt.utils.fitting.fit_oligomer_unfolding_many_signals(list_of_temperatures, list_of_signals, signal_ids, oligomer_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, model_scale_factor=False, scale_factor_exclude_ids=[], cp_value=None, method='least_squares')[source]#
Fit thermal unfolding curves of oligomers for many signals (optimized variant).
- Parameters:
- list_of_temperatureslist of array-like
Temperature arrays for each dataset
- list_of_signalslist of array-like
Signal arrays for each dataset
- signal_idslist of int
Signal-type id for each dataset (0..n_signals-1)
- oligomer_concentrationslist
sample concentrations of the oligomeric complex for each dataset (flattened across signals)
- initial_parametersarray-like
Initial guess for the parameters
- low_boundsarray-like
Lower bounds for the parameters
- high_boundsarray-like
Upper bounds for the parameters
- signal_fxcallable
Signal model function
- baseline_native_fxcallable
function to calculate the native state baseline
- baseline_unfolded_fxcallable
function to calculate the unfolded state baseline
- model_scale_factorbool, optional
If True, include a per-oligomeric concentration scale factor to account for intensity differences
- scale_factor_exclude_idslist, optional
IDs of scale factors to exclude / fix to 1
- cp_valuefloat or None, optional
If provided, Cp is fixed to this value and not fitted
- Returns:
- global_fit_paramsnumpy.ndarray
Fitted global parameters
- covnumpy.ndarray
Covariance matrix
- predicted_lstlist of numpy.ndarray
Predicted signals per dataset
- resultlmfit.minimizer.MinimizerResult
lmfit minimization result object
- minimizerlmfit.minimizer.Minimizer
lmfit minimizer object
- pychemelt.utils.fitting.fit_oligomer_unfolding_three_states_single_slopes(list_of_temperatures, list_of_signals, oligomer_concentrations, initial_parameters, low_bounds, high_bounds, signal_fx, baseline_native_fx, baseline_unfolded_fx, t1=None, t2=None, dh1=None, dh2=None, CpTh_value=None, method='least_squares', max_nfev=None)[source]#
Vectorized and optimized version of global thermal unfolding fitting of oligomers.
- Returns:
global_fit_params (numpy.ndarray)
cov (numpy.ndarray)
predicted_lst (list of numpy.ndarray)
result (lmfit.minimizer.MinimizerResult)
minimizer (lmfit.minimizer.Minimizer)
Note
Dear dev/user. Fitting Cp1 will probably not work in the case of monomers, given that changing Cp does not change the shape of the unfolding curve.
Vectorized fitting of oligomer thermal unfolding curves for multiple signal types sharing thermodynamic parameters and slopes, using lmfit.
- Parameters:
list_of_temperatures (list of array-like) – Temperature arrays for each dataset.
list_of_signals (list of array-like) – Signal arrays for each dataset.
signal_ids (list of int) – Signal-type id for each dataset (0..n_signals-1)
oligomer_concentrations (list) – Oligomer concentrations for each dataset (flattened across signals)
initial_parameters (array-like) – Initial guess for the parameters
low_bounds (array-like) – Lower bounds for the parameters
high_bounds (array-like) – Upper bounds for the parameters
signal_fx (callable) –
Signal model function baseline_native_fx : callable
function to calculate the baseline for the native state
baseline_unfolded_fx (callable) – function to calculate the baseline for the unfolded state
t1 (float, optional) – Values for the unfolding temperatures one and two
t2 (float, optional) – Values for the unfolding temperatures one and two
dh1 (float, optional) – Values for the unfolding enthalpy one and two
dh2 (float, optional) – Values for the unfolding enthalpy one and two
CpTh_value (float, optional) – Value for the total Cp of the system, enabling fitting of Cp1
- Returns:
global_fit_params (numpy.ndarray) – Fitted global parameters
cov (numpy.ndarray) – Covariance matrix
predicted_lst (list of numpy.ndarray) – Predicted signals per dataset
result (lmfit.minimizer.MinimizerResult)
minimizer (lmfit.minimizer.Minimizer)
pychemelt.utils.fractions module#
This module contains helper functions to obtain the amount of folded/intermediate/unfolded (etc.) protein Author: Osvaldo Burastero
- pychemelt.utils.fractions.fn_two_state_monomer(K)[source]#
Given the equilibrium constant K of N <-> U, return the fraction of folded protein.
- Parameters:
K (float) – Equilibrium constant of the reaction N <-> U
- Returns:
Fraction of folded protein
- Return type:
float
- pychemelt.utils.fractions.fu_two_state_dimer(K, C)[source]#
Given the equilibrium constant K, of N2 <-> 2U, and the concentration of dimer equivalent C, return the fraction of unfolded protein
- Parameters:
K (float) – Equilibrium constant of the reaction N2 <-> 2U
C (float) – Total concentration of the protein in dimer equivalents
- Returns:
Fraction of unfolded protein
- Return type:
float
- pychemelt.utils.fractions.fu_two_state_trimer(K, C)[source]#
Given the equilibrium constant K, of N3 <-> 3U, and the concentration of trimer equivalent C, return the fraction of unfolded protein
- Parameters:
K (float) – Equilibrium constant of the reaction N3 <-> 3U
C (float) – Total concentration of the protein in trimer equivalents
- Returns:
Fraction of unfolded protein
- Return type:
float
- pychemelt.utils.fractions.fu_two_state_tetramer(K, C)[source]#
Given the equilibrium constant K, of N4 <-> 4U, and the concentration of tetramer equivalent C, return the fraction of folded protein
- Parameters:
K (float) – Equilibrium constant of the reaction N4 <-> 4U
C (float) – Total concentration of the protein in tetramer equivalents
- Returns:
Fraction of unfolded protein
- Return type:
float
- pychemelt.utils.fractions.fi_three_state_tetramer_monomeric_intermediate(K1, K2, Ct)[source]#
Given the equilibrium constant K1, of N4 <-> 4I, K2, of I <-> U, and the concentration of tetramer equivalent Ct, return the fraction of intermediate
- pychemelt.utils.fractions.fi_three_state_dimer_monomeric_intermediate(K1, K2, C)[source]#
Given the equilibrium constant K1, of N2 <-> 2I, K2, of 2I <-> 2U and the concentration of dimer equivalent C, return the fraction of intermediate
- pychemelt.utils.fractions.fu_three_state_dimer_dimeric_intermediate(K1, K2, C)[source]#
Given the equilibrium constant K1, of N2 <-> I2, K2, of I2 <-> 2U and the concentration of dimer equivalent C, return the fraction of unfolded protein
- pychemelt.utils.fractions.fi_three_state_dimer_dimeric_intermediate(fu, K2, C)[source]#
Given the fraction of unfolded protein fu, the equilibrium constant K2, of I2 <-> 2U, and the concentration of dimer equivalent C, return the fraction of intermediate
- pychemelt.utils.fractions.fi_three_state_trimer_monomeric_intermediate(K1, K2, C)[source]#
Given the equilibrium constant K1, of N3 <-> 3I, K2, of 3I <-> 3U and the concentration of trimer equivalent C, return the fraction of unfolded protein
pychemelt.utils.math module#
This module contains helper functions for mathematical operations Author: Osvaldo Burastero
- pychemelt.utils.math.temperature_to_kelvin(T)[source]#
Convert temperature from Celsius to Kelvin if necessary.
- Parameters:
T (array-like) – Temperature values
- Returns:
Temperature values in Kelvin
- Return type:
array-like
- pychemelt.utils.math.temperature_to_celsius(T)[source]#
Convert temperature from Kelvin to Celsius if necessary.
- Parameters:
T (array-like) – Temperature values
- Returns:
Temperature values in Celsius
- Return type:
array-like
- pychemelt.utils.math.shift_temperature(T)[source]#
Shift temperature to be relative to Tref_cst in Kelvin.
- Parameters:
T (array-like) – Temperature values
- Returns:
Shifted temperature values
- Return type:
array-like
- pychemelt.utils.math.constant_baseline(dt, d, den_slope, a, *args)[source]#
Baseline function with no dependence on temperature and dependence on denaturant concentration
- Parameters:
dt (float) – delta temperature, not used here but required for compatibility with other baseline functions
d (float) – denaturant concentration
den_slope (float) – linear dependence of signal on denaturant concentration
a (float) – intercept of the baseline
- Returns:
Baseline signal
- Return type:
float
- pychemelt.utils.math.linear_baseline(dt, d, den_slope, a, b, *args)[source]#
Baseline function with linear dependence on temperature and linear dependence on denaturant concentration
- Parameters:
dt (float) – delta temperature, not used here but required for compatibility with other baseline functions
d (float) – denaturant concentration
den_slope (float) – linear dependence of signal on denaturant concentration
a (float) – intercept of the baseline
b (float) – linear dependence of signal on temperature
- Returns:
Baseline signal
- Return type:
float
- pychemelt.utils.math.quadratic_baseline(dt, d, den_slope, a, b, c)[source]#
Baseline function with quadratic dependence on temperature and linear dependence on denaturant concentration
- Parameters:
dt (float) – delta temperature, not used here but required for compatibility with other baseline functions
d (float) – denaturant concentration
den_slope (float) – linear dependence of signal on denaturant concentration
a (float) – intercept of the baseline
b (float) – linear dependence of signal on temperature
c (float) – quadratic dependence of signal on temperature
- Returns:
Baseline signal
- Return type:
float
- pychemelt.utils.math.exponential_baseline(dt, d, den_slope, a, c, alpha)[source]#
Baseline function with exponential dependence on temperature and linear dependence on denaturant concentration
- Parameters:
dt (float) – delta temperature, not used here but required for compatibility with other baseline functions
d (float) – denaturant concentration
den_slope (float) – linear dependence of signal on denaturant concentration
a (float) – intercept of the baseline
b (float) – pre-exponential factor for the dependence on temperature
c (float) – exponential coefficient for the dependence on temperature
- Returns:
Baseline signal
- Return type:
float
- pychemelt.utils.math.constant_baseline_only_temp(dt, a, *args)[source]#
Baseline function with no dependence on temperature or denaturant concentration
- Parameters:
dt (float) – delta temperature, not used here but required for compatibility with other baseline functions
a (float) – intercept of the baseline
- Returns:
Baseline signal
- Return type:
float
- pychemelt.utils.math.linear_baseline_only_temp(dt, a, b, *args)[source]#
Baseline function with linear dependence on temperature and no dependence on denaturant concentration
- Parameters:
dt (float) – delta temperature
a (float) – intercept of the baseline
b (float) – linear dependence of signal on temperature
- Returns:
Baseline signal
- Return type:
float
- pychemelt.utils.math.quadratic_baseline_only_temp(dt, a, b, c)[source]#
Baseline function with quadratic dependence on temperature and no dependence on denaturant concentration
- Parameters:
dt (float) – delta temperature
a (float) – intercept of the baseline
b (float) – linear dependence of signal on temperature
c (float) – quadratic dependence of signal on temperature
- Returns:
Baseline signal
- Return type:
float
- pychemelt.utils.math.exponential_baseline_only_temp(dt, a, c, alpha)[source]#
Baseline function with exponential dependence on temperature and no dependence on denaturant concentration
- Parameters:
dt (float) – delta temperature
a (float) – intercept of the baseline
c (float) – pre-exponential factor for the dependence on temperature
alpha (float) – exponential coefficient for the dependence on temperature
- Returns:
Baseline signal
- Return type:
float
- pychemelt.utils.math.is_evenly_spaced(x, tol=0.0001)[source]#
Check if x is evenly spaced within a given tolerance.
- Parameters:
x (array-like) – x data
tol (float, optional) – Tolerance for considering spacing equal (default: 1e-4)
- Returns:
True if x is evenly spaced, False otherwise
- Return type:
bool
- pychemelt.utils.math.first_derivative_savgol(x, y, window_length=5, polyorder=4)[source]#
Estimate the first derivative using Savitzky-Golay filtering.
- Parameters:
x (array-like) – x data (must be evenly spaced)
y (array-like) – y data
window_length (int, optional) – Length of the filter window, in temperature units (default: 5)
polyorder (int, optional) – Order of the polynomial used to fit the samples (default: 4)
- Returns:
First derivative of y with respect to x
- Return type:
numpy.ndarray
Notes
This function will raise a ValueError if x is not evenly spaced.
- pychemelt.utils.math.relative_errors(params, cov)[source]#
Calculate the relative errors of the fitted parameters.
- Parameters:
params (numpy.ndarray) – Fitted parameters
cov (numpy.ndarray) – Covariance matrix of the fitted parameters
- Returns:
Relative errors of the fitted parameters (in percent)
- Return type:
numpy.ndarray
- pychemelt.utils.math.find_line_outliers(m, b, x, y, sigma=2.5)[source]#
Find outliers in a linear fit using the sigma rule.
- Parameters:
m (float) – Slope of the line
b (float) – Intercept of the line
x (array-like) – x data
y (array-like) – y data
sigma (float, optional) – Number of standard deviations to use for outlier detection (default: 2.5)
- Returns:
Indices of the outliers
- Return type:
numpy.ndarray
- pychemelt.utils.math.get_rss(y, y_fit)[source]#
Compute the residual sum of squares.
- Parameters:
y (array-like) – Observed values
y_fit (array-like) – Fitted values
- Returns:
Residual sum of squares
- Return type:
float
pychemelt.utils.palette module#
Viridis color palette.
A perceptually uniform color map that is readable by those with colorblindness. Contains hex color values transitioning from dark purple to yellow.
pychemelt.utils.plotting module#
- class pychemelt.utils.plotting.PlotConfig(width: int = 1000, height: int = 800, type: str = 'png', font_size: int = 16, marker_size: int = 8, line_width: int = 3)[source]#
Bases:
objectGeneral plot configuration
- width: int = 1000#
- height: int = 800#
- type: str = 'png'#
- font_size: int = 16#
- marker_size: int = 8#
- line_width: int = 3#
- __init__(width: int = 1000, height: int = 800, type: str = 'png', font_size: int = 16, marker_size: int = 8, line_width: int = 3) None#
- class pychemelt.utils.plotting.AxisConfig(showgrid_x: bool = True, showgrid_y: bool = True, n_y_axis_ticks: int = 5, linewidth: int = 1, tickwidth: int = 1, ticklen: int = 5, gridwidth: int = 1)[source]#
Bases:
objectAxis styling configuration
- showgrid_x: bool = True#
- showgrid_y: bool = True#
- n_y_axis_ticks: int = 5#
- linewidth: int = 1#
- tickwidth: int = 1#
- ticklen: int = 5#
- gridwidth: int = 1#
- __init__(showgrid_x: bool = True, showgrid_y: bool = True, n_y_axis_ticks: int = 5, linewidth: int = 1, tickwidth: int = 1, ticklen: int = 5, gridwidth: int = 1) None#
- class pychemelt.utils.plotting.LayoutConfig(show_subplot_titles: bool = False, vertical_spacing: float = 0.1)[source]#
Bases:
objectLayout and spacing configuration
- show_subplot_titles: bool = False#
- vertical_spacing: float = 0.1#
- __init__(show_subplot_titles: bool = False, vertical_spacing: float = 0.1) None#
- class pychemelt.utils.plotting.LegendConfig[source]#
Bases:
objectLegend and labeling configuration
- color_bar_length = 0.4#
- color_bar_orientation = 'v'#
- color_bar_x_pos = 1.05#
- color_bar_y_pos = 0.5#
- __init__() None#
- pychemelt.utils.plotting.config_fig(fig, plot_width=800, plot_height=600, plot_type='png', plot_title_for_download='plot')[source]#
Configure plotly figure with download options and toolbar settings.
- Parameters:
fig (go.Figure) – Plotly figure object
plot_width (int, default 800) – Width of the plot in pixels
plot_height (int, default 600) – Height of the plot in pixels
plot_type (str, default "png") – Format for downloading the plot (e.g., “png”, “jpeg”)
plot_title_for_download (str, default "plot") – Title for the downloaded plot file
- Returns:
Configured plotly figure
- Return type:
go.Figure
- pychemelt.utils.plotting.plot_unfolding(pychemelt_sample, plot_derivative=False, plot_config: PlotConfig = None, axis_config: AxisConfig = None, layout_config: LayoutConfig = None, legend_config: LegendConfig = None)[source]#
Plot the unfolding curves, including the signal and the predicted curves
- Parameters:
pychemelt_sample – pychemelt.Sample object
plot_derivative (bool) – Whether to plot the derivative of the signal
plot_config (PlotConfig, optional) – Configuration for the overall plot
axis_config (AxisConfig, optional) – Configuration for the axes
layout_config (LayoutConfig, optional) – Configuration for the layout
legend_config (LegendConfig, optional) – configuration for the legend
- pychemelt.utils.plotting.plot_baselines(pychemelt_sample, plot_config: PlotConfig = None, axis_config: AxisConfig = None, layout_config: LayoutConfig = None, legend_config: LegendConfig = None)[source]#
Plot the fitted native and unfolded baseline curves on the data
- Parameters:
pychemelt_sample – pychemelt.Sample object
plot_config (PlotConfig, optional) – Configuration for the overall plot
axis_config (AxisConfig, optional) – Configuration for the axes
layout_config (LayoutConfig, optional) – Configuration for the layout
legend_config (LegendConfig, optional) – configuration for the legend
pychemelt.utils.processing module#
This module contains helper functions to process data Author: Osvaldo Burastero
- pychemelt.utils.processing.set_param_bounds(p0, param_names)[source]#
Generate heuristic lower and upper bounds for fitting parameters based on initial guesses.
- Parameters:
p0 (array-like) – Initial parameter guesses.
param_names (list of str) – Names of the parameters to apply specific logic (e.g., non-negative constraints).
- Returns:
(low_bounds, high_bounds) as lists of numeric values.
- Return type:
tuple
- pychemelt.utils.processing.expand_temperature_list(temp_lst, signal_lst)[source]#
Expand the temperature list to match the length of the signal list.
- Parameters:
temp_lst (list) – List of temperatures
signal_lst (list) – List of signals
- Returns:
Expanded temperature list
- Return type:
list
- pychemelt.utils.processing.clean_conditions_labels(conditions)[source]#
Clean the conditions labels by removing unwanted characters and patterns.
- Parameters:
conditions (list) – List of condition strings.
- Returns:
List of cleaned condition strings.
- Return type:
list
- pychemelt.utils.processing.subset_signal_by_temperature(signal_lst, temp_lst, min_temp, max_temp)[source]#
Subset the signal and temperature lists based on the specified temperature range.
- Parameters:
signal_lst (list) – List of signal arrays.
temp_lst (list) – List of temperature arrays.
min_temp (float) – Minimum temperature for subsetting.
max_temp (float) – Maximum temperature for subsetting.
- Returns:
Tuple containing the subsetted signal and temperature lists.
- Return type:
tuple
- pychemelt.utils.processing.guess_Tm_from_derivative(temp_lst, deriv_lst, x1, x2)[source]#
Estimate the melting temperature (Tm) by finding the extremum of the first derivative.
- Parameters:
temp_lst (list of np.ndarray) – Temperature arrays for each dataset.
deriv_lst (list of np.ndarray) – First derivative of the signal for each dataset.
x1 (float) – Lower buffer from the temperature edges to exclude noise/artifacts.
x2 (float) – Upper buffer from the temperature edges to define the baseline median window.
- Returns:
Estimated Tm values for each dataset.
- Return type:
list of float
- pychemelt.utils.processing.estimate_signal_baseline_params(signal_lst, temp_lst, native_baseline_type, unfolded_baseline_type, window_range_native=12, window_range_unfolded=12, oligomer_number=1)[source]#
Estimate the baseline parameters for the sample
- Parameters:
signal_lst (list of np.ndarray) – List of signal arrays
temp_lst (list of np.ndarray) – List of temperature arrays
window_range_native (float) – Range of the temperature window to estimate the native state baseline
window_range_unfolded (float) – Range of the temperature window to estimate the unfolded state baseline
native_baseline_type (str) – options: ‘constant’, ‘linear’, ‘quadratic’, ‘exponential’
unfolded_baseline_type (str) – options: ‘constant’, ‘linear’, ‘quadratic’, ‘exponential’
oligomer_number (int) – number of subunits in the oligomer
- Returns:
Lists of estimated parameters (p1Ns, p1Us, p2Ns, p2Us, p3Ns, p3Us).
- Return type:
tuple
- pychemelt.utils.processing.fit_local_thermal_unfolding_to_signal_lst(signal_lst, temp_lst, t_melting_init, p1_Ns, p1_Us, p2_Ns, p2_Us, p3_Ns, p3_Us, baseline_native_fx, baseline_unfolded_fx)[source]#
Perform individual (local) fits for each signal curve in a list.
- Parameters:
signal_lst (list of np.ndarray) – List of signals.
temp_lst (list of np.ndarray) – List of temperatures.
t_melting_init (list of float) – Initial Tm guesses.
p1_Ns (list of float) – Estimated baseline parameters for each curve.
p1_Us (list of float) – Estimated baseline parameters for each curve.
p2_Ns (list of float) – Estimated baseline parameters for each curve.
p2_Us (list of float) – Estimated baseline parameters for each curve.
p3_Ns (list of float) – Estimated baseline parameters for each curve.
p3_Us (list of float) – Estimated baseline parameters for each curve.
baseline_native_fx (callable) – Function to calculate the native baseline.
baseline_unfolded_fx (callable) – Function to calculate the unfolded baseline.
- Returns:
(Tms, dHs, predicted_lst) containing fitted parameters and signal arrays.
- Return type:
tuple
- pychemelt.utils.processing.re_arrange_predictions(predicted_lst, n_signals, n_denaturants)[source]#
Re-arrange the flattened predictions to match the original signal list with sublists.
- Parameters:
predicted_lst (list) – Flattened list of predicted signals of length n_signals * n_denaturants.
n_signals (int) – Number of signal types (e.g., different wavelengths).
n_denaturants (int) – Number of denaturant concentrations or conditions per signal.
- Returns:
Re-arranged list of predicted signals of length n_signals, where each element is a sublist of length n_denaturants.
- Return type:
list
- pychemelt.utils.processing.re_arrange_params(params, n_signals)[source]#
Re-arrange flattened parameters into a list of sublists grouped by signal.
- Parameters:
params (list or np.ndarray) – Flattened list of parameters.
n_signals (int) – Number of signal types to group parameters by.
- Returns:
Re-arranged list of parameters of length n_signals containing parameter arrays for each signal.
- Return type:
list of np.ndarray
- pychemelt.utils.processing.subset_data(data, max_points)[source]#
Reduces the number of data points by repeated striding until the size is below a threshold.
- Parameters:
data (np.ndarray) – Input data array to be subsetted.
max_points (int) – The maximum number of points allowed in the resulting array.
- Returns:
Subsetted data array containing every $2^n$-th point of the original.
- Return type:
np.ndarray
- pychemelt.utils.processing.get_colors_from_numeric_values(values, min_val, max_val, use_log_scale=False)[source]#
Map numeric values to colors in the VIRIDIS palette based on a specified range.
- Parameters:
values (list or np.ndarray) – Numeric values to map to colors.
min_val (float) – Minimum value of the range.
max_val (float) – Maximum value of the range.
use_log_scale (bool, optional) – Whether to use logarithmic scaling for the values, default is True.
- Returns:
List of hex color codes corresponding to the input values.
- Return type:
list
- pychemelt.utils.processing.combine_sequences(seq1, seq2)[source]#
Combine two sequences to generate all possible combinations of their elements.
- Parameters:
seq1 (list) – First sequence of elements.
seq2 (list) – Second sequence of elements.
- Returns:
A list of tuples, where each tuple contains one element from seq1 and one from seq2.
- Return type:
list
- pychemelt.utils.processing.adjust_value_to_interval(value, lower_bound, upper_bound, shift)[source]#
Verify that a value is within the specified bounds. If the value is outside the bounds, adjust it to the nearest bound. :param value: The value to be adjusted. :type value: float :param lower_bound: The lower bound of the interval. :type lower_bound: float :param upper_bound: The upper bound of the interval. :type upper_bound: float :param shift: How much to shift the value if it is outside the bounds. :type shift: float
- pychemelt.utils.processing.oligomer_number(model)[source]#
Get the number of subunits in the oligomer based on the model.
- Returns:
The number of subunits (2 for ‘Dimer’, 3 for ‘Trimer’, 4 for ‘Tetramer’, 1 otherwise).
- Return type:
int
- pychemelt.utils.processing.parse_number(s)[source]#
Parse a string as a float, handling: - European decimal (comma) - Optional thousands separators - Standard decimal point
- Parameters:
s (str) – The string to parse
- Return type:
float The parsed number
- Raises:
ValueError If the string cannot be parsed as a float –
- pychemelt.utils.processing.are_all_strings_numeric(lst)[source]#
- Parameters:
lst (list of str) – List of strings to check
- Returns:
True if all strings in the list are numeric (can contain digits, ‘.’, ‘-’, ‘,’), False otherwise
- Return type:
bool
- pychemelt.utils.processing.transform_to_list(element_or_list)[source]#
- Parameters:
element_or_list (bool, str, int, float, list, or numpy array) – The input element or list to be transformed into a list.
- Returns:
A list containing the input element if it is not already a list, or the input itself if it is None, a numpy array, or a list.
- Return type:
list or None
- Raises:
ValueError – If the input is not a boolean, string, integer, float, list, numpy array
- pychemelt.utils.processing.ci_dict_to_summary_df(ci_dict, percentage=0.95)[source]#
Convert lmfit confidence interval dictionary into a summary DataFrame.
- Parameters:
ci_dict (dict) – Dictionary containing confidence intervals for fitted parameters, typically in the format returned by lmfit.
- Returns:
DataFrame summarizing the confidence intervals for each parameter, with columns: - Parameter: Name of the fitted parameter - Lower_CI: Lower bound of the confidence interval - Value: Best-fit value of the parameter - Upper_CI: Upper bound of the confidence interval
- Return type:
pd.DataFrame
pychemelt.utils.rates module#
This module contains helper functions to obtain equilibrium constants Author: Osvaldo Burastero
- Useful references for unfolding models:
Rumfeldt, Jessica AO, et al. “Conformational stability and folding mechanisms of dimeric proteins.” Progress in biophysics and molecular biology 98.1 (2008): 61-84.
Bedouelle, Hugues. “Principles and equations for measuring and interpreting protein stability: From monomer to tetramer.” Biochimie 121 (2016): 29-37.
Mazurenko, Stanislav, et al. “Exploration of protein unfolding by modelling calorimetry data from reheating.” Scientific reports 7.1 (2017): 16321.
All thermodynamic parameters are used in kcal mol units
Unfolding functions for monomers have an argument called ‘extra_arg’ that is not used. This is because unfolding functions for oligomers require the protein concentration in that position
- pychemelt.utils.rates.eq_constant_thermo(T, DH1, T1, Cp)[source]#
T1 is the temperature at which ΔG(T) = 0 ΔH1, the variation of enthalpy between the two considered states at T1 Cp the variation of calorific capacity between the two states
- Parameters:
T (array-like) – Temperature (Kelvin)
DH1 (float) – Variation of enthalpy between the two considered states at T1 (kcal/mol)
T1 (float) – Temperature at which the equilibrium constant equals one (Kelvin)
Cp (float) – Variation of heat capacity between the two states (kcal/mol/K)
- Returns:
Equilibrium constant at the given temperature
- Return type:
numpy.ndarray
- pychemelt.utils.rates.eq_constant_termochem(T, D, DHm, Tm, Cp0, m0, m1)[source]#
Ref: Louise Hamborg et al., 2020. Global analysis of protein stability by temperature and chemical denaturation
- Parameters:
T (array-like) – Temperature (Kelvin only!)
D (float) – Denaturant concentration (M)
DHm (float) – Enthalpy change at Tm (kcal/mol)
Tm (float) – Melting temperature where ΔG = 0 (Kelvin only!)
Cp0 (float) – Heat capacity change (kcal/mol/K)
m0 (float) – m-value at the reference temperature
m1 (float) – Temperature dependence of the m-value
- Returns:
Equilibrium constant at a certain temperature and denaturant agent concentration
- Return type:
numpy.ndarray
pychemelt.utils.signals module#
This module contains helper functions to obtain the signal, given certain parameters Author: Osvaldo Burastero
Note: One could move dT outside the signal functions but the speedup was not significant.
- pychemelt.utils.signals.signal_two_state_tc_unfolding(T, D, DHm, Tm, Cp0, m0, m1, p1_N, p2_N, p3_N, p4_N, p1_U, p2_U, p3_U, p4_U, baseline_N_fx, baseline_U_fx, extra_arg=None)[source]#
Ref: Louise Hamborg et al., 2020. Global analysis of protein stability by temperature and chemical denaturation
- Parameters:
T (array-like) – Temperature in Kelvin units
D (array-like) – Denaturant agent concentration
DHm (float) – Variation of enthalpy between the two considered states at Tm
Tm (float) – Temperature at which the equilibrium constant equals one, in Kelvin units
Cp0 (float) – Variation of calorific capacity between the two states
m0 (float) – m-value at the reference temperature (Tref)
m1 (float) – Variation of m-value with temperature
p1_N (float) – parameters describing the native-state baseline
p2_N (float) – parameters describing the native-state baseline
p3_N (float) – parameters describing the native-state baseline
p4_N (float) – parameters describing the native-state baseline
p1_U (float) – parameters describing the unfolded-state baseline
p2_U (float) – parameters describing the unfolded-state baseline
p3_U (float) – parameters describing the unfolded-state baseline
p4_U (float) – parameters describing the unfolded-state baseline
baseline_N_fx (function) – for the native-state baseline
baseline_U_fx (function) – for the unfolded-state baseline
extra_arg (None, optional) – Not used but present for API compatibility with oligomeric models
- Returns:
Signal at the given temperatures and denaturant agent concentration, given the parameters
- Return type:
numpy.ndarray
- pychemelt.utils.signals.signal_two_state_t_unfolding(T, Tm, dHm, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, Cp=0, extra_arg=None)[source]#
Two-state temperature unfolding (monomer).
- Parameters:
T (array-like) – Temperature
Tm (float) – Temperature at which the equilibrium constant equals one
dHm (float) – Variation of enthalpy between the two considered states at Tm
p1_N (float) – baseline parameters for the native-state baseline
p2_N (float) – baseline parameters for the native-state baseline
p3_N (float) – baseline parameters for the native-state baseline
p1_U (float) – baseline parameters for the unfolded-state baseline
p2_U (float) – baseline parameters for the unfolded-state baseline
p3_U (float) – baseline parameters for the unfolded-state baseline
baseline_N_fx (callable) – function to calculate the baseline for the native state
baseline_U_fx (callable) – function to calculate the baseline for the unfolded state
Cp (float, optional) – Variation of heat capacity between the two states (default: 0)
extra_arg (None, optional) – Not used but present for compatibility
- Returns:
Signal at the given temperatures, given the parameters
- Return type:
numpy.ndarray
- pychemelt.utils.signals.two_state_thermal_unfold_curve(T, C, Tm, dHm, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, Cp=0)[source]#
Two-state temperature unfolding (monomer). N ⇔ U
- Parameters:
T (array-like) – Temperature
C (array-like) – Oligomer sample concentration - only for compatibility with oligomeric models, not used in the monomeric model
Tm (float) – Temperature at which the equilibrium constant equals one
dHm (float) – Variation of enthalpy between the two considered states at Tm
p1_N (float) – baseline parameters for the native-state baseline
p2_N (float) – baseline parameters for the native-state baseline
p3_N (float) – baseline parameters for the native-state baseline
p1_U (float) – baseline parameters for the unfolded-state baseline
p2_U (float) – baseline parameters for the unfolded-state baseline
p3_U (float) – baseline parameters for the unfolded-state baseline
baseline_N_fx (callable) – function to calculate the baseline for the native state
baseline_U_fx (callable) – function to calculate the baseline for the unfolded state
Cp (float, optional) – Variation of heat capacity between the two states (default: 0)
- Returns:
Signal at the given temperatures, given the parameters
- Return type:
numpy.ndarray
- pychemelt.utils.signals.two_state_thermal_unfold_curve_dimer(T, C, Tm, dHm, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, Cp=0)[source]#
Two-state temperature unfolding (dimer). N2 ⇔ 2U C is the total concentration (M) of the protein in dimer equivalent.
- Parameters:
T (array-like) – Temperature
C (array-like) – Oligomer sample concentration
Tm (float) – Temperature at which the equilibrium constant equals one
dHm (float) – Variation of enthalpy between the two considered states at Tm
p1_N (float) – baseline parameters for the native-state baseline
p2_N (float) – baseline parameters for the native-state baseline
p3_N (float) – baseline parameters for the native-state baseline
p1_U (float) – baseline parameters for the unfolded-state baseline
p2_U (float) – baseline parameters for the unfolded-state baseline
p3_U (float) – baseline parameters for the unfolded-state baseline
baseline_N_fx (callable) – function to calculate the baseline for the native state
baseline_U_fx (callable) – function to calculate the baseline for the unfolded state
Cp (float, optional) – Variation of heat capacity between the two states (default: 0)
- Returns:
Signal at the given temperatures, given the parameters
- Return type:
numpy.ndarray
Notes
C is the total concentration (M) of the protein in dimer equivalent.
- pychemelt.utils.signals.two_state_thermal_unfold_curve_trimer(T, C, Tm, dHm, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, Cp=0)[source]#
Two-state temperature unfolding (trimer). N3 ⇔ 3U
- Parameters:
T (array-like) – Temperature
C (array-like) – Oligomer sample concentration
Tm (float) – Temperature at which the equilibrium constant equals one
dHm (float) – Variation of enthalpy between the two considered states at Tm
p1_N (float) – baseline parameters for the native-state baseline
p2_N (float) – baseline parameters for the native-state baseline
p3_N (float) – baseline parameters for the native-state baseline
p1_U (float) – baseline parameters for the unfolded-state baseline
p2_U (float) – baseline parameters for the unfolded-state baseline
p3_U (float) – baseline parameters for the unfolded-state baseline
baseline_N_fx (callable) – function to calculate the baseline for the native state
baseline_U_fx (callable) – function to calculate the baseline for the unfolded state
Cp (float, optional) – Variation of heat capacity between the two states (default: 0)
- Returns:
Signal at the given temperatures, given the parameters
- Return type:
numpy.ndarray
Notes
C is the total concentration (M) of the protein in trimer equivalent.
- pychemelt.utils.signals.two_state_thermal_unfold_curve_tetramer(T, C, Tm, dHm, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, Cp=0, extra_arg=None)[source]#
Two-state temperature unfolding (tetramer). N4 ⇔ 4U
- Parameters:
T (array-like) – Temperature
C (array-like) – Oligomer sample concentration
Tm (float) – Temperature at which the equilibrium constant equals one
dHm (float) – Variation of enthalpy between the two considered states at Tm
p1_N (float) – baseline parameters for the native-state baseline
p2_N (float) – baseline parameters for the native-state baseline
p3_N (float) – baseline parameters for the native-state baseline
p1_U (float) – baseline parameters for the unfolded-state baseline
p2_U (float) – baseline parameters for the unfolded-state baseline
p3_U (float) – baseline parameters for the unfolded-state baseline
baseline_N_fx (callable) – function to calculate the baseline for the native state
baseline_U_fx (callable) – function to calculate the baseline for the unfolded state
Cp (float, optional) – Variation of heat capacity between the two states (default: 0)
- Returns:
Signal at the given temperatures, given the parameters
- Return type:
numpy.ndarray
Notes
C is the total concentration (M) of the protein in tetramer equivalent.
- pychemelt.utils.signals.unfolding_curve_monomer_monomeric_intermediate(T, C, T1, DH1, T2, DH2, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, bI, Cp1=0, CpTh=0)[source]#
Three states reversible unfolding N <-> I <-> U
- pychemelt.utils.signals.unfolding_curve_dimer_monomeric_intermediate(T, C, T1, DH1, T2, DH2, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, bI, Cp1=0, CpTh=0)[source]#
N2 ⇔ 2Ι ⇔ 2U Three-state unfolding with a monomeric intermediate C = concentration in dimer equivalent CpTotal = Cp1 + 2*Cp2
- pychemelt.utils.signals.unfolding_curve_trimer_monomeric_intermediate(T, C, T1, DH1, T2, DH2, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, bI, Cp1=0, CpTh=0)[source]#
N3 ⇔ 3Ι ⇔ 3U Three-state unfolding with a monomeric intermediate C = concentration of the trimer equivalent
- pychemelt.utils.signals.unfolding_curve_tetramer_monomeric_intermediate(T, C, T1, DH1, T2, DH2, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, bI, Cp1=0, CpTh=0)[source]#
N4 ⇔ 4Ι ⇔ 4U Three-state unfolding with a monomeric intermediate C = concentration of the tetramermer equivalent
- pychemelt.utils.signals.unfolding_curve_trimer_trimeric_intermediate(T, C, T1, DH1, T2, DH2, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, bI, Cp1=0, CpTh=0)[source]#
N3 ⇔ Ι3 ⇔ 3U Three-state unfolding with a trimeric intermediate C = concentration of the trimer equivalent
- pychemelt.utils.signals.unfolding_curve_dimer_dimeric_intermediate(T, C, T1, DH1, T2, DH2, p1_N, p2_N, p3_N, p1_U, p2_U, p3_U, baseline_N_fx, baseline_U_fx, bI, Cp1=0, CpTh=0)[source]#
N2 ⇔ Ι2 ⇔ 2U Three-state unfolding with a monomeric intermediate C = molar concentration in dimer equivalent CpTotal = Cp1 + Cp2
pychemelt.utils.svd module#
Module containing functions to perform Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) on spectral data, along with utilities for manipulating basis spectra and coefficients.
Author: Osvaldo Burastero
- pychemelt.utils.svd.apply_svd(X)[source]#
Perform Singular Value Decomposition (SVD) on the input data matrix X.
- Parameters:
X (numpy array of shape (n_wavelengths, n_measurements)) – The input data matrix to decompose.
- Returns:
explained_variance (numpy array) – The cumulative explained variance for each component.
basis_spectra (numpy array) – The left singular vectors (U matrix) representing the basis spectra.
coefficients (numpy array) – The coefficients associated with each basis spectrum.
- pychemelt.utils.svd.filter_basis_spectra(explained_variance, basis_spectra_all, coefficients_all, explained_variance_threshold=99)[source]#
Filter the basis spectra and coefficients based on the explained variance threshold :param explained_variance: The cumulative explained variance for each component. :type explained_variance: numpy array :param basis_spectra_all: The left singular vectors (U matrix) representing the basis spectra. :type basis_spectra_all: numpy array :param coefficients_all: The coefficients associated with each basis spectrum. :type coefficients_all: numpy array :param explained_variance_threshold: The threshold for explained variance to filter components. Default is 99. :type explained_variance_threshold: float, optional
- Returns:
basis_spectra (numpy array) – The filtered basis spectra.
coefficients (numpy array) – The filtered coefficients.
k (int) – The number of components that meet the explained variance threshold.
- pychemelt.utils.svd.align_basis_spectra_and_coefficients(X, basis_spectra, coefficients)[source]#
Align the basis spectra peaks to the original data :param X: The input data matrix. :type X: numpy array of shape (n_samples, n_features) :param basis_spectra: The basis spectra obtained from SVD. :type basis_spectra: numpy array :param coefficients: The coefficients associated with each basis spectrum. :type coefficients: numpy array
- Returns:
basis_spectra (numpy array) – The aligned basis spectra.
coefficients (numpy array) – The adjusted coefficients.
- pychemelt.utils.svd.angle_from_cathets(adjacent_leg, opposite_leg)[source]#
Calculate the angle between the hypotenuse and the adjacent leg of a right triangle. :param adjacent_leg: Length of the adjacent leg. :type adjacent_leg: float :param opposite_leg: Length of the opposite leg. :type opposite_leg: float
- Returns:
angle_in_radians – Angle in radians between the hypotenuse and the adjacent leg.
- Return type:
float
- pychemelt.utils.svd.get_2d_counterclockwise_rot_matrix(angle_in_radians)[source]#
Obtain the rotation matrix for a 2d coordinates system using a counterclockwise direction :param angle_in_radians: Angle in radians for the rotation. :type angle_in_radians: float
- Returns:
rotM – 2x2 rotation matrix.
- Return type:
numpy array
- pychemelt.utils.svd.get_3d_counterclockwise_rot_matrix_around_z_axis(angle_in_radians)[source]#
Obtain the rotation matrix for a 3d coordinates system around the z axis using a counterclockwise direction :param angle_in_radians: Angle in radians for the rotation. :type angle_in_radians: float
- Returns:
rotM – 3x3 rotation matrix.
- Return type:
numpy array
- pychemelt.utils.svd.get_3d_clockwise_rot_matrix_around_y_axis(angle_in_radians)[source]#
Obtain the rotation matrix for a 3d coordinates system around the y axis using a clockwise direction :param angle_in_radians: Angle in radians for the rotation. :type angle_in_radians: float
- Returns:
rotM – 3x3 rotation matrix.
- Return type:
numpy array
- pychemelt.utils.svd.rotate_two_basis_spectra(X, basis_spectra, pca_based=False)[source]#
Create a new basis spectra using a linear combination of the first and second basis spectra
- Parameters:
X (numpy array) – The raw data matrix of size n*m, where ‘n’ is the number of measured wavelengths and ‘m’ is the number of acquired spectra.
basis_spectra (numpy array) – The matrix containing the set of basis spectra.
pca_based (bool, optional) – Boolean to decide if we need to center the matrix X. Default is False.
- Returns:
basis_spectra_new (numpy array) – The new set of basis spectra.
coefficients (numpy array) – The new set of associated coefficients.
- pychemelt.utils.svd.rotate_three_basis_spectra(X, basis_spectra, pca_based=False)[source]#
Create a new basis spectra using a linear combination from the first, second and third basis spectra
- Parameters:
X (numpy array) – The raw data matrix of size n*m, where ‘n’ is the number of measured wavelengths and ‘m’ is the number of acquired spectra.
basis_spectra (numpy array) – The matrix containing the set of basis spectra.
pca_based (bool, optional) – Boolean to decide if we need to center the matrix X. Default is False.
- Returns:
basis_spectra_new (numpy array) – The new set of basis spectra.
coefficients_subset (numpy array) – The new set of associated coefficients.
- pychemelt.utils.svd.reconstruct_spectra(basis_spectra, coefficients, X=None, pca_based=False)[source]#
Reconstruct the original spectra based on the set of basis spectra and the associated coefficients
- Parameters:
basis_spectra (numpy array) – The matrix containing the set of basis spectra.
coefficients (numpy array) – The associated coefficients of each basis spectrum.
X (numpy array, optional) – Only used if pca_based equals TRUE! X is the raw data matrix of size n*m, where ‘n’ is the number of measured wavelengths and ‘m’ is the number of acquired spectra.
pca_based (bool, optional) – Boolean to decide if we need to extract the mean from the the X raw data matrix. Default is False.
Returns
-------
fitted (numpy array) – The reconstructed matrix which should be close the original raw data.
- pychemelt.utils.svd.explained_variance_from_orthogonal_vectors(vectors, coefficients, total_variance)[source]#
Useful to get the percentage of variance, not in the coordinate space provided by PCA/SVD, but against a different set of (rotated) vectors.
- Parameters:
vectors (numpy array) – The set of orthogonal vectors.
coefficients (numpy array) – The associated coefficients of each orthogonal vector.
total_variance (float) – The total variance of the original data (mean subtracted if we performed PCA…).
- Returns:
explained_variance – The amount of explained variance by each orthogonal vector.
- Return type:
list
- pychemelt.utils.svd.apply_pca(X)[source]#
Perform Principal Component Analysis (PCA) on the input data matrix X. :param X: The input data matrix to decompose. :type X: numpy array of shape (n_wavelengths, n_measurements)
- Returns:
cum_sum_eigenvalues (numpy array) – The cumulative explained variance for each principal component.
principal_components (numpy array) – The principal components (eigenvectors) representing the basis spectra.
coefficients (numpy array) – The coefficients associated with each principal component.
- pychemelt.utils.svd.recalc_explained_variance(basis_spectra, coefficients, X, pca_based=False)[source]#
Recalculate the explained variance of a set of basis spectra and associated coefficients :param basis_spectra: The basis spectra. :type basis_spectra: numpy array :param coefficients: The associated coefficients of each basis spectrum. :type coefficients: numpy array :param X: The raw data matrix of size n*m, where ‘n’ is the number of measured wavelengths
and ‘m’ is the number of acquired spectra.
- Parameters:
pca_based (bool, optional) – Boolean to decide if we need to center the matrix X. Default is False.
- Returns:
explained_variance – The cumulative explained variance for each component.
- Return type:
numpy array