pychemelt package#
PyChemelt package for the analysis of chemical and thermal denaturation data
Subpackages#
- pychemelt.utils package
- Submodules
- pychemelt.utils.constants module
- pychemelt.utils.files module
- pychemelt.utils.fitting module
fit_line_robust()fit_quadratic_robust()fit_exponential_robust()fit_thermal_unfolding()fit_tc_unfolding_single_slopes()fit_tc_unfolding_shared_slopes_many_signals()fit_tc_unfolding_many_signals()fit_oligomer_unfolding_single_slopes()fit_oligomer_unfolding_shared_slopes_many_signals()fit_oligomer_unfolding_many_signals()
- pychemelt.utils.fractions module
- pychemelt.utils.math module
temperature_to_kelvin()temperature_to_celsius()shift_temperature()constant_baseline()linear_baseline()quadratic_baseline()exponential_baseline()is_evenly_spaced()first_derivative_savgol()relative_errors()find_line_outliers()get_rss()solve_one_root_quadratic()solve_one_root_depressed_cubic()
- pychemelt.utils.palette module
- pychemelt.utils.plotting module
- pychemelt.utils.processing module
set_param_bounds()expand_temperature_list()clean_conditions_labels()subset_signal_by_temperature()guess_Tm_from_derivative()estimate_signal_baseline_params()fit_local_thermal_unfolding_to_signal_lst()re_arrange_predictions()re_arrange_params()subset_data()get_colors_from_numeric_values()combine_sequences()adjust_value_to_interval()oligomer_number()parse_number()are_all_strings_numeric()is_float()transform_to_list()
- pychemelt.utils.rates module
- pychemelt.utils.signals module
- pychemelt.utils.svd module
apply_svd()filter_basis_spectra()align_basis_spectra_and_coefficients()angle_from_cathets()get_2d_counterclockwise_rot_matrix()get_3d_counterclockwise_rot_matrix_around_z_axis()get_3d_clockwise_rot_matrix_around_y_axis()rotate_two_basis_spectra()rotate_three_basis_spectra()reconstruct_spectra()explained_variance_from_orthogonal_vectors()apply_pca()recalc_explained_variance()
Submodules#
pychemelt.main module#
Main class to handle thermal and chemical denaturation data The current model assumes that the unfolding is reversible
- class pychemelt.main.Sample(name='Test')[source]#
Bases:
objectClass to hold, process, and fit thermal and chemical denaturation data.
This class manages multiple signal types (e.g., 350nm, 330nm, Ratio) and concentrations, providing an interface for global thermodynamic analysis under the assumption of reversible unfolding.
- Parameters:
name (str, optional) – Identifier for the sample. Default is ‘Test’.
- signal_dic#
Raw signal data mapped by signal name.
- Type:
dict
- temp_dic#
Temperature data mapped by signal name.
- Type:
dict
- conditions#
Processed numeric values for experimental conditions (e.g., [Denaturant]).
- Type:
list of float
- labels#
Original string labels for each condition.
- Type:
list of str
- signals#
Names of all available signal types in the loaded files.
- Type:
list of str
- nr_signals#
Number of distinct signal types selected for analysis.
- Type:
int
- single_fit_done#
Flag indicating if individual dataset fits have been completed.
- Type:
bool
- global_fit_done#
Flag for global thermodynamic fitting with local baselines.
- Type:
bool
- global_global_fit_done#
Flag for global thermodynamics and global baseline slopes.
- Type:
bool
- global_global_global_fit_done#
Flag for global thermodynamics, slopes, and intercepts.
- Type:
bool
- read_file(file)[source]#
Read the file and load the data into the sample object
- Parameters:
file (str) – Path to the file
- Returns:
True if the file was read and loaded into the sample object
- Return type:
bool
- read_multiple_files(files)[source]#
Read multiple files and load the data into the sample object
- Parameters:
files (list or str) – List of paths to the files (or a single path)
- Returns:
True if the files were read and loaded into the sample object
- Return type:
bool
- set_signal(signal_names)[source]#
Set multiple signals to be used for the analysis. This way, we can fit globally multiple signals at the same time, such as 350nm and 330nm
- Parameters:
signal_names (list or str) – List of names of the signals to be used. E.g., [‘350nm’,’330nm’] or a single name
Notes
This method creates/updates the following attributes on the instance: - signal_lst_pre_multiple, temp_lst_pre_multiple : lists of lists - signal_names : list of signal name strings - nr_signals : int, number of signal types
- set_temperature_range(min_temp=0, max_temp=100)[source]#
Set the temperature range for the sample
- Parameters:
min_temp (float, optional) – Minimum temperature
max_temp (float, optional) – Maximum temperature
- set_signal_id()[source]#
Create a list with the same length as the total number of signals The elements of the list indicated the ID of the signal, e.g., all 350nm datasets are mapped to 0, all 330nm datasets to 1, etc.
- estimate_derivative(window_length=8)[source]#
Estimate the derivative of the signal using Savitzky-Golay filter
- Parameters:
window_length (int, optional) – Length of the filter window in degrees
Notes
Creates/updates attributes: - temp_deriv_lst_multiple, deriv_lst_multiple, deriv_lst_expanded : lists storing estimated derivatives and corresponding temps - predicted_deriv_lst_multiple : list storing estimated derivatives of predicted values
- guess_Tm(x1=6, x2=11)[source]#
Guess the Tm of the sample using the derivative of the signal
- Parameters:
x1 (float, optional) – Shift from the minimum and maximum temperature to estimate the median of the initial and final baselines
x2 (float, optional) – Shift from the minimum and maximum temperature to estimate the median of the initial and final baselines
Notes
x2 must be greater than x1.
This method creates/updates attributes: - t_melting_init_multiple : list of initial Tm guesses per signal - t_melting_df_multiple : list of pandas.DataFrame objects with Tm vs Denaturant
- estimate_baseline_parameters(native_baseline_type, unfolded_baseline_type, window_range_native=12, window_range_unfolded=12)[source]#
Estimate the baseline parameters for multiple signals
- Parameters:
native_baseline_type (str) – one of ‘constant’, ‘linear’, ‘quadratic’, ‘exponential’
unfolded_baseline_type (str) – one of ‘constant’, ‘linear’, ‘quadratic’, ‘exponential’
window_range_native (int, optional) – Range of the window (in degrees) to estimate the baselines and slopes of the native state
window_range_unfolded (int, optional) – Range of the window (in degrees) to estimate the baselines and slopes of the unfolded state
Notes
This method sets or updates these attributes: - bNs_per_signal, bUs_per_signal, kNs_per_signal, kUs_per_signal, qNs_per_signal, qUs_per_signal - poly_order_native, poly_order_unfolded
pychemelt.monomer module#
Main class to handle thermal and chemical denaturation data The current model assumes the protein is a monomer and that the unfolding is reversible
- class pychemelt.monomer.Monomer(name='Test')[source]#
Bases:
SampleClass to hold the data of a single sample and fit it
- set_denaturant_concentrations(concentrations=None)[source]#
Set the denaturant concentrations for the sample
- Parameters:
concentrations (list, optional) – List of denaturant concentrations. If None, use the sample conditions
Notes
Creates/updates attribute denaturant_concentrations_pre (numpy.ndarray)
- select_conditions(boolean_lst=None, normalise_to_global_max=True)[source]#
For each signal, select the conditions to be used for the analysis
- Parameters:
boolean_lst (list of bool, optional) – List of booleans selecting which conditions to keep. If None, keep all.
normalise_to_global_max (bool, optional) – If True, normalise the signal to the global maximum - per signal type
Notes
Creates/updates several attributes used by downstream fitting: - signal_lst_multiple, temp_lst_multiple : lists of lists with selected data - denaturant_concentrations : list of selected denaturant concentrations - denaturant_concentrations_expanded : flattened numpy array matching expanded signals - boolean_lst, normalise_to_global_max, nr_den : control flags/values
- fit_thermal_unfolding_local()[source]#
Fit the thermal unfolding of the sample using the signal and temperature data We fit one curve at a time, with individual parameters
- guess_Cp()[source]#
Guess the Cp of the sample by fitting a line to the Tm and dH values
Notes
This method creates/updates attributes used later in fitting: - Tms, dHs, slope_dh_tm, intercept_dh_tm, Cp0, Cp0 assigned to self.Cp0
- guess_initial_parameters(native_baseline_type, unfolded_baseline_type, window_range_native=12, window_range_unfolded=12)[source]#
Estimate starting thermodynamic and baseline parameters for global fitting.
- Parameters:
native_baseline_type ({'constant', 'linear', 'quadratic', 'exponential'}) – The model type for the native state baseline.
unfolded_baseline_type ({'constant', 'linear', 'quadratic', 'exponential'}) – The model type for the unfolded state baseline.
window_range_native (float, optional) – Temperature range at the start of the curve (in degrees) used for native baseline estimation. Default is 12.
window_range_unfolded (float, optional) – Temperature range at the end of the curve used for unfolded baseline estimation. Default is 12.
- fit_thermal_unfolding_global(fit_m_dep=False, cp_limits=None, dh_limits=None, tm_limits=None, cp_value=None)[source]#
Fit the thermal unfolding of the sample using the signal and temperature data We fit all the curves at once, with global thermodynamic parameters but local slopes and local baselines) Multiple signals can be fitted at the same time, such as 350nm and 330nm
- Parameters:
fit_m_dep (bool, optional) – If True, fit the temperature dependence of the m-value
cp_limits (list, optional) – List of two values, the lower and upper bounds for the Cp value. If None, bounds set automatically
dh_limits (list, optional) – List of two values, the lower and upper bounds for the dH value. If None, bounds set automatically
tm_limits (list, optional) – List of two values, the lower and upper bounds for the Tm value. If None, bounds set automatically
cp_value (float, optional) – If provided, the Cp value is fixed to this value, the bounds are ignored
Notes
This is a heavy routine that creates/updates many fitting-related attributes, including: - bNs_expanded, bUs_expanded, kNs_expanded, kUs_expanded, qNs_expanded, qUs_expanded - p0, low_bounds, high_bounds, global_fit_params, rel_errors - predicted_lst_multiple, params_names, params_df, dg_df - flags: global_fit_done, fit_m_dep, limited_tm, limited_dh, limited_cp, fixed_cp
- fit_thermal_unfolding_global_global()[source]#
Fit the thermal unfolding of the sample using the signal and temperature data We fit all the curves at once, with global thermodynamic parameters and global slopes (but local baselines) Multiple refers to the fact that we fit many signals at the same time, such as 350nm and 330nm Must be run after fit_thermal_unfolding_global_multiple
Notes
Updates global fitting attributes and sets global_global_fit_done when complete.
- fit_thermal_unfolding_global_global_global(model_scale_factor=True)[source]#
Fit the thermal unfolding of the sample using the signal and temperature data We fit all the curves at once, with global thermodynamic parameters, global slopes and global baselines Must be run after fit_thermal_unfolding_global_global
- Parameters:
model_scale_factor (bool, optional) – If True, model a scale factor for each denaturant concentration
Notes
Updates many global fitting attributes and sets global_global_global_fit_done when complete. If model_scale_factor is True the method also creates scaled signal attributes: - signal_lst_multiple_scaled, predicted_lst_multiple_scaled
- signal_to_df(signal_type='raw', scaled=False)[source]#
Create a dataframe with three columns: Temperature, Signal, and Denaturant. Optimized for speed by avoiding per-curve DataFrame creation.
- Parameters:
signal_type ({'raw', 'fitted', 'derivative'}, optional) – Which signal to include in the dataframe. ‘raw’ uses experimental data, ‘fitted’ uses model predictions, ‘derivative’ uses the estimated derivative signal.
scaled (bool, optional) – If True and signal_type == ‘fitted’ or ‘raw’, use the scaled versions if available.
pychemelt.thermal_oligomer module#
Main class to handle thermal denaturation data of mono- and oligomers up to tetramers The current model assumes the proteins’ unfolding is reversible
- class pychemelt.thermal_oligomer.ThermalOligomer(name='Test')[source]#
Bases:
SampleClass to hold the data of a DSF experiment of thermal unfolding with different concentrations of an oligomer.
- set_model(model_name)[source]#
Set subunit number of the oligomer used for the analysis. Currently supported are two state models of monomers, dimers, trimers and tetramers
- Parameters:
model_name (str) – name of the used model. Can be: “Monomer”, “Dimer”, “Trimer”, “Tetramer”. Case insensitive
- Raises:
ValueError – If the provided model name is not in the supported list.
Notes
This method creates/updates the following attributes on the instance: - self.model: oligomeric model used for analysis
- set_concentrations(concentrations=None)[source]#
Set the oligomeric concentrations for the sample
- Parameters:
concentrations (list, optional) – List of oligomer concentrations. If None, use the sample conditions
Notes
Creates/updates attribute oligomer_concentrations_pre (numpy.ndarray)
- select_conditions(boolean_lst=None, normalise_to_global_max=True)[source]#
For each signal, select the conditions to be used for the analysis
- Parameters:
boolean_lst (list of bool, optional) – List of booleans selecting which conditions to keep. If None, keep all.
normalise_to_global_max (bool, optional) – If True, normalise the signal to the global maximum - per signal type
Notes
Creates/updates several attributes used by downstream fitting: - signal_lst_multiple, temp_lst_multiple : lists of lists with selected data - oligomer_concentrations : list of selected oligomer concentrations - oligomer_concentrations_expanded : flattened numpy array matching expanded signals - boolean_lst, normalise_to_global_max, nr_olig : control flags/values
- guess_Cp()[source]#
Guess the Cp of the assembled oligomer by the number of residues.
- Raises:
ValueError – If self.n_residues is not set.
Notes
The number of residues represent the total number of residues in the oligomer
This method creates/updates attributes used later in fitting: - Cp0 assigned to self.Cp0
- estimate_baseline_parameters(native_baseline_type, unfolded_baseline_type, window_range_native=12, window_range_unfolded=12)[source]#
Estimate the baseline parameters for multiple signals of the oligomer. The native baseline represents the curve for the assemble doligomer while the unfolded baseline represents the curve for the unfolded and disassembled oligomer.
- Parameters:
native_baseline_type (str) – one of ‘constant’, ‘linear’, ‘quadratic’, ‘exponential’
unfolded_baseline_type (str) – one of ‘constant’, ‘linear’, ‘quadratic’, ‘exponential’
window_range_native (int, optional) – Range of the window (in degrees) to estimate the baselines and slopes of the native state
window_range_unfolded (int, optional) – Range of the window (in degrees) to estimate the baselines and slopes of the unfolded state
Notes
This method sets or updates these attributes: - bNs_per_signal, bUs_per_signal, kNs_per_signal, kUs_per_signal, qNs_per_signal, qUs_per_signal - poly_order_native, poly_order_unfolded
- fit_thermal_unfolding_global(cp_limits=None, dh_limits=None, tm_limits=None, cp_value=None)[source]#
Fit the thermal unfolding of the sample using the signal and temperature data We fit all the curves at once, with global thermodynamic parameters but local slopes and local baselines) Multiple signals can be fitted at the same time, such as 350nm and 330nm
- Parameters:
cp_limits (list, optional) – List of two values, the lower and upper bounds for the Cp value. If None, bounds set automatically
dh_limits (list, optional) – List of two values, the lower and upper bounds for the dH value. If None, bounds set automatically
tm_limits (list, optional) – List of two values, the lower and upper bounds for the Tm value. If None, bounds set automatically
cp_value (float, optional) – If provided, the Cp value is fixed to this value, the bounds are ignored
Notes
This is a heavy routine that creates/updates many fitting-related attributes, including: - bNs_expanded, bUs_expanded, kNs_expanded, kUs_expanded, qNs_expanded, qUs_expanded - p0, low_bounds, high_bounds, global_fit_params, rel_errors - predicted_lst_multiple, params_names, params_df, dg_df - flags: global_fit_done, limited_tm, limited_dh, limited_cp, fixed_cp
- fit_thermal_unfolding_global_global()[source]#
Fit the thermal unfolding of the sample using the signal and temperature data We fit all the curves at once, with global thermodynamic parameters and global slopes (but local baselines) Multiple refers to the fact that we fit many signals at the same time, such as 350nm and 330nm Must be run after fit_thermal_unfolding_global_multiple
Notes
Updates global fitting attributes and sets global_global_fit_done when complete.
- fit_thermal_unfolding_global_global_global(model_scale_factor=True)[source]#
Fit the thermal unfolding of the sample using the signal and temperature data We fit all the curves at once, with global thermodynamic parameters, global slopes and global baselines Must be run after fit_thermal_unfolding_global_global
- Parameters:
model_scale_factor (bool, optional) – If True, model a scale factor for each oligomer concentration
Notes
Updates many global fitting attributes and sets global_global_global_fit_done when complete. If model_scale_factor is True the method also creates scaled signal attributes: - signal_lst_multiple_scaled, predicted_lst_multiple_scaled
- signal_to_df(signal_type='raw', scaled=False)[source]#
Create a dataframe with three columns: Temperature, Signal, and oligomer. Optimized for speed by avoiding per-curve DataFrame creation.
- Parameters:
signal_type ({'raw', 'fitted', 'derivative'}, optional) – Which signal to include in the dataframe. ‘raw’ uses experimental data, ‘fitted’ uses model predictions, ‘derivative’ uses the estimated derivative signal.
scaled (bool, optional) – If True and signal_type == ‘fitted’ or ‘raw’, use the scaled versions if available.
- Returns:
A DataFrame with columns: [‘Temperature’, ‘Signal’, ‘Oligomer’, ‘ID’].
- Return type:
pd.DataFrame