DataStructures - Alignment¶

`Run` Module¶

Run¶

class msproteomicstoolslib.data_structures.Run.Run(header, header_dict, runid, orig_input_filename=None, filename=None, aligned_filename=None)¶

A run contains references to identified precursor groups and precursors.

The run stores a reference to precursor groups (heavy/light pairs) identified in the run. It has a unique id and stores the headers from the csv

A run has the following attributes:

an identifier that is unique to this run
a filename where it originally came from
a dictionary of precursor groups which are accessible through the following functions - getPrecursorGroup - hasPrecursor - getPrecursor - addPrecursor

addPrecursor(precursor, peptide_group_label)¶

getPrecursor(peptide_group_label, trgr_id)¶: Return precursor corresponding to the given peptide label group and the transition group id

getPrecursorGroup(curr_id)¶

get_aligned_filename()¶

get_best_peaks()¶

get_best_peaks_with_cutoff(cutoff)¶

get_id()¶

get_openswath_filename()¶

get_original_filename()¶

hasPrecursor(peptide_group_label, trgr_id)¶

`PrecursorGroup` Module¶

PrecursorGroup¶

class msproteomicstoolslib.data_structures.PrecursorGroup.PrecursorGroup(peptide_group_label, run)¶

A set of precursors that are isotopically modified versions of each other.

A collection of precursors that are isotopically modified versions of the same underlying peptide sequence. Generally these are heavy/light forms.

addPrecursor(self, precursor)¶: Add precursor to peptide group

getAllPeakgroups(self)¶: Generator of all peakgroups attached to the precursors in this group

getAllPrecursors(self)¶: Return a list of all precursors in this precursor group

getOverallBestPeakgroup(self)¶: Get the best peakgroup (by fdr score) of all precursors contained in this precursor group

getPeptideGroupLabel(self)¶: Get peptide group label

getPrecursor(self, curr_id)¶: Get the precursor for the given transition group id

`Precursor` Module¶

PrecursorBase¶

class msproteomicstoolslib.data_structures.Precursor.PrecursorBase(this_id, run)¶

Bases: object

find_closest_in_iRT(delta_assay_rt)¶

get_all_peakgroups()¶

get_best_peakgroup()¶

get_decoy()¶

get_id()¶

get_selected_peakgroup()¶

select_pg(this_id)¶

set_decoy(decoy)¶

unselect_pg(id)¶

GeneralPrecursor¶

class msproteomicstoolslib.data_structures.Precursor.GeneralPrecursor(this_id, run)¶

Bases: msproteomicstoolslib.data_structures.Precursor.PrecursorBase

A set of peakgroups that belong to the same precursor in a single run.

== Implementation details ==

This is a plain implementation where all peakgroup objects are stored in a simple list, this is not very efficient since many objects need to be created which in Python takes a lot of memory.

add_peakgroup(peakgroup)¶

append(transitiongroup)¶

find_closest_in_iRT(delta_assay_rt)¶

get_all_peakgroups()¶

get_best_peakgroup()¶: Return the best peakgroup according to fdr score

get_run_id()¶

get_selected_peakgroup()¶

id¶

peakgroups¶

precursor_group¶

protein_name¶

run¶

sequence¶

Precursor¶

class msproteomicstoolslib.data_structures.Precursor.Precursor(this_id, run)¶

Bases: msproteomicstoolslib.data_structures.Precursor.PrecursorBase

A set of peakgroups that belong to the same precursor in a single run.

Each precursor has a backreference to its precursor group (heavy/light pair) it belongs to, the run it belongs to as well as its amino acid sequence. Furthermore, a unique id for the precursor and the protein name are stored.

A precursor can return its best transition group, the selected peakgroup, or can return the transition group that is closest to a given iRT time. Its id is the transition_group_id (e.g. the id of the chromatogram)

The “selected” peakgroup is represented by the peakgroup that belongs to cluster number 1 (cluster_id == 1) which in this case is “special”.

== Implementation details ==

For memory reasons, we store all information about the peakgroup in a tuple (invariable). This tuple contains a unique feature id, a score and a retention time. Additionally, we also store, in which cluster the peakgroup belongs (if the user sets this).

A peakgroup has the following attributes:

an identifier that is unique among all other precursors
a set of peakgroups
a back-reference to the run it belongs to

add_peakgroup_tpl(pg_tuple, tpl_id, cluster_id=-1)¶

Adds a peakgroup to this precursor.

The peakgroup should be a tuple of length 4 with the following components:

id
quality score (FDR)
retention time (normalized)

3. intensity (4. d_score optional)

cluster_ids_¶

find_closest_in_iRT(delta_assay_rt)¶

getAllPeakgroups()¶

getClusteredPeakgroups()¶

getPrecursorGroup()¶

get_all_peakgroups()¶

get_best_peakgroup()¶

get_id()¶

get_run_id()¶

get_selected_peakgroup()¶

id¶

peakgroups_¶

precursor_group¶

protein_name¶

run¶

select_pg(this_id)¶

sequence¶

setClusterID(this_id, cl_id)¶

unselect_all()¶

unselect_pg(this_id)¶

`PeakGroup` Module¶

PeakGroupBase¶

class msproteomicstoolslib.data_structures.PeakGroup.PeakGroupBase¶

Bases: object

cluster_id_¶

fdr_score¶

get_cluster_id()¶

get_fdr_score()¶

get_feature_id()¶

get_intensity()¶

get_normalized_retentiontime()¶

get_value(value)¶

id_¶

intensity_¶

is_selected()¶

normalized_retentiontime¶

select_this_peakgroup()¶

set_fdr_score(fdr_score)¶

set_feature_id(id_)¶

set_intensity(intensity)¶

set_normalized_retentiontime(normalized_retentiontime)¶

set_value(key, value)¶

MinimalPeakGroup¶

class msproteomicstoolslib.data_structures.PeakGroup.MinimalPeakGroup(unique_id, fdr_score, assay_rt, selected, cluster_id, peptide, intensity=None, dscore=None)¶

Bases: msproteomicstoolslib.data_structures.PeakGroup.PeakGroupBase

A single peakgroup that is defined by a retention time in a chromatogram of multiple transitions. Additionally it has an fdr_score and it has an aligned RT (e.g. retention time in normalized space). A peakgroup can be selected for quantification or not (this is stored as having cluster_id == 1).

Note that for performance reasons, the peakgroups are created on-the-fly and not stored as objects but rather as tuples in “Peptide”.

Each peak group has a unique id, a score (fdr score usually), a retention time as well as a back-reference to the precursor that generated the peakgroup. In this case, the peak group can also be assigned a cluster id (where the cluster 1 is special as the one we will use for quantification).

get_cluster_id()¶

get_dscore()¶

print_out()¶

select_this_peakgroup()¶

setClusterID(id_)¶

set_fdr_score(fdr_score)¶

set_feature_id(id_)¶

set_intensity(intensity)¶

set_normalized_retentiontime(normalized_retentiontime)¶

GuiPeakGroup¶

class msproteomicstoolslib.data_structures.PeakGroup.GuiPeakGroup(fdr_score, intensity, leftWidth, rightWidth, peptide)¶

Bases: msproteomicstoolslib.data_structures.PeakGroup.PeakGroupBase

A single peakgroup that is defined by a retention time in a chromatogram of multiple transitions.

get_value(value)¶

GeneralPeakGroup¶

class msproteomicstoolslib.data_structures.PeakGroup.GeneralPeakGroup(row, run, peptide)¶

Bases: msproteomicstoolslib.data_structures.PeakGroup.PeakGroupBase

get_dscore()¶

get_value(value)¶

peptide¶

print_out()¶

row¶

run¶

setClusterID(clid)¶

set_value(key, value)¶

DataStructures - Basic¶

`Aminoacides` Module¶

Aminoacid¶

class msproteomicstoolslib.data_structures.aminoacides.Aminoacid(name, code, code3, composition)¶

Class to hold information about a single Amino Acid (AA)

code = None¶: One letter code

code3 = None¶: Three letter code

composition = None¶: Elemental composition

elementsLib = None¶: Library of elements

name = None¶: Full name of the AA

Aminoacides¶

class msproteomicstoolslib.data_structures.aminoacides.Aminoacides¶

addAminoacid(aminoacid)¶

getAminoacid(code)¶

initAminoacides()¶

`Modifications` Module¶

Modification¶

class msproteomicstoolslib.data_structures.modifications.Modification(aminoacid, tpp_Mod, unimodAccession, peakViewAccession, is_labeling, composition)¶

A modification on an Aminoacid

codes = ['TPP', 'unimod', 'ProteinPilot']¶: Available modification formats

getcode(code)¶

Modifications¶

class msproteomicstoolslib.data_structures.modifications.Modifications(default_mod_file=None)¶

A collection of modifications

appendModification(modification)¶

is_bool(expression)¶

printModifications()¶

readModificationsFile(modificationsfile)¶: It reads a tsv file with additional modifications. Modifications will be appended to the default modifications of this class. Tsv file headers & an example: modified-AA TPP-nomenclature Unimod-Accession ProteinPilot-nomenclature is_a_labeling composition-dictionary S S[167] 21 [Pho] False {‘H’ : 1,’O’ : 3, ‘P’ : 1}

translateModificationsFromSequence(sequence, code, aaLib=None)¶: Returns a Peptide object, given a sequence with modifications in any of the available codes. The code (TPP, Unimod,…) to be translated must be given.

`Peak` Module¶

Peak¶

class msproteomicstoolslib.data_structures.peak.Peak(str=None, spectraST=False)¶

Represents one peak of a spectrum.

init_with_self(peak)¶

initialize(peak, intensity, peak_annotation, statistics)¶

parse_str(peak)¶

to_write_string()¶

`Peptide` Module¶

Peptide¶

class msproteomicstoolslib.data_structures.peptide.Peptide(sequence, modifications={}, protein='', aminoacidLib=None)¶

addSpectrum(spectrum)¶: Deprecated definition

all_ions(ionseries=None, frg_z_list=[1, 2], fragmentlossgains=[0], mass_limits=None, label='')¶: Returns all the fragment ions of the peptide in a tuple of two objects: (annotated, ionmasses_only) annotated is a list of tuples as : (ion_type, ion_number, ion_charge, lossgain, fragment_mz) ionmasses_only is a list of fragment masses. When ionseries is not provided, all existing ion series (see: Peptide.iontypes) will be calculated. When frg_z_list is not provided, fragment ion charge states +1 and +2 will be used.

calIsoforms(switchingModification, modLibrary)¶: This returns the full list of peptide species of the same peptide family (isobaric, same composition, different modification site. The list is given as a list of Peptide objects. switchingModification must be given as a Modification object.

cal_UIS(otherPeptidesList, UISorder=2, ionseries=None, fragmentlossgains=[0], precision=1e-08, frg_z_list=[1, 2], mass_limits=None)¶: It calculates the UIS for a given peptide referred to a given list of other peptides. It returns a tuple of two objects all_UIS, and all_UIS_annotated. all_UIS contains only a mass list.

comparePeptideFragments(otherPeptidesList, ionseries=None, fragmentlossgains=[0], precision=1e-08, frg_z_list=[1, 2])¶: This returns a tuple of lists: (CommonFragments, differentialFragments). The differentialFragmentMasses are the masses of the __self__ peptide are not shared with any of the peptides listed in the otherPeptidesList. otherPeptidesList must be a list of Peptide objects. The fragments are reported as a tuple : (ionserie,ion_number,ion_charge,frqgmentlossgain,mass)

fragmentSequence(ion_type, frg_number)¶

getDeltaMassFromSequence(sequence)¶

getMZ(charge, label='')¶

getMZfragment(ion_type, ion_number, ion_charge, label='', fragmentlossgain=0.0)¶

getSequenceWithMods(code)¶

get_decoy_Q3(frg_serie, frg_nr, frg_z, blackList=[], max_tries=1000)¶

pseudoreverse(sequence='None')¶

shuffle_sequence()¶

`Residues` Module¶

Residues¶

class msproteomicstoolslib.data_structures.Residues.Residues(type='mono')¶

A class that contains information elements, amino acids and modifications. It stores mainly masse of these but also chemical formulas.

The most commonly used properties are:

Residues.average_elments : element weights

Residues.monoisotopic_elments : element weights

Residues.aa_codes : Three and One letter amino acid codes

Residues.aa_names : English names of the amino acids

Residues.aa_sum_formulas_text : Chemical formulas of all amino acids

Residues.aa_sum_formulas: Chemical formulas of all amino acids as hash

Residues.mass_xxx: monoisotopic masses of different compounds (NH3, H2O, CO, HPO4 etc)

Residues.average_data: average weight of amino acids

Residues.monoisotopic_data: monoisotopic weight of amino acids

Residues.monoisotopic_mod: monoisotopic modification data

Residues.mod_mapping: mapping of + notation to absolute weight notation (K[+8] to K[136])

Residues.Hydropathy: Hydropathy of amino acids (gravy scores)

TODO hydrophobicity of amino acids

TODO basicity of amino acids

TODO helicity of amino acids

Residues.pI: pI of amino acids

`DDB` Module¶

DDB¶

Abstraction layer to the 2DDB software framework.

Table Of Contents

Previous topic

Next topic

This Page

DataStructures - Alignment¶

`Run` Module¶

Run¶

`PrecursorGroup` Module¶

PrecursorGroup¶

`Precursor` Module¶

PrecursorBase¶

GeneralPrecursor¶

Precursor¶

`PeakGroup` Module¶

PeakGroupBase¶

MinimalPeakGroup¶

GuiPeakGroup¶

GeneralPeakGroup¶

DataStructures - Basic¶

`Aminoacides` Module¶

Aminoacid¶

Aminoacides¶

`Modifications` Module¶

Modification¶

Modifications¶

`Peak` Module¶

Peak¶

`Peptide` Module¶

Peptide¶

`Residues` Module¶

Residues¶

`DDB` Module¶

DDB¶

DataStructures - Alignment¶

Run Module¶

Run¶

PrecursorGroup Module¶

PrecursorGroup¶

Precursor Module¶

PrecursorBase¶

GeneralPrecursor¶

Precursor¶

PeakGroup Module¶

PeakGroupBase¶

MinimalPeakGroup¶

GuiPeakGroup¶

GeneralPeakGroup¶

DataStructures - Basic¶

Aminoacides Module¶

Aminoacid¶

Aminoacides¶

Modifications Module¶

Modification¶

Modifications¶

Peak Module¶

Peak¶

Peptide Module¶

Peptide¶

Residues Module¶

Residues¶

DDB Module¶

DDB¶

`Run` Module¶

`PrecursorGroup` Module¶

`Precursor` Module¶

`PeakGroup` Module¶

`Aminoacides` Module¶

`Modifications` Module¶

`Peak` Module¶

`Peptide` Module¶

`Residues` Module¶

`DDB` Module¶