pyXLMS.data package#
Module contents#
Core data structures and data type validation functions.
Examples
>>> from pyXLMS.data import CrosslinkSpectrumMatch as CSM
>>> csm = CSM(
... alpha_peptide="PEKP",
... alpha_peptide_crosslink_position=3,
... beta_peptide="TKIDE",
... beta_peptide_crosslink_position=2,
... spectrum_file="dsso.mzML",
... scan_nr=1,
... )
>>> from pyXLMS.data import Crosslink
>>> xl = Crosslink(
... alpha_peptide="PEKP",
... alpha_peptide_crosslink_position=3,
... beta_peptide="TKIDE",
... beta_peptide_crosslink_position=2,
... )
>>> from pyXLMS.data import Crosslink
>>> from pyXLMS.data import ParserResult
>>> xl = Crosslink(
... alpha_peptide="PEKP",
... alpha_peptide_crosslink_position=3,
... beta_peptide="TKIDE",
... beta_peptide_crosslink_position=2,
... )
>>> pr = ParserResult(search_engine="My Search Engine", crosslinks=[xl])
- class pyXLMS.data.Crosslink(
- *,
- alpha_peptide: str,
- alpha_peptide_crosslink_position: int,
- beta_peptide: str,
- beta_peptide_crosslink_position: int,
- alpha_proteins: List[str] | None = None,
- alpha_proteins_crosslink_positions: List[int] | None = None,
- alpha_decoy: bool | None = None,
- beta_proteins: List[str] | None = None,
- beta_proteins_crosslink_positions: List[int] | None = None,
- beta_decoy: bool | None = None,
- score: float | None = None,
- additional_information: Dict[str, Any] | None = None,
Bases:
BaseModelCore data structure representing a single crosslink.
Crosslinks represent two crosslinked peptides. Crosslinks can be unique peptide pairs or unique residue pairs, depending on their grouping.
Attributes Summary#
Here is a short summary about the crosslink attributes, for more details on the specific Pydantic validation requirements please refer to the corresponding attributes themselves.
Required#
The following attributes are required:
- alpha_peptidestr
The unmodified amino acid sequence of the first peptide. Amino acids should be in upper case. Modifications should not be included in the sequence.
- alpha_peptide_crosslink_positionint
The position of the crosslinker in the sequence of the first peptide (1-based).
- beta_peptidestr
The unmodified amino acid sequence of the second peptide. Amino acids should be in upper case. Modifications should not be included in the sequence.
- beta_peptide_crosslink_positionint
The position of the crosslinker in the sequence of the second peptide (1-based).
Optional#
The following attributes are optional:
- alpha_proteinslist of str, or None, default = None
The accessions of proteins that the first peptide is associated with.
- alpha_proteins_crosslink_positionslist of int, or None, default = None
Positions of the crosslink in the proteins of the first peptide (1-based). If given the list should be of the same length as
alpha_proteinsand crosslink position at list indexishould correspond to the protein at list indexiinalpha_proteins.- alpha_decoybool, or None, default = None
Whether the first peptide is from the decoy database (
True) or not (False).- beta_proteinslist of str, or None, default = None
The accessions of proteins that the second peptide is associated with.
- beta_proteins_crosslink_positionslist of int, or None, default = None
Positions of the crosslink in the proteins of the second peptide (1-based). If given the list should be of the same length as
beta_proteinsand crosslink position at list indexishould correspond to the protein at list indexiinbeta_proteins.- beta_decoybool, or None, default = None
Whether the second peptide is from the decoy database (
True) or not (False).- scorefloat, or None, default = None
Score of the crosslink.
- additional_informationdict of str, any, or None, default = None
A dictionary with additional information associated with the crosslink.
Notes
Alpha and beta assignment is internally decided by whichever peptide’s sequence is alphabetically first. If the
beta_peptide’s sequence comes alphabetically first it will be assigned toalpha_peptideand the originalalpha_peptidewill be assigned tobeta_peptide(and the same happens for all other corresponding alpha and beta values).Examples
>>> from pyXLMS.data import Crosslink >>> xl = Crosslink( ... alpha_peptide="PEKP", ... alpha_peptide_crosslink_position=3, ... beta_peptide="TKIDE", ... beta_peptide_crosslink_position=2, ... )
- additional_information: Annotated[Dict[str, Any] | None, Field(frozen=False, description='A dictionary with additional information associated with the crosslink.')]#
A dictionary with additional information associated with the crosslink.
- alpha_decoy: Annotated[bool | None, Field(frozen=True, description='Whether the alpha peptide is from the decoy database or not.')]#
Whether the first peptide is from the decoy database (
True) or not (False).
- alpha_peptide: Annotated[str, Field(frozen=True, description='The unmodified amino acid sequence of the first peptide.')]#
The unmodified amino acid sequence of the first peptide. Amino acids should be in upper case. Modifications should not be included in the sequence.
- alpha_peptide_crosslink_position: Annotated[int, Field(frozen=True, description='The position of the crosslinker in the sequence of the first peptide (1-based).')]#
The position of the crosslinker in the sequence of the first peptide (1-based).
- alpha_proteins: Annotated[List[str] | None, Field(frozen=True, description='The accessions of proteins that the first peptide is associated with.')]#
The accessions of proteins that the first peptide is associated with.
- alpha_proteins_crosslink_positions: Annotated[List[int] | None, Field(frozen=True, description='Positions of the crosslink in the proteins of the first peptide (1-based).')]#
Positions of the crosslink in the proteins of the first peptide (1-based). If given the list should be of the same length as
alpha_proteinsand crosslink position at list indexishould correspond to the protein at list indexiinalpha_proteins.
- beta_decoy: Annotated[bool | None, Field(frozen=True, description='Whether the beta peptide is from the decoy database or not.')]#
Whether the second peptide is from the decoy database (
True) or not (False).
- beta_peptide: Annotated[str, Field(frozen=True, description='The unmodified amino acid sequence of the second peptide.')]#
The unmodified amino acid sequence of the second peptide. Amino acids should be in upper case. Modifications should not be included in the sequence.
- beta_peptide_crosslink_position: Annotated[int, Field(frozen=True, description='The position of the crosslinker in the sequence of the second peptide (1-based).')]#
The position of the crosslinker in the sequence of the second peptide (1-based).
- beta_proteins: Annotated[List[str] | None, Field(frozen=True, description='The accessions of proteins that the second peptide is associated with.')]#
The accessions of proteins that the second peptide is associated with.
- beta_proteins_crosslink_positions: Annotated[List[int] | None, Field(frozen=True, description='Positions of the crosslink in the proteins of the second peptide (1-based).')]#
Positions of the crosslink in the proteins of the second peptide (1-based). If given the list should be of the same length as
beta_proteinsand crosslink position at list indexishould correspond to the protein at list indexiinbeta_proteins.
- property completeness: Literal['full', 'partial']#
Completeness of the crosslink, e.g.
"full"if all attributes are notNoneand else"partial".
- copy_with_update(
- update: Dict[str, Any] = {},
Creates a deep copy of the crosslink with optional attribute updates.
- Parameters:
update (dict of str, any, default = empty dict) – Dictionary mapping attribute names (str) to their updated values. The default (empty dict) will create a deep copy with the original attribute values.
- Returns:
New crosslink with optionally updated attributes.
- Return type:
Examples
>>> from pyXLMS.data import Crosslink >>> xl = Crosslink( ... alpha_peptide="PEKP", ... alpha_peptide_crosslink_position=3, ... alpha_proteins=["PROT"], ... beta_peptide="PEKP", ... beta_peptide_crosslink_position=3, ... beta_proteins=["PROT"], ... ) >>> xl_copy = xl.copy_with_update( ... update={"additional_information": {"homomeric": True}} ... )
- property crosslink_type: Literal['intra', 'inter']#
Link type of the crosslink, e.g.
"intra"if the proteins inalpha_proteinsandbeta_proteinsoverlap, otherwise"inter".
- property data_type: Literal['crosslink']#
Data type of the object.
- display(
- show_additional_information: bool = False,
- return_str: bool = False,
Pretty prints the crosslink.
- Parameters:
show_additional_information (bool, default = False) – Also display data in the
additional_information.return_str (bool, default = False) – If the display string should be returned.
- Returns:
The display string of the crosslink if
return_str = Trueotherwise None.- Return type:
None, or str
Examples
>>> from pyXLMS import parser >>> pr = parser.read( ... "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1.pdResult", ... engine="MS Annika", ... crosslinker="DSS", ... ) >>> xls = pr["crosslinks"] >>> xls[0].display() Data Type: crosslink Completeness: full Alpha Peptide: GQKNSR Alpha Peptide Crosslink Position: 3 Alpha Proteins: ['Cas9'] Alpha Proteins Crosslink Positions: [779] Alpha Decoy: False Beta Peptide: GQKNSR Beta Peptide Crosslink Position: 3 Beta Proteins: ['Cas9'] Beta Proteins Crosslink Positions: [779] Beta Decoy: False Crosslink Type: intra Crosslink Score: 119.82547820493929
- items() List[Tuple[str, Any]][source]#
Support for dict-like read access for backward compatibility.
- Returns:
Returns a list of tuples of attribute name, attribute value.
- Return type:
list of tuple of str, any
Notes
This internally just calls
self.model_dump(mode="python").items(). See model_dump.
- keys() List[str][source]#
Support for dict-like read access for backward compatibility.
- Returns:
Returns a list of attribute names.
- Return type:
list of str
Notes
This internally just calls
self.model_dump(mode="python").keys(). See model_dump.
- model_config = {'str_strip_whitespace': True, 'strict': True, 'validate_assignment': True}#
Pydantic configuration for the underlying validation model.
- model_post_init(context: Any = None) None[source]#
Performs extra validation and post init functions.
Notes
Alpha and beta assignment is internally decided by whichever peptide’s sequence is alphabetically first. If the
beta_peptide’s sequence comes alphabetically first it will be assigned toalpha_peptideand the originalalpha_peptidewill be assigned tobeta_peptide(and the same happens for all other corresponding alpha and beta values).Warning
This method should not be called manually!
- score: Annotated[float | None, Field(frozen=True, description='Score of the crosslink.')]#
Score of the crosslink.
- to_proforma(crosslinker: str | float | None = None) str[source]#
Returns the Proforma string for the crosslink.
- Parameters:
crosslinker (str, or float, or None, default = None) – Optional name or mass of the crosslink reagent. If the name is given, it should be a valid name from XLMOD.
- Returns:
The Proforma string of the crosslink.
- Return type:
str
Notes
If no crosslinker is given, the unmodified peptide Proforma will be returned.
Examples
>>> from pyXLMS.data import create_crosslink_min >>> xl = create_crosslink_min("PEPKTIDE", 4, "KPEPTIDE", 1) >>> xl.to_proforma() 'KPEPTIDE//PEPKTIDE'
>>> from pyXLMS.data import create_crosslink_min >>> xl = create_crosslink_min("PEPKTIDE", 4, "KPEPTIDE", 1) >>> xl.to_proforma(crosslinker="Xlink:DSSO") 'K[Xlink:DSSO]PEPTIDE//PEPK[Xlink:DSSO]TIDE'
- values() List[Any][source]#
Support for dict-like read access for backward compatibility.
- Returns:
Returns a list of attribute values.
- Return type:
list of any
Notes
This internally just calls
self.model_dump(mode="python").values(). See model_dump.
- class pyXLMS.data.CrosslinkSpectrumMatch(
- *,
- alpha_peptide: str,
- alpha_peptide_crosslink_position: int,
- beta_peptide: str,
- beta_peptide_crosslink_position: int,
- spectrum_file: str,
- scan_nr: int,
- alpha_modifications: Dict[int, Tuple[str, float]] | None = None,
- alpha_proteins: List[str] | None = None,
- alpha_proteins_crosslink_positions: List[int] | None = None,
- alpha_proteins_peptide_positions: List[int] | None = None,
- alpha_score: float | None = None,
- alpha_decoy: bool | None = None,
- beta_modifications: Dict[int, Tuple[str, float]] | None = None,
- beta_proteins: List[str] | None = None,
- beta_proteins_crosslink_positions: List[int] | None = None,
- beta_proteins_peptide_positions: List[int] | None = None,
- beta_score: float | None = None,
- beta_decoy: bool | None = None,
- score: float | None = None,
- charge: int | None = None,
- retention_time: float | None = None,
- ion_mobility: float | None = None,
- additional_information: Dict[str, Any] | None = None,
Bases:
BaseModelCore data structure representing a single crosslink-spectrum-match.
Crosslink-spectrum-matches associate two crosslinked peptides with a specific mass spectrum. They contain spectrum level information additionally to crosslink information.
Attributes Summary#
Here is a short summary about the crosslink-spectrum-match attributes, for more details on the specific Pydantic validation requirements please refer to the corresponding attributes themselves.
Required#
The following attributes are required:
- alpha_peptidestr
The unmodified amino acid sequence of the first peptide. Amino acids should be in upper case. Modifications should not be included in the sequence.
- alpha_peptide_crosslink_positionint
The position of the crosslinker in the sequence of the first peptide (1-based).
- beta_peptidestr
The unmodified amino acid sequence of the second peptide. Amino acids should be in upper case. Modifications should not be included in the sequence.
- beta_peptide_crosslink_positionint
The position of the crosslinker in the sequence of the second peptide (1-based).
- spectrum_filestr
Name of the spectrum file the crosslink-spectrum-match was identified in.
- scan_nrint
The corresponding scan number of the crosslink-spectrum-match. If the scan number is not available the spectrum index should be provided.
Optional#
The following attributes are optional:
- alpha_modificationsdict of int, tuple of str, float, or None, default = None
The modifications of the first peptide given as a dictionary that maps peptide position (1-based) to modification given as a tuple of modification name and modification delta mass.
N-terminalmodifications should be denoted with position0.C-terminalmodifications should be denoted with positionlen(peptide) + 1. If the peptide is not modified an empty dictionary should be given.- alpha_proteinslist of str, or None, default = None
The accessions of proteins that the first peptide is associated with.
- alpha_proteins_crosslink_positionslist of int, or None, default = None
Positions of the crosslink in the proteins of the first peptide (1-based). If given the list should be of the same length as
alpha_proteinsand crosslink position at list indexishould correspond to the protein at list indexiinalpha_proteins.- alpha_proteins_peptide_positionslist of int, or None, default = None
Positions of the first peptide in the corresponding proteins (1-based). If given the list should be of the same length as
alpha_proteinsand peptide position at list indexishould correspond to the protein at list indexiinalpha_proteins.- alpha_scorefloat, or None, default = None
Identification score of the first peptide.
- alpha_decoybool, or None, default = None
Whether the first peptide is from the decoy database (
True) or not (False).- beta_modificationsdict of int, tuple of str, float, or None, default = None
The modifications of the second peptide given as a dictionary that maps peptide position (1-based) to modification given as a tuple of modification name and modification delta mass.
N-terminalmodifications should be denoted with position0.C-terminalmodifications should be denoted with positionlen(peptide) + 1. If the peptide is not modified an empty dictionary should be given.- beta_proteinslist of str, or None, default = None
The accessions of proteins that the second peptide is associated with.
- beta_proteins_crosslink_positionslist of int, or None, default = None
Positions of the crosslink in the proteins of the second peptide (1-based). If given the list should be of the same length as
beta_proteinsand crosslink position at list indexishould correspond to the protein at list indexiinbeta_proteins.- beta_proteins_peptide_positionslist of int, or None, default = None
Positions of the second peptide in the corresponding proteins (1-based). If given the list should be of the same length as
beta_proteinsand peptide position at list indexishould correspond to the protein at list indexiinbeta_proteins.- beta_scorefloat, or None, default = None
Identification score of the second peptide.
- beta_decoybool, or None, default = None
Whether the second peptide is from the decoy database (
True) or not (False).- scorefloat, or None, default = None
Score of the crosslink-spectrum-match.
- chargeint, or None, default = None
The precursor charge of the corresponding mass spectrum of the crosslink-spectrum-match.
- retention_timefloat, or None, default = None
The retention time of the corresponding mass spectrum of the crosslink-spectrum-match in seconds.
- ion_mobilityfloat, or None, default = None
The ion mobility or compensation voltage of the corresponding mass spectrum of the crosslink-spectrum-match.
- additional_informationdict of str, any, or None, default = None
A dictionary with additional information associated with the crosslink-spectrum-match.
Notes
Alpha and beta assignment is internally decided by whichever peptide’s sequence is alphabetically first. If the
beta_peptide’s sequence comes alphabetically first it will be assigned toalpha_peptideand the originalalpha_peptidewill be assigned tobeta_peptide(and the same happens for all other corresponding alpha and beta values).Examples
>>> from pyXLMS.data import CrosslinkSpectrumMatch as CSM >>> csm = CSM( ... alpha_peptide="PEKP", ... alpha_peptide_crosslink_position=3, ... beta_peptide="TKIDE", ... beta_peptide_crosslink_position=2, ... spectrum_file="dsso.mzML", ... scan_nr=1, ... )
- additional_information: Annotated[Dict[str, Any] | None, Field(frozen=False, description='A dictionary with additional information associated with the crosslink-spectrum-match.')]#
A dictionary with additional information associated with the crosslink-spectrum-match.
- alpha_decoy: Annotated[bool | None, Field(frozen=True, description='Whether the first peptide is from the decoy database or not.')]#
Whether the first peptide is from the decoy database (
True) or not (False).
- alpha_modifications: Annotated[Dict[int, Tuple[str, float]] | None, Field(frozen=True, description='The modifications of the first peptide.')]#
The modifications of the first peptide given as a dictionary that maps peptide position (1-based) to modification given as a tuple of modification name and modification delta mass.
N-terminalmodifications should be denoted with position0.C-terminalmodifications should be denoted with positionlen(peptide) + 1. If the peptide is not modified an empty dictionary should be given.
- alpha_peptide: Annotated[str, Field(frozen=True, description='The unmodified amino acid sequence of the first peptide.')]#
The unmodified amino acid sequence of the first peptide. Amino acids should be in upper case. Modifications should not be included in the sequence.
- alpha_peptide_crosslink_position: Annotated[int, Field(frozen=True, description='The position of the crosslinker in the sequence of the first peptide (1-based).')]#
The position of the crosslinker in the sequence of the first peptide (1-based).
- alpha_proteins: Annotated[List[str] | None, Field(frozen=True, description='The accessions of proteins that the first peptide is associated with.')]#
The accessions of proteins that the first peptide is associated with.
- alpha_proteins_crosslink_positions: Annotated[List[int] | None, Field(frozen=True, description='Positions of the crosslink in the proteins of the first peptide (1-based).')]#
Positions of the crosslink in the proteins of the first peptide (1-based). If given the list should be of the same length as
alpha_proteinsand crosslink position at list indexishould correspond to the protein at list indexiinalpha_proteins.
- alpha_proteins_peptide_positions: Annotated[List[int] | None, Field(frozen=True, description='Positions of the first peptide in the corresponding proteins (1-based).')]#
Positions of the first peptide in the corresponding proteins (1-based). If given the list should be of the same length as
alpha_proteinsand peptide position at list indexishould correspond to the protein at list indexiinalpha_proteins.
- alpha_score: Annotated[float | None, Field(frozen=True, description='Identification score of the first peptide.')]#
Identification score of the first peptide.
- beta_decoy: Annotated[bool | None, Field(frozen=True, description='Whether the beta peptide is from the decoy database or not.')]#
Whether the second peptide is from the decoy database (
True) or not (False).
- beta_modifications: Annotated[Dict[int, Tuple[str, float]] | None, Field(frozen=True, description='The modifications of the second peptide.')]#
The modifications of the second peptide given as a dictionary that maps peptide position (1-based) to modification given as a tuple of modification name and modification delta mass.
N-terminalmodifications should be denoted with position0.C-terminalmodifications should be denoted with positionlen(peptide) + 1. If the peptide is not modified an empty dictionary should be given.
- beta_peptide: Annotated[str, Field(frozen=True, description='The unmodified amino acid sequence of the second peptide.')]#
The unmodified amino acid sequence of the second peptide. Amino acids should be in upper case. Modifications should not be included in the sequence.
- beta_peptide_crosslink_position: Annotated[int, Field(frozen=True, description='The position of the crosslinker in the sequence of the second peptide (1-based).')]#
The position of the crosslinker in the sequence of the second peptide (1-based).
- beta_proteins: Annotated[List[str] | None, Field(frozen=True, description='The accessions of proteins that the second peptide is associated with.')]#
The accessions of proteins that the second peptide is associated with.
- beta_proteins_crosslink_positions: Annotated[List[int] | None, Field(frozen=True, description='Positions of the crosslink in the proteins of the second peptide (1-based).')]#
Positions of the crosslink in the proteins of the second peptide (1-based). If given the list should be of the same length as
beta_proteinsand crosslink position at list indexishould correspond to the protein at list indexiinbeta_proteins.
- beta_proteins_peptide_positions: Annotated[List[int] | None, Field(frozen=True, description='Positions of the second peptide in the corresponding proteins (1-based).')]#
Positions of the second peptide in the corresponding proteins (1-based). If given the list should be of the same length as
beta_proteinsand peptide position at list indexishould correspond to the protein at list indexiinbeta_proteins.
- beta_score: Annotated[float | None, Field(frozen=True, description='Identification score of the second peptide.')]#
Identification score of the second peptide.
- charge: Annotated[int | None, Field(frozen=True, description='The precursor charge of the corresponding mass spectrum of the crosslink-spectrum-match.')]#
The precursor charge of the corresponding mass spectrum of the crosslink-spectrum-match.
- property completeness: Literal['full', 'partial']#
Completeness of the crosslink-spectrum-match, e.g.
"full"if all attributes are notNoneand else"partial".
- copy_with_update(
- update: Dict[str, Any] = {},
Creates a deep copy of the crosslink-spectrum-match with optional attribute updates.
- Parameters:
update (dict of str, any, default = empty dict) – Dictionary mapping attribute names (str) to their updated values. The default (empty dict) will create a deep copy with the original attribute values.
- Returns:
New crosslink-spectrum-match with optionally updated attributes.
- Return type:
Examples
>>> from pyXLMS.data import CrosslinkSpectrumMatch as CSM >>> csm = CSM( ... alpha_peptide="PEKP", ... alpha_peptide_crosslink_position=3, ... beta_peptide="TKIDE", ... beta_peptide_crosslink_position=2, ... spectrum_file="dsso.mzML", ... scan_nr=1, ... ) >>> csm_copy = csm.copy_with_update(update={"scan_nr": 2})
- property crosslink_type: Literal['intra', 'inter']#
Link type of the crosslink-spectrum-match, e.g.
"intra"if the proteins inalpha_proteinsandbeta_proteinsoverlap, otherwise"inter".
- property data_type: Literal['crosslink-spectrum-match']#
Data type of the object.
- display(
- show_additional_information: bool = False,
- return_str: bool = False,
Pretty prints the crosslink-spectrum-match.
- Parameters:
show_additional_information (bool, default = False) – Also display data in the
additional_information.return_str (bool, default = False) – If the display string should be returned.
- Returns:
The display string of the crosslink-spectrum-match if
return_str = Trueotherwise None.- Return type:
None, or str
Examples
>>> from pyXLMS import parser >>> pr = parser.read( ... "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1.pdResult", ... engine="MS Annika", ... crosslinker="DSS", ... ) >>> csms = pr["crosslink-spectrum-matches"] >>> csms[0].display() Data Type: crosslink-spectrum-match Completeness: full Alpha Peptide: GQKNSR Alpha Modifications: {3: ('DSS', 138.06808)} Alpha Peptide Crosslink Position: 3 Alpha Proteins: ['Cas9'] Alpha Proteins Crosslink Positions: [779] Alpha Proteins Peptide Positions: [777] Alpha Peptide Score: 119.82548987540834 Alpha Decoy: False Beta Peptide: GQKNSR Beta Modifications: {3: ('DSS', 138.06808)} Beta Peptide Crosslink Position: 3 Beta Proteins: ['Cas9'] Beta Proteins Crosslink Positions: [779] Beta Proteins Peptide Positions: [777] Beta Peptide Score: 119.82547820493929 Beta Decoy: False Crosslink Type: intra CSM Score: 119.82547820493929 Spectrum File: XLpeplib_Beveridge_QEx-HFX_DSS_R1.raw Scan Number: 2257 Precursor Charge: 3 Retention Time: 733.1895599999999 Ion Mobility/FAIMS CV: 0.0
- ion_mobility: Annotated[float | None, Field(frozen=True, description='The ion mobility or compensation voltage of the corresponding mass spectrum of the crosslink-spectrum-match.')]#
The ion mobility or compensation voltage of the corresponding mass spectrum of the crosslink-spectrum-match.
- items() List[Tuple[str, Any]][source]#
Support for dict-like read access for backward compatibility.
- Returns:
Returns a list of tuples of attribute name, attribute value.
- Return type:
list of tuple of str, any
Notes
This internally just calls
self.model_dump(mode="python").items(). See model_dump.
- keys() List[str][source]#
Support for dict-like read access for backward compatibility.
- Returns:
Returns a list of attribute names.
- Return type:
list of str
Notes
This internally just calls
self.model_dump(mode="python").keys(). See model_dump.
- model_config = {'str_strip_whitespace': True, 'strict': True, 'validate_assignment': True}#
Pydantic configuration for the underlying validation model.
- model_post_init(context: Any = None) None[source]#
Performs extra validation and post init functions.
Notes
Alpha and beta assignment is internally decided by whichever peptide’s sequence is alphabetically first. If the
beta_peptide’s sequence comes alphabetically first it will be assigned toalpha_peptideand the originalalpha_peptidewill be assigned tobeta_peptide(and the same happens for all other corresponding alpha and beta values).Warning
This method should not be called manually!
- retention_time: Annotated[float | None, Field(frozen=True, description='The retention time of the corresponding mass spectrum of the crosslink-spectrum-match in seconds.')]#
The retention time of the corresponding mass spectrum of the crosslink-spectrum-match in seconds.
- scan_nr: Annotated[int, Field(frozen=True, description='The corresponding scan number of the crosslink-spectrum-match.')]#
The corresponding scan number of the crosslink-spectrum-match. If the scan number is not available the spectrum index should be provided.
- score: Annotated[float | None, Field(frozen=True, description='Score of the crosslink-spectrum-match.')]#
Score of the crosslink-spectrum-match.
- spectrum_file: Annotated[str, Field(frozen=True, description='Name of the spectrum file the crosslink-spectrum-match was identified in.')]#
Name of the spectrum file the crosslink-spectrum-match was identified in.
- to_crosslink() Crosslink[source]#
Creates a crosslink from the crosslink-spectrum-match.
- Returns:
The corresponding crosslink created from the crosslink-spectrum-match.
- Return type:
- to_proforma(crosslinker: str | float | None = None) str[source]#
Returns the Proforma string for the crosslink-spectrum-match.
- Parameters:
crosslinker (str, or float, or None, default = None) – Optional name or mass of the crosslink reagent. If the name is given, it should be a valid name from XLMOD.
- Returns:
The Proforma string of the crosslink-spectrum-match.
- Return type:
str
Notes
Modifications with unknown mass are skipped.
If no modifications are given, only the crosslink modification will be encoded in the Proforma.
If no modifications are given and no crosslinker is given, the unmodified peptide Proforma will be returned.
Examples
>>> from pyXLMS.data import create_csm_min >>> csm = create_csm_min("PEPKTIDE", 4, "KPEPTIDE", 1, "RUN_1", 1) >>> csm.to_proforma() 'KPEPTIDE//PEPKTIDE'
>>> from pyXLMS.data import create_csm_min >>> csm = create_csm_min("PEPKTIDE", 4, "KPEPTIDE", 1, "RUN_1", 1) >>> csm.to_proforma(crosslinker="Xlink:DSSO") 'K[Xlink:DSSO]PEPTIDE//PEPK[Xlink:DSSO]TIDE'
>>> from pyXLMS.data import create_csm_min >>> csm = create_csm_min( ... "PEPKTIDE", ... 4, ... "KPMEPTIDE", ... 1, ... "RUN_1", ... 1, ... modifications_b={3: ("Oxidation", 15.994915)}, ... ) >>> csm.to_proforma(crosslinker="Xlink:DSSO") 'K[Xlink:DSSO]PM[+15.994915]EPTIDE//PEPK[Xlink:DSSO]TIDE'
>>> from pyXLMS.data import create_csm_min >>> csm = create_csm_min( ... "PEPKTIDE", ... 4, ... "KPMEPTIDE", ... 1, ... "RUN_1", ... 1, ... modifications_b={3: ("Oxidation", 15.994915)}, ... charge=3, ... ) >>> csm.to_proforma(crosslinker="Xlink:DSSO") 'K[Xlink:DSSO]PM[+15.994915]EPTIDE//PEPK[Xlink:DSSO]TIDE/3'
>>> from pyXLMS.data import create_csm_min >>> csm = create_csm_min( ... "PEPKTIDE", ... 4, ... "KPMEPTIDE", ... 1, ... "RUN_1", ... 1, ... modifications_a={4: ("DSSO", 158.00376)}, ... modifications_b={1: ("DSSO", 158.00376), 3: ("Oxidation", 15.994915)}, ... charge=3, ... ) >>> csm.to_proforma() 'K[+158.00376]PM[+15.994915]EPTIDE//PEPK[+158.00376]TIDE/3'
>>> from pyXLMS.data import create_csm_min >>> csm = create_csm_min( ... "PEPKTIDE", ... 4, ... "KPMEPTIDE", ... 1, ... "RUN_1", ... 1, ... modifications_a={4: ("DSSO", 158.00376)}, ... modifications_b={1: ("DSSO", 158.00376), 3: ("Oxidation", 15.994915)}, ... charge=3, ... ) >>> csm.to_proforma(crosslinker="Xlink:DSSO") 'K[+158.00376]PM[+15.994915]EPTIDE//PEPK[+158.00376]TIDE/3'
- values() List[Any][source]#
Support for dict-like read access for backward compatibility.
- Returns:
Returns a list of attribute values.
- Return type:
list of any
Notes
This internally just calls
self.model_dump(mode="python").values(). See model_dump.
- class pyXLMS.data.ParserResult(
- *,
- search_engine: str,
- crosslink_spectrum_matches: List[CrosslinkSpectrumMatch] | None = None,
- crosslinks: List[Crosslink] | None = None,
Bases:
BaseModelCore data structure for parser results.
Data structure returned by any (parser) function that reads crosslink-spectrum-matches and/or crosslinks.
Attributes Summary#
Here is a short summary about the parser result attributes, for more details on the specific Pydantic validation requirements please refer to the corresponding attributes themselves.
Required#
The following attributes are required:
- search_enginestr
The name of the identifying crosslink search engine.
Optional#
The following attributes are optional:
- crosslink_spectrum_matcheslist of CrosslinkSpectrumMatch, or None, default = None
List of parsed crosslink-spectrum-matches.
- crosslinkslist of Crosslink, or None, default = None
List of parsed crosslinks.
Examples
>>> from pyXLMS.data import Crosslink >>> from pyXLMS.data import ParserResult >>> xl = Crosslink( ... alpha_peptide="PEKP", ... alpha_peptide_crosslink_position=3, ... beta_peptide="TKIDE", ... beta_peptide_crosslink_position=2, ... ) >>> pr = ParserResult(search_engine="My Search Engine", crosslinks=[xl])
- property completeness: Literal['full', 'partial', 'empty']#
Completeness of the parser result, e.g.
"full"if all attributes are notNone,"empty"if crosslink-spectrum-matches and crosslinks areNone, and otherwise"partial".
- copy_with_update(
- update: Dict[str, Any] = {},
Creates a deep copy of the parser result with optional attribute updates.
- Parameters:
update (dict of str, any, default = empty dict) – Dictionary mapping attribute names (str) to their updated values. The default (empty dict) will create a deep copy with the original attribute values.
- Returns:
New parser result with optionally updated attributes.
- Return type:
Examples
>>> from pyXLMS.data import Crosslink >>> from pyXLMS.data import ParserResult >>> pr = ParserResult(search_engine="My Search Engine") >>> xl = Crosslink( ... alpha_peptide="PEKP", ... alpha_peptide_crosslink_position=3, ... beta_peptide="TKIDE", ... beta_peptide_crosslink_position=2, ... ) >>> pr_copy = pr.copy_with_update(update={"crosslinks": [xl]})
- crosslink_spectrum_matches: Annotated[List[CrosslinkSpectrumMatch] | None, Field(frozen=True, description='List of parsed crosslink-spectrum-matches.')]#
List of parsed crosslink-spectrum-matches.
- crosslinks: Annotated[List[Crosslink] | None, Field(frozen=True, description='List of parsed crosslinks.')]#
List of parsed crosslinks.
- csms(
- create_copy: bool = True,
Shorthand function to retrieve crosslink-spectrum-matches.
- Parameters:
create_copy (bool, default = True) – Whether a deep copy of the crosslink-spectrum-matches should be returned (default) or
self.crosslink_spectrum_matchesdirectly.- Returns:
Returns (a deep copy of)
self.crosslink_spectrum_matches.- Return type:
list of CrosslinkSpectrumMatch, or None
Notes
Please be aware that by default this explicitly creates a deep copy of the underlying data!
- property data_type: Literal['parser_result']#
Data type of the object.
- display(
- show_additional_information: bool = False,
- return_str: bool = False,
Pretty prints the parser result.
- Parameters:
show_additional_information (bool, default = False) – Also display data in the
additional_information.return_str (bool, default = False) – If the display string should be returned.
- Returns:
The display string of the parser result if
return_str = Trueotherwise None.- Return type:
None, or str
Examples
>>> from pyXLMS import parser >>> pr = parser.read( ... "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1.pdResult", ... engine="MS Annika", ... crosslinker="DSS", ... ) >>> pr.display() Data Type: parser_result Completeness: full Identifying Search Engine: MS Annika Number of Crosslink-Spectrum-Matches: 826 Number of Crosslinks: 300
- items() List[Tuple[str, Any]][source]#
Support for dict-like read access for backward compatibility.
- Returns:
Returns a list of tuples of attribute name, attribute value.
- Return type:
list of tuple of str, any
Notes
This internally just calls
self.model_dump(mode="python").items(). See model_dump.
- keys() List[str][source]#
Support for dict-like read access for backward compatibility.
- Returns:
Returns a list of attribute names.
- Return type:
list of str
Notes
This internally just calls
self.model_dump(mode="python").keys(). See model_dump.
- model_config = {'str_strip_whitespace': True, 'strict': True, 'validate_assignment': True}#
Pydantic configuration for the underlying validation model.
- search_engine: Annotated[str, Field(frozen=True, description='The name of the identifying crosslink search engine.')]#
The name of the identifying crosslink search engine.
- values() List[Any][source]#
Support for dict-like read access for backward compatibility.
- Returns:
Returns a list of attribute values.
- Return type:
list of any
Notes
This internally just calls
self.model_dump(mode="python").values(). See model_dump.
- xls(
- create_copy: bool = True,
Shorthand function to retrieve crosslinks.
- Parameters:
create_copy (bool, default = True) – Whether a deep copy of the crosslinks should be returned (default) or
self.crosslinksdirectly.- Returns:
Returns (a deep copy of)
self.crosslinks.- Return type:
list of Crosslink, or None
Notes
Please be aware that by default this explicitly creates a deep copy of the underlying data!
- pyXLMS.data.check_indexing(value: int | List[int]) bool[source]#
Checks that the given value is not 0-based.
- Parameters:
value (int, or list of int) – The value(s) to check.
- Returns:
If the given value(s) is/are okay.
- Return type:
bool
- Raises:
ValueError – If any of the values are smaller than one.
Examples
>>> from pyXLMS.data import check_indexing >>> check_indexing([1, 2, 3]) True
- pyXLMS.data.check_input(
- parameter: Any,
- parameter_name: str,
- supported_class: Any,
- supported_subclass: Any | None = None,
Checks if the given parameter is of the specified type.
Function that checks if a given parameter is of the specified type and if iterable, all elements are of the specified element type. This is mostly an input check function to catch any errors arising from not supported inputs early.
- Parameters:
parameter (any) – Parameter to check class of.
parameter_name (str) – Name of the parameter.
supported_class (any) – Class the parameter has to be of.
supported_subclass (any, or None, default = None) – Class of the values in case the parameter is a list or dict.
- Returns:
If the given input is okay.
- Return type:
bool
- Raises:
TypeError – If the parameter is not of the given class.
Examples
>>> from pyXLMS.data import check_input >>> check_input("PEPTIDE", "peptide_a", str) True
>>> from pyXLMS.data import check_input >>> check_input([1, 2], "xl_position_proteins_a", list, int) True
- pyXLMS.data.check_input_multi(
- parameter: Any,
- parameter_name: str,
- supported_classes: List[Any],
- supported_subclass: Any | None = None,
Checks if the given parameter is of one of the specified types.
Function that checks if a given parameter is of one of the specified types and if iterable, all elements are of the specified element type. This is mostly an input check function to catch any errors arising from not supported inputs early.
- Parameters:
parameter (any) – Parameter to check class of.
parameter_name (str) – Name of the parameter.
supported_classes (list of any) – Classes the parameter has to be of.
supported_subclass (any, or None, default = None) – Class of the values in case the parameter is a list or dict.
- Returns:
If the given input is okay.
- Return type:
bool
- Raises:
TypeError – If the parameter is not of one of the given classes.
Examples
>>> from pyXLMS.data import check_input_multi >>> check_input_multi("PEPTIDE", "peptide_a", [str, list]) True
- pyXLMS.data.create_crosslink(
- peptide_a: str,
- xl_position_peptide_a: int,
- proteins_a: List[str] | None,
- xl_position_proteins_a: List[int] | None,
- decoy_a: bool | None,
- peptide_b: str,
- xl_position_peptide_b: int,
- proteins_b: List[str] | None,
- xl_position_proteins_b: List[int] | None,
- decoy_b: bool | None,
- score: float | None,
- additional_information: Dict[str, Any] | None = None,
Creates a crosslink data structure.
Contains minimal data necessary for representing a single crosslink. The returned crosslink data structure is a dictionary with keys as detailed in the return section.
- Parameters:
peptide_a (str) – The unmodified amino acid sequence of the first peptide.
xl_position_peptide_a (int) – The position of the crosslinker in the sequence of the first peptide (1-based).
proteins_a (list of str, or None) – The accessions of proteins that the first peptide is associated with.
xl_position_proteins_a (list of int, or None) – Positions of the crosslink in the proteins of the first peptide (1-based).
decoy_a (bool, or None) – Whether the alpha peptide is from the decoy database or not.
peptide_b (str) – The unmodified amino acid sequence of the second peptide.
xl_position_peptide_b (int) – The position of the crosslinker in the sequence of the second peptide (1-based).
proteins_b (list of str, or None) – The accessions of proteins that the second peptide is associated with.
xl_position_proteins_b (list of int, or None) – Positions of the crosslink in the proteins of the second peptide (1-based).
decoy_b (bool, or None) – Whether the beta peptide is from the decoy database or not.
score (float, or None) – Score of the crosslink.
additional_information (dict with str keys, or None, default = None) – A dictionary with additional information associated with the crosslink.
- Returns:
The dictionary representing the crosslink with keys
data_type,completeness,alpha_peptide,alpha_peptide_crosslink_position,alpha_proteins,alpha_proteins_crosslink_positions,alpha_decoy,beta_peptide,beta_peptide_crosslink_position,beta_proteins,beta_proteins_crosslink_positions,beta_decoy,crosslink_type,score, andadditional_information. Alpha and beta are assigned based on peptide sequence, the peptide that alphabetically comes first is assigned to alpha.- Return type:
dict
- Raises:
TypeError – If the parameter is not of the given class.
ValueError – If the length of crosslink positions is not equal to the length of proteins.
Notes
The minimum required data for creating a crosslink is:
peptide_a: The unmodified amino acid sequence of the first peptide.peptide_b: The unmodified amino acid sequence of the second peptide.xl_position_peptide_a: The position of the crosslinker in the sequence of the first peptide (1-based).xl_position_peptide_b: The position of the crosslinker in the sequence of the second peptide (1-based).
Examples
>>> from pyXLMS.data import create_crosslink >>> minimal_crosslink = create_crosslink( ... peptide_a="PEPTIDEA", ... xl_position_peptide_a=1, ... proteins_a=None, ... xl_position_proteins_a=None, ... decoy_a=None, ... peptide_b="PEPTIDEB", ... xl_position_peptide_b=5, ... proteins_b=None, ... xl_position_proteins_b=None, ... decoy_b=None, ... score=None, ... )
>>> from pyXLMS.data import create_crosslink >>> crosslink = create_crosslink( ... peptide_a="PEPTIDEA", ... xl_position_peptide_a=1, ... proteins_a=["PROTEINA"], ... xl_position_proteins_a=[1], ... decoy_a=False, ... peptide_b="PEPTIDEB", ... xl_position_peptide_b=5, ... proteins_b=["PROTEINB"], ... xl_position_proteins_b=[3], ... decoy_b=False, ... score=34.5, ... )
- pyXLMS.data.create_crosslink_from_csm(
- csm: CrosslinkSpectrumMatch,
Creates a crosslink data structure from a crosslink-spectrum-match.
Creates a crosslink data structure from a crosslink-spectrum-match. The returned crosslink data structure is a dictionary with keys as detailed in the return section.
- Parameters:
csm (dict of str) – The crosslink-spectrum-match item to be converted to a crosslink item.
- Returns:
The dictionary representing the crosslink with keys
data_type,completeness,alpha_peptide,alpha_peptide_crosslink_position,alpha_proteins,alpha_proteins_crosslink_positions,alpha_decoy,beta_peptide,beta_peptide_crosslink_position,beta_proteins,beta_proteins_crosslink_positions,beta_decoy,crosslink_type,score, andadditional_information. Alpha and beta are assigned based on peptide sequence, the peptide that alphabetically comes first is assigned to alpha.- Return type:
dict
- Raises:
TypeError – If parameter
csmis not a valid crosslink-spectrum-match.
Notes
See also
data.create_crosslink().Examples
>>> from pyXLMS.data import create_csm_min, create_crosslink_from_csm >>> csm = create_csm_min("PEPTIDEA", 1, "PEPTIDEB", 5, "RUN_1", 1) >>> crosslink = create_crosslink_from_csm(csm)
- pyXLMS.data.create_crosslink_min(
- peptide_a: str,
- xl_position_peptide_a: int,
- peptide_b: str,
- xl_position_peptide_b: int,
- **kwargs,
Creates a crosslink data structure from minimal input.
Contains minimal data necessary for representing a single crosslink. This is an alias for
data.create_crosslink()``that sets all optional parameters to ``Nonefor convenience. The returned crosslink data structure is a dictionary with keys as detailed in the return section.- Parameters:
peptide_a (str) – The unmodified amino acid sequence of the first peptide.
xl_position_peptide_a (int) – The position of the crosslinker in the sequence of the first peptide (1-based).
peptide_b (str) – The unmodified amino acid sequence of the second peptide.
xl_position_peptide_b (int) – The position of the crosslinker in the sequence of the second peptide (1-based).
**kwargs – Any additional parameters will be passed to
data.create_crosslink().
- Returns:
The dictionary representing the crosslink with keys
data_type,completeness,alpha_peptide,alpha_peptide_crosslink_position,alpha_proteins,alpha_proteins_crosslink_positions,alpha_decoy,beta_peptide,beta_peptide_crosslink_position,beta_proteins,beta_proteins_crosslink_positions,beta_decoy,crosslink_type,score, andadditional_information. Alpha and beta are assigned based on peptide sequence, the peptide that alphabetically comes first is assigned to alpha.- Return type:
dict
Notes
See also
data.create_crosslink().Examples
>>> from pyXLMS.data import create_crosslink_min >>> minimal_crosslink = create_crosslink_min("PEPTIDEA", 1, "PEPTIDEB", 5)
- pyXLMS.data.create_csm(
- peptide_a: str,
- modifications_a: Dict[int, Tuple[str, float]] | None,
- xl_position_peptide_a: int,
- proteins_a: List[str] | None,
- xl_position_proteins_a: List[int] | None,
- pep_position_proteins_a: List[int] | None,
- score_a: float | None,
- decoy_a: bool | None,
- peptide_b: str,
- modifications_b: Dict[int, Tuple[str, float]] | None,
- xl_position_peptide_b: int,
- proteins_b: List[str] | None,
- xl_position_proteins_b: List[int] | None,
- pep_position_proteins_b: List[int] | None,
- score_b: float | None,
- decoy_b: bool | None,
- score: float | None,
- spectrum_file: str,
- scan_nr: int,
- charge: int | None,
- rt: float | None,
- im_cv: float | None,
- additional_information: Dict[str, Any] | None = None,
Creates a crosslink-spectrum-match data structure.
Contains minimal data necessary for representing a single crosslink-spectrum-match. The returned crosslink-spectrum-match data structure is a dictionary with keys as detailed in the return section.
- Parameters:
peptide_a (str) – The unmodified amino acid sequence of the first peptide.
modifications_a (dict of [int, tuple], or None) – The modifications of the first peptide given as a dictionary that maps peptide position (1-based) to modification given as a tuple of modification name and modification delta mass.
N-terminalmodifications should be denoted with position0.C-terminalmodifications should be denoted with positionlen(peptide) + 1. If the peptide is not modified an empty dictionary should be given.xl_position_peptide_a (int) – The position of the crosslinker in the sequence of the first peptide (1-based).
proteins_a (list of str, or None) – The accessions of proteins that the first peptide is associated with.
xl_position_proteins_a (list of int, or None) – Positions of the crosslink in the proteins of the first peptide (1-based).
pep_position_proteins_a (list of int, or None) – Positions of the first peptide in the corresponding proteins (1-based).
score_a (float, or None) – Identification score of the first peptide.
decoy_a (bool, or None) – Whether the alpha peptide is from the decoy database or not.
peptide_b (str) – The unmodified amino acid sequence of the second peptide.
modifications_b (dict of [int, tuple], or None) – The modifications of the second peptide given as a dictionary that maps peptide position (1-based) to modification given as a tuple of modification name and modification delta mass.
N-terminalmodifications should be denoted with position0.C-terminalmodifications should be denoted with positionlen(peptide) + 1. If the peptide is not modified an empty dictionary should be given.xl_position_peptide_b (int) – The position of the crosslinker in the sequence of the second peptide (1-based).
proteins_b (list of str, or None) – The accessions of proteins that the second peptide is associated with.
xl_position_proteins_b (list of int, or None) – Positions of the crosslink in the proteins of the second peptide (1-based).
pep_position_proteins_b (list of int, or None) – Positions of the second peptide in the corresponding proteins (1-based).
score_b (float, or None) – Identification score of the second peptide.
decoy_b (bool, or None) – Whether the beta peptide is from the decoy database or not.
score (float, or None) – Score of the crosslink-spectrum-match.
spectrum_file (str) – Name of the spectrum file the crosslink-spectrum-match was identified in.
scan_nr (int) – The corresponding scan number of the crosslink-spectrum-match.
charge (int, or None) – The precursor charge of the corresponding mass spectrum of the crosslink-spectrum-match.
rt (float, or None) – The retention time of the corresponding mass spectrum of the crosslink-spectrum-match in seconds.
im_cv (float, or None) – The ion mobility or compensation voltage of the corresponding mass spectrum of the crosslink-spectrum-match.
additional_information (dict with str keys, or None, default = None) – A dictionary with additional information associated with the crosslink-spectrum-match.
- Returns:
The dictionary representing the crosslink-spectrum-match with keys
data_type,completeness,alpha_peptide,alpha_modifications,alpha_peptide_crosslink_position,alpha_proteins,alpha_proteins_crosslink_positions,alpha_proteins_peptide_positions,alpha_score,alpha_decoy,beta_peptide,beta_modifications,beta_peptide_crosslink_position,beta_proteins,beta_proteins_crosslink_positions,beta_proteins_peptide_positions,beta_score,beta_decoy,crosslink_type,score,spectrum_file,scan_nr,retention_time,ion_mobility, andadditional_information. Alpha and beta are assigned based on peptide sequence, the peptide that alphabetically comes first is assigned to alpha.- Return type:
dict
- Raises:
TypeError – If the parameter is not of the given class.
ValueError – If the length of crosslink positions or peptide positions is not equal to the length of proteins.
Notes
The minimum required data for creating a crosslink-spectrum-match is:
peptide_a: The unmodified amino acid sequence of the first peptide.peptide_b: The unmodified amino acid sequence of the second peptide.xl_position_peptide_a: The position of the crosslinker in the sequence of the first peptide (1-based).xl_position_peptide_b: The position of the crosslinker in the sequence of the second peptide (1-based).spectrum_file: Name of the spectrum file the crosslink-spectrum-match was identified in.scan_nr: The corresponding scan number of the crosslink-spectrum-match.
Examples
>>> from pyXLMS.data import create_csm >>> minimal_csm = create_csm( ... peptide_a="PEPTIDEA", ... modifications_a={}, ... xl_position_peptide_a=1, ... proteins_a=None, ... xl_position_proteins_a=None, ... pep_position_proteins_a=None, ... score_a=None, ... decoy_a=None, ... peptide_b="PEPTIDEB", ... modifications_b={}, ... xl_position_peptide_b=5, ... proteins_b=None, ... xl_position_proteins_b=None, ... pep_position_proteins_b=None, ... score_b=None, ... decoy_b=None, ... score=None, ... spectrum_file="MS_EXP1", ... scan_nr=1, ... charge=None, ... rt=None, ... im_cv=None, ... )
>>> from pyXLMS.data import create_csm >>> csm = create_csm( ... peptide_a="PEPTIDEA", ... modifications_a={1: ("Oxidation", 15.994915)}, ... xl_position_peptide_a=1, ... proteins_a=["PROTEINA"], ... xl_position_proteins_a=[1], ... pep_position_proteins_a=[1], ... score_a=20.1, ... decoy_a=False, ... peptide_b="PEPTIDEB", ... modifications_b={}, ... xl_position_peptide_b=5, ... proteins_b=["PROTEINB"], ... xl_position_proteins_b=[3], ... pep_position_proteins_b=[1], ... score_b=33.7, ... decoy_b=False, ... score=20.1, ... spectrum_file="MS_EXP1", ... scan_nr=1, ... charge=3, ... rt=13.5, ... im_cv=-50, ... )
- pyXLMS.data.create_csm_min(
- peptide_a: str,
- xl_position_peptide_a: int,
- peptide_b: str,
- xl_position_peptide_b: int,
- spectrum_file: str,
- scan_nr: int,
- **kwargs,
Creates a crosslink-spectrum-match data structure from minimal input.
Contains minimal data necessary for representing a single crosslink-spectrum-match. This is an alias for
data.create_csm()``that sets all optional parameters to ``Nonefor convenience. The returned crosslink-spectrum-match data structure is a dictionary with keys as detailed in the return section.- Parameters:
peptide_a (str) – The unmodified amino acid sequence of the first peptide.
xl_position_peptide_a (int) – The position of the crosslinker in the sequence of the first peptide (1-based).
peptide_b (str) – The unmodified amino acid sequence of the second peptide.
xl_position_peptide_b (int) – The position of the crosslinker in the sequence of the second peptide (1-based).
spectrum_file (str) – Name of the spectrum file the crosslink-spectrum-match was identified in.
scan_nr (int) – The corresponding scan number of the crosslink-spectrum-match.
**kwargs – Any additional parameters will be passed to
data.create_csm().
- Returns:
The dictionary representing the crosslink-spectrum-match with keys
data_type,completeness,alpha_peptide,alpha_modifications,alpha_peptide_crosslink_position,alpha_proteins,alpha_proteins_crosslink_positions,alpha_proteins_peptide_positions,alpha_score,alpha_decoy,beta_peptide,beta_modifications,beta_peptide_crosslink_position,beta_proteins,beta_proteins_crosslink_positions,beta_proteins_peptide_positions,beta_score,beta_decoy,crosslink_type,score,spectrum_file,scan_nr,retention_time,ion_mobility, andadditional_information. Alpha and beta are assigned based on peptide sequence, the peptide that alphabetically comes first is assigned to alpha.- Return type:
dict
Notes
See also
data.create_csm().Examples
>>> from pyXLMS.data import create_csm_min >>> minimal_csm = create_csm("PEPTIDEA", 1, "PEPTIDEB", 5, "MS_EXP1", 1)
- pyXLMS.data.create_parser_result(
- search_engine: str,
- csms: List[CrosslinkSpectrumMatch] | None = None,
- crosslinks: List[Crosslink] | None = None,
Creates a parser result data structure.
Contains all necessary data elements that should be contained in a result returned by a crosslink search engine result parser.
- Parameters:
search_engine (str) – Name of the identifying crosslink search engine.
csms (list of dict, or None, default = None) – List of crosslink-spectrum-matches as created by
data.create_csm().crosslinks (list of dict, or None, default = None) – List of crosslinks as created by
data.create_crosslink().
- Returns:
The parser result data structure which is a dictionary with keys
data_type,completeness,search_engine,crosslink-spectrum-matchesandcrosslinks.- Return type:
dict
Examples
>>> from pyXLMS.data import create_parser_result >>> result = create_parser_result("MS Annika", None, None) >>> result["data_type"] 'parser_result' >>> result["completeness"] 'empty' >>> result["search_engine"] 'MS Annika'