pyXLMS.plotting package#

Submodules#

pyXLMS.plotting.plot_peptide_pair_distribution module#

pyXLMS.plotting.plot_peptide_pair_distribution.plot_peptide_pair_distribution(
data: List[Dict[str, Any]],
top_n: int = 25,
color: str = '#6d4bff',
title: str = 'Peptide Pair Distribution',
figsize: Tuple[float, float] = (16.0, 9.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Plot the peptide pair distribution for a set of crosslink-spectrum-matches.

Plot the peptide pair distribution as a barplot for a set of crosslink-spectrum-matches.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches.

  • top_n (int, default = 25) – Number of peptide pairs to plot. Peptide pairs are sorted by number of crosslink-spectrum-matches.

  • color (str, default = "#6d4bff") – Color of the bars.

  • title (str, default = "Peptide Pair Distribution") – The title of the barplot.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_msannika(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.xlsx"
... )
>>> csms = pr["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_peptide_pair_distribution(csms)

pyXLMS.plotting.plot_protein_distribution module#

pyXLMS.plotting.plot_protein_distribution.plot_protein_distribution(
data: List[Dict[str, Any]],
top_n: int = 25,
colors: List[str] = ['#6d4bff', '#ac99ff'],
title: str = 'Protein Distribution',
figsize: Tuple[float, float] = (16.0, 9.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Plot the protein distribution for a set of crosslink-spectrum-matches or crosslinks.

Plot the protein distribution as a barplot for a set of crosslink-spectrum-matches or crosslinks.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • top_n (int, default = 25) – Number of proteins to plot. Proteins are sorted by number of crosslinks or crosslink-spectrum-matches.

  • colors (list of str, default = ["#6d4bff", "#ac99ff"]) – Colors of the bar-types (intra-link and inter-link).

  • title (str, default = "Protein Distribution") – The title of the barplot.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches or crosslinks.

  • ValueError – If attribute ‘alpha_proteins’, or ‘beta_proteins’ is not available for any of the data.

  • IndexError – If not enough colors where specified.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_msannika(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.xlsx"
... )
>>> csms = pr["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_protein_distribution(csms)

pyXLMS.plotting.plot_residue_pair_distribution module#

pyXLMS.plotting.plot_residue_pair_distribution.plot_residue_pair_distribution(
data: List[Dict[str, Any]],
top_n: int = 25,
color: str = '#6d4bff',
title: str = 'Residue Pair Distribution',
figsize: Tuple[float, float] = (16.0, 9.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Plot the residue pair distribution for a set of crosslink-spectrum-matches.

Plot the residue pair distribution as a barplot for a set of crosslink-spectrum-matches. Requires that alpha_proteins, beta_proteins, alpha_proteins_crosslink_positions, and beta_proteins_crosslink_positions fields are set for all crosslink-spectrum-matches.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches.

  • top_n (int, default = 25) – Number of residue pairs to plot. Residue pairs are sorted by number of crosslink-spectrum-matches.

  • color (str, default = "#6d4bff") – Color of the bars.

  • title (str, default = "Residue Pair Distribution") – The title of the barplot.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_msannika(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.xlsx"
... )
>>> csms = pr["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_residue_pair_distribution(csms)

pyXLMS.plotting.plot_score_distribution module#

pyXLMS.plotting.plot_score_distribution.plot_score_distribution(
data: List[Dict[str, Any]],
bins: int = 25,
density: bool = False,
colors: List[str] = ['#00a087', '#3c5488', '#e64b35'],
title: str = 'Target and Decoy Score Distribution',
figsize: Tuple[float, float] = (16.0, 9.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Plot the score distribution for a set of crosslink-spectrum-matches or crosslinks.

Plot the target-target, target-decoy, and decoy-decoy score distribution as a histogram for a set of crosslink-spectrum-matches or crosslinks.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • bins (int, default = 25) – The number of equal-width bins in the histogram.

  • density (bool, default = False) – If True, draw and return a probability density: each bin will display the bin’s raw count divided by the total number of counts and the bin width, so that the area under the histogram integrates to 1.

  • colors (list of str, default = ["#00a087", "#3c5488", "#e64b35"]) – Colors of the histogram lines.

  • title (str, default = "Target and Decoy Score Distribution") – The title of the histogram.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches or crosslinks.

  • ValueError – If attribute ‘score’, ‘alpha_decoy’, or ‘beta_decoy’ is not available for any of the data.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_msannika(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.xlsx"
... )
>>> csms = pr["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_score_distribution(csms)

pyXLMS.plotting.plot_string_score_distribution module#

pyXLMS.plotting.plot_string_score_distribution.plot_string_score_distribution(
data: List[Dict[str, Any]],
organism: str | int | None = None,
plot_type: Literal['bar', 'hist'] = 'bar',
bins: int = 25,
density: bool = False,
zero_impute_nan: bool = True,
colors: List[str] | None = None,
title: str = 'STRING Score Distribution for Inter-Links',
figsize: Tuple[float, float] = (16.0, 9.0),
filename_prefix: str | None = None,
verbose: Literal[0, 1, 2] = 1,
) Tuple[Figure, Any][source]#

Plot the STRING score distribution for a set of inter-links.

Plot the STRING score distribution as a barplot or histogram for inter-links of a set of crosslink-spectrum-matches or crosslinks. STRING is accessible via string-db.org.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • organism (str, or int, or None, default = None) – Organism name (e.g. Homo sapiens) or taxon identifier (e.g. 9606). Taxon identifiers are preferred. See also string-db.org/cgi/organisms. If None it is assumed that the input data is already annotated with STRING scores and will raise an error if that is not the case.

  • plot_type ("bar", or "hist", default = "bar") – If STRING scores should be plotted as a bar plot or as a histogram.

  • bins (int, default = 25) – The number of equal-width bins in the histogram. Only applies to plot_type = "hist".

  • density (bool, default = False) – If True, draw and return a probability density: each bin will display the bin’s raw count divided by the total number of counts and the bin width, so that the area under the histogram integrates to 1. Only applies to plot_type = "hist".

  • zero_impute_nan (bool, default = True) – If nan values should be imputed with zeros. Only applies to plot_type = "hist".

  • colors (list of str, or None, default = None) – Colors of the bars. For plot_type = "bar" a total number of 6 colors have to be given. For plot_type = "hist" a total number of 3 colors have to be given. Uses the internal defaults if None (default) is given.

  • title (str, default = "STRING Score Distribution for Inter-Links") – The title of the plot.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

  • verbose (0, 1, or 2, default = 1) –

    • 0: All warnings are ignored.

    • 1: Warnings are printed to stdout.

    • 2: Warnings are treated as errors.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches or crosslinks.

  • ValueError – If parameter data does not contain any inter-links.

  • ValueError – If the number of given colors does not match the number of required colors for the plot type.

  • ValueError – If organism is None and data is not yet annotated with STRING scores.

  • TypeError – If parameter plot_type was not set correctly.

  • TypeError – If parameter verbose was not set correctly.

Notes

It is generally recommended to call transform.annotate_string_scores() before using this function to preemptively catch errors during annotation.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_custom("data/ms_annika/Nucleus_Rep1_Crosslinks.parquet")
>>> xls = pr["crosslinks"]
>>> fig, ax = plotting.plot_string_score_distribution(xls, organism="Homo sapiens")

pyXLMS.plotting.plot_target_decoy_distribution module#

pyXLMS.plotting.plot_target_decoy_distribution.plot_target_decoy_distribution(
data: List[Dict[str, Any]],
colors: List[str] = ['#00a087', '#3c5488', '#e64b35'],
title: str = 'Target and Decoy Distribution',
figsize: Tuple[float, float] = (16.0, 9.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Plot the target-decoy distribution for a set of crosslink-spectrum-matches or crosslinks.

Plot the target-target, target-decoy, and decoy-decoy distribution as a barplot for a set of crosslink-spectrum-matches or crosslinks.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • colors (list of str, default = ["#00a087", "#3c5488", "#e64b35"]) – Colors of the bars.

  • title (str, default = "Target and Decoy Distribution") – The title of the barplot.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches or crosslinks.

  • ValueError – If attribute ‘alpha_decoy’, or ‘beta_decoy’ is not available for any of the data.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_msannika(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.xlsx"
... )
>>> csms = pr["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_target_decoy_distribution(csms)

pyXLMS.plotting.plot_venn_diagram module#

pyXLMS.plotting.plot_venn_diagram.plot_venn_diagram(
data_1: List[Dict[str, Any]],
data_2: List[Dict[str, Any]],
data_3: List[Dict[str, Any]] | None = None,
by: Literal['peptide', 'protein'] = 'peptide',
labels: List[str] = ['Set 1', 'Set 2', 'Set 3'],
colors: List[str] = ['#4361EE', '#4CC9F0', '#F72585'],
alpha: float = 0.6,
contour: bool = False,
linewidth: float = 0.5,
title: str = 'Venn Diagram',
figsize: Tuple[float, float] = (10.0, 10.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Plot the venn diagram for two or three sets of crosslink-spectrum-matches or crosslinks.

Plot the venn diagram for two or three sets of crosslink-spectrum-matches or crosslinks. Overlaps are calculated by either looking at peptide sequence and crosslink position in the peptide using parameter by = “peptide” or by looking at protein crosslink position by using parameter by = “protein”. Please note that crosslink-spectrum-matches are automatically aggregated to crosslinks, and scan numbers do not influence the creation of the venn diagram. For more nuanced control over intersecting crosslink-spectrum-matches with scan numbers please refer to transform.intersection().

Parameters:
  • data_1 (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • data_2 (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • data_3 (list of dict of str, any, or None, default = None) – Optionally, a third list of crosslink-spectrum-matches or crosslinks.

  • by (str, one of "peptide" or "protein") – If peptide or protein crosslink position should be used for determining if a crosslink-spectrum-match or crosslink is unique.

  • labels (List[str], default = ["Set 1", "Set 2", "Set 3"]) – List of labels for the sets.

  • colors (List[str], default = ["#4361EE", "#4CC9F0", "#F72585"]) – List of valid colors to use for the venn circles.

  • alpha (float, default = 0.6) – Color opacity.

  • contour (bool, default = False) – If a contour should be drawn around venn circles.

  • linewidth (float, default = 0.5) – Linewidth of the contour.

  • title (str, default = "Venn Diagram") – Title of the venn diagram.

  • figsize (tuple of float, float, default = (10.0, 10.0)) – Width, height in inches.

  • filename_prefix (str, or None, default = None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • TypeError – If parameter by is not one of ‘peptide’ or ‘protein’.

  • ValueError – If one of the data parameters does not contain any crosslink-spectrum-matches or crosslinks.

  • ValueError – If attribute ‘alpha_proteins’, ‘alpha_proteins_crosslink_positions’, ‘beta_proteins’, or ‘beta_proteins_crosslink_positions’ is not available for any of the data and parameter ‘by’ was set to ‘protein’.

Notes

Please note that crosslink-spectrum-matches are automatically aggregated to crosslinks, and scan numbers do not influence the creation of the venn diagram. For more nuanced control over intersecting crosslink-spectrum-matches with scan numbers please refer to transform.intersection().

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> a = parser.read(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.txt",
...     engine="MS Annika",
...     crosslinker="DSS",
... )
>>> a = a["crosslink-spectrum-matches"]
>>> b = parser.read(
...     "data/maxquant/run1/crosslinkMsms.txt", engine="MaxQuant", crosslinker="DSS"
... )
>>> b = b["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_venn_diagram(
...     a, b, labels=["MS Annika", "MaxQuant"], colors=["orange", "blue"]
... )
>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> a = parser.read(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.txt",
...     engine="MS Annika",
...     crosslinker="DSS",
... )
>>> a = a["crosslink-spectrum-matches"]
>>> b = parser.read(
...     "data/maxquant/run1/crosslinkMsms.txt", engine="MaxQuant", crosslinker="DSS"
... )
>>> b = b["crosslink-spectrum-matches"]
>>> c = parser.read(
...     "data/plink2/Cas9_plus10_2024.06.20.filtered_cross-linked_spectra.csv",
...     engine="pLink",
...     crosslinker="DSS",
... )
>>> c = c["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_venn_diagram(
...     a, b, c, labels=["MS Annika", "MaxQuant", "pLink"], contour=True
... )
pyXLMS.plotting.plot_venn_diagram.venn(
set_1: Set[Any],
set_2: Set[Any],
set_3: Set[Any] | None = None,
labels: List[str] = ['Set 1', 'Set 2', 'Set 3'],
colors: List[str] = ['#4361EE', '#4CC9F0', '#F72585'],
alpha: float = 0.6,
contour: bool = False,
linewidth: float = 0.5,
title: str = 'Venn Diagram',
figsize: Tuple[float, float] = (10.0, 10.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Wrapper with pre-set defaults for creating venn diagrams with the matplotlib-venn package.

Wrapper with pre-set defaults for creating venn diagrams with the matplotlib-venn package github.com/konstantint/matplotlib-venn.

Parameters:
  • set_1 (set) – First set of the venn diagram.

  • set_2 (set) – Second set of the venn diagram.

  • set_3 (set, or None, default = None) – If not None a three set venn diagram will be drawn, if None the two set venn diagram of set_1 and set_2 will be drawn.

  • labels (List[str], default = ["Set 1", "Set 2", "Set 3"]) – List of labels for the sets.

  • colors (List[str], default = ["#4361EE", "#4CC9F0", "#F72585"]) – List of valid colors to use for the venn circles.

  • alpha (float, default = 0.6) – Color opacity.

  • contour (bool, default = False) – If a contour should be drawn around venn circles.

  • linewidth (float, default = 0.5) – Linewidth of the contour.

  • title (str, default = "Venn Diagram") – Title of the venn diagram.

  • figsize (tuple of float, float, default = (10.0, 10.0)) – Width, height in inches.

  • filename_prefix (str, or None, default = None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Warns:

RuntimeWarning – If more labels or colors than sets are supplied.

Raises:

IndexError – If less labels or colors than sets are supplied.

Examples

>>> from pyXLMS.plotting import venn
>>> fig, ax = venn(
...     {"A", "B", "C"},
...     {"B", "C", "D", "E", "F"},
...     labels=["A", "F"],
...     colors=["orange", "blue"],
... )
>>> from pyXLMS.plotting import venn
>>> fig, ax = venn({"A", "B", "C"}, {"B", "C", "D", "E", "F"}, {"F", "G"})

Module contents#

Plot the crosslink type distribution for a set of crosslink-spectrum-matches or crosslinks.

Plot the crosslink type distribution (intra- and inter-links) as a bar or pie chart for a set of crosslink-spectrum-matches or crosslinks.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • plot_type (str, one of "bar" or "pie", default = "bar") – Plot type, whether to plot as a bar or pie chart.

  • colors (list of str, default = ["#6d4bff", "#ac99ff"]) – Colors of the bars/pie slices (intra-link and inter-link).

  • title (str, default = "Crosslink Type Distribution") – The title of the plot.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches or crosslinks.

  • ValueError – If parameter plot type was set incorrectly.

  • IndexError – If not enough colors where specified.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_msannika(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.xlsx"
... )
>>> csms = pr["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_crosslink_type_distribution(csms)
pyXLMS.plotting.plot_peptide_pair_distribution(
data: List[Dict[str, Any]],
top_n: int = 25,
color: str = '#6d4bff',
title: str = 'Peptide Pair Distribution',
figsize: Tuple[float, float] = (16.0, 9.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Plot the peptide pair distribution for a set of crosslink-spectrum-matches.

Plot the peptide pair distribution as a barplot for a set of crosslink-spectrum-matches.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches.

  • top_n (int, default = 25) – Number of peptide pairs to plot. Peptide pairs are sorted by number of crosslink-spectrum-matches.

  • color (str, default = "#6d4bff") – Color of the bars.

  • title (str, default = "Peptide Pair Distribution") – The title of the barplot.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_msannika(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.xlsx"
... )
>>> csms = pr["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_peptide_pair_distribution(csms)
pyXLMS.plotting.plot_protein_distribution(
data: List[Dict[str, Any]],
top_n: int = 25,
colors: List[str] = ['#6d4bff', '#ac99ff'],
title: str = 'Protein Distribution',
figsize: Tuple[float, float] = (16.0, 9.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Plot the protein distribution for a set of crosslink-spectrum-matches or crosslinks.

Plot the protein distribution as a barplot for a set of crosslink-spectrum-matches or crosslinks.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • top_n (int, default = 25) – Number of proteins to plot. Proteins are sorted by number of crosslinks or crosslink-spectrum-matches.

  • colors (list of str, default = ["#6d4bff", "#ac99ff"]) – Colors of the bar-types (intra-link and inter-link).

  • title (str, default = "Protein Distribution") – The title of the barplot.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches or crosslinks.

  • ValueError – If attribute ‘alpha_proteins’, or ‘beta_proteins’ is not available for any of the data.

  • IndexError – If not enough colors where specified.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_msannika(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.xlsx"
... )
>>> csms = pr["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_protein_distribution(csms)
pyXLMS.plotting.plot_residue_pair_distribution(
data: List[Dict[str, Any]],
top_n: int = 25,
color: str = '#6d4bff',
title: str = 'Residue Pair Distribution',
figsize: Tuple[float, float] = (16.0, 9.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Plot the residue pair distribution for a set of crosslink-spectrum-matches.

Plot the residue pair distribution as a barplot for a set of crosslink-spectrum-matches. Requires that alpha_proteins, beta_proteins, alpha_proteins_crosslink_positions, and beta_proteins_crosslink_positions fields are set for all crosslink-spectrum-matches.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches.

  • top_n (int, default = 25) – Number of residue pairs to plot. Residue pairs are sorted by number of crosslink-spectrum-matches.

  • color (str, default = "#6d4bff") – Color of the bars.

  • title (str, default = "Residue Pair Distribution") – The title of the barplot.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_msannika(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.xlsx"
... )
>>> csms = pr["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_residue_pair_distribution(csms)
pyXLMS.plotting.plot_score_distribution(
data: List[Dict[str, Any]],
bins: int = 25,
density: bool = False,
colors: List[str] = ['#00a087', '#3c5488', '#e64b35'],
title: str = 'Target and Decoy Score Distribution',
figsize: Tuple[float, float] = (16.0, 9.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Plot the score distribution for a set of crosslink-spectrum-matches or crosslinks.

Plot the target-target, target-decoy, and decoy-decoy score distribution as a histogram for a set of crosslink-spectrum-matches or crosslinks.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • bins (int, default = 25) – The number of equal-width bins in the histogram.

  • density (bool, default = False) – If True, draw and return a probability density: each bin will display the bin’s raw count divided by the total number of counts and the bin width, so that the area under the histogram integrates to 1.

  • colors (list of str, default = ["#00a087", "#3c5488", "#e64b35"]) – Colors of the histogram lines.

  • title (str, default = "Target and Decoy Score Distribution") – The title of the histogram.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches or crosslinks.

  • ValueError – If attribute ‘score’, ‘alpha_decoy’, or ‘beta_decoy’ is not available for any of the data.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_msannika(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.xlsx"
... )
>>> csms = pr["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_score_distribution(csms)
pyXLMS.plotting.plot_string_score_distribution(
data: List[Dict[str, Any]],
organism: str | int | None = None,
plot_type: Literal['bar', 'hist'] = 'bar',
bins: int = 25,
density: bool = False,
zero_impute_nan: bool = True,
colors: List[str] | None = None,
title: str = 'STRING Score Distribution for Inter-Links',
figsize: Tuple[float, float] = (16.0, 9.0),
filename_prefix: str | None = None,
verbose: Literal[0, 1, 2] = 1,
) Tuple[Figure, Any][source]#

Plot the STRING score distribution for a set of inter-links.

Plot the STRING score distribution as a barplot or histogram for inter-links of a set of crosslink-spectrum-matches or crosslinks. STRING is accessible via string-db.org.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • organism (str, or int, or None, default = None) –

    Organism name (e.g. Homo sapiens) or taxon identifier (e.g. 9606). Taxon identifiers are preferred. See also string-db.org/cgi/organisms. If None it is assumed that the input data is already annotated with STRING scores and will raise an error if that is not the case.

  • plot_type ("bar", or "hist", default = "bar") – If STRING scores should be plotted as a bar plot or as a histogram.

  • bins (int, default = 25) – The number of equal-width bins in the histogram. Only applies to plot_type = "hist".

  • density (bool, default = False) – If True, draw and return a probability density: each bin will display the bin’s raw count divided by the total number of counts and the bin width, so that the area under the histogram integrates to 1. Only applies to plot_type = "hist".

  • zero_impute_nan (bool, default = True) – If nan values should be imputed with zeros. Only applies to plot_type = "hist".

  • colors (list of str, or None, default = None) – Colors of the bars. For plot_type = "bar" a total number of 6 colors have to be given. For plot_type = "hist" a total number of 3 colors have to be given. Uses the internal defaults if None (default) is given.

  • title (str, default = "STRING Score Distribution for Inter-Links") – The title of the plot.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

  • verbose (0, 1, or 2, default = 1) –

    • 0: All warnings are ignored.

    • 1: Warnings are printed to stdout.

    • 2: Warnings are treated as errors.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches or crosslinks.

  • ValueError – If parameter data does not contain any inter-links.

  • ValueError – If the number of given colors does not match the number of required colors for the plot type.

  • ValueError – If organism is None and data is not yet annotated with STRING scores.

  • TypeError – If parameter plot_type was not set correctly.

  • TypeError – If parameter verbose was not set correctly.

Notes

It is generally recommended to call transform.annotate_string_scores() before using this function to preemptively catch errors during annotation.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_custom("data/ms_annika/Nucleus_Rep1_Crosslinks.parquet")
>>> xls = pr["crosslinks"]
>>> fig, ax = plotting.plot_string_score_distribution(xls, organism="Homo sapiens")
pyXLMS.plotting.plot_target_decoy_distribution(
data: List[Dict[str, Any]],
colors: List[str] = ['#00a087', '#3c5488', '#e64b35'],
title: str = 'Target and Decoy Distribution',
figsize: Tuple[float, float] = (16.0, 9.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Plot the target-decoy distribution for a set of crosslink-spectrum-matches or crosslinks.

Plot the target-target, target-decoy, and decoy-decoy distribution as a barplot for a set of crosslink-spectrum-matches or crosslinks.

Parameters:
  • data (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • colors (list of str, default = ["#00a087", "#3c5488", "#e64b35"]) – Colors of the bars.

  • title (str, default = "Target and Decoy Distribution") – The title of the barplot.

  • figsize (tuple of float, float, default = (16.0, 9.0)) – Width, height in inches.

  • filename_prefix (str, or None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • ValueError – If parameter data does not contain any crosslink-spectrum-matches or crosslinks.

  • ValueError – If attribute ‘alpha_decoy’, or ‘beta_decoy’ is not available for any of the data.

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> pr = parser.read_msannika(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.xlsx"
... )
>>> csms = pr["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_target_decoy_distribution(csms)
pyXLMS.plotting.plot_venn_diagram(
data_1: List[Dict[str, Any]],
data_2: List[Dict[str, Any]],
data_3: List[Dict[str, Any]] | None = None,
by: Literal['peptide', 'protein'] = 'peptide',
labels: List[str] = ['Set 1', 'Set 2', 'Set 3'],
colors: List[str] = ['#4361EE', '#4CC9F0', '#F72585'],
alpha: float = 0.6,
contour: bool = False,
linewidth: float = 0.5,
title: str = 'Venn Diagram',
figsize: Tuple[float, float] = (10.0, 10.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Plot the venn diagram for two or three sets of crosslink-spectrum-matches or crosslinks.

Plot the venn diagram for two or three sets of crosslink-spectrum-matches or crosslinks. Overlaps are calculated by either looking at peptide sequence and crosslink position in the peptide using parameter by = “peptide” or by looking at protein crosslink position by using parameter by = “protein”. Please note that crosslink-spectrum-matches are automatically aggregated to crosslinks, and scan numbers do not influence the creation of the venn diagram. For more nuanced control over intersecting crosslink-spectrum-matches with scan numbers please refer to transform.intersection().

Parameters:
  • data_1 (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • data_2 (list of dict of str, any) – A list of crosslink-spectrum-matches or crosslinks.

  • data_3 (list of dict of str, any, or None, default = None) – Optionally, a third list of crosslink-spectrum-matches or crosslinks.

  • by (str, one of "peptide" or "protein") – If peptide or protein crosslink position should be used for determining if a crosslink-spectrum-match or crosslink is unique.

  • labels (List[str], default = ["Set 1", "Set 2", "Set 3"]) – List of labels for the sets.

  • colors (List[str], default = ["#4361EE", "#4CC9F0", "#F72585"]) – List of valid colors to use for the venn circles.

  • alpha (float, default = 0.6) – Color opacity.

  • contour (bool, default = False) – If a contour should be drawn around venn circles.

  • linewidth (float, default = 0.5) – Linewidth of the contour.

  • title (str, default = "Venn Diagram") – Title of the venn diagram.

  • figsize (tuple of float, float, default = (10.0, 10.0)) – Width, height in inches.

  • filename_prefix (str, or None, default = None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Raises:
  • TypeError – If a wrong data type is provided.

  • TypeError – If parameter by is not one of ‘peptide’ or ‘protein’.

  • ValueError – If one of the data parameters does not contain any crosslink-spectrum-matches or crosslinks.

  • ValueError – If attribute ‘alpha_proteins’, ‘alpha_proteins_crosslink_positions’, ‘beta_proteins’, or ‘beta_proteins_crosslink_positions’ is not available for any of the data and parameter ‘by’ was set to ‘protein’.

Notes

Please note that crosslink-spectrum-matches are automatically aggregated to crosslinks, and scan numbers do not influence the creation of the venn diagram. For more nuanced control over intersecting crosslink-spectrum-matches with scan numbers please refer to transform.intersection().

Examples

>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> a = parser.read(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.txt",
...     engine="MS Annika",
...     crosslinker="DSS",
... )
>>> a = a["crosslink-spectrum-matches"]
>>> b = parser.read(
...     "data/maxquant/run1/crosslinkMsms.txt", engine="MaxQuant", crosslinker="DSS"
... )
>>> b = b["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_venn_diagram(
...     a, b, labels=["MS Annika", "MaxQuant"], colors=["orange", "blue"]
... )
>>> from pyXLMS import parser
>>> from pyXLMS import plotting
>>> a = parser.read(
...     "data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.txt",
...     engine="MS Annika",
...     crosslinker="DSS",
... )
>>> a = a["crosslink-spectrum-matches"]
>>> b = parser.read(
...     "data/maxquant/run1/crosslinkMsms.txt", engine="MaxQuant", crosslinker="DSS"
... )
>>> b = b["crosslink-spectrum-matches"]
>>> c = parser.read(
...     "data/plink2/Cas9_plus10_2024.06.20.filtered_cross-linked_spectra.csv",
...     engine="pLink",
...     crosslinker="DSS",
... )
>>> c = c["crosslink-spectrum-matches"]
>>> fig, ax = plotting.plot_venn_diagram(
...     a, b, c, labels=["MS Annika", "MaxQuant", "pLink"], contour=True
... )
pyXLMS.plotting.venn(
set_1: Set[Any],
set_2: Set[Any],
set_3: Set[Any] | None = None,
labels: List[str] = ['Set 1', 'Set 2', 'Set 3'],
colors: List[str] = ['#4361EE', '#4CC9F0', '#F72585'],
alpha: float = 0.6,
contour: bool = False,
linewidth: float = 0.5,
title: str = 'Venn Diagram',
figsize: Tuple[float, float] = (10.0, 10.0),
filename_prefix: str | None = None,
) Tuple[Figure, Any][source]#

Wrapper with pre-set defaults for creating venn diagrams with the matplotlib-venn package.

Wrapper with pre-set defaults for creating venn diagrams with the matplotlib-venn package github.com/konstantint/matplotlib-venn.

Parameters:
  • set_1 (set) – First set of the venn diagram.

  • set_2 (set) – Second set of the venn diagram.

  • set_3 (set, or None, default = None) – If not None a three set venn diagram will be drawn, if None the two set venn diagram of set_1 and set_2 will be drawn.

  • labels (List[str], default = ["Set 1", "Set 2", "Set 3"]) – List of labels for the sets.

  • colors (List[str], default = ["#4361EE", "#4CC9F0", "#F72585"]) – List of valid colors to use for the venn circles.

  • alpha (float, default = 0.6) – Color opacity.

  • contour (bool, default = False) – If a contour should be drawn around venn circles.

  • linewidth (float, default = 0.5) – Linewidth of the contour.

  • title (str, default = "Venn Diagram") – Title of the venn diagram.

  • figsize (tuple of float, float, default = (10.0, 10.0)) – Width, height in inches.

  • filename_prefix (str, or None, default = None) – If given, plot will be saved with and without title in .png and .svg format with the given prefix.

Returns:

The created figure and axis from matplotlib.pyplot.subplots().

Return type:

tuple of matplotlib.figure.Figure, any

Warns:

RuntimeWarning – If more labels or colors than sets are supplied.

Raises:

IndexError – If less labels or colors than sets are supplied.

Examples

>>> from pyXLMS.plotting import venn
>>> fig, ax = venn(
...     {"A", "B", "C"},
...     {"B", "C", "D", "E", "F"},
...     labels=["A", "F"],
...     colors=["orange", "blue"],
... )
>>> from pyXLMS.plotting import venn
>>> fig, ax = venn({"A", "B", "C"}, {"B", "C", "D", "E", "F"}, {"F", "G"})