Reference

feature_plot(adata, feature[, gridsize, …])

Plot expression of gene or feature in hexbin

plot_composition(adata, group_by, color[, …])

Plot composition of clusters or other metadata

get_markers(adata, groupby[, key, …])

Extract markers from adata into Seurat-like table

merge_gene_info(adata)

Merges gene information from different batches

write_mtx(adata, output_dir)

Save scanpy object in mtx cellranger v3 format.

sc_utils.expr_colormap()[source]

Gray-to-blue colormap for expression data

sc_utils.feature_plot(adata: anndata._core.anndata.AnnData, feature: str, gridsize: tuple = (180, 70), linewidths: float = 0.15, figsize: Optional[float] = None) matplotlib.figure.Figure[source]

Plot expression of gene or feature in hexbin

Plots numeric feature value, commonly gene expression, on UMAP coordinates using hexbin. Feature is taken from adata.obs if it is found there, otherwise from adata.raw.

Parameters
  • adata – Annotated data matrix

  • feature – Name of the feature to plot

  • gridsize – Tuple of hexbin dimentions, larger numbers produce smaller hexbins

  • linewidths – Width of the lines to draw around each hexbin

  • figsize – Optional, make figure of this size

Returns

Matplotlib figure with colorbar added.

sc_utils.get_markers(adata, groupby, key='rank_genes_groups', p_val_cutoff=0.05, logfc_cutoff=0.5)[source]

Extract markers from adata into Seurat-like table

Extracts markers after they are computed by scanpy. Produces Seurat-like table with fields "p_val", "avg_logFC", "pct.1", "pct.2", "p_val_adj", "cluster", "gene"

Calculates the percentage of cells that express a given gene in the target cluster (pct.1 field) and outside the cluster (pct.2 field) from adata.raw matrix.

Parameters
  • adata – Annotated data matrix.

  • groupbyadata.obs field used for marker calculation

  • keyadata.uns key that has computed markers

  • p_val_cutoff – Drop all genes with adjusted p-value greater than or equal to this

  • logfc_cutoff – Drop all genes with average logFC less than or equal to this

Returns

  • Returns a pandas dataframe with above listed columns, optionally

  • subsetted on the genes that pass the cutoffs.

  • p_val field is a copy of adjusted p-value field.

Example

>>> sc.tl.rank_genes_groups(adata, "leiden", method="wilcoxon", n_genes=200)
>>> markers = sc_utils.get_markers(adata, "leiden")
>>> markers.to_csv("markers.csv")
sc_utils.merge_gene_info(adata: anndata._core.anndata.AnnData)[source]

Merges gene information from different batches

After concatenating several datasets, the gene information dataframe adata.var can have a lot of duplicate columns from all the batches.

This function merges gene_ids, feature_types and genome information from batches, inserts them in the table and removes the batch-associated columns.

Parameters

adata – Annotated data matrix.

Example

>>> datasets = [sc.read_h5ad(path) for path in paths]
>>> adata = datasets[0].concatenate(datasets[1:], join="outer")
>>> sc_utils.merge_gene_info(adata)
sc_utils.plot_composition(adata: anndata._core.anndata.AnnData, group_by: str, color: str, relative: bool = False, palette: Optional[Collection] = None, plot_numbers: bool = False) matplotlib.axes._axes.Axes[source]

Plot composition of clusters or other metadata

Groups cells by one metadata field and plots stacked barplot colored by another metadata field. Common use case is to see which samples contribute to which clusters. Plots horizontally.

Parameters
  • adata – Annotated data matrix

  • group_by – Name of the field to group by on y axis

  • color – Name of the field to color by

  • relative – Plot percentage for each cluster if True or absolute counts if False

  • palette – Optional, pass your own palette

  • plot_numbers – If True, plot number of cells next to the bars

Returns

Matplotlib axes with the plot.

sc_utils.write_mtx(adata, output_dir)[source]

Save scanpy object in mtx cellranger v3 format.

Saves basic information from adata object as cellranger v3 mtx folder. Saves only adata.var_names, adata.obs_names and adata.X fields. Creates directory output_dir if it does not exist. Creates 3 files: features.tsv.gz, barcodes.tsv.gz and matrix.mtz.gz. Will overwrite files in the output directory.

Parameters
  • adata – Annotated data matrix.

  • output_dir – Directory where to save results