Reference¶

`feature_plot`(adata, feature[, gridsize, …])	Plot expression of gene or feature in hexbin
`plot_composition`(adata, group_by, color[, …])	Plot composition of clusters or other metadata
`get_markers`(adata, groupby[, key, …])	Extract markers from adata into Seurat-like table
`merge_gene_info`(adata)	Merges gene information from different batches
`write_mtx`(adata, output_dir)	Save scanpy object in mtx cellranger v3 format.

sc_utils.expr_colormap()[source]¶: Gray-to-blue colormap for expression data

sc_utils.feature_plot(adata: anndata._core.anndata.AnnData, feature: str, gridsize: tuple = (180, 70), linewidths: float = 0.15, figsize: Optional[float] = None) → matplotlib.figure.Figure[source]¶

Plot expression of gene or feature in hexbin

Plots numeric feature value, commonly gene expression, on UMAP coordinates using hexbin. Feature is taken from adata.obs if it is found there, otherwise from adata.raw.

Parameters

adata – Annotated data matrix
feature – Name of the feature to plot
gridsize – Tuple of hexbin dimentions, larger numbers produce smaller hexbins
linewidths – Width of the lines to draw around each hexbin
figsize – Optional, make figure of this size

Returns

Matplotlib figure with colorbar added.

sc_utils.get_markers(adata, groupby, key='rank_genes_groups', p_val_cutoff=0.05, logfc_cutoff=0.5)[source]¶

Extract markers from adata into Seurat-like table

Extracts markers after they are computed by scanpy. Produces Seurat-like table with fields "p_val", "avg_logFC", "pct.1", "pct.2", "p_val_adj", "cluster", "gene"

Calculates the percentage of cells that express a given gene in the target cluster (pct.1 field) and outside the cluster (pct.2 field) from adata.raw matrix.

Parameters

adata – Annotated data matrix.
groupby – adata.obs field used for marker calculation
key – adata.uns key that has computed markers
p_val_cutoff – Drop all genes with adjusted p-value greater than or equal to this
logfc_cutoff – Drop all genes with average logFC less than or equal to this

Returns

Returns a pandas dataframe with above listed columns, optionally
subsetted on the genes that pass the cutoffs.
p_val field is a copy of adjusted p-value field.

Example

>>> sc.tl.rank_genes_groups(adata, "leiden", method="wilcoxon", n_genes=200)
>>> markers = sc_utils.get_markers(adata, "leiden")
>>> markers.to_csv("markers.csv")

sc_utils.merge_gene_info(adata: anndata._core.anndata.AnnData)[source]¶

Merges gene information from different batches

After concatenating several datasets, the gene information dataframe adata.var can have a lot of duplicate columns from all the batches.

This function merges gene_ids, feature_types and genome information from batches, inserts them in the table and removes the batch-associated columns.

Parameters: adata – Annotated data matrix.

Example

>>> datasets = [sc.read_h5ad(path) for path in paths]
>>> adata = datasets[0].concatenate(datasets[1:], join="outer")
>>> sc_utils.merge_gene_info(adata)

sc_utils.plot_composition(adata: anndata._core.anndata.AnnData, group_by: str, color: str, relative: bool = False, palette: Optional[Collection] = None, plot_numbers: bool = False) → matplotlib.axes._axes.Axes[source]¶

Plot composition of clusters or other metadata

Groups cells by one metadata field and plots stacked barplot colored by another metadata field. Common use case is to see which samples contribute to which clusters. Plots horizontally.

Parameters

adata – Annotated data matrix
group_by – Name of the field to group by on y axis
color – Name of the field to color by
relative – Plot percentage for each cluster if True or absolute counts if False
palette – Optional, pass your own palette
plot_numbers – If True, plot number of cells next to the bars

Returns

Matplotlib axes with the plot.

sc_utils.write_mtx(adata, output_dir)[source]¶

Save scanpy object in mtx cellranger v3 format.

Saves basic information from adata object as cellranger v3 mtx folder. Saves only adata.var_names, adata.obs_names and adata.X fields. Creates directory output_dir if it does not exist. Creates 3 files: features.tsv.gz, barcodes.tsv.gz and matrix.mtz.gz. Will overwrite files in the output directory.

Parameters

adata – Annotated data matrix.
output_dir – Directory where to save results