Reference¶
|
Plot expression of gene or feature in hexbin |
|
Plot composition of clusters or other metadata |
|
Extract markers from adata into Seurat-like table |
|
Merges gene information from different batches |
|
Save scanpy object in mtx cellranger v3 format. |
- sc_utils.feature_plot(adata: anndata._core.anndata.AnnData, feature: str, gridsize: tuple = (180, 70), linewidths: float = 0.15, figsize: Optional[float] = None) matplotlib.figure.Figure [source]¶
Plot expression of gene or feature in hexbin
Plots numeric feature value, commonly gene expression, on UMAP coordinates using hexbin. Feature is taken from
adata.obs
if it is found there, otherwise fromadata.raw
.- Parameters
adata – Annotated data matrix
feature – Name of the feature to plot
gridsize – Tuple of hexbin dimentions, larger numbers produce smaller hexbins
linewidths – Width of the lines to draw around each hexbin
figsize – Optional, make figure of this size
- Returns
Matplotlib figure with colorbar added.
- sc_utils.get_markers(adata, groupby, key='rank_genes_groups', p_val_cutoff=0.05, logfc_cutoff=0.5)[source]¶
Extract markers from adata into Seurat-like table
Extracts markers after they are computed by
scanpy
. Produces Seurat-like table with fields"p_val", "avg_logFC", "pct.1", "pct.2", "p_val_adj", "cluster", "gene"
Calculates the percentage of cells that express a given gene in the target cluster (
pct.1
field) and outside the cluster (pct.2
field) fromadata.raw
matrix.- Parameters
adata – Annotated data matrix.
groupby –
adata.obs
field used for marker calculationkey –
adata.uns
key that has computed markersp_val_cutoff – Drop all genes with adjusted p-value greater than or equal to this
logfc_cutoff – Drop all genes with average logFC less than or equal to this
- Returns
Returns a pandas dataframe with above listed columns, optionally
subsetted on the genes that pass the cutoffs.
p_val
field is a copy of adjusted p-value field.
Example
>>> sc.tl.rank_genes_groups(adata, "leiden", method="wilcoxon", n_genes=200) >>> markers = sc_utils.get_markers(adata, "leiden") >>> markers.to_csv("markers.csv")
- sc_utils.merge_gene_info(adata: anndata._core.anndata.AnnData)[source]¶
Merges gene information from different batches
After concatenating several datasets, the gene information dataframe
adata.var
can have a lot of duplicate columns from all the batches.This function merges
gene_ids
,feature_types
andgenome
information from batches, inserts them in the table and removes the batch-associated columns.- Parameters
adata – Annotated data matrix.
Example
>>> datasets = [sc.read_h5ad(path) for path in paths] >>> adata = datasets[0].concatenate(datasets[1:], join="outer") >>> sc_utils.merge_gene_info(adata)
- sc_utils.plot_composition(adata: anndata._core.anndata.AnnData, group_by: str, color: str, relative: bool = False, palette: Optional[Collection] = None, plot_numbers: bool = False) matplotlib.axes._axes.Axes [source]¶
Plot composition of clusters or other metadata
Groups cells by one metadata field and plots stacked barplot colored by another metadata field. Common use case is to see which samples contribute to which clusters. Plots horizontally.
- Parameters
adata – Annotated data matrix
group_by – Name of the field to group by on y axis
color – Name of the field to color by
relative – Plot percentage for each cluster if
True
or absolute counts ifFalse
palette – Optional, pass your own palette
plot_numbers – If
True
, plot number of cells next to the bars
- Returns
Matplotlib axes with the plot.
- sc_utils.write_mtx(adata, output_dir)[source]¶
Save scanpy object in mtx cellranger v3 format.
Saves basic information from adata object as cellranger v3 mtx folder. Saves only
adata.var_names
,adata.obs_names
andadata.X
fields. Creates directoryoutput_dir
if it does not exist. Creates 3 files:features.tsv.gz
,barcodes.tsv.gz
andmatrix.mtz.gz
. Will overwrite files in the output directory.- Parameters
adata – Annotated data matrix.
output_dir – Directory where to save results