alphagenome.data.gene_annotation.filter_protein_coding#
- alphagenome.data.gene_annotation.filter_protein_coding(gtf, include_gene_entries=False)[source]#
Filter GTF entries to only protein-coding genes.
- Parameters:
gtf (
DataFrame) – pd.DataFrame of GENCODE GTF entries. This data frame must contain a column named ‘transcript_type’ or ‘transcript_biotype’.include_gene_entries (
bool(default:False)) – Whether to include gene entries in addition to transcript entries.
- Return type:
DataFrame- Returns:
pd.DataFrame of GENCODE GTF entries subset to rows with protein-coding genes.