alphagenome.data.gene_annotation.filter_transcript_type#
- alphagenome.data.gene_annotation.filter_transcript_type(gtf, transcript_types=None)[source]#
Filter GTF entries by transcript types.
This function takes a GTF DataFrame and a list of transcript types and returns a new DataFrame containing only the transcripts with the specified types.
The GTF DataFrame must contain a column named ‘transcript_type’ or ‘transcript_biotype’. The function will raise a ValueError if neither of these columns is present.
- Parameters:
gtf (
DataFrame) – pd.DataFrame or pyranges.PyRanges.transcript_types (
Optional[tuple[TranscriptType,...]] (default:None)) – List of valid transcript types to use for filtering.
- Return type:
DataFrame- Returns:
pd.DataFrame of GENCODE GTF entries subset to rows with the requested transcript types.