alphagenome.data.gene_annotation.filter_transcript_type

alphagenome.data.gene_annotation.filter_transcript_type#

alphagenome.data.gene_annotation.filter_transcript_type(gtf, transcript_types=None)[source]#

Filter GTF entries by transcript types.

This function takes a GTF DataFrame and a list of transcript types and returns a new DataFrame containing only the transcripts with the specified types.

The GTF DataFrame must contain a column named ‘transcript_type’ or ‘transcript_biotype’. The function will raise a ValueError if neither of these columns is present.

Parameters:
  • gtf (DataFrame) – pd.DataFrame or pyranges.PyRanges.

  • transcript_types (Optional[tuple[TranscriptType, ...]] (default: None)) – List of valid transcript types to use for filtering.

Return type:

DataFrame

Returns:

pd.DataFrame of GENCODE GTF entries subset to rows with the requested transcript types.