alphagenome.data.gene_annotation.get_gene_interval#
- alphagenome.data.gene_annotation.get_gene_interval(gtf, gene_symbol=None, gene_id=None)[source]#
Returns a stranded
genome.Interval
given a gene identifier.Either gene_symbol or gene_id must be set, but not both.
- Parameters:
gtf (
DataFrame
) – pd.DataFrame of GENCODE GTF entries. Must contain columns ‘Feature’, ‘gene_name’, ‘gene_id’, ‘Chromosome’, ‘Start’, ‘End’, and ‘Strand’.gene_symbol (
Optional
[str
] (default:None
)) – A gene name or gene symbol (e.g., ‘EGFR’, ‘TNF’, ‘TP53’)gene_id (
Optional
[str
] (default:None
)) – An Ensembl gene ID, which can be patched (e.g. ‘ENSG00000141510.17’) or unpatched (e.g., ‘ENSG00000141510’).
- Return type:
- Returns:
A
genome.Interval
for the given gene identifier.- Raises:
ValueError – If neither or both gene_symbol and gene_id are set, or if no interval or multiple intervals are found for the given gene identifier.