alphagenome.data.genome.Junction#

class alphagenome.data.genome.Junction(chromosome, start, end, strand='.', name='', info=<factory>, k=None)[source]#

Represents a splice junction.

A splice junction is a point in a pre-mRNA transcript where an intron is removed and exons are joined during RNA splicing. This class inherits from Interval and adds properties and methods specific to splice junctions.

chromosome#

The chromosome name.

start#

The 0-based start position of the junction.

end#

The 0-based end position of the junction.

strand#

The strand of the junction (‘+’ or ‘-‘).

name#

An optional name for the junction.

info#

An optional dictionary to store additional information.

k#

An optional integer representing the number of reads supporting the splice junction.

Raises:

ValueError – If the strand is unstranded.

Attributes#

Table

acceptor

Returns the acceptor site position.

donor

Returns the donor site position.

k

chromosome

start

end

info

Junction.acceptor#

Returns the acceptor site position.

Junction.donor#

Returns the donor site position.

Junction.k: int | None = None#
Junction.name: str = ''#
Junction.negative_strand#

Returns True if interval is on the negative strand, False otherwise.

Junction.strand: str = '.'#
Junction.width#

Returns the width of the interval.

Junction.chromosome: str#
Junction.start: int#
Junction.end: int#
Junction.info: dict[str, Any]#

Methods#

Table

acceptor_region([overhang])

Returns the region around the acceptor site with overhang.

as_unstranded()

Returns an unstranded copy of the interval.

binary_mask(intervals[, bin_size])

Boolean mask True if any interval overlaps the bin: coverage > 0.

binary_mask_stranded(intervals[, bin_size])

Boolean mask True if any interval overlaps the bin: coverage > 0.

boundary_shift([start_offset, end_offset, ...])

Extends or shrinks the interval by adjusting the positions with padding.

center([use_strand])

Computes the center of the interval.

contains(interval)

Checks if this interval completely contains another interval.

copy()

Returns a deep copy of the interval.

coverage(intervals, *[, bin_size])

Computes coverage track from sequence of intervals overlapping interval.

coverage_stranded(intervals, *[, bin_size])

Computes a coverage track from intervals overlapping this interval.

dinucleotide_region()

Returns the dinucleotide regions around acceptor and donor sites.

donor_region([overhang])

Returns the region around the donor site with overhang.

from_interval_dict(interval)

Creates an Interval from a dictionary.

from_proto(proto)

Creates an Interval from a protobuf message.

from_pyranges_dict(row[, ignore_info])

Creates an Interval from a pyranges-like dictionary.

from_str(string)

Creates an Interval from a string (e.g., 'chr1:100-200:+').

intersect(interval)

Returns the intersection of this interval with another interval.

overlap_ranges(intervals)

Returns overlapping ranges from intervals overlapping this interval.

overlaps(interval)

Checks if this interval overlaps with another interval.

pad(start_pad, end_pad, *[, use_strand])

Pads the interval by adding the specified padding to the start and end.

pad_inplace(start_pad, end_pad, *[, use_strand])

Pads the interval in place by adding padding to the start and end.

resize(width[, use_strand])

Resizes the interval to a new width, centered around the original center.

resize_inplace(width[, use_strand])

Resizes the interval in place, centered around the original center.

shift(offset[, use_strand])

Shifts the interval by the given offset.

swap_strand()

Swaps the strand of the interval.

to_interval_dict()

Converts the interval to a dictionary.

to_proto()

Converts the interval to a protobuf message.

to_pyranges_dict()

Converts the interval to a pyranges-like dictionary.

truncate([reference_length])

Truncates the interval to fit within the valid reference range.

within_reference([reference_length])

Checks if the interval is within the valid reference range.

Junction.acceptor_region(overhang=(250, 250))[source]#

Returns the region around the acceptor site with overhang.

Return type:

Interval

Junction.as_unstranded()[source]#

Returns an unstranded copy of the interval.

Return type:

Self

Junction.binary_mask(intervals, bin_size=1)[source]#

Boolean mask True if any interval overlaps the bin: coverage > 0.

Return type:

ndarray

Junction.binary_mask_stranded(intervals, bin_size=1)[source]#

Boolean mask True if any interval overlaps the bin: coverage > 0.

Return type:

ndarray

Junction.boundary_shift(start_offset=0, end_offset=0, use_strand=True)[source]#

Extends or shrinks the interval by adjusting the positions with padding.

Parameters:
  • start_offset (int (default: 0)) – The amount to shift the start position.

  • end_offset (int (default: 0)) – The amount to shift the end position.

  • use_strand (bool (default: True)) – If True, the offsets are applied in reverse for negative strand intervals.

Return type:

Self

Returns:

A new interval with adjusted boundaries.

Junction.center(use_strand=True)[source]#

Computes the center of the interval.

For intervals with an odd width, the center is rounded down to the nearest integer.

If use_strand is True and the interval is on the negative strand, the center is calculated differently to maintain consistency when stacking sequences from different intervals oriented in the forward strand direction. This ensures that the relative distance between the interval’s upstream boundary and its center is preserved.

Parameters:

use_strand (bool (default: True)) – If True, the strand of the interval is considered when calculating the center.

Return type:

int

Returns:

The integer representing the center position of the interval.

Examples

>>> Interval('1', 1, 3, '+').center()
2
>>> Interval('1', 1, 3, '-').center()  # Strand doesn't matter.
2
>>> Interval('1', 1, 4, '+').center()
3
>>> Interval('1', 1, 4, '-').center()
2
>>> Interval('1', 1, 4, '-').center()
2
>>> Interval('1', 1, 4, '+').center(use_strand=False)
3
>>> Interval('1', 1, 2, '-').center()
1
Junction.contains(interval)[source]#

Checks if this interval completely contains another interval.

Return type:

bool

Junction.copy()[source]#

Returns a deep copy of the interval.

Return type:

Self

Junction.coverage(intervals, *, bin_size=1)[source]#

Computes coverage track from sequence of intervals overlapping interval.

This method calculates the coverage of this interval by a set of other intervals. The coverage is defined as the number of intervals that overlap each position within this interval.

The bin_size parameter allows you to bin the coverage into equal-sized windows. This can be useful for summarizing coverage over larger regions. If bin_size is 1, the coverage is calculated at single-base resolution.

Parameters:
  • intervals (Sequence[Self]) – A sequence of Interval objects that may overlap this interval.

  • bin_size (int (default: 1)) – The size of the bins used to calculate coverage. Must be a positive integer that divides the width of the interval.

Return type:

ndarray

Returns:

A 1D numpy array representing the coverage track. The length of the array is self.width // bin_size. Each element in the array represents the summed coverage within the corresponding bin.

Raises:

ValueError – If bin_size is not a positive integer or if the interval width is not divisible by bin_size.

Junction.coverage_stranded(intervals, *, bin_size=1)[source]#

Computes a coverage track from intervals overlapping this interval.

This method considers the strand information of both self and intervals.

Parameters:
  • intervals (Sequence[Self]) – Sequence of intervals possibly overlapping self.

  • bin_size (int (default: 1)) – Resolution at which to bin the output coverage track. Coverage within each bin (if larger than 1) will be summarized using sum().

Returns:

, 0] represents coverage for intervals on the same strand as self and output[:, 1] represents coverage of intervals on the opposite strand.

Return type:

Numpy array of shape (self.width // bin_size, 2) where output[

Junction.dinucleotide_region()[source]#

Returns the dinucleotide regions around acceptor and donor sites.

Return type:

tuple[Interval, Interval]

Junction.donor_region(overhang=(250, 250))[source]#

Returns the region around the donor site with overhang.

Return type:

Interval

classmethod Junction.from_interval_dict(interval)[source]#

Creates an Interval from a dictionary.

Return type:

Self

classmethod Junction.from_proto(proto)[source]#

Creates an Interval from a protobuf message.

Return type:

Self

classmethod Junction.from_pyranges_dict(row, ignore_info=False)[source]#

Creates an Interval from a pyranges-like dictionary.

This method constructs an Interval object from a dictionary that follows the pyranges format, such as a row from a pandas.DataFrame converted to a dict.

The dictionary should have the following keys:

  • ‘Chromosome’: The chromosome name.

  • ‘Start’: The start position.

  • ‘End’: The end position.

  • ‘Strand’: The strand (optional, defaults to unstranded).

  • ‘Name’: The interval name (optional).

Any other keys in the dictionary will be added to the info attribute of the Interval object, unless ignore_info is set to True.

Parameters:
  • row (Mapping[str, Any]) – A dictionary containing interval data.

  • ignore_info (bool (default: False)) – If True, any keys in the dictionary that are not part of the standard pyranges columns (‘Chromosome’, ‘Start’, ‘End’, ‘Strand’, ‘Name’) will not be added to the info attribute.

Return type:

Interval

Returns:

An Interval object created from the input dictionary.

classmethod Junction.from_str(string)[source]#

Creates an Interval from a string (e.g., ‘chr1:100-200:+’).

Return type:

Self

Junction.intersect(interval)[source]#

Returns the intersection of this interval with another interval.

Return type:

Optional[Self]

Junction.overlap_ranges(intervals)[source]#

Returns overlapping ranges from intervals overlapping this interval.

Parameters:

intervals (Sequence[Self]) – Sequence of candidate intervals to test for overlap.

Return type:

ndarray

Returns:

2D numpy array indicating the start and end of the overlapping ranges.

Junction.overlaps(interval)[source]#

Checks if this interval overlaps with another interval.

Return type:

bool

Junction.pad(start_pad, end_pad, *, use_strand=True)[source]#

Pads the interval by adding the specified padding to the start and end.

Parameters:
  • start_pad (int) – The amount of padding to add to the start.

  • end_pad (int) – The amount of padding to add to the end.

  • use_strand (bool (default: True)) – If True, padding is applied in reverse for negative strand intervals.

Return type:

Self

Returns:

A new padded interval.

Junction.pad_inplace(start_pad, end_pad, *, use_strand=True)[source]#

Pads the interval in place by adding padding to the start and end.

Parameters:
  • start_pad (int) – The amount of padding to add to the start.

  • end_pad (int) – The amount of padding to add to the end.

  • use_strand (bool (default: True)) – If True, padding is applied in reverse for negative strand intervals.

Junction.resize(width, use_strand=True)[source]#

Resizes the interval to a new width, centered around the original center.

Parameters:
  • width (int) – The new width of the interval.

  • use_strand (bool (default: True)) – If True, resizing considers the strand orientation.

Return type:

Self

Returns:

A new resized interval.

Junction.resize_inplace(width, use_strand=True)[source]#

Resizes the interval in place, centered around the original center.

Parameters:
  • width (int) – The new width of the interval.

  • use_strand (bool (default: True)) – If True, resizing considers the strand orientation.

Return type:

None

Junction.shift(offset, use_strand=True)[source]#

Shifts the interval by the given offset.

Parameters:
  • offset (int) – The amount to shift the interval.

  • use_strand (bool (default: True)) – If True, the shift direction is reversed for negative strand intervals.

Return type:

Self

Returns:

A new shifted interval.

Junction.swap_strand()[source]#

Swaps the strand of the interval.

Return type:

Self

Junction.to_interval_dict()[source]#

Converts the interval to a dictionary.

Return type:

dict[str, str | int]

Junction.to_proto()[source]#

Converts the interval to a protobuf message.

Return type:

Interval

Junction.to_pyranges_dict()[source]#

Converts the interval to a pyranges-like dictionary.

Return type:

dict[str, int | str]

Junction.truncate(reference_length=9223372036854775807)[source]#

Truncates the interval to fit within the valid reference range.

Return type:

Self

Junction.within_reference(reference_length=9223372036854775807)[source]#

Checks if the interval is within the valid reference range.

Return type:

bool