Skip to content

get_bound_overlaps

Functions:

Name Description
get_bound_overlaps

Calculates the overlaps, overlap ratios, and distances between input bounds and

Functions#

get_bound_overlaps #

get_bound_overlaps(
    input_bounds: ndarray, output_bounds: ndarray
) -> tuple[
    dict[int, list[int]],
    dict[int, list[float]],
    dict[int, list[float]],
]

Calculates the overlaps, overlap ratios, and distances between input bounds and output bounds.

Parameters:

Name Type Description Default
input_bounds ndarray

An array of shape (n, 2) representing the input bounds, where each row is [start, end].

required
output_bounds ndarray

An array of shape (m, 2) representing the output bounds, where each row is [start, end].

required

Returns:

Type Description
tuple[dict[int, list[int]], dict[int, list[float]], dict[int, list[float]]]

tuple[dict[int, list[int]], dict[int, list[float]]]: A tuple containing two dictionaries: - The first dictionary maps each output bin index to a list of input bin indices that overlap with it. - The second dictionary maps each output bin index to a list of ratios, representing the fraction of the input bin that is covered by the output bin. - The third dictionary maps each output bin index to a list of distances, representing the distance from the output bin center to each input bin center at least 50% covered by the output bin.

Source code in tsdat/transform_v2/utils/get_bound_overlaps.py
def get_bound_overlaps(
    input_bounds: np.ndarray, output_bounds: np.ndarray
) -> tuple[dict[int, list[int]], dict[int, list[float]], dict[int, list[float]]]:
    """
    Calculates the overlaps, overlap ratios, and distances between input bounds and
    output bounds.

    Args:
        input_bounds (np.ndarray): An array of shape (n, 2) representing the input bounds,
                                   where each row is [start, end].
        output_bounds (np.ndarray): An array of shape (m, 2) representing the output bounds,
                                    where each row is [start, end].

    Returns:
        tuple[dict[int, list[int]], dict[int, list[float]]]: A tuple containing two dictionaries:
            - The first dictionary maps each output bin index to a list of input bin indices
              that overlap with it.
            - The second dictionary maps each output bin index to a list of ratios,
              representing the fraction of the input bin that is covered by the output
              bin.
            - The third dictionary maps each output bin index to a list of distances,
              representing the distance from the output bin center to each input bin
              center at least 50% covered by the output bin.
    """
    # Convert to numerical arrays to calculate bound overlaps. First, to timedelta64.
    # Then break apart into 1D left and right bounds for input and output. Then use
    # pandas to get seconds from the timedeltas. Finally re-combine into bounds. Clunky
    # because pd.to_timedelta() can't handle a 2D array as input.
    if np.issubdtype(input_bounds.dtype, np.datetime64):
        start_time = input_bounds[0, 0]
        _input_deltas = input_bounds - start_time
        _output_deltas = output_bounds - start_time
        input_l_sec = pd.to_timedelta(_input_deltas[:, 0]).total_seconds()
        input_r_sec = pd.to_timedelta(_input_deltas[:, 1]).total_seconds()
        output_l_sec = pd.to_timedelta(_output_deltas[:, 0]).total_seconds()
        output_r_sec = pd.to_timedelta(_output_deltas[:, 1]).total_seconds()
        input_bounds = np.column_stack((input_l_sec, input_r_sec))
        output_bounds = np.column_stack((output_l_sec, output_r_sec))

    bin_idxs, bin_overlaps, bin_distances = _get_bound_overlaps(
        input_bounds=input_bounds,
        output_bounds=output_bounds,
    )
    return bin_idxs, bin_overlaps, bin_distances