Skip to content

adi

Attributes#

CDSGroup module-attribute #

CDSGroup = Group

CDSObject module-attribute #

CDSObject = Object

CDSVar module-attribute #

CDSVar = Var

COORDINATE_SYSTEM module-attribute #

COORDINATE_SYSTEM = 'coord_sys'

INPUT_DATASTREAM module-attribute #

INPUT_DATASTREAM = 'input_ds'

OUTPUT_DATASTREAM module-attribute #

OUTPUT_DATASTREAM = 'output_ds'

adi_qc_atts module-attribute #

adi_qc_atts = {
    "bit_1_description": "QC_BAD:  Transformation could not finish, value set to missing_value.",
    "bit_1_assessment": "Bad",
    "bit_2_description": "QC_INDETERMINATE:  Some, or all, of the input values used to create this output value had a QC assessment of Indeterminate.",
    "bit_2_assessment": "Indeterminate",
    "bit_3_description": "QC_INTERPOLATE:  Indicates a non-standard interpolation using points other than the two that bracket the target index was applied.",
    "bit_3_assessment": "Indeterminate",
    "bit_4_description": "QC_EXTRAPOLATE:  Indicates extrapolation is performed out from two points on the same side of the target index.",
    "bit_4_assessment": "Indeterminate",
    "bit_5_description": "QC_NOT_USING_CLOSEST:  Nearest good point is not the nearest actual point.",
    "bit_5_assessment": "Indeterminate",
    "bit_6_description": "QC_SOME_BAD_INPUTS:  Some, but not all, of the inputs in the averaging window were flagged as bad and excluded from the transform.",
    "bit_6_assessment": "Indeterminate",
    "bit_7_description": "QC_ZERO_WEIGHT:  The weights for all the input points to be averaged for this output bin were set to zero.",
    "bit_7_assessment": "Indeterminate",
    "bit_8_description": "QC_OUTSIDE_RANGE:  No input samples exist in the transformation region, value set to missing_value.",
    "bit_8_assessment": "Bad",
    "bit_9_description": "QC_ALL_BAD_INPUTS:  All the input values in the transformation region are bad, value set to missing_value.",
    "bit_9_assessment": "Bad",
    "bit_10_description": "QC_BAD_STD:  Standard deviation over averaging interval is greater than limit set by transform parameter std_bad_max.",
    "bit_10_assessment": "Bad",
    "bit_11_description": "QC_INDETERMINATE_STD:  Standard deviation over averaging interval is greater than limit set by transform parameter std_ind_max.",
    "bit_11_assessment": "Indeterminate",
    "bit_12_description": "QC_BAD_GOODFRAC:  Fraction of good and indeterminate points over averaging interval are less than limit set by transform parameter goodfrac_bad_min.",
    "bit_12_assessment": "Bad",
    "bit_13_description": "QC_INDETERMINATE_GOODFRAC:  Fraction of good and indeterminate points over averaging interval is less than limit set by transform parameter goodfrac_ind_min.",
    "bit_13_assessment": "Indeterminate",
}

Classes#

ADIAlignments #

Attributes#

CENTER class-attribute instance-attribute #
CENTER = 'CENTER'
LEFT class-attribute instance-attribute #
LEFT = 'LEFT'
RIGHT class-attribute instance-attribute #
RIGHT = 'RIGHT'
label_to_int class-attribute instance-attribute #
label_to_int = {LEFT: 0, CENTER: 0.5, RIGHT: 1}

Functions#

get_adi_value staticmethod #
get_adi_value(parameter_value: str)
Source code in tsdat/transform/adi.py
@staticmethod
def get_adi_value(parameter_value: str):
    return ADIAlignments.label_to_int.get(parameter_value)

ADITransformationTypes #

Attributes#

TRANS_AUTO class-attribute instance-attribute #
TRANS_AUTO = 'TRANS_AUTO'
TRANS_BIN_AVERAGE class-attribute instance-attribute #
TRANS_BIN_AVERAGE = 'TRANS_BIN_AVERAGE'
TRANS_INTERPOLATE class-attribute instance-attribute #
TRANS_INTERPOLATE = 'TRANS_INTERPOLATE'
TRANS_PASSTHROUGH class-attribute instance-attribute #
TRANS_PASSTHROUGH = 'TRANS_PASSTHROUGH'
TRANS_SUBSAMPLE class-attribute instance-attribute #
TRANS_SUBSAMPLE = 'TRANS_SUBSAMPLE'

AdiTransformer #

Functions#

transform #
transform(
    variable_name: str,
    input_dataset: xr.Dataset,
    output_dataset: xr.Dataset,
    transform_parameters: Dict[str, Any],
)

This function will use ADI libraries to transform one data variable to the shape defined for the output. This function will also fill out the output qc_ variable with the appropriate qc status from the transform algorithm.

The output variable and output qc_ variables' data will be written in place. Any variable attribute values that were added by adi will be copied back to the output variables.

Caller does not need to call this transform method for any 1-d variables where TRANS_PASSTHROUGH would apply. However, if there are two dimensions (say time and height), and the user only wants to transform one dimension (for example, time data is mapped to the input, but height data needs to be averaged), then you would need to call this transform and use TRANS_PASSTHROUGH for all the mapped dimension and a different transformation algorithm for any non-mapped dimensions.

If all dimensions are mapped and caller does not call this method, then all input values and input qc values must be copied over to the output by the caller. Also in this case the caller should add a 'source' attribute on the variable to explain what datastream the value came from.

Parameters#

variable_name: str The name of the variable being transformed. It should have the same name in both the input and output datasets. input_dataset : xarray.Dataset An xarray dataset containing: 1) A data variable to be transformed 2) Zero or one qc_variable that contains qc flags for the data variable. The qc_ variable must have the exact same base name as the data variable. For example, if the data variable is named 'temperature', then the qc variable must be named qc_temperature. The qc_variable must not have any qc attributes set. They will all be set by the transformer to specific bits that cannot be changed. 3) One or more coordinate variables matching the coordinates on the data variable 4) Zero or more bounds variables, one for each coordinate variable. Bounds variables specify the front edge and back edge of the bins used to compute the input data values. If no bounds variables are provided, ADI will assume each data point is a single, instantaneous value. If bounds variables are not present in the input data files, if the user knows what the bin widths and alignments were for the input datastreams, they can specify these values via the width and alignment transformation parameters (note that these parameters are for the input datastreams, not coordinate system defaults).

    Bounds values must be the same units as their corresponding coordinate variable.  Exact values should
    be used instead of offsets.

* Note that if range transform parameters were set on any datastreams, the xarray data must have at least
 'range' amount of extra data retrieved before and after the day being processed (if it exists).

* Note that the input variables should have been renamed to use the name from the output dataset.  So
    we should rename input variables to match their output names BEFORE we pass them to this method.

* Note that variable dimensions should have been renamed to match their names in the output variable.
xarray.Dataset

An xarray dataset where the transformed data will go. The output dataset must contain: 1) One or more coordinate variables with the shape of the defined output 2) One empty data variable with the same shape as its coordinate variables. The transformed values will be filled in by ADI. 3) One empty qc variable with the same shape as its coordinate variables. The qc flags and bit metadata attributes will be filled in by this function 4) One or more bounds variables, one for each coordinate variable. The bounds variables will contain the front edge and back edge of each bin for each output coord data point. The bounds variable values can computed from the coordinate data points and the width and alignment transformation parameters.

If the user does not specify bin width or alignment, then we use CENTER alignment by default and we
compute the bid width as the median of all the deltas between coordinate points.

transform_parameters : Dict

A compressed set of transformation parameters that apply just to the specific data variable being
transformed.  The following is the minimal set used for our initial release (more ADI parameters can be
added later, as they will be supported by the back end).

transform_parameters = {

    # Transformation_type defines the algorithm to use. This parameter should be defined by the converter.
    # Valid values are:
    #     TRANS_AUTO (This will average if there are more input points than output points, otherwise, interpolate)
    #     TRANS_INTERPOLATE
    #     TRANS_SUBSAMPLE   (i.e., nearest neighbor)
    #     TRANS_BIN_AVERAGE
    #     TRANS_PASSTHROUGH (all values passed directly through from the input, no transform takes place)

    "transformation_type": {
        "time": "TRANS_AUTO"
    },

    # Range specifies how far the transformer should look for the next good value when performing
    # subsample or interpolate transforms.
    # Range is always in same units as coord (e.g., seconds in this case).
    #  * Note that if range transform parameters are set for any datastreams, the xarray data must have at
    #  least 'range' amount of extra data retrieved before and after the day being processed (if it exists)!
    "range": {
        "time": 1800
    },

    # Width applies only when using bin averaging, and it specifies the width of the bin that was used to
    # determine a specific point.  Width is always in same units as coord.
    # Only use width if user wants to make the bin width different than the delta between points (e.g., for
    # smoothing data)
    # NOTE: You do not have to set width if you provide bounds variables on the output dataset!
    "width": {
        "time": 600
    }

    # Alignment applies only when using bin averaging, and it specifies where in the bin the data point is
    # located.  Valid values are:
    #    LEFT
    #    CENTER
    #    RIGHT
    # Default is CENTER
    "alignment": {
        "time": LEFT
    }
}
Returns#
Void - transforms are done in-place on output_dataset#
Source code in tsdat/transform/adi.py
def transform(
    self,
    variable_name: str,
    input_dataset: xr.Dataset,
    output_dataset: xr.Dataset,
    transform_parameters: Dict[str, Any],
):
    """-------------------------------------------------------------------------------------------------------------
    This function will use ADI libraries to transform one data variable to the shape defined for the output.
    This function will also fill out the output qc_ variable with the appropriate qc status from the transform
    algorithm.

    The output variable and output qc_ variables' data will be written in place.  Any variable attribute values
    that were added by adi will be copied back to the output variables.

    Caller does not need to call this transform method for any 1-d variables where TRANS_PASSTHROUGH would apply.
    However, if there are two dimensions (say time and height), and the user only wants to transform one dimension
    (for example, time data is mapped to the input, but height data needs to be averaged), then you would need to
    call this transform and use TRANS_PASSTHROUGH for all the mapped dimension and a different transformation
    algorithm for any non-mapped dimensions.

    If all dimensions are mapped and caller does not call this method, then all input values and input qc values
    must be copied over to the output by the caller.  Also in this case the caller should add a 'source' attribute
    on the variable to explain what datastream the value came from.

    Parameters
    ----------
    variable_name: str
        The name of the variable being transformed.  It should have the same name in both the input and output
        datasets.
    input_dataset : xarray.Dataset
        An xarray dataset containing:
        1) A data variable to be transformed
        2) Zero or one qc_variable that contains qc flags for the data variable.  The qc_ variable must have the
            exact same base name as the data variable.  For example, if the data variable is named 'temperature',
            then the qc variable must be named qc_temperature.
            The qc_variable must not have any qc attributes set.  They will all be set by the transformer to
            specific bits that cannot be changed.
        3) One or more coordinate variables matching the coordinates on the data variable
        4) Zero or more bounds variables, one for each coordinate variable.  Bounds variables specify the front
            edge and back edge of the bins used to compute the input data values.  If no bounds variables are
            provided, ADI will assume each data point is a single, instantaneous value.  If bounds variables
            are not present in the input data files, if the user knows what the bin widths and alignments were for
            the input datastreams, they can specify these values via the width and alignment transformation
            parameters (note that these parameters are for the input datastreams, not coordinate system defaults).

            Bounds values must be the same units as their corresponding coordinate variable.  Exact values should
            be used instead of offsets.

        * Note that if range transform parameters were set on any datastreams, the xarray data must have at least
         'range' amount of extra data retrieved before and after the day being processed (if it exists).

        * Note that the input variables should have been renamed to use the name from the output dataset.  So
            we should rename input variables to match their output names BEFORE we pass them to this method.

        * Note that variable dimensions should have been renamed to match their names in the output variable.

    output_dataset : xarray.Dataset
        An xarray dataset where the transformed data will go.  The output dataset must contain:
        1) One or more coordinate variables with the shape of the defined output
        2) One empty data variable with the same shape as its coordinate variables.  The transformed values will be
            filled in by ADI.
        3) One empty qc variable with the same shape as its coordinate variables.  The qc flags and bit metadata
            attributes will be filled in by this function
        4) One or more bounds variables, one for each coordinate variable.  The bounds variables will contain the
            front edge and back edge of each bin for each output coord data point.  The bounds variable values can
            computed from the coordinate data points and the width and alignment transformation parameters.

            If the user does not specify bin width or alignment, then we use CENTER alignment by default and we
            compute the bid width as the median of all the deltas between coordinate points.

    transform_parameters : Dict

        A compressed set of transformation parameters that apply just to the specific data variable being
        transformed.  The following is the minimal set used for our initial release (more ADI parameters can be
        added later, as they will be supported by the back end).

        transform_parameters = {

            # Transformation_type defines the algorithm to use. This parameter should be defined by the converter.
            # Valid values are:
            #     TRANS_AUTO (This will average if there are more input points than output points, otherwise, interpolate)
            #     TRANS_INTERPOLATE
            #     TRANS_SUBSAMPLE   (i.e., nearest neighbor)
            #     TRANS_BIN_AVERAGE
            #     TRANS_PASSTHROUGH (all values passed directly through from the input, no transform takes place)

            "transformation_type": {
                "time": "TRANS_AUTO"
            },

            # Range specifies how far the transformer should look for the next good value when performing
            # subsample or interpolate transforms.
            # Range is always in same units as coord (e.g., seconds in this case).
            #  * Note that if range transform parameters are set for any datastreams, the xarray data must have at
            #  least 'range' amount of extra data retrieved before and after the day being processed (if it exists)!
            "range": {
                "time": 1800
            },

            # Width applies only when using bin averaging, and it specifies the width of the bin that was used to
            # determine a specific point.  Width is always in same units as coord.
            # Only use width if user wants to make the bin width different than the delta between points (e.g., for
            # smoothing data)
            # NOTE: You do not have to set width if you provide bounds variables on the output dataset!
            "width": {
                "time": 600
            }

            # Alignment applies only when using bin averaging, and it specifies where in the bin the data point is
            # located.  Valid values are:
            #    LEFT
            #    CENTER
            #    RIGHT
            # Default is CENTER
            "alignment": {
                "time": LEFT
            }
        }

    Returns
    -------
    Void - transforms are done in-place on output_dataset
    -------------------------------------------------------------------------------------------------------------
    """
    # ADI can only handle time if it is the very first dim
    input_variable_dims = input_dataset[variable_name].dims
    if "time" in input_variable_dims and "time" != input_variable_dims[0]:
        raise BadTransformationSettingsError(
            f"Expected 'time' as the first dimension for variable {variable_name},"
            f" got: {input_variable_dims}"
        )

    # First convert the input and output variables into ADI format
    retrieved_dataset: cds3.Group = self._create_adi_retrieved_dataset(
        variable_name, input_dataset
    )
    transformed_dataset: cds3.Group = self._create_adi_transformed_dataset(
        variable_name, output_dataset
    )
    qc_variable_name = f"qc_{variable_name}"

    # Now convert the tranform parameters into ADI format
    adi_transform_parameters = TransformParameterConverter().convert_to_adi_format(
        transform_parameters
    )

    # Now apply the coordinate system transform parameters to the coordinate system group
    if COORDINATE_SYSTEM in adi_transform_parameters:
        params = adi_transform_parameters.get(COORDINATE_SYSTEM)
        cs_group = transformed_dataset.get_groups()[0]
        cds3.parse_transform_params(cs_group, params)

    # now apply the input datastream transform parameters to the obs group
    if INPUT_DATASTREAM in adi_transform_parameters:
        params = adi_transform_parameters.get(INPUT_DATASTREAM)
        obs_group = retrieved_dataset.get_groups()[0].get_groups()[0]
        cds3.parse_transform_params(obs_group, params)

    # Now run the transform
    adi_input_var = (
        retrieved_dataset.get_groups()[0].get_groups()[0].get_var(variable_name)
    )
    adi_input_qc_var = (
        retrieved_dataset.get_groups()[0].get_groups()[0].get_var(qc_variable_name)
    )
    adi_output_var = (
        transformed_dataset.get_groups()[0].get_groups()[0].get_var(variable_name)
    )
    adi_output_qc_var = (
        transformed_dataset.get_groups()[0]
        .get_groups()[0]
        .get_var(qc_variable_name)
    )

    trans.transform_driver(
        adi_input_var, adi_input_qc_var, adi_output_var, adi_output_qc_var
    )

    # Now copy any changed variable attributes back to the xr out variables.
    self._update_xr_attrs(variable_name, output_dataset, transformed_dataset)

    # Now free up memory from created adi data structures
    self._free_memory(retrieved_dataset)
    self._free_memory(transformed_dataset)

BadTransformationSettingsError #

Bases: ValueError

Exception class used when bad combinations of parameters are passed to transformation code.

TransformParameterConverter #

Attributes#

transform_param_type class-attribute instance-attribute #
transform_param_type = {
    "transformation_type": COORDINATE_SYSTEM,
    "width": COORDINATE_SYSTEM,
    "alignment": COORDINATE_SYSTEM,
    "input_datastream_alignment": INPUT_DATASTREAM,
    "input_datastream_width": INPUT_DATASTREAM,
    "range": INPUT_DATASTREAM,
    "qc_mask": INPUT_DATASTREAM,
    "missing_value": INPUT_DATASTREAM,
    "qc_bad": INPUT_DATASTREAM,
    "std_ind_max": COORDINATE_SYSTEM,
    "std_bad_max": COORDINATE_SYSTEM,
    "goodfrac_ind_min": COORDINATE_SYSTEM,
    "goodfrac_bad_min": COORDINATE_SYSTEM,
}

Functions#

convert_to_adi_format #
convert_to_adi_format(
    transform_parameters: Dict[Any, Any]
) -> Dict[str, str]
Source code in tsdat/transform/adi.py
def convert_to_adi_format(
    self, transform_parameters: Dict[Any, Any]
) -> Dict[str, str]:
    transforms: Dict[Any, Any] = {}
    """ 
    Example of input dictionary structure:

    transform_parameters = {
            "transformation_type": {
                "time": "TRANS_AUTO"
            },
            "range": {
                "time": 1800
            },
            "alignment": {
                "time": LEFT
            }
    }
    """

    for parameter_name, transform_parameter in transform_parameters.items():
        parameter_type = self.transform_param_type.get(parameter_name)
        transform_parameter_name = self._get_adi_transform_parameter_name(
            parameter_name, parameter_type
        )

        # TODO: for now we are not supporting variable overrides or datastream-specific overrides.
        #   When we do, we will need to revise this syntax.  For now, the keys are the dimensions and the
        #   values are the defaults
        for dim_name, value in transform_parameter.items():
            if parameter_type == COORDINATE_SYSTEM:
                file_name = COORDINATE_SYSTEM
                self._write_transform_parameter_row(
                    transforms,
                    file_name,
                    None,
                    dim_name,
                    transform_parameter_name,
                    value,
                )
            else:  # INPUT_DATASTREAM
                file_name = INPUT_DATASTREAM
                self._write_transform_parameter_row(
                    transforms,
                    file_name,
                    None,
                    dim_name,
                    transform_parameter_name,
                    value,
                )

    return transforms