tsdat.qc.handlers

Classes

FailPipeline

Raises a DataQualityError, halting the pipeline, if the data quality are

RecordQualityResults

Records the results of the quality check in an ancillary qc variable. Creates the

RemoveFailedValues

Replaces all failed values with the variable's _FillValue. If the variable does not

SortDatasetByCoordinate

Sorts the dataset by the failed variable, if there are any failures.

exception tsdat.qc.handlers.DataQualityError[source]

Bases: ValueError

Raised when the quality of a variable indicates a fatal error has occurred. Manual review of the data in question is often recommended in this case.

Initialize self. See help(type(self)) for accurate signature.

class tsdat.qc.handlers.FailPipeline[source]

Bases: tsdat.qc.base.QualityHandler

Raises a DataQualityError, halting the pipeline, if the data quality are sufficiently bad. This usually indicates that a manual inspection of the data is recommended.

Raises

DataQualityError – DataQualityError

class Parameters[source]

Bases: pydantic.BaseModel

context :str =[source]

Additional context set by users that ends up in the traceback message.

display_limit :int = 5[source]
tolerance :float = 0[source]

Tolerance for the number of allowable failures as the ratio of allowable failures to the total number of values checked. Defaults to 0, meaning that any failed checks will result in a DataQualityError being raised.

parameters :FailPipeline.Parameters[source]

Class Methods

run

Takes some action on data that has had quality issues identified.

Method Descriptions

run(self, dataset: xarray.Dataset, variable_name: str, failures: numpy.typing.NDArray[numpy.bool8])[source]

Takes some action on data that has had quality issues identified.

Handles the quality of a variable in the dataset and returns the dataset after any corrections have been applied.

Parameters
  • dataset (xr.Dataset) – The dataset containing the variable to handle.

  • variable_name (str) – The name of the variable whose quality should be handled.

  • failures (NDArray[np.bool8]) – The results of the QualityChecker for the provided variable, where True values indicate a quality problem.

Returns

xr.Dataset – The dataset after the QualityHandler has been run.

class tsdat.qc.handlers.RecordQualityResults[source]

Bases: tsdat.qc.base.QualityHandler

Records the results of the quality check in an ancillary qc variable. Creates the ancillary qc variable if one does not already exist.

class Parameters[source]

Bases: pydantic.BaseModel

assessment :Literal[bad, indeterminate][source]

Indicates the quality of the data if the test results indicate a failure.

bit :int[source]

The bit number (e.g., 1, 2, 3, …) used to indicate if the check passed. The quality results are bitpacked into an integer array to preserve space. For example, if ‘check #0’ uses bit 0 and fails, and ‘check #1’ uses bit 1 and fails then the resulting value on the qc variable would be 2^(0) + 2^(1) = 3. If we had a third check it would be 2^(0) + 2^(1) + 2^(2) = 7.

meaning :str[source]

A string that describes the test applied.

Class Methods

to_lower

Method Descriptions

to_lower(cls, assessment: Any) str[source]
parameters :RecordQualityResults.Parameters[source]

Class Methods

run

Takes some action on data that has had quality issues identified.

Method Descriptions

run(self, dataset: xarray.Dataset, variable_name: str, failures: numpy.typing.NDArray[numpy.bool8]) xarray.Dataset[source]

Takes some action on data that has had quality issues identified.

Handles the quality of a variable in the dataset and returns the dataset after any corrections have been applied.

Parameters
  • dataset (xr.Dataset) – The dataset containing the variable to handle.

  • variable_name (str) – The name of the variable whose quality should be handled.

  • failures (NDArray[np.bool8]) – The results of the QualityChecker for the provided variable, where True values indicate a quality problem.

Returns

xr.Dataset – The dataset after the QualityHandler has been run.

class tsdat.qc.handlers.RemoveFailedValues[source]

Bases: tsdat.qc.base.QualityHandler

Replaces all failed values with the variable’s _FillValue. If the variable does not have a _FillValue attribute then nan is used instead

Class Methods

run

Takes some action on data that has had quality issues identified.

Method Descriptions

run(self, dataset: xarray.Dataset, variable_name: str, failures: numpy.typing.NDArray[numpy.bool8]) xarray.Dataset[source]

Takes some action on data that has had quality issues identified.

Handles the quality of a variable in the dataset and returns the dataset after any corrections have been applied.

Parameters
  • dataset (xr.Dataset) – The dataset containing the variable to handle.

  • variable_name (str) – The name of the variable whose quality should be handled.

  • failures (NDArray[np.bool8]) – The results of the QualityChecker for the provided variable, where True values indicate a quality problem.

Returns

xr.Dataset – The dataset after the QualityHandler has been run.

class tsdat.qc.handlers.SortDatasetByCoordinate[source]

Bases: tsdat.qc.base.QualityHandler

Sorts the dataset by the failed variable, if there are any failures.

class Parameters[source]

Bases: pydantic.BaseModel

ascending :bool = True[source]

Whether to sort the dataset in ascending order. Defaults to True.

correction :str = Coordinate data was sorted in order to ensure monotonicity.[source]
parameters :SortDatasetByCoordinate.Parameters[source]

Class Methods

run

Takes some action on data that has had quality issues identified.

Method Descriptions

run(self, dataset: xarray.Dataset, variable_name: str, failures: numpy.typing.NDArray[numpy.bool8]) xarray.Dataset[source]

Takes some action on data that has had quality issues identified.

Handles the quality of a variable in the dataset and returns the dataset after any corrections have been applied.

Parameters
  • dataset (xr.Dataset) – The dataset containing the variable to handle.

  • variable_name (str) – The name of the variable whose quality should be handled.

  • failures (NDArray[np.bool8]) – The results of the QualityChecker for the provided variable, where True values indicate a quality problem.

Returns

xr.Dataset – The dataset after the QualityHandler has been run.