file_system
Classes:
Name | Description |
---|---|
FileSystem |
Handles data storage and retrieval for file-based data formats. |
Classes#
FileSystem #
Bases: Storage
Handles data storage and retrieval for file-based data formats.
Formats that write to directories (such as zarr) are not supported by the FileSystem storage class.
Classes:
Name | Description |
---|---|
Parameters |
|
Methods:
Name | Description |
---|---|
fetch_data |
|
last_modified |
Find the last modified time for any data in that datastream. |
modified_since |
Find the list of data dates that have been modified since the passed |
save_ancillary_file |
Saves an ancillary filepath to the datastream's ancillary storage area. |
save_data |
|
Attributes:
Name | Type | Description |
---|---|---|
data_filepath_template |
Template
|
|
handler |
FileHandler
|
The FileHandler class that should be used to handle data I/O within the storage |
parameters |
Parameters
|
File-system specific parameters, such as the root path to where files should be |
Attributes#
handler
class-attribute
instance-attribute
#
The FileHandler class that should be used to handle data I/O within the storage API.
parameters
class-attribute
instance-attribute
#
File-system specific parameters, such as the root path to where files should be saved, or additional keyword arguments to specific functions used by the storage API. See the FileSystemStorage.Parameters class for more details.
Classes#
Parameters #
Bases: Parameters
Attributes:
Name | Type | Description |
---|---|---|
data_filename_template |
str
|
Template string to use for data filenames. |
data_storage_path |
Path
|
The directory structure under storage_root where ancillary files are saved. |
Attributes#
class-attribute
instance-attribute
#Template string to use for data filenames.
Allows substitution of the following parameters using curly braces '{}':
ext
: the file extension from the storage data handlerdatastream
from the dataset's global attributeslocation_id
from the dataset's global attributesdata_level
from the dataset's global attributesdate_time
: the first timestamp in the file formatted as "YYYYMMDD.hhmmss"- Any other global attribute that has a string or integer data type.
At a minimum the template must include {date_time}
.
class-attribute
instance-attribute
#The directory structure under storage_root where ancillary files are saved.
Allows substitution of the following parameters using curly braces '{}':
storage_root
: the value from thestorage_root
parameter.datastream
: thedatastream
as defined in the dataset config file.location_id
: thelocation_id
as defined in the dataset config file.data_level
: thedata_level
as defined in the dataset config file.year
: the year of the first timestamp in the file.month
: the month of the first timestamp in the file.day
: the day of the first timestamp in the file.extension
: the file extension used by the output file writer.
Defaults to data/{location_id}/{datastream}
.
Functions#
fetch_data #
fetch_data(
start: datetime,
end: datetime,
datastream: str,
metadata_kwargs: Union[Dict[str, str], None] = None,
**kwargs: Any
) -> xr.Dataset
Fetches data for a given datastream between a specified time range.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
start
|
datetime
|
The minimum datetime to fetch. |
required |
end
|
datetime
|
The maximum datetime to fetch. |
required |
datastream
|
str
|
The datastream id to search for. |
required |
metadata_kwargs
|
dict[str, str]
|
Metadata substitutions to help resolve the data storage path. This is only required if the template data storage path includes any properties other than datastream or fields contained in the datastream. Defaults to None. |
None
|
Returns:
Type | Description |
---|---|
Dataset
|
xr.Dataset: A dataset containing all the data in the storage area that spans |
Dataset
|
the specified datetimes. |
Source code in tsdat/io/storage/file_system.py
last_modified #
Find the last modified time for any data in that datastream.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datastream
|
str
|
The datastream. |
required |
Returns:
Name | Type | Description |
---|---|---|
datetime |
Union[datetime, None]
|
The datetime of the last modification. |
Source code in tsdat/io/storage/file_system.py
modified_since #
Find the list of data dates that have been modified since the passed last modified date.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
datastream
|
str
|
description |
required |
last_modified
|
datetime
|
Should be equivalent to run date (the last time data were changed) |
required |
Returns:
Type | Description |
---|---|
List[datetime]
|
List[datetime]: The data dates of files that were changed since the last modified date |
Source code in tsdat/io/storage/file_system.py
save_ancillary_file #
Saves an ancillary filepath to the datastream's ancillary storage area.
NOTE: In most cases this function should not be used directly. Instead, prefer
using the self.uploadable_dir(*args, **kwargs)
method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filepath
|
Path
|
The path to the ancillary file. This is expected to have a standardized filename and should be saved under the ancillary storage path. |
required |
target_path
|
str
|
The path to where the data should be saved. |
None
|
Source code in tsdat/io/storage/file_system.py
save_data #
Saves a dataset to the storage area.
At a minimum, the dataset must have a 'datastream' global attribute and must have a 'time' variable with a np.datetime64-like data type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
Dataset
|
The dataset to save. |
required |