ParquetStoreΒΆ
Defined in dccd.storage.parquet
- class ParquetStore(data_path)[source]
Bases:
objectRead/write interface for a single DatasetId.
All timestamps (
TS) are nanoseconds UTC (int64).- Parameters:
- data_pathstr or Path
Root directory for all data files.
Examples
>>> import pathlib, tempfile >>> from dccd.domain.dataset import DatasetId >>> from dccd.domain.symbol import Symbol >>> from dccd.domain.types import DataType >>> store = ParquetStore('/tmp/data')
- directory(ds)[source]
Return the directory for ds, creating it if needed.
- inventory()[source]
Return list of dataset info dicts for all stored data.
Each entry includes
min_ts/max_ts(nanoseconds UTC) androwsso the UI can display the actual data time range.
- last_timestamp(ds)[source]
Return last TS in ns, or None if no data.
- load(ds, start_ns=None, end_ns=None)[source]
Load data for ds in the given nanosecond range.
- missing_intervals(ds, start_ns, end_ns)[source]
Return gaps as (start_ns, end_ns) pairs within [start_ns, end_ns].
- static read_provenance(file_path)[source]
Return the
Provenancestored in a Parquet file, if any.
- save(ds, records, provenance=None)[source]
Write records to Parquet, merging with existing data.
- Parameters:
- dsDatasetId
- recordslist
OHLCBar, Trade, or OrderBookSnapshot objects.
- provenanceProvenance or None
- Returns:
- int
Number of rows written.