DataStore¶
Defined in dccd.storage
- class DataStore(data_path, exchange, pair, span, data_type='ohlc')[source]
Bases:
objectUnified read/write interface for a single (exchange, pair, data_type).
- Parameters:
- data_pathstr
Root directory for all local data files (e.g.
'/data/crypto').- exchangestr
Exchange name, lowercase (e.g.
'binance').- pairstr
Trading pair in
'CRYPTO/FIAT'format (e.g.'BTC/USDT'). The slash is converted to a hyphen for the file-system path.- spanint or None
Candle interval in seconds. Required for
data_type='ohlc'; passNonefor trades and orderbook.- data_type{‘ohlc’, ‘trades’, ‘orderbook’}
Kind of data stored in this instance.
- Attributes:
directorypathlib.PathAbsolute directory for this store (created if absent).
- property directory
Absolute directory for this store (created if absent).
- existing_periods()[source]
List period labels for all available files.
- Returns:
- list of str
Sorted list of year strings (
['2024', '2025']) for OHLC, or date strings (['2026-05-20', '2026-05-21']) for trades/orderbook.
- is_period_complete(year)[source]
Return True if the parquet file for year contains all expected rows.
- Parameters:
- yearint
Calendar year to check (e.g.
2024).
- Returns:
- bool
Falsefor non-OHLC stores, missing files, or when the row count is below the expected number of candles for that year.
- last_timestamp()[source]
Return the last
TSvalue in the most recent period file.- Returns:
- int or None
Unix timestamp of the last row, or
Noneif no data exists.
- load(start=None, end=None)[source]
Load and concatenate all period files covering
[start, end].- Parameters:
- startint or None, optional
Inclusive lower bound (Unix timestamp).
Nonemeans no lower bound.- endint or None, optional
Inclusive upper bound (Unix timestamp).
Nonemeans no upper bound.
- Returns:
- pl.DataFrame
Concatenated data, sorted by
'TS', filtered to[start, end]. Empty DataFrame if no files are found.
- missing_intervals(start, end)[source]
Return the list of
(start, end)intervals within[start, end]that still need to be downloaded.For OHLC stores the method inspects existing annual parquet files: complete past years are skipped entirely; incomplete or absent years yield an interval from the last saved timestamp (
+ span) to the end of that year. The current calendar year always extends from the last saved row to end.For trades / orderbook stores (no
span) the method falls back to a simple resume: one interval fromlast_timestamp + span(or start if no data) to end.- Parameters:
- startint
Desired start timestamp (Unix seconds).
- endint
Desired end timestamp (Unix seconds).
- Returns:
- list of (int, int)
Ordered list of
(ivl_start, ivl_end)pairs to download. Empty list means all data is already present.
- save(df)[source]
Write df into the appropriate period file(s), merging with existing data.
OHLC data is grouped by year; trades and orderbook by calendar day. Rows are merged on
'TS'(dedupkeep='last'), sorted ascending, and written as Parquet.- Parameters:
- dfpl.DataFrame
Data to persist. Must contain a
'TS'column (Unix timestamps).