Storage (dccd.storage)

Unified data storage for all dccd data types.

DataStore is the single point of entry for reading and writing crypto data regardless of exchange, data type (OHLC, trades, order book), or collection method (REST or WebSocket).

Directory layout

{data_path}/{exchange}/ohlc/{pair}/{span}/YYYY.parquet
{data_path}/{exchange}/trades/{pair}/YYYY-MM-DD.parquet
{data_path}/{exchange}/orderbook/{pair}/YYYY-MM-DD.parquet
  • exchange: lowercase ('binance', 'kraken'…)

  • pair: BTC-USDT (slash replaced by hyphen — slash is invalid in paths)

  • span: short label '1m', '1h', '1d'… (OHLC only)

  • Granularity: annual for OHLC, daily for trades/orderbook

DataStore is the unified read/write interface for all dccd data types (OHLC, trades, order book). It replaces the scattered save/load logic previously spread across exchange.py, backfill.py, and stream_manager.py.

Directory layout

{data_path}/{exchange}/ohlc/{pair}/{span}/YYYY.parquet
{data_path}/{exchange}/trades/{pair}/YYYY-MM-DD.parquet
{data_path}/{exchange}/orderbook/{pair}/YYYY-MM-DD.parquet
  • exchange: lowercase ('binance', 'kraken'…)

  • pair: BTC-USDT (slash replaced by hyphen)

  • span: short label — '1m', '1h', '1d'… (OHLC only)

  • OHLC files are annual; trades and orderbook files are daily

Example paths:

~/data/crypto/binance/ohlc/BTC-USDT/1h/2026.parquet
~/data/crypto/kraken/ohlc/BTC-USD/1d/2025.parquet
~/data/crypto/binance/trades/BTC-USDT/2026-05-22.parquet
~/data/crypto/binance/orderbook/BTC-USDT/2026-05-22.parquet

Output formats

The form parameter on .save() / .save_trades() / .save_orderbook() controls where and how data is written. Parquet is recommended for production use; the other formats are available for interoperability.

form=

Extension

Notes

'parquet'

.parquet

Recommended. Columnar, compressed, native polars and pandas support. Annual files for OHLC, daily for trades / orderbook.

'csv'

.csv

Universal. Readable without libraries. Largest file size.

'xlsx'

.xlsx

Excel-compatible. Requires openpyxl. Best for small datasets.

'sqlite'

.db

Local SQL access via SQLAlchemy. Good for ad-hoc queries.

'sql'

Remote database (PostgreSQL, MySQL, …). Pass a SQLAlchemy connection string via the path argument.

get_data() returns a polars.DataFrame by default (since v2.3):

df_polars = obj.get_data()                # polars.DataFrame
df_pandas = obj.get_data(format='pandas') # pandas.DataFrame

API

DataStore(data_path, exchange, pair, span[, ...])

Unified read/write interface for a single (exchange, pair, data_type).