Quickstart

See Installation for prerequisites and install instructions.


Historical API

Download OHLCV candles via REST, save to Parquet, and load as a DataFrame:

from dccd.histo_dl import FromBinance

obj = FromBinance('/data/crypto/', 'BTC', span=3600, fiat='USDT')

# Full date range
obj.import_data(start='2024-01-01 00:00:00', end='2024-12-31 00:00:00')
obj.save(form='parquet')

df = obj.get_data()
print(df.head())

All five exchange classes share the same interface: FromBinance, FromKraken, FromCoinbase, FromBybit, FromOKX.

See Historical Downloader (dccd.histo_dl) for the full API reference.


Incremental updates

Pass start='last' to resume from the last saved timestamp, with no duplicate rows:

# On first run downloads everything; on subsequent runs resumes from
# the most recent saved candle.
obj.import_data(start='last', end='now').save(form='parquet')

This is the recommended pattern for scheduled collection: run it daily or hourly via cron or the Daemon (dccd.daemon).


Trades and Order Book

The same fluent API works for trade history and order book snapshots:

from dccd.histo_dl import FromKraken, FromBinance

# Trade history — full historical pagination (Kraken, OKX)
obj = FromKraken('/data/crypto/', 'BTC', span=3600, fiat='USD')
obj.import_trades(start='2024-01-01 00:00:00', end='2024-12-31 00:00:00')
obj.save_trades(form='parquet')

# Order book snapshot (depth = number of price levels per side)
obj2 = FromBinance('/data/crypto/', 'BTC', span=3600, fiat='USDT')
obj2.import_orderbook(depth=20)
obj2.save_orderbook(form='parquet')

Note

Bybit and Coinbase return only recent trades (≤ 1 000 and ≤ 100 respectively) — their public REST APIs do not support deep historical pagination. Use FromKraken or FromOKX for full trade history.


Output formats

The form parameter of .save() / .save_trades() controls the file format. get_data() returns a polars.DataFrame by default.

form=

Extension

When to use

'parquet'

.parquet

Recommended — columnar, compressed, fastest for Polars/Pandas loads.

'csv'

.csv

Universal export, readable without libraries, largest file size.

'xlsx'

.xlsx

Excel-compatible, good for small datasets.

'sqlite'

.db

Local SQL queries via SQLAlchemy.

'sql'

Remote database (PostgreSQL, MySQL, …) via SQLAlchemy connection string.

To get a pandas.DataFrame instead of the default Polars output:

df_polars = obj.get_data()               # polars.DataFrame (default)
df_pandas = obj.get_data(format='pandas') # pandas.DataFrame

Continuous (WebSocket) API

Stream order book and trades from Binance for one hour, saving a snapshot every 60 seconds:

from dccd.continuous_dl import get_data_binance

get_data_binance(
    path='/data/crypto/',
    pair='BTCUSDT',
    time_step=60,   # snapshot interval in seconds
    until=3600,     # total duration in seconds
    form='parquet',
)

For fine-grained control use the downloader class directly:

from dccd.continuous_dl import DownloadBinanceData
from dccd.tools.io import IODataBase

dl = DownloadBinanceData(pair='BTCUSDT', time_step=60, until=3600)
dl.set_trades_saver(IODataBase('/data/crypto/trades', method='parquet'))
dl.set_book_saver(IODataBase('/data/crypto/book', method='parquet'))
dl.run()

See Continuous Downloader (dccd.continuous_dl) for all six exchange classes.


CLI Daemon

Create a minimal config file config.yml:

storage:
  local_path: /data/crypto

histo_jobs:
  - exchange: binance
    pairs: [BTC/USDT, ETH/USDT]
    span: 3600

stream_jobs:
  - exchange: binance
    pairs: [BTC/USDT]
    channels: [trades, book]
    time_step: 60

Validate the config, backfill history, then start the daemon:

dccd validate --config config.yml
dccd backfill --config config.yml
dccd start   --config config.yml

Check the status of running jobs:

dccd status --config config.yml

See Daemon (dccd.daemon) for the full daemon reference including remote sync, health monitoring, and all CLI commands.