Quickstart¶
See Installation for prerequisites and install instructions.
Historical API¶
Download OHLCV candles via REST, save to Parquet, and load as a DataFrame:
from dccd.histo_dl import FromBinance
obj = FromBinance('/data/crypto/', 'BTC', span=3600, fiat='USDT')
# Full date range
obj.import_data(start='2024-01-01 00:00:00', end='2024-12-31 00:00:00')
obj.save(form='parquet')
df = obj.get_data()
print(df.head())
All five exchange classes share the same interface: FromBinance,
FromKraken, FromCoinbase,
FromBybit, FromOKX.
See Historical Downloader (dccd.histo_dl) for the full API reference.
Incremental updates¶
Pass start='last' to resume from the last saved timestamp, with no
duplicate rows:
# On first run downloads everything; on subsequent runs resumes from
# the most recent saved candle.
obj.import_data(start='last', end='now').save(form='parquet')
This is the recommended pattern for scheduled collection: run it daily or hourly via cron or the Daemon (dccd.daemon).
Trades and Order Book¶
The same fluent API works for trade history and order book snapshots:
from dccd.histo_dl import FromKraken, FromBinance
# Trade history — full historical pagination (Kraken, OKX)
obj = FromKraken('/data/crypto/', 'BTC', span=3600, fiat='USD')
obj.import_trades(start='2024-01-01 00:00:00', end='2024-12-31 00:00:00')
obj.save_trades(form='parquet')
# Order book snapshot (depth = number of price levels per side)
obj2 = FromBinance('/data/crypto/', 'BTC', span=3600, fiat='USDT')
obj2.import_orderbook(depth=20)
obj2.save_orderbook(form='parquet')
Note
Bybit and Coinbase return only recent trades (≤ 1 000 and ≤ 100
respectively) — their public REST APIs do not support deep historical
pagination. Use FromKraken or
FromOKX for full trade history.
Output formats¶
The form parameter of .save() / .save_trades() controls the file
format. get_data() returns a polars.DataFrame by default.
|
Extension |
When to use |
|---|---|---|
|
|
Recommended — columnar, compressed, fastest for Polars/Pandas loads. |
|
|
Universal export, readable without libraries, largest file size. |
|
|
Excel-compatible, good for small datasets. |
|
|
Local SQL queries via SQLAlchemy. |
|
— |
Remote database (PostgreSQL, MySQL, …) via SQLAlchemy connection string. |
To get a pandas.DataFrame instead of the default Polars output:
df_polars = obj.get_data() # polars.DataFrame (default)
df_pandas = obj.get_data(format='pandas') # pandas.DataFrame
Continuous (WebSocket) API¶
Stream order book and trades from Binance for one hour, saving a snapshot every 60 seconds:
from dccd.continuous_dl import get_data_binance
get_data_binance(
path='/data/crypto/',
pair='BTCUSDT',
time_step=60, # snapshot interval in seconds
until=3600, # total duration in seconds
form='parquet',
)
For fine-grained control use the downloader class directly:
from dccd.continuous_dl import DownloadBinanceData
from dccd.tools.io import IODataBase
dl = DownloadBinanceData(pair='BTCUSDT', time_step=60, until=3600)
dl.set_trades_saver(IODataBase('/data/crypto/trades', method='parquet'))
dl.set_book_saver(IODataBase('/data/crypto/book', method='parquet'))
dl.run()
See Continuous Downloader (dccd.continuous_dl) for all six exchange classes.
CLI Daemon¶
Create a minimal config file config.yml:
storage:
local_path: /data/crypto
histo_jobs:
- exchange: binance
pairs: [BTC/USDT, ETH/USDT]
span: 3600
stream_jobs:
- exchange: binance
pairs: [BTC/USDT]
channels: [trades, book]
time_step: 60
Validate the config, backfill history, then start the daemon:
dccd validate --config config.yml
dccd backfill --config config.yml
dccd start --config config.yml
Check the status of running jobs:
dccd status --config config.yml
See Daemon (dccd.daemon) for the full daemon reference including remote sync, health monitoring, and all CLI commands.