Lake offers the following data types:
- trades – trades occurring on the given exchange aggregated by a single taker order and price.
- book – market depth (order book) snapshots in high frequency, at least once per 100ms depending on the exchange support. Contains 20 price levels for each order book side.
- level_1 – a practical derivate of book containing just the first price level on each order book side. book rows containing updates on deeper price levels are skipped, which makes this data type much faster to load and work with.
- book_delta_v2 – market depth (order book) updates in high frequency, supports unlimited (eg. 1000+) price levels on each order book side. Has even higher update frequency than book data, but is more complicated to process as you have to build order book model from the updates as shown in this interactive notebook.
- candles – 1-minute OHLCV candles.
- book_1m – 1-minute order book snapshots containing up to one thousand price levels on each order book side.
- funding – funding and predicted funding rate plus mark_price for futures. On Binance for example this is updated every 3 seconds.
- open_interest – open interest for futures. On Binance for example this is updated every 20 seconds.
- liquidations – liquidation metrics such as side, price and quantity for futures. On Binance for example this is updated at most once per second.
All data are partitioned by exchange, symbol and day and you can access any combination of those. Note that level_1 and candles data are just a derivates of book and trades data. Perpetuals/futures symbols have ‘-PERP’ suffix in their name.
You can find more details about exchange coverage and available history in the Coverage section.
Access API
The data are available through an easy-to-use Python API. It offers high-performance parallelized data downloads, supports caching and downloading only new and unseen data once you add more time range or more token pairs to your query. This enables for example complex distributed machine-learning workloads.
- API source code: https://github.com/crypto-lake/lake-api
- API reference documentation: https://lake-api.readthedocs.io
- API package on pypi: https://pypi.org/project/lakeapi/
For advanced users, it’s also possible to access the data directly through AWS S3. The data storage consists of a directory structure eg. trades/exchange=BINANCE/symbol=BTC-USDT/dt=2022-01-01/random_name.parquet
and contains compressed parquet table files that are easy to load from any programming language. If you use this interface, please implement some kind of caching or don’t download the same data often to stay compliant with our Terms. Also be warned that the column names might slightly differ from the Python API and Schemata part of this page.
The access and amount of downloaded data are unlimited, but we reserve the right to contact customers who scrape all data or misconfigure cache and cause unnecessary traffic in the scale of ~100GBs. More details can be found in the Terms of Service.
Schemata
trades
name | pandas type | example value |
---|---|---|
side | category | ‘buy’ |
quantity | float64 | 0.00342 |
price | float64 | 19549.73 |
trade_id | Int64 | 1704373229 |
origin_time | datetime64[ns] | 1666051199989000192 |
received_time | datetime64[ns] | 1666051200016254720 |
exchange | category | ‘BINANCE’ |
symbol | category | ‘BTC-USDT’ |
Note that on SERUM exchange the trades also contain MPID identification of the buying and selling party. This variant of the trades schema is called trades_mpid.
book
name | pandas type | example value |
---|---|---|
received_time | datetime64[ns] | 1666051200016254720 |
sequence_number | Int64 | 548631456 |
bid_0_price | float64 | 19549.73 |
bid_0_size | float64 | 0.00342 |
bid_1_price | float64 | … |
bid_1_size | float64 | … |
… | … | … |
bid_19_price | float64 | … |
bid_19_size | float64 | … |
ask_0_price | float64 | … |
ask_0_size | float64 | … |
… | … | … |
ask_19_price | float64 | … |
ask_19_size | float64 | … |
exchange | category | ‘BINANCE’ |
symbol | category | ‘BTC-USDT’ |
level_1
name | pandas type | example value |
---|---|---|
received_time | datetime64[ns] | 1666051200016254720 |
bid_0_price | float64 | 19549.73 |
bid_0_size | float64 | 0.00342 |
ask_0_price | float64 | 19549.75 |
ask_0_size | float64 | 0.00634 |
exchange | category | ‘BINANCE’ |
symbol | category | ‘BTC-USDT’ |
book_delta_v2
name | pandas type | example value |
---|---|---|
origin_time | datetime64[ns] | 1666051200014000000 |
received_time | datetime64[ns] | 1666051200016254720 |
sequence_number | Int64 | 5667521204 |
side_is_bid | boolean | False |
price | float64 | 33.97 |
size | float64 | 393.38 |
exchange | category | ‘BINANCE’ |
symbol | category | ‘AVAX-USDT’ |
deep_book_1m
name | pandas type | example value |
---|---|---|
received_time | datetime64[ns] | 1666051200016254720 |
sequence_number | Int64 | 5667521204 |
bid_prices | list[float64] | [36.6, 36.2, 36.77, …] |
bid_sizes | list[float64] | 33.97 |
ask_prices | list[float64] | 33.97 |
ask_sizes | list[float64] | 33.97 |
exchange | category | ‘BINANCE’ |
symbol | category | ‘AVAX-USDT’ |
candles
name | pandas type | example value |
---|---|---|
origin_time | datetime64[ns] | 1666051140000000000 |
open | float64 | 19549.73 |
high | float64 | 19549.87 |
low | float64 | 19548.48 |
close | float64 | 19549.86 |
volume | float64 | 0.03388 |
trades | Int64 | 24 |
received_time | datetime64[ns] | 1666051140004562561 |
start | float64 | 1666051140000000000 |
stop | float64 | 166605120000000000 |
exchange | category | ‘BINANCE’ |
symbol | category | ‘BTC-USDT’ |
funding
name | pandas type | example value |
---|---|---|
origin_time | datetime64[ns] | 1682899248008000000 |
mark_price | float64 | 1867.939062 |
index_price | float64 | 0.0 |
rate | float64 | -0.000026 |
next_funding_time | datetime64[ns] | 1682928000000000000 |
received_time | datetime64[ns] | 1682899273386438400 |
exchange | category | ‘BINANCE_FUTURES’ |
symbol | category | ‘ETH-USDT-PERP’ |
Note that index price or mark price may be zero based on exchange or data origin.
open_interest
name | pandas type | example value |
---|---|---|
origin_time | datetime64[ns] | 1682899248008000000 |
open_interest | float64 | 92098.846 |
received_time | datetime64[ns] | 1682899273386438400 |
exchange | category | ‘BINANCE_FUTURES’ |
symbol | category | ‘ETH-BUSD-PERP’ |
liquidations
name | pandas type | example value |
---|---|---|
origin_time | datetime64[ns] | 1682899248008000000 |
side | string | ‘buy’ |
quantity | float64 | 0.457 |
price | float64 | 1942.6 |
id | float64 | -1 |
status | string | filled |
received_time | datetime64[ns] | 1682899273386438400 |
exchange | category | ‘BINANCE_FUTURES’ |
symbol | category | ‘ETH-BUSD-PERP’ |
Note that some exchanges only publish last liquidation in a given second, here is documentation for Binance.
Notes
- All quantities, sizes and volumes are in the base asset (eg. BTC in case of BTC-USDT pair).
- All times are in nanosecond unix integer timestamp format
- All decimal numbers are as floats. This is to ensure good storage and computational performance. You may wish to round them to tick size and convert to Python Decimal or string to get the precise value eg. for a pretty human-readable representation.
- Order book data usually don’t contain origin_time. Dataframes with them may contain an empty origin_time column with 0 or -1.
- Data for the past day are uploaded every day between 00:00 UTC and 3:00 UTC.
Subscription
Market data
- Subscribe now,
start your research in 3 minutes.
Existing subscribers
- Upgrade, pause or cancel
your subscription any time.