Here is the high level picture of today’s system. We will start a MarketStore instance using docker container, and run a background worker that calls GDAX price API so that we can pull the bitcoin historical price from their endpoint quickly and make it available for backtest clients to query over HTTP.
We will start another container for the client using python anaconda with python3 image. We use the official client package named pymarkestore. You will get a DataFrame from MarketStore.
Setup MarkeStore Server
There is the official build of MarketStore docker image today publicly available in DockerHub, but first, let’s write a config file for the server.
In the github repository you can find an example config file in YAML format: https://github.com/alpacahq/marketstore/blob/master/mkts.yml but I’m putting our example here.
root_directory: /project/data/mktsdb listen_port: 5993 log_level: info queryable: true stop_grace_period: 0 wal_rotate_interval: 5 enable_add: true enable_remove: false enable_last_known: false bgworkers: - module: gdaxfeeder.so name: GdaxFetcher config: symbols: - BTC base_timeframe: "1D" query_start: "2018-01-01 00:00"
This configures the server so that it fetches the GDAX historical price API for 1-day bars since 2018–01–01. Save this config as $PWD/mkts.yml file. The server listens on the port 5993 as default. Now let’s bring up the server.
$ docker run -v $PWD/mktsdb:/project/data/mktsdb -v $PWD/mkts.yml:/tmp/mkts.yml --net host alpacamarkets/marketstore:v2.1.1 marketstore -config /tmp/mkts.yml
The server should automatically download the docker images from DockerHub if you haven’t, and start the server process with the config. Hopefully, you will see something like this.
I0430 05:54:56.091770 1 log.go:14] Disabling "enable_last_known" feature until it is fixed... I0430 05:54:56.092200 1 log.go:14] Initializing MarketStore... I0430 05:54:56.092236 1 log.go:14] WAL Setup: initCatalog true, initWALCache true, backgroundSync true, WALBypass false: I0430 05:54:56.092340 1 log.go:14] Root Directory: /project/data/mktsdb I0430 05:54:56.097066 1 log.go:14] My WALFILE: WALFile.1525067696092950500.walfile I0430 05:54:56.097104 1 log.go:14] Found a WALFILE: WALFile.1525067686432055600.walfile, entering replay... I0430 05:54:56.100352 1 log.go:14] Beginning WAL Replay I0430 05:54:56.100725 1 log.go:14] Partial Read I0430 05:54:56.100746 1 log.go:14] Entering replay of TGData I0430 05:54:56.100762 1 log.go:14] Replay of WAL file /project/data/mktsdb/WALFile.1525067686432055600.walfile finished I0430 05:54:56.101506 1 log.go:14] Finished replay of TGData I0430 05:54:56.109380 1 plugins.go:14] InitializeTriggers I0430 05:54:56.110664 1 plugins.go:42] InitializeBgWorkers I0430 05:54:56.110742 1 log.go:14] Launching rpc data server... I0430 05:54:56.110800 1 log.go:14] Launching heartbeat service... I0430 05:54:56.110822 1 log.go:14] Enabling Query Access... I0430 05:54:56.110844 1 log.go:14] Launching tcp listener for all services...
If you see something like “Response error: Rate limit exceeded”, that’s a good sign, not a bad one, since it means the background worker successfully fetched the price data and reached to rate limit. The fetch worker will suspend for a while and restart to catch up to the current price automatically. You just need to keep it running.
MarketStore implements JSON-RPC and MessagePack-RPC for query. MessagePack-RPC is particularly important for performance of a query on a large dataset. Thankfully, there is already python and go client library so you don’t have to implement the protocol. In this article, we use python. We start from miniconda3 image from another terminal.
$ docker run -it --rm -v $PWD/client.py:/tmp/client.py --net host continuumio/miniconda3 bash # pip install ipython pymarketstore
We have installed ipython and pymarketstore, including their dependencies. From this terminal, let’s start an ipython shell and query MarketStore data.
# ipython (base) root@hq-dev-01:/# ipython Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19) Type 'copyright', 'credits' or 'license' for more information IPython 6.3.1 -- An enhanced Interactive Python. Type '?' for help.
In : import pymarketstore as pymkts
In : param = pymkts.Params('BTC', '1D', 'OHLCV', limit=100)
In : df = pymkts.Client('http://localhost:5993/rpc').query(param).first().df()
In : df[-10:] Out: Open High Low Close Volume Epoch 2018-04-14 00:00:00+00:00 7893.19 8150.00 7830.00 8003.11 9209.196953 2018-04-15 00:00:00+00:00 8003.12 8392.56 8003.11 8355.25 9739.103514 2018-04-16 00:00:00+00:00 8355.24 8398.98 7905.99 8048.93 13137.432715 2018-04-17 00:00:00+00:00 8048.92 8162.50 7822.00 7892.10 10537.460361 2018-04-18 00:00:00+00:00 7892.11 8243.99 7879.80 8152.05 10673.642535 2018-04-19 00:00:00+00:00 8152.05 8300.00 8101.47 8274.00 11788.032811 2018-04-20 00:00:00+00:00 8274.00 8932.57 8216.21 8866.27 16076.648797 2018-04-21 00:00:00+00:00 8866.27 9038.87 8610.70 8915.42 11944.464063 2018-04-22 00:00:00+00:00 8915.42 9015.00 8754.01 8795.01 7684.827002 2018-04-23 00:00:00+00:00 8795.00 8991.00 8775.10 8940.00 3685.109169
Voila! You just got the daily bitcoin price in hand in the DataFrame format. Note the second line (param = …) determines which symbol and timeframe to query, with some query predicates such as the number of rows or date range to query. From here, you can do a number of things including calculating indicators such as moving average and bollinger band, or find the statistical volume anomaly using some scipy package.
I want to emphasize that it is very important to build a performant historical dataset to study and develop a trading algorithm, and you can do it quickly with MarketStore as we have just walked through. This article demonstrated how to work with the bitcoin prices from GDAX, but you can hook up other data sources as well pretty easily using pymarketstore’s write method. You can also write your own custom background data fetcher.
Again, the query performance is going to be critical when in comes to backtesting, since you want to iterate quickly to get the results. now You may wonder how fast MarketStore can be. I will show the lightning fast query speed with huge data set in the next post.
In the meantime, please leave any questions in the comments or ask @AlpacaHQ regarding this tutorial. Leave your email below so we can notify you when we can grant access to the full trading platform! You can also check us out at https://alpaca.markets.