Algo Trading for Dummies  -  Implementing an Actual Trading Strategy (Part 4)

Strategy Development and Implementation

While most strategies that are successful long term are based on a mix of technical and fundamental factors, the fundamental behaviors which are exploited are often very nuanced and vary hugely, so its hard to generalize for an article. As such, we’ll be focusing more on the tools and methods for making strategies based on technical analysis.

Image from iOS (8).png

Visual Strategy Creation and Refinement

There are many great financial charting tools available, with various different specialties, my personal favourite free option being tradingview.com.

1_Vy5CbHA0AdW7pwSmiwoZng (1).png

One of the most useful features for strategy creation is its simple scripting language to create both trading indicators and back-testable strategies. While the back-testing tool is rather limited in its functionality, it serves as a good first step sanity check.

Simple creation of trading indicators which are then overlaid directly onto the chart allows for rapid testing and debugging of ideas, as its much quicker to create a script and visually check it against the market than to fully implement and back test it.

This rapid development process is a good first step to making certain types of strategies, particularly for active trading strategies that act on single symbols at a time. However, it won’t do you any good for portfolio strategies or those which incorporate advanced hedging.

For that, you’ll want to create your own tools for visualising full back-tests with multiple trading pairs. This is where the logging features of your back-tester will come in. With the end results being plotted in your graphing tool of choice, such as matplotlib (for Python).

logo2.png

Full Back-tester Framework:

(Simple example of a multi-symbol back-tester based on position handler from  previous article  — full script at end of this post)

(Simple example of a multi-symbol back-tester based on position handler from previous article — full script at end of this post)

Various plots, such as scatter graphs or hierarchical clustering, can be used to efficiently display and contrast different variations of the back-tested strategy and allow fine tuning of parameters.

Implementing and Back-testing

One of the easiest traps to fall into with the design of any predictive system is over-fitting to your data. It’s easy to see amazing results in back-tests if a strategy has been trained to completely fit the testing data. However, the strategy will almost certainly fall at the first hurdle when tested against anything out of sample, so is useless.

Meanwhile, at the other end of the spectrum, it is also possible to create a system which is overgeneralised. For example, a strategy which is supposed to actively trade the S&P 500 could easily turn a profit long term by always signaling long. But that completely defeats the purpose of trying to create the bot in the first place

The best practices for back-testing a system:

  1. Verify against out of sample data. If the strategy has been tuned against one set of data, it obviously going to perform well against it. All back-tests should be performed against a different set of data, whether that be a different symbol in the same asset class or the same symbol over a different time sample.
  2. Verify all strategies against some kind of benchmark. For a portfolio strategy you’d want to compare risk-adjusted returns metrics. For an active trading strategy you can look at risk:reward and win rate.
  3. Sanity check any strategies that pass the back-test. Where possible, look back over the specific set of steps it takes to make any trading signals. Do they make logical sense? If this isn’t possible (for example with Machine Learning), plot a set of its signals for out of sample data. Do they appear consistent and reasonable?
  4. If the strategy has gotten this far, run live tests. Many platforms offer paper-trading accounts for strategy testing. If not, you may be able to adapt your back-testing tool to accept live market data.

Once you finally have a fully tested and working strategy which you are happy with, you can run it with small amounts of capital on a testing account. While the strategy may be perfect, there is always the possibility of bugs in the trading bot itself.

Final Thoughts

Creating any effective trading strategy is hard, especially so when you also have to deal with defining it in objective terms that can be converted into code. It can be especially frustrating when nothing seems to produce reliable results. However, sticking to good practices when it comes to the data science of back-testing and refining a strategy will pay off vs learning those same lessons when a strategy under-performs with real money.

By Matthew Tweed

Full back-tester framework:

/

Intro into Machine Learning for Finance (Part 1)

There has been increasing talk in recent years about the application of machine learning for financial modeling and prediction. But is the hype justified? Is machine learning worth investing time and resources into mastering?

Photo by Franck V. on Unsplash

Photo by Franck V. on Unsplash

This series will be covering some of the design decisions and challenges to creating and training neural networks for use in finance, from simple predictive models to the use of ML to create specialised trading indicators and statistics — with example code and models along the way.

If you are comfortable with machine learning in general, please feel free to skip and read from the 3rd section “Where can it be applied in finance?

What is Machine Learning?

In simply terms, machine learning is about creating software which can be “trained” to automatically adapt its predictive model without the need for hard-coded changes. There is often debate whether machine learning is considered a subset of Artificial Intelligence, or whether AI is a subset of ML, but they both work to the same broad goal of pattern recognition and analysis.

While different forms of machine learning and expert systems have been around for decades, only relatively recently have we seen large advances in their learning capabilities as both training methods and computer hardware has advanced.

With the creation of easy to use open-source libraries, it has now become easier than ever to create, train and deploy models without the need for specialist education.

Neural Networks

Artificial Neural networks are, again, a subset of the broad field of machine learning. They are among the most commonly used and are easier to understand conceptually.

A network is made up of layers of “neurons”, which each perform a very simple calculation based on their own trained. Individually, they provide very little in terms of processing. However, when combined into a layer, and layers stacked into a full network, the complexity that the model can learn widens and deepens.

(Simple neural network structure)

(Simple neural network structure)

Each neuron has a weighting value associated with each input it receives. Its final output which is passes on is:

Sum of weightings * their respective input

This value is then put through an “activation function” (such as tanh or sigmoid).

The activation function can serve to normalize the output value of the neuron and add non-linearity, so that the network can learn functions more complex than simple linear relationships.

Once the input data has processed through the whole stack of neurons, you’ll be left with your simple prediction or statistics, such as a long/ short call for the next time period.

Training a network

Its all well and good looking at the flow of data for a model to make a prediction, but it wouldn’t be complete without a brief overview of how the network is actually trained to make these predictions.

During the training process, you run a set of data through the network to compare its predictions against the desired results for each data point. The difference between the output and the target value(s) is then used to update the weightings within the network through “back-propagation”.

Back-propagation starts at the output neurons, looking at the component values they received from the previous layer and the associated weightings. The weightings are given a small adjustment to bring the updated prediction closer inline with the desired output.

This error between the output and target is then fed to the next layer down were the same updating process is repeated until all the network weightings have been marginally adjusted.

This is repeated multiple times over every data-point in the training dataset, giving the network weightings a small adjustment each time until predictions converge towards the target outputs.

In theory, this learned model will then be able to make accurate predictions on out of sample data. However, it is very easy to over-fit to the training data if the model is too large and simply learns the inputs rather than a generalized representation.

(Example of over-fitting for simple classification — made with  tensorflow playground )

(Example of over-fitting for simple classification — made with tensorflow playground)

Where can it be applied in finance?

Since neural networks can be used to learn complex patterns in a dataset, they can be used to automate some of the processes of technical analysis commonly used by traders.

A moving average cross strategy can be coded with ease, needing only a few lines for a simple trading bot. However, more complex patterns such as indicator divergence, flags and wedges, and support/ resistance levels can be harder to identify with simple rules. And, indeed, forming a set of chart patterns into an objective trading strategy is often hard to achieve.

Machine learning can be applied in several different cases for this one scenario.

  1. Pattern recognition from candle data to identify levels of significance
  2. Creating specialised indicators to add to a simple rule based strategy
  3. A final processing and aggregation layer to make a prediction from your set of indicators.

Machine learning can also be applied in slightly more exotic ways to help refine further information:

  • Denoising and auto-encoding — used to remove some of the random noise of a price feed to help distill the underlying trend or specifics of the market sentiment.
  • Clustering — group together different equities and financial instruments to streamline the value of a portfolio. Or it could be used to evaluate and reduce the risk of a portfolio.
  • Regression — often used to try to predict the price at the next time step, however it can be applied to a range of abstracted indicators to help predict trading signals earlier.

Machine Learning vs Traditional Methods

In many of the cases above, it is perfectly possible (and often advisable) to stick to more traditional algorithms. A well made machine learning framework has the advantage when it comes it easy retraining, but at the cost of complexity, computational overhead and interpretability.

While there have been advances in the use of relevance heat-maps to help explain the source of a prediction, neural nets still mostly remain black boxes — ruling out certain use cases, such as for fund managers, where decision justification and accountability is of importance to clients.

We may well see attitudes change over time as ML assisted trading and investing becomes wider spread, but for now this remains a large obstacle to practical application in certain settings.

Furthermore, it is often a lot easier to make a simple rule based strategy over a full ML model and training structure. But when done right, machine learning can provide cutting edge accuracy to the adversarial world of financial trading.

Conclusion

Despite some of the added challenges and complexity brought by the addition of machine learning, it provides a new range of tools which can be applied to a range of problems in finance, allowing for greater automation and accuracy.

In the next post we’ll be looking deeper into some of the theory and decision making behind different training methods and tasks for a new model.

By Matthew Tweed

/

I Built a Go Plugin for Alpaca’s MarketStore as a College Intern

Hey all! I’m Ethan and recently started working for Alpaca as a Software Engineering Intern! For my first task, I created a Go plugin for Alpaca’s open source MarketStore server that fetches and writes Binance minute-level.

Image from iOS (2).jpg

You might be wondering — What is MarketStore? MarketStore is a database server written in Go that helps users handle large amounts of financial data. Inside of MarketStore, there are Go plugins that allow users to gather important financial and crypto data from third party sources.

For this blog post, I’ll be going over how I created the plugin from start to finish in three sections: Installing MarketStore, understanding MarketStore’s plugin structure, creating the Go plugin., and installing the Go plugin.

Experience Installing and Running MarketStore Locally

First, I set up MarketStore locally. I installed the latest version of Go and started going through the installation process outlined in MarketStore’s README. All the installation commands worked swimmingly, but when I tried to run marketstore using

ethanc@ethanc-Inspiron-5559:~/go/bin/src/github.com/alpacahq/marketstore$ marketstore -config mkts.yml

I got this weird error:

/usr/local/go/src/fmt/print.go:597:CreateFile/go/src/github.com/alpacahq/marketstore/executor/wal.go:87open /project/data/mktsdb/WALFile.1529203211246361858.walfile: no such file or directory: Error Creating WAL File

I was super confused and couldn’t find any other examples of this error online. After checking and changing permissions in the directory, I realized my mkts.yml file configuration root_directory was incorrect. To resolve this, I changed mkts.yml from

root_directory: /project/data/mktsdb

To

root_directory: /home/ethanc/go/bin/src/github.com/alpacahq/marketstore/project/data/mktsdb

and reran

ethanc@ethanc-Inspiron-5559:~/go/bin/src/github.com/alpacahq/marketstore$ marketstore -config mkts.yml

This time, everything worked fine and I got this output:

ethanc@ethanc-Inspiron-5559:~/go/bin/src/github.com/alpacahq/marketstore$ marketstore -config mkts.yml
…
I0621 11:37:52.067803 27660 log.go:14] Launching heartbeat service…
I0621 11:37:52.067856 27660 log.go:14] Enabling Query Access…
I0621 11:37:52.067936 27660 log.go:14] Launching tcp listener for all services
…

To enable the gdaxfeeder plugin which grabs data from a specified cryptocurrency, I uncommented these lines in the mkts.yml file:

and reran

ethanc@ethanc-Inspiron-5559:~$ marketstore -config mkts.yml

which yielded:

…
I0621 11:44:27.248433 28089 log.go:14] Enabling Query Access…
I0621 11:44:27.248448 28089 log.go:14] Launching tcp listener for all services…
I0621 11:44:27.254118 28089 gdaxfeeder.go:123] lastTimestamp for BTC = 2017–09–01 04:59:00 +0000 UTC
I0621 11:44:27.254189 28089 gdaxfeeder.go:123] lastTimestamp for ETH = 0001–01–01 00:00:00 +0000 UTC
I0621 11:44:27.254242 28089 gdaxfeeder.go:123] lastTimestamp for LTC = 0001–01–01 00:00:00 +0000 UTC
I0621 11:44:27.254266 28089 gdaxfeeder.go:123] lastTimestamp for BCH = 0001–01–01 00:00:00 +0000 UTC
I0621 11:44:27.254283 28089 gdaxfeeder.go:144] Requesting BTC 2017–09–01 04:59:00 +0000 UTC — 2017–09–01 09:59:00 +0000 UTC
…

Now that I got MarketStore running, I used Jupyter notebooks and tested out the commands listed in this Alpaca tutorial and got the same results. You can read more about how to run MarketStore in MarketStore’s README, Alpaca’s tutorial, and this thread.

Understanding how MarketStore Plugins work

After installing, I wanted to understand how their MarketStore repository works and how their current Go plugins work. Before working in Alpaca, I didn’t have any experience with the Go programming language. So, I completed the Go’s “A Tour of Go” tutorial to get a general feel of the language. Having some experience with C++ and Python, I saw a lot of similarities and found that it wasn’t as difficult as I thought it would be.

Creating a MarketStore Plugin

To get started, I read the MarketStore Plugin README. To summarize at a very high level, there are two critical Go features which power plugins: Triggers and BgWorkers. You use triggers when you want your plugin to respond when certain types data are written to your MarketStore’s database. You would use BgWorkers if you want your plugin to run in the background.

I only needed to use the BgWorker feature because my plugin’s goal is to collect data outlined by the user in the mkts.yml configuration file.

To get started, I read the code from the gdaxfeeder plugin which is quite similar to what I wanted to do except that I’m trying to get and write data from the Binance exchange instead of the GDAX exchange.

I noticed that the gdaxfeeder used a GDAX Go Wrapper, which got its historical price data public endpoint. Luckily, I found a Go Wrapper for Binance created by adshao that has the endpoints which retrieves the current supported symbols as well as retrieves Open, High, Low, Close, Volume data for any timespan, duration, or symbol(s) set as the parameters.

To get started, I first created a folder called binancefeeder then created a file called binancefeeder.go inside of that. I then first tested the Go Wrapper for Binanceto see how to create a client and talk to the Binance API’s Kline endpoint to get data:

I then ran this command in my root directory:

ethanc@ethanc-Inspiron-5559:~/go/bin/src/github.com/alpacahq/marketstore$ go run binancefeeder.go

and received the following response with Binance data:

&{1529553060000 6769.28000000 6773.91000000 6769.17000000 6771.34000000 32.95342700 1529553119999 223100.99470354 68 20.58056800 139345.00899491}
&{1529553120000 6771.33000000 6774.00000000 6769.66000000 6774.00000000 36.43794400 1529553179999 246732.39415947 93 20.42194600 138288.41850603}
…

So, it turns out that the Go Wrapper worked!

Next, I started brainstorming how I wanted to configure the Binance Go plugin. I ultimately chose symbols, queryStart, queryEnd, and baseTimeframe as my parameters since I wanted the user to query any specific symbol(s), start time, end time, and timespan (ex: 1min). Then, right after my imports, I started creating the necessary configurations and structure for BinanceFetcher for a MarketStore plugin:

The FetcherConfig’s members are what types of settings the user can configure in their configuration file (ex: mkts.yml) to start the plugin. The BinanceFetcher’’s members are similar to the FetcherConfig with the addition of the config member. This will be used in the Run function later.

After creating those structures, I started to write the background worker function. To set it up, I created the necessary variables inside the backgroundworker function and copied the recast function from the gdaxfeeder. The recast function uses Go’s Marshal function to encode the config JSON data received, then sets a variable ret to an empty interface called FetcherConfig. Then it stores the parsed JSON config data in the ret variable and returns it:

Then inside the NewBgWorker function, I started to create a function to determine and return the correct time format as well as set up the symbols, end time, start time, and time duration. If there are no symbols set, by default, the background worker retrieves all the valid cryptocurrencies and sets the symbol member to all those currencies. It also checks the given times and duration and sets them to defaults if empty. At the end, it returns the pointer to BinanceFetcher as the bgworker.BgWorker:

Then, I started creating the Run function which is implemented by BgWorker (see bgworker.go for more details). To get a better sense of how to handle errors and write modular code in Go, I read the code for plugins gdaxfeeder and polygon plugins. The Run function receives the BinanceFetcher (which is dereferenced since bgworker.BgWorker was the pointer to BinanceFetcher). Our goal for the Run function is to call the Binance API’s endpoint with the given parameters for OHLCV and retrieve the data and writes it to your MarketStore’s database.

I first created a new Binance client with no API key or secret since I’m using their API’s public endpoints.

Then, to make sure that the BinanceFetcher doesn’t make any incorrectly formatted API calls, I created a function to check the timestamp format using regex and change it to the correct one. I had to convert the user’s given timestamp to maintain consistency in the Alpaca’s utils.Timeframe which has a lot of helpful functions but has different structure members than the one’s Binance uses (ex: “1min” vs. “1m”). If the user uses an unrecognizable timestamp format, it sets the baseTimeframe value to 1 minute:

The start and end time objects are already checked in the NewBgWorker function and returns a null time.Time object if invalid. So, I only have to check if the start time is empty and set it to the default string of the current Time. The end time isn’t checked since it will be ignored if incorrect which will be explained in the later section:

Now that the BinanceFetcher checks for the validity of its parameters and sets it to defaults if its not valid, I moved onto programming a way to call the Binance API. 

To make sure we don’t overcall the Binance API and get IP banned, I used a for loop to get the data in intervals. I created a timeStart variable which is first set to the given time start and then created a timeEnd variable which is 300 times the duration plus the timeStart's time. At the beginning of each loop after the first one, the timeStart variable is set to timeEnd and the timeEnd variable is set to 300 times the duration plus the timeStart’s time:

When it reaches the end time given by the user, it simply alerts the user through glog and continues onward. Since this is a background worker, it needs to continue to work in the background. Then it writes the data retrieved to the MarketStore database. If invalid, the plugin will stop because I don’t want to write garbage values to the database:

Installing Go Plugin

To install, I simply changed back to the root directory and ran:

ethanc@ethanc-Inspiron-5559:~/go/bin/src/github.com/alpacahq/marketstore$ make plugins

Then, to configure MarketStore to use my file, I changed my config file, mkts.yml, to the following:

Then, I ran MarketStore:

ethanc@ethanc-Inspiron-5559:~/go/bin/src/github.com/alpacahq/marketstore$ marketstore -config mkts.yml

And got the following:

…
I0621 14:48:46.944709 6391 plugins.go:42] InitializeBgWorkers
I0621 14:48:46.944801 6391 plugins.go:45] bgWorkerSetting = &{binancefeeder.so BinanceFetcher map[base_timeframe:1Min query_start:2018–01–01 00:00 query_end:2018–01–02 00:00 symbols:[ETH]]}
I0621 14:48:46.952424 6391 log.go:14] Trying to load module from path: /home/ethanc/go/bin/bin/binancefeeder.so…
I0621 14:48:47.650619 6391 log.go:14] Success loading module /home/ethanc/go/bin/bin/binancefeeder.so.
I0621 14:48:47.651571 6391 plugins.go:51] Start running BgWorker BinanceFetcher…
I0621 14:48:47.651633 6391 log.go:14] Launching heartbeat service…
I0621 14:48:47.651679 6391 log.go:14] Enabling Query Access…
I0621 14:48:47.651749 6391 log.go:14] Launching tcp listener for all services…
I0621 14:48:47.654961 6391 binancefeeder.go:198] Requesting ETH 2018–01–01 00:00:00 +0000 UTC — 2018–01–01 05:00:00 +0000 UTC
…

Testing:

When I was editing my plugin and debugging, I often ran the binancefeeder.go file:

ethanc@ethanc-Inspiron-5559:~/go/bin/src/github.com/alpacahq/marketstore$ go run binancefeeder.go

If I ran into an issue I couldn’t resolve, I used the equivalent print function for Go (fmt). If there is an issue while running the plugin as part of MarketStore via the marketstore -config mkts.yml command, I used the glog.Infof() or glog.Errorf() function to output the corresponding error or incorrect data value.

Lastly, I copied the gdaxfeeder test go program and simply modified it for my binancefeeder test go program.

You’ve made it to the end of the blog post! Here’s the link to the Binance plugin if you want to see the complete code. If you want to see all of MarketStore’s plugins, check out this folder.

To summarize, if you want to create a Go extension for any open source repository, I would first read the existing documentation whether it is a README.md or a dedicated documentation website. Then, I would experiment around the repositories code by changing certain parts of the code and see which functions correspond with what action. Lastly, I would look over previous extensions and refactor an existing one that seems close to your plugin idea.

Thanks for reading! I hope you take a look at the MarketStore repository and test it out. If you have any questions, few free to comment below and I’ll try to answer!

Special thanks to Hitoshi, Sho, Chris, and the rest of the Alpaca’s Engineering team for their code reviews and help as well as Yoshi and Rao for providing feedback for this post.

By: Ethan Chiu

/

Algo Trading for Dummies  - 3 Useful Tips When Storing Trade Signals (Part 2)

Handling & Storing Trading Signals Are Hard

The calculation of simple trading indicators is made easy with the use of any one of the Technical Analysis libraries available, however, the efficient handling and storage of trading signals can be one of the most complex aspects of a live trading system.

Photo by  Jeremy Thomas  on  Unsplash

Calculating Basic Indicators? No Problem

While it’s often necessary to create custom indicators and trading signals, there is still significant benefit to using a standard library such as Ta-Lib for the basics. This saves a lot of time rather than having to reimplement a set of common indicators in your language of choice. It also has the added bonus of increased processing speed as opposed to calculation done in native Python, for example.

When it comes to moving averages and other simple time-series indicators, the process is fairly self explanatory — at every time step you calculate the next numerical value which is then used as the most up-to-date signal to trade against.

(Code Snippet to read data CSV files and process into trading indicators) https://gist.github.com/yoshyoshi/73f130026c25a7dcdb9d6909b1990277

The signals themselves will be stateless in that respect — you aren’t concerned with previous signals that have been made, only the combination of indicators present at that moment. However, you may still wish to store some of the information from the indicators, if only for external analysis at a later point.

Different Story For Advanced Pattern Recognition

Meanwhile, more advanced pattern recognition cannot be handled in such a simple manner. If, for example, your strategy relies on finding divergence between indicators, its possible to get a significant performance boost by storing some past data-points from which to construct the signal at each new step, rather than having to reprocess the full set of data in the look-back period every time.

This is the trade-off between storage/ RAM efficiency and processing efficiency, with the latter also requiring greater software complexity to achieve.

How You Should Store Signals Depends On How Fast You Need It To Be

For optimal processing efficiency, you would not only store all the previously calculated signals from past time-stamps, but also the relevant information to calculate the next step in as fewer steps as possible.

While this would be completely unnecessary for any system with a polling rate above a seconds, it is exactly the kind of consideration you would have for a higher frequency strategy.

Meanwhile, a portfolio re-balancing system, or even most day-trading strategies, have all the time in the world (relatively). You could easily recalculate all the relevant signals at each time-step, which would cut down on the need for the handling of historical indicator sets.

Depending on the trading period of the system, it may also be worth using a hybrid approach to indicator and signal storage. Rather than permanently saving the data, you could calculate the full set of indicators at start-up and periodically dump and refresh the data to keep only whats going to be used in RAM.

The precise design trade-offs should considered on an individual basis, as holding more data in RAM may not be an option when running the software from lower power cloud computing instances nor, at the other end of the spectrum, would you be able to spare the seconds to recalculate everything for a market making bot.

3 Useful Tips When Storing Trade Signals

As mentioned in the part 1 of this series, there are range of different storage solutions that can be used for trading data. However, there are several best practices which apply across all:

  1. Keep indicators in a numeric or boolean format where possible for storage. For example, splitting a more complex signal set into boolean components. This particular problem caused me several issues in projects I’ve had to work on in the past.
  2. Only store what is complex or time-consuming to recalculate. If a set of signals can be calculated in time in a stateless manner, its probably easier to do so than add the design complexity of storing extra information.
  3. Plan out the flow of data through your system before you start programming anything. What market data is going to be pulled for each time-step? What will then be calculated from this and what is necessary to store? A well thought-out design will reduce complexity and hassle down the line.

Past this, common sense applies. Its probably best to store the indicators and signals in the same time-series format as, and along side, the underlying symbols they’re derived from. More complex signals, or indicators derived from multiple symbols, may even warrant their own calculation and storage process.

You could even go as far as to create a separate indicator feed script which calculates and stores everything separately from the trading bot software itself. The database could then be read by each bot as just another data feed. This not only has the benefit of keeping the system more modular, but also allowing you to create a highly optimized calculation function without the complexity of direct integration into a live system.

Whatever flavour of system you end up using, make sure to plan out the data storage and access first and foremost, before starting the rest of the design and implementation process.

By Matthew Tweed

/

Algo Trading for Dummies  -  Collecting & Storing The Market Data (Part 1)

The lifeblood of any algorithmic trading system is, of course, its data — so that’s what we’ll cover in the first two posts of the mini-series.

Photo by  Farzad Nazifi  on  Unsplash

Always Always Collect Any Live Data

For the retail trader, most platforms and brokers are broadly the same, you’ll be provided with a simple wrapper for a relatively simple REST or Websocket API. It’s usually worth modifying the provided wrapper to suit your purposes, and potentially create your own custom wrapper — however, that can be done later once you have a better understanding of the structure and requirements of your trading system.

Depending on the nature of the trading strategy, there are various types of data you may need to access and work with — OHLCV data (candlesticks), bid/ asks, and fundamental or exotic data. OHLCV is usually the easiest to get historical data for, which will be important later for back-testing of strategies. While there are some sources for tick data and historic bid/ask or orderbook snapshots, they generally come at high costs.

With this last point in mind, it’s always good to collect any live data which will be difficult or expensive to access at a later date. This can be done by setting up simple polling scripts to periodically pull and save any data that might be relevant for back-testing in the future, such as bid/ask spread. This data can provide helpful insight into the market structure, which you wouldn’t be able to track otherwise.

Alpaca Python Wrapper Lets You Start Off Quickly

The Alpaca Python Wrapper provides a simple API wrapper to begin working with to create the initial proof of concept scripts. It serves well for both downloading bulk historical data and pulling live data for quick calculations, so will need little modification to get going.

It’s also to be noted that the Alpaca Wrapper returns market data in the form of pandas Dataframes, which has slightly different syntax compared to a standard Python array or dictionary — although this is covered thoroughly in the documentation so shouldn’t be an issue.

Keeping A Local Cache Of Data

While data may be relatively quick and easy to access on the fly, via the market API, for live trading, even small delays become a serious slow down when running batches of backtesting across large time periods or multiple trading symbols. As such, it’s best to keep a local cache of data to work with. This also allows you to create consistent data samples to design and verify your algorithms against.

There are many different storage solutions available, and in most cases it will come down to what you’re most familiar with. But, we’ll explore some of the options anyway.

No Traditional RDB For Financial Data Please

Financial data is time-series, meaning that each attribute is indexed by its associated time-stamp. Depending on the volume of data-points, traditional relational databases can quickly become impractical, as in many cases it is best to treat each data column as a list rather than the database as a collection of separate records.

On top of this, a database manager can add a lot of unnecessary overhead and complexity for a simple project that will have limited scaling requirements. Sure, if you’re planning to make a backend data storage solution which will be constantly queried by dozens of trading bots for large sets of data, you’ll probably want a full specialised time-series database.

However, in most cases you’ll be able to get away with simply storing the data in CSV files — at least initially.

Cutting Down Dev Time By Using CSVs

(Code Snippet to download and store OHLCV data into a CSV) https://gist.github.com/yoshyoshi/5a35a23ac263747eabc70906fd037ff3

The use of CSVs, or another simple format, significantly cuts down on usage of a key resource — development time. Unless you know that you will absolutely need a higher speed storage solution in the future, it’s better to keep the project as simple as possible. You’re unlikely be using enough data to make local storage speed much of an issue.

Even an SQL database can easily handle the storage and querying of hundreds of thousands of lines of data. To put that in perspective, 500k lines is equivalent to the 1 minute bars for a symbol between June 2013 and June 2018 (depending on trading hours). A well optimized system which only pulls and processes the necessary data will have no problem in overheads, meaning that any storage solution should be fine. Whether than be an SQL database, NoSQL or a collection of CSV files in a folder.

Additionally, it isn’t infeasible to store the full working dataset in RAM while in use. The 500k lines of OHLCV data used just over 700MB of RAM when serialized into lists (Tested in Python with data from the Alpaca client mentioned earlier).

When it comes to the building blocks of a piece of software, its best to keep everything as simple and efficient as possible, while keeping the components suitably modular so they may be adjusted in future if the design specification of the project changes.

By Matthew Tweed

/

So You Want to Trade Crypto - Hedging with Cryptocurrency and correlation structure (Part 6)

As a new asset class with historically low correlation to traditional financial products, many see Cryptocurrencies as a useful hedging tool against global downturns. However, the structure of Crypto volatility and correlation relative to market capitalization may prove somewhat detrimental to this use-case.

Photo by  Tyler Milligan  on  Unsplash

A story of Volatility

(Raw data from  coinmarketcap.com . These charts show the mean of the 60 day annualized volatility from 1st Jan 2017 to time of writing.)

(Raw data from coinmarketcap.com. These charts show the mean of the 60 day annualized volatility from 1st Jan 2017 to time of writing.)

As within equity markets, we see a small decrease in volatility as the market cap of coins increase (albeit with a relatively low correlation). This can be likened to blue-chip stocks vs mid-caps, with the former providing greater stability due to their established dominance in their respective sectors.

Although market cap is a slightly misleading metric when applied to Cryptocurrencies, it at least implies a higher value to a coin - thus requiring more money to shift its direction dramatically. That being said, volatility has been higher across the board over the last couple of years as Crypto shifted from the accumulation phase post 2013 into the major bull run. Finally pushing to record high as we moved into the final phase of the bull run and subsequent bear market as we entered 2018.

This structure of volatility allows Crypto portfolios and indexes to be constructed similarly to those of equities: high-cap only selection for reduced risk and volatility; mid-caps for higher risk and reward; or a more diversified index to try to capture a middle ground.

The Trend of Correlation

(Raw data from  coinmarketcap.com . These charts show the mean of the 60 day Pearson’s Correlation Coefficient against Bitcoin USD from 1st Jan 2017 to time of writing.)

(Raw data from coinmarketcap.com. These charts show the mean of the 60 day Pearson’s Correlation Coefficient against Bitcoin USD from 1st Jan 2017 to time of writing.)

Here we see nearly zero correlation between the market capitalization of a coin and its average correlation to Bitcoin (the historical leader of the Cryptocurrency space).

While this disproves the theory of high cap Cryptos holding a closer correlation to Bitcoin, it highlights the extremely high levels of correlation present throughout the market. This, as mentioned in previous posts, is likely due to the highly speculative and sentiment driven nature of the market, along with its relative immaturity compared to more traditional traded assets.

Interestingly, there isn’t much difference between the mean of correlation and the mean of absolute (positive only) correlation, meaning that we rarely see any negative correlation between ALT/USD pairs and BTC/USD.

Cryptocurrency as an Asset Class for hedging

Crypto holds the useful property of historically low correlation to other asset classes, such as equity and commodities, suggesting it to be a good hedge against external global factors. However, there are two main issues to this plan: Cryptocurrency has never weathered a global financial crisis; and the internal correlation within the Crypto space.

Since Bitcoin, and the rest of the Cryptocurrency market, has been experiencing its own market cycles due to its rapid growth over the past few years, any fluctuations due to correlation with equity markets has been almost unnoticeable - leading many to speculate that Cryptocurrency would continue this trend and make a good hedging tool against global downturns.

This observation happens to come on the back of a decade of huge growth in both US and global equity markets. Investors have been increasingly complacent in their gains over the past few years, and are happy to take greater and greater risks, betting money on more speculative assets such as Cryptocurrencies. However, such high yield assets are always the first to tumble at the onset of a recession, as investors scramble claw back their risk as their other positions drop.

Always "Different This Time"

Many will claim that its somehow “different this time” - it always is until the inevitable pullback. This was true of the dot-com bubble and I wouldn’t be surprised if the same fate will hold true for Cryptocurrency during a global dip. Not to say that Cryptocurrencies won’t be successful long term - the internet didn’t exactly disappear after 2000. But it should be approached with the same caution as any other high risk investment.

As alluded to in the first half of the article, the levels of volatility and correlation in Cryptocurrency make it difficult to create a well diversified portfolio - no matter what you pick you’re still at the mercy of Bitcoin and can incur the same volatility spikes and drawdowns.

While it may be possible to hedge a portfolio by shorting Bitcoin itself and creating synthetic ALT/BTC pairs, this won’t be able to eliminate the sensitivity of low-mid cap coins to shifts in market sentiment, so would have to be more actively managed.

All-in-all, Cryptocurrencies provide an interesting new opportunity for traders and investors alike - with high risk but much higher reward possibilities. They will not be a miracle financial product, nor a get rich quick scheme - but they can provide something truly new and different for those who have the time to understand and appreciate them.

By Matthew Tweed

/