Algo Trading News Headlines 9/10/2018

Is a Slang programming job at Goldman Sachs a technology career-killer?


If you work for Goldman Sachs, Slang is more than just the signifier of an informal word or phrase: it’s the bank’s very own programming language,devised by some of its most brilliant strats over two decades ago. However, depending upon who you speak to, Slang (short for Securities Language) is also a reason why working for the firm is a) incredibly interesting or b) a very quick way of ensuring you will never work anywhere but GS ever again.

Photo by  Jennifer Burk  on  Unsplash

Machine earning: how tech is shaking up bank market-making


This requires human traders to tell quants and technologists how they do their job, which is a tense balancing act. “Clearly, there is going to be the feeling of, ‘Well, you’re just automating everything I’m doing, so what’s in it for me?’” says Ezra Nahum at Goldman Sachs.

You want a machine learning job in finance? They might be less exciting than you think


For starters it isn’t actually clear what machine learning actually is. The term conjures up images of artificially intelligent cyborgs poring over streams of financial data, coming up with novel trading strategies which they then test and modify — all without any human supervision.

What Can The Crypto Robots Offer Us?


Cryptos trading robots utilize our free time and let us think on a passive income that is quite rewarding for a good future. On the contrary, when we are too busy with our own work, it can free up our time and invest on our behalf.

JP Morgan’s top quant warns next crisis to have flash crashes and social unrest not seen in 50 years


The trillion-dollar shift to passive investments, computerized trading strategies and electronic trading desks will exacerbate sudden, severe stock drops, Kolanovic said.


Best Starting Kits for Algo Trading with C#

Today, the world is transforming towards automated fashion, including manufacture, cars, marketing and logistics. Personal investment is no exception. At Alpaca, we are pushing this boundary forward so everyone can enjoy the automated investment world.

Photo by  Nikhil Mitra  on  Unsplash

List of .NET/C# Algo Trading Systems

When it comes to algo trading and automated investment, Python is one of the biggest players in the space, but many experts also use .NET/C# for its high performance and robustness. As we did some research on toolset you might look at to start your algo trading, we wanted to share this list for you.

Overall, the ecosystem has grown so much lately, and many open sources and tools are available for you at low cost, without much equipment.

  • QuantConnect

QuantConnect is one of the most popular online backtesting and live trading services, where you can learn and experiment your trading strategy to run with the real time market. The platform has been engineered in C# mainly, with additional language coverage such as python.

  • WealthLab

WealthLab is another C# platform where you can get the real time price and run your algorithm, if you have a Fidelity account.

  • NinjaTrader and MultiCharts

NinjaTrader and MultiCharts are also popular choices for different kind of assets with various broker options.

  • OpenSource Projects

In addition to these, StockSharp is an interesting open source project which is tailor for .NET algo traders and broker integrations.

You should also check out Lean which is an open source library developed by QuantConnect, who also uses this library for their flagship service, supporting multiple assets such as stocks and cryptocurrencies.

List of Data Library

  • Deedle

Deedle is probably one of the most useful libraries when it comes to algo trading. You would run some calculation using Frame and compare data, to get signals.

  • TALibraryInCSharp

TALibraryInCSharp is a great open source library that bridges TA-lib and .NET world, so that you can calculate common indicators such as moving average and RSI. Combining these libraries, you will get the power of trading tools.

  • IEX

Now the question is data to calculate those signals on, but if you are talking about US equities, you can leverage IEX’s free data API and there are libraries like IEXTradingApi that makes your life easy for getting the data instantly. 

  • Others

There are quite a bit of .NET libraries out there for proprietary data sources (e.g. for Quandl) too, so you should check it out.

Announcing Alpaca’s Official .NET Client SDK

Don’t forget about Alpaca! We are committed to providing the best experiences for many algo traders, and today we are happy to announce that our official .NET client SDK for Alpaca Trade API has been released.

Following our Python SDK, .NET SDK takes advantage of its robustness and high performance, as well as wide coverage of platforms. It is an open source project hosted in GitHub and the prebuilt package is up in NuGet. All the classes and methods are documented for IntelliSense so you can get the references right in your IDE.

Here is a snippet of how easily you can place a buy order of a share of Apple.

Alpaca Trade API covers not only retrieving account information and submitting orders, but also allows one to retrieve price and fundamentals information easily. For more details of API, please read our online documents.

Happy algo trading!


Algo Trading News Headlines 8/27/2018

Trading Places — Lines Blurred Between Traders And Programmers


The recent WSJ article focused upon details from Adam Korn, a 16-year veteran at Goldman. He stated that success today depends less on trusting one’s gut, rather much of a trader’s job is embedded in the computer code or algorithms, which do much of the work now.

What is the real story though, what has all this computerized algorithmic trading truly done, how much value has it truly created? One question I would like to ask, is there a correlation between the explosion of our debt levels and this newly digitized financial age?

Photo by  Phil Botha  on  Unsplash

Photo by Phil Botha on Unsplash

Major Russian Airline Tests Blockchain in Bid to Track Fuel Payments


According to S7, the application shares data about fuel demand on a shared ledger, a copy of which is managed by each of the three parties. Further, payments for the fuel can be conducted on the network, with digital invoices created via smart contract during each transaction.

Python Notebook Research to Replicate ETF Using Free Data


ETF is one of the great investment products in the last decade, and it has allowed so many people to gain the exposure to the wide range of assets easily at low cost. It is easy to buy a share of ETF without knowing what’s in there, but as a tech-savvy guy yourself, you may wonder how it works. By reconstructing the fund yourself, you may even come up with something better.

Trading Lesson: Don’t Touch That Dial. More to Come, Hedge Your Bets


In my 30 years as a trader, I’ve never seen a market like this. If the bots remain faithful to their programs, we are still likely to see higher stock prices over the next few weeks to months.

STEROID Launches New Automated Cryptocurrency Trading Algorithm


Algorithms run our online world, for the most part, a majority of everything done online is associated with an algorithm in one way or another. It only makes sense therefore that they would be used in the financial world as well. That is why STEROID has been developed, to create a functioning opportunity for traders on crypto exchanges.

Here’s how artificial intelligence can be used to beat the market


CNBC’s Bob Pisani is joined by Sam Masucci, ETF Managers Group CEO, to discuss how he’s using an AI program to pick stocks.


Python Notebook Research to Replicate ETF Using Free Data

ETF is one of the great investment products in the last decade, and it has allowed so many people to gain the exposure to the wide range of assets easily at low cost. It is easy to buy a share of ETF without knowing what’s in there, but as a tech-savvy guy yourself, you may wonder how it works. By reconstructing the fund yourself, you may even come up with something better.

In this article, we present some basis for you to start your research easily in python to science the ETF world. You can find the complete notebook in GitHub.

Photo by  Kevin Ku  on  Unsplash

Photo by Kevin Ku on Unsplash

What is ETF by the way?

ETF stands for Exchange-Traded Fund. Unlike other types of funds, its shares are traded in exchanges like individual company’s common stocks. The fund is managed by an ETF company and manages portfolio based on the strategy, often diversifying the exposure spread across the market.

One of the most popular ETF is SPY, that tracks S&P 500 index performance. Because of its convenience to manage the risks, not only has it been used by individual investors, but also robo advisors construct their portfolio using ETFs. The convenience doesn’t come for free, of course, and there is an associated cost called expense ratio, that varies an ETF to another.

An ETF’s return comes from the returns of underlying assets it holds. ETFs can hold not just individual stocks but also options and swaps, but in the case of market index ETF like SPY, it constructs a simple long position portfolio.

If the constituents are simply long only stocks, is it easy to run some simulation even in python? If it’s possible to build your own ETF-like portfolio, you don’t even need to pay ETF cost? The answer is YES.

Recreating ETF

Various services provide ETF constituent data either through their website or API, with paid and unpaid style. Some provide even historical data. We recommend to find your best services by yourself, but here we automate the process by Selenium to save your time copying and pasting the list of underlying stocks of particular ETF.

get_etf_holdings() will return the list of constituents in pandas DataFrame format, and the columns include weight in the portfolio and an actual number of shares holding as of today. 

Note this does not come with the price data, but you can pull the historical price data from IEX API for free.

get_closes() will take the constituent data from get_etf_holdings() and return the daily closing price history for the last month from IEX API.

Simulate SPY performance

Before doing something unique, let’s just check if our assumption is correct. The task here is to calculate the historical performance of reconstructed portfolio and compare that with the actual ETF.

Remember the constituent list we have is the one as of today. The fund may have rebalanced, but we assume that’s not the case and we build our portfolio a month ago. Putting altogether, we get something like this.


Even though we took the constituent data as of today, and applied it to simulate the last month, the result isn’t too different. This means this ETF hasn’t changed the holding shares significantly.

So, I don’t need to buy ETF but just buy these stocks?

It’s a natural question whether you can replicate ETF portfolio by buying only underlying stocks.

Yes, you can, only if you have more than $260,000,000,000 ($260BN) which is SPY’s market cap today. But no, you don’t have it, so let’s see how it changes if you do so with $10K. The resulted portfolio we get after some calculation is as below.

The actual total market value of this portfolio is about $2K. The reason why it diverges from the original target is because you don’t buy fractional shares. All fractions are truncated, resulting to much smaller. On the flip side, we found that we can build something similar to SPY with smaller amount of dollar. Running the same historical plotting, we get this.


The divergence is much bigger compared to the first one, and the volatility increased, but in terms of the return, it is not too bad. As a study, it is great to see the actual example like this that more diversified portfolio has less volatility, as the modern portfolio theory teaches.

Summary, and now what?

We presented some python research with actual notebook to study how ETF works, and did some simple experiments. You can look at the complete notebook here.

You can try it in your environment! We recommend to clone the notebook and extend the study for your purpose from here. Potential questions you may ask are:

  • what if the cash size is bigger, or smaller?
  • how about other index ETF such as QQQ?
  • how much dollar do you need to have at least one share for each?
  • can you replicate the return more precisely by rebalancing frequently?
  • can you build something similar by using other set of stocks too?

Research is always fun, and you should continue asking these questions. It is a great moment that this kind of research can be done in a day with only your laptop.

We leave it to the readers to what to do from here, but please let us know what you find if you do something in the comment, or to our Twitter @AlpacaHQ! We hope you will leverage the technology to automate your investments.


Algo Trading News Headlines 8/22/2018

Wall Street Erases the Line Between Its Jocks and Nerds


There used to be a strict hierarchy: Traders made money and won glory while programmers wrote code and stayed out of sight. Those days are over. Meet the straders. Part risk-taking trader and part computer-whiz “strategist,” they are prowling the halls at Goldman Sachs Group Inc., erasing a once-religious line between the jocks and the nerds.

Photo by  FOTOGRAFIA .GES  on  Unsplash

Israel’s Central Bank Wants Increased Regulation on Algorithmic Trading


Over 90% of the activities on the Tel Aviv Stock Exchange are algorithmically performed, according to a new research published by Israel’s central bank last week. While the number of algorithmic automated high-frequency trading (HFT) activities on the exchange is higher than most stock exchanges, the actual rate of transactions they execute, between 23% and 35%, is significantly lower than global standards, the report said.

Global Algorithmic Trading Market 2018–2025 by Business Players: Virtu Financial, KCG, DRW Trading, Optiver


Geographically, the global Algorithmic Trading market is designed for the following regional markets: USA, EU, Europe, China, Japan, Southeast Asia, India. It also studies the revenue market status, analysis of main manufacturers. It deciphers the Sales Price and Gross Margin Analysis and Global Sales Price Growth Rate, Marketing Trader or Distributor Analysis. The role of traders and distributors is emphasized in this research. The complete analysis of Algorithmic Trading Market on the global scale provides key details in form of graphs, statistics and tables which will help the market players in making key business decisions.

Wall Street Finds Limits with Current AI Applications


“A lot of work needs to be done to translate (AI) advancements into benefits for finance,” said Ambika Sukla, executive director of machine learning and AI at Morgan Stanley, at an AI conference Tuesday. “As we work on some of these new models, it’s important to proceed carefully and have a human in the loop.”

Machine Trading: Deploying Computer Algorithms to Conquer the Markets


Algorithmic trading is booming, and the theories, tools, technologies, and the markets themselves are evolving at a rapid pace. This book gets you up to speed, and walks you through the process of developing your own proprietary trading operation using the latest tools.

Comparing 3 Different Types of Neural Network Architectures in Finance


When working on a machine learning task, the network architecture and the training method are the two key factors to turning a set of data-points into a functional model. But where should different training methods be applied? How do they work? And which is “best”? In this post, we list up three types of training methods and make comparisons among Supervised, Unsupervised and Reinforcement Learning.

Coinscious Introduces Crypto Prediction Machine Built to Synergize AI and the Blockchain


Every day, people spend significant amounts of time reading financial news and checking cryptocurrency prices, gathering information that they hope can lead to better decisions. However, the quantity of information available is overwhelming and calls for effective tools and suitable methodologies to enable us to distill data into actionable insights. By processing large data sets quickly, machine learning algorithms can use news sources such as the Financial Times, The Washington Post, or Twitter to provide key insights.


Forecasting Market Movements Using Tensorflow - Intro into Machine Learning for Finance (Part 2)

Multi-Layer Perceptron for Classification

Is it possible to create a neural network for predicting daily market movements from a set of standard trading indicators?

In this post we’ll be looking at a simple model using Tensorflow to create a framework for testing and development, along with some preliminary results and suggested improvements.

Photo by  jesse orrico  on  Unsplash

Photo by jesse orrico on Unsplash

The ML Task and Input Features

To keep the basic design simple, it’s setup for a binary classification task, predicting whether the next day’s close is going to be higher or lower than the current, corresponding to a prediction to either go long or short for the next time period. In reality, this could be applied to a bot which calculates and executes a set of positions at the start of a trading day to capture the day’s movement.

The model is currently using 4 input features (again, for simplicity): 15 + 50 day RSI and 14 day Stochastic K and D.

These were chosen due to the indicators being normalized between 0 and 100, meaning that the underlying price of the asset is of no concern to the model, allowing for greater generalization.

While it would be possible to train the model against any number of other trading indicators or otherwise, I’d recommend sticking to those that are either normalized by design or could be modified to be price or volatility normalized. Otherwise a single model is unlikely to work on a range of stocks.

Dataset Generation

(Code Snippet of a dataset generation example — full script at end of this post)

(Code Snippet of a dataset generation example — full script at end of this post)

The dataset generation and neural network scripts have been split into two distinct modules to allow for both easier modification, and the ability to re-generate the full datasets only when necessary — as it takes a long time.

Currently the generator script is setup with a list of S&P 500 stocks to download daily candles since 2015 and process them into the required trading indicators, which will be used as the input features of the model.

Everything is then split into a set of training data (Jan 2015 — June 2017) and evaluation data (June 2017 — June 2018) and written as CSVs to “train” and “eval” folders in the directory that the script was run.

These files can then be read on demand by the ML script to train and evaluate the model without the need to re-download and process any more data.

Model Training

(Code Snippet of model training — full script at end of this post)

(Code Snippet of model training — full script at end of this post)

At start-up, the script reads all the CSV files in the “train” and “eval” folders into arrays of data for use throughout the training process. With such a small dataset, the RAM requirements will be low enough not to warrant extra complexity. But, for a significantly larger dataset, this would have to be updated to only read a sample of the full data at a time, rotating the data held in memory every few thousand training steps. This would, however, come at the cost of greater disk IO, slowing down training.

The neural network itself is also extremely small, as testing showed that with larger networks, evaluation accuracies tended to diverge quickly.


The network “long Output” and “short Output” are used as a binary predictor, with the highest confidence value being used as the model prediction for the coming day.

The “dense” layers within the architecture mean that each neuron is connected to the outputs of all the neurons in the layer below. These neurons are the same as described in “Intro into Machine Learning for Finance (Part 1)”, and use tanh as the activation function, which is a common choice for a small neural network.

Some types of data and networks can work better with different activation functions, such RELU or ELU for deeper networks. RELU (Rectifier Linear Unit) attempts to solve the vanishing gradient problem in deeper architectures, and the ELU is a variation on this to make training yet more efficient.


As well as displaying prediction accuracy stats in the terminal every 1000 training steps, the ML script is also setup to record summaries for use with TensorBoard — making graphing of the training process much easier.

While I haven’t included anything other than scalar summaries, it’s possible to record everything from histograms of the node weightings to sample images or audio from the training data.

To use TensorBoard with the saved summaries, simply set the — logdir flag to directory you’re running the ML script in. You then open the browser of your choice and enter “localhost:6006” into the search bar. All being well, you now have a set of auto-updating charts.

Training results

Node layouts: Model 1 (40,30,20,10), Model 2 (80,60,40,20), Model 3 (160,120,80,40)

Node layouts: Model 1 (40,30,20,10), Model 2 (80,60,40,20), Model 3 (160,120,80,40)

The results were, as expected, less than spectacular due to the simplicity of the example design and its input features.

We can see clear overfitting, as the loss/ error increases against the evaluation dataset for all tests, especially so on the larger networks. This means that the network is only learning the pattern of the specific training samples, rather than an a more generalized model. On top of this, the training accuracies aren’t amazingly high — only achieving a few percent above completely random guesses.

Suggestions for Modification and Improvement

The example code provides a nice model that can be played around with to help understand how everything works — but it serves more as a starting framework than a working model for prediction. As such, a few suggestions for improvements that you might want to make and ideas you could test

Input features

In its current state, the dataset is generated with only 4 input features and the model only looks at one point in time. This severely limits what you can expect it to be able to learn — would you be able to trade only looking at a few indicator values for one day in isolation?

First, modifying the dataset generation script to calculate more trading indicators and save them to the CSV. TA-lib has a wide range of functions which can be found here.

I recommend sticking to normalized indicators, similar to Stoch and RSI, as this takes the relative price of the asset out of the equation, so that the model can be generalized across a range of stocks rather than needing a different model for each.

Next, you could modify the ML script to read the last 10 data periods as the input at each time step, rather than just the one. This allows it to start learning more complex convergence and divergence patterns in the oscillators over time.

Network Architecture

As mentioned earlier, the network is tiny due to the lack of data and feature complexity of the example task. This will have to be altered to accommodate the extra data being fed by the added indicators.

The easiest way to do this would be to change the node layout variable to add extra layers or greater numbers of neurons per layer. You may also wish to experiment with different types of layer other than fully connected. Convolutional layers are often used for pattern recognition tasks with images, so could be interesting to test out on financial chart data.

Dataset labels

The dataset is labeled at “long” if price difference is >=0, otherwise “short”. However, you may wish to change the threshold to be equal to the median price change over the length of the data, to give a more balanced set of training data.

You may even wish to add a third category of “neutral” for days where the price stays within a limited range.

On top of this, the script also has the ability to vary the look ahead period for the increase or decrease in price. So it could be tested with a longer term prediction.


With the implementation of the suggested improvements, it is certainly possible to improve on the model to the point where it could be used as a complimentary trading indicator to a standard rule based strategy.

However, expectations should be tempered when it comes to such a simple architecture and training task. Machine learning can really set itself apart with a more refined network structure and prediction task.

As such, in the next article we’ll be looking at Supervised, Unsupervised and Reinforcement Learning, and how they can be used to create time series predictor and to analyze relationships in data to help refine strategies.

Full Script

By Matthew Tweed


Easily Build a Stock Trading Bot Using Broker API

Visual Strategy Development

Visual strategy creation is an important part of quick and efficient development, as it allows you to easily debug and adjust ideas by looking at how signals develop and change with shifts in the market.

I find Python to be a good language for this type of data-science, as the syntax is easy to understand and there are a wide range of tools and libraries to help you in your development. On top of this, the Alpaca Python API gives us an easy way to integrate market data without having to implement a new API wrapper.

*Disclaimer: As of today (July 27th 2018), Alpaca Trading API can be used only by invited beta users who opened accounts with Alpaca Securities.

For data processing and plotting, I recommend using TA-Lib and Matplotlib. Ta-Lib provides a nice library to calculate common market indicators, so that you don’t have to reimplement them yourself; while matplotlib is a simple yet powerful plotting tool which will serve you well for all types of data visualization.

Here’s a code snippet of an example framework script I put together (full scripts at the end of this section).

(Code Snippet of an example trade visualizer script I put together— full script at end of this section)

(Code Snippet of an example trade visualizer script I put together— full script at end of this section)

The script adds a simple moving average cross strategy against a few different trading symbols to give a small sample of the how it might fair in live trading. This allows for a first sanity check for a new strategy’s signals. Once a strategy has passed visual inspection you can run it through a backtesting tool, such as the one discussed in the “Algo Trading for Dummies” series.

You may even wish to add visual markers to each simulated trade and, for a move advanced strategy, the indicators the signal was derived from. This can make it even easier to analyze the weaknesses of a signal set so that you can adjust its parameters.

Simple Trading Bot

Once you’ve moved past the backtesting stage, you’ll need a simple trading framework to integrate your strategies for live testing. This can then be run on a paper trading account to test the signals against a live data feed.

This is an important step in development, as it tests whether the strategy has been over-fit to its dataset. For example, a strategy could easily be tuned to perfectly trade a specific symbol over a backtesting period. However, this is unlikely to generalize well to other markets or different time periods — leading to ineffective signals and losses.

As such, you’ll want to a simple way to test your strategies in a staging environment, before committing any money to them with a real trading account. This is both for testing the strategy and the implementation, as a small bug in your code could be enough to wipe out an account, if left unchecked.

Here’s another example snippet of a trading bot which implements the moving average cross strategy (full script at end of this section).

(Code Snippet of a trading bot which implements the moving average cross strategy — full script at end of this section)

(Code Snippet of a trading bot which implements the moving average cross strategy — full script at end of this section)

To make this into a full trading bot you could choose to either add a timed loop to the code itself or have the whole script run on a periodic schedule. The latter is often a better choice, as an exception causing an unexpected crash would completely stop the trading bot if it were a self contained loop. Where as, a scheduled task would have no such issue, as each polling step is a separate instance of the script.

On top of this, you’ll probably want to implement a logging system, so that you can easily monitor the bot and identify any bugs as it runs. This could be achieved by adding a function to write a text file with any relevant information at the end of each process.

Once you have a working strategy, the Alpaca API should make it easy to expand your trading bot into a full production system, allowing you to start trading quickly.

By Matthew Tweed


Algo Trading News Headlines 7/27/2018

INSIDE SCOOP One Market-Beating Quant Firm Was Buying Facebook Last Quarter


By their nature, quant firms — known for their rapid trading — may hold a stock only for as long as it takes to eat a candy bar. Yet a look at how Acadian’s positions have changed over the course of a quarter offers insights into the longer-term strategy of the market-beating money manager.

Screen Shot 2018-07-27 at 12.12.13 PM.png

The Problem With Algo Trading


Everywhere you look there are robots taking over. Even in markets! With the rise of quant trading, AI, algos, and everything else, is there even room for us human traders?

Algo Trading for Dummies — Implementing an Actual Trading Strategy


While most strategies that are successful long term are based on a mix of technical and fundamental factors, the fundamental behaviors which are exploited are often very nuanced and vary hugely, so its hard to generalize for an article. As such, we’ll be focusing more on the tools and methods for making strategies based on technical analysis.

Automated Trading Industry — Leading Key Companies


Some of the key players influencing the market are Citadel LLC, KCG Holdings, Virtu Financial., Trading Technologies International, Inc., InfoReach, Inc., Tethys Technology, Inc., Lime Brokerage LLC, FlexTrade Systems, Inc., Tower Research Capital LLC and Hudson River Trading LLC among others.

This company changes the DNA of investing — through machine learning


Now that it is possible to comb through the behavior patterns of millions of traders and use algorithms to understand how they think, social trading networks, such as eToro, add another dimension of information to this process.


Algo Trading for Dummies  -  Implementing an Actual Trading Strategy (Part 4)

Strategy Development and Implementation

While most strategies that are successful long term are based on a mix of technical and fundamental factors, the fundamental behaviors which are exploited are often very nuanced and vary hugely, so its hard to generalize for an article. As such, we’ll be focusing more on the tools and methods for making strategies based on technical analysis.

Image from iOS (8).png

Visual Strategy Creation and Refinement

There are many great financial charting tools available, with various different specialties, my personal favourite free option being

1_Vy5CbHA0AdW7pwSmiwoZng (1).png

One of the most useful features for strategy creation is its simple scripting language to create both trading indicators and back-testable strategies. While the back-testing tool is rather limited in its functionality, it serves as a good first step sanity check.

Simple creation of trading indicators which are then overlaid directly onto the chart allows for rapid testing and debugging of ideas, as its much quicker to create a script and visually check it against the market than to fully implement and back test it.

This rapid development process is a good first step to making certain types of strategies, particularly for active trading strategies that act on single symbols at a time. However, it won’t do you any good for portfolio strategies or those which incorporate advanced hedging.

For that, you’ll want to create your own tools for visualising full back-tests with multiple trading pairs. This is where the logging features of your back-tester will come in. With the end results being plotted in your graphing tool of choice, such as matplotlib (for Python).


Full Back-tester Framework:

(Simple example of a multi-symbol back-tester based on position handler from  previous article  — full script at end of this post)

(Simple example of a multi-symbol back-tester based on position handler from previous article — full script at end of this post)

Various plots, such as scatter graphs or hierarchical clustering, can be used to efficiently display and contrast different variations of the back-tested strategy and allow fine tuning of parameters.

Implementing and Back-testing

One of the easiest traps to fall into with the design of any predictive system is over-fitting to your data. It’s easy to see amazing results in back-tests if a strategy has been trained to completely fit the testing data. However, the strategy will almost certainly fall at the first hurdle when tested against anything out of sample, so is useless.

Meanwhile, at the other end of the spectrum, it is also possible to create a system which is overgeneralised. For example, a strategy which is supposed to actively trade the S&P 500 could easily turn a profit long term by always signaling long. But that completely defeats the purpose of trying to create the bot in the first place

The best practices for back-testing a system:

  1. Verify against out of sample data. If the strategy has been tuned against one set of data, it obviously going to perform well against it. All back-tests should be performed against a different set of data, whether that be a different symbol in the same asset class or the same symbol over a different time sample.
  2. Verify all strategies against some kind of benchmark. For a portfolio strategy you’d want to compare risk-adjusted returns metrics. For an active trading strategy you can look at risk:reward and win rate.
  3. Sanity check any strategies that pass the back-test. Where possible, look back over the specific set of steps it takes to make any trading signals. Do they make logical sense? If this isn’t possible (for example with Machine Learning), plot a set of its signals for out of sample data. Do they appear consistent and reasonable?
  4. If the strategy has gotten this far, run live tests. Many platforms offer paper-trading accounts for strategy testing. If not, you may be able to adapt your back-testing tool to accept live market data.

Once you finally have a fully tested and working strategy which you are happy with, you can run it with small amounts of capital on a testing account. While the strategy may be perfect, there is always the possibility of bugs in the trading bot itself.

Final Thoughts

Creating any effective trading strategy is hard, especially so when you also have to deal with defining it in objective terms that can be converted into code. It can be especially frustrating when nothing seems to produce reliable results. However, sticking to good practices when it comes to the data science of back-testing and refining a strategy will pay off vs learning those same lessons when a strategy under-performs with real money.

By Matthew Tweed

Full back-tester framework:


Algo Trading for Dummies -  Building a Custom Back-tester (Part 3)

While there are many simple backtesting libraries available, they can be quite complex to use effectively — requiring a lot of extra processing of data sets. It is sometimes worth coding a custom back-tester to suit your needs.

Image from iOS (4).png

Building a back-tester is a fantastic conceptual exercise. Not only does this give you a deeper insight into orders and their interaction with the market, but it can also provide the framework for the order handling module of your trading bot.

Order Handling

One of the key pieces to an active trading strategy is the handling of more advanced orders types, such as trailing stops, automatically hedged positions or conditional orders.

For this you’ll want a separate module to manage the order logic before submitting to an exchange. You may even need a dedicated thread to actively manage orders once submitted, in case the platform itself doesn’t offer the necessary types.

Its best for the module to keep an internal representation of each position and its associated orders, which is then verified and amended as the orders are filled. This means you can run calculations against your positions without the need to constantly be querying the broker. It also allows you to easily convert the code for use in your back-tester, by simply altering the order fill checks to reference the historical data at each time step.

(Code Snippet of an order handling function as part of a position handler — full script at end of article)

(Code Snippet of an order handling function as part of a position handler — full script at end of article)

It may also be worth implementing order aggregation and splitting algorithms. For example, you may want a function to split a larger limit order across multiple price levels to hedge your bets on the optimal fill. Or, indeed, you might need a system to net together the orders of multiple simultaneous strategies.

Assumptions and Issues of Back-testing

Unless you’re using tick data and bid/ask snapshots to back-test against, there will always be a level of uncertainty in a simulated trade as to whether it would fill fully, at what price, and at what time. The period of each data point can also cause issues if its above the desired polling rate of the trading bot.

These uncertainties are lessened as the average holding period for each trade increased vs the resolution of your data, but is never fully eliminated. It is advised to always assume the worst case scenario in your simulation, as its better for a strategy to be over prepared than under.

(Back-testing order processing logic implemented into position handler — full script at end of article)

(Back-testing order processing logic implemented into position handler — full script at end of article)

For example, if a stop-loss order would have been triggered during the span of a bar, then you’d want to add some slippage to its trigger price and/or use the bar’s closing price. In reality, your are unlikely to get filled so unfavorably, but it’s impossible to tell without higher granularity data.

On top of this, it is impossible to simulate the effect of your order on the market movement itself. While this would be unlikely to have a noticeable effect on most strategies, if you’re using extremely short holding times on each trade or larger amounts of capital, it could certainly be a factor.

Designing an Efficient Back-tester

When calculating the next time step for an indicator, unless you’ve stored all relevant variables you will be recalculating a lot of information from the look-back period. This is unavoidable in a live system and, indeed, less of an issue, as you won’t be able to process data faster than it arrives. But you really don’t want to wait around longer than you have to for a simulation to complete.

The easiest and most efficient workaround is to calculate the full set of indicators over the whole dataset at start-up. These can then be indexed against their respective symbols and time stamps and saved for later. Even better, you could run a batch of back-tests in the same session without needing to recalculate the basic indicators between runs.

At each time you will then simply query the set of indexed indicators, construct the trading signals and push the orders to the order handling module, where the simulated positions are calculated along with their profit/ loss. You’ll also want to store the position and order fill information, either as a subscript to the back-tester or integrated directly into the position handling module.

Even Improving Your Back-tester

Back-testing is only as useful as the insight its statistics provide. Common review metrics include win/loss ratio, average profit/loss, average trade time, etc. However you may want to generate more insightful reports, such as position risk:reward ratios or an aggregate of price movement before and after each traded signal, which allows you to fine tune the algorithm.

Once the full framework has been designed, implemented and debugged should you start looking for ways to speed up and upgrade the inner loop of the back-tester (the order handling module). It is a lot easier to take a working program and make it faster than it is to take an overly optimized program and make it work.

By Matthew Tweed

Full position handling class framework: