Bahman
Published on Feb 19, 2021
We started with “ Collecting Data ”.
We learned what OHLCV data is, and why both historical and online data are essential.
Then we continued with “ Data Analysis ”.
We saw how critical data cleaning and feature engineering are. To build a stable ML model, we must prepare data correctly and use visualization to better reach our goals.
Then we moved to pattern recognition:
We noted that you can easily fall into the trap of seeing patterns where none exist—like a horoscopist—so you must follow scientific methods like an astronomer. We identified a simple pattern, “SMA20,” and discussed labeling them to [0,1].
Now, it’s time to build a model. As always, remember: in this season we design; in the next season, we will develop.
You should always design your pipeline on paper or in your mind before development. That’s the method I follow—and it works for me ;)
Let’s look at the data format we have at this moment:
We have OHLCV plus SMA20 and a target column, like this:
At this step, machine learning helps us build a model. The main question is: how? Let’s quickly review how ML classification techniques work:
First, the machine receives labeled data (which we already prepared).
Then, by splitting the data into training and testing sets, the model learns from the training data.
Finally, we test on the test dataset to see if the model fits well.
This is a simplified version—the real process is a bit more complex.
In many fields, reaching a working model would be the end. But in finance and trading, it isn’t. We still need to perform Backtesting.
At this point, we define a Buy/Sell strategy. For example, with SMA20:
If this sounds abstract, look at the following data. It shows when we opened and closed long positions:
A trading model is a package of machine learning methods plus backtesting. Sometimes, adding extra rules to predictions improves performance.
From my perspective, a complete model includes both Long and Short strategies. That usually means having two ML models: one for Long positions, one for Short. Here, creativity helps you design a comprehensive model.