How to Make a Short Term Bitcoin Price Prediction with ML?


Table of Contents


Data Discovery and Visualization

Model Description

Data Preprocessing


Here are some tips that allow you to predict the best Bitcoin price with Deep Learning.


Due to the recent hype in cryptocurrencies - especially all of them posting all-time highs (ATH). I found it interesting to try to predict the price of Bitcoin. The purpose of this project is to predict the price of Bitcoin with Deep Learning. More precisely, I will show a stacked Neural Network model with Long Short Term Memory cells. Additionally, I will add 2 techniques called Early Stop and Quit to avoid overfitting. Finally, the model will be validated and used to predict the BTC price for the next 10 days.




The dataset used for this project is BTC-USD data from Yahoo Finance, including BTC Open, High, Low, Close, Adj Close, and Volume for the requested timeframe. In this case, I took all the available data and chose to use the latest 1200 observations in the script, thinking that their early years may have behaved quite differently than what is shown today.


Data Discovery and Visualization


Performed a series of checks to explore the dataset:

I compared 'Close' to 'Adj Close' price to decide which one to use as a reference, but it matched on all values. By doing some research, he found that this shouldn't be any different for cryptocurrencies, as traditional stocks can be different when talking about decisions made by a company's executives.


Null values ​​were searched, finding 4 days with missing data. Checked in different related sources but no reliable information was found. So I decided to fill in missing values ​​with 'ffill' method from Pandas where null values ​​will be replaced with the previous observations (the last available observation will be propagated). Another approach would be to drop the missing values.


I plotted 'Close' price and 'Volume' to look for any visual anomalies. Indeed, a phenomenal increase in Volume on February 26, 2021, means that it is about 3 times its normal value. For this data point, I checked different data sources and found that they were aligned, so I took no further action.


Finally, by calculating the Pearson correlation between the Close price and Volume, I got 0.799, so I decided to include Volume as a secondary feature used to predict the future prices of Bitcoin.



Model Description


The algorithm chosen for this analysis is Long Short Term Memory (LSTM) Neural Networks. Details on this type of Recurrent Neural Network are beyond the scope of this article, but brief articles can clarify the main points:

Recurrent Neural Networks (RNN) differ from a standard feed-forward approach by using previous input sources in computation.


One problem that can arise during training an RNN is the explosion of gradient errors during the iteration loop, which leads to an unstable neural network. LSTM solves this problem by adding or simply deleting the ability to memorize and update new information. This is possible thanks to more complex architecture.

In this case, the model was developed with Keras, a Python library that uses TensorFlow in the backend. Keras' API simplifies the implementation of the Neural Network.


Data Preprocessing


The training dataset is the Closing Price and Volume for past observations and the output will be the Closing Price predictions for the next n days. The name of the model is Multivariate since I will use 2 variables as input.

The first step in preparing data is to scale all observations, in this case, I used MixMaxScaler, but StandardScaler can also be used.


Creating and Training The Model


The model was built with Keras using the Sequential class and stacking the different LSTM layers.

For the input layer, I defined the amount of nodes/neurons as n_past to represent each of the variables. For the output layer, since we are building a Multi-Step model, the number of neurons must be the same number as our desired future predictions called n_future. Several hidden layers can be added between the input layer and the output layer. There is no rule for determining the optimal amount of hidden layers or the number of neurons. In this case, I used 8 hidden layers, each containing 20 nodes.


To prevent overfitting, dropout and early stop are included. As a quick summary: Drop is a layer added to use only some of the next layer's nodes and drop the others; If dropout is 0.2, 20% of the next layer's nodes are ignored.

Early stopping stops training when a tracked metric stops improvement. In this case, the metric we want to minimize is 'val_loss' and patience of 25, ie the number of periods with no improvement and after which training will be stopped.


More Improvements


Due to time constraints, I was not able to complete the project as I wanted, but here are a few ideas that could be developed further: Include other models such as ARIMA, SARIMA, Facebook Prophet to compare results. For RNN, we can also try GRU cells instead of LSTM.


Add more features that affect BTC price. Ex: some other exchange prices, Twitter data, Whale data, altcoin data, active BTC addresses etc. Increase the number of data points by reducing the time intervals. I used daily prices for this notebook but we can try hourly prices for example.


Documentation and Code Repository


Please see my GitHub repository for the complete code described in a Jupyter Notebook.

Feel free to comment or contact me if you have any doubts or suggestions to improve the project.

If you want more documentation on LSTM, Bitcoin, or other related topics, I've included a list of useful links when doing my project, check it out!




I found this project particularly interesting because it combines two big topics that I am really passionate about learning, Data Science and Cryptocurrencies. I really enjoyed researching both topics to come up with a good enough model but was a bit overwhelmed by the amount of information on each. Machine Learning and Cryptocurrencies are two worlds of complete knowledge and trying to build a solid and reliable model is certainly not an easy task.


The popularity of cryptocurrencies skyrocketed in 2017 as their market capitalizations grew exponentially for several months in a row. Prices exceeded $800 billion in January 2018. Although machine learning prediction has been successful in predicting stock market prices through a number of different time series models, its application in predicting cryptocurrency prices has been quite restrictive. The reason behind this is that the prices of cryptocurrencies are due to technological progress, internal competition, delivery pressure on the markets, economic issues, security issues, political factors etc. It is obvious because it depends on many factors such as profit if smart invention strategies are taken. Unfortunately, due to the lack of indexes, cryptocurrencies are relatively unpredictable compared to traditional financial forecasts such as stock market forecasting.


In this blog, we are trying to go through a four-step process for predicting cryptocurrency prices:

  • Receiving real-time cryptocurrency data.
  • Prepare data for training and testing.
  • Predict the price of the cryptocurrency using the LSTM neural network.
  • Visualize prediction results.


In financial markets, the high volatility of an asset is often seen as a negative factor. However, short-term traders can require high profits if traders open and close the right positions. The high volatility of cryptocurrencies, and Bitcoin, in particular, is what has made cryptocurrency trading so profitable these past years. The main purpose of this study is to compare several frameworks with each other to predict the daily closing Bitcoin price, after a rigorous model selection with the k-fold cross-validation method, and to search for the ones that provide the best performance. 


We evaluated the performance of single-stage frameworks based on only one machine learning technique, such as Bayesian Neural Network, Feed Forward, and Long Short-Term Memory Neural Networks, and two-stage frames generated by the neural networks just mentioned in the cascade to Support Vector Regression. The results highlight the higher performance of two-stage frameworks over the corresponding single-stage frameworks, but for Bayesian Neural Network. The single-stage framework based on the Bayesian Neural Network has the highest performance and the order of magnitude of the mean absolute percent error calculated over the price estimated by this framework is in line with those reported in recent literature studies.

You can get your Real-Time and Historical Cryptocurrency Data with Finage free Crypto Data API key.

Build with us today!

Start Free Trial