Skip to content

Daniel-Ro-Santiago/Dogecoin-Time-Series-Analysis-and-Forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dogecoin-Time-Series-Analysis-and-Forecasting

Table of contents

Project overview

This project presents a time series analysis of the Dogecoin (DOGE) cryptocurrency’s closing price. The analysis includes series decomposition to identify its main components, statistical tests to examine relevant properties of the series, and ARIMA modeling based on the obtained results. Finally, a 30-day price forecast is produced.

Dogecoin

Data

The dataset used, "dogecoin.csv", contains financial information for the Dogecoin (DOGE) cryptocurrency covering 2980 dates from 2017/11/09 to 2026/01/05. The data were obtained from Kaggle. This dataset includes the following variables:

Variable Definition
Date Trading date (UTC)
Close Closing price of the day (USD)
High Highest price during the day (USD)
Low Lowest price during the day (USD)
Open Opening price of the day (USD)
Volume Daily trading volume

Technologies

  • Python
  • Jupyter Notebook

Features

Here is what this project does:

  • Time series analysis: Decomposition of the time series, computation of standard and smoothed 30-days Moving Averages (MA), 30-days Moving Standard Deviation (MSTD), monthly box plots, Augmented Dickey-Fuller (ADF) stationarity test, Autocorrelation Function (ACF), and Partial Autocorrelation Function (PACF) using the Yule–Walker method.
  • Initial ARIMA model: Fitting of an ARIMA(1, 0, 0) model selected based on the analysis results, evaluation of predictive performance, residual diagnostics, frequency histogram visualization, Shapiro–Wilk normality test on residuals, and ACF analysis of the residuals.
  • Optimized ARIMA model: Optimization of ARIMA model parameters using the Akaike Information Criterion (AIC), model refitting with the optimal parameters, and visualization of the improved fit.
  • Future closing price prediction: A 30-day closing price forecast for Dogecoin with confidence intervals, generated for both the initial and optimized models.
  • Comparison between full and 2021-truncated in series: Comparison of analysis results and model performance using both the complete time series and a version of the data truncated from 2021 onward.

Limitations

The main limitations of this project are:

  • Dogecoin (DOGE) is inherently highly volatile.
  • The analysis is based exclusively on the closing price time series.
  • External factors that may influence the cryptocurrency price were not considered.
  • Only ARIMA models were used.

Process

First, the dataset "dogecoin.csv" was loaded and its dimensions (2981 rows and 6 columns) were verified. Data types were reviewed, missing values were checked, the Date column was set as the index, and basic descriptive statistics were computed. The Close variable, representing Dogecoin’s closing price, was selected as the target variable to analyze the stability and behavior of the asset over time.

Next, the full closing price time series was visualized, along with a truncated series starting in 2021 to observe recent patterns and high volatility.

Dogecoin time series

The time series was decomposed into trend, seasonality, and residual components using a multiplicative model with a 30-day period, based on the observed price behavior. Moving Averages (MA) and Moving Standard Deviations (MSTD) were analyzed to assess mean and variance stability, then Monthly box plots revealed significant volatility in closing prices.

Dogecoin decomposition

Subsequently, the Augmented Dickey-Fuller (ADF) test indicated that the series could be considered stationary at the $5%$ significance level, leading to the selection of a differencing order of $d = 0$. The Analysis of the ACF and PACF suggested an autoregressive component of order 1, motivating the proposal of an initial ARIMA(1, 0, 0) model.

Dogecoin ADF

The ARIMA(1, 0, 0) model was fitted, and although the visual fit appeared good, residual diagnostics revealed issues:

  • Persistent autocorrelation from Ljung–Box test.
  • Lack of normality based on Jarque–Bera test.
  • Heteroskedasticity.

These results indicated that the model did not fully capture the complexity of the time series. Error metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and Mean Absolute Percentage Error (MAPE) showed moderate predictive performance. This was reflected in the 30-day forecasts, which exhibited wide 95% confidence intervals.

Taking into consideration the identified limitations of the proposed model based on the conducted analyses, a more robust model was sought. Therefore, a systematic search over the ARIMA parameters ($p$, $d$, $q$) was made using the Akaike Information Criterion (AIC), yielding the following configuration:

  • An autoregressive component $p$ of order 3.
  • No differencing component $d$.
  • A moving average component $q$ of order 3.

The choice of the differencing component $d = 0$ was consistent with the ADF test results.

The ARIMA(3, 0, 3) model achieved the lowest AIC among the evaluated candidates and showed improved residual diagnostics, particularly in eliminating autocorrelation. Its visual fit to the historical series was also more accurate.

Finally, the 30-day forecast generated by the optimized model displayed a more detailed continuation of the recent trend. Although confidence intervals remained wide due to the inherent volatility of cryptocurrencies, short-term predictions exhibited increased reliability.

Dogecoin optimized model

Results

The main result of this project is that Dogecoin’s closing price constitutes a complex, highly volatile, and heteroskedastic time series. While a simple model such as ARIMA(1, 0, 0) can capture dominant persistence, a more complex model like ARIMA(3, 0, 3) selected via AIC provides a superior fit and improved modeling of temporal dependence.

Final recommendation

As a final recommendation for this project, the lack of normality and presence of heteroskedasticity in the residuals of both models suggest that alternative approaches, such as GARCH models, may offer improved performance when modeling volatility.

What I learned

The most important thing I learned from this project is that, when analyzing a volatile time series such as Dogecoin’s closing prices, we can select ARIMA model parameters based on statistical analysis. However, it is important to check the model’s performance to ensure we are making the best choice. We also need to consider external factors, such as news events and the behavior of other assets, to explain volatility without misinterpreting the results of our statistical analysis. Additionally, we can explore alternative models, like the one proposed in this project selected based on the AIC criterion, or use GARCH or SARIMA to improve our understanding of the volatility and stationarity of the time series.

How can it be improved

  • Explore GARCH and SARIMA models.
  • Implement machine learning techniques.
  • Analyze events that may affect the time series.
  • Incorporate external factors to explain volatility.
  • Compare results with other cryptocurrencies of similar characteristics.

Running the project

To run the project, simply open the Jupyter Notebook Dogecoin_Time_Series_Analysis_and_Forecasting, load the csv file dogecoin.csv and run all cells.

About

This project performs a time series analysis of the Dogecoin (DOGE) cryptocurrency closing price, including series decomposition, statistical testing, ARIMA modeling, and a 30-day price forecast.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors