- Project overview
- Data
- Technologies
- Features
- Limitations
- Process
- Results
- What I learned
- How can it be improved
- Running the project
This project presents a time series analysis of the Dogecoin (DOGE) cryptocurrency’s closing price. The analysis includes series decomposition to identify its main components, statistical tests to examine relevant properties of the series, and ARIMA modeling based on the obtained results. Finally, a 30-day price forecast is produced.
The dataset used, "dogecoin.csv", contains financial information for the Dogecoin (DOGE) cryptocurrency covering 2980 dates from 2017/11/09 to 2026/01/05. The data were obtained from Kaggle. This dataset includes the following variables:
| Variable | Definition |
|---|---|
Date |
Trading date (UTC) |
Close |
Closing price of the day (USD) |
High |
Highest price during the day (USD) |
Low |
Lowest price during the day (USD) |
Open |
Opening price of the day (USD) |
Volume |
Daily trading volume |
- Python
- Jupyter Notebook
Here is what this project does:
- Time series analysis: Decomposition of the time series, computation of standard and smoothed 30-days Moving Averages (MA), 30-days Moving Standard Deviation (MSTD), monthly box plots, Augmented Dickey-Fuller (ADF) stationarity test, Autocorrelation Function (ACF), and Partial Autocorrelation Function (PACF) using the Yule–Walker method.
- Initial ARIMA model: Fitting of an ARIMA(1, 0, 0) model selected based on the analysis results, evaluation of predictive performance, residual diagnostics, frequency histogram visualization, Shapiro–Wilk normality test on residuals, and ACF analysis of the residuals.
- Optimized ARIMA model: Optimization of ARIMA model parameters using the Akaike Information Criterion (AIC), model refitting with the optimal parameters, and visualization of the improved fit.
- Future closing price prediction: A 30-day closing price forecast for Dogecoin with confidence intervals, generated for both the initial and optimized models.
- Comparison between full and 2021-truncated in series: Comparison of analysis results and model performance using both the complete time series and a version of the data truncated from 2021 onward.
The main limitations of this project are:
- Dogecoin (DOGE) is inherently highly volatile.
- The analysis is based exclusively on the closing price time series.
- External factors that may influence the cryptocurrency price were not considered.
- Only ARIMA models were used.
First, the dataset "dogecoin.csv" was loaded and its dimensions (2981 rows and 6 columns) were verified. Data types were reviewed, missing values were checked, the Date column was set as the index, and basic descriptive statistics were computed. The Close variable, representing Dogecoin’s closing price, was selected as the target variable to analyze the stability and behavior of the asset over time.
Next, the full closing price time series was visualized, along with a truncated series starting in 2021 to observe recent patterns and high volatility.
The time series was decomposed into trend, seasonality, and residual components using a multiplicative model with a 30-day period, based on the observed price behavior. Moving Averages (MA) and Moving Standard Deviations (MSTD) were analyzed to assess mean and variance stability, then Monthly box plots revealed significant volatility in closing prices.
Subsequently, the Augmented Dickey-Fuller (ADF) test indicated that the series could be considered stationary at the
The ARIMA(1, 0, 0) model was fitted, and although the visual fit appeared good, residual diagnostics revealed issues:
- Persistent autocorrelation from Ljung–Box test.
- Lack of normality based on Jarque–Bera test.
- Heteroskedasticity.
These results indicated that the model did not fully capture the complexity of the time series. Error metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and Mean Absolute Percentage Error (MAPE) showed moderate predictive performance. This was reflected in the 30-day forecasts, which exhibited wide 95% confidence intervals.
Taking into consideration the identified limitations of the proposed model based on the conducted analyses, a more robust model was sought. Therefore, a systematic search over the ARIMA parameters (
- An autoregressive component
$p$ of order 3. - No differencing component
$d$ . - A moving average component
$q$ of order 3.
The choice of the differencing component
The ARIMA(3, 0, 3) model achieved the lowest AIC among the evaluated candidates and showed improved residual diagnostics, particularly in eliminating autocorrelation. Its visual fit to the historical series was also more accurate.
Finally, the 30-day forecast generated by the optimized model displayed a more detailed continuation of the recent trend. Although confidence intervals remained wide due to the inherent volatility of cryptocurrencies, short-term predictions exhibited increased reliability.
The main result of this project is that Dogecoin’s closing price constitutes a complex, highly volatile, and heteroskedastic time series. While a simple model such as ARIMA(1, 0, 0) can capture dominant persistence, a more complex model like ARIMA(3, 0, 3) selected via AIC provides a superior fit and improved modeling of temporal dependence.
Final recommendation
As a final recommendation for this project, the lack of normality and presence of heteroskedasticity in the residuals of both models suggest that alternative approaches, such as GARCH models, may offer improved performance when modeling volatility.
The most important thing I learned from this project is that, when analyzing a volatile time series such as Dogecoin’s closing prices, we can select ARIMA model parameters based on statistical analysis. However, it is important to check the model’s performance to ensure we are making the best choice. We also need to consider external factors, such as news events and the behavior of other assets, to explain volatility without misinterpreting the results of our statistical analysis. Additionally, we can explore alternative models, like the one proposed in this project selected based on the AIC criterion, or use GARCH or SARIMA to improve our understanding of the volatility and stationarity of the time series.
- Explore GARCH and SARIMA models.
- Implement machine learning techniques.
- Analyze events that may affect the time series.
- Incorporate external factors to explain volatility.
- Compare results with other cryptocurrencies of similar characteristics.
To run the project, simply open the Jupyter Notebook Dogecoin_Time_Series_Analysis_and_Forecasting, load the csv file dogecoin.csv and run all cells.




