Impact factor (WEB OF SCIENCE - Clarivate)

2 year: 7.2 | 5 year: 6.6


Predicting the daily counts of COVID-19 infection using temporal convolutional networks

Michael Li1,2, Fatemeh Esfahani1, Li Xing3*, Xuekui Zhang1*

1 University of Victoria, Victoria, British Columbia, Canada
2 Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
3 University of Saskatchewan, Saskatoon, Saskatchewan, Canada
* Joint senior authorship.

DOI: 10.7189/jogh.13.03029


Print Friendly, PDF & Email

The coronavirus 2019 (COVID-19) pandemic has significantly impacted the global economy and society. One of the key challenges in combating it was predicting its spread to take appropriate measures, such as lockdowns and social distancing. These measures have now been lifted, and many countries are entering the final stages of the COVID-19 pandemic.

It is essential to continue studying the data collected during the COVID-19 pandemic, even as the focus shifts to recovery and rebuilding, to improve our ability to respond to future pandemics and protect public health. The COVID-19 pandemic has provided a wealth of data that can be used to enhance our understanding of the virus and how it spreads. We used data from 3112 counties in the USA obtained from multiple sources, including the daily infection rates from the COVID-19 Data Repository of the Center for Systems Science and Engineering (CSSE) at the John Hopkins University [1], interventions used to control the spread of the virus [2], and demographics from the US Census [3], to train monitoring systems that detect and track future outbreaks or pandemics, allowing us to better prepare or even mitigate them in advance.

Artificial intelligence (AI) models have been used to forecast the cumulative daily number of COVID-19 cases. These models can analyse large amounts of data and make predictions quickly, which is critical in fast-moving pandemics. We built a forecasting model based on the temporal convolutional network (TCN) [4] and implemented a web application [5] that displays 28-day forecasts for every county in the United States. In our evaluation study, we found that our TCN-based model outperformed its extension (an ensemble model) and other state-of-art forecasting models.


TCNs are a deep learning method proposed by Lea et al. [4] in 2017. They are commonly used for tasks involving time-series data and can train in parallel, which results in faster training time and optimal graphical processing unit (GPU) usage. TCNs do not exhibit the vanishing gradient problem observed in recurrent neural networks [4] and can thus capture long-term dependencies in data, which is vital for accurately predicting the spread of the virus. TCNs have been used in various applications, such as flood forecasting and lip-reading recognition [6,7].

Photo: Our web application implementing methods discussed in this viewpoint, displaying 28-day forecasts for every county in the United States. Source: c0vidcather website, no permission needed for use. Available:

Our model takes a seven-day window of COVID-19 cases, which are then processed through a TCN layer of size 64. The output is then passed through a 20% dropout layer to the dense output layer, which predicts the eight day of daily cumulative COVID-19 cases (Appendix S1 in the Online Supplementary Document)


The TCN model cannot handle non-time-series data, which motivated us to extend the TCN model to an ensemble model. Our ensemble model combines multiple data sources to make predictions and uses those of different models to make a final prediction. This is advantageous, as it incorporates a broader range of variables, providing a more comprehensive overview of the situation.

Our model combines time-series data and tabular data. The time-series data consists of a seven-day window of COVID-19 cases and the tabular data contains 24 variables from the US Census used in other COVID-19 studies [13,8]. The tabular data are the input for a feedforward artificial neural network (ANN), while the time-series data are processed through the TCN model. The output predictions of the ANN and TCN are passed through a concatenate layer and then a dense output layer that produces the eight day’s predicted daily cumulative cases. The details of our ensemble model are presented in Appendix S2 in the Online Supplementary Document.


We compared the performance of our proposed models with several state-of-the-art approaches presented in literature, including the statistical model autoregressive integrated moving average (ARIMA) [9], long short-term memory networks (LSTM) [10], convolutional neural networks (CNN), and artificial neural networks (ANN) [11].

In this evaluation study, we randomly split our data into two subsets for model training and testing. We repeated this experiment ten times to obtain confidence intervals for our comparisons and reduce the effect of the random split on our results. Each model was trained and evaluated on the same train-test split to ensure a fair comparison. As the evaluation metric, we used mean absolute errors (MAEs), which are a popular model error evaluator for forecasting continuous values. The MAE is defined as the average absolute difference between predicted and actual cases in the test data.

The MAEs of six forecasting methods over the ten random experiments are visualized as side-by-side boxplots (Figure 1). Smaller MAEs or lower box positions indicate better forecasting performance. We found that the TCN model outperforms all other models (mean MAE = 19.71); the ensemble model is the second best (mean MAE = 26.38), indicating the added non-temporal information cannot improve TCN’s performance. Other models had much larger mean MAEs, with ANN at 51.25, LSTM at 90.61 (LSTM), CNN at 73.51, and ARIMA at 683.86. Notably, all 10 MAE-values of TCN and ensemble model were consistently smaller than the MAEs of the four other models. Furthermore, ARIMA had a notably higher mean MAE than other models, which we believe is due to the way it trains. Since it uses a moving average estimate, any errors it has will accumulate over time. Thus, it diverges quickly over longer forecasts, an issue TCN does not have.

Figure 1.  Boxplots of the MAE of six forecasting models, including the ARIMA, ANN, LSTM, CNN, TCN, and ensemble model. Each box is constructed using 10 MAEs of one method, shown as the points inside each box. The lower the box’s position, the smaller the MAE values, representing the method has better prediction performance.


We presented the application of the TCN model for disease forecasting and demonstrated that it outperforms state-of-the-art approaches. Based on these findings, we believe that the TCN is an excellent model for forecasting during the development of pandemic monitoring systems.

Despite being the top candidate for our forecasting tool, the TCN model also has its limitations, so we suggest using it with caution. First, it requires a large amount of data to make accurate predictions, so it would not be helpful during the early stages of pandemics or new variants. However, this limitation is not unique to the TCN model and is well-known from other deep learning methods. This could potentially be solved through transfer learning and using models built for other diseases or previous variants of the same virus. Second, the TCN model can only handle time-series data. This means that it does not have a complete picture of the situation and cannot consider variables such as public transportation use and demographic characteristics. We tried using ensemble machine learning to combine the TCN model with an ANN model built from non-time-series data, but the ensemble model did not outperform the TCN model. This could be due to two reasons. First, the pattern of observed time-series data might have carried all trends encoded in the demographic variables we added, so combining them gave no new information. Second, a better ensemble method is desired to utilise information in two data sources more efficiently. Third, there are limitations with the quality of input data, since data collection may not be as accurate when positive cases increase rapidly beyond capacity. AI models and their results should be used cautiously in decision-making, and comprehensive validation is always recommended.

Additional material

Online Supplementary Document


This research was enabled in part by computing resources support provided by WestGrid and Compute Canada.

Data availability: The data used in this study are from the Johns Hopkins Coronavirus Resource Center, the American Community Survey (ACS), the Oxford Covid-19 Government Response Tracker (OxCGRT), and NBC News

[1] Funding: Xuekui Zhang is funded by the Canada Research Chair program (CRC-2021-00232) and the Michael Smith Health Research BC Scholar Program (SCH-2022- 2553). Li Xing is funded by the Natural Sciences and Engineering Research Council of Canada Discovery Grants (RGPIN-2021-03530).

[2] Authorship contributions: Xuekui Zhang and Li Xing contributed to study conceptualization, design and supervised this project. Michael Li conducted data analysis and prepared the first draft of the manuscript. All authors contributed to the interpretation of analysis results, manuscript preparation and revision, and approved the final version of manuscript.

[3] Disclosure of interest: The authors completed the ICMJE Disclosure of Interest Form (available upon request from the corresponding author) and disclosed no relevant interests.


[1] E Dong, H Du, and L Gardner. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis. 2020;20:533-4. DOI: 10.1016/S1473-3099(20)30120-1. [PMID:32087114]

[2] T Hale, N Angrist, R Goldszmidt, B Kira, A Petherick, and T Phillips. A global panel database of pandemic policies (Oxford COVID-19 Government Response Tracker). Nat Hum Behav. 2021;5:529-38. DOI: 10.1038/s41562-021-01079-8. [PMID:33686204]

[3] US Census Bureau. American community survey (ACS). The United States Census Bureau; 2021. Available: Accessed: October 2022.

[4] Lea C, Flynn MD, Vidal R, Reiter A, Hager GD. Temporal Convolutional Networks for Action Segmentation and Detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017 July 21-26; Honolulu, HI, USA. New Jersey: IEEE Computer Society; 2017. pp. 1003-1012.

[5] c0vidcather development team. c0vidcather. Available: Accessed: 17 May 2023.

[6] Y Xu, C Hu, Q Wu, Z Li, S Jian, and Y Chen. Application of temporal convolutional network for flood forecasting. Hydrol Res. 2021;52:1455-68. DOI: 10.2166/nh.2021.021

[7] Martinez B, Ma P, Petridis S, Pantic MB. Lipreading Using Temporal Convolutional Networks. In: ICASSP 2020 – 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2020 May 4-8; Barcelona, Spain. New Jersey: IEEE; 2020. pp. 6319-6323.

[8] X Huang, X Shao, L Xing, Y Hu, DD Sin, and X Zhang. The impact of lockdown timing on COVID-19 transmission across US counties. EClinicalMedicine. 2021;38:101035. DOI: 10.1016/j.eclinm.2021.101035. [PMID:34308301]

[9] TM Awan and M-F Aslam. Prediction of daily COVID-19 cases in European countries using automatic ARIMA model. J Public Health Res. 2020;9:1765 DOI: 10.4081/jphr.2020.1765. [PMID:32874964]

[10] KKA Ghany, HM Zawbaa, and HM Sabri. COVID-19 prediction using LSTM algorithm: GCC case study. Inform Med Unlocked. 2021;23:100566. DOI: 10.1016/j.imu.2021.100566. [PMID:33842686]

[11] Istaiteh O, Owais T, Al-Madi N, Abu-Soud S. Machine Learning Approaches for COVID-19 Forecasting. In: 2020 International Conference on Intelligent Data Science Technologies and Applications (IDSTA); 2020 October 19-22; Valencia, Spain. New Jersey: IEEE; 2020;pp. 50-7.

Correspondence to:
Xuekui Zhang
Department of Mathematics and Statistics, University of Victoria
BC, Canada, V8P 5C2
[email protected]
Li Xing
Department of Mathematics and Statistics, University of Saskatchewan
Saskatoon, SK, Canada, S7N 5E6
[email protected]