Please note analyses and intrepretations presented here, reflect my personal opinion and do not necessarily reflect the official view of AGES, BASG, EMA, or any associated working party or committee.


Some new analyses of cumulative incidence of death and recovery using random attribution of events to cases


Here are some plots of COVID-19 data, that I have prepared and update regularly.

I import the data from the Coronavirus COVID-19 Global Cases by Johns Hopkins CSSE Dashboard which provides raw-data at https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/. Data are updated daily, but may not be as recent as depicted on the online dashboard.

I perform a little bit of pre-processing. Mainly I collapse data for the US and China. These data are provided on the provincial level, which makes it difficult to visualise using mapping software. Having these data on the country level lets me easily use different mapping tools that provide mapping data on the country level.

I spend most of my time designing new figures, so please excuse if annotation and documentation is scarce. Occasionally, data artefacts (missing numbers), or things I didn’t anticipted (declinign active cases) result in botched figures, which I may not fix immediately.

For some figures I compute relative figures in terms of cases per 100 000 population. These use population numbers as provided in spData package. Interestingly these numbers are not available for all countries in this package. For some of the missing countries (Norway, France, …) I have entered numbers as found on wikipedia. This is probably not entirely accurate, as they may not be from comparable census times.

Matching country names between disease data and mapping data is not complete. Some countries report numbers for different regions (Canada, US, China,…). I have colapsed numbers from those countries. Other countries e.g. Northern Macedonia have different names between datasets. For some of the affected countries I have attempted to unify names, others I may have missed. Currently, most countries outside Africa appear to be matched correctly.

I do estimate a couple of statistics. First I estimate the average doubling time, I use a period of 7 days (i.e. \(ADT_t = \frac{7}{log_2(x_t - x_{t-7})}\)) this is purely heuristic, but it appears to provide a good balance between variability and responsiveness to policy changes (as I have gauged from looking at Italy).

I do estimate the case fatality rate using a simple estimator that is recommended here and here - TODO add links to papers. This basically divides the number of fatalities by the number of resolved cases, which implies the assumption that unresolved cases will recover or die at the same rate as resolved cases.

Finally, the largest issue is certainly that all analyses can only as good as the numbers. I have no detailed insight into the testing and reporting practices of different countries, but from what one can glean from the media some dynamics in the data of certain countries likely reflect more the process of data gathering than the disease. This may be less of an issue with mortality figures, however, I would suppose that there are issues with attribution of disease related mortality as well.