How did South Africa’s genomic surveillance miss the Omicron variant until it was too late?

Neil Thomas Stacey
3 min readNov 26, 2021

On the 23rd of November, South Africa sequenced several samples of a new COVID-19 variant, labeled as B.1.1.529 and, around the same time, Hong Kong sequenced a sample of the same variant from a traveler who had been in South Africa. The fact that a foreign government picked it up from one of our infections at the same time as we did is already an indication that our detection was a bit late, but the picture is bleaker than that.

In the days since that first sequencing, we have been testing for B.1.1.529 using a type of PCR testing that can identify which samples lack a particular gene sequence and, in so doing, distinguish certain particular variants. B.1.1.529 has almost 100% prevalence in samples collected between 12 November and 20 November, meaning that by the time we detected it, it had already taken over completely. Moreover, cases began to rise rapidly on the 19th of November, further indicating that the horse had already bolted before we’d checked at the gate.

Our genomic surveillance program has been internationally praised as being among the world’s best-equipped, with superbly qualified personnel. Scientists had long since forecast that a 4th wave would be starting around now, and our genomics experts had also previously identified prolonged infection in an immuno-compromised patient as a major risk factor for substantial mutation. This was a crucial time to be vigilant for new variants emerging, and we knew it.

How, then, did one of the world’s most sophisticated genomics surveillance programs let this go undetected for so long, when forewarned to be particularly vigilant?*

The short answer is that the best surveillance technology imaginable does nothing if it’s pointed in the wrong direction. That would be a fair description of how we were set up.

Our genomic surveillance pulls from the PCR testing pool. Most South Africans can’t afford PCR tests, each of which costs roughly a quarter of the median household monthly income. So our genomic surveillance has essentially just been looking at the richest 10% of the country. That 10% also doesn’t use public transport, can mostly work from home when needed and are highly vaccinated. HIV prevalence, a colossal factor for mutation rate, has a very strong inverse correlation with income.

The portion of the population most likely to be exposed to COVID, most likely to transmit it onwards, and most likely to have mutations occur, has been almost entirely omitted from our genomic surveillance. So that’s how we missed this one; we weren’t looking.

I wrote about this in late September (link to belated Researchgate post), but no-one wanted to hear it. A veritable parade of international public health journals rejected an article I wrote to that effect, and our genomics experts, enamoured with their technical prowess and technological resources, remained oblivious to the prosaic matter of a systemic sampling bias distorting all of their work.

Some of you might feel that this post comes across as bitter. You’re right. The rest of you aren’t reading carefully enough.

*The glib answer would be that in the 30 days prior to B.1.1.529’s emergence, we only sequenced 44 samples, just 0.417% of recorded cases, quite possibly less than 0.2% of actual infections. Early detection is a remote impossibility at that sampling rate.

--

--

Neil Thomas Stacey

When I was a kid I figured I'd be a scientist when I grew up. Now I'm a scientist and I have no idea what I'll be when I grow up.