An Overview of Pooled Testing Procedures with application to Covid-19 Pandemic
Introduction
Pooled testing is a procedure used to reduce the cost of screening a large number of samples and is commonly used for screening infectious diseases. The application of pooled testing in a wide variety of infectious disease screening settings have been studied in [1–5]. The recent one being the research at the German Red Cross Blood Donor Service in Frankfurt headed by Professor Erhard Seifried and the Institute for Medical Virology at the University Hospital Frankfurt at Goethe University headed by Professor Sandra Ciesek wherein a procedure for increasing worldwide testing capacities for detecting SARS-CoV-2 has been developed [6]. Pooled testing or group testing is a procedure where individual specimens like blood or urine samples are pooled into groups to test for a binary response- for e.g. Covid positive or Covid negative. If a specimen sample of the pool tests negative, then all individuals within it are diagnosed as negative. If it is found positive, then retesting within the pool is done to detect the positive individuals.
This article gives an overview of pooled testing procedures which could be used in the control of Covid-19 pandemic.

Pooled Testing Methods
There are different Pooled Testing Methods [7]. These can be classified into two types-(1) Non- Informative Methods (i.e.) Methods which assume the population to be homogeneous (2) Informative Methods (i.e.) Methods which assume the population to be heterogeneous.
(1) Non-Informative Methods:
One of the earliest used methods of pooled testing is Dorfman testing. In this testing, if a specimen tests negative, then all individuals within that pool are confirmed as negative. If the test is positive, then it indicates that at least one individual within each pool is positive, and individual retesting
of each specimen is done to find the positives. This testing is the easiest method to apply, but it leads to a large increase in the number of tests.
For e.g. Consider a study on 100 random patient samples for an infectious disease who were also individually tested. The patient samples were pooled in 20 pools of 5 patient samples each. Of these 8 individual samples tested were found to be positive which were spread across 6 pools. Six pool samples were positive and the remaining fourteen samples were negative. All the samples of these 6 pools were again tested individually. So, a retesting of 30 patient samples were done. Despite this, the entire population of 100 patients were covered by 50 testing (20 pools samples followed by 30 individual samples).
Another method used is, instead of testing all members of a positive pool individually, a positive pool can be split into two or more sub-pools. If any sub-pool is found positive, then further splitting or individual testing is performed on it. One specific form of splitting is the binary splitting into almost two equal sizes as seen in Litvak et al. [8]. In this method, each split creates two new almost equally sized sub-pools.
For e.g. From the example above, 6 were positive pools each containing 5 samples each. Instead of testing all the 30, each positive pool of 5 samples could be split into 2 sub-pools of 2 and 3 patients and the same procedure can be repeated. Due to this binary split, a further 50% reduction in testing can be expected.
Yet another method to individual testing for a positive pool as given by Sterrett [9] relies on the fact that there is a greater probability of lesser number of positives within the pools (often, there is only one positive per pool). For an initial pool that tests positive, this method retests individuals one-by-one randomly until the first positive sample is detected. Once the first positive is found, specimens that have not been retested are pooled again and tested. This method of retesting stops once this new pool tests negative else the same process continues until all specimens are checked.
For e.g. From the above example, 6 pools were found to be positive. Out of these, one of them is picked and the samples are retested randomly one after the other. Suppose the 3rd sample chosen is found to be positive, then the remaining samples are re-pooled and tested. This process continues till all the samples have been checked. There will be a random percentage reduction in the overall number of tests done.
(2) Informative Methods
The previously seen methods did not consider the heterogeneity of the population. But the methods which assume the heterogeneity are more realistic and include parameters which place the individuals at risks of being positive. These risks can be calculated in several ways. Usually, a training data set of individual parameters and related risk factors are used to approximate a binary regression model. The model obtained can be used while screening individuals to estimate their probability of having the disease. These probabilities are used for selecting pool sizes, for organizing the initial testing that minimizes the number of positive pools, and for finding the order in which the individuals are retested within a positive pool.
For e.g., the risk factors for the current pandemic Covid-19 could be age structure of the population, comorbidities like diabetes, hypertension etc. and the travel history of the individual, to name a few. These factors could be used to classify the individuals as high risk and low risk.
McMahan et al. [10] have proposed two procedures which are extensions of the Dorfman method. The first is the threshold optimal Dorfman (TOD) that uses probability to classify individuals as high or low risk. For e.g., a threshold of 0.3 could be fixed which divides individuals with estimated probabilities greater than the threshold as high risk and lesser as low risk.
High risk persons are tested individually, and low risk people are arranged as per their risk probabilities. Then, they are tested using Dorfman testing with equal or almost equal pool size. The pool size chosen for the low risk individuals determines the expected number of tests.
The second method is the pool specific optimal Dorfman (PSOD) method. In this method, individuals are arranged as per their estimated risk probabilities. Then, the ordered individuals are assigned to successive pools, and the optimal pool sizes are obtained by reducing the expected number of tests using a greedy optimization algorithm [10]. As in the methods earlier, the individuals within the positive pools are tested again individually. In actual application, both TOD and PSOD methods are applied to groups of individuals.
Black et al. [11] proposed a new method by binary splitting of the pools. The sub-pools were formed in such a manner that one sub-pool had a higher probability of being positive and the other had a higher probability of being negative. This ensured that all the positives were grouped together throughout the retesting process thus enabling the other sub-pools to test negative during the previous iterations. Sub-pool configurations were determined by arranging the individuals with respect to their risk probabilities. Individuals below the median were allotted to one sub-pool and individuals above the median to the other sub-pool.
However, for practical applications, 3-step and 4-step implementations are recommended for both the non-informative and informative procedures.
Bilder et al. [12] proposed an informative adaptation of Sterett’s procedure. In this method, individuals are picked in descending order of their estimated risk probabilities until the first positive individual is found instead of random retesting of individuals. When the algorithmic process is repeated. After every positive individual is found, this process is repeated. This method is called full informative Sterrett (FIS).
Conclusion
In the current global pandemic scenario, to control the disease spread, timely identification of positives is extremely important. Therefore, the optimal usage of available resources like testing kits, manpower play a very crucial part. The pooled testing procedures discussed here will be useful in testing a larger population with limited resources.
References
1. Blood testing. URL http://www.redcrossblood.org/learn-about-blood/what-happens-donated-blood/blood-testing, retrieved January 7, 2012
2. Dodd R, Notari E, Stramer S. Current prevalence and incidence of infectious disease markers and estimated window-period risk in the American Red Cross donor population. Transfusion. 2002; 42:975–979. [PubMed: 12385406]
3. Gaydos C. Nucleic acid amplification tests for gonorrhea and chlamydia: practice and applications.Infectious Disease Clinics of North America. 2005; 19:367–386. [PubMed: 15963877]
4. Hourfar M, Themann A, Eickmann M, Puthavathana P, Laue T, Seifried E, Schmidt M. Blood screening for influenza. Emerging Infectious Diseases. 2007; 13:1081–1083. [PubMed: 18214186]
5. White D, Kramer L, Backenson P, Lukacik G, Johnson G, Oliver J, Howard J, Means R, Eidson M, Gotham I, et al. Mosquito surveillance and polymerase chain reaction detection of West Nile Virus, New York state. Emerging Infectious Diseases. 2001; 7:643–649. [PubMed: 11585526]
6. Pool testing of SARS-CoV-02 samples increases worldwide test capacities many times over | EurekAlert! Science News https://eurekalerthttps://eurekalert.org/pub_releases/2020-03/guf-pto033020.php, 2020
7. Dorfman R. The detection of defective members of large populations. Annals of Mathematical Statistics. 1943; 14:436–440.
8. Litvak E, Tu X, Pagano M. Screening for the presence of a disease by pooling sera samples.
Journal of the American Statistical Association. 1994; 89:424–434.
9. Sterrett A. On the detection of defective members of large populations. Annals of Mathematical Statistics. 1957; 28:1033–1036.
10. McMahan C, Tebbs J, Bilder C. Informative Dorfman screening. Biometrics. 2012
10.1111/j.1541–0420.2011. 01644.x
11. Black M, Bilder C, Tebbs J. Group testing in heterogeneous populations using halving
algorithms. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2012
10.1111/j.1467–9876.2011.01008.x
12. Bilder C, Tebbs J, Chen P. Informative retesting. Journal of the American Statistical
Association. 2010; 105:942–955. [PubMed: 21113353]