Neonatal hypoxic ischemic encephalopathy is a significant cause of mortality and morbidity. It is also associated with adverse outcomes such as cerebral palsy, cognitive dysfunction, epilepsy, and others, well beyond the neonatal period. These have a cascading impact on the community and society through increased health care utilization, need for special services, economic burden, and diminished workforce productivity. Several interventions have been explored to manage neonatal encephalopathy (NE). Among these, therapeutic hypothermia (TH) is ranked highest, with several studies and systematic reviews [1,2] reporting reduction in mortality and adverse neurological and/or neurodevelopmental outcomes during infancy [3,4]. TH involves controlled cooling of the body (or at least of the head) during the first 2-4 days of life, followed by a gradual rewarming to a euthermic state [1,5]. Currently, it is implemented globally, including in many low-resource health care settings [6–8], although the International Liaison Committee on Resuscitation advised its use only in institutions with adequate monitoring and intensive care facilities .
A recent multi-country HELIX trial reported that TH was associated with an alarming increase in both immediate and late mortality, prompting the authors to emphatically recommend its immediate discontinuation in resource-constrained settings . This created considerable consternation, especially in some developing countries, with arguments about the trial methods, generalizability, and other issues [11–17]. However, critical appraisal of the trial confirmed its validity , despite some plausible explanations for the stark differences in key outcomes . Additionally, a systematic review restricted to trials from developing countries reported limited benefit of TH in such settings .
These developments necessitate a detailed review of the available evidence. The Cochrane review published in 2013 is outdated, and also contained some data analysis errors, such as combining short-term and long-term outcomes in the same meta-analysis . A more recent review, updated as of mid-2020, contained several errors such as duplication of data from some trials, presenting data from non-existent trials, missing relevant trials, combining short-term and long-term mortality together, and expressing relative risk with negative integers . Therefore, we conducted an up-to-date systematic review of randomized controlled trials (RCTs) to evaluate the effects of therapeutic hypothermia (Intervention), vs normothermia or no hypothermia (Comparison), in neonates with hypoxic encephalopathy (Population), on mortality and neurological and/or neurodevelopmental features (Outcomes). The question of this review was: What are the effects of therapeutic hypothermia in newborns with hypoxic encephalopathy?
This review was registered in PROSPERO (Registration number CRD42021279682, dated 20 October 2021)  and conducted in accordance with the Cochrane Handbook for systematic reviews . The review is reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses-Protocols (PRISMA-P) 2020 statement .
Criteria for considering studies for this review
Types of studies: We included RCTs comparing the use of therapeutic hypothermia vs normothermia, or no hypothermia. We excluded non-randomized trials, cohort studies, trials with historic controls, case series, trials in animals, in vitro experiments, and ex vivo human studies.
Types of participants: We included RCTs enrolling newborn infants with a gestational age ≥35 weeks, having evidence of perinatal asphyxia and encephalopathy. Perinatal asphyxia was defined by one or more of the following: a) Apgar score ≤5 at 5 minutes of life; b) need for ongoing resuscitation or respiratory support at 10 minutes; or c) cord blood/arterial blood pH<7.1, or base deficit ≥12 within one hour of birth. Evidence of encephalopathy was based on Sarnat staging system or any other recognized staging/classification system.
Types of intervention: We included RCTs delivering TH (whole-body cooling [WBC] or selective head cooling [SHC]) by any device/equipment, initiated within 6 hours of birth, with documented reduction in core temperature (to ≤34°C in case of WBC) or middle ear temperature (to ≤34°C in case of SHC). We excluded trials where TH was initiated later than six hours after birth (in all or the majority of infants), or cooling was conducted without documentation of core temperature (as specified above), or was done for <48 hours.
Types of comparison: The comparator was normothermia, or no therapeutic cooling, or no intervention. We excluded studies without a comparison group, those in which the comparison group had received any cooling for any duration, or a historic comparison group.
Types of outcome measures: We considered the following outcomes: mortality, neurological impairment or disability (defined by any standard criteria), the composite outcome of mortality or disability, and cerebral palsy. We assessed these at four time points after randomization: a) Neonatal, ie, from randomization to discharge or death during the initial hospitalization; b) Infancy, ie, at the age of 18-24 months, c) Childhood, ie, at the age of 5-10 years, and d) Long-term, ie, beyond the age of 10 years. Other outcomes were seizures, electroencephalogram (aEEG) abnormalities, MRI findings suggesting neuronal damage during the initial hospitalization, duration of hospitalization, and quality of life. For this analysis, the primary outcome was listed as “mortality or neurological disability” at ≥18 months of age .
Information sources: Two authors independently searched the following databases: Medline, Embase, Cochrane Library, LIVIVO, Web of Science, Scopus, and CINAHL. We searched the following clinical trial registries: World Health Organization International Clinical Trials Registry Platform, ClinicalTrials.gov, and Clinical Trials Registry – India. We also hand-searched reference lists of included trials, as well as previous (narrative and systematic) reviews. In addition, we conducted a grey literature search using OpenGrey (www.opengrey.eu/), ProQuest, and Google Scholar. Each database was searched from its date of inception to October 31, 2021, without restrictions based on language or geography.
Search strategy: We used combinations of MeSH terms and synonyms of the following keywords, and their variations: neonate, newborn, perinatal, infant, hypothermia, therapeutic hypothermia, cool, cooling, therapeutic cooling, asphyxia, hypoxia, hypoxic-ischemic, encephalopathy, neonatal encephalopathy. The searches were pilot-tested before finalizing the strategy. The search strategy in representative databases is summarized in Table S1 in the Online Supplementary Document.
Selection of studies: Two review authors independently screened citation titles, followed by the abstracts of short-listed citations, followed by full-text of potentially eligible studies (and those without abstracts). Thereafter, two authors independently examined the full text versions of short-listed studies, to confirm eligibility for inclusion, and recorded reasons for exclusion of ineligible studies. Disagreements were discussed and resolved by consensus. After eliminating duplicate publications, a final list of studies was prepared. A PRISMA flow diagram was created, summarizing the search results and process of including studies.
Translation of languages other than English: Non-English publication abstracts were translated using open-source software; if eligible, the full text was translated as well.
Data extraction: Two review authors independently extracted the following information from the included studies.
- Trial characteristics: design, study duration, setting, date of publication.
- Participant characteristics: inclusion criteria, exclusion criteria, gestational age, birth weight, definition of perinatal asphyxia, definition and severity of encephalopathy, sample size.
- Intervention characteristics: WBC or SHC, method of cooling, temperature targeted, method of determining target temperature, cooling duration, cooling cessation criteria.
- Comparison characteristics: Temperature targeted, method of determining target temperature, and standard of care.
- Outcomes: Data on the outcomes listed above were extracted along with notes/remarks.
Dealing with missing data: We attempted to contact the corresponding authors of studies with missing or unclear data.
Data synthesis and statistical analysis: We presented data on baseline characteristics with descriptive statistics. We pooled data on the outcomes of interest and performed meta-analysis, using Cochrane Review Manager version 5.4 . For dichotomous outcomes, we calculated risk ratios (RR) with 95% confidence interval (CI) using the fixed-effect model. For continuous outcomes, we calculated the weighted mean difference with 95% CI (fixed-effect model). We opted for the fixed-effect model, as the alternative (random effects-model) tends to assign disproportionately greater weight to studies with smaller sample sizes. However, wherever the heterogeneity statistic exceeded 50%, we re-examined the pooled effect with the random effects model also. For data that could not be pooled by meta-analysis, we provided a description, summarizing the key results.
Assessment of methodological quality of included studies: Two authors independently assessed methodological quality, using version 2 of the Cochrane Risk-of-Bias (RoB) tool . We assessed RoB for each reported outcome of each trial, and the overall RoB of each trial.
Assessment of heterogeneity: We assessed heterogeneity among trials by visual inspection of the forest plots, and the Higgins-Thompson I2 method. We interpreted heterogeneity as outlined in the Cochrane Handbook: 75%-100% = considerable heterogeneity, 50%-90% = may represent substantial heterogeneity, 30%-60% = may represent moderate heterogeneity, and 0%-40% = might not be important . Where I2 exceeded 50%, we tried to identify explanations.
Subgroup analysis: We conducted a subgroup analysis based on the following criteria: a) Study setting (defined by the World Bank Classification of the country where the trial was conducted): high-income country (HIC), upper middle-income country (UMIC), lower middle-income country (LMIC), low-income country (LIC); and b) Type of cooling: WBC vs SHC. We planned subgroup analysis based on cooling method (formal devices vs informal methods), but there were insufficient studies.
Sensitivity analysis: We assessed the impact of low(er) quality studies, by excluding trials with moderate/high RoB.
We identified 36 863 citations, of which 85 citations were short-listed, and 39 publications [10,26–63] reporting 29 trials with 2926 participants [10,26–32,34–36,38,39,41–47,49,50,53,57–62] were included (Figure 1). Characteristics of the included studies are presented in Table 1, and their detailed description in Table S2 in the Online Supplementary Document. The reasons for excluding 46 studies [64–109] are presented in Table S3 in the Online Supplementary Document. Two authors independently categorized 13 studies each as having overall high, moderate, and low RoB (Table S4 in the Online Supplementary Document).
Figure 1. Flowchart highlighting screening and selection of studies.
Table 1. Characteristics of the included studies
EAC – external auditory canal, ECMO – extracorporeal membrane oxygenation, HELIX – hypothermia for moderate or severe neonatal encephalopathy in low-income and middle-income countries, HIE – hypoxic ischemic encephalopathy, HT – hypothermia, ICE – The Infant Cooling Evaluation, NA – not reported, NE – neonatal encephalopathy, NEST – neonatal ECMO Study of Temperature, NICHD – National Institute of Child Health and Human Development, NP – nasopharyngeal, SHC – selective head cooling, WBC – Whole body cooling, THIN – Therapeutic hypothermia for neonatal hypoxic-ischemic encephalopathy in India, TOBY – Total Body Hypothermia for Neonatal Encephalopathy Trial, wk – weeks
*In this trial, TH group had 4 subgroups with cooling to 36.5-36°C (n = 6): 35.9-35.5°C (n = 6): 35 ± 0.5°C (n = 6): 34.5 ± 0.5°C (n = 7). Only those with cooling to 34.5 ± 0.5°C were eligible for inclusion in this systematic review. The median time of initiation of intervention was within 4 hours of birth in the TH group; and 4.5 hours after birth in the control group.
†This study is the same as the Battin 2001 trial, however in this study data for TH group included participants cooled to temperature 35.0 ± 0.5°C (n = 6); or 34.5 ± 0.5°C (n = 7). Although, the latter conformed to the inclusion criteria of this review, outcome data could not be extracted separately for this group. Therefore, data from this study was unusable for meta-analysis.
Two of the 29 RCTs were multi-country trials [10,26]. Nine trials were conducted in India [28,31,35,36,38,44,45,49,60], six in China [39,46,47,59,61,62], four in the USA [26,29,41,53], two in the UK [30,42], and one each in Australia , Egypt , Germany , New Zealand , Turkey , and Uganda . The sample sizes in the 29 trials ranged from 19  to 408 ; with median (IQR) 93 (40, 158). Only 14 trials [10,26,27,29,30, 35,38,42,44,45,49,58,60,62] enrolled >100 participants each.
Nine RCTs [10,26,30,35,43,45,46,57,58] enrolled infants with moderate or severe encephalopathy, whereas another nine trials [27,29,41,44,47,49,59,60,62] also included some infants with mild encephalopathy. The proportion of such infants ranged from 0.2% to 23.3%. Eleven trials [28,31,32,34,36,38,39,42,50,53,61] did not describe the severity of encephalopathy. One study  presented data from a sub-group of participants reported in another study , hence data were extracted from the main publication.
Twenty-two studies with 2434 participants reported neonatal mortality during the initial hospitalization. The pooled RR was 0.87, (95% CI = 0.75, 1.00), I2 = 38% (Figure 2). The absolute risk difference was -0.03 (95% CI = -0.06, 0.00), I2 = 47%. One trial  reported mortality only during the intervention period, but not the entire hospitalization, hence its data was not pooled. Among the 22 trials, 21 showed an uncertain effect; only the HELIX trial  showed increased mortality. Excluding its data yielded a pooled RR of 0.74 (95% CI = 0.62, 0.87), I2 = 0%.
Figure 2. Meta-analysis of data on neonatal mortality (during the initial hospitalization).
Eleven trials with 2042 participants reported mortality at 18-24 months [10,26,27,29,30,34,38,42,46,58,62]; pooled RR (95% CI) was 0.88 (95% CI = 0.78, 1.01), I2 = 51% (Figure 3). The absolute risk difference was -0.04 (95% CI = -0.08, 0.00), I2 = 74%. Only one trial  showed statistically significant reduction, with a RR of 0.65 (95% CI = 0.43, 0.97); nine [26,29,30,34,38,42,46,58,62] showed statistically insignificant differences, and the HELIX trial  reported increased mortality, with an RR of 1.35 (95% CI = 1.04, 1.76). Excluding HELIX trial data  yielded a pooled RR of 0.77 (95% CI = 0.66, 0.90), I2 = 0%. Two trials [58,62] had data missing for >10% enrolled participants. In the Simbruner 2010  trial, 17.2% and 10.8% in the intervention and comparison arms had missing data. In the Zhou 2010  trial, the respective proportions were 27.5% and 20.3%. Exclusion of these two trials did not remarkably change the pooled effect; RR was 0.93 (95% CI = 0.81, 1.07), I2 = 53%.
Figure 3. Meta-analysis of data on mortality at the age of 18-24 months.
Only two studies [33,56] with 515 survivors reported mortality during childhood. Although one trial  reported statistically significant reduction, pooled RR (95% CI) was 0.81 (95% CI = 0.62, 1.04), I2 = 59% (Figure 4). Random-effects model yielded RR 0.79 (95% CI = 0.53, 1.18). The absolute risk difference was -0.07 (95% CI = -0.15, 0.01), I2 = 67%. No trial reported data on children older than ten years.
Figure 4. Meta-analysis of data on mortality between 5-10 years of age.
Unfavourable neurological and/or neurodevelopmental outcomes (ie, disability)
Eleven trials [10,27,29,30,34,38,39,42,46,54,62] with 1440 participants reported this outcome at 18-24 months of age. Among these, nine trials used the Bayley Scales of Infant Development (second or third edition) [10,27,29,30,34,39,42,46,54], and one each used the Gesell Child Development Age Scale and the Gross Motor Function Classification System (GMFCS) , and Developmental Assessment Scale for Indian Infants . Although only three trials [38,39,62] showed statistically significant reduction, whereas the other eight were inconclusive, pooled RR (95% CI) was 0.62 (95% CI = 0.52, 0.75), I2 = 26% (Figure 5). The absolute risk difference was -0.11 (95% CI = -0.15, -0.07), I2 = 46%. Four trials [27,29,46,54] had missing data in >10% survivors in at least one of the trial arms. Additionally, two trials had >10% difference in inter-group attrition. In the Jacobs 2011 trial , data were missing in 3.6% and 14.5% of survivors in the intervention and comparison groups. The respective proportions in the Li 2009 trial  were 17.8% and 6.8%.
Figure 5. Meta-analysis of data on participants with neurologic disability at the age of 18-24 months.
Three publications [33,44,56] presented the proportion with neurological disability during childhood, among 442 survivors; pooled RR was 0.68 (95% CI = 0.52, 0.90), I2 = 3% (Figure 6). The absolute risk difference was -0.12 (95% CI = -0.21, -0.04), I2 = 0%. The denominators in two of these [33,56] were less than the number of survivors, suggesting missing data. In the third publication , the originally randomized number was unavailable. There were no studies reporting the outcome at 10 years of age.
Figure 6. Meta-analysis of data on participants with neurologic disability between 5-10 years of age.
Mortality or disability
Ten trials with 1914 participants reported the composite outcome of death or disability at 18-24 months of age [10,26,27,29,30,34,38,46,58,62]. Pooled RR (95% CI) was 0.78 (95% CI = 0.72, 0.86), I2 = 54% (Figure 7). Random-effects model yielded a RR of 0.75 (95% CI = 0.66, 0.87). The absolute risk difference was -0.12 (95% CI = -0.17, -0.08), I2 = 59%. Unlike when the two outcomes were analysed separately, TH showed statistically significant improvement in the composite outcome in six of ten trials [26,27,38,46,58,62], and none including the HELIX trial  showed increased risk. Three trials [47,58,62] had data missing in >10% of participants in at least one arm. Excluding these trials yielded RR 0.84 (95% CI = 0.76, 0.92), I2 = 43%. In addition, the difference in attrition between the trial arms was >10% in the Li 2009 trial . In the Zhou 2010 trial , the proportions with missing data were 27.5% in the intervention arm and 20.3% in the comparison arm. Exclusion of these trials did not significant alter the pooled effect; RR was 0.60 (95% CI = 0.46, 0.78), I2 = 41%.
Figure 7. Meta-analysis of data on participants with death or neurologic disability at the age of 18-24 months.
Eight trials (1136 participants) reported the proportion of infants with cerebral palsy (CP) at 18-24 months of age [10,26,27,30,42,46,58,62]. Although only four [10,30,58,62] independently showed statistically significant reduction, the pooled RR was 0.63 (95% CI = 0.50, 0.78), I2 = 39% (Figure 8). Two trials [46,62] had data missing from >10% survivors in at least one arm, but their exclusion did not change the pooled effect; the RR was 0.68 (95% CI = 0.54, 0.86), I2 = 43%. The absolute risk difference across the 8 trials was -0.10 (95% CI = -0.15, -0.06), I2 = 55%.
Figure 8. Meta-analysis of data on participants with cerebral palsy at the age of 18-24 months.
Three studies [33,44,56] (449 survivors) reported the proportions with cerebral palsy during childhood; pooled RR was 0.63 (95% CI = 0.46, 0.85), I2 = 0% (Figure 9). The denominators in two [33,56] of these were less than the number of survivors, suggesting missing data. In the third publication , the number originally randomized was unavailable. The absolute risk difference across the 3 studies was -0.13 (95% CI = -0.21, -0.04), I2 = 0%. No studies reported cerebral palsy at 10 years of age.
Figure 9. Meta-analysis of data on participants with cerebral palsy between 5-10 years of age.
Ten trials [10,26,28,29,31,32,41,43,50,53] with 1094 participants reported neonatal seizures. The pooled RR was 1.02 (95% CI = 0.95, 1.09), I2 = 17% (Figure 10). The absolute risk difference was 0.01 (95% CI = -0.03, 0.06), I2 = 42%
Figure 10. Meta-analysis of data on participants with neonatal seizures (during the initial hospitalization).
Only four trials (710 participants) reported the proportion of infants with seizures at 18-24 months, ie, infantile epilepsy as a sequel to neonatal encephalopathy [10,29,30,42]. The pooled RR was 0.87 (95% CI = 0.55, 1.37), I2 = 36% (Figure 11). The absolute risk difference was -0.01 (95% CI = -0.06, 0.03), I2 = 60%. One trial  had data missing from >10% survivors, however its exclusion did not change the pooled effect: 0.84 (95% CI = 0.48, 1.48), I2 = 56%.
Figure 11. Meta-analysis of data on participants with seizures at the age of 18-24 months (ie, infantile epilepsy).
Only one  publication with 117 children presented data on seizures during childhood (ie, childhood epilepsy); there was no statistically significant impact, and RR was 0.65 (95% CI = 0.25, 1.68). The absolute risk difference was -0.06 (95% CI = -0.18, 0.07), N = 1, n = 117.
Length of hospital stay
Nine trials reported length of hospital stay during the initial hospitalization; five [26,32,35,53,58] yielded a pooled mean difference (95% CI) of -0.82 days (95% CI = -1.65, 0.02). The other four presented data as median (IQR) [10,30,38,42]. Although their hospitalization durations varied widely, they were comparable in both arms.
Only two publications [32,51] with 45 participants reported the proportion with EEG abnormalities during the initial hospitalization. One trial  performed EEG, 4-10 days after birth, whereas the other performed aEEG during the first 72 hours and calculated the proportion with persisting abnormalities. Pooled RR (95% CI) was 0.34 (95% CI = 0.14, 0.83), I2 = 22% (Figure 12). The absolute risk difference was -0.36 (95% CI = -0.62, -0.10), I2 = 0%.
Figure 12. Meta-analysis of data on participants with EEG abnormalities during the neonatal period.
Abnormalities on MRI
Eight trials reported MRI abnormalities during the initial hospitalization [10,31,40,43,51–53,55]. The timing of MRI varied as follows: during 7-14 days after birth , on the 5th day after birth , within the first 10 days of birth , during the first 7 days of life , between days 5-14 of life , within the first 4 weeks of birth , and by 44 weeks of post-menstrual age [53,55]. The pooled RR for number of infants with “any MRI abnormality” was 0.68 (95% CI = 0.56, 0.83), I2 = 50%, 6 trials, 377 participants (Figure 13). Random-effects model yield RR of 0.73 (95% CI = 0.54, 0.98). The absolute risk difference was -0.19 (95% CI = -0.29, -0.10), I2 = 28%. Three trials [31,51,55] showed a lower proportion, whereas the others [43,52,53] reported uncertain effect. MRI abnormalities in the basal ganglia region, or thalamic injury were reported in five trials [10,40,43,52,55] (680 participants); pooled RR was 0.82 (95% CI = 0.68, 0.98), I2 = 37% (Figure 14). The absolute risk difference was -0.08 (95% CI = -0.14, -0.01), I2 = 64%. Two of these trials [52,55] showed statistically significant reduction. Four trials [10,40,52,55] with 659 participants reported those with lesions in the posterior limb of the internal capsule (PLIC). Although only one  showed statistically significant reduction with TH, pooled RR was 0.66 (95% CI = 0.52, 0.84), I2 = 0% (Figure 15). The absolute risk difference was -0.11 (95% CI = -0.18, -0.05), I2 = 0%. White matter injury was reported in various ways in five trials [10,40,43,52,55] (686 participants). Although a statistically significant reduction was seen in only two trials [40,52], pooled RR was 0.88 (95% CI = 0.78, 0.98), I2 = 76% (Figure 16). Random-effects model yielded RR 0.76 (95% CI = 0.54, 1.09). The absolute risk difference was -0.07 (95% CI = -0.13, -0.01), I2 = 62%.
Figure 13. Meta-analysis of data on participants with ‘any MRI lesions’ during the neonatal period.
Figure 14. Meta-analysis of data on participants with basal ganglia lesions or thalamic injury on MRI, during the neonatal period.
Figure 15. Meta-analysis of data on participants with PLIC lesions on MRI during the neonatal period.
Figure 16. Meta-analysis of data on participants with white matter injury on MRI during the neonatal period.
Quality of life
A single trial  presented information on quality of life during childhood using various scoring systems. The proportion with Health Utilities Index (HUI3) score was not different in the two arms, RR was 0.76 (95% CI = 0.55, 1.04); and the mean difference of scores was also similar; 0.09 (95% CI = -0.06, 0.23).
We examined the outcomes by study setting (Table 2). Neonatal mortality and neonatal seizures did not show statistically significant inter-group differences, in any of the four types of countries/settings. TH significantly reduced mortality at 18-24 months in HIC but did not show statistically significant differences in UMIC or LMIC. Similarly, the composite outcome of death or disability at 18-24 months was significantly lowered in HIC and UMIC, but not LMIC. However, neurological disability and cerebral palsy at 18-24 months showed statistically significant reduction across settings.
Table 2. Analysis of outcomes by country/setting of the trials*
HIC – high-income countries, UMIC – upper middle-income countries, LMIC – lower middle-income countries, LIC – low-income countries, mo – months, y – years
*All data are presented as risk ratios (RR) with 95% confidence interval. ‘N’ represents the number of trials, and ‘n’ represents the number of participants.
Subgroup analysis by type of cooling (Table 3) showed statistically insignificant inter-group differences between WBC and SHC, for mortality (neonatal and at 18-24 months) and seizures at any age. Other outcomes at 18-24 months, namely neurological disability, composite of mortality or disability, and cerebral palsy, were all improved with TH, irrespective of whether the whole body or only the head was cooled.
Table 3. Analysis of outcomes by type of cooling*
mo – months, y – years
*All data are presented as risk ratios (RR) with 95% CI. ‘N’ represents the number of trials, and ‘n’ represents the number of participants.
Sensitivity analysis excluding trials with moderate/high RoB (from the analysis) did not change the overall result for major clinical outcomes, although the magnitude of effect diminished for some outcomes (Table 4). However, the exclusion changed three statistically significant differences in MRI outcomes to statistically insignificant differences (Table 4). Examination of pooled risk ratios among trials with low RoB against those with moderate or high RoB showed that TH reduced neonatal mortality and mortality at 18-24 months in trials with moderate/high RoB, but not in trials with low RoB (Table 4). However neurological disability, cerebral palsy, and the composite outcome of disability or mortality at 18-24 months showed benefit with TH in both types of trials, although the magnitude was less in low RoB trials.
Table 4. Analysis of outcomes by risk of bias within the trials*
mo – months, y – years
*All data are presented as risk ratios (RR) with 95% CI. ‘N’ represents the number of trials, and ‘n’ represents the number of participants.
This up-to-date systematic review showed that therapeutic hypothermia implemented for neonatal encephalopathy, did not result in statistically significant reductions in mortality during the neonatal period, infancy or later childhood. However, it reduced neurologic disability and cerebral palsy in infancy and childhood, resulting in reduction in the composite outcome of mortality or disability, despite absence of conclusive benefit on mortality alone. EEG abnormalities and multiple MRI outcomes were better in neonates who received TH. However, there was no statistically significant impact on seizures during the neonatal period, infantile epilepsy, or childhood epilepsy.
While the type of cooling (ie, WBC or SHC) did not affect the results, the setting where TH was implemented was relevant. TH reduced mortality at 18-24 months in high income countries, but not in other settings. While neonatal mortality and seizures were not reduced in any setting, disability and cerebral palsy in infancy were reduced in all settings.
More important, reduction in mortality reported in previous systematic reviews [1,2] was influenced by trials with higher risk of bias.
Thus, this systematic review uncovered several novel findings that contradict previous reviews [1,2]. This is partly because of the availability of new trials, notably the HELIX trial , but also due to methodological errors in the previous reviews. The Cochrane review combined immediate and later mortality in the same meta-analysis . The later review failed to include some eligible trials, duplicated data from some trials, presented data from non-existent trials, combined immediate and later mortality, and even expressed relative risk with negative integers .
The HELIX trial  reported increased mortality (neonatal and infancy) with TH, in stark contrast to previous trials. This RCT was one of the best conducted trials with multiple methodological refinements, strict definitions, largest sample size, extremely low attrition rate, and low risk of bias. Extensive critical appraisal did not identify any major limitations , although some concerns were raised about the inclusion of out-born infants, slightly delayed initiation of cooling (though within the accepted limit of 6 hours), and possibly diverse causes of hypoxic encephalopathy in low-resource settings .
This systematic review had several strengths notably exhaustive literature search across published and grey literature, inclusion of the largest cohort of trials to date, searching and data extraction in duplicate, careful extraction of data meeting the review criteria (rather than including data reported by trials), and undertaking multiple subgroup and sensitivity analyses. There were no deviations from the protocol . In fact, several additional outcomes were also presented. This fosters high confidence in the review findings.
We acknowledge several limitations in our review. We could not search Chinese language databases, or conference proceedings. We could not obtain individual participant data, or missing data for intention-to-treat analyses. In the protocol, we mentioned that randomized controlled trials would be included, but did not specify how quasi or pseudo randomized studies, would be handled. Analysis of the randomization method identified that 18 trials used an appropriate method of randomization, 1 trial used a quasi-randomization method, and 10 trials had an unclear method. Thus, the included trials had some quasi/pseudo randomized studies. The impact of this is evident from the differences in some outcomes among trials with low vs higher RoB.
The effect of therapeutic hypothermia may also be influenced by several factors such as the proportion of outborn neonates in studies, proportion with severe encephalopathy, method of cooling (servo vs non-servo), and severity of asphyxia. For example, 4 trials excluded outborn neonates, 14 trials included them (but only 8 of them reported the proportion of outborn babies), and 11 trials did not provide any information, Similarly, 15 studies reported data of participants with only severe or moderate neonatal encephalopathy, 9 studies included those with mild encephalopathy also, but the proportion was <25% of the total, and 15 studies did not report details of severity. Among these 15, data on Apgar score and/or cord blood parameters suggested severe disease in some (Table S2 in the Online Supplementary Document). Thirteen studies did not report any data on Apgar scores or cord blood parameters, whereas 26 studies reported either or both (Table S2 in the Online Supplementary Document). In the absence of individual patient data, it is not possible to account for these factors.
Before initiating this review, we listed the primary outcome as mortality or neurologic disability at the age of 18-24 months, in alignment with previous systematic reviews [1,2], and major trials [1,2,10,26,29,30]. Although the composite outcome provides useful information, we believe that it is skewed by the beneficial effects of TH on neurologic outcomes, masking the lack of statistically significant impact on mortality.
Adverse effects of therapeutic hypothermia were reported in various ways, and at various time points, in several trials. Although these are very important to consider, for making informed decisions (at the practice as well as policy levels), in this systematic review, we focused on evidence of efficacy, and did not examine adverse events.
We expected attrition in trials would bias the results in favor of the intervention, but did not observe this for most outcomes.
Finally, is more research required on therapeutic hypothermia for neonatal encephalopathy? Some experts would argue that more trials should be conducted until an optimal information size is achieved following which, further research can be discontinued. This would be very expensive in terms of time and resources. Instead, we suggest that research in local health care systems in resource-constrained settings, could focus on resolving issues such as which neonates are most likely to benefit from TH, predictors of failure, and of course primary prevention.
This up-to-date systematic review of randomized controlled trials confirmed that therapeutic hypothermia implemented for neonatal encephalopathy reduces neurologic disability and cerebral palsy in diverse settings. However, it has an unclear effect on neonatal, infantile, and childhood mortality. It also does not impact neonatal seizures, or epilepsy during infancy and childhood. The previously reported reduction in mortality was associated with trials of lower methodological quality, but not substantiated by trials with high(er) quality.
Online Supplementary Document