Interobserver Reliability of an MR Imaging Scoring System in Infants with Hypoxic-Ischemic Encephalopathy

Abstract

BACKGROUND AND PURPOSE: MR imaging has a key role in predicting neurodevelopmental outcomes following neonatal hypoxic-ischemic encephalopathy (HIE). A novel MR imaging scoring system for hypoxic-ischemic brain injury was used in our patient population with the aim of assessing interobserver variability and developing subcategories for the severity of brain injury.

MATERIALS AND METHODS: We evaluated brain MR images of 252 infants who underwent hypothermia for HIE between 2014 and 2019. First, 40 infants were selected randomly to test interobserver variability. Discrepancies were identified during the assessment of the first 20 MR images. The remaining 20 MR images were scored after adjusting the scoring system. Second, we determined cutoff values for the severity of injury that were based on the percentiles of the total scores in the full cohort.

RESULTS: The interobserver reliability showed excellent agreement for the total score both before (intraclass correlation coefficient = 0.96; 95% CI 0.89–0.99) and after the adjustment (intraclass correlation coefficient = 0.96; 95% CI, 0.89–0.98). The average of the differences and the agreement interval between the 2 readers decreased after the adjustment. Subcategories of brain injury were the following: We considered a total score of ≤4 (≤75%) as normal, 5–10 (76%–90%) as mild, 11–15 (91%–95%) as moderate, and >15 (>95%) as severe brain injury. The agreement on the classification of brain injury improved in the second epoch (weighted κ = 0.723 versus 0.887).

CONCLUSIONS: The adjusted scoring system may lead to a higher degree of interrater agreement. The presented cutoff values may be used to determine the severity of brain injury in future clinical studies including infants with mild hypoxia-ischemia.

ABBREVIATIONS:

HIE: hypoxic-ischemic encephalopathy
ICC: intraclass correlation coefficient
κ_W: weighted κ
PLIC: posterior limb of internal capsule
TH: therapeutic hypothermia

Hypoxic-ischemic encephalopathy (HIE) occurs in 2–3 per 1000 live term births in developed countries.¹ To date, therapeutic hypothermia (TH) initiated within the first 6 hours of life and continued for 72 hours with a target central temperature of 33.5°C is the only available treatment to reduce the risk of death and neurodevelopmental impairment.^2,3

The ability to predict neurodevelopmental outcomes following HIE allows parents and caregivers to optimize care beyond the neonatal period. MR imaging has a key role in predicting neurologic outcomes.^4,5 Although many previously reported MR imaging scoring systems have been related to outcome,^6⇓-8 they were usually performed with conventional sequences. The widely used scoring system of Barkovich et al⁶ published before the hypothermic era did not originally incorporate diffusion-weighted images, even though DWI has been recognized as the most reliable MR imaging sequence to assess injury during the first week after an hypoxic-ischemic event.^4,9 Recently, Weeke et al¹⁰ described a novel and more detailed MR imaging scoring system for term infants with HIE, incorporating DWI and ¹H-MR spectroscopy sequences as well patterns of injury to the gray matter, white matter, and cerebellum to improve the predictive value of MR imaging studies in infants with HIE. The gray matter subscore was an independent predictor of adverse outcome at 2 years of age and at school age.¹⁰

Given that our inclusion criteria for infants to undergo TH at Brigham and Women's Hospital had been broadened, offering cooling to milder cases and infants born at >34 weeks of gestation, we wished to explore the application of this new scoring system in our TH cohort.

We applied the new scoring system to our diverse patient population with the aim of assessing the observer variability between 2 experienced readers. We identified discrepancies during the evaluation of the first 20 MR imaging scans and adjusted the scoring system of Weeke et al¹⁰ accordingly.

Second, we also aimed to develop subcategories of severity from the scores of normal brain and, mild, moderate, severe brain injury. Our hypothesis was that the adjusted scoring system can improve interobserver reliability and increase the ease and reliability of the application of this scoring system as a new standard in the documentation of cerebral injury in the setting of hypoxic-ischemic encephalopathy.

MATERIALS AND METHODS

Patients

We have collected data, including imaging data, on 252 infants who underwent TH for neonatal encephalopathy between January 2014 and May 2019. We randomly selected 40 infants to assess the observer variability of the new MR imaging scoring system.¹⁰ This retrospective observational study was conducted at Brigham and Women's Hospital, Department of Pediatric Newborn Medicine, a single, tertiary-level neonatal intensive care unit. Institutional review board approval was obtained with a waiver of consent. The criteria for TH in our center are modified regional center–based criteria in which variables have been broadened from those used in the randomized clinical trials.^11,12 The adaptations have included the following: 1) decreasing the gestational age criteria to >34 weeks; 2) increasing the inclusion pH from ≤7.0 to ≤7.1; 3) reducing the base excess for inclusion from ≥16 mEq/L to ≥10 mEq/L; and 4) providing therapeutic hypothermia to infants with mild hypoxic-ischemic encephalopathy on clinical examination, in addition to those with moderate or severe HIE. The stage of HIE was assigned on the basis of the modified Sarnat staging system after combined assessment by clinicians before the initiation of TH.¹³

MR Imaging

All infants underwent at least 1 cerebral MR imaging performed after TH within the first week of life. The second MR imaging was based on the decision of the clinical team caring for the infant. Only the first MR images obtained within the first week of life were analyzed in this study. All scans were performed on a 3T Siemens scanner (Siemens, Erlangen, Germany). The standard clinical imaging protocol included sagittal motion-corrected magnetization-prepared rapid acquisition of gradient echo T1-weighted images (TR = 2800 ms; TE = 2.75, 4.68, 6.54, and 8.4 ms; flip angle = 7°; voxel size = 1 × 1 × 1 mm), axial turbo spin-echo T1-weighted images (TR = 574 ms, TE = 13 ms, flip angle = 140°, voxel size = 0.5 × 0.5 × 3 mm, echo-train length = 2), axial turbo spin-echo T2-weighted images (TR = 9000 ms, TE = 150 ms, flip angle = 120°, voxel size = 0.5 × 0.5 × 3 mm, echo-train length = 19), and coronal turbo spin-echo T2-weighted images (TR = 9210 ms, TE = 187 ms, flip angle = 130°, voxel size = 0.4 × 0.4 × 3 mm, echo-train length = 19). Diffusion-weighted imaging included multidirectional diffusion-weighted measurements (TR = 6200 ms, TE = 92 ms, bandwidth = 1984 Hz/Px, FOV = 140 mm, voxels = 2 × 2 × 2 mm, 30 b-directions with amplitudes ranging from 0 to 1000 s/mm²). ¹H-MR spectroscopy measurements were acquired at TE = 44 and 288 ms in the left thalamus and basal ganglia.

For noncritically ill neonates, we used a “feed and wrap” protocol, which is based on the timing of feeds, induction of natural sleep, and immobilization with wrapping to avoid the need for anesthetic agents.¹⁴ MR imaging scans with motion artifacts were excluded from the interrater analysis.

The pattern of brain injury was evaluated according to the novel grading system.¹⁰

The total score of the grading system is 57, including gray matter (maximum GM subscore = 25), white matter (maximum WM subscore = 21), cerebellum (maximum cerebellum subscore = 8), and an additional subscore (maximum additional subscore = 3). The additional score describes the presence of intraventricular or subdural hemorrhage and sinovenous thrombosis. The score of ¹H-MR spectroscopy was included in the gray matter subscore.¹⁰

The MR images of the first 20 neonates were evaluated on the basis of the description of the original article. We identified discrepancies during the evaluation of the first 20 MR imaging scans and adjusted the scoring system accordingly. A further series of 20 MR images was scored after adjustment of the scoring system.

The adjustments were the following: 1) The gestational age of the infants was taken into consideration when evaluating the myelination of the posterior limb of the internal capsule (PLIC) and the peak of the NAA (Fig 1A, -B); 2) a lesion that had involvement of both WM and the cortex was scored only individually for the principal area injured (Fig 1C); and 3) the extension of signal abnormality (involving 1 lobe or >1 lobe) was scored on the basis of the primary area of injury (Fig 1D). The images were analyzed by a pediatric neuroradiologist (E.Y.) and a dual-board-certified neonatologist and child neurologist (T.E.I.), who were blinded to the stage of neonatal encephalopathy.

FIG 1.

A, The gestational age of the infants considered when evaluating the myelination of the PLIC. In the coronal T1-weighted image, the myelination of the PLIC was considered as age-appropriate for a near-term infant (35 weeks of gestation) and was scored as normal. B, The gestational age of the infants was considered when evaluating the peak of the NAA. In ¹H-MR spectroscopy (TE = 30 ms), the peak NAA was considered as age-appropriate for a near-term infant (35 weeks of gestation) and was scored as normal. C, The lesion that had involvement of both the WM and cortex was scored individually only for the principal area. In axial DWI, the diffusion restriction in the cortex and its location were scored as focal (1 lobe) and unilateral (score of 2). The WM involvement was scored individually as focal and unilateral (score of 2). D, The extension of signal abnormality (involving 1 lobe or >1 lobe) was scored on the basis of the primary area of injury. In axial ADC mapping, the diffusion restriction in the WM was scored as focal (score of 1) because only the frontal lobe was involved and the location was scored as bilateral (score of 2).

Interrater agreement was assessed by total score, subscores, and the severity of brain injury (normal, mild, moderate, and severe) before (n = 20) and after the adjustment of the scoring system (n = 20).

Statistical Analysis

Interrater reliability was evaluated by calculation of the intraclass correlation coefficient (ICC) with a 2-way random-effects model for total score and WM, GM, cerebellum, and additional subscores. In addition, Bland-Altman plots were performed to assess the absolute limits of interobserver agreement for continuous variables.

The percentiles of the total score in the full cohort (n = 252) were calculated to determine the cutoff values for normal brain and mild, moderate, and severe brain injury. Weighted κ (κ_W) tests were used to determine the agreement between the readers for the severity of brain injury as a categorical variable. The McNemar test was run to determine whether there was a difference in the severity of brain injury as categoric variables between the readers. We used SPSS, Version 22 (IBM) and GraphPad Prism, Version 8.1.2 for macOS (GraphPad Software) to analyze and plot the data.

RESULTS

The demographic and prenatal data of the total cohort are presented in Table 1. Fifty-three percent of the infants had mild HIE based on the modified Sarnat staging system, reflecting our institutional policy offering cooling to milder cases. The brain MR imaging scans of the 252 infants were performed at a median of 4.0 (interquartile range = 3.0–4.0) days of life. The randomly selected 40 MR images were evaluated by 2 experienced readers, resulting in a total of 80 reads.

View this table:

Table 1:

Demographics and prenatal data of the full cohort^{^a}

In the first epoch of the study, 20 MR images were scored by 2 readers on the basis of the description of the original article of Weeke et al.¹⁰ There was strong interrater agreement for the total score with an ICC of 0.96 (95% CI, 0.89–0.99). The ICC for subscores also showed an excellent agreement between the raters (Table 2).

View this table:

Table 2:

Interrater reliability for subscores and for total score^{^a}

In the Bland-Altman analysis, the average of the differences [SD] was 1.80 [3.7] for the total score with −5.5 to 9.1 limits of agreement. Regarding the subscores, there were no mean differences of >2 points (Online Supplemental Data).

The severity of brain injury was classified on the basis of the distribution of total scores in the full cohort including 252 infants. Figure 2 shows the frequency distribution of the total score in the full cohort. The median of the total score was 2, ranging between 0 and 41 points in the full cohort. Subcategories of brain injury were determined as follows: We considered total score ≤4 (≤ 75%) as normal, 5–10 (76–90%) as mild, 11–15 (91–95%) as moderate, and >15 (>95%) as severe brain injury.

FIG 2.

The frequency distribution of the total score in the full cohort.

The 2 readers agreed that 10/20 (50%) MRI findings were within normal limits, findings of 2/20 (10%) scans were classified as moderate and those of 3/20 (15%) scans were graded as severe brain injury. However, the severity of brain injury in 5 infants (5/20, 25%) was graded differently by the 2 observers. Reader 1 classified findings of 3 MRIs (patients 1, 12, and 19) as moderate, whereas reader 2 rated them as mild (patients 1 and 19) or normal (patient 12). The severity of brain injury was classified differently by the 2 readers in patient 2 (mild versus normal) and in patient 6 (severe versus mild). The κ_W was run to determine whether there was agreement between 2 observers on the severity of brain injury, and it showed a substantial agreement (κ_W = 0.723) (Fig 3A and Table 3). Figure 3 shows the severity of brain injury based on the total score for each of the subjects.

FIG 3.

The severity of brain injury based on the total score for each subject in the first (A) and second epochs (B).

View this table:

Table 3:

Severity of injury based on the total scores^{^a}

In the second epoch of the study, 20 MR imaging scans were evaluated by the same readers after the adjustment of the grading system. The adjustments were based on the main discrepancies between the 2 readers in the first epoch, including the assessment of myelination in the PLIC, the peak of the NAA level, cortical involvement, and the extent of the WM injury.

In the second epoch, the ICC for the total score and subscores also indicated an excellent reliability between the 2 readers, similar to that of the first epoch with the exception of an additional subscore (Table 2). Overall, both the average of the differences (bias) and the limits of agreement improved for the total score and the subscores (Online Supplemental Data).

In addition, only 3 MR images (15%) were classified differently by the 2 observers (Fig 3B). In line with this difference, the κ_W showed a very good agreement between the 2 readers' classifications on the severity of brain injury (κ_W = 0.887) compared with the substantial agreement in the first epoch (κ_W = 0.723). The McNemar test determined that the difference of the proportion in each category was not statistically different, similar to the finding in the first epoch (Table 3).

DISCUSSION

This study has demonstrated the utility of a novel MR imaging scoring system in a cohort of neonates with a wide range of HIE severity. It also showed the potential advantage of adjusting some of its subscores. The interrater reliability showed an excellent level of agreement for the total score between the 2 experienced readers both before and after the adjustment of the scoring system. The Bland-Altman plot revealed, overall, a decreasing bias between the 2 readers and a narrower agreement interval for the subscores after the adjustment. In addition, the agreement between the 2 readers' classifications on the severity of brain injury greatly improved in the second epoch.

The presented cutoff values may be used to determine the severity of brain injury in future clinical studies. However, the cutoff values derived from the percentiles of the total scores in the full cohort may reflect our diverse patient population, including infants with mild HIE. Hence, these cutoff values may not be applicable to centers that provide TH to infants with only moderate and severe HIE.

In recent years, the inclusion criteria for hypothermia have been broadened, and TH has been offered increasingly to near-term infants.^15,16

In line with these criteria, the first adjustment related to the gestational age of infants with HIE. Both the metabolic profile and the myelination change as the brain matures. The rate of increase in the NAA peak is related to the maturation process.¹⁷ Likewise, an increase in myelinated WM can be detected between 35 and 41 weeks of gestation.¹⁸ Therefore, in the assessment of the NAA peak and the absence of myelination in the PLIC, the gestational age must be considered. The consistent evaluation of these 2 items is also important because both abnormal signal in the PLIC and NAA concentration have a good predictive value for the neurodevelopmental outcomes.^19,20

The second and third adjustment included the involvement of WM and the cortical area. The retrospective study of Rao et al¹⁶ found that the WM injury was the most frequent pattern among near-term infants, followed by GM injury and cortex involvement. Furthermore, isolated WM and cortical abnormalities were associated with communication and behavioral problems, visual impairment, and seizures.²¹ Hence, the consistent evaluation of WM and cortex involvement has a major role in the prediction of long-term outcomes. Moreover, the inconsistent scoring of brain injury can change the category of severity.

The study has several limitations that should be taken into consideration. First, statistical analysis should be interpreted with caution within the context of the small sample size. Second, the 95% limits of agreement in the Bland-Altman plot due the small sample size may be unreliable for estimating larger populations. Another limitation of the study is that we did not validate our cutoff values against long-term neurodevelopmental outcome data.

CONCLUSIONS

The novel grading system developed by Weeke et el¹⁰ provides a detailed evaluation of the neonatal brain with hypoxic-ischemic injury using DWI and ¹H-MR spectroscopy sequences. The modification of the scoring system may help with the correct interpretation of the selected items and can lead to a higher degree of interrater agreement. The presented cutoff values may be used to determine the severity of brain injury in future clinical studies, including those infants with mild HIE. Clearly, further studies are needed to determine the cutoff values of this novel grading system for the severity of brain injury in relation to neurodevelopmental sequelae.

Footnotes

Disclosures: Edward Yang—UNRELATED: Consultancy: CorticoMetrics, Comments: reviewed brain MRIs for a company developing software to recognize cortical dysplasia, last time in 2017. Terrie E. Inder—UNRELATED: Consultancy: Aspect Imaging, Comments: I am on the Scientific Advisory Board for this novel MR imaging company manufacturing a neonatal MRI system; Expert Testimony: occasional legal services, Comments: occasional medicolegal opinion; Grants/Grants Pending: federal and foundations.* *Money paid to the institution.

References

1.
2. Kurinczuk JJ,
3. White-Koning M,
4. Badawi N
. Epidemiology of neonatal encephalopathy and hypoxic-ischaemic encephalopathy. Early Hum Dev 2010;86:329–38 doi:10.1016/j.earlhumdev.2010.05.010 pmid:20554402
2.
2. Azzopardi D,
3. Strohm B,
4. Marlow N, et al
. TOBY Study Group. Effects of hypothermia for perinatal asphyxia on childhood outcomes. N Engl J Med 2014;371:140–49 doi:10.1056/NEJMoa1315788 pmid:25006720
3.
2. Jacobs SE,
3. Hunt R,
4. Tarnow-Mordi W, et al
. Cooling for newborns with hypoxic ischaemic encephalopathy. Cochrane Database Syst Rev 2007;CD003311 doi:10.1002/14651858.CD003311.pub2 pmid:17943788
4.
2. Alderliesten T,
3. de Vries LS,
4. Staats L, et al
. MRI and spectroscopy in (near) term neonates with perinatal asphyxia and therapeutic hypothermia. Arch Dis Child Fetal Neonatal Ed 2017;102:F147–52 doi:10.1136/archdischild-2016-310514 pmid:27553589
5.
2. de Vries LS,
3. Groenendaal F
. Patterns of neonatal hypoxic-ischaemic brain injury. Neuroradiology 2010;52:555–66 doi:10.1007/s00234-010-0674-9 pmid:20390260
6.
2. Barkovich AJ,
3. Hajnal BL,
4. Vigneron D, et al
. Prediction of neuromotor outcome in perinatal asphyxia: evaluation of MR scoring systems. AJNR Am J Neuroradiol 1998;19:143–49 pmid:9432172
7.
2. Shankaran S,
3. McDonald SA,
4. Laptook AR, et al
. Eunice Kennedy Shriver National Institute of Child Health and Human Development Neonatal Research Network. Neonatal magnetic resonance imaging pattern of brain injury as a biomarker of childhood outcomes following a trial of hypothermia for neonatal hypoxic-ischemic encephalopathy. J Pediatr 2015;167:987–93 doi:10.1016/j.jpeds.2015.08.013 pmid:26387012
8.
2. Rutherford M,
3. Ramenghi LA,
4. Edwards AD, et al
. Assessment of brain tissue injury after moderate hypothermia in neonates with hypoxic-ischaemic encephalopathy: a nested substudy of a randomised controlled trial. Lancet Neurol 2010;9:39–45 doi:10.1016/S1474-4422(09)70295-9 pmid:19896902
9.
2. Goergen SK,
3. Ang H,
4. Wong F, et al
. Early MRI in term infants with perinatal hypoxic-ischaemic brain injury: interobserver agreement and MRI predictors of outcome at 2 years. Clin Radiol 2014;69:72–81 doi:10.1016/j.crad.2013.09.001 pmid:24210250
10.
2. Weeke LC,
3. Groenendaal F,
4. Mudigonda K, et al
. A novel magnetic resonance imaging score predicts neurodevelopmental outcome after perinatal asphyxia and therapeutic hypothermia. J Pediatr 2018;192:33–40 doi:10.1016/j.jpeds.2017.09.043 pmid:29246356
11.
2. Azzopardi DV,
3. Strohm B,
4. Edwards AD, et al
. TOBY Study Group. Moderate hypothermia to treat perinatal asphyxial encephalopathy. N Engl J Med 2009;361:1349–58 doi:10.1056/NEJMoa0900854 pmid:19797281
12.
2. Gluckman PD,
3. Wyatt JS,
4. Azzopardi D, et al
. Selective head cooling with mild systemic hypothermia after neonatal encephalopathy: multicentre randomised trial. Lancet 2005;365:663–70 doi:10.1016/S0140-6736(05)17946-X pmid:15721471
13.
2. Shankaran S,
3. Laptook AR,
4. Ehrenkranz RA, et al
. National Institute of Child Health and Human Development Neonatal Research Network. Whole-body hypothermia for neonates with hypoxic-ischemic encephalopathy. N Engl J Med 2005;353:1574–84 doi:10.1056/NEJMcps050929 pmid:16221780
14.
2. Mathur AM,
3. Neil JJ,
4. McKinstry RC, et al
. Transport, monitoring, and successful brain MR imaging in unsedated neonates. Pediatr Radiol 2008;38:260–64 doi:10.1007/s00247-007-0705-9 pmid:18175110
15.
2. Jacobs SE,
3. Morley CJ,
4. Inder TE, et al
. Infant Cooling Evaluation Collaboration. Whole-body hypothermia for term and near-term newborns with hypoxic-ischemic encephalopathy: a randomized controlled trial. Arch Pediatr Adolesc Med 2011;165:692–700 doi:10.1001/archpediatrics.2011.43 pmid:21464374
16.
2. Rao R,
3. Trivedi S,
4. Vesoulis Z, et al
. Safety and short-term outcomes of therapeutic hypothermia in preterm neonates 34–35 weeks gestational age with hypoxic-ischemic encephalopathy. J Pediatr 2017;183:37–42 doi:10.1016/j.jpeds.2016.11.019 pmid:27979578
17.
2. Xu D,
3. Vigneron D
. Magnetic resonance spectroscopy imaging of the newborn brain: a technical review. Semin Perinatol 2010;34:20–27 doi:10.1053/j.semperi.2009.10.003 pmid:20109969
18.
2. Dubois J,
3. Dehaene-Lambertz G,
4. Kulikova S, et al
. The early development of brain white matter: a review of imaging studies in fetuses, newborns and infants. Neuroscience 2014;276:48–71 doi:10.1016/j.neuroscience.2013.12.044 pmid:24378955
19.
2. Rutherford MA,
3. Pennock JM,
4. Counsell SJ, et al
. Abnormal magnetic resonance signal in the internal capsule predicts poor neurodevelopmental outcome in infants with hypoxic-ischemic encephalopathy. Pediatrics 1998;102:323–28 doi:10.1542/peds.102.2.323 pmid:9685433
20.
2. Lally PJ,
3. Montaldo P,
4. Oliveira V, et al
. MARBLE consortium. Magnetic resonance spectroscopy assessment of brain injury after moderate hypothermia in neonatal encephalopathy: a prospective multicentre cohort study. Lancet Neurol 2019;18:35–45 doi:10.1016/S1474-4422(18)30325-9 pmid:30447969
21.
2. Martinez-Biarge M,
3. Bregant T,
4. Wusthoff CJ, et al
. White matter and cortical injury in hypoxic-ischemic encephalopathy: antecedent factors and 2-year outcome. J Pediatr 2012;161:799–807 doi:10.1016/j.jpeds.2012.04.054 pmid:22682614

Received May 11, 2020.
Accepted after revision November 23, 2020.

Main menu

User menu

Search

American Journal of Neuroradiology

Interobserver Reliability of an MR Imaging Scoring System in Infants with Hypoxic-Ischemic Encephalopathy

Abstract

ABBREVIATIONS:

MATERIALS AND METHODS

Patients

MR Imaging

Statistical Analysis

RESULTS

DISCUSSION

CONCLUSIONS

Footnotes

References

Indexed Content

Cases

Special Collections

Resources

Multimedia

About Us

American Society of Neuroradiology