Subcortical processing of speech regularities underlies reading and music aptitude in children
© Strait et al; licensee BioMed Central Ltd. 2011
Received: 12 May 2011
Accepted: 17 October 2011
Published: 17 October 2011
Skip to main content
© Strait et al; licensee BioMed Central Ltd. 2011
Received: 12 May 2011
Accepted: 17 October 2011
Published: 17 October 2011
Neural sensitivity to acoustic regularities supports fundamental human behaviors such as hearing in noise and reading. Although the failure to encode acoustic regularities in ongoing speech has been associated with language and literacy deficits, how auditory expertise, such as the expertise that is associated with musical skill, relates to the brainstem processing of speech regularities is unknown. An association between musical skill and neural sensitivity to acoustic regularities would not be surprising given the importance of repetition and regularity in music. Here, we aimed to define relationships between the subcortical processing of speech regularities, music aptitude, and reading abilities in children with and without reading impairment. We hypothesized that, in combination with auditory cognitive abilities, neural sensitivity to regularities in ongoing speech provides a common biological mechanism underlying the development of music and reading abilities.
We assessed auditory working memory and attention, music aptitude, reading ability, and neural sensitivity to acoustic regularities in 42 school-aged children with a wide range of reading ability. Neural sensitivity to acoustic regularities was assessed by recording brainstem responses to the same speech sound presented in predictable and variable speech streams.
Through correlation analyses and structural equation modeling, we reveal that music aptitude and literacy both relate to the extent of subcortical adaptation to regularities in ongoing speech as well as with auditory working memory and attention. Relationships between music and speech processing are specifically driven by performance on a musical rhythm task, underscoring the importance of rhythmic regularity for both language and music.
These data indicate common brain mechanisms underlying reading and music abilities that relate to how the nervous system responds to regularities in auditory input. Definition of common biological underpinnings for music and reading supports the usefulness of music for promoting child literacy, with the potential to improve reading remediation.
The human nervous system makes use of sensory regularities to drive accurate perception, especially when confronted with challenging perceptual environments . It is thought that the brain shapes perception according to predictions that are made based on regularities; this shaping is accomplished by comparing higher-level predictions with lower-level sensory encoding of an incoming stimulus via the corticofugal (i.e., top down) system . This is a common neural feature that spans sensory modalities and can be observed in neural responses to regularly-occurring, as opposed to unpredictably-occurring, stimuli [3–5]. The brain's ability to use sensory regularities is a fundamental feature of auditory processing, promoting even the most basic of auditory experiences such as language processing during infancy [6, 7] and speech comprehension amidst a competing conversational background . Failure of the brain to utilize sensory regularities has been associated with neural dysfunction, such as schizophrenia  and language impairment (e.g., dyslexia) [5, 9–11].
The impact of stimulus regularity on auditory processing has been well established in the auditory cortex [1, 3] and was recently documented at and below the level of the brainstem [12–15]. Specifically, neural potentials to frequently-occurring sounds exhibit enhanced frequency tuning in both the primary auditory cortex  and in the auditory brainstem [5, 17]. This sensory fine-tuning occurs rapidly, does not require overt attention and may enable enhanced object discrimination [14, 18]. Although reference to the neural enhancement of a repeated speech sound might seem contradictory to the well-known repetition suppression of cortical evoked response magnitudes, the neural mechanisms underlying this effect remain debated. While some have proposed that stimulus repetition leads to overall decreased neuronal activity, others have suggested that repetition facilitates precision in neural representation by enhancing certain aspects of the neural response while inhibiting others (e.g., more precise inhibitory sidebands surrounding a facilitated response to the physical dimensions of a repeated stimulus) .
Human auditory brainstem responses (ABRs) to the pitch of predictably presented speech are enhanced relative to ABRs to speech presented in a variable context . The extent of this subcortical enhancement of regularly-occurring speech relates to better performance on language-related tasks, such as reading and hearing speech in noise. This fine-tuning is thought to be driven by top-down cortical modulation of subcortical response properties  and its absence in poor readers is consistent with proposals that child reading impairment stems from the brain's inability to benefit from repetition in the sensory stream. Specifically, children with dyslexia fail to form perceptual anchors--a type of perceptual memory--based on repeating sounds [9, 11].
Although we have made gains in understanding the auditory processing of speech regularities in children with reading impairment (or lack thereof), we do not know how auditory expertise shapes these mechanisms. The auditory expertise engendered by musical training during childhood and into adulthood promotes the subcortical encoding of speech [20, 21] and may strengthen neural mechanisms that undergird child literacy [22–24]. Although the integrative nature of music and language abilities continues to be debated [25–27], a growing body of work supports shared abilities for music and reading, with music aptitude accounting for a substantial amount of the variance in child reading ability [28–30] even after controlling for nonverbal IQ and phonological awareness . It is thought that strengthened top-down control, which is important for modulating lower-level neural responses, unfolds with expertise  and, more specifically, with musical training [33, 34].
In order to define relationships between musical skill and literacy-related aspects of auditory brainstem function, we assessed subcortical processing of speech regularities, music aptitude and reading abilities in school-aged children. Our overarching goal was to define common biological underpinnings for music and reading abilities. We anticipated that music aptitude and literacy abilities would positively correlate with subcortical spectral enhancement of repetitive speech cues. We also explored relationships between musical skill and literacy-related aspects of auditory cognitive function through working memory assessments [35, 36], which included an auditory attention component. We anticipated that music aptitude and literacy abilities would positively correlate with auditory working memory and attention performance. In order to delineate and quantify relationships among variables, we applied the data to Structural Equation Modeling (SEM). SEM relies on a variety of simultaneous statistical methods (e.g., factor analysis, multiple regressions and path analysis combined with structural equation relations) to evaluate a hypothesized model . Although more traditional regression analyses are useful for delineating causal relationships among variables, SEM enables more efficient characterization of complex, real-world processes than can be achieved using correlation-based analyses . Specific benefits of SEM include the simultaneous analysis of multiple interrelated variables, consideration of measurement error, and inherent control for multiple comparisons. We expected SEM to substantiate our hypothesis that music aptitude predicts much of the variance in literacy abilities by way of shared cognitive and neural mechanisms.
42 normal hearing children between the ages of 8-13 years (M = 10.4, SD = 1.6, Males = 26). Participants and their legal guardians provided informed assent and consent according to Northwestern University's Institutional Review Board. Because we aimed to evaluate neural function and music aptitude across a spectrum of readers, no literacy restrictions were applied but all participants demonstrated normal audiometric thresholds (≤20 dB HL pure tone thresholds at octave frequencies from 125 to 8000 Hz) and IQ (≥85 score on the Wechsler Abbreviated Scale of Intelligence) . Participants also had clinically normal ABRs to 80 dB SPL 100 μs click stimuli that were presented at 31.1 Hz.
Extent of extracurricular activity was assessed by a parent questionnaire (the Child Behavior Checklist ). Parents rated their child's current extracurricular activities according to the frequency of the child's involvement--less than average, average, or more than average; these scores were summed to produce a single extracurricular activity score.
Good (n = 8) and poor readers (n = 21) were differentiated based on reading ability (Test of Word Reading Efficiency; see Reading and working memory, below) . Children with scores ≤90 were included in the poor reading group, while good readers had scores ≥110. 13 subjects did not meet the criteria for either group and were excluded from group analyses. Good and poor readers did not differ in age (Mann-Whitney U test; z = -0.223, p = 0.83), sex (Pearson Chi-Square χ2 = 0.12, p = 0.73), socioeconomic status as inferred by maternal education  (Pearson Chi-Square χ2 = 1.10, p = 0.59), years of musical training (Mann-Whitney U test; z = -0.231, p = 0.82), extent of extracurricular activity (Mann-Whitney U test; z = -1.202, p = 0.23) or nonverbal IQ (Mann-Whitney U test; z = -1.834, p = 0.07). With regard to musical training histories, 36 of the 42 children had undergone no to only a few months of musical training and were not currently involved in music activities. The other six children had participated in at least one year of musical training. One of these children was categorized as a poor reader, two were categorized as good readers and three were considered average readers (as such, these three were not included in either reading group).
Standardized literacy measures assessed oral (Test of Word Reading Efficiency, TOWRE)  and silent (Test of Silent Word Reading Fluency, TOSWRF)  reading speed. The TOWRE requires children to read aloud lists of real words (Sight subtest) and nonsense words (Phonemic Decoding subtest) while being timed. The two subscores are combined to form a composite score (here referred to as the TOWRE). The TOSWRF requires participants to quickly identify printed words by demarcating lines of letters into individual words while being timed. Participants are presented with rows of words that gradually increase in reading difficulty and they are asked to separate them (e.g., dimhowfigblue → dim/how/fig/blue). TOWRE ("reading efficiency") and TOSWRF ("reading fluency") age-normed scores were averaged in order to create a composite Reading variable for correlation analyses.
Auditory working memory was assessed using the Memory for Digits Forward subtest of the Comprehensive Test of Phonological Processing  and the Memory for Digits Reversed subtest of the Woodcock Johnson Test of Cognitive Abilities . Digits forward and digits reversed age-normed scores were averaged in order to create a composite score for correlation analyses. In light of auditory attention's contribution to memory for digits forward , composite performance on both digits forward and reversed subtests is referred to as Auditory Working Memory and Attention (AWM/Attn).
Music aptitude was assessed using Edwin E. Gordon's Intermediate Measures of Music Audiation (IMMA) , which measures children's abilities to internalize musical sound and compare two sequentially presented sound patterns. Tonal aptitude was assessed by the Tonal subtest, in which participants are presented with 40 pairs of musical excerpts that do not differ rhythmically but may differ melodically. Rhythm aptitude was assessed by the Rhythm subtest, in which participants are presented with 40 pairs of short excerpts that do not differ melodically but may differ rhythmically. For both subtests, participants indicate whether the two excerpts in each pair are the same or different. The subtest scores are combined to generate a composite music aptitude score. The rhythm, tonal and composite scores are normed by academic grade in order to produce percentile rankings.
The stimulus was presented to the right ear via insert earphones (ER-3; Etymotic Research, Elk Grove Village, IL) at 80 dB SPL and at a rate of 4.35 Hz. This fast presentation rate limits the contribution of cortical neurons, which are unable to phase-lock at such fast rates . Furthermore, the stimulus was presented in alternating polarities and average responses to each polarity were subsequently summed in order to limit contamination of the neural recording by the cochlear microphonic . During recording sessions, participants watched videos of their choice in order to maintain a still yet wakeful state with the soundtrack quietly playing from a speaker, audible through the nontest ear. Because auditory input from the soundtrack was not stimulus-locked and stimuli were presented directly to the right ear at a +40 dB signal-to-noise ratio, the soundtrack had no significant impact on the recorded responses .
Responses were digitally sampled at 20,000 Hz, offline filtered from 70 to 2000 Hz with a 12 dB roll-off and epoched from -40 to 190 ms (stimulus onset at time zero). Events with amplitudes beyond ± 35 μV were rejected as artifacts. Responses to 100 μs clicks were collected before and after each recording session in order to ensure consistency of wave V latencies, confirming no differences in recording parameters or subject variables.
As in Chandrasekaran et al. , we compared the brainstem responses to /da/ recorded in the variable condition to trial-matched responses recorded to /da/ in the predictable condition (Figure 1). Specifically, neural responses in the predictable condition were averaged according to their occurrence relative to the order of presentation in the variable condition, resulting in 700 artifact-free responses for each condition.
In accordance with Chandrasekaran et al., we examined the strength of the spectral encoding of the second and fourth harmonics (H2 and H4) in average responses for each participant over the formant transition of the stimulus (7-60 ms in the neural response) via fast Fourier transforms executed in Matlab 7.5.0 (The Mathworks, Natick, MA). Spectral magnitudes were calculated for 10 Hz-wide bins surrounding H2 and H4. The differences in the spectral amplitudes of H2 and H4 between the two conditions (predictable minus variable) were calculated for each participant and normalized through conversion to a z-score based on the group mean.
The brainstem response z-scores were compared across conditions and groups using a Repeated Measures ANOVA and correlated with the reading and music aptitude measures using Pearson's correlations (SPSS Inc., Chicago, IL). RMANOVA outcomes were further defined in a post-hoc analysis using Mann-Whitney U-tests. All results reflect two-tailed values and normality for all data was confirmed using the Kolmogorov-Smirnov test for equality.
We normalized all data through conversion to z-scores based on group means. Analysis of covariance matrix structures was conducted with Lisrel 8.8 (Scientific Software International Inc., Lincolnwood, IL) and solutions were generated based on maximum-likelihood estimation. We defined the model's directions of causality in accordance with our aims, being to define common biological and cognitive factors to account for the covariance in child reading and music abilities. We selected the Root Mean Square Error of Approximation (RMSEA) in order to evaluate the model's goodness of fit, with measurements below 0.08 indicative of good model fit . Lisrel 8.8 also calculates the likelihood ratio (χ2), its degrees of freedom and probability whenever maximum likelihood ratios are computed. The χ2 test functions as a statistical method for evaluating structural models, describing and evaluating the residuals that result from fitting a model to the observed data. A χ2 probability value greater than 0.05 indicates a good model fit .
The extent of subcortical enhancement of repetitive speech cues correlated with music aptitude and literacy abilities. Common variance among subcortical enhancement of repetitive speech cues, music aptitude and reading abilities was not accounted for by overarching factors such as socioeconomic status, extracurricular involvement or IQ.
SEM indicates that, by way of common neural (auditory brainstem) and cognitive (auditory working memory/attention) functions, music skill accounts for 38% of the variance in reading performance. The resulting statistical model delineates and quantifies relationships among auditory brainstem function, music aptitude, memory/attention and literacy.
Music aptitude correlated with reading performance. These relationships were largely driven by performance on the Rhythm music aptitude subtest (Rhythm-TOWRE: r = 0.41, p < 0.01; Rhythm-TOSWRF: r = 0.31, p < 0.05; Tonal-TOWRE: r = 0.16, p = 0.32; Tonal-TOSWRF: r = 0.26, p = 0.09), although the relationships between music aptitude and reading performance were strongest when considering the composite music aptitude score, which considers both Tonal and Rhythm performance (Composite-TOWRE: r = 0.45, p < 0.005; Composite-TOSWRF: r = 0.39, p < 0.01).
The amount of enhancement observed in ABRs recorded in the predictable compared to the variable condition positively correlated with reading and music aptitude performance across all subjects. The reading composite score (produced by combining TOWRE and TOSWRF z-scores) correlated with the amount of brainstem enhancement for both H2 and H4 (H2: r = 0.44, p < 0.005; H4: r = 0.40, p < 0.01; Figure 2b). The music composite score also correlated with the amount of brainstem enhancement to both harmonics (H2: r = 0.33, p < 0.05; H4: r = 0.37, p < 0.01; Figure 2b).
Although AWM/Attn correlated with the amount of brainstem enhancement to both harmonics (r = 0.35, p < 0.05), the covariance between these measures could be accounted for by their relationships with music aptitude. Whereas partialing for AWM/Attn did not eliminate the common variance observed between music aptitude and repetitive harmonic enhancement (r = 0.32, p = 0.04), AWM/Attn and repetitive harmonic enhancement no longer covaried when partialing for music aptitude (r = 0.20, p = 0.20). This suggests that most of the covariance between AWM/Attn and repetitive harmonic enhancement can be explained by their shared variance with music aptitude.
Subjects' socioeconomic status (SES) and extracurricular activity involvement did not correlate with the test variables of music aptitude, auditory brainstem enhancement of repetitive speech cues, reading, or auditory memory/attention
By means of subcortical enhancement of predictable speech harmonics and AWM, music aptitude accounted for 38% of the variability in reading ability (p < 0.01). The model demonstrated an excellent fit (χ2(18) = 17.64, p > 0.35; RMSEA = 0.05). All path coefficients were significant except for the path between Tonal Aptitude and Composite Music Aptitude (r2 = 0.03, p = 0.31). This model emphasizes the combined strength of relationships among rhythm aptitude, subcortical enhancement of predictable speech harmonics and AWM/Attn in predicting child reading ability.
We observed correlations among music and literacy abilities with the extent of subcortical enhancement of predictable speech cues. As such, our data reveal common, objective neural markers for music aptitude and reading ability and suggest a model for the relationships that have been documented between music and literacy performance [28–31, 53].
Our data also reveal common cognitive markers for music aptitude and reading ability. Auditory working memory and attention are driving components of child literacy [35, 36], and relationships between auditory working memory and attention and musical skill have already been established [33, 54]. Not only do musicians demonstrate better verbal memory than nonmusicians, but this advantage can be seen with as little as one year of musical training . Our results demonstrate a similar relationship between auditory working memory and attention and music aptitude in children, although this relationship is observed regardless of musical training backgrounds.
As in Chandrasekaran et al., we observed subcortical enhancement of a predictable, contrasted with a variable, speech presentation . This enhancement was specific for frequencies integral to the perception of pitch (H2 and H4). Similar repetition-induced frequency enhancement has been observed in the primary auditory cortex, where neurons exhibit sharpened acuity to stimulus frequency . This tuning occurs without overt attention, is stimulus specific and develops rapidly [3, 56]. Not surprisingly, enhanced neural tuning with stimulus repetition has been proposed to relate with improved object discrimination [16, 18].
The ability of the sensory system to automatically modify neural response properties according to expectations in a dynamic and context-sensitive manner is thought to have evolved to infer and represent the causes of change in our environment [1, 57]. This modification may occur in a descending fashion, beginning in extra-sensory cortices where predictions are developed based on prior experience (such as with repetition) and sequentially tuning lower level response properties to heighten sensory acuity [2, 32, 57, 58]. The descending nature of this neural tuning is supported by observations from cortical work showing decreased onset latencies from 120 ms (after two repetitions) to 50 ms (after 30 repetitions)  and is thought to represent the strengthening of the stimulus-specific memory trace at earlier and earlier processing stages . The correlations reported here between music aptitude and reading ability with subcortical fine-tuning to predictable speech sounds may indicate stronger top-down modulatory systems in individuals with better music aptitude and reading performance.
Our data demonstrate that diminished subcortical enhancement of predictable speech sounds relates with reading impairment. Similar observations have been made in poor readers, in addition to children with poor perception of speech presented in background noise ; we extend these findings to the domain of music. This relationship is not surprising given the importance of sound repetition and sequencing for music perception. Specifically, repetition and regularity lends to the perception of tonality , rhythm and meter [60, 61] and the structural use of musical themes. Deviations from predicted patterns result in impaired music production and perception [62–64] and can be flagged by the auditory cortex in both musically trained and untrained individuals, as measured by auditory evoked potentials [65–67]. Increased sensitivity to deviations from patterns in musical sound is thought to reflect enhanced sensory memory and discrimination abilities as well as more firmly established categorical boundaries .
It is not surprising that we observed correlations between music aptitude and subcortical spectral enhancement of predictable speech sounds given that musical expertise increases one's sensitivity to sound patterns not only in music, but also in speech [34, 69]. Although the argument can be made for a genetic contributor to musicians' enhanced sound processing, this increased sensitivity can be modulated, at least in part, by one's method of musical practice and training . Furthermore, diverse methodological approaches consistently reveal correlations between the extent of structural and functional neural enhancement observed in musicians and their years of musical practice or age of practice onset [71–74]. Such observations suggest the substantial contribution of experience-induced neuroplasticity to musicians' enhanced sound processing and may be attributed to the strength of top-down contributors to auditory processing [33, 69].
Due to its multisensory nature, attentional demands and reliance on rapid audio-motor feedback, music is a powerful tool for engendering neural plasticity, particularly for auditory processing [34, 75–78]. This plasticity is not constrained to the brain's music networks but applies more generally to auditory functions [27, 69, 72, 79–82]. Clinicians and researchers involved in the treatment and assessment of reading dysfunction have long held interest in the potential for musical training to strengthen neural networks for reading. Wisbey was one of the first to formally propose that music, by facilitating the development of multisensory awareness and auditory acuity, could promote reading in impaired children . This proposal has been verified by a number of experiments [84, 85] (c.f. Morais et al., 2010 ), with relationships between music and reading abilities observed in many more [28–30, 53, 86].
Definition and characterization of common neural mechanisms for music and reading skills may enable the development of a biological assessment of reading impairment and improve the efficacy of remedial attempts. Reading performance is known to rely on a chorus of multifaceted and complex processes that have proven difficult to disentangle; here, we find that subcortical function serves as a significant and accessible factor in reading impairment, accounting for 44% of the variance in child reading ability. The use of auditory brainstem measurements to assess learning and reading impairment has emerged in recent years [21, 87, 88], is being adapted for the clinic and can provide an objective index of the success of auditory [89, 90] and music training . In light of the high test-retest reliability of the speech-evoked ABR , individual responses are highly replicable and can be meaningfully compared to group means or established norms. Identification of common neural markers for music and reading skill, such as those reported here, may lead to the biological assessment of music-associated learning abilities in children and encourage the employment of music as a technique for literacy remediation.
Musical training during early childhood may be particularly important for the advancement of music and reading aptitude. Although the music test employed here is thought to measure music aptitude, being one's inherent ability for music, the creator of this measure, Edwin E. Gordon, has long emphasized the impact of music education during early childhood on music aptitude scores. Gordon makes this claim in light of his extensive longitudinal work showing that music aptitude can improve with musical training, particularly during early childhood . The importance of an early onset of music activities is more directly supported by outcomes from neuroscientific research, in which many of the neuroplastic changes associated with musical training are more extensive in individuals who began training earlier in their lifetimes [71, 72, 93–96]. With regard to auditory brainstem processing, we found that ABRs in young adult musicians who began musical training prior to age 7 were distinct from those in musicians who began training between the ages of 7-13 [72, 93]. Whereas musicians who began training prior to age 7 demonstrated enhanced ABRs to the spectral components of communication sounds compared to nonmusicians, those who began later in life did not. Observations such as this reflect a critical period for musical training-associated neural plasticity  and may speak to the importance of initiating musical training during early childhood for bringing about the greatest impact on music aptitude or, we propose, reading ability.
It remains undetermined whether reading abilities are impacted alongside music aptitude with musical training during childhood or whether the neural mechanism reported here is affected by musical training. Also undetermined is whether relationships between music and reading work in reverse, with language-based literacy remediation leading to improved music aptitude. More work (notably, longitudinal work) is necessary in order to define relationships between music aptitude, literacy and the auditory brainstem response to speech as well as to determine the impact of formal training, the efficacy of specific training approaches and/or literacy remediation programs.
Reading relies on a complex and multifaceted combination of processes that have proven difficult to disentangle. In light of correlational and structural modeling analyses, we conclude that subcortical function serves as a significant and accessible factor underlying reading ability and impairment, predicting 44% of the variance in reading ability. Further outcomes reveal direct relationships between musical skill and literacy-related aspects of auditory brainstem and memory/attention function, revealing common neural and cognitive mechanisms for reading and music abilities that may operate, at least in part, via corticofugal shaping of sensory function. By way of auditory brainstem spectral enhancement of predictable speech and auditory working memory/attention, music skill predicts approximately 40% of the variance in reading performance. Definition of common neural and cognitive mechanisms for music and reading skills may support the usefulness of music for promoting child literacy, with the potential to improve the efficacy of remedial attempts.
The extent of brainstem enhancement of predictable speech in subjects with high (IMMA ≥70th percentile; n = 18) and low (IMMA ≤30th percentile; n = 9) music aptitude patterned with the results observed when subjects were divided into good and poor readers. A 2 (condition) × 2 (music group) × 2 (harmonic) RMANOVA demonstrated an interaction between condition and music group (F = 6.17, p < 0.02). Post-hoc Mann Whitney U-tests demonstrated that subjects with high music aptitude have a greater enhancement of the second harmonic of speech presented in the predictable condition compared to the variable condition than subjects with low music aptitude (H2: z = -1.96, p < 0.05; H4: z = -1.29, p = 0.19).
This work is supported by the National Science Foundation grant 0921275 to NK and the National Institutes of Health grant F31DC011457 to DS.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.