Distinct reinforcement learning profiles distinguish between language and attentional neurodevelopmental disorders
Behavioral and Brain Functions volume 19, Article number: 6 (2023)
Theoretical models posit abnormalities in cortico-striatal pathways in two of the most common neurodevelopmental disorders (Developmental dyslexia, DD, and Attention deficit hyperactive disorder, ADHD), but it is still unclear what distinct cortico-striatal dysfunction might distinguish language disorders from others that exhibit very different symptomatology. Although impairments in tasks that depend on the cortico-striatal network, including reinforcement learning (RL), have been implicated in both disorders, there has been little attempt to dissociate between different types of RL or to compare learning processes in these two types of disorders. The present study builds upon prior research indicating the existence of two learning manifestations of RL and evaluates whether these processes can be differentiated in language and attention deficit disorders. We used a two-step RL task shown to dissociate model-based from model-free learning in human learners.
Our results show that, relative to neurotypicals, DD individuals showed an impairment in model-free but not in model-based learning, whereas in ADHD the ability to use both model-free and model-based learning strategies was significantly compromised.
Thus, learning impairments in DD may be linked to a selective deficit in the ability to form action-outcome associations based on previous history, whereas in ADHD some learning deficits may be related to an incapacity to pursue rewards based on the tasks' structure. Our results indicate how different patterns of learning deficits may underlie different disorders, and how computation-minded experimental approaches can differentiate between them.
Developmental dyslexia (DD) and Attention-deficit/hyperactivity disorder (ADHD) are two of the most common neurodevelopmental disorders. Dyslexia is characterized by difficulties in acquiring reading, writing, and spelling skills, whereas ADHD is characterized by inattention, impulsivity, and hyperactivity symptoms. Traditionally, DD has been suggested to arise from phonological impairments  but domain-general accounts postulate sensory  or procedural learning impairments [65, 98, 99] in its etiology, thus providing a mechanistic account for the diverse range of linguistic and nonlinguistic symptoms observed in this disorder. ADHD has been associated with an executive function deficit , but a growing body of evidence points to key deficits in motivational/reward-related processes as well [7, 36, 37, 59, 69, 73, 77, 80, 89]. There is a high comorbidity between these two childhood neurodevelopmental disorders , including shared symptoms such as temporal processing impairments [22, 94], executive function deficits , and procedural learning deficiencies [1, 110, 34, 54, 57].
Despite decades of research, the neurocognitive basis of these two disorders is still highly debated and the reason for the overlap is not yet fully understood. Recent advances in the research of comorbidity prompt a change from single deficit models to multiple models of developmental neuropsychology. According to the multiple deficit model , there are multiple probabilistic predictors of neurodevelopmental disorders across levels of analyses and comorbidity arises due to shared risk factors.
Interestingly theoretical and empirical findings in the research of DD and ADHD implicate abnormalities in cortico-striatal pathways in both disorders [64, 99]. In DD, cortico-striatal disruption [10, 51, 76, 103] is presumed to affect the ability to acquire skills, procedures and stimulus–response associations acquired incrementally [24, 65, 97, 98]. Since language learning critically depends upon these domain general abilities [21, 97], impaired striatal-based learning is presumed to disrupt the typical course of reading, writing, and spelling skills in those with DD. In ADHD anatomical and functional abnormalities within the striatum  have been suggested to give rise to impulsive behaviors  and neurobiological models of ADHD posit that the deficit in striatal-based learning and memory is likely to arise from dopamine dysfunction within the neostriatum . Recent evidence points to the right caudate as a shared neural substrate that is likely to be affected in both disorders .
The cortico-striatal network is responsible for reinforcement learning (RL), the process in which individuals learn by trial and error to make choices that exploit the likelihood of rewards and minimize the occurrence of penalties . Therefore, based on the notion of cortico-striatal abnormalities in both disorders, RL is likely to be affected as well. Consistent with this assumption, RL deficits have been documented in DD [38, 42, 63, 72, 88] as well as in ADHD [35, 39, 49, 61, 95]. Impairments have been observed across RL tasks involving probabilistic feedback such as the Probabilistic Selection Task [35, 63] and the Weather Prediction Task [39, 42]. Furthermore, both DD and ADHD individuals are impaired in learning information integration categories [49, 88] which are believed to be acquired via striatal-based RL mechanisms . Finally, both DD  and ADHD individuals  are impaired in probabilistic RL tasks when task conditions favor striatal-based memory engagement rather than hippocampal-based memory engagement, similar to a pattern observed among patients with striatal dysfunction [30, 33]. Notably some studies revealed intact RL in ADHD, but such findings are mostly found is tasks in which feedback is deterministic [47, 61] or in studies using relatively simple tasks with low number of stimuli [14, 48, 58].
Model-free vs. model-based RL
Nevertheless, we still do not have a clear understanding of RL phenomena in both DD and ADHD or whether they are characterized by distinct/shared RL mechanisms. Recent advances in the field of neurocomputational models of cognition suggest that RL cannot be considered a unitary phenomenon. Rather, people employ different computational strategies when solving RL problems. One of these involves learning stimulus–response contingencies which, after formation, are less sensitive to outcome and reward (Yin & Knowlton, 2006). A more prevalent account of learning describes goal-oriented learning by focusing on learning outcome-action contingencies. Here, outcome-action contingencies can be based solely on recent history and presumed to arise computationally from model-free (MF) learning. The MF system learns the expected value of actions through prediction errors, which quantify the difference between the worth of actual and expected outcomes. In addition, action-outcome contingencies can be updated through model-based (MB) RL, which operates by learning a predictive model of multiple world states and action-outcome probabilities, and updating action-outcome contingencies by incorporating this information and planning an action course by using this model to evaluate the different outcomes prospectively over multiple future world states [13, 15, 20, 107]. Here MB is likely to involve learning state values based on planning processes .
It has been shown that animals and humans use a mixture of RL processes [13, 15, 20, 107, 111]. Limiting computational resources by concurrent task [66, 67] or inducing stress [66, 67] hinders MB but not MF learning, somewhat in line with observations that learning based on stimulus–response associations is resistant to distraction [32, 109]. The ability to use MB strategies follows a developmental trajectory, as in children MF learning is more dominant than MB learning . Furthermore, MF learning has been shown to be sensitive to core components of executive functions, such as working memory and cognitive control [66–68]. Finally, in psychiatric disorders there is an imbalance between the ability to use MF vs. MB learning, such that those who have disorders associated with compulsivity and impulsivity tend to be impaired in their ability to use MB learning strategies [43, 102]. Neurobiologically, these two types of learning strategies are presumed to rely upon partially distinct neural substrates within the basal ganglia. It has been suggested that the dorsal lateral striatum subserves MF learning whereas the dorsal medial striatum underlies MB learning . Despite this evidence, however, hippocampal damage in humans hampers MB learning but not MF learning . Furthermore, although basal ganglia dopamine levels affect stimulus-response learning and hence are likely to affect MF learning , recent evidence points to the possibility that basal ganglia dopamine levels influence the ability to use MB but not MF learning strategies . Notably, however, computational stimulations reveal that tonic dopamine levels influence the exploitation-exploration behavior trade-off when learning values is based on previous reinforcement history .
The present study
The purpose of the present study was to examine RL behavior in two of the most common yet very different neurodevelopmental disorders. The theoretical and empirical body of research points to cortico-striatal abnormalities in both disorders (for a review see , which may lead to RL difficulties. RL has been studied in both ADHD and DD, but there has been no attempt to dissociate between different types of RL processes. Although a previous study revealed that methylphenidate increased risk taking in people with ADHD , we are aware of no studies that directly examined MB vs. MF RL learning in ADHD or DD. Likewise, there has been little attempt to compare RL in these neurodevelopmental disorders. The two-step task (TST; ) represents a recently popular approach to creating a task that differentiates between MF learning and MB processes and has been tested in a substantial number of studies in humans (e.g., [18, 66, 67, 82, 102, 106, 107]). In this task, a participant is required to make two decisions, each taking him closer to the outcome stage where a reward is revealed. TST allows a differentiation between two types of computations that may lead to impairments in reward-oriented behavior. The first is the MF effect of outcome on decisions, by which actions that were rewarded may not be sufficiently enhanced or associated with reward, leading to a weak association between actions and rewards. The second is the MB effect in which the likelihood that a path will lead to a reward is learned. Here, participants may not incorporate the probabilities of moving from one state to the next into their planning and decisions in the first step. Such computations, MF association and MB planning, may be uniquely disturbed in DD and ADHD.
Krishnan et al.  argued that cortico-striatal dysfunctions have been noted in both language and psychiatric disorders (such as ADHD) and raised the possibility that different computational models may explain the behavioral learning profile in each disorder. They specifically speculated that in developmental language disorders compared to psychiatric disorders (including ADHD) learning impairments will be less apparent when learning state values (the overall reward that one expects when choosing the state as the starting point). However as learning state values is common in MF and MB learning , learning state values based on planning processes may distinguish between language and attentional disorders. This notion is consistent with ample evidence showing that those with ADHD, but not those with DD, exhibit planning deficits and prefer immediate small rewards to delayed larger rewards [5, 16, 79, 85, 95]. Therefore, one could predict that MB learning will be selectively disrupted in ADHD. On the other hand, deficits in the MF association are likely to be impacted in both disorders, as shown by evidence pointing to an impaired ability to learn reinforcement contingencies in DD based on recent history [42, 63, 88] and ADHD [35, 39, 49, 61, 95].
To determine whether the current study was adequately powered, we performed an a priori power analysis. Based on prior research, we computed an effect size of d = 0.65 for the key group difference in model-based learning . Using the software package G*Power  with power (1 − β) set at 0.80 and α = 0.05, one-tailed, we determined that a sample size of 30 per group was required. Thus, the current study was adequately powered.
We excluded individuals who stayed with the same response-key for more than 95% of the trials (0 were excluded) or had more than 25% implausible quick reaction-times in either the first or second stage (< 150 ms; 1dys, 4 ADHD were omitted). For the remaining respondents we omitted from analysis trials with implausible reaction times (< 150 ms), and the first trial in the task (2.48%).
Modal based vs. model free learning
Each clinical population group (DD/ADHD) was tested against its own control group (neurotypcials matched to the DD group and neurotypicals matched to the ADHD group, respectively) and each clinical and control group were matched by age, gender, and non-verbal intelligence. Analyses were performed using R (Team, 2020). Mixed-effect logistic regression models were conducted using the lme4 package . For both experiments we used the following analyses:
To assess whether the groups differed in their ability to use MF vs. MB strategies, we evaluated the effect of events on each trial (trial n) on the first-step decision in the subsequent trial (trial n + 1). The two key predictors in trial n were whether or not a reward was received and whether this occurred after a common or rare transition to the second stage. We evaluated the impact of these events on the chance of repeating the same first-stage choice in trial n + 1. A pure model free agent is likely to repeat a first-stage choice that results in reward regardless of the previous transition type, predicting a positive main effect of reward on first-stage stay probabilities. A pure model-based agent, on the other hand, evaluates first-stage actions in terms of second-stage alternatives they tend to lead to. To examine the contribution of these two systems (i.e., MF vs. MB) we calculated a mixed effect logistic regression, where previous outcome (rewarded vs. unrewarded), previous transition (rare vs. common), group (DD/ADHD vs. control), and all related interactions were entered as fixed effects predicting the probability that the participant would repeat the same choice (stay probability). We further included in this analysis (and in all further mixed-effects regression analyses), a random effect of participants on the intercept parameter .
As an additional measure of model-based abilities, we analyzed second-stage reaction times (RTs) as a function of transition (rare vs. common). A previous study showed that greater deployment of model-based strategies in the first stage led to shorter RTs after common vs. rare transitions . Thus, the effect of transition on second-stage RTs can serve as an additional estimate for model-based involvement [12, 17]. We calculated a mixed effect linear regression, where transition (rare vs. common) and group (DD/ADHD vs. control) were entered as fixed effects predicting second-stage RTs. The regression included an additional random effect of participants on the intercept parameter.
Experiment 1: DD vs. controls
First stage MF vs. MB effects
Table 1 shows the results of this model and Fig. 1A illustrates the effects. We observed a significant main effect of previous outcome [χ2 (1) = 114.61, p < 0.001] on participants’ choices, showing that participants were more likely to stay with their first-stage choice when the previous trial was rewarded vs. unrewarded, across groups. This effect is indicative of model-free learning across groups. We further found that group modulated this effect, as evident by a significant previous outcome × group interaction [χ2 (1) = 8.08, p = 0.004], such that the DD group showed a smaller influence of previous outcome on first-choice stay probability. We also observed a significant previous outcome × previous transition interaction, [χ2 (1) = 13.424, p < 0.001], indicative of model-based learning. The three-way interaction of reward × transition × group was not significant [χ2 (1) = 1.52, p = 0.21], suggesting that people with DD tended to evaluate first-stage actions in terms of the second-stage alternatives associated with them, similar to how neurotypicals evaluated them.
Second-stage MB effect
Table 2 shows the results of this model and Fig. 1C illustrates the effects. We found a significant main effect of transition [χ2 (1) = 611.35, p < 0.001], where choices following a rare transition were slower than those following common transitions. None of the remaining effects with group were significant. This observation is consistent with the finding that those with DD did not differ from matched neurotypicals in their ability to use MB strategies.
Experiment 2: ADHD vs. controls
First-stage MF vs. MB effects
Table 3 shows the results of this model and Fig. 1B illustrates the effects. We observed a significant main effect of previous outcome [χ2 (1) = 92.603, p < 0.001], indicative of model-free learning across groups. However, group modulated this effect, as evident by a significant previous outcome × group interaction [χ2 (1) = 8.077, p = 0.01], such that the ADHD group showed a smaller influence of previous outcome on first-choice stay probability. We also observed a significant previous outcome × previous transition interaction [χ2 (1) = 15.967, p < 0.001], indicative of model-based learning. The triple interaction of reward*transition*group was significant [χ2 (1) = 4.755, p = 0.029], such that ADHD participants exhibited a reduced MB behavior (i.e., smaller previous outcome × previous transition interaction) compared to neurotypicals.
Second-stage MB effect
Table 4 shows the results of this model and Fig. 1D illustrates the effects. We found a significant main effect of transition [χ2 (1) = 340.94, p < 0.001], where choices following a rare transition were slower than those following common transitions. Importantly, there was a significant transition by group interaction [χ2 (1) = 29.551, p < 0.001], such that the transition effect (slower responses in rare compared to common states) was higher in the control group compared with the ADHD group, consistent with lower deployment of model-based strategies in the first stage for the ADHD compared to the control group. To test whether both groups exhibited a transition effect despite the differences in magnitude of the effect as indicated by the interaction, pairwise contrasts were calculated using the emmeans function from the emmeans package . Two pairwise contrasts for the levels of Transition (rare vs. common) were calculated for each group using the output of emmeans as input for the function contrast together with the Bonferroni correction for multiple comparisons. The effect of transition (slower responses in rare cases compared to common states) was significant for both groups (ADHD: estimate = 88.8, SE = 27.9, z. ratio = 3.18, p = 0.0015; TD: estimate = 173, SE = 27, z. ratio = 6.410 p < 0.001).
RL impairments have been implicated in both DD and ADHD [35, 39, 42, 49, 61, 63, 88, 95]. Here, we aimed to determine how different RL types (MF vs. MB) are affected in these two most common yet different neurodevelopmental disorders, and whether shared and distinct learning profiles could be observed across the two disorders. Consistent with previous studies, neurotypical participants in both Study 1 and 2 exhibited a typical use mixture of MF and MB strategies in the two-step task. However, the performance of young adults with DD and ADHD differed relative to matched neurotypicals.
Our results show that compared to matched controls, individuals with DD and individuals with ADHD were less likely to repeat a choice that was rewarded compared to neurotypicals. However, those with ADHD but not those with DD were less affected by MB considerations in their decisions compared to neurotypicals. Supporting this observation, those with ADHD but not those with DD exhibited reduced expectation violation effects, as reflected by a reduced RT difference between common and rare transitions as another indication of lower MB learning.
The observation of impaired model-based RL in ADHD is consistent with previous findings showing that the ability to use MB strategies is disrupted in disorders characterized by striatal dopamine dysfunction, such as Parkinson’s disease  and broadens it to populations that are also associated with striatal dopamine alterations and impulsive tendencies, such as ADHD. The results are especially consistent with previous findings showing temporal discounting in those with ADHD [5, 16, 79, 85, 95]. The impaired ability of people with ADHD to use MB strategies could arise from several reasons: First, ADHD participants can have difficulties/are slower at generating complex internal models of task environments. Another possibility is that they are able to generate internal models but fail to exert the cognitive effort required to follow these mental models. Finally, it can be the case that MB learning is overwhelmed by the absence of automatic control routines that are normally provided by the MF system, rendering MB learning less effective in ADHD. The latter possibility, however, is inconsistent with the results of the DD group that demonstrated preserved MB learning despite impaired reward effect relative to neurotypicals. Future studies are undoubtedly needed in order to understand the reduced model-based behavior we observed in those with ADHD. The observation of impaired MF and MB learning in ADHD is consistent with neurobiological models of ADHD positing impaired RL mechanisms [35, 78, 96]. Although these models differ in their level of explanation  all assume that RL processes are likely to be impaired in ADHD. The present findings add to this theoretical body of research by pointing to the possibility that RL deficits in ADHD cannot be conceived as a unitary phenomenon but that two distinct types of RL processes are likely to be affected in this disorder. Despite differences in the ability to use MB strategies in the ADHD and DD groups, a similar previous-outcome main effect impairment was observed in both groups compared to neurotypicals. There are several possibilities for explaining the reduced previous-outcome main effect we observed in the two groups. First, such an effect could be explained by noise or an increased tendency to explore the environment , which could reasonably be associated with decreased use of MF strategies . This possibility is consistent with recent findings showing that ADHD symptoms are negatively correlated with win-stay scores . Indeed, computational stimulations reveal an effect of altered dopamine levels on the exploration-exploitation trade-off. As such, altered dopamine levels in ADHD could give rise to such trade-off, consistent with neurobiological models of ADHD [35, 78, 96]. Notably, increased exploration in DD is less consistent with recent findings showing similar win-stay and lose-shift scores in DD compared to neurotypicals in a probabilistic reinforcement learning task . Another possibility is that the ability to learn reinforcement contingencies based on the recent outcome history is more disrupted in neurodevelopmental disorders compared to typical populations [35, 39, 42, 49, 61, 63, 88, 95]. In this regard, some have speculated that MF learning has notable parallels with procedural learning and that hippocampal-based learning is more equivalent with model-based behavior . Considering this, the present results resonate with theoretical models positing a procedural learning dysfunction in DD alongside intact hippocampal-based learning abilities [65, 98, 99]. Furthermore, at first glance the observation of impaired MF and MB learning in ADHD is inconsistent with theoretical and empirical research positing impaired striatal-based learning in ADHD alongside spared hippocampal-based learning [6, 41, 45, 99]. However, MB learning is also likely to involve additional neural substrates and in particular the dorsolateral prefrontal cortex , which has been shown to be affected in ADHD . Therefore, it can be the case that RL that rely on the dorsolateral prefrontal cortex as well are more likely to be affected in ADHD , rather than RL that are mostly associated with greater activation in hippocampal-based structures . Further studies are required to explore this possibility.
A further major contribution of the present study to previous literature is the examination of types of strategies employed by participants with DD during learning. The results of the present study suggest that learning deficits observed in DD might arise from impaired efficiency in using MF-based strategies. Our study therefore highlights the importance of studying not only learning deficits in DD but also use of strategies that might have a role in them. Since rule-based learning may be analogous to MB RL and procedural-based strategy may be analogous to model-free RL , the ability to use procedural-based strategies should be selectively disrupted in DD consistent with recent observations (Gabay, Roark & Holt, ). Procedural learning plays an important role in language acquisition  including the ability to form sound categories [26, 55]. Impaired category learning via procedural learning mechanisms could therefore influence the ability of people with DD to form precise phonological representations with negative effects on reading and phonological skills .
Taken together, the present findings reveal an interesting dissociation between attentional and language developmental disorders. A common deficit in MF association may lead to learning impairments in both disorders. Such impairments may be related to attenuated effect or detection of outcome valance, or to problems in associating the reward with its preceding actions, especially linking it to actions that are twice removed from the outcome (first-stage decisions). However, the two disorders show different effects of MB mechanisms. While the DD group showed an intact MB representation of the path leading to outcome and the ability to dynamically use this information when making planning decisions, i.e., thinking ahead, ADHD participants did not incorporate this information. This may be because of inappropriate representation of transition probability (i.e., of the path) or by failing to incorporate this information in decisions. This distinction between planning ahead and updating backwards may be a characteristic of other deficiencies between these two disorders, to be explored in future studies, and may call for different interventions. Such findings could be interpreted in light of the multiple deficit model of developmental disorders, according to which every developmental disorder involves multiple cognitive risk factors . Based on this notion, it may be the case that impairments in model-free RL may be one of the key risk factors for DD and ADHD  but that the MB learning deficit is related to the defining neuropsychological features of ADHD but not of DD.
The two-step task is one of the most common paradigms that has been suggested to differentiate between MF learning and MB processes and has been tested in a substantial number of typical and impaired populations. Nevertheless, caution is warranted in interpreting behavioral performance in this task, as several modifications to this paradigm could affect the relative contribution of each system to behavior. For example, it has been shown that MF RL can produce behavioral patterns in the two-step task that could be interpreted as MB RL . Furthermore, providing explicit instructions led participants to make primarily model-based choices with little model-free influence . However, in the current study, we found that ADHD and DD showed distinctive deviation from the behavior of control participants in the same task. This suggests that, to some extent, the two-step task used here can differentiate between learning processes and provide an informative insight into how such learning processes are impaired in different neurodevelopmental disorders. It will be important to direct future investigations to examining variants of the two-step task in ADHD/DD in order to more precisely understand the nature of MF/MB processes in these neurodevelopmental conditions.
To conclude, in the present study we compared different types of RL across DD and ADHD participants and their matched controls. Our results show a shared cognitive deficit in MF learning across participants with DD and ADHD relative to neurotypicals, alongside a deficit in MB learning that was selectively disrupted only in the ADHD group. These results suggest that distinct RL profiles can distinguish between language and attentional disorders.
Experiment 1: Participants with DD and neurotypical participants
Sixty-six university students (35 with DD, 15F and 31 controls, 18F) took part in the study. All participants were university students in Israel, from families with middle to high socioeconomic status. All participants were screened for being native Hebrew speakers, had no history of neurological disorders and/or psychiatric disorders, had normal or corrected-to-normal vision and normal hearing. The inclusion criteria for the DD group was (1) a formal diagnosis by a licensed clinician; (2) the absence of a formal diagnosis of attention deficit hyperactivity disorder (ADHD) or a specific language impairment; (3) a score below the clinical cutoff on the adult ADHD self-report scale (ASRS); (4) a score below a 1SD local norm cut-off for phonological decoding ; (5) a cognitive ability score within the normal range > 10th percentile Raven score . Based on these criteria, three participants with DD were excluded from the final sample. The control group was composed of individuals with no history of learning disabilities who exhibited no difficulties in reading (e.g., were above the reading cutoff (non-word reading) and was matched in age, gender, and nonverbal intelligence (assessed by the Raven test) to the DD group. The Institutional Review Board of the University of Haifa approved the study (no. 18/099), which was conducted in accordance with the Declaration of Helsinki, with written informed consent provided by all participants. Participants received a compensation of NIS 120 (approximately $37) for participating in the study.
Participants underwent a series of cognitive tests (Table 5) to evaluate basic cognitive ability, assessed by the Raven test  as well as tests of verbal short-term memory (Digit span test; Wechsler, 1997 ), rapid automatized naming skills (RAN tests;, phonological processing (phoneme segmentation, phoneme deletion, and Spoonerism), reading skills [83, 84], and attentional functions (ASRS; .
These tests were used to assert group differences in reading and phonological abilities. The results, shown in Table 6, indicate that the groups did not differ in age, cognitive abilities, or attentional skills, but compared to the control group the DD group displayed a profile of reading disability compatible with the symptomatology of developmental dyslexia. This group differed significantly from the control group on both rate and accuracy measures of word reading and decoding skills. The DD group demonstrated deficits also in the three key phonological domains: phonological awareness (Spoonerism, phoneme segmentation, phoneme deletion), verbal short-term memory (digit span), and rapid naming (rapid automatized naming).
Experimenet 2: Participants with ADHD and neurotypical participants
Sixty-five university students (35 with ADHD; 23F and 30 controls; 22F) took part in the study. All participants were university students in Israel, from families with middle to high socioeconomic status. All participants were screened for being native Hebrew speakers, had no history of neurological disorders and/or psychiatric disorders, had normal or corrected-to-normal vision and normal hearing. The inclusion criteria for the ADHD group included (1) a formal diagnosis of ADHD by an authorized clinician; (2) positive screening for ADHD based on the adult ADHD self-report scale (ASRS; , namely a score > = 51; (3) the lack of a formal diagnosis of a comorbid developmental disorder such as developmental dyslexia; (4) a cognitive ability score within the normal range > 10th percentile Raven score. The control group was composed of individuals with no history of learning disabilities who exhibited no difficulties in attentional skills (e.g., did not receive a positive score of ADHD based on the ASRS) and was matched in age, gender, and nonverbal intelligence (assessed by the Raven test) to the DD group. The Institutional Review Board of the University of Haifa approved the study (no. 18/099), which was conducted in accordance with the Declaration of Helsinki, with written informed consent provided by all participants. Participants received a compensation of NIS 120 (approximately $37) for participating in the study.
All participants underwent a series of cognitive tests to evaluate general intelligence as measured by Raven’s SPM tests , as well as tests of attentional (ASRS;  and reading skills . Details of the tests are presented in Table 5, and the results are shown in Table 7. The groups did not differ significantly in age, intelligence, or reading skills. Naturally, the ADHD group differed significantly from the control group in the ADHD measures derived from the ASRS questionnaire.
The task was similar to that employed in the study conducted by . Each trial was divided into two stages, each of which required a decision (see Fig. 2. In the first stage, a choice was made between two spaceships. Participants were told that these spaceships could fly to one of two different planets. Each spaceship would land more often on a specific planet (i.e., common transition; 70% chance, yet could also land on the alternative planet in a minority of trials (i.e., rare transition; 30% chance. In the second stage, participants were asked to decide between two aliens. The selection of each alien led probabilistically to a reward determined by independently drifting Gaussian random walks [standard deviation (SD = 0.025] with a lower boundary of 0.25 probability of reward and an upper boundary of 0.75, such that the probability of reward from any particular second stage option changed very slowly from trial to trial. Because the transition from the first stage choice to the second stage planet was stochastic, first stage choices allowed dissociating two learning strategies, either MF or MB.
The experiment consisted of two sessions. Participants completed a background questionnaire at home and were invited to complete the cognitive battery tests. In the second session, participants completed the two-step task. Sessions were conducted in a sound-attenuated booth in front of a 14-in laptop monitor.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Adi-Japha E, Fox O, Karni A. Atypical acquisition and atypical expression of memory consolidation gains in a motor skill in young female adults with ADHD. Res Dev Disabil. 2011;32(3):1011–20.
Akam T, Costa R, Dayan P. Simple plans or sophisticated habits? State, transition and learning interactions in the two-step task. PLoS Comput Biol. 2015;11(12): e1004648.
Ashby FG, Paul EJ, Maddox WT. 4 COVIS. In: Formal approaches in categorization. Pothos EM, Wills AJ (eds), Cambridge University Press, 65–88. 2011.
Barkley RA. Behavioral inhibition, sustained attention, and executive functions: constructing a unifying theory of ADHD. Psychol Bull. 1997;121(1):65.
Barkley RA, Edwards G, Laneri M, Fletcher K, Metevia L. Executive functioning, temporal discounting, and sense of time in adolescents with attention deficit hyperactivity disorder (ADHD) and oppositional defiant disorder (ODD). J Abnorm Child Psychol. 2001;29(6):541–56.
Barnes KA, Howard JH Jr, Howard DV, Kenealy L, Vaidya CJ. Two forms of implicit learning in childhood ADHD. Dev Neuropsychol. 2010;35(5):494–505.
Baroni A, Castellanos FX. Neuroanatomic and cognitive abnormalities in attention-deficit/hyperactivity disorder in the era of ‘high definition’neuroimaging. Curr Opin Neurobiol. 2015;30:1–8.
Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. arXiv preprint arXiv:1406.5823. 2014.
Breznitz, Z., & Misra, M. (2003). Speed of processing of the visual–orthographic and auditory–phonological systems in adult dyslexics: The contribution of “asynchrony” to word recognition deficits. Brain and language, 85(3), 486-502.
Brunswick, N., McCrory, E., Price, C. J., Frith, C. D., & Frith, U. (1999). Explicit and implicit processing of words and pseudowords by adult developmental dyslexics: A search for Wernicke's Wortschatz?. Brain, 122(10), 1901-1917.
Cubillo A, Halari R, Smith A, Taylor E, Rubia K. A review of fronto-striatal and fronto-cortical brain abnormalities in children and adults with attention deficit hyperactivity disorder (ADHD) and new evidence for dysfunction in adults with ADHD during motivation and attention. Cortex. 2012;48(2):194–215.
Culbreth AJ, Westbrook A, Daw ND, Botvinick M, Barch DM. Reduced model-based decision-making in schizophrenia. J Abnorm Psychol. 2016;125(6):777.
Daw ND, Gershman SJ, Seymour B, Dayan P, Dolan RJ. Model-based influences on humans’ choices and striatal prediction errors. Neuron. 2011;69(6):1204–15.
De Meyer H, Beckers T, Tripp G, Van der Oord S. Reinforcement contingency learning in children with ADHD: back to the basics of behavior therapy. J Abnorm Child Psychol. 2019;47(12):1889–902.
Decker JH, Otto AR, Daw ND, Hartley CA. From creatures of habit to goal-directed learners: tracking the developmental emergence of model-based reinforcement learning. Psychol Sci. 2016;27(6):848–58.
Demurie E, Roeyers H, Baeyens D, Sonuga-Barke E. Temporal discounting of monetary rewards in children and adolescents with ADHD and autism spectrum disorders. Dev Sci. 2012;15(6):791–800.
Deserno L, Huys QJ, Boehme R, Buchert R, Heinze H-J, Grace AA, Schlagenhauf F. Ventral striatal dopamine reflects behavioral and neural signatures of model-based control during sequential decision making. Proc Natl Acad Sci. 2015;112(5):1595–600.
Dezfouli A, Balleine BW. Actions, action sequences and habits: evidence that goal-directed and habitual action control are hierarchically organized. PLoS Comput Biol. 2013;9(12): e1003364.
Doll BB, Shohamy D, Daw ND. Multiple memory systems as substrates for multiple decision systems. Neurobiol Learn Mem. 2015;117:4–13.
Drummond N, Niv Y. Model-based decision making and model-free learning. Curr Biol. 2020;30(15):R860–5.
Earle FS, Del Tufo SN, Evans TM, Lum JA, Cutting LE, Ullman MT. Domain-general learning and memory substrates of reading acquisition. Mind Brain Educ. 2020;14(2):176–86.
Farmer ME, Klein RM. The evidence for a temporal processing deficit linked to dyslexia: a review. Psychon Bull Rev. 1995;2(4):460–93.
Faul F, Erdfelder E, Buchner A, Lang A-G. Statistical power analyses using G* power 3.1: tests for correlation and regression analyses. Behav Res Methods. 2009;41(4):1149–60.
Fawcett AJ, Nicolson RI. Development of dyslexia: the delayed neural commitment framework. Front Behav Neurosci. 2019;13:112.
Feher da Silva C, Hare TA. Humans primarily use model-based inference in the two-stage task. Nat Hum Behav. 2020;4(10):1053–66.
Feng G, Yi HG, Chandrasekaran B. The role of the human auditory corticostriatal network in speech learning. Cereb Cortex. 2019;29(10):4077–89.
Fernandez-Ruiz J, Hakvoort Schwerdtfeger RM, Alahyane N, Brien DC, Coe BC, Munoz DP. Dorsolateral prefrontal cortex hyperactivity during inhibitory control in children with ADHD in the antisaccade task. Brain Imaging Behav. 2020;14(6):2450–63.
Findling C, Skvortsova V, Dromnelle R, Palminteri S, Wyart V. Computational noise in reward-guided learning drives behavioral variability in volatile environments. Nat Neurosci. 2019;22(12):2066–77.
Foerde K. What are habits and do they depend on the striatum? A view from the study of neuropsychological populations. Curr Opin Behav Sci. 2018;20:17–24.
Foerde K, Braun EK, Shohamy D. A trade-off between feedback-based learning and episodic memory for feedback events: evidence from Parkinson’s disease. Neurodegener Dis. 2013;11(2):93–101.
Foerde K, Daw ND, Rufin T, Walsh BT, Shohamy D, Steinglass JE. Deficient goal-directed control in a population characterized by extreme goal pursuit. J Cogn Neurosci. 2021;33(3):463–81.
Foerde K, Knowlton BJ, Poldrack RA. Modulation of competing memory systems by distraction. Proc Natl Acad Sci. 2006;103(31):11778–83.
Foerde K, Shohamy D. Feedback timing modulates brain systems for learning in humans. J Neurosci. 2011;31(37):13157–67.
Fox O, Karni A, Adi-Japha E. The consolidation of a motor skill in young adults with ADHD: shorter practice can be better. Res Dev Disabil. 2016;51:135–44.
Frank MJ, Santamaria A, O’Reilly RC, Willcutt E. Testing computational models of dopamine and noradrenaline dysfunction in attention deficit/hyperactivity disorder. Neuropsychopharmacology. 2007;32(7):1583–99.
Furukawa E, Bado P, da Costa RQM, Melo B, Erthal P, de Oliveira IP, Mattos P. Reward modality modulates striatal responses to reward anticipation in ADHD: effects of affiliative and food stimuli. Psychiatry Res: Neuroimaging. 2022. https://doi.org/10.1016/j.pscychresns.2022.111561.
Furukawa E, Bado P, Tripp G, Mattos P, Wickens JR, Bramati IE, Tovar-Moll F. Abnormal striatal BOLD responses to reward anticipation and reward delivery in ADHD. PloS ONE. 2014;9(2):e89129.
Gabay Y. Delaying feedback compensates for impaired reinforcement learning in developmental dyslexia. Neurobiol Learn Mem. 2021;185: 107518.
Gabay Y, Goldfarb L. Feedback-based probabilistic category learning is selectively impaired in attention/hyperactivity deficit disorder. Neurobiol Learn Mem. 2017;142:200–8.
Gabay Y, Holt LL. Incidental learning of sound categories is impaired in developmental dyslexia. Cortex. 2015;73:131–43.
Gabay Y, Shahbari-Khateb E, Mendelsohn A. Feedback timing modulates probabilistic learning in adults with ADHD. Sci Rep. 2018;8(1):15524.
Gabay Y, Vakil E, Schiff R, Holt LL. Probabilistic category learning in developmental dyslexia: evidence from feedback and paired-associate weather prediction tasks. Neuropsychology. 2015;29(6):844.
Gillan CM, Otto AR, Phelps EA, Daw ND. Model-based learning protects against forming habits. Cogn Affect Behav Neurosci. 2015;15(3):523–36.
Gläscher J, Daw N, Dayan P, O’Doherty JP. States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron. 2010;66(4):585–95.
Goodman J, Marsh R, Peterson BS, Packard MG. Annual research review: the neurobehavioral development of multiple memory systems–implications for childhood and adolescent psychiatric disorders. J Child Psychol Psychiatry. 2014;55(6):582–610.
Goswami U. A temporal sampling framework for developmental dyslexia. Trends Cogn Sci. 2011;15(1):3–10.
Groen Y, Wijers AA, Mulder LJ, Waggeveld B, Minderaa RB, Althaus M. Error and feedback processing in children with ADHD and children with autistic spectrum disorder: an EEG event-related potential study. Clin Neurophysiol. 2008;119(11):2476–93.
Hauser TU, Iannaccone R, Ball J, Mathys C, Brandeis D, Walitza S, Brem S. Role of the medial prefrontal cortex in impaired decision making in juvenile attention-deficit/hyperactivity disorder. JAMA Psychiat. 2014;71(10):1165–73.
Huang-Pollock CL, Maddox WT, Tam H. Rule-based and information-integration perceptual category learning in children with attention-deficit/hyperactivity disorder. Neuropsychology. 2014;28(4):594.
Humphries MD, Khamassi M, Gurney K. Dopaminergic control of the exploration-exploitation trade-off via the basal ganglia. Front Neurosci. 2012. https://doi.org/10.3389/fnins.2012.00009.
Kita Y, Yamamoto H, Oba K, Terasawa Y, Moriguchi Y, Uchiyama H, Inagaki M. Altered brain activity for phonological manipulation in dyslexic Japanese children. Brain. 2013;136(12):3696–708.
Konfortes H. Diagnosing ADHD in Israeli adults: the psychometric properties of the adult ADHD self report scale (ASRS) in Hebrew. Isr J Psychiatry. 2010;47(4):308.
Krishnan S, Watkins KE, Bishop DV. Neurobiological basis of language learning difficulties. Trends Cogn Sci. 2016;20(9):701–14.
Laasonen M, Väre J, Oksanen-Hennah H, Leppämäki S, Tani P, Harno H, Cleeremans A. Project DyAdd: Implicit learning in adult dyslexia and ADHD. Ann Dyslexia. 2014;64(1):1–33.
Lim S-J, Fiez JA, Holt LL. Role of the striatum in incidental learning of sound categories. Proc Natl Acad Sci. 2019;116(10):4671–80.
Lonergan A, Doyle C, Cassidy C, MacSweeney Mahon S, Roche RA, Boran L, Bramham J. A meta-analysis of executive functioning in dyslexia with consideration of the impact of comorbid ADHD. J Cogn Psychol. 2019;31(7):725–49.
Lum JA, Ullman MT, Conti-Ramsden G. Procedural learning is impaired in dyslexia: evidence from a meta-analysis of serial reaction time studies. Res Dev Disabil. 2013;34(10):3460–76.
Luman M, Goos V, Oosterlaan J. Instrumental learning in ADHD in a context of reward: intact learning curves and performance improvement with methylphenidate. J Abnorm Child Psychol. 2015;43(4):681–91.
Luman M, Oosterlaan J, Sergeant JA. The impact of reinforcement contingencies on AD/HD: a review and theoretical appraisal. Clin Psychol Rev. 2005;25(2):183–213.
Luman M, Tripp G, Scheres A. Identifying the neurobiology of altered reinforcement sensitivity in ADHD: a review and research agenda. Neurosci Biobehav Rev. 2010;34(5):744–54.
Luman M, Van Meel CS, Oosterlaan J, Sergeant JA, Geurts HM. Does reward frequency or magnitude drive reinforcement-learning in attention-deficit/hyperactivity disorder? Psychiatry Res. 2009;168(3):222–9.
Mandali A, Sethi A, Cercignani M, Harrison NA, Voon V. Shifting uncertainty intolerance: methylphenidate and attention-deficit hyperactivity disorder. Transl Psychiatry. 2021;11(1):1–9.
Massarwe AO, Nissan N, Gabay Y. Atypical reinforcement learning in developmental dyslexia. J Int Neuropsychol Soc. 2021. https://doi.org/10.1017/S1355617721000266.
McGrath LM, Stoodley CJ. Are there shared neural correlates between dyslexia and ADHD? A meta-analysis of voxel-based morphometry studies. J Neurodev Disord. 2019;11(1):1–20.
Nicolson RI, Fawcett AJ. Dyslexia, dysgraphia, procedural learning and the cerebellum. Cortex: A Journal Devoted to the Study of the Nervous System and Behavior. 2011. https://doi.org/10.1016/j.cortex.2009.08.016.
Otto AR, Gershman SJ, Markman AB, Daw ND. The curse of planning: dissecting multiple reinforcement-learning systems by taxing the central executive. Psychol Sci. 2013;24(5):751–61.
Otto AR, Raio CM, Chiang A, Phelps EA, Daw ND. Working-memory capacity protects model-based learning from stress. Proc Natl Acad Sci. 2013;110(52):20941–6.
Otto AR, Skatova A, Madlon-Kay S, Daw ND. Cognitive control predicts use of model-based reinforcement learning. J Cogn Neurosci. 2014;27(2):319–33.
Paloyelis Y, Mehta MA, Faraone SV, Asherson P, Kuntsi J. Striatal sensitivity during reward processing in attention-deficit/hyperactivity disorder. J Am Acad Child Adolesc Psychiatry. 2012;51(7):722–32.
Pennington BF. From single to multiple deficit models of developmental disorders. Cognition. 2006;101(2):385–413.
Pennington BF, Bishop DV. Relations among speech, language, and reading disorders. Annu Rev Psychol. 2009;60:283–306.
Pereira CLW, Zhou R, Pitt MA, Myung JI, Rossi PJ, Caverzasi E, Meyer M. Probabilistic decision-making in children with dyslexia. Front Neurosci. 2022. https://doi.org/10.3389/fnins.2022.782306.
Plichta MM, Scheres A. Ventral–striatal responsiveness during reward anticipation in ADHD and its relation to trait impulsivity in the healthy population: a meta-analytic review of the fMRI literature. Neurosci Biobehav Rev. 2014;38:125–34.
Portengen CM, Sprooten E, Zwiers MP, Hoekstra PJ, Dietrich A, Holz NE, Saam MC. Reward and punishment sensitivity are associated with cross-disorder traits. Psychiatry Res. 2021;298:113795.
Raven JC, Court JH. Raven’s progressive matrices and vocabulary scales. New Delhi: Oxford pyschologists Press; 1998.
Richlan F, Kronbichler M, Wimmer H. Meta-analyzing brain dysfunctions in dyslexic children and adults. Neuroimage. 2011;56(3):1735–42.
Sagvolden T, Aase H, Zeiner P, Berger D. Altered reinforcement mechanisms in attention-deficit/hyperactivity disorder. Behav Brain Res. 1998;94(1):61–71.
Sagvolden T, Johansen EB, Aase H, Russell VA. A dynamic developmental theory of attention-deficit/hyperactivity disorder (ADHD) predominantly hyperactive/impulsive and combined subtypes. Behav Brain Sci. 2005;28(3):397–418.
Scheres A, Dijkstra M, Ainslie E, Balkan J, Reynolds B, Sonuga-Barke E, Castellanos FX. Temporal and probabilistic discounting of rewards in children and adolescents: effects of age and ADHD symptoms. Neuropsychologia. 2006;44(11):2092–103.
Scheres A, Milham MP, Knutson B, Castellanos FX. Ventral striatal hyporesponsiveness during reward anticipation in attention-deficit/hyperactivity disorder. Biol Psychiat. 2007;61(5):720–4.
Shahar N, Hauser TU, Moutoussis M, Moran R, Keramati M, Consortium N, Dolan RJ. Improving the reliability of model-based decision-making estimates in the two-stage decision task with reaction-times and drift-diffusion modeling. PLoS Comput Biol. 2019;15(2):e1006803.
Sharp ME, Foerde K, Daw ND, Shohamy D. Dopamine selectively remediates ‘model-based’reward learning: a computational approach. Brain. 2016;139(2):355–64.
Shatil E. One-minute test for pseudowords. Unpublished test. Haifa: University of Haifa; 1995.
Shatil E. One-minute test for words-unpublished test. Haifa: University of Haifa; 1997.
Shiels K, Hawk LW Jr, Reynolds B, Mazzullo RJ, Rhodes JD, Pelham WE Jr, Gangloff BP. Effects of methylphenidate on discounting of delayed rewards in attention deficit/hyperactivity disorder. Expe Clin Psychopharmacol. 2009;17(5):291.
Smittenaar P, FitzGerald TH, Romei V, Wright ND, Dolan RJ. Disruption of dorsolateral prefrontal cortex decreases model-based in favor of model-free control in humans. Neuron. 2013;80(4):914–9.
Snowling MJ. From language to reading and dyslexia. Dyslexia. 2001;7(1):37–46.
Sperling AJ, Lu Z-L, Manis FR. Slower implicit categorical learning in adult poor readers. Ann Dyslexia. 2004;54(2):281–303.
Ströhle A, Stoy M, Wrase J, Schwarzer S, Schlagenhauf F, Huss M, Gregor A. Reward anticipation and outcomes in adult males with attention-deficit/hyperactivity disorder. NeuroImage. 2008;39(3):966–72.
Sutton, R. S., & Barto, A. G. Introduction to reinforcement learning. 1998.
Sutton, R. & Barto, A. Reinforcement Learning: An Introduction (MIT Press, 1998).
Taylor, H.; Vestergaard, M.D. Developmental Dyslexia: Disorder or Specialization in Exploration? Front. Psychol. 2022, 13, 889245.
Team, R. C. R: the R project for statistical computing. 2019. Accessed Feb, 28. 2020
Toplak, M. E., Dockstader, C., & Tannock, R. (2006). Temporal information processing in ADHD: findings to date and new methods. Journal of neuroscience methods, 151(1), 15-29.
Tripp G, Alsop B. Sensitivity to reward frequency in boys with attention deficit hyperactivity disorder. J Clin Child Psychol. 1999;28(3):366–75.
Tripp G, Wickens JR. Research review: dopamine transfer deficit: a neurobiological theory of altered reinforcement mechanisms in ADHD. J Child Psychol Psychiatry. 2008;49(7):691–704.
Ullman MT. Contributions of memory circuits to language: the declarative/procedural model. Cognition. 2004;92(1–2):231–70.
Ullman MT, Earle FS, Walenski M, Janacsek K. The neurocognition of developmental disorders of language. Annu Rev Psychol. 2020;71:389–417.
Ullman MT, Pullman MY. A compensatory role for declarative memory in neurodevelopmental disorders. Neurosci Biobehav Rev. 2015;51:205–22.
Vanseijen, H., & Sutton, R. A deeper look at planning as learning from replay. International conference on machine learning. 2015.
Vikbladh OM, Meager MR, King J, Blackmon K, Devinsky O, Shohamy D, Daw ND. Hippocampal contributions to model-based planning and spatial memory. Neuron. 2019;102(3):683–93.
Voon V, Derbyshire K, Rück C, Irvine MA, Worbe Y, Enander J, Sahakian BJ. Disorders of compulsivity: a common bias towards learning habits. Mol Psychiatry. 2015;20(3):345–52.
Wang Z, Yan X, Liu Y, Spray GJ, Deng Y, Cao F. Structural and functional abnormality of the putamen in children with developmental dyslexia. Neuropsychologia. 2019;130:26–37.
Wechsler, D. (1997). Wechsler adult intelligence scale-(WAIS-3) San Antonio. TX: Harcourt Assessment.
Willcutt EG, Pennington BF. Comorbidity of reading disability and attention-deficit/hyperactivity disorder: differences by gender and subtype. J Learn Disabil. 2000;33(2):179–91.
Worbe Y, Palminteri S, Savulich G, Daw N, Fernandez-Egea E, Robbins T, Voon V. Valence-dependent influence of serotonin depletion on model-based choice strategy. Mol Psychiatry. 2016;21(5):624–9.
Wunderlich K, Smittenaar P, Dolan RJ. Dopamine enhances model-based over model-free choice behavior. Neuron. 2012;75(3):418–24.
Yael W, Tami K, Tali B. The effects of orthographic transparency and familiarity on reading Hebrew words in adults with and without dyslexia. Ann Dyslexia. 2015;65(2):84–102.
Zeithamova D, Maddox WT. Dual-task interference in perceptual category learning. Mem Cognit. 2006;34(2):387–98.
Ballan R, Durrant SJ, Stickgold R, Morgan A, Manoach DS, Gabay Y. A failure of sleep-dependent consolidation of visuoperceptual procedural learning in young adults with ADHD. Translational Psychiatry. 2022;12(1):499.
Mas-Herrero E, Sescousse G, Cools R, Marco-Pallares J. The contribution of striatal pseudo-reward prediction errors to value-based decision-making. NeuroImage. 2019;193:67–74.
Gabay Y, Roark CL, Holt LL. Impaired and Spared Auditory Category Learning in Developmental Dyslexia. Psychological Science. 2022:09567976231151581.
Ms. Noyli Nissan conducted the study under the supervision of Dr. Yafit Gabay.
This research was supported by grants from the Israel Science Foundation (grant No. 734/22) and the National Institute of Psychobiology in Israel awarded to Yafit Gabay and by Joy Ventures (2020 cycle) awarded to Yafit Gabay and Uri Hertz.
Ethics approval and consent to participate
The Institutional Review Board of the University of Haifa approved the study, which was conducted in accordance with the Declaration of Helsinki, with written informed consent provided by all participants.
Consent for publication
All authors consent for publication of the manuscript in its present form.
All authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Nissan, N., Hertz, U., Shahar, N. et al. Distinct reinforcement learning profiles distinguish between language and attentional neurodevelopmental disorders. Behav Brain Funct 19, 6 (2023). https://doi.org/10.1186/s12993-023-00207-w