1. Introduction

Research into language learning aptitude has experienced a revival over the past few years, largely due to impactful new developments in both theory and measurement. In terms of theory, researchers have proposed a distinction between aptitude for explicit and aptitude for implicit learning (, ; ; ; ) which is currently undergoing empirical examination and validation. Although most measures of aptitude capture primarily explicit aptitude in the sense of associative memory, phonetic coding and language-analytic ability, it is argued that tests should ideally also include measures of implicit aptitude, conceptualised as sensitivity to the recency, frequency and conditional probability of input stimuli. With regard to measurement, the LLAMA aptitude test battery (; ), which has become one of the most widely used aptitude measures in the field of second language (L2) learning research, has been undergoing continued improvement, resulting in a new version being made available to the L2 research community in late 2019 and further updates being published throughout 2020 and 2021.

Empirical studies of aptitude have relied almost exclusively on data from young adults and/or adolescents (, ), with older adults still very much under-represented. This sits uneasily with rising interest in older adults’ language learning (, ; ; ; ), as researchers increasingly acknowledge the importance of understanding older learners’ abilities and needs in the face of an ageing population and the potential benefits of cognitive activity, including language learning, for healthy ageing. The study reported in this paper constitutes a first attempt at assessing the suitability of existing aptitude measures for use with older adults aged 60+. Furthermore, it aimed to establish the relationship between components of aptitude for explicit and implicit learning in such a population. In addition, we examined to what extent participants’ occupational status, chronological age, level of multilingualism, emotional state, self-concept and leisure activities were associated with their performance on the aptitude measures. In the subsequent sections, we provide an overview of the theoretical and empirical background to the study, followed by a detailed description of the methodology, results and discussion of the key findings.

2. Background

2.1 Language learning aptitude

Language learning aptitude refers to a set of cognitive and perceptual abilities that facilitates the language learning process, allowing individuals with high levels of aptitude to learn additional languages quickly, successfully and with ease. Classic conceptualisations of aptitude are based on J. B. Carroll’s (; ) four-component model comprising phonetic coding ability (the ability to code unfamiliar sounds for further processing and retention), grammatical sensitivity (the ability to identify the function of words in sentences), inductive language learning ability (the ability to infer linguistic patterns from exemplars) and associative memory (the ability to create and retain form-meaning associations). This model of aptitude was derived empirically from large-scale factor-analytic studies which resulted in the development of the Modern Language Aptitude Test (MLAT; , ). Subsequently, grammatical sensitivity and inductive language learning ability have been subsumed under the label of language-analytic ability in the sense of inferring linguistic patterns and making generalisations (, ; for recent overviews, see, e.g., ; ).

The LLAMA aptitude test battery (; ) is considered a current instantiation of the classic model of aptitude. Unlike the MLAT, the LLAMA is suitable for speakers of a variety of L1s and is freely available to the research community. The computer-administered LLAMA comprises four subtests labelled LLAMA B, D, E and F which, in conjunction, measure the aptitude components of associative memory, phonetic coding ability and language-analytic ability. Efforts to ascertain the validity and reliability of the test have led to both promising findings (; ; ) and justified critique (; ), resulting in the most recently published improved version.

A complementary strand of recent research has focused on the hypothesised distinction between aptitude for explicit learning and aptitude for implicit learning (, ; ; ). Aptitude for implicit learning has been defined as “a cluster of cognitive abilities that […] enables learners to conduct unconscious computation of the distributional and transitional probabilities of linguistic input” (). Thus, high levels of this kind of aptitude would facilitate implicit language learning in particular, that is, learning through a non-conscious process of induction that takes place without online awareness and results in intuitive knowledge (; ; ). Implicit learning can be contrasted with explicit learning, which is characterised by online awareness of the learning target and hypothesis testing, e.g., when looking for regularities in the linguistic input ().

Implicit aptitude is conceptualised as a componential construct that is distinct from cognitive abilities in the explicit domain, that is, memory, language-analytic ability and (aspects of) phonetic coding ability. Like explicit aptitude, implicit aptitude is hypothesised to comprise both domain-general cognitive abilities (e.g., processing non-linguistic input such as sequences of symbols, shapes or images) and domain-specific linguistic abilities (processing language input such as syllables, words or sentences).

It has been proposed that implicit aptitude in the sense of sensitivity to input frequency and conditional probability can be measured by means of tasks that require the learning of regularities governing the sequencing of stimuli. Suitable candidates would be artificial grammar learning tasks (in the linguistic domain) and the probabilistic serial reaction time (SRT) task, which relies on non-linguistic stimuli (). Furthermore, the LLAMA D, which involves the recognition of syllables, may also tap into implicit aptitude because it is essentially semantically vacuous. Initial evidence supporting this distinction has been found ().

Over the past decades, a large number of empirical studies has investigated the role of aptitude in L2 learning. Two meta-analyses of this research (, ) provide a concise overview of the cumulative results. The first meta-analysis () focused on the relationship between aptitude and morphosyntactic attainment. A total of 33 primary studies completed between 1959 and 2013 were included, contributing 309 effect sizes from 3,106 learners. The results showed that aptitude explained about 10% of the variance in morphosyntactic learning. The second meta-analysis () focused on the relationship between aptitude and other individual difference variables, as well as on the relationship between aptitude and L2 learning in terms of both general proficiency and specific skills. The data set consisted of 66 primary studies reporting on results from 109 unique samples comprising a population of 13,035 participants. The results revealed a moderate correlation between aptitude and executive working memory, and a weak correlation between aptitude and phonological short-term memory. The correlation coefficient of .49 identified for the criterion variable of general L2 proficiency indicated that aptitude was a strong predictor, accounting for about 25% of the variance. For general L2 proficiency, phonetic coding ability was a stronger predictor than both language-analytic ability and associative memory, with the latter the weakest predictor. With regard to specific L2 knowledge domains and skills, mostly moderate correlations with aptitude were found. Unsurprisingly, language-analytic ability was the strongest predictor for grammar knowledge ().

In sum, cumulative findings offer strong evidence for a significant role of aptitude in L2 learning across a range of ages and learning contexts, with measures of aptitude predicting L2 attainment in terms of general proficiency, specific language skills and knowledge of morphosyntax. Despite the large number of studies, the age range captured is relatively narrow: participants were typically young adults or adolescents, rather than more mature adults. This predominant focus on younger participants is in marked contrast with the recent increased interest in language learning by older adults aged 60+ (e.g., , ; ; ; ). Such work on third-age language learning highlights the importance of understanding older L2 learners’ abilities and needs, especially in view of ageing populations in many countries across the world.

2.2 Language learning in older adults

It is well known that as we get older, some of our cognitive functions decline, including working memory capacity, executive control, inductive reasoning and overall processing speed (; ; ). In general, online capacities relating to the speed of processing as well as sensory acuity, especially in the auditory domain, are negatively affected by advancing age, whereas crystallised abilities, i.e., abilities reflecting verbal and scholastic knowledge, general information and problems for which individuals have prior concepts and specific solution strategies, tend to remain stable (; ).

Moreover, as we go through life, discourse and strategic competences continue to be built up (), offering facilitative potential. The more communicative practice we accrue and the more situations we experience, the better we can develop our discourse abilities and the better we can apply strategies of language use (and language learning) in a task-appropriate way. It is therefore unsurprising that studies investigating older adults’ L2 learning have demonstrated that successful additional language learning is certainly possible late in life (; ; ).

Given the stability of crystallised abilities as well as the relative strategic expertise of older learners, one might hypothesise that explicit knowledge and learning will be particularly important for this age group. Three studies including adults aged 60+ investigated participants’ learning of Latin subject and object case marking through computer-administered instruction (; ; ). The researchers were interested in whether learners would benefit from explicit information about the target structure in addition to input-based practice. Overall, the findings showed that participants improved between pre-test and post-test, i.e., they learned the target structure successfully. However, explicit information had no additional facilitative effect when it was provided prior to practice (; ), and it actually had a detrimental effect when it was provided as part of corrective feedback during practice ().

While these results appear to suggest that explicit knowledge and learning may not convey particular benefits for older adults, it is too early to generalise beyond the specific research designs at hand. All three studies were laboratory-based, had very short treatments and thus offer limited ecological validity. Furthermore, experimental variables such as the length of time available for reading metalinguistic comments may have impacted on the usefulness of explicit information for the older participants in at least one of the studies. Specifically, a brief presentation of metalinguistic explanations may have worked against the participants, not because of the nature of the material provided, but because of the speed of processing required. In other words, it is possible that explicit information would have proved beneficial if it had been accessible to participants for longer (for a detailed critique, see ).

Beyond acknowledging the differential contributions of explicit and implicit knowledge and processes in third-age L2 learning, it is worth noting that individual differences can be expected to play a role (). Indeed, as older adults have more extensive life experience than younger adults, and as they additionally continue to develop and diversify in terms of their experiences (), inter-individual variation can be expected in third-age participants – a claim that has recently received powerful empirical support ().

2.3 The present study

Taking into account the above points as well as the fact that aptitude research has worked almost exclusively with younger participants and that no study to date has examined both explicit and implicit aptitude or the relationship between these constructs and other individual difference variables in older adults aged 60+, we posed the following research questions:

  1. How do older adult participants perform on measures of aptitude for explicit and implicit learning?
  2. What is the relationship between the hypothesised components of aptitude for explicit learning and aptitude for implicit learning?
  3. How do occupational status, age, level of multilingualism, emotional state, self-concept and leisure activities relate to performance on the aptitude measures?

3. Methodology

3.1 Research design, participants and procedure

In order to address the research questions, we conducted a correlational study with 64 healthy older adults aged between 61 and 79. In terms of gender, 41 participants identified as female, 22 as male and 1 as non-binary. Educational level ranged from completion of secondary school to PhD level, with a bachelor’s degree the norm. Participants were also asked about their occupation, whether current (N = 12) or pre-retirement (N = 52). A range of professional fields was represented, though all fields were indicative of higher socioeconomic status, in line with the relatively high levels of education in the sample: education/social work (N = 17), administration (N = 15), accountancy/finance/IT (N = 12), publishing/journalism/research (N = 8), engineering (N = 5), health (N =4) and business management (N = 3). The vast majority of the participants where L1 speakers of English (N = 60); other L1s were French (N = 2), German (N = 1) and bilingual English/Polish (N = 1).

Participants were asked to report all languages they had ever learned or tried to learn at any point in their lives and to self-assess their level of proficiency in each language on a 4-point scale: (1) ‘I know a few words and phrases, that’s all’, (2) ‘I can have a simple conversation and/or understand basic information and/or read and write short and simple texts’, (3) ‘I am a fairly confident user of the language, i.e., I can speak, listen, read and/or write it quite well’, (4) ‘I am a proficient user of the language’. This resulted in between 0 and 6 other languages beyond the L1 being reported (mean = 3.3), with 29 different languages represented across the sample.

We calculated a cumulative level of languages and a mean level based on self-assessed proficiency. Unsurprisingly, all language background variables were intercorrelated: number of languages and cumulative level of languages are overlapping concepts and correlated very strongly (rho = 0.89, p < 0.001); number of languages and mean level of languages share some variance but are not overlapping to the same extent, and their correlation was moderate (rho = 0.35, p = 0.005). For the analyses reported below, we calculated a multilingualism score by adding up the z-scores for number of languages and mean level of languages. This provided us with a combined measure of quantity (languages learned/encountered) and quality (reported proficiency level) of prior language learning.

For the purpose of the study, the participants completed the LLAMA test battery, a probabilistic SRT task, and a background questionnaire on biographical factors, leisure activities, self-concept and emotional state, as described below. Data collection took place online via participants’ own laptop or desktop computers. After providing their informed written consent, participants first completed the SRT, followed by the LLAMA and finally the background questionnaire.

3.2 Instruments

3.2.1 LLAMA

A slightly modified version of the LLAMA that was available prior to v.3 was programmed into PsychoPy and made available to participants via the Pavlovia platform. We reprogrammed the test in order to access item data for reliability analyses and to correct a faulty item in LLAMA F. We also adjusted the scoring, so all subtests would be scored out of 100, including LLAMA D. Finally, we used five spare sound files for LLAMA D, thus increasing the number of test items to 35 to improve reliability of that subtest. The subtests were presented to participants in letter order, i.e., B, D, E and F. LLAMA B, E and F consisted of a timed study phase followed by an untimed test phase; LLAMA D consisted of an exposure phase with fixed timings followed by an untimed test phase. LLAMA D, E and F had a two-way multiple-choice response format; in order to correct for guessing, incorrect responses were penalised. Due to a technical error, LLAMA E did not work as intended, and we subsequently discarded the data. We therefore do not discuss this subtest any further.

LLAMA B is a vocabulary learning task assessing associative memory in the sense of creating and retaining form-meaning associations. During a two-minute study phase, participants tried to learn associations between picture stimuli and novel words. In the test phase, participants were presented with a word and had to click on the matching picture from the entire array of 20 images.

LLAMA D is a test of phonetic coding ability and assesses auditory pattern recognition. Participants listened to 10 computer-generated sound sequences. In the test phase, they had to decide whether they had heard a given sound sequence before or not.

LLAMA F is a test of grammatical inferencing assessing language-analytic ability. During a five-minute study phase during which note-taking was allowed, participants attempted to work out the grammar of a new (mini-)language by making inferences from short sentences (form) shown in association with pictures they describe (meaning). The test phase required recognition of grammatically correct versus grammatically incorrect sentences describing associated pictures.

Figures 1 and 2a/b show the interfaces of LLAMA B and F, respectively. As LLAMA D used auditory stimuli, the visual interface only showed two buttons, √ for ‘yes’ and X for ‘no’. Reliability of the test overall was reasonable: Cronbach’s alpha = 0.77. Individual sub-tests varied from good to poor in terms of reliability: LLAMA B = 0.85, LLAMA D = 0.62, LLAMA F = 0.40.

Figure 1 

LLAMA B interface.

Figure 2a 

LLAMA F interface - learning phase.

Figure 2b 

LLAMA F interface - testing phase.

3.2.2 SRT

The probabilistic SRT task () was programmed into PsychoPy and accessed by participants via Pavlovia. In this task, participants were presented with squares appearing at one of four locations on the screen at one-second intervals. As soon as a square appeared, participants had to press a corresponding keyboard button as quickly as possible. Unbeknown to participants, the order of presentation followed a specific pre-determined sequence 85% of the time (training trials) and another sequence 15% of the time (control trials), with pseudo-random switches between the two types of trial. Any difference in response times on training vs. control trials after 8 blocks comprising a total of 960 trials is considered a measure of implicit learning. In other words, the size of the learning effect is hypothesised to represent an individual’s aptitude for implicit learning, as discussed in Section 2.1 above. Reliability was reasonable: Guttmann split half = 0.595. Note that numerically lower reliability indices are expected on implicit measures such as the SRT (; ). Figure 3 shows the interface of the test.

Figure 3 

SRT interface.

3.2.3. Background questionnaire

The background questionnaire was made available to participants via Qualtrics. It comprised four sections aimed at gathering biographical information, participants’ reported leisure activities, their self-concept and their emotional state. The section on biographical information asked about gender, date of birth, occupation and occupational status as well as the number and levels of languages learned.

The section on leisure activities drew on Gribbin et al. () and comprised 15 items describing common activities such as reading the newspaper, going for a walk, attending classes or meeting with friends. For each listed activity, participants responded on a 5-point frequency scale ranging from ‘never or hardly ever’ to ‘every day’. Reliability was acceptable: Cronbach’s alpha = 0.68.

The section on self-concept assessed participants’ perceptions of their own physical and mental state, abilities and skills by presenting 10 statements such as ‘I am fit’, ‘I have a good memory’ or ‘I am good at learning languages’. Participants responded on a 5-point agreement scale ranging from ‘strongly disagree’ to ‘strongly agree’. Reliability was good: Cronbach’s alpha = 0.76.

Emotional state was assessed by means of the Positive and Negative Affect Schedule (PANAS; ) which lists 20 adjectives describing different kinds of emotions, e.g., ‘enthusiastic’, ‘nervous’, ‘alert’, ‘upset’, etc. Participants indicated to what extent they had felt this way over the past week on a 5-point strength-of-feeling scale ranging from ‘not at all or only very slightly’ to ‘extremely’. Reliability was good: Cronbach’s alpha = 0.85.

The leisure activities and self-concept scales were reduced by means of factor analyses (varimax rotation). The analysis of the leisure activities scale (KMO = 0.532; Bartlett’s test of sphericity < 0.001) yielded a 7-factor solution (eigenvalues > 1) explaining 73% of the variance:

Factor 1: Light reading/going for a walk

Factor 2: Physical exercise

Factor 3: Socialising/talking with others

Factor 4: Entertainment/activities with others

Factor 5: Solitary entertainment

Factor 6: Intellectually stimulating activities

Factor 7: Solitary study/activities

The analysis of the self-concept scale (KMO = 0.648, Bartlett’s test of sphericity < 0.001) yielded a 3-factor solution (eigenvalues > 1) explaining 62% of the variance:

Factor 1: Memory/cognition

Factor 2: General health and language learning

Factor 3: Manual abilities

Subsequent analyses were based on the above factors. The alpha level as was set at 0.05.

4. Results

The first research question asked how older adult participants performed on the measures of aptitude for explicit and implicit learning. The descriptive statistics are shown in Table 1.

Table 1

Descriptive statistics for the aptitude measures.


NRANGEMINIMUMMAXIMUMMEANSD

LLAMA total (B + D + F)63219021990.9243.77

LLAMA B63100010031.9823.55

LLAMA D637107130.2117.77

LLAMA F638008028.7322.75

SRT Mean RT Training Blocks 2–840343.21318.43661.64523.1775.67

SRT Mean RT Control Blocks 2–841357.03330.97688.00533.3974.34

SRT Mean RT Difference Blocks 2–84062.70–21.5241.1810.2413.56

Note: RT = Response time. Mean response time differences were calculated over Blocks 2–8 as there was no difference in response times yet at the start of the test (i.e. on Block 1), in keeping with expectations.

The mean scores participants achieved on the LLAMA sub-tests suggest that the measure was challenging for the participants, with LLAMA F the most difficult component. Recall that the maximum score for each sub-test was 100; all means are well below 50%. Nevertheless, there were some high scorers, as evidenced by the maximum scores. At the same time, the minimum score was at floor level.

While the LLAMA was completed by 63 out of the 64 participants, with one participant reporting technical problems, only 40 participants successfully completed the SRT (Blocks 2–8). For the purpose of the study, we considered the test to have been completed successfully if a participant (a) produced data throughout and (b) achieved at least 50% accuracy on each block, so a meaningful data set was available for potential response time differences to emerge. Even though only about two thirds of the sample met these criteria, the mean learning effect of 10.24 ms was significant with a medium effect size (following ): t(39) = –4.779, p < 0.001, d = 0.758. Although the minimum mean response time ndicates that not all participants exhibited learning, the significant mean difference clearly shows that many participants could and did learn implicitly in the course of the task.

The second research question concerned the relationship between the hypothesised components of aptitude for explicit and implicit learning. Table 2 shows the correlations between the aptitude measures.

Table 2

Correlations between the aptitude measures.


LLAMA TOTAL (B + D + F)LLAMA BLLAMA DLLAMA FSRT MEAN RT DIFFERENCE BLOCKS 2–8

LLAMA total
(B + D + F)
r10.733**0.562**0.727**0.099

p0.0000.0000.0000.544

LLAMA Br10.1390.267*–0.214

p0.2780.0350.184

LLAMA Dr10.1560.443**

p0.2210.004

LLAMA Fr10.100

p0.541

Note: ** = correlation significant at the 0.01 level, 2-tailed; * = correlation significant at the 0.05 level, 2-tailed.

The results show expected correlations between the LLAMA total and the three sub-tests. Moreover, LLAMA B and LLAMA F correlate at a moderate level. LLAMA D and the SRT are also associated at a moderate level, while LLAMA D does not correlate with either of the other LLAMA sub-tests. A factor analysis (varimax rotation, KMO = 0.403, Bartlett’s test of sphericity = 0.027) yielded a 2-factor solution (eigenvalues > 1) explaining 68% of the variance. The aptitude tests load on the two factors in the expected way, supporting the hypothesised distinction between measures of implicit and explicit aptitude, respectively. The rotated component matrix is shown in Table 3.

Table 3

Rotated component matrix for explicit and implicit aptitude.


IMPLICIT APTITUDEEXPLICIT APTITUDE

LLAMA B–0.1720.818

LLAMA D0.8090.126

LLAMA F0.1710.742

SRT Mean RT Difference Blocks 2–80.871–0.122

The third research question asked how occupational status, age, level of multilingualism, emotional state, self-concept and leisure activities related to performance on the aptitude measures. In terms of occupational status, 52 participants reported that they were retired and 12 reported that they were still working. The descriptive statistics showed a substantial numerical difference between scores on LLAMA D between retired participants (mean = 30.3, SD = 15.87) and participants who were still working (mean = 40.6, SD = 19.43). By the same token, the response time differences on the SRT differed numerically between retired participants (mean = 8.12, SD = 14.43) and those who were still working (mean = 16.6, SD = 8.08), indicating a larger learning effect for the latter group. In order to assess the influence of occupational status inferentially, we ran a MANCOVA with LLAMA B + F, LLAMA D and SRT scores as the dependent variables and chronological age as the covariate. There was no significant overall effect of occupational status (Wilks’ Lambda, p = 0.149), but a significant between-subjects effect for LLAMA D and a between-subjects effect approaching significance for the SRT were in evidence. Pairwise comparisons (Bonferroni-adjusted) were significant for LLAMA D (p = 0.033) and approached significance for the SRT (p = 0.077). In other words, when age was controlled for, retired participants achieved significantly lower scores on the LLAMA D and showed a trend towards less learning on the SRT task compared with participants who were still working.

Table 4 shows the descriptive statistics for participants’ reported leisure activities and their self-concept.

Table 4

Descriptive statistics for leisure activities and self-concept.


RANGEMINIMUMMAXIMUMMEANSD

Light reading/going for a walk3.671.335.003.310.96

Physical exercise4.001.005.002.741.15

Socialising/talking with others3.501.505.003.920.78

Entertainment/activities with others3.001.004.002.380.60

Solitary entertainment3.501.004.502.410.95

Intellectually stimulating activities3.501.004.502.710.98

Solitary study/activities3.001.004.002.770.72

Memory/cognition2.752.255.003.820.57

General health/language learning2.502.505.003.900.51

Manual abilities4.001.005.003.581.00

Note: Higher scores indicate more frequent activity and a more positive self-concept, with 5 the maximum.

Overall, the sample can be considered quite active in leisure terms. A correlation analysis yielded no significant relationships between reported frequency of leisure activities and LLAMA B + F, LLAMA D or the SRT (all coefficients < 0.195, all p-values > 0.126). In terms of self-concept, participants were noticeably confident, with all means on the positive side of the response scale. With regard to their emotional state over the preceding week, responses were again overwhelmingly positive (maximum score = 5), as shown in Figure 4, with the mean well above the neutral point of the scale, though there were two outliers on the negative side.

Figure 4 

Participants’ emotional state.

A correlation analysis including age, level of multilingualism, emotional state and self-concept resulted in three significant relationships. LLAMA B + F scores were weakly but significantly correlated with age (date of birth), r = 0.284, p = 0.024, indicating an advantage for younger participants on the measure of aptitude for explicit learning. LLAMA D scores were weakly but significantly correlated with self-perceived memory/cognition, r = 0.256, p = 0.043, suggesting higher scores for implicit phonetic coding for participants with a positive view of their own memory/cognition. Finally, SRT task performance was moderately associated with level of multilingualism, r = 0.346, p = 0.029, indicating better implicit sequence learning by participants with more extensive prior language learning experience. All other correlations were low and non-significant (all coefficients < 0.227, all p-values > 0.093).

5. Discussion

It is now possible to draw together the threads of the study and consider the results in conjunction. We will begin by focusing on the construct of language learning aptitude and its measurement before widening the scope to the role of other individual difference variables.

5.1 Explicit and implicit aptitude in older adults

The first finding of note is that the results obtained from a sample of participants aged 60+ align with the hypothesised distinction between aptitude for explicit and aptitude for implicit learning (, ; ; ). A factor analysis based on scores on the LLAMA B, D and F subtests as well as the SRT task resulted in two factors with unequivocal loadings, clearly separating LLAMA B + F, which are expected to measure explicit aptitude, and LLAMA D and the SRT, which are thought to measure implicit aptitude. The fact that a test relying on linguistic stimuli (LLAMA D) and a test using non-linguistic visual stimuli (SRT) loaded on the same factor suggests that processing mode was more important than stimulus domain. Our finding adds empirical support from a hitherto under-researched participant group to previous findings with younger participants ().

The second finding of note is the challenging nature of all aptitude measures used in the present study when administered to participants aged 60+. On the LLAMA, mean scores were well below 50% throughout. For LLAMA B (32%) and LLAMA F (29%), this is not in line with results from recent research with younger participants. For instance, mean scores achieved by L1 Russian adolescents (N = 115) were 61% on LLAMA B and 56% on LLAMA F (). Mean scores from a sample of young adults with mixed L1 backgrounds (N = 350) based in Sweden, South Africa and Senegal were 50% on LLAMA B and 51% on LLAMA F (). Another, though much smaller sample of university-level learners with mixed L1 backgrounds (N = 41) based in the UK attained mean scores of 56% on LLAMA B and 61% on LLAMA F (). Two possible explanations for the lower scores attained by participants in the current study suggest themselves: we may be faced with an age effect and/or a task effect.

With regard to a task effect, it is certainly the case that compared with the samples in the studies mentioned above as well as in other previous studies with university-level learners in particular, our third-age participants had had no recent practice in taking tests, least of all online, given that they completed their formal education many years ago. As LLAMA B + F scores were significantly associated with age, we established an age effect within the 19-year range of our sample which points towards an advantage for younger participants on the measures of aptitude for explicit learning. The descriptive differences in mean scores between our sample and the younger samples in previous studies referred to above further reinforce this interpretation, suggesting that explicit tasks in the linguistic domain which draw on fluid abilities in terms of memorisation of novel form-meaning associations between concepts and their labels (LLAMA B) and processing power in the sense of inductive analysis of structural patterns (LLAMA F) become more challenging as we get older (; ; ). At the same time, it must be acknowledged that though explicit in nature, both LLAMA B and LLAMA F are paced, with study time limited to two and five minutes respectively, so processing speed was not irrelevant.

Fast, implicit processing of linguistic and non-linguistic stimuli was tapped by the implicit aptitude measures in the present study, that is, LLAMA D and the SRT task, respectively. Unlike the LLAMA, the SRT had a low completion rate, with about a third of our sample failing to produce usable data on this test. As participants had a maximum of one second to respond to each stimulus, the test may have outpaced some individuals, resulting in high error rates. Alternatively, participants may have given up altogether and no longer even tried to keep pace. In order to establish whether there was an age effect within the 19-year range of our sample, we compared participants who completed the SRT task successfully with those who did not by date of birth. There was no statistical difference, but a trend pointing towards an advantage for younger participants within the sample was evident: t(62) = –1.741, p = 0.087.

In view of the relatively high failure rate observed in the present study, one could argue that the SRT task may have limited use with older adults. However, we did find a significant learning effect in those participants who successfully completed the test. Put differently, a number of the participants who were able to keep pace in terms of motor responses demonstrated implicit sequence learning ability for non-linguistic stimuli. In addition, performance on LLAMA D as a measure of implicit phonetic coding was at 30%. This is in line with one of the younger samples referred to above, which likewise achieved a mean of 30% on this subtest (), though the other younger samples we have chosen for comparison scored higher means at 49% () and 56% (). Taken together, it seems that although processing speed as measured by motor responses was an issue for some participants, the results on the implicit aptitude measures are in keeping with the argument that linguistic processing abilities actually remain intact in later life and that task effects may be responsible for contrasting results in previous research (). In our case, it may have been exactly such a task effect that prevented a greater number of participants from completing the SRT. The fact that LLAMA D was completed by all participants and that it is thus not implicit aptitude per se which causes potential issues supports this interpretation.

5.2 Individual differences in occupational status, level of multilingualism and self-concept

In the present study, we asked participants to report on leisure activities, their self-concept and their emotional state in the preceding week. Overall we observed frequent engagement in common leisure activities, positive self-concepts and positive affective states, and it is likely that the clustered responses we obtained on the various scales can explain the absence of (many) significant correlations. It would appear that we are faced with a selection bias, that is, older adults who were less active and less positive about themselves were unlikely to volunteer for participation in the present study, which not only required the completion of unfamiliar tests, but also doing this independently online. Anecdotally, we encountered strong initial interest from a considerable number of potential volunteers who were keen to occupy themselves during the Covid-19 lockdown at the time, but enthusiasm quickly waned among many in view of our insistence on laptop or desktop computers for task completion, which seemed to be seen as foreboding ‘serious’ and thus perhaps more threatening work than potentially more familiar and ‘fun’ touchscreen-enabled devices such as tablets or smartphones. Again anecdotally, the availability of suitable equipment mostly did not seem to be an issue, but rather our request for its use.

Nevertheless, and despite the clustered nature of the educated and confident sample that participated in the study, we found some significant effects and relationships with regard to implicit aptitude measures and other individual difference variables. First and foremost, occupational status had a significant effect on performance on LLAMA D and showed a trend for the SRT, indicating that participants who were still working did better on these measures than participants who were retired. Importantly, chronological age was controlled for in this analysis, so there was no confound in this regard. Second, level of multilingualism in the sense of quantity and quality of prior language learning correlated with SRT task performance, pointing towards better implicit sequence learning (of non-verbal stimuli) by participants with more extensive language learning experience. Third, participants’ self-concept in terms of their own memory/cognition was correlated with LLAMA D performance, i.e., older adults with a more positive view of their own memory/cognition achieved higher scores on implicit phonetic coding – a finding which suggests a certain level of metacognitive awareness on the part of the participants.

Given the research design of the present study, we cannot draw definitive conclusions about cause and effect, although it is possible to theorise about the directionality of the observed effects and relationships. With regard to age and performance on the explicit aptitude measures (see Section 5.1 above), it is plausible to claim that chronological age is the predictor and test performance the outcome. With regard to level of multilingualism and performance on the SRT task, we are probably dealing with a cyclical relationship: participants with better implicit sequence learning ability went on to learn more languages in their lives and achieved higher levels of proficiency in these languages. At the same time, the use of more languages and to a higher level over time may have further enhanced implicit sequence learning ability. With regard to occupational status and implicit aptitude, a bilateral relationship is likewise possible. Continuing to work may keep you on your toes, so you perform significantly better on an implicit linguistic task and marginally better on an implicit non-linguistic task. Alternatively, or simultaneously, individuals with better implicit learning abilities are more likely to keep working for longer as they are better equipped for continued employment and/or enjoy the challenge to a greater extent.

6. Conclusion and suggestions for future research

Future research should investigate the predictive validity of the explicit and implicit aptitude measures that were used in the present study for older adults’ language learning. This would require a research design in which the same participant sample is tested on the LLAMA battery and the SRT task and also engages in a language learning course. The presentation of stimuli on the SRT task could be slowed down somewhat to ensure successful completion of the test by a greater number of participants. As the LLAMA D was completed by all participants in the present study and resulted in mean scores that were comparable to those in previous studies, performance on this measure could be used to establish convergent validity of an SRT task with slightly longer presentation times.