1. Introduction

In L2 vocabulary learning, the ultimate goal for learners is to be able to use the new vocabulary productively. Producing L2 words involves storing them and retrieving them from the mental lexicon. In order to facilitate this process, research has shown that it is helpful for learners to pay attention to the form of the word, i.e. to engage in structural elaboration (Barcroft, 2002). One method for directing a learner’s attention to word-form is by requiring the learner to write the target word down. Previous research has demonstrated that doing so results in superior word learning compared to prompting the learner to focus on word meaning (Candry et al., 2017; Elgort et al., 2016). However, the question remains whether writing words down by hand will also result in better retention of L2 vocabulary if the method is compared to another method for structural elaboration.

The present study compares two structural elaboration techniques in order to determine whether they contribute to word-form learning to an equal extent: written repetition (i.e. writing a word down by hand repeatedly) and oral repetition (i.e. saying a word out loud repeatedly). Oral repetition was selected as a comparison method since, like written repetition, it is an ecologically valid L2 vocabulary learning method which requires the learner to produce the target word-form. We also added a control condition in which the learners were not prompted to engage in structural elaboration. Research suggests that more proficient language learners are more likely to use oral repetition as an L2 vocabulary learning strategy (Gu & Johnson, 1996). With the aim of accounting for this individual learner variable, the influence of learner style on the participants’ test performance as well as on the efficiency of the L2 vocabulary learning techniques was investigated.

2. Literature

When learners encounter an unknown L2 word, they often engage in semantic elaboration, i.e. they focus on word-meaning (Barcroft, 2002). If processing demands are high, however, concentrating on word-meaning will have a negative effect on the retention of word-form since the effort of engaging in semantic elaboration may usurp the processing resources required for encoding word-form. Explicitly encouraging learners to focus on word-form, i.e. prompting them to engage in structural elaboration, should increase the chances of them remembering this word-form (see Nation’s (2007) language-focused instruction and Laufer’s (2010) word-focused instruction). The more a learner engages in both semantic and structural elaboration, the better this learner’s chances of retaining the new word are (Laufer & Hulstijn, 2001).

In the present study, two methods which induce structural elaboration in L2 vocabulary learning are compared: written repetition and oral repetition. Studies comparing these vocabulary learning methods are scarce. In an L1 vocabulary study, Gathercole & Conway (1988) found an advantage for oral repetition on a word recognition test. In the only L2 vocabulary study we are aware of, Thomas & Dieter (1987) compared the merits of writing words down and saying words out loud. They concluded that written repetition resulted in better retention of word-form than oral repetition.

Research comparing either written repetition or oral repetition to other vocabulary learning strategies is more readily available. Several studies have found oral repetition to improve word memory compared to learning treatments during which words are not repeated out loud (e.g. Ellis & Beaton, 1993; Gathercole & Conway, 1988; MacLeod, Gopie, Hourihan, Neary, & Ozubko, 2010; Seibert, 1927). It must be noted, however, that barring Seibert (1927) none of these studies were conducted in the context of L2 vocabulary learning. According to Ellis (1995b, 1997), oral repetition of a word ensures that the word is retained in the phonological loop, which increases the odds of the word being transferred to long-term memory. Furthermore, learning to pronounce a word is a matter of sensorimotor learning, a type of learning which is fostered by repetition (Seibert, 1927). Saying a word out loud is presumed to create a sensorimotor representation of the word in the learner’s mind which should allow the learner to remember this word better (Krishnan, Watkins & Bishop, 2017; Mathias, Palmer, Perrin, & Tillmann, 2015). Moreover, besides the motor component and the auditory component involved in the method, a third component also appears to contribute to the efficiency of oral repetition: a self-referential component, i.e. hearing your own voice (Forrin & MacLeod, 2017). Notwithstanding these findings, some studies comparing oral repetition to learning activities that do not require learners to say the word out loud assert that the oral aspect is not critical for vocabulary learning (Abbs, Gupta, & Khetarpal, 2008; Kang, Gollan, & Pashler, 2013; Krishnan et al., 2017). Hence, the jury is out on the extent to which oral repetition benefits L2 vocabulary learning.

For written repetition, a similar disparity can be observed, at least in the context of L2 learning. A number of studies have endorsed the benefits of writing a word down for L2 word-form learning (Candry et al., 2017; Elgort et al., 2016; Eyckmans et al., 2017; Hummel, 2010). Moreover, the results of lexical decision tasks have indicated that word writing also resulted in better lexicalization, i.e. better integration of the words in the learner’s mental lexicon (Elgort et al., 2016). These studies suggest that the positive effect of the technique for L2 word-form learning is generated not only by the increased attention to word-form, but also by the motor memory that is created through this particular activity. Nonetheless, not all studies investigating the effects of writing L2 vocabulary down argue in favour of the technique; some studies have found the method to result in poorer performance on a form recall test than a control condition in which no explicit instructions were given to the learners as to how they were expected to learn the target vocabulary (Barcroft, 2006, 2007). Apparently, the writing task as operated in these studies consumed all of the learners’ processing resources, as a result of which the learners were not able to encode word-form and engage in form-meaning mapping (Barcroft, 2006, 2007).

Although written and oral repetition both focus the learner’s attention on word-form, they address different modalities to do so. Whereas written repetition accesses the visual aspect of the word (i.e. orthography), oral repetition focuses on its auditory aspect (i.e. phonology). For subsequent word recognition or production, the congruence between the modality in which a word was learned and the modality in which it has to be recognized or recalled might impact how well the learner is able to perform the task. Nelson, Balass and Perfetti (2005) established that learners were more capable of recognizing words if these had been learned in the same modality as the one in which they were tested. Similarly, Bosse, Chaves, & Valdois (2014) found learners to be better able to recall words productively in a modality congruent to the one in which they were learned, a phenomenon they termed the encoding-retrieval match effect. Both findings are in line with the assumptions of Transfer Appropriate Processing (TAP) Theory, which posits that the value of a particular learning activity is contingent on the goal of this activity (Morris, Bransford, & Franks, 1977).

Both written and oral repetition are strategies which L2 learners perceive as beneficial for the L2 vocabulary learning process (Chen, 1998; Schmitt, 1997). Gu and Johnson (1996) showed that learners seemed to prefer oral repetition over written repetition. Moreover, the learners’ self-reported use of written repetition as an L2 vocabulary learning strategy was found to be a negative predictor of their general level of L2 proficiency, whereas the use of oral repetition for L2 vocabulary learning was shown to be a positive predictor. This suggests that more proficient L2 learners are more likely to engage in oral than written repetition for L2 word learning. It also indicates that L2 learners may have a personal preference for certain L2 vocabulary learning strategies, which could have implications for the efficiency of these strategies. In a previous study, Candry et al. (2017) compared the efficacy of written repetition with meaning inferencing for L2 vocabulary learning, and investigated whether the learners’ preference with regard to method had an influence on the effectiveness of both techniques. Overall, the written repetition technique was found to be superior to meaning inferencing, regardless of the learners’ preference. Nevertheless, the advantage for written repetition compared to meaning inferencing was more pronounced among learners who preferred written repetition than among learners who preferred meaning inferencing. Learner style, which we consider to be a preference for vocabulary learning strategies of a particular type, may also impact the efficiency of written or oral repetition. For instance, the VARK learner style questionnaire (Leite, Svinicki, & Shi, 2009) allows a learner to determine whether he or she has a preference for visual, aural, read/write or kinaesthetic learning strategies. However, the effect of learner styles on the effectiveness of these two vocabulary learning strategies has not yet been investigated. The efficiency of a particular vocabulary learning strategy may also be influenced by a learner’s L2 vocabulary size. Research has indicated that the larger a learner’s L2 vocabulary size is, the better this learner will acquire new L2 vocabulary, a finding which has been termed the Matthew effect (Horst, Cobb, & Meara, 1998; Stanovich, 1986). Indeed, Candry et al. (2017) found that the larger a participant’s L2 vocabulary size, the more target items he or she knew after undergoing the learning treatment. However, there was no interaction between L2 vocabulary size and learning treatment.

3. Research questions

This paper will address the following main research question:

  1. Which of the three proposed learning conditions (written repetition, oral repetition, control condition) results in better L2 form recall, meaning recall, and lexicalization?

In addition, the following additional research questions will be investigated:

  • 2. Does learner style have an influence on the efficiency of the three learning conditions?
  • 3. Does L2 vocabulary size have an influence on the efficiency of the three learning conditions?
  • 4. To what extent does congruence of the learning and testing condition have an influence on vocabulary recall?

Following Perfetti & Hart’s (2002) Lexical Quality Hypothesis, postulating that the lexical representation of a word will be highest in quality if the learner had access to orthography, phonology, and semantics during the learning process, we hypothesize that written repetition will lead to superior results on all three measures of word knowledge, since learners had access to meaning and phonology in this condition and experienced an increased focus on the orthography of the target items. In both other conditions, the learners’ attention was not explicitly directed to the orthography of the target items. Furthermore, we expect oral repetition to yield better form recall scores than the control condition, owing to the motor component and the self-referential component inherent in saying words out loud. Based on Candry et al.’s (2017) results, we also anticipate that learner style will have an influence on the efficiency of the three learning conditions operationalized in the present study, and that L2 vocabulary size will not influence the efficiency of the three learning conditions. Hence, we expect that the learning conditions will be equally efficient for all learners, regardless of their L2 vocabulary size.

In keeping with TAP-theory (Morris et al., 1977) and several other studies which have argued in favour of congruence between treatment and test modality (Bosse et al., 2014; Nelson et al., 2005), we predict that words learned in the oral repetition condition will be recalled better in the spoken post-test and that written words will be recalled better in the written post-test.

4. Methodology

4.1. Design

The present study used a within-subjects design in which the participants learned 24 target items in three blocks of eight words. Hence, each participant learned eight words in each of the three learning conditions. All blocks were preceded by a practice block containing non-target items from the 2000 level frequency band, so that the learners understood how the learning procedure functioned prior to acquiring the target items. The procedure was conducted on a computer and programmed with PsychoPy (Peirce, 2008). All target items were presented in sentence contexts. Two native speakers of German and one near-native speaker of German, all of whom were German instructors at the university where the experiment took place, checked the idiomaticity and language level of the sentence contexts in order to make sure that the participants would understand the non-target vocabulary in the sentences.

4.2. Target items

The participants learned 24 low-frequency German words. The frequency of the target items was checked by means of the Leipzig Corpora Collection Corpus for German developed by Goldhahn, Eckart & Quasthoff (2012). All words were between 5 and 9 letters long (see appendix 1).

4.3. Participants

The participant group consisted of 71 Dutch-speaking learners of German in their first Bachelor year of Applied Linguistics at a Flemish university. Their estimated level of German proficiency ranged between CEFR level A2 and B1 and their average age was 18. Four participants were excluded from the study: one participant had to end the learning treatment prematurely due to illness; three other participants did not complete one of the learning conditions in the correct manner. One week after the learning treatment, 52 of the participants took part in the delayed post-tests.

4.4. Procedure

The participants were invited to take part in an experiment which required them to learn 24 new German words. A pre-test conducted prior to learning the target vocabulary allowed us to exclude target items already known to the learners. Four items had to be excluded from the analysis. Next, the learning procedure was initiated. All instructions, both oral and written, were provided in Dutch. For each block of eight target items, the participants completed three steps. The third step differed according to condition (see Table 1).

Table 1

Learning procedure.

Step Presentation of target item Instruction Duration

1 First sentence context
e.g. Das {Konterfei} des neuen Präsidenten ist überall zu sehen; jeder weiß, wie er aussieht. – The President’s portrait can be seen everywhere; everyone knows what he looks like.
Read the entire sentence and carefully look at the word between brackets 15 seconds
2 Target word, Dutch translation and audio recording of target item played twice
e.g. Konterfei – portret
Read the target item and its translation. You will hear an audio recording of the target item twice 10 seconds
3 Second sentence context
e.g. An der Wand hängt ein {Konterfei} von meiner Großmutter, das mein Großvater gezeichnet hat. – On the wall, there is a portrait of my grandmother which was drawn by my grandfather
Instruction depended on the learning condition: written repetition, oral repetition or control condition (see instructions to the participants in the text below) 20 seconds

In the written repetition condition, the participants received the following instructions: “Read each sentence in its entirety and write down the word in brackets repeatedly until you hear a beep.” After the beep they had to direct their attention back to the screen to read the sentence context containing the next target item. In the oral repetition condition, the participants were told: “Read the sentence in its entirety and repeat the word in brackets out loud until you hear a beep.” Their repetitions were recorded. In the control condition, the participants were given the following instruction: “Read the sentence completely and then look at the word in brackets until you hear a beep.”

These three steps were repeated twice for the remaining target items, but step three was conducted in a different experimental condition each time. Table 2 demonstrates how the order of the words was counterbalanced across conditions.

Table 2

Order of target words across conditions.

Written repetition Oral repetition Control condition

Group 1 Words 1–8 Words 9–16 Words 17–24
Group 2 Words 9–16 Words 17–24 Words 1–8
Group 3 Words 17–24 Words 1–8 Words 9–16

After the learning procedure, the participants first completed two form recall tests which were administered by computer: a written and a spoken form recall test. The first twelve words were tested by means of the written form recall test: the participants saw the Dutch translation of one of the target words on screen and had to write down the corresponding German target word by hand on their answer sheet. The next twelve words were tested through the spoken form recall test: the participants again saw the Dutch translation of a target word on screen and had to say the corresponding German target word out loud. Their spoken answers were recorded by the computer. One third of these two sets of 12 words was learned in the writing condition, one third was learned in the oral repetition condition, and one third was learned in the control condition. Hence, one third of the words was tested in a mode congruent with the learning treatment. The order of the words was counterbalanced across post-test modes. In both modalities, participants were given 15 seconds to recall each word.

Next, participants completed a meaning recall test. They were presented with the 24 target words and had to write down the corresponding Dutch translations of the words. Finally, a lexical decision test was administered to measure implicit knowledge of the target words. If one aims to detect a degree of word knowledge that is too shallow or too unstable to be detected in an explicit form recall test, a more fine-grained, implicit measure may be required. The lexical decision task contained the 24 target words, 24 high-frequency German words and 48 German nonwords. The participants had to indicate whether the word appearing on screen was an existing German word or not. In order to become familiar with the task, participants completed 16 trials prior to the start of the task.

One week later, the participants completed the same form and meaning recall tests and lexical decision task. The lexical decision task contained different high-frequency German words and German nonwords than the week before in order to avoid the participants responding faster to these items due to a familiarity effect. Participants also completed two German vocabulary size tests so that we could determine whether their vocabulary size informed their post-test performance. For receptive German vocabulary size, they completed the LexTALE for German (Lemhöfer & Broersma, 2012). A productive German vocabulary size test, which was developed by the Institut für Testforschung und Testentwicklung (2016), was also administered. In addition, participants filled in the VARK learner style questionnaire (Leite et al., 2009) so that we could verify whether learner style had an influence on the efficiency of the learning conditions.

4.5. Scoring and analysis

The spoken responses were transcribed phonetically and compared to a phonetic transcription of the audio recording of the target word in order to assign an appropriate score. Responses in both test modes were scored twice: once with a strict scoring protocol, according to which the response was accorded either 0 or 1, and once according to Barcroft’s (2002) Lexical Production Scoring Protocol, which awards a score of 0, 0.25, 0.5, 0.75 or 1, depending on the percentage of the word that was produced. The strict form and meaning recall data and the accuracy data of the lexical decision task were analysed by means of a generalized linear mixed effects model constructed with the glmer function of the lme4-package (Bates, Maechler, Bolker, & Walker, 2015). Partial form recall scores were analysed with a cumulative link mixed model fitted by means of the clmm-function of the ordinal-package (Christensen, 2015). The reaction time data were analysed with a linear mixed effects model, for which the function lmer from the lme4-package was employed (Bates et al., 2015). Cohen’s d for the mixed effects models was calculated as in Candry et al. (2017): Participant and item effect sizes were calculated by means of the orddom-package (Rogmann, 2013) and then combined with the ESCI software for Meta-Analysis (Cumming, 2012).

5. Results

5.1. Learning effects of the three conditions

We observe that the writing condition yields the highest immediate form recall percentages, both for strict and partial form recall, followed by oral repetition and then the control condition, although the difference between these two conditions is negligible (see Table 3). The differences between written repetition and oral repetition (Estimate = –0.6221, SE = 0.1462, z = –4.255, p = 0.0001, d = 0.42 for strict scoring; Estimate = –0.6749, SE = 0.1375, z = –4.909, p < 0.0001, d = 0.40 for partial scoring) and the writing condition and the control condition (Estimate = –0.7273, SE = 0.1461, z = –4.979, p < 0.0001, d = 0.54 for strict scoring; Estimate = –0.6617, SE = 0.1359, z = –4.867, p < 0.0001, d = 0.49 for partial scoring) are significant with medium effect sizes. The difference between oral repetition and the control condition is not significant, and a very small effect size is noted (Estimate = –0.1052, SE = 0.1436, z = –0.733, p = 0.7440, d = 0.10 for strict scoring; Estimate = 0.0133, SE = 0.1321, z = 0.100, p = 0.9945, d = 0.06 for partial scoring). One week later, however, written repetition no longer results in superior form recall percentages. The difference between the three conditions has levelled out and learning condition is no longer a significant predictor of the participants’ performance on the delayed form recall test, neither for the strict (p = 0.8785) nor the partial form recall scores (p = 0.853).

Table 3

Immediate and delayed form recall percentages per condition.

Strict Partial

Immediate (n = 67) Delayed (n = 52) Immediate (n = 67) Delayed (n = 52)

Written repetition 63% 41% 71% 48%
Oral repetition 52% 43% 62% 48%
Control 50% 42% 61% 48%

Immediate meaning recall scores are virtually equal in all three conditions (see Table 4). The likelihood ratio test indicates that the variable condition does not improve the model fit (p = 0.3405 for immediate meaning recall and p = 0.2054 for delayed meaning recall).

Table 4

Immediate and delayed meaning recall percentages per condition.

Immediate (n = 67) Delayed (n = 52)

Written repetition 81% 69%
Oral repetition 80% 66%
Control 79% 66%

Although condition was not a significant predictor of performance on the immediate lexical decision task, not for reaction times (p = 0.4002) nor accuracy (p = 0.373), average reaction times were lowest for words learned through written repetition and highest for words learner in the control condition (see Table 5). Accuracy was virtually equal in all three conditions. After one week, reaction times were highest in the control condition and lowest in the oral repetition condition, but condition was again not a significant predictor of reaction times on the lexical decision task (p = 0.2563). The participants responded equally accurately to words learned through written repetition and oral repetition, but less accurately to words learned in the control condition. The difference between written repetition and the control condition just falls short of significance with a small effect size (Estimate = 0.4480, SE = 0.2547, z = 1.759, p = 0.0786, d = 0.28); the difference between oral repetition and the control condition is significant, again with a small effect size (Estimate = 0.5314, SE = 0.2572, z = 2.066, p = 0.0388, d = 0.26).

Table 5

Results of the immediate and delayed lexical decision task.

Reaction times Accuracy

Immediate (n = 67) Delayed (n = 52) Immediate (n= 67) Delayed (n = 52)

Written repetition 683,97 ms 735,02 ms 94% 90%
Oral repetition 685,55 ms 732,12 ms 95% 90%
Control 698,41 ms 753,39 ms 95% 86%

5.2 Influence of learner style, L2 vocabulary size and test-treatment congruence

We investigated the effect of learner style, L2 vocabulary size and test-treatment congruence on the participants’ learning gains and on the efficiency of the three learning conditions. According to the VARK learner style questionnaire, six participants had a preference for the visual modality, 21 participants preferred the aural/auditory modality, nine participants had a profile which fitted the read/write modality, and 10 participants favoured the kinaesthetic modality. The remaining nine participants had a multimodal profile, combining two or even three of the four VARK-modalities. Learner style was not a significant predictor of performance on the delayed form recall test (p = 0.9001 for strict scoring; p = 0.8333 for partial scoring), the delayed meaning recall test (p = 0.4972) or the delayed lexical decision task (p = 0.573 for reaction times; p = 0.3236 for accuracy).

On the LexTALE, which measured receptive L2 vocabulary size, the participants obtained an average score of 61.3%. Their average scores on the productive German vocabulary size test were 11.7 (= 65%, SD = 3.14) at the 1000-word frequency level, 8.8 (= 48.9%, SD = 2.71) at the 2000-word frequency level, 5.1 (= 28.3%, SD = 2.18) at the 3000-word frequency level, 3.8 (= 21.1%, SD = 2.42) at the 4000-word frequency level and 2.4 (= 13.3%, SD = 1.51) at the 5000-word frequency level. Productive L2 vocabulary size was a significant predictor of the scores obtained on the delayed form recall test (Estimate = 0.2928, SE = 0.0774, z = 3.781, p = 0.0002 for strict form recall; Estimate = 0.2475, SE = 0.0778, z = 3.181, p = 0.001 for partial form recall) and the delayed meaning recall test (Estimate = 0.2250, SE = 0.0818, z = 2.747, p = 0.006): The higher a learner’s productive L2 vocabulary size, the more words this learner was able to recall. However, the interaction between condition and productive L2 vocabulary size did not improve the model fit for delayed form recall (p = 0.1273 for strict scoring and p = 0.1827 for partial scoring) or delayed meaning recall (p = 0.1804 for delayed meaning recall). Receptive L2 vocabulary size was not a significant predictor of the scores obtained on the delayed form recall test (p = 0.1376 for strict scoring; p = 0.0697 for partial scoring) or the delayed meaning recall test (p = 0.4798). Lastly, neither receptive (p = 0.1863 for reaction times; p = 0.4982 for accuracy) nor productive L2 vocabulary size (p = 0.7684 for reaction times; p = 0.8311 for accuracy) predicted the results of the delayed lexical decision task.

Finally, test-treatment congruence was not a significant predictor of post-test performance, neither for the immediate (p = 0.5006 for strict scoring; p = 0.6183 for partial scoring) nor the delayed form recall results (p = 0.5317 for strict scoring; p = 0.625 for partial scoring). Hence, words which were learned through written repetition were not recalled better in the written post-test than words which were learned in the oral repetition condition, and vice versa (for percentages: see Table 6).

Table 6

Form recall percentages per combination of learning condition and type of post-test.

Immediate (n = 67) Delayed (n = 52)

Written repetition + Written post-test 68% 46%
Oral repetition + Written post-test 57% 45%
Oral repetition + Spoken post-test 47% 40%
Written repetition + Spoken post-test 58% 36%

However, we did observe that scores were overall higher on the written form recall test than on the spoken form recall test (see Table 7). This difference was significant, both for immediate form recall (Estimate = 0.5418, SE = 0.1187, z = 4.564, p < 0.0001 for strict scoring; Estimate = 0.6434, SE = 0.1109, z = 5.802, p < 0.0001 for partial scoring) and delayed form recall (Estimate = 0.5897, SE = 0.1456, z = 4.050, p < 0.0001 for strict scoring; Estimate = 0.6147, SE = 0.1315, z = 4.674, p < 0.0001 for partial scoring).

Table 7

Form recall percentages per post-test mode.

Immediate (n = 67) Delayed (n = 52)

Written Post-test 59% 46%
Spoken Post-test 51% 38%

We also established that response rates, i.e. the number of instances where a participant provided an answer on the form recall test, were higher for the written post-test than for the spoken post-test (see Table 8), a difference which is again found to be significant (Estimate = 0.7590, SE = 0.1293, z = 5.870, p < 0.0001 for immediate form recall; Estimate = 0.6359, SE = 0.1408, z = 4.516, p < 0.0001 for delayed form recall).

Table 8

Response rates per post-test mode.

Immediate (n = 67) Delayed (n = 52)

Written Post-test 38,9% 30,6%
Spoken Post-test 33,1% 25,2%

6. Discussion

In the case of form recall, the results of the experiment point to a slight advantage of the writing condition over the oral repetition and control conditions. Moreover, although the written repetition technique resulted in the same accuracy on the delayed lexical decision task, it led to higher accuracy on this task compared to the control condition. As such, our findings seem consistent with the Lexical Quality Hypothesis (Perfetti & Hart, 2002): learners had access to orthography, phonology and semantics in the written repetition condition and, as a result, were able to create more complete lexical representations of the new vocabulary than in the two other conditions. In addition, the results seem to be consistent with previous research establishing that immediate form recall was better for words which had been written down (Candry et al., 2017; Elgort et al., 2016; Thomas & Dieter, 1987). The effect observed in the present study was slightly smaller than the effect observed in Candry et al. (2017). For immediate form recall, the differences between written repetition and oral repetition, and between written repetition and the control condition were significant with a medium effect size, whereas in Candry et al. (2017), the writing condition significantly outperformed the semantically elaborative condition with a medium to high effect size.

For the most part, however, the advantage of the writing condition was short-lived. It should be noted that previous studies on the effects of word writing either did not include a delayed form recall test (Elgort et al., 2016; Thomas & Dieter, 1987), or delayed this test by only one day (Candry et al., 2017). In our study, the superiority of word writing had disappeared after a one-week interval. Nevertheless, contrary to Barcroft (2006, 2007), we did not establish that writing a word down resulted in inferior delayed form recall scores than the control condition. In view of its marginally better results on the immediate form recall test, written repetition seems to have benefited vocabulary learning more than the other structurally elaborative condition that was employed (i.e. oral repetition).

Another explanation for the benefit of written repetition observed in the immediate form recall test may be that writing a word down entails a greater focus on phonology than anticipated. According to the phonological mediation hypothesis, access to the orthographical knowledge of a word presupposes the retrieval of its phonology (Geschwind, 1969; Luria, 1970). This would mean that the visual presentation of a word activates phonological information as well as orthographic information (Nelson et al., 2005). Although the results of several studies (e.g. Bub & Kertesz, 1982; Shelton & Weinrich, 1997; Rapp & Caramazza, 1997) have challenged the obligatory nature of phonological mediation, other studies have found evidence for phonology contributing to the representation of orthographic codes (Damian, Dorjee, & Stadthagen-Gonzalez, 2011; Damian & Qu, 2013; Miceli & Capasso, 1997). As such, simply reading a word may not only allow the learner to process how the word is written, but also how the word is pronounced. Moreover, the participants in the written repetition condition may have repeated the word subvocally whilst writing it down. Although there is some debate as to whether subvocalization occurs consistently during silent reading, it is a commonly observed phenomenon (e.g. Cleland & Davies, 1963; Reisberg, Smith, Baxter, & Sonenshine, 1989; Smith, Wilson, & Reisberg, 1995). Should the learners indeed have engaged in subvocalization during the written repetition condition, their attention would have focused on both the orthography and phonology of a word, engaging in both orthographic and phonological processing as a result. This two-fold processing may then have resulted in the superiority of written repetition compared to oral repetition and the control condition. Furthermore, if learners engage in two types of processing simultaneously, they are also likely to create two types of motor memory concurrently. Several studies have detected movements in the vocal tract during silent reading, implying that even silent reading entails a motor aspect for speech production (e.g. McGuigan, 1970; McGuigan & Bailey, 1969; McGuigan, Keller, & Stanton, 1964; Sokolov, 1969).

Oral repetition generated lower explicit word knowledge than written repetition in the immediate form recall test, but resulted in somewhat better implicit word knowledge than the control condition in the lexical decision task. The delayed scores observed for oral and written repetition were virtually equal, suggesting they may yield similar long-term effects. We had expected written repetition to result in superior results on both the crude tests of explicit knowledge (i.e. form and meaning recall) and the finer-grained test of implicit knowledge (i.e. the lexical decision task) compared to oral repetition, but it is possible that looking at the written form of the word and saying the word out loud still entailed a focus on orthography, which would contribute to the quality of the lexical representation of the item. The self-teaching hypothesis (Share, 1995) states that through phonological recoding (i.e. the translation of printed words into their spoken equivalents), a certain extent of orthographic knowledge of the word is built up.

Overall, the control condition yielded the lowest scores. It is remarkable though that, contrary to what we expected, the control condition resulted in equally high scores on the form recall tests as oral repetition. This finding is not consistent with TAP-theory: although learners had to produce the target items on the form recall test, recall was not better for words learned through oral repetition – which entailed production of the target items – than for words learned in the control condition. The self-teaching hypothesis could again contribute to our understanding of why our control condition did not underperform on the form recall test. As a generalization of the self-teaching hypothesis, De Jong and Share (2007) investigated whether orthographic learning was better for words read out loud (i.e. a condition similar to our oral repetition condition) than for words read in silence (i.e. a condition similar to our control condition). Contrary to expectations, orthographic learning appeared to be similar across both conditions. As such, the processes of reading out loud and reading in silence may be more similar than anticipated, and learners may have engaged in structural elaboration in the control condition after all, accounting for the similar results obtained in the oral repetition and control condition. However, the delayed lexical decision task demonstrates that oral repetition yielded better implicit word knowledge than the control condition, which may be due to the self-referential auditory input learners obtained by hearing themselves say the words out loud. Hence, not only the establishment of a motor memory, but also this self-referential input would have benefited word learning during oral repetition. Arguably, the self-referential component may be even more conducive to word learning than the motor component (Forrin & MacLeod, 2017).

Since we did not ask the learners what they did during this control condition, we cannot know for certain what went on in their minds when they were completing this learning condition. Another possibility is that a form of transfer took place from the structural elaboration conditions to the control condition. Potentially, learners who first completed one or both of the structural elaboration conditions and then experienced the control condition transferred the type of focus on form they engaged in in the structural elaboration conditions to the control condition. Therefore, we checked whether an effect of condition order was at play. Analysis demonstrated that order of condition was not a significant predictor of post-test performance (immediate form recall: p = 0.3571 for strict scores and p = 0.2863 for partial scores; delayed form recall: p = 0.6915 for strict scores and p = 0.6783 for partial scores). Hence, transfer from the structural elaboration conditions to the control condition does not seem to have occurred.

Contrary to Thomas & Dieter (1987), time on task in this study was equal for written repetition and oral repetition. We documented the number of repetitions in both conditions so as to be able to determine whether repetition had an influence on post-test performance. In the written repetition condition, participants wrote the word down 4.8 times on average; during oral repetition, the word was produced on average 7.75 times. Number of repetitions was not a statistically significant predictor of post-test performance. Therefore, it seems to be more important for learners to engage with the word for an equal period of time than for them to write the word down or say it out loud an equal number of times.

With regard to learner style, we did not establish an influence of the participants’ results on the VARK on the efficiency of the learning conditions. We expected that learners would perform better in the learning condition which suited their learner style profile best. However, it appeared that learner style as assessed by the VARK did not influence how well the participants performed in any of the three learning conditions. Our analysis also demonstrated that German vocabulary size did not interact with the effect of learning condition. We did establish, however, that the larger a learner’s productive German vocabulary was, the more target vocabulary this learner acquired, regardless of the learning condition in which these words were acquired. Hence, we found support for the Matthew effect, which posits that the rich get richer (e.g. Horst et al., 1998; Stanovich, 1986). Finally, we established that words were not recalled better on a post-test that was similar to the learning condition, i.e. words learned in the written repetition condition were not recalled better on the written post-test and words learned through oral repetition were not recalled better on the spoken post-test. Hence, the prediction we made based on TAP-theory (Morris et al., 1977) was not corroborated by our findings. Rather, words were recalled significantly better on the written post-test than on the spoken post-test. This finding is in agreement with Nairne (2002), who debunks the encoding-retrieval match effect as a myth.

In addition, our analysis indicated that the participants responded significantly more on the written post-test than on the spoken post-test. This could be due to the learners experiencing a degree of embarrassment when having to produce newly learned words out loud and potentially being unsure that their answers were correct. The fact that several participants were completing the learning procedure in the same room, as well as their awareness that their answers would be recorded and replayed in order to be awarded a score, could also have contributed to this element of self-consciousness.

7. Conclusion

If written repetition was shown to result in superior L2 word learning compared to a condition in which semantic elaboration was prompted (Candry et al., 2017), the results of this study suggest that written repetition results in marginally superior L2 vocabulary learning, at least in the short run, than another condition that motivates learners to engage in structural elaboration, namely oral repetition. However, we found a small advantage for both structural elaboration techniques compared to a control condition in which the participants were instructed to simply look at the target item with regard to implicit word knowledge. Therefore, we propose that language teachers encourage their learners to engage in structural elaboration during L2 vocabulary learning. Producing the target item, be it in the written or spoken form, appears to contribute to word-form learning. In particular, we advise learners to write words down during the learning process. We found no interaction between the participants’ learner style and their L2 vocabulary size, indicating that written repetition is an efficient L2 vocabulary learning method, regardless of a learner’s learner style or L2 proficiency.

The effect found here for written repetition is only an immediate one; a delayed effect was not observed. Research has demonstrated that spaced presentations of new vocabulary are more effective for word learning than massed presentations, a phenomenon known as the spacing effect (Ellis, 1995a). The immediate effect of the writing condition might be maintained over time if the same treatment were to be repeated again after a short delay. This way, the spaced presentations of the target vocabulary would be ensured. Therefore, a longitudinal study is warranted in which the two structural elaboration activities operationalized in the present study are repeated in consecutive treatments over the course of several days or weeks. Such a long-term study could allow us to ascertain whether a long-term positive effect can be observed for either written repetition or oral repetition.

In addition, future research should aim to determine whether adding the spoken mode during the act of writing a word down adds to the benefits of written repetition. We suggested that one of the reasons why written repetition was more beneficial for L2 vocabulary learning in this study could be that the learners subvocalized the words whilst writing them down and, consequently, engaged in a combination of orthographic and phonological processing. In a future study three conditions should be compared: a condition in which learners write the target item down repeatedly whilst also repeating the target item out loud; a condition in which the target item is written down repeatedly whilst the learners subvocalize the item; and a condition in which the target item is written down repeatedly and subvocalization is suppressed, for instance by requiring the learners to continuously say something else. Such a study would further help us to delineate the benefits of written repetition as an L2 vocabulary learning technique. Finally, since we posited that learners may have experienced a degree of embarrassment when giving answers on the spoken form recall test and therefore have given fewer answers, the study should be conducted again with the participants undergoing the learning treatment in an individual setting rather than with several participants in the same room.

Additional File

The Additional file for this article can be found as follows:

Appendix

Target Items. DOI: https://doi.org/10.22599/jesla.44.s1