Exploring the evolution in oral fluency and productive vocabulary knowledge during a stay abroad

Vladyslav Gilyuk; Amanda Edmonds; Elisa Sneed German

1. Introduction

In a 2020 publication, Foster reviewed second language (L2) research on oral fluency, and set out a research agenda on this topic comprised of five tasks. In the current article, we contribute to L2 oral fluency research by offering evidence relevant to Foster’s fourth task, which targets the relationship between productive vocabulary knowledge and oral fluency. According to Foster, “it can be argued that the more a person can draw on free productive knowledge, the more fluently that person is able to speak” (p. 453). This is presumably because a mental lexicon that is relatively sparsely populated can cause problems at the level of formulation (), insofar as “the demands of the pre-verbal message cannot be met so easily, or even at all” (), which may result in hesitation phenomena and slower delivery. To date, there exists relatively little empirical research that demonstrates this relationship (cf. ; ; ), and those studies that have explored this question in a stay-abroad context report contradictory results (; ; ). Foster pairs her call for more research on this issue with several methodological suggestions: (a) fluency analyses should be carried out on longer extracts of L2 speech, (b) fluency data should be collected via interactive tasks (to balance the dominance of laboratory-based monologue tasks), and (c) it may be pertinent to assess productive vocabulary knowledge using both controlled and free tasks. In the current study, we report on a project in which we followed each of these methodological suggestions in order to explore the potential parallel development of productive vocabulary knowledge and oral fluency using detailed case studies from five Francophone users of L2 English. In addition, our longitudinal corpus, which was collected before and after a nine-month immersion experience in a target-language environment, allows us to offer insights into potential change in oral fluency and productive vocabulary knowledge as a result of a stay abroad (SA).

2. Literature review

2.1. The development of fluency in a L2

According to Freed (), fluency is one of the most studied topics in first and second language acquisition (for a recent treatment of the topic, see ). In this section, we first provide a brief overview of common fluency measures before focusing on research on the development of oral fluency during a SA.

2.1.1. Measuring fluency

According to Segalowitz (), fluency consists of three dimensions, namely cognitive fluency (speech planning), utterance fluency (speech production), and perceived fluency (speech perception). Only utterance fluency may be measured objectively (), and it is this dimension that will be the focus of the current analysis. Skehan () and Tavakoli and Skehan () proposed three crucial aspects for the measurement of utterance fluency: speed fluency (speed, flow and density), breakdown fluency (pauses and hesitation phenomena) and repair fluency (reformulations, corrections and repetitions). In what follows, we will review measures commonly used to assess speed fluency and breakdown fluency; repair fluency was not evaluated in our study, given that previous research suggests that it strongly reflects individual style preferences and does not necessarily differentiate native and non-native speakers (e.g., , ).

Oral productions from native and non-native speakers often differ in terms of speech speed, and this difference has been quantified in various manners (see ), with three measures being particularly common: speech rate (SR), articulation rate (AR) and mean length of runs (MLoR). SR is generally measured as the number of syllables uttered per minute/second, including pause time. Despite being widely used to assess speed fluency, authors such as De Jong et al. () have argued that SR also reflects breakdown fluency, given that the calculation includes pause time. The second temporal measure – AR – is calculated in the same way as SR, the only difference being that pause time is excluded. This measure thus indicates the number of syllables articulated within a given period of phonation. The final temporal measure, MLoR, corresponds to the average length (in syllables) of a run, where runs are strings of speech between pauses. Breakdown fluency focuses on pauses in speech production. Although pauses are a normal feature of all speech, L2 production sometimes differs from that of native speakers in that L2 speakers produce overall longer pauses and their speech is characterized by a greater number of pauses within (as opposed to between) speech units (e.g., ; ; ). Numerous authors have suggested that pause position can reflect different aspects of oral speech planning and production: unit-internal pausing may reflect word-searches during message formulation, whereas pauses that occur at unit boundaries may indicate macro-planning (see ).

2.1.2. The development of fluency during a SA

Numerous studies have documented fluency development after a SA for both speed and breakdown fluency. Beginning with speed fluency, research has consistently shown that learners produce faster speech after a SA, as measured using SR (see, among others, ; ; ; ; ; ; ; ), AR (e.g., ; ; ; ; ), and/or MLoR (see ; ; ; ; ; ). These results have been obtained for several L2s (English, Spanish, French) and for immersion periods of various lengths, ranging from 3–4 weeks () up to nine months (). Overall, this body of research offers strong evidence of gains in speed and suggests that change may occur quickly, explaining the significant results for short stays and studies that found significant development at the beginning of longer stays (see ; ). We note that most of this research has assessed speed fluency change using monologue tasks (see for an exception), based on relatively short extracts (e.g., 30 seconds, in the case of ).

Turning to breakdown fluency, Towell et al. () found no significant change in the average length of all pauses produced by a group of 12 learners of French after a six-month stay in France. However, this finding may reflect the fact that pause position was not taken into account in the analysis. Indeed, some recent research has demonstrated differences in pause length and pause frequency depending on whether the pause occurs within a speech unit or between two units (e.g., ). Within the SA literature, Huensch and Tracy-Ventura () found that unit-internal pauses quickly shortened in length for their group of L2 learners of Spanish after their arrival abroad, and that this change was maintained over the course of the nine-month stay. Between-unit pauses, on the other hand, showed no clear change in duration as a result of the SA (see p. 286). Additional evidence for change in breakdown fluency as a result of a SA comes from Leonard and Shea (), who focused on the rate of pausing. These authors found that the number of unit-internal pauses decreased after a three-month stay in Argentina, whereas no significant change was found in the rate of pausing at unit boundaries (see ). However, in an analysis of dialogue and monologue data collected after 6 weeks and again after 10 weeks spent abroad, Tavakoli () reported no change in the number of pauses at either clause-external or at clause-internal positions. Taken together, most evidence suggests that a SA may lead to a reduction in both the number and the length of clause-internal pauses, but not necessarily to changes in inter-clause pausing behavior.

2.2. The development of productive vocabulary knowledge in a L2

Vocabulary knowledge covers knowledge about a word’s form, meaning, and use, and this knowledge can be either receptive or productive (see ). Receptive – or passive – knowledge covers such abilities as recognizing a word, understanding it in the input, and knowing in what situations one can expect to encounter it. Productive – or active – vocabulary knowledge involves, among other things, being able to pronounce a word, to call it up when the need arises, and to select appropriate collocates. Receptive knowledge precedes productive knowledge and, as a result, a speaker’s receptive knowledge store is larger than their productive one (). Although researchers recognize a difference in receptive and productive knowledge, most research into vocabulary knowledge assessment has focused on the former (). However, and as pointed out by Foster (), when researching a potential connection between vocabulary knowledge and oral fluency, it is logical – and indeed important – to focus on productive vocabulary knowledge, as this is the vocabulary that can be mobilized by the speaker in production. In what follows, we briefly present how productive vocabulary knowledge has been measured, and then we report findings on the development of productive vocabulary knowledge during a SA.

2.2.1. Measuring productive vocabulary knowledge

In the assessment of productive vocabulary knowledge, a distinction can be made between controlled and free knowledge (). On the one hand, controlled refers to when a speaker demonstrates productive vocabulary knowledge by responding to a direct elicitation (e.g., providing a L2 translation in response to a L1 cue). Free productive vocabulary knowledge, on the other hand, is demonstrated when a speaker produces a word spontaneously. As suggested by Foster (), it may be useful to include both types of approaches in studies of the relationship between productive vocabulary knowledge and oral fluency. Indeed, although free productive vocabulary may be most readily available in production, previous research has demonstrated a positive relationship between oral fluency and controlled productive vocabulary knowledge (; ). In the current study, we followed Foster’s suggestion and included three measures of productive vocabulary knowledge: one controlled (Lex30) and two free measures (one targeting lexical diversity, the other lexical sophistication). In what follows, these measures will be presented.

Lex30 assesses productive vocabulary knowledge using a word association format (see , for additional details on the design and scoring of this task). Participants are presented with a list of 30 stimulus words and asked to provide the first four words that come to their minds upon seeing each stimulus. The responses to the stimuli (120 max = 30 cues × 4 responses) are lemmatized and subjected to a frequency analysis. One point is given for each response outside the first 1,000 most frequent words (except for proper nouns, which receive no points). The final score can be represented as the percentage of infrequent responses (e.g., ; ), with higher percentages interpreted as reflecting a larger productive vocabulary. While research continues to explore the reliability and the validity of Lex30 (see ), authors such as Fitzpatrick and Clenton () have argued that this task represents an improvement over other controlled measures of productive vocabulary for two major reasons: (a) the use of 30 stimuli allows participants to showcase lexical knowledge in a variety of semantic areas and (b) the word association format allows participants to provide responses for which they have only partial lexical knowledge.

Measures of free productive vocabulary knowledge are based on spontaneous productions, which are analyzed with respect to different variables, such as lexical diversity and lexical sophistication. Measures of lexical diversity assume that greater variety (and, as such, less repetition) indicates greater productive vocabulary knowledge. Many researchers have discussed the merits of different lexical diversity measures, criticizing measures such as the type/token ratio for their sensitivity to text length (e.g., ). These discussions have led to the development of other lexical diversity measures, including the widely used D score (). This measure was designed to neutralize the sensitivity to production length by calculating the type/token ratio for random samples from a production. The final D score corresponds to a random-sampling type/token ratio for the production, with higher scores indicating greater lexical diversity. Like lexical diversity, lexical sophistication is also evaluated by examining spontaneous productions. Measures of lexical sophistication focus on the distribution of words in a text as a function of their frequency in the target language. A production containing more infrequent words or word families is considered more lexically sophisticated. Assessments of lexical sophistication thus generally take the form of lexical frequency profiling (; ).

2.1.2. The development of productive vocabulary knowledge during a SA

Looking first at controlled productive vocabulary knowledge, Fitzpatrick () conducted a case study of a participant who completed Lex30 six times during an eight-month SA experience. Fitzpatrick modified the Lex30 scoring procedure in order to track the micro-development of vocabulary, and reported (among other things) that the increase in number of responses, percentage of native speaker-like responses, and number of collocational responses signaled a positive impact of the SA context. Leonard and Shea () assessed controlled productive vocabulary knowledge of a group of L2 learners of Spanish using a 30-item vocabulary test. The results indicated significant development in vocabulary knowledge after a three-month stay in Argentina (see ).

Results with respect to changes in free productive vocabulary knowledge are mixed, both for lexical diversity and for lexical sophistication. Whereas Foster (), Serrano et al. (), Lara (), Tavakoli (), and McManus et al. () reported that participants showed greater lexical diversity after a SA, Lara (), Mora and Valls-Ferrer (), and Leonard and Shea () found no significant evolution. Turning to lexical sophistication, Laufer and Paribahkt () used the Lexical Frequency Profile to assess written productions from Israeli students studying English either in Israel or in Canada. Results showed an advantage in lexical sophistication for the students who remained in Israel. However, significant gains in lexical sophistication for learners participating in a SA were reported by Tracy-Ventura () using frequency profiling and Leonard and Shea using Guiraud’s advanced index. Leonard and Shea interpreted their finding as indicating that the SA may provide learners with “an opportunity to incorporate a greater number of low-frequency words into their active vocabulary” (p. 188).

2.3 Exploring fluency and productive vocabulary development together in a SA context

A review of the literature reveals three studies that have explored potential concomitant development of oral fluency and productive vocabulary over a SA: Leonard and Shea (), McManus et al. (), and Mora and Valls-Ferrer (). However, these three studies arrived at contradictory conclusions, highlighting the need for additional research. In Leonard and Shea’s () study, none of the regression analyses identified a significant relationship between fluency and lexis, and Mora and Valls-Ferrer () reported a significant evolution in fluency but not in lexical diversity. In contrast, McManus et al. () reported that the SA contributed to the increase in both, leading them to conclude that their results “showed significant and long-lasting relationships between fluency and lexis” (p. 158).

Thus, although much research has been devoted to either fluency or lexical development during a SA, little attention has been given to the relationship between these two components of linguistic competence, a gap that was identified by Foster () for second language acquisition research in general. Foster went further, identifying several ways in which future research may profitably build on the existing body of knowledge. With respect to fluency, she highlighted the importance of analyzing longer stretches of oral production and the need to diversify task types. Measurements based on longer extracts increase our confidence in the representativeness of the analyzed sample (see also ). As for task types, previous research is dominated by laboratory-based monologue tasks that tend to allow time for pre-task macro-level planning, which may influence fluency. Interactive tasks, which elicit more spontaneous speech, are relatively rare, leaving open the question whether previous findings also apply to spontaneous speech. As concerns productive vocabulary knowledge, Foster underscored the complexity of this construct, and suggested that researchers may do well to consider measures of both controlled and free productive vocabulary knowledge. In the current study, we heed Foster’s call for more research and implement these three recommendations. The research questions that guided this study were as follows:

To what extent does oral fluency change for five Francophone learners of L2 English after an academic year in a SA context?
To what extent does productive vocabulary change for five Francophone learners of L2 English after an academic year in a SA context?

After responding to these two questions, we will reflect on the potential parallel development of the two competencies under study.

3. Method section

3.1. Participants

The participants were five 18-/19-year-old students (A, C, M, N, Y) in their first or second year of a degree course in Applied Foreign Languages in France. They were native speakers of French (Y was also a native speaker of Turkish, but French was his dominant language) who studied English in addition to a second L2 (Arabic, Chinese, Italian or Spanish). The participants’ English proficiency, measured by means of the Oxford Quick Placement Test before the SA experience, was found to be either lower intermediate (A, C) or advanced (M, N, Y). Among the five participants, three were female (A, C, M), and two were male (N, Y). The participants had mainly studied English in a formal context and, at the outset of the project, had never spent a long period in an English-speaking country. They spent an academic year (i.e., nine months) in Ireland (A, C, M, N) or England (Y) surrounded by native and non-native speakers of English in both formal (i.e., lectures and seminars at the host university) and informal (i.e., transport, job, supermarket) contexts, as a part of Erasmus+ programme.

3.2. Data collection

Data were collected before participants’ departure (June 2018), three times during their SA, and after their return to France (June 2019) as part of the PROLINGSA corpus. Only pre-stay and post-stay data will be analyzed in the present article, thus spanning a period of one year. At these two sessions, participants took part in a semi-guided interview and completed the Lex30 test. For each interview, a different set of questions was used, and discussions revolved around the participants’ expectations, experiences, and observations in connection with their SA. Using a different set of questions for each interview ensured that discussions represented spontaneous speech. The interviews were conducted by one of two researchers and transcribed using CHAT conventions (). Interviews varied in length: The shortest interview was 18 minutes 34 seconds (M, pre-stay interview) while the longest was 53 minutes 47 seconds (Y, post-stay interview). The total speaking time for each participant (including pauses) also varied widely (see Table 1 for details).

Table 1

Length of pre-stay and post-stay interviews (in minutes).


PARTICIPANT	PRE-STAY		POST-STAY

	TOTAL DURATION	PARTICIPANT SPEAKING TIME	TOTAL DURATION	PARTICIPANT SPEAKING TIME

A	27:00	11:07	34:16	23:32

C	26:49	11:05	23:27	10:04

M	18:34	8:05	20:51	11:26

N	23:55	11:06	37:21	24:47

Y	44:24	30:20	53:47	40:33

The administration of the Lex30 test involved a PowerPoint presentation during which the 30 stimuli were shown for 30 seconds each. The participant was instructed to write down up to four English words that came to mind upon seeing each stimulus. (See Appendix, for Y’s pre-stay Lex30 test).

3.3. Data coding

3.3.1. Fluency

In order to maximize the length of extracts analyzed, we aligned our analysis on the shortest speaking time for any participant in the corpus under study. As seen in Table 1, the shortest speaking time occurred in M’s pre-stay interview, where this learner spoke for 8 minutes 05 seconds. We thus analyzed a total of eight minutes for each participant from each interview. The eight minutes were counted starting from the response provided to the first open-ended question.

Phonetic coding was realized in PRAAT (). The eight-minute extracts from each interview were saved into separate audio files, which were first automatically annotated to Textgrids with silent and sounding intervals by PRAAT. These Textgrids were then manually coded in order to identify all silent pauses of 250 ms or longer, following the threshold identified by De Jong and Bosker () for L2 fluency research. Finally, each transcription was divided into units for the analysis of pause location. Following De Jong et al. () and Huench and Tracy-Ventura (), we analyzed all transcripts into analysis of speech units (ASUs), defined as “a single speaker’s utterance that consists of either an independent clause, or sub-clausal unit, with any subordinate clause” (). For this analysis, we respected the coding conventions developed by Foster et al.

For the purposes of this study, we used three speed fluency and four breakdown fluency measurements (see Table 2). The breakdown fluency measures focused on silent pauses, which have been reported to be strongly correlated with fluency ().

Table 2

Utterance fluency measures.


SPEED FLUENCY	DEFINITION

SR	number of syllables divided by number of seconds (including pauses)

AR	number of syllables divided by number of seconds (excluding pauses)

MLoR	total number of syllables divided by total number of runs

BREAKDOWN FLUENCY

Within ASU Mean length of silent pauses Number of silent pauses	average length of silent pauses longer than 250 ms number of silent pauses longer than 250 ms

Between ASU

Mean length of silent pauses	average length of silent pauses longer than 250 ms

Number of silent pauses	number of silent pauses longer than 250 ms

3.3.2. Productive vocabulary

Productive vocabulary knowledge was assessed in three ways. Beginning with controlled productive knowledge, Lex30 responses were lemmatized and then subjected to a frequency analysis using the Corpus of Contemporary American English (Corpus, n.d.). Responses which were located within the first 1,000 most frequent words or that corresponded to proper nouns were given 0 points; all other responses received one point. The number of infrequent responses was then divided by the total number of responses, yielding the percentage of infrequent responses. Free productive vocabulary knowledge was examined with respect to lexical diversity and lexical sophistication. Lexical diversity was assessed using the VocD command in CLAN, which returns a D score. An indication of lexical sophistication was obtained by determining the percentage of word families in a given interview that were infrequent. In this analysis, any word that lies beyond the first 1,000 most frequent word families of the language (as determined using the COCA) was considered infrequent. To carry out this analysis, we used the Compleat Vocabprofiler (https://www.lextutor.ca/vp/comp/).

3.4. Data analysis

The present study is based on five case studies, which precludes the use of inferential statistics. For our analysis, we present detailed descriptive results for each of the participants, and draw attention to important trends in the data.

4. Results

4.1. Fluency results

Beginning with speed fluency, we note that all five participants show gains between pre-stay and post-stay (Table 3). In order to facilitate comparisons among the participants, we have included percentage change for each measure. Although all participants show change in the direction of faster speech, different individual profiles are visible. On all three speed fluency measures, C shows the lowest percentage gain. N also shows relatively moderate change, especially as measured using AR and SR. More striking changes are visible in the data from A, M and Y. For example, their average MLoR more than doubled from pre-stay to post-stay interviews.

Table 3

Speed fluency results.


PARTICIPANT	SR^A			AR^A			MLOR

	PRE-STAY	POST-STAY	% CHANGE	PRE-STAY	POST-STAY	% CHANGE	PRE-STAY	POST-STAY	% CHANGE

A	2.24	3.28	46.6	2.75	3.70	34.6	6.39	13.92	117.5

C	2.05	2.49	21.9	2.76	2.98	7.9	5.33	7.02	31.6

M	2.75	3.64	32.3	3.49	4.10	17.4	6.70	14.80	120.7

N	2.28	3.04	33.3	3.39	3.69	8.8	5.59	8.63	54.4

Y	2.94	4	36.5	3.86	4.52	17	6.53	15.23	133.2

^a Calculated as syllables per second.

Breakdown fluency was assessed by determining the number of pauses (> 250 ms) and their average length, either within an ASU or between ASUs (Table 4). Considering first the results for pause length, we note that all participants used shorter pauses at post-stay than at pre-stay, and that this was true for both pause positions. Results for the number of pauses showed differences both as a function of pause position and across individuals. First, we note that all participants reduced the number of > 250 ms pauses within ASUs at post-stay, whereas the number of between ASU pauses either remained approximately the same or increased. Looking at the individual trends, these longitudinal data reveal three different pause profiles. The first profile concerns three participants (A, M, and N), who showed clear reduction in the number of within ASU pauses and a concomitant increase in between ASU pauses after a year abroad. Whereas at pre-stay, these participants’ productions showed between 3.5 (N) and 5.7 (M) more within ASU pauses than between ASU pauses, this ratio had decreased to between 1.2 (N) and 1.5 (A) at post-stay. The second profile is exemplified by Y, who overall showed reduction in the number of pauses; this reduction was dramatic for within ASU pauses and slight for between ASU pauses, leading to a use of more between than within ASU pauses at post-stay, a finding which is unique in this dataset. Finally, C’s data reveal only a slight decrease in the number of within ASU pauses, and no change in the number of between ASU pauses over time.

Table 4

Breakdown fluency results: Silent pauses.


PARTICIPANT	MEAN LENGTH WITHIN ASU^a			MEAN LENGTH BETWEEN ASU^a			NUMBER WITHIN ASU			NUMBER BETWEEN ASU

	PRE-STAY	POST-STAY	% CHANGE	PRE-STAY	POST-STAY	% CHANGE	PRE-STAY	POST-STAY	% CHANGE	PRE-STAY	POST-STAY	% CHANGE

A	667.49	563.27	–15.6	789.06	730.46	–7.4	77	43	–44.1	15	28	86.6

C	867.31	538.71	–37.8	1267.76	767.80	–39.4	83	74	–10.8	21	21	0

M	630.36	549.46	–12.8	855.73	618.23	–27.7	109	45	–58.7	19	34	78.9

N	783.07	572.14	–26.9	1349.35	698.56	–48.2	108	62	–42.5	31	53	70.9

Y	647.3	535.06	–17.3	737.97	585.61	–20.6	104	29	–72.1	44	42	–4.5

^a Expressed in milliseconds.

4.2. Productive vocabulary results

We begin with the results from Lex30, the measure of controlled productive vocabulary knowledge (Table 5). At pre-stay, the five participants provided between 33.3% (A) and 60% (N) infrequent responses when asked to react to the 30 stimuli. One year later, these percentages had increased for four of the participants (A, C, M, and Y), whereas the percentage of infrequent responses provided by N (the highest scorer at pre-stay) had clearly decreased (to 46.5%).

Table 5

Controlled productive vocabulary measure: Percentage infrequent responses (Lex30).


PARTICIPANT	PRE-STAY	POST-STAY

A	33.3	38.3

C	36.2	43.8

M	49.57	56.6

N	60	46.5

Y	41.5	44.5

Free productive vocabulary knowledge was evaluated by assessing the lexical diversity and lexical sophistication of participants’ oral productions at pre-stay and post-stay (Table 6). Lexical diversity (as measured with D) registered little change for four of the participants (A, C, N, and Y), with pre-stay and post-stay scores being within three points of each other. The only participant who showed clear evolution over time – M – went in the direction of less lexical diversity at post-stay as compared to pre-stay. The measure of lexical sophistication also showed little overall change for most of the participants (A, C, M, and Y), for whom the percentage of infrequent word families varied between 1.3% (for M) and 2.8% (Y) between the two interviews. The results for N, on the other hand, showed a clearly higher percentage of infrequent word families at post-stay (22.1%) as compared to pre-stay (14.7%), indicating greater lexical sophistication after the SA.

Table 6

Free productive vocabulary measures.


PARTICIPANT	PRE-STAY			POST-STAY

	Word families beyond 1K band			Word families beyond 1K band

	#	%	D	#	%	D

A	47/222	21.2	52.21	75/394	19	52.99

C	39/241	16.2	61	40/268	14.9	63.86

M	149/247	19.8	71.68	58/314	18.5	57.98

N	38/258	14.7	66.97	93/421	22.1	65.05

Y	127/433	29.3	63.51	196/610	32.1	60.61

5. Discussion

The goal of the present study was to investigate the extent to which oral fluency and productive vocabulary potentially evolve in parallel as a result of an academic year in a SA context. Our study makes a number of important contributions to the literature, for both fluency and lexis, by taking into account longer (i.e., eight-minute) extracts of L2 speech produced in an interactive task, and using both controlled and free tasks to measure productive vocabulary. Moreover, our study stands out insofar as we detail individual trajectories for five speakers. The decision to focus on case studies may seem to go against the tide, as it is currently common for researchers to call for larger-scale studies and to highlight the need for large sample sizes for appropriate generalization. However, we agree with Duff (), who argued that case studies have much to contribute to the field of applied linguistics, not only with respect to hypothesis generation, but also in the building and critiquing of theory. The present approach provides a relevant complement to group-based analyses presented in previous research and, importantly, allows us to explore individual patterns. The importance of individual patterns will become clear as we answer our research questions.

Our first research question examined the development of fluency, specifically speed fluency (SR, AR and MLoR) and breakdown fluency (number and duration of within- and between-ASU silent pauses). Our findings provide further support for the extensive body of literature demonstrating speed fluency changes as a result of SA. To different degrees, all of our participants show changes in the expected direction on all three measures; however, the changes observed for AR are rather modest compared to SR and MLoR. This is perhaps not surprising given that AR has been argued to be a pure measure of speed fluency, whereas SR and MLoR may represent composite measures of general utterance fluency (see ), as they incorporate other potentially additive aspects of fluency, such as breakdown fluency. Indeed, our results also show changes in breakdown fluency. Contra previous research, which either found no change in pause length over time () or significant shortening only for within-ASU pauses (), all of our participants show decreases in the average length of pauses both within and between ASUs. The fact that our findings depart from previous research may be due to the differences in our methodology. Whereas participants in both Towell et al.’s and Huensch and Tracy-Ventura’s studies completed a story retelling task, our participants were engaged in a semi-guided interview. Relevant evidence for a task effect comes from Tavakoli (), who compared productions by the same speakers engaged in monologic and dialogic tasks. The author found that the mean length of pauses was statistically shorter in the dialogues, but that there was little difference concerning number or location of the pauses. Tavakoli suggests that the shorter pauses in dialogues may reflect the fact that participants engaged in such tasks (which would include semi-guided interviews of the type used in our study) benefit from “listening time” (p. 146), which may facilitate both macro-planning and formulation of the message. Our breakdown fluency results may then reflect an evolution in the five learners’ ability to effectively make use of this built-in listening time after their SA, an evolution that would not be expected in monologic tasks.

Our findings related to the number of pauses are more difficult to interpret because our participants exhibited different behaviors. In this respect, our results mirror previous research, which has reported contradictory results (; ; ). In our study, three participants showed a robust decrease in the number of within-ASU pauses and a parallel increase in the number of between-ASU pauses; one participant showed a slight decrease in within-ASU pauses, but no change in the number of pauses between ASUs; and one participant showed an overall reduction in the number of pauses, dramatic for within-ASU and slight for between-ASU. Although all three profiles demonstrate a decrease in within-ASU pauses, four of the learners still used more within- than between-ASU pauses post-SA; only Y displayed the typical profile of native speakers with more between- than within-ASU pauses (see ; ).

Our second research question examined the development of productive vocabulary as a result of SA, using both controlled and free measures. Our controlled productive vocabulary measure – Lex30 – revealed gains in productive vocabulary for four out of five participants, providing support for previous research that (albeit using different measures) generally reports positive change in controlled productive vocabulary knowledge after SA (see ; ; ). However, one of our five participants showed the opposite pattern: N, the participant with the highest pre-SA score, actually showed regression on his post-SA test. Whether this result truly indicates loss of productive vocabulary knowledge remains unclear, given the findings for the same participant on lexical sophistication. Unlike the four other participants, who exhibited little change on this measure, N demonstrated a substantial evolution in lexical sophistication, producing a considerably higher percentage of infrequent word families post-SA (14.7% → 22.1%). Concerning lexical diversity, we again found little change for four out of five participants; the exception being M, who demonstrated considerably less lexical diversity post-SA. Taken together, the findings from the three productive vocabulary measures show various trajectories after a SA: Three speakers (A, C, Y) showed gains in controlled but not free productive knowledge, M increased controlled knowledge but showed reduced lexical diversity, and finally N obtained a lower controlled vocabulary score all the while demonstrating increased lexical sophistication.

Our vocabulary results are thus somewhat puzzling, mainly because they generally go against the findings of previous research on vocabulary development and SA. We hypothesize that this pattern of results may reflect the interview task itself, and/or perhaps more general features of spoken language. First, although the interview task always followed the same general format and gave our learners the freedom to exploit whatever vocabulary they found most appropriate, the questions were not the same each time. Perhaps if they had had to address exactly the same questions, we would have seen development since the semantic fields mobilized in order to respond would have been the same. A second potential artifact of the task concerns the increasing degree of familiarity between the students and the interviewers. At the time of the last interview, the participants were being interviewed for the fifth time. The students and interviewers had had a great deal of contact over the SA period and, consequently, the nature of the task had become more informal. It seems unlikely that the same incentive to showcase diverse and infrequent words would be present in later interviews. Additionally, we observed the marked development of common features of spoken language over the course of SA, such as the frequent use of discourse markers (see ). Discourse markers tend to involve highly frequent words and their use introduces repetition, which would necessarily result in lower scores for lexical diversity and sophistication. Finally, it also seems plausible that an important part of our participants’ vocabulary development took place in the university context, and may thus represent more specialized lexis relevant to their domains of study. This is not the type of vocabulary that would necessarily be elicited in completing Lex30, or in interviews that ask the students to reflect upon their SA experience, so our measures may underestimate our participants’ progress.

Turning now to the parallel development of fluency and productive vocabulary, our results suggest that these two components of linguistic competence followed rather different developmental trajectories as a result of SA for our five participants. We generally see strong evidence for fluency development, but more mixed indicators of growth in productive vocabulary. As discussed previously, several researchers consider that breakdown fluency measures reflect lexical organization and access (within-unit pausing) and macro-planning (between unit-pauses). And, according to Foster (), “it can be argued that the more a person can draw on free productive knowledge, the more fluently that person is able to speak” (p. 453). If this is the case, it is remarkable that we see clear change in pausing (both in length and in number), but relatively little evidence of an increase in productive vocabulary knowledge. We see two explanations for this combination of findings. The first interpretation that is consistent with our findings would consider that speed and breakdown fluency show gains independent of the predicted concomitant changes in productive vocabulary knowledge. Under this scenario, the reduction in length and number of pauses may above all reflect the greater automatization and efficiency of lexical access with respect to the learners’ existing lexicons. In other words, the SA may provide opportunities for these relatively advanced speakers to practice using their L2, leading to increased automatization (see ; ). For the second interpretation, we assume that despite having employed three measures of productive vocabulary, we were not successful in fully capturing the lexical advances made by our participants during their year abroad. In this interpretation, the productive vocabularies of the five learners have in fact increased, and the detected changes in pausing would reflect this change, insofar as a larger vocabulary can facilitate planning and formulation of a speaker’s message, resulting in fewer and shorter pauses. Teasing apart these two interpretations will require research projects in which productive vocabulary knowledge is tested more extensively. However, results from one of our participants – Y – provide tentative support for a connection between productive vocabulary knowledge and fluency. Y was the only participant to have more between- than within-ASU pauses at post-SA. Although we do not see a dramatic increase in his productive vocabulary knowledge from pre- to post-SA, it is worth mentioning that he produced the highest percentage of infrequent word families (by far), both before and after SA. The low number of within-ASU pauses at post-stay suggests that Y does not have problems with lexical access, or if he does, he is able to resolve them faster than our pause threshold of 250 ms. It may be that his greater free vocabulary knowledge – visible in his higher lexical sophistication – contributes to this difference relative to the other participants.

6. Conclusion

Our findings show clear evidence of changes in fluency, but more mitigated changes in productive vocabulary for five Francophone learners of English after a nine-month stay in an Anglophone environment. On the basis of seven fluency measures and three productive vocabulary measures, we see both instances of consensus (i.e., the speed fluency results), but most of the time, we see individual differences. These individual differences notably call into question a clear parallel patterning of the development of oral fluency and productive vocabulary over the course of a SA. Although it may be the case that our productive vocabulary measures were unsuccessful in detecting actual change (an issue we leave to future research), these results also underscore the importance of looking past aggregate group results in order to verify generalizations against individual profiles.


CUE	RESPONSES

1. attack	bomb	terrorist	nightmare	awful

2. board	class	pupil	hotel	writing

3. close	open	door	mouth	shop

4. cloth	wear	pants	textile	protection

5. dig	soil	ground	agriculture	earth

6. dirty	clean	rubbish	insalubrious	messy

7. disease	illness	sick	sore throat	medecine

8. experience	novelty	personal	travel	awesome

9. fruit	passionfruit	banana	vegetable	pomegranate

10. furniture	rug	desk	room	cupboard

11. habit	custom	conservatory	traditional	time

12. hold	paper	secret	pencil	tight

13. hope	magic	persevere	stars	life

14. kick	mean	rugby	football	ball

15. map	geography	travel	freedom	Turkey

16. obey	order	submission	respectful	good

17. pot	melting	cultures	difference	–

18. potato	tomato	vegetable	yellow	heavy

19. real	fake	realistic	hard	suffer

20. rest	vacation	good	breathe	lay

21. rice	india	China	Japan	–

22. science	respect	awesome	philosophy	brain

23. seat	chair	down	movie	–

24. spell	word	sorcerer	witch	potion

25. substance	drug	toxic	unknown	colour

26. stupid	intelligent	poor	meannes	inferiority

27. television	stupid	games	politics	waste

28. tooth	toothpaste	fragile	tongue	white

29. trade	commerce	economics	exchange	world

30. window	door	room	house	wind

Journal of the European Second Language Association

Research