Compound processing in second language acquisition of English

Serkan Uygun; Ayşe Gürel

1. Introduction

The processing of monomorphemic and multimorphemic words (both inflected and derived) has been on the forefront of psycholinguistic research over the past four decades (see , for a review). As a cross-linguistically widespread method of complex word formation, compounding has also taken a central place in this line of research (e.g. ). Compounds have unique characteristics that enabled researchers to look into subtleties of the mental lexicon. For example, unlike derived and inflected forms, which contain bound morphemes in predictable positions, compounds involve the combination of two or more (free) morphemes. This makes it possible to examine complex word processing without affix-related confounds. Also, as Libben () notes, affixes comprise a closed set, and thus they can be more easily isolated than compounds that are typically composed of open-class multiple roots. Additionally, the position of constituents in compounds is not always predictable (e.g., bookworm versus yearbook). This provides an opportunity to test the contribution of headedness in compound processing. Furthermore, when two lexemes are combined to create a new word, the lexeme(s) can either retain or lose the original meaning(s), leading to compounds with varying degrees of transparency. These inherent characteristics of compounds have made it possible to investigate a range of issues, such as the role of constituency/headedness and semantic transparency in processing complex words ().

Decades of research on the mental lexicon has produced many different models of morphologically complex word processing (see , for a review). The models differed from one another in terms of the extent of morphological decomposition they assumed to take place, and the factors playing a role in this decompositional process. Going back to pioneering research in this field, Taft and Forster () proposed a processing pattern whereby individual morphemes in a complex word are activated prior to lexical access. For compound processing, this model predicts prelexical activation of compound constituents. For example, when processing the compound, blueberry, each constituent (i.e., blue and berry) is retrieved from the lexicon. Therefore, the whole word meaning is accessed by combining the meanings of its constituents. Consequently, compound processing might be slower than monomorphemic word processing. As a competing view, the Full-listing Model () posited that multimorphemic words are not decomposed into their constituents but stored as whole units. For instance, the compound, blueberry has a separate entry as a whole unit in the mental lexicon. This model predicts no difference between the representation of monomorphemic and compound words.

In contrast to these extremist views, subsequent research on inflected and derived forms has revealed various factors such as (constituent/morpheme) frequency, word familiarity, and (semantic) transparency as important determinants in the decompositional processing pattern, suggesting that the dual route is also possible in morphological processing (e.g., ; ). For example, Caramazza et al. () proposed full-listing for familiar words but decomposition for novel words. Similarly, Schreuder & Baayen () observed full-listing for frequent and semantically opaque forms but decomposition for novel, less frequent and semantically transparent words. With respect to compound processing, the dual-route view predicts decomposition for transparent compounds but full-listing for opaque compounds. Libben’s (; ) Automatic Progressive Parsing and Lexical Excitation (APPLE) model is also relevant in this context as it directly relates to morphological parsing of compounds. This model does not make predictions as to when a complex form may undergo morphological decomposition, but rather specifies the processes involved in the decomposition process when it does occur (). The model assumes three levels of representation in compound word recognition: the stimulus, lexical and conceptual levels. For example, blueberry is a transparent-transparent compound as both constituents contribute to the meaning of the whole word. However, strawberry is partially opaque because the constituent straw does not contribute to the meaning of the whole compound while berry does. Both types of compounds are represented as whole units at the stimulus level (). At the lexical level, compounds are decomposed into their constituents (e.g. blue-berry, straw-berry), but each constituent representation is linked to the whole word. Thus at the stimulus and lexical levels, both transparent and partially opaque compounds are represented in the same way (). At the conceptual level only the semantically transparent constituent is represented. For example, both constituents of the transparent-transparent compound blueberry (blue and berry) are linked to the compound’s conceptual representation, while in the partially opaque compound strawberry, only the transparent constituent (berry) has the link to the conceptual representation. Consequently, only the meanings of the transparent constituents and transparent whole compounds are activated. This representational difference at the conceptual level results in slower reaction times (RTs) for fully transparent compounds (). In addition, the APPLE Model assumes that the facilitative links between a compound and its constituents which do not exist in monomorphemic words result in a processing advantage for compounds. Therefore, the prediction is that it should be easier and faster to process compounds than monomorphemic words. Thus, when accessing blueberry, both constituents, blue and berry, establish facilitative links that enable compound constituents to serve as significant primes. In contrast, when processing a monomorphemic word such as crocodile, the primes croco and dile could not establish these facilitative links leading to slower RTs.

In compound processing research, there are some fundamental questions examined so far. For example, does a compound such as blueberry form a single entry (i.e. whole-word representation) in the mental lexicon or is it decomposed into individual constituents during recognition? Does the morpheme-based decompositional route or whole-word representation depend on how transparent a compound is? For example, is there a processing difference between a fully transparent compound, blueberry, and a partially opaque compound, strawberry ()? Does transparency interact with headedness in compound processing? In other words, does the semantic transparency of the morphological head (e.g. berry in blueberry) play a more significant role than the semantic transparency of a non-head (e.g. blue in blueberry) in overall lexical decision latencies (e.g. )?

These questions are also revealing in the context of L2 acquisition as current psycholinguistic research with L2 learners is particularly interested in identifying whether late L2 learners are sensitive to the internal morphological structure of inflected words (e.g. ; ), derived words (e.g. ; ), and compounds (e.g. ; ).

In light of this background, the present study examines potential native-non-native speaker differences in processing noun-noun compounds in L2 English by first language (L1) Turkish-speaking participants. The aim is to uncover whether late L2 learners differ from native speakers in the extent of decomposition they employ and of their reliance on semantic transparency in processing compounds. In the rest of the paper, we first present an overview of L1 and L2 studies on compound processing. This is followed by a brief note on compounds in English and Turkish and the methodology. The paper concludes with findings and discussion.

2. Previous Research on Compound Processing

2.1. L1 studies

There is relatively less work on compound processing compared to inflectional and derivational processing. Previous L1 compound studies have mainly focused on the questions of whether one of the constituents (i.e. head or non-head) has a more significant impact on the processing route and whether the semantic transparency of constituents affects the parsing route.

Researchers have explored the role of constituency by manipulating the frequency and/or word status (i.e. word vs. nonword) of constituents. The activation of either one or both constituents during lexical access of a compound is interpreted as decomposition. Several studies have revealed the role of first constituent in compound recognition. For example, in one of the earliest lexical decision experiments in English, Taft and Forster () found that compound-looking nonwords in which the first constituent is a word (e.g. footmilge) took longer to reject in comparison to compound-looking nonwords where the second constituent is a word (e.g. thernlow). This suggests that nonword classification time is affected by the lexical status of the first constituent but not the second. Furthermore, Taft and Forster () found that compounds whose first constituent is of low frequency (e.g. loincloth) were recognized as a word significantly more slowly than compounds with a high-frequency first constituent (e.g. headstand), revealing a frequency-based facilitative role for the first constituent. Similarly, in an eye-tracking experiment, Juhasz () showed that compounds with a high-frequency first constituent had shorter first fixation times (i.e. duration of the first fixation on the target word) and gaze durations (i.e. total duration of all fixations on the target word) than did frequency- and length-matched simple words. Compounds with a low-frequency first constituent did not differ from simple words. In contrast, Juhasz et al. () found a frequency-dependent facilitative effect of the second constituent rather than the first constituent in English compounds, not only in lexical decision and naming tasks but also in the eye movement paradigm. Finally, there are studies revealing the impact of the frequency of both constituents. For example, in an eye-tracking study, Andrews et al. () found clear effects of frequency of both head and non-head in English compound processing. In a more recent study, Janssen et al. () found, in a lexical decision task, that first and second constituent frequency together with compound’s surface frequency and family size affected the RTs for compounds.

Another major question explored in compound processing studies pertains to the role of semantic transparency in the parsing route. In such studies, compounds are usually divided into four groups according to the semantic transparency level of their constituents, as measured in relation to the meaning of a whole word: transparent-transparent (TT) (e.g. bedroom), opaque-transparent (OT) (e.g. nickname), transparent-opaque (TO) (e.g. shoehorn) and opaque-opaque (OO) (e.g. deadline) (). The RTs obtained for these compound types are compared with one another to identify whether the semantic transparency of constituents influences the compound processing pattern. In a masked priming study, Shoolman and Andrews () found that both first and second constituents primed target compounds regardless of semantic transparency. In addition, the priming effects observed in compounds were significantly greater than those found in pseudocompounds and monomorphemic words. Similarly, Libben et al. () observed similar priming effects for all four types of compound words, suggesting that semantic transparency had no significant effect in parsing. Their results still revealed an interesting RT difference between the compounds with a transparent head (i.e. TT and OT) and those with an opaque head (i.e. TO and OO); the former type was processed more rapidly than the latter. Nevertheless, this difference did not result in significantly decreased priming effects for the latter group (i.e. TO and OO compounds). Some studies demonstrated clearer evidence for the role of semantic transparency in compound processing. For example, in a semantic priming experiment in Dutch, Sandra () used, as primes, semantic associates of the first and second constituents of fully transparent (e.g. woman-MILKMAN) and opaque (e.g. bread-BUTTERFLY) compounds. Priming effects were found only for transparent compounds. In other words, only in fully transparent compounds both constituents served as primes. In another study on Dutch, Zwitserlood () employed a semantic priming task with fully transparent (e.g. kerkorgel ‘church organ’, kerk ‘church’, orgel ‘organ’), fully opaque (klokhuis ‘core of an apple’, klok ‘clock’, huis ‘house’), and partially opaque compounds in which the second constituent was the same as its fully transparent pair but semantically opaque (drankorgel ‘drunkard’, drank ‘drink’, orgel ‘organ’). The results revealed priming effects for fully transparent and partially opaque compounds but not for fully opaque compounds. Finally, Stathis () examined English compound processing via a lexical decision task and found that compounds were decomposed only when both constituents were transparent.

In addition to previous studies on compound processing in L1 English, it is also important to look at how native speakers process compounds in Turkish as it is the L1 in the present study. In one such study, Özer () investigated, via a morphological priming task involving picture naming, three types of compounds in Turkish: i) bare juxtaposed compounds (neither constituent is inflected, e.g., çelik kapı ‘steel door’, çelik ‘steel’, door ‘kapı’); ii) indefinite compounds (only the head (second) constituent is inflected with the possessive suffix – s(I), e.g., devlet kapı-sı ‘government service’, devlet ‘government’, kapı-sı ‘door’+possessive suffix); iii) definite compounds (both constituents are inflected, the modifier (first constituent) with genitive suffix – (n)In and the head with possessive suffix –s(I), e.g. bahçenin kapısı ‘gate of the garden’, bahçe-nin ‘garden’+genitive suffix, kapı-sı ‘door’+possessive suffix). The set of target pictures (e.g., kapı ‘door’) were paired with three noun-noun compounds, used as primes. In other words, the primes were matched with the target picture either on the basis of the first constituent (e.g., target: ana ‘mother/main’; bare juxtaposed compound: ana fikir ‘main idea’, ana ‘main’, fikir ‘idea’; indefinite compound: ana kucağı ‘mother’s bosom’, ana ‘mother’, kucağ-ı ‘bosom’+possessive suffix; definite compound: ananın emeği ‘mother’s effort’, ana-nın, ‘mother’+genitive suffix, emeğ-i ‘effort’+possessive suffix) or second constituent (e.g. target: balık ‘fish’; bare juxtaposed compound: akbalık ‘dace’, ak ‘white’, balık ‘fish’; indefinite compound: dilbalığı ‘flounder’, dil ‘tongue’, balığ-ı ‘fish’+possessive suffix; definite compound: gölün balığı ‘fish of the lake’, göl-ün ‘lake’+genitive suffix, balığ-ı ‘fish’+possessive suffix) or they were completely unrelated (e.g. arka teker ‘rear wheel’, arka ‘back’, teker ‘wheel’). The results revealed that morphologically related compound primes led to shorter naming latencies compared to unrelated distractors, suggesting a decompositional processing pattern. Albeit not statistically significant, Özer also obtained an RT advantage for the second constituent (i.e. the head) of the compound. In a more recent study, Uygun () tested the processing of Turkish compounds in a masked priming experiment with four types of stimuli: transparent-transparent compounds (e.g. kuzeydoğu ‘northeast’, kuzey ‘north’, doğu ‘east’) and partially opaque compounds (e.g. büyükelçi ‘ambassador’, büyük ‘big’, elçi ‘delegate’), pseudocompound and monomorphemic items. His results revealed that while the second constituent served as a significant prime, the first constituent showed only a marginal facilitation in compound recognition. Semantic transparency was also found to be important in the sense that while the second constituent was activated in transparent-transparent compounds, both constituents were activated in partially opaque compounds. This suggests that Turkish native speakers activate both constituents only when transparency decreases in compounds.

2.2. L2 compound studies

Compared to L1 studies, the number of L2 studies exploring how adult learners process compounds is limited. The central question examined in the context of L2 acquisition pertains to potential differences between native and non-native speakers in the way they access compound words. More specifically, whether or not L2 learners can make use of morphological information in compound processing independent of constituency and semantic transparency was explored in a series of studies. For example, Goral et al. () used a primed lexical decision task to test Hebrew–English bilinguals living in Israel, who had learnt English at school and had not lived in an English-speaking country before. Although this study did not have native speaker controls, the findings were still revealing as no significant constituent-priming effects were found for English compounds. In another study without a control group, Ko () conducted a masked priming experiment to identify whether morphological information played a role independent of orthography and whether the first or the second constituent makes a contribution in English compound processing by Korean–English bilinguals. The stimuli involved four types of prime-target pairs: morphologically decomposable, semantically transparent and orthographically overlapped (+M+S+O, e.g. key-KEYHOLE; hole-KEYHOLE), morphologically decomposable, semantically opaque and orthographically overlapped (+M–S+O, e.g. dead-DEADLINE; line-DEADLINE), only orthographically overlapped (–M–S+O, e.g. pump-PUMPKIN; kin-PUMPKIN) and only semantically related (–M+S–O, e.g. frigid-COLD). The results overall showed no significant priming effects, implying that L2 learners do not employ decomposition. Nevertheless, RT results showed more facilitation of the first constituent, as evidenced by shorter RTs for the first constituent primes compared to the second. Additional evidence was presented via a recent unmasked lexical decision experiment by González Alonso et al. () that compared English native speakers, L1 Spanish–L2 English sequential bilinguals, L1 Spanish–L2 Basque–L3 English, and L1 Basque–L2 Spanish–L3 English trilinguals in terms of their responses to English noun-verb-er compounds (e.g. taxi driver). The results revealed that, overall, native English speakers were the fastest group for all conditions, followed by Spanish–English bilinguals, while both trilingual groups were the slowest with similar RTs, implying that the number of languages spoken and the morphological properties of the most frequently active language may impact the processing of other languages. The researchers also suggested that the morphological structure of compounds is likely to develop at later stages in non-native speakers.

There are, however, several studies that found native-like decomposition in L2 compound processing. In a lexical decision experiment with Chinese–English bilinguals, Wang () observed the constituent frequency effect in English compound processing. More specifically, L2 learners demonstrated faster lexical decisions when the second constituent was a high-frequency word, suggesting that a frequency-based decompositional access pattern is also available to L2 learners. Further evidence for the decompositional pattern came from the unmasked priming data of late-arrival US-resident Hebrew–English bilinguals in Goral et al.’s () study. Similarly, in an unmasked lexical decision study involving L2 English compounds, Ko et al. () found significantly shorter RTs for compounds with high-frequency second constituents than low-frequency second constituents in Korean–English bilinguals. This was taken as evidence for decomposition in L2 learners. In another study, González Alonso et al. () compared the processing of English noun-verb-er compounds (e.g. cheerleader) by native and (L1 Spanish) non-native speakers of English via a masked priming task including five conditions: first constituent (e.g., fund-FUNDRAISER), second constituent (e.g. raiser-FUNDRAISER), first orthographic condition (e.g. funk-FUNDRAISER), second orthographic condition (e.g. raisin-FUNDRAISER) and unrelated condition (e.g. cool-FUNDRAISER). The results provided strong evidence for constituent priming both for native and non-native speakers. Additionally, no priming effect was observed for the orthographic condition in either group. Native and non-native differences emerged only in the form of lower total accuracy and longer mean RTs on the part of the non-native group. In a recent masked priming study testing the processing of transparent and opaque English compounds by English native speakers and Chinese–English bilinguals, Li et al. () used compounds as primes and constituents as targets in transparent-transparent compounds (e.g. toothbrush-TOOTH; toothbrush-BRUSH), opaque-opaque compounds (e.g. honeymoon-HONEY; honeymoon-MOON) and orthographic overlap condition (e.g. restaurant-REST; tomorrow-ROW). Decomposition was observed in both transparent and opaque compounds, indicating semantic transparency-independent decomposition both in L1 and L2 compound recognition. As for the group differences, native speakers responded significantly faster than bilinguals. They also differed in the orthographic overlap condition; while no priming effect was obtained for native speakers, a clear orthographic priming effect was reported for the word-initial position (e.g., restaurant-REST) in bilinguals. Therefore, the researchers concluded that L2 compound processing may not solely be morphological but also orthographical. There are also studies providing support for the dual-route model. For example, Mayila () investigated, via a masked priming experiment, how Chinese–English bilinguals processed transparent and opaque compounds in English. The results showed that the transparent condition produced significant priming effects (i.e. decomposition) but the opaque condition did not, suggesting that, as a function of transparency, decompositional and whole-word route are both possible in L2 English compound processing.

3. The Study

3.1. Research questions and hypotheses

Within this background, the present study examines the lexical access of compounds in L2 English by L1-Turkish-speaking learners in comparison to English native speakers. It is important to note that English and Turkish compounds share a myriad of linguistic similarities. In both languages, compounding is a highly productive word formation process and compounds are mostly right-headed (e.g. strawberry and kuzeydoğu ‘northeast’, kuzey ‘north’, doğu ‘east’) (; ). In addition, both languages have compounds consisting of two roots usually made up of two nouns (e.g. hairnet and babaanne ‘paternal grandmother’, baba ‘father’, anne ‘mother’) or an adjective plus a noun (e.g. blackboard and büyükbaba ‘grandfather’, büyük ‘big’, baba ‘father’), which are classified as nominal compounds (; ). Also, compounds in both languages can be grouped on the basis of the degree of semantic transparency as described in section 3.2.3, below.

In light of this background, the current study explores, via a masked priming lexical decision task, how English compound words are processed by native and non-native speakers. The masked priming paradigm enables us to compare the groups in terms of their RTs in three types of prime-target pairs. Potential RT differences among the groups and across the prime conditions will reveal whether i) compounds are recognized as unanalyzed units or parsed into constituent morphemes; ii) semantic transparency and headedness influence the processing route; iii) L1 and L2 groups demonstrate differential processing patterns; iv) L2 proficiency impacts native-like processing.

We hypothesize that compounds will be subject to morphological parsing as proposed by decompositional views (e.g. ) and processed faster than their frequency-matched monomorphemic counterparts as predicted by Libben’s () APPLE Model. Accordingly, only morphemic constituents are expected to serve as primes and to accelerate the compound’s overall processing speed. This decompositional access route is not predicted to be influenced by semantic transparency and headedness. More specifically, following Libben et al. (), morphological constituency-based parsing is predicted both in fully transparent and partially opaque compounds. With respect to L1–L2 differences, similarities between English and Turkish compounding are predicted to work in favor of the L1 Turkish–L2 English participants. Therefore, we only predict quantitative differences among the groups. Nevertheless, proficiency-based approximation to native-like processing route might be more clearly observed in the advanced proficiency group.

3.2. The methodology

3.2.1. Participants

A total of 165 participants (102 L2 learners and 63 English native speakers) were tested in the study. Demographic and linguistic information gathered from participants via a questionnaire is presented in Table 1.

Table 1

Participants.

Groups	Mean Age (range)	Age of first English exposure (range)	Length (years) of English exposure (range)

English native speakers (N = 63, female: 38; male: 25)	24.66 (20–53)	At birth	From birth
Intermediate-level L2 learners (N = 51, female: 32; male: 19)	19.56 (18–24)	9.41 (5–18)	10.49 (2–17)
Advanced-level L2 learners (N = 51, female: 31; male: 20)	21.13 (18–27)	8.88 (4–14)	12.19 (4–18)

Half of the L2 participants had intermediate and the other half had advanced L2 proficiency. All of them were students of a private English-medium university in Istanbul. L2 proficiency-based grouping was made on the basis of the results of an in-house English proficiency test developed by participants’ university (Table 2). All L2 participants have learnt English in school setting and none of them has spent time in an English-speaking country.

Table 2

English proficiency scores.

Groups	Mean scores (out of 100)	Range	Standard Deviation

Intermediate-level L2 learners (N = 51)	52.71	38–58	4.56
Advanced-level L2 learners (N = 51)	75.65	69-88	5.08

The independent sample t-test revealed a significant difference between the proficiency scores (t(100) = 23.988, p < 0.001). The advanced learners received significantly higher scores than the intermediate group.

3.2.2. The experimental task

A masked priming task based on E-prime 2.0 () was conducted to measure RTs and accuracy in compound processing. The masked priming lexical decision task is also referred as a ‘sandwich’ technique because the prime is sandwiched between a forward pattern mask (#####) and the target stimulus. It is commonly used in the mental lexicon research as it does not allow, due to very short prime duration, for any kind of explicit processing strategy that may arise from conscious identification of a prime (). The task is believed to tap early stages of processing. Thus any potential priming effects indicate unconscious linguistic computations during lexical access of complex words ().

3.2.3. Materials

In this constituent priming task, the English compound words were divided into two categories following the design of Shoolman & Andrews (): 1) transparent-transparent compounds in which the meanings of two constituents are related to the meaning of the whole word (e.g. headache); 2) partially opaque compounds in which the meaning of one of the constituents (either the first or the second) is not related to the whole meaning (e.g. grapefruit, nightmare). A total of 20 right-headed noun-noun compounds (10 for each type) were used as targets. The test also included 10 pseudocompounds (e.g. mandate), which consist of two constituents that can potentially stand alone as free morphemes (i.e. man and date) but do not serve as real compounds. Finally, 60 monomorphemic words (e.g. crocodile) that cannot be morphologically decomposed were included in the study. Pseudocompound and monomorphemic items served as control items. Pseudocompound items were included to see to what extent meaningful constituents were activated in lexical access. Monomorphemic items enabled us to make comparisons with compound items in terms of priming effects they trigger. The items were chosen after examining the course books and interviewing the English teachers of L2 participants to ensure that the items are known to them. All compounds, pseudocompounds and monomorphemic items, selected from the SUBTLEX-US Corpus () were matched, as much as possible, on the following measures: whole-word length, whole-word frequency, first constituent length, first constituent frequency, second constituent length, and second constituent frequency (see Table 3 for the properties of test items). A total of 90 nonwords were also used. Sixty monomorphemic plausible nonwords were created by changing two to three letters of existing English words. Thirty compound nonwords were created by combining two words, two nonwords or a word and a nonword.

Table 3

Examples from the stimuli list.

Condition	WW		C1		C2

	Frequency	Length	Frequency	Length	Frequency	Length

TT (headache)	5.37	8.7	143.47	4.4	82.31	4.3
PO (grapefruit)	5.86	8.8	139.08	4.4	65.78	4.4
PSC (mandate)	5.75	7.9	199.02	3.8	69.36	4.1
Monomorphemic (crocodile)	5.60	8.35	–	4.23	–	4.12

Note. WW: Whole Word; C1: Constituent 1; C2: Constituent 2; TT: Transparent-Transparent Compounds; PO: Partially-Opaque Compounds; PSC: Pseudocompounds. All mean frequencies presented here are given per million.

An analysis comparing compounds, pseudocompounds and monomorphemic items in terms of frequency revealed no significant differences for the whole-word frequency (F(3, 86) = 0.013; p = 0.998), first constituent frequency (F(2, 27) = 0.079; p = 0.924) and second constituent frequency (F(2, 27) = 0.060; p = 0.942). With respect to length, no significant differences were obtained for the whole-word length (F(3, 86) = 1.413; p = 0.244), first constituent length (F(3, 86) = 976; p = 0.408) and second constituent length (F(3, 86) = 441; p = 0.724).

3.2.4. Procedure

Participants responded to a set of words on the computer screen by pressing either a ‘Yes’ or ‘No’ button on the keyboard as quickly and accurately as possible to decide if the word on the screen was a real word in English. For each trial, first a forward mask (#####) was presented at the center of the screen for 500 milliseconds; this was followed by the prime, which was presented for 50 milliseconds, followed immediately by the target. The target item remained on the screen until the participant pressed the ‘Yes’ or ‘No’ buttons. Participants were tested individually and a practice of 10 stimuli was given prior to the actual test to familiarize participants with the procedure. The prime-target pairs were presented in three conditions: i) Constituent 1 (head–HEADACHE), ii) Constituent 2 (ache–HEADACHE), iii) Unrelated (barn–HEADACHE) for compounds; and i) Constituent 1 (man–MANDATE), ii) Constituent 2 (date–MANDATE), iii) Unrelated (box–MANDATE) for pseudocompounds; and i) Constituent 1 (croco–CROCODILE), ii) Constituent 2 (dile–CROCODILE), iii) Unrelated (year–CROCODILE) for monomorphemic items. There were three versions of the test so that no participant saw the same target more than once.

3.2.5. Data coding and analysis

All incorrect responses and outliers were excluded from the analysis. A ‘No’ (i.e. nonword) response to a real word and a ‘Yes’ (i.e. real word) response to a nonword were labeled as an incorrect response. RTs exceeding three standard deviations above and below a participant’s mean RT per word type were deemed outliers. The motivation for using three standard deviations as a cutoff point was related to the low frequency of English compounds and their frequency matched monomorphemic words in the stimuli. The participants with an error rate exceeding 33% were excluded from the study. One intermediate-level participant was excluded from the analysis because of her high error rate. Thus the analysis was based on 50 participants in this group.

Descriptive statistics and repeated measures ANOVA were conducted on the mean RTs of correctly responded items. Following Shoolman and Andrew’s () study, three RT analyses were conducted. In the first analysis, to determine the effect of morphological structure, the mean RTs of two sets of compound words (i.e. transparent and partially opaque items) were compared with those of noncompound words (i.e. pseudocompound and monomorphemic items). In the second analysis, transparent-transparent compounds were compared with partially opaque compounds to evaluate semantic contribution. To identify priming effects, the mean RTs for target items preceded by the first and the second constituent primes were compared to those preceded by unrelated primes. In addition, the priming effects from the first and second constituent primes were compared with each other to identify any differential facilitation from the two constituents. More specifically, mean RTs were compared to examine whether compounds preceded by either their first or second constituent were recognized faster than those preceded by an unrelated prime word. In the last analysis, pseudocompounds were compared with monomorphemic words to assess the lexical status of constituents.

4. Results

4.1. Analysis 1

To investigate whether compounds were processed differently from noncompounds, the mean RTs to two types of compound words (transparent and partially opaque items) were compared with the mean RTs to noncompounds (pseudocompound and monomorphemic items) (Table 4). A 2 (word types) × 3 (prime types) × 3 (groups) mixed-model ANOVA was conducted. Across all three analyses, word types and prime types were within-subject variables and group was between-subject variable and Bonferroni test was used as the post-hoc test. Mauchly’s test indicated that the assumption of sphericity had been violated for the interaction between word types and prime types, χ²(2) = 13.66, p < 0.002. Therefore a Greenhouse-Geisser correction was used.

Table 4

Mean RTs and standard deviations in three prime conditions for compounds and noncompounds.

Group	Compounds			Noncompounds

	C1	C2	UR	C1	C2	UR

	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)

English native speakers	610.45 (83.57)	608.42 (81.80)	649.11 (96.33)	648.30 (99.30)	643.44 (89.72)	662.26 (91.23)
Intermediate-level L2 learners	722.34 (121.43)	743.94 (150.30)	772.62 (147.12)	800.42 (168.05)	774.20 (156.25)	823.96 (139.05)
Advanced-level L2 learners	666.60 (104.75)	677.62 (90.44)	725.87 (126.76)	710.61 (129.48)	735.19 (132.95)	720.60 (113.20)

Note. C1: Constituent 1 as prime; C2: Constituent 2 as prime; UR: Unrelated prime.

The main effect of group was significant (F(2, 161) = 26.903; p < 0.001; η²_p = 0.25) because intermediate-level L2 learners were, overall, significantly slower than English native speakers (p < 0.001) and advanced-level learners (p < 0.003). Also, advanced learners were significantly slower than native speakers (p < 0.002). There was also a significant main effect of word types (F(1, 161) = 31.156; p < 0.001; η²_p = 0.16) since compound words were processed significantly faster than noncompound words (p < 0.001). Another significant main effect was for prime types (F(2, 161) = 25.376; p < 0.001; η²_p = 0.14). Significant differences were found between constituent 1 and unrelated primes (p < 0.001; Cohen’s d = 0.28) and constituent 2 and unrelated primes (p < 0.001; Cohen’s d = 0.24) and the unrelated prime condition was significantly slower than the other two conditions.

There was a significant interaction effect between word types and prime types (F(2, 161) = 6.392; p < 0.004; η²_p = 0.04). This indicated significant differences between constituent 1 and unrelated primes (p < 0.001; Cohen’s d = 0.43), and constituent 2 and unrelated primes (p < 0.001; Cohen’s d = 0.34) in compound words, suggesting that they were accessed in a decomposed fashion. Significant differences were also obtained in noncompound words between constituent 1 and unrelated primes (p < 0.03; Cohen’s d = 0.13), and constituent 2 and unrelated primes (p < 0.02; Cohen’s d = 0.15), indicating decomposition for noncompounds as well. There was also a significant interaction among word types, prime types and groups (F(4, 161) = 3.421; p < 0.01; η²_p = 0.04). In the native group’s compound data, the unrelated prime condition was significantly slower than constituent 1 (p < 0.004; Cohen’s d = 0.43) and constituent 2 prime conditions (p < 0.002; Cohen’s d = 0.46). However, in noncompound items, no significant differences were found among prime conditions. These results suggest that while compounds are processed in a decomposed fashion, noncompound items are accessed as unanalyzed units by native speakers. The same pattern was also observed in advanced-level L2 learners; in compounds, unlike the unrelated prime, both constituent 1 (p < 0.001; Cohen’s d = 0.51) and constituent 2 (p < 0.002; Cohen’s d = 0.44) facilitated target word recognition and no such effects were found in noncompounds. In contrast, intermediate learners yielded a different processing pattern; there was a significant difference between constituent 1 and unrelated primes (p < 0.001; Cohen’s d = 0.37) in compounds, and a significant difference between constituent 2 and unrelated primes (p < 0.001; Cohen’s d = 0.37) in noncompounds. This suggests that intermediate-level participants accessed both compounds and noncompounds via decomposition, relying on constituent 1 and constituent 2, respectively.

In sum, the results revealed that all groups processed English compounds significantly faster than noncompounds. In addition, all groups showed a tendency to decompose compounds; while English native speakers and advanced-level learners could access both constituents, intermediate-level learners accessed only constituent 1. Unlike the other groups, intermediate-level learners accessed constituent 2 in noncompound items, indicating a tendency to segment units irrespective of their morphological status.

4.2. Analysis 2

In the second analysis, the mean RTs of transparent compounds were compared with that of partially opaque compounds to evaluate the extent of semantic contribution in compound processing. A 2 (word types) × 3 (prime types) × 3 (groups) mixed-model ANOVA was conducted. As Table 5 demonstrates, the main effect of group was significant (F(2, 161) = 21.302; p < 0.001; η²_p = 0.21) because intermediate-level L2 learners were significantly slower than native speakers (p < 0.001) and advanced-level L2 learners (p < 0.02). Also, advanced learners were significantly slower than native speakers (p < 0.003). There was also a significant main effect of word types (F(1, 161) = 17.233; p < 0.001; η²_p = 0.10) since partially opaque compound words were processed significantly faster than transparent-transparent compound words (p < 0.001). Another significant main effect was for prime types (F(2, 161) = 26.268; p < 0.001; η²_p = 0.14). Significant differences were found between constituent 1 and unrelated primes (p < 0.001; Cohen’s d = 0.37) and constituent 2 and unrelated primes (p < 0.001; Cohen’s d = 0.32). Furthermore, the unrelated prime condition was significantly slower than the other two conditions indicating decomposition in compound processing.

Table 5

Mean RTs and standard deviations in three prime conditions for compound words.

Group	Partially-Opaque Compounds			Transparent-transparent Compounds

	C1	C2	UR	C1	C2	UR

	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)

English native speakers	607.15 (104.53)	596.72 (90.28)	634.31 (101.76)	615.70 (95.62)	620.86 (98.68)	666.72 (113.15)
Intermediate-level L2 learners	709.45 (150.72)	729.23 (150.21)	764.11 (215.19)	737.78 (161.75)	747.98 (163.16)	790.31 (167.04)
Advanced-level L2 learners	654.55 (117.10)	667.19 (114.06)	700.25 (103.05)	680.35 (131.64)	688.82 (102.10)	757.17 (194.62)

Note. C1: Constituent 1 as prime; C2: Constituent 2 as prime; UR: Unrelated prime.

To sum up, the results revealed that all groups processed partially opaque compounds significantly faster than transparent-transparent compounds. They all processed compounds via decomposition regardless of semantic transparency since no significant interaction effect between word types and prime types was obtained. These findings also suggest that constituent/headedness-based difference is not observed in the extent of semantic transparency facilitation.

4.3. Analysis 3

In the final analysis, the mean RTs of pseudocompounds and of monomorphemic words were compared. A 2 (word types) × 3 (prime types) × 3 (groups) mixed-model ANOVA was conducted. Mauchly’s test indicated that the assumption of sphericity had been violated for the main effect of prime types, χ²(2) = 25.049, p < 0.001. Therefore a Greenhouse-Geisser correction was used.

As Table 6 shows, the main effect of group was significant (F(2, 161) = 21.911; p < 0.001; η²_p = 0.21) because overall, intermediate-level L2 learners were significantly slower than native speakers (p < 0.001) and advanced-level L2 learners (p < 0.004). Also, advanced learners were significantly slower than native speakers (p < 0.007). Another significant main effect was for prime types (F(2, 161) = 3.472; p < 0.04; η²_p = 0.02). Significant differences were found between constituent 1 and unrelated primes (p < 0.03; Cohen’s d = 0.12) and constituent 2 and unrelated primes (p < 0.05; Cohen’s d = 0.11). Also, the condition with unrelated primes was significantly slower than the other two conditions, indicating decomposition for pseudocompounds and monomorphemic words.

Table 6

Mean RTs and standard deviations in three prime conditions for pseudocompounds and monomorphemic words.

Group	Pseudocompounds			Monomorphemic Words

	C1	C2	UR	C1	C2	UR

	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)	Mean (SD)

English native speakers	650.80 (136.71)	643.46 (109.80)	662.61 (106.64)	648.72 (89.93)	648.18 (88.18)	662.82 (92.64)
Intermediate-level L2 learners	816.29 (208.89)	779.81 (210.87)	820.75 (171.07)	793.20 (170.23)	784.94 (155.51)	837.01 (174.14)
Advanced-level L2 learners	713.15 (151.77)	752.79 (190.80)	710.15 (120.51)	709.11 (119.42)	725.81 (122.25)	735.35 (138.30)

Note. C1: Constituent 1 as prime; C2: Constituent 2 as prime; UR: Unrelated prime.

There was also a significant interaction effect between prime types and group (F(4, 161) = 3.621; p < 0.008; η²_p = 0.04). This indicated significant differences between constituent 2 and unrelated primes (p < 0.001; Cohen’s d = 0.26) in intermediate learners, suggesting that these noncompound items were also accessed in a decomposed fashion. However, no significant differences among the primes were obtained for native speakers and advanced-level L2 learners.

These results suggest that while native speakers and advanced-level L2 learners process pseudocompounds and monomorphemic words without morphological parsing, intermediate-level learners apply a constituent 2-based decomposition.

5. Discussion

The findings demonstrated that English native speakers, as predicted, were significantly faster than L2 learners in all word categories. The intermediate group was the slowest as in González Alonso et al. (, ) and Li et al. (). Native speakers recognized compounds significantly faster than noncompounds, and both constituents were activated in compound processing. This finding was also reported in earlier studies conducted with native English speakers (e.g. ; ; ). Although constituent-based decomposition is normally expected to be costly, as predicted by Libben’s () APPLE Model, the facilitative links between the compound and its roots might have assisted word recognition, leading to shorter RTs for compounds than noncompounds, which lack morphemic constituents. The advanced L2 group displayed native-like processing as they accessed compounds significantly faster than noncompounds while demonstrating decomposition only in the former category. While intermediate-level learners also recognized compounds significantly faster than noncompounds, they activated only constituent 1 in compounds (constituent 2 facilitation did not reach a statistically significant level, p = 0.079).

As for the role of semantic transparency in compound processing, all groups showed similar patterns. As proposed by the APPLE Model, partially opaque compounds were processed significantly faster than transparent-transparent compounds. This is probably due to the fact that unlike transparent-transparent compounds, in partially opaque compounds, only the meaning of transparent constituent and transparent whole compound are activated and this results in faster RTs for partially opaque compounds. Crucially, both constituents served as primes in both compound types in all groups, indicating semantic transparency-independent decomposition both in L1 and L2 compound recognition, a finding similar to what was reported in Li et al. (). This also suggests that constituency/headedness and semantic transparency do not interact. In other words, two constituents equally served as primes irrespective of semantic transparency of a compound.

With respect to pseudocompounds and monomorphemic items, both native speakers and advanced-level participants accessed these forms as unanalyzed units. However, in intermediate-level learners, constituent 2 served as a significant prime and constituent 1 was also close to the level of significance (p = 0.093), suggesting a tendency to do decomposition irrespective of the morphemic status of segments. Pseudocompounds and monomorphemic items were included in the test to identify whether processing was influenced by orthographical similarities between the primes and targets. Recall that in Li et al. () advanced L2 learners decomposed English compounds but this was not solely morphological because orthographic priming effects were also found for control items in the word initial overlap position (e.g. restaurant-REST) but not in the word final position (e.g. tomorrow-ROW). The present study did not indicate such an effect for advanced participants. Only intermediate-level participants demonstrated constituent-based segmentation in noncompounds. This suggests that L2 learners are less sensitive to the morphological status of constituents and tend to decompose noncompound words during lexical access, but as their proficiency increases, they become more native-like, as revealed by the results of advanced L2 learners.

6. Conclusion

This study examined compound processing in L2 English by L1-Turkish-speaking late learners. Both native and non-native groups accessed English compounds faster than noncompounds as predicted by Libben’s () APPLE Model. In native speakers and advanced-level learners, both constituents served as primes, whereas in intermediate learners only constituent 1 facilitated lexical access, suggesting that decomposition is more evident at a higher L2 proficiency level. Crucially, since intermediate-level learners also demonstrated decomposition in noncompounds, the facilitation effect observed in their compound processing may be orthographic rather than morphological in nature. As for semantic transparency, partially opaque compounds were processed significantly faster than transparent-transparent compounds by all groups as predicted by the APPLE model. Both constituent 1 and 2 equally served as primes in both compound types, suggesting that decomposition is observed regardless of semantic transparency and headedness/constituency as evidenced by the absence of significant interaction effect between word types and prime types.

Overall, advanced-level learners’ data suggest that native-like processing is possible even in late L2 learners as a function of increased proficiency. Similarity in Turkish and English compounds may have also played a role in native-like processing. The present findings imply that L2 learners at lower proficiency levels may not rely on the morphological structure in processing compounds. Nevertheless, studies with a separate orthographic prime condition may be more revealing as to whether L2 compound processing is solely morphological or whether orthography also plays a significant role.

Journal of the European Second Language Association

Research

Compound processing in second language acquisition of English

Abstract

1. Introduction

2. Previous Research on Compound Processing

2.1. L1 studies

2.2. L2 compound studies

3. The Study

3.1. Research questions and hypotheses

3.2. The methodology

3.2.1. Participants

3.2.2. The experimental task

3.2.3. Materials

3.2.4. Procedure

3.2.5. Data coding and analysis

4. Results

4.1. Analysis 1

4.2. Analysis 2

4.3. Analysis 3

5. Discussion

6. Conclusion

Notes

Acknowledgement

References