Sentence repetition in Farsi-English bilingual children repetition in Farsi-English bilingual children.

The current study aimed to create an assessment that can be used in the future to measure the language abilities of Farsi-speaking children in a clinical setting. A Farsi sentence-repetition task was created that included structures organised into three levels of complexity from least to most complex. Twenty typically developing Farsi-English bilingual children between the ages of 6;3–11;6 were recruited from Farsi schools in Toronto, Canada. Significant differences on the participants’ performance among the three levels were found with the lowest performance in the most complex sentences and the highest performance in the least complex ones. Specific structures appeared to be more challenging than others within each level of complexity. The children’s decreasing performance with increasing complexity and the evidence that specific structures are challenging within each level make the Farsi sentence repetition task a promising tool for assessing the language skills of Farsi-English speaking children.


Introduction
Bilingualism is the norm in many countries around the world. Consecutive waves of migration in the 20 th and 21 st century have led to an increase of bilingualism in many countries in Europe and in the Americas. However, the demographic changes are often not reflected in the school curriculum that tends to be monolingual. The children's first language (L1) or mother tongue is not being used in the classroom to support the children's learning and there is a lack of resources available in the children's home languages. Having resources, such as language assessments, in the children's home languages is crucial if a child is suspected to have a Specific Language Impairment/Developmental Language Disorder (from this point onwards abbreviated as DLD).
The majority of language assessments are normed to monolingual children, therefore their norms should not be used on bilingual children because the language trajectory of bilingual children significantly differs from that of monolingual children (Armon-Lotem, 2012). In addition, a bilingual child's language development is controlled by factors that do not influence the language development of monolingual children (e.g., age of onset (AoO), length of exposure (LoE), quality and quantity of language input, language dominance and the status of the language (minority/majority). As a result, bilingual children are a more heterogenous group in comparison to monolingual children. Misdiagnosis can underestimate the language potential of children who have English as a second language (L2) (Paradis, 2005). Bilingual children have varying degrees of exposure to their L1 and their L2, which influences their performance on L2 tasks. More effective results would be obtained by comparing bilinguals to bilinguals rather than bilinguals to monolinguals (Paradis et al., 2013) because if a bilingual child shows low language abilities in both their L1 and L2, the problem is likely due to internal factors (Leonard, 2014). However, if the bilingual child only shows low language in their L2, then difficulties are likely due to language-exposure factors (Leonard, 2014).
There continues to be a shortage of standardized language assessments for the heritage languages of children who have English as their L2 (Marinis et al., 2017), and there is a lack of language assessments for Farsi. The number of English-speaking Farsi children in the United States, Australia, United Kingdom (UK) and Canada is considerably high. In 2018, the Public Affairs Alliance of Iranian Americans stated that there are about 1 million Iranians living in the Americas, and in 2017 the Office for National Statistics in the UK estimated that around 70,000 Iranian born individuals live in the UK. Finally, according to Statistics Canada (2017) there are around 150,000-200,000 Iranians living in Canada, the majority of which are in the Greater Toronto Area. A large number of bilingual Farsi-English children in schools world-wide do not have access to bilingual tasks for language assessment. In the current study, we developed a Farsi sentence-repetition (SR) task to assess the morphosyntactic skills of a group of school aged Farsi-English bilingual children.

SR tasks
SR tasks assess morphosyntax and verbal working memory (Conti-Ramsden et al., 2001;Stokes et al., 2006) and provide information on the syntactic aspects of sentence processing (Polišenská, 2011). In SR tasks individuals are required to repeat sentences that have been told to them verbatim. Sentences should be relatively long to avoid passive echoing (parroting) of the sentence, as this would not provide significant information about the participant's language abilities. Repeating sentences of varying length requires children to use their grammatical system, which allows researchers to get a glimpse into their implicit knowledge (Polišenská et al., 2015). If children have not acquired the specific grammatical structure being elicited, they will be unable to repeat the sentences. As a result, grammatical competence is at the heart of what SR is measuring. In comparison to other assessments, SR tests are quick and easy to perform, they allow for greater control of administration and analysis and can be used for children as young as preschool age (Seeff-Gabriel et al., 2010).

Developing sentence repetition tasks for bilingual children
In order to address the risk of misdiagnoses in bilingual children, various LITMUS (Language Impairment Testing in Multilingual Settings) tasks were developed within the COST (European Cooperation in the field of Scientific and Technical Research) -Action IS0804 Language Impairment in Multilingual Society: Linguistic Patters and the Road to Assessment . The LITMUS-SR tasks were designed to tap into processes that are particularly difficult for children with DLD. They were designed based on past theoretical knowledge on how DLD is manifested across languages. To create parallel versions of SR tasks across languages, two principles were used (Marinis & Armon-Lotem, 2015): • In addition to a set of syntactically simple structures as control structures (language-independent structures), include in all SR tasks a set of syntactically complex structures that have been shown to be difficult for children with DLD across languages and that involve embedding and/or syntactic movement; • Include a set of structures for each language that have been shown to be difficult for children with DLD in the specific language (language-specific structures).
The language-independent structures are those that have been found to be challenging in children with DLD across languages -structures that require movement such as wh-questions and relative clauses, as well as structures that are syntactically complex (i.e., conditionals and subordination). On the other hand, language-specific structures should be chosen based on research that indicates that a structure is vulnerable in a particular language in a child with DLD but not in a typically developing (TD) bilingual child. The LITMUS-SR tasks also control for sentence length, vocabulary and memory effects. Length is controlled in terms of number of syllables and vocabulary by selecting high frequency early acquired words (Marinis & Armon-Lotem, 2015).

Using SR tasks with bilingual children
Several versions of the LITMUS-SR task have been created in many languages, including Lebanese Arabic, French, Russian, German and Turkish (Marinis & Armon-Lotem, 2015). These tasks have been used to differentiate between TD bilingual children and bilingual children with DLD or to address language development in TD bilingual children. Two studies using the French LITMUS-SR task examined the effectiveness of using the LITMUS-SR task to accurately differentiate between TD bilingual children (Bi-TD) and bilingual children with DLD (Bi-DLD). Fleckstein et al. (2018) tested Bi-TD and Bi-DLD Arabic/French or English/French children in their majority language (French). They used a SR task which had five sentence types divided into two subtypes of varying complexity. Overall, the Bi-TD children outperformed the children with Bi-DLD, and clausal embedding was particularly difficult for the children with DLD. However, there was a discrepancy in the sample sizes between the Bi-TD and Bi-DLD groups that could have affected the results. A study by Tuller et al. (2018) also addressed the effectiveness of the French and German LITMUS-SR tasks but investigated Bi-TD and Bi-DLD children in Arabic, Turkish or Portuguese as a heritage language and French or Germen as a majority language. The participants were tested in both their heritage and their majority language. Their heritage language was tested using various standardized assessments, while French or German was tested using their respective LITMUS-SR tasks. These tasks were effective in distinguishing between the Bi-TD and the Bi-DLD groups. Another study addressing the effectiveness of a LITMUS-SR task in both the heritage and the majority language was done by Meir (2018). In her study she tested Bi-TD and Bi-DLD Russian heritage and majority Hebrew children in both their languages. Unlike the French LITMUS-SR task, the Russian and Hebrew SR tasks grouped the sentence types into three levels of complexity from least to most complex on the basis of Marinis and Armon-Lotem's (2015) principles. Meir's (2018) study found that overall the Bi-TD children outperformed the Bi-DLD children on both SR tasks. In addition, the error patterns of these two samples of children differed significantly. The errors of the former were more minor and kept the complexity of the sentence intact while the errors made by the DLD sample generally involved simplifying the sentence structure. Meir and Armon-Lotem (2015) also did a study looking at Bi-TD and Bi-DLD Russian heritage and L2 Hebrew speaking children. The participants were all tested using the Russian LITMUS-SR task, which was divided into three levels of complexity. The children were tested on their ability to use case in simple and complex sentences. The results showed that the Bi-TD group outperformed the Bi-DLD group on all measures. Finally, Gavarró (2017) developed a LITMUS-SR task for Catalan, also using sentences grouped around three levels of complexity, in order to address differences between school-aged Bi-TD children and children with Bi-DLD. Although the children with DLD were older than the TD children (DLD children: 10 year old; TD children: 6-7 year old), the children with DLD had a higher number of ungrammatical sentences compared to the TD children.
Another set of studies using LITMUS-SR tasks investigated TD bilingual children with the goal of addressing both the differences between the children's L1 and L2 and the effects of internal and external factors on the children's performances. Antonijevic et al. (2017) designed an Irish LITMUS-SR task and used it with bilingual children with English as L1 and Irish as L2. The English and the Irish tasks included sentences that were grouped around three levels of complexity and measured different sentence structures and types of errors. Their results showed that the children did significantly better on the L1 (English SR) in comparison to the L2 (Irish SR) task. The researchers also conducted analyses across different sentence structures for both the English and Irish SR tasks and found that relative clauses were particularly poor in Irish. In addition, Meir et al. (2017) conducted a study with TD bilingual heritage Russian and L2 Hebrew children. The study investigated cross-linguistic influences between the L1 and the L2. The researchers also examined the extent to which the age of L2 onset is associated with the acquisition of morphosyntactic properties in both Russian and Hebrew. The participants were tested in both languages using the Russian and Hebrew LITMUS-SR tasks. The researchers found that cross-linguistic influences were bi-directional.
Several studies using SR tasks with bilingual children have addressed the effects of internal (age at testing, AoO, LoE,) and external factors (parental education, language use and language richness) on the children's performance in the L1 and/or the L2. Previous research has found that internal factors are more highly related to vocabulary measures, while external factors are more highly related to morphosyntactic measures. To address the relationship between the children's performance on SR tasks with internal and external factors, Armon-Lotem et al. (2011) recruited bilingual Russian-German and Russian-Hebrew children and looked at what factors contributed to the children's performance on SR tasks. The findings revealed that AoO correlated negatively with performance on the L2 vocabulary and SR tasks while LoE correlated positively with performance on both those tasks in. The study also showed that, parents' education/occupation correlated positively with both the L1 and the L2 only for the vocabulary tasks. In addition, the study by Flekstein et al. (2018) indicated that high performance on SR tasks was significant and positively correlated with LoE for the majority language, while Tuller et al. (2018) indicated that the most significant predictor of accurate performance on SR tasks was positive early development. Finally, Thordardottir and Brandeker (2013) looked at monolingual English, monolingual French, and bilingual French-English children (five years of age). The bilingual participants had varying degrees of exposure to English and French. The children were tested using both SR and non-word repetition tasks and it was found that LoE was more highly positively correlated with the former than with the latter.
Despite the large number of parallel versions of LITMUS-SR tasks, there is currently no LITMUS-SR task for Farsi. Farsi, as it is called by native speakers of Iran, is the native language and the lingual franca of the nation of Iran (Kazemi, 2013) and one of the three major dialects of the Indo-Iranian branch of the Indo-European language family. The other two dialects are Dari, spoken by those from Afghanistan, and Tajik, which is a variant of Persian spoken in Tajikstan. For the most part, Farsi, Dari and Tajik are mutually intelligible. In the present investigation, we refer to the Farsi variety spoken by Iranians/Persians from Iran, as the participants were all immigrants from Iran. However, the findings of the study could be relevant to all Farsi speakers. Farsi is spoken by many individuals around the world: In the Middle East, Australia, Canada, the UK and the Americas. In Canada, Farsi is one of the top 10 languages spoken in homes by immigrant families (Statistics Canada, 2017). This makes it important to develop a Farsi LITMUS-SR task in order to be able to use it with Farsi speaking children worldwide.

The present study
The aim of the present study was to create a Farsi LITMUS-SR task using the principles of the COST Action IS0804 . Therefore, we reviewed previous research on the language development of DLD Farsi-speaking children in order to identify structures that are known to be difficult for this population. This section presents the language-specific structures used in the Farsi LITMUS-SR task. In addition, we identified syntactically complex structures that have been shown to be difficult for children with DLD across languages and that involve embedding and/or syntactic movement and a set of syntactically simple structures as control structures (language-independent structures).
Two major studies which looked at the language skills of Farsi children with DLD were instrumental in informing the development of the Farsi LITMUS-SR task. The first was by Foroodi-Nejad (2011), who investigated the morphosyntactic skills of monolingual Farsi-speaking children with DLD. Foroodi-Nejad (2011) compared nine Farsi-speaking children with DLD (ages 4;4 -7;6) to 16 Farsispeaking TD aged matched children in Iran. The children with DLD were all diagnosed by Speech and Language Pathologists (SLPs). The children's morpho syntactic skills were assessed through narrative tasks and sentencecompletion tasks. In the narrative tasks, children were shown pictures and were asked to tell a story. They were scored on the microstructure and macrostructure of the narrative task. The sentence-completion task was designed to assess the children's use of object clitics. Overall, the children with DLD made a higher number of grammatical errors on both tasks in comparison to the TD children.
The second study that informed the current study investigated pre-school monolingual Farsi-speaking children with DLD. Kazemi (2013) compared 27 TD monolingual Farsi-speaking children to 24 monolingual Farsi-speaking children with DLD. The children with DLD were all previously diagnosed by SLPs in Iran. Kazemi (2013) elicited language samples from the participants by engaging them in free play with their parents. The study's main objective was to identify a set of Farsi-specific syntactic structures that could significantly and reliably differentiate between TD children and children with DLD. Kazami (2013) identified a set of Farsi specific measures and errors that differentiated between children with DLD and those who were TD. The language-specific structures identified were plural marker/ha, direct object marker (DOM)/ra, progressive marker mi\ 1 , possessive clitic, direct object clitic and the ezafe.
The structures identified by Foorodi-Nejad (2011) and Kazemi (2013) were taken into consideration in the construction of the current Farsi LITMUS-SR task. The next two subsections illustrate the sets of sentence types that were included in the Farsi LITMUS-SR task.

Plurals
In Farsi, plural nouns can be marked by either/ha or/an, as in (1). The former is a general plural marker for all nouns while the latter is a specific plural marker for animate nouns (Kazemi, 2013).

DOMs
The DOM/ra is always postnominal. Specificity and definiteness are the two key elements that determine if a noun will be marked by/ra. It is obligatory when it is associated with a demonstrative and/or proper noun (Gavarró & Heshmati, 2014), but it does not occur with unidentifiable and nonspecific nouns (Safari & Mahrpour, 2015). In the colloquial spoken variety/ra can turn into/ro or/o, as in (2). (2) Man maʃin/ro aez aeli xaeridaem I car -DOM from Ali bought 'I bought the car from Ali.'

Present progressive marker
The present progressive tense in Farsi requires the verb "to have" dɑʃtaen, the present stem of the main verb with the correct person ending (Singular: -aem, -i, -ɛ, Plural: -im, -in, -an) and the prefix/mi as in (3). (3) ɛmruz dɑri maedʒaelɛ mi/xɔni today you are magazine PRE-PRO/read-2nd-SG 'Today you are reading a magazine.'

Possessive and object clitics
In Farsi the possessive clitic is enclitic and is attached to the possessor as shown in (4) (Rasekh, 2017). The object clitic is also enclitic and is attached to the verb, as in (5a) (Samvelian & Tsang, 2010). (5b) shows example (5a) with a noun phrase in the object position rather than an object clitic for comparison. The ezafe in Farsi is an unstressed vowel that occurs when a noun is modified and follows the noun (Ghomeshi, 1997 In addition to these language-specific structures, we chose to use another two sets of structures: Syntactically complex structures that are problematic for children with DLD across languages. These structures are wh-questions, short and long actional and non-actional passives, adjunct temporal subordinate clauses, con ditionals, subject and object relative clauses and syntactically simple structures as control structures (complement clauses, coordinate clauses). These sets of structures have been used in other LITMUS-SR tasks across languages.

Wh-questions
Farsi who and which object wh-questions, as in (7a) and (7b) below, are similar to English in that they involve syntactic movement (Adli, 2010).

Passives
Passives in Farsi are formed by taking the following steps: "(i) the demotion of the subject, (ii) the promotion of the object to subject position, and (iii) the morphological change in the verb, from an active form to a past participle, and the merge of šodaen, inflected for person and tense." (Gavarró & Heshmati, 2015, p. 85). Example (8a) illustrates a short passive and (8b) a long passive.
(8a) sɑrɑ bɛ bimarestan bɔrdɛ ʃɔd Sara to hospital taken was 'Sara was taken to the hospital.' (8b) naeqɑʃiɑ taevɑsɔtɛ maerdɔm dide ʃɔdaen paintings by people seen were 'The paintings were seen by the people.'

Subordinate clauses
In Farsi all subordinate clauses are finite, are either in the indicative or subjunctive, and are generally preceded by the optional relative pronoun 'ke' (that/which) (Mahootian, 1997). In the present study, we included adjunct temporal subordinate clauses, as in (9).
(9) baetʃɛ kɛ budaem, raeftaem mɔsɑfɛraet child when was-I went-I vacation 'When I was a kid, I went on vacation.'

Conditionals
Conditionals in Farsi are constructed in a very similar way to English. They consist of a main clause followed by an "if" = "aegaer" (short form "aegɛ") subordinate clause (Nilsson, 2007), as in (10).
(10) baestaeni migiri aegɛ dɔxtaerɛ xubi ice cream will-get-you if -COND girl good bɑʃi be 'You will get ice cream, if you be a good girl.'

Relative clauses
Relative clauses in Farsi are formed using the invariant complementizer ke. Ke does not agree with the noun phrase it modifies and is not marked for animacy, gender or number of the noun it modifies (Taghvaipour, 2004). Subject relative clauses follow the subject, as shown in (11), while object relative clauses follow the object, as shown in the right-branching object relative clause in (12a) and in the center embedded object relative clause in (12b).
(11) pɛesaeri kɛ bulizɛ ɑbi puʃidɛbud raeft the boy who shirt blue wearing-was left 'The boy, who was wearing the blue shirt, left.' (12a) mɔaelɛmɛ xɑnumi kɛ daevaet kaerdim rɔ the teacher the lady that we invited DOM did saw 'The teacher saw the lady we invited.' (12b) dɔxtaeri kɛ tɔ dust dɑri xɑhaerɛ maenɛ the girl who you love is sister my 'The girl who you love is my sister.'

Complement Clauses
Farsi Complement Clauses, as shown in (13), can be complements of verbs, adjectives or nouns and can be finite or non-finite. The complement clauses used in this study were all non-finite to be matched with the complement clauses used in the English LITMUS-SR task.

Coordinates
Farsi coordinate Clauses, as shown in (14), are constructed in a very similar way to English (i.e., two main clauses are joined by one of the coordinating conjunctions, such as vali/amma 'but' and o/va ' and').
(14) gɔʃnaemɛ vali qaezɑ naedɑrim Hungry-I am but food no-have-we 'I am hungry but we don't have food.' The structures in the Farsi LITMUS-SR task were organised into three levels of complexity from simplest to most complex, as in other LITMUS-SR tasks.

Research questions
The current study addressed the following research questions: 1) Are there differences in terms of accuracy in the children's performance between the three levels of complexity? 2) Are there differences in terms of accuracy between the structures at each level? 3) Is there a relationship between the children's performance on the Farsi LITMUS-SR task and their age and language history (AoO, LoE, Total use)?
The first two research questions were used to validate our assumptions about allocating structures in different levels of complexity. The third question addressed the relationship between age, language history and use with the children's performance on the Farsi LITMUS-SR task.

Participants
Twenty-five typically developing Farsi-English bilingual children between the ages of 6;3-11;6 were randomly recruited from Farsi schools in Toronto. Three participants dropped out half way through the study and two had recordings that were corrupt; these five participants were excluded from the study. Thus, we report on 20 participants. All but one parent completed a modified version of the Questionnaire for Parents of Bilingual Children (PABIQ) (Tuller, 2015). The questions on the PABIQ were from the following seven sections: General information, early milestones, current language skills, languages used at home, languages used outside the home, information about parents and current or past speech/ language difficulties of parents/siblings. The PABIQ took about 15 minutes to complete. None of the children had a history of speech and/or language delay or impairment and there were no concerns about the children's language development. Table 1 presents information on children's language profile. AoO is defined as the age at which the child is first exposed to a language. Two children were simultaneous bilinguals and the remaining 18 were sequential bilinguals. The total use of language is based on adding the language use of children at home for each language separately. This is why adding language use in the two languages does not add up to one (see Table 1 below). Seven of the 19 children were exposed to a third (French) or fourth (Turkish or Arabic) language, but their language proficiency in these languages was low based on parental report.
For participant recruitment, a school information sheet was sent to Farsi schools around the Greater Toronto Area. Most schools were happy to take part in the study and circulated an information sheet and consent form to parents of children who fit the inclusion criteria. The first author, who is a native speaker of Farsi, also went to the schools and talked to parents of potential participants to provide additional information.

Test design
The language-independent and language-specific structures totalled 18 structures. Three levels of complexity were established for the Farsi SR task. The language-independent structures were allocated to different complexity levels largely in line with the English LITMUS-SR task for consistency (Level 1: Who object; short passive; Level 2: Coordination, complement clauses, which object, long passive; Level 3: Subordinate clauses, relative clauses, conditionals). Language-specific structures were allocated to different complexity levels on the basis of morphosyntactic complexity (e.g., long ezafe was allocated to Level 2 whereas simple ezafe at Level 1), previous literature on the acquisition of Farsi and our intuition about morphosyntactic complexity in Farsi. Table 2 illustrates the allocation of sentence structures into the three levels of complexity.
Four sentences were created for each one of the structures, with the exception of the coordinates and complement clauses, which each had only two sentences because these were control sentences. This led to 68 sentences in total. These were given to a panel of experts consisting of researchers, Farsi teachers, and clinicians who were asked to judge the appropriateness of the Farsi lexical and syntactic items presented in the task in terms of age of acquisition. All sentences were judged to be age appropriate.
Sentences in Level 1 ranged from 7-13 syllables (M = 9.82, SD = 1.21), those in Level 2 from 8-17 syllables (M = 11.55, SD = 2.40), and Level 3 from 11-16 syllables (M = 13.80, SD = 1.47). A significant difference was identified between the levels (F(2,65) = 31.55, p < 0.001). Pairwise comparisons indicated that the differences were between Level 1 and Level 2 p = 0.003, Level 1 and Level 3 p < 0.001, and between Level 2 and Level 3 p < 0.001. The sentence length increased with the level of complexity. A significant difference of syllable length was also found within the levels for Levels 2 and 3: F(5,14) = 5.22, p = 0.006 for Level 2 and F(3,16) = 4.38, p = 0.020 for Level 3. The longest structure in Level 2 was the complex ezafe (M = 14.75, SD = 2.87) and the shortest structure was the which object (M = 9.50, SD = 0.29). Pairwise comparisons between all structures in Level 2 indicate that the significant difference in length was only found between the complex ezafe and the which object structures (p = 0.006). For Level 3 the longest structure was the conditional (M = 14.75, SD = 1.26) and the shortest structure was the subordinate clause (M = 12.00, SD = 1.47). Once again, pairwise comparisons between all the structures in Level 3 indicate the significant differences Note: Age in Months; total use of language used in the home is a 0-1 scale with 1 equal to total use and 0 equal to no use. in length were only found between these two structures (conditional and subordinate clause, p = 0.029).

Test administration
The Farsi LITMUS-SR task involved asking children to repeat sentences in Farsi. The sentences were presented randomly within each level. Informed consent was obtained by both the parents and the children prior to administration, and children were told they could stop the testing at any time. The SR task took about 10-20 minutes to complete and the children's responses were audio-recorded during testing. At the start of the SR task two practice sentences were used to help the children understand how the task worked. The researcher read out the sentences to the children and they were required to repeat the sentences verbatim after they heard them. Reading the sentences to each participant allows for a greater rapport with clients. Presenting the sentences through headphones takes away the personal attention that researchers and clinicians get with clients and participants. In clinical settings speech and language therapists (SLTs) are more likely to use an oral version rather than a computerized version for practical reasons, so therefore the Farsi version was made to match. It is envisaged that when the test is finalized and used for clinical purposes, both a computer and a paper version will be available for clinical use. The paper version can be used by clinicians who speak Farsi for quick and easy administration while the computerized version can be used by those SLTs who do not speak Farsi but have Farsi speaking clients and would like to obtain information on their Farsi language use.

Transcription and scoring
The sentences were first transcribed and then scored. Transcription and scoring were conducted by the first author. The data were scored in three ways, using the scoring scheme of the Test of Language Development-Primary (TOLD-P-4) (Newcomer & Hammill, 2008), the scoring scheme of the Clinical Evaluation of Language Fundamentals 3 (CELF-3) (Semel et al., 1995) and the structural scoring from Marinis and Armon-Lotem (2015). For our study, we chose to use the TOLD-P-4 scoring scheme in which each sentence is given a score of 1 if it was repeated verbatim and a score of 0 if one or more changes were made. This scoring scheme leads to smaller number of errors and is a simpler and faster coding method for clinicians .

Data analyses
First, the scores were analyzed using related samples ANOVAs to learn whether there were statistically significant differences between the three levels of complexity. Subsequently, repeated measures ANOVAs were used to ascertain differences between the syntactic structures within each level. Correlation analyses were used to address whether the results from the SR task correlated with age and the participants' language history from the PABIQ (AoO, LoE, language use, etc).

Reliability ratings
In order to measure the internal consistency of the task we did a split-half reliability analysis comparing the items on the first half of the test to the items on the second half of the test. The Spearman-Brown Coefficient was approximately 86%. We also looked at the interrater reliability ratings between two independent raters. The second rater was a SLP Farsi-speaking graduate student from Isfahan University in Iran. The interrater reliability for 20% of the data was 93.8%. Figure 1 shows the mean scores of the three levels in the SR task. A one-way repeated measures ANOVA with the factor Level showed a statistically significant main effect of Level,F(1.485,28.214) = 54.284, p < 0.001, η 2 = 0.741. Bonferroni-corrected pairwise t-tests indicated that Level 1 (M = 85.7143, SD = 13.759) has a statistically higher accuracy than Level 2 (M = 75.25, SD = 18.812; p < 0.001) and Level 3 (M = 58.50, SD = 21.77; p < 0.001) and that Level 2 also has statistically higher accuracy than Level 3 (p < 0.001). 2

Differences between the structures in each level
The next set of analyses were conducted on the sentence structures within each level to address statistically significant differences between the structures within each level. Figure 2 shows mean scores between the seven structures in Level 1. A one-way repeated measures ANOVA with the factor Structure did not show a statistically significant main effect of structure (F(4.01,76.20) = 0.87, p = 0.484, η 2 = 0.04).
To determine if there were any differences between the six structures in Level 2, a one way within samples ANOVA with the factor Structure was conducted. Figure 3 shows the descriptive information for the Level 2 scores. A significant main effect of Structure was found (F(3.48,66.09) = 12.02, p < 0.001, η 2 = 0.39). Pairwise comparisons using Bonferroni correction identified that complement clauses (M = 100, SD = 0) were more accurate   Figure 4 shows the mean scores between the five structures in Level 3. A one-way repeated measures ANOVA with the factor Structure showed a significant main effect of Structure (F(4,76) = 5.89, p < 0.001, η 2 = 0.24). Pairwise t-tests using Bonferroni correction showed that object relative clauses with right branching (M = 41.25, SD = 26) were less accurate than subject relative clauses (M = 65, SD = 28.562; p = 0.015), and subordinate clauses (M = 70, SD = 27.63; p = 0.001).

Correlations between the performance on the SR task, age, and language history
Pearson correlation coefficients were computed to assess the relationship between the participants' SR total scores and their age. A strong correlation was found (r = 0.59, p < 0.01). Pearson correlations were also conducted to investigate the relationship between SR total scores with total use of Farsi and English as well as AoO of Farsi and English. These results can be found in Table 3. Table 3 indicates that the total use of Farsi is positively correlated with the children's performance on the SR task.

Discussion
The aim of this study was to develop a Farsi LITMUS-SR task based on the principles laid out by the Bi-SLI COST Action as well as on previous research on the language development and language impairment of Farsi-speaking children. We included language-specific structures and language-independent structures. The language-specific structures were chosen based on research by Kazemi (2013), who indicated that a set of structures significantly differentiated monolingual TD Farsi children from those with DLD. The language-independent structures were both syntactically complex and syntactically simple. The structures were divided into three levels from least to most complex. Several analyses were conducted to identify if the structures on the Farsi LITMUS-SR task were able to  indicate significant differences between the participant's language skills in Farsi.

Effects of syntactic complexity and length
The first analysis compared the three levels of the task. The results showed that participants performed better on the structures in Level 1 versus the structures in Level 2 and Level 3. In addition, the children performed better on the structures in Level 2 versus those in Level 3. The structures in Level 1 were syntactically simpler and shorter than in Level 2 and Level 3. The complexity and length rose in Level 2 and then again in Level 3 with those structures being the most complex and longest. The gradual decrease in performance with increasing syntactic complexity and length can be explained by Riches (2012), who discusses the importance of sentence complexity and length. He showed that while syntactic complexity affected error rates, this was irrespective of sentence length. The drop in performance from Level 1 to Level 3 is likely to be due to the increase of complexity rather than length. More complex structures may not yet be fully acquired by our participant's grammatical repertoire.
The second set of analyses helped determine differences between the participants performance on the different structures within each level. The findings for Level 1 indicate that the children did not make significantly more errors on one structure versus another. This indicates that the structures chosen in this level are of similar complexity and justifies their inclusion in the same level. In contrast, in Level 2, the children made the most errors on the long passive structures and the least amount of errors on the complement clauses. The complement clauses used in the study were non-finite and quite simple; this could have been the reason why the children did so well on these structures. Complement clauses could either be eliminated from the task due to the ceiling effect or they could be regrouped to Level 1. On the other hand, long passives are a very rare structure in Farsi and the word used for 'by' in Farsi (tavasote) is a very infrequent word that young children could be unfamiliar with. Our findings are supported by Vahidiyan-Kamyar (2003), who indicated that the passive structure is not often used in Farsi and is a structure which is often acquired at a later age. Due to the complexity of this structure, long passives could be regrouped to Level 3. At Level 3 children made the most errors on the object relative right branching sentence type. This is consistent with previous literature, as both TD bilingual children and DLD monolinguals have the most difficulty with object relative clauses (Fleckstein et al., 2016;Friedmann & Novogrodsky, 2004). The significant differences in performance within the levels cannot be due to variations in length. In Level 2 there was a difference in length only between the complex ezafe and the which object, with complex ezafe being the longest sentence type. However, the most difficult construction within Level 2 was long passives and not complex ezafe. Similarly, in Level 3 pairwise comparisons showed that conditionals were significantly longer than subordinate clauses, but the structure with the lowest accuracy within Level 3 was object relative clauses.

Effects of age and language history
The final aspect of this study regarded effects of age and language history on SR. We found that a child's total use of Farsi had a strong positive correlation with their overall score on the Farsi SR task. This is in line with Chondrogianni and Marinis (2011), who showed that external factors in Turkish-English bilinguals, such as use of English in the home, was more highly correlated to morphosyntactic levels than internal factors. However, the current findings are somewhat inconsistent with Armon-Lotem et al's. (2011) results, which showed that bilingual children's performance on SR tasks was influenced by both external and internal factors, with the latter having a stronger effect on performance. In the present study, AoO had no effect on performance. However, it is important to note that there was very little variance in the AoO of Farsi and the sample size was small. These are the likely reasons for the lack of a correlation with AoO and the discrepancy with Armon-Lotem et al. (2011).

Conclusion
The main objective of the current study was to develop a Farsi assessment that can be used to measure the language abilities of heritage Farsi-speaking children in clinical settings. The aim was that this tool will ultimately have the potential to be sensitive enough to distinguish between Bi-TD and Bi-DLD Farsi-English bilingual children. The Farsi LITMUS-SR task was created on the basis of the principles laid out by Marinis and Armon Lotem (2015). The present task had both languagespecific and language-independent structures that were organised into three levels from least to most complex. There were three main findings. First, complexity and length affected the children's performance with their performance decreasing with increasing complexity and length. Second, differences were found for the structures in Levels 2 and 3 but no differences were found in Level 1. Third, the children's age and total use of Farsi were significantly correlated with total performance on the SR task, indicating the importance of internal and external factors for language development. These findings demon strate that the Farsi SR task can be a promising diagnostic tool for children with Farsi as heritage language worldwide.

Notes
1 The back and forward slashes on the/ha, /ra and mi\ denote the position in which these affixes attach to a word. The forward slash indicates that it is a suffix which attaches to the end of a word while the back slash indicates that it is prefix which attaches to the beginning of a word. 2 To address one of the reviewer's questions as to whether the effect of level is due to age effects, we conducted a second repeated samples ANOVA with the factor Level and Age as a covariate. The results showed that there was a significant effect of Level (F(1.35,24.34) = 5.491, p < 0.05, η 2 = 0.234) but no significant interaction between Level and Age.