Evidence-based reading interventions for English language learners: A multilevel meta-analysis

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Associated Data

Apppendix 1_Heliyon GUID: B001C602-4A8B-460E-B258-C7C200631401

Data included in article/supplementary material/referenced in article.

Abstract

The number of English Language Learners (ELLs) has been growing worldwide. ELLs are at risk for reading disabilities due to dual difficulties with linguistic and cultural factors. This raises the need for finding practical and efficient reading interventions for ELLs to improve their literacy development and English reading skills. The purpose of this study is to examine the evidence-based reading interventions for English Language Learners to identify the components that create the most effective and efficient interventions. This article reviewed literature published between January 2008 and March 2018 that examined the effectiveness of reading interventions for ELLs. We analyzed the effect sizes of reading intervention programs for ELLs and explored the variables that affect reading interventions using a multilevel meta-analysis. We examined moderator variables such as student-related variables (grades, exceptionality, SES), measurement-related variables (standardization, reliability), intervention-related variables (contents of interventions, intervention types), and implementation-related variables (instructor, group size). The results showed medium effect sizes for interventions targeting basic reading skills for ELLs. Medium-size group interventions and strategy-embedded interventions were more important for ELLs who were at risk for reading disabilities. These findings suggested that we should consider the reading problems of ELLs and apply the Tier 2 approach for ELLs with reading problems.

Keywords: English language learners, Evidenced-based intervention, Meta-analysis, Reading

English language learners, Evidenced-based intervention, Meta-analysis, Reading.

1. Introduction

There is a growing body of literature that recognizes the importance of quality education for learners who study in a language other than their native language (Estrella et al., 2018; Ludwig et al., 2019). As cultural, racial, ethnic, and linguistic diversification takes place globally, the number of students studying a second language different from their native language is also increasing worldwide. In the United States, nearly 5 million learners who are not native speakers of English are currently attending public schools, and this figure has increased significantly over the past decade (NCES, 2016). As the number of children whose native language is not English increased, the need for educational support also increased. Furthermore, the implementation of NCLB policy emphasizes the need for quality education for all students included in all schools. Accordingly, NCLB has emerged as a critical policy for learners to study in their second language. In other words, there is an urgent need to ensure that non-native English speakers receive appropriate education due to NCLB, which has not only increased the demand for education but also led to the practice of enhanced education for learners whose English is not their native language.

ELLs (English language learners) refer to the education provided for learners whose native language is not English in English-speaking countries (National Center for Education Statistics, 2021). The education provided to these ELLs is called ESL (English as a second language), ESOL (English to speakers of other languages), EFL (English as a foreign language), and so on. Each term is adopted differently depending on the policy, purpose, and status of operation of the state and/or school district. While a variety of terms have been suggested, this paper uses the term ‘ELLs’ to refer to learners who are not native speakers of English and uses the terms ‘the English education program’ and the ‘ELL program’ to refer to the English education program provided to ELLs.

To ensure quality education, students identified as ELLs can participate in supportive programs to improve their English skills. These ELL programs can be broadly divided into two methods: “pull-out” and “push-in” (Honigsfeld, 2009). In the pull-out program, students are taken to a specific space other than the classroom at regular class time and are separately taught English. In the push-in program, the ELL teacher joins the mainstream ELLs’ classroom and assists them during class time. Through these educational supports, ELLs are required to achieve not only English language improvements addressed in Title III of NCLB but also language art achievements appropriate to their grade level addressed in Title I of NCLB. ELLs are expected to achieve the same level of academic achievement as students of the same grade level, as well as comparable language skills.

A considerable amount of literature has been published on the achievement and learning status of ELLs (Ludwig, 2017; Soland and Sandilos, 2020). These studies revealed that despite the intensive, high-quality education support for ELLs, they encounter difficulties learning and academic achievement. The National Reading Achievement Test (NAEP) results show that the achievement gap between non-ELLs and ELLs is steadily expanding in the areas of both mathematics and reading (Polat et al., 2016). Ultimately, ELLs are reported to have the highest risk of dropping out of school (Sheng et al., 2011). These difficulties are not limited to early school age. Fry (2007) reported that the results from a national standardized test of 8th-grade students found that ELLs performed lower than white students in both reading and math. Callahan and Shifrer (2016) analyzed data from a nationally representative educational longitudinal study in 2002 and found that, despite taking into account language, socio-demographic and academic factors, ELLs still have a large gap in high school academic achievement. Additionally, research has suggested that ELLs are less likely to participate in higher education institutions compared to non-ELL counterparts (Cook, 2015; Kanno and Cromley, 2015).

Factors found to influence the difficulties of ELLs in learning have been explored in several studies (Dussling, 2018; Thompson and von Gillern, 2020; Yousefi and Bria, 2018). There are two main reasons for these difficulties. First, ELLs face many challenges in learning a new language by following the academic content required in the school year (American Youth Policy Forum, 2009). Moreover, language is an area that is influenced by sociocultural factors, and learning academic contents such as English language art and math are also influenced by sociocultural elements and different cultural backgrounds, which affects the achievement of ELLs in school (Chen et al., 2012; Orosco, 2010). Second, it is reported that the heterogeneity of ELLs makes it challenging to formulate instructional strategies and provide adequate education for them. Due to the heterogeneous traits in the linguistic and cultural aspects of the ELL group, there are limitations in specifying and guiding traits. Therefore, properly reflecting their characteristics is difficult.

The difficulties for ELLs in academic achievement raise the necessity for searching practical and efficient reading interventions for ELLs to improve English language and academic achievement, including ELLs' English language art achievement. These needs and demands led to the conduct of various studies that analyze the difficulties of ELLs. Over the past decade, these studies have provided important information on education for ELLs. The main themes of the studies are difficulties in academic achievement and interventions for ELLs, including reading (Kirnan et al., 2018; Liu and Wang, 2015; Roth, 2015; Shamir et al., 2018; Tam and Heng, 2016), writing (Daugherty, 2015; Hong, 2018; Lin, 2015; nullP) or both reading and math (Dearing et al., 2016; Shamir et al., 2016). The influences of teachers on children's guidance (Kim, 2017; Daniel and Pray, 2017; Téllez and Manthey, 2015; Wasseell, Hawrylak, Scantlebuty, 2017) and the influences of family members (Johnson and Johnson, 2016; Walker, Research on 2017) are also examined.

Reading is known to function as an important predictor of success not only in English language art itself but also in overall school life (Guo et al., 2015). This is because reading is conducted throughout the school years, as most of the activities students perform in school are related to reading. Furthermore, reading is considered one of the major fundamental skills in modern society because it has a strong relationship with academic and vocational success beyond school-based learning (Lesnick et al., 2010). In particular, for ELLs, language is one of the innate barriers; thereafter, reading is one of the most common and prominent difficulties in that it is not done in their native language (Rawian and Mokhtar, 2017; Snyder et al., 2017). In this respect, several studies have investigated reading for ELLs. These studies explore effective interventions and strategies (Kirnan et al., 2018; Mendoza, 2016; Meredith, 2017; Reid and Heck, 2017) and suggest reading development models or predictors for reading success (Boyer, 2017; Liu and Wang, 2015; Rubin, 2016). For these individual studies to provide appropriate guidance to field practitioners and desirable suggestions for future research, aggregation of the overall related studies, not only of the individual study, and research reflections based on them are required. Specifically, meta-analysis can be an appropriate research method. Through meta-analysis, we can derive conclusions from previous studies and review them comprehensively. Furthermore, meta-analysis can ultimately contribute to policymakers and decision-makers making appropriate decisions for rational strategies and policymaking.

Although extensive research has been carried out on the difficulties of ELLs and how to support them, a sufficiently comprehensive meta-analysis of these studies has not been carried out. Some studies have focused on specific interventions, such as morphological interventions (Goodwin and Ahn, 2013), peer-mediated learning (Cole, 2014), and video game-based instruction (Thompson and von Gillern). Ludwig, Guo, and Georgiou (2019) demonstrated the effectiveness of reading interventions for ELLs. However, they divided reading-related variables into “reading accuracy”, “reading fluency”, and “reading comprehension” and examined the effectiveness of the reading-related attributes in each of the variables. Therefore, the study has limitations for exploring the various aspects of reading and their effectiveness for reading interventions.

Individual studies have their characteristics and significance. However, for individual studies to be more widely adopted in the field and to be a powerful source for future research, it is necessary to analyze these individual studies more comprehensively. Meta-analysis reviews past studies related to the topic by 'integrating' previous studies, analyzes and evaluates them through 'critical analysis', provides implications to the field, and gives rise to intellectual stimulation to future studies by ‘identifying issues’ (Cooper et al., 2019). Through this, meta-analysis can be a useful tool for diagnosing the past where relevant research has been conducted, taking appropriate treatment for the present, and providing intellectual stimulation for future studies.

Therefore, the purposes of this study are to examine evidence-based reading interventions for ELLs presented in the literature to analyze their effects and to identify the actual and specific components for creating the most effective and efficient intervention for ELLs. The findings of this study make a major contribution to research on ELLs by demonstrating the implications for the field and future study.

2. Method

2.1. Selection of studies

A meta-analysis of peer-reviewed articles on ELL reading interventions published between January 2008 and March 2018 was conducted. According to the general steps of a meta-analysis, data related to reading interventions for English language learners were collected as follows. First, educational and psychological publication databases, such as Google Scholar (https://scholar.google.co.kr), ERIC (https://eric.ed.gov/), ELSEVIER (http://www.elsevier.com), and Springer (https://www.springer.com/gp) were used to find the articles to be analyzed using the search terms “ELLs,” ESL,” “Reading,” “Second language education,” “Effectiveness,” and “Intervention” separately and in combination with each other. We reviewed the results of the web-based search for articles and included all relevant articles on the preliminary list. We selected the final list of the articles to be analyzed by applying inclusion and exclusion criteria to the preliminary list of articles. Studies were included in the final list based on three primary criteria. First, each study should evaluate the effectiveness of a school-based reading intervention using an experimental or quasi-experimental group design. In this process, single case, qualitative, and/or descriptive studies for ELLs were excluded from the analysis. Second, we included all types of reading-related interventions (i.e., phonological awareness, word recognition, reading fluency, vocabulary, and reading comprehension). Third, each study needed to report data in a statistical format to calculate an effect size. Fourth, we only included studies whose subjects were in grades K-12. The preliminary list had 75 articles, but since some of these studies did not meet the inclusion criteria, we excluded them from the final list for analysis. In total, this meta-analysis included 28 studies with 234 effect sizes (see Figure 1 ).

Figure 1

Prisma flow diagram.

2.2. Data analysis

2.2.1. Coding procedure

To identify the relevant components of the evidence-based reading interventions for ELLs, we developed an extensive coding document. Our interest was in synthesizing the effect sizes and finding the variables that affect the effectiveness of reading interventions for ELLs. The code sheet was made based on a code sheet used in Vaughn et al. (2003) and Wanzek et al. (2010). All studies were coded for the following: (a) study characteristics, including general information about the study, (b) student-related variables, (c) intervention-related variables, (d) implementation-related variables, (e) measurement-related variables, and (f) quantitative data for the calculation of effect sizes.

Within the study characteristics category, we coded the researchers’ names, publication year, and title from each study to identify the general information about each study. For the student-related variables, mean age, grade level(s), number of participants, number of males, number of females, sampling method, exceptionality type (reading ability level), identification criteria in case of learning disabilities, race/ethnicity, and SES were coded. We divided grade level(s) into lower elementary (K-2), upper elementary (3–5), and secondary (6–12). When students with learning disabilities participated in the study, we coded the identification criteria reported in the study. For race/ethnicity, we coded white, Hispanic, black, Asian, and others. Within intervention-related variables, we coded for the title of the intervention, the key instructional components of the intervention, the type of intervention, and the reading components of the intervention. The reading components coded were phonemic awareness, phonics, fluency, vocabulary, reading comprehension, listening comprehension, and others. If an intervention contained multiple reading components, all reading components included in the intervention were coded. Fourth, within implementation-related variables, we coded group size, duration of the intervention (weeks), the total number of sessions, frequency of sessions per week, length of each session (minutes), personnel who provided the intervention (i.e., teacher, researchers, other), and the setting. Fifth, in measurement-related variables, we coded the title of the measurement, reliability coefficient, validity coefficient, type of measurement, type of reliability, and type of validity. We also coded quantitative data such as the pre- and posttest means, the pre- and posttest standard deviations, and the number of participants in the pre- and posttests for both the treatment and control groups. These coding variables are defined in Table 1 . The research background and sample information are in Appendix 1.

Table 1

Study ComponentCodeDetails
General InformationTitle
Names of researchers
Publication year
ParticipantMean age
Age and Grade levelsPreschool, Lower elementary (K-2), Upper elementary (3–5), Secondary (6–12)
Number of participantsTotal number of participants, Number of girls, Number of boys
ExceptionalityGeneral, Learning difficulties, Learning disabilities, Others
Race/EthnicityEuropean-American, Hispanic, African-American, Asian/Pacific Islander, Others
SESLower, Middle, Upper
InterventionTitle of intervention
Key instructional components
Type of reading interventionStrategy instruction, Peer tutoring, Computer-based learning, and Others
Reading componentsPhonemic awareness, Phonics, Fluency, Vocabulary, Reading comprehension, Listening comprehension and Others
ImplementationGroup sizeSmall group (1 or more and 5 or less), Middle group (6 or more and 15 or less), and Large group or class size (16 or more)
Duration of intervention (weeks)
Total number of sessions
Frequency per week
Length of each session (minutes)
InstructorTeachers, Graduate students, Researchers, Others
SettingClassroom, Resource room, Afternoon school, and Others
MeasurementTitle of measurement methods
Type of measurementStandardized measurement and Researcher-developed measurement
Reliability coefficientReported and Unreported
Validity coefficientReported and Unreported
Type of reliabilityTest-retest reliability, Cronbach α, and Others
Type of validityCriterion validity, Construct validity, Content validity and Others

2.2.2. Coding reliability

The included articles were coded according to the coding procedure described above. Two researchers coded each study separately and reached 91% agreement. Afterward, the researchers reviewed and discussed the differences to resolve the initial disagreements.

2.2.3. Data analysis

First, we calculated 234 effect sizes from the interventions included in the 28 studies. The average effect size was calculated using Cohen's d formula. In addition, we conducted a two-level meta-analysis through multilevel hierarchical linear modeling (HLM) using the HLM 6.0 interactive mode statistical program to analyze the computed effect sizes and find the predictors that affect the effect sizes of reading interventions. HLM is appropriate to quantitatively obtain both overall summary statistics and quantification of the variability in the effectiveness of interventions across studies as a means for accessing the generalizability of findings. Moreover, HLM easily incorporates the overall mean effect size using the unconditional model, and HLM is useful to explain variability in the effectiveness of interventions between studies in the conditional model. The aim of the current study is to provide a broad overview of interventions for ELLs. To achieve this aim, we conducted an unconditional model for overall mean effect size and conducted a conditional model to identify factors that have an impact on the strength of effect sizes. In regard to variables related to the effectiveness of interventions, we conducted a conditional model with student-related, measurement-related, intervention-related, and implementation-related variables. In the case of quantitative meta-analyses, it is assumed that observations are independent of one another (How and de Leeuw, 2003). However, this assumption is usually not applied in social studies if observations are clustered within larger groups (Bowman, 2003) because each effect size within a study might not be homogeneous (Beretvas and Pastor, 2003). Thus, a two-level multilevel meta-analysis using a mixed-effect model was employed because multiple effect sizes are provided within a single education study. To calculate effect size (ES) estimates using Cohen's d, we use the following equation [1]:

E S = M t − M c S D p o o l e d

The pooled standard deviation, SDpooled, is defined as

S D p o o l e d = S D 1 2 ( n 1 − 1 ) + S D 2 2 ( n 2 − 1 ) n 1 + n 2 − 2

In HLM, the unconditional model can be implemented to identify the overall effect size across all estimates and to test for homogeneity. If an assumption of homogeneity is rejected by an insignificant chi-square coefficient in the unconditional model, this means that there are differences within and/or between studies. This assumption must go to the next step to find moderators that influence effect sizes. This step is called a level two model or a conditional model. A conditional model is conducted to investigate the extent of the influence of the included variables.

The level one model (unconditional model) was expressed as [3], and the level two model (the conditional model was expressed as [4].

d j = δ j + e j ⋅ e j ∼ N ( 0 , V j ) δ j = γ 0 + u j ⋅ u j ∼ N ( 0 , τ )

In equation (3), δ j represents the mean effect size value for study j, and e j is the within-study error term assumed to be theoretically normally distributed with a mean of 0 and a variance of V j . In the level two model equation [4], γ 0 represents the overall mean effect size for the population, and u j represents the sampling variability between studies presumed to be normally distributed with a mean of 0 and a variance τ .

Regarding publication bias, we looked at the funnel plot with the 'funnel()' command of the metafor R package (Viechtbauer, 2010), and to verify this more statistically, we used the dmetar R package (Harrer et al., 2019). Egger's regression test (Egger et al., 1997) was conducted using the 'eggers.test()' command to review publication bias. Egger's regression analysis showed that there was a significant publication error (t = 3.977, 95% CI [0.89–2.54], p < .001). To correct this, a trim-and-fill technique (Duval and Tweedie, 2000) was used. As a result, the total effect size corrected for publication bias was also calculated. The funnel plot is shown in [ Figure 2 ].

Figure 2

3. Results

We analyzed 28 studies to identify influential variables that count for reading interventions for ELLs. Before performing the multilevel meta-analysis, the effect size of 28 studies was analyzed by traditional meta-analysis. The forest plots for the individual effect sizes of 28 studies are shown in Appendix 2. We present our findings with our research questions as an organizational framework. First, we showed an unconditional model for finding the overall mean effect size. Then, we described the variables that influenced the effect size of reading interventions for ELLs using a conditional model.

3.1. Unconditional model

An unconditional model of the meta-analysis was tested first. In the analysis, restricted maximum likelihood estimation was used. This analysis was conducted to confirm the overall mean effect size and to examine the variability among all samples. The results are shown in Table 2 .

Table 2

Results of the unconditional model analysis.

Fixed Effect
CoefficientStandard Errort Ratio(df)95% CI
LowerUpper
Intercept 0.653 0.063 10.173∗∗(233) 0.530 0.776
Random Effect
Variance ComponentStandard DeviationChi
Intercept0.5890.7671245.90∗∗∗

The intercept coefficient in the fixed model is the overall mean effect size from 234 effect sizes. This means that the effect of reading intervention for English language learners is medium based on Cohen's d. Cohen's d is generally interpreted as small d = 0.2, medium d = 0.5 and large d = 0.8. The variance component indicates the variability among samples. The estimate was 0.589 and remained significant (χ 2 = 1245.90, p < .001). This statistical significance means that moderator analysis with dominant predictors in a model is required to explore the source of variability.

3.2. Conditional model

Moderator analysis using the conditional model was expected to identify factors that have an impact on the strength of effect sizes. In this study, the moderator analysis was administered by nine critical variable categories: students’ grade, exceptionality, SES, reading area, standardized test, test reliability, intervention type, instructor, and group size. Variables in each category were coded by dummy coding. Dummy coding was used to identify the difference in dependent variables between the categories of independent variables. For example, we used four dummy variables to capture the five dimensions. The parameter estimates capture the differences in effect sizes between the groups that are coded 1 and a reference group that is coded 0. From a mathematical perspective, it does not matter which categorical variable is used as the referenced group (Frey, 2018). We labeled one variable in each category as a reference group to make the interpretation of the results easier. We used an asterisk mark to denote the reference group for each category; if a word has an asterisk next to it, this indicates that it is the reference group for that category.

Student-related variables

The results of the conditional meta-analysis for students' grade variables are presented in Table 3 . In Table 3 , the significant coefficients mean that mean effect sizes are significantly larger for studies in reference conditions. For student grades, upper elementary students showed significantly larger mean effect sizes than secondary students (2.720, p = 0.000), but preschool students showed significantly lower mean effect sizes than secondary students (-0.103, p = 0.019). The Q statistic was significant for students’ grades (Q = 27.20, p < 0.001) (see Table 4 ).

Table 3

Results of the moderator analysis for student grade.
Fixed EffectKCoefficient (d)Standard Errort Ratiodfp-valueQ
Secondary∗200.4820.0667.2612300.00027.70
Preschool110-0.1030.043-2.3702300.019
Lower Elementary870.0680.0840.8102300.419
Upper Elementary172.7200.16916.0762300.000
df: degree of freedom.

Table 4

Results of the moderator analysis for exceptionality.
Fixed EffectkCoefficient (d)Standard Errort Ratiodfp-valueQ
Low achievement∗60.7070.1983.5812320.0010.0278
General228-0.0800.208-0.3852320.700
df: degree of freedom. Exceptionality

Table 5 shows that low and low-middle SES was not significantly different from students with no information about SES (0.055, p = 0.666). Moreover, students with middle and upper SES did not have significantly smaller effect sizes than students with nonresponse (-0.379, p = 0.444). The Q statistic was significant for students’ SES (Q = 68.50, p < 0.001).

Table 5

Results of the moderator analysis for SES.
Fixed EffectkCoefficient (d)Standard Errort Ratiodfp-valueQ
Nonresponse∗880.6130.0926.6562310.00068.50
Low-Middle1240.0550.1270.4322310.666
Middle-Upper22-0.3790.494-0.7672310.444
df: degree of freedom. Measurement-related variables Standardization

Table 6 shows the results of the moderator analysis for measurement types. The coefficient for the standardized measurement-related variable was not significant. The Q statistic was significant for the standardization of measurement tools (Q = 5.28, p < 0.001).

Table 6

Results of the moderator analysis for standardization of measurement tools.
Fixed EffectkCoefficient (d)Standard Errort Ratiodfp-valueQ
Researcher developed∗610.7210.1076.7272320.0005.28
Standardized173-0.1290.131-0.9832320.327
df: degree of freedom. Reliability

Table 7 shows the results of the moderator analysis for the reliability of the measurement tools. The coefficient for the measurement reliability-related variable was significant (0.409, p = 0.003), which means that the effect sizes of measurements that reported reliability (ES = 0.770) were significantly larger than the effect sizes of measurements that had information about reliability (ES = 0.361). The Q statistic was significant for the reliability of the measurement tools (Q = 5.82, p < 0.001) (see Table 8 ).

Table 7

Results of the moderator analysis for reliability.
Fixed EffectkCoefficient (d)Standard Errort Ratiodfp-valueQ
Nonresponse about reliability∗810.3610.1083.3382320.0015.82
Reliability1530.4090.1323.0932320.003
df: degree of freedom.

Table 8

Results of the moderator analysis for content of the intervention.
Fixed EffectkCoefficient (d)Standard Errort Ratiodfp-valueQ
Other area∗210.0960.1500.6422280.52124.005
Phonological awareness580.5280.2092.5212280.013
Reading fluency131.1500.3243.5492280.001
Vocabulary930.4420.1792.4642280.000
Reading comprehension320.9710.2094.6512280.000
Listening Comprehension170.8340.2573.2442280.002
df: degree of freedom. Intervention-related variables Content of the intervention Intervention types

For intervention types, strategy instruction, peer tutoring, and computer-based learning were compared to other methods, which were fixed as a reference group. Table 9 shows that strategy instruction was significantly larger than other methods in mean effect sizes (0.523, p = 0.001). However, studies that applied peer tutoring and computer-based learning showed lower than other methods, but these differences were not statistically significant (-0.113, p = 0.736; -0114, p = 0.743). The Q statistic was significant for intervention types (Q = 73.343, p < 0.001).

Table 9

Results of the moderator analysis for intervention types.
Fixed EffectkCoefficient (d)Standard Errort Ratiodfp-valueQ
Other method∗340.2690.1351.9862300.04873.343
Strategy instruction1540.5230.1543.4052300.001
Peer tutoring18-0.1130.337-0.3372300.736
Computer based learning28-0.1140.348-0.3282300.743
df: degree of freedom. Implementation-related variables Instructor

For instructor-related variables, other instructor-delivered instructions were assigned as a reference group. Table 10 shows that the teacher and researcher groups showed significantly larger than the other instructors. Moreover, the teacher group showed larger than the researcher group (0.909, p = 0.000). The Q statistic was significant for instructor-related variables (Q = 14.024, p < 0.001).

Table 10

Results of the moderator analysis for instructor.
Fixed EffectkCoefficient (d)Standard Errort Ratiodfp-valueQ
Other instructor∗6-0.1970.225-0.8732300.38414.024
Teacher1820.9090.2373.8372300.000
Graduate students40.6910.4691.4762300.141
Researcher420.8940.2733.2732300.002
df: degree of freedom. Group size

For group size, mixed groups were fixed as a reference group. Group size variables were divided into a small group (1 or more and 5 or less), a middle group (6 or more and 15 or less), and a large group or class size (16 or more). Table 11 shows that the middle group (6 or more and 15 or less) and the small group (1 or more and 5 or less) were significantly larger than the mixed group (0.881, p = 0.000; 0.451, p = 0.006). However, the difference between the large group and the mixed group was not significant (0.120, p = 0.434). The Q statistic was significant for group size variables (Q = 17.756, p < 0.001).

Table 11

Results of the moderator analysis for group size.
Fixed EffectkCoefficient (d)Standard Errort Ratiodfp-valueQ
Mixed group∗620.3910.1113.5282300.00117.756
Small group610.4510.1602.8242300.006
Middle group180.8810.2313.8082300.000
Large group930.1200.1530.7832300.434
df: degree of freedom.

4. Discussion

The purpose of this meta-analysis was to explore the effects of reading interventions for ELLs and to identify research-based characteristics of effective reading interventions for enhancing their reading ability. To achieve this goal, this study tried to determine the answers to two research questions. What is the estimated mean effect size of reading interventions for ELLs in K-12? To what extent do student-, intervention-, implementation-, and measurement-related variables have effects on improving the reading ability of ELLs in K-12? Therefore, our study was limited to recent K-12 intervention studies published between January 2008 and March 2018 that included phonological awareness, fluency, vocabulary, reading comprehension, and listening comprehension as intervention components and outcome measures. A total of 28 studies were identified and analyzed. To inquiry the two main research questions, a two-level meta-analysis was employed in this study. For the first research question, the unconditional model of HLM was conducted to investigate the mean effect size of reading interventions for ELLs. The conditional model of HLM was conducted to determine which variables have significant effects on reading interventions for ELLs. Below, we briefly summarized the results of this study and described the significant factors that seem to influence intervention effectiveness. These findings could provide a better understanding of ELLs and support implications for the development of reading interventions for ELLs.

4.1. Effectiveness of reading interventions for ELLs

The first primary finding from this meta-analysis is that ELLs can improve their reading ability when provided appropriate reading interventions. Our findings indicated that the overall mean effect size of reading interventions of ELLs yielded an effect size of 0.653, which indicates a medium level of effect. From this result, we can conclude that the appropriate reading interventions generally have impacts on reading outcomes for ELLs in K-12. This is consistent with prior syntheses reporting positive effects of reading interventions for ELLs (Vaughn et al., 2006; Abraham, 2008).

Effect size information is important to understand the real effects of the intervention. Therefore, this finding indicated that supplementary reading interventions for ELLs will be developed and implemented. This finding also showed that states are required to develop a set of high-quality reading interventions for ELLs. Language interventions for ELLs have become one of the most important issues in the U.S. Increasing numbers of children in U.S. schools have come from homes in which English is not the primary language spoken. NCES (2016) showed that 4.9 million students, or 9.6% of public school students, were identified as ELLs, which was higher than the 3.8 million students, or 8.1%, identified in 2000 (NCES, 2016). While many students of immigrant families succeed in their academic areas, too many do not. Some ELLs lag far behind native English speakers in the school because of the strong effect of language factors on the instruction or assessment. Although English is not their native language, ELLs should learn educational content in English. This leads to huge inequity in public schools. Thus, improving the English language and literacy skills of ELLs is a major concern for educational policymakers. This finding can support practitioners’ efforts and investments in developing appropriate language interventions for ELLs.

4.2. The effects of moderating variables

The second primary finding of this meta-analysis relates to four variable categories: student-, intervention-, implementation-, and measurement-related variables. Effective instruction cannot be designed by considering one factor. The quality of instruction is the product of many factors, including class size, the type of instructions, and other resources. This finding showed which factors affected the effectiveness of reading interventions. Specifically, we found that the variables that proved to have significant effects on reading outcomes of ELLs were as follows: upper elementary students, reliable measurement tools, reading and listening comprehension-related interventions, strategy instruction, and the middle group consisting of 6 or more and 15 or less. Teachers and practitioners in the field may choose to adopt these findings into their practices. ELL teachers may design their instruction as strategy-embedded instruction in middle-sized groups.

We found that grades accounted for significant variability in an intervention's effectiveness. Specifically, we found that reading interventions were substantially more effective when used with upper elementary students than secondary students. This means that the magnitude of an intervention's effectiveness changed depending on when ELLs received reading interventions. Specifically, the larger effect sizes on upper elementary students than secondary schools showed the importance of early interventions to improve ELLs' language abilities. Students who experience early reading difficulty often continue to experience failure in later grades. ELLs, or students whose primary language is other than English and are learning English as a second language, often experience particular challenges in developing reading skills in the early grades. According to Kieffer (2010), substantial proportions of ELLs and native English speakers showed reading difficulties that emerged in the upper elementary and middle school grades even though they succeeded in learning to read in the primary grades.

Regarding students’ English proficiency and academic achievement, there was no statistically significant difference between students with low achievement and general students. Given the heterogeneity of the English language learner population, interventions that may be effective for one group of English language learners may not be effective with others (August and Shanahan, 2006). This result is similar to the results achieved by Lovett et al. (2008). Lovett et al. (2008) showed that there were no differences between ELLs and their peers who spoke English as a first language in reading intervention outcomes or growth intervention. This finding suggests that systematic and explicit reading interventions are effective for readers regardless of their primary language.

For students' socioeconomic status (SES), there was no significant difference between the low-middle group and the nonresponse group. However, we cannot find that students' SES is critical for implementing reading interventions. Low SES is known to increase the risk of reading difficulties because of the limited access to a variety of resources that support reading development and academic achievement (Kieffer, 2010). Many ELLs attend schools with high percentages of students living in poverty (Vaughn et al., 2009). These schools are less likely to have adequate funds and resources and to provide appropriate support for academic achievement (Donovan and Cross, 2002). Snow, Burns and Griffin (1998) highlighted multiple and complex factors that contribute to poor reading outcomes in school, including a lack of qualified teachers and students who come from poverty. Although this study cannot determine the relationship between the effectiveness of reading interventions and the SES of students, more studies are needed. In addition, these results related to students’ characteristics showed that practitioners and teachers can consider for whom to implement some interventions. Researchers should provide a greater specification of the student samples because this information will be particularly critical for English language learners.

Although many of the studies measured a variety of outcomes across all areas of reading, interventions that focused on improving reading comprehension and listening comprehension obtained better effects than other reading outcomes. This result is similar to those discussed in previous findings (Wanzek and Roberts, 2012; Carrier, 2003).

With regard to effective intervention types, the findings indicated that strategy instruction was statistically significant for improving the reading skills of ELLs. However, computer-based interventions, which are frequently used for reading instruction for ELLs in recent years, showed lower effect sizes than mixed interventions. Strategy instructions are known as one of the effective reading interventions for ELLs (Proctor et al., 2007; Begeny et al., 2012; Olson and Land, 2007; Vaughn et al., 2006). These strategies included activating background knowledge, clarifying vocabulary meaning, and expressing visuals and gestures for understanding after reading. Some studies have shown that computer-based interventions are effective for ELLs (White and Gillard, 2011; Macaruso and Rodman, 2011), but this study does not. Therefore, there is little agreement in the research literature on how to effectively teach reading to ELLs (Gersten and Baker, 2000). Continued research efforts must specify how best to provide intervention for ELLs.

With respect to the implementation of the intervention, teachers and researchers as instructors would produce stronger effects than other instructors. In this study, multiple studies showed that various instructors taught ELLs, including teachers, graduate students, and researchers. The professional development of instructors is more important than that of those who taught ELLs. This finding is consistent with Richards-Tutor et al. (2016). They also did not find differences between researcher-delivered interventions and school personnel-delivered interventions. Continuing professional development should build on the preservice education of teachers, strengthen teaching skills, increase teacher knowledge of the reading process, and facilitate the integration of newer research on reading into the teaching practices of classroom teachers (Snow et al., 1998). Overall, professional development is the key factor in strengthening the reading skills of ELLs.

This study showed that medium-sized groups of 6 or more and 15 or less had larger effect sizes than the mixed groups. In addition, the medium-sized group showed a larger effect size than the small group of 5 or less. This finding showed that a multi-tiered reading system should be needed in the general classroom. This finding is linked to the fact that the reaction to intervention (RTI) approach is more effective for ELLs. Linan-Thompson et al. (2007) pointed out that RTI offers a promising alternative for reducing the disproportionate representation of culturally and linguistically diverse students in special education by identifying students at risk early and providing preventive instruction to accelerate progress. Regarding interventions for ELLs who are struggling with or at risk for reading difficulties, Ross and Begeny (2011) compared the effectiveness between small group interventions and implementing the intervention in a 1/1 context for ELLs. They showed that nearly all students benefitted from the 1/1 intervention, and some students benefitted from the small group intervention. This finding is commensurate with a previous study investigating the comparative differences between group sizes and suggests research-based support for the introduction of the RTI approach.

However, most implementation-related variables, including duration of intervention, the total number of sessions, frequency per week, length of each session, settings, and instructor, did not have any significant effect on the reading ability of ELLs. That is, ELLs are able to achieve their reading improvement regardless of the duration of intervention, where they received the reading intervention, and who taught them. This finding is similar to those discussed by Snyder et al. (2017). They also synthesized the related interventions for ELLs and showed that the length of intervention did not seem to be directly associated with overall effect sizes for reading outcomes. This finding is also the same as recent research on intervention duration with native English speakers (Wanzek et al., 2013). Wanzek and colleagues examined the relationship between student outcomes and hours of intervention in their meta-analysis. The findings showed no significant differences in student outcomes based on the number of intervention hours. Elbaum et al. (2000) stated that the intensity of the interventions is most important for effectiveness. Our results somewhat support these researchers’ opinions, but we cannot be certain that a brief intervention would have the same overall effect on reading outcomes as a year-long intervention. Thus, we should consider the intervention intensity, such as student attendance at the sessions, with the duration of the intervention.

4.3. Implications for practice and for research

The most effective and efficient education refers to education that is made up in the right ways, that includes proper content, and that is delivered on time so that the students can benefit the most. To implement this, research to identify a particular framework based on the synthesis of research results through meta-analysis, such as this study, must be conducted. Furthermore, the implications based on the results must be deeply considered. In this respect, important implications for the practice and research of practitioners, researchers, and policymakers on enhancing reading competence for ELLs of this study are as follows.

First, reading interventions for ELLs are expected to be the most efficient when conducted on a medium-sized group of 6–15 students. This indicates that implementing reading interventions for ELLs requires a specially designed group-scale configuration rather than simply a class-wide or one-to-one configuration. Second, the implementation of reading interventions for ELLs is most effective when conducted for older elementary school students. This is in contrast to Morgan and Sideridis (2006), who demonstrated the characteristics of students with learning disabilities using multilevel meta-analysis and showed that age groups were irrelevant in the effect size of reading interventions for students with learning disabilities. Therefore, it can be seen that the ELLs group, unlike the learning disability group, the students of which have reading difficulty due to their disabilities, is in the normal development process but has reading difficulty due to linguistic differences. Accordingly, it can be seen that the senior year of elementary school, in which a student has been exposed to the academic environment for a sufficiently long time and language is sufficiently developed, is the appropriate time for learning English for ELLs. Third, effective reading interventions for ELLs should be performed with a strategy-embedded instruction program. This is based on the fact that strategic instructions are effective for vocabulary or concepts in unfamiliar languages (Carlo et al., 2005; Chaaya and Ghosn, 2010).

The above implications require the implementation of Tier 2 interventions for reading interventions for ELLs in practice. In Tier 2 interventions, students can participate in more intensive learning through specially designed interventions based on their personal needs (Ortiz et al., 2011). In other words, in policymaking and administrative decision-making, intensive education programs for ELLs who have been exposed to the academic environment for a certain period but still have reading difficulties, including having achievements that fall short of the expected level, are needed.

Considering further applications, these findings could guide practitioners and policymakers to develop effective evidence-based reading programs or policies. The significant variables in this study can be considered to develop new programs for ELLs.

Declarations

Author contribution statement

All authors listed have significantly contributed to the development and the writing of this article.