Screening for Reading Problems in Grades 1 Through 3: An Overview of Select Measures
by Evelyn S. Johnson, Ed.D., Juli Pool, Ph.D., and Deborah R. Carter, Ph.D., Boise State University
Additional Articles
Additional Resources
Learning to read is arguably the most important work of students in the early elementary grades. Learning to read lays the foundation for future learning and understanding across all areas of the curriculum. Without this foundation, students will struggle to achieve academically in not only reading and writing, but also in areas such as math, science, and social studies. Decades of reading research have provided a good blueprint for understanding how children learn how to read. As reported by the National Reading Panel (National Institute of Child Health and Human Development, 2000), although the path may differ slightly based on individual experiences, children generally require the following to learn to read well:
-
Strong receptive and expressive language
-
Well-developed phonological and print awareness
-
Knowledge of letter–sound relationships (decoding)
-
Large vocabularies
-
An ability to comprehend what they read
-
The ability to read naturally and effortlessly (fluency).
Instructional research on reading has indicated that children develop these abilities best when provided with systematic and explicit instruction, when exposed to rich language and literary environments, and when exposed to appropriate developmental opportunities and environments at the earliest ages.
Many students, however, fail to develop the prerequisite skills and knowledge that enable them to become good readers (Spear-Swerling & Sternberg, 1994). Even with a strong instructional program, some students will require additional support to become good readers. A large body of evidence supports early intervention for struggling readers (Snow, Burns, & Griffin, 1998). Struggling readers who do not receive early intervention tend to fall further behind their peers, in some cases to the point that their reading difficulties become intractable (Stanovich, 1986). Given the importance of early intervention, as well as the importance of reading across the curriculum and grade levels, students who require intervention need to be identified as soon as possible. Universal screening within a Response-to-Intervention (RTI) framework is an important tool in the process of identifying students who require early reading intervention.
What Screening Processes Have Been Used to Identify Student at Risk for Reading Problems?
Screening measures, by definition, are typically very brief assessments of a particular skill or ability that is highly predictive of a later outcome. Screening measures are designed to quickly sort students into one of two groups: 1) those who require intervention and 2) those who do not. To sort students into these two categories, a screening measure does not need to be very comprehensive, it merely needs to focus on a specific skill that is highly correlated with a broader measure of reading achievement and that results in a highly accurate sorting of students.
As practiced in many schools, screening is a high-stakes assessment (Davis, Lindo, & Compton, 2007). Important decisions such as the allocation of precious intervention resources and the designation of a percentage of students as being at risk are made based on the results of a screening process (Davis et al., 2007). Therefore, many researchers recommend that for an RTI model to be effective, screening processes need to identify all, or nearly all, of the students who are at risk for poor learning outcomes (true positives) while managing the number of students who are falsely identified (false positives). In other words, sensitivity and specificity levels of 90% and higher are desirable, but with the exception of a very small number of research studies limited to predicting reading outcomes for 1st graders (Compton, Fuchs, Fuchs, & Bryant, 2006; O'Connor & Jenkins, 1999) these ideals have yet to be reached.
Researchers have proposed a variety of methods for identifying students at risk for reading problems. Within the context of an RTI framework, the most commonly described screening processes include a) direct route models, b) progress-monitoring models (Compton et al., 2006; Fuchs & Fuchs, 2006), and c) risk index models (Catts, Fey, Zhang, & Tomblin, 2001; Johnson, Jenkins, Petscher, & Catts, 2009; Schatschneider, 2008). Each of these methods is briefly described below.
Direct Route Models - In the direct route model (Jenkins, 2003), students identified by the screening as at risk are immediately placed into intervention. Typically, the screening measure assesses a single skill (e.g. letter identification) that is highly predictive of later reading performance. Direct route models require high degrees of accuracy from a screening measure, because no further confirmation of assessment results is conducted to correct screening errors (more information on direct route models can be found in "Universal Screening for Reading Problems: Why and How Should We Do This?" by Jenkins & Johnson, 2008).
Progress-Monitoring Models - In the progress-monitoring model, students initially identified as at risk are monitored for a number of weeks to see if they "self-correct." In other words, because many schools conduct screening processes in the beginning of the school year, screening may overidentify students whose reading performance may have decreased during the summer months. Students who may have initially performed poorly on the screening measure may respond positively to the first few weeks of reading instruction and no longer be considered at risk. Progress-monitoring models have resulted in high levels of accuracy in studies examining their use with 1st graders (Compton et al., 2006).
Risk Index Models - In the risk index model, a probability for risk is computed based on a number of variables collected on a student, including not only assessment results but also other related factors such as parent’s education level and English language learner (ELL) status. The probability is reported as a percentage of students with similar profiles who later performed poorly on an outcome measure. For example, a student with a risk index of 85% would be considered very at risk for poor reading outcomes, whereas a student with a risk index of 10% would not be considered to be at risk. Risk index models can improve on the accuracy levels of screening processes that rely on a single measure and, unlike single measures, can consider the impact of numerous variables. Catts et al. (2001) and Johnson et al. (2009) found that computing a risk index resulted in greater classification accuracy of a screening process compared with the classification accuracy of single screening measures.
The processes listed above have all been described for use within an RTI framework. An additional approach commonly used to identify students at risk for reading problems is diagnostic assessment of reading ability, in which students are assessed on a wide variety of component skills and processes related to reading. Because the direct route model to screening has been extensively employed in RTI implementation, the use of diagnostic assessments within the context of RTI has not been discussed in great detail, but should be considered as an important part of the RTI process. Diagnostic assessments differ from screening measures. Screening measures are designed to be brief and efficient tools for quickly identifying who might be at risk. Diagnostic assessments are then used to confirm the initial screening results, and to inform intervention by determining a student’s particular difficulties.
The use of screening followed by more in-depth diagnostic assessment is similar to the process used when screening for vision problems. A brief screening quickly identifies who may have difficulty seeing without corrective lenses, but those lenses are not prescribed until an in-depth assessment a) confirms that the person does require corrective lenses and b) determines the particular nature of the individual's problem. In some cases, it is determined that no corrective lenses are needed (e.g., correcting false positives). This is an important point that we will revisit later in this article.
What Screening Measures Work Well in Identifying Students in Grades 1 Through 3 Who Are At Risk for Reading Problems?
Regardless of the process chosen to conduct screening, all include the use of measures that are highly predictive of later reading ability. In a recent review of reading screening tools, Jenkins, Hudson, and Johnson (2007) found that screening processes for the various early grade levels tended to focus on the components of reading outlined below.
Grade 1: Word Identification Fluency (WIF), Phonological Awareness, and Letter Knowledge - Of these, WIF has been demonstrated to be one of the strongest predictors of reading ability (Compton, Fuchs, Fuchs & Bryant, 2006; Fuchs, Fuchs, & Compton, 2004).
Grades 2 and 3: Oral Reading Fluency (ORF) and Word Identification Fluency - Not many studies have examined screening measures of 2nd and 3rd grade reading ability, beyond the use of ORF measures. At 2nd grade, WIF remains a strong predictor (Foorman, Francis, Fletcher, Schatschneider, & Mehta, 1998).
The Jenkins et al. (2007) review also reported, however, that with the exception of WIF in 1st grade, current measures and approaches to screening, especially those used in Grades 2 and 3, tend not to result in high levels of classification accuracy. This is especially the case in studies in which ORF measures served as the sole criterion for classification decisions.
The Jenkins et al. (2007) review has important implications for practice, especially in schools in which the direct route model of screening is in place. If schools overidentify in assessing intervention needs, their intervention resources may be overtaxed and, subsequently, less effective for those students who truly need them. There are other repercussions of the reliance on single skill screening measures. Sufficient evidence exists to show that teachers will teach to the test, especially when those tests are used to make high-stakes decisions (Pearson, 2006).
Teachers and schools may mistakenly conclude that intense instruction on the skill measured by the screening tool (e.g., ORF) will result in improved overall reading ability. While that may be the case for students with deficits in the specific skill, studies have demonstrated that while performance on the single skill can be improved, this doesn’t necessarily translate into gains in overall reading ability (Pressley, Hilden, & Shankland, 2005; Samuels, 2007).
These findings have led some to call for more comprehensive screening batteries that include other constructs that are related to reading, such as language and vocabulary. The research on effective screening procedures and measures is growing. In the meantime, how can schools proceed with screening processes that both accurately identify who is at risk and provide helpful insight regarding appropriate intervention?
Toward an Improved Process of Screening and Intervention Planning
Reading is a complex construct that requires the synthesis of many skills, abilities, experiences and many types of knowledge. For example, although learning to read typically begins with a focus on decoding, students who do not also develop large vocabularies and the ability to comprehend what they have read will not become good readers. Additionally, several studies have identified that poor readers demonstrate different profiles of abilities (Pierce, Katzir, Wolf, & Noam, 2007; Valencia & Riddle Buly, 2004; Wise et al., 2007). For example, in a study that examined at-risk readers in Grades 2 and 3, different profiles of reading skills were found (Pierce et al., 2007). Some students had average word reading skills but deficits in vocabulary, whereas some students had low sight word efficiency but average passage reading. This suggests the importance of using a variety of measures to determine who is at risk for poor reading outcomes (Jenkins & O’Connor, 2002; Pierce et al., 2007). Additionally, students respond to the instructional program at different rates, and students who have been progressing well may experience difficulties when more challenging material is presented (Compton, Fuchs, Fuchs, Elleman & Gilbert, 2008). This implies that screening needs to occur more than once during the school year.
Research examining the use of probability indices and other approaches to improving screening are underway, especially for the early elementary grades. Until those processes are developed and ready for implementation, however, schools should consider the following approaches:
-
Grade 1 - WIF has been found to be one of the strongest predictors of reading outcomes for 1st grade students. Therefore, we suggest at a minimum that a universal screen for 1st graders include measures of WIF. To enhance the accuracy of the screening results, students initially identified by the screen should have their progress monitored for several weeks (the research-based recommendation is 5 weeks) following the initial screen (Fuchs & Fuchs, 2006). Once a pool of students is identified as at risk, continued progress monitoring in WIF can improve the accuracy of the initial screening results.
-
Grade 2 - In the beginning of the year, assessments of ORF and WIF should be used as screening tools. As with Grade 1, a system for progress monitoring should be in place to help "catch" students who respond adequately to instruction and do not require more intense intervention.
-
Grade 3 - ORF measures are one of the only screening tools currently described in the literature for this grade level. However, as with Grade 2, classification accuracy is not adequate to warrant its use as a sole criterion for intervention decisions. Additionally, schools will need to examine decision rules for a variety of subpopulations, as research has indicated that higher levels of accuracy can be reached when cut-scores are adjusted for various populations, such as ELLs.
-
For All Grades - Screening is Step 1ne of the process and does not provide a comprehensive assessment of a student’s specific problems. Similarly, focusing on improving the skill targeted by a screening tool (e.g., WIF measures or reading rate) is not by itself an effective intervention. Once the pool of at-risk students is identified, more comprehensive assessments of their reading ability should be conducted to inform appropriate intervention placements. A student whose performance on a screening instrument is extremely low may require a different type and/or intensity of intervention than a student whose screening score is close to the cut-score.
Existing Resources for Further Information
To assist schools in moving forward with this type of process, we have constructed two tables to provide details on select assessments—both screening measures and more comprehensive, diagnostic assessments of reading. These tables are meant as a starting point to identify potential screening measures. Additionally, many of the screening instruments reviewed in "Screening for Reading Problems in Preschool and Kindergarten: An Overview of Select Measures" are also appropriate for use with students in Grades 1–3. A comprehensive database of reading assessments has also been developed by SEDL and is available free of charge. Finally, the National Center on Response to Intervention’s Technical Assistance Center provides a more comprehensive technical review of reading screening tools for use in the elementary grades.
Table 1: Screening Measures for Grades 1 through 3
Name
|
Skill
|
Administration Time
|
Reliability/Validity
|
Test of Word Reading Efficiency (TOWRE) |
Word Indentification Fluency |
5-10 minutes |
|
Woodcock-Johnson Diagnostic Reading Battery |
Word Identification Fluency |
5-10 minutes per test |
- Reliability Coefficients exceed .90; Norms established with a population > 8800
|
Dynamic Indicators of Basic Early Literacy Skills (DIBELS) |
Oral Reading Fluency |
2 minutes per probe |
|
AIMSweb |
Oral Reading Fluency
|
1-3 minutes |
- Alternate form .83 (2nd grade), .86 (3rd grade)
|
EdCheckup |
Oral Reading Fluency |
The measure takes 3 minutes of administration time and 1-10 minutes of scoring time |
- Test-retest coefficient range from .92 - .94; Alternate forms reliability from .89 to .94
|
System to Enhance Educational Performance (iSTEEP) |
Oral Reading Fluency |
1 minute per form |
|
Test of Silent Word Reading Fluency (TOSWRF) |
Silent Word Reading Fluency |
10 minutes per protocol |
-
Validity coefficients range from .59 o.85; Test re-test reliability range from .70 to .96; Alternate form reliability range from .70 to .97
-
Validity coefficients (with other word identification tests) greater than .70
|
Table 2: Multiple Skill Diagnostic Assessments Grade 1-3
Name
|
Skills Assessed
|
Administration Time/Type
|
Reliability
|
Validity
|
Stanford Diagnostic Reading Test-4th ed. (SDRT4) |
Sounds (phonemic awareness); letters (phonics); words (vocabularly); pictures (fluency); stories (comprehension) |
85-105 minutes/group administration |
Reliability coefficients: .79-.94 for four major components and .95-.98 for total scores; alternate form: .62-.82 for components and .86-.88 for total scores |
- Validity data not reported but adequate evidence presented that subtests measure what they purport to measure
|
Diagnostic Assessment of Reading (DAR) |
Print awareness, rhyming; segementing words; identifying initial/final consonant sounds; blending; naming and matching letters; matching words |
20-30 minutes |
Not reported |
|
Group Reading Assessment and Diagnostic Evaluation (Grade) |
Prereading, reading readiness, phonological awareness; vocabulary, reading comprehension; listening comprehension |
45-90 minutes |
Internal consistency: .89-.99; Alternative form: .81-.94; Test/retest: .77-.98 |
|
Early Reading Diagnostic Assessment-2nd ed. (ERDA-2) |
Letter recognition; concepts of print; story retell/listening comprehension; rhyming; phonemes; syllables, receptive language; expressive vocabulary; word reading, pseudoword decoding; rimes, comprehension, passage fluency; rapid automatic naming; synonyms; word meanings |
60-110 minutes (15-20 minutes for each subtest) |
Split-half reliability: .63-.95; Test-retest: .64-.98; Interrater: .88-.96 |
- Validity data not reported but adequate evidence presented taht subtests measure what they purport to measure
|
STAR Early Literacy |
General readiness; graphophonemic knowledge; phonemic awareness; phonics; comprehension; structural analysis; vocabularly |
15 minutes per student; 30 minutes for entire classroom |
Test-retest: .87; average reliability coefficient across grades: .92 |
-
Overall average concurrent validity (with STAR Reading) for 1st grade: .68, for 2nd grade: .52, for 3rd grade: .57
-
Predictive validity: for 1st grade: .62, for 2nd grade: .67, for 3rd grade: .77
|
Note. CAT=California Achievement Test; GMRT=Gates-MacGinitie Reading Tests; PIAT-R=Peadbody Achievement Test-Revised
Conclusion
Screening within an RTI framework is a high-stakes decision. Sufficient evidence demonstrates that direct route approaches to screening do not result in high levels of classification accuracy. Progress monitoring and further assessment of students initially identified as at risk on screening measures helps improve identification and, subsequently, intervention planning. Numerous screening and diagnostic measures of reading are available, and screening processes will continue to be refined to reach the high levels of accuracy needed to implement an effective RTI process.
References
Catts, H. W., Fey, M. E., Zhang, X., & Tomblin, J. B. (2001). Estimating the risk of future reading difficulties in kindergarten children. Language, Speech, and Hearing Services in Schools, 32, 38–50.
Compton, D. L., Fuchs, D., Fuchs, L. S., & Bryant, J. D. (2006). Selecting at-risk readers in first grade for early intervention: A two-year longitudinal study of decision rules and procedures. Journal of Educational Psychology, 98, 394–409.
Compton, D. L., Fuchs, D., Fuchs, L. S., Elleman, A. M., & Gilbert, J. K. (2008). Tracking children who fly below the radar: Latent transition modeling of students with late- emerging reading disability. Learning and Individual Differences, 18, 329-337.
Davis, G. N., Lindo, E. J., & Compton, D. (2007). Children at-risk for reading failure: Constructing an early screening measure. Teaching Exceptional Children, 39(5), 32–39.
Foorman, B. R., Francis, D. J., Fletcher, J. M., Schatschneider, C., & Mehta, P. (1998). The role of instruction in learning to read: Preventing reading failure in at-risk children. Journal of Educational Psychology, 90, 37–55.
Fuchs, L. S., & Fuchs, D. (2006). Implementing responsiveness-to-intervention to identify learning disabilities. Perspectives on Dyslexia, 32(1), 39–43.
Fuchs, L. S, Fuchs, D., & Compton, D. L. (2004). Monitoring early reading development in first grade: Word identification fluency versus nonsense word fluency. Exceptional Children, 71, 7–21.
Jenkins, J. R. (2003, December). Candidate measures for screening at-risk students. Paper presented at the National Research Center on Learning Disabilities Responsiveness-to-Intervention Symposium, Kansas City, MO. Retrieved April 3, 2006.
Jenkins, J. R., Hudson, R. F., & Johnson, E. S. (2007). Screening for service delivery in an RTI framework: Candidate measures. School Psychology Review, 36, 582–599.
Jenkins, J. R., & Johnson, E. S. (2008). Universal screening for reading problems: Why and how should we do this? Retrieved April 16, 2008.
Jenkins, J. R., & O’Connor, R. E. (2002). Early identification and intervention for young children with reading/learning disabilities. In R. Bradley, L. Danielson, & D. Hallahan (Eds.), Identification of learning disabilities: Research to practice (pp. 99–149). Mahwah, NJ: Erlbaum.
Johnson, E. S., Jenkins, J. R., Petscher, Y., & Catts, H. W. (2009). How can we improve the accuracy of screening instruments? Learning Disabilities Research & Practice.
National Institute of Child Health and Human Development. (2000). Report of the National Reading Panel: Teaching children to read: An evidence-based assessment of the scientific research literature on reading and its implications for reading instruction. Reports of the subgroups. Washington, DC: U.S. Government Printing Office.
O'Connor, R. E., & Jenkins, J. R. (1999). Prediction of reading disabilities in kindergarten and first grade. Scientific Studies of Reading, 3(2), 159–197.
Pearson, P. D. (2006). Foreword. In K. S. Goodman (Ed), The truth about DIBELS: What it is, what it does. Portsmouth, NH: Heinemann.
Pierce, M. E., Katzir, T., Wolf, M. & Noam, G. G. (2007). Clusters of second and third grade dysfluent urban readers. Reading and Writing: An Interdisciplinary Journal, 20, 885-907.
Pressley, M., Hilden, K., & Shankland, R. (2005). An evaluation of end-grade-3 Dynamic Indicators of Basic Early Literacy Skills (DIBELS): Speed reading without comprehension, predicting little (Technical Report). East Lansing, MI: Literacy Achievement Research Center.
Samuels, S. J. (2007). The DIBELS tests: Is speed of barking at print what we mean by reading fluency? Reading Research Quarterly, 42, 563–566.
Schatschneider, C. (November, 2008). Classification in context. Paper presented at the International Dyslexia Association, Seattle, WA.
Snow, C. , Burns, M. S., & Griffin, P. (1998). Preventing reading difficulties in young children. Washington, DC: National Academy Press.
Spear-Swerling, L., & Sternberg, R. J. (1994). The road not taken: An integrative theoretical model of reading disability. Journal of Learning Disabilities, 27, 91–103.
Stanovich, K. E. (1986). Matthew effects in reading: Some consequences of individual differences in the acquisition of literacy. Reading Research Quarterly, 21, 360–407.
Valencia, S. W., & Riddle Buly, M. R. (2004). What struggling readers really need. The Reading Teacher, 57, 520-533.
Wise, J. C., Sevcik, R. A., Morris, R. D., Lovett, M. W., & Wolf, M. (2007). The relationship among receptive and expressive vocabulary, listening comprehension, pre-reading skills, word identification skills, and reading comprehension by children with reading disabilities. Journal of Speech, Language and Hearing Research, 50, 1093-1109.
Back To Top
|