Background/Context: In 2010, Reading Recovery was awarded a $55 million Investing in Innovation (i3) grant from the U.S. Department of Education. This five-year grant supported the expansion of Reading Recovery in more than 1,400 schools in over 30 states, and provided intervention to over 80,000 students. Although a multi-site randomized experiment during the i3 Scale-Up of Reading Recovery produced rigorous evidence of large impacts, the study provided no information about whether these impacts were sustained beyond first grade. This symposium will present results from an IES-funded efficacy follow-up study, which utilizes a regression discontinuity (RD) design to evaluate long-term outcomes through 3rd and 4th grades for more than 9,000 study participants. In this study, we seek to determine whether and under what circumstances the initial impacts of Reading Recovery on student literacy are maintained (or not) through 3rd and 4th grade. Long-term achievement outcomes for the RD study were measured by collecting scale scores on state achievement tests in reading or English language arts administered at the end of third and fourth grades. All data collection was conducted from fall 2017 through January 2020 (prior to the pandemic). Research Questions: 1. Does the RD study replicate the short-term impacts seen in the RCT? 2. What are the long-term impacts of Reading Recovery on state reading achievement test scores? 3. What interventions did RR and Control Group students receive in 2nd, 3rd, and 4th grades? Setting/Participants: The study presented in this paper includes four i3 cohorts of students (N>9,000) who were in 1st grade in i3 schools during the 2011-12 through 2014-15 school years, as well as one additional cohort of students who were in 1st grade in any Reading Recovery school implementing the RD design during the 2016-17 school year. Intervention: Reading Recovery is a short-term early intervention designed to prevent reading failure in the lowest-achieving readers in first grade. Children identified to receive Reading Recovery meet one-to-one with a specially trained Reading Recovery teacher every day for 30-minute lessons, over a period of 12 to 20 weeks. Research Design: To estimate the long-term effects of Reading Recovery, we implemented a regression discontinuity design (RDD) in a randomly selected sample of Reading Recovery i3 schools during each year of the i3 Scale-Up external evaluation (2011-2015) and also in one additional cohort during the 2016-17 school year. The RDD study used cutoff-based assignment based on pre-intervention test scores--students with the lowest scores on the Observation Survey of Early Literacy (OS), relative to other students in their school, were assigned to Reading Recovery, while those who scored above the cutscore did not receive Reading Recovery. Data Collection and Analysis: Long-term outcomes were measured by collecting scale scores on state achievement tests in reading or English language arts administered to students in third and fourth grades. We were able to record and link 3rd grade state test score data in reading or ELA for 9,906 students, and 4th grade state test data for 6,371 students. Scores on each state test were standardized to a common z-score scale by subtracting the statewide mean and dividing by the statewide standard deviation separately for each state (or by the national mean and standard deviation for nationally-normed tests). In addition to collecting state test scores, we also administered an online survey through which teachers, teacher leaders, or school administrators were able to document the experience of individual Reading Recovery and RDD control group students in terms of (a) continued performance monitoring, (b) participation in supplemental reading programs and interventions, and (c) instructional programs and curricula used in first, second, and third grade. Findings: The results from this study are interesting and provocative. The first major finding is that the RD was able to replicate the large positive short-term impacts seen under the RCT (i.e., ES greater than +0.60 across all years in both the RD and RCT). This suggests that the RD design used in this study provides valid estimates of the causal effects of Reading Recovery, and offers an opportunity to generate rigorous evidence of long-term impacts. The second major finding is that the long-term impact estimates from the RD study are significant and negative (ES between -0.10 and -0.30). The third major finding is that nearly half of the students participating in Reading Recovery during first grade received no additional reading intervention during subsequent grades. Potential explanations for the positive short-term findings and vnegative long-term findings include (1) that Reading Recovery produces large impacts on early literacy measures, but it does not translate to skills needed for continued success in later grades, or (2) Reading Recovery produces large impacts on early literacy measures, but these gains are lost when students do not receive sufficient intervention in later grades. Additional findings in the presentation will highlight several methodological complexities in the RD study that are likely to be particularly interesting to a SREE audience.