Author: "Liu, Ou Lydia" / Search Limiters: Full Text - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Liu, Ou Lydia"' showing total 29 results

Start Over Author "Liu, Ou Lydia" Search Limiters Full Text

29 results on '"Liu, Ou Lydia"'

1. Comparing the Effect of Contextualized versus Generic Automated Feedback on Students' Scientific Argumentation. Research Report. ETS RR-22-03

Author: Olivera-Aguilar, Margarita, Lee, Hee-Sun, Pallant, Amy, Belur, Vinetha, Mulholland, Matthew, and Liu, Ou Lydia
Abstract: This study uses a computerized formative assessment system that provides automated scoring and feedback to help students write scientific arguments in a climate change curriculum. We compared the effect of contextualized versus generic automated feedback on students' explanations of scientific claims and attributions of uncertainty to those claims. Classes were randomly assigned to the contextualized feedback condition (227 students from 11 classes) or to the generic feedback condition (138 students from 9 classes). The results indicate that the formative assessment helped students improve their scores in both explanation and uncertainty scores, but larger score gains were found in the uncertainty attribution scores. Although the contextualized feedback was associated with higher final scores, this effect was moderated by the number of revisions made, the initial score, and gender. We discuss how the results might be related to students' familiarity with writing scientific explanations versus uncertainty attributions at school.
Published: 2022

2. Computerized Text Analysis: Assessment and Research Potentials for Promoting Learning

Author: Lee, Hee-Sun, McNamara, Danielle, Bracey, Zoë Buck, Wilson, Christopher, Osborne, Jonathan, Haudek, Kevin C., Liu, Ou Lydia, Pallant, Amy, Gerard, Libby, Linn, Marcia C., and Sherin, Bruce
Abstract: Rapid advancements in computing have enabled automatic analyses of written texts created in educational settings. The purpose of this symposium is to survey several applications of computerized text analyses used in the research and development of productive learning environments. Four featured research projects have developed or been working on: (1) equitable automated scoring models for scientific argumentation for English Language Learners; (2) a real-time, adjustable formative assessment system to promote student revision of uncertainty-infused scientific arguments; (3) a web-based annotation tool to support student revision of scientific essays; and (4) a new research methodology that analyzes teacher-produced text in online professional development courses. These projects will provide unique insights towards assessment and research opportunities associated with a variety of computerized text analysis approaches. [This paper was published in: "13th International Conference on Computer Supported Collaborative Learning Proceedings," 2019, pp. 743-750.]
Published: 2019

3. Assessing Civic and Intercultural Competency in Higher Education: The ETS 'HEIghten®' Approach. Research Report. ETS RR-18-23

Author: Liu, Ou Lydia, Roohr, Katrina Crotts, and Rios, Joseph A.
Abstract: Economic globalization and interdisciplinary advancements have increased the demand for college graduates to possess transferable skills that would allow them to contribute effectively to the modern workforce. In particular, transferable competencies such as civic competency and intercultural competency are critical for colleges to prepare responsible citizens and productive workers. Despite the recognized importance, the choice and quality of assessment for such competencies have been fairly limited due to the challenges in defining such complex, multidimensional constructs and identifying item types that can adequately assess them. In this report, we describe the principles we followed to operationalize definitions for civic competency and intercultural competency and the process we followed to design assessments for these 2 competencies. Findings from a large-scale pilot test are reported. Results showed that these multidimensional constructs can be adequately assessed and that there is room for students to improve in these areas. Implications for higher education institutions on how to promote these critical competencies are discussed.
Published: 2018

4. Development and Validation of the Written Communication Assessment of the 'HEIghten'® Outcomes Assessment Suite. Research Report. ETS RR-17-53

Author: Rios, Joseph A., Sparks, Jesse R., Zhang, Mo, and Liu, Ou Lydia
Abstract: Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack the ability to balance authenticity (i.e., requiring students to produce a sample of writing) with psychometric quality. To this end, we discuss the development of a newly developed measure, the WC assessment of the "HEIghten"® outcomes assessment suite, and present pilot test results based on a sample of 985 test takers from 33 higher education institutions. Overall, we found that the measure includes well-functioning items (i.e., highly discriminating and lacking gender-differential item functioning), an essay task that can be reliably scored by combining human scores with scores provided by an automated algorithm, evidence to support reporting separate selected-response and essay scores to individuals and institutions, and adequate convergent validity evidence. Such results suggest that the HEIghten WC assessment demonstrates promise in providing institutions with a time- and cost-efficient measure of WC that may allow for actionable data to drive decision-making and improve teaching and student learning.
Published: 2017

5. Investigating Validity Evidence for the 'ETS'® Proficiency Profile. Research Report. ETS RR-17-01

Author: Roohr, Katrina Crotts, Liu, Ou Lydia, and Liu, Huili
Abstract: The "ETS"® Proficiency Profile (EPP), a college-level assessment, has been widely used to evaluate general education student learning outcomes (SLOs) in college. The purpose of this study was to investigate validity evidence for the EPP by evaluating the relationship with outcomes such as student retention, cumulative grade point average (GPA), and degree attainment, and by investigating differential validity across subgroups and cross-sectional learning gains. Three main conclusions were drawn from this study: (a) Students made significant learning gains from freshman to senior year using EPP scores; (b) freshman scores showed modest relationships with cumulative GPA at various points in college and senior scores showed strong relations with final-year cumulative GPA; and (c) differential validity was found across gender, race, and college major when looking at the relationship between EPP scores and first-year and sophomore GPA. Implications of these results are discussed.
Published: 2017

6. Assessing Intercultural Competence in Higher Education: Existing Research and Future Directions. Research Report. ETS RR-16-25

Author: Griffith, Richard L., Wolfeld, Leah, Armon, Brigitte K., Rios, Joseph, and Liu, Ou Lydia
Abstract: The modern wave of globalization has created a demand for increased intercultural competence (ICC) in college graduates who will soon enter the 21st-century workforce. Despite the wide attention to the concepts and assessment of ICC, few assessments meet the standards for a next-generation assessment in areas of construct clarity, innovative item types, response processes, and validity evidence. The objectives of this report are to identify current conceptualizations of ICC, review existing assessments and their validity evidence, propose a new framework for a next-generation ICC assessment, and discuss key assessment considerations. To summarize, we found the current state of the literature to be murky in terms of the clarity of the ICC construct. Definitions of the construct vary considerably as to whether it is a trait, skill, or performance outcome. In addition, current measurements of ICC overly rely on self-report methods, which have a number of flaws that result in less than optimal assessment. In this paper, we propose a new framework based on a model of the social thinking process developed by Grossman and colleagues that describes the knowledge, skills, and abilities that promote success in complex social situations. From this social process model, as well as Earley and Peterson's definition of ICC (a person's capability to gather, interpret, and act upon these radically different cues to function effectively across cultural settings or in a multicultural situation), three stages are developed: approach, analyze, and act. Guided by this framework, we discuss assessment considerations such as innovative task types and multiple response formats to help translate the framework to an assessment of ICC.
Published: 2016

7. Pilot Testing the Chinese Version of the ETS® Proficiency Profile Critical Thinking Test. Research Report. ETS RR-16-37

Author: Liu, Ou Lydia, Mao, Liyang, Zhao, Tingting, Yang, Yi, Xu, Jun, and Wang, Zhen
Abstract: Chinese higher education is experiencing rapid development and growth. With tremendous resources invested in higher education, policy makers have requested more direct evidence of student learning. However, assessment tools that can be used to measure college-level learning are scarce in China. To mitigate this situation, we translated the critical thinking test from the ETS® Proficiency Profile (EPP) into Chinese. EPP has been widely used in the United States to assess general college learning outcomes. We pilot tested the EPP--Chinese test with students from a university in China. Results suggest that (a) the test is unidimensional and therefore is sufficient to report a total score from a practical standpoint; (b) the total score reliability is satisfactory; (c) most items showed moderate correlations with the total score, but the translation of one item needs additional revision; (d) the test is correlated with related constructs (e.g., the Chinese college entrance examination and a national English test); and (e) no item showed differential item functioning or was found to be biased toward any subgroup. In summary, the Chinese version of the critical thinking test showed potential as a suitable assessment tool for Chinese college students.
Published: 2016

8. An Investigation of the Use and Predictive Validity of Scores from the 'GRE'® revised General Test in a Singaporean University. ETS GRE® Board Research Report. ETS GRE®-16-01. ETS Research Report. RR-16-05

Author: Liu, Ou Lydia, Klieger, David M., Bochenek, Jennifer L., Holtzman, Steven L., and Xu, Jun
Abstract: International institutions have been increasingly using the "GRE"® revised General Test to admit students to graduate programs.However, little is known about how scores from the GRE revised General Test are used in the admission process outside of the United States and their validity in predicting graduate students' performance (e.g., their graduate school grade point averages [GGPAs]). As the GRE revised General Test was launched in August 2011, there is a compelling need to investigate its predictive validity, particularly in an international context. A large percentage of examinees who take the GRE revised General Test from outside of the United States are citizens of Asian countries. Consequently, we examined how scores from the GRE revised General Test predict a range of graduate student performance outcomes at a Singaporean institution that represents the highest caliber of academic excellence in Asian countries. We also interviewed key members of the admissions committees to understand how the GRE revised General Test and its individual sections are used in the admission process. Our analyses revealed that scores from the GRE revised General Test predicted GGPA and program standing. In particular, these scores showed incremental value beyond undergraduate GPA (UGPA) for predicting GGPA. Furthermore, among enrolled students, those who submitted scores from the GRE revised General Test in application had significantly higher GGPAs than those who did not. These findings largely apply to both doctoral and master's students.
Published: 2016

9. Investigating the Relationship between Test Preparation and 'TOEFL iBT'® Performance. Research Report. ETS RR-14-15

Author: Liu, Ou Lydia
Abstract: This study investigates the relationship between test preparation and test performance on the "TOEFL iBT"® exam. Information on background variables and test preparation strategies was gathered from 14,593 respondents in China through an online survey. A Chinese standardized English test was used as a control for prior English ability. Multiple regression analyses were used to investigate the relationship of coaching school attendance and test preparation strategies with TOEFL iBT total scores. Coaching school attendance had little or no relationship with TOEFL test scores across language domains.Confirmatory factor analyses revealed that general English learning strategies and test-specific strategies represent two distinct factors of test preparation. Implications of the findings for test developers and test sponsors are discussed.
Published: 2014

10. Assessing Quantitative Literacy in Higher Education: An Overview of Existing Research and Assessments with Recommendations for Next-Generation Assessment. Research Report. ETS RR-14-22

Author: Roohr, Katrina Crotts, Graf, Edith Aurora, and Liu, Ou Lydia
Abstract: Quantitative literacy has been recognized as an important skill in the higher education and workforce communities, focusing on problem solving, reasoning, and real-world application. As a result, there is a need by various stakeholders in higher education and workforce communities to evaluate whether college students receive sufficient training on quantitative skills throughout their postsecondary education. To determine the key aspects of quantitative literacy, the first part of this report provides a comprehensive review of the existing frameworks and definitions by national and international organizations, higher education institutions, and other key stakeholders. It also examines existing assessments and discusses challenges in assessing quantitative literacy. The second part of this report proposes an approach for developing a next-generation quantitative literacy assessment in higher education with an operational definition and key assessment considerations. This report has important implications for higher education institutions currently using or planning to develop or adopt assessments of quantitative literacy.
Published: 2014

11. Assessing Critical Thinking in Higher Education: Current State and Directions for Next-Generation Assessment. Research Report. ETS RR-14-10

Author: Liu, Ou Lydia, Frankel, Lois, and Roohr, Katrina Crotts
Abstract: Critical thinking is one of the most important skills deemed necessary for college graduates to become effective contributors in the global workforce. The first part of this article provides a comprehensive review of its definitions by major frameworks in higher education and the workforce, existing assessments and their psychometric qualities, and challenges surrounding the design, implementation, and use of critical thinking assessment. In the second part, we offer an operational definition that is aligned with the dimensions of critical thinking identified from the reviewed frameworks and discuss the key assessment considerations when designing a next-generation critical thinking assessment. This article has important implications for institutions that are currently using, planning to adopt, or designing an assessment of critical thinking.
Published: 2014

12. Automated Guidance for Student Inquiry

Author: Gerard, Libby F, Ryoo, Kihyun, McElhaney, Kevin W, Liu, Ou Lydia, Rafferty, Anna N, and Linn, Marcia C
Subjects: technology, assessment, science inquiry, automated scoring and guidance, Specialist Studies in Education, Psychology, Cognitive Sciences, Education
Abstract: In 4 classroom experiments we investigated uses for technologies that automatically score student generated essays, concept diagrams, and drawings in inquiry curricula. We used the automatic scores to assign typical and research-based guidance and studied the impact of the guidance on student progress. Seven teachers and their 897 students participated. We documented the impact of guidance using pretests, embedded assessments, posttests, logged computer interaction data, and student and teacher interviews. We compared guidance designed to promote knowledge integration to 3 alternatives typically used in middle school classrooms. The knowledge integration guidance was more effective than generic guidance and specific guidance, and as effective as guidance designed by experienced teachers who also participated in professional development that emphasized knowledge integration. Results suggest that using automatic scores to assign knowledge integration guidance can provide an inquiry teaching partner: this guidance helps students use evidence to sort out ideas and can free teachers to support students who need extra help.
Published: 2016

13. Computer science skills across China, India, Russia, and the United States

Author: Loyalka, Prashant, Liu, Ou Lydia, Li, Guirong, Chirikov, Igor, Kardanova, Elena, Gu, Lin, Ling, Guangming, Yu, Ningning, Guo, Fei, Ma, Liping, Hu, Shangfeng, Johnson, Angela Sun, Bhuradia, Ashutosh, Khanna, Saurabh, Froumin, Isak, Shi, Jinghuan, Choudhury, Pradeep Kumar, Beteille, Tara, Marmolejo, Francisco, and Tognatta, Namrata
Published: 2019

14. Investigating 10-Year Trends of Learning Outcomes at Community Colleges. Research Report. ETS RR-13-34

Author: Liu, Ou Lydia and Roohr, Katrina Crotts
Abstract: Community colleges currently enroll about 44% of the undergraduate students in the United States and are rapidly expanding. It is of critical importance to obtain direct evidence of student learning to see if students receive adequate training at community colleges. This study investigated the 10-year trends of community college students' (n = 46,403) performance in reading, writing, mathematics, and critical thinking, as assessed by the ETS[TM] Proficiency Profile (EPP), an assessment of college-level learning outcomes. Results showed that community college students caught up with and significantly outperformed students from liberal arts colleges by the end of the 10-year period and made significant improvement in critical-thinking skills. An increasing gender gap was observed in mathematics at community colleges. Prevalent ethnic minority and English as a second language (ESL) gaps were noted but gaps between ESL and non-ESL students and between Hispanic and White students were decreasing. Additionally, Asian students at community colleges showed an overall decline in performance. Findings from this study provide significant implications for community college leaders, researchers, and policymakers.
Published: 2013

15. Is There Any Interaction between Background Knowledge and Language Proficiency That Affects 'TOEFL iBT'® Reading Performance? TOEFL iBT® Research Report. TOEFL iBT-18. ETS Research Report RR-12-22

Author: Hill, Yao Zhang and Liu, Ou Lydia
Abstract: This study investigated the effect of the interaction between test takers' background knowledge and language proficiency on their performance on the "TOEFL iBT"® reading section. Test takers with the target content background knowledge (the focal groups) and those without (the reference groups) were identified for each of the 5 selected passages based on their self-identified academic and cultural backgrounds. The test takers were further classified into high and low proficiency groups based on their TOEFL iBT scores. Differential functioning was investigated at the item, item bundle, and passage levels. The results suggested that background knowledge interacted with language proficiency on certain items, which could be attributed to idiosyncratic passage and item characteristics (i.e., characteristics that were specific to a particular passage or item). Only 1 of the 5 passages investigated showed intermediate differential bundle functioning, favoring the focal group for both the high and low proficiency groups. There was no differential functioning at the passage level. This research sheds new light on our understanding of the effects of background knowledge and its interaction with language proficiency in the context of second language reading comprehension. It also has significant practical implications for test developers in advancing fair assessments.
Published: 2012

16. Examining American Post-Secondary Education. Research Report. ETS RR-11-22

Author: Educational Testing Service and Liu, Ou Lydia
Abstract: The purpose of this report is to identify the most prominent issues in U.S. higher education and to develop strategic research plans to address the issues that are most relevant to ETS's capabilities in measurement and assessment through the ETS's higher education research initiative. In the United States, issues related to higher education such as improved performance and effective accountability have received unprecedented attention from stakeholders at many levels. At the national level, President Obama has set forth an ambitious agenda for American postsecondary education such that by 2020, the United States should once again have the largest concentration of citizens with a postsecondary degree. At the corporate level, ETS, as the world's largest educational research and testing organization, is ready to move beyond testing program-based research in higher education and has the capability to deal with some of the most thorny issues in higher education. By strategically expanding post-secondary research, ETS will establish itself as a pioneer and thought leader in the field of higher education. The first part of the research report identifies four key issues existing in American higher education: "enrollment and performance", "retention and degree attainment", "student learning and experience", and "learning outcomes and accountability". The second part of the research report develops an ETS research agenda with short-term and long-term plans to address these issues. The agenda specifies short-term and long-term research goals that are specific, attainable, and measurable. Research findings from the studies proposed in this agenda have a potential for advancing understanding of the current situation and future needs of American higher education and also contributing to enhanced student learning at postsecondary institutions. Reaffirming and strengthening American higher education is critical to this country's success in the 21st century. (Contains 17 figures, 2 tables and 2 notes.)
Published: 2011

17. Does Content Knowledge Affect TOEFL iBT[TM] Reading Performance? A Confirmatory Approach to Differential Item Functioning. TOEFL iBT Research Report. RR-09-29

Author: Educational Testing Service, Liu, Ou Lydia, Schedl, Mary, Malloy, Jeanne, and Kong, Nan
Abstract: The TOEFL iBT[TM] has increased the length of the reading passages in the reading section compared to the passages on the TOEFL[R] computer-based test (CBT) to better approximate academic reading in North American universities, resulting in a reduced number of passages in the reading test. A concern arising from this change is whether the decrease in topic variety increases the likelihood that an examinee's familiarity with the particular content of a given passage will influence the examinee's reading performance. This study investigated differential item functioning and differential bundle functioning for six TOEFL iBT reading passages, three involving physical science and three involving cultural topics. The majority of items displayed little or no differential item functioning (DIF). When all of the items in a passage were examined, none of the passages showed differential functioning at the passage level. Hypotheses are provided for the DIF occurrences. Implications for fairness issues in test development are also discussed. Appendices include: (1) E-Mail to the Test Takers; and (2) Background Survey for TOEFL iBT. (Contains 3 tables.)
Published: 2009

18. Measuring Learning Outcomes in Higher Education. ETS R&D Connections. Number 10

Author: Educational Testing Service and Liu, Ou Lydia
Abstract: As college tuitions and fees continue to grow, students, parents and public policymakers are interested in understanding how public universities operate and whether their investments are well-utilized. Accountability in public higher education has come into focus following the attention accountability has received in K-12 education. Against this backdrop, the Voluntary System of Accountability (VSA) was developed in 2007 by the American Association of State Colleges and Universities (AASCU) and the National Association of State Universities and Land-Grant Colleges (NASULGC). As of April 2009, 321 institutions from all 50 U.S. states have signed up for the VSA program, which evaluates core educational outcomes in public colleges and universities. The VSA uses the term value-added to refer to the learning progress college and university students make from freshman to senior year, measured by looking at the difference between freshmen and senior performance on one of three standardized tests. Because a good education has become a pathway to opportunities and success, all stakeholders, including students, parents, faculty members, institutional administrators, testing organizations, and public policymakers, deserve to know whether institutions have done their best to maximize student learning and have effectively utilized public resources. These stakeholders need to reach a scientific common ground as to how institutions should be evaluated and what constituencies should be involved. This common understanding is crucial to the fruitfulness of programs such as the VSA that aim to evaluate institutional effectiveness. (Contains 1 footnote.)
Published: 2009

19. Measuring Learning Outcomes in Higher Education Using the Measure of Academic Proficiency and Progress (MAPP). Research Report. ETS RR-08-47

Author: Liu, Ou Lydia
Abstract: The Secretary of Education's Commission on the Future of Higher Education emphasizes accountability in higher education as one of the key areas of interest. The Voluntary System of Accountability (VSA) was developed to evaluate the effectiveness of general public college education. This study examines how student progress in college, indicated by the performance difference between freshmen and seniors after controlling for admission scores, can be measured using the Measure of Academic Proficiency and Progress (MAPP) test. A total of 6,196 students from 23 institutions were included in this study. Results indicated that MAPP was able to differentiate the performance between freshmen and seniors after controlling for SAT®/ACT scores. The institutions were classified into 10 groups on the basis of the difference in the actual vs. expected MAPP performance. This study provides an example of how MAPP can be used to evaluate value-added performance in college education. Issues such as student sampling and test-taking motivation are discussed.
Published: 2008

20. An Initial Field Trial of an Instrument for Measuring Learning Strategies of Middle School Students. Research Report. ETS RR-08-03

Author: Liu, Ou Lydia, Jackson, Teresa, and Ling, Guangming
Abstract: Learning strategies have been increasingly recognized as a useful tool to promote effective learning. In response to the lack of available learning strategies measures for middle school students, this study designed an instrument for these students, assessing behavioral, cognitive, and metacognitive strategies. This instrument, the Middle School Learning Strategies (MSLS) scale, is examined in terms of factorial structure, reliability, and correlates. Three factors emerge from the analyses: "effective strategies," "help seeking," and "bad habits." The subscales displayed a reasonable reliability, ranging from 0.70 to 0.87. Student grades in language arts, social studies, math, and science were collected as criterion variables. As expected, grades in these four subjects correlated positively with both effective strategies and help seeking, yet negatively with bad habits. As a pilot measure, this instrument has demonstrated promising features as a useful tool for students to evaluate and enhance their learning strategies.
Published: 2008

21. An Initial Investigation of a Modified Procedure for Parallel Analysis. Research Report. ETS RR-07-41

Author: Liu, Ou Lydia, Rijmen, Frank, and Kong, Nan
Abstract: Parallel analysis has been well documented to be an effective and accurate method for determining the number of factors to retain in exploratory factor analysis. Despite its theoretical and empirical advantages, the popularity of parallel analysis has been thwarted by its limited access in statistical software such as SPSS and SAS, especially in software that analyzes ordinal data. Among the few commonly used procedures, the Hayton, Allen, and Scarpello (2004) procedure requires manually computing the mean of eigenvalues from at least 50 replications. The O'Connor (2000) procedure overcomes that limitation, yet it has difficulties dealing with random missing data. To address these technical issues of parallel analysis for ordinal variables, we adapted and modified the O'Connor procedure to provide an alternative that best approximates the ordinal data by factoring in the frequency distributions of the variables (e.g., the number of response categories and the frequency of each response category per variable). Our procedure has a slightly different theoretical rationale from O'Connor's as well as a practical advantage in dealing with missing data.
Published: 2007

22. The Standardized Letter of Recommendation: Implications for Selection. Research Report. ETS RR-07-38

Author: Liu, Ou Lydia, Minsky, Jennifer, and Ling, Guangming
Abstract: In an effort to standardize academic application procedures, the Standardized Letter of Recommendation (SLR) was developed to capture important cognitive and noncognitive qualities of graduate school candidates. The SLR consists of seven scales ("knowledge," "analytical skills," "communication skills," "motivation," "self- organization," "professionalism and maturity," and "teamwork") and was applied to an intern-selection scenario. Both professor ratings (N = 414) during the application process and mentor ratings of the selected students (N = 51) after the internship was completed were collected using the SLR. A multidimensional Rasch investigation suggests that the seven scales of the SLR displayed satisfactory psychometric properties in terms of reliability, model fit, item fit statistics, and discrimination. The two cognitive scales, "knowledge" and "analytical skills," were found to be the best predictors for intern selection. The professor ratings and mentor ratings had moderate to high correlations, with the professor ratings being systematically higher than the mentor ratings. Possible reasons for the rating discrepancies are discussed. Also, implications for how the SLR can be used and improved in other selection situations are suggested.
Published: 2007

23. Automated Scoring of Constructed‐Response Science Items: Prospects and Obstacles

Author: Liu, Ou Lydia, Brew, Chris, Blackmore, John, Gerard, Libby, Madhok, Jacquie, and Linn, Marcia C
Subjects: automated scoring, constructed-response items, c-rater, science assessment, Specialist Studies in Education, Education
Abstract: Content-based automated scoring has been applied in a variety of science domains. However, many prior applications involved simplified scoring rubrics without considering rubrics representing multiple levels of understanding. This study tested a concept-based scoring tool for content-based scoring, c-rater™, for four science items with rubrics aiming to differentiate among multiple levels of understanding. The items showed moderate to good agreement with human scores. The findings suggest that automated scoring has the potential to score constructed-response items with complex scoring rubrics, but in its current design cannot replace human raters. This article discusses sources of disagreement and factors that could potentially improve the accuracy of concept-based automated scoring. © 2014 by the National Council on Measurement in Education.
Published: 2014

24. Investigation of Response Changes in the GRE Revised General Test

Author: Liu, Ou Lydia, Bridgeman, Brent, Gu, Lixiong, Xu, Jun, and Kong, Nan
Abstract: Research on examinees' response changes on multiple-choice tests over the past 80 years has yielded some consistent findings, including that most examinees make score gains by changing answers. This study expands the research on response changes by focusing on a high-stakes admissions test--the Verbal Reasoning and Quantitative Reasoning measures of the GRE revised General Test. We analyzed data from 8,538 examinees for Quantitative and 9,140 for Verbal sections who took the GRE revised General Test in 12 countries. The analyses yielded findings consistent with prior research. In addition, as examinees' ability increases, the benefit of response changing increases. The study yielded significant implications for both test agencies and test takers. Computer adaptive tests often do not allow the test takers to review and revise. Findings from this study confirm the benefit of such features.
Published: 2015
Full Text: View/download PDF

25. Measuring Learning Outcomes in Higher Education: Motivation Matters

Author: Liu, Ou Lydia, Bridgeman, Brent, and Adler, Rachel M.
Abstract: With the pressing need for accountability in higher education, standardized outcomes assessments have been widely used to evaluate learning and inform policy. However, the critical question on how scores are influenced by students' motivation has been insufficiently addressed. Using random assignment, we administered a multiple-choice test and an essay across three motivational conditions. Students' self-report motivation was also collected. Motivation significantly predicted test scores. A substantial performance gap emerged between students in different motivational conditions (effect size as large as 0.68). Depending on the test format and condition, conclusions about college learning gain (i.e., value added) varied dramatically from substantial gain (d = 0.72) to negative gain (d = -0.23). The findings have significant implications for higher education stakeholders at many levels. (Contains 5 tables, 2 figures and 1 note.)
Published: 2012
Full Text: View/download PDF

26. Student Evaluation of Instruction: In the New Paradigm of Distance Education

Author: Liu, Ou Lydia
Abstract: Distance education has experienced soaring development over the last decade. With millions of students in higher education enrolling in distance education, it becomes critically important to understand student learning and experiences with online education. Based on a large sample of 11,351 students taught by 1,522 instructors from 29 colleges and universities, this study investigates the factors that impact student evaluation of instruction in distance education, using a two-level hierarchical model. Key findings reveal that in a distance education setting, gender and class size are no longer significant predictors of quality of instruction. However, factors such as reasons for taking the course, student class status and instructor's academic rank have a significant impact on student evaluation of learning and instruction. Findings from this study offer important implications for institutional administrators on utilizing the evaluation results and on developing strategies to help faculty become effective online instructors.
Published: 2012
Full Text: View/download PDF

27. Value-Added Assessment in Higher Education: A Comparison of Two Methods

Author: Liu, Ou Lydia
Abstract: Evaluation of the effectiveness of higher education has received unprecedented attention from stakeholders at many levels. The Voluntary System of Accountability (VSA) is one of the initiatives to evaluate institutional core educational outcomes (e.g., critical thinking, written communication) using standardized tests. As promising as the VSA method is for calculating a value-added score and allowing results to be comparable across institutions, it has a few potential methodological limitations. This study proposed an alternative way of value-added computation which takes advantage of multilevel models and considers important institution-level variables. The institutional value-added ranking was significantly different for some of the institutions (i.e., from being ranked at the bottom to performing better than 50% of the institutions) between these two methods, which may lead to substantially different consequences for those institutions, should the results be considered for accountability purposes.
Published: 2011
Full Text: View/download PDF

28. Student Evaluation of Instruction: In the New Paradigm of Distance Education

Author: Liu, Ou Lydia, primary
Published: 2011
Full Text: View/download PDF

29. Value-added assessment in higher education: a comparison of two methods

Author: Liu, Ou Lydia, primary
Published: 2010
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

29 results on '"Liu, Ou Lydia"'

1. Comparing the Effect of Contextualized versus Generic Automated Feedback on Students' Scientific Argumentation. Research Report. ETS RR-22-03

2. Computerized Text Analysis: Assessment and Research Potentials for Promoting Learning

3. Assessing Civic and Intercultural Competency in Higher Education: The ETS 'HEIghten®' Approach. Research Report. ETS RR-18-23

4. Development and Validation of the Written Communication Assessment of the 'HEIghten'® Outcomes Assessment Suite. Research Report. ETS RR-17-53

5. Investigating Validity Evidence for the 'ETS'® Proficiency Profile. Research Report. ETS RR-17-01

6. Assessing Intercultural Competence in Higher Education: Existing Research and Future Directions. Research Report. ETS RR-16-25

7. Pilot Testing the Chinese Version of the ETS® Proficiency Profile Critical Thinking Test. Research Report. ETS RR-16-37

8. An Investigation of the Use and Predictive Validity of Scores from the 'GRE'® revised General Test in a Singaporean University. ETS GRE® Board Research Report. ETS GRE®-16-01. ETS Research Report. RR-16-05

9. Investigating the Relationship between Test Preparation and 'TOEFL iBT'® Performance. Research Report. ETS RR-14-15

10. Assessing Quantitative Literacy in Higher Education: An Overview of Existing Research and Assessments with Recommendations for Next-Generation Assessment. Research Report. ETS RR-14-22

11. Assessing Critical Thinking in Higher Education: Current State and Directions for Next-Generation Assessment. Research Report. ETS RR-14-10

12. Automated Guidance for Student Inquiry

13. Computer science skills across China, India, Russia, and the United States

14. Investigating 10-Year Trends of Learning Outcomes at Community Colleges. Research Report. ETS RR-13-34

15. Is There Any Interaction between Background Knowledge and Language Proficiency That Affects 'TOEFL iBT'® Reading Performance? TOEFL iBT® Research Report. TOEFL iBT-18. ETS Research Report RR-12-22

16. Examining American Post-Secondary Education. Research Report. ETS RR-11-22

17. Does Content Knowledge Affect TOEFL iBT[TM] Reading Performance? A Confirmatory Approach to Differential Item Functioning. TOEFL iBT Research Report. RR-09-29

18. Measuring Learning Outcomes in Higher Education. ETS R&D Connections. Number 10

19. Measuring Learning Outcomes in Higher Education Using the Measure of Academic Proficiency and Progress (MAPP). Research Report. ETS RR-08-47

20. An Initial Field Trial of an Instrument for Measuring Learning Strategies of Middle School Students. Research Report. ETS RR-08-03

21. An Initial Investigation of a Modified Procedure for Parallel Analysis. Research Report. ETS RR-07-41

22. The Standardized Letter of Recommendation: Implications for Selection. Research Report. ETS RR-07-38

23. Automated Scoring of Constructed‐Response Science Items: Prospects and Obstacles

24. Investigation of Response Changes in the GRE Revised General Test

25. Measuring Learning Outcomes in Higher Education: Motivation Matters

26. Student Evaluation of Instruction: In the New Paradigm of Distance Education

27. Value-Added Assessment in Higher Education: A Comparison of Two Methods

28. Student Evaluation of Instruction: In the New Paradigm of Distance Education

29. Value-added assessment in higher education: a comparison of two methods

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

29 results on '"Liu, Ou Lydia"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources