286 results on '"Liu, Ou Lydia"'
Search Results
2. An Exploration of Admissions-Related Practices from Institutions' Admissions Web Pages and Implications for Equity
- Author
-
Sotelo, Jose, Gooch, Reginald M., Cho-Baker, Sugene, Haviland, Sara B., Kell, Harrison J., Ling, Guangming, and Liu, Ou Lydia
- Abstract
This study investigates current practices in how admissions policies are communicated through student-facing web pages. One hundred fifty web pages (30 institutions, 5 admissions web pages each, stratified by degree-level and major) were scraped for information about holistic admissions policies and required application materials. Overall, more holistic language was used in undergraduate web pages than graduate web pages, particularly among nonminority serving institutions (MSIs). Graduate web pages required more application materials than undergraduate web pages, with more materials required for graduate web pages that used holistic language. Additionally, undergraduate web pages that used holistic language identified more qualitative cutoffs (e.g., identifying relative importance of application components, in-depth discussion of how factors are balanced) than the web pages that did not use holistic language. The paper concludes with a discussion of directions for future research and of best practices and practical takeaways for admissions officers to help increase equity in admissions.
- Published
- 2023
3. Assessing student learning outcomes of higher education
- Author
-
Liu, Ou Lydia, primary
- Published
- 2023
- Full Text
- View/download PDF
4. Computerized Text Analysis: Assessment and Research Potentials for Promoting Learning
- Author
-
Lee, Hee-Sun, McNamara, Danielle, Bracey, Zoë Buck, Wilson, Christopher, Osborne, Jonathan, Haudek, Kevin C., Liu, Ou Lydia, Pallant, Amy, Gerard, Libby, Linn, Marcia C., and Sherin, Bruce
- Abstract
Rapid advancements in computing have enabled automatic analyses of written texts created in educational settings. The purpose of this symposium is to survey several applications of computerized text analyses used in the research and development of productive learning environments. Four featured research projects have developed or been working on: (1) equitable automated scoring models for scientific argumentation for English Language Learners; (2) a real-time, adjustable formative assessment system to promote student revision of uncertainty-infused scientific arguments; (3) a web-based annotation tool to support student revision of scientific essays; and (4) a new research methodology that analyzes teacher-produced text in online professional development courses. These projects will provide unique insights towards assessment and research opportunities associated with a variety of computerized text analysis approaches. [This paper was published in: "13th International Conference on Computer Supported Collaborative Learning Proceedings," 2019, pp. 743-750.]
- Published
- 2019
5. The Effect of Faculty Research on Student Learning in College
- Author
-
Loyalka, Prashant, Shi, Zhaolei, Li, Guirong, Kardanova, Elena, Chirikov, Igor, Yu, Ningning, Hu, Shangfeng, Wang, Huan, Ma, Liping, Guo, Fei, Liu, Ou Lydia, Bhuradia, Ashutosh, Khanna, Saurabh, Li, Yanyan, and Murray, Adam
- Abstract
Whether faculty research affects college student learning has long been the subject of debate. Previous studies use subjective measures of student learning; focus on correlation rather than causation; and typically focus on one college, thus lacking generalizability. Using unique, large-scale survey and assessment data that we collected from nationally representative samples of STEM undergraduates in China, India, and Russia, as well as a causal identification strategy that accounts for differential sorting of students to faculty, we present generalizable estimates of the effect of faculty research on objective, standardized measures of student learning. Results show that faculty research has a negative effect on student learning, suggesting direct trade-offs between the university's dual mission of producing research and learning.
- Published
- 2022
- Full Text
- View/download PDF
6. Are Fourth-Year College Students Better Critical Thinkers than Their First-Year Peers? Not so Much, and College Major and Ethnicity Matter
- Author
-
Liu, Ou Lydia, Roohr, Katrina Crotts, Seybert, Jacob M., and Fishtein, Daniel
- Abstract
Critical thinking has become an essential skill required by both higher education and the workforce. Research to date has reported moderate cross-sectional learning differences in critical thinking as students progress through a 4-year college. However, retention, differential participation, and students' low test-taking motivation possibly confounded conclusions from prior studies. Controlling for such factors, we found a cross-sectional difference of 0.24 SDs after 4 years in college (n = 2,381 students, 46 institutions), considerably smaller than what's reported in prior studies. Natural science majors performed the highest and business majors performed the lowest. Minority students achieved only half of the cross-sectional difference (0.18 SDs) of white students (0.39 SDs), and only 6% of them scored at the Advanced level, compared to 24% of white students.
- Published
- 2022
- Full Text
- View/download PDF
7. Assessing Civic and Intercultural Competency in Higher Education: The ETS 'HEIghten®' Approach. Research Report. ETS RR-18-23
- Author
-
Liu, Ou Lydia, Roohr, Katrina Crotts, and Rios, Joseph A.
- Abstract
Economic globalization and interdisciplinary advancements have increased the demand for college graduates to possess transferable skills that would allow them to contribute effectively to the modern workforce. In particular, transferable competencies such as civic competency and intercultural competency are critical for colleges to prepare responsible citizens and productive workers. Despite the recognized importance, the choice and quality of assessment for such competencies have been fairly limited due to the challenges in defining such complex, multidimensional constructs and identifying item types that can adequately assess them. In this report, we describe the principles we followed to operationalize definitions for civic competency and intercultural competency and the process we followed to design assessments for these 2 competencies. Findings from a large-scale pilot test are reported. Results showed that these multidimensional constructs can be adequately assessed and that there is room for students to improve in these areas. Implications for higher education institutions on how to promote these critical competencies are discussed.
- Published
- 2018
8. Development and Validation of the Written Communication Assessment of the 'HEIghten'® Outcomes Assessment Suite. Research Report. ETS RR-17-53
- Author
-
Rios, Joseph A., Sparks, Jesse R., Zhang, Mo, and Liu, Ou Lydia
- Abstract
Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack the ability to balance authenticity (i.e., requiring students to produce a sample of writing) with psychometric quality. To this end, we discuss the development of a newly developed measure, the WC assessment of the "HEIghten"® outcomes assessment suite, and present pilot test results based on a sample of 985 test takers from 33 higher education institutions. Overall, we found that the measure includes well-functioning items (i.e., highly discriminating and lacking gender-differential item functioning), an essay task that can be reliably scored by combining human scores with scores provided by an automated algorithm, evidence to support reporting separate selected-response and essay scores to individuals and institutions, and adequate convergent validity evidence. Such results suggest that the HEIghten WC assessment demonstrates promise in providing institutions with a time- and cost-efficient measure of WC that may allow for actionable data to drive decision-making and improve teaching and student learning.
- Published
- 2017
9. Investigating Validity Evidence for the 'ETS'® Proficiency Profile. Research Report. ETS RR-17-01
- Author
-
Roohr, Katrina Crotts, Liu, Ou Lydia, and Liu, Huili
- Abstract
The "ETS"® Proficiency Profile (EPP), a college-level assessment, has been widely used to evaluate general education student learning outcomes (SLOs) in college. The purpose of this study was to investigate validity evidence for the EPP by evaluating the relationship with outcomes such as student retention, cumulative grade point average (GPA), and degree attainment, and by investigating differential validity across subgroups and cross-sectional learning gains. Three main conclusions were drawn from this study: (a) Students made significant learning gains from freshman to senior year using EPP scores; (b) freshman scores showed modest relationships with cumulative GPA at various points in college and senior scores showed strong relations with final-year cumulative GPA; and (c) differential validity was found across gender, race, and college major when looking at the relationship between EPP scores and first-year and sophomore GPA. Implications of these results are discussed.
- Published
- 2017
10. Examining Mode Effects for an Adapted Chinese Critical Thinking Assessment
- Author
-
Gu, Lin, Ling, Guangming, Liu, Ou Lydia, Yang, Zhitong, Li, Guirong, Kardanova, Elena, and Loyalka, Prashant
- Abstract
We examine the effects of computer-based versus paper-based assessment of critical thinking skills, adapted from English (in the U.S.) to Chinese. Using data collected based on a random assignment between the two modes in multiple Chinese colleges, we investigate mode effects from multiple perspectives: mean scores, measurement precision, item functioning (i.e. item difficulty and discrimination), response behavior (i.e. test completion and item omission), and user perceptions. Our findings shed light on assessment and item properties that could be the sources of mode effects. At the test level, we find that the computer-based test is more difficult and more speeded than the paper-based test. We speculate that these differences are attributable to the test's structure, its high demands on reading, and test-taking flexibility afforded under the paper testing mode. Item-level evaluation allows us to identify item characteristics that are prone to mode effects, including targeted cognitive skill, response type, and the amount of adaptation between modes. Implications for test design are discussed, and actionable design suggestions are offered with the goal of minimizing mode effect.
- Published
- 2021
- Full Text
- View/download PDF
11. Assessing Intercultural Competence in Higher Education: Existing Research and Future Directions. Research Report. ETS RR-16-25
- Author
-
Griffith, Richard L., Wolfeld, Leah, Armon, Brigitte K., Rios, Joseph, and Liu, Ou Lydia
- Abstract
The modern wave of globalization has created a demand for increased intercultural competence (ICC) in college graduates who will soon enter the 21st-century workforce. Despite the wide attention to the concepts and assessment of ICC, few assessments meet the standards for a next-generation assessment in areas of construct clarity, innovative item types, response processes, and validity evidence. The objectives of this report are to identify current conceptualizations of ICC, review existing assessments and their validity evidence, propose a new framework for a next-generation ICC assessment, and discuss key assessment considerations. To summarize, we found the current state of the literature to be murky in terms of the clarity of the ICC construct. Definitions of the construct vary considerably as to whether it is a trait, skill, or performance outcome. In addition, current measurements of ICC overly rely on self-report methods, which have a number of flaws that result in less than optimal assessment. In this paper, we propose a new framework based on a model of the social thinking process developed by Grossman and colleagues that describes the knowledge, skills, and abilities that promote success in complex social situations. From this social process model, as well as Earley and Peterson's definition of ICC (a person's capability to gather, interpret, and act upon these radically different cues to function effectively across cultural settings or in a multicultural situation), three stages are developed: approach, analyze, and act. Guided by this framework, we discuss assessment considerations such as innovative task types and multiple response formats to help translate the framework to an assessment of ICC.
- Published
- 2016
12. Pilot Testing the Chinese Version of the ETS® Proficiency Profile Critical Thinking Test. Research Report. ETS RR-16-37
- Author
-
Liu, Ou Lydia, Mao, Liyang, Zhao, Tingting, Yang, Yi, Xu, Jun, and Wang, Zhen
- Abstract
Chinese higher education is experiencing rapid development and growth. With tremendous resources invested in higher education, policy makers have requested more direct evidence of student learning. However, assessment tools that can be used to measure college-level learning are scarce in China. To mitigate this situation, we translated the critical thinking test from the ETS® Proficiency Profile (EPP) into Chinese. EPP has been widely used in the United States to assess general college learning outcomes. We pilot tested the EPP--Chinese test with students from a university in China. Results suggest that (a) the test is unidimensional and therefore is sufficient to report a total score from a practical standpoint; (b) the total score reliability is satisfactory; (c) most items showed moderate correlations with the total score, but the translation of one item needs additional revision; (d) the test is correlated with related constructs (e.g., the Chinese college entrance examination and a national English test); and (e) no item showed differential item functioning or was found to be biased toward any subgroup. In summary, the Chinese version of the critical thinking test showed potential as a suitable assessment tool for Chinese college students.
- Published
- 2016
13. An Investigation of the Use and Predictive Validity of Scores from the 'GRE'® revised General Test in a Singaporean University. ETS GRE® Board Research Report. ETS GRE®-16-01. ETS Research Report. RR-16-05
- Author
-
Liu, Ou Lydia, Klieger, David M., Bochenek, Jennifer L., Holtzman, Steven L., and Xu, Jun
- Abstract
International institutions have been increasingly using the "GRE"® revised General Test to admit students to graduate programs.However, little is known about how scores from the GRE revised General Test are used in the admission process outside of the United States and their validity in predicting graduate students' performance (e.g., their graduate school grade point averages [GGPAs]). As the GRE revised General Test was launched in August 2011, there is a compelling need to investigate its predictive validity, particularly in an international context. A large percentage of examinees who take the GRE revised General Test from outside of the United States are citizens of Asian countries. Consequently, we examined how scores from the GRE revised General Test predict a range of graduate student performance outcomes at a Singaporean institution that represents the highest caliber of academic excellence in Asian countries. We also interviewed key members of the admissions committees to understand how the GRE revised General Test and its individual sections are used in the admission process. Our analyses revealed that scores from the GRE revised General Test predicted GGPA and program standing. In particular, these scores showed incremental value beyond undergraduate GPA (UGPA) for predicting GGPA. Furthermore, among enrolled students, those who submitted scores from the GRE revised General Test in application had significantly higher GGPAs than those who did not. These findings largely apply to both doctoral and master's students.
- Published
- 2016
14. Are College Students Gaining Critical Thinking Skills? Not so Much and Major and Ethnicity Matter
- Author
-
Liu, Ou Lydia, Roohr, Katrina Crotts, Seybert, Jacob, and Fishtein, Daniel
- Abstract
Critical thinking has become an essential skill required by both higher education and workforce. Prior research reported moderate learning gains in critical thinking through college. However, retention, differential participation, and students' low test-taking motivation have possibly confounded the prior conclusions. Controlling for such factors, we found a gain of 0.24 SDs after four years in college (n=2,381 students, 46 institutions), considerably smaller than previously reported results. Natural science majors performed the highest and business majors performed the lowest. Minority students achieved only half of the gain of white students. Fifty percent of minority students scored below the Proficient level, compared to 27% for white students, and only 5% performed at the Advanced level, compared to 20% for white students.
- Published
- 2020
- Full Text
- View/download PDF
15. Thinking Critically about Critical Thinking: Validating the Russian HEIghten® Critical Thinking Assessment
- Author
-
Shaw, Amy, Liu, Ou Lydia, Gu, Lin, Kardonova, Elena, Chirikov, Igor, Li, Guirong, Hu, Shangfeng, Yu, Ningning, Ma, Liping, Guo, Fei, Su, Qi, Shi, Jinghuan, Shi, Henry, and Loyalka, Prashant
- Abstract
Critical thinking has been identified as a crucial general skill contributing to academic and career success in the twenty-first century. With the increasing demands of the modern workplace and a global trend of accountability in higher education, educators and employers pay great attention to the development of students' critical thinking skills throughout their training. Therefore, there is an urgent need worldwide for an updated and comprehensive assessment tool of college-level critical thinking. This paper reports on the preliminary validation for the Russian version of the "HEIghten"® Critical Thinking assessment developed by Educational Testing Service (ETS). Based on a large Russian college student sample (N = 1060), we evaluated the psychometric quality of the items, individual and institution-level reliability, external validity, and student perceptions. Overall, the results suggested good psychometric quality, except that a few items showed low discriminating power and should be further examined with a second wave of data collection. IRT analyses revealed testlet effects and supported the essentially unidimensional structure of the measure. Appropriate correlations with external criteria provided support for the measure's convergent validity. Implications of the preliminary validation study results and the future research agenda, especially the need to collect longitudinal data, are discussed.
- Published
- 2020
- Full Text
- View/download PDF
16. Skill levels and gains in university STEM education in China, India, Russia and the United States
- Author
-
Loyalka, Prashant, Liu, Ou Lydia, Li, Guirong, Kardanova, Elena, Chirikov, Igor, Hu, Shangfeng, Yu, Ningning, Ma, Liping, Guo, Fei, Beteille, Tara, Tognatta, Namrata, Gu, Lin, Ling, Guangming, Federiakin, Denis, Wang, Huan, Khanna, Saurabh, Bhuradia, Ashutosh, Shi, Zhaolei, and Li, Yanyan
- Published
- 2021
- Full Text
- View/download PDF
17. Investigating the Relationship between Test Preparation and 'TOEFL iBT'® Performance. Research Report. ETS RR-14-15
- Author
-
Liu, Ou Lydia
- Abstract
This study investigates the relationship between test preparation and test performance on the "TOEFL iBT"® exam. Information on background variables and test preparation strategies was gathered from 14,593 respondents in China through an online survey. A Chinese standardized English test was used as a control for prior English ability. Multiple regression analyses were used to investigate the relationship of coaching school attendance and test preparation strategies with TOEFL iBT total scores. Coaching school attendance had little or no relationship with TOEFL test scores across language domains.Confirmatory factor analyses revealed that general English learning strategies and test-specific strategies represent two distinct factors of test preparation. Implications of the findings for test developers and test sponsors are discussed.
- Published
- 2014
18. Assessing Quantitative Literacy in Higher Education: An Overview of Existing Research and Assessments with Recommendations for Next-Generation Assessment. Research Report. ETS RR-14-22
- Author
-
Roohr, Katrina Crotts, Graf, Edith Aurora, and Liu, Ou Lydia
- Abstract
Quantitative literacy has been recognized as an important skill in the higher education and workforce communities, focusing on problem solving, reasoning, and real-world application. As a result, there is a need by various stakeholders in higher education and workforce communities to evaluate whether college students receive sufficient training on quantitative skills throughout their postsecondary education. To determine the key aspects of quantitative literacy, the first part of this report provides a comprehensive review of the existing frameworks and definitions by national and international organizations, higher education institutions, and other key stakeholders. It also examines existing assessments and discusses challenges in assessing quantitative literacy. The second part of this report proposes an approach for developing a next-generation quantitative literacy assessment in higher education with an operational definition and key assessment considerations. This report has important implications for higher education institutions currently using or planning to develop or adopt assessments of quantitative literacy.
- Published
- 2014
19. Assessing Critical Thinking in Higher Education: Current State and Directions for Next-Generation Assessment. Research Report. ETS RR-14-10
- Author
-
Liu, Ou Lydia, Frankel, Lois, and Roohr, Katrina Crotts
- Abstract
Critical thinking is one of the most important skills deemed necessary for college graduates to become effective contributors in the global workforce. The first part of this article provides a comprehensive review of its definitions by major frameworks in higher education and the workforce, existing assessments and their psychometric qualities, and challenges surrounding the design, implementation, and use of critical thinking assessment. In the second part, we offer an operational definition that is aligned with the dimensions of critical thinking identified from the reviewed frameworks and discuss the key assessment considerations when designing a next-generation critical thinking assessment. This article has important implications for institutions that are currently using, planning to adopt, or designing an assessment of critical thinking.
- Published
- 2014
20. Automated Text Scoring and Real-Time Adjustable Feedback: Supporting Revision of Scientific Arguments Involving Uncertainty
- Author
-
Lee, Hee-Sun, Pallant, Amy, Pryputniewicz, Sarah, Lord, Trudi, Mulholland, Matthew, and Liu, Ou Lydia
- Abstract
This paper describes HASbot, an automated text scoring and real-time feedback system designed to support student revision of scientific arguments. Students submit open-ended text responses to explain how their data support claims and how the limitations of their data affect the uncertainty of their explanations. HASbot automatically scores these text responses and returns the scores with feedback to students. Data were collected from 343 middle- and high-school students taught by nine teachers across seven states in the United States. A mixed methods design was applied to investigate (a) how students' utilization of HASbot impacted their development of uncertainty-infused scientific arguments; (b) how students used feedback to revise their arguments, and (c) how the current design of HASbot supported or hindered students' revisions. Paired sample t tests indicate that students made significant gains from pretest to posttest in uncertainty-infused scientific argumentation, ES = 1.52 SD, p < 0.001. Linear regression analysis results indicate that students' HASbot use significantly contributed to their posttest performance on uncertainty-infused scientific argumentation while gender, English language learner status, and prior computer experience did not. From the analysis of videos, we identified several affordances and limitations of HASbot.
- Published
- 2019
- Full Text
- View/download PDF
21. Automated Guidance for Student Inquiry
- Author
-
Gerard, Libby F, Ryoo, Kihyun, McElhaney, Kevin W, Liu, Ou Lydia, Rafferty, Anna N, and Linn, Marcia C
- Subjects
technology ,assessment ,science inquiry ,automated scoring and guidance ,Specialist Studies in Education ,Psychology ,Cognitive Sciences ,Education - Abstract
In 4 classroom experiments we investigated uses for technologies that automatically score student generated essays, concept diagrams, and drawings in inquiry curricula. We used the automatic scores to assign typical and research-based guidance and studied the impact of the guidance on student progress. Seven teachers and their 897 students participated. We documented the impact of guidance using pretests, embedded assessments, posttests, logged computer interaction data, and student and teacher interviews. We compared guidance designed to promote knowledge integration to 3 alternatives typically used in middle school classrooms. The knowledge integration guidance was more effective than generic guidance and specific guidance, and as effective as guidance designed by experienced teachers who also participated in professional development that emphasized knowledge integration. Results suggest that using automatic scores to assign knowledge integration guidance can provide an inquiry teaching partner: this guidance helps students use evidence to sort out ideas and can free teachers to support students who need extra help.
- Published
- 2016
22. Computer science skills across China, India, Russia, and the United States
- Author
-
Loyalka, Prashant, Liu, Ou Lydia, Li, Guirong, Chirikov, Igor, Kardanova, Elena, Gu, Lin, Ling, Guangming, Yu, Ningning, Guo, Fei, Ma, Liping, Hu, Shangfeng, Johnson, Angela Sun, Bhuradia, Ashutosh, Khanna, Saurabh, Froumin, Isak, Shi, Jinghuan, Choudhury, Pradeep Kumar, Beteille, Tara, Marmolejo, Francisco, and Tognatta, Namrata
- Published
- 2019
23. Investigating 10-Year Trends of Learning Outcomes at Community Colleges. Research Report. ETS RR-13-34
- Author
-
Liu, Ou Lydia and Roohr, Katrina Crotts
- Abstract
Community colleges currently enroll about 44% of the undergraduate students in the United States and are rapidly expanding. It is of critical importance to obtain direct evidence of student learning to see if students receive adequate training at community colleges. This study investigated the 10-year trends of community college students' (n = 46,403) performance in reading, writing, mathematics, and critical thinking, as assessed by the ETS[TM] Proficiency Profile (EPP), an assessment of college-level learning outcomes. Results showed that community college students caught up with and significantly outperformed students from liberal arts colleges by the end of the 10-year period and made significant improvement in critical-thinking skills. An increasing gender gap was observed in mathematics at community colleges. Prevalent ethnic minority and English as a second language (ESL) gaps were noted but gaps between ESL and non-ESL students and between Hispanic and White students were decreasing. Additionally, Asian students at community colleges showed an overall decline in performance. Findings from this study provide significant implications for community college leaders, researchers, and policymakers.
- Published
- 2013
24. Is There Any Interaction between Background Knowledge and Language Proficiency That Affects 'TOEFL iBT'® Reading Performance? TOEFL iBT® Research Report. TOEFL iBT-18. ETS Research Report RR-12-22
- Author
-
Hill, Yao Zhang and Liu, Ou Lydia
- Abstract
This study investigated the effect of the interaction between test takers' background knowledge and language proficiency on their performance on the "TOEFL iBT"® reading section. Test takers with the target content background knowledge (the focal groups) and those without (the reference groups) were identified for each of the 5 selected passages based on their self-identified academic and cultural backgrounds. The test takers were further classified into high and low proficiency groups based on their TOEFL iBT scores. Differential functioning was investigated at the item, item bundle, and passage levels. The results suggested that background knowledge interacted with language proficiency on certain items, which could be attributed to idiosyncratic passage and item characteristics (i.e., characteristics that were specific to a particular passage or item). Only 1 of the 5 passages investigated showed intermediate differential bundle functioning, favoring the focal group for both the high and low proficiency groups. There was no differential functioning at the passage level. This research sheds new light on our understanding of the effects of background knowledge and its interaction with language proficiency in the context of second language reading comprehension. It also has significant practical implications for test developers in advancing fair assessments.
- Published
- 2012
25. Examining American Post-Secondary Education. Research Report. ETS RR-11-22
- Author
-
Educational Testing Service and Liu, Ou Lydia
- Abstract
The purpose of this report is to identify the most prominent issues in U.S. higher education and to develop strategic research plans to address the issues that are most relevant to ETS's capabilities in measurement and assessment through the ETS's higher education research initiative. In the United States, issues related to higher education such as improved performance and effective accountability have received unprecedented attention from stakeholders at many levels. At the national level, President Obama has set forth an ambitious agenda for American postsecondary education such that by 2020, the United States should once again have the largest concentration of citizens with a postsecondary degree. At the corporate level, ETS, as the world's largest educational research and testing organization, is ready to move beyond testing program-based research in higher education and has the capability to deal with some of the most thorny issues in higher education. By strategically expanding post-secondary research, ETS will establish itself as a pioneer and thought leader in the field of higher education. The first part of the research report identifies four key issues existing in American higher education: "enrollment and performance", "retention and degree attainment", "student learning and experience", and "learning outcomes and accountability". The second part of the research report develops an ETS research agenda with short-term and long-term plans to address these issues. The agenda specifies short-term and long-term research goals that are specific, attainable, and measurable. Research findings from the studies proposed in this agenda have a potential for advancing understanding of the current situation and future needs of American higher education and also contributing to enhanced student learning at postsecondary institutions. Reaffirming and strengthening American higher education is critical to this country's success in the 21st century. (Contains 17 figures, 2 tables and 2 notes.)
- Published
- 2011
26. Does Content Knowledge Affect TOEFL iBT[TM] Reading Performance? A Confirmatory Approach to Differential Item Functioning. TOEFL iBT Research Report. RR-09-29
- Author
-
Educational Testing Service, Liu, Ou Lydia, Schedl, Mary, Malloy, Jeanne, and Kong, Nan
- Abstract
The TOEFL iBT[TM] has increased the length of the reading passages in the reading section compared to the passages on the TOEFL[R] computer-based test (CBT) to better approximate academic reading in North American universities, resulting in a reduced number of passages in the reading test. A concern arising from this change is whether the decrease in topic variety increases the likelihood that an examinee's familiarity with the particular content of a given passage will influence the examinee's reading performance. This study investigated differential item functioning and differential bundle functioning for six TOEFL iBT reading passages, three involving physical science and three involving cultural topics. The majority of items displayed little or no differential item functioning (DIF). When all of the items in a passage were examined, none of the passages showed differential functioning at the passage level. Hypotheses are provided for the DIF occurrences. Implications for fairness issues in test development are also discussed. Appendices include: (1) E-Mail to the Test Takers; and (2) Background Survey for TOEFL iBT. (Contains 3 tables.)
- Published
- 2009
27. Measuring Learning Outcomes in Higher Education. ETS R&D Connections. Number 10
- Author
-
Educational Testing Service and Liu, Ou Lydia
- Abstract
As college tuitions and fees continue to grow, students, parents and public policymakers are interested in understanding how public universities operate and whether their investments are well-utilized. Accountability in public higher education has come into focus following the attention accountability has received in K-12 education. Against this backdrop, the Voluntary System of Accountability (VSA) was developed in 2007 by the American Association of State Colleges and Universities (AASCU) and the National Association of State Universities and Land-Grant Colleges (NASULGC). As of April 2009, 321 institutions from all 50 U.S. states have signed up for the VSA program, which evaluates core educational outcomes in public colleges and universities. The VSA uses the term value-added to refer to the learning progress college and university students make from freshman to senior year, measured by looking at the difference between freshmen and senior performance on one of three standardized tests. Because a good education has become a pathway to opportunities and success, all stakeholders, including students, parents, faculty members, institutional administrators, testing organizations, and public policymakers, deserve to know whether institutions have done their best to maximize student learning and have effectively utilized public resources. These stakeholders need to reach a scientific common ground as to how institutions should be evaluated and what constituencies should be involved. This common understanding is crucial to the fruitfulness of programs such as the VSA that aim to evaluate institutional effectiveness. (Contains 1 footnote.)
- Published
- 2009
28. Validating the Use of Translated and Adapted HEIghten ® Quantitative Literacy Test in Russia
- Author
-
Gu, Lin, Liu, Ou Lydia, Xu, Jun, Kardonova, Elena, Chirikov, Igor, Li, Guirong, Hu, Shangfeng, Yu, Ningning, Ma, Liping, Guo, Fei, Su, Qi, Shi, Jinghuan, Shi, Henry, Loyalka, Prashant, Veldkamp, Bernard, Series Editor, von Davier, Matthias, Series Editor, Zlatkin-Troitschanskaia, Olga, editor, Toepper, Miriam, editor, Pant, Hans Anand, editor, Lautenbach, Corinna, editor, and Kuhn, Christiane, editor
- Published
- 2018
- Full Text
- View/download PDF
29. Validation of Automated Scoring for a Formative Assessment That Employs Scientific Argumentation
- Author
-
Mao, Liyang, Liu, Ou Lydia, Roohr, Katrina, Belur, Vinetha, Mulholland, Matthew, Lee, Hee-Sun, and Pallant, Amy
- Abstract
Scientific argumentation is one of the core practices for teachers to implement in science classrooms. We developed a computer-based formative assessment to support students' construction and revision of scientific arguments. The assessment is built upon automated scoring of students' arguments and provides feedback to students and teachers. Preliminary validity evidence was collected in this study to support the use of automated scoring in this formative assessment. The results showed satisfactory psychometric properties related to this formative assessment. The automated scores showed satisfactory agreement with human scores, but small discrepancies still existed. Automated scores and feedback encouraged students to revise their answers. Students' scientific argumentation skills improved during the revision process. These findings provided preliminary evident to support the use of automated scoring in the formative assessment to diagnose and enhance students' argumentation skills in the context of climate change in secondary school science classrooms.
- Published
- 2018
- Full Text
- View/download PDF
30. Assessing College Critical Thinking: Preliminary Results from the Chinese HEIghten® Critical Thinking Assessment
- Author
-
Liu, Ou Lydia, Shaw, Amy, Gu, Lin, Li, Guirong, Hu, Shangfeng, Yu, Ningning, Ma, Liping, Xu, Changqing, Guo, Fei, Su, Qi, Kardanovaj, Elena, Chirikov, Igor, Shi, Jinghuan, Shi, Zhaolei, Wang, Huan, and Loyalka, Prashant
- Abstract
Assessing student learning outcomes has become a global trend in higher education. In this paper, we report on the validation of the Chinese HEIghten® Critical Thinking assessment with a nationally representative sample of Electrical Engineering and Computer Science students from 35 institutions in China. Key findings suggest that there was a test delivery mode effect favoring the paper tests over the online tests. In general, the psychometric quality of the items was satisfactory for low-stakes, group-level uses but there were a few items with low discrimination that awaits further investigation. The relationships between test scores and various external variables such as college entrance examination scores, university elite status and student perceptions of the test were as expected. We conclude with speculations on the key findings and discussion of directions for future research.
- Published
- 2018
- Full Text
- View/download PDF
31. Measuring Learning Outcomes in Higher Education Using the Measure of Academic Proficiency and Progress (MAPP). Research Report. ETS RR-08-47
- Author
-
Liu, Ou Lydia
- Abstract
The Secretary of Education's Commission on the Future of Higher Education emphasizes accountability in higher education as one of the key areas of interest. The Voluntary System of Accountability (VSA) was developed to evaluate the effectiveness of general public college education. This study examines how student progress in college, indicated by the performance difference between freshmen and seniors after controlling for admission scores, can be measured using the Measure of Academic Proficiency and Progress (MAPP) test. A total of 6,196 students from 23 institutions were included in this study. Results indicated that MAPP was able to differentiate the performance between freshmen and seniors after controlling for SAT®/ACT scores. The institutions were classified into 10 groups on the basis of the difference in the actual vs. expected MAPP performance. This study provides an example of how MAPP can be used to evaluate value-added performance in college education. Issues such as student sampling and test-taking motivation are discussed.
- Published
- 2008
32. An Initial Field Trial of an Instrument for Measuring Learning Strategies of Middle School Students. Research Report. ETS RR-08-03
- Author
-
Liu, Ou Lydia, Jackson, Teresa, and Ling, Guangming
- Abstract
Learning strategies have been increasingly recognized as a useful tool to promote effective learning. In response to the lack of available learning strategies measures for middle school students, this study designed an instrument for these students, assessing behavioral, cognitive, and metacognitive strategies. This instrument, the Middle School Learning Strategies (MSLS) scale, is examined in terms of factorial structure, reliability, and correlates. Three factors emerge from the analyses: "effective strategies," "help seeking," and "bad habits." The subscales displayed a reasonable reliability, ranging from 0.70 to 0.87. Student grades in language arts, social studies, math, and science were collected as criterion variables. As expected, grades in these four subjects correlated positively with both effective strategies and help seeking, yet negatively with bad habits. As a pilot measure, this instrument has demonstrated promising features as a useful tool for students to evaluate and enhance their learning strategies.
- Published
- 2008
33. Value Added in Higher Education: Brief History, Measurement, Challenges, and Future Directions
- Author
-
Roohr, Katrina Crotts, primary, Olivera-Aguilar, Margarita, additional, and Liu, Ou Lydia, additional
- Published
- 2021
- Full Text
- View/download PDF
34. An Initial Investigation of a Modified Procedure for Parallel Analysis. Research Report. ETS RR-07-41
- Author
-
Liu, Ou Lydia, Rijmen, Frank, and Kong, Nan
- Abstract
Parallel analysis has been well documented to be an effective and accurate method for determining the number of factors to retain in exploratory factor analysis. Despite its theoretical and empirical advantages, the popularity of parallel analysis has been thwarted by its limited access in statistical software such as SPSS and SAS, especially in software that analyzes ordinal data. Among the few commonly used procedures, the Hayton, Allen, and Scarpello (2004) procedure requires manually computing the mean of eigenvalues from at least 50 replications. The O'Connor (2000) procedure overcomes that limitation, yet it has difficulties dealing with random missing data. To address these technical issues of parallel analysis for ordinal variables, we adapted and modified the O'Connor procedure to provide an alternative that best approximates the ordinal data by factoring in the frequency distributions of the variables (e.g., the number of response categories and the frequency of each response category per variable). Our procedure has a slightly different theoretical rationale from O'Connor's as well as a practical advantage in dealing with missing data.
- Published
- 2007
35. The Standardized Letter of Recommendation: Implications for Selection. Research Report. ETS RR-07-38
- Author
-
Liu, Ou Lydia, Minsky, Jennifer, and Ling, Guangming
- Abstract
In an effort to standardize academic application procedures, the Standardized Letter of Recommendation (SLR) was developed to capture important cognitive and noncognitive qualities of graduate school candidates. The SLR consists of seven scales ("knowledge," "analytical skills," "communication skills," "motivation," "self- organization," "professionalism and maturity," and "teamwork") and was applied to an intern-selection scenario. Both professor ratings (N = 414) during the application process and mentor ratings of the selected students (N = 51) after the internship was completed were collected using the SLR. A multidimensional Rasch investigation suggests that the seven scales of the SLR displayed satisfactory psychometric properties in terms of reliability, model fit, item fit statistics, and discrimination. The two cognitive scales, "knowledge" and "analytical skills," were found to be the best predictors for intern selection. The professor ratings and mentor ratings had moderate to high correlations, with the professor ratings being systematically higher than the mentor ratings. Possible reasons for the rating discrepancies are discussed. Also, implications for how the SLR can be used and improved in other selection situations are suggested.
- Published
- 2007
36. Automated Scoring of Constructed‐Response Science Items: Prospects and Obstacles
- Author
-
Liu, Ou Lydia, Brew, Chris, Blackmore, John, Gerard, Libby, Madhok, Jacquie, and Linn, Marcia C
- Subjects
automated scoring ,constructed-response items ,c-rater ,science assessment ,Specialist Studies in Education ,Education - Abstract
Content-based automated scoring has been applied in a variety of science domains. However, many prior applications involved simplified scoring rubrics without considering rubrics representing multiple levels of understanding. This study tested a concept-based scoring tool for content-based scoring, c-rater™, for four science items with rubrics aiming to differentiate among multiple levels of understanding. The items showed moderate to good agreement with human scores. The findings suggest that automated scoring has the potential to score constructed-response items with complex scoring rubrics, but in its current design cannot replace human raters. This article discusses sources of disagreement and factors that could potentially improve the accuracy of concept-based automated scoring. © 2014 by the National Council on Measurement in Education.
- Published
- 2014
37. ETS 2025 技能分类法
- Author
-
Liu, Ou Lydia, primary, Kell, Harrison, additional, Williams, Kevin, additional, Ling, Guangming, additional, and Sanders, Micah, additional
- Published
- 2023
- Full Text
- View/download PDF
38. ETS Skills Taxonomy
- Author
-
Liu, Ou Lydia, primary, Kell, Harrison, additional, Williams, Kevin, additional, Ling, Guangming, additional, and Sanders, Micah, additional
- Published
- 2023
- Full Text
- View/download PDF
39. Evaluating the Impact of Careless Responding on Aggregated-Scores: To Filter Unmotivated Examinees or Not?
- Author
-
Rios, Joseph A., Guo, Hongwen, Mao, Liyang, and Liu, Ou Lydia
- Abstract
When examinees' test-taking motivation is questionable, practitioners must determine whether careless responding is of practical concern and if so, decide on the best approach to filter such responses. As there has been insufficient research on these topics, the objectives of this study were to: a) evaluate the degree of underestimation in the true mean when careless responses are present, and b) compare the effectiveness of two filtering procedures in purifying biased aggregated-scores. Results demonstrated that: a) the true mean was underestimated by around 0.20 "SDs" if the total amount of careless responses exceeded 6.25%, 12.5%, and 12.5% for easy, moderately difficult, and difficult tests, respectively, and b) listwise deleting data from unmotivated examinees artificially inflated the true mean by as much as 0.42 "SDs" when ability was related to careless responding. Findings from this study have implications for when and how practitioners should handle careless responses for group-based low-stakes assessments.
- Published
- 2017
- Full Text
- View/download PDF
40. Investigating Student Learning Gains in College: A Longitudinal Study
- Author
-
Roohr, Katrina Crotts, Liu, Huili, and Liu, Ou Lydia
- Abstract
This study examines learning gains of college students' performance in critical thinking, reading, writing, and mathematics as assessed by the ETS Proficiency Profile (EPP). In this study, students' college learning gain was estimated by calculating the score differences between their first and last test administrations. Results revealed that (a) after being in college for one or two years, students did not demonstrate significant learning gains, (b) after three or more years, students made small to moderate gains on the EPP total score, and reading and mathematics subscales, (c) after four or five years, students made small to moderate learning gains on EPP total score and all four subscales, and (d) among various demographic and college-level variables, college experience was the largest significant predictor of students' learning gain, followed by first-year GPA. Implications of these results are discussed.
- Published
- 2017
- Full Text
- View/download PDF
41. Investigating the Impact of Automated Feedback on Students' Scientific Argumentation
- Author
-
Zhu, Mengxiao, Lee, Hee-Sun, Wang, Ting, Liu, Ou Lydia, Belur, Vinetha, and Pallant, Amy
- Abstract
This study investigates the role of automated scoring and feedback in supporting students' construction of written scientific arguments while learning about factors that affect climate change in the classroom. The automated scoring and feedback technology was integrated into an online module. Students' written scientific argumentation occurred when they responded to structured argumentation prompts. After submitting the open-ended responses, students received scores generated by a scoring engine and written feedback associated with the scores in real-time. Using the log data that recorded argumentation scores as well as argument submission and revisions activities, we answer three research questions. First, how students behaved after receiving the feedback; second, whether and how students' revisions improved their argumentation scores; and third, did item difficulties shift with the availability of the automated feedback. Results showed that the majority of students (77%) made revisions after receiving the feedback, and students with higher initial scores were more likely to revise their responses. Students who revised had significantly higher final scores than those who did not, and each revision was associated with an average increase of 0.55 on the final scores. Analysis on item difficulty shifts showed that written scientific argumentation became easier after students used the automated feedback.
- Published
- 2017
- Full Text
- View/download PDF
42. Assessing Students' Deep Conceptual Understanding in Physical Sciences: An Example on Sinking and Floating
- Author
-
Shen, Ji, Liu, Ou Lydia, and Chang, Hsin-Yi
- Abstract
This paper presents a transformative modeling framework that guides the development of assessment to measure students' deep understanding in physical sciences. The framework emphasizes 3 types of connections that students need to make when learning physical sciences: (1) linking physical states, processes, and explanatory models, (2) integrating multiple explanatory models, and (3) connecting scientific models to concrete experiences. We carried out a 2-phase exploratory study that helped further develop and refine the framework. In the first phase, we developed 3 items on sinking and floating and pilot tested them with 18 undergraduate students. Analysis of student responses revealed various student misconceptions and the different connections students made among science ideas. Based on the findings, we revised the assessment, modified the instruction, and collected data from another cohort of 26 students. The second cohort of students showed significant improvement of understanding of sinking and floating after instruction. Implications and limitations of how our assessment framework can be used to improve students' conceptual understanding in science are discussed.
- Published
- 2017
- Full Text
- View/download PDF
43. Online Proctored versus Unproctored Low-Stakes Internet Test Administration: Is There Differential Test-Taking Behavior and Performance?
- Author
-
Rios, Joseph A. and Liu, Ou Lydia
- Abstract
Online higher education institutions are presented with the concern of how to obtain valid results when administering student learning outcomes (SLO) assessments remotely. Traditionally, there has been a great reliance on unproctored Internet test administration (UIT) due to increased flexibility and reduced costs; however, a number of validity concerns have led some researchers to question its implementation and results. To mitigate the limitations of UIT, a relatively new approach, referred to as online proctoring, has been developed to mirror in-person proctoring remotely by capitalizing on technology to create verifiable and secure testing conditions. This study evaluated the comparability of online proctored and unproctored test administration in a low-stakes testing context on user-friendliness, examinee behavior, and mean scores. Results demonstrated improved user-friendliness (e.g., ease of logging in); however, no significant differences were observed in terms of keystroke information, rapid guessing, or aggregated scores between proctoring conditions. Overall, these results suggest that online institutions can implement UIT, which is a cost-effective approach to test administration, and obtain valid group-level inferences from SLO assessments.
- Published
- 2017
- Full Text
- View/download PDF
44. Ten Years after the Spellings Commission: From Accountability to Internal Improvement
- Author
-
Liu, Ou Lydia
- Abstract
Student learning outcomes assessment has been increasingly used in U.S. higher education institutions over the last 10 years, partly fueled by the recommendation from the Spellings Commission that institutions need to demonstrate more direct evidence of student learning. To respond to the Commission's call, various accountability initiatives have been launched, profoundly reshaping how assessment has been viewed, implemented, and used in higher education. This article reviews the conceptual and methodological challenges of the assessment agenda for one of the landmark accountability initiatives, the Voluntary System of Accountability, and also documents the notable shift from a strong focus on accountability to an increasing emphasis on internal improvement. This article then discusses the most recent developments in assessment approaches and tools, and proposes a four-element, one-enabler assessment cycle for institutions to maximally benefit from their assessment efforts.
- Published
- 2017
- Full Text
- View/download PDF
45. 21世纪技能的观点:缩小雇主和高等教育毕业生间技能差距的下一个方向
- Author
-
Orona, Gabe, primary, Liu, Ou Lydia, additional, and Arum, Richard, additional
- Published
- 2023
- Full Text
- View/download PDF
46. Validation of Automated Scoring of Science Assessments
- Author
-
Liu, Ou Lydia, Rios, Joseph A., and Heilman, Michael
- Abstract
Constructed response items can both measure the coherence of student ideas and serve as reflective experiences to strengthen instruction. We report on new automated scoring technologies that can reduce the cost and complexity of scoring constructed-response items. This study explored the accuracy of c-rater-ML, an automated scoring engine developed by Educational Testing Service, for scoring eight science inquiry items that require students to use evidence to explain complex phenomena. Automated scoring showed satisfactory agreement with human scoring for all test takers as well as specific subgroups. These findings suggest that c-rater-ML offers a promising solution to scoring constructed-response science items and has the potential to increase the use of these items in both instruction and assessment.
- Published
- 2016
- Full Text
- View/download PDF
47. Investigating College Learning Gain: Exploring a Propensity Score Weighting Approach
- Author
-
Liu, Ou Lydia, Liu, Huili, Roohr, Katrina Crotts, and McCaffrey, Daniel F.
- Abstract
Learning outcomes assessment has been widely used by higher education institutions both nationally and internationally. One of its popular uses is to document learning gains of students. Prior studies have recognized the potential imbalance between freshmen and seniors in terms of their background characteristics and their prior academic performance and have used linear regression adjustments for these differences, which some researchers have argued are not fully adequate. We explored an alternative adjustment via propensity score weighting to balance the samples on background variables including SAT score, gender, and ethnicity. Results involving a cross-sectional sample of freshmen and seniors from seven groups of majors within a large research university showed that students in most of the majors demonstrated significant learning gain. Additionally, there was a slight difference in learning gain rankings across major groupings when compared to multiple regression results.
- Published
- 2016
- Full Text
- View/download PDF
48. Assessing Critical Thinking in Higher Education: The HEIghten™ Approach and Preliminary Validity Evidence
- Author
-
Liu, Ou Lydia, Mao, Liyang, Frankel, Lois, and Xu, Jun
- Abstract
Critical thinking is a learning outcome highly valued by higher education institutions and the workforce. The Educational Testing Service (ETS) has designed a next generation assessment, the HEIghten™ critical thinking assessment, to measure students' critical thinking skills in analytical and synthetic dimensions. This paper introduces the theoretical framework that guided the assessment design, and also reports on the preliminary validity evidence of the pilot data from over 3000 students from 35 two and four-year institutions. The critical thinking scores demonstrated satisfactory total and subscale reliabilities, were reasonably correlated with SAT scores, high school grade point average (GPA), and college GPA, and were able to detect cross-sectional performance difference between freshmen and seniors. In addition, most examinees reported having tried their best when taking the test. Results show that test-taking motivation has a significant impact on performance. We encourage institutions to pay attention to motivational issues in implementing low-stakes learning outcomes assessment such as the HEIghten™ critical thinking assessment.
- Published
- 2016
- Full Text
- View/download PDF
49. Reconceptualizing a College Science Learning Experience in the New Digital Era: A Review of Literature
- Author
-
Shen, Ji, Jiang, Shiyan, Liu, Ou Lydia, Spector, J. Michael, Series editor, Bishop, M. J., Series editor, Ifenthaler, Dirk, Series editor, and Ge, Xun, editor
- Published
- 2015
- Full Text
- View/download PDF
50. Author Correction: Skill levels and gains in university STEM education in China, India, Russia and the United States
- Author
-
Loyalka, Prashant, Liu, Ou Lydia, Li, Guirong, Kardanova, Elena, Chirikov, Igor, Hu, Shangfeng, Yu, Ningning, Ma, Liping, Guo, Fei, Beteille, Tara, Tognatta, Namrata, Gu, Lin, Ling, Guangming, Federiakin, Denis, Wang, Huan, Khanna, Saurabh, Bhuradia, Ashutosh, Shi, Zhaolei, and Li, Yanyan
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.