93 results on '"Kadriye Ercikan"'
Search Results
2. Digital divide: A critical context for digitally based assessments
- Author
-
Kadriye Ercikan, Mustafa Asil, and Raman Grover
- Subjects
digital divide ,digital assessments ,ICILS ,Education - Abstract
Student learning is increasingly taking place in digital environments both within and outside schooling contexts. Educational assessments are following suit, both to take advantage of the conveniences and opportunities that digital environments provide as well as to reflect the mediums of learning increasingly taking place in societies around the world. A social context relevant to learning and assessment in the digital age is the great differences in access to and competence in technology among students from different segments of societies. Therefore, access and competency in relation to technology become critical contexts for evaluations that rely on digitally based assessments. This chapter examines the digital divide between students from different segments of the society and discusses strategies for minimizing effects of digital divide on assessments of student learning. The research focuses on two types of demographic groups—gender and socioeconomic status (SES) groups—that have been highlighted in research on the digital divide. The research utilizes data from IEA’s International Computer and Information Literacy Study (ICILS) 2013 for Grade 8 students administered in 21 jurisdictions around the world. It thus provides an international perspective on digital divide as an important context for international assessments as well as assessments within jurisdictions such as Mexico that are conducting assessments in digitally based environments.
- Published
- 2018
- Full Text
- View/download PDF
3. Student Perceptions About Their General Learning Outcomes
- Author
-
Stephanie Barclay McKeown and Kadriye Ercikan
- Subjects
Education - Abstract
Aggregate survey responses collected from students are commonly used by universities to compare effective educational practices across program majors, and to make high-stakes decisions about the effectiveness of programs. Yet if there is too much heterogeneity among student responses within programs, the program-level averages may not appropriately represent student-level outcomes, and any decisions made based on these averages may be erroneous. Findings revealed that survey items regarding students’ perceived general learning outcomes could be appropriately aggregated to the program level for 4th-year students in the study but not for 1st-year students. Survey items concerning the learning environment were not valid for either group when aggregated to the program level. This study demonstrates the importance of considering the multilevel nature of survey results and determining the multilevel validity of program-level interpretations prior to making any conclusions based on aggregate student responses. Implications for institutional effectiveness research are discussed.
- Published
- 2017
- Full Text
- View/download PDF
4. Canadians and Their Pasts
- Author
-
Margaret Conrad, Kadriye Ercikan, Gerald Friesen, Jocelyn Létourneau, D.A. Muise, David Northrup, Peter Seixas
- Published
- 2019
5. Optimizing Implementation of Artificial‐Intelligence‐Based Automated Scoring: An Evidence Centered Design Approach for Designing Assessments for AI‐based Scoring
- Author
-
Kadriye Ercikan and Daniel F. McCaffrey
- Subjects
Developmental and Educational Psychology ,Psychology (miscellaneous) ,Applied Psychology ,Education - Published
- 2022
6. Comparing <scp>Test‐Taking</scp> Behaviors of English Language Learners ( <scp>ELLs</scp> ) to <scp>Non‐ELL</scp> Students: Use of Response Time in Measurement Comparability Research
- Author
-
Kadriye Ercikan and Hongwen Guo
- Subjects
Social Psychology ,Comparability ,Mathematics education ,Response time ,Ell ,English language ,Statistics, Probability and Uncertainty ,Psychology ,Applied Psychology ,Education ,Test (assessment) - Published
- 2021
7. Differential rapid responding across language and cultural groups
- Author
-
Kadriye Ercikan and Hongwen Guo
- Subjects
Compromise ,media_common.quotation_subject ,Cultural group selection ,Disengagement theory ,Differential (infinitesimal) ,Psychology ,Rapid response ,Education ,Cognitive psychology ,media_common ,Test (assessment) - Abstract
Rapid response behaviour, a type of test disengagement, cannot be interpreted as a true indicator of the targeted constructs and may compromise score accuracy as well as score validity for interpre...
- Published
- 2020
8. Use of Response Process Data to Inform Group Comparisons and Fairness Research
- Author
-
Kadriye Ercikan, Hongwen Guo, and Qiwei He
- Subjects
Response process ,05 social sciences ,Applied psychology ,Comparability ,050401 social sciences methods ,050301 education ,Group comparison ,Test bias ,Education ,0504 sociology ,Achievement test ,Psychology ,0503 education ,Meaning (linguistics) - Abstract
Comparing group is one of the key uses of large-scale assessment results, which are used to gain insights to inform policy and practice and to examine the comparability of scores and score meaning....
- Published
- 2020
9. Implementing ILSAs
- Author
-
Juliette Lyons-Thomas, Kadriye Ercikan, Eugene Gonzalez, and Irwin Kirsch
- Published
- 2022
10. COVID-19 and U.S. Schools: Using Data to Understand and Mitigate Inequities in Instruction and Learning
- Author
-
Kadriye Ercikan and Laura S. Hamilton
- Subjects
Coronavirus disease 2019 (COVID-19) ,business.industry ,Pandemic ,Social emotional learning ,Context (language use) ,Sociology ,Public relations ,business - Abstract
Shortly after the COVID-19 pandemic arrived in the United States, schools across the country had to enact significant, rapid changes to their instructional models, and schools varied widely in their access to the resources needed to support these efforts. Researchers across the U.S. quickly launched surveys, website reviews, and other data-collection methods to document these shifts. In this chapter, we draw on this research to describe the U.S. K-12 educational context, the policies states adopted, the practices and resources schools offered, and the potential effects on students’ academic, social, and emotional learning. In these discussions we draw particular attention to inequities in educational opportunities across schools serving different student populations. We then discuss how different sources of data will be needed to help identify educational needs and mitigate disparities in instruction and learning post-pandemic.
- Published
- 2021
11. Measurement Comparability of Reading in the English and French Canadian Populations: Special Case of the 2011 Progress in International Reading Literacy Study
- Author
-
Kadriye Ercikan and Shawna Goodrich
- Subjects
score comparability ,test equivalence ,05 social sciences ,Comparability ,050301 education ,international assessment ,English language ,item equivalence ,Differential item functioning ,behavioral disciplines and activities ,050105 experimental psychology ,lcsh:Education (General) ,Education ,Reading literacy ,Mathematics education ,French canadian ,reading literacy achievement ,differential item functioning ,0501 psychology and cognitive sciences ,Special case ,Psychology ,lcsh:L7-991 ,0503 education ,Equivalence (measure theory) - Abstract
The purpose of this study is to examine item equivalence and score comparability of the Progress in International Reading Literacy Study (PIRLS) 2011 for the Canadian French and English language groups. Two methods of differential item functioning were conducted to examine item equivalence across 13 test booklets designed to assess reading literacy in early years of schooling. Four bilingual reviewers with expertise in reading literacy conducted independent, linguistic, and cultural reviews to identify both the degree of item equivalence and potential sources of differences between language versions of released items. Results indicate that an average of 25% of items per booklet function differentially at the item level. Reviews by experts indicate differences between the two language versions on some items flagged as displaying differential item functioning (DIF). Some of these were identified to have linguistic differences pointing to differential difficulty levels in the two language versions.
- Published
- 2019
12. For Which Boys and Which Girls Are Reading Assessment Items Biased Against? Detection of Differential Item Functioning in Heterogeneous Gender Populations
- Author
-
Kadriye Ercikan and R Grover
- Subjects
Multivariate analysis ,Gender differential ,fungi ,05 social sciences ,050401 social sciences methods ,050301 education ,behavioral disciplines and activities ,Differential item functioning ,humanities ,Education ,Student assessment ,Developmental psychology ,Reading assessment ,0504 sociology ,Developmental and Educational Psychology ,Achievement test ,Psychology ,0503 education ,Group level ,Socioeconomic status - Abstract
In gender differential item functioning (DIF) research it is assumed that all members of a gender group have similar item response patterns and therefore generalizations from group level to subgroup and individual levels can be made accurately. However DIF items do not necessarily disadvantage every member of a gender group to the same degree, indicating existence of heterogeneity of response patterns within gender groups. In this article the impact of heterogeneity within gender groups on DIF investigations was investigated. Specifically, it was examined whether DIF results varied when comparing males versus females, gender × socioeconomic status subgroups and latent classes of gender. DIF analyses were conducted on reading achievement data from the Canadian sample of the Programme of International Student Assessment 2009. Results indicated considerable heterogeneity within males and females and DIF results were found to vary when heterogeneity was taken into account versus when it was not.
- Published
- 2017
13. In Search of Validity Evidence in Support of the Interpretation and Use of Assessments of Complex Constructs: Discussion of Research on Assessing 21st Century Skills
- Author
-
Kadriye Ercikan and Maria Elena Oliveri
- Subjects
21st century skills ,Management science ,Knowledge level ,Interpretation (philosophy) ,Information literacy ,05 social sciences ,050401 social sciences methods ,050301 education ,Construct validity ,Cognition ,Test validity ,Education ,0504 sociology ,Developmental and Educational Psychology ,Psychology ,Construct (philosophy) ,0503 education ,Cognitive psychology - Abstract
Assessing complex constructs such as those discussed under the umbrella of 21st century constructs highlights the need for a principled assessment design and validation approach. In our discussion, we made a case for three considerations: (a) taking construct complexity into account across various stages of assessment development such as the design, scaling, and interpretation aspects; (b) cognitive validity evidence that goes beyond traditional psychometric analyses of response patterns, and (c) cross-cultural validity. We analyze the four articles in this special issue with respect to these three considerations and discuss the kinds of evidence needed to support interpretation of scores from 21st century constructs.
- Published
- 2016
14. The assessment of mathematical literacy of linguistic minority students: Results of a multi-method investigation
- Author
-
Romeo Fola, Kadriye Ercikan, Wolff-Michael Roth, and Marielle Simon
- Subjects
Mathematical literacy ,4. Education ,Applied Mathematics ,05 social sciences ,Numerical cognition ,050401 social sciences methods ,050301 education ,Multiple methods ,Differential item functioning ,Linguistics ,Education ,Test (assessment) ,0504 sociology ,Language assessment ,Home language ,Mathematics education ,Multi method ,Psychology ,0503 education ,Applied Psychology - Abstract
Assessing mathematical literacy of students who have limited proficiency in the language of the test is a critical challenge in mathematics education. Previous research indicates that knowledge and competencies of such students are underestimated. This presents a major validity and fairness problem for assessment. Most efforts addressing fairness and validity issues in assessment of linguistic minority students focus on the test language only. To overcome limitations of single approaches, we examine in this study the interaction between the test language and the student language background by means of multiple methods. Thus, we investigate possible linguistic bias of items flagged as functioning differentially (the result of DIF analyses) by means of (a) two levels of expert analyses and (b) student think-aloud protocols to investigate language effects in published mathematics items from the 2000 to 2003 Programme for International Student Assessment (PISA) administration for students attending French schools in Canada and speaking either French or other languages at home. DIF analyses were conducted to identify items on which students from different home language backgrounds attending French schools achieve differently. The expert panels tended to identify surface characteristics of language that may be responsible for group differences but not for the differential effects detected by differential item functioning (DIF). Student think-aloud protocols in part confirm and in part contradict DIF results, providing insights for the source of the differences. Suggestions are provided for further study.
- Published
- 2015
15. Analyzing Fairness Among Linguistic Minority Populations Using a Latent Class Differential Item Functioning Approach
- Author
-
Kadriye Ercikan, Juliette Lyons-Thomas, Maria Elena Oliveri, and Steven Holtzman
- Subjects
Multivariate analysis ,Linguistic group ,05 social sciences ,Ethnic group ,050401 social sciences methods ,050301 education ,Differential item functioning ,Test bias ,Linguistics ,Education ,0504 sociology ,Statistics ,Developmental and Educational Psychology ,Psychology ,0503 education - Abstract
Differential item functioning (DIF) analyses have been used as the primary method in large-scale assessments to examine fairness for subgroups. Currently, DIF analyses are conducted utilizing manifest methods using observed characteristics (gender and race/ethnicity) for grouping examinees. Homogeneity of item responses is assumed denoting that all examinees respond to test items using a similar approach. This assumption may not hold with all groups. In this study, we demonstrate the first application of the latent class (LC) approach to investigate DIF and its sources with heterogeneous (linguistic minority groups). We found at least three LCs within each linguistic group, suggesting the need to empirically evaluate this assumption in DIF analysis. We obtained larger proportions of DIF items with larger effect sizes when LCs within language groups versus the overall (majority/minority) language groups were examined. The illustrated approach could be used to improve the ways in which DIF analyses ...
- Published
- 2015
16. This Issue
- Author
-
Kadriye Ercikan
- Subjects
Education - Published
- 2015
17. A Framework for Developing Comparable Multilingual Assessments for Minority Populations: Why Context Matters
- Author
-
Kadriye Ercikan, Maria Elena Oliveri, and Marielle Simon
- Subjects
Insufficient Sample ,Social Psychology ,Relation (database) ,Context effect ,Modeling and Simulation ,Pedagogy ,Comparability ,Multiple language ,Applied psychology ,Staffing ,Context (language use) ,Psychology ,Education - Abstract
The assessment of linguistic minorities often involves using multiple language versions of assessments. In these assessments, comparability of scores across language groups is central to valid comparative interpretations. Various frameworks and guidelines describe factors that need to be considered when developing comparable assessments. These frameworks provide limited information in relation to the development of multiple language versions of assessments for assessing linguistic minorities within countries. To this end, we make various suggestions for the types of factors that should be considered when assessing linguistic minorities. Our recommendations are tailored to the particular constraints potentially faced by various jurisdictions tasked with developing multiple language versions of assessments for linguistic minorities. These challenges include having limited financial and staffing resources to develop comparable assessments and having insufficient sample sizes to perform psychometric analyses ...
- Published
- 2015
18. Moving beyond Country Rankings in International Assessments: The Case of PISA
- Author
-
Nancy E. Perry and Kadriye Ercikan
- Subjects
Equity (economics) ,business.industry ,Standardized test ,Academic achievement ,Education ,Educational research ,International education ,Political science ,Regional science ,Mathematics education ,Achievement test ,Comparative education ,business ,Publication - Abstract
The Programme for International Student Assessment (PISA) was designed by the Organisation for Economic Cooperation and Development (OECD) to evaluate the quality, equity, and efficiency of school systems around the world. Specifically, the PISA has assessed 15-year-old students’ reading, mathematics, and science literacy on a 3-year cycle, since 2000. Also, the PISA collects information about how those outcomes are related to key demographic, social, economic, and educational variables. However, the preponderance of reports involving PISA data focus on achievement variables and cross-national comparisons of achievement variables. Challenges in evaluating achievement of students from different cultural and educational settings and data concerning students’ approaches to learning, motivation for learning, and opportunities for learning are rarely reported. A main goal of this themed issue of Teachers College Record (TCR) is to move the conversation about PISA data beyond achievement to also include factors that affect achievement (e.g., SES, home environment, strategy use). Also we asked authors to consider how international assessment data can be used for improving learning and education and what appropriate versus inappropriate inferences can be made from the data. In this introduction, we synthesize the six articles in this issue and themes that cut across them. Also we examine challenges associated with using data from international assessments, like the PISA, to inform education policy and practice within and across countries. We conclude with recommendations for collecting and using data from international assessments to inform research, policy, and teaching and learning.
- Published
- 2015
19. Reading Proficiency and Comparability of Mathematics and Science Scores for Students From English and Non-English Backgrounds: An International Perspective
- Author
-
Juliette Lyons-Thomas, Wolff-Michael Roth, Kadriye Ercikan, Michelle Y. Chen, Debra Sandilands, Marielle Simon, and Shawna Goodrich
- Subjects
Social Psychology ,Point (typography) ,media_common.quotation_subject ,International comparisons ,Perspective (graphical) ,Comparability ,English language ,Mathematics assessment ,Education ,Modeling and Simulation ,Reading (process) ,Mathematics education ,Meaning (linguistics) ,media_common - Abstract
The purpose of this research is to examine the comparability of mathematics and science scores for students from English language backgrounds (ELB) and non-English language backgrounds (NELB). We examine the relationship between English reading proficiency and performance on mathematics and science assessments in Australia, Canada, the United Kingdom, and the United States. The findings indicate a strong relationship with reading proficiency accounting for up to 43% of the variance in mathematics and up to 79% in science. In all comparisons, ELB students either outperformed NELB students or performed at the same level. However, when statistical adjustments were made for reading proficiency, in both mathematics and science, the score gap between the groups became statistically non-significant in three out of the four countries. These findings point to differences in score meaning in mathematics and science assessments and limitations in comparing performances of ELB and NELB.
- Published
- 2014
20. Validation of Score Meaning for the Next Generation of Assessments : The Use of Response Processes
- Author
-
Kadriye Ercikan, James W. Pellegrino, Kadriye Ercikan, and James W. Pellegrino
- Subjects
- Examinations--Interpretation, Educational tests and measurements--Standards--United States, Psychological tests--Standards--United States
- Abstract
Despite developments in research and practice on using examinee response process data in assessment design, the use of such data in test validation is rare. Validation of Score Meaning in the Next Generation of Assessments Using Response Processes highlights the importance of validity evidence based on response processes and provides guidance to measurement researchers and practitioners in creating and using such evidence as a regular part of the assessment validation process. Response processes refer to approaches and behaviors of examinees when they interpret assessment situations and formulate and generate solutions as revealed through verbalizations, eye movements, response times, or computer clicks. Such response process data can provide information about the extent to which items and tasks engage examinees in the intended ways. With contributions from the top researchers in the field of assessment, this volume includes chapters that focus on methodological issues and on applications across multiple contexts of assessment interpretation and use. In Part I of this book, contributors discuss the framing of validity as an evidence-based argument for the interpretation of the meaning of test scores, the specifics of different methods of response process data collection and analysis, and the use of response process data relative to issues of validation as highlighted in the joint standards on testing. In Part II, chapter authors offer examples that illustrate the use of response process data in assessment validation. These cases are provided specifically to address issues related to the analysis and interpretation of performance on assessments of complex cognition, assessments designed to inform classroom learning and instruction, and assessments intended for students with varying cultural and linguistic backgrounds. The Open Access version of this book, available at http://www.taylorfrancis.com, has been made available under a Creative Commons Attribution-Non Commercial-No Derivatives 4.0 license.
- Published
- 2017
21. Validation of Score Meaning Using Examinee Response Processes for the Next Generation of Assessments
- Author
-
Kadriye Ercikan and James W. Pellegrino
- Subjects
Meaning (existential) ,Psychology ,Cognitive psychology - Published
- 2017
22. Validation of Score Meaning for the Next Generation of Assessments
- Author
-
James W. Pellegrino and Kadriye Ercikan
- Subjects
Meaning (existential) ,Psychology ,Linguistics - Published
- 2017
23. Effects of Population Heterogeneity on Accuracy of DIF Detection
- Author
-
Kadriye Ercikan, Bruno D. Zumbo, and Maria Elena Oliveri
- Subjects
Ell ,Regression analysis ,English language ,behavioral disciplines and activities ,Differential item functioning ,Test bias ,humanities ,Education ,Item response theory ,Statistics ,Developmental and Educational Psychology ,Population Heterogeneity ,Detection rate ,Psychology - Abstract
Heterogeneity within English language learners (ELLs) groups has been documented. Previous research on differential item functioning (DIF) analyses suggests that accurate DIF detection rates are reduced greatly when groups are heterogeneous. In this simulation study, we investigated the effects of heterogeneity within linguistic (ELL) groups on the accuracy of DIF detection. Heterogeneity within such groups may occur for a myriad of reasons including differential lengths of time residing in English-speaking countries, degrees of exposure to English-speaking environments, and amounts of English instruction. Our findings revealed that at high levels of within-group heterogeneity, DIF detection is at the level of chance, implying that a large proportion of DIF items might remain undetected when assessing heterogeneous populations potentially leading to developing biased tests. Based on our findings, we urge test development organizations to consider heterogeneity within ELL and other heterogeneous focus grou...
- Published
- 2014
24. Inconsistencies in DIF Detection for Sub-Groups in Heterogeneous Language Groups
- Author
-
Kadriye Ercikan, Wolff-Michael Roth, Marielle Simon, Juliette Lyons-Thomas, and Debra Sandilands
- Subjects
media_common.quotation_subject ,Comparability ,Ell ,French ,Single group ,Differential item functioning ,language.human_language ,Education ,Identification (information) ,Consistency (negotiation) ,Developmental and Educational Psychology ,language ,Psychology ,Social psychology ,Diversity (politics) ,media_common - Abstract
Diversity and heterogeneity among language groups have been well documented. Yet most fairness research that focuses on measurement comparability considers linguistic minority students such as English language learners (ELLs) or Francophone students living in minority contexts in Canada as a single group. Our focus in this research is to examine the degree to which measurement comparability, as indicated by differential item functioning (DIF), is consistent for sub-groups among linguistic minority Francophone students in Canada. The findings suggest that the linguistic minority Francophone students who speak French at home and those who do not speak French at home should not be grouped together for investigating measurement comparability or for examining performance gaps. We identified a great degree of differences in DIF identification with a consistency of 7–10% in DIF identification in the separate analyses for the two groups. The findings highlight methodological problems with investigating fairness f...
- Published
- 2014
25. Uncovering Substantive Patterns in Student Responses in International Large-Scale Assessments—Comparing a Latent Class to a Manifest DIF Approach
- Author
-
René Lawless, Kadriye Ercikan, Bruno D. Zumbo, and Maria Elena Oliveri
- Subjects
Social Psychology ,Manifest and latent functions and dysfunctions ,Cognitive complexity ,behavioral disciplines and activities ,Differential item functioning ,Latent class model ,Education ,Developmental psychology ,Reading comprehension ,Homogeneous ,Modeling and Simulation ,Item response theory ,Generalizability theory ,Psychology - Abstract
In this study, we contrast results from two differential item functioning (DIF) approaches (manifest and latent class) by the number of items and sources of items identified as DIF using data from an international reading assessment. The latter approach yielded three latent classes, presenting evidence of heterogeneity in examinee response patterns. It also yielded more DIF items with larger effect sizes and more consistent item response patterns by substantive aspects (e.g., reading comprehension processes and cognitive complexity of items). Based on our findings, we suggest empirically evaluating the homogeneity assumption in international assessments because international populations cannot be assumed to have homogeneous item response patterns. Otherwise, differences in response patterns within these populations may be under-detected when conducting manifest DIF analyses. Detecting differences in item responses across international examinee populations has implications on the generalizability and meani...
- Published
- 2014
26. An Investigation of School-Level Factors Associated With Science Performance for Minority and Majority Francophone Students in Canada
- Author
-
Kadriye Ercikan, Debra Sandilands, Stephanie Barclay McKeown, and Juliette Lyons-Thomas
- Subjects
First language ,education ,Multilevel model ,French ,Science education ,language.human_language ,Education ,Developmental psychology ,language ,Mathematics education ,Achievement test ,School level ,Comparative education ,Psychology ,Socioeconomic status - Abstract
Minority Francophone students in predominantly English-speaking Canadian provinces tend to perform lower on large-scale assessments of achievement than their Anglophone peers and majority Francophone students in Quebec. This study is the first to apply multilevel modeling methods to examine the extent to which school-level factors may be differentially associated with achievement for minority Francophone students compared to majority Francophone students. Forty-four percent of the variance in science achievement among Francophone schools was explained by the model. After taking into consideration gender as well as student- and school-level socioeconomic status, French-language schools in Ontario and New Brunswick had, on average, lower PISA 2006 science achievement scores compared to schools in Quebec. Findings indicate that differences in performance between linguistic minority and majority Francophone students could not be reduced to differences in socioeconomic status or to the school resource ...
- Published
- 2014
27. Limits of Generalizing in Education Research: Why Criteria for Research Generalization Should Include Population Heterogeneity and Uses of Knowledge Claims
- Author
-
Wolff-Michael Roth and Kadriye Ercikan
- Subjects
education.field_of_study ,Generalization ,Computer science ,Management science ,Knowledge economy ,Population ,Probabilistic logic ,Context (language use) ,Sample (statistics) ,Causality ,Education ,Educational research ,education ,Cognitive psychology - Abstract
Context Generalization is a critical concept in all research designed to generate knowledge that applies to all elements of a unit (population) while studying only a subset of these elements (sample). Commonly applied criteria for generalizing focus on experimental design or representativeness of samples of the population of units. The criteria tend to neglect population diversity and targeted uses of knowledge generated from the generalization. Objectives This article has two connected purposes: (a) to articulate the structure and discuss limitations of different forms of generalizations across the spectrum of quantitative and qualitative research and (b) to argue for considering population heterogeneity and future uses of knowledge claims when judging the appropriateness of generalizations. Research Design In the first part of the paper, we present two forms of generalization that rely on statistical analysis of between-group variation: analytic and probabilistic generalization. We then describe a third form of generalization: essentialist generalization. Essentialist generalization moves from the particular to the general in small sample studies. We discuss limitations of each kind of generalization. In the second part of the paper, we propose two additional criteria when evaluating the validity of evidence based on generalizations from education research: population heterogeneity and future use of knowledge claims. Conclusions/Recommendations The proposed criticisms of research generalizations have implications on how research is conducted and research findings are summarized. The main limitation in analytic generalization is that it does not provide evidence of a causal link for subgroups or individuals. In addition to making explicit the uses that the knowledge claims may be targeting, there is a need for some changes in how research is conducted. This includes a need for demonstrating the mechanisms of causality; descriptions of intervention outcomes as positive, negative, or neutral; and latent class analysis accompanied with discriminant analysis. The main criticism of probabilistic generalization is that it may not apply to subgroups and may have limited value for guiding policy and practice. This highlights a need for defining grouping variables by intended uses of knowledge claims. With respect to essentialist generalization, there are currently too few qualitative studies attempting to identify invariants that hold across the range of relevant situations. There is a need to study the ways in which a kind of phenomenon is produced, which would allow researchers to understand the various ways in which a phenomenon manifests itself.
- Published
- 2014
28. Analysis of Sources of Latent Class Differential Item Functioning in International Assessments
- Author
-
Kadriye Ercikan, Bruno D. Zumbo, and Maria Elena Oliveri
- Subjects
Multivariate analysis ,Social Psychology ,Comparability ,Regression analysis ,Linear discriminant analysis ,behavioral disciplines and activities ,Differential item functioning ,Education ,Discriminant function analysis ,Modeling and Simulation ,Statistics ,Item response theory ,Econometrics ,Psychology ,Multinomial logistic regression - Abstract
In this study, we investigated differential item functioning (DIF) and its sources using a latent class (LC) modeling approach. Potential sources of LC DIF related to instruction and teacher-related variables were investigated using substantive and three statistical approaches: descriptive discriminant function, multinomial logistic regression, and multilevel multinomial logistic regression analyses. Results revealed that differential response patterns, as indicated by identification of LCs, were most strongly associated with student achievement levels and teacher-related variables rather than manifest characteristics such as gender, test language, and country, which are the focus of typical measurement comparability research. Findings from this study have important implications for measurement comparability and validity research. Evidence of within-group heterogeneity in the test data structure suggests that the identification of DIF and its sources may not apply to all examinees in the group and that me...
- Published
- 2013
29. Investigating Sources of Differential Item Functioning in International Large-Scale Assessments Using a Confirmatory Approach
- Author
-
Kadriye Ercikan, Debra Sandilands, Bruno D. Zumbo, and Maria Elena Oliveri
- Subjects
Social Psychology ,International studies ,Applied psychology ,Cognition ,Test validity ,Differential item functioning ,Education ,Developmental psychology ,Modeling and Simulation ,Cultural diversity ,Cross-cultural ,Pairwise comparison ,Psychology ,Equivalence (measure theory) - Abstract
International large-scale assessments of achievement often have a large degree of differential item functioning (DIF) between countries, which can threaten score equivalence and reduce the validity of inferences based on comparisons of group performances. It is important to understand potential sources of DIF to improve the validity of future assessments; however, previous attempts to identify sources of DIF have had variable results. This study had two purposes. The first was to apply a confirmatory approach (Poly-SIBTEST) to investigate sources of DIF typically found in international large-scale assessments: adaptation effects and cognitive loadings of items. We conducted three pairwise DIF analyses on Spanish and English versions of the Progress in International Reading Literacy Study 2001 Reader booklet. Results confirmed that item cognitive loadings were a source of differential functioning favoring both England and the United States when compared against Colombia; however, adaptation effects did not...
- Published
- 2013
30. Investigating Linguistic Sources of Differential Item Functioning Using Expert Think-Aloud Protocols in Science Achievement Tests
- Author
-
Wolff-Michael Roth, Juliette Lyons-Thomas, Kadriye Ercikan, Debra Sandilands, and Maria Elena Oliveri
- Subjects
Identification (information) ,Comparability ,Applied psychology ,Achievement test ,Protocol analysis ,Think aloud protocol ,Psychology ,Science education ,Differential item functioning ,Curriculum ,Social psychology ,Education - Abstract
Even if national and international assessments are designed to be comparable, subsequent psychometric analyses often reveal differential item functioning (DIF). Central to achieving comparability is to examine the presence of DIF, and if DIF is found, to investigate its sources to ensure differentially functioning items that do not lead to bias. In this study, sources of DIF were examined using think-aloud protocols. The think-aloud protocols of expert reviewers were conducted for comparing the English and French versions of 40 items previously identified as DIF (N = 20) and non-DIF (N = 20). Three highly trained and experienced experts in verifying and accepting/rejecting multi-lingual versions of curriculum and testing materials for government purposes participated in this study. Although there is a considerable amount of agreement in the identification of differentially functioning items, experts do not consistently identify and distinguish DIF and non-DIF items. Our analyses of the think-aloud protoco...
- Published
- 2013
31. How Is Testing Supposed to Improve Schooling If We Do Not Evaluate to What Extent It Improves Schooling?
- Author
-
Kadriye Ercikan
- Subjects
Statistics and Probability ,Educational testing ,Applied Mathematics ,Mathematics education ,Test validity ,Psychology ,Test use ,Education - Published
- 2013
32. New Directions in Assessing Historical Thinking
- Author
-
Kadriye Ercikan, Peter Seixas, Kadriye Ercikan, and Peter Seixas
- Subjects
- D16.25
- Abstract
New technologies have radically transformed our relationship to information in general and to little bits of information in particular. The assessment of history learning, which for a century has valued those little bits as the centerpiece of its practice, now faces not only an unprecedented glut but a disconnect with what is valued in history education. More complex processes—historical thinking, historical consciousness or historical sense making—demand more complex assessments. At the same time, advances in scholarship on assessment open up new possibilities. For this volume, Kadriye Ercikan and Peter Seixas have assembled an international array of experts who have, collectively, moved the fields of history education and assessment forward. Their various approaches negotiate the sometimes-conflicting demands of theoretical sophistication, empirically demonstrated validity and practical efficiency. Key issues include articulating the cognitive goals of history education, the relationship between content and procedural knowledge, the impact of students'language literacy on history assessments, and methods of validation in both large scale and classroom assessments. New Directions in Assessing Historical Thinking is a critical, research-oriented resource that will advance the conceptualization, design and validation of the next generation of history assessments.
- Published
- 2015
33. Assessment Design for Accuracy of Scores, Meaningfulness of Interpretations, and Fairness of Decision-Making in High-Stakes Educational Testing
- Author
-
Kadriye Ercikan and Avi Allalouf
- Subjects
education ,ComputingMilieux_COMPUTERSANDEDUCATION - Abstract
Testing is used in decision-making in a variety of contexts worldwide. These include selection and screening of individuals for education programs (including higher education admissions), certification and licensing of professionals, and policymaking in schools and higher education. The consequences of testing-for example, being denied admission to education programs or authorization to practice specific professions-affect individuals, organizations, and jurisdictions. In every testing context, questions arise about the degree to which a test provides accurate and meaningful information for fair decision-making. Also, the use of assessment has been the target of specific criticism, including claims that the high-stakes nature of tests leads to “teaching to the test,” thereby reducing schools’ and districts’ control over K-12 education curricula, and that higher education admission tests lack a focus on subject matter knowledge. This chapter discusses the main areas of criticism and recommends several assessment design requirements that address them.
- Published
- 2016
34. Use of Evidence-Centered Design in Assessment of History Learning
- Author
-
Kadriye Ercikan, Peter Seixas, Pamela Kaliski, and Kristen Huff
- Published
- 2016
35. Qualitative and Quantitative Evidence in Health: The Critics’ View
- Author
-
Kadriye Ercikan and Wolff-Michael Roth
- Subjects
Structure (mathematical logic) ,education.field_of_study ,Essentialism ,business.industry ,Generalization ,Population ,Sample (statistics) ,Public relations ,Epistemology ,Political science ,Conviction ,education ,business ,Set (psychology) ,Qualitative research - Abstract
In this chapter, we articulate the structure and discuss limitations of different forms of generalizations across the spectrum of quantitative and qualitative research. Three main categories of generalization are identified and discussed: analytic, probabilistic (sample to population), and essentialist. We argue for a set of criteria for evaluating research generalization and evidence. Underlying our review is the conviction that it is more important for health researchers to worry about the quality of research evidence than about whether the research is of the quantitative, qualitative, or mixed-method type. Independent of whether the research is qualitative or quantitative, we are as critical of research that overgeneralizes as we are of research that fails to offer generalizations beyond the actual case(s) studied.
- Published
- 2016
36. Methodologies for Investigating Item- and Test-Level Measurement Equivalence in International Large-Scale Assessments
- Author
-
Kadriye Ercikan, Brent F. Olson, Bruno D. Zumbo, and Maria Elena Oliveri
- Subjects
Social Psychology ,Comparability ,Regression analysis ,behavioral disciplines and activities ,Differential item functioning ,humanities ,Education ,Modeling and Simulation ,Item response theory ,Statistics ,Econometrics ,Ordered logit ,Psychology ,Equivalence (measure theory) ,Parametric statistics ,Level measurement - Abstract
In this study, the Canadian English and French versions of the Problem-Solving Measure of the Programme for International Student Assessment 2003 were examined to investigate their degree of measurement comparability at the item- and test-levels. Three methods of differential item functioning (DIF) were compared: parametric and nonparametric item response theory and ordinal logistic regression. Corresponding derivations of these three DIF methods were investigated at the test-level to examine both differential test functioning (DTF) and the correspondence between findings at the item-level with those at the test-level. Item-level findings suggested consistency in DIF detection across methods; however, differences in effect sizes of DIF were found by each method. Test-level results revealed a high degree of consistency across DTF methods. Discrepancies were found between item- and test-level comparability analyses. Item-level analyses suggested moderate to low degrees of comparability, whereas test-level f...
- Published
- 2012
37. Do Different Approaches to Examining Construct Comparability in Multilanguage Assessments Lead to Similar Conclusions?
- Author
-
Kadriye Ercikan and Maria Elena Oliveri
- Subjects
Item analysis ,Statistics ,Comparability ,Developmental and Educational Psychology ,Construct (philosophy) ,Psychology ,Differential item functioning ,Reliability (statistics) ,Education ,Test data ,Test (assessment) ,Factor analysis - Abstract
In this study, we examine the degree of construct comparability and possible sources of incomparability of the English and French versions of the Programme for International Student Assessment (PISA) 2003 problem-solving measure administered in Canada. Several approaches were used to examine construct comparability at the test- (examination of test data structure, reliability comparisons and test characteristic curves) and item-levels (differential item functioning, item parameter correlations, and linguistic comparisons). Results from the test-level analyses indicate that the two language versions of PISA are highly similar as shown by similarity of internal consistency coefficients, test data structure (same number of factors and item factor loadings) and test characteristic curves for the two language versions of the tests. However, results of item-level analyses reveal several differences between the two language versions as shown by large proportions of items displaying differential item functioning,...
- Published
- 2011
38. Introduction to the Special Issue: Levels of Analysis in the Assessment of Linguistic Minority Students
- Author
-
Kadriye Ercikan and Guillermo Solano-Flores
- Subjects
Developmental and Educational Psychology ,Psychology ,Linguistics ,Education - Published
- 2014
39. Application of Think Aloud Protocols for Examining and Confirming Sources of Differential Item Functioning Identified by Expert Reviews
- Author
-
Kadriye Ercikan, Rubab Arim, Danielle Law, Jose Domene, France Gagnon, and Serge Lacroix
- Subjects
Protocol analysis ,Cognition ,Think aloud protocol ,Psychology ,Empirical evidence ,Differential item functioning ,Test bias ,Thinking processes ,Social psychology ,Education ,Cognitive psychology - Abstract
This paper demonstrates and discusses the use of think aloud protocols (TAPs) as an approach for examining and confirming sources of differential item functioning (DIF). The TAPs are used to investigate to what extent surface characteristics of the items that are identified by expert reviews as sources of DIF are supported by empirical evidence from examinee thinking processes in the English and French versions of a Canadian national assessment. In this research, the TAPs confirmed sources of DIF identified by expert reviews for 10 out of 20 DIF items. The moderate agreement between TAPs and expert reviews indicates that evidence from expert reviews cannot be considered sufficient in deciding whether DIF items are biased and such judgments need to include evidence from examinee thinking processes.
- Published
- 2010
40. Chronic illnesses in Canadian children: what is the effect of illness on academic achievement, and anxiety and emotional disorders?
- Author
-
Kadriye Ercikan and Y. J. Martinez
- Subjects
Male ,Canada ,medicine.medical_specialty ,Adolescent ,National Longitudinal Survey of Children and Youth ,Childhood chronic illness ,Population ,MEDLINE ,Academic achievement ,Young Adult ,Quality of life (healthcare) ,Developmental and Educational Psychology ,medicine ,Humans ,Young adult ,Child ,Psychiatry ,education ,education.field_of_study ,Mood Disorders ,Public Health, Environmental and Occupational Health ,Anxiety Disorders ,Chronic Disease ,Pediatrics, Perinatology and Child Health ,Quality of Life ,Educational Status ,Anxiety ,Female ,medicine.symptom ,Epidemiologic Methods ,Psychology ,Clinical psychology - Abstract
Background Survival rates of children with a chronic illness is at an all-time high. Up to 98% of children suffering from a chronic illness, which may have been considered fatal in the past, now reach early adulthood. It is estimated that as many as 30% of school-aged children are affected by a chronic illness. For this population of children, the prevalence of educational and psychological problems is nearly double in comparison with the general population. Methods This study investigated the educational and psychological effects of childhood chronic illness among 1512 Canadian children (ages 10–15 years). This was a retrospective analysis using data from the National Longitudinal Survey of Children and Youth, taking a cross-sectional look at the relationships between childhood chronic illnesses, performance on a Mathematics Computation Exercise (MCE) and ratings on an Anxiety and Emotional Disorder (AED) scale. Results When AED ratings and educational handicaps were controlled for, children identified with chronic illnesses still had weaker performance on the MCE. Chronic illness did not appear to have a relationship with children's AED ratings. The regression analysis indicated that community type and illness were the strongest predictors of MCE scores. Conclusions The core research implications of this study concern measurement issues that need to be addressed in future large-scale studies. Clinical implications of this research concern the need for co-ordinated services between the home, hospital and school settings so that services and programmes focus on the ecology of the child who is ill.
- Published
- 2009
41. Adaptation of instructional materials: a commentary on the research on adaptations of Who Polluted the Potomac
- Author
-
Kadriye Ercikan and Naim Alper
- Subjects
Cultural Studies ,Comparability ,Mathematics education ,Sociology of Education ,Psychology ,Adaptation (computer science) ,Science education ,Equivalence (measure theory) - Abstract
This commentary first summarizes and discusses the analysis of the two translation processes described in the Oliveira, Colak, and Akerson article and the inferences these researchers make based on their research. In the second part of the commentary, we describe procedures and criteria used in adapting tests into different languages and how they may apply to adaptation of instructional materials. The authors provide a good theoretical analysis of what took place in two translation instances and make an important contribution by taking the first step in providing a systematic discussion of adaptation of instructional materials. Our discussion proposes procedures for adapting instructional materials for examining equivalence of source and target versions of adapted instructional materials. We highlight that many of the procedures and criteria used in examining comparability of educational tests is missing in this emerging research of area.
- Published
- 2008
42. Design and Development Issues in Provincial Large-Scale Assessments: Designing Assessments to Inform Policy and Practice
- Author
-
Kadriye Ercikan and Stephanie Barclay-McKeown
- Subjects
Library and Information Sciences - Abstract
Abstract: Over the past four decades, there has been much debate on key sources of data in evaluating education, determining school effectiveness, and providing evidence to inform accountability and education planning. Entangled in this debate has been the extent to which large-scale assessments of learning provide valid evidence about the quality of schooling and education in Canada and how they can be used to inform education practice and policy. This article discusses five issues in large-scale assessments that are key for their usefulness and for making valid inferences. Based on recent research on assessment design and validity, the authors offer recommendations for large-scale assessments to better serve the multiple purposes they are intended to serve.
- Published
- 2008
43. New Directions in Assessing Historical Thinking
- Author
-
Kadriye Ercikan and Peter Seixas
- Subjects
History ,Psychoanalysis ,4. Education ,media_common.quotation_subject ,05 social sciences ,History education ,050301 education ,Construct validity ,Environmental ethics ,06 humanities and the arts ,16. Peace & justice ,060104 history ,Historical thinking ,0601 history and archaeology ,Narrative ,Consciousness ,Advanced Placement ,0503 education ,Discipline ,Competence (human resources) ,media_common - Abstract
Preface Acknowledgements Contributor Biographies Introduction Part I: Goals of History Education: Models of Historical Cognition and Learning 1 Historical Consciousness in Germany: Concept, Implementation, Assessment Carlos Kolbl & Lisa Konrad 2 The Difficulty of Assessing Disciplinary Historical Reading Abby Reisman 3 Heritage as a Resource for Enhancing and Assessing Historical Thinking: Reflections from the Netherlands Carla van Boxtel, Maria Grever & Stephan Klein 4 Relating Historical Consciousness to Historical Thinking through Assessment Catherine Duquette Commentary 1 Into the Swampy Lowlands of Important Problems Robert B. Bain Part 2: Issues in Designing Assessments of Historical Thinking 5 Assessing for Learning in the History Classroom Bruce VanSledright 6 Historical Thinking, Competencies and their Measurement: Challenges and Approaches Andreas Korber & Johannes Meyer-Hamme 7 A Design Process for Assessing Historical Thinking: The Case of a One-Hour Test Peter Seixas, Lindsay Gibson & Kadriye Ercikan 8 Material-based and Open-ended Writing Tasks for Assessing Narrative Competence among Students Monika Waldis, Jan Hodel, Holger Thunemann, Meik Zulsdorf-Kersting, & Beatrice Ziegler Commentary 2 Historical Thinking: In Search of Conceptual and Practical Guidance for the Design and Use of Assessments of Student Competence Josh Radinsky, Susan R. Goldman, James W. Pellegrino Part 3: Large-scale Assessment of Historical Thinking 9 A Large-Scale Assessment of Historical Knowledge and Reasoning: NAEP U.S. History Stephen Lazer 10 Assessing Historical Thinking in the Redesigned Advanced Placement United States History Course and Exam Lawrence G. Charap 11 Historical Consciousness and Historical Thinking Reflected in Large-scale Assessment in Sweden Per Eliasson, Fredrik Alven, Cecilia Axelsson Yngveus, & David Rosenlund Commentary 3 Assessment of Historical Thinking in Practice Susan M. Brookhart Part 4: Validity of Score Interpretations 12 The Importance of Construct Validity Evidence in History Assessment: What is Often Overlooked or Misunderstood? Pamela Kaliski, Kara Smith, & Kristen Huff 13 Cognitive Validity Evidence for Validating Assessments of Historical Thinking Kadriye Ercikan, Peter Seixas, Juliette Lyons-Thomas, & Lindsay Gibson 14 Measuring Up?: Multiple-Choice Questions Gabriel A. Reich 15 History Assessments of Thinking: An Investigation of Cognitive Validity Mark Smith & Joel Breakstone Commentary 4 The Validity of Historical Thinking Assessments: A Commentary Denis Shemilt
- Published
- 2015
44. Developments in Assessment of Student Learning
- Author
-
Kadriye Ercikan
- Subjects
Formative assessment ,Medical education ,Student learning ,Psychology - Published
- 2015
45. Examining Sources of Gender DIF in Mathematics Assessments Using a Confirmatory Multidimensional Model Approach
- Author
-
Sharon Mendes-Barnett and Kadriye Ercikan
- Subjects
Content area ,Gender differential ,fungi ,Contrast (statistics) ,Thinking skills ,behavioral disciplines and activities ,Test bias ,humanities ,Education ,Developmental psychology ,body regions ,Multidimensional model ,Cognitive level ,Developmental and Educational Psychology ,Multidimensional scaling ,Psychology ,human activities - Abstract
This study contributes to understanding sources of gender differential item functioning (DIF) on mathematics tests. This study focused on identifying sources of DIF and differential bundle functioning for boys and girls on the British Columbia Principles of Mathematics Exam (Grade 12) using a confirmatory SIBTEST approach based on a multidimensional model. Problem solving as a content area was confirmed as a source of gender DIF in favor of boys when the item is presented in the form of a story problem or when the problems are noncontext specific. Patterns and relations content areas produced a mixture of confirmed sources of DIF, with some subtopics favoring the girls and some favoring the boys. In contrast to what might be expected given the findings of previous gender DIF research, this study did not find geometry to be a source of gender DIF. All of the higher cognitive level items favored boys. High levels of DIF were detected in favor of girls on the bundle of computation items in which no equations...
- Published
- 2006
46. Using Multiple-Variable Matching to Identify Cultural Sources of Differential Item Functioning
- Author
-
Kadriye Ercikan and Amery D. Wu
- Subjects
Matching (statistics) ,Content area ,Social Psychology ,Context (language use) ,Regression analysis ,Logistic regression ,Differential item functioning ,Education ,Test (assessment) ,Variable (computer science) ,Modeling and Simulation ,Psychology ,Social psychology ,Cognitive psychology - Abstract
Identifying the sources of differential item functioning (DIF) in international assessments is very challenging, because such sources are often nebulous and intertwined. Even though researchers frequently focus on test translation and content area, few actually go beyond these factors to investigate other cultural sources of DIF. This article introduces the multiple-variable matching method using logistic regression analysis to identify sources of DIF. A case study demonstrates how this methodology identified Extra Lesson Hours After School (ELHAS) as a potential source of DIF between Taiwan and the United States in the Third International Mathematics and Science Study (TIMSS) 1999. DIF is not a fixed character of any test item, nor is a cultural factor an inherent source of DIF. The legitimacy of a source of DIF relies on the specific context and purpose for the cross-country comparison.
- Published
- 2006
47. What Good Is Polarizing Research Into Qualitative and Quantitative?
- Author
-
Kadriye Ercikan and Wolff-Michael Roth
- Subjects
Research design ,Data collection ,05 social sciences ,Polarization (politics) ,050401 social sciences methods ,050301 education ,Sketch ,Education ,Epistemology ,Educational research ,0504 sociology ,Generalizability theory ,Psychology ,Attribution ,0503 education ,Social psychology ,Qualitative research - Abstract
In education research, a polar distinction is frequently made to describe and produce different kinds of research: quantitative versus qualitative. In this article, the authors argue against that polarization and the associated polarization of the “subjective” and the “objective,” and they question the attribution of generalizability to only one of the poles. The purpose of the article is twofold: (a) to demonstrate that this polarization is not meaningful or productive for education research, and (b) to propose an integrated approach to education research inquiry. The authors sketch how such integration might occur by adopting a continuum instead of a dichotomy of generalizability. They then consider how that continuum might be related to the types of research questions asked, and they argue that the questions asked should determine the modes of inquiry that are used to answer them.
- Published
- 2006
48. Book Review
- Author
-
Kadriye Ercikan
- Subjects
Educational measurement ,Social Psychology ,Modeling and Simulation ,Applied psychology ,Cross-cultural ,Psychological testing ,Multilingualism ,Psychology ,Education - Published
- 2006
49. Scoring Examinee Responses for Multiple Inferences: Multiple Scoring in Assessments
- Author
-
Kadriye Ercikan
- Subjects
Computer science ,business.industry ,Statistical inference ,Artificial intelligence ,Data mining ,computer.software_genre ,business ,computer ,Natural language processing ,Education - Abstract
Multiple scoring is widely used in large-scale assessments. The use of a single response for making multiple inferences as is done in multiple scoring has implications on the validity of these inferences and interpretations based on assessment results. The purpose of this article is to review two types of multiple scoring practices and discuss how multiple scoring affects inferences.
- Published
- 2005
50. Examining the Construct Comparability of the English and French Versions of TIMSS
- Author
-
Kadriye Ercikan and Kim Koh
- Subjects
Social Psychology ,Point (typography) ,Modeling and Simulation ,Rank (computer programming) ,Comparability ,Mathematics education ,Achievement test ,Overall performance ,Construct (philosophy) ,Psychology ,Empirical evidence ,Education ,Science study - Abstract
The objective of this research was to examine the comparability of constructs assessed by English and French versions of the Third International Mathematics and Science Study (TIMSS). The differences in constructs observed in our analyses indicate serious limitations of using TIMSS results for making comparisons that use overall performance in mathematics and science. In particular, large differences between constructs were observed between the U.S. and French scales. Such limitations include rank ordering of performance of countries such as the United States and France, as well as conducting research using TIMSS data to compare factors associated with performance. The results from this study point to differences in constructs assessed by TIMSS in different countries and the importance of empirical evidence to support construct comparability before TIMSS results can be meaningfully used for research.
- Published
- 2005
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.