18 results on '"Computer scoring"'
Search Results
2. Development and validation of interactive creativity task platform.
- Author
-
Ching-Lin Wu, Yu-Der Su, Eason Chen, Pei-Zhen Chen, Yu-Lin Chang, and Hsueh-Chih Chen
- Subjects
DIVERGENT thinking ,CREATIVE ability ,STANDARDIZED tests ,CONFORMANCE testing ,TASKS - Abstract
Co-creativity focuses on how individuals produce innovative ideas together. As few studies have explored co-creativity using standardized tests, it is difficult to effectively assess the individual’s creativity performance within a group. Therefore, this study aims to develop a platform that allows two individuals to answer creativity tests simultaneously. This platform includes two divergent thinking tasks, the Straw Alternative Uses Test and Bottle Alternative Uses Test, and Chinese Radical Remote Associates Test A and B, which were used to evaluate their openand closed-ended creative problem-solving performance. This platform has two modes: single-player mode and paired-player mode. Responses from 497 adults were collected, based on which the fluency, flexibility, and originality of divergent thinking were measured. This study also developed a computer scoring technique that can automatically calculate the scores on these creativity tests. The results showed that divergent thinking scores from computer-based calculation and manual scoring were highly positively correlated, suggesting that the scores on a divergent thinking task can be calculated through a system that avoids timeconsuming, uneconomical manual scoring. Overall, the two types of tests on this platform showed considerable internal consistency reliability and criterionrelated validity. This advanced application facilitates the collection of empirical evidence about co-creativity. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
3. Promoting Effects of Computer Scoring on English Learning of College Students.
- Author
-
Siwei Liu
- Subjects
COLLEGE students ,ECONOMIC globalization ,FOREIGN students ,FORMATIVE tests ,COMPUTERS - Abstract
In the age of economic globalization, it is important for college students to master such an international language as English. The computer scoring is an effective tool to enhance their ability of English learning. Drawing on theories of formative assessment and structural learning, this paper mainly verifies the promoting effect of computer scoring on English learning among college students. The data were collected through a questionnaire survey, and a case study was carried out on a scoring website for English writing. The results show that: formative assessment and structural learning lay the theoretical basis for computer scoring; college students generally recognize that computer scoring system greatly enhances their ability and enthusiasm of English learning; the target computer scoring system (www.pigai.org) facilitates autonomous learning under teacher supervision, with the functions on student and teacher interfaces. The research findings greatly promote the development of computer scoring and English learning among college students. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
4. The computerized scoring algorithm for the autobiographical memory test: updates and extensions for analyzing memories of English-speaking adults.
- Author
-
Takano, Keisuke, Hallford, David J., Vanderveren, Elien, Austin, David W., and Raes, Filip
- Subjects
- *
AUTOBIOGRAPHICAL memory , *MEMORY testing , *MEMORY , *SUPPORT vector machines - Abstract
The Autobiographical Memory Test (AMT) has been central in psychopathological studies of memory dysfunctions, as reduced memory specificity or overgeneralised autobiographical memory has been recognised as a hallmark vulnerability for depression. In the AMT, participants are asked to generate specific memories in response to emotional cue words, and their responses are scored by human experts. Because the manual coding takes some time, particularly when analysing a large dataset, recent studies have proposed computerised scoring algorithms. These algorithms have been shown to reliably discriminate between specific and non-specific memories of English-speaking children and Dutch- and Japanese-speaking adults. The key limitation is that the algorithm is not developed for English-speaking adult memories, which may cover a wider range of vocabulary that the existing algorithm for English-speaking child memories cannot process correctly. In the present study, we trained a new support vector machine to score memories of English-speaking adults. In a performance test (predicting memory specificity against human expert coding), the adult-memory algorithm outperformed the child-memory variant. In another independent performance test, the adult-memory algorithm showed robust performances to score memories that were generated in response to a different set of cues. These results suggest that the adult-memory algorithm reliably scores memory specificity. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
5. The Impact of Misspelled Words on Automated Computer Scoring: A Case Study of Scientific Explanations.
- Author
-
Ha, Minsu and Nehm, Ross
- Subjects
- *
ORTHOGRAPHY & spelling , *SPELLING errors , *ACCURACY , *ELLA (Computer hardware description language) , *ENGLISH language - Abstract
Automated computerized scoring systems (ACSSs) are being increasingly used to analyze text in many educational settings. Nevertheless, the impact of misspelled words (MSW) on scoring accuracy remains to be investigated in many domains, particularly jargon-rich disciplines such as the life sciences. Empirical studies confirm that MSW are a pervasive feature of human-generated text and that despite improvements, spell-check and auto-replace programs continue to be characterized by significant errors. Our study explored four research questions relating to MSW and text-based computer assessments: (1) Do English language learners (ELLs) produce equivalent magnitudes and types of spelling errors as non-ELLs? (2) To what degree do MSW impact concept-specific computer scoring rules? (3) What impact do MSW have on computer scoring accuracy? and (4) Are MSW more likely to impact false-positive or false-negative feedback to students? We found that although ELLs produced twice as many MSW as non-ELLs, MSW were relatively uncommon in our corpora. The MSW in the corpora were found to be important features of the computer scoring models. Although MSW did not significantly or meaningfully impact computer scoring efficacy across nine different computer scoring models, MSW had a greater impact on the scoring algorithms for naïve ideas than key concepts. Linguistic and concept redundancy in student responses explains the weak connection between MSW and scoring accuracy. Lastly, we found that MSW tend to have a greater impact on false-positive feedback. We discuss the implications of these findings for the development of next-generation science assessments. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
6. Automated Integrative Complexity: Current Challenges and Future Directions.
- Author
-
Houck, Shannon C., Conway, Lucian Gideon, and Gornick, Laura Janelle
- Subjects
- *
LINGUISTIC complexity , *AUTOMATION , *LINGUISTIC analysis , *DIALECTIC , *HUMAN-machine systems - Abstract
Automating integrative complexity is fraught with many challenges. To address these challenges, we discuss the tension between a specificity approach and a more flexible multiple-pass approach, the multifaceted nature of the complexity construct, the gold standard for complexity measurement, the difficulty of human scoring and its consequences for automation, and some ways forward for creating the best complexity measurements. In so doing, we present new data demonstrating (1) initial evidence for the validity of a new automated system for measuring two different forms of complexity (elaborative and dialectical), (2) the danger of constructing measurements in a purely ad hoc fashion that ignores prospective testing, (3) human-to-computer correspondence is in part a function of human-to-human correspondence, (4) human-to-computer correspondence increases systematically as one uses tests with larger units of analysis, and (5) the lack of correspondence of different systems (both human and automated) may occur in part because they were designed for different units of analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
7. The performance, application and integration of various seabed classification systems suitable for mapping Posidonia oceanica (L.) Delile meadows.
- Author
-
Puhr, Kristian, Schultz, Stewart, Pikelj, Kristina, Petricioli, Donat, and Bakran-Petricioli, Tatjana
- Subjects
- *
POSIDONIA oceanica , *OCEAN bottom , *GLOBAL environmental change , *SEAGRASSES , *MEADOWS , *COASTAL ecology - Abstract
Abstract: In the context of current global environmental changes, mapping and monitoring seagrass meadows have become highly important for management and preservation of coastal zone ecosystems. The purpose of this research was to determine the numerical precision of various cost-effective benthic habitat mapping techniques and their suitability for mapping and monitoring of Posidonia oceanica meadows in the Croatian Adriatic. We selected ultra-high resolution aerial imagery, single-beam echo sounder (SBES) seabed classification system from Quester Tangent Co. (QTC), and surface based underwater videography as affordable, non-destructive and simple to use systems for data acquisition. The ultra-high resolution digital imagery was capable of detecting P. oceanica meadows up to 4m depth with 94% accuracy, from 4m to 12.5m depth the accuracy dropped to app. 76%, and from 12.5 to 20m the system was only capable of distinguishing seabed biota from substrata, though with 97% accuracy. The results of the QTC system showed over 90% detection accuracy for Cymodocea nodosa covered seabed, excellent separation capabilities (>92%) of different sediment types (slightly gravelly sand, gravelly muddy sand and slightly gravelly muddy sand) and reasonable accuracy for mapping underwater vegetation regardless of the bathymetric span. The system proved incapable of separating P. oceanica from dense macroalgae on the same type of substratum. Surface-based underwater videography demonstrated great potential for estimating P. oceanica cover in a sampled region using either a single human rater or a computer estimate. The consistency between two human scorers in evaluating P. oceanica bottom coverage was near perfect (>98%) and high between digital and human scorers (80%). The results indicate that although the selected systems are suitable for mapping seagrasses, they all display limitations in either detection accuracy or spatial coverage, which leads to a conclusion that suitable system integration is essential for producing high quality seagrass spatial distribution maps. [Copyright &y& Elsevier]
- Published
- 2014
- Full Text
- View/download PDF
8. Accuracy of computer algorithms and the human eye in scoring actigraphy.
- Author
-
Boyne, Kathleen, Sherry, David, Gallagher, Paul, Olsen, Margaret, and Brooks, Lee
- Abstract
Purpose: The purpose of this study is to determine the optimal scoring method and parameter settings of actigraphy by comparison to simultaneous polysomnography (PSG). Methods: Fifteen studies of simultaneous PSG and actigraphy were completed in adolescents (mean age = 16.3 years) and analyzed. Scoring actigraphy by the human eye was compared to a commercial computerized algorithm using various parameters. The PSG was considered the reference standard. Results: There was a better correlation between actigraphy and PSG sleep start/end, total sleep time, wake after sleep onset, and sleep efficiency when the rest period was determined by the human (mean r = 0.640) rather than auto-set by the software ( r = 0.406). The best results came when the rest intervals were set based on the PSG ( r = 0.694). Scoring the printed actogram by the human eye was superior to the auto analyses as well ( r = 0.575). Higher correlations and lower biases were obtained from lower wake threshold settings (low and medium) and higher immobility times (10 and 15 min). Conclusions: Visual scoring by simple inspection of the actigraphy tracing had a reasonable correlation with the gold standard PSG. Accurate determination of the rest interval is important in scoring actigraphy. Scoring actigraphy by the human eye is superior to this computer algorithm when auto-setting major rest periods. A low wake threshold and 10-15 min of immobility for sleep onset and sleep end yield the most accurate computerized results. Auto-setting major rest intervals should be avoided to set start/end of rest intervals; adjustments for artifacts and/or a sleep diary for comparison are helpful. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
9. Point and Interval Estimates of Percentile Ranks for Scores on the Texas Functional Living Scale.
- Author
-
Crawford, John R., Cullum, C. Munro, Garthwaite, Paul H., Lycett, Emma, and Allsopp, Kate J.
- Subjects
- *
PERCENTILES , *BAYESIAN analysis , *ACTIVITIES of daily living scales , *PSYCHOMETRICS - Abstract
Point and interval estimates of percentile ranks are useful tools in assisting with the interpretation of neurocognitive test results. We provide percentile ranks for raw subscale scores on the Texas Functional Living Scale (TFLS; Cullum, Weiner, & Saine, 2009) using the TFLS standardization sample data (N = 800). Percentile ranks with interval estimates are also provided for the overall TFLS T score. Conversion tables are provided along with the option of obtaining the point and interval estimates using a computer program written to accompany this paper (TFLS_PRs.exe). The percentile ranks for the subscales offer an alternative to using the cumulative percentage tables in the test manual and provide a useful and quick way for neuropsychologists to assimilate information on the case's profile of scores on the TFLS subscales. The provision of interval estimates for the percentile ranks is in keeping with the contemporary emphasis on the use of confidence intervals in psychological statistics [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
10. Performance of the frequency domain indices with respect to sleep staging
- Author
-
Kuo, Terry B.J., Chen, C.Y., Hsu, Ya-Chuan, and Yang, Cheryl C.H.
- Subjects
- *
SLEEP stages , *ELECTROPHYSIOLOGY , *ALGORITHMS , *HEART beat , *ELECTROENCEPHALOGRAPHY , *SLOW wave sleep - Abstract
Abstract: Objective: To compare computerized staging using spectral analyses of various electrophysiological signals with manual sleep staging. Methods: Sleep recordings from 21 normal subjects were scored by an experienced rater and by a dichotomous algorithm. The performance of the spectral indices was assessed by the largest kappa value (LKV). Results: Theta/beta power ratio of the electroencephalogram, high frequency power (8–58Hz) of the electromyogram (PEMG), mean R–R interval, and total power (0–16Hz) of the body acceleration (PACCE) had high (>0.5) LKVs when differentiating between waking and sleep. To differentiate sleep with (stage 2 and slow wave sleep) and without (rapid eye movement and stage 1 sleep) spindles, sigma/beta power ratio had high LKVs. PEMG had a medium (>0.25) LKV to separate rapid eye movement from stage 1 sleep whereas delta/beta power ratio had a high LKV to separate stage 2 and slow wave sleep. Conclusion: The frequency components of electroencephalogram perform well in identifying sleep, sleep with spindles, and slow wave sleep. Electromyogram, heart rate, and body acceleration offer high agreement only when differentiating between wakefulness and sleep. Significance: The human–machine agreement is acceptable with spectral parameters, but heart rate and body acceleration still cannot substitute for electroencephalogram. [Copyright &y& Elsevier]
- Published
- 2012
- Full Text
- View/download PDF
11. Human vs. Computer Diagnosis of Students' Natural Selection Knowledge: Testing the Efficacy of Text Analytic Software.
- Author
-
Nehm, Ross and Haertig, Hendrik
- Subjects
- *
NATURAL selection , *BIOLOGICAL invasions , *LIFE sciences , *DIVERSITY in education , *BIOLOGICAL evolution - Abstract
Our study examines the efficacy of Computer Assisted Scoring (CAS) of open-response text relative to expert human scoring within the complex domain of evolutionary biology. Specifically, we explored whether CAS can diagnose the explanatory elements (or Key Concepts) that comprise undergraduate students' explanatory models of natural selection with equal fidelity as expert human scorers in a sample of >1,000 essays. We used SPSS Text Analysis 3.0 to perform our CAS and measure Kappa values (inter-rater reliability) of KC detection (i.e., computer-human rating correspondence). Our first analysis indicated that the text analysis functions (or extraction rules) developed and deployed in SPSSTA to extract individual Key Concepts (KCs) from three different items differing in several surface features (e.g., taxon, trait, type of evolutionary change) produced 'substantial' (Kappa 0.61-0.80) or 'almost perfect' (0.81-1.00) agreement. The second analysis explored the measurement of human-computer correspondence for KC diversity (the number of different accurate knowledge elements) in the combined sample of all 827 essays. Here we found outstanding correspondence; extraction rules generated using one prompt type are broadly applicable to other evolutionary scenarios (e.g., bacterial resistance, cheetah running speed, etc.). This result is encouraging, as it suggests that the development of new item sets may not necessitate the development of new text analysis rules. Overall, our findings suggest that CAS tools such as SPSS Text Analysis may compensate for some of the intrinsic limitations of currently used multiple-choice Concept Inventories designed to measure student knowledge of natural selection. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
12. Percentile Norms and Accompanying Interval Estimates from an Australian General Adult Population Sample for Self-Report Mood Scales (BAI, BDI, CRSD, CES-D, DASS, DASS-21, STAI-X, STAI-Y, SRDS, and SRAS).
- Author
-
Crawford, John, Cayley, Carol, Lovibond, Peter F, Wilson, Peter H, and Hartley, Caroline
- Subjects
- *
SELF-evaluation , *REFERENCE values , *STATISTICAL models , *PEARSON correlation (Statistics) , *STATISTICAL correlation , *CRONBACH'S alpha , *COMPUTER software , *RESEARCH funding , *CENTER for Epidemiologic Studies Depression Scale , *PROBABILITY theory , *QUESTIONNAIRES , *SEX distribution , *CHI-squared test , *DESCRIPTIVE statistics , *EXPERIMENTAL design , *RESEARCH , *AFFECT (Psychology) , *PSYCHOLOGICAL tests , *CONFIDENCE intervals , *EDUCATIONAL attainment , *RELIABILITY (Personality trait) - Abstract
Despite their widespread use, many self-report mood scales have very limited normative data. To rectify this, Crawford et al. have recently provided percentile norms for a series of self-report scales. The present study aimed to extend the work of Crawford et al. by providing percentile norms for additional mood scales based on samples drawn from the general Australian adult population. Participants completed a series of self-report mood scales. The resultant normative data were incorporated into a computer programme that provides point and interval estimates of the percentile ranks corresponding to raw scores for each of the scales. The programme can be used to obtain point and interval estimates of the percentile ranks of an individual's raw scores on the Beck Anxiety Inventory, the Beck Depression Inventory, the Carroll Rating Scale for Depression, the Centre for Epidemiological Studies Rating Scale for Depression, the Depression, Anxiety, and Stress Scales (DASS), the short-form version of the DASS (DASS-21), the Self-rating Scale for Anxiety, the Self-rating Scale for Depression, the State-Trait Anxiety Inventory (STAI), form X, and the STAI, form Y, based on normative sample sizes ranging from 497 to 769. The interval estimates can be obtained using either classical or Bayesian methods as preferred. The programme (which can be downloaded at ) provides a convenient and reliable means of obtaining the percentile ranks of individuals' raw scores on self-report mood scales. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
13. Teaching and Learning with Individually Unique Exercises.
- Author
-
Joerding, Wayne
- Subjects
HOMEWORK ,ONLINE education ,PEDAGOGICAL content knowledge ,COMPUTER software development ,OPEN source software ,COMPUTER assisted instruction ,STUDENT assignments ,EDUCATIONAL technology ,RESEARCH papers (Students) - Abstract
In this article, the author describes the pedagogical benefits of giving students individually unique homework exercises from an exercise template. Evidence from a test of this approach shows statistically significant improvements in subsequent exam performance by students receiving unique problems compared with students who received traditional paper assignments that were identical across students. The author also describes the software developed by himself and his students to implement this approach to homework problems. The software generates unique computer-graded assignments for each student from an assignment template and scores the resulting exercises. Instructors can create questions that require students to interact with diagrams or provide solutions to symbolic equations. The software is freely available to educators under an open-source license, to use, edit, and improve as they choose. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
14. On percentile norms in neuropsychology: Proposed reporting standards and methods for quantifying the uncertainty over the percentile ranks of test scores.
- Author
-
Crawford, John R., Garthwaite, Paul H., and Slick, Daniel J.
- Subjects
- *
NEUROPSYCHOLOGICAL tests , *NEUROPSYCHOLOGY , *COMPUTER software , *UNCERTAINTY , *PSYCHOPHYSIOLOGY - Abstract
Normative data for neuropsychological tests are often presented in the form of percentiles. One problem when using percentile norms stems from uncertainty over the definitional formula for a percentile. (There are three co-existing definitions and these can produce substantially different results.) A second uncertainty stems from the use of a normative sample to estimate the standing of a raw score in the normative population. This uncertainty is unavoidable but its extent can be captured using methods developed in the present paper. A set of reporting standards for the presentation of percentile norms in neuropsychology is proposed. An accompanying computer program (available to download) implements these standards and generates tables of point and interval estimates of percentile ranks for new or existing normative data. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
15. Experimenting with a computer essay-scoring program based on ESL student writing scripts.
- Author
-
Coniam, David
- Subjects
- *
ESSAYS , *AUTHORSHIP , *ENGLISH as a foreign language , *NATIVE language & education , *NATIVE language instruction - Abstract
This paper describes a study of the computer essay-scoring program BETSY. While the use of computers in rating written scripts has been criticised in some quarters for lacking transparency or lack of fit with how human raters rate written scripts, a number of essay rating programs are available commercially, many of which claim to offer comparable reliability with human raters. Much of the validation of such programs has focused on native-speaking tertiary-level students writing in subject content areas. Instead of content areas with native-speakers, the data for this study is drawn from a representative sample of scripts from an English as a second language (ESL) Year 11 public examination in Hong Kong. The scripts (900 in total) are taken from a writing test consisting of three topics (300 scripts per topic), each representing a different genre. Results in the study show good correlations between human raters' scores and the program BETSY. A rater discrepancy rate, where scripts need to be re-marked because of disagreement between two raters, emerged at levels broadly comparable with those derived from discrepancies between paired human raters. Little difference was apparent in the ratings of test takers on the three genres. The paper concludes that while computer essay-scoring programs may appear to rate inside a 'black box' with concomitant lack of transparency, they do have potential to act as a third rater, timesaving assessment tool. And as technology develops and rating becomes more transparent, so will their acceptability. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
16. Computer Scoring of Micronuclei in Human Lymphocytes
- Author
-
Callisen, H., Norman, A., Pincu, M., Eisert, Wolfgang G., editor, and Mendelsohn, Mortimer L., editor
- Published
- 1984
- Full Text
- View/download PDF
17. The Complexity Construct in Political Psychology: Personological and Cognitive Approaches
- Author
-
BRITISH COLUMBIA UNIV VANCOUVER DEPT OF PSYCHOLOGY, Suedfeld, Peter, BRITISH COLUMBIA UNIV VANCOUVER DEPT OF PSYCHOLOGY, and Suedfeld, Peter
- Abstract
Measures of the cognitive complexity of leaders have been used to infer the flexibility, open-endedness, and information-orientation of their decision making in international and nonstate confrontations. At present, there are two major methods of "assessment at a distance" used in this context. One uses computer scoring to develop personality profiles of leaders; the other uses a more labor- and time-intensive human scoring system to track changes in the target's thinking to predict the outcome of a particular confrontation. If computer scoring were able to make event-specific predictions, the savings in time and work would be substantial. This study compared the two systems to establish the following: (1) whether the computer-scored system could replace human scoring; and (2) using the example of the South Ossetia War between Georgia and Russia, which method was a better predictor of rising and falling tension. The data confirmed the relevance of integrative complexity measurement in a new context, that of an ongoing confrontation with changing levels of tension, up to and including war, between a major and a minor national power. The correlation between scores from the two methods was low; at high levels of cognitive complexity, it was essentially zero. The human scoring of integrative complexity, which tracks changes in complexity over the duration of a particular event, was closely tied to the course of the confrontation. The computer scoring of cognitive complexity, which profiles complexity as a stable personality characteristic, was not. Thus, although computer scoring has significant advantages in cost and time, it does not accomplish the same goals. The data indicate that the negative trade-off between speed and accuracy is serious enough to opt for the more laborious human tasking if the goal is the prediction of international crisis outcome., Text in English; abstract and executive summary in English and French.
- Published
- 2010
18. Computers to Try Grading N.J. Essays.
- Author
-
Brody, Leslie
- Subjects
- *
HUMAN-computer interaction , *EXAMINATIONS , *ARTIFICIAL intelligence - Published
- 2015
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.