557 results on '"Formatted text"'
Search Results
2. Pictorial warning labels as deterrents of alcohol abuse
- Author
-
Antonio Mileti, M. Irene Prete, Luigi Piper, Gianluigi Guido, Piper, L., Mileti, A., Prete, M. I., and Guido, Gianluigi
- Subjects
education.field_of_study ,Post hoc ,Alcohol abuse ,Advertising ,Sample (statistics) ,computer.file_format ,medicine.disease ,Test (assessment) ,Alcoholic drinking ,medicine ,Business, Management and Accounting (miscellaneous) ,Formatted text ,Warning label ,Product (category theory) ,education ,Psychology ,computer ,Food Science - Abstract
PurposeThe purpose of this research is to demonstrate the effectiveness of pictorial warning labels that leverage the risk of obesity as a deterrent against alcohol abuse. It evaluates the impact of three different kinds of warning labels that can potentially discourage alcoholic drinking: (1) a claim, in text format, that cautions consumers about the product (i.e. a responsibility warning statement); (2) a textual warning label, text-format information on the content of the product or the consequences of excessive consumption (i.e. a synthetic nutritional table); (3) a pictorial warning label, an image depicting a food product with a caloric content equivalent to that of an alcoholic beverage.Design/methodology/approachIn Study 1, a 2 × 2 × 2 factorial design is used to evaluate the intention to buy different alcoholic cocktails. The stimuli comprised two cocktails that are similar in alcoholic volume, but different in their caloric content. The images of the products were presented across eight warning label conditions and shown to 480 randomly selected Italian respondents who quantified their intention to buy the product. In Study 2, a different sample of 34 Italian respondents was solicited with the same stimuli considered in Study 1, and neuropsychological measurements through Electroencephalography (EEG) were registered. A post hoc least significance difference (LSD) test is used to analyse data.FindingsThe results show that only the presence of an image representing an alcoholic beverage's caloric content causes a significant reduction in consumers' purchase intentions. This effect is due to the increase in negative emotions caused by pictorial warning labels.Originality/valueThe findings provide interesting insights on pictorial warning labels, which can influence the intention to purchase alcoholic beverages. They confirmed that the use of images in the warning labels has a greater impact than text, and that the risk of obesity is an effective deterrent in encouraging consumers to make healthier choices.
- Published
- 2021
- Full Text
- View/download PDF
3. E-textbook technology: Are instructors using it and what is the impact on student learning?
- Author
-
Kim Roberts, Jamie Mills, and Angela Benson
- Subjects
Learning environment ,05 social sciences ,Significant difference ,050301 education ,Survey research ,030206 dentistry ,computer.file_format ,Body of knowledge ,03 medical and health sciences ,0302 clinical medicine ,Student achievement ,ComputingMilieux_COMPUTERSANDEDUCATION ,Mathematics education ,Formatted text ,Student learning ,Community college ,Psychology ,0503 education ,computer - Abstract
PurposeToday’s digital and mobile learning environment has contributed to the increased availability of and interest in e-textbooks, and many school systems are conducting trials to evaluate their effectiveness. The purpose of this paper is to identify and analyze instructors’ levels of use (LoU) of e-textbook features and innovations at a southeastern US community college. This study also evaluated the effectiveness of e-textbooks compared to paper textbooks on student achievement during a pilot period of e-textbook implementation.Design/methodology/approachUsing a survey research design, the LoU of the Innovation framework was applied to identify and analyze instructors’ LoU rankings for eight e-textbook features. The study also used historical data on student demographics and final course grades to evaluate student achievement between text formats. Descriptive and inferential statistics were used to answer the research questions.FindingsResults showed that e-textbook features were used at a low to non-existent level by instructors and that there was no significant difference in grade average between text formats among students. However, interactions between text format, age and gender were found.Originality/valueThis study added to the body of knowledge regarding e-textbook efficacy. While many studies stop with the conclusion that there is no difference in student outcomes between text formats, this study addressed a gap in literature on how to improve student performance with e-textbook technology by using the LoU of an innovation framework.
- Published
- 2021
- Full Text
- View/download PDF
4. Detection and prevention of Phishing Attacks
- Author
-
Lavkush Gupta, Madhuri Gedam, Rucha Desai, and Abu Saad Choudhary
- Subjects
Information Systems and Management ,Computer science ,Phishing attack ,computer.file_format ,Phishing detection ,Computer security ,computer.software_genre ,Phishing ,HTML element ,Similarity (psychology) ,Formatted text ,Feature set ,computer ,Software ,Information Systems ,Web site - Abstract
Phishing is one amongst the main issues visaged by cyber-world and ends up in monetary losses for each industries and people. Detection of phishing attack with high accuracy has forever been a difficult issue. At present, visual similarities-based techniques square measure terribly helpful for police work phishing websites expeditiously. Phishing web site appearance terribly similar in look to its corresponding legitimate web site to deceive users into basic cognitive process that they are browsing the right web site. Visual similarity primarily based phishing detection techniques utilize the feature set like text content, text format, HTML tags, Cascading sheet (CSS), image, then forth, to form the choice. These approaches compare the suspicious web site with the corresponding legitimate web site by victimisation numerous options and if the similarity is larger than the predefined threshold price then it is declared phishing [2].
- Published
- 2021
- Full Text
- View/download PDF
5. THE EFFECTS OF CULTURAL BASED TEXT TYPES IN READING COMPREHENSION
- Author
-
Raja Nor Safinas Raja Harun and Indhira Thirunavukarasu
- Subjects
Ethnic group ,English proficiency ,computer.file_format ,language.human_language ,Nonprobability sampling ,Reading comprehension ,language ,Mathematics education ,Formatted text ,Narrative ,Text types ,Psychology ,computer ,Malay - Abstract
Background and Purpose: This research aims to investigate the effects of Malay, Chinese and Indian cultural based text types on reading comprehension. Methodology: An exploratory case study research design was employed to explore the use of cultural schemata in reading comprehension of three types of cultural text format: narrative, descriptive and info graphic. The purposive sampling method was used to select participants based on their ethnicities, English proficiency level, and their scores in prior knowledge assessment and retelling assessment. A total of 15 students, between 11- 12 years old, were selected. The retelling technique in written form, a comprehension test and an interview protocol were used as instruments to gather data in this research and they were analysed both quantitatively and qualitatively. Findings: The overall findings have shown that the comprehension test scores showed better performance in info graphic texts. However, the retelling assessment scores showed better performance in retelling narrative texts. The participants’ main reason for text incomprehensibility was due to the unfamiliar cultural content regardless of the types of text format . Contributions: This study aids teachers in their pedagogical decisions when selecting and adapting cultural texts for reading comprehension Keywords: Comprehension test, cultural schemata, reading comprehension, retelling assessment, text types. Cite as: Thirunavukarasu, I., & Raja Harun, R. N. S. (2021). The effects of cultural based text types in reading comprehension. Journal of Nusantara Studies, 6(1), 1-23. http://dx.doi.org/10.24200/jonus.vol6iss1pp1-23
- Published
- 2021
- Full Text
- View/download PDF
6. Quote examiner: verifying quoted images using web-based text similarity
- Author
-
Sawinder Kaur, Sneha Banerjee, and Parteek Kumar
- Subjects
Information retrieval ,Computer Networks and Communications ,business.industry ,Computer science ,020207 software engineering ,Cloud computing ,02 engineering and technology ,computer.file_format ,Optical character recognition ,computer.software_genre ,Hardware and Architecture ,Similarity (psychology) ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Web application ,Formatted text ,Social media ,business ,computer ,Software - Abstract
Over the last few years, there has been a rapid growth in digital data. Images with quotes are spreading virally through online social media platforms. Misquotes found online often spread like a forest fire through social media, which highlights the lack of responsibility of the web users when circulating poorly cited quotes. Thus, it is important to authenticate the content contained in the images being circulated online. So, there is a need to retrieve the information within such textual images to verify quotes before its usage in order to differentiate a fake or misquote from an authentic one. Optical Character Recognition (OCR) is used in this paper, for converting textual images into readable text format, but none of the OCR tools are perfect in extracting information from the images accurately. In this paper, a method of post-processing on the retrieved text to improve the accuracy of the detected text from images has been proposed. Google Cloud Vision has been used for recognizing text from images. It has also been observed that using post-processing on the extracted text improved the accuracy of text recognition by 3.5% approximately. A web-based text similarity approach (URLs and domain name) has been used to examine the authenticity of the content of the quoted images. Approximately, 96.26% accuracy has been achieved in classifying quoted images as verified or misquoted. Also, a ground truth dataset of authentic site names has been created. In this research, images with quotes by famous celebrities and global leaders have been used. A comparative analysis has been performed to show the effectiveness of our proposed algorithm.
- Published
- 2021
- Full Text
- View/download PDF
7. Improving Efficiency of Customer Requirements Classification on Autonomous Vehicle by Natural Language Processing
- Author
-
Fengrong Han, Asrul Adam, and Hao Wang
- Subjects
Computer Networks and Communications ,business.industry ,Computer science ,Customer perception ,computer.file_format ,Customer requirements ,computer.software_genre ,Computer Graphics and Computer-Aided Design ,Human-Computer Interaction ,Quality management system ,Artificial Intelligence ,Management of Technology and Innovation ,Formatted text ,Artificial intelligence ,business ,computer ,Natural language processing ,Information Systems - Abstract
Safety is critical for autonomous vehicle, therefore quality management system method is crucial for the risks that may impact human beings. Quality management system help identify customer requirements and finally meet them. Customer requirements also include other aspects that customers or stakeholders are most concerned. Although many researches on customer perception had been done, they do not include all aspect of the requirements toward autonomous vehicle. Furthermore, they are most in text format or will be transfer to text format that convenient to store and read. In front of the large amount text data, classifying them become time and costs consuming. The customer requirements on autonomous vehicle are summarized and allocated in different categories. The natural language processing method is applied in this paper. This method shows its efficiency on dealing with customer requirements. The results provide valuable reference for autonomous vehicle developer and top managemen
- Published
- 2020
- Full Text
- View/download PDF
8. Aplikasi Pemberitahuan Rapat pada STMIK Dumai Berbasis SMS Gateway
- Author
-
Putri Yunita, Fitri Pratiwi, and Masrizal Masrizal
- Subjects
Service (systems architecture) ,Short Message Service ,ComputerSystemsOrganization_COMPUTERSYSTEMIMPLEMENTATION ,business.industry ,Computer science ,InformationSystems_INFORMATIONSYSTEMSAPPLICATIONS ,ComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS ,SMS gateway ,ComputerApplications_COMPUTERSINOTHERSYSTEMS ,Information technology ,computer.file_format ,T58.5-58.64 ,Schedule (workplace) ,GSM ,Phone ,Formatted text ,InformationSystems_MISCELLANEOUS ,business ,Protocol (object-oriented programming) ,computer ,Computer network - Abstract
Short Message Service (SMS) is a technology that provides service for sending and receiving messages between cell phones. Every individual, government, private or educational institution cannot be separated from using SMS as one of the information media. One of them is meeting notification information. But SMS technology can only carry limited data. SMS Gateway is a system that bridges mobile phones with the system that becomes a server with SMS as the information. In the SMS Gateway work system, the user's cellular phone sends an SMS containing the written format to access the information needed through the GSM network. The SMS will be received by the SMS Gateway cellular phone which will then be retrieved by the PC using the mfbus protocol via a data cable. Up to the PC, the text format will be processed by the SMS Gateway application program to produce information that will be sent to the SMS Gateway cellular phone using the mfbus protocol via a data cable. After that the information is sent by the SMS Gateway cellular phone to the user's cellular phone. With the SMS Gateway-based meeting notification application, it can provide detailed and concurrent notification of meeting information so that each meeting member can follow according to his schedule.
- Published
- 2020
- Full Text
- View/download PDF
9. Spectrum Management Operations Tools
- Author
-
George F. Elmasry
- Subjects
Computer science ,business.industry ,Jamming ,computer.file_format ,Information repository ,Resolution (logic) ,Spectrum management ,Software ,Data access ,EMI ,Formatted text ,Software engineering ,business ,computer - Abstract
This chapter provides a technical description of several tools used to facilitate spectrum management operations. It includes hardware and infrastructure requirements, software used, and capabilities of spectrum management tools. Tools that promote the flow of information between spectrum stakeholders reduce the planning cycle leading to quicker decisions. Spectrum managers are able to perform the core SMO functions much more efficiently when tools comply with the net‐centric environment. Standard Frequency Action Format is a line oriented text format used by DOD, and by U.S. allies and unified action partners who use Spectrum XXI. Joint spectrum interference resolution online collaboration portal is the preferred tool for reporting EMI occurrences. It is a Web‐based, centralized application containing data and correspondence for reported EMI, intrusion, and jamming incidents dating back to 1970. The Defense Spectrum Organization provides direct on‐line data access to the joint spectrum data repository and provides customized reports.
- Published
- 2020
- Full Text
- View/download PDF
10. TRIZ Theory and the Method of Cancer Document Selection for Chemical Complexes and Innovation Schemes of Meta-Analysis with Lymphomas as an Example
- Author
-
Wang Ling, Yan Huiquan, Yu Zhiming, and Lyu Penghui
- Subjects
Article Subject ,law.invention ,Set (abstract data type) ,03 medical and health sciences ,0302 clinical medicine ,Software ,law ,Selection (linguistics) ,medicine ,TRIZ ,Formatted text ,QD1-999 ,030304 developmental biology ,0303 health sciences ,Information retrieval ,business.industry ,Cancer ,General Chemistry ,computer.file_format ,medicine.disease ,Chemistry ,Geography ,030220 oncology & carcinogenesis ,Meta-analysis ,Informatics ,business ,computer - Abstract
In the face of the growing incidence of malignant tumors (about 3.929 million, data issued in January 2019) and the death rate (about 2.338 million, data issued in January 2019) and the limitation of the application of informatics in cancer treatment, this paper tried to use TRIZ theory to deduce new ideas about cancer treatments, perform literature analysis on schemes, and make retrieval strategy for meta-analyses on cancer therapy. By using TRIZ theory and information to analyze the fields of cancers, the research schemes for selecting documents on cancer therapy were presented. After retrieving the documents, we exported all those articles in text format. We further analyzed the research status with the software CiteSpace and Bibliographic Information Mining System (BICOMS) by using different keywords, regions, countries, schools, authors, geography, institutes, etc. We also performed the cluster analysis by using Statistical Package for the Social Sciences (SPSS) software and performed two-way cluster analysis by using Gluto software. The hot areas of research and their tendency or distribution were analyzed. The search strategy was set and the retrieving results were tried.
- Published
- 2020
- Full Text
- View/download PDF
11. Are Videos or Text Better for Describing Attributes in Stated-Preference Surveys?
- Author
-
Shelby D. Reed, Jessie Ehrisman, Laura J. Havrilesky, Jui-Chen Yang, and Stephanie Lim
- Subjects
Time Factors ,media_common.quotation_subject ,Decision Making ,Applied psychology ,Video Recording ,Risk Assessment ,Health administration ,03 medical and health sciences ,Presentation ,0302 clinical medicine ,Text mining ,Patient Education as Topic ,Surveys and Questionnaires ,Humans ,Formatted text ,Quality (business) ,030212 general & internal medicine ,CLIPS ,Aged ,media_common ,computer.programming_language ,Ovarian Neoplasms ,Text Messaging ,business.industry ,030503 health policy & services ,Patient Preference ,computer.file_format ,Middle Aged ,Preference ,Comprehension ,Socioeconomic Factors ,Female ,0305 other medical science ,Psychology ,business ,computer - Abstract
In stated-preference research, the conventional approach to describing study attributes is through text, often with easy-to-understand graphics. More recently, researchers have begun to present attribute descriptions and content in videos. Some experts have expressed concern regarding internalization and retention of information conveyed via video. Our study aimed to compare respondents’ understanding of attribute information provided via text versus video. Potential respondents were randomized to receive a text or video version of the survey. In the text version, all content was provided in text format along with still graphics. In the video version, text content was interspersed with four video clips, providing the same information as the text version. In both versions, 10 questions were embedded to assess respondents’ understanding of the information presented relating to ovarian cancer treatments. Half of the questions were on treatment benefits and the other half were on treatment-related risks. Some questions asked about the decision context and definitions of treatment features, and others asked about the graphic presentation of treatment features. Preferences for ovarian cancer treatments were also compared between respondents receiving text versus video versions. Overall, 150 respondents were recruited. Of the 95 who were eligible and completed the survey, 54 respondents received the text version and 41 received the video version. Median times to completion were 24 and 30 min in the video and text arms, respectively (p
- Published
- 2020
- Full Text
- View/download PDF
12. A Study on Multi-media Disaster Information Contents to provide Disaster Alert Service to the Public
- Author
-
Hyun Chul Kim, Beom-Jun Cho, Sungboo Kang, and Ki Bong Kwon
- Subjects
Service (systems architecture) ,Warning system ,Order (exchange) ,Computer science ,Information and Communications Technology ,Multimedia information ,Information system ,Formatted text ,computer.file_format ,Computer security ,computer.software_genre ,computer ,Common alerting protocol - Abstract
Most disaster warning systems currently in operation provide disaster information only through text or voice. Therefore, it would be quite difficult for many of the elderly, the foreigners, and the disabled to recognize the information to respond timely against disaster. In addition, due to the limitations in the information that text format can carry, there are limitations in providing accurate disaster situations. In order to solve these problems, research on technologies to provide more disaster information by including various multi-media contents and on technologies that can automatically generate multi-media contents is being carried out. In Korea, there is a high-level infrastructure that can employ and provide multi-media disaster information by utilizing the latest ICT technologies such as ‘5G’ and ‘UHD’ along with digital signages and bus information systems. Utilizing this infrastructure, ‘Location customized information’ and ‘multi-media information’ can be provided. In particular, while utilizing the standard CAP (Common Alerting Protocol) suitable for the Korean environment, disaster alerts can be immediately transmitted including the localized multimedia information. This study aims to find a way to deliver more disaster information than the current system and contribute to reducing damage to lives and properties of people in case of disaster.
- Published
- 2022
- Full Text
- View/download PDF
13. WebAssembly Text Toolkit and Other Utilities
- Author
-
Shashank Mohan Jain
- Subjects
Computer science ,Programming language ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Formatted text ,computer.file_format ,computer.software_genre ,computer - Abstract
WebAssembly text toolkit is a toolkit to peek into a Wasm module in a textual format. Although the Wasm module is binary, this toolkit allows you to convert Wasm into a human-readable text format.
- Published
- 2021
- Full Text
- View/download PDF
14. Accessibility in Online User Testing
- Author
-
Estella Oncins
- Subjects
User testing ,Coronavirus disease 2019 (COVID-19) ,business.industry ,Qualitative property ,Usability ,computer.file_format ,World Wide Web ,User experience design ,Research studies ,Formatted text ,business ,computer ,Qualitative research ,Mathematics - Abstract
The current COVID-19 crisis has revealed the crucial role of online communication technologies in providing unique opportunities to carry out qualitative research in online user-based testing. The ability to provide a shared common space for participants living in different parts of the world and to record discursive data in text format accurately, makes these tools crucial in gathering qualitative data for research studies (Turney & Pocknee, 2005). Although the accessibility of the online communication platforms is improving, they still present significant challenges for all users, especially when running synchronous meeting sessions with participants in remote settings (Dodds & Hess, 2020).
- Published
- 2021
- Full Text
- View/download PDF
15. Data modelling and data processing generated by human eye movements
- Author
-
Velin Kralev, Radoslava Kraleva, and Petia Koprinkova-Hristova
- Subjects
Rapid application development ,Data processing ,Information retrieval ,General Computer Science ,computer.internet_protocol ,Computer science ,Binary number ,Human eye movements ,computer.file_format ,Session (web analytics) ,Data modeling ,modelling ,eye movements ,Relational model ,Formatted text ,Data modelling ,Electrical and Electronic Engineering ,computer ,XML ,data processing - Abstract
Data modeling and data processing are important activities in any scientific research. This research focuses on the modeling of data and processing of data generated by a saccadometer. The approach used is based on the relational data model, but the processing and storage of the data is done with client datasets. The experiments were performed with 26 randomly selected files from a total of 264 experimental sessions. The data from each experimental session was stored in three different formats, respectively text, binary and extensible markup language (XML) based. The results showed that the text format and the binary format were the most compact. Several actions related to data processing were analyzed. Based on the results obtained, it was found that the two fastest actions are respectively loading data from a binary file and storing data into a binary file. In contrast, the two slowest actions were storing the data in XML format and loading the data from a text file, respectively. Also, one of the time-consuming operations turned out to be the conversion of data from text format to binary format. Moreover, the time required to perform this action does not depend in proportion on the number of records processed.
- Published
- 2021
16. Intellectual Analysis of Text Data for Solving the Problem of Information Categorization
- Author
-
Daria M. Loseva
- Subjects
Information retrieval ,business.industry ,Computer science ,Semantic analysis (machine learning) ,Big data ,computer.file_format ,Semantics ,Market research ,Text mining ,Categorization ,Information system ,Formatted text ,business ,computer - Abstract
The report discusses the use of Text Mining algorithms such as semantic analysis of text and search for keywords to solve the problem of categorizing data entering the information system in the form of short messages in text format. An example of the application of such algorithms in the information system for processing user messages is given.
- Published
- 2021
- Full Text
- View/download PDF
17. ML-based classification of eye movement patterns during reading using eye tracking data from an Apple iPad device: Perspective machine learning algorithm needed for reading quality analytics app on an iPad with built-in eye tracking
- Author
-
D. Zhigulskaya, V. Anisimov, K. Shedenko, Revazov Arsen, K. Chernozatonsky, and A. Pikunov
- Subjects
Computer science ,business.industry ,media_common.quotation_subject ,Perspective (graphical) ,Eye movement ,computer.file_format ,Analytics ,Reading (process) ,Eye tracking ,Formatted text ,Digital learning ,business ,computer ,Algorithm ,Cognitive load ,media_common - Abstract
Eye movements in reading can provide important information about readers' perception of texts and, with appropriate algorithms, about the level of their understanding of what is written. Both digital learning and professional training require processing of large volumes of information, mostly in the text format. Management and control of students' or trainees' levels of perception in reading can be accomplished with the use of technology, assessing attention, engagement, understanding, cognitive load and tiredness of readers. These metrics can potentially be inferred from eye tracking data during reading a text. For this task it is important to utilize features of mass-market consumer devices. For example, the latest version of iPad Pro with a standard iOS operating system has an embedded eye tracker, and thus it provides opportunities for mass adoption in the educational and training settings. Authors of this work built a stable algorithm for detecting saccades and fixations in noisy eye tracking data recorded by iPad Pro 11″ and achieved certain progress in applying a machine-learning algorithm for classifying eye movement patterns in reading. The results could be used for creating an interactive reader's assistant in the format of an iOS application.
- Published
- 2021
- Full Text
- View/download PDF
18. Analysis of Semantic Units of NITF Text Format
- Author
-
O Balalaieva
- Subjects
business.industry ,Computer science ,Formatted text ,computer.file_format ,Artificial intelligence ,business ,computer.software_genre ,computer ,Natural language processing - Published
- 2019
- Full Text
- View/download PDF
19. Evaluación de las estrategias de comprensión lectora de la formación inicial de maestros: Influencia del momento formativo y del formato textual sobre el producto-resumen
- Author
-
María José Rodríguez Conde, Gabriel Herrada Valverde, and Azucena Hernández Martín
- Subjects
Initial training ,Knowledge creation ,Reading comprehension ,Pedagogy ,Knowledge building ,Primary education ,Formatted text ,Relevance (information retrieval) ,computer.file_format ,Psychology ,computer ,Education ,Task (project management) - Abstract
Resumen:La construcción de conocimiento está ligada a las tareas de comprensión lectora implicadas en la elaboración de resúmenes. Sin embargo, resumir favorece efectivamente la creación de conocimiento sólo con un modelo de tarea adecuado: expresar de forma breve y personal la macroestructura del texto. Esta investigación consiste en un estudio cuasiexperimental, con un grupo de estudiantes que comenzaba la titulación de Maestro en Educación Primaria, y con otro que la había finalizado, y se encontraba cursando el título de Psicopedagogía. El objetivo era comprobar la influencia del tipo de texto, impreso e hipervinculado, en la utilización de las estrategias de comprensión lectora necesarias para elaborar un resumen. Los resultados sugieren que el lector, independientemente del momento formativo en el que se encuentra y del formato del texto para resumir, no parte de un modelo de tarea adecuado que le permita obtener un producto satisfactorio. La importancia de este estudio radica en el tipo de estudiantes objeto de estudio, futuros profesores de Educación Primaria, y sus consecuencias para el planteamiento de la formación inicial de docentes.
- Published
- 2019
- Full Text
- View/download PDF
20. Increased Word Spacing Improves Performance for Reading Scrolling Text with Central Vision Loss
- Author
-
Robin Walker, Stephen J. Anderson, and Hannah Harvey
- Subjects
Adult ,Male ,Computer science ,media_common.quotation_subject ,Speech recognition ,Population ,Young Adult ,Reading (process) ,Humans ,Formatted text ,Scotoma ,education ,Language ,media_common ,education.field_of_study ,Recall ,computer.file_format ,Crowding ,Ophthalmology ,Pattern Recognition, Visual ,Reading ,Space Perception ,Scrolling ,Peripheral vision ,Female ,computer ,Words per minute ,Optometry - Abstract
Significance Scrolling text can be an effective reading aid for those with central vision loss. Our results suggest that increased interword spacing with scrolling text may further improve the reading experience of this population. This conclusion may be of particular interest to low-vision aid developers and visual rehabilitation practitioners. Purpose The dynamic, horizontally scrolling text format has been shown to improve reading performance in individuals with central visual loss. Here, we sought to determine whether reading performance with scrolling text can be further improved by modulating interword spacing to reduce the effects of visual crowding, a factor known to impact negatively on reading with peripheral vision. Methods The effects of interword spacing on reading performance (accuracy, memory recall, and speed) were assessed for eccentrically viewed single sentences of scrolling text. Separate experiments were used to determine whether performance measures were affected by any confound between interword spacing and text presentation rate in words per minute. Normally sighted participants were included, with a central vision loss implemented using a gaze-contingent scotoma of 8° diameter. In both experiments, participants read sentences that were presented with an interword spacing of one, two, or three characters. Results Reading accuracy and memory recall were significantly enhanced with triple-character interword spacing (both measures, P ≤ .01). These basic findings were independent of the text presentation rate (in words per minute). Conclusions We attribute the improvements in reading performance with increased interword spacing to a reduction in the deleterious effects of visual crowding. We conclude that increased interword spacing may enhance reading experience and ability when using horizontally scrolling text with a central vision loss.
- Published
- 2019
- Full Text
- View/download PDF
21. Compression and Decompression of Audio Files Using the Arithmetic Coding Method
- Author
-
Irene Sri Morina and Parasian Dp Silitonga
- Subjects
Floating point ,lcsh:T58.5-58.64 ,lcsh:Information technology ,Computer science ,business.industry ,computer.file_format ,File format ,Arithmetic coding ,File size ,audio file, wave file, compression and decompression, arithmetic coding ,Data_FILES ,Formatted text ,Pulse-code modulation ,business ,computer ,Computer hardware ,Coding (social sciences) ,Data compression - Abstract
Audio file size is relatively larger when compared to files with text format. Large files can cause various obstacles in the form of large space requirements for storage and a long enough time in the shipping process. File compression is one solution that can be done to overcome the problem of large file sizes. Arithmetic coding is one algorithm that can be used to compress audio files. The arithmetic coding algorithm encodes the audio file and changes one row of input symbols with a floating point number and obtains the output of the encoding in the form of a number of values greater than 0 and smaller than 1. The process of compression and decompression of audio files in this study is done against several wave files. Wave files are standard audio file formats developed by Microsoft and IBM that are stored using PCM (Pulse Code Modulation) coding. The wave file compression ratio obtained in this study was 16.12 percent with an average compression process time of 45.89 seconds, while the average decompression time was 0.32 seconds.
- Published
- 2019
- Full Text
- View/download PDF
22. Recognition of Tamil handwritten character using modified neural network with aid of elephant herding optimization
- Author
-
P. S. Periasamy and S. Kowsalya
- Subjects
Artificial neural network ,Computer Networks and Communications ,business.industry ,Computer science ,Dravidian languages ,Feature extraction ,Pattern recognition ,computer.file_format ,Optical character recognition ,computer.software_genre ,language.human_language ,Character (mathematics) ,Hardware and Architecture ,Tamil ,Media Technology ,language ,Segmentation ,Formatted text ,Artificial intelligence ,business ,computer ,Software - Abstract
Nowadays, recognition of machine printed or hand printed document is essential part in applications. Optical character recognition is one of the techniques which are used to convert the printed or hand written file into its corresponding text format. Tamil is the south Indian language spoken widely in Tamil Nadu. It has the longest unbroken literary tradition amongst Dravidian language. Tamil character recognition (TCR) is one of the challenging tasks in optimal character recognition. It is used for recognizing the characters from scanned input digital image and converting them into machine editable form. Recognition of handwritten in Tamil character is very difficult, due to variations in size, style and orientation angle. Character editing and reprinting of text document that were printed on paper are time consuming and low accuracy. In order to overcome this problem, the proposed technique utilizes effective Tamil character recognition. The proposed method has four main process such as preprocessing process, segmentation process, feature extraction process and recognition process. For preprocessing, the input image is fed to Gaussian filter, Binarization process and skew detection technique. Then the segmentation process is carried out, here line and character segmentation is done. From the segmented output, the features are extracted. After that the feature extraction, the Tamil character is recognized by means of optimal artificial neural network. Here the traditional neural network is modified by means of optimization algorithm. In neural network, the weights are optimized by means of Elephant Herding Optimization. The performance of the proposed method is assessed with the help of the metrics namely Sensitivity, Specificity and Accuracy. The proposed approach is experimented and its results are analyzed to visualize the performance. The proposed approach will be implemented in MATLAB.
- Published
- 2019
- Full Text
- View/download PDF
23. The Use of Natural Language Processing Approach for Converting Pseudo Code to C# Code
- Author
-
Ayad Tareq Imam and Ayman Jameel Alnsour
- Subjects
Process (engineering) ,Science ,computer.software_genre ,Semantic role labeling ,Rule-based machine translation ,Artificial Intelligence ,Code (cryptography) ,Formatted text ,Pseudocode ,srl ,Thesaurus (information retrieval) ,business.industry ,QA75.5-76.95 ,computer.file_format ,nlp ,i-case ,thematic role ,verb classification ,Electronic computers. Computer science ,acg ,Artificial intelligence ,business ,computer ,Software ,Natural language processing ,Natural language ,Information Systems - Abstract
Although current computer-aided software engineering tools support developers in composing a program, there is no doubt that more flexible supportive tools are needed to address the increases in the complexity of programs. This need can be met by automating the intellectual activities that are carried out by humans when composing a program. This paper aims to automate the composition of a programming language code from pseudocode, which is viewed here as a translation process for a natural language text, as pseudocode is a formatted text in natural English language. Based on this view, a new automatic code generator is developed that can convert pseudocode to C# programming language code. This new automatic code generator (ACG), which is called CodeComposer, uses natural language processing (NLP) techniques such as verb classification, thematic roles, and semantic role labeling (SRL) to analyze the pseudocode. The resulting analysis of linguistic information from these techniques is used by a semantic rule-based mapping machine to perform the composition process. CodeComposer can be viewed as an intelligent computer-aided software engineering (I_CASE) tool. An evaluation of the accuracy of CodeComposer using a binomial technique shows that it has a precision of 88%, a recall of 91%, and an F-measure of 89%.
- Published
- 2019
- Full Text
- View/download PDF
24. Analyzing drivers’ preferences and choices for the content and format of variable message signs (VMS)
- Author
-
Helai Huang, Zhuanglin Ma, Mohammed A. Quddus, Jaeyoung Lee, and Wenjing Zhao
- Subjects
050210 logistics & transportation ,Service system ,Discrete choice ,Computer science ,Reliability (computer networking) ,05 social sciences ,Transportation ,computer.file_format ,010501 environmental sciences ,Management Science and Operations Research ,01 natural sciences ,Transport engineering ,Variable (computer science) ,Revealed preference ,Distraction ,0502 economics and business ,Automotive Engineering ,Formatted text ,Visibility ,computer ,0105 earth and related environmental sciences ,Civil and Structural Engineering - Abstract
Background Recent advance in variable message signs (VMS) technology has made it viable to provide spatio-temporal information on traffic and network conditions to drivers. There is a debate whether VMS diverts drivers’ attention away from the road and may cause unnecessary distraction in their driving tasks due to inconsistent VMS contents and formats. There are also other external factors such as weather conditions, visibility and time of day that may affect the integrity and reliability of the VMS. In China, only about 23% drivers were persuaded by VMS to follow route diversion. Objective In order to capture the full benefits of VMS, the aim of this paper is therefore to identify the factors affecting VMS by examining what kinds of VMS contents, formats and their interactions are more preferable to drivers, specifically in China. Methods A revealed preference (RP) questionnaire and stated preference (SP) survey consisting of 1154 samples from private and taxi drivers was conducted and analyzed using discrete choice model. Results The results revealed that the information showed by amber-on-black on text format, white-on-blue on graph format or the suggested route diversion information showed by single line are preferred by drivers in fog weather. In addition, highly educated drivers or drivers with no occupation are more prone to the qualitative delay time on a text-graph format in fog weather. In normal weather, drivers with working trip purpose are mostly preferred to receive the information on a congested traffic condition with a reason on a text-only format. However, the congested traffic condition along with the information on the apparent causes shown by red-on-black or green-on-black on a text-only format was least preferred by drivers. Regarding current and adjacent road traffic information, drivers prefer to receive the suggested route diversion on a graph-only format in fog weather and the qualitative delay time on a text-graph format in normal weather. Irrespective to weather conditions, male drivers incline to the qualitative delay time on a text-graph format. Conclusions The findings of this study could assist traffic authorities to design the most acceptable VMS for displaying traffic information for the purpose of improving road traffic efficiency and provide the theory evidence for the design of in-vehicle personalized information service system.
- Published
- 2019
- Full Text
- View/download PDF
25. Classical Chinese Poetry Generation based on Transformer-XL
- Author
-
Hyo Jong Lee and Jianli Zhao
- Subjects
Dependency (UML) ,Poetry ,business.industry ,Computer science ,Deep learning ,computer.file_format ,computer.software_genre ,Character (mathematics) ,Text processing ,Classical Chinese poetry ,Formatted text ,Artificial intelligence ,business ,computer ,Natural language processing ,Transformer (machine learning model) - Abstract
Classical Chinese poetry is a kind of formatted text with phonological patterns. It has been a big challenge for the classical Chinese poetry generation. Because deep learning technology has made great progress in image and text processing, we present a classical Chinese poetry generation model based on Transformer-XL that can capture longer-term dependency than the original transformer. Our model creates the first character according to the title specified by a user and then concatenates the generated character as the input to predict the next character until reaches the predefined number of characters. Experimental results show that our model can generate poems that have good format, consistency and rhythm.
- Published
- 2021
- Full Text
- View/download PDF
26. Automated Proctoring System using Computer Vision Techniques
- Author
-
Sarthak Maniar, Krish Sukhani, Sudhir N. Dhage, and Krushna Shah
- Subjects
Computer science ,business.industry ,Cheating ,Distance education ,computer.file_format ,Test (assessment) ,Identification (information) ,Eye tracking ,Formatted text ,Computer vision ,Tracking (education) ,Artificial intelligence ,business ,computer ,Curriculum - Abstract
The arrival of COVID-19 has ushered in a new era of distance learning. Since schools and universities have closed, learning has transferred to apps like Google Meet, Microsoft Teams, Zoom, and others. Almost all colleges have changed their curricula to reflect the current reality. Students' practical knowledge deteriorated as a result of the virtual form of learning, and they began attending lectures only for the purpose of attending them. With all of this, their grades and scores should ideally be declining, but the findings came as a shock. It was a fantastic turnaround. Many students outperformed their average score. This is due to the fact that there has never been a way to perform an organized evaluation without using unequal means in the online-mode of education. To address the existing problem, a system that can assist in analysing unfair tactics used by students is required. The employment of proctoring procedures is a big difficulty for the research community when it comes to online examinations. In this paper, we present how to create a complete multi-model system utilising computer vision to prevent the presence of humans throughout the examination. We propose a system that includes a variety of features that students may exploit throughout the test, such as eye gaze tracking, mouth open or close detection, object identification, and head posture estimate using facial landmarks and face detection. Our system can also transform the student's voice into a text format, which might be useful for keeping track of the words said by the student. This might aid the examiner in determining whether or not the student is speaking with someone close. In summary, this research reveals how to prevent cheating in online tests using semi-automated proctoring based on vision and audio capabilities and monitor multiple students at a time.
- Published
- 2021
- Full Text
- View/download PDF
27. Companion: Easy Navigation App for Visually Impaired Persons
- Author
-
Kinnary Uday Panchal, Taher Juzer Gari, Vaishali Chavan, and Dhruvisha Chetan Khara
- Subjects
business.industry ,Computer science ,Frame (networking) ,computer.file_format ,Object (computer science) ,Object detection ,law.invention ,Depth map ,Bounding overwatch ,Relay ,law ,Formatted text ,Computer vision ,Artificial intelligence ,Android (operating system) ,business ,computer - Abstract
Life without vision is filled with hardships. We have proposed a system to aid visually impaired persons in their day-to-day navigation by providing them information about their surroundings using audio output (voice feedback) in real-time. The execution of this system is divided into three steps. The first is object detection using the You Only Look Once version 3 (YOLOv3) algorithm. Although multiple objects as detected in each frame, it is not practical to relay each object to the (visually impaired) user. There is a need to find the closest and most relevant object for the user. This is done in the second stage by estimating the depth and finding the closest object with the help of a depth map and comparing the size of bounding boxes in real-time. The output of the first two steps is generated in text format. The last step is to convert the text into audio output using the Text-To-Speech (TTS) API. As this system is intended to be inexpensive and readily available, this system is implemented on Android smartphones and has no additional hardware requirements.
- Published
- 2021
- Full Text
- View/download PDF
28. Development and multicenter validation of chest X-ray radiography interpretations based on natural language processing
- Author
-
Chen Xu, Yaping Zhang, Xueqian Xie, Beibei Jiang, Geertruida H. de Bock, Jun Lan, Rozemarijn Vliegenthart, Yao Shen, Liu Mingqian, and Shundong Hu
- Subjects
business.industry ,Computer science ,Radiography ,computer.file_format ,computer.software_genre ,Convolutional neural network ,Triage ,Test (assessment) ,Automatic image annotation ,Rule-based machine translation ,Medicine ,Formatted text ,Artificial intelligence ,business ,Association (psychology) ,computer ,Natural language processing - Abstract
Artificial intelligence can assist in interpreting chest X-ray radiography (CXR) data, but large datasets require efficient image annotation. The purpose of this study is to extract CXR labels from diagnostic reports based on natural language processing, train convolutional neural networks (CNNs), and evaluate the classification performance of CNN using CXR data from multiple centers We collected the CXR images and corresponding radiology reports of 74,082 subjects as the training dataset. The linguistic entities and relationships from unstructured radiology reports were extracted by the bidirectional encoder representations from transformers (BERT) model, and a knowledge graph was constructed to represent the association between image labels of abnormal signs and the report text of CXR. Then, a 25-label classification system were built to train and test the CNN models with weakly supervised labeling. In three external test cohorts of 5,996 symptomatic patients, 2,130 screening examinees, and 1,804 community clinic patients, the mean AUC of identifying 25 abnormal signs by CNN reaches 0.866 ± 0.110, 0.891 ± 0.147, and 0.796 ± 0.157, respectively. In symptomatic patients, CNN shows no significant difference with local radiologists in identifying 21 signs (p > 0.05), but is poorer for 4 signs (p 0.05), but is poorer at classifying nodules (p = 0.013). In community clinic patients, CNN shows no significant difference for 12 signs (p > 0.05), but performs better for 6 signs (p
- Published
- 2021
29. Application of Deep Learning for Weapons Detection in Surveillance Videos
- Author
-
Nazeef Ul Haq, Muhammad Moazam Fraz, Muhammad Shahzad, and Tufail Sajjad Shah Hashmi
- Subjects
Information retrieval ,business.industry ,computer.internet_protocol ,Computer science ,Deep learning ,computer.file_format ,Convolutional neural network ,Object detection ,Annotation ,Metric (mathematics) ,Data set (IBM mainframe) ,Formatted text ,Artificial intelligence ,business ,computer ,XML - Abstract
Weapon detection is a very serious and intense issue as far as the security and safety of the public in general, no doubt it’s a hard and difficult task furthermore, its troublesome when you need to do it automatically or with some of the AI model. Different object detection models are available but in case of weapons detection it is difficult to detect the weapons of distinctive size and shapes along with the different colors of the background. Currently, a great deal of Convolutional Neural Network (CNN) based deep learning approaches are proposed for the recognition and classification in real-time. In this paper, we have done the comparative analysis of the two versions which is a state of the art model called YOLOV3 and YOLOV4 for weapons detection. For training purpose, we create weapons dataset and the images are collected from google images along with a portion of different assets. We annotate the images one by one manually in different formats in light of fact that YOLO needs annotation file in text format and some other models need annotation file in XML format. We trained both the versions on a large data set of weapons and afterward tested their results for comparative analysis. We explained in the paper that YOLOV4 performs obviously superior to the YOLOV3 in terms of processing time and sensitivity yet we can compare these two in precision metric. The implementation details and trained models are made public at this link:https://cutt.ly/5kBEPhM.
- Published
- 2021
- Full Text
- View/download PDF
30. An Introduction to the Digital Analysis of Stationary Signals
- Author
-
I P Castro
- Subjects
Engineering drawing ,Software ,business.industry ,Computer science ,Microcomputer ,Emphasis (telecommunications) ,Formatted text ,Digital analysis ,computer.file_format ,business ,computer - Abstract
An Introduction to the Digital Analysis of Stationary Signals: A Computer Illustrated Text directly illustrates the various techniques required to make accurate measurements of the properties of fluctuating signals. Emphasis is on qualitative ideas rather than detailed mathematical analysis for which the computer illustrated text format is ideally suited. The author reinforces normal figures and diagrams with computer-generated graphical displays produced dynamically by the student. This package of text and accompanying software is not specific to any particular microcomputer.
- Published
- 2021
- Full Text
- View/download PDF
31. Text mining with sentiment analysis on seafarers’ medical documents
- Author
-
Marzio Di Canio, Nalini Chintalapudi, Gopi Battineni, Francesco Amenta, and Getu Gamo Sagaro
- Subjects
Text mining ,Computer science ,business.industry ,Seafarers ,media_common.quotation_subject ,Medical record ,Sentiment analysis ,computer.file_format ,Information technology ,T58.5-58.64 ,Data science ,Digital health ,Naive Bayes classifier ,Health care ,Machine learning ,Word clouds ,Quality (business) ,Formatted text ,Medical prescription ,business ,computer ,media_common - Abstract
Digital health systems contain large amounts of patient records, doctor notes, and prescriptions in text format. This information summarized over the electronic clinical information will lead to an improved quality of healthcare, the possibility of fewer medical errors, and low costs. Besides, seafarers are more vulnerable to have accidents, and prone to health hazards because of work culture, climatic changes, and personal habits. Therefore, text mining implementation in seafarers’ medical documents can generate better knowledge of medical issues that often happened onboard. Medical records are collected from digital health systems of Centro Internazionale Radio Medico (C.I.R.M.) which is an Italian Telemedical Maritime Assistance System (TMAS). Three years (2018–2020) patient data have been used for analysis. Adoption of both lexicon and Naive Bayes’ algorithms was done to perform sentimental analysis and experiments were conducted over R statistical tool. Visualization of symptomatic information was done through word clouds and 96% of the correlation between medical problems and diagnosis outcome has been achieved. We validate the sentiment analysis with more than 80% accuracy and precision.
- Published
- 2021
32. Audiovisual content for Radiology Fellowship selection process, A pilot study using online questionnaires with smartphones in the time of the COVID-19 pandemic. (Preprint)
- Author
-
Tomás De Andrade Lourenço Freddi, Abdalla Skaf, Luís Pecci Neto, Ivan R. B. Godoy, Dany Jasinowodolinski, Hilton Muniz Leão-Filho, and André Fukunishi Yamada
- Subjects
medicine.medical_specialty ,Social distance ,media_common.quotation_subject ,MEDLINE ,Computer-assisted web interviewing ,computer.file_format ,Subspecialty ,030218 nuclear medicine & medical imaging ,Computer Science Applications ,Education ,03 medical and health sciences ,Presentation ,0302 clinical medicine ,medicine ,Quality (business) ,Formatted text ,030212 general & internal medicine ,Radiology ,Psychology ,computer ,media_common ,Neuroradiology - Abstract
BACKGROUND: Traditional radiology fellowships are usually one or two year clinical trainings in a specific area after completion of a four-year residency. OBJECTIVE: Our purpose was to investigate the experience of fellowship applicants in answering radiology questions in an audiovisual format, using their own smartphones after answering radiology questions in a traditional printed text format as part of the application process during the Coronavirus Disease 2019 (COVID-19) pandemic. Our hypothesis is that fellowship applicants would find that recorded audiovisual radiology content adds value to the conventional selection process, may increase engagement by using their own smartphone device, and also facilitate understanding of imaging findings of radiology-based questions, while maintaining social distancing. METHODS: One senior staff radiologist of each subspecialty prepared 4 audiovisual radiology questions for each subspecialty. We conducted a survey using online questionnaires for 123 fellowship applications for musculoskeletal (39), internal medicine (61) and neuroradiology (23) programs to evaluate the experience of using audiovisual radiology content as a substitute for the conventional text evaluation. RESULTS: The great majority of the applicants (99%) answered positively (agree or strongly agree) that images in digital forms have superior quality than printed in paper. Eighty-two percent of the applicants agreed with the statement that the presentation of the cases in audiovisual format facilitates the understanding of the findings. Also, most of the candidates agreed or strongly agreed that answering digital forms is more practical than conventional paper forms (65%). CONCLUSIONS: Use of audiovisual content as part of the selection process for radiology fellowships is a new approach to evaluate the potential to enhance the applicant's experience during this process. This technology also allows for the evaluation of candidates without the necessity of in-person interactions. Further studies could streamline these methods in order to minimize work redundancy with traditional text tests or even evaluate the acceptance of using only audiovisual content using smartphones.
- Published
- 2021
- Full Text
- View/download PDF
33. Novel Technique for Script Translation using NLP: Performance Evaluation
- Author
-
Darshana Patil, S.B. Chaudhari, and Sharmila Shinde
- Subjects
Machine translation ,Computer science ,business.industry ,Regional language ,Rule-based system ,computer.file_format ,Hybrid machine translation ,computer.software_genre ,language.human_language ,Comprehension ,language ,Formatted text ,Artificial intelligence ,Marathi ,business ,computer ,Natural language ,Natural language processing - Abstract
Data can be represented in various languages, such as Hindi, Marathi and English in social media and also in various knowledge resources. The data is open to users in a different format. Information comprehension is very difficult for uneducated individuals. Often it is difficult for people to read or comprehend meaningful data available on social media. Thus, researchers used NLP to develop different machine translation techniques, such as rule-based, statistical-based and some hybrid machine translation. Natural language is incredibly rich in form and composition and highly vague. Natural language comprehension is much harder compared to natural language development. We present the MVET (Marathi video to English Text translation) technique in this research manuscript which helps to translate Marathi Video content into English text format. That will be very useful for people who really aren’t Marathi. Sometimes so many people have lost valuable information because of the regional language crisis. We have developed the MVET machine translation technique to solve this problem, which will achieve more precision compared to existing methods such as statistical, rule-based and hybrid.
- Published
- 2021
- Full Text
- View/download PDF
34. RCytoGPS: an R package for reading and visualizing cytogenetics data
- Author
-
Zachary B. Abrams, Kevin R. Coombes, Dwayne G. Tally, and Lynne V. Abruzzo
- Subjects
Statistics and Probability ,medicine.medical_specialty ,Computer science ,media_common.quotation_subject ,Biochemistry ,Field (computer science) ,Software ,Reading (process) ,medicine ,Formatted text ,Molecular Biology ,media_common ,computer.programming_language ,Supplementary data ,Information retrieval ,business.industry ,Cytogenetics ,Genetic data ,computer.file_format ,Applications Notes ,JSON ,Computer Science Applications ,Computational Mathematics ,R package ,Computational Theory and Mathematics ,business ,computer - Abstract
SummaryCytogenetics data, or karyotypes, are among the most common clinically used forms of genetic data. Karyotypes are stored as standardized text strings using the International System for Human Cytogenomic Nomenclature (ISCN). Historically, these data have not been used in large-scale computational analyses due to limitations in the ISCN text format and structure. Recently developed computational tools such as CytoGPS have enabled large-scale computational analyses of karyotypes. To further enable such analyses, we have now developed RCytoGPS, an R package that takes JSON files generated from CytoGPS.org and converts them into objects in R. This conversion facilitates the analysis and visualizations of karyotype data. In effect this tool streamlines the process of performing large-scale karyotype analyses, thus advancing the field of computational cytogenetic pathology.Availability and ImplementationFreely available at https://CRAN.R-project.org/package=RCytoGPSSupplementary informationSupplementary data are available at Bioinformatics online.
- Published
- 2021
- Full Text
- View/download PDF
35. Bengali Sentiment Analysis of E-commerce Product Reviews using K-Nearest Neighbors
- Author
-
Mst. Tuhin Akter, Rashed Mustafa, and Manoara Begum
- Subjects
business.industry ,Computer science ,Sentiment analysis ,06 humanities and the arts ,02 engineering and technology ,computer.file_format ,E-commerce ,0603 philosophy, ethics and religion ,computer.software_genre ,language.human_language ,Random forest ,Support vector machine ,Statistical classification ,Bengali ,0202 electrical engineering, electronic engineering, information engineering ,language ,020201 artificial intelligence & image processing ,Formatted text ,The Internet ,060301 applied ethics ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
The sentiment analysis of the Bengali language is converting into a trendy research topic nowadays. Sentiment analysis is a useful technique in opinion mining, emotion extraction, and trend predictions. By sentiment analysis, the actual sentiment of a text review can be extracted. Every day, every second's people use the internet for different purposes and leave their opinions or perspective views in various places on the internet as a text format. The opinion or review on the internet can contain the author's positive, negative, and neutral views of the statement. This study proposed a machine learning-based model to predict a user's sentiment (positive, neutral, and negative) of a Bangla text review. We have applied five machine learning algorithms in our dataset, which we manually gathered from a Bangladeshi e-commerce site called “Daraz.” We have experimented with Random Forest classifier, Logistic Regression, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and XGBoost algorithms with our dataset. KNN performs great among all these five algorithms in all the performance measures of accuracy, precision, recall, and f1-score. KNN shows 96.25% accuracy, 0.96 in each of precision, recall, and f1-score.
- Published
- 2021
- Full Text
- View/download PDF
36. Text classification to streamline online wildlife trade analyses
- Author
-
Lewis Mitchell, Oliver C. Stringham, Stephanie Moncayo, Joshua V. Ross, Adam Toomes, Phillip Cassey, and Katherine G. W. Hill
- Subjects
0106 biological sciences ,010504 meteorology & atmospheric sciences ,Computer science ,Social Sciences ,Wildlife ,01 natural sciences ,Poultry ,Machine Learning ,Sociology ,Advertising ,sort ,Formatted text ,Computer Networks ,internet ,Classifieds ,Marketing ,Multidisciplinary ,Commerce ,Eukaryota ,computer.file_format ,internet.website_category ,Wildlife trade ,Identification (information) ,Area Under Curve ,Vertebrates ,Medicine ,Pigeons ,The Internet ,F1 score ,Research Article ,Computer and Information Sciences ,Science ,Context (language use) ,Animals, Wild ,010603 evolutionary biology ,Birds ,Artificial Intelligence ,Animals ,Domestic Animals ,0105 earth and related environmental sciences ,Text Messaging ,Internet ,Information retrieval ,business.industry ,Organisms ,Biology and Life Sciences ,Models, Theoretical ,Communications ,ROC Curve ,Sample Size ,Amniotes ,business ,computer ,Zoology - Abstract
Automated monitoring of websites that trade wildlife is increasingly necessary to inform conservation and biosecurity efforts. However, e-commerce and wildlife trading websites can contain a vast number of advertisements, an unknown proportion of which may be irrelevant to researchers and practitioners. Given that many wildlife-trade advertisements have an unstructured text format, automated identification of relevant listings has not traditionally been possible, nor attempted. Other scientific disciplines have solved similar problems using machine learning and natural language processing models, such as text classifiers. Here, we test the ability of a suite of text classifiers to extract relevant advertisements from wildlife trade occurring on the Internet. We collected data from an Australian classifieds website where people can post advertisements of their pet birds (n = 16.5k advertisements). We found that text classifiers can predict, with a high degree of accuracy, which listings are relevant (ROC AUC ≥ 0.98, F1 score ≥ 0.77). Furthermore, in an attempt to answer the question ‘how much data is required to have an adequately performing model?’, we conducted a sensitivity analysis by simulating decreases in sample sizes to measure the subsequent change in model performance. From our sensitivity analysis, we found that text classifiers required a minimum sample size of 33% (c. 5.5k listings) to accurately identify relevant listings (for our dataset), providing a reference point for future applications of this sort. Our results suggest that text classification is a viable tool that can be applied to the online trade of wildlife to reduce time dedicated to data cleaning. However, the success of text classifiers will vary depending on the advertisements and websites, and will therefore be context dependent. Further work to integrate other machine learning tools, such as image classification, may provide better predictive abilities in the context of streamlining data processing for wildlife trade related online data.
- Published
- 2021
37. CNN based Optical Character Recognition and Applications
- Author
-
Muni Sekhar Velpuru, Naragudem Sarika, and Nageswararao Sirisala
- Subjects
business.industry ,Computer science ,Deep learning ,Search engine indexing ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Character encoding ,Optical character recognition ,computer.file_format ,ASCII ,computer.software_genre ,Convolutional neural network ,Character (mathematics) ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Formatted text ,Artificial intelligence ,business ,computer - Abstract
The procedure of translating images of handwritten, typewritten, or typed text into a format recognized by computers is called Optical Character Recognition (OCR). Editing, indexing, searching, and storage space reduction are the uses of Optical Character Recognition. This is done by scanning the picture of the text character-by-character first, then processing the scanned image and eventually converting the image of the character into character codes, such as ASCII. To translate text in an image into text format, the Optical Character Recognition system is used. There are three key aspects of OCR approach: pre-processing, character recognition, character segmentation and presentation of data. Convolutional Neural Network is a deep learning method which is used for character recognition. In this paper, CNN layers, architecture and implementation of CNN architecture are discussed. Here the CNN (VGG-16) model is trained over Telugu character data set which covers maximum of 1600 characters and its accuracy is measured.
- Published
- 2021
- Full Text
- View/download PDF
38. Resume Data Extraction Using NLP
- Author
-
Aman Adhikari, Umang Goyal, Anirudh Negi, Subhash Chand Gupta, and Tanupriya Choudhury
- Subjects
Parsing ,Computer science ,business.industry ,Download ,Process (computing) ,computer.file_format ,computer.software_genre ,Upload ,Data extraction ,Relational model ,Formatted text ,Artificial intelligence ,business ,Relevant information ,computer ,Natural language processing - Abstract
Extracting valuable and relevant information from a potential employee’s CV to ease the hiring process for employers, automating data extraction, parsing documents of multiple formats, and storing data in a standardized relational database model. The user uploads a single resume or multiple resumes into the program, the program accepts multiple formats (.pdf, .doc, .rtf, etc.) and converts it into a standard text format which is later parsed for the required information, and the extracted data is organized in a standard defined format. The user can then download the extracted information in the .CSV format.
- Published
- 2021
- Full Text
- View/download PDF
39. Automatic Audio Replacement of Objectionable Content for Sri Lankan Locale
- Author
-
Janarthan Jeyachandran, N. S. Weerakoon, Tharshvini Pathmaseelan, M. S. M. Siriwardane, R. K. N. D. Jayawardhane, and Gobiga Rajalingam
- Subjects
business.industry ,Computer science ,Offensive ,computer.file_format ,Filter (signal processing) ,Term (logic) ,computer.software_genre ,language.human_language ,Binary classification ,Tamil ,language ,Formatted text ,Artificial intelligence ,business ,computer ,Word (computer architecture) ,Natural language processing ,Sentence - Abstract
Fake news, hate speech, crude language, ethnic and racial slurs and more have been spreading widely every day, yet in Sri Lanka, there is no definite solution to save our society from such profanities. The method we propose detects racist, sexist and cursing objectionable content of Sinhala, Tamil and English languages. To selectively filter out the potentially objectionable audio content, the input audio is first preprocessed, converted into text format, and then such objectionable content is detected with a machine learning filtering mechanism. In order to validate its offensive nature, a preliminary filtering model was created which takes the converted sentences as input and classifies them through a binary classification. When the text is classified as offensive, then secondary filtering is carried out with a separate multi-class text classification model which classifies each word in the sentence into sexist, racist, cursing, and non-offensive categories. The models in preliminary filtering involve the Term Frequency–Inverse Document Frequency (TF-IDF) vectorizer and Support Vector Machine algorithm with varying hyperparameters. As for the multi-class classification model for Sinhala language, the combination of Logistic Regression (LR) and Countvectorizer was used while the Multinomial Naive Bayes and TF-IDF vectorizer model was found suitable for Tamil. For English, LR with Countvectorizer model was chosen to proceed. The system has an 89% and 77% accuracy of detection for Sinhala and Tamil respectively. Finally, the detected objectionable content is replaced in the audio with a predetermined audio input.
- Published
- 2021
- Full Text
- View/download PDF
40. SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
- Author
-
Patrick K. O'Neill, Vitaly Lavrukhin, Shinji Watanabe, Keenan Freyberg, Yuekai Zhang, Jagadeesh Balam, Boris Ginsburg, Somshubra Majumdar, Yuliya Dovzhenko, Oleksii Kuchaiev, Michael D. Shulman, Vahid Noroozi, and Georg Kucsko
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,Computer science ,Speech recognition ,media_common.quotation_subject ,SIGNAL (programming language) ,computer.file_format ,Punctuation ,Denormalization ,Task (project management) ,Disk formatting ,Audio and Speech Processing (eess.AS) ,FOS: Electrical engineering, electronic engineering, information engineering ,Formatted text ,Transcription (software) ,computer ,Computation and Language (cs.CL) ,Orthography ,media_common ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
In the English speech-to-text (STT) machine learning task, acoustic models are conventionally trained on uncased Latin characters, and any necessary orthography (such as capitalization, punctuation, and denormalization of non-standard words) is imputed by separate post-processing models. This adds complexity and limits performance, as many formatting tasks benefit from semantic information present in the acoustic signal but absent in transcription. Here we propose a new STT task: end-to-end neural transcription with fully formatted text for target labels. We present baseline Conformer-based models trained on a corpus of 5,000 hours of professionally transcribed earnings calls, achieving a CER of 1.7. As a contribution to the STT research community, we release the corpus free for non-commercial use at https://datasets.kensho.com/datasets/scribe., Comment: 5 pages, 1 figure. Submitted to INTERSPEECH 2021
- Published
- 2021
- Full Text
- View/download PDF
41. Offline Odia Handwritten Characters Recognition Using WEKA Environment
- Author
-
Sarojananda Mishra and Anupama Sahu
- Subjects
Character (computing) ,Computer science ,business.industry ,computer.file_format ,Optical character recognition ,computer.software_genre ,language.human_language ,Field (computer science) ,Telugu ,Naive Bayes classifier ,ComputingMethodologies_PATTERNRECOGNITION ,Bengali ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,language ,Formatted text ,Artificial intelligence ,Decision table ,business ,computer ,Natural language processing - Abstract
Optical character recognition (OCR) is an image analysis technique in the document where digital images (scanned) that contain handwritten script or machine printed script is used as input into a system to convert it to editable machine readable text format. In the current era, OCRs development for regional script is an active field of cutting edge as research, such as Odia, Telugu and Bengali. Particularly, in Odia language, it is a great challenge to an OCR inventor due to various categories of character in the alphabet. Further it is also required to combine the different letter in Odia, and many characters are roundish similar in loop. In this paper, WEKA software has been used to build classification model for offline character recognition. Hence, this research work has been attempted toward development of a novel algorithm for classification of offline handwritten Odia character recognition using Naive Bayes and decision table in WEKA environment.
- Published
- 2021
- Full Text
- View/download PDF
42. THIRD EYE: A System to Help the Visually Impaired Students in Academics
- Author
-
Salam Vaheetha, V. Vinayakrishnan, S. Nadera Beevi, and J. Midhun Chandran
- Subjects
Multimedia ,Computer science ,Headset ,Speech synthesis ,Formatted text ,computer.file_format ,Voice command device ,Android (operating system) ,Digital library ,Braille ,computer.software_genre ,computer ,Task (project management) - Abstract
Education has always been a strenuous task for visually impaired students. Courtesy to Braille’s system for being a primary technique for blind education since 1824. In this paper, we propose a technology-aided system – named THIRD EYE: A System to Help the Visually Impaired Students in Academics, which functions on voice commands especially designed to help visually impaired students. This system will convert the text in text books into audio files using text-to-speech conversion technique. It also converts the textbooks in English languages to audio in Malayalam language and sets up a digital library to store these converted audio files, which will help the students in their studies. This also reduces human efforts and also proves to be quite handy in case of change in syllabi. This system also helps the students to write their examinations without the help of a scribe, by automatically recording the dictated answer and converting the speech into corresponding text format. A provision to replay the answer is also provided to make corrections if required. This is an android and web-based application, which requires a headset with microphone as an external device. The system proposed will mark a beginning of new hope and an emancipation from the above-said requisites.
- Published
- 2021
- Full Text
- View/download PDF
43. SOT: An Application-based Research for Translate Natural Language from Image
- Author
-
Ali Md Musfick Jamil, Ishrak Islam Zarif, Masud Rabbani, and Thaharim Khan
- Subjects
Hindi ,Copying ,business.industry ,Computer science ,computer.file_format ,Optical character recognition ,computer.software_genre ,language.human_language ,Image (mathematics) ,Bengali ,language ,Formatted text ,Artificial intelligence ,business ,computer ,Natural language processing ,Natural language - Abstract
This chapter proposes a new and innovative method of smart optical character recognition (OCR) for translation (SOT) from an English image to Bangla and Hindi language. For translation, this OCR system can take any picture from any documents which contain printed English sentences. SOT can translate the image into text within two languages. The translated text from SOT can also be copied and kept for further use. SOT can convert the full image in a text format and then can translate some of the portions of an image by cropping. Google translator has been used here for the OCR. This API can translate the image into a printed text or a machine-encoded text. But the translated text cannot be kept for further use which is an immense disparity between SOT and Google translator. Moreover, many APIs also deploy, which can translate the image into a text format but cannot ferry the limit of copying text which is collected after translating text from the image. SOT can be beneficiary among all the OCR as it struck the limit of copying text.
- Published
- 2021
- Full Text
- View/download PDF
44. Automatic Data Collecting and Application of the Touch Probing System on the CNC Machine Tool
- Author
-
Die Zhang, Xin Shen, Xinyu Jiang, Wenliang Zi, and Zhi Tang
- Subjects
0209 industrial biotechnology ,Engineering drawing ,Article Subject ,Computer science ,General Mathematics ,Controller (computing) ,Big data ,02 engineering and technology ,020901 industrial engineering & automation ,Software ,0202 electrical engineering, electronic engineering, information engineering ,QA1-939 ,Formatted text ,business.industry ,General Engineering ,Process (computing) ,computer.file_format ,Engineering (General). Civil engineering (General) ,Numerical control ,020201 artificial intelligence & image processing ,TA1-2040 ,business ,computer ,Host (network) ,Injection molding machine ,Mathematics - Abstract
For realizing automatic data collecting of the touch probing system on the CNC machine tool and practicing the application technology of big data from the CNC machining process, a special NC program was developed on the Siemens 840D SL controller to record data with a defined text format, and they were uploaded onto the host computer automatically. With the help of DB management software in the host PC, data obtained were sent into the MES database regularly, and then automatic data collecting of manufacturing process information was realized. With the big data technology, three applications based on big data technology have been listed. They are duo active error detection on the probing system, geometrical accuracy monitoring, and management of the cutting parameter and tool life. Tests of cutting on the platen of an injection molding machine with a PAMA SR3000 floor-type CNC boring-milling machine proved that the new technology achieves its design objectives.
- Published
- 2021
- Full Text
- View/download PDF
45. Hashtag as modern text format in linguistics
- Author
-
Svetlana A. Burikova and Ekaterina Ovchinnikova
- Subjects
lcsh:LC8-6691 ,Continuous sampling ,Social network ,lcsh:Special aspects of education ,Computer science ,business.industry ,hashtag ,Linguistics ,computer.file_format ,Quantitative analysis (finance) ,Instagram ,Formatted text ,The Internet ,hashtag formation and text ,internet ,business ,lcsh:L ,computer ,lcsh:Education - Abstract
Nowadays modern Internet technologies play an important role in our life. The present article deals with the peculiarities of hashtags as new text format used in social network. The article presents characteristic features of hashtag text, classification of hashtag functions, types of hashtags according to their construction and the position of hashtags in the post. Different types of hashtags were subjected to the analysis. Continuous sampling method, descriptive qualitative and quantitative analysis allowed the authors to come to the conclusion about hashtags as linguistic tools. As a result, five hashtag functions and six hashtag types were identified. These findings would help to understand modern online discourse and to prove the idea that hashtags are considered as meaningful elements of social network communication.
- Published
- 2021
46. Soil Analysis and its Type Prediction with Speech Enabled Output using IoT and AWS
- Author
-
Aniket Patil, Suman Mohapatra, and Alivarani Mohapatra
- Subjects
Support vector machine ,Mean squared error ,Soil test ,Linear regression ,Value (computer science) ,Formatted text ,computer.file_format ,Data mining ,Soil type ,computer.software_genre ,computer ,Mathematics ,Random forest - Abstract
This paper proposes a new technique which can predict soil type and gives correct information to the farmers in audio format for improvised cultivation. It collects different soil parameters such as soil temperature, moisture and nitrogen, phosphorous and potassium (NPK) values present in the soil by taking the help of different sensors and predict the soil type using Random Forest Classifier, Support Vector Machine and Linear Regression Algorithms. From the comparison of all the mentioned machine learning algorithms, it is found that Random Forest Classifier gives the best soil type prediction with least Root Mean Square Error (RMSE) value. Using AWS technique, the predicted soil type information which is given in text format is converted to audio format which is easily understand by the farmers. A serverless application has been utilized to convey the information about the soil type to the farmers for better cultivation.
- Published
- 2020
- Full Text
- View/download PDF
47. Research on performance evaluation benchmark of formatted text watermarking.
- Author
-
CHEN Qing and XING Xiao-xi
- Abstract
Aiming at no available benchmarking to evaluate and compare the robustness of formatted text watermarking algorithms, this paper proposed a benchmark of formatted text watermarking. It reviewed the general framework of watermarking system, analyzed and determined the performance parameters of watermarking system and rating criteria of the text visual quality. With reference of attack classifications of digital image watermarking performance evaluation benchmarks, it proposed a new classification of text watermarking attacks: removal attacks, geometrical attacks, cryptographic attacks and protocol attacks. Meanwhile, it also described the particular attack mode of Microsoft Word. Finally it tested two different text watermarking algorithms according to attack intensity-robustness, attack intensity-visual quality and amount of embedded information-robustness respectively. The experiment shows the system has the effectiveness and practicability in the evaluation and design of text watermarking algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
48. Speech to Text Translation enabling Multilingualism
- Author
-
Shahana Bano, Yalavarthi Sikhi, Pavuluri Jithendra, and Gorsa Lakshmi Niharika
- Subjects
Process (engineering) ,business.industry ,Computer science ,Speech recognition ,Regional language ,computer.file_format ,Speech processing ,Formatted text ,The Internet ,Multilingualism ,business ,Hidden Markov model ,computer ,Spoken language - Abstract
Speech acts as a barrier to communication between two individuals and helps them in expressing their feelings, thoughts, emotions, and ideologies among each other. The process of establishing a communicational interaction between the machine and mankind is known as Natural Language processing. Speech recognition aids in translating the spoken language into text. We have come up with a Speech Recognition model that converts the speech data given by the user as an input into the text format in his desired language. This model is developed by adding Multilingual features to the existent Google Speech Recognition model based on some of the natural language processing principles. The goal of this research is to build a speech recognition model that even facilitates an illiterate person to easily communicate with the computer system in his regional language.
- Published
- 2020
- Full Text
- View/download PDF
49. RESTful Web Service for Madurese and Indonesian Language Translator Applications on Android Devices
- Author
-
Yoga Dwitya Pramudita, Sigit Susanto Putro, Rizal Nurman Wahyudi, Ika Oktavia Suzanti, and Firdaus Solihin
- Subjects
Computer science ,computer.file_format ,computer.software_genre ,JSON ,language.human_language ,Indonesian ,World Wide Web ,Data exchange ,language ,Formatted text ,Android (operating system) ,Web service ,Architecture ,computer ,Mobile device ,computer.programming_language - Abstract
There is no doubt that an Indonesian and Madurese bilingual translator system that can run on all platforms is highly needed. On the bright side, a Web-based Indonesian and Madurese bilingual translator is available at Madura.web.id. However, there is no translator application on Android devices that can be widely used by smartphone users. This study aims to create a mobile application for Indonesian-Madurese translators using RESTful API with JSON data format. In order to build a translator system that can be used by all platforms, including Android, a web service must be created. Web service is a standard and a programming method for sharing data between several applications. The architecture used in this study is RESTful Web Service because it is lighter and faster. As for text format used for data exchange, JSON format is easier to encode or decode by mobile devices. Based on the test results of response time for Madurese and Indonesian bilingual translators on Android devices, it was faster than the translator system on the existing website. The measurement result of testing on 2G networks showed the average web service response time was 31% faster than the website, 77% faster for 3G networks, and 73% faster for 4G networks.
- Published
- 2020
- Full Text
- View/download PDF
50. The international glycan repository GlyTouCan version 3.0
- Author
-
Tamiko Ono, Masaaki Shiota, Daisuke Shinmachi, Nobuyuki P. Aoki, Masaaki Matsubara, Shinichiro Tsuchiya, Akihiro Fujita, Issaku Yamada, and Kiyoko F. Aoki-Kinoshita
- Subjects
Glycan ,Internet ,Databases, Factual ,Extramural ,AcademicSubjects/SCI00010 ,International Cooperation ,Computational Biology ,computer.file_format ,Computational biology ,Biology ,carbohydrates (lipids) ,Polysaccharides ,Terminology as Topic ,Genetics ,biology.protein ,Humans ,Database Issue ,Formatted text ,computer ,Software - Abstract
Glycans serve important roles in signaling events and cell-cell communication, and they are recognized by lectins, viruses and bacteria, playing a variety of roles in many biological processes. However, there was no system to organize the plethora of glycan-related data in the literature. Thus GlyTouCan (https://glytoucan.org) was developed as the international glycan repository, allowing researchers to assign accession numbers to glycans. This also aided in the integration of glycan data across various databases. GlyTouCan assigns accession numbers to glycans which are defined as sets of monosaccharides, which may or may not be characterized with linkage information. GlyTouCan was developed to be able to recognize any level of ambiguity in glycans and uniquely assign accession numbers to each of them, regardless of the input text format. In this manuscript, we describe the latest update to GlyTouCan in version 3.0, its usage, and plans for future development.
- Published
- 2020
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.