Descriptor: "Machine translation" / Topic: computer - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Machine translation"' showing total 11,912 results

Start Over Descriptor "Machine translation" Topic computer

11,912 results on '"Machine translation"'

1. Hierarchical self attention based sequential labelling model for Bhojpuri, Maithili and Magahi languages

Author: Anil Kumar Singh, Rajesh Kumar Mundotiya, and Swasti Mishra
Subjects: General Computer Science, Machine translation, Structured support vector machine, Computer science, Bhojpuri, business.industry, computer.software_genre, language.human_language, Maithili, Information extraction, Language technology, language, Artificial intelligence, Language family, business, computer, Natural language processing, Chunking (computing)
Abstract: Sequential labelling plays a vital role in solving numerous Natural Language Processing (NLP) applications such as Machine Translation and Information Extraction etc. One of these is Part-of-Speech (POS) tagging, which assigns a sequence of grammatical categories to the given sentence, and Chunking which groups them into ‘chunks’ or what can be called minimal phrases. Bhojpuri, Maithili and Magahi are low resource languages and widely spoken in central north-eastern India, belonging to the Indo-Aryan language family. The creation of an annotated corpus for POS tagging and Chunking, and then building an initial automatic tool for these problems is the first attempt towards building language technology tools for these languages. The annotated corpus used to develop POS Taggers and Chunkers, based on various machine learning algorithms (TnT, CRF, MEMM and Structured SVM) and state-of-the-art LSTM-CNN-CRF model, and then these compared with the obtained results on two new proposed deep learning-based models, Self-Attention Hierarchical Bi-LSTM CRF (SAHBiLC) and a fine-tuned version of it, Fine-SAHBiLC. The SAHBiLC and Fine-SAHBiLC models outperform on Bhojpuri (Accuracy for POS and Chunking is 0.86% and 0.94%, respectively) and Maithili (Accuracy for POS and Chunking is 0.86% and 0.95%, respectively) and Magahi (Accuracy for POS is 0.86%).
Published: 2022
Full Text: View/download PDF

2. Moroccan Arabic vocabulary generation using a rule-based approach

Author: Ridouane Tachicart and Karim Bouzoubaa
Subjects: Vocabulary, General Computer Science, Machine translation, Computer science, business.industry, media_common.quotation_subject, Concatenation, Spell, 020206 networking & telecommunications, Rule-based system, 02 engineering and technology, Lexicon, computer.software_genre, Set (abstract data type), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, State (computer science), Artificial intelligence, business, computer, Natural language processing, media_common
Abstract: NLP resources play a crucial role in the building of many NLP applications. The importance of these resources depends not only on their size and coverage but also on the richness and the precision of the annotated information they provide. In the case of resource-scarce languages such as Moroccan Arabic, the building of NLP applications is limited due to the lack of these resources. To overcome this problem, we follow a rule-based approach to generate a Moroccan morphological vocabulary (MORV) which constitutes the first step addressing the problem of Moroccan morphological generation. MORV is designed and implemented based on two main components: On one hand, an MA lexicon and a list of fully annotated affixes and clitics that we have created specifically to ensure the generation process. On the other hand, a set of rules covering the concatenation and the orthographic adjustments of the generated words. Moreover, given a base form, MORV outputs more than 4.5 M Moroccan words with rich morphological features such as tense, gender, number, state, etc. We tested the coverage of MORV on texts collected from Moroccan social media and realized that it reaches a vocabulary coverage of 84% and a precision of 94%. This system is a benefit for building other NLP applications such as spell checking, morphological analysis, and machine translation.
Published: 2022
Full Text: View/download PDF

3. Progress in Machine Translation

Author: Liang Huang, Zhongjun He, Hua Wu, Kenneth Church, and Haifeng Wang
Subjects: Environmental Engineering, General Computer Science, Machine translation, Computer science, business.industry, Materials Science (miscellaneous), General Chemical Engineering, media_common.quotation_subject, General Engineering, Energy Engineering and Power Technology, Translation (geometry), computer.software_genre, Field (computer science), Quality (business), Artificial intelligence, business, computer, Natural language processing, Transformer (machine learning model), media_common
Abstract: After more than 70 years of evolution, great achievements have been made in machine translation. Especially in recent years, translation quality has been greatly improved with the emergence of neural machine translation (NMT). In this article, we first review the history of machine translation from rule-based machine translation to example-based machine translation and statistical machine translation. We then introduce NMT in more detail, including the basic framework and the current dominant framework, Transformer, as well as multilingual translation models to deal with the data sparseness problem. In addition, we introduce cutting-edge simultaneous translation methods that achieve a balance between translation quality and latency. We then describe various products and applications of machine translation. At the end of this article, we briefly discuss challenges and future research directions in this field.
Published: 2022
Full Text: View/download PDF

4. Human vs. AI

Author: Haiqing Chen and Hanji Li
Subjects: Machine translation, Computer science, business.industry, media_common.quotation_subject, Quality (business), General Medicine, Artificial intelligence, computer.software_genre, business, Translation (geometry), computer, Natural language processing, media_common
Abstract: As one of the most important applications of AI, machine translation has always been the hot topic among scholars in linguistics, computer science, cognitive science and other areas. This article made an assessment of translations of 4 selected major online machine translation platforms from perspectives of efficiency, operating mode and condition. The outputs of machine and human were compared by employing new “6-4” table and comprehensive error rate. The assessment shows that although the quality of machine translation is improving, the gap still exists between the quality of machine translation and human translation. Based on the research findings, the author predicts that machine translation cannot possibly replace human translation and the two will continue to coexist in the foreseeable future.
Published: 2022
Full Text: View/download PDF

5. Ignorance is Bliss: Exploring Defenses Against Invariance-Based Attacks on Neural Machine Translation Systems

Author: Akshay Chaturvedi, Abhisek Chakrabarty, Utpal Garain, Eiichiro Sumita, and Masao Utiyama
Subjects: Cognitive science, BLISS, Machine translation, Computer science, media_common.quotation_subject, Ignorance, computer.software_genre, computer, media_common, computer.programming_language
Published: 2022
Full Text: View/download PDF

6. SG-Net: Syntax Guided Transformer for Language Representation

Author: Yuwei Wu, Rui Wang, Junru Zhou, Zhuosheng Zhang, Sufeng Duan, and Hai Zhao
Subjects: FOS: Computer and information sciences, Dependency (UML), Machine translation, Computer Science - Artificial Intelligence, Computer science, computer.software_genre, Computer Science - Information Retrieval, Artificial Intelligence, Humans, Representation (mathematics), Language, Transformer (machine learning model), Computer Science - Computation and Language, business.industry, Applied Mathematics, Linguistics, DUAL (cognitive architecture), Syntax, Artificial Intelligence (cs.AI), Computational Theory and Mathematics, Computer Vision and Pattern Recognition, Artificial intelligence, business, Computation and Language (cs.CL), computer, Encoder, Algorithms, Information Retrieval (cs.IR), Software, Word (computer architecture)
Abstract: Understanding human language is one of the key themes of artificial intelligence. For language representation, the capacity of effectively modeling the linguistic knowledge from the detail-riddled and lengthy texts and getting rid of the noises is essential to improve its performance. Traditional attentive models attend to all words without explicit constraint, which results in inaccurate concentration on some dispensable words. In this work, we propose using syntax to guide the text modeling by incorporating explicit syntactic constraints into attention mechanisms for better linguistically motivated word representations. In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention. Syntax-guided network (SG-Net) is then composed of this extra SDOI-SAN and the SAN from the original Transformer encoder through a dual contextual architecture for better linguistics inspired representation. The proposed SG-Net is applied to typical Transformer encoders. Extensive experiments on popular benchmark tasks, including machine reading comprehension, natural language inference, and neural machine translation show the effectiveness of the proposed SG-Net design., The early version accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). Journal extension of arXiv:1908.05147 (AAAI 2020)
Published: 2022
Full Text: View/download PDF

7. PhraseAttn: Dynamic Slot Capsule Networks for phrase representation in Neural Machine Translation

Author: Binh Nguyen, Long H.B. Nguyen, Dien Dinh, and Binh Van Le
Subjects: Statistics and Probability, Phrase, Machine translation, Artificial Intelligence, business.industry, Computer science, General Engineering, Representation (systemics), Artificial intelligence, computer.software_genre, business, computer, Natural language processing
Abstract: Word representation plays a vital role in most Natural Language Processing systems, especially for Neural Machine Translation. It tends to capture semantic and similarity between individual words well, but struggle to represent the meaning of phrases or multi-word expressions. In this paper, we investigate a method to generate and use phrase information in a translation model. To generate phrase representations, a Primary Phrase Capsule network is first employed, then iteratively enhancing with a Slot Attention mechanism. Experiments on the IWSLT English to Vietnamese, French, and German datasets show that our proposed method consistently outperforms the baseline Transformer, and attains competitive results over the scaled Transformer with two times lower parameters.
Published: 2022
Full Text: View/download PDF

8. Fuzzy-Match Repair Guided by Quality Estimation

Author: Mikel L. Forcada, Felipe Sánchez-Martínez, John Ortega, Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos, and Transducens
Subjects: Machine translation, Computer science, media_common.quotation_subject, 02 engineering and technology, computer.software_genre, Translation (geometry), Quality estimation, Set (abstract data type), Translation memories, Artificial Intelligence, 0202 electrical engineering, electronic engineering, information engineering, Translations, Quality (business), Fuzzy-match repair, Language, media_common, business.industry, Applied Mathematics, Translating, Approximate string matching, Computer-aided translation, Computational Theory and Mathematics, Lenguajes y Sistemas Informáticos, 020201 artificial intelligence & image processing, Translation memory, Computer Vision and Pattern Recognition, Artificial intelligence, business, computer, Algorithms, Software, Natural language processing
Abstract: Computer-aided translation tools based on translation memories are widely used to assist professional translators. A translation memory (TM) consists of a set of translation units (TU) made up of source- and target-language segment pairs. For the translation of a new source segment s', these tools search the TM and retrieve the TUs (s,t) whose source segments are more similar to s'. The translator then chooses a TU and edit the target segment t to turn it into an adequate translation of s'. Fuzzy-match repair (FMR) techniques can be used to automatically modify the parts of t that need to be edited. We describe a language-independent FMR method that first uses machine translation to generate, given s' and (s,t), a set of candidate fuzzy-match repaired segments, and then chooses the best one by estimating their quality. An evaluation on three different language pairs shows that the selected candidate is a good approximation to the best (oracle) candidate produced and is closer to reference translations than machine-translated segments and unrepaired fuzzy matches (t). In addition, a single quality estimation model trained on a mix of data from all the languages performs well on any of the languages used. This work was supported by the Spanish Government through the EFFORTUNE project [TIN-2015-69632-R].
Published: 2022
Full Text: View/download PDF

9. Routledge Encyclopedia of Translation Technology

Author: Sin-Wai Chan
Subjects: Wright, History, Machine translation, Speech translation, Speech recognition, Encyclopedia, Translation studies, Art history, Translation memory, Pragmatics, computer.software_genre, computer, Terminology
Abstract: Introduction Chan Sin-wai Acknowledgement Part 1: General Issues of Translation Technology * The Development of Translation Technology: 1967-2013 Chan Sin-wai * Computer-aided Translation: Major Concepts Chan Sin-wai * Computer-aided Translation Systems Ignacio Garcia * Computer-Aided Translation: Translator Training Lynne Bowker * Machine Translation: General Liu Qun and Zhang Xiaojun * Machine Translation: History of Research and Applications W. John Hutchins * Example-based Machine Translation Billy Wong Tak-ming and Jonathan Webster * Open-Source Machine Translation Technology Mikel L. Forcada * Pragmatics-based Machine Translation David Farwell and Stephen Helmreich * Rule-based Machine Translation Yu Shiwen and Bai Xiaojing * Statistical Machine Translation Liu Yang and Zhang Min * Evaluation in Machine Translation and Computer-aided Translaton Kit Chunyu and Wong Tak-ming * The Teaching of Machine Translation: The Chinese University of Hong Kong as a Case Study Cecilia Wong Suk Man Part 2: The National / Regional Developments of Translation Technology * Translation Technology in China Qian Duoxiu * Translation Technology in Canada Elliott Macklovitch * Translation Technology in France Sylviane Cardey * Translation Technology in Hong Kong Chan Sin-wai, Ian Chow and Wong Tak-ming * Translation Technology in Japan Hitoshi Isahara * Translation Technology in South Africa Gerhard van Huyssteen and Marissa Griesel * Translation Technology in Taiwan: Track and Trend Shih Chung-ling * Translation Technology in the Netherlands and Belgium Leonoor van der Beek and Antal van den Bosch * Translation Technology in the United Kingdom Christophe Declercq * A History of Translation Technology in the United States of America Jost Zetzsche and Jennifer DeCamp Part 3: Specific Topics in Translation Technology * Alignment Lars Ahrenberg * Bitext Alan K. Melby, Arle Lommel, and Lucia Morado Vazquez * Computational Lexicography Zhang Yihua * Concordancing Federico Zanettin * Controlled Language Rolf Schwitter * Corpus Li Lan * Editing in Translation Technology Christophe Declercq * Information Retrieval and Text Mining Kit Chunyu and Nie Jian-Yun * Language Codes and Language Tags Sue Ellen Wright * Localization Keiran Dunne * Natural Language Processing Olivia Kwong * Online Translation Federico Gaspari * Part of Speech Tagging Felipe Sanchez-Martinez * Segmentation Freddy Y. Y. Choi * Speech Translation Tan Lee * Subtitling and Technology Jorge Dias-Cintas * Terminology Management Kara Warburton * Translation Memory Alan K. Melby and Sue Ellen Wright * Translation Management Systems Mark Shuttlewort
Published: 2023
Full Text: View/download PDF

10. Enriching the transfer learning with pre-trained lexicon embedding for low-resource neural machine translation

Author: Mieradilijiang Maimaiti, Maosong Sun, Yang Liu, and Huanbo Luan
Subjects: Multidisciplinary, Artificial neural network, Machine translation, Computer science, business.industry, Translation (geometry), computer.software_genre, Lexicon, Feature (machine learning), Embedding, Artificial intelligence, business, Transfer of learning, computer, Word (computer architecture), Natural language processing
Abstract: Most State-Of-The-Art (SOTA) Neural Machine Translation (NMT) systems today achieve outstanding results based only on large parallel corpora. The large-scale parallel corpora for high-resource languages is easily obtainable. However, the translation quality of NMT for morphologically rich languages is still unsatisfactory, mainly because of the data sparsity problem encountered in Low-Resource Languages (LRLs). In the low-resource NMT paradigm, Transfer Learning (TL) has been developed into one of the most efficient methods. It is difficult to train the model on high-resource languages to include the information in both parent and child models, as well as the initially trained model that only contains the lexicon features and word embeddings of the parent model instead of the child languages feature. In this work, we aim to address this issue by proposing the language-independent Hybrid Transfer Learning (HTL) method for LRLs by sharing lexicon embedding between parent and child languages without leveraging back translation or manually injecting noises. First, we train the High-Resource Languages (HRLs) as the parent model with its vocabularies. Then, we combine the parent and child language pairs using the oversampling method to train the hybrid model initialized by the previously parent model. Finally, we fine-tune the morphologically rich child model using a hybrid model. Besides, we explore some exciting discoveries on the original TL approach. Experimental results show that our model consistently outperforms five SOTA methods in two languages Azerbaijani (Az) and Uzbek (Uz). Meanwhile, our approach is practical and significantly better, achieving improvements of up to 4.94 and 4.84 BLEU points for low-resource child languages Az ! Zh and Uz ! Zh, respectively.
Published: 2022
Full Text: View/download PDF

11. Exploring Multi-Stage Information Interactions for Multi-Source Neural Machine Translation

Author: Bin Wang, Jinsong Su, Jianwei Cui, Li Xiang, Chulun Zhou, Yang Liu, Ziyao Lu, and Min Zhang
Subjects: Acoustics and Ultrasonics, Machine translation, Exploit, Computer science, business.industry, computer.software_genre, Multi stage, Computational Mathematics, Encoding (memory), Computer Science (miscellaneous), Artificial intelligence, Electrical and Electronic Engineering, business, Encoder, computer, Sentence, Natural language processing, Word (computer architecture), Multi-source
Abstract: Existing studies for multi-source neural machine translation (NMT) either separately model different source sentences or resort to the conventional single-source NMT by simply concatenating all source sentences. However, there exist two drawbacks in these approaches. First, they ignore the explicit word-level semantic interactions between source sentences, which have been shown effective in the embeddings of multilingual texts. Second, multiple source sentences are simultaneously encoded by an NMT model, which is unable to fully exploit the semantic information of each source sentence. In this paper, we explore multi-stage information interactions for multi-source NMT. Specifically, we first propose a multi-source NMT model that performs information interactions at the encoding stage. Its encoder contains multiple semantic interaction layers, each of which sequentially consists of (1) monolingual semantic interaction sub-layer, which is based on the self-attention mechanism and used to learn word-level monolingual contextual representations of source sentences, and (2) cross-lingual semantic interaction sub-layer, which leverages word alignments to perform fine-grained semantic transitions among hidden states of different source sentences. Furthermore, at the training stage, we introduce a mutual distillation based training framework, where single-source models and ours perform information interactions. Such framework can fully exploit the semantic information of each source sentence to enhance our model. Extensive experimental results on the WMT14 English-German-French dataset show our model exhibits significant improvements upon competitive baselines.
Published: 2022
Full Text: View/download PDF

12. Hierarchical multimodal transformer to summarize videos

Author: Maoguo Gong, Xuelong Li, and Bin Zhao
Subjects: Scheme (programming language), Closed captioning, Machine translation, Computer science, Cognitive Neuroscience, Frame (networking), computer.software_genre, Automatic summarization, Computer Science Applications, Task (project management), Recurrent neural network, Artificial Intelligence, Data mining, computer, computer.programming_language, Transformer (machine learning model)
Abstract: Although video summarization has achieved tremendous success benefiting from Recurrent Neural Networks (RNN), RNN-based methods neglect the global dependencies and multi-hop relationships among video frames, which limits the performance. Transformer is an effective model to deal with this problem, and surpasses RNN-based methods in several sequence modeling tasks, such as machine translation, video captioning, etc. Motivated by the great success of transformer and the natural structure of video (frame-shot-video), a hierarchical transformer is developed for video summarization, which can capture the dependencies among frame and shots, and summarize the video by exploiting the scene information formed by shots. Furthermore, we argue that both the audio and visual information are essential for the video summarization task. To integrate the two kinds of information, they are encoded in a two-stream scheme, and a multimodal fusion mechanism is developed based on the hierarchical transformer. In this paper, the proposed method is denoted as Hierarchical Multimodal Transformer (HMT). Practically, extensive experiments show that HMT achieves (F-measure: 0.441, Kendall’s τ : 0.079, Spearman’s ρ : 0.080) and (F-measure: 0.601, Kendall’s τ : 0.096, Spearman’s ρ : 0.107) on SumMe and TVsum, respectively. It surpasses most of the traditional, RNN-based and attention-based video summarization methods.
Published: 2022
Full Text: View/download PDF

13. Integrating Deep Learning and Machine Translation for Understanding Unrefined Languages

Author: Jina Kim, Soyoung Oh, HongGeun Ji, Eunil Park, and Seong Jae Choi
Subjects: Machine translation, Computer science, business.industry, Deep learning, computer.software_genre, Computer Science Applications, Biomaterials, Mechanics of Materials, Modeling and Simulation, Artificial intelligence, Electrical and Electronic Engineering, business, computer, Natural language processing
Published: 2022
Full Text: View/download PDF

14. Which Apple Keeps Which Doctor Away? Colorful Word Representations With Visual Oracles

Author: Haojie Yu, Masao Utiyama, Zhuosheng Zhang, and Hai Zhao
Subjects: Word embedding, Acoustics and Ultrasonics, Machine translation, Computer science, Generalization, business.industry, Natural language understanding, Context (language use), computer.software_genre, Multimodality, Computational Mathematics, Computer Science (miscellaneous), Language model, Artificial intelligence, Electrical and Electronic Engineering, business, computer, Natural language processing, Word (computer architecture)
Abstract: Recent pre-trained language models (PrLMs) offer a new performant method of contextualized word representations by leveraging the sequence-level context for modeling. Although the PrLMs generally provide more effective contextualized word representations than non-contextualized models, they are still subject to a sequence of text contexts without diverse hints from multimodality. This paper thus proposes a visual representation method to explicitly enhance conventional word embedding with multiple-aspect senses from visual guidance. In detail, we build a small-scale word-image dictionary from a multimodal seed dataset where each word corresponds to diverse related images. Experiments on 12 natural language understanding and machine translation tasks further verify the effectiveness and the generalization capability of the proposed approach. Analysis shows that our method with visual guidance pays more attention to content words, improves the representation diversity, and is potentially beneficial for enhancing the accuracy of disambiguation.
Published: 2022
Full Text: View/download PDF

15. A machine learning application for raising WASH awareness in the times of COVID-19 pandemic

Author: Rohan Pandey, Vaibhav Gautam, Ridam Pal, Harsh Bandhey, Lovedeep Singh Dhingra, Vihaan Misra, Himanshu Sharma, Chirag Jain, Kanav Bhagat, Arushi Arushi, Lajjaben Patel, Mudit Agarwal, Samprati Agrawal, Rishabh Jalan, Akshat Wadhwa, Ayush Garg, Yashwin Agrawal, Bhavika Rana, Ponnurangam Kumaraguru, and Tavpritesh Sethi
Subjects: FOS: Computer and information sciences, Male, Computer Science - Machine Learning, Machine translation, Computer science, media_common.quotation_subject, Science, 02 engineering and technology, computer.software_genre, Machine learning, Global Health, Chatbot, Literacy, Article, Machine Learning (cs.LG), Machine Learning, Computer Science - Computers and Society, 03 medical and health sciences, 0302 clinical medicine, Intervention (counseling), Computers and Society (cs.CY), 0202 electrical engineering, electronic engineering, information engineering, Humans, Relevance (information retrieval), 030212 general & internal medicine, Misinformation, Disinformation, mHealth, Pandemics, media_common, Natural Language Processing, Public health, Computer Science - Computation and Language, Multidisciplinary, business.industry, Computational science, COVID-19, 3. Good health, 030220 oncology & carcinogenesis, Medicine, 020201 artificial intelligence & image processing, Female, Artificial intelligence, business, Computation and Language (cs.CL), computer
Abstract: Background: The COVID-19 pandemic has uncovered the potential of digital misinformation in shaping the health of nations. The deluge of unverified information that spreads faster than the epidemic itself is an unprecedented phenomenon that has put millions of lives in danger. Mitigating this Infodemic requires strong health messaging systems that are engaging, vernacular, scalable, effective and continuously learn the new patterns of misinformation. Objective: We created WashKaro, a multi-pronged intervention for mitigating misinformation through conversational AI, machine translation and natural language processing. WashKaro provides the right information matched against WHO guidelines through AI, and delivers it in the right format in local languages. Methods: We theorize (i) an NLP based AI engine that could continuously incorporate user feedback to improve relevance of information, (ii) bite sized audio in the local language to improve penetrance in a country with skewed gender literacy ratios, and (iii) conversational but interactive AI engagement with users towards an increased health awareness in the community. Results: A total of 5026 people who downloaded the app during the study window, among those 1545 were active users. Our study shows that 3.4 times more females engaged with the App in Hindi as compared to males, the relevance of AI-filtered news content doubled within 45 days of continuous machine learning, and the prudence of integrated AI chatbot Satya increased thus proving the usefulness of an mHealth platform to mitigate health misinformation. Conclusion: We conclude that a multi-pronged machine learning application delivering vernacular bite-sized audios and conversational AI is an effective approach to mitigate health misinformation., Comment: 14 pages, 7 figures
Published: 2022

16. Enhancing Neural Machine Translation With Dual-Side Multimodal Awareness

Author: Yuqing Song, Shizhe Chen, Jun Xie, Qin Jin, Fei Huang, and Wei Luo
Subjects: Machine translation, Computer science, Human–computer interaction, Signal Processing, Media Technology, Electrical and Electronic Engineering, DUAL (cognitive architecture), computer.software_genre, computer, Computer Science Applications
Published: 2022
Full Text: View/download PDF

17. Challenges and Countermeasures of Machine Translation to Translation and Interpretation under the Background of Artificial Intelligence

Author: Tao Tao
Subjects: Machine translation, Computer science, business.industry, Interpretation (philosophy), Artificial intelligence, computer.software_genre, business, Translation (geometry), computer
Published: 2021
Full Text: View/download PDF

18. Non-autoregressive neural machine translation with auxiliary representation fusion

Author: Chen Xu, Tong Xiao, Jingbo Zhu, Kai Feng, and Quan Du
Subjects: Statistics and Probability, Fusion, Machine translation, Computer science, business.industry, General Engineering, Representation (systemics), Pattern recognition, computer.software_genre, Autoregressive model, Artificial Intelligence, Artificial intelligence, business, computer
Abstract: Recently, many efforts have been devoted to speeding up neural machine translation models. Among them, the non-autoregressive translation (NAT) model is promising because it removes the sequential dependence on the previously generated tokens and parallelizes the generation process of the entire sequence. On the other hand, the autoregressive translation (AT) model in general achieves a higher translation accuracy than the NAT counterpart. Therefore, a natural idea is to fuse the AT and NAT models to seek a trade-off between inference speed and translation quality. This paper proposes an ARF-NAT model (NAT with auxiliary representation fusion) to introduce the merit of a shallow AT model to an NAT model. Three functions are designed to fuse the auxiliary representation into the decoder of the NAT model. Experimental results show that ARF-NAT outperforms the NAT baseline by 5.26 BLEU scores on the WMT’14 German-English task with a significant speedup (7.58 times) over several strong AT baselines.
Published: 2021
Full Text: View/download PDF

19. Modeling hypotactic structure for Chinese-English neural machine translation of complex sentences

Author: Yufeng Chen, Guoyi Miao, Mingtong Liu, Jinan Xu, Wenhe Feng, and Jian Liu
Subjects: Statistics and Probability, Structure (mathematical logic), Machine translation, Artificial Intelligence, business.industry, Computer science, General Engineering, Artificial intelligence, business, computer.software_genre, computer, Natural language processing
Abstract: The hypotactic structural relation between clauses plays an important role in improving the discourse coherence of document-level translation. However, the standard neural machine translation (NMT) models do not explicitly model the hypotactic relationship between clauses, which usually leads to structurally incorrect translations of long and complex sentences. This problem is particularly noticeable on Chinese-to-English translation task of complex sentences due to the grammatical form distinction between English and Chinese. English is rich in grammatical form (e.g. verb morphological changes and subordinating conjunctions) while Chinese is poor in grammatical form. These linguistic phenomena make it a challenge for NMT to learn the hypotactic structure knowledge from Chinese as well as the structure alignment between Chinese and English. To address these issues, we propose to model the hypotactic structure for Chinese-to-English complex sentence translation by introducing hypotactic structure knowledge. Specifically, we annotate and build a hypotactic structure aligned parallel corpus that provides rich hypotactic structure knowledge for NMT. Moreover, we further propose a structure-infused neural framework to combine the hypotactic structure knowledge with the NMT model through two integrating strategies. In particular, we introduce a specific structure-aware loss to encourage the NMT model to better learn the structure knowledge. Experimental results on WMT17, WMT18 and WMT19 Chinese-to-English translation tasks demonstrate the effectiveness of the proposed methods.
Published: 2021
Full Text: View/download PDF

20. Human-Computer Interaction in Translation Activity: Fluency of Machine Translation

Author: Roman Králik, Barbara Jakubickova, and Katarina Welnitzova
Subjects: Machine translation, Computer science, media_common.quotation_subject, slovak, computer.software_genre, machine translation, Education, Synthetic language, Fluency, human-computer interaction, Psychology, Slovak, Dimension (data warehouse), fluency, media_common, business.industry, General Medicine, Ambiguity, language.human_language, BF1-990, Analytic language, language, translation quality, Source text, Artificial intelligence, business, english, computer, Natural language processing
Abstract: Digitalization is one of the key distinctive features of modern environment and social life. Nowadays more and more functions are transferred to the artificial mind. How effective is the replacement of human activity with computer activity? In the given article, this problem is solved by an example of integration of digital technologies into translation activities. It this paper, emphasis is placed on the quality of machine translation (MT) output of legal texts in the language pair English - Slovak. It studies a Criminal Code formulated in the Slovak language which was translated by a human translator into English and consequently via machine translation system Google Translate (GT) back into Slovak. The back-translation - translation of a translated text back into its original language - as a quality assessment tool to detect discrepancies, mistranslations and inevitable differences between the source text and the target text was used. The quality of MT output was evaluated according to Multidimensional Quality Metrics (MQM) standards with the focus on the dimension of Fluency. The multiple comparisons were applied to determine which issues (errors) in Fluency dimension differ from the others. A statistically significant difference is noticed between Agreement and other issues, as well as between Ambiguity and other issues. The errors in Agreement are related to the differences between the languages: English is considered mostly an analytic language, Slovak represents a synthetic language. The issues in the Ambiguity dimension correlate with the type of the text being examined, since legal texts are characterized by relatively complicated wording and numerous terms; moreover, accuracy and unambiguity need to be preserved. Generally, the MT output is able to provide users with basic information about the text. On the other hand, most of the segments need revision and/or correction; in such cases, human intervention and post-editing is necessary.
Published: 2021
Full Text: View/download PDF

21. Are Ellipses Important for Machine Translation?

Author: Payal Khullar
Subjects: Linguistics and Language, Machine translation, Artificial Intelligence, Computer science, business.industry, Artificial intelligence, Ellipse, business, computer.software_genre, computer, Language and Linguistics, Natural language processing, Computer Science Applications
Abstract: This article describes an experiment to evaluate the impact of different types of ellipses discussed in theoretical linguistics on Neural Machine Translation (NMT), using English to Hindi/Telugu as source and target languages. Evaluation with manual methods shows that most of the errors made by Google NMT are located in the clause containing the ellipsis, the frequency of such errors is slightly more in Telugu than Hindi, and the translation adequacy shows improvement when ellipses are reconstructed with their antecedents. These findings not only confirm the importance of ellipses and their resolution for MT, but also hint toward a possible correlation between the translation of discourse devices like ellipses with the morphological incongruity of the source and target. We also observe that not all ellipses are translated poorly and benefit from reconstruction, advocating for a disparate treatment of different ellipses in MT research.
Published: 2021
Full Text: View/download PDF

22. Preventing translation quality deterioration caused by beam search decoding in neural machine translation using statistical machine translation

Author: Hasan Bulut and Emre Satir
Subjects: Beam diameter, Statistical machine translation, Information Systems and Management, Similarity (geometry), Machine translation, Neural machine translation, Computer science, Decoding, Beam search, Translation (geometry), computer.software_genre, Computer Science Applications, Theoretical Computer Science, Artificial Intelligence, Control and Systems Engineering, Greedy algorithm, Algorithm, computer, Software, Word (computer architecture), Decoding methods
Abstract: Decoding is an important part of machine translation systems, and the most popular inference algorithm used here is beam search. Beam search algorithm improves translation by allowing a larger search space to be traversed than greedy search. However, as the beam width increases, the translation performance declines after a certain point in neural machine translation (NMT). This problem is usually not observed in statistical machine translation (SMT) due to the decoding method. This paper proposes a hybrid system based method that uses SMT predictions to prevent quality deterioration in the beam search algorithm used in NMT decoding. Our approach is based on the reranking n-best list of NMT according to the SMT system translation sentence. We propose two different algorithms for reranking NMT n-best lists. The first algorithm uses the length information of the SMT outputs. In contrast, the second uses a word-based similarity approach with the Jaccard Index, the Dice's Coefficient, and the Overlap Coefficient. Experiments on three different language pairs show that the method we propose prevents the decrease in translation quality and produces a gain of 1.3 BLEU and 1.6 METEOR for different beam sizes and 1.8 BLEU and 2.1 METEOR average scores compared to the baseline results. (c) 2021 Published by Elsevier Inc.
Published: 2021
Full Text: View/download PDF

23. Machine translation post-editing – Current situation and the future of translator training in Bulgaria

Author: Irina Stoyanova-Georgieva
Subjects: Machine translation, business.industry, Computer science, Training (meteorology), Artificial intelligence, Current (fluid), computer.software_genre, business, computer, Natural language processing
Abstract: The current paper is an attempt to analyse the situation on the market for specialised translation services, and more precisely for Machine Translation in Bulgaria. It provides an overview of some of the generic MT systems and analyses the results coming from the translation of two types of text. The aim of the paper is to raise awareness about the results of Neural Machine Translation and to reveal the need for MT post-editing courses.
Published: 2021
Full Text: View/download PDF

24. Research on Intelligent English Translation Method Based on the Improved Attention Mechanism Model

Author: Rong Wang
Subjects: Translation system, Article Subject, Artificial neural network, Machine translation, business.industry, Computer science, Mechanism (biology), Translation (geometry), computer.software_genre, Machine learning, Computer Science Applications, QA76.75-76.765, Recurrent neural network, Computer software, Artificial intelligence, business, computer, Software, Decoding methods, Coding (social sciences)
Abstract: The use of neural machine algorithms for English translation is a hot topic in the current research. English translation using the traditional sequential neural framework, which is too poor at capturing long-distance information, has its own major limitations. However, the current improved frameworks, such as recurrent neural network translation, are not satisfactory either. In this paper, we establish an attention coding and decoding model to address the shortcomings of traditional machine translation algorithms, combine the attention mechanism with a neural network framework, and implement the whole English translation system based on TensorFlow, thus improving the translation accuracy. The experimental test results show that the BLUE values of the algorithm model built in this paper are improved to different degrees compared with the traditional machine learning algorithms, which proves that the performance of the proposed algorithm model is significantly improved compared with the traditional model.
Published: 2021
Full Text: View/download PDF

25. Investigating post-editing effort

Author: Maria Stasimioti, Konstantinos Chatzitheodorou, and Vilelmini Sosoni
Subjects: Machine translation, Computer science, Mechanical Engineering, media_common.quotation_subject, First language, Energy Engineering and Power Technology, Cognition, Context (language use), Management Science and Operations Research, computer.software_genre, Fluency, Quality (business), computer, Cognitive load, Axiom, Cognitive psychology, media_common
Abstract: The working environment of translators has changed significantly in recent decades, with post-editing (PE) emerging as a new trend in the human translation workflow, particularly following the advent of neural machine translation (NMT) and the improvement of the quality of the machine translation (MT) raw output especially at the level of fluency. In addition, the directionality axiom is increasingly being questioned with translators working from and into their first language both in the context of translation (Buchweitz and Alves 2006; Pavlović and Jensen 2009; Fonseca and Barbosa 2015; Hunziker Heeb 2015; Ferreira 2013, 2014; Ferreira et al. 2016; Feng 2017) and in the context of PE (Garcia 2011; Sánchez-Gijón and Torres-Hostench 2014; da Silva et al. 2017; Toledo Báez 2018). In this study we employ product- and process-oriented approaches to investigate directionality in PE in the English-Greek language pair. In particular, we compare the cognitive, temporal, and technical effort expended by translators for the full PE of NMT output in L1 (Greek) with the effort required for the full PE of NMT output in L2 (English), while we also analyze the quality of the final translation product. Our findings reveal that PE in L2, i.e., inverse PE, is less demanding than PE in L1, i.e., direct PE, in terms of the time and keystrokes required, and the cognitive load exerted on translators. Finally, our research shows that directionality does not imply differences in quality.
Published: 2021
Full Text: View/download PDF

26. Artificial Intelligence Machine Translation Based on Fuzzy Algorithm

Author: Zhimin Li
Subjects: Article Subject, Artificial neural network, Machine translation, Computer Networks and Communications, business.industry, Computer science, media_common.quotation_subject, Analytic hierarchy process, TK5101-6720, Ambiguity, Translation (geometry), Semantics, computer.software_genre, Fuzzy logic, Computer Science Applications, Telecommunication, Selection (linguistics), Artificial intelligence, business, Algorithm, computer, media_common
Abstract: In order to study machine translation more in-depth, it is particularly important for the research of artificial intelligence with fuzzy algorithms to convert an unfamiliar language into a mature language. The neural network translation model has been developed in recent years and has achieved rich research results. Aiming at the current lack of accuracy of neural machine translation (NMT), which may cause ambiguity, this paper takes English machine translation as an example and proposes an artificial intelligence machine translation optimization model based on fuzzy theory. On the basis of NMT model translation, first the semantics of English machine translation is classified, a semantic selection model is built, then the analytic hierarchy process is used to determine the semantic order of English machine translation, and the corresponding fault-tolerant operation is carried out to the error-prone errors, weight the semantics, and introduce the fuzzy theory to arrange the English semantics of English machine translation. Finally, the performance of the model is analyzed through specific application experiments. The results show that the accuracy of the machine translation selection permutation model is improved by nearly 4.5% and can reach more than 90% compared with other models, and the timeliness is better than other models, which is improved by nearly 15%, which has obvious advantages.
Published: 2021
Full Text: View/download PDF

27. A Study on the Intelligent Translation Model for English Incorporating Neural Network Migration Learning

Author: Yanbo Zhang
Subjects: Technology, Article Subject, Machine translation, Artificial neural network, Computer Networks and Communications, Computer science, Generalization, business.industry, TK5101-6720, Overfitting, computer.software_genre, Translation (geometry), External Data Representation, Boom, Telecommunication, Artificial intelligence, Electrical and Electronic Engineering, Language translation, business, computer, Natural language processing, Information Systems
Abstract: Under the current artificial intelligence boom, machine translation is a research direction of natural language processing, which has important scientific research value and practical value. In practical applications, the variability of language, the limited capability of representing semantic information, and the scarcity of parallel corpus resources all constrain machine translation towards practicality and popularization. In this paper, we conduct deep mining of source language text data to express complex, high-level, and abstract semantic information using an appropriate text data representation model; then, for machine translation tasks with a large amount of parallel corpus, I use the capability of annotated datasets to build a more effective migration learning-based end-to-end neural network machine translation model on a supervised algorithm; then, for machine translation tasks with parallel corpus data resource-poor language machine translation tasks, migration learning techniques are used to prevent the overfitting problem of neural networks during training and to improve the generalization ability of end-to-end neural network machine translation models under low-resource conditions. Finally, for language translation tasks where the parallel corpus is extremely scarce but monolingual corpus is sufficient, the research focuses on unsupervised machine translation techniques, which will be a future research trend.
Published: 2021
Full Text: View/download PDF

28. Enhanced encoder for non-autoregressive machine translation

Author: Shuheng Wang, Heyan Huang, and Shumin Shi
Subjects: Linguistics and Language, Speedup, Machine translation, Computer science, computer.software_genre, Translation (geometry), Language and Linguistics, Tokenization (data security), Autoregressive model, Artificial Intelligence, Encoder, computer, Algorithm, Software, Decoding methods, Transformer (machine learning model)
Abstract: Non-autoregressive machine translation aims to speed up the decoding procedure by discarding the autoregressive model and generating the target words independently. Because non-autoregressive machine translation fails to exploit target-side information, the ability to accurately model source representations is critical. In this paper, we propose an approach to enhance the encoder’s modeling ability by using a pre-trained BERT model as an extra encoder. With a different tokenization method, the BERT encoder and the Raw encoder can model the source input from different aspects. Furthermore, having a gate mechanism, the decoder can dynamically determine which representations contribute to the decoding process. Experimental results on three translation tasks show that our method can significantly improve the performance of non-autoregressive MT, and surpass the baseline non-autoregressive models. On the WMT14 EN $$\rightarrow$$ DE translation task, our method achieves 27.87 BLEU with a single decoding step. This is a comparable result with the baseline autoregressive Transformer model which obtains a score of 27.8 BLEU.
Published: 2021
Full Text: View/download PDF

29. Low Resource Neural Machine Translation: Assamese to/from Other Indo-Aryan (Indic) Languages

Author: BaruahRupjyoti, MundotiyaRajesh Kumar, and SinghAnil Kumar
Subjects: General Computer Science, Machine translation, Indo aryan, Low resource, Computer science, Assamese, language, computer.software_genre, computer, Linguistics, language.human_language, Transformer (machine learning model), Bridging (programming)
Abstract: Machine translation (MT) systems have been built using numerous different techniques for bridging the language barriers. These techniques are broadly categorized into approaches like Statistical Machine Translation (SMT) and Neural Machine Translation (NMT). End-to-end NMT systems significantly outperform SMT in translation quality on many language pairs, especially those with the adequate parallel corpus. We report comparative experiments on baseline MT systems for Assamese to other Indo-Aryan languages (in both translation directions) using the traditional Phrase-Based SMT as well as some more successful NMT architectures, namely basic sequence-to-sequence model with attention, Transformer, and finetuned Transformer. The results are evaluated using the most prominent and popular standard automatic metric BLEU (BiLingual Evaluation Understudy), as well as other well-known metrics for exploring the performance of different baseline MT systems, since this is the first such work involving Assamese. The evaluation scores are compared for SMT and NMT models for the effectiveness of bi-directional language pairs involving Assamese and other Indo-Aryan languages (Bangla, Gujarati, Hindi, Marathi, Odia, Sinhalese, and Urdu). The highest BLEU scores obtained are for Assamese to Sinhalese for SMT (35.63) and the Assamese to Bangla for NMT systems (seq2seq is 50.92, Transformer is 50.01, and finetuned Transformer is 50.19). We also try to relate the results with the language characteristics, distances, family trees, domains, data sizes, and sentence lengths. We find that the effect of the domain is the most important factor affecting the results for the given data domains and sizes. We compare our results with the only existing MT system for Assamese (Bing Translator) and also with pairs involving Hindi.
Published: 2021
Full Text: View/download PDF

30. Parallel Algorithm of Hierarchical Phrase Machine Translation Based on Distributed Network Memory

Author: Guanghua Qiu
Subjects: Phrase, Machine translation, Computer science, Parallel algorithm, Parallel computing, computer.software_genre, computer, Information Systems, Management Information Systems
Abstract: Machine translation has developed rapidly. But there are some problems in machine translation, such as good reading, unable to reflect the mood and context, and even some language machines can not recognize. In order to improve the quality of translation, this paper uses the SSCI method to improve the quality of translation. It is found that the translation quality of hierarchical phrases is significantly improved after using the parallel algorithm of machine translation, which is about 9% higher than before, and the problem of context free grammar is also solved. The research also found that the use of parallel algorithm can effectively reduce the network memory occupation, the original 10 character content, after using the parallel algorithm, only need to occupy 8 characters, the optimization reaches 20%. This means that the parallel algorithm of hierarchical phrase machine translation based on distributed network memory can play a very important role in machine translation.
Published: 2021
Full Text: View/download PDF

31. The challenges of machine translation of academic publications

Author: N. G. Popova and D. A. Rew
Subjects: Machine translation, Computer science, Foreign language, Pharmaceutical Science, Scientific literature, computer.software_genre, Data science, Terminology, Complementary and alternative medicine, Scripting language, Academic writing, Encyclopedia, Pharmacology (medical), Source text, computer
Abstract: Clear translation remains a major challenge to better communication and understanding of the international academic literature, despite advances in Machine Translation (MT). Automatic translation systems which captured the detail and the sense of any manuscript in any language for a reader from any other linguistic background would find global applications.In this article, we discuss the current opportunities and constraints to the wider use of machine translation and computer-assisted human translation (CAT). At the present stage of technology development, these instruments offer a number of advantages to specialists working with scientific texts. These include the facility to skim and scan large amounts of information in foreign languages, and to act as digital dictionaries, thesauri and encyclopedias. Word-to-word and phrase-to-phrase translation between many languages and scripts is now well advanced.The availability of modern machine translation has therefore changed the work of specialist scientific translators, placing greater emphasis on more advanced text and sense editing skills. However, machine translation is still challenged by the nuances of language and culture from one society to another, particularly in the freestyle literature of the arts and humanities. Scientific papers are generally much more structured, but the quality of machine translation still largely depends on the quality of the source text. This varies considerably between different scientific disciplines and from one author to another.The most advanced translation systems are making steady progress. It is timely to revisit traditional training programmes in the field of written translation to focus on the development of higher-level research competencies, such as terminology search, and so to make best use of evolving machine translation technologies.More widely, we consider that there is a challenge across the higher education systems in all countries to develop a simple, clear and consistent “international” writing style to assist fast, reliable and low-cost machine translation and hence to advance mutual understanding across the global scientific literature.
Published: 2021
Full Text: View/download PDF

32. A New Concept of Electronic Text Based on Semantic Coding System for Machine Translation

Author: EddineMeftah Mohammed Charaf
Subjects: General Computer Science, Machine translation, Computer science, business.industry, media_common.quotation_subject, Structural ambiguity, Ambiguity, computer.software_genre, Code (semiotics), Field (computer science), Coding system, Artificial intelligence, business, computer, Natural language processing, Word (computer architecture), media_common
Abstract: In the field of machine translation of texts, the ambiguity in both lexical (dictionary) and structural aspects is still one of the difficult problems. Researchers in this field use different approaches, the most important of which is machine learning in its various types. The goal of the approach that we propose in this article is to define a new concept of electronic text, which makes the electronic text free from any lexical or structural ambiguity. We used a semantic coding system that relies on attaching the original electronic text (via the text editor interface) with the meanings intended by the author. The author defines the meaning desired for each word that can be a source of ambiguity. The proposed approach in this article can be used with any type of electronic text (text processing applications, web pages, email text, etc.). Thanks to the approach that we propose and through the experiments that we have conducted using it, we can obtain a very high accuracy rate. We can say that the problem of lexical and structural ambiguity can be completely solved. With this new concept of electronic text, the text file contains not only the text but also with it the true sense of the exact meaning intended by the writer in the form of symbols. These semantic symbols are used during machine translation to obtain a translated text completely free of any lexical and structural ambiguity.
Published: 2021
Full Text: View/download PDF

33. Improving neural machine translation with latent features feedback

Author: Min Zhang, Junhui Li, and Yachao Li
Subjects: Machine translation, Computer science, Cognitive Neuroscience, Speech recognition, Translation (geometry), ENCODE, computer.software_genre, Computer Science Applications, Artificial Intelligence, Encoding (memory), Feature (machine learning), Layer (object-oriented design), Representation (mathematics), Encoder, computer
Abstract: Most state-of-the-art neural machine translation (NMT) models progressively encode feature representation in a bottom-up feed-forward fashion. This traditional encoding mechanism has no guidance from external signals. In computer vision tasks, the feedback connection plays a crucial role, particularly for understanding tasks. In this paper, we propose a simple but effective approach to learn latent feature representations explicitly from input sentences via a latent feature encoder (LFE), which are fed back to an NMT encoder via a top-down feedback mechanism. Through the feedback mechanism, the representations in one layer are influenced by representations of both lower and higher layers, resulting in a more effective encoding mechanism. Besides, to enhance the capability of the LFE in better capturing latent features from the source sentences, we pre-train the LFE via a Denoising Auto-Encoder (DAE) strategy. Experimentation on the large-scale WMT 2014 English-to-German and WMT 2017 Chinese-to-English translation tasks demonstrates that our proposed LFE, either pre-trained with the DAE or not, significantly outperforms the strong baseline.
Published: 2021
Full Text: View/download PDF

34. Automatic RadLex coding of Chinese structured radiology reports based on text similarity ensemble

Author: Hailing Cai, Shan Nan, Huilong Duan, Xudong Lu, Qi Tian, and Yani Chen
Subjects: medicine.medical_specialty, China, Jaccard index, Machine translation, Computer science, Computer applications to medicine. Medical informatics, WordNet, R858-859.7, Health Informatics, Hybrid translation, computer.software_genre, Text similarity ensemble, Similarity (network science), medicine, Humans, Automatic coding, Word2vec, Language, Natural Language Processing, Health Policy, Research, Levenshtein distance, Standardized radiology reports, Computer Science Applications, Radiology Information Systems, Multilayer perceptron, Radiology, computer, Algorithms, Coding (social sciences)
Abstract: BackgroundStandardized coding of plays an important role in radiology reports’ secondary use such as data analytics, data-driven decision support, and personalized medicine. RadLex, a standard radiological lexicon, can reduce subjective variability and improve clarity in radiology reports. RadLex coding of radiology reports is widely used in many countries, but translation and localization of RadLex in China are far from being established. Although automatic RadLex coding is a common way for non-standard radiology reports, the high-accuracy cross-language RadLex coding is hardly achieved due to the limitation of up-to-date auto-translation and text similarity algorithms and still requires further research.MethodsWe present an effective approach that combines a hybrid translation and a Multilayer Perceptron weighting text similarity ensemble algorithm for automatic RadLex coding of Chinese structured radiology reports. Firstly, a hybrid way to integrate Google neural machine translation and dictionary translation helps to optimize the translation of Chinese radiology phrases to English. The dictionary is made up of 21,863 Chinese–English radiological term pairs extracted from several free medical dictionaries. Secondly, four typical text similarity algorithms are introduced, which are Levenshtein distance, Jaccard similarity coefficient, Word2vec Continuous bag-of-words model, and WordNet Wup similarity algorithms. Lastly, the Multilayer Perceptron model has been used to synthesize the contextual, lexical, character and syntactical information of four text similarity algorithms to promote precision, in which four similarity scores of two terms are taken as input and the output presents whether the two terms are synonyms.ResultsThe results show the effectiveness of the approach with an F1-score of 90.15%, a precision of 91.78% and a recall of 88.59%. The hybrid translation algorithm has no negative effect on the final coding, F1-score has increased by 21.44% and 8.12% compared with the GNMT algorithm and dictionary translation. Compared with the single similarity, the result of the MLP weighting similarity algorithm is satisfactory that has a 4.48% increase compared with the best single similarity algorithm, WordNet Wup.ConclusionsThe paper proposed an innovative automatic cross-language RadLex coding approach to solve the standardization of Chinese structured radiology reports, that can be taken as a reference to automatic cross-language coding.
Published: 2021

35. ATLASLang NMT: Arabic text language into Arabic sign language neural machine translation

Author: Mourad Brour and Abderrahim Benabbou
Subjects: Interlingua, General Computer Science, Machine translation, Exploit, Arabic, Computer science, 02 engineering and technology, computer.software_genre, Simple (abstract algebra), 0202 electrical engineering, electronic engineering, information engineering, Arabic sign language, Machine translation system, Artificial neural network, business.industry, Hearing impaired, 020206 networking & telecommunications, QA75.5-76.95, language.human_language, Natural language processing, Arabic language, Electronic computers. Computer science, language, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, Deaf people, Natural language processing, Word (computer architecture)
Abstract: ATLASLang is a machine translation system from Arabic text language into Arabic sign language (ArSL). The first version of the system (Brour and Benabbou, 2019) is based on two approaches: rule-based Interlingua and example-based approaches. It is a classical machine translation system, its limitations are the linguistic knowledge necessary to develop the rules, also many rules and exceptions required. In the last few years, a neural machine translation has achieved notable results, and several well-known companies including Google (Wu et al., 2016) and Systran (Crego et al., 2016) have starting to exploit it. In this paper, we present a new version of ATLASLang that is implemented using a feed-forward back-propagation Artificial Neural Network. In this version, we have started to translate simple sentences composed from a limited number of words since generally communication with deaf person uses short sentences. We have used the same ATLASLang MTS database of signs and we have utilized morphological characteristics to derive the maximum information from each word in the input of the neural translation system. The two versions of the system are compared using n-gram BLEU score (Papineni et al., 2002), and the results demonstrate that neuron network approaches outperform the other classical approaches.
Published: 2021
Full Text: View/download PDF

36. Pengembangan Metode Neural Machine Translation Berdasarkan Hyperparameter Neural Network

Author: Muhammad Yusuf Aristyanto and Robert Kurniawan
Subjects: Hyperparameter, Measure (data warehouse), Artificial neural network, Machine translation, Computer science, business.industry, Activation function, Overfitting, Machine learning, computer.software_genre, Artificial intelligence, business, computer, Dropout (neural networks), Test data
Abstract: Manusia sebagai makhluk sosial yang selalu ingin berhubungan dengan manusia lainnya memaksa manusia untuk saling berkomunikasi. Di sinilah peran bahasa menjadi amat penting, karena dengan adanya bahasa, maka akan dengan mudah mengerti apa yang ingin disampaikan oleh orang lain. Untuk itu, perlu adanya media yang dapat membantu memahami berbagai bahasa di dunia, salah satunya adalah mesin penerjemah. Salah satu metode yang dapat digunakan untuk membuat mesin penerjemah adalah Neural Machine Translation (NMT). NMT yang sekarang sudah ada masih memiliki berbagai kekurangan dan perlu dilakukan pengembangan lebih jauh. Diantaranya pada masalah overfitting yang membuat modelnya kurang bisa melakukan generalisasi pada data lain yang diujikan. Banyak hal yang mempengaruhi performa dari NMT tersebut, salah satunya adalah ukuran hyperparameter yang digunakan dan arsitektur model yang digunakan. Namun belum ada ukuran pasti yang dapat digunakan untuk menghasilkan model dengan performa yang terbaik. Sehingga penelitian ini bertujuan untuk mengembangkan arsitektur model NMT dan melakukan simulasi pada masing-masing hyperparameter Neural Network dan ukuran pada arsitektur modelnya, antara lain batch size, epoch, optimizer, activation function, dan dropout rate. Hasil yang didapatkan adalah model pengembangan dapat mengatasi masalah overfitting dari model sebelumnya dengan akurasi sebesar 72,24% dan skor BLEU sebesar 45,83% yang dilakukan pada data uji lainnya.
Published: 2021
Full Text: View/download PDF

37. Graph transformer networks based text representation

Author: Xiaoyan Cai, Mei Xin, Nanxin Wang, and Libin Yang
Subjects: Sequence, Theoretical computer science, Machine translation, Computer science, Computer Science::Information Retrieval, Cognitive Neuroscience, Node (networking), Text graph, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), computer.software_genre, Computer Science Applications, Artificial Intelligence, Graph (abstract data type), Representation (mathematics), computer, Feature learning, Transformer (machine learning model)
Abstract: Graph Neural Networks (GNN) has been used to exploit global features in text representation learning for natural language processing (NLP) tasks, including text classification, sequence tagging, neural machine translation and relational reasoning. However, GNN based models usually build a graph for the entire corpus, they have high memory consumption, ignoring the order of words and containing test documents in the training graph. Thus, these models are inherently transductive and have difficulties in inductive learning. In order to solve the above problems, we propose a Graph Transformer Networks based Text representation (GTNT) model. It first constructs a degree-centric text graph, which generates a text graph for each document in the corpus. Then it adopts a graph transformer network to model the graph to obtain node embeddings. When we apply our proposed GTNT model to citation recommendation and text classification tasks, the experimental results show that our model outperforms other state-of-the-art models.
Published: 2021
Full Text: View/download PDF

38. Optimizing Machine Translation to Overcome Mechanical Engineering Vocational Education Students Difficulties in Academic Writing

Author: Trisya Avianka and Siti Drivoka Sulistyaningrum
Subjects: Machine translation, Vocational education, Pedagogy, Academic writing, computer.software_genre, Psychology, computer
Abstract: Mesin terjemahan penting untuk pembelajaran dan pengajaran terutama penulisan akademis. Biasanya digunakan untuk menyelesaikan kesulitan menulis akademik siswa yang dianggap bermasalah selama beberapa tahun terakhir. Oleh karena itu, penelitian ini bertujuan untuk mengetahui penggunaan mesin terjemahan untuk mengatasi kesulitan mahasiswa Pendidikan Vokasi Teknik Mesin dalam penulisan akademik. Penelitian ini menggunakan survei kualitatif dengan desain penelitian analisis deskriptif. Sumber data adalah 27 mahasiswa semester 2 Pendidikan Vokasi Teknik Mesin yang sedang menempuh mata kuliah Bahasa Inggris di Universitas Negeri Jakarta. Pengumpulan data dilakukan melalui kuesioner dan dianalisis menggunakan analisis deskriptif. Kuesioner 1 digunakan untuk mengetahui apakah peserta menggunakan terjemahan mesin atau tidak dan untuk mengetahui jenis mesin terjemahan yang biasa mereka gunakan. Angket 2 dibagi menjadi dua bagian. BAGIAN A diadaptasi dari Xiao dan Chen (2015) karena menjelaskan kesulitan menulis akademik siswa. Terdiri dari 12 item yang didistribusikan melalui Google Form kepada 27 siswa. Sedangkan BAGIAN B diadaptasi dari temuan S. M. Lee (2020) sebelumnya. Ditemukan bahwa kesulitan dalam penulisan akademik ada di dalam kelas. Kesulitan terbesar terletak pada kesulitan tata bahasa diikuti oleh kosa kata dan ekspresi. Ternyata penggunaan terjemahan mesin sangat membantu mereka mengatasi kesulitan kosakata, diikuti oleh tata bahasa dan ekspresi. Machine translation is important for learning and teaching especially academic writing. It usually used to overcome students’ academic writing difficulties which have been regarded problematic for recent years. Hence this study aims to discover the employment of machine translation to overcome Mechanical Engineering Vocational Education students’ difficulties in academic writing. This research used qualitative survey with descriptive analysis as the design of the study. The data sources were 27 2nd semester students of Mechanical Engineering Vocational Education which currently taking English as the college course at Universitas Negeri Jakarta. The data was gathered through questionnaires and analyzed using descriptive analysis. Questionnaire 1 is used to know whether the participant used machine translation or not and to discover what kind of machine translation they usually used. Questionnaire 2 was divided into two parts. PART A was adapted from Xiao and Chen (2015) as they described students’ academic writing difficulties. It consists of 12 items which distributed through Google Form to 27 students. Meanwhile, PART B was adapted from S. M. Lee (2020) previous findings. It was found that the difficulties in academic writing exist in the classroom. The greatest difficulty lies in difficulties in grammar followed by vocabulary and expressions. It turned out that employing machine translation can help them the most to overcome vocabulary difficulties, followed by grammar and expression.
Published: 2021
Full Text: View/download PDF

39. Recent advances of low-resource neural machine translation

Author: Chao-Hong Liu, Rejwanul Haque, and Andy Way
Subjects: Linguistics and Language, Training set, Artificial neural network, Machine translation, Low resource, business.industry, Computer science, 02 engineering and technology, Machine learning, computer.software_genre, Language and Linguistics, Artificial Intelligence, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Computational linguistics, business, computer, Protocol (object-oriented programming), Software
Abstract: In recent years, neural network-based machine translation (MT) approaches have steadily superseded the statistical MT (SMT) methods, and represents the current state-of-the-art in MT research. Neural MT (NMT) is a data-driven end-to-end learning protocol whose training routine usually requires a large amount of parallel data in order to build a reasonable-quality MT system. This is particularly problematic for those language pairs that do not have enough parallel text for training. In order to counter the data sparsity problem of the NMT training, MT researchers have proposed various strategies, e.g. augmenting training data, exploiting training data from other languages, alternative learning strategies that use only monolingual data. This paper presents a survey on recent advances of NMT research from the perspective of low-resource scenarios.
Published: 2021
Full Text: View/download PDF

40. Supporting Risk-Aware Use of Online Translation Tools in Delivering Mental Healthcare Services among Spanish-Speaking Populations

Author: Tianyong Hao, Wenxiu Xie, Meng Ji, Kam-Yiu Lam, Xiaobo Qian, Chi-Yin Chow, and Mengdan Zhao
Subjects: Mental Health Services, Article Subject, General Computer Science, Machine translation, Computer science, General Mathematics, Computer applications to medicine. Medical informatics, R858-859.7, Neurosciences. Biological psychiatry. Neuropsychiatry, computer.software_genre, Machine Learning, Relevance vector machine, 03 medical and health sciences, 0302 clinical medicine, Feature (machine learning), Humans, Translations, 030212 general & internal medicine, Interpretability, Learning classifier system, business.industry, General Neuroscience, Bayes Theorem, Usability, General Medicine, Data science, Mental health, 3. Good health, Mental Health, business, computer, Classifier (UML), 030217 neurology & neurosurgery, Research Article, RC321-571
Abstract: Neural machine translation technologies are having increasing applications in clinical and healthcare settings. In multicultural countries, automatic translation tools provide critical support to medical and health professionals in their interaction and exchange of health messages with migrant patients with limited or non-English proficiency. While research has mainly explored the usability and limitations of state-of-the-art machine translation tools in the detection and diagnosis of physical diseases and conditions, there is a persistent lack of evidence-based studies on the applicability of machine translation tools in the delivery of mental healthcare services for vulnerable populations. Our study developed Bayesian machine learning algorithms using relevance vector machine to support frontline health workers and medical professionals to make better informed decisions between risks and convenience of using online translation tools when delivering mental healthcare services to Spanish-speaking minority populations living in English-speaking countries. Major strengths of the machine learning classifier that we developed include scalability, interpretability, and adaptability of the classifier for diverse mental healthcare settings. In this paper, we report on the process of the Bayesian machine learning classifier development through automatic feature optimisation and the interpretation of the classifier-enabled assessment of the suitability of original English mental health information for automatic online translation. We elaborate on the interpretation of the assessment results in clinical settings using statistical tools such as positive likelihood ratios and negative likelihood ratios.
Published: 2021
Full Text: View/download PDF

41. Deep Learning Methods for Sign Language Translation

Author: Shagan Sah, Akash Chintha, Thomastine Sarchet, Akhil Santha, Priyanshu Srivastava, Raymond Ptucha, Nikunj R. Kotecha, Ifeoma Nwogu, Brian P. Landy, Tejaswini Ananthanarayana, Andre Webster, and Joseph Panaro
Subjects: Machine translation, business.industry, Computer science, Deep learning, Sign language, computer.software_genre, Convolutional neural network, Computer Science Applications, Human-Computer Interaction, Reinforcement learning, Artificial intelligence, business, Modality (semiotics), computer, Natural language processing, Natural language, Sign (mathematics)
Abstract: Many sign languages are bona fide natural languages with grammatical rules and lexicons hence can benefit from machine translation methods. Similarly, since sign language is a visual-spatial language, it can also benefit from computer vision methods for encoding it. With the advent of deep learning methods in recent years, significant advances have been made in natural language processing (specifically neural machine translation) and in computer vision methods (specifically image and video captioning). Researchers have therefore begun expanding these learning methods to sign language understanding. Sign language interpretation is especially challenging, because it involves a continuous visual-spatial modality where meaning is often derived based on context. The focus of this article, therefore, is to examine various deep learning–based methods for encoding sign language as inputs, and to analyze the efficacy of several machine translation methods, over three different sign language datasets. The goal is to determine which combinations are sufficiently robust for sign language translation without any gloss-based information. To understand the role of the different input features, we perform ablation studies over the model architectures (input features + neural translation models) for improved continuous sign language translation. These input features include body and finger joints, facial points, as well as vector representations/embeddings from convolutional neural networks. The machine translation models explored include several baseline sequence-to-sequence approaches, more complex and challenging networks using attention, reinforcement learning, and the transformer model. We implement the translation methods over multiple sign languages—German (GSL), American (ASL), and Chinese sign languages (CSL). From our analysis, the transformer model combined with input embeddings from ResNet50 or pose-based landmark features outperformed all the other sequence-to-sequence models by achieving higher BLEU2-BLEU4 scores when applied to the controlled and constrained GSL benchmark dataset. These combinations also showed significant promise on the other less controlled ASL and CSL datasets.
Published: 2021
Full Text: View/download PDF

42. Evaluation of google translate in rendering English COVID-19 texts into Arabic

Author: Yousef Albudairi, Samah Meqdadi, and Zakaryia Almahasees
Subjects: Lexis, Linguistics and Language, Service (systems architecture), Grammar, Machine translation, media_common.quotation_subject, Control (management), Semantics, computer.software_genre, Punctuation, Language and Linguistics, Linguistics, Education, Set (psychology), Psychology, computer, media_common
Abstract: Machine Translation (MT) has the potential to provide instant translation in times of crisis. MT provides real solutions that can remove borders between people and COVID-19 information. The widespread of MT system makes it worthy of scrutinizing the capacity of the most prominent MT system, Google Translate, to deal with COVID-19 texts into Arabic. The study adopted (Costa et al., 2015a) framework in analysing the output of Google Translate output service in terms of orography, grammar, lexis, and semantics. The study’s corpus was extracted from World Health Organization (WHO), United Nations Children’s Emergency Fund (UNICEF), U.S. Food and Drug Administration (FDA), the Foreign, Commonwealth & Development Office (FCDO), and European Centre for Disease Prevention and Control (ECDC). The paper reveals that Google Translate committed a set of errors: semantic, grammatical, lexical, and punctuation. Such errors inhibit the intelligibility of the translated texts. It also indicates that MT might work as an aid to translate general information about COVID-19, but it is still incapable of dealing with critical information about COVID-19. The paper concludes that MT can be an effective tool, but it can never replace human translators.
Published: 2021
Full Text: View/download PDF

43. Analyse des erreurs de traduction automatique pour la combinaison de langues slovène-français et perspectives pour une formation en post-édition

Author: Sonia Vaupot
Subjects: Foreign language learning, Machine translation, First language, Foreign language, Sociology, computer.software_genre, Language industry, computer, Humanities
Abstract: Ces dernières années, la traduction automatique (TA) a considérablement progressé de sorte que des connaissances en post-édition (PE) deviennent nécessaires dans l’industrie de la traduction. Cela nécessite de nouvelles connaissances et la mise en place d’un enseignement dans le domaine de la PE. Dans le présent article, nous vérifions l’usage de la TA dans une formation en traduction de niveau Master (Université de Ljubljana), à partir d’une langue maternelle à moindre diffusion (le slovène) vers une langue étrangère à large diffusion (le français), et nous en dégageons les erreurs dues à la TA. L’analyse des erreurs indique que la qualité de la TA diffère notamment en fonction du type de texte et du niveau de difficulté du texte à traduire. Enfin, nous proposons quelques perspectives pour une formation en PE axée sur la traduction et l’apprentissage d’une langue étrangère.
Published: 2021
Full Text: View/download PDF

44. The German EU Council Presidency Translator

Author: Josef van Genabith, Stephan Busemann, Artūrs Vasiļevskis, and Mārcis Pinnis
Subjects: German, Presidency, Machine translation, Action (philosophy), Artificial Intelligence, Computer science, language, Public administration, computer.software_genre, computer, language.human_language
Abstract: This contribution describes the German EU Council Presidency Translator (EUC PT), a machine translation service created for the German EU Council Presidency in the second half of 2020, which is open to the general public. Following a series of earlier presidency translators, the German version exhibits important extensions and improvements. The German EUC PT is the first to integrate systems from commercial vendors, public services, and a research center, using a mix of custom and generic translation engines, and to introduce a new webpage translation widget. A further important feature is the close collaboration with human translators from the German ministries, who were provided with computer-assisted translation tool plugins integrating machine translation services into their daily work environments. Uptake by the public reflects a huge interest in the service, showing the need for breaking language barriers.
Published: 2021
Full Text: View/download PDF

45. Comparison of Translation Techniques by Google Translate and U-Dictionary: How Differently Does Both Machine Translation Tools Perform in Translating?

Author: Kammer Tuahman Sipayung, I Made Dwipa Arta, Diani Indah, Novdin Manoktong Sianturi, and Yeti Rohayati
Subjects: Machine translation, Computer science, business.industry, Artificial intelligence, computer.software_genre, Translation (geometry), business, computer, Natural language processing
Abstract: Better translation produced by computation linguistics should be evaluated through linguistics theory. This research aims to describe translation techniques between Google Translate and U-Dictionary. The study used a qualitative research method with a descriptive design. This design was used to describe the occurrences of translation techniques in both translation machine, with the researchers serving as an instrument to compare translation techniques which is produced on machine. The data are from expository text entitled “Importance of Good Manners in Every Day Life”. The total data are 122 words/phrases which are pairs of translations, English as source language and Indonesia as target language. The result shows that Google Translate apply five of Molina & Albir’s (2002) eighteen translation techniques, while U-dictionary apply seven techniques. Google Translate dominantly apply literal translation techniques (86,8%) followed by reduction translation techniques (4,9%). U-dictionary also dominantly apply literal translation techniques (75,4%), but follows with the variation translation techniques (13,1%). This study showed that both machines produced different target texts for the same source language due to different applications of techniques, with U-dictionary proven to apply more variety of translation techniques than Google Translate. The researcher hopes this study can be used as an evaluation for improving the performance of machine translations.
Published: 2021
Full Text: View/download PDF

46. The Place of Machine Translation in Teaching Translation

Author: Erdinç Aslan
Subjects: Machine translation, Philosophy, General Medicine, computer.software_genre, computer, Humanities
Abstract: Les développements rapides de la traduction automatique ces dernières années ont considérablement affecté le domaine de la traduction. Les nouvelles méthodes qui sont appliquées ont fourni des améliorations significatives dans la qualité de la traduction. D’une part cette situation a fait de la traduction automatique d’un domaine sur lequel les grandes entreprises se concentrent et réalisent des investissements importants, et par conséquent, elle a augmenté la part de marché de la traduction automatique dans le secteur, et d’autre part elle l'a rendu populaire en permettant aux gens de s'intéresser à la traduction automatique. Malgré cela, la mesure dans laquelle ces développements peuvent être reflétés dans le programme d'enseignement de la traduction est discutée par les enseignants en traduction. Cette étude se concentre sur la place de la traduction automatique dans l'enseignement de la traduction et examine comment ces systèmes peuvent être reflétés dans le programme d'enseignement de la traduction.
Published: 2021
Full Text: View/download PDF

47. Usage of The Field Index in Machine Translation

Author: Yenok Grishkyan
Subjects: Index (economics), Machine translation, Field (physics), computer.software_genre, Algorithm, computer, Mathematics
Abstract: The current article discusses the main problems of human and machine translations, as well as introduces a new lexical description in machine translation for faster and more accurate translation. The new method uses so-called field indicators or the Field Indices to facilitate the MT search engine for words by marking these words with special components creating a semantic field, and allowing the MT devices to search for the word according to its usage in the text. The Field Index system covers the semantic description of the following main spheres: scientific field, public or social field and humanitarian field. These three spheres contain subfields that usually mingle with the parent index through a dash, with the parent index being with the first one. The scientific field includes such aspects as geography, mathematics, chemistry, physics, economy, medicine, etc., with related subfields like diseases and biological terms (for medicine), names of drugs (as a separate filed), finance and accounting (as part of economy), etc. Applied Sciences Index contains miscellaneous words used by other subfields of the same scientific sphere: e.g., computer, telephone, function and many others, and plays a crucial part in distinguishing polysemantic words such as mouse (hardware), root (in mathematics), etc. The public or social field contains subfields that narrow the meaning of words to a specific one and includes aspects as art, agriculture, law, education, religion, housing utilities, time, transportation, people, etc.i+iT Ss the widest semantic field containing a lot of subfields specifying words that belong to such groups as colours, architecture, games, music, sport, etc. (for art index), clothing, beverages, food and production (for agriculture index). Notions of time, people, professions and terms for religion and items used in household are present in this group due to its wide usage within the society. The humanitarian sphere deals mostly with terms used in languages, literature, manuscripts and libraries. These subfields help identify polysemantic words between nations and languages, book titles and ordinary words and phrases, and literary styles (documentaries, fairy tales, dramas, etc.). In turn, these can be further defined as prose or a poem. All formulae proposed in the project consider the presence of the Field Indices and its position at the end of the description of the word. Depending on the target language, the translated version should be identical with the source following this very principle.
Published: 2021
Full Text: View/download PDF

48. Dual contextual module for neural machine translation

Author: Isaac K. E. Ampomah, Glenn I. Hawe, and Sally McClean
Subjects: Linguistics and Language, Machine translation, business.industry, Computer science, Context (language use), DUAL (cognitive architecture), computer.software_genre, Language and Linguistics, Artificial Intelligence, Artificial intelligence, Computational linguistics, Representation (mathematics), business, computer, Encoder, Software, Natural language processing, Sentence, Transformer (machine learning model)
Abstract: Self-attention-based encoder-decoder frameworks have drawn increasing attention in recent years. The self-attention mechanism generates contextual representations by attending to all tokens in the sentence. Despite improvements in performance, recent research argues that the self-attention mechanism tends to concentrate more on the global context with less emphasis on the contextual information available within the local neighbourhood of tokens. This work presents the Dual Contextual (DC) module, an extension of the conventional self-attention unit, to effectively leverage both the local and global contextual information. The goal is to further improve the sentence representation ability of the encoder and decoder subnetworks, thus enhancing the overall performance of the translation model. Experimental results on WMT’14 English-German (En$$\rightarrow $$ → De) and eight IWSLT translation tasks show that the DC module can further improve the translation performance of the Transformer model.
Published: 2021
Full Text: View/download PDF

49. A Review of Machine Translation Tools: The Translation’s Ability

Author: Tira Nur Fitria
Subjects: Machine translation, Computer science, business.industry, Flexibility (personality), Context (language use), computer.software_genre, Translation (geometry), Field (computer science), Documentation, Diction, Artificial intelligence, IBM, business, computer, Natural language processing
Abstract: The objective of the research is to review the ability of online machine translator tools includes Google Translate (GT), Collin Translator (CT), Bing Translator (BT), Yandex Translator (YT), Systran Translate (ST), and IBM Translator (IT). This research applies descriptive qualitative. The documentation was used in this study. The result of the analysis shows that the translation results are different, both from the style of language and the choice of words used by each machine translation tool. Thus, directly or indirectly, whether consciously or not, each translation machine carries its characteristics. Machine translation technology cannot be separated from the active role of humans. In other words, it will always be the best choice for users to rely on expert translation rather than machine translation. But no machine translator can be as accurate as human skills in producing translation products. In particular, the field of translation is also concerned with machine translation to support the performance of translators in analyzing the diction used as an element of language. In this regard, it needs to be underlined that the existence of machine translation is an additional facility in the world of translation, not as the main means of translation because the sophistication of the machine will not be able to match the flexibility of the human brain's cognitive abilities in adjusting the translation results according to the existing context. Accurate translation is sometimes subjective, relatively often temporal. Therefore, it is permissible for translating by more than one machine translator
Published: 2021
Full Text: View/download PDF

50. Los motores de traducción automática y su uso como herramienta lexicográfica en la traducción de unidades léxicas aisladas

Author: María Liébana González and Concepción Maldonado González
Subjects: Linguistics and Language, Machine translation, Computer science, business.industry, media_common.quotation_subject, Context (language use), Lexicographical order, computer.software_genre, Language and Linguistics, Focus (linguistics), Quality (business), Artificial intelligence, business, computer, Natural language processing, media_common
Abstract: Este trabajo ofrece un breve repaso de algunos de los principales sistemas de traducción automática (TA) que conviven actualmente en el mercado, así como de los problemas más comunes a los que nos exponemos al hacer uso de la traducción automática en la traducción de unidades léxicas aisladas, fuera de contexto. El objetivo es señalar, o al menos cuestionar, las limitaciones lingüísticas que presentan los motores de TA y reflexionar sobre la necesidad de incorporar datos lexicográficos a los mismos, con el fin de mejorar su rendimiento. Para ello, evaluaremos, desde un enfoque práctico, el comportamiento de Google Traductor, Bing Microsoft Translator y el traductor de DeepL, en la traducción de unidades léxicas aisladas. Compararemos los resultados que habríamos obtenido si, para las mismas tareas, hubiéramos recurrido a diccionarios bilingües en vez de a motores de TA. Constataremos que la respuesta generada por los motores de TA resulta insuficiente para solucionar con precisión nuestras dudas, y que los datos lexicográficos recogidos en diferentes diccionarios bilingües en línea, por lo general, aportan información lexicográfica más completa y adecuada para satisfacer las necesidades comunicativas del usuario.
Published: 2021
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

11,912 results on '"Machine translation"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources