Descriptor: "TTS" / Publication Type: Electronic Resources - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"TTS"' showing total 46 results

Start Over Descriptor "TTS" Publication Type Electronic Resources

46 results on '"TTS"'

1. Synthesising prosody with insufficient context

Author: Hodari, Zack, King, Simon, Watts, Oliver, and Bell, Peter
Subjects: text-to-speech synthesis, TTS, expressivity, prosody, situational information, natural language
Abstract: Prosody is a key component in human spoken communication, signalling emotion, attitude, information structure, intention, and other communicative functions through perceived variation in intonation, loudness, timing, and voice quality. However, the prosody in text-to-speech (TTS) systems is often monotonous and adds no additional meaning to the text. Synthesising prosody is difficult for several reasons: I focus on three challenges. First, prosody is embedded in the speech signal, making it hard to model with machine learning. Second, there is no clear orthography for prosody, meaning it is underspecified in the input text and making it difficult to directly control. Third, and most importantly, prosody is determined by the context of a speech act, which TTS systems do not, and will never, have complete access to. Without the context, we cannot say if prosody is appropriate or inappropriate. Context is wide ranging, but state-of-the-art TTS acoustic models only have access to phonetic information and limited structural information. Unfortunately, most context is either difficult, expensive, or impos- sible to collect. Thus, fully specified prosodic context will never exist. Given there is insufficient context, prosody synthesis is a one-to-many generative task: it necessitates the ability to produce multiple renditions. To provide this ability, I propose methods for prosody control in TTS, using either explicit prosody features, such as F0 and duration, or learnt prosody representations disentangled from the acoustics. I demonstrate that without control of the prosodic variability in speech, TTS will produce average prosody-i.e. flat and monotonous prosody. This thesis explores different options for operating these control mechanisms. Random sampling of a learnt distribution of prosody produces more varied and realistic prosody. Alternatively, a human-in-the-loop can operate the control mechanism-using their intuition to choose appropriate prosody. To improve the effectiveness of human-driven control, I design two novel approaches to make control mechanisms more human interpretable. Finally, it is important to take advantage of additional context as it becomes available. I present a novel framework that can incorporate arbitrary additional context, and demonstrate my state-of- the-art context-aware model of prosody using a pre-trained and fine-tuned language model. This thesis demonstrates empirically that appropriate prosody can be synthesised with insufficient context by accounting for unexplained prosodic variation.
Published: 2022
Full Text: View/download PDF

2. Unsupervised learning for text-to-speech synthesis

Author: Watts, Oliver Samuel, King, Simon, Clark, Robert, and Yamagishi, Junichi
Subjects: 414, unsupervised learning, vector space model, speech synthesis, TTS, text-to-speech
Abstract: This thesis introduces a general method for incorporating the distributional analysis of textual and linguistic objects into text-to-speech (TTS) conversion systems. Conventional TTS conversion uses intermediate layers of representation to bridge the gap between text and speech. Collecting the annotated data needed to produce these intermediate layers is a far from trivial task, possibly prohibitively so for languages in which no such resources are in existence. Distributional analysis, in contrast, proceeds in an unsupervised manner, and so enables the creation of systems using textual data that are not annotated. The method therefore aids the building of systems for languages in which conventional linguistic resources are scarce, but is not restricted to these languages. The distributional analysis proposed here places the textual objects analysed in a continuous-valued space, rather than specifying a hard categorisation of those objects. This space is then partitioned during the training of acoustic models for synthesis, so that the models generalise over objects' surface forms in a way that is acoustically relevant. The method is applied to three levels of textual analysis: to the characterisation of sub-syllabic units, word units and utterances. Entire systems for three languages (English, Finnish and Romanian) are built with no reliance on manually labelled data or language-specific expertise. Results of a subjective evaluation are presented.
Published: 2013

3. Learning in spiking neural networks

Author: Davies, Sergio, Garside, James, and Furber, Stephen
Subjects: 006.3, Learning, Spiking neural networks, Neural network simulators, Neuromimetic hardware, Neuromorphic hardware, SpiNNaker, Population-based routing, STDP, TTS, Real-time software, Asyncronous software execution
Abstract: Artificial neural network simulators are a research field which attracts the interest of researchers from various fields, from biology to computer science. The final objectives are the understanding of the mechanisms underlying the human brain, how to reproduce them in an artificial environment, and how drugs interact with them. Multiple neural models have been proposed, each with their peculiarities, from the very complex and biologically realistic Hodgkin-Huxley neuron model to the very simple 'leaky integrate-and-fire' neuron. However, despite numerous attempts to understand the learning behaviour of the synapses, few models have been proposed. Spike-Timing-Dependent Plasticity (STDP) is one of the most relevant and biologically plausible models, and some variants (such as the triplet-based STDP rule) have been proposed to accommodate all biological observations. The research presented in this thesis focuses on a novel learning rule, based on the spike-pair STDP algorithm, which provides a statistical approach with the advantage of being less computationally expensive than the standard STDP rule, and is therefore suitable for its implementation on stand-alone computational units. The environment in which this research work has been carried out is the SpiNNaker project, which aims to provide a massively parallel computational substrate for neural simulation. To support such research, two other topics have been addressed: the first is a way to inject spikes into the SpiNNaker system through a non-real-time channel such as the Ethernet link, synchronising with the timing of the SpiNNaker system. The second research topic is focused on a way to route spikes in the SpiNNaker system based on populations of neurons. The three topics are presented in sequence after a brief introduction to the SpiNNaker project. Future work could include structural plasticity (also known as synaptic rewiring); here, during the simulation of neural networks on the SpiNNaker system, axons, dendrites and synapses may be grown or pruned according to biological observations.
Published: 2013

4. So-to-Speak : an exploratory platform for investigating the interplay between style and prosody in TTS

Author: Székely, Éva, Wang, Siyang, Gustafsson, Joakim, Székely, Éva, Wang, Siyang, and Gustafsson, Joakim
Abstract: In recent years, numerous speech synthesis systems have been proposed that feature multi-dimensional controllability, generating a level of variability that surpasses traditional TTS systems by orders of magnitude. However, it remains challenging for developers to comprehend and demonstrate the potential of these advanced systems. We introduce So-to-Speak, a customisable interface tailored for showcasing the capabilities of different controllable TTS systems. The interface allows for the generation, synthesis, and playback of hundreds of samples simultaneously, displayed on an interactive grid, with variation both low level prosodic features and high level style controls. To offer insights into speech quality, automatic estimates of MOS scores are presented for each sample. So-to-Speak facilitates the audiovisual exploration of the interaction between various speech features, which can be useful in a range of applications in speech technology., QC 20231009
Published: 2023

5. Multi-Speaker and Multi-Lingual Text-to-Speech

Author: Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, Rodríguez Fonollosa, José Adrián, Gonzálbez Biosca, Daniel, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, Rodríguez Fonollosa, José Adrián, and Gonzálbez Biosca, Daniel
Abstract: Los sistemas dedicados a la síntesis del habla, popularmente conocidos como sistemas Text-to-Speech (TTS) han experimentado notables mejoras durante los últimos años gracias al desarrollo y el crecimiento de la inteligencia artificial. Las aplicaciones de estos modelos son múltiples: desde el uso en asistentes de voz hasta el doblaje de producciones cinematográficas. Los objetivos a la hora de entrenar un modelo de este tipo son cada vez más sofisticados. Este proyecto repasa los diferentes retos que hay que resolver a la hora de diseñar un sistema de síntesis de voz así como las diferentes técnicas normalmente usadas en los últimos años para conseguirlo. Además tiene como objetivo principal desarrollar un modelo bilingüe que sea capaz de generar una voz natural a partir de textos tanto en español como en catalán. Por último, se presentan y se prueban distintas soluciones para tratar de conseguir generar una voz de cualquier locutor que no esté presente en las bases de datos utilizadas durante el entrenamiento del sistema. Se exponen todas las dificultades y problemas encontrados dadas las limitaciones de los datos disponibles para desarrollar el sistema., Systems dedicated to speech synthesis, popularly known as Text-to-Speech (TTS) systems, have experienced notable improvements in recent years thanks to the development and growth of artificial intelligence. The applications of these models are multiple: from the use in voice assistants to the dubbing of film productions. The objectives when training a model of this type are increasingly sophisticated. This project reviews the different challenges that must be solved when designing a speech synthesis system as well as the different techniques normally used during the last years to achieve it. In addition, its main objective is to develop a bilingual model that is capable of generating a natural voice from texts in both Spanish and Catalan. Finally, different solutions are presented and tested in order to try to generate any voice that is not in the databases used to train the system. The difficulties and problems encountered due to the limitations of the available data to develop the system are exposed.
Published: 2023

6. Estudio de la conversión texto a voz basada en DNN: modelo base y fine-tuning

Author: Peñas Pérez, Irene, Cardeñoso Payo, Valentín, Escudero Mancebo, David, Peñas Pérez, Irene, Cardeñoso Payo, Valentín, and Escudero Mancebo, David
Abstract: La síntesis de voz es un área de investigación en constante evolución, y que está siendo, en la actualidad, un campo de investigación para las DNN generativas. En este trabajo se aborda la necesidad de desarrollar un sistema de síntesis de voz en español para superar las limitaciones lingüísticas que existen en este campo en el idioma español y tratar de mejorar la accesibilidad como puede ser en los asistentes virtuales. El objetivo del TFM se centra en explorar el uso de técnicas neuronales de última generación para crear un modelo base español, a partir de un conjunto de datos en castellano. Más tarde, se procede a optimizar, para después realizar un finetuning con otro conjunto de datos nuevo, obteniendo de esta manera una serie de modelos en español. Por último, se evalúan y se extraen una serie de conclusiones. Para la consecución de este objetivo, se hace uso de la herramienta NeMo. De esta manera, se crea un modelo base español utilizando FastPitch y HiFiGAN. Además se dispone de 3 conjuntos de datos diferentes para realizar los modelos y los consecuentes experimentos. Se evalúan las señales sonoras generadas por los diferentes modelos, tanto el base, como los finetuned y más tarde se hacen dos evaluaciones, una objetiva con un conjunto de métricas, y otra perceptual, en la que se pregunta a una serie de personas sobre la calidad e inteligibilidad de los audios. En conclusión, mediante este trabajo se aborda la necesidad imperante de desarrollar conjuntos de datos y sistemas de síntesis de voz en español para superar las limitaciones lingüísticas y mejorar la accesibilidad en aplicaciones como los asistentes virtuales en castellano., Speech synthesis is an area of research in constant evolution, and is currently a field of research for generative DNN. The aim of the Master’s Dissertation is to explore the use of state-of-the-art neural techniques to create a Spanish base model from a Spanish dataset. Later, we proceed to optimize, and then perform a fine-tuning with another new dataset, obtaining in this way a series of models in Spanish. Finally, they are evaluated and a series of conclusions are drawn. In order to achieve this objective, the NeMo tool is used. In this way, a Spanish base model is created using FastPitch and HiFiGAN. In addition, three different datasets are available to perform the models and the consequent experiments. The sound signals generated by the different models, both the base and the finetuned, are evaluated and later two evaluations are made, an objective one with a set of metrics, and a perceptual one, in which a series of people are asked about the quality and intelligibility of the audios. In conclusion, this work addresses the imperative need to develop datasets and speech synthesis systems in Spanish to overcome linguistic limitations and improve accessibility in applications such as virtual assistants in Spanish., Departamento de Informática (Arquitectura y Tecnología de Computadores, Ciencias de la Computación e Inteligencia Artificial, Lenguajes y Sistemas Informáticos), Máster en Ingeniería Informática
Published: 2023

7. Investigating reading comprehension in Reading While Listening and the relevancy of The Voice Effect

Author: Hedenström, Edvin, Barck-Holst, Axel, Hedenström, Edvin, and Barck-Holst, Axel
Abstract: Various forms of multimedia learning have been shown to aid learners time and time again. One form of multimedia learning that has not been thoroughly studied is reading while listening (RWL). This is especially the case when it comes to the immediate impacts on reading comprehension from practising RWL. Furthermore the recent advancements of Text-To-Speech (TTS) have started to challenge the established notion that real human recorded spoken word is always preferable for learning, also known as The Voice Effect. This study looked at Swedish University students with English as their second language (L2) and examined how their reading comprehension in L2 was performing in three different groups. The groups were Reading Only (RO), Reading-While-Listening with spoken word (RWL-SW) and Reading-While-Listening with text-to-speech (RWL-TTS). The RO group was then compared to The RWL groups. The two RWL groups were also compared on test scores as well as perceived enjoyment and aid from the narration as reported by the participants. Our results did not exhibit any statistically significant difference in reading comprehension between the RO group and the RWL groups. When looking at the results of the reading comprehension test the RO and RWL-TTS groups got the exact same number of correct answers. This suggests that RWL did not have any notable impact on reading comprehension. Furthermore no statistical significant difference was found between the two RWL groups in test scores or perceived enjoyment and aid from the narration. What’s interesting to note is that RWL-SW performed slightly worse than RWL-TTS on the comprehension test. The reported perceived enjoyment and aid from the narration was also notably similar to each other. This suggests that The Voice Effect did not have relevance in this test., Olika former av multimediainlärning har visat sig hjälpa eleverna gång på gång. En form av multimedieinlärning som inte har studerats grundligt är läsning medan man lyssnar (RWL). Detta gäller särskilt när det gäller de omedelbara effekterna på läsförståelsen av att använda på RWL. Dessutom har de senaste framstegen med text till tal (TTS) börjat utmana den etablerade uppfattningen att verkligt mänskligt inspelat talat ord alltid är att föredra vid inlärning, även kallat “Rösteffekten” (The Voice Effect). I den här studien undersöktes svenska universitetsstudenter med engelska som andraspråk (L2) och hur deras läsförståelse i L2 presterade i tre olika grupper. Grupperna var Reading Only (RO), Reading-While-Listening med en mänsklig talare (RWL-SW) och Reading-While-Listening med text-to-speech (RWL-TTS). RO-gruppen jämfördes sedan med RWL-grupperna. De två RWL-grupperna jämfördes också med avseende på testresultat samt upplevd njutning och hjälp från berättandet enligt deltagarnas rapporter. Våra resultat visade ingen statistiskt signifikant skillnad i läsförståelse mellan RO-gruppen och RWL-grupperna. När man tittar på resultaten av läsförståelsetestet fick RO- och RWL-TTS- grupperna exakt lika många korrekta svar. Detta tyder på att RWL inte hade någon anmärkningsvärd inverkan på läsförståelsen. Dessutom hittades ingen statistiskt signifikant skillnad mellan de två RWL-grupperna när det gäller testresultat eller upplevd njutning och hjälp av uppläsningen. Vad som är intressant att notera är att RWL-SW presterade något sämre än RWL-TTS på läsförståelsetestet. Den rapporterade upplevda uppskattningen och hjälp från uppläsning var också anmärkningsvärt likartade. Detta tyder på att “The Voice Effect” inte hade någon betydelse i detta test.
Published: 2023

8. Web-based strategies in the manufacturing industry

Author: Velásquez, Luis Alexis
Subjects: 005, Internet, Architectures, SELTOOL, TTS, DISKOVER
Abstract: The explosive growth of Internet-based architectures is allowing an efficient access to information resources over geographically dispersed areas. This fact is exerting a major influence on current manufacturing practices. Business activities involving customers, partners, employees and suppliers are being rapidly and efficiently integrated through networked information management environments. Therefore, efforts are required to take advantage of distributed infrastructures that can satisfy information integration and collaborative work strategies in corporate environments. In this research, Internet-based distributed solutions focused on the manufacturing industry are proposed. Three different systems have been developed for the tooling sector, specifically for the company Seco Tools UK Ltd (industrial collaborator). They are summarised as follows. SELTOOL is a Web-based open tool selection system involving the analysis of technical criteria to establish appropriate selection of inserts, toolholders and cutting data for turning, threading and grooving operations. It has been oriented to world-wide Seco customers. SELTOOL provides an interactive and crossed-way of searching for tooling parameters, rather than conventional representation schemes provided by catalogues. Mechanisms were developed to filter, convert and migrate data from different formats to the database (SQL-based) used by SELTOOL.TTS (Tool Trials System) is a Web-based system developed by the author and two other researchers to support Seco sales engineers and technical staff, who would perform tooling trials in geographically dispersed machining centres and benefit from sharing data and results generated by these tests. Through TTS tooling engineers (authorised users) can submit and retrieve highly specific technical tooling data for both milling and turning operations. Moreover, it is possible for tooling engineers to avoid the execution of new tool trials knowing the results of trials carried out in physically distant places, when another engineer had previously executed these trials. The system incorporates encrypted security features suitable for restricted use on the World Wide Web. An urgent need exists for tools to make sense of raw data, extracting useful knowledge from increasingly large collections of data now being constructed and made available from networked information environments. This explosive growth in the availability of information is overwhelming the capabilities of traditional information management systems, to provide efficient ways of detecting anomalies and significant patterns in large sets of data. Inexorably, the tooling industry is generating valuable experimental data. It is a potential and unexplored sector regarding the application of knowledge capturing systems. Hence, to address this issue, a knowledge discovery system called DISKOVER was developed. DISKOVER is an integrated Java-application consisting of five data mining modules, able to be operated through the Internet. Kluster and Q-Fast are two of these modules, entirely developed by the author. Fuzzy-K has been developed by the author in collaboration with another research student in the group at Durham. The final two modules (R-Set and MQG) have been developed by another member of the Durham group. To develop Kluster, a complete clustering methodology was proposed. Kluster is a clustering application able to combine the analysis of quantitative as well as categorical data (conceptual clustering) to establish data classification processes. This module incorporates two original contributions. Specifically, consistent indicators to measure the quality of the final classification and application of optimisation methods to the final groups obtained. Kluster provides the possibility, to users, of introducing case-studies to generate cutting parameters for particular Input requirements. Fuzzy-K is an application having the advantages of hierarchical clustering, while applying fuzzy membership functions to support the generation of similarity measures. The implementation of fuzzy membership functions helped to optimise the grouping of categorical data containing missing or imprecise values. As the tooling database is accessed through the Internet, which is a relatively slow access platform, it was decided to rely on faster Information retrieval mechanisms. Q-fast is an SQL-based exploratory data analysis (EDA) application, Implemented for this purpose.
Published: 2000

9. Novel dual-action tissue through-the-scope clip for endoscopic closure.

Author: Yang, Dennis, Yang, Dennis, Kadkhodayan, Kambiz, Arain, Mustafa A, Hasan, Muhammad K, Yang, Dennis, Yang, Dennis, Kadkhodayan, Kambiz, Arain, Mustafa A, and Hasan, Muhammad K
Abstract: Video 1Use of a novel Dual Action Tissue through-the-scope clip for endoscopic closure.
Published: 2022

10. Diseño de un sistema de seguimiento para animales en peligro de extinción basado en tecnología LPWAN

Author: Sempere Paya, Víctor Miguel, Todolí Ferrandis, David, Universitat Politècnica de València. Departamento de Comunicaciones - Departament de Comunicacions, Universitat Politècnica de València. Escuela Técnica Superior de Ingenieros de Telecomunicación - Escola Tècnica Superior d'Enginyers de Telecomunicació, Bartolín Arnau, Luis Miguel, Sempere Paya, Víctor Miguel, Todolí Ferrandis, David, Universitat Politècnica de València. Departamento de Comunicaciones - Departament de Comunicacions, Universitat Politècnica de València. Escuela Técnica Superior de Ingenieros de Telecomunicación - Escola Tècnica Superior d'Enginyers de Telecomunicació, and Bartolín Arnau, Luis Miguel
Abstract: [ES] Este proyecto se basa en el diseño de un sistema de seguimiento para la geolocalización de animales en peligro de extinción basado en tecnología LPWAN. Nace de la necesidad de localizar este tipo de animales en alta montaña y en zonas rurales para permitir la convivencia entre estos animales y el ser humano. Está basado en un nodo emisor, el cual va a emitir las coordenadas GPS del animal. Además, este nodo va encapsulado en una carcasa hermética cumpliendo con la norma IP-67, la cual irá acoplada a un collar que llevará el animal sin perjudicarle en absoluto en sus hábitos diarios. En cuanto a la infraestructura utilizada consta de dos instalaciones fijas ubicadas en entornos de alta montaña donde se visualizan tres tipos de terreno en los cuales se van a realizar pruebas respecto a dicha aplicación. Además, se investiga la profundidad de penetración de esta tecnología así como los parámetros de calidad de servicio, obteniendo así información detalla de la calidad de la tecnología LPWAN., [EN] This project is based on the design of a tracking system for the geolocation of endangered animals based on LPWAN technology. It is born from the need to geolocate this type of animals in high mountains and rural areas to allow the coexistence between these animals and the human being. It is based on an emisor node, which will emit the GPS coordenates of the animal. Besides that, this node is encapsulated in a hermetic housing complying with the IP-67 standard, which will be attached to a necklace worn by the animal without affecting its daily habits. As for the infrastructure used, it consists of two fixed installations located in high mountain environments where three types of terrain are visualized in which tests will be carried out with respect to this application. In addition, the depth of penetration of this technology is investigated as well as the quality of service parameters, thus obtaining detailed information on the quality of LPWAN technology.
Published: 2021

11. Fatal vaccine-induced immune thrombotic thrombocytopenia (VITT) post Ad26.COV2.S: first documented case outside US.

Author: UCL - SSS/IREC/MONT - Pôle Mont Godinne, UCL - SSS/IREC/CARD - Pôle de recherche cardiovasculaire, UCL - (MGD) Laboratoire de biologie clinique, UCL - (SLuc) Service d'hématologie, Rodriguez, Elsa V C, Bouazza, Fatima-Zohra, Dauby, Nicolas, Mullier, François, d'Otreppe, Stéphanie, Jissendi Tchofo, Patrice, Bartiaux, Magali, Sirjacques, Camille, Roman, Alain, Hermans, Cédric, Cliquennois, Manuel, UCL - SSS/IREC/MONT - Pôle Mont Godinne, UCL - SSS/IREC/CARD - Pôle de recherche cardiovasculaire, UCL - (MGD) Laboratoire de biologie clinique, UCL - (SLuc) Service d'hématologie, Rodriguez, Elsa V C, Bouazza, Fatima-Zohra, Dauby, Nicolas, Mullier, François, d'Otreppe, Stéphanie, Jissendi Tchofo, Patrice, Bartiaux, Magali, Sirjacques, Camille, Roman, Alain, Hermans, Cédric, and Cliquennois, Manuel
Abstract: PURPOSE: We reported the first described post Ad26.COV2.S (Janssen, Johnson & Johnson) vaccine-induced immune thrombocytopenia (VITT) case outside US. CASE DESCRIPTION: CA young woman without any medical history presented association of deep vein thrombosis and thrombocytopenia at day 10 after vaccine injection. The patient was treated with low-molecular weight heparin at a first medical institution. Twelve days post Ad26.COV2.S vaccination, the patient was admitted at our hospital for neurological deterioration and right hemiplegia. Medical imaging using MRI showed thrombosis of the major anterior part of the sagittal superior sinus with bilateral intraparenchymal hemorrhagic complications. Screening tests for antibodies against platelet factor 4 (PF4)-heparin by rapid lateral flow immunoassay and chemiluminescence techniques were negative. Platelet activation test using heparin-induced multiple electrode aggregometry confirmed the initial clinical hypothesis. Despite immediate treatment with intravenous immunoglobulin, dexamethasone, danaparoid and attempted neurosurgery the patient evolved toward brain death. CONCLUSION: Even though it is an extremely rare complication of vaccination physicians should maintain a high index of suspicion of VITT in patients who received an adenovirus-vector-based SARS-CoV-2 vaccine within the last 30 days with persistent complains compatible with VITT or thromboembolic event associated with thrombocytopenia. The diagnosis should not be excluded if the rapid anti-PF4 immunological nor chemiluminescence techniques yield negative results. An adapted functional assay should be performed to confirm the diagnosis. Early treatment with intravenous immunoglobulin and non-heparin anticoagulants is essential as delayed diagnosis and administration of appropriate treatment is associated with poor prognosis.
Published: 2021

12. Grapheme-to-phoneme transcription of English words in Icelandic text

Author: Ármannsson, Bjarki and Ármannsson, Bjarki
Abstract: Foreign words, such as names, locations or sometimes entire phrases, are a problem for any system that is meant to convert graphemes to phonemes (g2p; i.e.converting written text into phonetic transcription). In this thesis, we investigate both rule-based and neural methods of phonetically transcribing English words found in Icelandic text, taking into account the rules and constraints of how foreign phonemes can be mapped into Icelandic phonology. We implement a rule-based system by compiling grammars into finite-state transducers. In deciding on which rules to include, and evaluating their coverage, we use a list of the most frequently-found English words in a corpus of Icelandic text. The output of the rule-based system is then manually evaluated and corrected (when needed) and subsequently used as data to train a simple bidirectional LSTM g2p model. We train models both with and without length and stress labels included in the gold annotated data. Although the scores for neither model are close to the state-of-the-art for either Icelandic or English, both our rule-based system and LSTM model show promising initial results and improve on the baseline of simply using an Icelandic g2p model, rule-based or neural, on English words. We find that the greater flexibility of the LSTM model seems to give it an advantage over our rule-based system when it comes to modeling certain phenomena. Most notable is the LSTM’s ability to more accurately transcribe relations between graphemes and phonemes for English vowel sounds. Given there does not exist much previous work on g2p transcription specifically handling English words within the Icelandic phonological constraints and it remains an unsolved task, our findings present a foundation for the development of further research, and contribute to improving g2p systems for Icelandic as a whole.
Published: 2021

13. Homograph Disambiguation and Diacritization for Arabic Text-to-Speech Using Neural Networks

Author: Lameris, Harm and Lameris, Harm
Abstract: Pre-processing Arabic text for Text-to-Speech (TTS) systems poses major challenges, as Arabic omits short vowels in writing. This omission leads to a large number of homographs, and means that Arabic text needs to be diacritized to disambiguate these homographs, in order to be matched up with the intended pronunciation. Diacritizing Arabic has generally been achieved by using rule-based, statistical, or hybrid methods that combine rule-based and statistical methods. Recently, diacritization methods involving deep learning have shown promise in reducing error rates. These deep-learning methods are not yet commonly used in TTS engines, however. To examine neural diacritization methods for use in TTS engines, we normalized and pre-processed a version of the Tashkeela corpus, a large diacritized corpus containing largely Classical Arabic texts, for TTS purposes. We then trained and tested three state-of-the-art Recurrent-Neural-Network-based models on this data set. Additionally we tested these models on the Wiki News corpus, a test set that contains Modern Standard Arabic (MSA) news articles and thus more closely resembles most TTS queries. The models were evaluated by comparing the Diacritic Error Rate (DER) and Word Error Rate (WER) achieved for each data set to one another and to the DER and WER reported in the original papers. Moreover, the per-diacritic accuracy was examined, and a manual evaluation was performed. For the Tashkeela corpus, all models achieved a lower DER and WER than reported in the original papers. This was largely the result of using more training data in addition to the TTS pre-processing steps that were performed on the data. For the Wiki News corpus, the error rates were higher, largely due to the domain gap between the data sets. We found that for both data sets the models overfit on common patterns and the most common diacritic. For the Wiki News corpus the models struggled with Named Entities and loanwords. Purely neural models generally o
Published: 2021

14. Evaluating Multi-Uav System with Text to Spech for Sitational Awarness and Workload

Author: Lindgren, Viktor and Lindgren, Viktor
Abstract: With improvements to miniaturization technologies, the ratio between operators required per UAV has become increasingly smaller at the cost of increased workload. Workload is an important factor to consider when designing the multi-UAV systems of tomorrow as too much workload may decrease an operator's performance. This study proposes the use of text to speech combined with an emphasis on a single screen design as a way of improving situational awareness and perceived workload. A controlled experiment consisting of 18 participants was conducted inside a simulator. Their situational awareness and perceived workload was measured using SAGAT and NASA-TLX respectively. The results show that the use of text to speech lead to a decrease in situational awareness for all elements inside the graphical user interface that were not directly handled by a text to speech event. All of the NASA-TLX measurements showed an improvement in perceived workload except for physical demand. Overall an improvement of perceived workload was observed when text to speech was in use.
Published: 2021

15. Síndrome de Takotsubo tras traumatismo que requiere intervención quirúrgica urgente.

Author: Vicente Orgaz, Marta, Gholamian Ovejero, Soraya, González Velasco, Raquel, Gómez del Pulgar Vázquez, Blanca, Vicente Orgaz, Marta, Gholamian Ovejero, Soraya, González Velasco, Raquel, and Gómez del Pulgar Vázquez, Blanca
Abstract: Takotsubo syndrome (TTS) is a stress cardiomyopathy affecting more frequently postmenopausal women after a physical or emotional trigger. It presents with similar symptoms to acute coronary syndrome as well as elevation of cardiac biomarkers and abnormalities in electrocardiogram. Diagnosis is based on ‘InterTAK Diagnostic Criteria’ established in 2018, with the fundamental finding of temporary wall motion abnormality of the left ventricle, typically without coronary pathology (although this does not exclude the diagnosis) and early reversibility of the condition. Treatment is not clearly defined so it will include life support measures and prevention of life-threatening complications. In the recent past years there has been an increase in the description of this syndrome in sanitary environments, which most likely suggests an infradetection. Perioperative period might provide a perfect scenario for its occurrence due to the implication of surgical stress and hence the anesthesiologist will be key in its diagnosis and management., El síndrome de Takotsubo (STT), es una miocardiopatía de estrés que afecta más frecuentemente a mujeres postmenopaúsicas tras un estímulo físico o emocional. Se caracteriza por presentar una clínica similar al síndrome coronario agudo, con elevación de enzimas cardiacas y alteraciones en el electrocardiograma. El diagnóstico se basa en los “InterTAK Diagnostic Criteria” establecidos en 2018, siendo fundamental el hallazgo de una anormalidad en la contracción del ventrículo izquierdo, sin patología coronaria (aunque su aparición no excluye el diagnóstico), con reversibilidad del cuadro precoz. El tratamiento no está claramente establecido, incluyendo medidas de soporte y la prevención de complicaciones. En los últimos años, ha habido un aumento de su descripción en los entornos sanitarios, lo que sugiere un probable infradiagnóstico. El entorno perioperatorio puede ser un escenario ideal para su aparición, por la implicación del estrés quirúrgico, siendo el anestesiólogo el principal implicado en su diagnóstico y manejo.
Published: 2021

16. Code-Switching ASR and TTS Using Semisupervised Learning with Machine Speech Chain

Author: NAKAYAMA, Sahoko, TJANDRA, Andros, SAKTI, Sakriani, NAKAMURA, Satoshi, NAKAYAMA, Sahoko, TJANDRA, Andros, SAKTI, Sakriani, and NAKAMURA, Satoshi
Abstract: The phenomenon where a speaker mixes two or more languages within the same conversation is called code-switching (CS). Handling CS is challenging for automatic speech recognition (ASR) and text-to-speech (TTS) because it requires coping with multilingual input. Although CS text or speech may be found in social media, the datasets of CS speech and corresponding CS transcriptions are hard to obtain even though they are required for supervised training. This work adopts a deep learning-based machine speech chain to train CS ASR and CS TTS with each other with semisupervised learning. After supervised learning with monolingual data, the machine speech chain is then carried out with unsupervised learning of either the CS text or speech. The results show that the machine speech chain trains ASR and TTS together and improves performance without requiring the pair of CS speech and corresponding CS text. We also integrate language embedding and language identification into the CS machine speech chain in order to handle CS better by giving language information. We demonstrate that our proposed approach can improve the performance on both a single CS language pair and multiple CS language pairs, including the unknown CS excluded from training data.
Published: 2021

17. Viability of the implementation of a semi-autonomous electric tram without rails or overhead lines in a city

Author: Universitat Politècnica de Catalunya. Departament d'Enginyeria Civil i Ambiental, Garola Crespo, Àlvar, Vélez Sabater, Gemma, Serrahima i Serra, Sergi, Universitat Politècnica de Catalunya. Departament d'Enginyeria Civil i Ambiental, Garola Crespo, Àlvar, Vélez Sabater, Gemma, and Serrahima i Serra, Sergi
Abstract: This study reviews the current state of development of new Trackless Tram Systems (TTS) and finds that the technology is ready and reliable enough for implementation. The study focuses on the last version of TTS developed by CCRC Locomotive and nicknamed TRAMeBUS. As there is currently a public transport problem in a central avenue (Av. Diagonal) in the city of Barcelona the project analyses if this could be solved taking advantage of the TRAMeBUS vehicle. Currently the central section of the Avenue acts as a wall and prevents the two Barcelona’s Tramway networks from joining and providing a better service to citizens. A revision of the already studied alternatives for the Diagonal Avenue delivers two scenarios to consider. Unifying the last stations from each tramway network through an on-surface link (4 km) or, unifying them analogously by building a tunnel (2 km) and an on-surface link (2 km) covering the same path. Then, four scenarios are built from these alternatives them being either exploited by TRAMeBUS or conventional Alstom tramway units. A Cost-Benefit Analysis is made to compare the new system to the conventional light rail one. The obtained results state that, due to its smaller Capital Expenditure requirements a TTS benchmarks much better than their conventional counterparts. A merely financial analysis is made to see the economic profitability of the alternatives. All alternatives deliver good results. Both on-surface alternatives might require a similar grant from public authorities. A Multicriteria Evaluation is also made to assess all the non-monetizable characteristics of both systems and compare them. Again, the on-surface TRAMeBUS alternative gets the best results. Despite the chosen alternative being very robust to the sensitivity analysis the over conservative scenario build might indicate that, in a real case implementation it is logical to expect lower costs and, consequently, higher overall profitability.
Published: 2020

18. Protocole de protection de la faune marine et campagnes sismiques

Author: Ducatel, Cecile, Le Gall, Yves, Lurton, Xavier, Ducatel, Cecile, Le Gall, Yves, and Lurton, Xavier
Abstract: Ce document précise les mesures prises par l’Ifremer pour la protection de la faune marine lors de l’utilisation des sources sismiques appartenant à la classe 1 (> 500 in 3). Avant de présenter le protocole à appliquer lors des campagnes de géosciences marines, les principes de l’évaluation des risques sonores sont rappelés.
Published: 2019

19. Protection protocol for marine fauna and seismic campaigns

Author: Ducatel, Cecile, Le Gall, Yves, Lurton, Xavier, Ducatel, Cecile, Le Gall, Yves, and Lurton, Xavier
Abstract: This document explains measures taken by Ifremer to protect marine fauna when using class 1 seismic sources (> 500 in3). Before presenting the protocol to be applied during marine geoscience campaigns, the acoustic risk assessment principles are recapped.
Published: 2019

20. Teachers’ beliefs on utilizing TTS as a tool for learning English at Upper Secondary School

Author: Stoker, Jonathon and Stoker, Jonathon
Abstract: There are many students in the class that have dyslexia and can struggle with simple tasks such as reading. Therefore, this study set out to investigate the applications of text to speech synthesizers facilitate learning English at upper secondary with these students in focus from a teacher’s perspective. This study has been conducted through means of a semi- structured interviews with secondary school teachers. Research to support the fact that TTS does in fact aid facilitate the reading of students with difficulties has been stark. One the other hand scholars have claimed that it does not always aid struggling readers, therefore this paper will explore the discrepancies between these contrasting views. In the results it was found that the usage of TTS in the classroom should be seen as a compensatory tool that can aid struggling students in reading as opposed to being seen as a solution. The question to whether this can aid students without struggling difficulties was bound to the intelligibility of the voice of the TTS. Furthermore, it was maintained that this could in fact encourage students with their reading on the basis of academic success.
Published: 2019

21. Språk- och kommunikationsutvecklingi förskolan : Med fokus på Tecken som Alternativ och Kompletterande Kommunikation

Author: Steinwall, Lina, Åkerlund, Elinn, Steinwall, Lina, and Åkerlund, Elinn
Abstract: För att idag kunna känna en delaktighet i samhället behöver varje människa kunna samspela. Detta samspel bygger på kommunikation med varandra. Det finns personer som har svårt för att kommunicera då det kan finnas ett språk och/eller en kommunikationsbarriär. För att komma över den här barriären kan man använda sig av stödtecken. I Sveriges förskolor arbetas det ibland med tecken som hjälpmedel. Denna teckenanvändning kallas för TAKK (Tecken som Alternativ och Kompletterande Kommunikation). Vårt syfte med rapporten var att ta reda på hur förskolor arbetar med tecken. Hur ofta de används, vad pedagogerna ser för utmaningar och möjligheter med det och om de anser att det kan vara till stöd för barnen i deras språk- och kommunikationsutveckling och i så fall på vilket sätt. För att få fram ett resultat utifrån detta syfte intervjuades sex stycken pedagoger i norra delen av Sverige. När vi analyserade intervjuerna gjorde vi en transkribering av vårt insamlade material. Därefter använde vi oss av fenomenografisk analysmetod. Detta kommer vi beskriva mer ingående i metodavsnittet. I resultatet framgick det att tecken endast användes i enstaka situationer men att pedagogerna skulle vilja arbeta mer med det. Pedagogerna såg många möjligheter med att använda TAKK och de menade bland annat att det kunde gynna barnens språkutveckling på olika sätt. Det fanns även utmaningar med TAKK som till exempel kunde vara att pedagogerna inte hade någon eller otillräckligt med utbildning inom TAKK vilket vi tolkar kan vara en nackdel. I rapporten kommer TAKK, tecken och tecken som stöd användas som synonymer.
Published: 2018

22. Desarrollo de un módulo de dictados en una plataforma web educativa

Author: Semidynamics Technology Services, Espasa Sans, Roger, Solé Pellisa, Guillem, Sabaté i Garriga, Ferran, Castillo Malaver, Italo, Semidynamics Technology Services, Espasa Sans, Roger, Solé Pellisa, Guillem, Sabaté i Garriga, Ferran, and Castillo Malaver, Italo
Published: 2018

23. Nativization of foreign names in TTS for automatic reading of world news in Swahili

Author: Mendelson, Joseph, Oplustil, P., Watts, O., King, S., Mendelson, Joseph, Oplustil, P., Watts, O., and King, S.
Abstract: When a text-To-speech (TTS) system is required to speak world news, a large fraction of the words to be spoken will be proper names originating in a wide variety of languages. Phonetization of these names based on target language letter-To-sound rules will typically be inadequate. This is detrimental not only during synthesis, when inappropriate phone sequences are produced, but also during training, if the system is trained on data from the same domain. This is because poor phonetization during forced alignment based on hidden Markov models can pollute the whole model set, resulting in degraded alignment even of normal target-language words. This paper presents four techniques designed to address this issue in the context of a Swahili TTS system: Automatic transcription of proper names based on a lexicon from a better-resourced language; the addition of a parallel phone set and special part-of-speech tag exclusively dedicated to proper names; a manually-crafted phone mapping which allows substitutions for potentially more accurate phones in proper names during forced alignment; the addition in proper names of a grapheme-derived frame-level feature, supplementing the standard phonetic inputs to the acoustic model. We present results from objective and subjective evaluations of systems built using these four techniques., QC 20180131
Published: 2017
Full Text: View/download PDF

24. Mayordomo: controla tu smart home a través de la voz

Author: Agerri Gascón, Rodrigo, Sarasola Gabiola, Kepa Mirena, F. INFORMATICA, INFORMATIKA F., Grado en Ingeniería Informática, Informatikaren Ingeniaritzako Gradua, Iturrioz Rodríguez, Aitor, Agerri Gascón, Rodrigo, Sarasola Gabiola, Kepa Mirena, F. INFORMATICA, INFORMATIKA F., Grado en Ingeniería Informática, Informatikaren Ingeniaritzako Gradua, and Iturrioz Rodríguez, Aitor
Abstract: La conectividad y la electrónica de bajo coste están posibilitando que cada vez más elementos de nuestra casa se vuelvan inteligentes y puedan ser controlados remotamente. Hasta ahora, la única forma de gestionar estos dispositivos era a través de aplicaciones móviles, pero últimamente están proliferando cada vez más sistemas basados en la voz. Mediante este proyecto se pretende construir una plataforma de código abierto que permi- ta interactuar con los elementos domóticos del hogar de forma sencilla. Además, tendrá que ser modular para que su uso pueda extenderse a otros ámbitos y tener la capacidad de procesar comandos en diferentes idiomas, convirtiéndose así en la primera plataforma en poder ser usada íntegramente en euskera. De forma trasversal, el proyecto requerirá trabajar con tecnologías lingüísticas y permitirá aprender y profundizar en las técnicas empleadas en el procesamiento de lenguaje natural.
Published: 2017

25. Improving an Italian TTS System : Voice Based Rules for Word Boundaries' Phenomena

Author: Rossetti, Laura and Rossetti, Laura
Abstract: This thesis project aimed at improving a commercial Italian Unit Selection system by obtaining a higher agreement between the internal target representation produced by the text analysis component and the recordings in the speech database. Precisely this has been done by implementing new transformation rules that are applied on the input text during the text normalization and just after the grapheme-to-phoneme (g2p) module. We focused on mismatchings that occur over word boundaries and that are caused by both typical Italian phonotactic phenomena and prosodic breaks. In order to write the rules a deep and careful analysis of the speech database and the voice talent’s specific pronunciation has been performed in addition to the study of Italian language’s properties, both from a phonetic and syntactic perspective. The evaluation of the new rules added to the system showed on improvement of 4.25% over our baseline. Finally this study shows the importance of the criteria used for the selection of the voice talent, revealing the significance of consistency in his/her performance of phonetic features for rule based text analysis components.
Published: 2017

26. Word and paragraph embeddings for expresive speech synthesis

Author: Bonafonte Cávez, Antonio, Gómez Bajo, Germán, Bonafonte Cávez, Antonio, and Gómez Bajo, Germán
Abstract: Speech synthesis is the task of generating speech using computers. Due to the limitations of classical techniques, these systems are normally not suitable for applications that would benefit from expressiveness in the speech, such as audiobook reading. In this project, we attempt to develop a text-to-speech speech synthesizer that is capable of reacting to the semantic content of the input text to produce expressive speech. The system is based on the Socrates text-to-speech framework developed in the VEU research lab at UPC and the Keras deep learning library., La sintesis de voz consiste en utilizar ordenadores para generar voz humana. Debido a las limitaciones de las técnicas clásicas, estos sitemas normalmente no son adecuados para aplicaciones que requieren voz expresiva como en la lectura automática de audiolibros. En este proyecto, tratamos de desarrollar un sintetizador de voz capaz de reaccionar al contenido semántico del texto para producir voz expresiva. El sistema está basado en el framework de síntesis de voz Socrates, desarrollado en el grupo VEU de la UPC, y en la librería de deep learning Keras., La síntesi de veu consisteix en fer servir ordinadors per generar veu humana. Degut a les limitacions de les tècniques clàssiques, aquests sistemes normalment no són adequats per aplicacions que requireixen veu expressiva com és el cas de la lectura de audiollibres automàtica. En aquest projecte, desenvolupem un sintetitzador de veu capaç de reaccionar al contingut semàntic del text per produir veu expressiva. El sistema està basat en el framework de síntesi de veu Socrates, desenvolupat al grup de recerca VEU de la UPC, i en la llibreria de deep learning Keras.
Published: 2017

27. La plantation sur monticule, un moyen efficace pour augmenter la productivité de la forêt boréale - Bilan d’une plantation d’épinettes noires de 30 ans située près du lac Chibougamau

Author: Walsh, Denis, Krause, Cornelia, Walsh, Denis, and Krause, Cornelia
Abstract: Cette étude compare après 30 ans la croissance et le rendement d’une plantation d’épinettes noires reboisées sur différents microsites formés par le passage soit d’un scarificateur Bräcke ou d’un scarificateur à disques TTS. Les deux sites sont localisés sur la même unité physiographique dans la pessière à mousses de l’Ouest tout près de Chibougamau. Le dispositif n’a pas été construit à l’origine pour permettre les comparaisons statistiques entre les deux traitements de préparation de terrain étant donné l’absence de vraies répétitions; les comparaisons statistiques entre les microsites à l’intérieur de chaque site sont cependant valides. Les gains de croissance des plants du Bräcke par rapport à ceux du TTS sont donc donnés à titre indicatif. La hauteur, le diamètre et le volume de la tige des arbres reboisés sur le monticule formé par le Bräcke étaient significativement plus grands que ceux reboisés à l’épaulement ou dans le poquet. Nous n’avons pas observé de différences significatives entre les plants reboisés dans le fond ou l’épaulement formé par le scarificateur à disques. Après 30 ans en plantation, le gain en volume des plants sur le monticule était de 107% plus élevé comparativement aux plants reboisés après passage du scarificateur à disques TTS. Cette différence s’explique par une croissance radiale annuelle accélérée des plants sur le monticule à partir de la 4ième année en plantation sur une période s’étendant sur 19 ans. Le volume marchand prédit par les équations des modèles de prédiction actuels serait supérieur de 53 m3/ha à 60 ans pour une plantation d’épinette noire réalisée sur des monticules comparativement à une préparation traditionnelle par un scarificateur à disques. This study compare 30 years old black spruce stands planted on different microsites made by a Bräcke mounder or a disk trencher. The two sites are located on the same physiographic unit in the boreal forest near Chibougamau, northeastern Quebec. The experimental design was not
Published: 2016

28. Deep learning applied to speech synthesis

Author: Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, Bonafonte Cávez, Antonio, Pascual de la Puente, Santiago, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, Bonafonte Cávez, Antonio, and Pascual de la Puente, Santiago
Abstract: Deep Learning has been applied successfully to speech processing problems. In this work we explore its capabilities, focusing concretely in recurrent neural architectures to build a state of the art Text-To-Speech system from scratch. The different steps to make the full TTS system are shown. Also, a post-filtering method to improve the generated speech naturalness is applied and evaluated. The objective results show which architecture fits better our problem, achieving low error rates in term of cepstral distortion, pitch estimation error and voiced/unvoiced classification error. Also, subjective results suggest that the model achieves a state of the art quality in the synthesis, where the post-filtering factor seems to be a key component to get a good level of naturalness. A novel architecture called Multi-Output TTS is also proposed to hold multiple speakers inside the same structure. Some hidden layers are shared by all the speakers, while there is a specific output layer for each speaker. Objective and perceptual experiments prove that this scheme produces much better results in comparison with single speaker models. Moreover, we also tackle the problem of speaker adaptation by adding a new output branch to the model and successfully training it without the need of modifying the base optimized model. This fine tuning method achieves better results than training the new speaker from scratch with its own model. Finally, we also tackle the problem of speaker interpolation by adding a new output layer (alpha-layer) on top of the Multi-Output branches. An identifying code is injected into the layer together with acoustic features of many speakers. Experiments show that the alpha-layer can effectively learn to interpolate the acoustic features between speakers., El Deep Learning se ha aplicado con éxito a problemas de procesado del habla. En éste trabajo exploramos las capacidades de ésta disciplina, haciendo especial énfasis en las arquitecturas recurrentes para construir un sistema de síntesis de voz desde cero. Se muestran las distintas etapas para hacer el sistema de síntesis completo. Además se aplica y se evalúa un método de post-procesado con tal de mejorar la naturalidad de la voz generada. Los resultados objetivos muestran qué arquitectura encaja más con nuestro problema, consiguiendo errores bajos en términos de distorsión cepstral, error de estimación de pitch y error de clasificación sonoro/sordo. También los resultados subjetivos indican que el modelo llega a tener una calidad de voz comparable con la de las últimas tecnologías, donde el hecho de aplicar el post-procesado parece ser una pieza clave para obtener un buen nivel de naturalidad. También se propone una arquitectura innovadora llamada Multi-Output TTS, la cual contiene diferentes hablantes dentro de la misma estructura. Algunas capas ocultas se comparten entre todos los hablantes, mientras que hay una capa de salida específica para cada uno de ellos. Los experimentos perceptuales y objetivos muestran que éste esquema produce resultados bastante mejores en comparación con los modelos de hablantes solos. También abordamos el problema de adaptación de hablantes añadiendo una nueva capa de salida al modelo y entrenándola sin necesidad de modificar el sistema base ya optimizado. Éste método de afinado del modelo en la última capa permite obtener mejores resultados que entrenando el modelo del nuevo hablante desde cero con su propio modelo. Finalmente también abordamos el problema de interpolación de hablantes añadiendo una nueva capa sobre las salidas del Multi-Output, la cual se llama capa-alfa. A la nueva capa se le introduce un código de identificación del hablante junto con las características acústicas de los distintos hablantes. Los experimentos mues, El Deep Learning s'ha aplicat amb èxit a problemes de processament de la parla. En aquest treball explorem les capacitats d'aquesta disciplina, fent especial èmfasi en les arquitectures recurrents per a construir un sistema de síntesi de veu des de zero. Es mostren les diferents etapes per fer el sistema de síntesi complet. A més, s'aplica i s'avalua un mètode de post-processament per tal de millorar la naturalitat de la veu generada. Els resultats objectius mostren quina arquitectura encaixa més amb el nostre problema, aconseguint errors baixos en termes de distorsió cepstral, error d'estimació de pitch i error de classificació sonor/sord. També els resultats subjectius indiquen que el model arriba a tenir una qualitat de síntesi comparable amb la de les últimes tecnologíes, on el fet de fer post-processament sembla ser una peça clau per obtenir un bon nivell de naturalitat. També es proposa una arquitectura novedosa anomenada Multi-Output TTS, la qual conté diferents parlants dins la mateixa estructura. Algunes capes ocultes es comparteixen entre tots els parlants, mentres que hi ha una capa de sortida específica per a cada un d'ells. Els experiments perceptuals i objectius mostren que aquest esquema produeix força millors resultats en comparació amb els models de parlants sols. També abordem el problema d'adaptació de parlants afegint una nova capa de sortida al model i entrenant-la sense necessitat de modificar el sistema base ja optimitzat. Aquest mètode d'afinament del model a l'última capa permet obtenir millors resultats que entrenant el model del nou parlant des de zero amb el seu propi model sol. Finalment també abordem el problema d'interpolació de parlants afegint una nova capa sobre les sortides del Multi-Output, la qual es diu capa-alfa. A la nova capa se li insereix un codi d'identificació juntament amb les característiques acústiques dels diferents parlants. Els experiments mostren que la capa-alfa pot aprendre, en efecte, a interpolar valors intermi
Published: 2016

29. Configuración de una vivienda inteligente por medio de la voz

Author: Fons Cors, Joan Josep, Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica, Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació, Tomás Planells, Samuel, Fons Cors, Joan Josep, Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica, Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació, and Tomás Planells, Samuel
Abstract: [ES] Desarrollo de una aplicación de configuración por voz para interconectar un servidor de una vivienda inteligente y los usuarios de este. Se ha desarrollado en JAVA y hace uso de la herramienta Sphinx para transformar la voz del usuario en una cadena de texto (STT). La aplicación reconoce las acciones y los dispositivos de este texto si lo hubiera. Después prepara la comunicación con el servidor usando servicios web si y solo si existe suficiente información para realizar la acción, y la acción se pueda aplicar a los dispositivos detectados. Si por cualquier motivo no se puede aplicar, existe un feedback hacia el usuario el cual recoge la información que falta. Una vez enviado la petición al servidor se informa al usuario por medio de la voz si se ha realizado o no la acción (TTS)., [EN] Development of a voice configuration application for interconnecting a smart home server with its users. It has been developed in Java and makes use of the Sphinx software tool to transform the user's voice into a text string (STT). The application recognizes the actions and devices of this text, if any. Then, it prepares communication with the server using web services only when there is enough information to perform the action, and this one may be applied to the detected devices. If, for any reason, it cannot be applied, there is a feedback to the user which collects the missing information. After sending the request to the server, the user is informed through voice in case he has or he hasn’t made the action (TTS).
Published: 2015

30. A putative role of p53 pathway against impulse noise induced damage as demonstrated by protection with Pifithrin-alpha and a Src Inhibitor

Author: Fetoni, Anna Rita, Bielefeld, Ec, Nicotera, T, Henderson, D., Fetoni, Anna Rita (ORCID:0000-0001-5405-4301), Fetoni, Anna Rita, Bielefeld, Ec, Nicotera, T, Henderson, D., and Fetoni, Anna Rita (ORCID:0000-0001-5405-4301)
Abstract: Exposure to high-level noise leads to oxidative stress and triggers apoptosis of the hair cells. This study examined whether p53, a tumor suppressor protein, is activated in the cochlea following impulse noise exposure. Inhibition of p53 with pifithrin alpha, a specific p53 inhibitor, or KX1-004, a Src-protein tyrosine kinase inhibitor, was tested to determine if p53 inhibition could reduce noise-induced hearing loss and cochlear damage. Chinchillas were pre-treated with a local administration of pifithrin alpha or KX1-004 and exposed to impulse noise. The chinchillas were assessed for threshold shift at 1 and 24hours after the noise. At 4 or 24hours post noise, the cochleae were removed and organs of Corti were examined to assess the damage to the cells and upregulation of p53 by the noise. Apoptosis was evident in both outer hair cells and supporting cells. Phospho-p53 (Ser 15) was upregulated 4hours and 24hours after the noise. KX1-004 and pifithrin alpha both decreased threshold shift and the number of missing outer hair cells. These results indicate that p53 is involved in the early stages of noise-induced cell death and inhibition of this signaling pathway is a potential protective strategy against noise-induced hearing loss.
Published: 2014

31. A Hybrid TTS between Unit Selection and HMM-based TTS under limited data conditions

Author: Phung, Trung-Nghia, Luong, Chi Mai, Akagi, Masato, Phung, Trung-Nghia, Luong, Chi Mai, and Akagi, Masato
Abstract: The intelligibility of HMM-based TTS can reach that of the original speech. However, HMM-based TTS is far from natural. On the contrary, unit selection TTS is the most-natural sounding TTS currently. However, its intelligibility and naturalness on segmental duration and timing are not stable. Additionally, unit selection needs to store a huge amount of data for concatenation. Recently, hybrid approaches between these two TTS, i.e. the HMM trajectory tiling TTS (HTT), have been studied to take advantages of both unit selection and HMM-based TTS. However, such methods still require a huge amount of data for rendering. In this paper, a hybrid TTS among unit selection, HMM-based TTS, and the Modified Restricted Temporal Decomposition (MRTD), named HTD, is proposed motivating to take advantages of both unit selection and HMM-based TTS under limited data conditions. Here, TD is a sparse representation of speech that decomposes a spectral or prosodic sequence into two mutually independent components: static event targets and correspondent dynamic event functions, and MRTD is a compact but efficient version of TD. Previous studies show that the dynamic event functions of MRTD are related to the perception of speech intelligibility, one core linguistic or content information, while the static event targets of MRTD convey non-linguistic or style information. Therefore, by borrowing the concepts of unit selection to render the event targets of the spectral sequence, and directly borrowing the prosodic sequences and the dynamic event functions of the spectral sequence generated by HMM-based TTS, the naturalness and the intelligibility of the proposed HTD can reach the naturalness of unit selection, and the intelligibility of HMM-based TTS, respectively. Due to the smoothness of event functions of MRTD, an appropriate smoothness in synthesized speech can still be ensured when being rendering by a small amount of data, resulting in the usability of the proposed HTD under limited, identifier:https://dspace.jaist.ac.jp/dspace/handle/10119/11514
Published: 2013

32. An open source reading system for print disabilities

Author: Nazemi, Azadeh, Murray, Iain, Nazemi, Azadeh, and Murray, Iain
Abstract: According to World Intellectual Property Organization (WIPO) estimation only 5% of the world’s one million print titles that are published every year are accessible to the some 340 million around the world who are blind, vision impaired or who live with other print disabilities. Access to information and education is an established human right. Many with print disabilities struggle to achieve equality in this area due to the lack of accessible books and sources of information. This research describes an approach to design a comprehensive reading system for vision impaired people.
Published: 2013

33. Remise en production des milieux ouverts sur stations sèches dans la pessière à mousses du Saguenay-Lac-Saint-Jean et du nord du Québec : résultats 10 ans après plantation pour l'épinette noire

Author: Walsh, Denis, Tremblay, Pascal, Hébert, François, Allaire, Jacques, Côté, Damien, Lord, Daniel, Walsh, Denis, Tremblay, Pascal, Hébert, François, Allaire, Jacques, Côté, Damien, and Lord, Daniel
Abstract: Au sein du domaine bioclimatique de la pessière noire à mousses et de la sapinière à bouleau blanc de la forêt boréale commerciale du Québec, on retrouve des formations ouvertes d’étendue variable où la composition végétale se compare à celle rencontrée dans le domaine bioclimatique de la pessière à lichens. Ces milieux ouverts sur stations sèches, souvent appelés dénudés secs (DS), sont caractérisés par un couvert arborescent inférieur à 25 % principalement composé d’épinette noire (Picea mariana [Mill.] B.S.P.), parfois accompagné de pin gris (Pinus banksiana Lamb.). La faible densité du couvert arborescent de ces milieux serait plus liée à l’historique des perturbations des sites qu’à une capacité de support jugée trop faible. En fait, elle proviendrait plus d’une succession de perturbations naturelles menant à un déficit de régénération. Ramener ces sites à leur densité arborescente initiale par différentes approches sylvicoles pourrait donc s’avérer intéressant. Les objectifs du présent projet sont de tester la réponse de croissance de plants d’épinette noire en fonction de différentes préparations de terrain (taupe et scarifiage) et de deux gabarits de plants (126-25 et 67-50) en utilisant des plantations effectuées dans des pessières à mousses (PM) récoltées et scarifiées à titre de témoins. Pour les parcelles scarifiées, les taux de survie demeurent, dans tous les cas, suffisamment élevés pour assurer le succès de la plantation. La croissance des plants après dix ans est significativement supérieure dans les parcelles établies dans les PM par rapport à celles établies dans les DS. Ces différences pourraient être liées au niveau de perturbation supérieure des parcelles scarifiées des PM, lesquelles ont été récoltées avant le scarifiage. Cette hypothèse est aussi supportée lorsque les données des différentes préparations de terrain utilisées dans les DS sont comparées entre elles. Le scarifiage a permis d’obtenir les taux de survie et de croissance (hauteur et
Published: 2012

34. Effet du microsite sur la croissance de l'épinette noire plantée après scarifiage au Bräcke ou au TTS : bilan 22 ans après la plantation

Author: Walsh, Denis, Lord, Daniel, Walsh, Denis, and Lord, Daniel
Abstract: Cette étude compare après 22 ans de croissance le rendement d’une plantation d’épinette noire reboisée sur différents microsites formés par le passage soit d’un scarificateur Bräcke ou d’un scarificateur à disques TTS. Les deux sites sont localisés sur la même unité physiographique dans la pessière à mousse de l’ouest tout près de Chibougamau. Le dispositif n’a pas été construit à l’origine pour permettre les comparaisons statistiques entre les deux traitements de préparation de terrain étant donné l’absence de vraies répétitions; les comparaisons statistiques entre les microsites à l’intérieur de chaque site sont cependant valides. Les gains de croissance des plants du Bräcke par rapport à ceux du TTS sont donc donnés à titre indicatif. La hauteur, le diamètre et le volume de la tige des arbres reboisés sur le monticule formé par le Bräcke étaient significativement plus grands que ceux reboisés à l’épaulement ou dans le poquet. Nous n’avons pas observé de différences significatives entre les plants reboisés dans le fond ou l’épaulement formé par le scarificateur à disques. Après 22 ans en plantation, le gain en volume des plants sur le monticule était de 165% plus élevé comparativement aux plants reboisés après passage du scarificateur à disques TTS. Cette différence s’explique par une croissance radiale annuelle accélérée des plants sur le monticule à partir de la 4ième année en plantation sur une période s’étendant sur 19 ans. Le volume marchand prédit par les équations des modèles de prédiction actuels serait supérieur de 60 m3/ha à 60 ans pour une plantation d’épinette noire réalisée sur des monticules comparativement à une préparation traditionnelle par un scarificateur à disques. This study compare 22 years old black spruce stands planted on different microsites made by a Bräcke mounder or a disk trencher. The two sites are located on the same physiographic unit in the boreal forest near Chibougamau, northeastern Quebec. The experimental design was not planne
Published: 2011

35. Remise en production des milieux ouverts sur stations sèches dans la pessière à mousse du Saguenay-Lac-Saint-Jean et du Nord du Québec : résultats 5 et 10 ans après la plantation pour l'épinette noire

Author: Tremblay, Pascal, Hébert, François, Allaire, Jacques, Walsh, Denis, Lord, Daniel, Tremblay, Pascal, Hébert, François, Allaire, Jacques, Walsh, Denis, and Lord, Daniel
Abstract: Au sein du domaine bioclimatique de la pessière noire à mousses de la forêt boréale commerciale du Québec, on retrouve des formations ouvertes d’étendue variable où la composition végétale se compare à celle rencontrée dans le domaine bioclimatique de la pessière à lichens. Ces milieux ouverts sur stations sèches, souvent appelés dénudés secs (DS), sont caractérisés par un couvert arborescent inférieur à 25 % et principalement composés d’épinette noire (Picea mariana (Mill) B.S.P.) parfois accompagnés de pin gris (Pinus banksiana Lamb.). La faible densité du couvert arborescent de ces milieux serait plus liée à l’historique des perturbations des sites qu’à une capacité de support jugée trop faible. En fait, elle proviendrait plus d’une succession de perturbations naturelles menant à un déficit de régénération. Ramener ces sites à leur densité arborescente initiale via différentes approches sylvicoles pourrait donc s’avérer intéressant. Les objectifs du présent projet sont de tester la réponse de croissance de plants d’épinette noire en fonction de différentes préparations de terrain (taupe et TTS) et de deux gabarits de plant (126-25 et 67-50) en utilisant des plantations effectuées dans des pessières à mousses récoltées et scarifiées à titre de témoins. Les données après cinq ans sont complètes, alors que celles récoltées après dix ans ne concernent qu’une partie du dispositif expérimental initial, les plants de certaines parcelles atteignant dix ans après plantation à la fin de la saison de croissance 2011. Le présent rapport en est donc un d’étape pour les données dix ans, le final devant être déposé en 2012. Pour les parcelles scarifiées, les taux de survie demeurent, dans tous les cas, suffisamment élevés pour assurer le succès de la plantation. La croissance des plants après cinq ans est significativement supérieure dans les parcelles établies dans la pessière à mousses (PM) par rapport à celles établies dans les DS. Cette différence n’est plus significative a
Published: 2011

36. Feasibility study on a text-to-speech synthesizer for embedded systems

Author: Hammarstedt, Linnea and Hammarstedt, Linnea
Abstract: A system converting textual information into speech is usually denoted as a TTS (Text-To-Speech) system. The design of this system varies depending on its purpose and platform requirements. In this thesis a TTS synthesizer designed for an embedded system operating on an arbitrary vocabulary has been evaluated and partially implemented in Matlab, constituting a base for further development. The focus is on the speech generation part, which involves the conversion from phonetic notation into synthetic speech. The chosen TTS system is the so called Time Domain-PSOLA, which convincingly suits the implementation and platform requirements. It concatenates segments of recorded speech and changes its prosodic characteristics with the Pitch Synchronous Overlap and Add (PSOLA) technique. The segment size is from the mid point of one phone to the mid point of the next, referred to as a diphone. The quality of the generated synthesized speech is rather satisfying for the test sentences applied. Some disturbances still occur as a consequence of mismatches, such as different spectral properties of the segments and pitch detection errors, but with further developing a reduction of these can be performed., Validerat; 20101217 (root)
Published: 2006

37. Navegador HTML lleuger orientat a persones invidents

Author: Universitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics, Catala Roig, Neus, Sánchez Montaner, Óscar, Universitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics, Catala Roig, Neus, and Sánchez Montaner, Óscar
Abstract: El projecte cobreix el disseny i implementació d’un navegador, lleuger, HTML (HyperText Markup Language) destinat a persones invidents o amb deficiències visuals, utilitzant una interfície senzilla basada en teclat i síntesi de veu, TTS (Text-To Speech). La implementació del navegador s’ha dut a terme utilitzant tecnologia Java. ----------------------------------------------------------- This project covers the design and implementation of a HTML (HyperText Markup Language) lightweight browser for blind people or people with less vision, using a simple interface based on key events and TTS (text-to-speech). The implementation of browser has been made in Java.
Published: 2006

38. En fonetisk analys av uppläst fri vers - de tonala utspelen hos Kristina Lugn

Author: Svensson, Malin and Svensson, Malin
Abstract: Uppläst fri vers upptar mycket liten plats inom den fonetiska och talteknologiska forskningen. Syntetiska modeller har ännu inte utarbetats för uppläst fri vers, troligen på grund av att det är svårt att fastställa poetiska effekter. Låg talhastighet och särskilda accent- och betoningsmönster är en del av en litterär konvention för den upplästa fria versen. Dessa parametrar är troligen knutna till den specifika talsituationen. Även poetiska effekter i form av unika prosodiska mönster för varje dikt tillhör konventionen för uppläst fri vers. Auditiv, fonologisk och akustisk analys görs av en svensk poets upplästa fria vers; Kristina Lugns Om ni hör ett skott? Tidigare undersökning fastställer denna talares låga talhastighet, vilken är enligt konventionen för uppläst fri vers. I denna undersökning upplevs intonation vara den mest prominenta prosodiska parametern i form av tonala utspel. Analysen visar att det globala tonomfånget varierar i storlek i samspel med emfas hos den fokala accenten i frasen. En poetisk effekt hos det undersökta materialet kan antas vara stor variation i emfas mellan fraser. Detta realiseras i närvaro och frånvaro av tonala utspel. Talaren visar sig använda korta fraser. I materialet påträffas inte sällan mer än en fokal accent inom vissa fraser, samt frånvaro av fokala accenter i andra fraser, vilket går emot en allmän förväntan om EN fokal accent per fras. Att medvetet undvika samt framhäva fokala accenter kan även det ses som en poetisk effekt.
Published: 2006

39. Remise en production des milieux ouverts sur stations sèches dans la pessière à mousses du Saguenay-Lac-Saint-Jean, Chibougamau Chapais : résultats 3 ans après plantation

Author: Hébert, François, Tremblay, Pascal, Allaire, Jacques, Walsh, Denis, Lord, Daniel, Côté, Damien, Hébert, François, Tremblay, Pascal, Allaire, Jacques, Walsh, Denis, Lord, Daniel, and Côté, Damien
Abstract: Au sein de la pessière à mousses fermée de la forêt boréale commerciale, on retrouve des forêts ouvertes d'étendue variable où la composition végétale se compare à celle rencontrée dans le domaine de la pessière à lichens et de la toundra forestière. Ces milieux ouverts de stations sèches (souvent appelés dénudés secs) sont caractérisés par un couvert arborescent d'arbres épars, principalement d'épinette noires (Picea mariana (Mill.) B.S.P.), parfois accompagné de pins gris (Pinus banksiana Lamb.), et dont la densité du couvert est inférieure à 40% et d'une strate arbustive composée de lichens du genre Cladina et Cladonia associé à des plantes arbustives de la famille des éricacées. L'ouverture de ces milieux serait liée à une succession de perturbations naturelles en rafales, qui empêche l'établissement de la régénération. Étant donné que la croissance des arbres dans ces milieux ouverts est comparable à celle des pessières fermées, la remise en production de ces milieux semblait envisageable. Les objectifs de ce projet étaient de tester la réaction des plantations (survie et croissance) dans des milieux ouverts créés naturellement au sein de forêts fermées de ce domaine. Deux facteurs sont aussi étudiés simultanément, soit l'effet de différents modes de préparation de terrain et l'utilisation de différents gabarits de semis d'épinette noire. Trois préparations de terrain furent testées dans les milieux ouverts à station sèche (DS) soit : le scarifiage, la taupe et sans préparation de terrain. La réponse des plants dans les DS a été comparée à une plantation réalisée dans une pessière à mousses récoltée et scarifiée. Deux gabarits de plants furent testés, soit ceux produits en récipient de 67 cavités de 50 cm3 et ceux produits en récipient 126 cavités de 25 cm3. Deux mesures morphologiques ont été faites pour chaque plant récolté. Le scarifiage dans les DS est la préparation de terrain qui a permis d'obtenir les meilleurs taux de survie et de croissance (hauteur, d
Published: 2005

40. Conversor texto a voz multilingüe de Telefónica I+D

Author: Armenta, Ana, Escalada Sardina, José Gregorio, Garrido Almiñana, Juan María, Rodríguez Crespo, Miguel Ángel, Armenta, Ana, Escalada Sardina, José Gregorio, Garrido Almiñana, Juan María, and Rodríguez Crespo, Miguel Ángel
Abstract: Telefónica I+D presenta la última versión de su conversor texto a voz multiligüe, multilocutor y basado en selección de unidades., Telefónica I+D presents the last versión its corpus based, multispeaker and multilingual text to speech system.
Published: 2003

41. Effets de la préparation de terrain sur le type et l'abondance des espèces végétales compétitrices dans le canton d'Hébécourt, Abitibi

Author: Durand, François and Durand, François
Abstract: Dans le nouveau contexte de l'aménagement intensif des forêts, la préparation de terrain en vue du reboisement est de plus en plus utilisée. Il est donc essentiel d'évaluer ses effets sur la dynamique des espèces végétales compétitrices. La classification écologique du territoire, en intégrant les éléments physiques et biologiques du milieu, permet d'isoler les facteurs abiotiques des facteurs temporels responsables de la distribution des espèces végétales. Cette stratification a permis d'évaluer l'effet de trois préparations de terrain sur le type et l'abondance des espèces compétitrices, dans le canton d'Hébécourt en Abitibi. Des trois préparations, seul le scarifiage "TTS" semble avoir un effet bénéfique en réduisant la compétition sur les dépôts grossiers. Sur les dépôts argileux, la compétition en espèces rudérales augmente significativement avec le déblaiement d'hiver et le scarifiage à dents sous-saleuses qui favorisent l'élimination des espèces préétablies et la mise à jour de la couche minérale. L'importance de la compétition varie selon l'intervention et la nature du site, les problèmes les plus aigus ayant été observés après scarifiage à dents sous-saleuses. Le reboisement avec des plants de hauteur supérieure immédiatement après la coupe pourrait présenter une alternative valable à la préparation de terrain. La prédiction de l'évolution des espèces compétitrices en fonction des caractéristiques écologiques des sites et des diverses préparations de terrain permettra d'optimiser le choix d'un aménagement sylvicole approprié.
Published: 1989

42. Design a vývoj plug-in nástrojů

Author: Kubíková, Zuzana, Dědic, Filip, Bařák, Šimon, Kubíková, Zuzana, Dědic, Filip, and Bařák, Šimon
Abstract: Plug-in je software, který nepracuje samostatně, ale jako doplňkový modul jiné aplikace a rozšiřuje tak její funkčnost. Obvykle využívá připraveného rozhraní aplikace zvaného API. Množství programů nabízí programátorům možnost použít jejich API (aplikační rozhraní) s možností rozšířit funkčnost příslušného programu., A plug-in is software that does not work independently, but as an add-on module to another application, thus extending its functionality. It usually uses a ready-made application interface called an API. Many programs offer programmers the ability to use their API (application interface) with the ability to extend the functionality of the program.

43. Design a vývoj plug-in nástrojů

Author: Kubíková, Zuzana, Dědic, Filip, Bařák, Šimon, Kubíková, Zuzana, Dědic, Filip, and Bařák, Šimon
Abstract: Plug-in je software, který nepracuje samostatně, ale jako doplňkový modul jiné aplikace a rozšiřuje tak její funkčnost. Obvykle využívá připraveného rozhraní aplikace zvaného API. Množství programů nabízí programátorům možnost použít jejich API (aplikační rozhraní) s možností rozšířit funkčnost příslušného programu., A plug-in is software that does not work independently, but as an add-on module to another application, thus extending its functionality. It usually uses a ready-made application interface called an API. Many programs offer programmers the ability to use their API (application interface) with the ability to extend the functionality of the program.

44. Reverzibilnost metod pro změnu hlasu

Author: Malinka, Kamil, Firc, Anton, Lička, Zbyněk, Malinka, Kamil, Firc, Anton, and Lička, Zbyněk
Abstract: Moderní metody pro změnu hlasu dovolují i nezkušeným uživatelům vytvářet přesvědčívé nahrávky hlasu slavné osoby s pouze pár sekundami nahraného ukázkového hlasu. Existují dvě hlavní kategorie metod pro změnu hlasu: konverze hlasu a text-to-speech. Metody konverze hlasu vyžadují vstupní řeč, která má být konvertována do hlasu jiného řečníka. Moderní metody pro konverzi hlasu se často zabývají odstraněním či redukcí množství informací o původním řečníkovi v konvertovaném hlasu. Tato práce se zabývá možnostmi pro extrakci informací z konvertovaného hlasu s případnou kompletní rekonstrukcí vstupní řeči. Výsledky této práce odhalují poznatky o nestudované vlastnosti těchto metod., State-of-the-art voice-changing methods allow inexperienced users to create convincing voice recordings of famous individuals with just a few seconds of recorded speech. There are two major approaches to voice generation: voice conversion and text-to-speech. Voice conversion methods require the user to input source speech to be converted to the target voice. A trend with voice conversion methods, especially those requiring only mere seconds of reference speech, has been restricting the amount of information about the original speaker in the converted speech. This work focuses on studying the amount of information extractable about the original speaker from artificial speech and potentially reconstructing the original speech. The results of this work shed light on an unstudied property of voice-changing methods.

45. Design a vývoj plug-in nástrojů

Author: Kubíková, Zuzana, Dědic, Filip, Kubíková, Zuzana, and Dědic, Filip
Abstract: Plug-in je software, který nepracuje samostatně, ale jako doplňkový modul jiné aplikace a rozšiřuje tak její funkčnost. Obvykle využívá připraveného rozhraní aplikace zvaného API. Množství programů nabízí programátorům možnost použít jejich API (aplikační rozhraní) s možností rozšířit funkčnost příslušného programu., A plug-in is software that does not work independently, but as an add-on module to another application, thus extending its functionality. It usually uses a ready-made application interface called an API. Many programs offer programmers the ability to use their API (application interface) with the ability to extend the functionality of the program.

46. Design a vývoj plug-in nástrojů

Author: Kubíková, Zuzana, Dědic, Filip, Kubíková, Zuzana, and Dědic, Filip
Abstract: Plug-in je software, který nepracuje samostatně, ale jako doplňkový modul jiné aplikace a rozšiřuje tak její funkčnost. Obvykle využívá připraveného rozhraní aplikace zvaného API. Množství programů nabízí programátorům možnost použít jejich API (aplikační rozhraní) s možností rozšířit funkčnost příslušného programu., A plug-in is software that does not work independently, but as an add-on module to another application, thus extending its functionality. It usually uses a ready-made application interface called an API. Many programs offer programmers the ability to use their API (application interface) with the ability to extend the functionality of the program.

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

Publisher

46 results on '"TTS"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources