Back to Search
Start Over
Conditioned Text Generation with Transfer for Closed-Domain Dialogue Systems
- Source :
- International Conference on Statistical Language and Speech Processing, Luis Espinosa-Anke; Carlos Martín-Vide; Irena Spasić. International Conference on Statistical Language and Speech Processing, Springer, pp.23-34, 2020, ⟨10.1007/978-3-030-59430-5_2⟩, Statistical Language and Speech Processing ISBN: 9783030594299, SLSP
- Publication Year :
- 2020
- Publisher :
- HAL CCSD, 2020.
-
Abstract
- International audience; Scarcity of training data for task-oriented dialogue systemsis a well known problem that is usually tackled with costly and time-consuming manual data annotation. An alternative solution is to relyon automatic text generation which, although less accurate than humansupervision, has the advantage of being cheap and fast. Our contributionis twofold. First we show how to optimally train and control the generationof intent-specific sentences using a conditional variational autoencoder.Then we introduce a new protocol calledquery transferthat allows toleverage a large unlabelled dataset, possibly containing irrelevant queries,to extract relevant information. Comparison with two different baselinesshows that this method, in the appropriate regime, consistently improvesthe diversity of the generated queries without compromising their quality.We also demonstrate the effectiveness of our generation method as a dataaugmentation technique for language modelling tasks.
- Subjects :
- Computer science
business.industry
media_common.quotation_subject
Text generation
020206 networking & telecommunications
02 engineering and technology
Dialogue Systems
Machine learning
computer.software_genre
Autoencoder
Domain (software engineering)
[MATH.MATH-PR]Mathematics [math]/Probability [math.PR]
Transfer (computing)
Spoken Language Understanding
0202 electrical engineering, electronic engineering, information engineering
Leverage (statistics)
020201 artificial intelligence & image processing
Quality (business)
Artificial intelligence
Control (linguistics)
business
Protocol (object-oriented programming)
computer
media_common
Subjects
Details
- Language :
- English
- ISBN :
- 978-3-030-59429-9
- ISBNs :
- 9783030594299
- Database :
- OpenAIRE
- Journal :
- International Conference on Statistical Language and Speech Processing, Luis Espinosa-Anke; Carlos Martín-Vide; Irena Spasić. International Conference on Statistical Language and Speech Processing, Springer, pp.23-34, 2020, ⟨10.1007/978-3-030-59430-5_2⟩, Statistical Language and Speech Processing ISBN: 9783030594299, SLSP
- Accession number :
- edsair.doi.dedup.....fbf7ba0e3600e2f2bc1443356dff058c
- Full Text :
- https://doi.org/10.1007/978-3-030-59430-5_2⟩