Back to Search
Start Over
Efficient Sentence Embedding using Discrete Cosine Transform
- Source :
- EMNLP 2019
- Publication Year :
- 2019
-
Abstract
- Vector averaging remains one of the most popular sentence embedding methods in spite of its obvious disregard for syntactic structure. While more complex sequential or convolutional networks potentially yield superior classification performance, the improvements in classification accuracy are typically mediocre compared to the simple vector averaging. As an efficient alternative, we propose the use of discrete cosine transform (DCT) to compress word sequences in an order-preserving manner. The lower order DCT coefficients represent the overall feature patterns in sentences, which results in suitable embeddings for tasks that could benefit from syntactic features. Our results in semantic probing tasks demonstrate that DCT embeddings indeed preserve more syntactic information compared with vector averaging. With practically equivalent complexity, the model yields better overall performance in downstream classification tasks that correlate with syntactic features, which illustrates the capacity of DCT to preserve word order information.<br />Comment: To appear in EMNLP 2019
- Subjects :
- Computer Science - Computation and Language
Subjects
Details
- Database :
- arXiv
- Journal :
- EMNLP 2019
- Publication Type :
- Report
- Accession number :
- edsarx.1909.03104
- Document Type :
- Working Paper