1. An Analysis of XML Compression Efficiency
- Author
-
Augeri, Christopher James, Mullins, Barry E., Baird III, Leemon C., Bulutoglu, Dursun A., and Baldwin, Rusty O.
- Subjects
Computer Science - Databases ,Computer Science - Information Theory ,Computer Science - Performance ,E.4 ,H.1.1 ,H.3.5 ,I.7.2 ,D.2.8 - Abstract
XML simplifies data exchange among heterogeneous computers, but it is notoriously verbose and has spawned the development of many XML-specific compressors and binary formats. We present an XML test corpus and a combined efficiency metric integrating compression ratio and execution speed. We use this corpus and linear regression to assess 14 general-purpose and XML-specific compressors relative to the proposed metric. We also identify key factors when selecting a compressor. Our results show XMill or WBXML may be useful in some instances, but a general-purpose compressor is often the best choice., Comment: 1. test data at https://web.archive.org/web/20160805043420/http://chris-augeri.com/wp-content/uploads/docs/xml_compress.htm 2. one next step is testing newer compressors, e.g., Brotli, along with Zstandard, which leverages the asymmetric numeral system (ANS) 3. citations at https://scholar.google.com/scholar?cluster=8178011010886368797&hl=en&as_sdt=5,33&sciodt=0,33
- Published
- 2024
- Full Text
- View/download PDF