1. Optimum Benefit Protocol: A fast converging, bandwidth-efficient decentralized similarity overlay
- Author
-
Shanika Karunasekera, I. F. Bukhari, and Aaron Harwood
- Subjects
Computer Networks and Communications ,business.industry ,Computer science ,Distributed computing ,Document classification ,Message passing ,020206 networking & telecommunications ,02 engineering and technology ,computer.software_genre ,MovieLens ,Theoretical Computer Science ,Artificial Intelligence ,Hardware and Architecture ,Scalability ,Convergence (routing) ,0202 electrical engineering, electronic engineering, information engineering ,Bandwidth (computing) ,020201 artificial intelligence & image processing ,Gossip protocol ,business ,Communication complexity ,Cluster analysis ,computer ,Software ,Computer network - Abstract
Due to large volumes of data available online, techniques such as document classification and clustering are required for organization, analysis and management of data. Similarity-based Clustering (SBC) is used by many existing systems for filtering information. Decentralized gossip-based overlays offer a simple, robust and scalable solution to SBC clustering. Convergence and communication complexity are the two key areas of concern when SBC is implemented using these overlays. Convergence guarantees accurate clustering but costs bandwidth because these systems rely on message passing to achieve convergence. In this work, we address the long tail problem, experienced by low in-degree nodes in a long tailed similarity distribution, such as power-law distribution and propose a new SBC approach, Optimum Benefit Protocol (OBP), that converges more rapidly than existing approaches and reduces the long tail. The proposed protocol only sends messages that could benefit the receiver, reducing bandwidth to one fifth of the default. Our protocol obtains at least 90% convergence for a 900 node network, starting in a random configuration, in less than 10 cycles, for all observed experiments. We used real world distributions from Yahoo, Movielens, and Epinion datasets, for experimentation.
- Published
- 2017
- Full Text
- View/download PDF