Back to Search Start Over

A low latency minimum distance searching unit of the SOM based hardware quantizer.

Authors :
Kurdthongmee, W.
Source :
Microprocessors & Microsystems. Mar2015, Vol. 39 Issue 2, p135-143. 9p.
Publication Year :
2015

Abstract

Parts of a SOM (Self-Organizing Map) based quantizer can be performed in parallel; i.e. distance calculation between an input pixel and a group of codewords or processing elements (PEs), and updating weight of PEs. To search for the best matching unit (BMU) whose distance is the minimum, all distances are inevitably required to compare with each other. Conventionally, the minimum distance searching unit is constructed from a group of comparators which are connected in a multistage manner in order to come up with the final single minimum distance and its index. In this way, the overall latency of the unit is linearly proportional to the number of stage of comparators log 2 ( C ) where C is the number of distances. In this paper, we propose a novel hardware centric algorithm with the objective to reduce the latency for the minimum distance searching unit. In a simple form, the algorithm relies on using a memory of K addresses of 1-bit word size where K is equal to the maximum value of distance. During operation, all distances are used to refer to the memory addresses in order to change their states from ‘unoccupied’ to ‘occupied’. To efficiently search for the first address whose state is ‘occupied’, which is equivalent to the minimum distance, the look up table is employed. The algorithm is also adapted to make it more feasible to realize on an FPGA platform. The synthesis results compared with the conventional minimum distance searching indicate that the FPGA resource requirements of the algorithm are twice in terms of slices and LUT usages. In term of latency reduction, the implementation takes only 0.62 times of the conventional one for a PE size of 256. After integrating the unit to the SOM based quantizer, it has found that the obtained frame rate is 1.50 times of the conventional one for a PE size of 256, the image size of 512 × 512 and the clock speed of 66.67 MHz. The latency reduction can be further improved if the FPGA supports combining all the ‘occupied’ states in a single stage in contrast to use a group of internal limited input size LUTs. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01419331
Volume :
39
Issue :
2
Database :
Academic Search Index
Journal :
Microprocessors & Microsystems
Publication Type :
Academic Journal
Accession number :
101941191
Full Text :
https://doi.org/10.1016/j.micpro.2015.01.009