Function evaluation is an important arithmetic computation in many signal processing applications, such as special function units in modern graphics processing units (GPUs). Hardware implementations of function evaluation usually consists of lookup tables (LUT) and some simple arithmetic units of multipliers and/or adders. LUT usually takes a significant portion of total area cost, especially when function evaluators are allowed to compute several different arithmetic functions with shared arithmetic units where evaluation of each function needs separate LUT. In this paper, we focus on the category of table-addition (TA) function evaluators that are composed of two types of LUT, table of initial values ( TI ) and table of offset values ( TO ), followed by a multi-operand adder. It has been shown that multipartite table method (MP) has significant improvement over prior similar designs such as symmetric bipartite table methods (SBTM) and symmetric table addition methods (STAM) for applications with low-to-medium precision requirements. This paper presents an extension of MP, called hierarchical multipartite (HMP), which further reduces total table size by applying several levels of table decompositions. Furthermore, we perform the bit-width optimization by jointly considering the impacts of all error sources during the search of best table decompositions, leading to more efficient hardware design. Besides, a new lossless decomposition of TI is presented, resulting in additional saving of table size without incurring any extra errors. Experimental results show that the proposed design can efficiently reduce the total area cost in ASIC and FPGA implementations.