Approximate computing
Approximate computing is a computation which returns a possibly inaccurate result rather than a guaranteed accurate result, for a situation where an approximate result is sufficient for a purpose.[1][2] One example of such situation is for a search engine where no exact answer may exist for a certain search query and hence, many answers may be acceptable. Similarly, occasional dropping of some frames in a video application can go undetected due to perceptual limitations of humans. Approximate computing is based on the observation that in many scenarios, although performing exact computation requires large amount of resources, allowing bounded approximation can provide disproportionate gains in performance and energy, while still achieving acceptable result accuracy. For example, in k-means clustering algorithm, allowing only 5% loss in classification accuracy can provide 50 times energy saving compared to the fully accurate classification.[1]
The key requirement in approximate computing is that approximation can be introduced only in non-critical data, since approximating critical data (e.g., control operations) can lead to disastrous consequences, such as program crash or erroneous output.
Strategies
Several strategies can be used for performing approximate computing.
- Approximate circuits
- Approximate adders,[3] multipliers and other logical circuits can reduce hardware overhead.[4][5] For example, an approximate multi-bit adder can ignore the carry chain and thus, allow all its sub-adders to perform addition operation in parallel.
- Approximate storage
- Instead of storing data values exactly, they can be stored approximately, e.g., by truncating the lower-bits in floating point data. Another method is accept less reliable memory. For this, in DRAM and eDRAM, refresh rate can be lowered and in SRAM, supply voltage can be lowered. In general, any error detection and correction mechanisms should be disabled.
- Software-level approximation
- There are several ways to approximate at software level. Memoization can be applied. Some iterations of loops can be skipped (termed as loop perforation) to achieve a result faster. Some tasks can also be skipped, for example when a run-time condition suggests that those tasks are not going to be useful (task skipping). Monte Carlo algorithms trade correctness for execution time guarantees. The computation can be reformulated according to paradigms that allow easily the acceleration on specialized hardware, e.g. a neural processing unit.[6]
Application areas
Approximate computing has been used in a variety of domains where the applications are error-tolerant, such as multimedia processing, machine learning, signal processing, scientific computing, etc. Google is using this approach in their Tensor processing units (TPU, a custom ASIC).
Derived paradigms
The main issue in approximate computing is the identification of the section of the application that can be approximated. In the case of large scale applications, it is very common to find people holding the expertise on approximate computing techniques not having enough expertise on the application domain (and vice versa). In order to solve this problem, programming paradigms[7][8] have been proposed. They all have in common the clear role separation between application programmer and application domain expert. These approaches allow the spread of the most common optimizations and approximate computing techniques.
See also
References
- 1 2 Mittal, Sparsh (May 2016). "A Survey of Techniques for Approximate Computing". ACM Comput. Surv. ACM. 48 (4): 62:1–62:33. doi:10.1145/2893356.
- ↑ A. Sampson, et al. "EnerJ: Approximate data types for safe and general low-power computation", In ACM SIGPLAN Notices, vol. 46, no. 6, 2011.
- ↑ J. Echavarria, et al. "FAU: Fast Approximate Adder Units on LUT-Based FPGAs.", FPT, 2016.
- ↑ S. Venkataramani, et al. "SALSA: systematic logic synthesis of approximate circuits", DAC, 2012.
- ↑ R. Hegde et al. "Energy-efficient signal processing via algorithmic noise-tolerance", ISLPED, 1999.
- ↑ H. Esmaeilzadeh, et al. "Neural acceleration for general-purpose approximate programs", MICRO, 2012
- ↑ Nguyen, Donald; Lenharth, Andrew; Pingali, Keshav (2013). "A lightweight infrastructure for graph analytics". Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM: 456 – – 471.
- ↑ Silvano, Cristina; Agosta, Giovanni; Cherubin, Stefano; Gadioli, Davide; Palermo, Gianluca; Bartolini, Andrea; Benini, Luca; Martinovič, Jan; Palkovič, Martin; Slaninová, Kateřina; Bispo, João; Cardoso, João M. P.; Rui, Abreu; Pinto, Pedro; Cavazzoni, Carlo; Sanna, Nico; Beccari, Andrea R.; Cmar, Radim; Rohou, Erven (2016). "The ANTAREX approach to autotuning and adaptivity for energy efficient HPC systems". Proceedings of the ACM International Conference on Computing Frontiers. ACM: 288 – – 293.