Instruction prefetch

In computer architecture, instruction prefetch is a technique used in central processor units to speed up the execution of a program by reducing wait states.

Prefetching occurs when a processor requests an instruction or data block from main memory before it is actually needed. Once the block comes back from memory, it is placed in a cache. When the instruction/data block is actually needed, it can be accessed much more quickly from the cache than if it had to make a request from memory. Thus, prefetching hides memory access latency and hence, it is a useful technique for addressing the memory wall issue.

Since programs are generally executed sequentially, performance is likely to be best when instructions are prefetched in program order. Alternatively, the prefetch may be part of a complex branch prediction algorithm, where the processor tries to anticipate the result of a calculation and fetch the right instructions in advance. In the case of dedicated hardware (like a Graphics Processing Unit) the prefetch can take advantage of the spatial coherence usually found in the texture mapping process. In this case, the prefetched data are not instructions, but texture elements (texels) that are candidates to be mapped on a polygon.

The first mainstream microprocessors to use some form of instruction prefetch were the Intel 8086 (six bytes) and the Motorola 68000 (four bytes). In recent years, many high-performance processors use prefetching techniques.

Types of prefetching

Prefetching can be classified in many ways.^[1]

Data or instruction prefetching

As the name implies, the prefetching can be performed for either data blocks or instruction blocks. Since data access patterns show less regularity than instruction patterns, accurate data prefetching is generally more challenging than instruction prefetching.

Hardware or software prefetching

Prefetching can be performed in either hardware or software. Hardware prefetchers may use some storage to detect access patterns and based on it, prefetch instructions are issued. Software prefetchers insert prefetch instructions in program source-code based on knowledge of program control flow.

Prefetching metrics

Many metrics are used for characterizing prefetching operations.^[1]

Prefetch degree

Prefetch degree is the number of cache lines prefetched in each prefetching operation.

Prefetch distance

Prefetch distance shows how far ahead of the demand access stream, the data blocks are prefetched.

Usefulness

A prefetch operation is useful or useless depending on whether the item brought by it removes or does not remove a future cache miss. A prefetch operation is harmful if the item brought by it replaces a useful block and thus, possibly increases the cache misses. Harmful prefetches lead to cache pollution. A prefetch operation is redundant if the data-block brought by it is already present in cache.

References

1 2 "A Survey of Recent Prefetching Techniques for Processor Caches", ACM Computing Surveys, 2016.