x86 calling conventions

This article describes the calling conventions used when programming x86 architecture microprocessors.

Calling conventions describe the interface of called code:

This is intimately related with the assignment of sizes and formats to programming-language types. Another closely related topic is name mangling, which determines how symbol names in the code map to symbol names used by the linker. Calling conventions, type representations, and name mangling are all part of what is known as an application binary interface (ABI).

There are often subtle differences in how various compilers implement these conventions, so it is often difficult to interface code which is compiled by different compilers. On the other hand, conventions which are used as an API standard (such as stdcall) are very uniformly implemented.

Historical background

Prior to microcomputers, the machine manufacturer generally provided an operating system and compilers for several programming languages. The calling convention(s) for each platform were those defined by the manufacturer's programming tools.

Early microcomputers before the Commodore Pet and Apple II, generally came without an OS or compilers, as did the IBM PC. The only hardware standard for IBM PC-compatible machines was defined by the Intel processors (8086, 80386) and the literal hardware IBM shipped. Hardware extensions and all software standards (save for a BIOS calling convention) were thrown open to market competition.

A multitude of independent software firms offered operating systems, compilers for many programming languages, and applications. Many different calling schemes were implemented by the firms, often mutually exclusive, based on different requirements, historical practices, and programmer creativity.

After the IBM-compatible market shakeout, Microsoft operating systems and programming tools (with differing conventions) predominated, while second-tier firms like Borland and Novell, and open-source projects like GCC, still maintained their own standards. Provisions for interoperability between vendors and products were eventually adopted, simplifying the problem of choosing a viable convention.[1]

Caller clean-up

In these conventions, the caller cleans the arguments from the stack.

cdecl

The cdecl (which stands for C declaration) is a calling convention that originates from the C programming language and is used by many C compilers for the x86 architecture.[1] In cdecl, subroutine arguments are passed on the stack. Integer values and memory addresses are returned in the EAX register, floating point values in the ST0 x87 register. Registers EAX, ECX, and EDX are caller-saved, and the rest are callee-saved. The x87 floating point registers ST0 to ST7 must be empty (popped or freed) when calling a new function, and ST1 to ST7 must be empty on exiting a function. ST0 must also be empty when not used for returning a value.

In the context of the C programming language, function arguments are pushed on the stack in the reverse order. In Linux, GCC sets the de facto standard for calling conventions. Since GCC version 4.5, the stack must be aligned to a 16-byte boundary when calling a function (previous versions only required a 4-byte alignment.)[2]

Consider the following C source code snippet:

int callee(int, int, int);

int caller(void)
{
	int ret;

	ret = callee(1, 2, 3);
	ret += 5;
	return ret;
}

On x86, it will produce the following assembly code (Intel syntax):

caller:
	; make new call frame
	push    ebp
	mov     ebp, esp
	; push call arguments
	push    3
	push    2
	push    1
	; call subroutine 'callee'
	call    callee
	; remove arguments from frame
	add     esp, 12
	; use subroutine result
	add     eax, 5
	; restore old call frame
	pop     ebp
	; return
	ret

The caller cleans the stack after the function call returns.

There are some variations in the interpretation of cdecl,[3] particularly in how to return values. As a result, x86 programs compiled for different operating system platforms and/or by different compilers can be incompatible, even if they both use the "cdecl" convention and do not call out to the underlying environment. Some compilers return simple data structures with a length of 2 registers or less in the register pair EAX:EDX, and larger structures and class objects requiring special treatment by the exception handler (e.g., a defined constructor, destructor, or assignment) are returned in memory. To pass "in memory", the caller allocates memory and passes a pointer to it as a hidden first parameter; the callee populates the memory and returns the pointer, popping the hidden pointer when returning.

In Linux/GCC, double/floating point values should be pushed on the stack via the x87 pseudo-stack. Like so:

	sub     esp, 8          ; make room for the double
	fld     [ebp + x]       ; load our double onto the floating point stack
	fstp    [esp]           ; push our double onto the stack
	call    funct
	add     esp, 8

Using this method ensures it is pushed on the stack in the correct format.

The cdecl calling convention is usually the default calling convention for x86 C compilers, although many compilers provide options to automatically change the calling conventions used. To manually define a function to be cdecl, some support the following syntax:

void _cdecl funct();

The _cdecl modifier must be included in the function prototype and in the function declaration, to override any other settings that might be in place.

syscall

This is similar to cdecl in that arguments are pushed right-to-left. EAX, ECX, and EDX are not preserved. The size of the parameter list in doublewords is passed in AL.

Syscall is the standard calling convention for 32 bit OS/2 API.

Arguments are pushed right-to-left. The three lexically first (leftmost) arguments are passed in EAX, EDX, and ECX and up to four floating-point arguments are passed in ST(0) through ST(3), although space for them is reserved in the argument list on the stack. Results are returned in EAX or ST(0). Registers EBP, EBX, ESI, and EDI are preserved.

Optlink is used by the IBM VisualAge compilers.

Callee clean-up

In these conventions, the callee cleans up the arguments from the stack. Functions which utilize these conventions are easy to recognize in ASM code because they will unwind the stack prior to returning. The x86 ret instruction allows an optional 16-bit parameter that specifies the number of stack bytes to unwind before returning to the caller. Such code looks like this:

 ret 12

Conventions entitled fastcall or register have not been standardized, and have been implemented differently, depending on the compiler vendor.[1] Typically register based calling conventions pass one or more arguments in registers which reduces the number of memory accesses required for the call and thus make them usually faster.

pascal

Based on the Pascal programming language's calling convention, the parameters are pushed on the stack in left-to-right order (opposite of cdecl), and the callee is responsible for balancing the stack before return.

This calling convention was common in the following 16-bit APIs: OS/2 1.x, Microsoft Windows 3.x, and Borland Delphi version 1.x. Modern versions of the Windows API use stdcall, which still has the callee restoring the stack as in the Pascal convention, but the parameters are now pushed right to left.

stdcall

The stdcall[4] calling convention is a variation on the Pascal calling convention in which the callee is responsible for cleaning up the stack, but the parameters are pushed onto the stack in right-to-left order, as in the _cdecl calling convention. Registers EAX, ECX, and EDX are designated for use within the function. Return values are stored in the EAX register.

stdcall is the standard calling convention for the Microsoft Win32 API and for Open Watcom C++.

Microsoft fastcall

Microsoft[5] or GCC[6] __fastcall convention (aka __msfastcall) passes the first two arguments (evaluated left to right) that fit into ECX and EDX. Remaining arguments are pushed onto the stack from right to left. When MS compiler compiles for IA64 or AMD64, it ignores the __fastcall keyword and uses the one 64-bit calling convention instead.

Microsoft vectorcall

In Visual Studio 2013, Microsoft introduced the __vectorcall calling convention in response to efficiency concerns from game, graphic, video/audio, and codec developers.[7] For IA-32 and x64 code, __vectorcall is similar to __fastcall and the original x64 calling conventions respectively, but extends them to support passing vector arguments using SIMD registers. For x64, when any of the first six arguments are vector types (float, double, __m128, __m256, etc.), they are passed in via the corresponding XMM/YMM registers. Similarly for IA-32, up to six XMM/YMM registers are allocated sequentially for vector type arguments from left to right regardless of position. Additionally, __vectorcall adds support for passing homogeneous vector aggregate (HVA) values, which are composite types consisting solely of up to four identical vector types, using the same six registers. Once the registers have been allocated for vector type arguments, the unused registers are allocated to HVA arguments from left to right regardless of position. Resulting vector type and HVA values are returned using the first four XMM/YMM registers.[8]

Borland register

Evaluating arguments from left to right, it passes three arguments via EAX, EDX, ECX. Remaining arguments are pushed onto the stack, also left to right.[9] It is the default calling convention of the 32-bit compiler of Delphi, where it is known as register. Some versions of Linux kernel use this convention on i386.[10]

Watcom register

Watcom does not support the __fastcall keyword except to alias it to null. The register calling convention may be selected by command line switch. (However, IDA uses __fastcall anyway for uniformity.)

Up to 4 registers are assigned to arguments in the order eax, edx, ebx, ecx. Arguments are assigned to registers from left to right. If any argument cannot be assigned to a register (say it is too large) it, and all subsequent arguments, are assigned to the stack. Arguments assigned to the stack are pushed from right to left. Names are mangled by adding a suffixed underscore.

Variadic functions fall back to the Watcom stack based calling convention.

The Watcom C/C++ compiler also uses the #pragma aux[11] directive that allows the user to specify their own calling convention. As its manual states, "Very few users are likely to need this method, but if it is needed, it can be a lifesaver".

TopSpeed / Clarion / JPI

The first four integer parameters are passed in registers eax, ebx, ecx and edx. Floating point parameters are passed on the floating point stack – registers st0, st1, st2, st3, st4, st5 and st6. Structure parameters are always passed on the stack. Additional parameters are passed on the stack after registers are exhausted. Integer values are returned in eax, pointers in edx and floating point types in st0.

safecall

In Delphi and Free Pascal on Microsoft Windows, the safecall calling convention encapsulates COM (Component Object Model) error handling, thus exceptions aren't leaked out to the caller, but are reported in the HRESULT return value, as required by COM/OLE. When calling a safecall function from Delphi code, Delphi also automatically checks the returned HRESULT and raises an exception if necessary.

The safecall calling convention is the same as the stdcall calling convention, except that exceptions are passed back to the caller in EAX as a HResult (instead of in FS:[0]), while the function result is passed by reference on the stack as though it were a final "out" parameter. When calling a Delphi function from Delphi this calling convention will appear just like any other calling convention, because although exceptions are passed back in EAX, they are automatically converted back to proper exceptions by the caller. When using COM objects created in other languages, the HResults will be automatically raised as exceptions, and the result for Get functions is in the result rather than a parameter. When creating COM objects in Delphi with safecall, there is no need to worry about HResults, as exceptions can be raised as normal but will be seen as HResults in other languages.

function function_name(a: DWORD): DWORD; safecall;

Returns a result and raises exceptions like a normal Delphi function, but it passes values and exceptions as though it was:

function function_name(a: DWORD; out Result: DWORD): HResult; stdcall;

Either caller or callee clean-up

thiscall

This calling convention is used for calling C++ non-static member functions. There are two primary versions of thiscall used depending on the compiler and whether or not the function uses a variable number of arguments.

For the GCC compiler, thiscall is almost identical to cdecl: The caller cleans the stack, and the parameters are passed in right-to-left order. The difference is the addition of the this pointer, which is pushed onto the stack last, as if it were the first parameter in the function prototype.

On the Microsoft Visual C++ compiler, the this pointer is passed in ECX and it is the callee that cleans the stack, mirroring the stdcall convention used in C for this compiler and in Windows API functions. When functions use a variable number of arguments, it is the caller that cleans the stack (cf. cdecl).

The thiscall calling convention can only be explicitly specified on Microsoft Visual C++ 2005 and later. On any other compiler thiscall is not a keyword. (However, disassemblers, such as IDA, must specify it. So IDA uses keyword __thiscall for this.)

Register preservation

Another part of a calling convention is which registers are guaranteed to retain their values after a subroutine call. According to the Intel ABI to which the vast majority of compilers conform, the EAX, EDX, and ECX are to be free for use within a procedure or function, and need not be preserved.

x86-64 calling conventions

x86-64 calling conventions take advantage of the additional register space to pass more arguments in registers. Also, the number of incompatible calling conventions has been reduced. There are two in common use.

Microsoft x64 calling convention

The Microsoft x64 calling convention[12][13] is followed on Windows and pre-boot UEFI (for long mode on x86-64). It uses registers RCX, RDX, R8, R9 for the first four integer or pointer arguments (in that order), and XMM0, XMM1, XMM2, XMM3 are used for floating point arguments. Additional arguments are pushed onto the stack (right to left). Integer return values (similar to x86) are returned in RAX if 64 bits or less. Floating point return values are returned in XMM0. Parameters less than 64 bits long are not zero extended; the high bits are not zeroed.

When compiling for the x64 architecture in a Windows context (whether using Microsoft or non-Microsoft tools), there is only one calling convention — the one described here, so that stdcall, thiscall, cdecl, fastcall, etc., are now all one and the same.

In the Microsoft x64 calling convention, it's the caller's responsibility to allocate 32 bytes of "shadow space" on the stack right before calling the function (regardless of the actual number of parameters used), and to pop the stack after the call. The shadow space is used to spill RCX, RDX, R8, and R9,[14] but must be made available to all functions, even those with fewer than four parameters.

The registers RAX, RCX, RDX, R8, R9, R10, R11 are considered volatile (caller-saved).[15]

The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, and R15 are considered nonvolatile (callee-saved).[15]

For example, a function taking 5 integer arguments will take the first to fourth in registers, and the fifth will be pushed on the top of the shadow space. So when the called function is entered, the stack will be composed of (in ascending order) the return address, followed by the shadow space (32 bytes) followed by the fifth parameter.

In x86-64, Visual Studio 2008 stores floating point numbers in XMM6 and XMM7 (as well as XMM8 through XMM15); consequently, for x86-64, user-written assembly language routines must preserve XMM6 and XMM7 (as compared to x86 wherein user-written assembly language routines did not need to preserve XMM6 and XMM7). In other words, user-written assembly language routines must be updated to save/restore XMM6 and XMM7 before/after the function when being ported from x86 to x86-64.

Starting with Visual Studio 2013, Microsoft introduced the __vectorcall calling convention which extends the x64 convention.

System V AMD64 ABI

The calling convention of the System V AMD64 ABI is followed on Solaris, Linux, FreeBSD, OS X,[16] and is the de facto standard among Unix and Unix-like operating systems. The first six integer or pointer arguments are passed in registers RDI, RSI, RDX, RCX (R10 in the Linux kernel interface[17]:124), R8, and R9, while XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6 and XMM7 are used for certain floating point arguments.[17]:20 As in the Microsoft x64 calling convention, additional arguments are passed on the stack[17]:20 and the return value is stored in RAX and RDX.[17]:22

If the callee wishes to use registers RBP, RBX, and R12–R15, it must restore their original values before returning control to the caller. All others must be saved by the caller if it wishes to preserve their values.[17]:15

If the callee is a variadic function, then the number of floating point arguments passed to the function in vector registers must be provided by the caller in the RAX register.[17]:50)

Unlike the Microsoft calling convention, a shadow space is not provided; on function entry, the return address is adjacent to the seventh integer argument on the stack.

List of x86 calling conventions

This is a list of x86 calling conventions.[1] These are conventions primarily intended for C/C++ compilers (especially the 64-bit part below), and thus largely special cases. Other languages may use other formats and conventions in their implementations.

Architecture Calling convention name Operating system, compiler Parameters in registers Parameter order on stack Stack cleanup by Notes
8086 cdecl RTL (C) Caller
Pascal LTR (Pascal) Callee
fastcall Microsoft (non-member) AX, DX, BX LTR (Pascal) Callee Return pointer in BX.
fastcall Microsoft (member function) AX, DX LTR (Pascal) Callee “this” on stack low address. Return pointer in AX.
fastcall Turbo C[18] AX, DX, CX LTR (Pascal) Callee “this” on stack low address. Return pointer on stack high address.
Watcom AX, DX, BX, CX RTL (C) Callee Return pointer in SI.
IA-32 cdecl GCC RTL (C) Caller When returning struct/class, the calling code allocates space and passes a pointer to this space via a hidden parameter on the stack. The called function writes the return value to this address.
cdecl Microsoft RTL (C) Caller When returning struct/class,
  • Plain old data (POD) return values 32 bits or smaller are in the EAX register
  • POD return values 33-64 bits in size are returned via the EAX:EDX registers.
  • Non-POD return values or values larger than 64-bits, the calling code will allocate space and passes a pointer to this space via a hidden parameter on the stack. The called function writes the return value to this address.
stdcall Microsoft RTL (C) Callee
GCC RTL (C) Hybrid Stack aligned on 16 bytes boundary.
fastcall Microsoft ECX, EDX RTL (C) Callee Return pointer on stack if not member function.
fastcall GCC ECX, EDX RTL (C) Callee
register Delphi and Free Pascal EAX, EDX, ECX LTR (Pascal) Callee
thiscall Windows (Microsoft Visual C++) ECX RTL (C) Callee Default for member functions.
vectorcall Windows (Microsoft Visual C++) RTL (C)
Watcom compiler EAX, EDX, EBX, ECX RTL (C) Callee Return pointer in ESI.
x86-64 Microsoft x64 calling convention[12] Windows (Microsoft Visual C++, GCC, Intel C++ Compiler, Delphi), UEFI RCX/XMM0, RDX/XMM1, R8/XMM2, R9/XMM3 RTL (C) [19] Caller Stack aligned on 16 bytes. 32 bytes shadow space on stack. The specified 8 registers can only be used for parameters 1 through 4.
vectorcall Windows (Microsoft Visual C++) RCX/XMM0, RDX/XMM1, R8/XMM2, R9/XMM3 + XMM0-XMM5/YMM0-YMM5 RTL (C) Caller [20]
System V AMD64 ABI[17] Solaris, Linux, BSD, OS X (GCC, Intel C++ Compiler) RDI, RSI, RDX, RCX, R8, R9, XMM0–7 RTL (C) Caller Stack aligned on 16 bytes boundary. 128 bytes red zone below stack.

References

Footnotes

  1. 1 2 3 4 Agner Fog (2010-02-16). "Calling conventions for different C++ compilers and operating systems" (PDF).
  2. "GCC Bugzilla – Bug 40838 - gcc shouldn't assume that the stack is aligned". 2009.
  3. de Boyne Pollard, Jonathan (2010). "The gen on function calling conventions". Frequently Given Answers.
  4. http://msdn2.microsoft.com/en-us/library/zxk0tw93(vs.71).aspx
  5. "__fastcall". MSDN. Retrieved 2013-09-26.
  6. Ohse, Uwe. "gcc attribute overview: function fastcall". ohse.de. Retrieved 2010-09-27.
  7. "Introducing 'Vector Calling Convention'". MSDN. Retrieved 2014-12-31.
  8. "__vectorcall". MSDN. Retrieved 2014-12-31.
  9. "Program Control: Register Convention". docwiki.embarcadero.com. 2010-06-01. Retrieved 2010-09-27.
  10. "Add CONFIG for -mregparm=3".
  11. "Calling_Conventions: Specifying_Calling_Conventions_the_Watcom_Way". openwatcom.org. 2010-04-27. Retrieved 2010-09-27.
  12. 1 2 "x64 Software Conventions: Calling Conventions". msdn.microsoft.com. 2010. Retrieved 2010-09-27.
  13. "x64 Architecture". msdn.microsoft.com.
  14. "x64 Software Conventions - Stack Allocation". Microsoft. Retrieved 2010-03-31.
  15. 1 2 "Caller/Callee Saved Registers". msdn.microsoft.com/en-us/library/6t169e9c.aspx. Microsoft.
  16. "x86-64 Code Model". Mac Developer Library. Apple Inc. Archived from the original on 2016-03-10. Retrieved 2016-04-06. The x86-64 environment in OS X has only one code model for user-space code. It’s most similar to the small PIC model defined by the x86-64 System V ABI.
  17. 1 2 3 4 5 6 7 Michael Matz, Jan Hubička, Andreas Jaeger, Mark Mitchell, eds. (2014-11-17). "System V Application Binary Interface: AMD64 Architecture Processor Supplement" (PDF). 0.99.7.
  18. Borland C/C++ version 3.1 User Guide (PDF). Borland. 1992. pp. 158,189191.
  19. http://msdn.microsoft.com/en-us/library/zthk2dkh(v=vs.80).aspx
  20. https://msdn.microsoft.com/en-us/library/dn375768.aspx

Other sources

Further reading

  • Jonathan de Boyne Pollard (2010). "The gen on function calling conventions". Frequently Given Answers. 
  • Kip R. Irvine (2011). "Advanced Procedures (Chapter 8)". Assembly Language for x86 Processors (6th ed.). Prentice Hall. ISBN 978-0-13-602212-1. 
  • Borland C/C++ version 3.1 User Guide (PDF). Borland. 1992. pp. 158,189191. 
  • Thomas Lauer (1995). "The New __stdcall Calling Sequence". Porting to Win32: A Guide to Making Your Applications Ready for the 32-Bit Future of Windows. Springer. ISBN 978-0-387-94572-9. 
This article is issued from Wikipedia - version of the 10/11/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.