c - What ensures reads/writes of operands occurs at desired timed with extended ASM? -
according gcc's extended asm , assembler template, keep instructions consecutive, must in same asm block. i'm having trouble understanding provides scheduling or timings of reads , writes operands in block multiple statements.
as example, ebx
or rbx
needs preserved when using cpuid
because, according abi, caller owns it. there open questions respect use of ebx
, rbx
, want preserve unconditionally (its requirement). 3 instructions need encoded single asm block ensure consecutive-ness of instructions (re: assembler template discussed in first paragraph):
unsigned int __func = 1, __subfunc = 0; unsigned int __eax, __ebx, __ecx, __edx; __asm__ __volatile__ ( "push %ebx;" "cpuid;" "pop %ebx" : "=a"(__eax), "=b"(__ebx), "=c"(__ecx), "=d"(__edx) : "a"(__func), "c"(__subfunc) );
if expression representing operands interpreted @ wrong point in time, __ebx
saved ebx
(and not cpuid
's ebx
), pointer global offset table (got) if pic enabled.
where, exactly, expression specify store of cpuid
's %ebx
__ebx
should happen (1) after push %ebx
; (2) after cpuid
; (3) before pop %ebx
?
in question present code push
, pop
of ebx
. idea of saving ebx
in event compile gcc using -fpic
(position independent code) correct. our function not clobber ebx
upon return in situation. unfortunately way have defined constraints explicitly use ebx
. compiler warn (error: inconsistent operand constraints in 'asm') if using pic code , specify =b
output constraint. why doesn't produce warning unusual.
to around problem can let assembler template choose register you. instead of pushing , popping exchange %ebx
unused register chosen compiler , restore exchanging after. since don't wish have compiler clobber our input registers during exchange specify clobber modifier, ending constraint of =&r
(instead of =b
in ops code). more on modifiers can found here. code (for 32 bit) like:
unsigned int __func = 1, __subfunc = 0; unsigned int __eax, __ebx, __ecx, __edx; __asm__ __volatile__ ( "xchgl\t%%ebx, %k1\n\t" \ "cpuid\n\t" \ "xchgl\t%%ebx, %k1\n\t" : "=a"(__eax), "=&r"(__ebx), "=c"(__ecx), "=d"(__edx) : "a"(__func), "c"(__subfunc));
if intend compile x86_64 (64 bit) you'll need save entire contents of %rbx
. code above not quite work. you'd have use like:
uint32_t __func = 1, __subfunc = 0; uint32_t __eax, __ecx, __edx; uint64_t __bx; /* big enough hold 64 bit value */ __asm__ __volatile__ ( "xchgq\t%%rbx, %q1\n\t" \ "cpuid\n\t" \ "xchgq\t%%rbx, %q1\n\t" : "=a"(__eax), "=&r"(__bx), "=c"(__ecx), "=d"(__edx) : "a"(__func), "c"(__subfunc));
you code using conditional compilation deal both x86_64 , i386:
uint32_t __func = 1, __subfunc = 0; uint32_t __eax, __ecx, __edx; uint64_t __bx; /* big enough hold 64 bit value */ #if defined(__i386__) __asm__ __volatile__ ( "xchgl\t%%ebx, %k1\n\t" \ "cpuid\n\t" \ "xchgl\t%%ebx, %k1\n\t" : "=a"(__eax), "=&r"(__bx), "=c"(__ecx), "=d"(__edx) : "a"(__func), "c"(__subfunc)); #elif defined(__x86_64__) __asm__ __volatile__ ( "xchgq\t%%rbx, %q1\n\t" \ "cpuid\n\t" \ "xchgq\t%%rbx, %q1\n\t" : "=a"(__eax), "=&r"(__bx), "=c"(__ecx), "=d"(__edx) : "a"(__func), "c"(__subfunc)); #else #error "unknown architecture." #endif
gcc has __cpuid
macro defined in cpuid.h
. defined macro saves ebx
, rbx
register when required. can find gcc 4.8.1 macro definition here idea of how handle cpuid
in cpuid.h.
the astute reader may ask question - stops compiler choosing ebx
or rbx
scratch register use exchange. compiler knows ebx
, rbx
in context of pic, , not allow used scratch register. based on personal observations on years , reviewing assembler (.s) files generated c code. can't how more ancient versions of gcc handled problem.
Comments
Post a Comment