c - What ensures reads/writes of operands occurs at desired timed with extended ASM? -
according gcc's extended asm , assembler template, keep instructions consecutive, must in same asm block. i'm having trouble understanding provides scheduling or timings of reads , writes operands in block multiple statements.
as example, ebx or rbx needs preserved when using cpuid because, according abi, caller owns it. there open questions respect use of ebx , rbx, want preserve unconditionally (its requirement). 3 instructions need encoded single asm block ensure consecutive-ness of instructions (re: assembler template discussed in first paragraph):
unsigned int __func = 1, __subfunc = 0; unsigned int __eax, __ebx, __ecx, __edx; __asm__ __volatile__ ( "push %ebx;" "cpuid;" "pop %ebx" : "=a"(__eax), "=b"(__ebx), "=c"(__ecx), "=d"(__edx) : "a"(__func), "c"(__subfunc) ); if expression representing operands interpreted @ wrong point in time, __ebx saved ebx (and not cpuid's ebx), pointer global offset table (got) if pic enabled.
where, exactly, expression specify store of cpuid's %ebx __ebx should happen (1) after push %ebx; (2) after cpuid; (3) before pop %ebx?
in question present code push , pop of ebx. idea of saving ebx in event compile gcc using -fpic (position independent code) correct. our function not clobber ebx upon return in situation. unfortunately way have defined constraints explicitly use ebx. compiler warn (error: inconsistent operand constraints in 'asm') if using pic code , specify =b output constraint. why doesn't produce warning unusual.
to around problem can let assembler template choose register you. instead of pushing , popping exchange %ebx unused register chosen compiler , restore exchanging after. since don't wish have compiler clobber our input registers during exchange specify clobber modifier, ending constraint of =&r (instead of =b in ops code). more on modifiers can found here. code (for 32 bit) like:
unsigned int __func = 1, __subfunc = 0; unsigned int __eax, __ebx, __ecx, __edx; __asm__ __volatile__ ( "xchgl\t%%ebx, %k1\n\t" \ "cpuid\n\t" \ "xchgl\t%%ebx, %k1\n\t" : "=a"(__eax), "=&r"(__ebx), "=c"(__ecx), "=d"(__edx) : "a"(__func), "c"(__subfunc)); if intend compile x86_64 (64 bit) you'll need save entire contents of %rbx. code above not quite work. you'd have use like:
uint32_t __func = 1, __subfunc = 0; uint32_t __eax, __ecx, __edx; uint64_t __bx; /* big enough hold 64 bit value */ __asm__ __volatile__ ( "xchgq\t%%rbx, %q1\n\t" \ "cpuid\n\t" \ "xchgq\t%%rbx, %q1\n\t" : "=a"(__eax), "=&r"(__bx), "=c"(__ecx), "=d"(__edx) : "a"(__func), "c"(__subfunc)); you code using conditional compilation deal both x86_64 , i386:
uint32_t __func = 1, __subfunc = 0; uint32_t __eax, __ecx, __edx; uint64_t __bx; /* big enough hold 64 bit value */ #if defined(__i386__) __asm__ __volatile__ ( "xchgl\t%%ebx, %k1\n\t" \ "cpuid\n\t" \ "xchgl\t%%ebx, %k1\n\t" : "=a"(__eax), "=&r"(__bx), "=c"(__ecx), "=d"(__edx) : "a"(__func), "c"(__subfunc)); #elif defined(__x86_64__) __asm__ __volatile__ ( "xchgq\t%%rbx, %q1\n\t" \ "cpuid\n\t" \ "xchgq\t%%rbx, %q1\n\t" : "=a"(__eax), "=&r"(__bx), "=c"(__ecx), "=d"(__edx) : "a"(__func), "c"(__subfunc)); #else #error "unknown architecture." #endif gcc has __cpuid macro defined in cpuid.h. defined macro saves ebx , rbx register when required. can find gcc 4.8.1 macro definition here idea of how handle cpuid in cpuid.h.
the astute reader may ask question - stops compiler choosing ebx or rbx scratch register use exchange. compiler knows ebx , rbx in context of pic, , not allow used scratch register. based on personal observations on years , reviewing assembler (.s) files generated c code. can't how more ancient versions of gcc handled problem.
Comments
Post a Comment