Using popcnt instruction found on newer processors to get population count which is number of set bits.
Reference : SSE4
Here are couple example inputs with expected output.
0xA = 1010 -- > 2 0x3F2 = 1111110010 -- > 7
#include <iostream> #include <stdio.h> using namespace std; int main() { // 0xA = 1010 -- > 2 // 0x3F2 = 1111110010 -- > 7 unsigned long code = 0x3F2UL; int bitcount = 0; __asm__( "movl %1, %%eax; " //code into eax "popcnt %%eax, %%eax;"// call popcnt instruction "movl %%eax, %0;"// ebx into bitcount :"=r"(bitcount)// output :"r"(code)// input :"%eax","%ebx","%ecx","%edx"// clobbered register ); printf("%d", bitcount); return 0; } |
Update
To answer Phils question why we have "%eax","%ebx","%ecx","%edx"
as our clobbered register list, this is my understanding and it could be completely wrong.
Our instruction movl %%eax, %0;
used GCC syntax for accessing c++ variable bitcount. So ‘%0’ refers to bitcount which will be copied into ebx register, that is why I have ebx in the list, as for the ecxand edx they have been added purely because they are general purpose registers and I did not want to override some previously stored value and my limited understanding on how inline assembly works.
can i ask about speedtest coding?? i really want to know about it.
How can I help you ?
Can you explain why
%ebx
,%ecx
and%edx
are included among the clobbered registers in your example code?Thank for the great question, I have updated the post.