Quantcast
Channel: User Lundin - Code Review Stack Exchange
Viewing all articles
Browse latest Browse all 42

Answer by Lundin for Unsigned 32-bit integer to binary string function

$
0
0

As already said in other reviews, you should switch to caller allocation. And there's no initialize the string at all, just overwrite all of it.

As for the algorithm itself, or in general, go with readability and keep things simple. When we do so, we tend to get efficiency for free too. This seems like a good compromise between readability and performance:

void uint32_to_binstr (char dst[33], uint32_t n){  for(size_t i=0; i<32; i++)  {    bool one = n & (1u << i);    size_t index = 32-i-1;     dst[index] = '0'+ one;  }  dst[32]='\0';}

There's a rule of thumb to always keep (for) loops as simple as possible, as close to idea ideal for(size_t i=0; i<n; i++) as possible. When you do this, the code tends to turn both more readable and efficient. (A bit of fiddling around with disassemblers showed me that this gave somewhat better machine code than the down counting versions or the versions right shifting a mask.)

Since I counted from 0 to 31, I need to compensate for that when determining the index used for storage - the above actually fills the array from the back to the front. The calculation 32-i-1 gives 31, 30, 29...

The bool one = n & (1u << i); is basically the same as if(n & (1u << i)) and will likely result in a branch too, probably there's way around it without turning the code real ugly.

Note that I used 1u for shifting, since 1 gives a signed int and shifting signed numbers is a very bad idea. In case of left shift we might shift data into the sign bit = undefined behavior bug. In case of right shift we don't know if we get arithmetic or logical shift = non-portable code and generally a bad idea.

Peeking at the resulting machine code (gcc x86 -O3):

uint32_to_binstr:        lea     r9, [rdi-32]        mov     rax, rdi        mov     r8d, 1.L2:        mov     ecx, edi        mov     edx, r8d        sub     ecx, eax        sal     edx, cl        and     edx, esi        cmp     edx, 1        mov     edx, 48        sbb     dl, -1        sub     rax, 1        mov     BYTE PTR [rax+32], dl        cmp     r9, rax        jne     .L2        mov     BYTE PTR [rdi+32], 0        ret

This is reasonably compact and not many branches.

Now since we decided to use a trivial, readable loop, we automatically get lots of advantages we hadn't even considered from the start. Suppose we want to modify the code to drop leading zeroes or just print a certain amount of digits? We can use pretty much the very same code, just swap out the hard-coded 32 for a parameter and then change the order of parameters:

void uint32_to_binstr (uint32_t n, size_t digits, char dst[digits]){  for(size_t i=0; i<digits; i++)  {    bool one = n & (1u << i);    size_t index = digits-i-1;     dst[index] = '0'+ one;  }  dst[digits]='\0';}

Viewing all articles
Browse latest Browse all 42

Trending Articles