After some long discussion in comments regarding changing the whole concept... here are the main problems:
- Variadic functions are slow, very type-unsafe and should be avoided in general.
- Caller allocation is preferred whenever possible. Partially because it can give faster code, partially because then there's no confusion over who is responsible for clean up (see
getline
as the perfect example of a problematic API using heap allocation).
I made a brief draft of how you would be able to do this in a conceptually different way, based on chaining multiple strcat
calls to each other, which will be more efficient. That is, basically:
char buf [n] = "hello ";strcat(strcat(buf, "world"), "!!!");
(Though naturally this syntax is a bit icky and a loop with a string array would be preferred.)
However, this assumes that the caller ensures that the buffer is always large enough. One can invent a version of strcat
which optionally handles allocation. Either by always doing malloc
internally or by getting an allocator function passed by the caller, to enable caller allocation in resource-constrained systems (or even using alloca
for stack allocation). A naive implementation might look like this:
typedef void* alloc_t (size_t size);char* strcata (char* restrict dst, const char* restrict src, alloc_t* alloc){ size_t dst_size = strlen(dst); size_t src_size = strlen(src); char* new_dst = alloc(dst_size + src_size + 1); if(new_dst==NULL) return NULL; memcpy(new_dst, dst, dst_size); memcpy(new_dst+dst_size, src, src_size+1); // +1 to also copy null term return new_dst;}#define strcat_alloc(dst,src,alloc) \ (alloc==NULL) ? strcat( (dst), (src) ) : strcata( (dst), (src), alloc )
This boils down to fairly acceptable and mostly branch-free code, despite the function pointer which can't be inlined, see gcc -O3
x86_64 Linux:
strcata: push r14 mov r14, rdi push r13 mov r13, rsi push r12 mov r12, rdx push rbp push rbx call strlen mov rdi, r13 mov rbx, rax call strlen mov rbp, rax lea rdi, [rbx+1+rax] call r12 mov r12, rax test rax, rax je .L1 mov rdx, rbx mov rsi, r14 mov rdi, rax call memcpy lea rdi, [r12+rbx] lea rdx, [rbp+1] mov rsi, r13 call memcpy.L1: pop rbx mov rax, r12 pop rbp pop r12 pop r13 pop r14 ret
Example of use would be:
https://godbolt.org/z/8WasPnYWv
int main (void){ char buf1[100] = "hello "; strcat_alloc(buf1, "world", NULL); puts(buf1); char* buf2 = strcat_alloc(buf1, "!!!", malloc); puts(buf2); free(buf2); char* buf3 = strcat_alloc( strcat_alloc( strcat_alloc("hello ", "world ", malloc),"how's it ", malloc),"going?", malloc); // again, a loop would be much prettier than chaining calls like this puts(buf3); free(buf3);}
Output:
hello worldhello world!!!hello world how's it going?