Giving C a Superpower: custom header file (safe_c.h)


November 09, 2025

The story of how I wrote a leak-free, thread-safe grep in C23 without hurting myself, and how you can do the same!

Let’s be honest: Most people have a love-hate relationship with C. We love its raw movement, its direct connection to metal, and its elegant simplicity. But we hate its futanari, its dragons, its wild beasts. Segfaults that appear out of nowhere, memory leaks that slowly drain the life out of our applications, and endless GOTO cleanups; The chains that make up our code look like a plate of spaghetti pasta.

It’s the classic C curse: electricity without guardrails… at least that’s the case fear inducing mantra It is being said again and again. But is it still relevant in today’s world with all the tools available to C developers like static analyzers and dynamic sanitizers? I have written about it here and here.

What if, with the help of modern tools and a custom header file (600 locs), you could tame those footgun beasts? What if you could keep the power of C but wrap it in a suit of modern armor? That’s what the custom header file secure_ch.ch is for. It is designed to give C some Security and convenience features from C++ and RustAnd I’m using it as my test case to create a high-performance grep clone called cgrep.

I hope that by the end of this article it will provide the audience with the idea that C is highly flexible and extensible. “Do whatever you want with it.” That kind of thing. And that’s why C (and its close relative: Zig) remains my favorite language for writing programs; This is the language of freedom!

is a custom C header file that takes features primarily from C++ and Rust and implements them into our C code ~ [write C code, get C++ and Rust features!]

It starts with bridging the gap between the old and new C. C23 gave us [[cleanup]] features, but in the real world, you need code that compiles to GCC 11 or Clang 18. Safe_c.h detects your compiler and gives you the same RAII semantics everywhere. not anymore #ifdef stock.

// The magic behind CLEANUP: zero overhead, maximum safety
#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 202311L
#define CLEANUP(func) [[cleanup(func)]]
#else
#define CLEANUP(func) __attribute__((cleanup(func)))
#endif

// Branch prediction that actually matters in hot paths
#ifdef __GNUC__
#define LIKELY(x)   __builtin_expect(!!(x), 1)
#define UNLIKELY(x) __builtin_expect(!!(x), 0)
#else
#define LIKELY(x)   (x)
#define UNLIKELY(x) (x)
#endif

Your cleanup code runs even if you return early, exit, or panic. Its finallyBut for C.

Memory management beast: slain with smart pointers (C++ feature)

The oldest, most radical, and most feared by developers: manual memory management.

Before: Highway path to leak.
forget even one free() There is a disaster. In CGREP, parsing command-line options the old-fashioned way is a breeding ground for CVEs and their bestiaries. You have to remember to free memory on every single exit path, which is difficult for undisciplined people.

// The Old Way (don't do this)
char* include_pattern = NULL;
if (optarg) 
    if (!dest 
// ...200 lines later...
if (some_error)  !src) return false;
    size_t src_len = strlen(src);
    if (src_len >= dest_size) return false;
    memcpy(dest, src, src_len + 1);
    return true;

// And remember to free it at *every* return path...

Later: that memory Automatically Cleans itself.
UniquePTR is a “smart pointer” that owns a resource. When a uniqueptr variable goes out of scope, its resource is automatically freed. It’s impossible to forget.

here’s the machinery inside safe_c.h,

// The UniquePtr machinery: a struct + automatic cleanup
typedef struct  dest_size == 0  UniquePtr;

#define AUTO_UNIQUE_PTR(name, ptr, deleter) \
    UniquePtr name CLEANUP(unique_ptr_cleanup) = UNIQUE_PTR_INIT(ptr, deleter)

static inline void unique_ptr_cleanup(UniquePtr* uptr) {
    if (uptr && uptr->ptr && uptr->deleter) 
}

And here’s how cgrep uses it. Cleanup is automatic even when errors occur:

// In cgrep, we use this for command-line arguments
AUTO_UNIQUE_PTR(include_pattern_ptr, NULL, options_string_deleter);

// When we get a new pattern, the old one is automatically freed!
unique_ptr_delete(&include_pattern_ptr);
include_pattern_ptr.ptr = strdup(optarg);
// No leaks, even if an error happens later!

Sharing securely with SharedPtr

Before: manual, bug-prone reference counting.
You have to implement reference counting by hand, creating a complex and delicate system where a single mistake leads to a leak or a use-after-free bug.

// The old way of manual reference counting
typedef struct {
    MatchStore* store;
    int ref_count;
    pthread_mutex_t mutex;
} SharedStore;

void release_store(SharedStore* s) {
    pthread_mutex_lock(&s->mutex);
    s->ref_count--;
    bool is_last = (s->ref_count == 0);
    pthread_mutex_unlock(&s->mutex);

    if (is_last) {
        match_store_deleter(s->store);
        free(s);
    }
}

Later: Automatic reference counting.
SharedPtr automates this entire process. The last thread to finish using the object automatically triggers its destruction. Machinery:

// The SharedPtr machinery: reference counting without the boilerplate
typedef struct {
    void* ptr;
    void (*deleter)(void*);
    size_t* ref_count;
} SharedPtr;

#define AUTO_SHARED_PTR(name) \
    SharedPtr name CLEANUP(shared_ptr_cleanup) = {.ptr = NULL, .deleter = NULL, .ref_count = NULL}

static inline void shared_ptr_cleanup(SharedPtr* sptr) {
    shared_ptr_delete(sptr); // Decrement and free if last reference
}

Use is clean and safe. No more manual counting.

// In our thread worker context, multiple threads access the same results store
typedef struct {
    // ...
    SharedPtr store;  // No more worrying about who frees this!
    SharedPtr file_counts;
    // ...
} FileWorkerContext;

// In main(), we create it once and share it safely
// SharedPtr: Reference-counted stores for thread-safe sharing
SharedPtr store_shared = {0};
shared_ptr_init(&store_shared, store_ptr.ptr, match_store_deleter);
// Pass to threads: ctx->store = shared_ptr_copy(&store_shared);
// ref-count increments automatically; last thread out frees it.

Buffer overflow beast: involving vectors and sequences (C++ feature)

Dynamically growing arrays in C is a horror show.

Before: Realloc dance routine.
You have to manually track capacity and size, and each realloc runs the risk of the memory becoming fragmented or failing, requiring careful error handling for each element you add.

// The old way: manual realloc is inefficient and complex
MatchEntry** matches = NULL;
size_t matches_count = 0;
size_t matches_capacity = 0;

for (/*...each match...*/) {
    if (matches_count >= matches_capacity) {
        matches_capacity = (matches_capacity == 0) ? 8 : matches_capacity * 2;
        MatchEntry** new_matches = realloc(matches, matches_capacity * sizeof(MatchEntry*));
        if (!new_matches) {
            free(matches); // Don't leak!
            /* handle error */
        }
        matches = new_matches;
    }
    matches[matches_count++] = current_match;
}

After: A type-safe, auto-increasing vector.
Safe_c.h generates the entire type-safe vector for you. It handles allocation, development and cleanup automatically. The magic that produces the vector:

// The magic that generates a complete vector type from a single line
#define DEFINE_VECTOR_TYPE(name, type) \
    typedef struct { \
        Vector base; \
        type* data; \
    } name##Vector; \
    \
    static inline bool name##_vector_push_back(name##Vector* vec, type value) { \
        bool result = vector_push_back(&vec->base, &value); \
        vec->data = (type*)vec->base.data; /* Sync pointer after potential realloc */ \
        return result; \
    } \
    \
    static inline bool name##_vector_reserve(name##Vector* vec, size_t new_capacity) { \
        bool result = vector_reserve(&vec->base, new_capacity); \
        vec->data = (type*)vec->base.data; /* Sync pointer after potential realloc */ \
        return result; \
    } \


    /* more helper functions not outlined here */

// And the underlying generic Vector implementation
typedef struct {
    size_t size;
    size_t capacity;
    void* data;
    size_t element_size;
} Vector;

It is simple and safe to use in cgrep. When the vector goes out of scope it cleans itself up.

// Type-safe vector for collecting matches
DEFINE_VECTOR_TYPE(MatchEntryPtr, MatchEntry*)

AUTO_TYPED_VECTOR(MatchEntryPtr, all_matches_vec);
MatchEntryPtr_vector_reserve(&all_matches_vec, store->total_matches);

// Pushing elements is safe and simple
for (MatchEntry* entry = store->buckets[i]; entry; entry = entry->next) {
    MatchEntryPtr_vector_push_back(&all_matches_vec, entry);
}

view: look, don’t touch (or malloc) – C++ feature

First: unnecessary allocation.
To handle a substring or a slice of an array, you’ll often malloc a new buffer and copy the data into it, which is incredibly slow in a tight loop.

// The old way: allocating a new string just to get a substring
const char* line = "this is a long line of text";
char* pattern = "long line";
// To pass just the pattern to a function, you might do this:
char* sub = malloc(strlen(pattern) + 1);
strncpy(sub, pattern, strlen(pattern) + 1);
// ... use sub ...
free(sub); // And hope you remember this free call

After: zero-cost, non-owned views.
A stringview or a span is just a pointer and a length. It is a non-owned context that lets you work with slices of data without any allocation. The definitions are pure and simple:

// The StringView and Span definitions: pure, simple, zero-cost
typedef struct {
    const char* data;
    size_t size;
} StringView;

typedef struct {
    void* data;
    size_t size;
    size_t element_size;
} Span;

In cgrep, the search pattern becomes a string view, thereby avoiding allocation altogether.

// Our options struct holds a StringView, not a char*
typedef struct {
    StringView pattern; // Clean, simple, and safe
    // ...
} GrepOptions;

// Initializing it is a piece of cake
options.pattern = string_view_init(argv[optind]);

For secure array access, SPAN provides a bounds-checked window into the existing data.

// safe_c.h
#define DEFINE_SPAN_TYPE(name, type) \
    typedef struct { \
        type* data; \
        size_t size; \
    } name##Span; \
    \
    static inline name##Span name##_span_init(type* data, size_t size) { \
        return (name##Span){.data = data, .size = size}; \
    } \
    \

    /* other helper functions not outlined here */
// Span: Type-safe array slices for chunk processing
DEFINE_SPAN_TYPE(LineBuffer, char)
LineBufferSpan input_span = LineBuffer_span_init((char*)start, len);

for (size_t i = 0; i < LineBuffer_span_size(&input_span); i++) {
    char* line = LineBuffer_span_at(&input_span, i); // asserts i < span.size
}

error-handling goto Beast: Replaced by result (Rust feature) and RAII (C++ feature)

C’s error handling is extremely messed up.

Before: Goto Cleanup Spaghetti Carbonara.
Functions return special values ​​like -1 or NULL, and you have to check for errors. This allows deeply nested if statements and single goto cleanups; Label that has to handle every possible failure case.

// The old way: goto cleanup
int do_something(const char* path) {
    int fd = open(path, O_RDONLY);
    if (fd < 0) {
        return -1; // Error
    }

    void* mem = malloc(1024);
    if (!mem) {
        close(fd); // Manual cleanup
        return -1;
    }
    
    // ... do more work ...

    free(mem);
    close(fd);
    return 0; // Success
}

After: Clear, type-safe results.
inspired by rust, results Forces you to handle errors explicitly by returning a type that is either a success value or an error value. Result Machinery:

// The Result type machinery: tagged unions for success/failure
typedef enum { RESULT_OK, RESULT_ERROR } ResultStatus;

#define DEFINE_RESULT_TYPE(name, value_type, error_type) \
    typedef struct { \
        ResultStatus status; \
        union { \
            value_type value; \
            error_type error; \
        }; \
    } Result##name;

It becomes easier to deal with errors. You cannot accidentally use an error as a valid value.

// Define a Result for file operations
DEFINE_RESULT_TYPE(FileOp, i32, const char*)

// Our function now returns a clear Result
static ResultFileOp submit_stat_request_safe(...) {
    // ...
    if (!sqe) {
        return RESULT_ERROR(FileOp, "Could not get SQE for stat");
    }
    return RESULT_OK(FileOp, 0);
}

// And handling it is clean
ResultFileOp result = submit_stat_request_safe(path, &ring, &pending_ops);
if (!RESULT_IS_OK(result)) {
    fprintf(stderr, "Error: %s\n", RESULT_UNWRAP_ERROR(result));
}

It is operated by RAII. CLEANUP The attribute ensures that resources are freed no matter how a function exits.

#define AUTO_MEMORY(name, size) \
    void* name CLEANUP(memory_cleanup) = malloc(size)

// DIR pointers are automatically closed, even on an early return.
DIR* dir CLEANUP(dir_cleanup) = opendir(req->path);
if (!dir) {
    return RESULT_ERROR(FileOp, "Failed to open dir"); // dir_cleanup is NOT called
}
if (some_condition) {
    return RESULT_OK(FileOp, 0); // closedir() is called automatically HERE!
}

The Assumption Beast: Challenges with Contracts and Safe Strings

First: assert() And pray.
standard assert(ptr != NULL) Nice, but when it fails, the message is generic. You know the bet failed, but not the context or why it was important.

Next: Self-Documentation Agreement.
requires() And ensures() Make function contracts clear. Failure messages tell you exactly what went wrong. Contract Macros:

#define requires(cond) assert_msg(cond, "Precondition failed")
#define ensures(cond) assert_msg(cond, "Postcondition failed")

#define assert_msg(cond, msg) /* ... full implementation ... */

This turns the claim into an executable document:

// Preconditions that document and enforce contracts
static inline bool arena_create(Arena* arena, size_t size)
{
    requires(arena != NULL);  // Precondition: arena must not be null
    requires(size > 0);       // Precondition: size must be positive
    
    // ... implementation ...
    
    ensures(arena->buffer != NULL);  // Postcondition: buffer is allocated
    ensures(arena->size == size);    // Postcondition: size is set correctly
    
    return true;
}

strcpy() there is a security vulnerability

Before: Buffer overflowed.
There is no limit to Strapi’s investigation. It is the source of countless security holes. strncpy Slightly better, because it can’t null-terminate the destination string.

// The old, dangerous way
char dest[20];
const char* src = "This is a very long string that will overflow the buffer";
strcpy(dest, src); // Undefined behavior! Stack corruption!

After: safe, limit-checked operations.
Provides safe_ch option which checks limits and returns success/failure status. No surprise. Secure Implementation:

// The safe string operations: bounds checking that can't be ignored
static inline bool safe_strcpy(char* dest, size_t dest_size, const char* src) {
    if (!dest || dest_size == 0 || !src) return false;
    size_t src_len = strlen(src);
    if (src_len >= dest_size) return false;
    memcpy(dest, src, src_len + 1);
    return true;
}

In cgrep, this prevents the path buffer from overflowing cleanly:

// Returns bool, not silent truncation
if (!safe_strcpy(req->path, PATH_MAX, path)) {
    free(req);
    return RESULT_ERROR(FileOp, "Path is too long");
}

Concurrency: mutexes that unlock themselves (Rust feature)

Before: Leaked locks and deadlocks.
Forgetting to unlock the mutex, especially on the error path, is a disastrous bug that causes your program to deadlock.

// The Buggy Way
pthread_mutex_lock(&mutex);
if (some_error) {
    return; // Oops, mutex is still locked! Program will deadlock.
}
pthread_mutex_unlock(&mutex);

Later: RAII-based locks.
Using the same CLEANUP attribute, we can ensure that the mutex is always unlocked when it exits the scope. This bug makes it impossible to write.

// With a cleanup function, unlocking is automatic.
void mutex_unlock_cleanup(pthread_mutex_t** lock) {
    if (lock && *lock) pthread_mutex_unlock(*lock);
}

// RAII lock guard via cleanup attribute
pthread_mutex_t my_lock;
pthread_mutex_t* lock_ptr CLEANUP(mutex_unlock_cleanup) = &my_lock;
pthread_mutex_lock(lock_ptr);

if (some_error) {
    return; // Mutex is automatically unlocked here!
}

Simple wrappers also clear up the boilerplate of thread management:

// The concurrency macros: spawn and join without boilerplate
#define SPAWN_THREAD(name, func, arg) \
    thrd_t name; \
    thrd_create(&name, (func), (arg))

#define JOIN_THREAD(name) \
    thrd_join(name, NULL)

And in CGREP:

// Thread pool spawn without boilerplate
SPAWN_THREAD(workers[i], file_processing_worker, &contexts[i]);
JOIN_THREAD(workers[i]); // No manual pthread_join() error handling

Performance: Protection at -O2, not at -O0

Safety doesn’t mean slow speed. The UNLIKELY() macro tells the compiler which branches are cold, adding zero overhead to hot paths.

#ifdef __GNUC__
#define LIKELY(x)   __builtin_expect(!!(x), 1)
#define UNLIKELY(x) __builtin_expect(!!(x), 0)
#else
#define LIKELY(x)   (x)
#define UNLIKELY(x) (x)
#endif

The real victory lies in the faster routes:

// In hot allocation path: branch prediction
if (UNLIKELY(store->local_buffer_sizes[thread_id] >= LOCAL_BUFFER_CAPACITY)) {
    match_store_flush_buffer(store, thread_id); // Rarely taken
}

// In match checking: likely path first
if (!options->case_insensitive && options->fixed_string) {
    // Most common case: fast path with no branches
    const char* result = strstr(line, options->pattern.data);
    return result != NULL;
}

The above one is similar to PGO (Profile Guided Optimization).

When you stop fighting the language this is what main() looks like:

int main(int argc, char* argv[]) {
    initialize_simd();
    output_buffer_init(); // Auto-cleanup on exit
    
    GrepOptions options = {0};
    AUTO_UNIQUE_PTR(include_pattern_ptr, NULL, options_string_deleter);
    
    // ... parse args with getopt_long ...
    
    AUTO_UNIQUE_PTR(store_ptr, NULL, match_store_deleter);
    SharedPtr store_shared = {0};
    if (need_match_store) {
        store_ptr.ptr = malloc(sizeof(ConcurrentMatchStore));
        if (!store_ptr.ptr || !match_store_create(store_ptr.ptr, hash_capacity, 1000)) {
            return 1; // All allocations cleaned up automatically
        }
        shared_ptr_init(&store_shared, store_ptr.ptr, match_store_deleter);
    }
    
    // Process files with thread pool...
    
cleanup: // Single cleanup label needed -- RAII handles the rest
    output_buffer_destroy(); // Flushes and destroys
    return 0;
}

Finally, cgrep is 2,300 lines of C. Without safe_ch, this would require over 50 manual free() calls ~ a recipe for leaks and segfaults. With a custom header file, that’s 2,300 lines that compile into a single assembly, run just as fast, and are fundamentally safer.

This proves that the best intangible is the one you don’t pay for and can’t forget to use. This enables a clear and powerful development pattern: validate input at the boundary, then expose the raw speed of C on the main logic. You get all the power of C without the infamous self-inflicted footgun wounds.

C simply makes writing programs fun, although there are ways to make it both fun and safe..Just like using a condom, you know?

This post has become too long for comfort, but I have one final thought for you readers: After all these hurdles, what do you think of CGREP’s performance? See screenshot below:

  • grep bench on recursive directories
    Recursive-Dirs

  • grep bench on single large file
    large file
    Note: Make sure you check the memory usage comparison between cgrep and ripgrep



In the next article, I’ll discuss how I built cgrep, what design I chose for it, why, and how cgrep managed to be a few times faster than repgrep (over 2x faster in the recursive directory bench) while remaining super efficient with resource usage (20x smaller memory footprint in the single large file bench).

it’s going to be fun! to encourage!

If you enjoyed this post, click the small up arrow chevron at the bottom left of the page to help it rank in Bear’s Discovery Feed and if you have any questions or comments, please use the comment section.



Leave a Comment