How to write type-safe generics in C

(raphgl.github.io)

36 points | by todsacerdoti 4 hours ago ago

41 comments

Having used this in production around 2004, I never really liked this approach - abusing the preprocessor this much shouldn't be necessary and the result is almost unreadable.

I think a better way is possible using `_Generic`[1]. Even though it would still use macros, the resulting code is much more readable.

---------------------------

[1] `_Generic` comes with its own problems too, of course.

[-]

naasking an hour ago

> abusing the preprocessor this much shouldn't be necessary and the result is almost unreadable.

The readability of error messages was always my beef with macro-based generics. Maybe 15 years ago I played around with using UTF characters for the brackets and type separators when translating a generic type into name-mangled type, so at least the generated source and error messages would still be clear and understandable.

Robust UTF support for identifiers and preprocessor was still kind of inconsistent across compilers though, so it only sort of worked. I expect this has improved considerably now though, so maybe I should try again and write something up. You can embed a surprising amount of advanced programming language theory into suitable name resolution.

QuadmasterXLII an hour ago

Instead of invisibly generating the per type headers and implementations via macros during compilation, I generate them using something like M4 and check them into the version control system. Then, instead of integrating M4 into the compilation process, I verify that the checked in generated files match the output of m4 in the CI checks. The prettiness of the generated files is now a code quality target. This provides the best possible developer experience to anyone who is not modifying the generic containers, and survivable dx if you are modifying the generic containers, at which point you kind of deserve it and any mistakes from forgetting to regenerate and re check in are isolated from other users at the CI step.

This is secretly two way generation: you can edit The generated file, recompiling until until the code works, commit that, and then edit the M4 inputs until CI passes.

As another example of the same principle, whenever I am required to have a requirements.txt and setup.cfg in the same puthon library, I now verify that they match in a CI check instead of trying to single source of truth by having a setup.cfg import the requirements.txt through some clever hack.

schaefer 3 hours ago

A more accurate title would be:

“How to write type safe generics using the C preprocessor”

pjmlp 3 hours ago

Type safety, generics and C is a bit of oxymoron.

david2ndaccount 3 hours ago

I wrote a similar article in the past: https://www.davidpriver.com/ctemplates.html

I use this technique in my hobby projects as it has worked out the best for me of the options I’ve explored.

tester756 2 hours ago

Why they cannot add generics to C as first class citizen, so you don't have to abuse macros, preprocessing and other primitive techniques?

[-]

xigoi an hour ago

Because according to C programmers, generics are too “complex” (as opposed to preprocessor abuse, which is “simple”).

codr7 an hour ago

No. You don't mess around with the language like that.

This is how you do it:

https://github.com/codr7/hacktical-c/tree/main/vector

[-]

xigoi an hour ago

This is the type erasure approach, whose cons are mentioned in the article.

wffurr 27 minutes ago

Why did this get flagged? Weird.

kevin_thibedeau 2 hours ago

I recommend m4 for doing this sort of thing. It's much easier to manipulate the generated source with its facilities than with the preprocessor.

jesse__ 3 hours ago

I wrote a metaprogramming language that adds another option to the list, for anyone that's interested : https://github.com/scallyw4g/poof

variadix 3 hours ago

You can create macro functions per generic function so something like Vector_New(int)(&v) expands to Vector_New_int(&v). It also looks less foreign (more like templates) than the G macro.

giantpotato 2 hours ago

vec_push doesnt check for realloc failure in vec_fit

synergy20 3 hours ago

or use nim metaprogramming, which will be transpiled to c

3 hours ago

[deleted]

self_awareness 3 hours ago

Well, another option would be to use a C++ compiler, which supports templates, but limit the use of classes through a coding convention standard.

[-]

krupan 3 hours ago

Not sure why this is down voted when the whole point of TFA is to torture the C language into doing something it can't really do. I guess there's an unspoken assumption in TFA that you are stuck using C and absolutely cannot use a different language, not even C++?

[-]

relling 2 hours ago

[dead]

lelanthran 3 hours ago

> Well, another option would be to use a C++ compiler, which supports templates, but limit the use of classes through a coding convention standard.

When the other option is "ask the developers to practice discipline", an option that doesn't require that looks awfully attractive.

That being said, I'm not a fan of the described method either. Maybe the article could have shown a few more uses of this from the caller perspective.

[-]

krupan 3 hours ago

"ask the developers to practice discipline" is a baseline requirement for coding in C

[-]

EPWN3D 2 hours ago

And it hasn't worked in practice. C unfortunately does not have a very big pit of success -- it's way too hard to do the right thing and way too easy to do the wrong thing.

The solution to this doesn't have to be "rewrite everything in Rust", but it does mean that you need to provide safe, easy implementations for commonly-screwed-up patterns. Then you're not asking people to be perfect C programmers; you're just asking them to use tools that are easier than doing the wrong thing.

lelanthran 3 hours ago

> "ask the developers to practice discipline" is a baseline requirement for coding in C

Sure, but since there's 10x more opaque footguns in C+++, there is much less discipline needed than when coding in C++.

The footguns in C are basically signed-integer over/underflows and memory errors. The footguns in C++ include all the footguns in C, and then add a ton more around object construction type, object destruction types, unexpected sharing of values due to silent and unexpected assignments, etc.

Just the bit on file-scope/global-scope initialisation alone can bite even experienced developers who are adding a new nonlocally-scoped instance of a new class.

pjmlp 3 hours ago

Unfortunately the majority has failed to attend the temple classes on such practices.

Loudergood 2 hours ago

If only.

EPWN3D 2 hours ago

It never seems to work out that way though. C++ is just too large a language, and it gets bigger with each revision. The minute you hire a senior/principal engineer who loves C++, they'll make the case to enable "just this one" feature, and before you know it, you've got a sprawling C++ code base, not a "C with a light dusting of C++" code base.

flashgordon 2 hours ago

Actually this was my first instinct too. Just limit what you use c++ for and write c code with templates and be done with it.

The problems I am guessing start when you are tempted into using the rest of the features one by one. You have generics. Well next let's get inheritance in. Now a bit of operator overloading. Then dealing with all kinds of smart pointers...

[-]

eptcyka 2 hours ago

What would be the detrimental effect of using smart pointers?

pjmlp 3 hours ago

C folks rather reproduce badly C++ than acknowledge its Typescript like improvements over C.

[-]

lelanthran 2 hours ago

> C folks rather reproduce badly C++ than acknowledge its Typescript like improvements over C

This is a rather crude misrepresentation; most C programmers who need a higher level of abstraction than C reach for Java, C# or Go over C++.

IOW, acknowledging that C++ has improvements over C still does not make the extra C++ footguns worth switching over.

When you gloss over the additional footguns, it looks like you're taking it personally when C programmers don't want to deal with those additional footguns.

After all, we don't choose languages based on which one offers the most freedom to blow your leg off, we tend to choose languages based on which ones have the most restrictions against blowing your leg off.

If your only criteria is "Where can I get the most features", then sure, C++ looks good. If your criteria is "Where are the fewest footguns", then C++ is at the bottom of the list.

[-]

pjmlp 2 hours ago

Nah, it is called life experience meeting those kind of persons since the 1990's, starting on BBS forums.

My criteria is being as safe as Modula-2 and Object Pascal, as bare minimum.

C++ offers the tools, whereas WG14 has made it clear they don't even bother, including turning down Dennis Ritchie proposal for fat pointers.

[-]

lelanthran 2 hours ago

>> looks like you're taking it personally

> it is called life experience meeting those kind of persons

Looks like you are confirming that you are taking it personally.

I don't understand why, though.

You cannot imagine a programmer that wants fewer footguns?

[-]

pjmlp 2 hours ago

Yes, when careless programmers are responsible for critical infrastructure systems, and rather take a YOLO attitude to systems programming.

[-]

lelanthran 2 hours ago

> Yes, when careless programmers are responsible for critical infrastructure systems, and rather take a YOLO attitude to systems programming.

Well, that's a novel take: "Opting for fewer footguns is careless". :-)

It's probably not news to you that your view is, to put it kindly, very rare.

[-]

pjmlp 2 hours ago

Is it? Ask the governments and respective cyber security agencies.

And to finish this, as I won't reply any further,

"A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to--they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980 language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law."

-- C.A.R Hoare's "The 1980 ACM Turing Award Lecture"

pron 2 hours ago

C is sometimes used where C++ can't be. Exotic microcontrollers and other niche computing elements sometimes only have a C compiler. Richer, more expressive languages may also have additional disadvantages, and people using simpler, less expressive languages may want to enjoy some useful features that exist in richer languages without taking on all of their disadvantages, too. Point being, while C++ certainly has some clear benefits over C, it doesn't universally dominate it.

TS, on the other hand, is usable wherever JS is, and its disadvantages are much less pronounced.

[-]

pjmlp 2 hours ago

It isn't the 1980's any longer, there isn't a chip in such scenarios other than PIC class, even AVR get to use C++.

[-]

pron an hour ago

8051s are still programmed almost entirely in C. There are C++ compilers available, but they're rarely used. Even on STM32, C is more popular. There's a perception -- and not an unsubstantiated one -- that C++'s can more easily sneak in operations that could go unnoticed.

C++ has many advantages over C, but it also brings some clear disadvantages that matter more when you want to be aware of every operation. When comparing language A against language B, it's not enough to consider what A does better than B; you also have to consider what it does worse.

That's why I don't think that the comparison to TS/JS is apt. Some may argue that C++ has even more advantages over C than TS has over JS, but I think it's fairly obvious that its disadvantages compared C are also bigger. For all its advantages, there are some important things that C++ does worse than C. But aside from adding a build step (which is often needed, anyway), it's hard to think of important things that TS does worse than JS.

hawk_ 2 hours ago

Is there a way to pass compiler switches to disable specific C++ features? Or other static analysis tools that break the build upon using prohibited features?

[-]

fweimer 2 hours ago

No two development groups agree on the desired features, so it would have to be a custom compiler plugin.

You could start with a Perl script that looks at the output of “clang++ -Xclang -ast-dump” and verifies that only permitted AST nodes are present in files that are part of the project sources.