Love C, hate C: Web framework memory problems(alew.is)

163 pointsby OneLessThing4 days ago15 comments

bluetomcat3 days ago
Good C code will try to avoid allocations as much as possible in the first place. You absolutely don’t need to copy strings around when handling a request. You can read data from the socket in a fixed-size buffer, do all the processing in-place, and then process the next chunk in-place too. You get predictable performance and the thing will work like precise clockwork. Reading the entire thing just to copy the body of the request in another location makes no sense. Most of the “nice” javaesque XXXParser, XXXBuilder, XXXManager abstractions seen in “easier” languages make little sense in C. They obfuscate what really needs to happen in memory to solve a problem efficiently.
- lelanthran3 days ago
  > Good C code will try to avoid allocations as much as possible in the first place.
  I've upvoted you, but I'm not so sure I agree though.
  Sure, each allocation imposes a new obligation to track that allocation, but on the downside, passing around already-allocated blocks imposes a new burden for each call to ensure that the callees have the correct permissions (modify it, reallocate it, free it, etc).
  If you're doing any sort of concurrency this can be hard to track - sometimes it's easier to simply allocate a new block and give it to the callee, and then the caller can forget all about it (callee then has the obligation to free it).
  - obviouslynotme3 days ago
    The most important pattern to learn in C is to allocate a giant arena upfront and reuse it over and over in a loop. Ideally, there is only one allocation and deallocation in the entire program. As with all things multi-threaded, this becomes trickier. Luckily, web servers are embarrassingly parallel, so you can just have an arena for each worker thread. Unluckily, web servers do a large amount of string processing, so you have to be careful in how you build them to prevent the memory requirements from exploding. As always, tradeoffs can and will be made depending on what you are actually doing.
    Short-run programs are even easier. You just never deallocate and then exit(0).
    adrianN3 days ago
    Arenas are a nice tool, but they don't work for all use cases. In the limit you're reimplementing malloc on top of your big chunk of memory.
    galangalalgol3 days ago
    Most games have to do this for performance reasons at some point and there are plenty of variants to choose from. Rust has libraries for some of them, but in c rolling it yourself is the idiom. One I used in c++ and worked well as a retrofit was to overload new to grab the smallest chunk that would fit the allocation from banks of them. Profiling under load let the sizes of the banks be tuned for efficiency. Nothing had to know it wasn't a real heap allocation, but it was way faster and with zero possibility of memory fragmentation.
    lifthrasiir2 days ago
    Most pre-2010 games had to. As a prior gamedev after that period I can confidently say that it is a relic of the past in most cases now. (Not like that I don't care, but I don't have to be that strict about allocations.)
    card_zero2 days ago
    Because why?
    user____name2 days ago
    Virtual memory gets rid of a lot of fragmentation issues.
    galangalalgol2 days ago
    Yeah. Fragmentation was a niche concern of that embedded use case. It had an mmu, just wasn't used by the rtos. I am surprised that allocations aren't a major hitter anymore. I still have to minimize/eliminate them in linux signal processing code to stay realtime.
    lifthrasiir2 days ago
    Probably because hardwares became powerful enough that you can make a performant game without thinking much about allocations.
    juped2 days ago
    The normal practical version of this advice that isn't a "guy who just read about arenas post" is that you generally kick allocations outward; the caller allocates.
    lelanthran2 days ago
    They don't work for all use-cases, but they most certainly work for this use-case (HTTP server).
    bheadmaster2 days ago
    > Ideally, there is only one allocation and deallocation in the entire program.
    Doesn't this techically happen with most of the modern allocators? They do a lot of work to avoid having to request new memory from the kernel as much as possible.
    Asmod4n2 days ago
    last time i checked, the glibc allocator doesnt ask the OS that often for new heap memory.
    Like, every ~thousand malloc calls invoked (s)brk and that was it.
    card_zero2 days ago
    > there is only one allocation and deallocation in the entire program.
    > Short-run programs are even easier. You just never deallocate and then exit(0).
    What's special about "short-run"? If you deallocate only once, presumably just before you exit, then why do it at all?
    free_bip2 days ago
    Just because there's only one deallocation doesn't mean it's run only once. It would likely be run once every time the thread it belongs to is deallocated, like when it's finished processing a request.
    lelanthran2 days ago
    I agree, which is why I wrote an arena allocator library I use (somewhere on github, probably public and free).
  - 17186274403 days ago
    To reduce the amount of allocation instead of:
    struct parsed_data * = parse (...); struct process_data * = process (..., parsed_data); struct foo_data * = do_foo (..., process_data);
    you can do
    parse (...) { ... process (...); ... } process (...) { ... do_foo (...); ... }
    It sounds like violating separation of concerns at first, but it has the benefit, that you can easily do procession and parsing in parallel, and all the data can become readonly. Also I was impressed when I looked at a call graph of this, since this essentially becomes the documentation of the whole program.
    ambicapter3 days ago
    How testable is this, though?
    17186274402 days ago
    It might be a problem when you can't afford side-effects that you later throw away, but I haven't experienced that yet. The functions still have return codes, so you still can test, whether a correct input results in no error check being followed and that incorrect input results in an error check being triggered.
  - throwawaymaths3 days ago
    is there any system where doing the basics of http (everything up to framework handoff of structured data) are done outside of a single concurrency unit?
    btown2 days ago
    Not exactly what you’re looking for, but https://github.com/simdjson/simdjson absolutely uses micro-parallel techniques for parsing, and those do need to think about concurrency and how processors handle shared memory in pipelined and branch-predicted operations.
- wfn3 hours ago
  Agree re: no need for heap allocation - for others: I recommend reading thru whole masscan source (https://github.com/robertdavidgraham/masscan), it's a pleasure btw - iirc rather few/sparse malloc()s which are part of regular I/O processing flow (there will be malloc()s which depending on config etc. set up additional data structs but as part of setup).
- lock13 days ago
  Why does "good" C have to be zero alloc? Why should "nice" javaesque make little sense in C? Why do you implicitly assume performance is "efficient problem solving"?
  Not sure why many people seem fixated on the idea that using a programming language must follow a particular approach. You can do minimal alloc Java, you can simulate OOP-like in C, etc.
  Unconventional, but why do we need to restrict certain optimizations (space/time perf, "readability", conciseness, etc) to only a particular language?
  - bluetomcat3 days ago
    Because in C, every allocation incurs a responsibility to track its lifetime and to know who will eventually free it. Copying and moving buffers is also prone to overflows, off-by-one errors, etc. The generic memory allocator is a smart but unpredictable complex beast that lives in your address space and can mess your CPU cache, can introduce undesired memory fragmentation, etc.
    In Java, you don't care because the GC cleans after you and you don't usually care about millisecond-grade performance.
    jstimpfle3 days ago
    No. Look up Arenas. In general group allocations to avoid making a mess.
    rictic2 days ago
    If you send a task off to a work queue in another thread, and then do some local processing on it, you can't usually use a single Arena, unless the work queue itself is short lived.
    jenadine2 days ago
    I don't see how arenas solve the problems.
    jstimpfle2 days ago
    You group things from the same context together, so you can free everything in a single call.
    estimator72922 days ago
    No. Arenas are not a general case solution. Look it up
  - cogman103 days ago
    > Why should "nice" javaesque make little sense in C?
    Very importantly, because Java is tracking the memory.
    In java, you could create an item, send it into a queue to be processed concurrently, but then also deal with that item where you created it. That creates a huge problem in C because the question becomes "who frees that item"?
    In java, you don't care. The freeing is done automatically when nobody references the item.
    In C, it's a big headache. The concurrent consumer can't free the memory because the producer might not be done with it. And the producer can't free the memory because the consumer might not have ran yet. In idiomatic java, you just have to make sure your queue is safe to use concurrently. The right thing to do in C would be to restructure things to ensure the item isn't used before it's handed off to the queue or that you send a copy of the item into the queue so the question of "who frees this" is straight forward. You can do both approaches in java, but why would you? If the item is immutable there's no harm in simply sharing the reference with 100 things and moving forward.
    In C++ and Rust, you'd likely wrap that item in some sort of atomic reference counted structure.
  - lelanthran3 days ago
    > Why does "good" C have to be zero alloc?
    GP didn't say "zero-alloc", but "minimal alloc"
    > Why should "nice" javaesque make little sense in C?
    There's little to no indirection in idiomatic C compared with idiomatic Java.
    Of course, in both languages you can write unidiomatically, but that is a great way to ensure that bugs get in and never get out.
    codr72 days ago
    In C, direct memory control is the top feature, which means you can assume anyone who uses your code is going to want to control memory through the process. This means not allocating from wherever and returning blobs of memory, which means designing different APIs, which is part of the reason why learning C well takes so long.
    I started writing sort of a style guide to C a while ago, which attempts to transfer ideas like this one more by example:
    https://github.com/codr7/hacktical-c
    nxobject2 days ago
    Echoing my sibling comment - thanks for sharing this.
    jabits2 days ago
    Thanks for sharing this work.
    lock12 days ago
    > Of course, in both languages you can write unidiomatically, but that is a great way to ensure that bugs get in and never get out.
    Why does "unidiomatic" have to imply "buggy" code? You're basically saying an unidiomatic approach is doomed to introduce bugs and will never reduce them.
    It sounds weird. If I write Python code with minimal side effects like in Haskell, wouldn't it at least reduce the possibility of side-effect bugs even though it wasn't "Pythonic"?
    AFAIK, nothing in the language standard mentions anything about "idiomatic" or "this is the only correct way to use X". The definition of "idiomatic X" is not as clear-cut and well-defined as you might think.
    I agree there's a risk with an unidiomatic approach. Irresponsibly applying "cool new things" is a good way to destroy "readability" while gaining almost nothing.
    Anyway, my point is that there's no single definition of "good" that covers everything, and "idiomatic" is just whatever convention a particular community is used to.
    There's nothing wrong with applying an "unidiomatic" mindset like awareness of stack/heap alloc, CPU cache lines, SIMD, static/dynamic dispatch, etc in languages like Java, Python, or whatever.
    There's nothing wrong either with borrowing ideas like (Haskell) functor, hierarchical namespaces, visibility modifiers, borrow checking, dynamic dispatch, etc in C.
    Whether it's "good" or not is left as an exercise for the reader.
    lelanthran2 days ago
    > Why does "unidiomatic" have to imply "buggy" code?
    Because when you stray from idioms you're going off down unfamiliar paths. All languages have better support for specific idioms. Trying to pound a square peg into a round hole can work, but is unlikely to work well.
    > You're basically saying an unidiomatic approach is doomed to introduce bugs and will never reduce them.
    Well, yes. Who's going to reduce them? Where are you planning to find people who are used to code written in an unusual manner?
    By definition alone, code is written for humans to read. If you're writing it in a way that's difficult for humans to read, then of course the bug level can only go up and not down.
    > It sounds weird. If I write Python code with minimal side effects like in Haskell, wouldn't it at least reduce the possibility of side-effect bugs even though it wasn't "Pythonic"?
    "Pythonic" does not mean the same thing as "Idiomatic code in Python".
  - estimator72922 days ago
    Good C has minimal allocations because you, the human, are the memory allocator. It's up to your own meat brain to correctly track memory allocation and deallocation. Over the last century, C programmers have converged on some best practices to manage this more effectively. We statically allocate, kick allocations up the call chain as far as possible. Anything to get that bit of tracked state out of your head.
    But we use different approaches for different languages because those languages are designed for that approach. You can do OOP in C and you can do manual memory management in C#. Most people don't because it's unnecessarily difficult to use languages in a way they aren't designed for. Plus when you re-invent a wheel like "classes" you will inevitably introduce a bug you wouldn't have if you'd used a language with proper support for that construct. You can use a hammer to pull out a screw, but you'd do a much better job if you used a screwdriver instead.
    Programming languages are not all created equal and are absolutely not interchangeable. A language is much, much more than the text and grammar. The entire reason we have different languages is because we needed different ways to express certain classes of problems and constructs that go way beyond textual representation.
    For example, in a strictly typed OOP language like C#, classes are hideously complex under the hood. Miles and miles of code to handle vtables, inheritance, polymorphism, virtual, abstract functions and fields. To implement this in C would require effort far beyond what any single programmer can produce in a reasonable time. Similarly, I'm sure one could force JavaScript to use a very strict typing and generics system like C#, but again the effort would be enormous and guaranteed to have many bugs.
    We use different languages in different ways because they're different and work differently. You're asking why everyone twists their screwdrivers into screws instead of using the back end to pound a nail. Different tools, different uses.
- riedel2 days ago
  A long time ago I was involved in building compilers. It was common that we solved this problem with obstacks, which are basically stacked heaps. I wonder one could not build more things like this, where freeing is a bit more best effort but you have some checkpoints. (I guess one would rather need tree like stacks) Just have to disallow pointers going the wrong way. Allocation remains ugly in C and I think explicit data structures are are definitely a better way of handling it.
- fulafel2 days ago
  This shared memory and pointer shuffling is of course fraught with requiring correct logic to avoid memory safety bugs. Good C code doesn't get you pwned, I'd argue.
  - jenadine2 days ago
    > Good C code doesn't get you pwned, I'd argue.
    This is not a serious argument because you don't really define good C code and how easy or practical it is to do. The sentence works for every language. "Good <whatever language> code doesn't get you pwned"
    But the question is whether "Average" or "Normal" C code gets you pwned? And the answer is yes, as told in the article.
    fulafel2 days ago
    The comment I was responding to suggested Good C Code employes optimizations that, I opined, are more error prone wrt memory safety - so I was not attempting to define it, but challenging the offered characterisation.
- fsckboy2 days ago
  >Good C code will try to avoid allocations as much as possible in the first place.
  there's a genius to this: if you're going to optimize prematurely, do it right out of the gate!
- 01HNNWZ0MV43FF3 days ago
  Can you do parsing of JSON and XML without allocating?
  - veqq3 days ago
    Of course. You can do it in a single pass/just parse the token stream. There are various implementations like: https://zserge.com/jsmn/
    andrepd3 days ago
    It requires manual allocation of an array of tokens. So it needs a backing "stack vector" of sorts.
    And what about escapes?
    int_19h2 days ago
    For escapes you can mutate the raw buffer with data in place, since a single escape always expands to fewer characters than the escape itself.
  - bluetomcat3 days ago
    Yes, you can do it with minimal allocations - provided that the source buffer is read-only or is mutable but is unused later directly by the caller. If the buffer is mutable, any un-escaping can be done in-place because the un-escaped string will always be shorter. All the substrings you want are already in the source buffer. You just need a growable array of pointer/length pairs to know where tokens start.
  - acidx2 days ago
    Yes! The JSON library I wrote for the Zephyr RTOS does this. Say, for instance, you have the following struct:
    struct SomeStruct { char *some_string; int some_number; };
    You would need to declare a descriptor, linking each field to how it's spelled in the JSON (e.g. the some_string member could be "some-string" in the JSON), the byte offset from the beginning of the struct where the field is (using the offsetof() macro), and the type.
    The parser is then able to go through the JSON, and initialize the struct directly, as if you had reflection in the language. It'll validate the types as well. All this without having to allocate a node type, perform copies, or things like that.
    This approach has its limitations, but it's pretty efficient -- and safe!
    Someone wrote a nice blog post about (and even a video) it a while back: https://blog.golioth.io/how-to-parse-json-data-in-zephyr/
    The opposite is true, too -- you can use the same descriptor to serialize a struct back to JSON.
    I've been maintaining it outside Zephyr for a while, although with different constraints (I'm not using it for an embedded system where memory is golden): https://github.com/lpereira/lwan/blob/master/src/samples/tec...
  - zzo38computer2 days ago
    It depends what you intend to do with the parsed data, and where the input comes from and where the output will be going to. There are situations that allocations can be reduced or avoided, but that is not all of them. (In some cases, you do not need full parsing, e.g. to split an array, you can check if it is a string or not and the nesting level, and then find the commas outside of any arrays other than the first one, to be split.) (If the input is in memory, then you can also consider if you can modify that memory for parsing, which is sometimes suitable but sometimes not.)
    However, for many applications, it will be better to use a binary format (or in some cases, a different text format) rather than JSON or XML.
    (For the PostScript binary format, there is no escaping, and the structure does not need to be parsed and converted ahead of time; items in an array are consecutive and fixed size, and data it references (strings and other arrays) is given by an offset, so you can avoid most of the parsing. However, note that key/value lists in PostScript binary format is nonstandard (even though PostScript does have that type, it does not have a standard representation in the binary object format), and that PostScript has a better string type than JavaScript but a worse numeric type than JavaScript.)
  - megous2 days ago
    Yes, you can first validate the buffer, to know it contains valid JSON, and then you can work with pointers to beginings of individual syntactic parts of JSON, and have functions that decide what type of the current element is, or move to the next element, etc. Even string work (comparisons with other escaped or unescaped strings, etc.) can be done on escaped strings directly without unescaping them to a buffer first.
    Ergonomically, it's pretty much the same as parsing the JSON into some AST first, and then working on the AST. And it can be much faster than dumb parsers that use malloc for individual AST elements.
    You can even do JSON path queries on top of this, without allocations.
    Eg. https://xff.cz/git/megatools/tree/lib/sjson.c
  - gritzko3 days ago
    Yep, no problem. In place parsing only requires a stack. Stack length is the maximum JSON nesting allowed. I have a C dialect exactly like that.
  - Ygg23 days ago
    Theoretically yes. Practically there is character escaping.
    That kills any non-allocation dreams. Moment you have "Hi \uxxxx isn't the UTF nice?" you will probably have to allocate. If source is read-only you have to allocate. If source is mutable you have to waste CPU to rewrite the string.
    deaddodo3 days ago
    I'm confused why this would be a problem. UTF-8 and UTF-16 (the only two common unicode subsets) are a maximum of 4 bytes wide (and, most commonly, 2 in English text). The ASCII representation you gave is 6-bytes wide. I don't know of many ASCII unicode representations that have less bytewidth than their native Unicode representation.
    Same goes for other characters such as \n, \0, \t, \r, etc. All half in native byte representation.
    lelanthran3 days ago
    > Moment you have "Hi \uxxxx isn't the UTF nice?" you will probably have to allocate.
    Depends on what you are doing with it. If you aren't displaying it (and typically you are not in a server application), you don't need to unescape it.
    mpyne3 days ago
    And this is indeed something that the C++ Glaze library supports, to allow for parsing into a string_view pointing into the original input buffer.
    _3u102 days ago
    It’s just two pointers the current place to write and the current place to read, escapes are always more characters than they represent so there’s no danger of overwriting the read pointer. If you support compression this can become somewhat of and issue but you simply support a max block size which is usually defined by the compression algorithm anyway.
    Ygg22 days ago
    If you have a place to write, then it's not zero allocation. You did an allocation.
    And usually if you want maximum performance, buffered read is the way to go, which means you need a write slab allocation.
    lelanthran2 days ago
    > If you have a place to write, then it's not zero allocation. You did an allocation.
    Where did that allocation happen? You can write into the buffer you're reading from, because the replacement data is shorter than the original data.
    Ygg2a day ago
    You have a read buffer and somewhere where you have to write to.
    Even if we pretend that the read buffer is not allocating (plausible), you will have to allocate for the write source for the general case (think GiB or TiB of XML or JSON).
    lelanthrana day ago
    > You have a read buffer and somewhere where you have to write to.
    The "somewhere you have to write to" is the same buffer you are reading from.
    Ygg2a day ago
    Not if you are doing buffered reads, where you replace slow file access with fast memory access. This buffer is cleared every X bytes processed.
    Writing to it would be pointless because clears obliterate anything written; or inefficient because you are somehow offsetting clears, which would sabotage the buffered reading performance gains.
    lelanthrana day ago
    Maybe I missed it, but ITT we were talking about C buffers, not buffered reads.
    Ygg2a day ago
    I thought we were talking about high performance parsing. Of which buffered reads are one. Other is loading entire document into mutable memory, which also has limitations.
    topspin3 days ago
    > Practically there is character escaping
    The voice of experience appears. Upvoted.
    It is conceivable to deal with escaping in-place, and thus remain zero-alloc. It's hideous to think about, but I'll bet someone has done it. Dreams are powerful things.
  - lelanthran3 days ago
    > Can you do parsing of JSON and XML without allocating?
    If the source JSON/XML is in a writeable buffer, with some helper functions you can do it. I've done it for a few small-memory systems.
- self_awareness2 days ago
  That mythical "Good C Code", which is known only to some people who I never met.
- pjmlp2 days ago
  These abstractions were already common in enterprise C code decades before Java came to be, thanks to stuff like Yourdon Structured Method.
  Using fixed size buffers doesn't fix out of bounds errors, and stack corruption caused by such bugs.
  Naturally we all know good C programmers never make them. /s
acidx3 days ago
One thing to note, too, is that `atoi()` should be avoided as much as possible. On error (parse error, overflow, etc), it has an unspecified return value (!), although most libcs will return 0, which can be just as bad in some scenarios.
Also not mentioned, is that atoi() can return a negative number -- which is then passed to malloc(), that takes a size_t, which is unsigned... which will make it become a very large number if a negative number is passed as its argument.
It's better to use strtol(), but even that is a bit tricky to use, because it doesn't touch errno when there's no error but you need to check errno to know if things like overflow happened, so you need to set errno to 0 before calling the function. The man page explains how to use it properly.
I think it would be a very interesting exercise for that web framework author to make its HTTP request parser go through a fuzz-tester; clang comes with one that's quite good and easy to use (https://llvm.org/docs/LibFuzzer.html), especially if used alongside address sanitizer or the undefined behavior sanitizer. Errors like the one I mentioned will most likely be found by a fuzzer really quickly. :)
- MathMonkeyMan3 days ago
  Unspecified, really? cppreference's [C documentation][1] says that it returns zero. The [OpenGroup][2] documentation doesn't specify a return value when the conversion can't be performed. This recent [draft][3] of the ISO standard for C says that if the value cannot be represented (does that mean over/underflow, bad parse, both, neither?), then it's undefined behavior.
  So three references give three different answers.
  You could always use sscanf instead, which tells you how many values were scanned (e.g. zero or one).
  [1]: https://en.cppreference.com/w/c/string/byte/atoi.html
  [2]: https://pubs.opengroup.org/onlinepubs/9799919799/functions/a...
  [3]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf
  - acidx3 days ago
    The Linux man page (https://man7.org/linux/man-pages/man3/atoi.3.html#VERSIONS) says that POSIX.1 leaves it unspecified. As you found out, it's really something that should be avoided as much as possible, because pretty much everywhere disagrees how it should behave, especially if you value portability.
    sscanf() is not a good replacement either! It's better to use strtol() instead. Either do what Lwan does (https://github.com/lpereira/lwan/blob/master/src/lib/lwan-co...), or look (https://cvsweb.openbsd.org/src/lib/libc/stdlib/strtonum.c?re...) at how OpenBSD implemented strtonum(3).
    For instance, if you try to parse a number that's preceded by a lot of spaces, sscanf() will take a long time going through it. I've been hit by that when fuzzing Lwan.
    Even cURL is avoiding sscanf(): https://daniel.haxx.se/blog/2025/04/07/writing-c-for-curl/
    MathMonkeyMan2 days ago
    If your use case can have C++, then [std::from_chars][1] is ideal. Here's gcc's [implementation][2]; a lot of it seems to be handling different bases.
    [1]: https://en.cppreference.com/w/cpp/utility/from_chars.html
    [2]: https://github.com/gcc-mirror/gcc/blob/461fa63908b5bb1a44f12...
kazinator2 days ago
I definitely don't love C that does atoi on a Content-Length value that came from the network and passes that to malloc.
Even before we get to how a malicious would interact with malloc, there is this:
> The functions atof, atoi, atol, and atoll are not required to affect the value of the integer expression errno on an error. If the value of the result cannot be represented, the behavior is undefined. [ISO C N3220 draft]
That includes not only out-of-range values by garbage that cannot be converted to a number at all. atoi("foo") can behave in any manner whatsoever and return anything.
Those functions are okay to use on something that has been validated in a way that it cannot cause a problem. If you know you have a nonempty sequence of nothing but digits, possibly with a minus sign, and the number digits is small enough that the value will fit into int, you are okay.
> A malicious user can pass Content-Length of 4294967295
But why would they when it's fewer keystrokes to use -1, which will go to 4294967295 on a 32 bit malloc, while scaling to 18446744073709551615 on 64 bit?
- noobermin11 hours ago
  This seems more like a coding problem than a C problem. Most people would at least validate input before allocating anything.
- 17186274408 hours ago
  > But why would they when it's fewer keystrokes to use -1, which will go to 4294967295 on a 32 bit malloc, while scaling to 18446744073709551615 on 64 bit?
  If that user wants to exploit your application it's better not to pass such a high value, since malloc typically detects size > SIZE_MAX/2. But then this code also doesn't check for malloc to return NULL, so this might also what leads to an exploit.
yipikaya3 days ago
As an aside, it's amusing that it took 25 years for C coders to embrace the C99 named struct designator feature:
```
    HttpParser parser = {
        .isValid = true,
        .requestBuffer = strdup(request),
        .requestLength = strlen(request),
        .position = 0,
    };
```
All the kids are doing it now!
- rurban2 days ago
  It's only Microsoft's fault to have not implemented it for decades in MSVC. They stayed at C89 forever.
  - flykespice2 days ago
    I never understand the reason why Microsoft lagged so much behind on newer c standards adoption. Did their compiler infrastructure made it difficult to adopt newer standards flexible? Or they simply did not care?
    rurban2 days ago
    They focused on C++ only. Management, not their devs.
- 17186274403 days ago
  This is nice for constant data, but strdup can return NULL here, which is again never checked.
  > it took 25 years for C coders to embrace the C99 named struct designator feature
  Not sure if this actually true, but this is kind of the feature of C, 20 years old code or compiler is supposed to work just fine, so you just wait for some time to settle things. For fast and shiny, there is Javascript.
- davemp3 days ago
  I’m still regularly getting on projects and moving C89 variable declarations from the start of functions to where they’re initialized, but I guess it’s not the kids doing it.
  - 17186274402 days ago
    I only declare variables at the begin of a block, not because I would need C89 compatibility, but because I find it clearer to establish the working set of variables upfront. This doesn't restrict me in anyway, because I just start a new block, when I feel the need. I also try to keep the scope of a variable as small as possible.
  - mkfs2 days ago
    > C89 variable declarations from the start of functions
    Technically it's the start of a block.
    davemp2 days ago
    Technically but I don’t think folks ever really bothered.
- varjag2 days ago
  Some of the most famous C codebases (e.g. the Linux kernel) been using them for some time.
lelanthran3 days ago
I can't completely blame the language here: anyone "coding" in a language new to them using an LLM is going to have real problems.
- OneLessThing3 days ago
  It's funny the author says this was 90% written without AI, and that AI was mostly used for the json code. I think they're just new to C.
  Trust me I love C. Probably over 90% of my lifetime code has been written in C. But python newbies don't get their web frameworks stack smashed. That's kind of nice.
  - lelanthran3 days ago
    > But python newbies don't get their web frameworks stack smashed. That's kind of nice.
    Hah! True :-)
    The thing is, smashed stacks are difficult to exploit deterministically or automatically. Even heartbleed, as widespread as it was, was not a guaranteed RCE.
    OTOH, an exploit in a language like Python is almost certainly going to be easier to exploit deterministically. Log4j, for example, was a guaranteed exploit and the skill level required was basically "Create a Java object".
    This is because of the ease with which even very junior programmers can create something that appears to run and work and not crash.
    alfiedotwtf3 days ago
    > The thing is, smashed stacks are difficult to exploit deterministically or automatically. Even heartbleed, as widespread as it was, was not a guaranteed RCE.
    That’s like driving without a seatbelt - it’s not safe, but it would only matter on that very rare chance you have a crash. I would rather just wear a seatbelt!
- uyzstvqs3 days ago
  It's a double-sided coin. LLMs are probably the best way to learn programming languages right now. But if you vibecode in a programming language that you don't understand, it's going to be a disaster sooner or later.
  This is also the reason why AI will not replace any actual jobs with merit.
  - AdieuToLogic3 days ago
    > LLMs are probably the best way to learn programming languages right now.
    Books still exist, be they in print or electronic form.
    estimator72922 days ago
    Examples are the best documentation, and we now have a machine to produce infinite examples tailored specifically to any situation
    AdieuToLogic16 hours ago
    > Examples are the best documentation ...
    No they are not, as examples lack explanation of the concepts underlying a programming language's definition.
    > ... and we now have a machine to produce infinite examples tailored specifically to any situation
    This is like saying, "to learn X language, just read a bunch of source in GitHub repositories that use it."
    What books written by authoritative people provide, such as language designers or recognized luminaries, is conveyance of key linguistic concepts and an explanation of "the why" they are important. This is the sole purvey of people.
    nxobject2 days ago
    Pending on the quality of the examples, of course.
    zweifuss2 days ago
    I would claim that:
    (interactive labs + quizzes) > Learning from books
    Good online documentation > 5yr old tome on bookshelf
    chat/search with ai > CTRL+F in a PDF manual
    AdieuToLogic16 hours ago
    Most of what you claim as being better does not address how people can discover concepts of which they are previously aware. To wit:
    One cannot complete "labs + quizzes" unless they know how to answer same. One cannot "Ctrl-F in a PDF manual" unless they know what to search for.
    As to online docs being better than a printed "5yr old tome on bookshelf", that depends on if the available online documentation subsumes the book. If it does, awesome, but if it doesn't, then there very likely are things to learn within reach of said bookshelf.
    EDIT:
    An exemplar to consider is how the Actor Model[0] can be used to define a FaaS[1]-based system. Without being aware of this paper, it is unrealistic to expect someone to be able to formulate LLM prompts incorporating concepts identified by same.
    Side note: the Actor Model[0] paper is far older than a "5yr old tome" and is very much applicable to this day.
    0 - https://dspace.mit.edu/bitstream/handle/1721.1/41962/AI_WP_1...
    1 - https://en.wikipedia.org/wiki/Function_as_a_service
    skydhash2 days ago
    Interactive labs can do a great job of teaching skills, but they fell short of teaching understanding. And at some point, it’s faster to read a book to learn, because there’s a reduced need for practice.
    Hypertext is better than printed book format, but if you’re just starting with something you need a guide that provides a coherent overview. Also most online documentation are just bad.
    Why ctrl+f? You can still have a table of contents and an index with pdf. And the pdf formats support link. And I’d prefer filtering/querying over generation because the latter is always tainted by my prompt. If I type `man unknown_function`, I will get an error, not a generated manual page.
dang3 days ago
Recent and related:
Show HN: I built a web framework in C - https://news.ycombinator.com/item?id=45526890 - Oct 2025 (208 comments)
AdieuToLogic3 days ago
While the classic "Parse, don’t validate"[0] paper uses Haskell instead of C as its illustrative programming language, the approach detailed is very much applicable in these scenarios.
0 - https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...
- lelanthran2 days ago
  > While the classic "Parse, don’t validate"[0] paper uses Haskell instead of C as its illustrative programming language, the approach detailed is very much applicable in these scenarios.
  Good thing someone (i.e. me) took the time to demonstrate PdV in C: https://www.lelanthran.com/chap13/content.html
  - nxobject2 days ago
    I appreciate that link - now I see the parallels between “consolidate allocation in C to the extent that the rest of your code doesn’t have to worry”, and “consolidate validation in C” to the extent that…”.
jqpabc1233 days ago
Reads like an indictment of vibe coding.
LLMs are fundamentally probabilistic --- not deterministic.
This basically means that anything produced this way is highly suspect. And this framework is an example.
erichocean3 days ago
Give Fil-C a try, the speed hit is pretty minimal and you get full memory safety.
https://fil-c.org/
- Karrot_Kream3 days ago
  Wow this is really cool, I'd never seen this before. Thanks!
- adhamsalama2 days ago
  Why isn't this used more?
jacquesm3 days ago
There are many, many more such issues with that code. The person that posted it is new to C and had an AI help them to write the code. That's a recipe for disaster, it means the OP does not actually understand what they wrote. It looks nice but it is full of footguns and even though it is a useful learning exercise it also is a great example of why it is better run battle tested frame works than to inexpertly roll your own.
As a learning exercise it is useful, but it should never see production use. What is interesting is that the apparent cleanliness of the code (it reads very well) is obscuring the fact that the quality is actually quite low.
If anything I think the conclusion should be that AI+novice does not create anything that is useable without expert review and that that probably adds up to a net negative other than that the novice will (hopefully) learn something. It would be great if someone could put in the time to do a full review of the code, I have just read through it casually and already picked up a couple of problems, I'm pretty sure that if you did a thorough job of it there would be many more.
- drnick13 days ago
  > What is interesting is that the apparent cleanliness of the code (it reads very well) is obscuring the fact that the quality is actually quite low.
  I think this is a general feature and one of the greatest advantages of C. It's simple, and it reads well. Modern C++ and Rust are just horrible to look at.
  - messe3 days ago
    I slightly unironically believe that one of the biggest hindrances to rust's growth is that it adopted the :: syntax from C++ rather than just using a single . for namespacing.
    jacquesm3 days ago
    I believe that the fanatics in the rust community were the biggest factor. They turned me off what eventually became a decent language. There are some language particulars that were strange choices, but I get that if you want to start over you will try to get it all right this time around. But where the Go authors tried to make the step easy and kept their ego out of it, it feels as if the rust people aimed at creating a new temple rather than to just make a new tool. This created a massive chicken-and-the-egg problem that did not help adoption at all. Oh, and toolchain speed. For non-trivial projects for the longest time the rust toolchain was terribly slow.
    I don't remember any other language's proponents actively attacking the users of other programming language.
    cyphar3 days ago
    > But where the Go authors tried to make the step easy and kept their ego out of it
    That is very different to my memories of the past decade+ of working on Go.
    Almost every single language decision they eventually caved on that I can think of (internal packages, vendoring, error wrapping, versioning, generics) was preceded by months if not years of arguing that it wasn't necessary, often followed by an implementation attempt that seems to be ever so slightly off just out of spite.
    Let's don't forget that the original Go 1.0 expected every project's main branch to maintain backward compatibility forever or else downstreams would break, and without hacks (which eventually became vendoring) you could not build anything without an internet connection.
    To be clear, I use Go (and C... and Rust) and I do like it on the whole (despite and for its flaws) but I don't think the Go authors are that different to the Rust authors. There are (unfortunately) more fanatics in the Rust community but I think there's also a degree to which some people see anything Rust-related as being an attack on other projects regardless of whether the Rust authors intended it to be that way.
    jacquesm2 days ago
    Fair enough.
    lelanthran3 days ago
    > I believe that the fanatics in the rust community were the biggest factor.
    I second this; for a few years it was impossible to have any sort of discussion on various programming places when the topic was C: the conversation would get quickly derailed with accusations of "dinosaur", etc.
    Things have gone quiet recently (last three years, though) and there have been much fewer derailments.
    LexiMax3 days ago
    As an outsider, I don't really see Rust having done anything different recently than they weren't doing from the start.
    What seems to have changed in recent years is the buy-in from corporations that seemingly see value in its promises of safety. This seems to be paired with a general pulling back of corporate support from the C++ world as well as a general recession of fresh faces, a change that at least from the sidelines seems to be mostly down to a series of standards committee own-goals.
    kcexn3 hours ago
    I'm not sure that there is a recession of corporate support from C++. Just that the proportion of companies that need C++ is smaller than it once was.
    I like the safety promise of Rust. But the complicated interop story with C and C++ hurt it a lot. I mean, in a typical codebase, what proportion of bugs will be memory-safety related vs other reasons? Ideally, we could just wrap the safety-critical bits in a memory-safe wrapper and continue to use C and C++ for everything else.
    LexiMax3 days ago
    Being a C++ developer and trafficking mostly in C++ spaces, there is a phenomenon I've noticed that I've taken to calling Rust Derangement Syndrome. It's where C and C++ developers basically make Rust the butt of every joke, and make fun it it in a way that is completely outsized with how much they interact with Rust developers in the wild.
    It's very strange to witness. Annoying advocacy of languages is nothing new. C++ was at one point one of those languages, then it was Java, then Python, then Node.js. I feel like if anything, Rust was a victim of a period of increased polarization on social media, which blew what might have been previously seen as simple microaggressions completely out of proportion.
    hu33 days ago
    I don't think Rust will ever be as big as C++ because there were fewer options back then.
    These days Go/Zig/Nim/C#/Java/Python/JS and other languages are fast enough for most use cases.
    And Rust learning curve doesn't help either. C++ was basically C with OOP on steroids. Rust is very different.
    I say that because I wouldn't group Rust opposition with any of those languages you cited. It's different for mostly different reasons and magnitudes.
    pjmlpa day ago
    As someone that was there, a few things helped C++ adoption, and even then it wasn't without the C vs C++ flamewars that endure to these days.
    - At the time, with a few minor differences, C++ was Typescript for C, thus very easy to adopt into existing projects
    - Being born on the same birthplace as C and UNIX, meant all C compiler vendors saw as added value to have it as part of their offering, and it was natural that every UNIX SDK also had C++ support available alongside C.
    - Apple, Metrowerks, IBM, Borland and Microsoft helped to push C++ adoption, by making it the official way to use application frameworks. MacApp (originally in Object Pascal), PowerPlant, CSet++, Turbo Vision/OWL/VCL, and MFC respectively.
    This kept C++ as the language to go for performance in enterprise computing, while Delphi and VB got the "easy" development role, until Java and .NET took over all those frameworks.
    Rust doesn't have this kind of industry wide push, even in OSes where it is being embraced like Windows and Android, note that it isn't being pushed as yet another way to write userspace applications, rather low level OS services.
    LexiMaxa day ago
    > Rust doesn't have this kind of industry wide push, even in OSes where it is being embraced like Windows and Android, note that it isn't being pushed as yet another way to write userspace applications, rather low level OS services.
    This seems apropos in a world where C++ has been bleeding userspace buy-in for longer than I've been professionally programming.
    I started learning Rust a few months ago in an attempt to teach an old dog new tricks, and while it's quite pleasant as far as it went, I can think of several classes of programs that I would be reluctant to use the language for. But I wouldn't dream of using C++ for those types of programs either.
    There are rumors floating around that Microsoft is rolling their own rustc-codegen-gcc paired with their C2 codegen backend. Don't know what to make of those rumors, but it helped to reassure me to feel like the time I invested thus far hasn't been wasted.
    pjmlp21 hours ago
    Not sure about the new backend, but they are indeed quite invested.
    "From Blue Screens to Orange Crabs: Microsoft's Rusty Revolution"
    https://www.youtube.com/watch?v=uDtMuS7BExE
    imtringued3 days ago
    Software vulnerabilities are an implicit form of harassment.
    messe3 days ago
    I'm hoping that's meant to satirise the rust community, because it's horseshit like this that makes a sizeable subset of rust evangelists unbearable.
    01HNNWZ0MV43FF3 days ago
    > I don't remember any other language's proponents actively attacking the users of other programming language.
    I just saw someone on Hacker News saying that Rust was a bad language because of its users
    jacquesm3 days ago
    Yawn. Really, if you have nothing to say don't do it here.
    LexiMax3 days ago
    Gotcha hypocrisy might be a really cheap thing to point out, but they're not wrong.
    I have noticed my fair share of Rust Derangement Syndrome in C++ spaces that seems completely outsized from the series of microaggressions that they eventually point out when asked "Why?"
    dgfitz3 days ago
    It’s interesting, over the past 15 years I’ve had occasion to work with other c/c++ devs on various contracts, probably 50ish distinct different companies. Not once has rust even come up in casual conversation.
    LexiMaxa day ago
    I'm probably a little younger than you, so it's likely a generational thing. I also notice it's a lot more pervasive in internet-driven watercoolers than face to face.
  - citbl3 days ago
    The safer the C code, the more horrible it starts looking though... e.g.
    my_func(char msg[static 1])
    moefh3 days ago
    I don't understand why people think this is safer, it's the complete opposite.
    With that `char msg[static 1]` you're telling the compiler that `msg` can't possibly be NULL, which means it will optimize away any NULL check you put in the function. But it will still happily call it with a pointer that could be NULL, with no warnings whatsoever.
    The end result is that with an "unsafe" `char *msg`, you can at least handle the case of `msg` being NULL. With the "safe" `char msg[static 1]` there's nothing you can do -- if you receive NULL, you're screwed, no way of guarding against it.
    For a demonstration, see[1]. Both gcc and clang are passed `-Wall -Wextra`. Note that the NULL check is removed in the "safe" version (check the assembly). See also the gcc warning about the "useless" NULL check ("'nonnull' argument 'p' compared to NULL"), and worse, the lack of warnings in clang. And finally, note that neither gcc or clang warn about the call to the "safe" function with a pointer that could be NULL.
    [1] https://godbolt.org/z/qz6cYPY73
    lelanthran3 days ago
    > I don't understand why people think this is safer, it's the complete opposite.
    Yup, and I don't even need to check your godbolt link - I've had this happen to me once. It's the implicit casting that makes it a problem. You cannot even typedef it away as a new type (the casting still happens).
    The real solution is to create and use opaque types. In this case, wrapping the `char[1]` in a struct would almost certainly generate compilation errors if any caller passed the wrong thing in the `char[1]` field.
    uecker3 days ago
    Compared to other languages, this is still nice.
    jacquesm3 days ago
    It is - like everything else - nice because you, me and lots of others are used to it. But I remember starting out with C and thinking 'holy crap, this is ugly'. After 40+ years looking at a particular language it no longer looks ugly simply because of familiarity. But to a newcomer C would still look quite strange and intimidating.
    And this goes for almost all programming languages. Each and every one of them has warts and issues with syntax and expressiveness. That holds true even for the most advanced languages in the field, Haskell, Erlang, Lisp and more so for languages that were originally designed for 'readability'. Programming is by its very nature more akin to solving a puzzle than to describing something. The puzzle is to how to get the machine to do something, to do it correctly, to do it safely and to do it efficiently, and all of those while satisfying the constraint of how much time you are prepared (or allowed) to spend on it. Picking the 'right' language will always be a compromise on some of these, there is no programming language that is perfect (or even just 'the best' or 'suitable') for all tasks, and there are no programming languages that are better than any other for any subset of all tasks until 'tasks' is a very low number.
    uecker2 days ago
    I agree that the first reaction usually is only about what one is used to. I have seen this many times. Still, of course, not all syntax is equally good.
    For example, the problem with Vec<Vec<T>> for a 2D array is not that one is not used to it, but that the syntax is just badly designed. Not that C would not have problematic syntax, but I still think it is fairly good in comparison.
    jacquesm2 days ago
    C has one massive advantage over many other languages: it is just a slight level above assembler and it is just about as minimal as a language can be. It doesn't force you into an eco-system, plays nice with lots of other tools and languages and gets out of the way. 'modern' languages, such as Java, Rust, Python, Javascript (Node) and so on all require you to buy in to the whole menu, they're not 'just a language' (even if some of them started out like that).
    uecker2 days ago
    Not forcing you into an eco-system is what makes C special, unique and powerful, and this aspect is not well understood by most critics. Stephen Kell wrote a great essay about it.
    pjmlp2 days ago
    Meanwhile, in Modula-2 from 1978, that would be
    PROCEDURE my_func(msg: ARRAY OF CHAR);
    Now you can use LOW() and HIGH() to get the lower and upper bounds, and naturally bounds checked unless you disabled them, locally or globaly.
    jacquesm2 days ago
    This should not be downvoted, it is both factually correct and a perfect illustration of these problems already being solved and ages ago at that.
    It is as if just pointing this out already antagonizes people.
    pjmlp2 days ago
    A certain group of people likes to pretend before C there were no other systems programming languages, other than BCPL.
    Ignoring what happened since 1958 (JOVIAL being a first attempt), and thus all its failings are excused because it was discovering the world.
    jacquesma day ago
    I think the main reason you see this happening over and over again is because we're teaching this whole discipline wrong. By 1960 most of the problems in software development were known and had one or more solutions. Knuth spent decades just cataloging what was mostly already known (and moved the field forward in quite a few occasions as well).
    And yet, you can't go a day without someone declaring that now is the time to do it right, this time it will be different. And then they proceed to do one thing after another for which the outcome is already known, just not to them. I think the best way to teach would be to start off with a fairly detailed history of what had gone before, just to give people a map and some basic awareness of the degree to which things have already been done, rather than to find new and interesting ways to shoot themselves in the foot (again).
    lioeters21 hours ago
    Unlike actual engineering, software "engineering" as a field has decided to reinvent itself every generation - worse, every turn of the trends, even every project and person. Majority of the practitioners are in it for superficial reasons, unaware of its rich history and culture.
    With ignorance comes arrogance of an individualist intellectual, thinking their unique revolutionary contribution will wow the public and move the field forward. Except inevitably they're not only reinventing the wheel but the entire automobile, without knowing basic principles and the work of predecessors. It has a lot in common with modern art.
    > we're teaching this whole discipline wrong
    I sometimes think languages after C, like C++ and Java, were misguided in some ways. Sure they provided business value, brought new ideas, and the software worked - but their popularity came at a cost of leaving countless great thoughts behind in history, and resulted in a poverty of software culture, education and imagination.
    There are optimistic signs of people returning to the roots, re-learning the lessons and re-discovering ideas. I think many are coming to realize the need for a reformation of sorts.
    3 days ago
    undefined
- nurettin3 days ago
  > should never see production use.
  I have an issue with high strung opinions like this. I wrote plenty of crappy delphi code while learning the language that saw production use and made a living from it.
  Sure, it wasn't the best experience for users, it took years to iron out all the bugs and there was plenty of frustration during the support phase (mostly null pointer exceptions and db locks in gui).
  But nobody would be better off now if that code never saw production use. A lot of business was built around it.
  - zdragnar3 days ago
    Buggy code that just crashes or produces incorrect results are a whole different category. In C a bug can compromise a server and your users. See the openssl heart bleed vulnerability as a prime example.
    Once upon a time, you could put up a relatively vulnerable server, and unless you got a ton of traffic, there weren't too many things that would attack it. Nowadays, pretty much anything Internet facing will get a constant stream of probes. Putting up a server requires a stricter mindset than it used to.
  - jacquesm3 days ago
    There are minimum standards for deployment to the open web. I think - and you're of course entirely free to have a different opinion - that those are not met with this code.
    nurettin3 days ago
    Yes, I have lots of opinions!
    I guess the question at spotlight is: At what point would your custom server's buffer overflow when reading a header matter and would that bug even exist at that point?
    Could a determined hacker get to your server without even knowing what weird software you cooked up and how to exploit your binary?
    We have a lot of success stories born from bad code. I mean look at Micro$oft.
    Look at all the big players like discord leaking user credentials. Why would you still call out the little fish?
    Maybe I should create a form for all these ahah.
    frumplestlatz2 days ago
    > Could a determined hacker get to your server without even knowing what weird software you cooked up and how to exploit your binary?
    Yes.
    nurettin2 days ago
    Yes but how? After the overflow they still have to know the address of the next call site and the server would be in a UB state.
    jacquesm2 days ago
    The code is on github. Figure out a way to get a shell through that code and you're hosed if someone recognizes it in active use.
    nurettin2 days ago
    I mean tha hacker won't know what software is running on the server, unless the server announces itself which can be traced to the repo, but then, why ?? Who cares about this guy's vps? This whole thread makes no sense to me and I'm the only one questioning.
    jacquesma day ago
    > This whole thread makes no sense to me and I'm the only one questioning.
    That may well be because this isn't your field?
    nurettina day ago
    Or maybe well thought out, intelligent responses are a rare thing. Occam's razor suggests the latter.
    frumplestlatz2 days ago
    UB state doesn’t mean totally uncontrollable or opaque.
    There are lots of ways the server could leak information about its internal state, and exploits have absolutely been implemented in the past based only on what was visible remotely.
- lifthrasiir3 days ago
  Yeah, I recently wrote a moderate amount of C code [1] entirely with Gemini and while it was much better than what I initially expected I needed a constant steering to avoid inefficient or less safe code. It needed an extensive fuzzing to get the minimal amount of confidence, which caught at least two serious problems---seriously, it's much better than most C programmers, but still.
  [1] https://github.com/lifthrasiir/wah/blob/main/wah.h
  - jacquesm3 days ago
    I've been doing this the better part of a lifetime and I still need to be careful so don't feel bad about it. Just like rust has an 'unsafe' keyword I realize all of my code is potentially unsafe. Guarding against UB, use-after-free, array overruns and so on is a lot of extra work and you only need to slip up once to have a bug, and if you're unlucky something exploitable. You get better at this over the years. But if I know something needs to be bullet proof the C compiler would not be my first tool of choice.
    One good defense is to reduce your scope continuously. The smaller you make your scope the smaller the chances of something escaping your attention. Stay away from globals and global data structures. Make it impossible to inspect the contents of a box without going through a well defined interface. Use assertions liberally. Avoid fault propagation, abort immediately when something is out of the expected range.
    uecker3 days ago
    I strategy that helps me is just not use open-coded pointer arithmetic or string manipulation but encapsulate those behind safe bounds-checked interfaces. Then essentially only life-time issues remain and for those I usually do have a simple policy and clearly document any exception. I also use signed integers and the sanitizer in trapping mode, which turns any such issue I may have missed into a run-time trap.
    OneLessThing3 days ago
    This is why I love C. You can build these guard rails at exactly the right level for you. You can build them all the way up to CPython and do garbage collection and constant bounds checking. Or keep them at just raw pointer math. And everywhere in between. I like your approach. The downside being that there are probably 100,000+ bespoke implementations of similar guard rails where python users for example all get them for free.
    jacquesm3 days ago
    It definitely is a lot of freedom.
    But the lack of a good string library is by itself responsible for a very large number of production issues, as is the lack of foresight regarding de-referencing pointers that are no longer valid. Lack of guardrails seems to translate in 'do what you want' not necessarily 'build guard rails at the right level for you', most projects simply don't bother with guardrails at all.
    Rust tries to address a lot of these issues, but it does so by tossing out a lot of the good stuff as well and introducing a whole pile of new issues and concepts that I'm not sure are an improvement over what was there before. This creates a take-it-or-leave it situation, and a barrier to entry. I would have loved to see that guard rails concept extended to the tooling in the form of compile time flags resulting in either compile time flagging of risky practices (there is some of this now, but I still think it is too little) and runtime errors for clear violations.
    The temptation to 'start over' is always there, I think C with all of its warts and shortcomings is not the best language for a new programmer to start with if they want to do low level work. At the same time, I would - still, maybe that will change - hesitate to advocate for rust, it is a massive learning curve compared to the kind of appeal that C has for a novice. I'd probably recommend Go or Java over both C and rust if you're into imperative code and want to do low level work. For functional programming I'd recommend Erlang (if only because of the very long term view of the people that build it) or Clojure, though the latter seems to be on its retour.
    uecker2 days ago
    I think the C standard should provide some good libraries, e.g. a string library. But in any case the problem with 100000+ bespoke implementations in C is not fixed by designing new programming languages and also adding them to the mix. Entropy is a bitch.
    lelanthran2 days ago
    > I strategy that helps me [...]
    In another comment recently I opined that C projects, initiated in 2025, are likely to be much more secure than the same project written in Python/PHP (etc).
    This is because the only people choosing C in 2025 are those who have been using it already for decades, have internalised the handful of footguns via actual experience and have a set of strategies for minimising those footguns, all shaped with decades of experience working around that tiny handful of footguns.[1]
    Sadly, this project has rendered my opinion wrong - it's a project initiated in 2025, in C, that was obviously done by an LLM, and thus is filled with footguns and over-engineering.
    ============
    [1] I also have a set of strategies for dealing with the footguns; I would gues if we sat down together and compared notes our strategies would have more in common than they would differ.
    uecker2 days ago
    If you want something fool-proof where a statistical code generated will not generate issues, then C is certainly not a good choice. But also for other languages this will cause issues. I think for vibe-coding a network server you might want something sand-boxed with all security boundaries outside, in which case it does not really matter anymore.
  - OneLessThing3 days ago
    This is exactly my problem with LLM C code, lack of confidence. On the other hand, when my projects get big enough to the point where I cannot keep the code base generally loaded into my brains cache they eventually get to the point where my confidence comes from extensive testing regardless. So maybe it's not such a bad approach.
    I do think that LLM C code if made with great testing tooling in concert has great promise.
    jacquesm3 days ago
    That generalizes to anything LLM related.
  - lelanthran3 days ago
    > It needed an extensive fuzzing to get the minimal amount of confidence, which caught at least two serious problems---seriously, it's much better than most C programmers, but still.
    How are you doing your fuzzing? You need either valgrind (or compiler sanitiser flags) in the loop for a decent level of confidence.
    lifthrasiir3 days ago
    The "minimal" amount of confidence, not a decent level of confidence. You are completely right that I need much more to establish anything higher than that.
- OneLessThing3 days ago
  I agree that it reads really well which is why I was also surprised the quality is not high when I looked deeper. The author claims to have only used AI for the json code, so your conclusion may be off, it could just be a novice doing novice things.
  I suppose I was just surprised to find this code promoted in my feed when it's not up to snuff. And I'm not hating, I do in fact love the project idea.
- citbl3 days ago
  The irony is also that AI could have been used to audit the code and find these issues. All the author had to do was to question.
qalmakkaa day ago
Integer operations, the one thing in computers where basically there's no non-annoying way to do them right except being over pedantic with checks
qalmakkaa day ago
OT: using the `strcasecmp` family of functions is basically asking for trouble - unless you've previously set the locale to "C", which is basically the only locale with a defined behaviour. Otherwise you're basically bound to run onto very funny internationalisation issues you'd rather know nothing about (and fail the Turkey Test)
messe3 days ago
> Another interesting choice in this project is to make lengths signed:
There are good reasons for this choice in C (and C++) due to broken integer promotion and casting rules.
See: "Subscripts and sizes should be signed" (Bjarne Stroustrup) https://open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1428r0...
As a nice bonus, it means that ubsan traps on overflow (unsigned overflows just wrap).
- uecker3 days ago
  I do not agree that the integer promotion or casting (?) rules are broken in C. That some people make mistakes because they do not know them is a different problem.
  The reason you should make length signed is that you can use the sanitizer to find or mitigate overflow as you correctly observe, while unsigned wraparound leads to bugs which are basically impossible to find. But this has nothing to do with integer promotion and wraparound bugs can also create bugs in - say - Rust.
  - messe3 days ago
    I meant implicit casting, but I guess that really falls under promotion in most cases where it's relevant here (I'm on a train from Aarhus to Copenhagen right now to catch a flight, and I've slept considerably less than usual, so apologies if I'm making some slight mistakes).
    The issues really arise when you mix signed/unsigned arithmetic and end up promoting everything to signed unexpectedly. That's usually "okay", as long as you're not doing arithmetic on anything smaller than an int.
    As an aside, if you like C enough to have opinions on promotion rules then you might enjoy the programming language Zig. It's around the same level as C, but with much nicer ergonomics, and overflow traps by default in Debug/ReleaseSafe optimization modes. If you want explicit two's complement overflow it has +%, *% and -% variants of the usual arithmetic operations, as well as saturating +|, *|, -| variants that clamp to [minInt(T), maxInt(T)].
    EDIT to the aside: it's also true if you hate C enough to have opinions on promotion rules.
    uecker3 days ago
    I prefer C to Zig. IMHO all the successor languages throw out the baby with the bathwater and add unnecessary complexity. But Zig is much better than Rust, but, still, I would never use it for a serious project.
    The "promoting unexpectedly" is something I do not think happens if you know C well. At least, I can't remember ever having a bug because of this. In most cases the promotion prevents you from having a bug, because you do not get unexpected overflow or wraparound because your type is too small.
    Mixing signed and unsigned is problematic, but I see issues mostly in code from people who think they need to use unsigned when they shouldn't because they heard signed integers are dangerous. Recently I saw somebody "upgrading" a C code basis to C++ and also changing all loop variables to size_t. This caused a bug which he blamed on working on the "legacy C code" he is working on, although the original code was just fine. In general, there are compiler warnings that should catch issues with sign for conversions.
    lelanthran3 days ago
    > Recently I saw somebody "upgrading" a C code basis to C++ and also changing all loop variables to size_t. This caused a bug which he blamed on working on the "legacy C code" he is working on, although the original code was just fine.
    I had the same experience about 10 years back when a colleague "upgrade" code from using size_t to `int`; on that platform (ATMEGA or XMEGA, not too sure now) `int` was too small, overflowed and bad stuff happened in the field.
    The only takeaway is "don't needlessly change the size and sign of existing integer variables".
    uecker2 days ago
    I don't think this is the only takeway. My point is that you can reliably identify signed integer overflow using sanitizers and you can also reliably mitigate related attacks by trapping for signed integer overflow (it still may be a DoS, but you can stop more serious harm). Both does not work with unsigned types except in a tightly controlled project where you treat unsigned wraparound as a bug, but this fails the moment you introduce other idiomatic C code that does not follow this.
    jacquesm3 days ago
    Yes, this is one of the more subtle pitfalls of C. What helps is that in most contexts the value of 2 billion is large enough that a wraparound would be noticed almost immediately. But when it isn't then it can lead to very subtle errors that can propagate for a long time before anything goes off the rails that is noticed.
  - OneLessThing3 days ago
    It's interesting to hear these takes. I've never had problems catching unsigned wrap bugs with plain old memory sanitizers, though I must admit to not having a lot of experience with ubsan in particular. Maybe I should use it more.
    uecker3 days ago
    GCC's sanitizer does not catch unsigned wraparound. But the bigger problem is that a lot of code is written where it assumes that unsigned wraps around and this is ok. So you you would use a sanitizer you get a lot of false positives. For signed overflow, one can always consider this a bug in portable C.
    Of course, if you consistently treat unsigned wraparound as a bug in your code, you can also use a sanitizer to screen for it. But in general I find it more practical to use signed integers for everything except for modular arithmetic where I use unsigned (and where wraparound is then expected and not a bug)
    jacquesm3 days ago
    I've had some fun reviewing some very old code I wrote (1980's) to see what it looked like to me after such a long time of gaining experience. It's not unlike what the OP did here, it reads cleanly but I can see many issues that escaped my attention at the time. I always compared C with a very fast car: you can take some corners on two wheels but if you make a habit of that you're going to end up in a wall somewhere. That opinion has not changed.
    uecker3 days ago
    I think the correct comparison is a sharp knife. It is extremely useful and while there is a risk it is fully acceptable. The idea that we should all use plastic knifes because there are often accidents with knifes is wrong and so is the idea that we use should abandon C because of memory safety. I follow computer security issues for several decades, and while I think we should have memory safety IMHO the push and arguments are completely overblown - and they are especially not worth the complexity and issues of Rust. I never was personally impacted by a security exploit caused by memory safety or know anybody in my personal vicinity who was. I know many cases where people where affected by other kinds of security issues. So I think those are what we should focus on first. And having timely security updates is a hell lot more important than memory safety, so I am not happy that Rust now makes this harder.
    jacquesm3 days ago
    That's an interesting point you are making there. The most common exploits are of the human variety. Even so it is probably a good idea to minimize the chances of all kinds of exploits. One other problem - pet peeve of mine - is that instead of giving people just security updates manufacturers will happily include a whole bunch of new and 'exciting' stuff in their updates that in turn will (1) introduce new security issues and (2) will inevitably try to extract more money from the updaters. This is extremely counterproductive.
    int_19h2 days ago
    The real problem with C is that it's not just a sharp knife, it's a knife with poor ergonomics that makes it more prone to cutting yourself.
    The answer to that though is probably more something like Zig than something like Rust.
    DenisDolyaa day ago
    Hi, are you interested in Zig? Then please check out my port of jsmn to Zig. I wanted to know if people will like it and if there are any downsides others might not. https://github.com/Ferki-git-creator/jsmn_zig
    pjmlp2 days ago
    Except any good chef or butcher knows that they should be wearing protective gloves when using sharp knifes.
    > Cut-resistant gloves are an essential piece of safety equipment in any kitchen.
    https://www.restaurantware.com/blogs/smallwares/how-to-choos...
    Where are C's gloves?
    simonask3 days ago
    I’m sorry, but there is an incredible amount of hard data on this, including the number of CVEs directly attributable to memory safety bugs. This is publicly available information, and we as an industry should take it seriously.
    I don’t mean to be disrespectful, but this cavalier attitude towards it reads like vaccine skepticism to me. It is not serious.
    Programming can be inconsequential, but it can also be national security. I know which engineers I would trust with the latter, and they aren’t the kind who believe that discipline is “enough”.
    goalieca3 days ago
    CVE are important but there’s also a lot of theatre there. How many are known exploitable? Most aren’t if you follow threat intel. Most of the Internet infrastructure is running c/c++ and is very safe.
    simonask2 days ago
    It's fine to have a sober view of the severity, but we can hopefully agree in general that writing any program in C or C++ that faces the internet requires extreme caution.
    goalieca2 days ago
    I think anything that faces the internet needs extreme caution. I've done enough pentesting myself to see that mistakes are abound and most of them are logic problems.
    ueckera day ago
    Number of CVEs is completely irrelevant. Also Google's or Microsoft's priorities are completely irrelevant. If you have reliable data from the real world, please show it to me.
    jacquesm3 days ago
    So what do you propose to do?
    simonask3 days ago
    I propose that we start taking the appropriate amount of professional responsibility.
    That includes being honest about the actual costs of software when you don’t YOLO the details. Zero UB is table stakes now - it didn’t use to be, but we don’t live in that world anymore.
    It’s totally fine to use C or whatever language for it, but you are absolutely kidding yourself if you think the cost is less than at least an order of magnitude higher than the equivalent code written in Rust, C#, or any other language that helps you avoid these bugs. Rust even lets you get there at zero performance cost, so we’re down to petty squabbles about syntax or culture - not serious.
    pjmlp2 days ago
    Thankfully the new cybersecurity laws will help here, when companies map production costs to languages, the needle will keep moving away from those that tank security budgets.
    jacquesm2 days ago
    I was actually hoping for far more strict enforcement but so far they're taking it relatively easy.
    pjmlp2 days ago
    Indeed, however better slowly than nothing at all.
    jacquesm3 days ago
    > I propose that we start taking the appropriate amount of professional responsibility.
    I agree. For me that means: software engineering should start taking the same attitude to writing software that structural engineers bring to the table when they talk about bridges, buildings and other structures that will have people's lives depending on them. I'm not sure how we're going to make rings out of bits but we need to realize - continuously - that the price of failure is often paid in blood, or in the best case with financial loss and usually not by us. And in turn we should be enabled to impose that same ethic on management, because more often than not that's the root cause of the problem.
    > That includes being honest about the actual costs of software when you don’t YOLO the details.
    Does that include development cost?
    Maintenance costs?
    Or just secondary costs?
    Why the focus on costs?
    > Zero UB is table stakes now - it didn’t use to be, but we don’t live in that world anymore.
    This is because 'Rust and C# exist'? Or is it because Java, Erlang, Visual Basic, Lisp etc exist?
    > It’s totally fine to use C or whatever language for it, but you are absolutely kidding yourself if you think the cost is less than at least an order of magnitude higher than the equivalent code written in Rust, C#, or any other language that helps you avoid these bugs.
    We were talking about responsibility first, and that goes well beyond just measuring 'cost'. The mistake in bringing cost into it is that cost is a business concept that is used to justify picking a particular technology over another. And just like security is an expense that doesn't show anything on the balance sheet if it works besides that it cost money the same goes for picking a programming language eco-system.
    So I think focusing on cost is a mistake. That just allows the bookkeeper to make the call and that call will often be the wrong one.
    > Rust even lets you get there at zero performance cost, so we’re down to petty squabbles about syntax or culture - not serious.
    The debate goes a lot further than that. You have millions of people that are writing software every day that are not familiar with Rust. To get them to pick a managed language over what they are used to is going to take a lot of convincing.
    It starts of with ethics, and I don't think it should start off with picking a favorite language. You educate, show by example and you deliver at or below the same cost that those other eco-systems do and then you slowly eat the world because your projects are delivered on-time, with provably lower real world defects and hopefully at a lower cost.
    And then I really couldn't care what language was picked, in the rust world that translates into 'anything but C' because that is perceived to be the enemy somehow, which is strange because there are many alternatives to rust that are perfectly suitable, have much higher mind share already.
    C is - even today - at 10x the popularity that rust is, it will take a massive amount of resources to switch those people over, and likely it will take more than one generation. In the meantime all of the C code in the world will have to be maintained, which means there is massive job security for people learning C. For people learning rust to the exclusion of learning C that situation is far worse. This needs to be solved.
    These are not 'petty squabbles' about syntax or culture. They are the harsh reality of the software development world at large, which has seen massive projects deployed at scale developed with those really bad languages full of undefined behavior (well, that's at least one thing that Assembly Language has going for it, as long as the CPU does what it says in the book undefined behavior doesn't exist). People are going to point at that and say 'good enough'. And they see all those memory overflows, CVEs etc as a given, and they realize that in spite of all of those the main vector for security issues is people, and configuration mistakes not so much the software itself.
    This is not ideal, obviously, but C, like any bad habit, is very hard to dislodge if your main argument is 'you should drop this tool because mine is better'. Then you need to show that your tool is better, so much better that it negates the cost to switch. And that's a very tall order, for any programming language, much more so for one that is struggling for adoption in the first place.
    simonask3 days ago
    Cost is a useful metric because it reflects a number of relevant things: Time to develop, effort to maintain - yes, but also people turnover, required expertise levels, satisfaction, and so on. Whether or not you like it, you have to care about cost if you want to make rational decisions. I'm not talking about assigning a Euro/Dollar/Yuan value to each hour spent on a project, but you need a rough idea about the size of the time and energy investment you are making when starting a project.
    > This is because 'Rust and C# exist'? Or is it because Java, Erlang, Visual Basic, Lisp etc exist?
    Things have changed for three important reasons: (1) C/C++ compilers have evolved, and UB is significantly more catastrophic than it was in the 90s and early 00s. (2) As societies digitize, the stakes are higher than even - leaking personal data has huge legal and moral consequences, and system outages can have business-killing financial consequences. (3) There are actual, viable alternatives - GC is no longer a requirement for memory safety.
    > To get them to pick a managed language over what they are used to is going to take a lot of convincing.
    Perhaps you didn't mean to say so, but Rust is not a managed language (that's a .NET term referring to C#, F#, etc.).
    Me and other Rust users are obviously trying to convince even more people to use the language, and that's because we are having a great time over here. It's a very pleasant language with a pleasant community and a high level of technical expertise, and it allows me to get significantly closer to living up to my own ideals. I'm not making a moral argument here, trying to say that you or anyone is a bad person for not using Rust, but I am making a moral argument saying that denying the huge cost and risk associated with developing software in C and C++ is bullshit.
    > And then I really couldn't care what language was picked, in the rust world that translates into 'anything but C' because that is perceived to be the enemy somehow, which is strange because there are many alternatives to rust that are perfectly suitable, have much higher mind share already.
    The point here is that, until Rust came along, you had the choice between wildly risky (but fast) C and C++ code, or completely safe (but slow) garbage collected languages with heavy runtimes and significant deployment challenges.
    C is certainly not "the enemy" - I never said that, and I wouldn't. But that old world is gone. The excuse of picking risky, problem-riddled languages that we know are associated with extreme costs for reasons of performance no longer has any technical merit. There can be other reasons, but this isn't it.
    > C is - even today - at 10x the popularity that rust is, it will take a massive amount of resources to switch those people over [...]
    It's insane to me that anyone would limit themselves to a single language. Every competent programmer I know knows at least a handful. Why are we worried about this? I'm a decent C programmer, and a very good C++ programmer - better at both because I'm also fairly good at Rust.
    > And they see all those memory overflows, CVEs etc as a given, and they realize that in spite of all of those the main vector for security issues is people, and configuration mistakes not so much the software itself.
    "Pobody's nerfect." I'm sorry, I really dislike this attitude. We can't let the fact that security is hard, or that perfection is unattainable, be an excuse to deliver more crap.
    > This is not ideal, obviously, but C, like any bad habit, is very hard to dislodge if your main argument is 'you should drop this tool because mine is better'
    Again, that's not my argument. My argument is that you should be honest about what the actual costs, or alternatively the actual quality.
    pjmlp2 days ago
    > The point here is that, until Rust came along, you had the choice between wildly risky (but fast) C and C++ code, or completely safe (but slow) garbage collected languages with heavy runtimes and significant deployment challenges.
    Not really, I have been mostly coding in managed languages for the last couple of decades, and this has been not really true for quite some time.
    Yes if we go down language benchmark games, they won't win every little micro benchmark, however for like 99% of commercial use cases, what they deliver is fast enough for project requirements in execution time, and hardware resources.
    Now where they fail is in human perception and urban myths, of where they are suitable to be adopted.
    Languages like Rust overcome this, with their type system approach to resource management, the naysayers have run out of excuses.
    simonask2 days ago
    I think you are pointing out that garbage collected languages can be very fast, right? I agree about that, but it does fundamentally comes with some very big caveats.
    There's a huge number of use cases that are perfectly served by GC languages, even where performance matters, but there's also a huge number that benefit from the extra boost and significantly lower memory usage of a compiled language.
    pjmlp2 days ago
    There are plenty of compiled languages with GC, value types and low level programming capabilities, including playing with pointers C style.
    D, C#, Nim, Swift, Go for mainstream examples.
    If we dive into less successful attempts from the past,
    Cedar, Modula-2+, Modula-3, Oberon, Oberon-2, Active Oberon, Component Pascal, Oberon-07, Spec#, System C# among plenty others that are probably listed on ACM SIGPLAN list of papers.
    As for some commercial examples,
    https://www.withsecure.com/en/solutions/innovative-security-...
    https://dlang.org/blog/2018/12/04/interview-liran-zvibel-of-...
    https://www.wildernesslabs.co/
    https://www.astrobe.com/boards.htm
    lelanthrana day ago
    > It's a very pleasant language with a pleasant community
    I'm sure.
    > bullshit.
    > But that old world is gone.
    > Every competent programmer
    > an excuse to deliver more crap.
    Yeah. Very pleasant indeed.
    Some serious cognitive dissonance is in your post. You claim how you're part of a community that so damn pleasant, but you're out throwing shade...
    jacquesm3 days ago
    > Cost is a useful metric because it reflects a number of relevant things: Time to develop, effort to maintain - yes, but also people turnover, required expertise levels, satisfaction, and so on. Whether or not you like it, you have to care about cost if you want to make rational decisions. I'm not talking about assigning a Euro/Dollar/Yuan value to each hour spent on a project, but you need a rough idea about the size of the time and energy investment you are making when starting a project.
    You are missing the cost to switch and that's a massive one and the one that I think most parties are using to decide whether or not to stick with what they know or to try something that is new to them. If you have a team of 50 embedded C++ developers and a deadline 'let's use rust' is a gamble very few managers will make.
    > Things have changed for three important reasons: (1) C/C++ compilers have evolved, and UB is significantly more catastrophic than it was in the 90s and early 00s.
    That depends on what industry you are looking at. For instance, in aviation the cost of undefined behavior, crashing software or wrong calculations was always that high. The difference is that in that industry (and a handful of others) there is enough budget to do it right resulting in far fewer in production issues than what we have come to accept in the 'always online, auto-update' world. That whole attitude is as much or more to blame for this than any particular language.
    > (2) As societies digitize, the stakes are higher than even - leaking personal data has huge legal and moral consequences, and system outages can have business-killing financial consequences.
    Show me the names of the businesses that have died because of data leaks or UB. See, the problem is that for those businesses it usually is just a speedbump. They don't care and no matter what the size of the breach the consequences are usually minor.
    The employee sticking a USB drive found on the street into their laptop causing a cryptolocker incident is a much more concrete problem.
    > (3) There are actual, viable alternatives - GC is no longer a requirement for memory safety.
    GC is a convenience, and if you're going to switch languages you might as well pick one that is is more convenient. Java for instance is suitable now for 90% or so of the use cases where C or C++ would be your only option 15 years ago.
    > Perhaps you didn't mean to say so, but Rust is not a managed language (that's a .NET term referring to C#, F#, etc.).
    I know, but Java, Lisp and so on are managed languages, and they offer both safety and convenience. Rust only offers safety, other than that it is only marginally more convenient than C and some would argue less so.
    > Me and other Rust users are obviously trying to convince even more people to use the language, and that's because we are having a great time over here.
    Show, don't tell.
    > It's a very pleasant language with a pleasant community and a high level of technical expertise, and it allows me to get significantly closer to living up to my own ideals.
    Yes, but those are your ideals, which don't necessarily overlap with mine. I don't particularly care about one programming language or another, I've learned enough of them by now to know that all of them have their limitations, their warts, their good bits and their bad bits. I also know that the size of the eco-system is a large function in whether or not I'll be able to get through the day in a productive way.
    > I'm not making a moral argument here, trying to say that you or anyone is a bad person for not using Rust, but I am making a moral argument saying that denying the huge cost and risk associated with developing software in C and C++ is bullshit.
    See, your use of the word 'bullshit' triggers me in a way that you probably do not intend, but it is exactly that attitude that turns me off the language that you would like me to switch to. I don't particularly see that huge cost and risk as applied to myself because I'm not currently writing code that is going to be part of some network service. If I see an embedded shop doing their work in Rust then I'm happy because I can ignore at least one small aspect of the source of bugs in such software. But there are plenty remaining and Rust - no matter what you think - is not a silver bullet for all of the things that can go wrong with low level software. There are other, better alternatives for most of those applications, I'd be more inclined to use Java or Erlang if those are available, and Go if they are not. The speed at which I can develop software is a massive factor in that whole 'cost' evaluation for me.
    > The point here is that, until Rust came along, you had the choice between wildly risky (but fast) C and C++ code, or completely safe (but slow) garbage collected languages with heavy runtimes and significant deployment challenges.
    That just isn't true. There are more languages besides Rust that allow for low level and fast work. Go for instance is an excellent contender. And for long running processes Java is excellent, it is approaching C levels of throughput and excels at networked services.
    > C is certainly not "the enemy" - I never said that, and I wouldn't. But that old world is gone.
    Sorry, but this is not a realistic stance. That old world is not gone, and it is likely here to stay for many more decades. There is so much inertia here in terms of invested capital that you can't just make declarations like these and expect to be taken serious.
    > The excuse of picking risky, problem-riddled languages that we know are associated with extreme costs for reasons of performance no longer has any technical merit. There can be other reasons, but this isn't it.
    Do you realize that this is just your opinion and not a statement of fact?
    > It's insane to me that anyone would limit themselves to a single language.
    'Insane' is another very loaded word. Is this really the kind of language you want to be using while advocating for Rust? There are many programmers that learn one eco system well enough to carve out a career for themselves, and I'm not going to be the one to judge them for that. I'm not one of them, but I can see how it happens and I would definitely not label everybody that's not a polyglot as not entirely right in the head.
    > Every competent programmer I know knows at least a handful.
    I know some very competent programmers that only know one. But they know that one better than I know any of the ones that I'm familiar with. For instance, I know a guy that decided early on that if nobody wants to work on COBOL projects that that is exactly what he's going to do: become a world class expert in COBOL to help maintain all that old stuff. At a price. He's making very good money with that, far more than he'd have ever made by going with something more popular. I know plenty of Java only programmers and a couple that have decided that python is all they need. That's their right and it isn't up to me to look down on them or call them incompetent because they can do something that I apparently can't: focus, and get really good at one thing.
    > Why are we worried about this? I'm a decent C programmer, and a very good C++ programmer - better at both because I'm also fairly good at Rust.
    I would not label myself as 'very good' in any language, I always hope to get better and in spite of doing this for 4+ decades I have never felt that I was 'good enough'.
    > "P[sic]obody's nerfect." I'm sorry, I really dislike this attitude.
    Again, why the antagonism. We have many different classes of issues, and depending on the context some of them may not be a problem at all. I've built stuff in JavaScript because it was the most suitable for the job. But I stay the hell away from node and anything associated with it because I don't consider myself qualified to audit all of the code that could be pulled in through a dependency. And that's a good chunk of this: just know your limitations, and realize that not just 'nobody's perfect' but also that you yourself are not perfect and more than likely to mess up when you go into territory that is unfamiliar to you.
    > We can't let the fact that security is hard, or that perfection is unattainable, be an excuse to deliver more crap.
    Ok. So now you are labeling what other people produce as 'crap'. This isn't helping.
    > Again, that's not my argument. My argument is that you should be honest about what the actual costs, or alternatively the actual quality.
    So I'm not honest. If you are wondering what I meant when I wrote earlier that it is the attitude of some of the Rust advocates that turns me off then here in this thread you have a very nice example of that. All of this pontification and emotionally laden language serves nobody, least of all Rust.
    If you want to win people over try the following:
    - refrain from insulting your target audience
    - respect the fact that your opinions are just that
    - understand that there may be factors outside of your view that are part of the decision making process
    - understand that you may not have a complete understanding of the problem domain or the restrictions involved (is a variation on the previous one)
    - try to not use emotional language to make your point
    - showing beats telling any day of the week
    simonask2 days ago
    I don't have time to respond to all of this, but let me just say that you seem to be under the impression that it is somehow my responsibility to "win you over" and convince you to use Rust. I have stated very clearly that that's not my point. My point is that we should all stop lying about the actual cost of delivering reliable software written in C or C++, and in particular that we as an industry need to stop downplaying the consequences of things like UB.
    Are you personally doing any of those things? I don't know, and I don't think I have accused you of that.
    I'm not here to swoon you by sweet-talking you into using a different programming language. All this "show don't tell" - what are you talking about? Do you need real-world examples of successful Rust projects? There's a myriad of impressive ones, but you are fully capable of googling that.
    I'm not a representative of Rust the language (how could I be), and I reserve the right to call out moral corruption as I see it. I frankly do not need any "well-meaning" advice about how best to advocate for Rust - that's not my job.
    jacquesm2 days ago
    > I'm not a representative of Rust the language (how could I be), and I reserve the right to call out moral corruption as I see it. I frankly do not need any "well-meaning" advice about how best to advocate for Rust - that's not my job.
    Whether you realize it or not, you are an advocate and you are doing a very, very poor job of it.
    simonaska day ago
    If you are making technical decisions based on how strangers on the internet make you feel, I fear there's not much I could have done anyway.
  - 01HNNWZ0MV43FF3 days ago
    > That some people make mistakes because they do not know them is a different problem.
    We can argue til we're blue in the face that people should just not make any mistakes, but history is against us - People will always make mistakes.
    That's why surgeons are supposed to follow checklists and count their sponges in and out
  - Sukera3 days ago
    Could you expand on how these wraparound bugs happen in Rust? As far as I know, integer overflow panics (i.e. aborts) your code when compiled in debug mode, which I think is often used for testing.
  - bringbart3 days ago
    >while unsigned wraparound leads to bugs which are basically impossible to find.
    What?
    unsigned sizes are way easier to check, you just need one invariant:
    if(x < capacity) // good to go
    Always works, regardless how x is calculated and you never have to worry about undefined behavior when computing x. And the same invariant is used for forward and backward loops - some people bring up i >= 0 as a problem with unsigned, but that's because you should use i < n for backward loops as well, The One True Invariant.
- kstenerud3 days ago
  Yup, unsigned math is just nasty.
  Actually, unchecked math on an integer is going to be bad regardless of whether it's signed or unsigned. The difference is that with signed integers, your sanity check is simple and always the same and requires no thought for edge cases: `if(index < 0 || index > max)`. Plus ubsan, as mentioned above.
  My policy is: Always use signed, unless you have a specific reason to use unsigned (such as memory addresses).
  - bringbart3 days ago
    unsigned is easier: 'if(index >= max)' and has fewer edge cases because you don't need to worry about undefined behavior when computing index.
    int_19h2 days ago
    Just because it's UB doesn't mean it's not a problem, though. If you do unsigned arithmetics but don't account for the possibility of wraparound on overflow, the resulting value is well-defined, but it does you no good if you then try to e.g. index using it and cause a buffer overflow.
  - lelanthran3 days ago
    > The difference is that with signed integers, your sanity check is simple and always the same and requires no thought for edge cases: `if(index < 0 || index > max)`
    Wait, what? How is that easier than `if (index > max)`?
    kstenerud2 days ago
    Because if max is a calculated value, it could silently wrap around and leave index to cause a buffer overflow.
    Or if index is counting down, a calculated index could silently wrap around and cause the same issue.
    And if both are calculated and wrap around, you'll have fun debugging spooky action at a distance!
    If both are signed, that won't happen. You probably do have a bug if max or index is calculated to a negative value, but it's likely not an exploitable one.
    17186274402 days ago
    I have no clue what cases you have in mind, can you give some examples? Surely when you have index as unsigned the maximum would be represented unsigned as well?
- accelbred3 days ago
  If using C23, _BitInt allows for integer types without promotion.
- user____name3 days ago
  I just put assertions to check the ranges of all sizes and indices upon function entry, doubles as documentation, and I mostly don't have to worry about signedness as a result.
pizlonator3 days ago
Just compile it with Fil-C
ge963 days ago
Long as you allocate me, it's alright