Ugly (and not performant if in a hot path) but it works.
In my experience, the worst part of the C standard library is not its existence, but the fact that so many developers insist on slavishly using it directly, instead of safer wrappers.
Edit: https://doc.rust-lang.org/src/core/num/mod.rs.html#1537
interesting! It boils down to this
pub const fn from_ascii_radix(src: &[u8], radix: u32) -> Result<u32, ParseIntError> {
use self::IntErrorKind::*;
use self::ParseIntError as PIE;
// guard: radix must be 2..=36
if 2 > radix || radix > 36 {
from_ascii_radix_panic(radix);
}
if src.is_empty() {
return Err(PIE { kind: Empty });
}
// Strip leading '+' or '-', detect sign
// (a bare '+' or '-' with nothing after it is an error)
// accumulate digits, checking for overflow
Ok(result)
}But it's not hard at all. It's not even as full of small issues that you can't handle the load, like dates. It's just annoying as hell.
The problem is exclusive to C and C++. It's created by the several rounds of standardization of broken behavior.
for(int i = 0; i < len(characters); i++)
{
if(characters[i]-48 <= 9 && characters[i]-48 >= 0)
{
ret = ret * 10 + characters[i] - 48;
}
else
{
return ERROR;
}
}
return ret;
Adjust until it actually works, but you get the picture.the author admits you can parse signed integers in their second example, but for unsigned, they don't like seem to like that unsigned parsing will accept negative numbers and then automatically wrap them to their unsigned equivalents, nor do they like that C number parsing often bails with best effort on non-numeric trailing data rather than flagging it an error, nor do they like that ULONG_MAX is used as a sentinel value by sscanf.
I'm not sure what they mean by "output raw" vs "output"
$ cat t.c
#include <stdlib.h>
#include <math.h>
#include <stdio.h>
int main(int argc, char \* argv){
char * enda = NULL;
unsigned long long a = strtoull("-18446744073709551614", &enda, 10);
printf("in = -18446744073709551614, out = %llu\n", a);
char * endb = NULL;
unsigned long long b = strtoull("-18446744073709551615", &endb, 10);
printf("in = -18446744073709551615, out = %llu\n", b);
return 0;
}
$ gcc t.c
$ ./a.out
in = -18446744073709551614, out = 2
in = -18446744073709551615, out = 1
$
I get their "output raw" value. I don't know what their "output" value is coming from.I don't see anywhere they describe what they are representing in the raw vs not columns.
That's right. I don't like asking it to parse the number contained inside a string, and getting a different number as a result.
That's just simply not the right answer.
> I'm not sure what they mean by "output raw" vs "output"
I can see how that's very unclear. Changed now to "Readable".
As you can read at https://en.wikipedia.org/wiki/Errno.h errno is barely used by the C standard (though defined there). It is rather POSIX that uses errno very encompassingly. For example the WinAPI functions use a much more sensible way to report errors (and don't make use of errno).
For strtoul and friends, maybe? 7.24.1 is pretty dense, but the key parts are "the expected form of the subject sequence is a sequence of letters and digits representing an integer with the radix specified by base, optionally preceded by a plus or minus sign […] If the correct value is outside the range of representable values […] ULONG_MAX […] is returned".
So the "expected form" allows a minus sign, but then it's clearly "outside the range of representable values" for strtoul to try parsing a negative value. So maybe it should return ULONG_MAX on those.
So arguably a minus sign present could already be treated as an error, and still be standard compliant. Unless I'm misreading.
Ok, having a method to do that for you would be nice, but the post reads like it's an issue that std library doesn't provide you with a method behaving as you exactly want
That should be opt-in via a flag, if it needs to be supported at all. Unix file permissions are the only deliberate use of octal I've ever seen.
Yes, the standard library is bad. This is by far the worst part of the C legacy. But it is not that hard to write your own.
String functions like this are not difficult at all, and you can use better naming and semantics, write faster code etc.
C is not the C standard library, ffs.
The distinction between a language and its standard library gets blurry even in theory, and in practice they're nearly inseparable. If a language's standard library has four ways of doing almost the same thing, and they're all fundamentally broken, that's a problem.
Complete BS in my opinion.
Bonus points for having bespoke linting rules to point out the use of known “bad” functions.
In one old project we went through and replaced all instances of sprintf() with snprintf() or equivalent. Once we were happy that we’d got every occurrence we could then add lint rules to flag up any new use of sprintf() so that devs didn’t introduce new possible problems into the code.
(Obviously you can still introduce plenty of problems with snprintf() but we learned to give that more scrutiny.)
Similar to how strlcpy() is not a slam dunk fix to the strcpy() problem.
If someone uses sprintf() you have to go faffing around to check whether they've thought about the destination buffer size. The size of the structure may be buried far away through several layers of other APIs/etc.
Using snprintf() doesn't solve this in any way, but checking whether the new use of snprintf() checks the return value is relatively simple. Again, there's still no guarantee that there aren't other problems with snprintf() but, in our experience, we found that once people were forced to use it over sprintf() and had things checked in PR reviews we found that the number of instances of misuse dropped dramatically.
It wasn't the switch of functions that reduced the number of problems we saw, but the outright banning of the known footgun `sprintf()` and the careful auditing and replacement of it with `snprintf()` that served as a whole load of reference copies for how to use it. We spread the work of replacing `sprintf()` around the team so that everyone got to do some of the switches and everyone got to review the changes. And we found a whole load of possible problems (most of which were very unlikely to ever lead to a crash or corruption.)
The same would apply if you picked any other known footgun and did similar refactoring/rewrites/auditing/etc.
Anyway, I haven't done C commercially/professionally for about 5 years now. I do miss it though.
There is a hashmap implementation though: https://man7.org/linux/man-pages/man3/hsearch.3.html
(In fact, looking at it again, I assume I'd purposely purged it from my memory given how terrible it is.)
The non-extensible nature is the biggest one. There are plenty of times when the maximum number of elements needed to be stored will be known in advance. (See the note about hcreate().)
Secondly the hserach() implementation requires the keys to be NUL terminated strings since "the same key" is determined using strcmp(). Good luck if you want to use a number, pointer, arbitrary structure or anything else as a key.
Any reasonable hash table implementation would not have either of these limitations.
Maybe I needed to say:
> > like lists/hashmaps/etc which neither C nor the standard libraries provide
... reasonable implementations of.
The criticisms related to UB are not about understanding the target platform and the target compiler's behavior. Undefined Behavior is not the same thing as Implementation-defined Behavior, and lots of folks (including me) would be satisfied with reclassifying chunks of UB as the latter.
The behavior of the target platform isn't really the issue. C23 mandates two's complement for signed integers. Most hardware wraps on overflow, but that literally doesn't matter. The standard says a program exhibiting signed overflow is undefined, period.
In practice, UB rules mean the compiler is free to remove checks for signed overflow/underflow, checks for null pointers, etc. This can and does happen. Man, just a few weeks ago, I just had to deal with a crash in a C program that turned out to be due to the compiler removing a null check. That was a painful one.
Like… edge cases? It's parsing a number! We're not talking about I/O on hard vs soft intr NFS mounts, here. There's a right answer.
strlen(), on valid null terminated strings, doesn't come with caveats like "oh we can't measure strings of length 99".
But sure, C is turing complete. It is possible to solve any problem a turing machine can solve.
> understand the target platform and the target compiler’s behavior.
This is neither. This is purely the language.