IMO the main takeaway is that malformed input is not an exceptional state when parsing, and should be treated as a first class citizen. Everything else is yak shaving how you want to handle the (status, validObject) tuple coming from the parser.
These days compiler developers implement accepted standard features pretty fast.
if (sscanf(user_input, "%4u-%2u-%2u", &year, &month, &day) != 3) {
// return an error
}
This still does not catch trailing garbage, but you could check for that as well: if (sscanf(user_input, "%4u-%2u-%2u%c", &year, &month, &day, &dummy) != 3) {
// return an error
}
The result would be 4 if there was at least one trailing character. Too bad there is still no std::scan() companion to C++23's std::print().Consider a hypothetical Goose type, we can express any Goose usefully as output and, conveniently, some potential inputs could be read as a Goose successfully though most arbitrary strings cannot be understood as a Goose.
Providing std::print for Goose is simple, we've got a variable (or maybe a constant) of type Goose, we just emit the correct sequence of symbols. It's annoying to actually write all the boilerplate in C++ 23 but that's mechanical it's not actually tricky to do just very boring (and so hence maybe C++ 26 makes that easier via reflection)
But how could std::scan for Goose work? We need a Goose variable to potentially store the Goose if we read one, but how can we make a default Goose? No, each Goose is unique and there is no substitute, this can't work.
The std::scan idea seem attractive for simple almost untyped input, strings, integers, that sort of thing, but the whole point of "Parse, don't validate" is that you probably want to parse email addresses and ISBNs and ISO dates, you don't want a string, another string and a third string.
Rust's FromStr trait is more appropriate. Given a type implements FromStr we can parse any string to (maybe) get an instance of that type, but we don't need an "empty" instance first because we're doing the construction when we call the function.
auto [value, text, goose] = std::scan<int, std::string, Goose>(input, "{} {} {}");
A halfway solution would be to have the hypothetical std::scan() take references to std::optional<>s or std::expected<>s: std::optional<int> value;
std::optional<std::string> text;
std::optional<Goose> goose;
/* auto result = */ std::scan(input, "{} {} {}", value, text, goose);
The latter would be type safe, close to how scanf() works, but less satisfying from a functional programming standpoint.Orthogonal to that, adding support for scanning a Goose would be just like how you add a formatter for it, and would be quite similar to a Rust trait. One could imagine having to define something like this:
template<>
struct std::scanner<Goose> {
constexpr auto parse(std::format_parse_context& ctx) {…}
auto scan(std::format_context& ctx) const -> std::optional<Goose> {…}
}; // There are a few ways to let API callers bring their own
// memory, as they would in a no-malloc environment and this
// stack-friendly c'tor is a stand-in for that.
There's just something about this comment that doesn't feel right. I've seen these kinds of phrasings in LLM output before but I'm not sure exactly how to describe them.> Use your language’s type system to parse unstructured inputs.
We don’t use the type system to parse. We use the type system to provide evidence (also called a proof or a witness) that parsing was successful, and we rely on the language’s access control facilities (public/private) and the soundness of its type system to prevent fabrication of false evidence.
// There are a few ways to let API callers bring their own
// memory, as they would in a no-malloc environment and this
// stack-friendly c'tor is a stand-in for that.
static Birthdate epoch() { return Birthdate(1900, 1, 1); }Regardless of how they might have used LLMs, I tend to have an issue with this kind of complaint, given the C++ example code on the Design Patterns: Elements of Reusable Object-Oriented Software book, released in 1994, 2 years before Java was made public.
Or the examples from "Using the Booch Method: A Rational Approach", "Designing Object Oriented C++ Applications Using The Booch Method", or "Using the Booch Method: A Rational Approach".
Additional there are enough framework examples starting with Turbo Vision in 1990, MacAPP in 1989, OWL in 1991, MFC in 1992,....
Somehow a C++ style that was prevalent in the industry between 1990 and 1996, that I bet plenty of devs still have to maintain in 2026, has become "Java in C++".
A class with a passel of static member functions is Java code. It is not in any way idiomatic C++ code which has had namespace-level ("free") functions since it was invented as C-with-classes many decades ago. Using classes holding a whole lot of static member functions is strongly frowned on in the professional C++ community.
There's not much mystery about that - Java took that approach and ran with it, and now has much greater mindshare than C++.
Also, the mid-90s were before most software developers working today were born, I suspect. They'd have to go find a graybeard and ask them to tell them tales of yore, to find out about any of this.
What about everyone born before 1900?
But the pattern applies regardless of the validation logic.
It's just a toy example not a production ready birthday validation library.
* return pointer-or-null
* choose "invalid" sentinel values and then use birthdate_is_valid(...) to check validity.
* Add an is_valid bool field (or even an error enum like in the C++23 example)
* Add an out field in the constructor function for the error code (similar to how ObjC does things).
Pointer-or-NULL doesn't work, because all pointers are nullable in C; you can always have a Foo* (NULL) that's doesn't actually point to a valid Foo.
Invalid sentinel values are definitionally values of a particular type that are invalid. Same with an is_valid field.
An out field in the constructor means that whatever you actually return in the case of an error is going to be a well-typed Foo that's invalid.
Sure, Optional is more elegant, but the end result is the same: Now none of the other code needs to validate; it's already been verified valid at all points where a parse error could have occurred.
C may not be an easy language, but with the right tooling you can make code safer, and idioms like parse-dont-validate possible.
All four of your examples are validate.
Know any languages that are worse than C at this?
Being abstracted by code you just wrote is quite a painful experience, yes.
Parse, don't validate was written around Haskell!
This is extremely natural to do in a language like Haskell or Rust. And incredibly unnatural to do in C++ for instance.
Tl;dr: there's nothing extra that functional or OO programming give you here. Both allow you to represent the problem in a properly typed fashion. Why would you represent an email as a string unless you are a) deeply inexperienced or b) have some really good reason to drop all the benefits of a strongly typed language?