If the author's content is all in this 'assume no C background' style, it will be useful to those without previous manual memory management background in contrast to most articles that assume a lot of context.
The thing I really wish C had, and which there is no straightforward workaround for, is postfix casting. When you do `((struct foo*)(COMPLEX_EXPRESSION))->field`, I think it would read a lot better if the `(struct foo*)` cast was on the same side as `->field`. Maybe something like `(COMPLEX_EXPRESSION)@(struct foo*)->field`
some_pointer*.field
It seems like `.` could in theory be made to recursively dereference any number of layers down to the base type. It's not as though `.` has any other possible meaning when used directly against an address ... right?
https://retrocomputing.stackexchange.com/questions/10812/why...
Yes, I know, comptime and such.
What this actually reminds me of is Ada, where you write `value.all` to basically the same effect as Zig's `value.*`. It's as if everything is a struct with itself as a member.
I think what seems weird though is more the "... = .{...}" syntax.
This makes it look as if there was some kind of anonymous object ".{...}" in memory that you're copying from, but there isn't. It's actually just a writing operation and the .{...} itself doesn't represent anything.
Maybe that was also what the author found confusing?
I.e. C's
*ptr
is ptr.*
in Zig.Edit: Going through this author's website, it seems like a lot of their posts are about rediscovering low-level programming concepts through Zig. Like this article, where they discover you can't compare strings directly, and you have to use memcmp:
https://www.openmymind.net/Switching-On-Strings-In-Zig/
They claim that they blog because they "find that [they] retain things better when I write about them." No problem with that. Just a little odd to see on the hn front page, I suppose.
I think this is just "discovering". While you (and I) probably discovered these things with assembly or C languages, it's perfectly reasonable or even appropriate for newer generations to have these kinds of experiences with Zig or Rust.
When learning piano you first learn how to play rudiments and then you move up to more complex scores.
Otherwise you end up writing an article about how Ravel' Scarbo is amazing because it involves playing with two hands at the same time.
If learning Piano was an example then everyone would have like you said learn the basic of everything and then build up on it. Modern day people learn multiple different chords and somehow string them together. If you do EE you have have learned how the piano works before you start playing the piano.
Worth remembering most people in programming today start with Javascript / Python or Ruby.
More common with guitar than piano though.
Similarly they are most popular with Javascript than with C
Are there really that many Zig programmers that have never seen C or know what pointers are?
But, to answer your question directly: absolutely. In addition to writing a lot about it, I maintain some popular libraries and lurk in various communities. Let me assure you, beginner memory-related questions come up _all the time_. I'd break them down into three groups:
1 - Young developers who might have a bit of experience in JavaScript or python. Not sure how they're finding their way to Zig. Maybe from HN, maybe to do game development. I think some come to Zig specifically to learn this kind of stuff (I've always believed most programmers should know C. Learning Zig gets you the same fundamentals, and has a lot of QoL stuff).
2 - Hobbyist. Often python developers, often doing embedded stuff. Might be looking to write extensions in Zig (versus having to do it in C).
3 - Old programmers who have been using higher level languages for _decades_ and need a refresher. Hey, that's me!
Also, from learning human languages, it's a well-known lesson that phrasebook-type "this means this" translations (like some here are asking, from Zig to C/Rust) are useful for quick and dirty learning good enough for one trip, but long term learning needs this kind of a direct explanation.
1. It avoids the word (or syntax in this case) getting stuck in a double-indirection state, needing you to mentally translate it from Zig to C to what it actually means every time.
2. It avoids the learner attaching the wrong nuances to the word or syntax feature, based on the translation they're given, when the language they're learning has different nuances. In other words, it helps the learner see it as its own thing, and not be unduly colored by what they already know and find easy to grasp on to (even when it's subtly wrong).
Lots of popular Youtubers such as Primeagen (somebody who easily gets 200/300k+ views per video) have been speaking highly about Zig.
Perhaps some day Zig will have replaced C and beginners will come to Zig having never touched C, and in that context this approach makes sense - after all, you wouldn't litter an introductory article on C with comparisons to Algol. But today, surely the modal Zig beginner already knows enough C that the syntax would better be explained by reference to C.
You aren't always the target.
Sometimes it seems like the change is just to make it different, not better.
The whole point of Zig is to fix C's mistakes. I don't know why they'd repeat this one.
I wish they'd got it right for (de)referencing too though.
It’s not that important though. Many people write two or more languages without mixing the syntax.
p.* and lv.& would be good post-operators, imo. Less going left and parenthesizing.
Ada does the same exact thing, except there you write `p.all` instead.
In any case, while the exact syntax may not be ideal, using a postfix operator for dereferencing is vastly better than prefix in practice due to typical patterns of use. There's a reason why you end up writing () a lot in C code with heavy pointer operations - the things they end up mixed with most often turn out to have the wrong kind of priority and/or precedence more often than not. Things are much simpler when everything is postfix and code reads naturally left-to-right.
x()->g()->f();
At least one person at Bell Labs historically used that scheme for some graphics program.All it requires is that the function returns a pointer to a struct which itself contains function pointers.
C also allows the dot form if you really want it:
x().g().f()
Simply by returning structs, and for all compilers that I know if, such end up using a hidden pointers to structs.Now to make it a bit more usable, one needs a bit of planning so that either of these can be done:
x(&x_out).g(&g_out, x_out).f(&f_out, g_out);
x(&x_out)->g(&g_out, x_out)->f(&f_out, g_out);
(I'd suggest the latter is now the more readable of the two).Where there is a VFT in each of the xxx_out structures, and the calls simply returns the VFT, while the whole abstraction is stored/returned via the out arguments.
f(*g(*x)) is just as good as f(g(x*)*)
Hmm, function names go before parentheses, that looks like a good reason for dereferencing to be prefix too, otherwise they end up a long way apart.
But in the back of mind I'm thinking "semantics cause pointless fussy arguments, all code is ugly, let's stop programming in text somehow".
But this is literally not semantics. Semantics is everything of value, this is just syntax.
is it a method call? is it a property/field value? how am i to be indicated that there is an implicit suspend point when you .await?
In the ideal world, we wouldn't have unary prefix operators at all, but unfortunately unary +, -, and NOT are prefix mostly because they were that in math notation and got grandfathered in (bonus points to Smalltalk here for going with consistency here - "not" and "negated" are regular nullary methods there so it's postfix throughout!). However, these are rarely themselves an operand of another unary operator, so you can mostly deal with this by giving postfix higher precedence than prefix in all cases, so at least it's a simple rule. But then for pointer dereference, it is in fact common to have the result of a dereference itself be an operand in the middle of another expression.
So now you have some choices to make. If you make the pointer dereference prefix, then you don't need extra parentheses when applying other prefix operators to it, e.g. -*p or !*p. If you make it postfix, then you don't need extra parentheses when applying other postfix operators to it, e.g. (using Pascal-style ^ for dereferencing) p^[0] or p^.x. Alternatively, you could add special postfix operators that desugar into the combination but avoid those extra parens, which is what C did with -> for field access.
(Technically, you could also make everything prefix instead, e.g. field access ALGOL 68 style: `month OF birthDate OF person`. But this is very counter-intuitive with indexing, and also makes code completion unusable, so it's not a serious option.)
Although I agree that await could use some precedence like new in javascript:
new types.Date().toJSON()
// fine
await readdir(…).filter(…)
// oops
I’d rather parenthesize what I’m awaiting in complex cases than awaiting just everything that follows.when a value is dereferenced, having a consistent left to right dereferencing makes sense.
for example, in c, without looking up the order of operations how confident are you that you know what is going on in the following:
foo = **bar[10]
(Yes, C programmers will probably hunt you down if you do this, but it does work.)
This syntax is less arbitrary than C's. It draws a syntactic parallel between accessing a single member and accessing "all members". (by using pattern-matching-like syntax)
It makes the language more consistent and one's mental model of it smaller. (Even though I doubt that patterns other than the Kleene star would work)
A parallel with files:
cp dir/a dir/b dir/c /other/dir
In Zig you can cp dir/* /other/dir
In C: cp *dir /other/dir
In Zig, you can pretty much read the types aloud and it makes, your brain does not need to peek for parsing. For me it's too late, I got used to the C way but I still want the better thing to get adopted. No more int ((foo)(int))[5]; nonsense—just T, [N]T, or *T, making intent crystal clear.
I happen to be reading the K&R C book right now and they do in fact tell you this.
I have not read it so that's on me I guess
That said, C's function pointer declaration syntax is indeed awful. But really that's because ANSI took a very hardline "no incompatible changes" tact when adding prototypes to the language, which limited the ways they could express them. That decision is one of the reasons we're still writing C today. Any yahoo can come up with a new language, kids do it all the time. ANSI's job was to add features to the language in which Unix was already written.
This Is basically the gist of the article.
I’m surprised to see that so many programmers these days don’t know C and basic pointers.
Large parts of the industry does really work with low(er) level languages.
Even people who went through C/C++ in their formal education/start of their career may have not used it for a long time.
Didn't read, got bored at having pointers explained, even if actively trying to skip it.
I get a mix of: 1- you don't know your audience or are trying to cater to too wide of an audience
2- if someone doesn't know how pointers work, is there any merit in knowing syntax of a novel programming language?
So if you want the veterans to read this, you are going to have to make it less accessible.
Edit: it seems like the whole article is OP discovering pointers but zig is like his first language. Lol.
You aren't always the audience, it's fine.
I would suggest going through C. Otherwise it's like learning C++ without learning C. Zig is similarly a successor to C, (it is phonetically named after it).
That would address your blind spots and let you appreciate Zig's identity
You're making many wrong assumptions at the same time:
- that I've never used C or C++ (both wrong)
- that I don't know pointers (also wrong, nor it is a particularly difficult topic)
- that you need to go through C again to learn Zig. Have you met many people that picked up Zig as their first programming (or first system programming) language to make those statements?
Because I am in both the IRC and discord and there's plenty of people that get proficient in Zig starting from it.
Writing about an advanced concept catering to begginners is a mess. It's like writing about logarithmic calculus and prefacing with an explanation on exponents.
However, thinking long term, are we going to continue to introduce pointers via C to new students? If not, then how? C++ or Zig seem viable options, so this might be a long term proposition.
Inb4 there won't be programmers
I first don't see a problem with C being a basal part of the curriculum for centuries, we are already at half a century going strong.
Second, if this were to happen, it should happen with a mature language, and decided consciously by seasoned professionals.
Third. I think more than looking at other languges we should look into other types of pointers, like files, hyperlinks, assembly addresses, inodes.
To date C is the standard for pointer-pointer syntax, introducing new standards is fine, but it only makes sense with the standard syntax.
It's the canon of programming. Long live C
Everything else has provenance rules and magic unrepresented metadata around it that makes a pointer more than just an address.
In C you can also do those fun array indexing and pointer arithmetic tricks that require you truly understand the concept of "address plus offset" that basically everything uses.
But even if that's a hallucination, Rust does the hard work of keeping pointers off your mind by a combo of refs, box, etc. and Rust is not a good first language, I don't think, for a variety of reasons.
My ultimate programing curriculum if I had to make one would teach you to program in python, then show you C via Cython or similar. Lots of allocation and free under the hood in Python.
The issue comes when you try casting to pointers. Because of providence, aliasing rules, and a couple of dragons that built a nest in the C language specification, you could have two pointers to the exact same memory locations, but have the program be undefined if you use the wrong pointer.
Granted, this doesn't stop you from doing things like
foo_t *foo = (foo_t*) 0xDEADBEEF
And in the few occasions where that is something you would reasonably want to do it does more or less what you would expect (unless you forgot that the CPU sticks a transparent cache between you and the memory bus; but not even assembly would save you from that oversight).In Rust pointer provenance is a well defined language feature and if you go to the language docs you can read how it interacts with your use of raw pointers and the twin APIs provided for this.
In C the main ISO document just says basically here be dragons. There's an additional TS from 2023 with better explanation, but of course your C compiler even if it implemented all of C23 needn't necessarily implement that TS. Also of course the API is naturally nowhere near as rich as Rust's. C is not a language where pointers have an "addr" method so they also don't need a separate exposure API.
I suspect that in Zig none of this is clearly specified.
Doesn't seem hidden to me