I found that there are not enough good teaching materials on type checkers -- e.g. the second edition of the Dragon Book lacks a type checker, which is a glaring hole IMO - https://news.ycombinator.com/item?id=38270753
Also, teaching material tends to have a bias toward type inference and the Hindley-Milner algorithm, which are NOT used by the most commonly used languages
So I appreciate this, but one thing in this code that I find (arguably) confusing is the use of visitors. e.g. for this part, I had to go look up what this method does in Python:
# Default so every expr returns a Type.
def generic_visit(self, node):
super().generic_visit(node)
if isinstance(node, ast.expr):
return ANY
Also, the main() calls visit(), but the visitor methods ALSO call visit(), which obscures the control flow IMO. Personally, if I need to use a visitor, I like there to just be a single pass---
In contrast, Essentials of Compilation was released 1 or 2 years ago, in Racket and in Python. And the Python version uses the same typed AST module.
https://www.amazon.com/Essentials-Compilation-Incremental-Ap...
But it uses a more traditional functional style, rather than the OO visitor style:
https://github.com/IUCompilerCourse/python-student-support-c...
So one thing I did was to ask an LLM to translate this code from OO to functional style :-) But I didn't get around to testing it
(I looked at this code a week ago when it appeared on lobste.rs [1], and sent a trivial PR [2])
Pierce’s Types and Programming Languages[1] is excellent. It starts with very little (if you understand basic set-theory notation, you’re probably OK), gets you to a pretty reasonable point, and just generally makes for very pleasant reading. You should probably pick something else if you want a hands-on introduction with an immediate payoff, but then you probably wouldn’t pick the Dragon Book, either.
Right now I think Siek's book is better for what I want to do, though admittedly I didn't get that far into it, because my type checking project is way on the back burner
I would like to see any type checkers that people wrote after reading TAPL!
Edit: This project (best fun I’ve had programming in a long while) is what got me sharing Eli Bendersky’s Unification post a couple of weeks back https://news.ycombinator.com/item?id=44938156
Your language will have a number of phases/passes to carry out. Let's say LambdaLifting, TypeChecking and Inlining.
All the code for lambda lifting belongs in one module, all the code for type-checking in another module, etc.
If you instead use visitor pattern, you will be looking at all the code related to Variable, Function, Literal in those files respectively.
So when you're working on Function.typecheck(), it will sit in source code just under Function.lambdalift() and just above Function.inline() - things which you don't want to consider together. Meanwhile, you'll need to switch between source files to work on Variable.typecheck() and Literal.typecheck().
I've never organized visitor pattern code that way. Usually it's something like:
TypeChecker
visitFunction
visitVariable
visitLiteral
PrettyPrinter
visitFunction
visitVariable
visitLiteral
So related functions (across the types you're visiting) are kept together, you're not revisiting the Function module to add a new visitor there. That would almost defeat the purpose of the pattern.https://en.wikipedia.org/wiki/Visitor_pattern - See the UML diagram here.
If you search for pop(), you can see that
self.expected_ret.append(ret)
self.generic_visit(n)
self.expected_ret.pop()
and self.push(narrows_true); [self.visit(s) for s in n.body]; self.pop()
self.push(narrows_false); [self.visit(s) for s in n.orelse]; self.pop()
In the functional style, you just pass a param using the stack, rather than using an explicit stack.It's not so bad here, but with a big enough language, and more complicated algorithms, the mutable member variables basically become "mutable globals".
And if you re-call visit() at arbitrary depths, IMO the algorithm gets obscured.
---
That said, I agreed here that visitors are useful when you need to say traverse all string literals in an AST, at arbitrary depths: https://lobste.rs/s/jdgjjt/visitor_pattern_considered_pointl...
---
A sign that this issue isn't settled is that two of the more complex type checkers make opposite decisions
- MyPy uses visitors extensively - https://github.com/python/mypy/tree/master/mypy
- TypeScript mostly uses switch/case functions - https://github.com/microsoft/TypeScript/blob/main/src/compil...
I'd be interested in analysis of why that is, but I suspect it's mainly style
Python is probably the apex of the "slow + doesn't work without a magic environment" problem
EDIT: escaped censorship