I think more systematic testing. It seems the AI is pretty good at generating tests and validating them, so really filling out the places where testing is weak. I think we could do a lot more tests for each compiler pass. For example, I think getting the AI to generate tons of programs and validate that the types the compiler infers on each variable are what we expect them to be.