A valid choice; one might also find re2c[1] or Ragel [2] + Lemon[3] a friendlier if less well-established implementation of essentially the same theory. Moving further away, there’s also ANTLR[4] or spending a week on a hand-coded lexer and parser.
> CPU’s parallelism is not utilized in the same way as GPU since different threads are not synchronized at all, there is simply nothing like Warp or Wavefront. [...] SIMD optimization would be a fairly bad fit for shaders since there are no four instances of shader executions at the same time.
See, however, ISPC[5,6] for how attempting to use SIMD (also known as “90% of your CPU’s compute”) does yield wavefronts, etc. (If I’m reading between the lines of the second reference correctly, the work on it paved the way for LLVM-based shader-compiler backends in AMD and Intel GPU drivers.)
[2] https://www.colm.net/open-source/ragel/
But none of it was surprising. The original RenderMan shading language was "vectorized" and it even used SIMD instructions on modern processors to run the "interpreter loops". That is, a single "color add" in RSL might have looked like:
for (int i=0; i < grid_points; i++) {
out_color[i] += foo;
}
and the inner part there could use vector instructions. That just isn't nearly enough to get useful wins.The point of ISPC et al. was to give people CUDA-like easy mode for "trust me, just vectorize the whole thing and deal with the masking for me". It goes beyond a hardcoded shading language (easier target!) though didn't reach as far as CUDA with complex structs / C++ capabilities.