A Basic Just-In-Time Compiler (2015)(nullprogram.com)

105 pointsby ibobeva month ago6 comments

kscarleta month ago
It seems that JIT is overloaded with at least 2 meaning.
The canonical definition of JIT is "compilation during execution of a program". Usually, a program is being interpreted first, then switches to compiled code in the middle of execution. This is not what this article does.
What this article does is sometimes called on-the-fly AOT, or just on-the-fly compilation. I'd prefer not overloading the term "JIT".
- dataflowa month ago
  > Usually, a program is being interpreted first, then switches to compiled code in the middle of execution.
  I agree what they have isn't JIT compilation, but not for that reason. Tiered execution was never a central part of JIT compilation either. It was a fairly new invention in comparison.
  The reason what they describe isn't JIT compilation is IMO fairly boring: it's not compiling the input program in any meaningful way, but simply writing hard-coded logic into executable memory that it already knows the program intended to perform. Sure there's a small degree of freedom based on the particular arithmetic operations being mentioned, but that's... very little. When your compiler already knows the high-level source code logic before it's even read the source code, it's... not a compiler. It's just a dynamic code emitter.
  As to the actual difference between JIT vs. AOT... it may just come down to accounting. That is, on whether you can always exclude the compilation time/cost from the overall program execution time/cost or not. If so, you're compiling ahead of (execution) time. If not, you're compiling during execution time.
  - kscarleta month ago
    > As to the actual difference between JIT vs. AOT... it may just come down to accounting. That is, on whether you can always exclude the compilation time/cost from the overall program execution time/cost or not. If so, you're compiling ahead of (execution) time. If not, you're compiling during execution time.
    Well, this includes what I refer to as "on-the-fly" AOT, like SBCL, CCL, Chez Scheme... Even ECL can be configured to work this way. As I mentioned in another comment, people in those circles do not refer to these as "JIT" at all, instead saying "I wish my implementation was JIT instead of on-the-fly AOT"!
  - ordua month ago
    > it's not compiling the input program in any meaningful way, but simply writing hard-coded logic into executable memory that it already knows the program intended to perform.
    The program reads the logic from stdin and translates it into machine instructions. I can agree that there is not a lot of a freedom in what can be done, but I think it just means that the source language is not Turing complete. I don't believe that compiler needs to deal with a Turing complete language to claim the title "JIT compiler".
    dataflowa month ago
    > The program reads the logic from stdin and translates it into machine instructions. I can agree that there is not a lot of a freedom in what can be done, but I think it just means that the source language is not Turing complete. I don't believe that compiler needs to deal with a Turing complete language to claim the title "JIT compiler".
    "Not Turing-complete" is quite the understatement.
    A "compiler" is software that translates computer code from one programming language into another language. Not just any software that reads input and produces output.
    The input language here is... not even a programming language to begin with. Literally all it can express is linear functions. My fixed-function calculator is more powerful than that! If this is a programming language then I guess everyone who ever typed on a calculator is a programmer too.
    ordua month ago
    > A "compiler" is software that translates computer code from one programming language into another language.
    Yes, of course.
    > Literally all it can express is linear functions.
    And why a language to express linear functions can't be a programming language?
    > My fixed-function calculator is more powerful than that!
    But it isn't compiling, it is interpreting, is it? So your fixed-function calculator is not a compiler, it is an interpreter. It is irrelevant how powerful it is. There are much even more powerful interpreters and less powerful compilers.
    The example we see gets computer code in one language and translates it into machine instructions. Talking about 'understatement' you are adding to this definition more constraints and these are very fuzzy constraints on what counts as a programming language.
    tgva month ago
    A compiler takes some language and translates it into something close(r) to the hardware. And that's what the OP does. And since it compiles in process and executed it too, it's JIT, as opposed to AOT.
    These terms are not related to the complexity of the problem. The first compilers could only translate for formulas, hence FORTRAN.
    dataflowa month ago
    I'm not sure where you got your definition, but I basically copied Wikipedia's.
    a month ago
    undefined
    Brian_K_Whitea month ago
    That's not compilation merely substitution.
- orthoxeroxa month ago
  I think if your program starts to execute machine code that wasn't present in the binary, then it counts as having a JIT compiler.
  Of course, there are edge cases like embedding libtcc, but I think it's a reasonable definition.
- ignoramousa month ago
  > This is not what this article does.
  Discounting books, many other well written articles on JIT have been shared on HN over the years [0][1][2]; the one I particularly liked as it introduces the trinity in a concise way: Compiler, Interpreter, JIT https://nickdesaulniers.github.io/blog/2015/05/25/interprete... / https://archive.vn/HaFlQ (2015).
  [0] How to JIT - an introduction, https://eli.thegreenplace.net/2013/11/05/how-to-jit-an-intro... (2013).
  [1] Bytecode compilers and interpreters, https://bernsteinbear.com/blog/bytecode-interpreters/ (2019).
  [2] Let's build a Simple Interp, https://ruslanspivak.com/lsbasi-part1/ (2015).
- a month ago
  undefined
- MobiusHorizonsa month ago
  I always took the distinction to be the executable memory bit as opposed to writing an executable and then launching it in the usual way. Of course high performance runtimes that contain JITs do typically interpret first in order to get type information and reduce startup latency, but that’s more a function of the language not being efficiently AOT compileable rather than fundamental to the concept of a JIT
  - throwaway17_17a month ago
    Can you explain what mean by a “language not being efficiently AOT compileable”? I am guessing you are referring to languages that emphasize REPL based development or languages that are ‘image’ based (I think Smalltalk is the most common example).
    Like the guesses above, I can understand difficulty with AOT compilation in conjunction with certain use cases; however, I can not think of a language that based on its definition would be less amenable to AOT compilation.
    Nevermarka month ago
    The more context is narrowed down, the more optimizations that can applied during compilation.
    AOT situations where a lot of context is missing:
    • Loosly typed languages. Code can be very general. Much more general than how it is actually used in any given situation, but without knowing what the full situation is, all that generality must be complied.
    • Increment AOT compilation. If modules have been compiled separately, useful context wasn't available during optimization.
    • Code whose structure is very sensitive to data statistics or other conditional runtime information. This is the prime advantage of JIT over AOT. Unless the AOT compiler is working in conjunction with representative data and a profiler.
    Those are all cases where JIT has advantages.
    A language where JIT is optimal, is by definition, less amenable to AOT compilation.
    MobiusHorizonsa month ago
    I should probably have said, not compilable to efficient code. rather than not efficiently compilable. Basically I was referring to dynamic typing. Typically such languages are interpreted, although there is probably a way to compile the same sequence the interpreter would take as an AOT compiler. When types are not known until runtime, the compiler can't emit efficient code, and instead has to emit code that checks the type on each function invocation which is basically exactly what the interpreter was already doing. This is where JITs really shine, since they can produce one or more compilations of the same function specialized to types that have been observed to be used frequently. the interpreter can then call those specialized versions of the functions directly when appropriate.
    MobiusHorizonsa month ago
    What I meant was dynamic typing primarily. Think JavaScript or lua. These are both languages where the choice is between interpreter and JIT rather than JIT or AOT. VM based languages like Java also fall in that category, not because of dynamic typing so much as because of shipping bytecode to the client for portability reasons.
  - kscarleta month ago
    In that sense almost every compiled Lisp/Scheme implementation, GHC, etc. or any other interactive programming system, count as JIT. But virtually nobody in those circles refer to such implementations as JIT. Instead, people says "I wish our implementation was JIT to benefit from all those optimizations it enables"!
    MobiusHorizonsa month ago
    Do they generate machine code in ram and jump to it? Or do they interpret byte code?
    EDIT: at least GHC seems to be a traditional AOT compiler.
    kscarleta month ago
    They generate native machine code.
- a month ago
  undefined
senkoa month ago
Ah, I thought this was going to be about a JITted BASIC. No such luck.
possiblywronga month ago
Following is the closed form solution linked in the article (from a since-deleted Reddit comment):
```
    from functools import reduce

    def recurrence(ops, a0, n):
        def transform(x, op):
            return eval(repr(x) + op)
        ops = ops.split()
        m = reduce(transform, [op for op in ops if op[0] in ('*', '/')], 1)
        b = reduce(transform, ops, 0)
        for k in range(n + 1):
            print('Term 0: ', a0 * m ** k + b * (m ** k - 1) / (m - 1))
```
> This is really only interesting if a particular (potentially really large) term of the sequence is desired, as opposed to all terms up to a point. The key observation is that any sequence of the given set of operations reduces to a linear recurrence which has the given solution.
mananaysiemprea month ago
> On x86-64, pages may be 4kB, 2MB, or 1GB
But I believe sysconf(_SC_PAGESIZE) will always be 4KB, because the “may” is at the user’s discretion, not the system’s. Except on Cosmopolitan where it will always be 64KB, because Windows NT for Alpha (yes, seriously).
MobiusHorizonsa month ago
Is it just me or is the author using static wrong? Someone mentioned it in the thread on the previous post, where it felt more like an oversight. But in this article it seems much more like an actual misunderstanding. Should it actually be the function level static variable? I feel like in the right context the optimizer might be able to leave that in a register if there the calling function is in the compilation unit.
the-smug-onea month ago
Man, I friggin despise how hard (in a very not fun way) generating x86 assembly is :-). Generating Aarch64 is a lot easier.
- mananaysiemprea month ago
  I mean, I can’t do it by heart, but it’s not fundamentally very hard? Thumb-2 is definitely much more annoying.