JVM exceptions are weird: a decompiler perspective(purplesyringa.moe)

128 pointsby vrnvu6 days ago5 comments

marginalia_nu11 hours ago
On the subject
```
  void foo() {
    for (;;) {
      try { return; } 
      finally { continue; }
    }
  }
```
is my favorite cursed Java exceptions construct.
- ptx10 hours ago
  Python has the same construct but is removing it, starting with a warning in version 3.14: https://peps.python.org/pep-0765/
  - maxdamantus8 hours ago
    Interesting .. from the post above:
    > The projects examined contained a total of 120,964,221 lines of Python code, and among them the script found 203 instances of control flow instructions in a finally block. Most were return, a handful were break, and none were continue.
    I don't really write a lot of Python, but I do write a lot of Java, and `continue` is the main control flow statement that makes sense to me within a finally block.
    I think it makes sense when implementing a generic transaction loop, something along the lines of:
    <T> T executeTransaction(Function<Transaction, T> fn) { for (int tries = 0;; tries++) { var tx = newTransaction(); try { return fn.apply(tx); } finally { if (!tx.commit()) { // TODO: potentially log number of tries, maybe include a backoff, maybe fail after a certain number continue; } } } }
    In these cases "swallowing" the exception is often intentional, since the exception could be due to some logic failing as a result of inconsistent reads, so the transaction should be retried.
    The alternative ways of writing this seem more awkward to me. Either you need to store the result (returned value or thrown exception) in one or two variables, or you need to duplicate the condition and the `continue;` behaviour. Having the retry logic within the `finally` block seems like the best way of denoting the intention to me, since the intention is to swallow the result, whether that was a return or a throw.
    If there are particular exceptions that should not be retried, these would need to be caught/rethrown and a boolean set to disable the condition in the `finally` block, though to me this still seems easier to reason about than the alternatives.
    AdieuToLogic6 hours ago
    > Having the retry logic within the `finally` block seems like the best way of denoting the intention to me, since the intention is to swallow the result, whether that was a return or a throw.
    Except that is not the documented intent of the `finally` construct:
    The finally block always executes when the try block exits. This ensures that the finally block is executed even if an unexpected exception occurs. But finally is useful for more than just exception handling — it allows the programmer to avoid having cleanup code accidentally bypassed by a return, continue, or break. Putting cleanup code in a finally block is always a good practice, even when no exceptions are anticipated.[0]
    Using `finally` for implementing retry logic can be done, as you have illustrated, but that does not mean it is "the best way of denoting the intention." One could argue this is a construct specific to Java (the language) and does not make sense outside of this particular language-specific idiom.
    Conceptually, "retries" are not "cleanup code."
    0 - https://docs.oracle.com/javase/tutorial/essential/exceptions...
    maxdamantus4 hours ago
    Sounds like the right intent to me. To pinpoint your existing quote from the documentation:
    > The finally block always executes when the try block exits. This ensures that the finally block is executed even if an unexpected exception occurs.
    The intent of the transaction code is that the consistency is checked (using `tx.commit()`) "even if an unexpected exception occurs".
    I'm not sure how else to interpret that to be honest. If you've got a clearer way of expressing this, feel free to explain.
    locknitpickeran hour ago
    > The intent of the transaction code is that the consistency is checked (using `tx.commit()`) "even if an unexpected exception occurs".
    A transaction failing is the opposite of an unexpected event. Transactions failing is a central use case of any transaction. Therefore it should be handled explicitly instead of using exceptions.
    Exceptions are for unexpected events such as the node running out of memory, or a process failing to write to disk.
    maxdamantus39 minutes ago
    > A transaction failing is the opposite of an unexpected event.
    That's why it's denoted by a non-exceptional return value from `tx.commit()` in my sample code. When I've talked about exceptions here, I'm talking about exceptions raised within the transaction. If the transaction succeeds, those exceptions should be propagated to the calling code.
    > Exceptions are for unexpected events such as the node running out of memory, or a process failing to write to disk.
    Discussing valid uses of exceptions seems orthogonal to this (should OOM lead to a catchable exception [0], or should it crash the process?). In any case, if the process is still alive and the transaction code determines without error that "yes, this transaction was invalid due to other contending transactions", it should retry the transaction. If something threw due to lack of memory or disk space, chances are it will throw again within a successful transaction and the error will be propagated.
    [0] As alluded to in my first post, you might want to add some special cases for exceptions/errors that you want to immediately propagate instead of retrying. Eg, you might treat `Error` subtypes differently, which includes `OutOfMemoryError` and other cases that suggest the program is in a potentially unusable state, but this still isn't required according to the intent of the transactional logic.
    brabel3 hours ago
    Doesn't that code ignore errors even if it runs out of retries? Don't you want to log every Exception that happens, even if the transaction will be retried?
    This code is totally rotten.
    maxdamantus2 hours ago
    A result of an inconsistent transaction should be discarded whether it's a return value or a thrown exception. If it runs out of tries another error should be thrown. This should only happen due to contention (overlapping transactions), not due to a logical exception within the transaction.
    You can add extra logging to show results or exceptions within the transaction if you want (for the exception this would simply be a `catch` just before the `finally` that logs and rethrows).
    I've omitted these extra things because it's orthogonal to the point that the simplest way to express this logic is by having the `continue` control flow unconditional on whether the code was successful .. which is what you use `finally` for.
    If you did this in Rust noone would complain, since the overall result is expressed as a first-class `Result<T, E>` value that can naturally be discarded. This is why Rust doesn't have `finally`.
    Rust is also a lot more permissive about use of control flow, since you can write things like `foo(if x { y } else { continue }, bar)`.
    Personally, I prefer when the language gives a bit more flexibility here. Of course you can write things that are difficult to understand, but my stance is still that my example code above is the simplest way to write the intended logic, until someone demonstrates otherwise.
    I don't think this is a restriction that generally helps with code quality. If anything I've probably seen more bad code due to a lack of finding the simplest way to express control flow of an algorithm.
    I'm sure there's some train of thought that says that continue/break/return from a loop is bad (see proponents of `Array.prototype.forEach` in JS), but I disagree with it.
  - teddyh8 hours ago
    There was a recent talk at PYCON UK about it, by one of the authors of the PEP in question: <https://www.youtube.com/watch?v=vrVXgeD2fts>
- cerved11 hours ago
  To anyone wondering, I believe it's cursed because the finally continue blocks hijacks the try return, so the for loop never returns
  - chii5 hours ago
    see, if you only had GOTO's, this would be obvious what is going on!
  - taneq3 hours ago
    So the function returns, and then during its tidyup, the 'continue' basically comefrom()s the VM back into the loop? That is, indeed, cursed.
    friendzis3 hours ago
    I would not call this snippet particularly "cursed". There is no "aktshchually this happens, therefore this is activated" hidden magic going on. The try-catch-finally construct is doing exactly what it is designed and documented to do: finally block is executed regardless of the outcome within try. The purpose of finally block is to fire regardless of exceptionality in control flow.
    Surprising at first? Maybe. Cursed? Wouldn't say so. It is merely unconventional use of the construct.
  - 8 hours ago
    undefined
- ziml7710 hours ago
  Just tested that in C# and it seems they made the smart decision to not allow shenanigans like that in a finally block:
  CS0157 Control cannot leave the body of a finally clause
  - o11c5 hours ago
    What about throwing an exception from the finally clause?
    messean hour ago
    This loops, if that's what you're asking:
    while (true) { try { try { return; } finally { throw new Exception(); } } catch { } }
- brunoborges5 hours ago
  In JDK 25, you can run this code:
  $ cat App.java void main() { for (;;) { try { return; } finally { continue; } } } $ java App.java
- scrame2 hours ago
  this broke my head. I think I haven't touched Java in a while and kept thinking continue should be in a case/switch so ittook a minute to back out of that alleyway before I even got what was wrong with this.
- kfuse10 hours ago
  That's not just Java and there is nothing really cursed about it: throwing in a finally block is the most common example. Jump statements are no different, you can't just ignore them when they override the return or throw statements.
  - Maxatar8 hours ago
    It is just Java as far as I can tell. Other languages with a finally don't allow for explicitly exiting the finally block.
    maxdamantus7 hours ago
    And JavaScript .. And Python (though as sibling posts have mentioned it looks like they're intending to make a breaking change to remove it).
    EDIT: actually, the PEP points out that they intend for it to only be a warning in CPython, to avoid the breaking change
    o11c5 hours ago
    Notably, C++ and similar languages don't support lexical `finally` at all, instead relying on destructors, which are a function and obviously cannot affect the control flow of their caller ...
    except by throwing exceptions, which is a different problem that there's no "good" solution to (during unwinding, that is).
    kmeisthax4 hours ago
    I thought destructors were all noexcept now... or at the very least if you didn't noexcept, and then threw something, it just killed the process.
    Although, strictly speaking, they could have each exception also hold a reference to the prior exception that caused the excepting object to be destroyed. This forms an intrusive linked list of exceptions. Problem is, in C++ you can throw any value, so there isn't exactly any standard way for you to get the precursor exception, or any standard way for the language to tell the exception what its precursor was. In Python they could just add a field to the BaseException class that all throwables have to inherit from.
    aw16211073 hours ago
    > I thought destructors were all noexcept now...
    Destructors are noexcept by default, but that can be overridden with noexcept(false).
    > or at the very least if you didn't noexcept, and then threw something, it just killed the process.
    IIRC throwing out of a destructor that's marked noexcept(false) terminates the process only if you're already unwinding from something else. Otherwise the exception should be thrown "normally".
  - metaltyphoon9 hours ago
    It is Java as C# disallow this
  - MangoToupe8 hours ago
    > override the return
    How is this not cursed
- bear864211 hours ago
  This is exceedingly nasty. Well Done!
pron12 hours ago
Nice post!
A minor point:
> monitors are incompatible with coroutines
If by coroutines the author meant virtual threads, then monitors have always been compatible with virtual threads (which have always needed to adhere to the Thread specification). Monitors could, for a short while, degrade the scalability of virtual threads (and in some situations even lead to deadlocks), but that has since been resolved in JDK 24 (https://openjdk.org/jeps/491).
- PhilipRoman11 hours ago
  I think it's coroutines as in other JVM languages like Kotlin, where yielding may be implemented internally as return (due to lack of native coroutine support in JVM).
  Holding a lock/monitor across a yield is a bad idea for other reasons, so it shouldn't be a big deal in practice.
12 hours ago
undefined
Joker_vD10 hours ago
Doesn't JRE has some limited form of decompilation in its JIT, as a pre-pass? IIRC, it reconstructs the basic blocks and CFG from the bytecode and does some minor optimizations before going on to regalloc and codegen.
- monocasa10 hours ago
  It's hard to call it decompilation as opposed to just regular compilation though.
immibis10 hours ago
Older versions of Java did try to have only one copy of the finally block code. To implement this, there were "jsr" and "ret" instructions, which allowed a method (a subroutine) to contain subroutines inside it. This even curseder implementation of finally is prohibited starting from version 51 class files (Java 7).