As mathematicians say, optimization is left as an exercise to the reader. You did the hard part.
Still, OP claims to have done the best job to date at creating (via AI) specs, and the non-optimal Rust implementation, so a net gain?
Does it? How are you legally intending to use copyright to license this machine output? How would you know it's not encumbered in any way?
I always found software IP to be absurd, but this is a particularly absurd situation. We're talking here about a small utility tool implemented from scratch and open sourced, with no apparent intent to make any money from it.
Are you concerned about the "encumberence" of using "unlicensed" tools to manipulate .doc, or .pdf, or .mp3 files?! Well I'm not, and if anyone ever tried to sue me for improper access to their proprietary formats, I'll show them some old testament impropriety.
It wasn't even a disasm/pseudocode to formal spec flow, and then a separate human implementation. The same human has been in the loop throughout, and large parts of it were generated directly.
It's basically guaranteed tainted.
Edit: I should have skimmed a bit more patiently, there was in fact no "disasm/pseudocode + the human getting tainted" part to this apparently.
"This is copyright-encumbered and nonfree because it's a derivative work of the legacy RAR binaries" is a different argument (and seems like it depends on details of the setup that were somewhat glossed over in the post).
You can get these LLMs to generate copyrighted outputs both intentionally and accidentally. This is a known fact; therefore, if you're not checking the output to see if this has occurred then you're potentially generating legal risks for yourself and anyone who uses your code.
To not only ignore this for your own use case but to then release the code under a proclaimed license seems legally problematic if not ethically concerning.
If you did get sued for infringement I can't imagine that your defense would be that you find the argument tiresome? Honestly, do you think this would never happen, or how would you go about defending your actions here?
At the very least you could see if it's already been open sourced under a different license. If you take GPL code and just slap MIT on it do you not consider that a violation?
> Or is your claim that using an LLM for coding is always copyright infringement?
I'm claiming you cannot really know.
> I'll personally be
It may be someone who uses or redistributes your code in any fashion.
> derailing the thread
I've made two posts. One with an idea and the second clarifying it. This is not "derailing the thread" under any sane definition. This is simply a complicated and relatively unexplored topic that clearly draws a lot of interest and resulting conversation from the crowd here.
I think using this type of bullying rhetoric damages that conversation and harms the reputation of Hacker News in general and I always regret it when I see it.
Because it’s a boring argument that we’re not going to make progress on until it is actually tested in court.
Also, if/when this is is tested, the court’s options seem to be (a) say yeah this is fine, or (b) cause unending havoc that if followed through on would destroy the economy (a precedent that any org who’s proprietary code made it into ai training data could sue any org that was using code generated by that model? Do the math on how many suits that is.)
How can you shout at Claude when it’s
1) foobaring, bamblabooing and fghrtawing all the time without telling you what’s going on
2) when it finally interacts, it’s asking for a permission you told it 30 seconds ago "yes and do not ever ask me again until heat death of the Universe"
3) and after all of that, it just spits out: "you’re out of tokens, give up your liver or wait until next Trump’s war"
Oh man, now I have to plug my tool[0]... it doesn't hide anything, but by default tries to provide a pleasant interface (ctrl+o to toggle details similar to CC, but less janky?)
Disclaimer: It's way simpler than Claude Code or even pi (on purpose)
For actual correctness verification in the strong sense, you'd need to start from a specification written in a formal language so that it's machine checkable, which if I had to guess not even win.rar GmbH has.
From a philosophical perspective, there's no way to know that any piece of software is truly correct without formal verification.
But in the present, non-philosophical context, it's obvious that what we mean is, colloquially, "how well-tested is this against a variety of edge-case files which the official winrar handles correctly? Is there a test suite, and how robust is it? Plenty of software that claims to be compatible with the rar format, doesn't actually successfully read all rar files."
It's also equally obvious, in the present context, that we would prefer these steps to have been taken by the author of the software before we install it and run it on our own computers and data. The parent commenter wasn't just asking about the software's correctness for the sake of academic curiosity.
I don't know how all these test cases were generated, but at least some of them seem to have been copied (with attribution) from the test suites of earlier FOSS RAR implementations.
The ideal would be to test it against a representative corpus of real-world legacy RAR files, but I'm not sure where you'd find one.
Added, later: hey you changed your comment, added a whole paragraph.
I was immediately proven right once I pressed "update". That said, I have now deleted my snarky response that followed. Not in the game of capitalizing off of the human equivalent of a race condition.
I should make a browser addon to delay posting, this is the 2nd time this happens in the past few days.
Edit:
Nevermind, it's already a feature built into the site. Turned it on. I wonder if it applies to edits also...
Nope, doesn't seem to. Oh well, should still help.
use std::fs::File;
use std::io::prelude::*;
fn main() -> std::io::Result<()> {
let mut file = File::create("content.txt")?;
file.write_all(b"3!")?;
Ok(())
}No, it doesn't even need to compile. The mere fact that it's in Rust means it's correct.
You know what I meant: How can we have confidence that this implementation of RAR is functionally identical to what it's based on? What would give me the confidence to use it in a critical piece of infrastructure?
Because it's a defined format there can be binary exact comparisons between the input and output files - we already have an oracle in the form of proper RAR format software, so if they are identical, you don't need to look further for that specific case.
You can see a version of this that I did quite similarly, for postgresql wire format, here: https://github.com/pgdogdev/pgdog/tree/main/integration/sql
It validates that sql with the same setup, teardown, and test results in perfectly exact compatibility between raw postgresql as the control and various configurations of PgDog, with both the text format and binary format, so ultimately a 6-way multivariate test that should always result in binary-exact results.
You also know what I meant, since I spelled it out in more detail a comment later. But even though you're being facetious, yes, that really is the case. If it works it works. That's the bar for the vast, vast majority of software, and has been since forever. Demonstrated practical correctness. If you stumble into a bug, you log it as a defect and then either wait for a fix or fix it yourself depending. That's all that regular people ever have. In the case of this project, this was achieved via fuzz testing.
It's literally no different to e.g. validating the NTFS driver that ships in the Linux kernel, or validating any other (re)implementation of anything. You just do a bunch of empirical testing and hope for the best. It is also why reimplementations always lag behind, which I'm not suggesting is not a real concern (or that defects wouldn't be). It's just not a gotcha.
Hell, I'm 99% sure this is exactly what the actual vendor does too, or at least I sure hope that they do have tests at least. Cause they're sure as shit not using a formally verified compiler toolchain, meaning they definitely don't have a formal proof about whether even the official implementation in itself is correct. Only empirical data at best too.
I get that this is often the case, but it does feel like we should be able to do better. At least when humans write this code you can have the expectation that there was real intent behind making sure the semantics of the code are aligned with the specification. At least with current language models, they tend to just brute-force test suite acceptance until everything passes, in a way no human developer has the capacity for. Of course this is often how it works with humans too (i.e. the classic Oracle story), but it does feel wrong.
Can we be sure that this method has produced a correct artefact without years of extensive usage? Probably not, hence my reluctance to rely on something like this, at least initially.
There's a lot of chatter lately e.g. about using TLA+ for formal modeling, so that anything downstream can be formally proven. That helps, but then the formal model still needs to be crafted somehow, which means a pass of semantic interpretation.
Going from binary to spec mechanistically via formal proofs would be possible, but only if there was a formal spec for the binary structure and the ISA available. In practice, both are just natural language prose too however, meaning another interpretation pass or two. The ISA specs also keep a lot implementation-defined / undefined afaik, for microarchitecture-level optimization freedom.
Netlists, PDK, and the likes then might be public for some RISC-V designs these days, but to get the actual chip behavior requires EM simulation typically on a scale that is not possible for any chip performant enough to be of interest. And RISC-V is not a very broadly adopted platform for proprietary consumer software.
Having the human do the semantic mapping is expensive and legally stricken. Having an LLM do it is more risk, but way, way, cheaper and currently legally grey. And both can and do make mistakes.
This is why I see this so bleakly. That said, I do also think formats like this are delicate enough that even rudimentary empirical testing should provide a surprisingly decent behavioral coverage. There's a reason that "I can't believe anything ever works at all" is such a common sentiment. Practical usage is a surprisingly powerful gate, and fuzzing in particular is basically that on steroids.
I do nevertheless still secretly get the heebie-jeebies from the Linux NTFS implementation though (me bringing that up was no coincidence).
There's much more about correctness of a piece of software than: "produces the same output as the original on x test cases".
I'm not saying it's a bad implementation and, if anything, LLMs are much better at translating/porting existing code (and finding bugs) than at writing things unheard of.
You're basically saying, if I may make a pun: "rust me bro, it's correct".
One thing I have been curious at is are there any ways to stop a rar compression mid way and then continue it later?
Like suppose I have a compression happening for a large file, then would there be a possibility with this project to shut down the computer mid compression and continue it after starting it again?
I would really love it if you can add this functionality!
Were you flagged for a cybersecurity violation?
You can draw your own conclusions as to what this says about the state of agentic development.
Kudos to the author. A fun read, thank you for sharing.
Maybe just cut the unprompted whining?
HN is better than most in this regard thanks to community flagging, but even then there's a lot of it. Ultimately, it'd seem that the ratio you're describing skews a whole lot more towards the anti-ai sentiment side, than towards the anti-anti-ai one (or towards a stalemate). Or rather, that the latter sentiment is not common enough necessarily to thwart such comments. And so you see it reflected verbally instead.
I suppose the question is whether the author had ever entered into a contract limiting reverse engineering...