no, not totally. The directory at the end of the archive points backwards to local headers, which in turn include all the necessary information, e.g. the compressed size inside the archive, compression method, the filename and even a checksum.
If the archive isn't some recursive/polyglot nonsense as in the article, it's essentially just a tightly packed list of compressed blobs, each with a neat, local header in front (that even includes a magic number!), the directory at the end is really just for quick access.
If your extraction program supports it (or you are sufficiently motivated to cobble together a small C program with zlib....), you can salvage what you have by linearly scanning and extracting the archive, somewhat like a fancy tarball.
This works great on campus, but when everyone went remote during COVID it wasn't anymore. It went from three minutes to like twenty minutes.
However. Most files change only rarely. I don't need all the files, just the ones which are different. So I wrote a scanner thing which compares the zip file's filesize and checksum to the checksum of the local file. If they're the same, we skip it, otherwise, we decompress out of the zip file. This cut the time to get the daily build from 20 minutes to 4 minutes.
Obviously this isn't resilient to an attacker, crc32 is not secure, but as an internal tool it's awesome.
No, its purpose was to allow multi floppy disks archives. You would insert the last disk, then the other ones, one by one…
If the archive is on a hard disk, the program reads the directory at the end and then seeks to the local header, rather than doing a linear scan. Or the floppy motor, if it is a small archive on a single floppy.
If you have multiple floppies, you insert the last one, the program reads the header and then tells you what floppy to insert, rather than having to go through them one by one, which you know, would be slower.
In one case, a hard disk arm, or the floppy motor, does the seeking, in the other case, your hands do the seeking. But it's still the same algorithm, doing the same thing, for the same reason.
This redundant information has lead to multiple vulnerabilities over the years. As having redundant information means that a maliciously crafted zip file with conflicting headers can have 2 different interpretations when processed by 2 different parsers.
The PKZIP tools came with PKZIPFIX.EXE, which would scan the file from the beginning and rebuild a missing central archive. You could extract any files up to the truncated file where your download stopped.
[1]: https://forum.videohelp.com/threads/393096-Fixing-Partially-Download-MP4-Files unzip zbsm.zip
Archive: zbsm.zip
inflating: 0
error: invalid zip file with overlapped components (possible zip bomb)
This seems to have been done in a patch to address https://nvd.nist.gov/vuln/detail/cve-2019-13232https://sources.debian.org/patches/unzip/6.0-29/23-cve-2019-...
> A final plea
It's time to put an end to Facebook. Working there is not ethically neutral: every day that you go into work, you are doing something wrong. If you have a Facebook account, delete it. If you work at Facebook, quit.
And let us not forget that the National Security Agency must be destroyed.
Someone shared a link to that site in a conversation earlier this year on HN. For a long time now, I've had a gzip bomb sitting on my server that I provide to people that make a certain categories of malicious calls, such as attempts to log in to wordpress, on a site not using wordpress. That post got me thinking about alternative types of bombs, particularly as newer compression standards have become ubiquitous, and supported in browsers and http clients.
I spent some time experimenting with brotli as a compression bomb to serve to malicious actors: https://paulgraydon.co.uk/posts/2025-07-28-compression-bomb/
Unfortunately, as best as I can see, malicious actors are all using clients that only accept gzip, rather than brotli'd contents, and I'm the only one to have ever triggered the bomb when I was doing the initial setup!
Like bomb the CPU time instead of memory.
That's how self extraction archives and installers work and are also valid zip files. The extractor part is just a regular executable that is a zip decompresser that decompresses itself.
This is specific to zip files, not the deflate algorithm.
import zlib
zlib.decompress(b"\x00\x00\x00\xff\xff" * 1000 + b"\x03\x00", wbits=-15)
If you want to spin more CPU, you'd probably want to define random huffman trees and then never use them.The minimal version boils down to:
bytes.fromhex("04c001090000008020ffaf96") * 1000000 + b"\x03\x00"But you probably don't want to be investigated for either.
https://sources.debian.org/patches/unzip/6.0-29/23-cve-2019-...
The detection maintains a list of covered spans of the zip files
so far, where the central directory to the end of the file and any
bytes preceding the first entry at zip file offset zero are
considered covered initially. Then as each entry is decompressed
or tested, it is considered covered. When a new entry is about to
be processed, its initial offset is checked to see if it is
contained by a covered span. If so, the zip file is rejected as
invalid.
So effectively it seems as though it just keeps track of which parts of the zip file have already been 'used', and if a new entry in the zip file starts in a 'used' section then it fails.I.e. an advanced compressor could abuse the zip file format to share base data for files which only incrementally change (get appended to, for instance).
And then this patch would disallow such practice.
1. A exceeds some unreasonable threshold
2. A/B exceeds some unreasonable threshold
On the other hand, zip bomb described in this blog post relies on decompressing the same data multiple times - so it wouldn't trigger your A/B heuristics necessarily.
Finally, A just means "you can't compress more than X bytes with my file format", right? Not a desirable property to have. If deflate authors had this idea when they designed the algorithm, I bet files larger than "unreasonable" 16MB would be forbidden.
Sure, if you expect to decompress files with high compression ratios, then you'll want to adjust your knobs accordingly.
> On the other hand, zip bomb described in this blog post relies on decompressing the same data multiple times - so it wouldn't trigger your A/B heuristics necessarily.
If you decompress the same data multiple times, then you increment A multiple times. The accounting still works regardless of whether the data is same or different. Perhaps a better description of A and B in my post would be {number of decompressed bytes written} and {number of compressed bytes read}, respectively.
> Finally, A just means "you can't compress more than X bytes with my file format", right? Not a desirable property to have. If deflate authors had this idea when they designed the algorithm, I bet files larger than "unreasonable" 16MB would be forbidden.
The limitation is imposed by the application, not by the codec itself. The application doing the decompression is supposed to process the input incrementally (in the case of DEFLATE, reading one block at a time and inflating it), updating A and B on each iteration, and aborting if a threshold is violated.
A better zip bomb [WOOT '19 Paper] [pdf] - https://news.ycombinator.com/item?id=20685588 - Aug 2019 (2 comments)
A better zip bomb - https://news.ycombinator.com/item?id=20352439 - July 2019 (131 comments)
I use zip bombs to protect my server - https://news.ycombinator.com/item?id=43826798 - April 2025 (452 comments)
How to defend your website with ZIP bombs (2017) - https://news.ycombinator.com/item?id=38937101 - Jan 2024 (75 comments)
The Most Clever 'Zip Bomb' Ever Made Explodes a 46MB File to 4.5 Petabytes - https://news.ycombinator.com/item?id=20410681 - July 2019 (5 comments)
Defending a website with Zip bombs - https://news.ycombinator.com/item?id=14707674 - July 2017 (183 comments)
Zip Bomb - https://news.ycombinator.com/item?id=4616081 - Oct 2012 (108 comments)
It is a much easier problem to solve than you would expect. No need to drag in a data centre when heuristics can get you close enough.
[0] https://sources.debian.org/patches/unzip/6.0-29/23-cve-2019-...