I have seen people do ceramics where information was stacked in layers and had to be destroyed to extract. The ultimate form of shifting media to preserve and read information. I guess that could done with better resolution with 3D printed Zirconia (0.1 mm³ blobs) so 1Mb /cm³
Edit: this idea of a cold storage is from Footfall by Niven and Pournelle, where information was stored on monoliths where layers could be incrementally extracted with tools documented on the above layers. i.e. start with 0.1 bit per m² and go down, done with the hand wavy handling of practical problems in science fiction.
[1] https://www.bookandsword.com/2016/10/29/the-information-dens...
[^1]: Can't find the source right now, so take this with a grain of salt.
The biggest advantage of character-based encodings is that they can be decoded by humans (as opposed to dot-based encodings), which means that you don’t need a camera or a scanner to recover the data.
This is an interesting point. In our post apocalyptic future scholars will be using their quills to translate archives of these (in my imagination anyway). Of course they would have to translate into binary and then into human chars.
I can imaging they will be sad they cannot listen to the mp3's.
Adding color allows on to code more information per dot (3x more with three colors).
Is this right? Wouldn't it be base-3 encoding? Three bits of binary can count to 8. Three trits of base three can count to 27. Color has all sorts of disadvantages but maybe a much greater payoff (unless I m mistaken).
I am very skeptical of this idea that people will be able to write but unable to produce useful digital computers. Computers are a mathematical discovery, not an electronic invention. Electronics makes them a thousand times faster, but a computer made out of wood, flax threads, and bent copper wire would still be hundreds of times faster than a person at tabulating logarithms, balancing ledgers, calculating ballistic trajectories, casting horoscopes, encrypting messages, forecasting finances, calculating architectural measurements, or calculating compound interest. So I think automatic computation as such is likely to survive if any human intellectual tradition does.
I agree. When I first saw the post and the mention of humans in the reading end of the loop, I though "maybe there is a scifi story here". Hard to imagine a scenario that left humans but not many artifacts except caches of paper (or other "printed" media). Maybe a remote tribe of uncontacted people (or another species altogether) inherit the Earth after a modern world apocalypse kills off everyone in the technologically more advanced world.
A civilization starting from scratch would still need to develop a fair bit of math and tech/science sophistication before understanding and starting to use artifacts left behind. In particular optical/color on paper scanners would have been difficult before the 20th century.
Imagine tomes of programming lore, dutifully transcribed by rooms of silent scribes, acolytes carrying freshly finished pages to and fro, each page beautifully illuminated wih pictures of the binary saints, to ward off Beelzebug.
In this case they're not directly using the color to store information, they just have three differently colored QR codes overlayed on top of each other. With that method you can use a filter to separate them back out and you've got three separate QR codes worth of data in one place. The way they're added ends up using more than just three colors in that example.
If you were truly to use colored dots to store binary information without worrying about using a standard like QR, I think you'd be going from base-2 (white and black) to base-3 (red, blue, green) or more likely base-4 (white, red, blue, green) or even base-8 (if you were willing to add multiple colors on top of each other) in which case yeah you'd have way more than just 3x the data density.
That's only true if you can print and read colors in a higher resolution/don't destroy information at 3x the density with color, I'm not sure if that's generally true.
>If you were truly to use colored dots to store binary information without worrying about using a standard like QR, I think you'd be going from base-2 (white and black) to base-3 (red, blue, green) or more likely base-4 (white, red, blue, green) or even base-8 (if you were willing to add multiple colors on top of each other) in which case yeah you'd have way more than just 3x the data density.
Base 8 is exactly 3x the data density. (Log(8)/log(2))
2 dots at 2 possibilities each gives 4 (2^2)
They only diverge from there. Or am I doing my math wrong?
Log(25)/log(4) is 2.3. Among other things this definition has the nice property that two disks/pages/drives/bits together contain the sum of the capacities instead of their product.
It has shown to be an issue for including data, or spreadsheets. Most colleagues just print Excel files to a PDF that gets appended, but while it complies with the regulation it's basically unusable as-is.
For this reason, paper is at best useful as a bootstrapping mechanism, which would allow readers to construct a mechanism to read more densely encoded data. My best guess is that the main storage of information in this case would likely be microfilms, which should be at least 100x dense than the ideal paper data storage. Higher density allows for using less dense encodings to aid readers. And as far as I know microfilms are no harder to preserve than papers.
Or just go with metal https://rosettaproject.org/
Or try to create a culture for humans and store information in that.
Fiber laser in 100W range would do it, maybe $10k?
You could do photochemical etching but would be more fuss and wouldn't last as long as a laser engraving.
Probably looking at order of 1gig/1000kg if using 1mm 316 plate (napkin math only, naive estimate). Interesting to explore.
Not that I'm aware of. A DVD write laser is maybe 200mW so it's not going to be able to engrave most materials or do it VERY slowly at best. The spot size is ridiculously small though (this is good) so they are still interesting.
Most people interested in light wood or thin plastic applications have moved on to the small 5-20W diode laser form factor, these are available for a few hundred dollars if you aren't too worried about safety (e.g. no kids in the house). Something with a proper enclosure, interlocks and air handling costs more but still sub $1k. The spot size is much bigger than a DVD laser though; you can't get anything like the same resolution.
Modding a DVD laser has much higher hack value but it seems to have gone out of style as hobby lasers became widely available as a product.
Re: materials, if you are not on the "happy path" (material supported by manufacturer or specifically designed for laser) you have to get samples and test.
There's a few different interactions with laser spot size, wavelength, power, passes etc and the material which means different people (with different systems) tend to get different results. The variability limits the "shareability" of results; probably the biggest sources material info / laser settings are in the forums of the laser manufacturers because it makes the most sense to share settings with other users of the same system.
As you noted, glue and nonuniformity are a big thing, most materials aren't designed to be burned / vaporised. For glass specifically I think the most practical way would be a CO2 laser which is different again.
We have paper books from 500 years ago. Microfiche is already deteriorating.
If you keep paper dry and flat, and use pH-neutral inks and paper, it is extremely stable.
I'd also expect the plastics to go yellow and opaque over long periods, and recovering the document without damage may be difficult or impossible
If we just have text files, and mayve vector graphics for simple schematics, that's a lot of info.
You could encode data in monolithic structures this way. They'd last longer than paper and given future generations lots of confusion trying to figure out the meaning.
I find it interesting that, if you were print 4 sheets double sided you would have roughly the same amount of information stored as a 720kb 5 1/4" floppy disk and if you cut and folded it would take up roughly the same size and weight.
https://youtu.be/mIGotStRCkA?si=toG5xeLMZzjIGTxC
It's more like a long, linear barcode, but still. More often, they put the source code in the magazine and you'd just type it into your machine.
I am not sure why, for character-based encodings, they used a general-purpose font (Inconsolata) rather than one that is specifically made for OCR -- and how this would have made it better.
Going further, if you only print a limited alphabet (16, 32 or 39 symbols) why not use a specialized font with only these characters? The final step is to use a bitmap "font" that simply shows different character values as different bit patterns.
https://www.monperrus.net/martin/perfect-ocr-digital-data
From the linked article:
>The optimal font varies very much on the considered engine. Monospaced fonts (aka fixed-width) such as Inconsolata, are more appropriate in general. ocr-a and ocr-b ocrb give really poor results.
I noticed that they liked using lower case letters for bases where that is a choice. I would think that the larger, upper case letters would be better for OCR. Using lower case for either OCR-A or OCR-B would be a poor idea in any case. The good OCR properties are only provided for the upper case letters. The lower case letters were mostly provided for completeness.
Also, the author might be training on entire blocks of characters rather than individual characters. That isn't really want you want here unless you are using something like words for your representation. OCR-A and OCR-B were designed for character by character OCR.
I saw some work a while ago of storing SQL extracted table data as an image, and always thought that with good compression and a good printer, you could make paper copies.
I will try to remove dust from my A4 scanner and try to read that MP3 from printed medium, seems a bit insane to store multimedia in a paper but who needs to store it without proven ability to read. My printers love to mess with ink (especially ones with pirate-refilled cartridge) so I do not really believe this is practically at maximum resolution.
https://www.monperrus.net/martin/perfect-ocr-digital-data
(Last section before conclusion.)
IIUC this provided the best overall reliable information density (at 4.2kb / A4 page).
I've seen these barcodes scan accurately off dingy plastic cards using webcams.
The information level per symbol is not great (about 1kb), but the error correction and physical layout work really well.
100 errors in an 876kB file would be about an 0.0012% errror rate. You are going to need another level of ECC before that.
It's probably worth mentioning https://github.com/za3k/qr-backup/ which is tested in practice rather than merely theoretical. It doesn't achieve very high density, though.
The theoretical information capacity of an uncoated 600dpi laser-printed page ought to be close to 600×600 bits per square inch, 23.6×23.6 bits per square millimeter in modern units. This is 33.7 megabits per US letter page or 34.8 megabits per A4 page. The bit error rate of a laser printer is quite low, under 1%, and the margins are maybe another 5% at most. So modest ECC ought to be able to deliver most of that channel capacity in practice. QR codes and OCR apparently don't come close.
As an exercise, 13 years ago, I designed a proportional 1-bit-deep pixel font for printable ASCII, based on Janne Kujala's work, that averages about 3½×6 pixels. This is about 20 bits per character, so a letter-sized page should hold almost a megabyte of human-readable ASCII text. I generated the King James Bible in it at 600dpi. It comes to about four pages. Printed out in a half-assed way at double size (300dpi) on a 600dpi printer, you can read it pretty easily with a good magnifying glass. I have not yet been able to get an even partly readable printout at full resolution. If someone else tries it, I'm interested in hearing your results.
http://canonical.org/~kragen/bible-columns.png (warning, 93+-megapixel image, 4866×19254)
http://canonical.org/~kragen/bible-columns-320x200.png (small excerpt from the above)
http://canonical.org/~kragen/sw/netbook-misc-devel/6-pixel-1... (the font as a 374×7 image)
http://canonical.org/~kragen/sw/netbook-misc-devel/propfontr... (the image generation program I regret having written in Python because it won't run in current Python)
http://canonical.org/~kragen/sw/netbook-misc-devel/bible-pg1... (test input text, public domain everywhere except the UK)