There is zero information on the submitted page besides what's in the title, and this is a six-year-old GitHub repository with no activity in years.
What are we looking at? What is the significance of this repo? Did it invent this idea? Does it have any practical uses? Is there any broader context at all?
Someone posts "Tomasulo's algorithm" or "Newton's method" and then the comments are all just random anecdotes about a time someone encountered the topic. It is like playing a word association game.
Homomorphic encryption is a way of applying logic transformations on encrypted data without having to decrypt it first. A simple example would be multiplying a number field in an encrypted blob by 2.
The repo title is JPEG compression (a file format used for photograph-like images that uses lossy compression to reduce filesize).
The repo links to another project which converts bitmap images (an array of pixel colors… basically the completely uncompressed representation of an image) to JPEG compressed format.
If that is your comment indeed, I think the problem was that you were overly dismissive and sarcastic. The parent poster asks questions instead, and doesn’t make remarks beyond stating the facts – this reads as harsh, but fair. (I also suppose they’ve accumulated some karma and this helps with the flagging algorithm somehow.)
Maybe.
"There is zero information on the submitted page besides what's in the title, and this is a six-year-old GitHub repository with no activity in years."
We made the same points though I'm admittedly more flippant and curt.
Wouldn't you compress the file independently ... regardless of anytype of encryption (homomorphic or not).
I'm confused.
Like, for example, let's say I want to farm out word counts to the cloud. Wouldn't the information required to identify what a "word" is require the running software to be able to see breaks/periods/etc? Doesn't that leak information about the cyphertext? How does it stop someone from writing software that, for example, maps out the position of all the a's, then b's, then c's, etc in a cyphertext and MITMing it?
Since the scheme is asymmetric, the untrusted agent can also encrypt new data with the public key, meaning that they can do things like: `return encrypt(x) + encrypted_data_from_user * encrypt(y)`
This alone doesn't let the untrusted agent evaluate a boolean condition or run an algorithm like the one you proposed, but they can at least run encrypted data through a neural network and send the encrypted output to the user.
[1] https://www.cs.cmu.edu/~odonnell/hits09/gentry-homomorphic-e...
Unfortunately the mathematical details escape me as well. Maybe one day I'll set aside some time to look into it.
> Wouldn't the information required to identify what a "word" is require the running software to be able to see breaks/periods/etc?
Yes. But that information can be encrypted and still computed on. You might suppose that since encrypted data is essentially gibberish, then multiplying or adding different blocks of it together will only generate gibberish. The insight is that it is not entirely gibberish- or else how would we be able to decrypt it? The information to identify a word still there, and you can write a program that sees breaks/periods/etc, but you won't know when it sees a break/period until decryption of the result.
> Doesn't that leak information about the cyphertext?
This question seems to be encoding an assumption that the cloud in your example is able to see the result of the search query. The key insight here is that the cloud is only able to compute the encrypted result. It can only return the encrypted result to the requester who has the private key, and can decrypt the result, and see how many words were counted.
> How does it stop someone from writing software that, for example, maps out the position of all the a's, then b's, then c's, etc in a cyphertext and MITMing it?
I'm a bit confused by the attack here. I think it is also assuming that the untrusted computing party is able to read the plaintext result of the operation.
Here's an illustrative example. Suppose my encryption scheme is Enc(key, m) = key*m = c. Suppose my decryption scheme is Dec(key, c) = c/key = m. This scheme is not secure, but pretend that it is, and that separating out key and m from c is difficult.
I want the untrusted cloud to compute m1 * m2.
I can perform Enc(key, m1) = c1, Enc(key, m2) = c2 and send c1 and c2 to the cloud to multiply.
The cloud receives c1 and c2 which are really key*m1 and key*m2, but we are assuming that the cloud can't separate these factors from the products.
The cloud returns c1*c2, which we know equals key*m1*key*m2 = key^2 * (m1*m2).
If we divide c1*c2 by (key^2) - note: this is just running our decryption algorithm with a modified key - we will get m1*m2, which is what we wanted!
For a more formal example using ElGamal Encryption (apologies for spelling ElGamal wrong in the paper) I wrote this up: https://github.com/lsnow99/elgamal/blob/main/elgamal.pdf
Credit to https://www.cs.cmu.edu/~goyal/15356/lecture_notes.pdf for definitions
In your example you decryped by key^2. The decryption requiring a different or derivative key a general feature of the algorithm or just a quirk of your example?
It's also somewhat definitional. We're taking an established encryption scheme in both cases and constructing a new algorithm on top of it. Above where I say we're performing decryption with a key set to key^2, we could also think of it as doing Dec(key, key*m) where the key is the same but we've done some operation on the cyphertext.
Was on mobile phone so was too hard to read the handwritten text. Will definitely check it out later.
Anyway thanks again for the additional explanation, much appreciated.
I personally would prefer something like Homomorphic encryption for say sql queries on a database that the server can never read.
All I remember is that this didn't end up becoming practical because of some limitations, which were maybe discussed in the paper. Something about leaking row counts.