A JPEG Image Compression Service Using Part Homomorphic Encryption (2019)(github.com)

60 pointsby keepamovin4 months ago9 comments

crazygringo4 months ago
Can the submitter, or anyone else, please explain what this is?
There is zero information on the submitted page besides what's in the title, and this is a six-year-old GitHub repository with no activity in years.
What are we looking at? What is the significance of this repo? Did it invent this idea? Does it have any practical uses? Is there any broader context at all?
- milesrout4 months ago
  I feel similarly with the occasional Wikipedia articles that get posted here with no context. Okay, yes, what about it?
  Someone posts "Tomasulo's algorithm" or "Newton's method" and then the comments are all just random anecdotes about a time someone encountered the topic. It is like playing a word association game.
- thephyber4 months ago
  Not the poster or repo author, but:
  Homomorphic encryption is a way of applying logic transformations on encrypted data without having to decrypt it first. A simple example would be multiplying a number field in an encrypted blob by 2.
  The repo title is JPEG compression (a file format used for photograph-like images that uses lossy compression to reduce filesize).
  The repo links to another project which converts bitmap images (an array of pixel colors… basically the completely uncompressed representation of an image) to JPEG compressed format.
  - daniellaigz4 months ago
    Pls take a look at Lattica’s FHE survey, it shouldn't take more than ten minutes to complete and we’d appreciate your response https://forms.gle/XfKBUmDwpVHZkeTv7. Thank you!
- nailer4 months ago
  The best reply is further down the thread, so I’ll just link to it:
  https://news.ycombinator.com/item?id=43284843
- stayclassyhn4 months ago
  Literally got flagged off this comment thread for saying the same thing...
  - notpushkin4 months ago
    I think you’re talking about the comment by whatdoinghere (as I don’t see any dead comments in your name in this thread).
    If that is your comment indeed, I think the problem was that you were overly dismissive and sarcastic. The parent poster asks questions instead, and doesn’t make remarks beyond stating the facts – this reads as harsh, but fair. (I also suppose they’ve accumulated some karma and this helps with the flagging algorithm somehow.)
    stayclassyhn24 months ago
    "I think the problem was that you were overly dismissive and sarcastic."
    Maybe.
    "There is zero information on the submitted page besides what's in the title, and this is a six-year-old GitHub repository with no activity in years."
    We made the same points though I'm admittedly more flippant and curt.
    notpushkin4 months ago
    Yeah, I agree, the point is basically the same – I think it’s the tone that was the problem. You don’t have to be overly positive and wrap your opinion with tons of fluff, just remember it’s other people’s work you’re talking about (and not everybody is good at explaining what they do!)
xyzzyz4 months ago
If you are curious about what use case there is of it: imagine that instead of JPEG compression, we are running an LLM. You send your prompt encrypted, server homomorphically executes the model and sends the encrypted output, which you then decrypt. As a result, you get LLM output, but the server operators never learn anything about the contents of the query or response.
- totetsu4 months ago
  But.. if the LLM uses something like a MOE architecture, then surely some information about the prompt would leak just by observing what parts of the model are pulled into memory?
  - palata4 months ago
    I think if you want full homomorphic encryption, you have to go for the worst case. So you always pull all the model? That's usually why FHE is not practical.
- HeatrayEnjoyer4 months ago
  I wonder if FHE would ever be practical for an LLM. FHE necessitates calculating over the entire data, an extreme requirement for many workloads but routine for LLM inference.
yoelhacks4 months ago
I was also unclear what this actually did. Not sure how much this helped, but it is a nice visualization of what's happening: https://eraser.io/git-diagrammer?diagramId=ysiiD14zhpPemvoJO...
daniellaigz4 months ago
Pls take a look at Lattica’s FHE survey, it shouldn't take more than ten minutes to complete and we’d appreciate your response https://forms.gle/XfKBUmDwpVHZkeTv7. Thank you!
alberth4 months ago
"Compression" and "Encryption" seem like two words I wouldn't expect in the same sentence.
Wouldn't you compress the file independently ... regardless of anytype of encryption (homomorphic or not).
- Almondsetat4 months ago
  This is image compression, which uses visual tricks, not simple binary file compression
  - alberth4 months ago
    Isn't a JPEG already compressed.
    I'm confused.
    Almondsetat4 months ago
    The service is taking an encrypted image and compressing it without decrypting it
    hot_gril4 months ago
    Presumably the input isn't a jpeg but something uncompressed. Or a jpeg that you want to compress more? Idk, there's no readme.
pogue4 months ago
What would you use this for?
- perching_aix4 months ago
  As a technological demonstration of homomorphic encryption and its benefit.
  - cogman104 months ago
    I'm having a hard time wrapping my head around how you'd do homomorphic encryption securely (particularly FHE).
    Like, for example, let's say I want to farm out word counts to the cloud. Wouldn't the information required to identify what a "word" is require the running software to be able to see breaks/periods/etc? Doesn't that leak information about the cyphertext? How does it stop someone from writing software that, for example, maps out the position of all the a's, then b's, then c's, etc in a cyphertext and MITMing it?
    uniq74 months ago
    I have only read Craig Gentry, 2009 [1] (the first proposed FHE scheme), but the main idea is that an untrusted agent can perform operations on the encrypted data (i.e., addition and multiplication) and obtain an encrypted result. The untrusted agent cannot understand the result, but they can send it to the user, and the user can decrypt it.
    Since the scheme is asymmetric, the untrusted agent can also encrypt new data with the public key, meaning that they can do things like: `return encrypt(x) + encrypted_data_from_user * encrypt(y)`
    This alone doesn't let the untrusted agent evaluate a boolean condition or run an algorithm like the one you proposed, but they can at least run encrypted data through a neural network and send the encrypted output to the user.
    [1] https://www.cs.cmu.edu/~odonnell/hits09/gentry-homomorphic-e...
    fragmede4 months ago
    That wouldn't be FHE, for the reason you identified. Outside of a tiny number of algorithms, homomorphic encryption doesn't usefully generalize to something as generic as word counting of arbitrary text like that. Production use of FHE is really limited. Probably the most well known application is by Apple for phone number lookup, where they're able to get the caller id for a phone number without knowing the phone number, basically.
    perching_aix4 months ago
    I think the idea is that you would never get to know the answer to those individual questions, or any questions really; you'd be producing an encrypted blob of a response that supposedly has the answer and then just pass that onward.
    Unfortunately the mathematical details escape me as well. Maybe one day I'll set aside some time to look into it.
    snowfoolin4 months ago
    These are good questions! FHE appears nonintuitive at first glance, but hopefully with a smaller example it can be made clear.
    > Wouldn't the information required to identify what a "word" is require the running software to be able to see breaks/periods/etc?
    Yes. But that information can be encrypted and still computed on. You might suppose that since encrypted data is essentially gibberish, then multiplying or adding different blocks of it together will only generate gibberish. The insight is that it is not entirely gibberish- or else how would we be able to decrypt it? The information to identify a word still there, and you can write a program that sees breaks/periods/etc, but you won't know when it sees a break/period until decryption of the result.
    > Doesn't that leak information about the cyphertext?
    This question seems to be encoding an assumption that the cloud in your example is able to see the result of the search query. The key insight here is that the cloud is only able to compute the encrypted result. It can only return the encrypted result to the requester who has the private key, and can decrypt the result, and see how many words were counted.
    > How does it stop someone from writing software that, for example, maps out the position of all the a's, then b's, then c's, etc in a cyphertext and MITMing it?
    I'm a bit confused by the attack here. I think it is also assuming that the untrusted computing party is able to read the plaintext result of the operation.
    Here's an illustrative example. Suppose my encryption scheme is Enc(key, m) = key*m = c. Suppose my decryption scheme is Dec(key, c) = c/key = m. This scheme is not secure, but pretend that it is, and that separating out key and m from c is difficult.
    I want the untrusted cloud to compute m1 * m2.
    I can perform Enc(key, m1) = c1, Enc(key, m2) = c2 and send c1 and c2 to the cloud to multiply.
    The cloud receives c1 and c2 which are really key*m1 and key*m2, but we are assuming that the cloud can't separate these factors from the products.
    The cloud returns c1*c2, which we know equals key*m1*key*m2 = key^2 * (m1*m2).
    If we divide c1*c2 by (key^2) - note: this is just running our decryption algorithm with a modified key - we will get m1*m2, which is what we wanted!
    For a more formal example using ElGamal Encryption (apologies for spelling ElGamal wrong in the paper) I wrote this up: https://github.com/lsnow99/elgamal/blob/main/elgamal.pdf
    Credit to https://www.cs.cmu.edu/~goyal/15356/lecture_notes.pdf for definitions
    magicalhippo4 months ago
    Great explanation, thank you.
    In your example you decryped by key^2. The decryption requiring a different or derivative key a general feature of the algorithm or just a quirk of your example?
    snowfoolin4 months ago
    In the ElGamal example I linked you'll see that we use the same key for encryption and decryption. However, we perform a modification on the cyphertext using known values. This has a similar effect.
    It's also somewhat definitional. We're taking an established encryption scheme in both cases and constructing a new algorithm on top of it. Above where I say we're performing decryption with a key set to key^2, we could also think of it as doing Dec(key, key*m) where the key is the same but we've done some operation on the cyphertext.
    magicalhippo4 months ago
    > In the ElGamal example I linked you'll see
    Was on mobile phone so was too hard to read the handwritten text. Will definitely check it out later.
    Anyway thanks again for the additional explanation, much appreciated.
- 77pt774 months ago
  Compressing images into jpeg without anyone else ever seeing them.
  I personally would prefer something like Homomorphic encryption for say sql queries on a database that the server can never read.
  - hot_gril4 months ago
    About the encrypted DB: https://people.csail.mit.edu/nickolai/papers/popa-cryptdb-tr...
    All I remember is that this didn't end up becoming practical because of some limitations, which were maybe discussed in the paper. Something about leaking row counts.
  - dzidol4 months ago
    There's such thing, at least in prototype stage. Like cryptDB, which, afaik is homomorphic with a nice property of preserving the order of encrypted keys. It leaks some information, but you can apply predicates different than equality, like `where a between 100 and 200`. In that case, the query passed to server will look like `where a between encrypted(100) and encrypted(200)` and server will be able to apply it, without knowing real values of the range limits or data stored in `a` column. With encryption function which don't preserve order, the data would need to be filtered client-side, so whole projection would be dumped to client and it would be the client who first decrypts the data, and then filters-out unnecessary rows.
  - tgv4 months ago
    There's no information in the repo, but I suppose this means jpeg compression of an encrypted image without decryption, right?
    mystified50164 months ago
    Yes. Homomorphic encryption means that you can do math to the encrypted file to produce a new encrypted file. The operator doing the math cannot read either version, but the owner of the original file can.
  - pogue4 months ago
    Without anyone ever seeing them?
4 months ago
undefined
cyanydeez4 months ago
[flagged]
- 4 months ago
  undefined
whatdoinghere4 months ago
[flagged]