What if you're a user and you don't care about searching for "dog" in your own photos, you might not even use the Photos app, but apple still scans all your photos and sends data off device without asking you?
Perhaps this complicated dance works, perhaps they have made no mistakes, perhaps no one hacks or coerces the relay host providers... they could still have just asked for consent the first time you open Photos (if you ever do) before starting the scan.
Google, on the other hand, uploads photos to their server and does the analysis there.
There is the infamous case of the parents who Google tried to have arrested after they used their Android device to seek medical assistance for their child during lockdown. Their doctor asked them to send images of the problem, and Google called the police and reported the parents for kiddie porn.
> “I knew that these companies were watching and that privacy is not what we would hope it to be,” Mark said. “But I haven’t done anything wrong.”
The police agreed. Google did not.
https://www.nytimes.com/2022/08/21/technology/google-surveil...
Google refused to return access to his account even after the police cleared him of wrongdoing.
It's sort of the crime of "contempt of court", but after the fact: receiving a judge's prescription about how you must interpret a law during a case, but then going right back to using a different interpretation when you leave court.
This is why I constantly work to help people reduce their dependence on Google. Screw that. If anyone ever tells you that they rely on Google for anything, show them this article.
But I definitely live in fear of Google fucking up and disabling my account.
> Google, on the other hand, uploads photos to their server and does the analysis there.
The comment you're replying to (and the whole sub-thread in fact) isn't about if how Apple is doing it is the best/worst way, but rather before they do it, they don't ask for permission. Regardless of how they technically do it, the fact that they don't ask beforehand is what is being argued about here.
The parents Google tried to get arrested in the story above do not agree.
> When Mark’s and Cassio’s photos were automatically uploaded from their phones to Google’s servers, this technology flagged them. Jon Callas of the E.F.F. called the scanning intrusive, saying a family photo album on someone’s personal device should be a “private sphere.” (A Google spokeswoman said the company scans only when an “affirmative action” is taken by a user; that includes when the user’s phone backs up photos to the company’s cloud.)
Google not only automatically uploaded their images to their server, it analyzed those images and reported the users to the police for kiddie porn based on a single false positive.
If you care about not sending photos to Google, it's pretty obvious how to not have that happen.
IMO, Google is not the bad guy here, although when it was explained to them that the photos were legitimate, they should definitely have reenabled the account.
I'm OK with Google scanning photos that I send to them that will be stored on their servers. Honestly, how can they not?
I never enable cloud backups, because it means my shit is sent somewhere.
Then they proceed to claim those automatic backups are an "affirmative action" that justifies them scanning the contents of your images as well.
When setting up a new phone (and many times thereafter) they prompt you to enable photos backup. It's not on by default if I remember correctly.
You keep saying that, but it remains false. The parents explicitly opted in to sending their photos to Google.
About weekly it prompts me with a huge popup whether I want to continue without backup, with "enable backup" selected by default. If I deselect this I'm prompted with another popup asking me to back up specific selected photos. If I misclick either of these (which is easy, since they pop up after briefly showing my photos which I'm actively trying to tap on), then Google will start hoovering up all my photos without confirmation.
Their "consent" form is user-hostile and it's disingenuous to hold it as an example of Google protecting privacy.
Pro tip: install Google Gallery which (ironically) is effectively a de-Googled Photos. Unfortunately it's also stripped down in other ways but it suffices for simply viewing photos on your own device.
Sending everything to a server is, however, how Google's service works.
You don't need any encryption to process locally.
Apple handles those tasks on the users own device(s) for privacy reasons.
Of course maybe that question has its own answer. Apple markets itself as the last personal computing company where you are the customer not the product so they are held to a higher standard.
What they should do Is do the processing locally while the phone is plugged in, and just tell people they need a better phone for that feature if it’s too slow. Or do it on their Mac if they own one while that is plugged in.
Perhaps it's not as visible because Google hasn't defaulted to opt-in for most of these? Or because a lot of it is B2B and Google-internal (e.g., a differential-privacy layer on top of SQL for internal metrics)
[edit]: this link was a very vague press release that doesn't say exactly how Google uses it: https://security.googleblog.com/2019/06/helping-organization...
Most people I know of wouldn't care about such a feature other than a breathless sort of "Wow, Apple tech!" So they are building something which is intended to win over privacy conscious people, kudos to them, everyone stands to benefit. But the most basic custom in that subculture is consent. So they built something really great and then clumsily failed on the easiest detail because it is so meaningless to everyone except that target audience. To that audience, they don't bother criticising google or microsoft (again) because it goes without saying that those companies are terrible, it doesn't need to be said again.
Just because it can’t be “appreciated” by all users doesn’t mean it’s only “for” a small sub-group.
It seems to me they’re just trying to minimise the data they have access to — similar to private cloud compute — while keeping up with the features competitors provide in a less privacy-respecting way. Them not asking for permission makes it even more obvious to me that it’s not built for any small super privacy-conscious group of people but the vast majority of their customers instead.
The issue isn't slowness. Uploading photo library data/metadata is likely always slower than on-device processing. Apparently the issue in this case is that the world locations database is too large to be stored locally.
What kind of capacity can ROM chips have these days? And at what cost?
But now that we're talking about these differences, I'd say that Apple users are notoriously complacent and defend Apple and their practices. So, perhaps in some part it is in an attempt to compensate for that? I'm still surprised how we've now accepted that Apple receives information pretty much every time we run a process (or rather, if it ran more than 12 hours ago, or has been changed).
You think Trump is bad? Well, Putin is worse. You think Putin is bad? Kim Jong Un is worse.
I also appreciate Ente's Guest View[1] that lets you select the photos you want to show someone in person on your phone to prevent the issue of them scrolling too far.
[0] https://github.com/ente-io/ente
Likewise, you need to trust your spouse or significant other, but if there are obvious signs of cheating, then you need to be suspicious.
An essential part of trust is not overstepping boundaries. In this case, I believe that Apple did overstep. If someone demands that you trust them blindly and unconditionally, that's actually a sign you shouldn't trust them.
That's certainly a take, which you're clearly entitled to take. I don't disagree with the point that you make; this ought to have been opt in.
What you should do now is acknowledge this in your original post and then explain why they should have been more careful about how they released this feature. Homomorphic encryption of the data reframes what you wrote somewhat. Even though data is being sent back, Apple never knows what the data is.
Do you mean my original blog post? The one that not only mentions homomorphic encryption but also links to Apple's own blog post about it? I don't know how that can "reframe" what I wrote when it already framed it.
My charitable interpretation is that the app allows a greater verification process so the bank trusts it more and it's "to protect me, the user". But then, the website lets me transfer $100,000 using a multitude of other methods if I want (wire, e-check, create carrier check), so... yeah.
Edit: Not to mention that many of the newer banks don't even have web banking. It's app only. Of course, its your choice to open an account there though
When they start opting me into photo scanning I lose a bit of trust. The homomorphic encryption makes it less bad. The relative quiet around the rollout of the feature makes it worse. Apple's past attempt to start client side scanning makes it worse. Etc...
The net result is I trust them a bit less. Perhaps not enough to set my apple devices on fire yet, but a bit.
Why would an open source android distro be more trustworthy?
Trust has many meanings but for this discussion we’ll consider privacy and security. As in, I trust my phone to not do something malicious as a result of outside influence, and I trust it to not leak data that I don’t want other people to know.
Open source software is not inherently more secure nor more private. However it can sometimes be more secure (because more people are helping find bugs, because that specific project prioritizes security, etc.) and is usually more private. Why? Because it (usually) isn’t controlled by a single central entity, which means there is (usually) no incentive to collect user data.
In reality it’s all kind of a mess and means nothing. There’s tons of bugs in open source software, and projects like Audacity prove they sometimes violate user privacy. HN-type people consider open source software more secure and private because you can view the source code yourself, but I guarantee you they have not personally reviewed the source of all the software they use.
If you want to use an open-source Android distro I think you would learn a lot. You don’t need to have a CS degree. However unless you made massive lifestyle changes in addition to changing your phone, I’m not confident it would meaningfully make you more secure or private.
I have other reasons, perhaps, to prefer open source stuff, but I am not ready to assume it is inherently more private or secure.
Granted they require you to opt-in in order for the photos app to be usable & if you go out of your way to avoid opting in they make your photo-browsing life miserable with prompts & notifications. But you do still have to opt-in.
If you dare turn off Play Protect for example, you will be asked to turn it on every time you install or update anything. Never mind that you said no the last thousand times it asked.
Tech companies love doing this. Apple does the same, so does Microsoft.
If you know some choice isn't right for you (now or forever), the company is feeling extra beautiful today, and you're in luck, you'll get a "Do this now, or I'll remind you later" choice. But then sometimes they just decide that "This is how things are now".
I've had this happen in every environment except Linux, where I get to shoot myself in the foot whenever I want, and sometimes a bit more.
I've used Simple Gallery Pro before but it's not very good.
Currently using Immich but that's not really a general photo app - it's got a narrow use case - so I still use the Google Photos app alongside it quite often.
Specific alternative recommendations that aren't malware welcome.
It's rock solid for me. you can browse folders move, copy, hide small edits. you can't search 'dog' which is a plus, it doesn't scan faces.
see: https://github.com/SimpleMobileTools/General-Discussion/issu...
If (like me) you don't need e2e I can highly recommend Immich for its use-case though.
Unfortunately google's camera app will only open google photos if you click the image preview after taking one. Just doesn't respect the default gallery app setting at all.
This feature is some new "landmark" detection and it feels like it's a trial balloon or something as it simply makes zero sense unless what they are categorizing as landmarks is enormous. The example is always the Eiffel tower, but the data to identify most of the world's major landmarks is small relative to what the device can already detect, not to mention that such lookups don't even need photo identification and could instead (and actually already do and long have) use simple location data and nearby POIs for such metadata tagging.
The landmarks thing is the beginning, but I feel like they want it to be much more detailed. Like every piece of art, model of car, etc, including as they change with new releases, etc.
That's the problem with all those implementations: no feedback of any kind. No list of recognized tags. No information of what is or is to be processed. No nothing. Just magic that doesn't work.
https://github.com/pythongosssss/ComfyUI-WD14-Tagger
which uses specific models to generate proper booru tags out of any image you pass to it.
More importantly, I know for sure they have this capability in practice, because if you tap the right way in the right app, when the Moon is in just the right phase, both Samsung Gallery and OneDrive Photos does (or in case of OneDrive, used to):
- Provide occasional completions and suggestions for predefined categories, like "sunset" or "outwear" or "people", etc.;
- Auto-tag photos with some subset of those (OneDrive, which also sometimes records it in metadata), or if you use "edit tag" options, suggest best fitting tags (Samsung);
- Have a semi-random list of "Things" to choose from to categorize your photos, such as "Sunsets", "City", "Outdoors", "Room", etc. Google Photos does that one too.
This shows they do maintain a list of correct and recommended classifications. They just choose to keep it hidden.
With regards to face recognition, it's even worse. There's zero controls and zero information other than occasionally matched (and often mismatched) face under photo properties, that you can sometimes delete.
There's nothing stopping either Apple or Google from giving users an option to just disable these connected features, globally or per-app. Just allow a "no cloud services" toggle switch in the Photos app, get the warning that $FEATURES will stop working, and be done.
I know why Google isn't doing this, they're definitely monetizing every bit of that analyzed content. Not really sure about Apple though, might be that they consider their setup with HE as being on par with no cloud connectivity privacy wise.
It’s more. It also can create memories “trip to New York in 2020”, “Cityscapes in New York over the years”, or “Peter over the years” (with Peter being a person added to Photos)
Apple, Google, Microsoft and Samsung, they all seem to be tripping over each other in an effort to make this whole thing just as much ass-backwards as possible. Here is how it, IMHO, should work:
1) It scans stuff, detects faces and features. Locally or in the cloud or not at all, as governed by an explicit opt-in setting.
2) Fuck search. Search is not discoverable. I want to browse stuff. I want a list of objects/tags/concepts it recognized. I want a list of faces it recognized and the ability to manually retag them, and manually mark any that they missed. And not just a list of 10 categories the vendor thinks are most interesting. All of them. Alphabetically.
3) If you insist on search, make it work. I type in a word, I want all photos tagged with it. I click on a face, I want all photos that have matching face on it. Simple as that. Not "eventual consistency", not "keep refreshing, every 5th refresh I may show you a result", or other such breakage that's a staple experience of OneDrive Photos in particular.
Don't know about Apple, but Google, Microsoft and Samsung all refuse #2, and spectacularly fail at #3, and the way it works, I'm convinced it's intentional, as I can't even conceptualize a design that would exhibit such failures naturally.
EDIT:
4) (A cherry on a cake of making a sane product that works) Recognition data is stored in photo metadata, whether directly or in a sidecar file, in any of a bunch of formats sane people use, and is both exported along with the photos, and adhered to when importing new photos.
This is a completely hypothetical scenario. If users with such requirements actually existed, PinePhones and similar devices would be significantly more popular.
I didn't like their license because it's BSD-3-Clause-Clear but then they state:
"Zama’s libraries are free to use under the BSD 3-Clause Clear license only for development, research, prototyping, and experimentation purposes. However, for any commercial use of Zama's open source code, companies must purchase Zama’s commercial patent license"
So Its not free, you need to pay for patent license, and they don't disclose how much.
I recommend OpenFHE as an alternative Free open source solution. I know its C++ and not Rust, but no patent license and it can do the same thing the blog post wants to do, it even has more features like proxy-reencryption that I think Concrete can't do.
It's like saying: "FREE* candy! (Free to look at, eating is $6.95 / pound)"
> If a company open sources their code under BSD3-clear, they can sell additional licenses to use the patents included in the open source code. In this case, it still isn’t the software that is sold, but rather the usage rights of the patented intellectual property it contains.
Every day I like the Apache licence more.
Just like 3+1 is not 3.
On the other hand, there's nothing explicitly stating that the permission is intended to extend to practicing the patents embodied in the software. That's just an inference any reasonable person would draw from the language of the license. It may be better to state it explicitly, as the Apache license does.
But it may be worse, because longer licenses contain more to argue over, and once you start listing the particular causes of action you're promising not to sue for, you may miss some. Suppose your program is for chemistry and includes a list of solubility parameters for different compounds. If someone copies that, that's a potential cause of action under national laws implementing the sui generis database rights in the EU Database Protection directive: https://www.eumonitor.eu/9353000/1/j4nvk6yhcbpeywk_j9vvik7m1... which postdates the authorship of the BSD license and isn't mentioned in the Apache License 2.0 either. Plausibly the explicit listing of copyright rights and patent rights in the license will incline courts to decide that no license to database rights was intended.
Some future legislation will presumably create new intellectual property restrictions we currently can't imagine, even if it also removes some of the ones we currently suffer from.
(A separate issue is that the patent holder may not be the person who granted the license or indeed have contributed in any way to the creation of the BSD-licensed software, which is just as much of a problem with the Apache license.)
Issues like these require thoughtful deliberation, and unfortunately the Reddit format adopted by HN makes that impossible—in fact, the editing and replying deadlines added for HN make it a medium even less amenable to such discussions.
Concrete's lawyers must believe that BSD doesn't grant patent rights. The Concrete license.txt is straight BSD, but the Readme says it only applies in certain situations. So is it BSD licensed or not? If that statement about patents in the Readme is load-bearing then what's stopping me from forking the project and removing it?
In the linked post, they say, "the original BSD3 license did not mention patents at all, creating an ambiguity that the BSD3-clear version resolves", which has an additional clause beginning, "NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY THIS LICENSE." Presumably if Metacarta's lawyers really did believe that BSD doesn't grant patent rights as they claim, they wouldn't have gone to the trouble to edit it to remove that implicit grant. And if Concrete's lawyers really did believe it, they probably would have gone with the actual open-source license everyone recognizes.
Copied from openfhe.org
Or maybe thats not what you meant...
The OHTTP scheme does not _technically_ prevent this. It increases the number parties need to cooperate to extract this information, hoping it would be caught somewhere in the pipeline.
There are people doing NNs in HE, but most implementations do indeed require bootstrapping, and for that reason they usually use CKKS.
Is there any concrete info about noise budgets? It seems like that’s the critical concern, and I’d like to understand at what point precisely the security breaks down if you have too little (or too much?) noise.
What typically makes FHE expensive computationally is a “bootstrapping” step for removing the noise that accumulated after X operations and threatening correctness. After bootstrapping you can do another X operations. Rinse and repeat until you finish the computation to you want to perform.
(written in 2009) http://crypto.stanford.edu/craig/craig-thesis.pdf
some newer FHE don't encounter a noise limit or don't use the bootstrapping technique.
Privacy by design is always nice to see.
Let's supouse Apple is evil (or they recive an order from a judge) and they want to know who is calling 5555-1234
1) Add a new empty "spam" numbers encrypted database to the server (so there are now two encrypted databases in the system)
2) Add the encrited version of 5555-1234 to it.
3) When someone checks, reply the correct answer from the real database and also check in the second one and send the reply to the police.
You can't be forced to hand over customer data after you have designed a system so that your servers don't ever have that information stored in the first place, court order or no.
If an iPhone was not sending any traffic whatsoever to the mothership, at least it would ring alarm bells if it suddenly started doing so.
Whereas if the app never phoned home and then upon upgrade it started to then you could decide to kill it and stop using the app / phone.
Of course, realistically <.00001% of users would even check for unexpected phone home, or abandon the platform over any of this. So in a way you're right.
If you accept that every photo captured will send traffic to the mothership, like the story here, then that is no longer something you can check, either.
In any case, as others have mentioned, no one cares. In fact, I could argue that the scenario I'm forecasting is exactly what has already happened: the photos app suddenly started sending opaque blobs for every photo captured. A paranoid guy noticed this traffic and asked Apple about it. Apple replied with a flimsy justification, but users then go to ridiculous extremes to justify that this is not Apple spying on them, but a new super-secret-magic-sauce that cannot possibly be used to exfiltrate their data, despite the fact that Apple has provided exactly 0 verifiable assurances about it (and in fact has no way to do so). And the paranoid guy will no longer be able to notice extra per-photo traffic in the future.
- iOS Photos -> Vectors
- Search Query "Dog photos" -> Vectors
- Result (Cosine Similarity): Look some dog photos!
iPhones have plenty of local storage and compute power for doing this kind of thing when the phone is idle. And cosine similarity can work quickly at runtime.
> This seems like a lot of data the client is getting anyway. I don’t blame you for questioning if the server is actually needed. The thing is, the stored vectors that are compared against are by far the biggest storage user. Each vector can easily be multiple kilobytes. The paper discusses a database of 35 million entries divided across 8500 clusters.
[1]: https://github.com/fguzman82/CLIP-Finder2 [2]: https://github.com/apple/ml-mobileclip
The idea that Apple is going to use this feature to spy on you, completely misses the fact that they own the entire OS on your phone, and are quite capable of directly spying on you via your phone if they wanted to.
I'm not sure there's a way out of this that doesn't involve open source and repeatable builds (and watch out for Reflections on Trusting Trust).
A huge privacy-bruising feature for nothing in our case.
As a concrete example, someone on my team today asked me “can you send me that photo from the comedy festival a couple years ago that had the nice projections on the proscenium?”. I searched apple photos (on my phone, while hiking through a park) for “sketchfest theater projection”. It used the OCR to find Sketchfest and presumably the vector embeddings of theater and projection. The one photo she was referring to was the top result. It’s pretty impressive.
It can’t always find the exact photo I’m thinking of the first time, but I can generally find any vaguely-remembered photo from years ago without too much effort. It is pretty magical. You should get in the habit of trying it out, you’ll probably be pleasantly surprised.
If the latter, please note that this feature doesn’t actually send a query to a server for a specific landmark — your device does the actual identification work. It’s a rather clever feature in that sense…
From a cursory glance, the computation of centroids done on the client device seems to obviate the need for sending embedded vectors of potentially sensitive photo details — is that incorrect?
I’d be curious to read a report of how on-device-only search (using latest hardware and software) is impacted by disabling the feature and/or network access…
https://machinelearning.apple.com/research/homomorphic-encry...
> The client decrypts the reply to its PNNS query, which may contain multiple candidate landmarks. A specialized, lightweight on-device reranking model then predicts the best candidate…
[please correct me if I missed anything — this used to be my field, but I’ve been disabled for 10 years now, so grain of salt]
I mean, the Wally paper contains enough information to effectively implement homomorphic encryption for similar purposes. The field was almost entirely academic ~12 years ago…
I miss talking shop on HN. Comments like that are why we can’t have nice things.
In the "appeal to cryptographers" section (which I really look forward to being fulfilled by someone, hopefully soon!), HE is equated to post-quantum cryptography. As far as I know, most current post-quantum encryption focuses on the elimination of Diffie-Hellman schemes (both over finite fields and over elliptic curves) since those are vulnerable to Shor's algorithm.
However, it's clear from the code samples later in the post (and not explained in the text, afaict) that a public key gets used to re-encrypt the resultant value of a homomorphic add or multiply.
Is this a case of false equivalence (in the sense that HE != post-quantum), or is it more the case that there's some new asymmetric cryptography scheme that's not vulnerable to Shor's?
The twist in FHE is that the server also has an encryption of the user's secret key, which adds an assumption called "circular security", and that's needed to do some homomorphic operations like key switching.
So what gets called the "public key" in the blog post is just the (self?-)encrypted secret key from the user?
I'll read up on your other points after work -- appreciate the search ledes :)
Searching encrypted stuff is what I wondered about, in the past I had to decrypt everything before I could use the standard sql search LIKE
Funny post today about cosine similarity
This downplays the issue. Knowing that Alice takes lots of screenshots of Winnie the Pooh memes means that Alice’s family gets put into Xinjiang concentration camps, not just a political disaster.
(This is a contrived example: iCloud Photos is already NOT e2ee and this is already possible now; but the point stands, as this would apply to people who have iCloud turned off, too.)
It's worth noting though that it's now possible to opt in to iCloud Photo e2ee with "Advanced Data Protection". [0]
It’s also not available in China.
Yes, is you bar the "trust me bro" element in your definition, you'll by definition have no such element.
Reality, though, doesn't care about your definition, so in reality this is exactly the "trust me bro" element that exists
> But we’re already living in a world where all our data is up there, not in our hands.
If that's your real view, then why do you care about all this fancy encryption at all? It doesn't help if everything is already lost
You could also observe all bits leaving the device from the moment you initialize it and determine that only encrypted bits leave and that no private keys leave, which only leaves the gap of some side channel at the factory, but you could perform the calculation to see that the bits are only encrypted with the key you expect them to be encrypted with.