If you assume competence from Google, they probably have two different watermarks. A sloppy one they offer an online oracle for and one they keep in reserve for themselves (and law enforcement requests).
Also given that it's Google we are dealing with here, they probably save every single image generated (or at least its neural hash) and tie it to your account in their database.
I also read a lot of comments on HN that start by attacking the source of the information, such as saying it was AI assisted, instead of the actual merits of the work.
The HN community is becoming curmudgeonly and using AI tooling as the justification.
This one is such a gigantic clusterfuck... They're mimicking ASCII tables using Unicode chars of varying length and, at times, there's also an off-by-one error. But the model (not Claude, but the model underneath it) is capable of generating ASCII tables.
P.S: I saw the future... The year is 2037 and we've got Unicode tables still not properly aligned.
Also something about how AI is not special and we haven't added or needed invisible watermarks for other ways media can be manipulated deceptively since time immemorial, but that's less of a practical argument and more of a philosophical one.
So it's a "no" by default.
[0]: if it does what it claims to do. I didn't verify. Given how much AI writing in the README my hunch is that this doesn't work better than simple denoising.
The README itself reads like unedited AI output with several layers of history baked in.
- V1 and V2 appear in tables and diagrams but are never explained. V3 gets a pipeline diagram that hand-waves its fallback path.
- The same information is restated three times across Overview, Architecture, and Technical Deep Dive. ~1600 words padded to feel like a paper without the rigor.
- Five badges, 4 made up, for a project with 88 test images, no CI, and no test suite. "Detection Rate: 90%" has no methodology behind it. "License: Research" links nowhere and isn't a license.
- No before/after images, anywhere, for a project whose core claim is imperceptible modification.
- Code examples use two different import styles. One will throw an ImportError.
- No versioning. If Google changes SynthID tomorrow, nothing tells you the codebook is stale.
The underlying observations about resolution-dependent carriers and cross-image phase consistency are interesting. The packaging undermines them.
There are already ten million AI image generators, the overwhelming majority of which do not watermark their outputs. Google auto-inserting them is nice, but ultimately this kind of tool to remove them will inevitably be widespread.
One workflow that some artists use is that they draw with ink on paper, scan, and then digitally color. Nothing prevents someone from generating line art using generative AI, printing it, scanning it, and coloring it.
And what if someone just copy pastes something into Photoshop or imports layers? That's what you'd do for composites that mix multiple images together. Can one copy paste screenshots into a multi layer composition or is that verboten and taints the final image?
And what about multi program workflows? Let's say I import a photo, denoise it in DxO, retouch in affinity photo, resize programmatically using image magick, and use pngcrush to optimize it, what metadata is left at the end?
If only everyone just agrees with me.
This project proves what red teaming was in place wasn't good enough.
Oh hey, neat. I mentioned this specific method of extracting SynthID a while back.[1]
Glad to see someone take it up.
Meta: your comment was marked [dead], like a few other constructive comments I saw in recent days. Not sure why.
I appreciate you pointing it out, but this account is banned. Thank you for vouching though!