It would be more interesting to devise a method that survives all extra data stripping and re-encoding, perhaps taking advantage of deterministic encoders, assuming they don't randomize pixel data on purpose.
In other words: turning the image data stream itself into a polyglot.
Watermarking tries to resist image data manipulation. Smuggling data is concerned with preservation of bytes.
Though if we're executing arbitrary code on the target anyway, ways of embedding data in an image are vast, including watermarking/steganography.
So you have a package that doesn't include (directly) malicious code or make network calls, yet it can still run malicious code from the network. This is much better than simple obfuscation because you can vary the payload, like a command-and-control server.
I probably should have minified it too...
generally its the JPEG standard that allows the payload, manipulation by abusing EXIF is how you operate the exploit.
there is a 64k file segment specified for JPEG, and you can abuse it to hold any "data" you want, as well as extending to other segments, for more storage.
the raw steganography in most primative form is a comparison of two photos, one of which is pixelshifted to encode the data.
in advanced form, the pixels hold the encrypted data, but the application segments of the JPEG hold keys and or matrix values, and you need a reference image. you can move fairly large volumes of ASCII representation like this before its noticed
you basicly write a webpage that local caches the payload and keys, then abuses EXIF to build and execute an exploit on the target.
You have to be selective though, some of the EXIF data specifies things like color spaces and orientation that is used by browsers for displaying the image properly.
EDIT: my vibe-coding slop agent put my home GPS lat long in the example config in the README lol. Please don't rob my house; I'll go run git-filter-repo later.
[1] https://daniel.lawrence.lu/blog/2023-12-20-trip-to-europe/