There was a discussion on here the other day about the PS6, and honestly were I involved in consoles/games production anymore I'd be looking seriously about how to incorporate assets like this.
It's good for visualizing something by itself, but not for building a scene out of it.
If you want a real cursed problem for Gaussian splats though: global illumination. People have decomposed splat models into separate global and PBR colors, but I have no clue how you'd figure out where that global illumination came from, let alone recompute it for a new lighting situation.
Also, since it's slightly hidden in a comment underneath the abstract and easy to miss, here's the link to the paper's project page: https://stopaimme.github.io/GI-GS-site/
I wonder if it's possible to do some kind of blendshape style animation, where you blend between multiple recorded poses.
This is objectively violating accessibility guidelines for contrast.
The best thing about reader mode is that there’s now always an escape hatch for those who it doesn’t work for.
I'd also like to show my gratitude for you releasing this as a free culture file! (CC BY)
Is it possible to handle SfM out of band? For example, by precisely measuring the location and orientation of the camera?
The paper’s pipeline includes a stage that identifies the in-focus area of an image. Perhaps you could use that to partition the input images. Exclusively use the in-focus areas for SfM, perhaps supplemented by out of band POV information, then leverage the whole image for training the splat.
Overall this seems like a slow journey to building end-to-end model pipelines. We’ve seen that in a few other domains, such as translation. It’s interesting to see when specialized algorithms are appropriate and when a unified neural pipeline works better. I think the main determinant is how much benefit there is to sharing information between stages.
I would have thought that since that reflection has a different color in different directions, gaussian splat generation would have a hard time coming to a solution that satisfies all of the rays. Or at the very least, that a reflective surface would turn out muddy rather than properly reflective-looking.
Is there some clever trickery that's happening here, or am I misunderstanding something about gaussian splats?
Sometimes it will “go wrong”, you can see in some of the fly models that if you get too close, body parts start looking a bit transparent as some of the specular highlights are actually splats on the back of an internal surface. This is very evident with mirrors - they are just an inverted projection which you can walk right into.
E.g. if you have a cluster of tiny adjacent volumes that have high variability based on viewing angle, but the difference between each of those volumes is small, handle it as a smooth, reflective surface, like chrome.
I presume these would look great on good vr headset?
Black text on a dark grey background is nearly unreadable - I used Reader Mode.
I wonder if one could capture each angle in a single shot with a Lytro Illum instead of focus-stacking? Or is the output of an Illum not of sufficient resolution?
Should be possible to model the focal depth of the camera directly. But perhaps that is not done in standard software. You still want several images with different focus settings
I'd love to know the compute hardware he used and the time it took to produce.
Likely triangles are used to render the image in a traditional pipeline.