CUDA Ray Tracing 2x Faster Than RTX: My CUDA Ray Tracing Journey(karimsayedre.github.io)

65 pointsby ibobev2 days ago6 comments

E-Reverance2 days ago
See comments as to why this is misleading https://www.reddit.com/r/GraphicsProgramming/comments/1ljmf0...
esperenta day ago
I think the title of this should be changed, at the moment it's click bait. It should be something like:
CUDA Ray Tracing 2x Faster Than RTX when Rendering Spheres
As far as I can see, this renderer can't do anything else except spheres (and maybe planes).
It's no bad achievement to beat a general purpose production renderer at one specific thing, but a renderer that can only do spheres is just a hyper-optimized toy, and here it's being presented as far more than that.
sheepscreek2 days ago
I’m guessing it’s because they’re using all the computing power the GPU has to offer in CUDA mode, as opposed to sharing the GPU with other functions (when in RTX).
- atq2119a day ago
  More likely it's because the scene they're using is completely unrepresentative of what people are interested in: almost no triangles, primarily procedural nodes (for spheres), and in general a fairly simple scene.
- colechristensena day ago
  Yup this is an "assume spherical cow" situation where it's not dishonest, but you can't draw any real world conclusions from the experiment unless you happen to be working in a very restricted space.
  - ChocolateGoda day ago
    Wouldn't you need to in a real world scenario make the CUDA cores aware of the game geometry adding more work on the CPU?
    touisteur20 hours ago
    Ideally you don't make the cuda cores aware but rather the ray-tracing circuitry. RT cores are designed to perform ray-triangle intersections in a BVH. You get the teraflops and memory bandwidth (or more of it) if you fit the RT-core computing model.
    And in most cases it's ok to spend time on one CPU function (creating and loading the BVH) against the hundred thousands of frames you'll be drawing on GPU.
    colechristensen10 hours ago
    A whole lot of stuff is going on during gaming and graphics rendering with trick upon trick to squeeze out every last bit of performance. Unless you're an expert in a graphics rendering stack or a game engine it's hard to have these conversations in a meaningful way.
kachapopopowa day ago
wow, bypassing a rendering backend makes things go faster, what a surprise!
This only runs on nvidia, vulkan is designed to be cross-compatible with not only gpus, but operating systems as well. Vulkan is pretty direct compared to something like dx11 thought so I guess it is interesting to see performance improvement non the less.
kookamamiea day ago
> __restrict__ Pointers
Ahh, my favorite nitpick from C++ not having sane default aliasing rules spills to the CUDA-land.
- pjmlp7 hours ago
  Is hard to have them, when one of the original goals was being mostly copy paste compatible with C89.
  - kookamamie6 hours ago
    Yes, though C has restrict in the language now, but C++ does not.
    pjmlp6 hours ago
    Because no one has ever bothered to create a WG21 paper proposal to include it.
pixelpoeta day ago
> FMA performance here is a non-issue, I'm not just flexing—I'm showing off my CUDA prowess. But hey, got to demonstrate I know my hardware!
This article is pretty embarrassing, and as others have noted, very misleading due to the RTX units hardly being used.