It is also part of the benchmarks game they play against each other.
But in 2001 ATI was caught applying optimizations to Quake 3 when someone realized if you renamed the executable from “quake” to “quack” the score dropped a ton. It was a big scandal.
I know that’s common now but that wasn’t a thing that was done at the time.
1. AAAA Game Studio shits out another unoptimized clunker
2. nvidia considers it a reputational risk if games run at 30 FPS on a 5090
3. They go in, look at the perverse ways the game misuses rendering primitives, and then hacks shit in to make whatever bad things they're doing less bad.
As a gamer, this seems fine to me and i generally blame the AAAA devs for being bad at their jobs or AAAA studio leads for being ok shipping unoptimized messes.
As a software developer, it almost certainly has a bad effect on the ecosystem long term. "Hacks shit in" is the very definition of technical debt, and that has a cost that someone, somewhere is going to have to pay in some form.
> You’re looking as a dev, but the reality is that a consumer cannot see technical debt.
The consumer can't _see_ technical debt, but they sure as heck can be impacted by it.
- Technical debt means the code base is harder to work with later. So fixes/enhancements take longer to make it into the code (and sometimes never can)
- This particular type of technical debt means the code by the game developers sets precedent, and the next developer may us it as an example. So the amount of code incorrectly using the api grows faster over time
You can click the timestamp ("X minutes ago") to view the comment without context, and reply from there.
These hacks are game specific, so another developer wouldn't get them.
There is no reason anyone has to pay each and every iota of technical debt. Plenty of things with technical debt hit end of life and no one ever looks in that code again. I suspect most technical debt goes this way - in program, program never updates (or minor updates), then dies.
Your claim would require every piece of technical debt in anything ever (code, buildings, cars, anywhere) has to be removed before the thing goes end of life or goes into a mode where it never is changed. That seems ludicrous to me.
Yes. My understanding was it was optimized by reducing precision or something to a visibly apparent degree.
It's different if the driver changes things in ways such that rendered output is the same or at least imperceptibly different. I think there's also a lot more communication between gpu makers and game/engine developers these days; plus a lot more frequent updates.
If only we had that sort of a control over rendering for every game ourselves - since projects like OptiScaler at least let us claw back control over sometimes proprietary upscaling and even framegen, but it's not quite enough: https://github.com/optiscaler/OptiScaler
I'd also mention Lossless Scaling here, though it still only works on upscaling and framegen and with worse methods, but at least works for most games out there: https://store.steampowered.com/app/993090/Lossless_Scaling/
I want to be able to freely toggle between different types of AA and SSAO and reflections and lighting and LOD systems and various shader effects (especially things like chromatic aberration or motion blur) and ray tracing and all that, instead of having to hope that the console port that's offered to me has those abilities in the graphics menu and that whoever is making the decisions hasn't decided that actually "low" graphics (that would at least run smoothly) would look too bad for the game's brand image or something.
“AAAA Game Studio shits out another unoptimized clunker” seems a paradoxical statement to me. I would have thought “AAAA” meant “highly resourced” game company. Does it just mean high revenue? Lots of players?
Like board manufacturers, the game developers also need to please the drivers and do the way driver silently dictates to them (regardless of what DirectX, OpenGL or Vulkan says), otherwise all bets are off.
Funny quirk, though: that particular window wouldn't show files named firefox.exe. It would accept that as typed input, if you were at the correct folder, but the file listing omitted that particular file.
Maybe it was mozilla.exe; it was a long time ago. But that was the discovery that pushed me off IE forever.
You saw that again in more modern times when Microsoft removed support for the APIs they provided to set browser defaults, forcing browser makers to write step by step instructions on what to click to set the default browser.
I believe they walked that back, but it left such a bad taste that I switched my installation of Windows from default mode to EU mode in order to avoid it. And come to think of it, I haven’t used my windows machine for much outside of AI in about 6 months.
But Microsoft is not alone in these sort of defaults games - every OS or browser maker, Apple, Google, Firefox, wants to create moats so they can more easily monetize your usage of a product. I never thought I’d prefer the business model of free to play games, where they just outright ask you for money and have to keep finding new ways to entertain instead of relying on hard to change defaults and selling your data.
The driver looks to see if a known old game is calling it, and if it's one known to crash, it returns no more than 256 characters, and likely also puts all the _old_ extensions that the game is likely to know and react to in the string.
There are also all sorts of games that called APIs in a particular order or set particular options, because they represented a "fast path" at the time, and now they don't, but if you're that program, then yes they do.
Ultimately, this clutter is what let do the development of the Vulcan API, to stop games second-guessing graphics APIs which themselves second-guess the games.
A frequent example I’ve encountered is web frameworks that have to keep checking for escaped text because they didn’t write it in horizontal layers where you know for sure that all inputs have been scrubbed when they reach this function but not that one. So the same functions get called with data that comes from your team and from customers. Reuse is tricky.
- Unescape, sanitize or validate at all entry points.
- Escape all outputs (this includes the database queries).
If you follow those simple rules, you never have to check once you are past a controller. And you should fuzz your controllers to make sure no unexpected data makes it past there.
Everyone has clever answers for greenfield projects and empty rhetoric for brown.
https://docs.nvidia.com/cutlass/index.html
it presumably makes various assumptions and speedups for NVIDIA's matrix multiplication library... called cutlass
Quake was also the standard for a game that was willing to fully exploit the hardware of the time.
This ticket is a good starting place to see the chain of issues around the ongoing work: https://github.com/NVIDIA/cutlass/pull/2037
I wonder if we search the comments if we can find something referencing this.
If you have hundreds of passes that are complex and rely on various "contracts" like type names or some shit, then really crazy things like this can happen unintentionally and not maliciously
Names can be both informative, and misdirecting, at the same time.
First, nVidia and ATI used executable names for detecting games, then they started to add heuristics.
If you think they stopped the practice, you're very mistaken. Every AMD and nVidia driver has game and app specific fixes and optimizations.
nVidia cheated in 3D Mark that way, so they patched/changed their benchmark to prevent it. Also, again nVidia, patched their drivers so some of the more expensive but visually invisible calls like scene flushes in a particular game is batched (e.g. do all 50 flushes at the 50th call) to prevent the game becoming a slide show on expensive hardware.
This is also why AMDs and Intel's open source drivers under Linux a success, because they are vanilla drivers written from scratch per spec, and if your code calls OpenGL/Vulkan to spec, then you're golden.
Even some companies cross compile AMD's Linux drivers for windows on embedded systems since they're free from useless optimizations from them.
Interestingly, most benchmark controversies back in the day are now expected behaviour, i.e. game-specific optimizations with no (well, in this age of upscalers and other lossy optimization techniques, probably even somewhat) visible image degradation. A gaming-specific driver with no game-specific improvements in its changelog would be considered strange, and it very much works with executable detection.
Back in the day, there was still the argument that drivers should not optimize for benchmarks even when visually identical, because it wouldn't show the hardware's real world potential. Kinda cute from today's perspective. :)
But of course there were the obvious cases...
The Quack3 lowering filtering quality as shown above, of course (at least that one was put into the driver as a togglable setting later on).
But the most cheeky one has to be nVidia's 3dmark03 "optimizations", where they blatantly put static clip planes into the scenes so that everything outside the predefined camera path from the benchmark sequence would simply be cut from the scene early (which e.g. fully broke the freelook patched into 3dmark and would generally break any interactive application)
Just kidding, nice to see another person who remembers these things. Want some root beer?
Many people, including me, didn't have an internet connection back in the day. The Sneakernet went into overdrive so get everyone a copy!
When my colleague said that they managed to go faster than intel with icc with some hand tuned parameters, I remember answering "youdidwat?".
Good times.
If the headline was "FB8 is ~7% faster when kernel name has 'cutlass' in it...", it wouldn't seem sensational.
> Rewrite the attention kernel to be persistent. This gives better performance at low-contexts. However, fp16 at large context has suffered a bit due to a ptxas instruction scheduling issue in the softmax partition. fp8 is ~100 tflops faster when the kernel name has "cutlass" in it.
The charitable reading is that, on certain kernels, using fp8 rather than fp16 values gives better performance. (Although I can't even see how the numbers relate to a "~100 tflops faster" claim in any respect, nor does it even list any kernel names or suggest a control kernel!) But this is being presented as if someone has uncovered evidence of cheating on benchmarks.
# Up to 150 TFLOPS faster for fp8!
if specialization.constants["dtype"] == gl.float8e5:
name = "cutlass_" + name
It's literally in the code.
> > fp8 is 100 tflops faster when the kernel name has "cutlass" in it
> kms
In order to get to the part that you're trying to hold me accountable for, I would furthermore have to click onto the commits tab and search through a 93-commit PR.
I thought today I was using a site where trying to think the best of people and propose that someone had taken something out of context, based on the immediately available context having a simpler explanation, would not get me treated like a corporate shill (for a company I don't even care about). Apparently I was wrong.
How did you get "without doing any research whatsoever" out of me demonstrably following the link and reading and quoting what appeared on the facing page?