Good example of how the bottlenecks are often not where you think they are, and why you have to profile & measure (which I assume Viti95 did in order to find that speedup so early on). The status bar percentage?! Maybe there's something about the Doom arch which makes that relatively obvious to experts, but I certainly would've never guessed that was a bottleneck a priori.
Would like to see a write up on how it's even possible to achieve that when PCs from 20-30 years ago had no issue with such task.
Electron.
It's one of the most complex pieces of software - perhaps even human designed systems - ever to exist and we're using it to render a few polygons and drop shadows because the C++ committee made a bunch of mistakes decades ago so now our webdevs are mortally afraid of CMake and Qt/QML. Or GTK. Or whatever. Pretty much the only people that seem to put out native GUI tools in any significant quantity are Apple developers.
The tradeoffs that Blink and V8 engineers have made to support the entirety of the internet and the morass of web development precludes efficient use of resources for simpler purposes, like rendering an animation. After all, there a billion React hooks and ad tracking scripts to optimize, otherwise bounce rates will increase.
Strong disagree. If it were an animated gif then the browser will be astonishingly efficient because of crazy good optimisations.
The underlying reason is that developers are limited to the techniques/toolbox they know. The performance costs are unpredictable because of:
(1) the declarative style (using imperative solutions would have other costs),
(2) debugging browser performance regressions is difficult (SQL EXPLAIN is more learnable).
Browsers enable developers. I could design that animation in CSS even though I'm a developer, plus I understand the article fully. I couldn't design an animated gif because I am totally unfamiliar with any tools for achieving that.
I think the Blink and V8 teams do an exceptionally good job when choosing compromises. HTML/CSS/SVG/JS and Chromium are winning the UI wars because they deliver practical and efficient enough solutions. Other UI solutions I have experienced all have other downsides. Browsers are magical.
I mostly really agree with your comment.
Note that this isn't even a case of whoever implemented that cursor "doing it wrong"; to quote another comment on that bug from a Chrome dev:
> Powerful* text editors built on the web stack cannot rely on the OS text caret and have to provide their own. In this case, VSCode is probably using the most reasonable approach to blinking a cursor: a step timing-function with a CSS keyframe animation. This tells the browser to only change the opacity every 500ms. Meanwhile, Chrome hasn't yet optimised this completely yet, hence http://crbug.com/361587. So currently, Chrome is doing the full rendering lifecycle (style, paint, layers) every 16ms when it should be only doing that work at a 500ms interval. I'm confident that the engineers working on Chrome's style components can sort this out, but it'll take a little bit of work.
At a guess, by something in the UI framework turning an O(n) task into an O(n^2) task. Seen that happen in person, the iPhone app was taking 20 minutes(!) to start up in some conditions, the developer responsible insisted the code couldn't possibly be improved, the next day I'd found the un-necessary O(n^2) op and reduced that to a few hundred milliseconds.
The over-abstraction of current programming is, I think, a mistake — I can see why the approach in React and SwiftUI is tempting, why people want to write code that way, but I think it puts too much into magic black-boxes. If you change your thinking just a bit, the older ways of writing UI are not that difficult, and much more performant.
I have a bunch of other related opinions about e.g. why VIPER is slightly worse than an entire pantheon of god classes combined with a thousand lines inside an if-block: https://benwheatley.github.io/blog/2024/04/07-21.31.19.html
I remember using phpBB back in the late 2000's, viewing a page that had at least 100 animated emoticons on it.
It would slow IE6 down to a halt. But then I tried Chrome or Firefox (I forget which one) and it didn't even blink showing the same page. I even remember reading some developer posts about things like that at the time.
Loops occur so fast it’s pretty standard to have to put some throttling logic in the code which I’ve never had to do in web dev except perhaps the document.ready statement in JavaScript to make sure the dom has been loaded.
Oh, looks like I'm way late: https://cprimozic.net/blog/building-a-signal-analyzer-with-m...
--
1: https://web.dev/articles/simplify-paint-complexity-and-reduc...
I used to work on a set top box app, and for some of the features, replacing the whole page with a single canvas was the only way to get steady FPS.
The animation's still there — and my PC is better now, so it doesn't stutter — but I'm willing to bet it's still burning waaay too many watts, for something so trivial.
"SMW is incredibly inefficient when it displays the player score in the status bar. In the worst case (playing as Luigi, both players with max score), it can take about a full 1/6 of the entire frame to do so, lowering the threshold for slowdown. Normally, the actual amount of processing time is roughly proportional to the sum of the digits in Mario's score when playing as Mario, and the to the sum of the digits in both players' scores when playing as Luigi. This patch optimizes the way the score is stored and displayed to make it roughly constant, slightly faster than even in the best case without."
https://nee.lv/2021/02/28/How-I-cut-GTA-Online-loading-times...
It’s far more expensive than people assume and in many cases is single-threaded. This can make logging the scalability bottleneck!
https://www.reddit.com/r/Doom/comments/8a1m9s/psa_deactivate...
Problem for id is that you have no sane way of profiling the DOS port on NeXT.
I did game engine dev professionally in 1994 (and non-professionally before that). We profiled both time and memory use, using GCC and gprof for function-level profiling, recording memory allocation statistics by type and pool (and measuring allocation time), microbenchmarking the time spent sending commands to blitters and coprocessors, visually measuring the time in different phases of the frame rendering loop to 10-100 microsecond granularity by toggling border or palette registers, measuring time spent sorting display lists, time in collision detection, etc.
You might regard most of those as not what you mean by profiling, but the GCC/gprof stuff certainly counts, as it provides a detailed profile after a run, showing hot functions and call stacks to investigate.
It's true that most of the time, we just changed the code, ran the game again and just looked at the frame rate or even just how fluid it felt, though :-)
I'm guessing they used gcc on NeXT but profiling that platform probably didn't make much sense, different compiler, different CPU, different OS...
The game was written on a TurboColor which had a Motorola 68030 (?) that ran at 33MHz and supposedly only ran at 15FPS (probably as slow as a 386DX/33 at the time)
While 486 AGI stalls were trivial, especially Pentium changed the game with its U and V pipes. You could no longer trivially eyeball what code is faster. Profiling was a must.
Heck, I even "profiled" my C64 routines in the eighties by changing the border color. You could clearly visually see how long your code takes to execute.
Overall this can mean that in some situations the game feels not as smooth as before due to these variations.
Essentially when considering real time rendering the slowest path is the most critical to optimize.
I don't think the benchmark accounts that.
I'm probably not the real target audience here, but that looked interesting; I didn't think there were good storage-over-network options that far back. A little searching turns up https://www.brutman.com/mTCP/mTCP_NetDrive.html - that's really cool:)
> NetDrive is a DOS device driver that allows you to access a remote disk image hosted by another machine as though it was a local device with an assigned drive letter. The remote disk image can be a floppy disk image or a hard drive image.
Back in school in the early 90s we had one computer lab where around 25 Mac Plus machines were daisy chained via AppleTalk to a Mac II. All of the Plus machines mounted their filesystem from the Mac II. It was painfully slow, students lost 5-10 minutes at the start of class trying to get the word processor started. Heck, the Xerox Altos also used network mounts for their drives.
If you have networking the first thing someone wants to do is copy files, and the most ergonomic way is to make it look just like a local filesystem.
DOS was a bit behind the curve because there was no networking built-in, so you had to do a lot of the legwork yourself.
Where I want to school we had AFS. You could sit down at any Unix workstation and login and it looked like your personal machine. Your entire desktop & file environment was there and the environment automatically pointed all your paths at the correct binaries for that machine. (While we were there I remember using Sun, IBM, and SGI workstations in this environment.)
When Windows came on campus it felt like the stone ages as none of this stuff worked and SMB was horrible in comparison.
These days it feels like distributed file systems are used less and less in lieu of having to upload everything to various web based cloud systems.
In some ways it feels like everything has become less and less friendly with the loss of desktop apps in favor of everything in the browser.
I guess I do use OneDrive, but it doesn't seem particularly good, even compared to 1990s options.
I don't recall whether there were any security-minded limits you could put on what was shared, but for a team that was meant to share everything, it was pretty handy.
Later I saw some Citrix setups which would load the applications from the server. That also worked pretty OK.
With Windows you definitely had all the options to make this work in the late 90s.
Anyone remember Novell NetWare?
made my day :)
you did have networking? waw
not here.. that's why Floppy-net was something.. as well as bus-304-net (like, write a floppy, hop on bus 304, go to other campus)
SMB (samba) is also from the DOS era. Most people only know of it from Windows, though.
There were various other ways to make network "drives" as the DOS drive interface was very simplistic and easy to grab onto.
It was rare to find this stuff until Win95 make network connections "free" (before then, you had to buy the networking hardware and often the software, separately!).
Back in the day, you could author a web page directly in GruntPage, and publish it straight to your web server provided said server had the FPSE (FrontPage Server Extensions), a proprietary Microsoft add-on, installed. WebDAV was like the open-standards response to that. Eventually in later versions of FrontPage the FPSE was deprecated and support for WebDAV was provided.
Glad to see someone making sure that Doom still gets performance improvements :D
So in a way, I owe my whole career and fortune to KenS. Cool.
Also shout out to anyone who remembers "wackplayer" - Duke's equivalent of the BEEP keyword.
I especially liked the idea of CR2 and CR3 as scratchpad registers when memory access is really slow (386SX and cacheless 386DXs). And the trick of using ESP as a loop counter without disabling interrupts (by making sure it always points to a valid stack location) is just genius.
- IBM MDA text mode: https://www.youtube.com/watch?v=Op2tr2lGK6Y
- EGA & Plantronics ColorPlus: https://www.youtube.com/watch?v=gxx6lJvrITk
- Classic blue & pink CGA: https://youtu.be/rD0UteHi2qM
- CGA, 320x200x16 with 'ANSI from Hell' hack: https://www.youtube.com/watch?v=ut0V1nGcTf8
- Hercules: https://www.youtube.com/watch?v=EEumutuyBBo
Most of these run worse than with VGA, presumably because of all the color remapping etc
Any love for Tandy Graphics Adapter? I'd hate to have to run in CGA :( would need a 286 build for my Tandy 1000 TL/2, if it was still alive.
Wow - by 1992 I was on my fourth homebuilt PC. The KCS computer shows in Marlborough MA were an amazing resource for tinkerers. Buy parts, build PC and use for a while, sell PC, buy more parts - repeat.
By the end of 1992 I was running a 486-DX3 100 with a ULSI 487 math coprocessor.
For a short period of time I arguably had the fastest PC - and maybe computer on campus. It outran several models of Pentium and didn't make math mistakes.
I justified the last build because I was simulating a gas/diesel thermal-electric co-generation plant in a 21 page Excel spreadsheet for my honors thesis. The recalculation times were killing me.
Degree was in environmental science. Career is all computers.
Anyway, there's no such thing as a "DX3". And the first 100MHz 486 (the DX4) came out in March of 1994, so I don't see how you were running one at the end of 1992.
My family's first computer - not counting a hand-me-down XT that was impossibly out-of-date when we got it in 1992 or so - was a 66MHz 486-DX2, purchased in early 1995.
I can't quite explain why, but as a matter of pride it's still upsetting - decades later - to see someone weirdly bragging about an impossible computer that supposedly outran mine despite a three year handicap.
...looked it up, apparently the standard 487 was a full 486DX that disabled and replaced the original 486SX. Was this some sort of other unusually awesome coprocessor I hadn't heard of?
Possibly something software like maple could take advantage of
The 486SX was fully 32-bit (unlike the SLC and 386SX) and the 486DX had the integrated FPU, and the 487 was a drop in 486DX which disabled the 486SX
The 386SX had a 16 bit external bus interface so it could work with 286 chipsets. The DX processors had a full 32 bit bus and correspondingly better throughput. The 386 never included an integrated FPU, you had to add a separate co-processor for that.
https://fabiensanglard.net/fastdoom/#:~:text=one%20commit%20...
i don't get the ibuprofen reference ?
edit: although ibuprofen is a brand name.
>DOOM cycles between three display pages. If only two were used, it would have to sync to the VBL to avoid possible display flicker.
How does triple buffering eliminate VBL waits, exactly? There was no VBL interrupt on a standard VGA, was there?
This is not the case with double buffering. There can be a case, if the CPU renders fast enough, where it just finished rendering to the current target, but the previous target is still being sent to the CRT. In that case the CPU need to block on VBL.
I suppose you could poll the CRTC every so often during the game loop or rendering process, though. That must have been how it worked.
Seeing this made a difference makes it clear Fabien ran fastdoom in Mode Y
>One optimization that did not work on my machine was to use video mode 13h instead of mode Y.
13h should work on anything, its the VBD that requires specific VESA 2.0 feature enabled (LFB * ). VBR should also work no problem on this IBM
Both 13h and VBR modes would probably deliver another ~10 fps on 486/66 with VESA CL5428.
* LFB = linear frame buffer, not available on most ISA cards. Somewhat problematic as it required less than 16MB ram or "15-16MB memory hole" enabled in bios. On ISA Cirrus Logic support depended on how the chip was wired to the bus, some vendors supported it while others used lazy copy and paste of reference design and didnt. With VESA Cirrus Logic lazy vendors continued to use same basic reference design wiring disabling LFB. No idea about https://theretroweb.com/motherboards/s/ibm-ps-1-type-2133a,-... motherbaord
You misinterpreted what he wrote. He wasn't saying that mode 13h didn't work; he meant that the optimizations in the mode 13h path of the executable weren't as good as the Mode Y path. It's the optimization that didn't work, not mode 13h itself.
I think this statement is incorrect. These modes requires support for VESA 2.0 which this IBM does not have.
> Somewhat problematic as it required less than 16MB ram or "15-16MB memory hole" enabled in bios.
Could be the issue. Do you have any documentation about this?
- Intel FC80486DX4WB-100 (50 x 2 @100)
- Am5x86-P75+ (50 x 3 @150)
- Am5x86-P100 (40 x 4 @160)
> One of my goals for FastDoom is to switch the compiler from OpenWatcom v2 to DJGPP (GCC), which has been shown to produce faster code with the same source. Alternatively, it would be great if someone could improve OpenWatcom v2 to close the performance gap. > - Conversation with Viti95
Out of curiosity, how hard is it to port from OpenWatcom to GCC?
Clearly the solution here is to write a Watcom llvm front end…
I don't think it is that hard but likely very time consuming.
In theory it should only be about writing a new build script (not based on `wmake` but on a real `make`). And then workout the tiny flag/preprocessor/C compiler discrepancies.
For mostly C code like the original Doom source, yes. But it looks like FastDoom people have added quite a bit of assembly. That needs to be ported to AT&T syntax. Or you need to find out if Intel syntax works in your version of DJGPP's gas. While this should work nowadays I have not tried it. Then there's other differences like Watcom mapping low memory by default into the low part of the address space but DJGPP needing explicit mapping or access.
If something is/has become a standard, then optimization takes over. You want to be fastest and meet all of the standard's tests. Doom is similarly now a standard game to port to any new CPU, toaster, whatever. Similarly email protocol, or a browser standard (WebRTC, Quic, etc).
The reason your latest web app/ electron app is not fast is that it is exploratory. It's updated everyday to meet new user needs, and fast-enough-to-not-get-in-the-way is all that's needed performance wise. Hence we see very fast IRC apps, but slack and teams will always be slow.
Funny how that is, for me it was a Sony Alpha camera (~~flagship at the time~~) and 10 years later I finally buy it for $50.
I know there are better cameras in the Alpha line but yeah, I had an R3 at one point which was wasted on me as an amateur.
Let's stick with photography. Can someone who knows what they're doing get great results with cheap equipment? Yes, in many situations. Is it WAY easier with the right gear? Absolutely. Finding the right balance is tricky - no, you don't need a flagship body and lens to get started, but having a flagship or pro-grade body/lens from 1-3 generations ago can be huge.
I shoot Nikon, not Sony, but going from a consumer D50 body to a Pro D300 body was huge just in terms of ergonomics - more buttons to allow me to quickly adjust things without having to pull the camera from my eye and fumble through menus.
In the current generation, I finally moved to Mirrorless with a Z6ii which blew my mind and enabled so many more things - no, I wasn't "getting the full use" out of it and yes, I got some great shots with my old DSLR gear, but it made so many things so much easier that it made shooting fun and got me to carry the camera and take photos every day, which has been the biggest factor in improving my skills. Within the last few months I splurged and upgraded to a current-gen flagship (Z8) which amazed me once again - the Z6ii was more camera than I could fully exploit, but the Z8's ergonomics are just incredible - so many buttons, most of them remappable, allowing me to truly develop an instinctive way of shooting and allowing the equipment to get out of my way.
It's important to try to avoid loving gear more than loving the activity, but that doesn't mean that higher-end gear is "wasted on" amateurs.
I'm back to the basics now trying to produce videos with an Nex-5n it's not 4K but very cheap.
I get what you're saying though
I think MPV it's a typo for MVP
Just dont let corporations use MVPR as a metric or it will cease to be a fun challenge
text-align: justify;
try it in your browser console: document.body.style="text-align: justify";
WTH is Ibuprofen?!
ibuprofen is an anti-inflammatory and anti-coagulant, sold under many different names.
A fairly close relative to Aspirin that's easier on the stomach and has less of an anticoagulant effect.
It's not the same as - for example - Tylenol which is called Doliprane or Dafalgan in Europe. In that case the active molecule is the exact same, just the name is changing; but you will have a hard time finding a box with Tylenol written on it in France.
I have never seen it called Doliprane or Dafalgan here only ever Paracetamol which is the generic name.
If such a thing exists!
https://news.ycombinator.com/item?id=42607794 https://news.ycombinator.com/item?id=42566112
Also reading Masters of Doom, Carmack and I have something in common ha (went to a form of jail)
How is it that simple tools like text editors work the same as they did 20 ago but take orders of magnitude more RAM?
One can still use Notepad if you want to, but be much less productive with it.
[0] https://fabiensanglard.net/revisiting_the_pathtracer/index.h...
Show me a photo of that beauty.
+1 on Andrew Kensler's awesomeness. After I published these article, he took the time to send me a package with Pixar goodies. Deeply moving gesture.
These dudes are living their best lives, and having done Quake-style asm texture mapping loops in the 90s (Mike Abrash, fastmap, Chris Hecker, PC Game Programming Encyclopedia, ...), I can definitely appreciate it <3
html { font-family: system-ui; }
Consider https://alanhogan.com/bookmarklets#add_css to add this to the page. Code blocks are still shown in monospaced font. BTW, monospaced font for prose is an anti-pattern that you hackers need to relinquish, but whatever!Furthermore, using system-ui for anything that's prose and not a UI is an anti-pattern that UI designers love to make. It also makes the font dependent on system language, which makes things worse if the system language doesn't match the language of the page. https://infinnie.github.io/images/blog/bootstrap.png Even hardcoding it to something classic like Verdana (remember web safe fonts?) is much better.