Before HD, almost all video was non-square pixels. DVD is 720x480. SD channels on cable TV systems are 528x480.
Correct. This came from the ITU-R BT.601 standard, one of the first digital video standards authors of which chose to define digital video as a sampled analog signal. Analog video never had a concept of pixels and operated on lines instead. The rate at which you could sample it could be arbitrary, and affected only the horizontal resolution. The rate chosen by BT.601 was 13.5 MHz, which resulted in a 10/11 pixel aspect ratio for 4:3 NTSC video and 59/54 for 4:3 PAL.
>SD channels on cable TV systems are 528x480
I'm not actually sure about America, but here in Europe most digital cable and satellite SDTV is delivered as 720x576i 4:2:0 MPEG-2 Part 2. There are some outliers that use 544x576i, however.
BT.601 is from 1982 and was the first widely adopted analog component video standard (sampling analog video into 3 color components (YUV) at 13.5 MHz). Prior to BT.601, the main standard for video was SMPTE 244M created by the Society of Motion Picture and Television Engineers, a composite video standard which sampled analog video at 14.32 MHz. Of course, a higher sampling rate is, all things equal, generally better. The reason for BT.601 being lower (13.5 MHz) was a compromise - equal parts technical and political.
Analog television was created in the 1930s as a black-and-white composite standard and in 1953 color was added by a very clever hack which kept all broadcasts backward compatible with existing B&W TVs. Politicians mandated this because they feared nerfing all the B&W TVs owned by voters. But that hack came with some significant technical compromises which complicated and degraded analog video for over 50 years. The composite and component sampling rates (14.32 MHz and 13.5 MHz) are both based on being 4x a specific existing color carrier sampling rate from analog television. And those two frequencies directly dictated all the odd-seeming horizontal pixel resolutions we find in pre-HD digital video (352, 704, 360, 720 and 768) and even the original PC display resolutions (CGA, VGA, XGA, etc). To be clear, analog television signals were never pixels. Each horizontal scanline was only ever an oscillating electrical voltage from the moment photons struck an analog tube in a TV camera to the home viewer's cathode ray tube (CRT). Early digital video resolutions were simply based on how many samples an analog-to-digital converter would need to fully recreate the original electrical voltage.
For example, 720 is tied to 13.5 Mhz because sampling the active picture area of an analog video scanline at 13.5 MHz generates 1440 samples (double per-Nyquist). Similarly, 768 is tied to 14.32 MHz generating 1536 samples. VGA's horizontal resolution of 640 is simply from adjusting analog video's rectangular aspect ratio to be square (720 * 0.909 = 640). It's kind of fascinating all these modern digital resolutions can be traced back to decisions made in the 1930s based on which affordable analog components were available, which competing commercial interests prevailed (RCA vs Philco) and the political sensitivities present at the time.
I don't think you need to be doubling here. Sampling at 13.5 MHz generates about 720 samples.
13.5e6 Hz * 53.33...e-6 seconds = 720 samples
The sampling theorem just means that with that 13.5 MHz sampling rate (and 720 samples) signals up to 6.75 MHz can be represented without aliasing.There's some history on the standard here: https://tech.ebu.ch/docs/techreview/trev_304-rec601_wood.pdf
This allows the captured aspect ratio on film to be fixed for various aspect ratios images that are displayed.
This is repeated often and simply isn't true.
From these 4 values one can compute the video sampling frequency that corresponds to square pixels. For the European TV standard, an image with square pixels would have been of 576 x 768 pixels, obtained at a video sampling frequency close to 15 MHz.
However, in order to allow more TV channels in the available bands, the maximum video frequency was reduced to a lower frequency than required for square pixels (which would have been close to 7.5 MHz in Europe) and then to an even lower maximum video frequency after the transition to PAL/SECAM, i.e. to lower than 5.5 MHz, typically about 5 MHz. (Before the transition to color, Eastern Europe had used sharper black&white signals, with a lower than 6.5 MHz maximum video frequency, typically around 6 MHz. The 5.5/6.5 MHz limits are caused by the location of the audio signal. France had used an even higher-definition B&W system, but that had completely different parameters than the subsequent SECAM, being an 819-line system, while the East-European system differed only in the higher video bandwidth.)
So sampling to a frequency high enough for square pixels would have been pointless as the TV signal had been already reduced to a lower resolution by the earlier analog processing. Thus the 13.5 MHz sampling frequency chosen for digital TV, corresponding to pixels wider than their height, was still high enough to preserve the information contained in the sampled signal.
https://tech.ebu.ch/docs/techreview/trev_304-rec601_wood.pdf
Another condition that had to be satisfied by the sampling frequency was to be high enough in comparison with the maximum bandwidth of the video signal, but not much higher than necessary.
Among the common multiples of the line frequencies, 13.5 MHz was chosen because it also satisfied the second condition, which is the condition that I have discussed, i.e. that it was possible to choose 13.5 MHz only because the analog video bandwidth had been standardized to values smaller than needed for square pixels, otherwise for the sampling frequency a common multiple of the line frequencies that is greater than 15 MHz would have been required (which is 20.25 MHz).
It still looks surprisingly good, considering.
Notes: 1. https://eye-of-the-gopher.github.io/
Doing my part and sending you some samples of UPC cable from the Czech Republic :)
720x576i 16:9: https://0x0.st/P-QU.ts
720x576i 4:3: https://0x0.st/P-Q0.ts
That one weird 544x576i channel I found: https://0x0.st/P-QG.ts
I also have a few decrypted samples from the Hot Bird 13E, public DVB-T and T2 transmitters and Vectra DVB-C from Poland, but for that I'd have to dig through my backups.
My understanding is that televisions would mostly have square/rectangular pixels, while computer monitors often had circular pixels.
Or are you perhaps referring to pixel aspect ratios instead?
F.ex. in case of a "4:3 720x480" frame… a quick test: 720/4=180 and 480/3=160… 180 vs. 160… different results… which means the pixels for this frame are not square, just rectangular. Alternatively 720/480 vs. 4/3 works too, of course.
What a CRT actually draws, though, are lines. Analog television is a machine that chops up a 2D plane into a stack of lines, which are captured, broadcasted, and drawn to the screen with varying intensity. Digital television - and, for that matter, any sort of computer display - absolutely does need that line to be divided into timesteps, which become our pixels. But when that gets displayed back on a CRT, the "pixels" stop mattering.
In the domain of analog television, the only property of the video that's actually structural to the signal are the vertical and horizontal blanking frequencies - how many frames and lines are sent per second. The display's shape is implicit[1], you just have to send 480 lines, and then those lines get stretched to fit the width[2] of the screen. A digital signal being converted to analog can be anything horizontally. A 400x480 and a 720x480 picture will both be 4:3 when you display it on a 4:3 CRT.
Pixel aspect ratio (PAR) is how the digital world accounts for the gap between pixels and lines. The more pixels you send per line, the thinner the pixels get. If you send exactly as many horizontal pixels as the line count times the display's aspect ratio, you get square pixels. For a 4:[3] monitor, that's 640 pixels, or 640x480. Note that that's neither the DVD nor the SD cable standard - so both had non-square pixels.
Note that there is a limit to how many dots you can send. But this is a maximum - a limitation of the quality of the analog electronics and the amount of bandwidth available to the system. DVD and SD cable are different sizes from each other, but they both will display just fine even on an incredibly low-TVL[4] blurry mess of a 60s CRT.
[0] There were some specialty tubes that could do "penetrative color", i.e. increasing the amplitude of the electron gun beyond a certain voltage value would change to a different color. This did not catch on.
[1] As well as how many lines get discarded during vertical blanking, how big the overscan is, etc.
[2] Nothing physical would stop you from making a CRT that scans the other way, but AFAIK no such thing exists. Even arcade cabinets with portrait (tate) monitors were still scanning by the long side of the display.
[3] There's a standard for analog video transmission from 16:9 security cameras that have 1:1 pixel aspect ratio - i.e. more pixels per line. It's called 960H, because it sends... 960 horizontal pixels per line.
https://videos.cctvcamerapros.com/surveillance-systems/what-...
[4] Television lines - i.e. how many horizontal lines can the CRT display correctly? Yes, this terminology is VERY CONFUSING and I don't like it. Also, it's measured differently from horizontal pixels.
I know you are arguing semantics and I hoped people would see past the "pixels aren't pixels" debate and focus on what I was actually asking, which is how physical dot/pixel/phosphor/mask/whatever patterns have anything to do with frame sizes of a digital video format, and I still assert that they don't, inherently... short of some other explanation I am not aware of.
All I was trying to say was that I thought OP was conflating physical "pixel" geometry with aspect ratios. Perhaps my question was too simple and people were taking it to mean more than I did, or thought I was misunderstanding something else.
In that context, the answer is: They don't really have any relationship at all.
Plenty of TVs of the past (from any continent) also had no physical dots/pixels/patterns at all. These were monochrome, aka black and white. :)
They had an inflexible number of lines that could be displayed per field, but there was no inherent delineation within each line as to what a pixel meant. It was just a very analog electron beam that energized a consistently-coated phosphorescent screen with a continuously-variable intensity as it scanned across for each raster line.
Pixels didn't really happen until digital sources also happened. A video game system (like an Atari or NES, say) definitely seeks to deliver pixels at its video output, and so does a digital format like DVD.
But the TV doesn't know the difference. It doesn't know that it's displaying something that represents pixels instead of a closed-circuit feed from a completely-analog pixel-free tube-based camera.
The "non-square pixel" part is just a practical description: When we have a digital framebuffer (as we do with a DVD player), that framebuffer has defined horizontal and vertical boundaries -- and a grid of pixels within those boundaries -- because that's just how digital things be.
When we smoosh the pixels of a DVD player's 720x480 framebuffer into an analog display with an aspect ratio of 4x3, we wind up with a combined system, with pixels, and those pixels aren't square.
---
And that's perfectly OK, though it does lead to weirdness.
For example: To produce a display of a perfect square that is 100 lines high using a 4x3 NTSC DVD is actually impossible. It'd have to be 100 pixels high and 112.5 pixels wide, which we can't accomplish since the format doesn't have that kind of precision. It's impossible with an LCD, and it's impossible with an analog set from 1955 just the same.
It's a DVD limitation, not a display limit. There's no great trick to producing a perfect square with analog gear, where pixels aren't a thing -- but we can't get there with the DVD's non-square pixels.
That weirdness doesn't have anything at all to do with how the display is constructed, though: Again, the TV has no concept of what a pixel even is (and a shadow mask doesn't necessarily exist at all).
CRTs of course had no fixed horizontal resolution.
Edit: I just realized I forgot about PAL DVDs which were 720x576. But the same principle applies.
This would correct the display, but how did it do it? Was it by drawing the same number of scanlines, but reducing the vertical distance between each line?
Was the CRT natively HD, or SD? Was it zooming in on the middle of the frame, or letterboxing?
Back then, there was no concept of SD or HD. PAL has 625 scanlines (~576 visible). No fixed horizontal resolution.
They never quite looked right, and making pixel graphics was a bit of a hassle since your perfect design on graph paper didn't look the same on screen, etc, etc, etc. I mean it wasn't life-threatening, just a tiny source of never-ending annoyances.
My Macintosh 512e (one of the early "toaster Macs") had square pixels and it was so great to finally have them.
Why would you want this? VHS. NTSC has 480-ish visible scanlines, but VHS only has bandwidth for 350 pixels.
We still have to discover necessary information for small tasks, like it was a note buried in a stack of papers, in the spare room that has absorbed a decade's worth of clutter.
Another canary I notice, is the presence of "Please don't hit the back button" on web pages served by major corporations. Something bad might happen if you click/touch/return that button! Hands off your input devices, please!
On the progress front, we know how to topologically layer information, like never before. Huge appearing/disappearing header bars, over popup ads, over content. Such screen space efficiency.
In some ways we have come far. In a truly remarkable number of ways, not so much.
What you're referring to stems from an assumption made a long time ago by Microsoft, later adopted as a de facto standard by most computer software. The assumption was that the pixel density of every display, unless otherwise specified, was 96 pixels per inch [1].
The value stuck and started being taken for granted, while the pixel density of displays started growing much beyond that—a move mostly popularized by Apple's Retina. A solution was needed to allow new software to take advantage of the increased detail provided by high-density displays while still accommodating legacy software written exclusively for 96 PPI. This resulted in the decoupling of "logical" pixels from "physical" pixels, with the logical resolution being most commonly defined as "what the resolution of the display would be given its physical size and a PPI of 96" [2], and the physical resolution representing the real amount of pixels. The 100x100 and 200x200 values in your example are respectively the logical and physical resolutions of your screenshot.
Different software vendors refer to these "logical" pixels differently, but the most names you're going to encounter are points (Apple), density-independent pixels ("DPs", Google), and device-independent pixels ("DIPs", Microsoft). The value of 96, while the most common, is also not a standard per se. Android uses 160 PPI as its base, Apple has for a long time used 72.
[1]: https://learn.microsoft.com/en-us/archive/blogs/fontblog/whe...
[2]: https://developer.mozilla.org/en-US/docs/Web/API/Window/devi...
From what I recall only Microsoft had problems with this, and specifically on Windows. You might be right about software that was exclusive to desktop Windows. I don't remember having scaling issues even on other Microsoft products such as Windows Mobile.
If my memory serves, it was Apple that popularized high pixel density in displays with the iPhone 4. They weren't the first to use such a display [1], but certainly the ones to start a chain reaction that resulted in phones adopting crazy resolutions all the way up to 4K.
It's the desktop software that mostly had problems scaling. I'm not sure about Windows Mobile. Windows Phone and UWP have adopted an Android-like model.
[1]: https://en.wikipedia.org/wiki/Retina_display#Competitors
Some software, most notably image editors and word processors, still try to match the zoom of 100% with the physical size of a printout.
Some modern films are still filmed with anamorphic lenses because the director / DP like that, and so we in the VFX industry have to deal with plate footage that way, and so have to deal with non-square pixels in the software handling the images (to de-squash the image, even though the digital camera sensor pixels that recorded the image from the lens were square) in order to display correctly (i.e. so that round circular things still look round, and are not squashed).
Even to the degree that full CG element renders (i.e. rendered to EXR with a pathtracing renderer) should really use anisotropic pixel filter widths to look correct.
My guess is this is because encoding hardware can do max 1920x1080, and there is no easy way to make that hardware encode 1080x1920, so you are forced to encode as 1080x1080. Swapping rows and columns in hardware tends to be a big change because caches and readahead totally changes when you process the data in a different order.
And even then, why make it 1080 wide?
I feel like there's more going on. And maybe it's related to shorts supporting any aspect ratio up to 1:1.
But that's all assuming the article is giving an accurate picture of things in the first place. I went and pulled up the max resolutions for three random shorts: 576x1024, 1080x1920, 1080x1920. The latter two also have 608x1080 options.
Furthermore, the referencing of a raster can assume any shape or form. It makes some sense some signals are optimized for hardware restrictions.
Another interesting example are anamorphic lenses used in cinema.
If you used a camera or a GUI to generate your pixels, they are not point samples.
Non-antialiasing is just taking fewer samples and not attenuating the aliasing artifact band with a filter.
A gui is more complex. Most graphics are collections of blocks. But then if you do any effects like filling a bezier curve or shadow you are back to the point sampling model.
If a video file only stores a singular color value for each pixel, why does it care what shape the pixel is in when it's displayed? It would be filled in with the single color value regardless.
I thought i understood the article just fine but these comments are confusing.