If I have a bin of apples, and I say it's 5 apples wide, and 4 apples tall, then you'd say I have 20 apples, not 20 apples squared.
It's common to specify a length by a count of items passed along that length. Eg, a city block is a ~square on the ground bounded by roads. Yet if you're traveling in a city, you might say "I walked 5 blocks." This is a linguistic shortcut, skipping implied information. If you're trying to talk about both in a unclear context, additional words to clarify are required to sufficiently convey the information, that's just how language words.
This idea itself breaks down when we get to triangular subpixel rendering, which spans pixels and divides subpixels. But it's also a minor form of optical illusion, so making sense of it is inherently fraught.
Maybe a pixel is just a pixel.
If I told you parking spots are about two bowling lane's wide... I'm obviously not trying to say they are 120ft wide.
And pixels are even starting to vary in the third dimension too, with the various curved and bendable and foldable displays.
Edit: To clarify, if someone says 3 blocks that could vary by like a factor of like 3 or in extreme caesx more so when used as a unit of length it is a very rough estimate. It is usually used in my country as a way to know when you have reached your destination.
And the article concludes with : "But it does highlight that the common terminology is imperfect and breaks the regularity that scientists come to expect when working with physical units in calculations". Which matches your conclusion.
But it's not true. Counts (like "number of pixels" or "mole of atoms") are dimensionless, which is a precise scientific concept that perfectly matches the common terminology.
Isn't a pixel clearly specified as a picture element? Isn't the usage as a length unit just as colloquial as "It's five cars long", which is just a simplified way of saying "It is as long as the length of a car times five", where "car" and "length of car" are very clearly completely separate things?
> The other awkward approach is to insist that the pixel is a unit of length
Please don't. If you want a unit of length that works well with pixels, you can use Android's "dp" concept instead, which are "density independent pixels" (kinda a bad name if you think about it) and are indeed a unit of length, namely 1dp = 158.75 micro meter, so that you have 160 dp to the inch. Then you can say "It's 10dp by 5dp, so 50 square dp in area.".
It's nearly identical to North American usage of "block" (as in "city block"). Merriam Webster lists these two definitions (among many others):
> 6 a (1): a usually rectangular space (as in a city) enclosed by streets and occupied by or intended for buildings
> 6 a (2): the distance along one of the sides of such a block
The pixel is a unit of area - we just occasionally use units of area to measure length.
I have never heard someone use the first instance, and I wouldn't understand what it meant. I mean, I could buy that it meant that there is a five-acre plot between that house and where we are now, but it wouldn't give me any useful idea of how far the house is other than "not too close." Perhaps you have in mind that, since the "width" of an acre is a furlong, a house 5 acres away is 5 furlongs away?
I have heard sentences like "the property line is two acres into the woods" and it was understood that he was using acre like you might use "block" in a city - "the property line is two acre widths into the woods". As you say, that's just a furlong, but I doubt either of us knew that at the time.
A Pixel Is Not a Little Square (1995) [pdf] – http://alvyray.com/Memos/CG/Microsoft/6_pixel.pdf
Truth is, a pixel is both a sample and a transducer. And in transduction, a pixel is both an integrator and an emitter.
> If you are looking to understand how your operating system will display images, or how your graphics drivers work, or how photoshop will edit them, or what digital cameras aim to produce, then it’s the point sample definition.
Even some VGA modes have non-square pixels, and these were used by many games and... Windows 9x splash screen.
They also look more like a square when I back away. And the mismatch of the square model doesn't mean the point model is good.
So your intuition for why squares makes sense is wrong, but you’re still holding on to it.
> doesn't mean the point model is good.
What does show it’s a good model is all the theory of image processing and the implementation of this theory in camera display systems.
You’re welcome to propose an alternative theory, and if that is consistent, try to get manufacturers to adopt it.
I said subpixels are rectangles. Because they are.
If the point model was all you need, then objects small enough to slip between points would be invisible. Which is not the case.
In particular a shot of the night sky would look pure black.
So if being wrong means we should abandon the model, then we can't use squares or points.
https://upload.wikimedia.org/wikipedia/commons/4/4d/Pixel_ge...
I look forward to your paper about a superior digital image representation.
Or are you arguing that the slightly rounded corners on the rectangles make a significant difference in how the filtering math works out? It doesn't. On a scale between "gaussian" and "perfect rectangles", the filtering for this shape is 95% toward the latter.
They are just bits in a computer. But there is a correct way of to interpret them in a particular context. For example 32 bits can be meaningless - or it can have an interpretation as a twos complement integer which is well defined.
If you are looking to understand how an operating system will display images, or how graphics drivers work, or how photoshop will edit them, or what digital cameras produce, then it’s the point sample definition.
And for pixel art, the intent is usually far from points on a smooth color territory.
Multiple interpretations matter within different contexts inside the computer context.
They use a physical process to attempt to determine light at a single point. That’s their model they try to approximate.
> And for pixel art, the intent is usually far from points on a smooth color territory.
And notice that to display pixel art you need to tell it to interpret the image data differently.
Also it has a vastly different appearance on a CRT where it was designed which is less like a rectangle.
According to who?
A naked camera sensor with lens sure doesn't do that, it collects squares of light, usually in a mosaic of different colors. Any point approximation would have to be in software.
> Any point approximation would have to be in software.
Circuits can process signals too.
And outputs what? Just because the input is an area does not mean the output is an area.
What it if it outputs the peak of the distribution across the area?
> that are pretty square.
If we look at a camera sensor and do not see a uniform grid of packed area elements would that convince you?
I notice you haven’t shared any criticism of the point model - widely understood by the field.
> What it if it outputs the peak of the distribution across the area?
It outputs a voltage proportional to the (filtered) photon count across the entire area.
> If we look at a camera sensor and do not see a uniform grid of packed area elements would that convince you?
Non-uniformity won't convince me points are a better fit, but if the median camera doesn't use a grid I'll be interested in what you have to show.
> I notice you haven’t shared any criticism of the point model - widely understood by the field.
This whole comment line is a criticism of the input being modeled as points, and my criticism of the output is implied by my pixel art comment above (because point-like upscaling causes a giant blur) and also exists in other comments like this one: https://news.ycombinator.com/item?id=43777957
This is not true. And it’s even debunked in the original article.
No, it's not. That article does not mention digital cameras anywhere. It briefly says that scanners give a gaussian, and I don't want to do enough research to see how accurate that is, but that's the only input device that gets detailed.
It also gives the impression that computer rendering uses boxes, when usually it's the opposite and rendering uses points.
In signal processing you have a finite number of samples of an infinitely precise contiguous signal, but in image processing you have a discrete representation mapped to a discrete output. It's contiguous only when you choose to model it that way. Discrete → contiguous → discrete conversion is a useful tool in some cases, but it's not the whole story.
There are images designed for very specific hardware, like sprites for CRT monitors, or font glyphs rendered for LCD subpixels. More generally, nearly all bitmap graphics assumes that pixel alignment is meaningful (and that has been true even in the CRT era before the pixel grid could be aligned with the display's subpixels). Boxes and line widths, especially in GUIs, tend to be designed for integer multiples of pixels. Fonts have/had hinting for aligning to the pixel grid.
Lack of grid alignment, an equivalent of a phase shift that wouldn't matter in pure signal processing, is visually quite noticeable at resolutions where the hardware pixels are little squares to the naked eye.
Since the pixels never were a waveform, never were sampled from such signal (even light in camera sensors isn't sampled along these axis), and don't get displayed as a 2D waveform, the pixels-as-points model from the article at the top of this thread is just an arbitrary abstract model, but it's not an accurate representation of what pixels are.
https://en.wikipedia.org/wiki/Subpixel_rendering
Edit: For the record, I'm on Win 10 with a 1440p monitor and disabling font smoothing makes a very noticeable difference.
People are acting like this is some issue that no longer exists, and you don't have to be concerned with subpixel rendering anymore. That's not true, and highlights a bias that's very prevalent here on HN. Just because I have a fancy retina display doesn't mean the average user does. If you pretend like subpixel rendering is no longer a concern, you can run into situations where fonts look great on your end, but an ugly jagged mess for your average user.
And you can tell who the Apple users are because they believe all this went away years ago.
The reason is mostly that it is too hard to make it work under transformations and compositing, while higher resolution screens are a better solution for anyone who cares enough.
This is a little misleading, as the new versions of Edge and Windows Terminal do use subpixel antialiasing.
What Microsoft did was remove the feature on a system level, and leave implementation up to individual apps.
Laptop screen is 4k with 200% scaling.
Seriously the font rendering in certain areas (i.e. windows notification panel) is actually dogshit. If I turn off the 200% scaling on the laptop screen then reboot it looks correct again.
Anti-aliasing can be and is done on squares routinely. It’s called ‘greyscale antialiasing’ to differentiate from LCD subpixel antialiasing, but the name is confusing since it works and is most often used on colors.
The problem Alvy-Ray is talking about is far more subtle. You can do anti-aliasing with little squares, but the result isn’t 100% correct and is not the best result possible no matter how many samples you take. What he’s really referring to is what signal processing people call a box filter, versus something better like a sinc or Gaussian or Mitchell filter.
Regarding your edit, on a high DPI display there’s very little practical difference bewteen LCD subpixel antialiasing and ‘greyscale’ (color) antialiasing. You don’t need LCD subpixels to get effective antialiasing, and you can get mostly effective antialiasing with square shaped pixels.
And I guess I should have explicitly stated that I'm not talking about high-DPI displays, subpixel rendering obviously doesn't do much good there!
My point is simply this, if you don't treat pixel like discrete little boxes that display a single color, you can use subpixels to effectively increase the resolution on low-DPI monitors. Yes, you can use greyscale antialiasing instead, you will even get better performance, but the visual quality will suffer on your standard desktop PC monitor.
If you don't treat a logical pixel like a discrete little square, you can take advantage of subpixel geometry to effectively increase the resolution. It's not the same as antialiasing, even though it can be used for antialiasing.
Arguably, instead of treating pixels as squares, you're treating them as three times as many rectangles, but that should still contradict the mental model of pixels as little squares.
No, it doesn’t. You seem to be missing this point. Using subpixel rectangles does not in any way correct the misconception, it perpetuates it.
A 100x100 thumbnail that was reduced from a 1000x1000 image might have pixels which are derived from 100 samples of the original image (e.g. a simple average of a 10x10 pixel block). Or other possibilities.
As an abstraction, a pixel definitely doesn't represent a point sample, let alone an infinitely small one. (There could be some special context in which it does but not as a generality.)
And if a downsampling algorithm tries to approximate a point sample, it'll give you a massively increased chance of ugly moire patterns.
The audio equivalent is that you drop 3/4 of your samples and it reflects the higher frequencies down into the lower ones and hurts the quality. You need to do a low-pass filter first. And "point samples from a source where no frequencies exist above X, also you need to change X before doing certain operations" is very different and significantly more complicated than "point samples". Point samples are one leaky abstraction among many leaky abstractions, not the truth. Especially when an image has a hard edge with frequencies approaching infinity.
Someone with just the small version wouldn't know if it's supposed to look like that, but we're not asking them.
Unless they can infer it's a picture of normal objects, in which case they can declare it's moire and incorrect.
Having worked on both game engines and CG films, I think that’s misleading. Point sample is kind of an overloaded term in practice, but I still don’t think your statement is accurate. Many games, especially modern games are modeling pixels with area; explicitly integrating over regions for both visibility and shading calculations. In fact I would say games are generally treating pixels as squares not point samples. That’s what DirectX and Vulkan and OpenGL do. That’s what people typically do with ray tracing APIs in games as well. Even a point sample can still have an associated area, and games always display pixels that have area. The fact that you can’t display a point sample without using area should be reason enough to avoid describing pixels that way.
I'm not sure how raytracing as done by video games can be construed a more area-based than point sample based. It seems to me like ray-hit testing is done in a point-sample manner, and games aren't doing multiple rays per pixel. (They're usually doing less than one ray per pixel, and averaging over several frames that sampled from different locations within a square pixel, then running it all through a blurring denoiser filter and upscaling, but the result is so bad for anything other than a static scene that I don't think it should be used to support any argument.)
Ignoring the denoiser, games are quite commonly using Box Filter for pixels. That’s the square in square pixels. The point sampling is in service of an integral whose shape is a square, and that’s the problem Alvy Ray is talking about.
That point sampling is distinctly different from the “point sample” that represents the pixel value itself, so let’s not conflate them. The averaging over an area that is square shaped, whether it’s multiple samples or over time, is the reason that the pixel shape is summarized as square, not as a point.
And the end pixels are "physical things". Like ceramic tiles on a bathroom wall.
Your wall might be however many meters in length and you might need however squared meters of tile in order to cover it. But still, if you need 10 tiles high and 20 tiles width, you need 200 tiles to cover it. No tension there.
Now you might argue that pixels in a scaled game don't correspond with physical objects in the screen any more. That's ok. A picture of the bathroom wall will look smaller than the wall itself. Or bigger, if you hold it next to your face. It's still 10x20=200 tiles.
When you multiply 3 meter by 4 meter, you do not get 12 meters. You get 12 meter squared. Because "meter" is not a discrete object. It's a measurement.
When you have points A, B, C. And you create 3 new "copies" of those points (by geometric manipulation like translating or rotating vectors to those points), you now have 12 points: A, B, C, A1, B1, C1, A2, B2, C2, A3, B3, C3. You don't get "12 points squared". (What would that even mean?) Because points are discrete objects.
When you have 3 apples in a row and you add 3 more such rows, you get 4 rows of 3 apples each. You now have 12 apples. You don't have "12 apples squared". Because apples are discrete objects.
When you have 3 pixels in a row and you add 3 more such rows of pixels, you get 4 rows of 3 pixels each. You now have 12 pixels. You don't get "12 pixels squared". Because pixels are discrete objects.
Pixels are like points and apples. Pixels are not like metres.
"12 meter(s) squared" sounds like a square that is 12 meters on each side. On the other hand, "12 square meters" avoids this weirdness by sounding like 12 squares that are one meter on each side, which the area you're actually describing.
If you use formal notation, 12 m^2 is very clear. But i have yet to see anyone write 12px^2
As for the rest, see GGP's argument. px^2 doesn't make logical sense. When people are use pixels as length, it's in the same way as "I live 2 houses over" - taking a 2D or 3D object and using one of its dimensions as length/distance.
> This is an issue that strikes right at the root of correct image (sprite) computing and the ability to correctly integrate (converge) the discrete and the continuous. The little square model is simply incorrect. It harms. It gets in the way. If you find yourself thinking that a pixel is a little square, please read this paper.
> A pixel is a point sample. It exists only at a point. For a color picture, a pixel might actually contain three samples, one for each primary color contributing to the picture at the sampling point. We can still think of this as a point sample of a color. But we cannot think of a pixel as a square—or anything other than a point.
Alvy Ray Smith, 1995 http://alvyray.com/Memos/CG/Microsoft/6_pixel.pdf
The paper's claim applies at least somewhat sensibly to CRTs, but one mustn't imagine the voltage interpolation and shadow masking a CRT does corresponds meaningfully to how modern displays work... and even for CRTs it was never actually correct to claim that pixels were point samples.
It is pretty reasonable in the modern day to say that an idealized pixel is a little square. A lot of graphics operates under this simplifying assumption, and it works better than most things in practice.
Integrates this information into what? :)
> A modern display does not reconstruct an image the way a DAC reconstructs sounds
Sure, but some software may apply resampling over the original signal for the purposes of upscaling, for example. "Pixels as samples" makes more sense in that context.
> It is pretty reasonable in the modern day to say that an idealized pixel is a little square.
I do agree with this actually. A "pixel" in popular terminology is a rectangular subdivision of an image, leading us right back to TFA. The term "pixel art" makes sense with this definition.
Perhaps we need better names for these things. Is the "pixel" the name for the sample, or is it the name of the square-ish thing that you reconstruct from image data when you're ready to send to a display?
Into electric charge? I don’t understand the question, and it sounds like the question is supposed to lead readers somewhere.
The camera integrates incoming light into a tiny square into an electric charge and then reads out the charge (at least for a CCD), giving a brightness (and with the Bayer filter in front of the sensor, a color) for the pixel. So it’s a measurement over the tiny square, not a point sample.
This is where I was trying to go. The pixel, the result at the end of all that, is the single value (which may be a color with multiple components, sure). The physical reality of the sensor having an area and generating a charge is not relevant to the signal processing that happens after that. For Smith, he's saying that this sample is best understood as a point, rather than a rectangle. This makes more sense for Smith, who was working in image processing within software, unrelated to displays and sensors.
And depending on your application, you absolutely need to account for sensor properties like pixel pitch and color filter array. It affects moire pattern behavior and creates some artifacts.
I’m not saying you can’t think of a pixel as a point sample, but correcting other people who say it’s a little square is just wrong.
Yes. The spacing between point samples determines the frequency, a fundamental characteristic of a dsp signal.
Integrated into a single color sample indeed. After all the integration, mosaicking, and filtering, a single sample is calculated. That’s the pixel. I think that’s where the confusion is coming from. To Smith, the “pixel” is the sample that lives in the computer.
The actual realization of the image sensors and their filters are not encoded in a typical image format, nor used in a typical high level image processing pipelines. For abstract representations of images, the “pixel” abstraction is used.
The initial reply to this chain focused on how camera sensors capture information about light, and yes, those sensors take up space and operate over time. But the pixel, the piece of data in the computer, is just a point among many.
> But the pixel, the piece of data in the computer, is just a point among many.
Sure, but saying that this is the pixel, and negating all other forms as not "true" pixels is arbitrary. The real-valued physical pixels (including printer dots) are equally valid forms of pixels. If anything, it would be impossible for humans to sense the pixels without interacting with the real-valued forms.
It turns out that when you view things that way, pixels as points continues to make sense.
Obviously, when a pile of pixels is shown on a screen (or for that matter, collected from a camera's CCD, or blobbed by ink on a piece of paper), it will have some shape: The shape of the LCD matrix, the shape of the physical sensor, the shape of the ink blot. But those aren't pixels, they're the physical shapes of the pixels expressed on some physical medium.
If you're going to manipulate pixels in a computer's memory (like by creating more of them, or fewer), then you'd do best by treating the pixels as sampling points - at this point, the operation is 100% sampling theory, not geometry.
When you're done, and have an XY matrix of pixels again, you'll no doubt have done it so that you can give those pixels _shape_ by displaying them on a screen or sheet of paper or some other medium.
1. there exist other ways to represent an integer
2. An old computer uses a different representation
3. numbers are displayed in base 10 on my monitor
4. when I type in numbers I don’t type binary
5. twos complement is confusing and unintuitive
6. it’s more natural to multiply by 10 when using base 10
7. I’ve used 32 bits to represent other data before.
* A small city might be ten blocks by eight blocks, and we could also say the whole city is eighty blocks.
* A room might by 13 tiles by 15 tiles, or 295 tiles total.
* On graph paper you can draw a rectangle that's three squares by five squares, or 15 squares total.
The dot may be physically small, or physically large, and it may even be non-square (I used to work for a camera company that had non-square pixels in one of its earlier DSLRs, and Bayer-format sensors can be thought of as “non-square”), so saying a pixel is a certain size, as a general measure across implementations, doesn’t really make sense.
In iOS and MacOS, we use “display units,” which can be pixels, or groups of pixels. The ratio usually changes, from device to device.
> That means the pixel is a dimensionless unit that is just another name for 1, kind of like how the radian is length divided by length so it also equals one, and the steradian is area divided by area which also equals one.
But then for some reason decides to ignore it. I don’t understand this article. Yes, pixels are dimensionless units used for counting, not measuring. Their shape and internal structure is irrelevant (even subpixel rendering doesn’t actually deal with fractions - it alters neighbors to produce the effect).
The issue is muddied by the fact that what people mostly care about is either the linear pixel count or pixel pitch, the distance between two neighboring pixels (or perhaps rather its reciprocal, pixels per unit length). Further confounding is that technically, resolution is a measure of angular separation, and to convert pixel pitch to resolution you need to know the viewing distance.
Digital camera manufacturers at some point started using megapixels (around the point that sensor resolutions rose above 1 MP), presumably because big numbers are better marketing. Then there's the fact that camera screen and electronic viewfinder resolutions are given in subpixels, presumably again for marketing reasons.
A chessboard is 8 tiles wide and 8 tiles long, so it consists of 64 tiles covering an area of, well, 64 tiles.
The fact that some cities don't have repeated grids and hence don't use the term is not really a valuable corrective to the post you are replying to.
E.g. Manhattan has mostly rectangular blocks, if you go from 8th Avenue to Madison Avenue along 39th St you traveled 4 blocks (the last of which is shorter than the first 3), if you go from 36th St to 40th St along 8th Avenue you traveled 4 blocks (all of which are shorter than the blocks between the avenues).
The problem in this article is it incorrectly assumes a pixel to be a length and then makes nonsensical statements. The correct way to interpret "1920 pixels wide" is "the same width as 1920 pixels arranged in a 1920 by 1 row".
In the same way that "square feet" means "feet^2" as "square" acts as a square operator on "feet", in "pixels wide" the word "wide" acts as a square root operator on the area and means "pixels^(-2)" (which doesn't otherwise have a name).
If you have a high resolution screen the a CSS pixel is typically be 4 actual display pixels (2x2) instead of just 1. And if you change the zoom level, the amount of display pixels might actually change in fractional ways. The unit only makes sense in relation to what's around it. If you render vector graphics or fonts, pixels are used as relative units. On a high resolution screen it will actually use those extra display pixels.
If you want to show something that's exactly 5cm on a laptop or phone screen, you need to know the dimensions of the screen and figure out how many pixels you need per cm to scale things correctly. Css has some absolute units but they only work as expected for print media typically.
But to be contrarian, the digital camera world always markets how many megapixels a camera has. So in essense, there are situations where pixels are assumed to be an area, rather than a single row of X pixels wide.
Did you meant "pixels^(1/2)"? I'm not sure what kind of units pixels^(-2) would be.
A pixel is a point sample by definition.
You are referring to a physical piece of a display panel. A representation of an image in software is a different thing. Hardware and software transforms the dsp signal of an image into voltages to drive the physical pixel. That process takes into account physical characteristics like dimensions.
Oh btw physical pixels aren’t even square and each RGB channel is a separate size and shape.
So, I don't think it's entirely valid to talk about pixels as if they are pure, one dimensional units..
They're _things_ and you can talk about how many things wide or tall something is, and you can talk about how many things something has. Very much the same way you can with bricks (which are mostly never square) (though tiles are, you never talk about how many kilotiles is in your bathroom either, yet you can easily talk about how many tiles wide or tall a wall is).
So, no, pixels is not a unit in the mathematical sense.. it's an item, in the physical sense.
There are also things like scanners, that may have only one row of pixels on the scanner sensor, it does not have an area of zero, and you don't need to specify that there's one pixel on the other axis, because it's an inherent property of pixels that they have both width and height and thus area in and of themselves.
But it does highlight that the common terminology is imperfect and breaks the regularity that scientists come to expect when working with physical units in calculations
Scientists and engineers dont actually expect much, they make a lot of mistakes, are not very rigorous, not demanding towards each others. It is common for Units to be wrong, context defined, socially dependent and even sometimes added together when the operator + hasn't been properly definedIt certainly doesn't make sense to mix different meanings in a mathematical sense.
E.g., when referring to a width in pixels, the unit is pixel widths. We shorten it and just say pixels because it's awkward and redundant to say something like "the screen has a width of 1280 pixel widths", and the meaning is clear to the great majority of readers.
Sometimes, it is used as a length or area, omitting a conversion constant, but we do it all the times, the article gives out the mass vs force as an example.
Also worth mentioning that pixels are not always square. For example, the once popular 320x200 resolution have pixels taller than they are wide.
It is just word semantics revolving around a synecdoche.
When we say that an image is 1920 pixels wide, the precise meaning is that it is 1920 times the width of a pixel. Similarly 1024 pixels high means 1024 times the height of a pixel. The pixel is not a unit of length; its height or width are (and they are different when the aspect ratio is not 1:1!)
A syntax-abbreviating semantic device in human language where part of something refers to the whole or vice versa is called a synecdoche. Under synecdoche, "pixel" (the whole) can refer to "pixel width" (part or property of the whole).
Just like the synecdoche "New York beats Chicago 4:2" refer to basketball teams in its proper context, not literally the cities.
The critical distinction is the inclusion of a length dimension in the measurement: "1920 pixels wide", "3 mount everests tall", "3.5 football fields long", etc.
or
> "I have made this longer than usual because I have not had time to make it shorter."
Pixel is an abbreviation for 'picture element' which describes a unit of electronic image representation. To understand it, consider picture elements in the following context...
(Insert X different ways of thinking about pictures and their elements.)
If there is a need for a jargon of mathematical "dimensionality" for any of these ways of thinking, please discuss it in such context.
Next up:
<i>A musical note is a unit of...</i>
And the difference is pure pedantry, because each photosite corresponds to a pixel in the image (unless we’re talking about lens correction?). It’s like making up a new word for monitor pixels because those are little lights (for OLED) while the pixel is just a tuple of numbers. I don’t see why calling the sensor grid items „pixels“ is misunderstandable in any way.
Edit: Upon doing some more reading it sounds like a photosite or sensel, isn't a group of sensors, but a single sensor, which can pick up r,g,b,.. light - "each individual photosite, remember, records only one colour – red, green or blue" - https://www.canon-europe.com/pro/infobank/image-sensors-expl...
I couldn't seem to find a particular name for the RGGB/.. pattern that a bayer filter consists of an array of.
A "megapixel" is simply defined as 1024 pixels squared ish.
There is no kilopixel. Or exapixel.
No-one doesn't understand this?
So an image could be 1 mega pixel, or 1000 times 1000 pixel-lengths.