Should you normalize RGB values by 255 or 256?(30fps.net)

167 pointsby pplanu6 hours ago21 comments

moefhan hour ago
This problem of what exactly a color value means is mostly inconsequential when you have 8 bits per component, the difference in the denominator being either 255 or 256 makes the errors tiny, you must have really good color perception and get really close to the screen to see any difference at all, and your monitor/phone screen is probably not calibrated anyway, so who cares.
It becomes a pain in the ass when you're generating a VGA signal with a microcontroller with 8 color output pins (3 red, 3 green, 2 blue). The meaning of a color value is very real in this setup: it corresponds to a voltage level you must send to the VGA monitor, 0V-0.7V.
So the blue channel will map (0->0V, 1->0.23V, 2->0.47V, 3->0.7V), and the red/green will map (0->0V, 1->0.1V, ..., 7->0.7V). Notice how none of the blue voltages match any of the red/green ones (other than the extremes)? That means you don't get to see any pure grays -- the closest ones will have bit of blue or yellow tint, depending on the direction of the difference.
Not only that, any gradients at all (other than the ones not mixing blue with the other channels) will be noticeable off: for example, the closest colors in the line between pure red to pure white will all be slightly orange or purple.
Code for VGA output in 8-bit color with double-buffered 320x240 framebuffer for the Raspberry Pi Pico 2 here, if anyone cares: https://github.com/moefh/pico-vga-8bit-demo
dpark12 minutes ago
The entire issue arises from the use of truncation, right? It guarantees that only an exact 1.0 could land in the 255 bin so the net effect is a reduction of 256 bins to 255 bins. (Using random numbers as shown also guarantees no 1.0.)
Why not scale to fill the available bins, though? i.e. trunc(result * 255.999)?
kazinator35 minutes ago
No, the "alternative" approach looks strange in the 7 bit example.
1.0 lies on the right side of the bin 7. But 0.0 lies on the left of bin 0.
The standard approach assumes that we have centered samples: that zero is dead black, plus (and minus!) some uncertainty and so is bin 7.
If the sampling of the intensity is distortion-free (no clipping took place due to overexposure) then bin 7 represents a range of possible values centered around 1.0.
It is not a half-sized interval.
> This means that when converting floating-point values in the [0,1] range back to integers, the extreme bins have effectively half the width of other bins.
Under any interpretation whatsoever of the image samples, there is latitude for interpreting the maximum value 255 as being distortion: clipping from an arbitrarily higher value. Shifting things by 0.5 doesn't fix this issue of not knowing whether 255 means that an intensity close to 1.0 is being represented (no distortion), or an outlier intensity of 37.49 (severely clamped). That could go the other way too.
In other words, there is a possible bias in the extreme bin. The signal could be limited such that the bin's full sampling range is not in effect, or the signal could be overwhelming, so that values far outside of the range are clipped and included.
The only way around this is to make the highest value a canary which represents "clipped value". That is to say, 255 means "clipped datum", so that only 254 and below is sampling of unclipped signal. Machine-generated image (e.g. 3D rendering) then avoid the 255 value, and camera sensors are calibrated so that it doesn't occur when technical images are being shot.
herf5 hours ago
I'll argue for the +0.5 solution. First, I don't like half-sized intervals at the edges, and second, a 255-based representation is typically a SDR (not HDR) image.
RGB values represent luminances against some adapted state, and a "zero" in a daylit scene is not "zero luminance" - it's just about 0.001x as bright as the brightest point - it's millions of photons, way more than zero. In a sense our eyes experience contrast on a sliding scale, and there is no absolute zero in the system. For example, broadcast systems historically used 16-235 as their luminance range for SDR. I think any argument that says "we must have zero" is going to have a bias, but I don't think zero is needed for most things.
- pixelesque3 hours ago
  As someone with a lot of experience in this area doing image processing and rendering for VFX (including writing image readers and writers for my own software and commercial VFX software), I think you might be forgetting that colourspace conversion (to sRGB 'linear' rec709 for old-school SDR, but other more wider gamuts for newer formats) would happen after this, so the 'squish' of the dynamic range would happen after loading.
  Also, a lot of workflows for image processing and compositing do assume that 0 means zero, whether correctly or not (often incorrectly). So there are often assumptions that for 8-bit, 0u maps to 0.0f and 255 maps to 1.0f for things like masking or alpha: as soon as you have 0 values which become just over 0.0, you then have artifacts because some code somewhere is using a hard threshold of 0.0 to mask some other operation, and vice-versa for 1.0 with alpha, where suddenly because the 255 values are no longer 1.0f, you have very slightly see-through objects (often only visible in certain situations or when pixel-peeping) after pre-multiplication.
  (Same thing can happen when 254 becomes 1.0f after +0.5 with masking).
  - herf2 hours ago
    good point - alpha is a notable exception, it is not luminance
- pornelan hour ago
  Although the post focuses on RGB, the same quantization issue exists for any type of signal being mapped between discrete and continuous representations.
  The issue isn't in having a representation for 0 photons, but about maximizing information stored in a byte. Ideally you shouldn't be underutilizing the byte value 0, nor add bias to data that should have been assigned to the 0th bucket, regardless of what it represents (you could have a color space that goes from bright to super bright, and still want to ensure that every byte represents equal chunk of your brightness range).
  - PaulDavisThe1st8 minutes ago
    Yep, the exact same problem arises in digital audio, mapping between integer sample formats and the floating point representation that is generally used internally.
- kazinator34 minutes ago
  They are not half sized at the edges, unless negative black bothers you.
- amavect4 hours ago
  I agree. Additionally, both 0.0 and 1.0 don't really exist for dithered signals, so a byte should map to [0.5, 255.5] before division by 256. This also solves the signed integer asymmetry, as a signed byte maps to [-127.5, 127.5] before division by 128. I wonder if audio DSP folks have done this already.
  - amavect2 hours ago
    Thinking about this more, dithering requires negative values to cancel out when adding. Works for audio, but color doesn't have negative numbers.
    somat43 minutes ago
    It is still frequency, where it would have negative values. but I doubt any color handling algorithms deal with it as a frequency. Rightfully so, the physical wetware for decoding images is very different than that for decoding audio. Well... not that different if you think of audio as a single pixel monochrome image.
    Now I am imagining a weird alternate history where we treat audio like we treat color. OK take three bytes which encode how loud the sound is, one for lows, one for mids and one for highs where lows mids and high frequencies are picked to match human ear response.
- yxhuvud5 hours ago
  Both solutions add 0.5, the difference is where in the process it happens.
- infinet3 hours ago
  Interesting idea, but somehow I feel the world is shaking. For the processing program, what used to black(0.0) and white(1.0) has became very dark gray and very bright gray.
- dylan6042 hours ago
  > broadcast systems historically used 16-235
  For 8-bit, 16 maps to 7.5IRE which is the well understood legal black. Mapping 235 means they mapped peak to 110IRE. This is based on a 0-120IRE scale. This gets weird as the broadcast limit for video was 100IRE allowing for the chroma to reach 110IRE. So if you're trying to limit your white values to 235, that'll be higher than is broadcast safe. Of course, nobody cares about NTSC broadcast limits any more. However, to this day, I still see out of spec tapes marked as "broadcast master" that have been ingested for streaming use. It drives me crazy to this day, and it's only getting worse as people don't even have scopes to adjust the VTR's TBC properly.
  - variaga25 minutes ago
    Ugh. Sudden flashbacks to having to switch analog output between Japanese NTSC (no pedestal) and US NTSC (with pedestal) without getting weird noise in the black regions.
    But IIRC the MPEG-2 standard had luma==235 -> 100IRE for all of the analog formats (pal/ntsc-j/ntsc/secam) so I'm not sure why you say that would violate the broadcast limits?
- themafia4 hours ago
  > In a sense our eyes experience contrast on a sliding scale
  There's a whole visual center to check the amount of incoming light and adjust your pupils for you. It's intentionally reactive.
  > and there is no absolute zero in the system.
  There maybe is. I think we call that "blind."
  > broadcast systems historically used 16-235 as their luminance range for SDR
  Mostly because it was a fully analog system and these all translate down to signal voltage. Jokingly NTSC used to be referred to as "Never Twice the Same Color" due to being a compromise bolted onto the side of an already compromised system.
  - a_conservative2 hours ago
    >> and there is no absolute zero in the system.
    > There maybe is. I think we call that "blind."
    If you go looking into that, you'll see that the reality is far far more complex [0]
    "The number of people with no light perception is unknown, but it is estimated to be less than 10 percent of totally blind individuals."
    [0] https://chicagolighthouse.org/sandys-view/what-blind-people-...
Nuthen3 hours ago
That was a fun article to read of something I haven't had to think about in a while. It brought to mind moments in game development of having pixel art needing to be drawn on an integer value despite the game logic using floating point math. I tried something similar to the +0.5 in places so that it wouldn't look as bad (especially when there's a moving camera, which also needed to be truncated..).
I also enjoyed the 2002 article by Jonathan Blow [1] that's linked at the bottom. The visualization from the first article helped a lot once this started to go more in-depth.
[1] https://web.archive.org/web/20240706043551/https://number-no...
dudu245 hours ago
If you have a ruler and it goes to 12 inches, you should normalize by the length L and not by 13, the number of points on the ruler.
- Timwi3 hours ago
  I'm confused by that analogy. Is the “ruler” a 255-inch ruler with 256 points labeled 0–255, or is it a 256-inch ruler with 256 1-inch segments, making L = 256×1?
- layer82 hours ago
  But who says that the numbers are representing the points, rather than representing the intervals between the points?
  - wkyan hour ago
    It doesn't even need to represent intervals. A 13 inch ruler with 13 markings at 0.5, 1.5, etc inches is still a valid ruler, albeit an odd construction.
- lacedeconstruct5 hours ago
  yes but >> 8 is so much faster
  - xigoi4 hours ago
    You don’t divide a float by 256 by shifting it right eight bits; that would yield complete garbage. You subtract 8 from the exponent, then check if you got an underflow.
    dheera3 hours ago
    Same point; divide by power of 2 is a fast subtraction operation in float world, while divide by 255 shits all over the whole float
    yongjik6 minutes ago
    If your input is an arbitrary float, you need to check for denormals (and maybe NaNs). You can do bitmasking trick to avoid conditional jumps but I'm skeptical you can do it faster than SIMD multiply instruction.
  - StilesCrisis4 hours ago
    It's just multiplication. Floating multiply is extraordinarily fast.
    lacedeconstruct4 hours ago
    The difference between 20 cycles and 1 clock cycle in a hot loop is very noticeable
    exyi3 hours ago
    It's 3 cycles for float multiplication (and 1 for shift right):
    https://uops.info/table.html?search=mulss&cb_lat=on&cb_tp=on...
    https://uops.info/table.html?search=shr&cb_lat=on&cb_tp=on&c...
    In throughput it's even less of a difference: 2 per cycle vs 3 per cycle.
    Tuna-Fish3 hours ago
    FP Division by constant is optimized by a compiler into a multiply. Graphics processing typically happens on the GPU these days, and on all recent GPUs FPMUL belongs to the class of lowest-latency operations. That is, there are no other instructions that complete faster.
    pixelesque3 hours ago
    Only with things like -ffast-math enabled will compilers do the reciprocal. It can make a fair difference in some cases, but it's often better to selectively use it in code locations you know are acceptable by doing it manually in the code.
    mgaunard3 hours ago
    That's only valid to do if the reciprocal is representable exactly.
    hansvm2 hours ago
    That's not totally true. It's sufficient to be exactly representable, but you only need the reciprocal rounding error to be small enough to guarantee the multiplication rounding step fixes it across the entire range of numerators. For IEEE754 f16 values, there are 28 such extra values, the positive and negative sides of 1705/x where x is a power of 2 at least as great as 2048.
    Sesse__3 hours ago
    Useful, then, that you can start several vectorized floating-point muls each cycle. (E.g., most modern x86 are 3/0.5 cycles for vmulps. No 20 cycles in sight.)
  - dist-epoch4 hours ago
    Only in micro-benchmarks.
    For real usage, today's CPUs are limited by memory bandwidth.
    lacedeconstruct4 hours ago
    What are you talking about in a hot loop in my software renderer this is like 10x faster
    // color4_t result = { // .r = (src.r * src.a + dst.r * inv_alpha) * INV_255, // .g = (src.g * src.a + dst.g * inv_alpha) * INV_255, // .b = (src.b * src.a + dst.b * inv_alpha) * INV_255, // .a = src.a + (dst.a * inv_alpha) * INV_255 // }; // 1/256 but much faster color4_t result = { .r = (src.r * src.a + dst.r * inv_alpha) >> 8, .g = (src.g * src.a + dst.g * inv_alpha) >> 8, .b = (src.b * src.a + dst.b * inv_alpha) >> 8, .a = src.a + ((dst.a * inv_alpha) >> 8) };
    Tuna-Fish3 hours ago
    If the latter is 10x faster, the issue is some kind of weird compilation failure for the above version. For one, it only cuts a third of the multiplies.
    dist-epoch4 hours ago
    Because you are working in the cache.
    Also, you should use SIMD.
    lacedeconstruct4 hours ago
    > Also, you should use SIMD. ironically no clang is better at auto vectorizing
    szundi4 hours ago
    [dead]
- groundzeros20155 hours ago
  I’m dumb. Doesn’t 0 start at the beginning?
  - dylan6042 hours ago
    It's right up there with the confusion if 2000 was the new year of the 21st century or the last year of the 19th century.
    simonaskan hour ago
    For the record, the mathematically correct answer to this question is that the year 2000 was the last year of the 19th century.
    The reason is that year 0 never existed. The year 1 BCE was followed by the year 1 CE.
    Culturally, anthropologically, and psychologically it might be a different matter. But 2000 years had not passed before the end of that year.
    tzot32 minutes ago
    The debate is if 2000 is the first year of the 21st century or the last year of the 20th century. (btw I agree with the latter)
jessetemp3 hours ago
The author is confusing bins with bin edges. In their first plot, the standard approach looks strange because 0-7 should be the bin edges, not the center points as shown in the plot.
You can see this confusion again in the histogram example. There are only 255 bins, not 256. If you fix that mistake and remove the 0.5 offset, then the histogram is distributed correctly at both ends.
- pornelan hour ago
  No, the author understands the problem way deeper than you do.
  You haven't grasped the fact that the choice isn't obvious, and has subtle trade-offs.
  If you don't believe the author, check the other posts he references.
- nomel2 hours ago
  2*8 = 256. You can represent 256 distinct values, bins, with an 8 bit number. If you stick a 0 in that first one, it takes a bin. If you fill the rest with by-one increasing integers, then the max value will be 255, thus the 2*bits - 1, which is the max value you can store.
- bjourne2 hours ago
  How do you fit 256 distinct values into 255 bins?
  - jessetemp2 hours ago
    By counting the edges
Sesse__4 hours ago
You should multiply by 255.0, optionally add a dither (triangular is okay), and then let the FPU round using its default IEEE 754 round-to-nearest-ties-to-nearest-even mode. None of this crazy 0.5 stuff. :-)
Retr0id5 hours ago
Both of these assume a linear transfer function, which is rarely the case.
- leni5363 hours ago
  Basically never for 8-bit color channels.
RobRiveraan hour ago
Are we talking 0 or 1 based values? HONKHONK*
crazygringo5 hours ago
Advice for anyone on mobile: read in landscape mode if you want to be able to see the division by 256 version code example at the start.
The HTML/CSS is bad that lets it completely overflow the right edge of the page instead of wrapping.
I re-read this post three times in total confusion before I figured out the most important piece was off-screen entirely.
theyeenzbeanz5 hours ago
Should always be 0-255 as that fits an unsigned byte.
- crazygringo5 hours ago
  That's not what the article is about.
- Retr0id5 hours ago
  > assume that in both cases the output values are clamped before the final typecast
atilimcetin4 hours ago
Interesting article. I tend to use
- i = min(floor(f * 256), 255) (from float to uint8)
- f = i / 255 (from uint8 to float)
Basically a mix of the 2 approaches mentioned in the article.
For all integers between [0,255], if I do uint8 -> float -> uint8 conversion, I will get the same result.
--
edit: I wondered what's the maximum jitter amount that I can introduce to the float and get the same uint8 value. And also these 0->0.0 and 255->1.0 should map properly.
With my approach at the top, the jitter margin that I can introduce is 1/65280.
But with the article's approach
- i = floor(f * 255 + 0.5)
- f = i / 255
maximum jitter margin is 1/510 (which is better).
- AgentME3 hours ago
  It's worth pointing out that the article explicitly calls out your first mixed technique:
  > Finally, one should never mix the encode and decode steps of the two quantizers. That’s just broken code. It’s an easy mistake to make, though.
- vitorsr4 hours ago
  This is what I do for the former:
  floor( nextafter( 256, 255 ) * value )
  - atilimcetin4 hours ago
    Oh very nice idea to get rid of the min operator.
AlienRobotan hour ago
Case against 255: it looks wrong in the graph :(
Case against 256: no 0 or 1 values :(
Considering how important having a 0 and 1 value is for arithmetic in general, I think 255 is better.
wyager19 minutes ago
You don't need to make this judgement; it's fixed by the colorspace you're working in.
First, figure out what colorspace the processing needs to happen in. Usually this is linear RGB.
Then, figure out what OETF and EOTF your input/output format use. This will be something like PQ or HLG. This will exactly specify the meaning of each integer value.
This fixes the choice of representation and conversion.
dist-epoch4 hours ago
A similar issue exists in the audio world, for example 16-bit integer audio is between [-32768, 32767] (non-symmetric), but floating point audio is [-1.0, 1.0].
- ack_complete35 minutes ago
  There is an analogous situation in graphics with signed normalized formats. The solution there is that the R16_SNORM format maps -1 to +1 as [-32767, 32767] with -32768 being a special value (not normally emitted, and mostly but not always interpreted as -32767). Some audio storage formats seem to use this mapping too.
- adzm4 hours ago
  note that floating point audio very often exceeds [-1.0, 1.0] within the pipeline, just to be tamed at the very end of the mix to fit within those bounds. this is pretty much why every modern DAW uses floating point these days.
davidladdsource10 minutes ago
[flagged]
corysama2 hours ago
[dead]
ctdinjeu84 hours ago
Both. 255 for each color and the last 1 as the alpha for each channel.
Why not??? Fight me
DigitallyFidget5 hours ago
255 gives 0-255, which gives you a zero value. 256 is 1-256, you lose the option of setting 0.
- crazygringo5 hours ago
  That's not what the article is about.
dgently74 hours ago
"Let’s say you’re writing an image processing program. The program takes in an image, converts it to floating point, does some processing and finally saves the modified pixels to disk as 8-bit colors. "
excuse to argue about the best way aside, if this is the goal you should not be rolling your own image file reading. you should use openimageio. idk what approach it takes in its internal conversion to float, but that library is more likely to have the right answer than you trying to roll it yourself given its the library used internally by tons of professional image manipulation software...
- pixelesque3 hours ago
  If you're a beginner, or just want something which works quickly, sure.
  However OIIO is far from perfect in all situations (having had to debug and fix issues with its mip-map generation filtering code in the past), so don't always assume that just because there's a mature open source library out there doing something that it's always perfect.
  - dgently73 hours ago
    sure of course nothing is perfect and oiio has a lot of surface area / is still oss. thats good advice.
    ive just seen a lot of "ai researchers" who are getting into professional image processing and are both beginners and want things quickly and so could do much worse than just starting from what they get out of oiio. especially for a lot of the non-obvious stuff (more of that in color handling than just the io stuff though)
- AgentME3 hours ago
  OpenImageIO uses the standard division by 255 technique: https://openimageio.readthedocs.io/en/latest/imageoutput.htm...