Chrome's hidden X-Browser-Validation header reverse engineered(github.com)

380 pointsby dsekz7 months ago14 comments

dsekz7 months ago
Dug into chrome.dll and figured out how the x-browser-validation header is generated. Full write up and PoC code here: https://github.com/dsekz/chrome-x-browser-validation-header
Why do you think Chrome bothers with this extra headers. Anti-spoofing, bot detection, integrity or something else?
- userbinator7 months ago
  Making it easier to reject "unapproved" or "unsupported" browsers and take away user freedom. Trying to make it harder for other browsers to compete.
  - ajross7 months ago
    That can be done already based on User-Agent, though. Other browsers don't spoof their agent strings to look like Chrome, and never have (or, they do, but only in the sense that everyone still claims to be Mozilla). And browsers have always (for obvious reasons) been very happy to identify themselves correctly to backend sites.
    The purpose here is surely to detect sophisticated spoofing by non-user-browser software, like crawlers and robots. Robots are in fact required by the net's Geneva Convention equivalent to identify themselves and respect limitations, but obviously many don't.
    I have a hard time understanding robot detection as an issue of "user freedom" or "browser competition".
    jml7c57 months ago
    >I have a hard time understanding robot detection as an issue of "user freedom" or "browser competition".
    The big one is that running a browser other than Chrome (or Safari) could come to mean endless captchas, degrading the experience. "Chrome doesn't have as many captchas" is a pretty good hook.
    Forgeties797 months ago
    Not to mention how often you can get stuck in an infinite loop where it just will not accept your captcha results and keeps making you do it over and over. Especially if you’re using a VPN. It’s maddening sometimes. Can’t even do a basic search
    ajross7 months ago
    So the market isn't allowed to detect robots because some sites have bad captcha implementations? I'm not following. Captchas aren't implement by the browser.
    motorest7 months ago
    > So the market isn't allowed to detect robots (...)
    I don't know what you mean by "the market".
    What I do know is that if I try to go to a site with my favourite browser and a site blocks me because it's so poorly engineered it thinks I am a bot just because I'm not using Chrome, then it's pretty obvious that it's not detecting bots.
    Also worth noting: it might surprise you that there browser automation frameworks. Some of them, such as Selenium, support Chrome.
    OhMeadhbh7 months ago
    So the cool thing is we can now add an x-browser-validation header to selenium (and firefox).
    Forgeties797 months ago
    Exactly
    Forgeties797 months ago
    I’m not sure who “the market” is in this case, but reCAPTCHA is owned and implemented by Google and clearly favors their browser. Any attempts to use other browsers or obfuscate your digital footprint in the slightest leads to all kinds of headaches. It’s a very convenient side effect of their “anti-bot” efforts that they have every incentive to steer in to.
    aaronmdjones7 months ago
    This isn't Google's doing but Mozilla's. Firefox's strict tracking protection blocks third-party cookies. The site you're trying to visit isn't hosting reCAPTCHA itself; reCAPTCHA was loaded from a third-party origin (Google); so the cookie that Google sets saying you passed the CAPTCHA is blocked by Firefox.
    You can add an exception in Firefox's settings to allow third-party cookies for CAPTCHAs. Google's reCAPTCHA cookie is set by "recaptcha.net", and CloudFlare's CAPTCHA has exactly the same problem, whose domain is "challenges.cloudflare.com".
    If the cookies aren't set and passed back, then they can't know that you've solved it, so you get another one.
    mindslight7 months ago
    You're blaming Mozilla because they fixed a security vulnerability, and then saying that the workaround is to reenable the vulnerability so that Google can continue surveilling.
    Forgeties797 months ago
    Yet for some inexplicable reason all the other bot detection methods I encounter online don’t struggle at all with me and don’t stick me in infinite loops. Cloudflare for instance simply does not bug out with rare exceptions for me.
    Maybe my experience is atypical but it seems to me this is a reCAPTCHA problem, not a Mozilla one. It’s Google’s problem. I imagine they can solve this but simply don’t want to.
    Maybe I’m wrong but again, i encounter more issues with their “anti bot” methods than any other by a massive margin.
    hedora7 months ago
    Concretely: Google meet blocks all sorts of browsers / private tabs with a vague: “you cannot join this meeting” error. They let mainstream ones in though.
    jherskovic7 months ago
    I use Safari (admittedly, with Private Cloud and a few tracking-blocking extensions) and get bombarded with Cloudflare's 'prove you are human' checkbox several times an hour.
    It's already a pretty degraded experience.
    randomjoe27 months ago
    I mean you're using a VPN, they can't tell the diff between you and a bunch of bots
    Diti7 months ago
    Requests per second?
    JeffMcCune7 months ago
    Harder to scale, stateful.
    fireflash387 months ago
    I think you mean they can't profit from selling data from a bunch of bots.
    Sayrus7 months ago
    > I have a hard time understanding robot detection as an issue of "user freedom" or "browser competition".
    In the name of robot detection, you can lock down device, require device attestation, prevent users from running non-standard devices/OS/software, prevent them from accessing websites (CloudFlare dislikes non-chrome browser and hates non-standard browsers, ReCaptcha blocks you out if you're not on Chrome-like/Safari/Firefox). Web Environment Integrity[1] is also a good example of where robot detection ends up affecting the end user.
    [1] https://en.wikipedia.org/wiki/Web_Environment_Integrity
    ajross7 months ago
    Aren't all those solutions even more impactful on the user experience though? Someone who cares about user freedom would think they're even worse, no?
    baybal27 months ago
    [dead]
    jsnell7 months ago
    The purpose here isn't to deal with sophisticated spoofing. This is setting a couple of headers to fixed and easily discoverable values. It wouldn't stop a teenager with Curl, let along a sophisticated adversary. There's no counter-abuse value here at all.
    It's quite hard to figure out what this is for, because the mechanism is so incredibly weak. Either it was implemented by some total idiots who did not bother talking at all to the thousands of people with counter-abuse experience that work at Google, or it is meant for some incredibly specific case where they think the copyright string actually provides a deterrent.
    (If I had to guess, it's about protecting server APIs only meant for use by the Chrome browser, not about protecting any kind of interactive services used directly by end-users.)
    Sophira7 months ago
    I would imagine that this serves the same purpose as the way that early home consoles would check the inserted cartridge to see that it had a specific copyright message in it, because then you can't reproduce that message without violating the copyright.
    In this case, you would need to reproduce a message that explicitly states that it's Google's copyright, and that you don't have the right to copy it ("All rights reserved."). Doing that might then give Google the legal evidence it needs to sue you.
    In other words, a legal deterrence rather than a technical one.
    soulofmischief7 months ago
    It's easy to change the User Agent and we cannot handwave this fact away for the sake of argument.
- Avamander7 months ago
  > Why do you think Chrome bothers with this extra headers. Anti-spoofing, bot detection, integrity or something else?
  Bot detection. It's a menace to literally everyone. Not to piss anyone off, but if you haven't dealt with it, you don't have anything of value to scrape or get access to.
  - motorest7 months ago
    > Bot detection. It's a menace to literally everyone. Not to piss anyone off, but if you haven't dealt with it, you don't have anything of value to scrape or get access to.
    What leads you to believe that bit developers are unable to set a request header?
    They managed fine to set Chrome's user agent. Why do you think something like X-Browser-Validation is off limits?
    Sophira7 months ago
    Because you would need to reproduce an explicit Google copyright statement which states that you don't have the right to copy it ("All rights reserved.") in order to do it fully.
    That presumably gives Google the legal ammunition it needs to sue you if you do it.
    userbinator7 months ago
    Companies like SEGA have tried doing stuff like that in the past, and lost.
    tomsonj7 months ago
    It seems like the requirement to reproduce this copyright header alone, nevermind the validation hash, would be enough to scare off scrapers?
    Sophira7 months ago
    I'm no lawyer, but my take on it is that by reproducing this particular value for the validation header, you are stating that you are the Chrome browser. It's likely that this has been implemented in such a way that other browsers could use it too if they so choose; the expected contents of the copyright header can then change depending on what you have in the validation header.
    To me, it seems likely that the spec is for a legally defensible User-Agent header.
    Avamander7 months ago
    > They managed fine to set Chrome's user agent. Why do you think something like X-Browser-Validation is off limits?
    It's not off-limits technically. But do you think it'll remain this simple going forward? I doubt that.
  - lxgr7 months ago
    Do you mean bot and non-Chrome-using human detection?
  - IshKebab7 months ago
    Bots can easily copy the header though so I don't see how that helps?
    Avamander7 months ago
    Only if they know to implement it and while it uses a more trivial approach. I expect it to become increasingly difficult gradually. It's also yet another way to make mistakes and make it entirely obvious that one is forging Chrome.
  - ohdeargodno7 months ago
    Bullshit. You don't have anything of value either. Scrapers will ram through _anything_, and figure out if it's useful later.
- twapi7 months ago
  Seems like they are using these headers only for google.com requests.
  - xnx7 months ago
    Yes I think it is part of their multi level testing of for new version rollouts. In addition to all the internal unit and performance tests, they want an extra level of verification that weird things aren't happening in the wild
  - AznHisoka7 months ago
    They probably are using it to block bots scraping Google results is my theory
- exiguus7 months ago
  I have two questions:
  1. Do I understand it correctly and the validation header is individual for each installation?
  2. Is this header only in Google Chrome or also in Chromium?
  - gruez7 months ago
    >1. Do I understand it correctly and the validation header is individual for each installation?
    I'm not sure how you got that impression. It's generated from fixed constants.
    https://github.com/dsekz/chrome-x-browser-validation-header?...
    exiguus7 months ago
    It's still not clear to me because it's called the default API key. And for me, default means that this is normally overwritten. And if overwritten, during build or during install? That's what I'm asking myself.
  - dlenski7 months ago
    I had the same question (2). https://news.ycombinator.com/item?id=44560664
    If it's only in the closed-source Chrome, then it seems it's intended to help Google's servers distinguish between Google's own products and others.
    But I've never seen a Google site which worked less-well in Chromium than in Chrome, so I'm somewhat skeptical of this. Perhaps there are exceptions
- wernerb7 months ago
  Is it not likely that it protects against AI bot Llama?
  - wut427 months ago
    I don't see how you can "protect" against a large language model that cannot do browsing.
- 7 months ago
  undefined
userbinator7 months ago
This should be somewhat alarming to anyone who already knows about WEI.
I wonder if "x-browser-copyright" is an attempt at trying to use the legal system to stifle competition and further their monopoly. If so, have they not heard of Sega v. Accolade ?
I'm a bit amused that they're using SHA-1. Why not MD5, CRC32, or (as the dumb security scanners would recommend) even SHA256?
- ulrikrasmussen7 months ago
  I am also alarmed. Google has to split off its development of both Chrome and Android now, this crazy vertical integration is akin to a private company building and owning both the roads AND the cars. Sure, you can build other cars, but we just need to verify that your tires are safe before you can drive on OUR roads. It's fine as long as you build your car on our complete frame, you can still choose whatever color you like! Also, the car has ads.
  - nurettin7 months ago
    Ok but The Road is the internet, how much of that does google/alphabet actually own?
    ulrikrasmussen7 months ago
    All of YouTube. The vast majority of email. All sources of revenue for ad-funded sites, basically, except for those ads pushed by Meta in their respective walled gardens. They are also the gatekeepers deciding what parts of the internet the users actually see, and they continuously work towards preventing people from actually visiting other sites by siphoning off information and keeping users on Google (AMP, AI summaries). The whole Play Store ecosystem is a walled garden which pretends to be open by building on an ostensibly open source OS but adding strict integrity checks on top which gives Google the ultimate power to decide what is allowed to run on peoples phones.
    They don't have to own the servers and the pipes if they own all the clients, sources of revenue, distribution platforms and financial transaction systems.
    nolok7 months ago
    The rest of your list is irrealistic but I had to react at least to this one :
    > The vast majority of email.
    Not even close, less than a third in reality
    I agree that google should be cut down, but if done then other tech giant should be too, otherwise we're just trading one master for another
    CrossVR7 months ago
    Even less than a third is absolutely massive on the scale of a protocol like E-mail.
    nolok7 months ago
    Oh I am not saying they're not a gigantic provider, I'm saying less than a third is very far from "the vast majority" and exageration and misinformation help no one's case, be they on purpose or due to lack of knowledge.
    JimDabell7 months ago
    I would shy away from calling them a majority myself, but it’s a fair point.
    Remember that email involves at least two parties. It doesn’t matter if I use a non-Google provider, I still have to follow all of Google’s email rules, or email will be useless to me because I wouldn’t be able to send mail to Gmail or Google Workspace users.
    In a practical sense, Google have very direct control over almost all email.
    rpdillon7 months ago
    They're probably the biggest provider in existence.
    nolok7 months ago
    I myself would bet on microsoft
    rpdillon7 months ago
    I was being a bit generous. I did the research. Google has 1.8 billion active users. They're the biggest.
    gus_massa7 months ago
    In the math department, we had a Moodle the students in the first year of my university in Argentina.
    When we started like 15 years ago, the emails of the students and TA were evenly split in 30% Gmail, 30% Yahoo!, 30% Hotmail and 10% others (very aproxímate numbers).
    Now the students have like 80% Gmail, 10% Live/Outlook/Hotmail and 10% others/Yahoo. Some of the TA are much older, so perhaps "only" 50% use Gmail.
    The difference is huge. I blame the mandatory gmail account for the cell phone.
    Anyway, we had weird problems with Live/Outlook/Hotmail and Yahoo because they classified some of our emails as spam. Gmail usually works better.
    Anyway^2, everyone is using WhatsApp, so it doesn't matter.
    lxgr7 months ago
    In what way would you consider WhatsApp a replacement for email? Instant messaging is a completely different use case.
    inemesitaffia7 months ago
    Not for everyone.
    Anyway, I got asked to provide a "real" email address by support at my mobile provider.
    I gave them a yahoo email.
    gus_massa7 months ago
    Here, for a lot of profesional (from medical doctors to plumbers) the only contact is a WhatsApp numbers, no email, no real phone.
    At work, 70% of the messages are by WhatsApp. We have like 10 buildings distributed in a 3 million person city, like 3 miles away from each other. So there is a lot of global coordination (mostly by WA). Also inside each building each subgroup of TA (like Algebra+Monday-Thursday+Morning) has one WA group, and the students have an unofficial WA per course.
    We even have a WA group for the "HOA" of my home. (It's an apartment.) People can't maintain a mailing list or use CC correctly, but can use WA.
    And there is another WA for the parents in each course of my children in primary school. Everything is discussed there, in particular invitations to birthday parties. Also, the school has like 3 official methods to send info (that is very confusing), but someone kindly repost all the info in the WA group.
    Also, WA has a few aventajes: [1]
    * If someone sends a message, they get angry if you don't reply in less than 5 minutes.
    * If you realize something a Saturday at 11:30 pm, you can't send a WA about that, because the other person will think you expect them to get out of bed/party to reply.
    * You can't mark a message as unread to reply it later or in a few days.
    * It's even more centralized than email
    [1] /s
    forty7 months ago
    How much is "the vast majority"? I would say that one third of something global with potentially infinite number of providers, when the second player is probably a fraction of that, is already a pretty big majority.
    calfuris7 months ago
    I don't know exactly where to draw the line on "the vast majority," but surely it must be higher than the bar for a simple majority, which is "more than half." If you want to describe something in the lead but under the 50% mark, the word you're looking for is "plurality."
    forty7 months ago
    In French it's not the case, you can have relative or absolute majority, which might explain my confusion.
    According to this definition https://www.merriam-webster.com/dictionary/majority : "c : the greater quantity or share" that also seems to be a possible meaning in English
    jkaplowitz7 months ago
    Yes indeed, both meanings are possible in most contexts.
    In US English, when speaking with the mathematical precision, majority means absolute majority (more than half) and plurality means relative majority (more than anyone else). British English does also have the term relative majority like in French, though I don’t know if this is used in mathematics.
    But like most other dictionaries in both English and French (with some exceptions like l’Académie Française’s dictionary), Merriam-Webster tries to describe how language is actually used in the real world and not some theoretical idea of how it should be used.
    Therefore, since “majority” is often used to mean either absolute or relative majority when speaking in a less precise context than mathematics, a general-purpose dictionary like this one lists both meanings. A mathematical dictionary from the US (again I don’t know about the British equivalent) would list just the absolute meaning.
    degamad7 months ago
    As an Australian English and Indian English speaker and a mathematician, I have never heard the word plurality outside of discussions of the US political system.
    I have seen nitpicking on whether the word majority is the right word for a relative majority, but only seen plurality offered as an alternative by American English speakers who are also students of the American political system.
    I would almost never expect anyone to say "the plurality of cars sold are Toyotas", for example.
    nurettin7 months ago
    > They don't have to own the servers and the pipes if they own all the clients, sources of revenue, distribution platforms and financial transaction systems.
    They don't own all sources of revenue. Even on their major media platform they get siphoned off by companies like patreon. It is all a charade and not everyone is enamoured by that.
    mschuster917 months ago
    > how much of that does google/alphabet actually own?
    A ton. They got shares in a bunch of submarine cables, their properties (YouTube, Maps, Google Search) make up a wide share of Internet traffic, they are via Google Search the chief traffic source for most if not all websites, they own a large CDN as well as one of the three dominant hyperscalers...
- JimDabell7 months ago
  > I wonder if "x-browser-copyright" is an attempt at trying to use the legal system to stifle competition and further their monopoly. If so, have they not heard of Sega v. Accolade ?
  My first thought was the Nintendo logo used for Gameboy game attestation.
  I wonder what a court would make of the copyright header. What original work is copyright being claimed for here? The HTTP request? If I used Chrome to POST this comment, would Google be claiming copyright over the POST request?
  - notpushkin7 months ago
    com.apple.Dont_Steal_Mac_OS_X
- Retr0id7 months ago
  SHA-1 is a head-scratcher for sure.
  I can only assume it's the flawed logic that it's "reasonably secure, but shorter than sha256". Flawed because SHA1 is broken, and SHA256 is faster on most hardware, and you can just truncate your SHA256 output if you really want it to be shorter.
  - adrian_b7 months ago
    SHA-1 is broken for being used in digital signature algorithms or for any other application that requires collision resistance.
    There are a lot of applications for which collision resistance is irrelevant and for which the use of SHA-1 is fine, for instance in some random number generators.
    On the CPUs where I have tested this (with hardware instructions for both hashes, e.g. some Ryzen and some Aarch64), SHA-1 is faster than SHA-256, though the difference is not great.
    In this case, collision resistance appears irrelevant. There is no point in finding other strings that will produce the same validation hash. The correct input strings can be obtained by reverse engineering anyway, which has been done by the author. Here the hash was used just for slight obfuscation.
    Retr0id7 months ago
    The perf difference between SHA1 and SHA256 was marginal on the systems I tested (3950x, M1 Pro), which makes SHA256 a no-brainer to me if you're just picking between those two (collision resistance is nice to have even if you "don't need it").
    You're right that collision resistance doesn't really matter here, but there's a fair chance SHA1 will end up deprecated or removed from whatever cryptography library you're using for it, at some point in the future.
    mjevans7 months ago
    When will CRC32c (also used in https://en.wikipedia.org/wiki/Ethernet_frame#Frame_check_seq... ), MD5, etc get removed? Sure they aren't supported for _security_ use, and should not be used by anything new. However the algorithms will likely continue to exist in libraries of some sort for the foreseeable future. Maybe someday in the distant future they'll just be part of a 'legacy / ancient hash and cryptography' library that isn't standard, but they'll continue to be around.
    SO many things also already standardize on SHA1 (or even weaker hashes) as a (non-security) anti-collision hash for either sharding storage sets (host, folder, etc) or just as already well profiled hash key algos.
    Retr0id7 months ago
    CRC was never a cryptographic hash so there is no need to deprecate it.
    MD5 (and SHA1) is already absent or deprecated in many cryptography libraries, e.g. https://cryptography.io/en/latest/hazmat/primitives/cryptogr...
    Every time someone uses MD5 or SHA1 for something that isn't legacy-backcompat, they further delay their deprecation/removal unnecessarily.
    unscaled7 months ago
    The difference that you've already noted here is that the X-Browser-Validation is new. It doesn't have to keep using SHA1, MD5 or CRC-32 to maintain compatibility with a protocol spec that predates the existence of newer algorithms.
    mjevans7 months ago
    The header is new, but what's it working with on the server side? Were there any other considerations for the selection of the value?
    Though in contrast to that, sometimes the criteria is just that a given number of bits aren't useful, so the output of a different hash is truncated to the desired size.
    Maybe part of the driving criteria os compatibility with E.G. the oldest supported Android version? Or maybe some version of Windows seen in legacy devices in poor countries? There might be good reasons beyond just 'header is new, everything must be state of the art'.
    JimDabell7 months ago
    There’s also the downside of every engineer you onboard spending time raising the same concern, and being trained to ignore it. You want engineers to raise red flags when they see SHA-1!
    Sometimes something that looks wrong is bad even if it’s technically acceptable.
    unscaled7 months ago
    Not just engineers. Many off-the-shelf static analysis tools would happily jump at every mention of a deprecated algorithm such as SHA1 in your code. It's just too much noise, and the performance cost of SHA-256 is negligible on modern computers. If digest size or speed on older machines is a concern, there are other options like Blake2/3.
    There probably(?) isn't any serious vulnerability in using SHA-1 for an integrity identifier that is based on a hard-coded "API key", but I think algorithm hygiene is always a good thing. You don't want to train your engineers to use broken algorithms like SHA-1, "because it might be ok, idk".
    adrian_b7 months ago
    It should be noted that using a parallelizable hash, like Blake2/3, does not provide higher speed by magic.
    Evaluating anything in parallel is a different compromise between the time and the power needed to perform a computation, i.e. with an N-way parallel evaluation you hope to reduce the time by almost N times, while increasing the power by a similar factor and not increasing much the energy required to do the computation.
    The time to compute a hash is not always the most important, especially when the hash computation can be overlapped with other data processing. In mobile and embedded applications the energy can be more important. In that case using the hardware instructions for SHA-256 or SHA-1 can provide energy savings over hashes like Blake2/3.
    So the best choice for a hash function can be affected by many factors, it is preferable not to choose automatically the same function regardless of the circumstances.
    Nowadays SHA-256 is widely supported in hardware and still secure enough for any application with an 128-bit security target, so it is OK as a default choice, but it may be not the best choice in many cases.
  - pinoy4207 months ago
    [dead]
- mindslight7 months ago
  > have they not heard of Sega v. Accolade ?
  My mind went here immediately as well, but some details are subtly different. For example being a remote service instead of a locally-executed copy of software, Google could argue that they are materially relying on such representation to provide any service at all. Or that without access to the service's code, someone cannot prove this string is required in order to interoperate. It also wouldn't be the first time the current Supreme Court took advantage of slightly differing details as an excuse to reject longstanding precedent in favor of fascism.
  - wongarsu7 months ago
    And even if it falls under fair use in the US, they could still have a case in some other relevant market. The world is a big place
    userbinator7 months ago
    If anything, the EU is even more likely to consider it fair use for interoperability, which basically leaves Asia --- but Google's services are blocked in the biggest country there, so I'm not sure about that.
    They might be trying to do this in the US given the political climate, but then again, the current administration is decidedly unfriendly towards Big Tech in general.
- PeterStuer7 months ago
  WEI? As in Windows Experience Index? Can you elaborate?
  - runiq7 months ago
    Web Environment Integrity: https://en.wikipedia.org/wiki/Web_Environment_Integrity
- lxgr7 months ago
  Probably any cryptographic hash function would have done.
  My suspicion is that what they're trying to do here is similar to e.g. the "Readium LCP" DRM for ebooks (previously discussed at [1]): A "secret key" and a "proprietary algorithm" might possibly bring this into DMCA scope in a way that using only a copyrighted string might not.
  [1] https://news.ycombinator.com/item?id=43378627
cebert7 months ago
I have to imagine Google added these headers to make it easier for them to identify agentic requests vs human requests. What angers me is that this is yet another signal that can be used to uniquely fingerprint users.
- gruez7 months ago
  It doesn't really meaningfully increase the fingerprinting surface. As the OP mentioned the hash is generated from constants that are the same for all chrome builds. The only thing it really does is help distinguish chrome from other chromium forks (eg. edge or brave), but there's already enough proprietary bits inside chrome that you can easily tell it apart.
  - thayne7 months ago
    > The only thing it really does is help distinguish chrome from other chromium forks (eg. edge or brave)
    You could already do that with the user agent string. What this does is distinguishes between chrome and something else pretending to be chrome. Like say a firefox user who is spoofing a chrome user agent on a site that blocks, or reduces functionality for the firefox user agent.
    bobbiechen7 months ago
    Plenty of bots pretend to be Chrome via user agent, but if you look closely are actually running Headless Chromium. This is a very useful signal for fraud and abuse prevention.
    littlestymaar7 months ago
    > This is a very useful signal for fraud and abuse prevention.
    Like people spoofing the Chrome UA in Firefox to avoid artificial performance degradation inflicted by Google on their websites...
    thayne7 months ago
    Let's ignore for the moment that this has been reverse engineered.
    If they only look at this header, then legitimate users using non-chrome browsers will get treated as bots.
    If the these headers are only used for chrome user agents, then it would be easy to bypass by using headless chromium with a user agent that spoofs firefox or safari.
    7 months ago
    undefined
    TechDebtDevin7 months ago
    This is what I don't get. Anybody scraping at scale is using headful browsers as fallback, this does nothing. I will just find the browser that works, and use it.
    TechDebtDevin7 months ago
    I spoof User Agent, TLS/browser fingerprinting all day. These are the basics. None of this bothers me tbh, I'm constantly running tests on lots of versions chrome, firefox and brave and haven't really seen any impact in bot detection. I do a lot of browser emulation of other browsers in Chrome. PermiterX/Human seems to be the only WAF that is really good about catching this.
- thayne7 months ago
  I'm more concerned that whether intentional or not this will probably cause problems for users who use non-chrome browsers. Like say slowing down requests that don't have this header, responding with different content, etc.
  - userbinator7 months ago
    User-agent discrimination has been happening for literally decades at this point, but you're right that this could make things worse.
    snackbroken7 months ago
    User-agent discrimination is tolerable when it's Joe Webmaster doing it out of ignorance. It is not acceptable if it is being used by a company leveraging their dominant position in one market to gain an advantage over its competitors in another market. It's not acceptable even if it's not said company's expressed intent to do so but merely a "happy accident" that is getting "overlooked".
    Indeed, even for those who require a round of mental gymnastics before they concede that monopolies are, like, "bad" or whatever, GP points out precisely how this would constitute "consumer harm".
    mook7 months ago
    Tell that to Google intentionally slowing down Firefox even without ad blocking. (I'm talking about them using the fallback for web components instead, not the slowdowns when ads don't load.)
- qingcharles7 months ago
  How does that work, though? I have a bunch of automated tasks I use to speed up my workflows, but they all run on top of the regular browser that I also use. I don't see how this war is winnable? (not without tracking things like micro-movements of the mouse that might be caused by being a human etc)
jakub_g7 months ago
FYI: Google enterprise workspace admins can enable policies which e.g. prevent login ability to google.com properties to only Chrome browsers.
I wonder if this is header is not connected in some way to that feature.
- cj7 months ago
  Seems unnecessary.
  The same policies also offer the ability to force-install an official Google "Endpoint Verification" chrome extension which validates browser/OS integrity using Enterprise Chrome Extension APIs ("chrome.enterprise") [0] only available in force-installed enterprise extensions.
  FWIW, in my years of managing enterprise chrome deployments, I haven't come across the feature to force people to use Chrome (there are a lot of settings, maybe I've missed this one). But, there definitely is the ability to prevent users from mixing their work and non-work gmail accounts in the same chrome profile.
  [0] https://developer.chrome.com/docs/extensions/reference/api/e...
  Edit: Okay, maybe one hole in my logic is the first-sign in experience. When signing into google for the first time in a new chrome browser, the force-installed extension wouldn't be there yet. Although Google could hypothetically still allow the login initially, but then abort/cancel the sign in process as part of the login flow if the extension doesn't sync and install (indicating non-chrome use).
  - jakub_g7 months ago
    In my current job we do have force-Chrome setting enabled. I can't log in to Gmail through any other browser. Neither SSO login to GitHub via Google.
    cj7 months ago
    This might be their “context aware” security feature. Which can prevent access to certain things based on device, browser, etc.
    I don’t see why any of that can’t rely on a chrome extension implementation using the privileged APIs to verify OS, Browser, etc. Struggling to understand why they need special headers for any of this functionality.
thayne7 months ago
Why would they think this was a good idea after losing the chrome anti-trust trial? I don't know the intended purpose is for this, but I can see several ways this could be used anti-competitive way, although now it has been reverse engineered, an extension could spoof it. On the other hand, I wonder if they intend to claim the header is a form of DRM and such spoofing is a DMCA violation...
- jsnell7 months ago
  > after losing the chrome anti-trust trial?
  There hasn't been such a trial.
- Retr0id7 months ago
  x-browser-copyright seems like an attempt at something similar to the Gameboy's nintendo-logo DRM (wherein cartridges are required to have the nintendo logo bitmap before they can boot, so any unlicensed carts would be trademark infringement)
  - userbinator7 months ago
    http://en.wikipedia.org/wiki/Sega_Enterprises_Ltd._v._Accola... is the legal precedent that says trying to do that won't work, but then again maybe Google thinks it's invincible and can do whatever it wants after it ironically defeated Oracle in a case about interoperability and copyright.
    Retr0id7 months ago
    Even if they can't defend it legally, it costs them ~nothing to add the header and it could still act as a deterrent.
    meibo7 months ago
    Apple famously does this with this word soup in their SMC chips, and proceeded to bankrupt a company that sold Hackintoshes and shipped it in their EFI: https://en.wikipedia.org/wiki/Psystar_Corporation
    Our hard work by these words guarded please don't steal (c) Apple Computer Inc
    Though one could argue that they would have probably bankrupted them anyway even if they hadn't done that.
    thayne7 months ago
    That was before the DMCA was passed. It's possible DMCA section 1201 could apply here.
- krackers7 months ago
  >an extension could spoof it
  not if they make it dynamic somehow (e.g. include current day in hash). Then with MV3 changes that prevent dynamic header manipulation there is no way for an extension to spoof it.
  - thayne7 months ago
    > Then with MV3 changes that prevent dynamic header manipulation
    That doesn't apply to Firefox
    krackers7 months ago
    Fair, I was considering chrome headless since firefox users are already served google captchas more often.
binary1327 months ago
I think it’s difficult to argue that Google doesn’t have the right and capability to build their own private internet, I just also think they’d like to make the entire internet their own private internet, and do away with the public internet, and I’d really prefer they not do that.
- TacticalCoder7 months ago
  [dead]
aussieguy12347 months ago
So this is basically hidden client attestation?
- Aaargh203187 months ago
  Not really. It's just an API key + the user agent. There is no mechanism to detect the browser hasn't been tampered with. If you wanted to do that you'd at least include a hash over the browser binary, or better yet the in-memory application binary.
  - delusional7 months ago
    That would provide not extra capability. Anybody smart enough to modify the chrome executable could just patch the hash generation to also return a static (but correct) hash.
delusional7 months ago
Is an "api key" like this covered by copyright? Would that technically mean that spoofing this random sequence of numbers would require me to agree to whatever source license they offer it under, since I wouldn't know the random sequence unless I read it in their source?
That's an odd possibility.
- userbinator7 months ago
  Anti-reverse-engineering clauses in EULAs are limited and exceptions are always present for interoperability. The same goes for copyright. It's hard to argue that this key is secret if it's widely and publicly distributed.
  Ironically, Google just fought with Oracle a case around similar concepts.
dlenski7 months ago
Does the open-source Chromium generate this header as well? (Perhaps with a slightly different UA as input.)
Or is it exclusive to the closed-source Chrome codebase?
_imnothere7 months ago
And why should anyone with a sane mind (except for Googlers) allow this kind of validation bs to exist?
- rs1867 months ago
  At this point I am fully convinced that Google is abusing Chrome's dominant position to push their own agenda and shape the whole Internet the way they want. Privacy sandbox, manifest v3, you name it.
  Sadly nobody can do anything about it, so far. We'll yet need to see the outcome of the antitrust trial.
  - orphea7 months ago
    > to push their own agenda and shape the whole Internet the way they want
    It is Chrome's raison d'être from the very beginning. You don't think Google made its own browser because they felt generous, right?
RandyOrion7 months ago
Two questions:
Which version of chrome is the first to implement these headers?
What are the potential effects of these headers on chromium forks, e.g. ungoogled chromium?
- kuschkufan7 months ago
  Well, what did you find out?
Everdred2dx7 months ago
If you were using a user agent spoofing extension couldn't this be used to guess your "real" UA?
- jedimastert7 months ago
  It looks like it's an SHA hash, so working backwards would probably be prohibitively irritating.
  - dataflow7 months ago
    That's not how it works. The combination of valid inputs is a small set. You just try each one until you get the hash.
    jedimastert7 months ago
    It's not all that small, although probably small enough to make a rainbow table or something.
    You would have to maintain the code to generate character-perfect strings (or maybe just keep a very large library of the current most popular ones) and also make sure you have the up to date API key salt values (which they probably going to start rotating regularly), which–as I said before–wouldn't be impossible, just prohibitively irritating to maintain for comparatively little benefit.
    And besides, it won't be too long before people just start spoofing the hash too, probably shorter than getting the generator up and running
    giingyui7 months ago
    [dead]
egorfine7 months ago
Is there a way to prevent Chrome from sending those headers?
Larrikin7 months ago
How do I set this in Firefox?