1) this projects' chrome extension sends detailed telemetry to posthog and amplitude:
- https://storage.googleapis.com/cobrowser-images/telemetry.pn...
- https://storage.googleapis.com/cobrowser-images/pings.png
2) this project includes source for the local mcp server, but not for its chrome extension, which is likely bundling https://github.com/ruifigueira/playwright-crx without attribution
super suss
1. Yes, the extension uses an anonymous device ID and sends an analytics event when a tool call is used. You can inspect the network traffic to verify that zero personalized or identifying information is sent.
I collect anonymized usage data to get an idea of how often people are using the extension in the same way that websites count visitors. I split my time between many projects and having a sense of how many active users there are is helpful for deciding which ones to focus on.
2. The extension is completely written by me, and I wrote in this GitHub issue why the repo currently only contains the MCP server (in short, I use a monorepo that contains code used by all my extensions and extracting this extension and maintaining multiple monorepos while keeping them in sync would require quite a bit of work): https://github.com/BrowserMCP/mcp/issues/1#issuecomment-2784...
I understand that you're frustrated with the way I've built this project, but there's really nothing nefarious going on here. Cheers!
Knee-jerk reactions aren't helpful. Yes, too much tracking is not good, but some tracking is definitely important to improving a product over time and focusing your efforts.
Any other mode of operation is morally bankrupt.
I don't sign a term sheet when I order at McDonalds but you can be damn sure they count how many big macs I order. Does that make them morally bankrupt? Or is it just a normal business operation that is actually totally reasonable?
It's 2025 - we want informed consent and voluntary participation with the default assumption that no, we do not want you watching over our shoulders, and no, you are not entitled to covertly harvest all the data you want and monetize that without notifying users or asking permissions. The whole ToS gotcha game is bullshit, and it's way past time for this behavior to stop.
Ignorance and inertia bolstering the status quo doesn't make it any less wrong to pile more bullshit like this onto the existing massive pile of bullshit we put up with. It's still bullshit.
If they were tracking my identity across sites and actually selling it to the highest bidder that's one thing that we'll definitely agree on. This is so so far from that.
You're welcome to build and use your own MCP browser automation if you're so hostile to the developer that built something cool and free for you to use.
Any covert, involuntary, automatic surveillance of a person for any reason whatsoever should have a court order and legal authority behind it - it's gross and exposes the target to vulnerabilities they're not cognizant of.
For telemetry tracking user behavior to be useful at all, it's got to be associated with a user. The idea of telemetry anonymization is marketing speak for "we obfuscated it, we know deanonymization is trivial, but people are stupid, especially regulators."
Any anonymization done is sufficiently obfuscated such that corporate asses get covered in the case of any regulatory investigation. There's no legitimate, mathematically valid anonymization of user data that you could do without destroying the information that you're trying to get in the first place through these tools. This means that any aggregation of user data useful to a malicious actor will inevitably be compromised - the second Posthog or Amplitude become a desirable target, they'll get pwned and breached, and much handwringing will be done, and there will be no recourse or recompense for damages done.
The only strategy to prevent the dissemination of surveillance data is not to collect it in the first place. It should be illegal to collect the data without voluntary, user initiated participation, and any information collected should be ephemeral with regular inspection to ensure compliance. Any violation of user privacy should result in crippling fines, something like 5% of the value of the company per user per day of violation - if you can't responsibly manage the data, you shouldn't be collecting it.
This means all the automatic continuous development a/b testing intrusive corner cutting corporate bullshit would have to stop. Continually leaking surveillance data to malicious actors year over year with no repercussions has thoroughly demonstrated that people cannot be trusted with safekeeping data.
I will build and use my own automation if I need to, based on products that don't covertly, involuntarily, ignorantly surveil their users, without even being aware of potential for harm, and I'll continue to point it out when it shows up in random projects and products, because it's wrong and it should stop.
We should stop embracing the things that enshittify the world, and stop sacrificing things like "other people's privacy" for convenience or profit.
Keep in mind, extensions can update themselves at any time, including when they're bought out by someone else. In fact, I bet that's a huge draw... imagine buying an extension that "can read and modify data on all your websites" and then pushing an update that, oh I dunno, exfiltrates everyone's passwords from their gmail. How would most people even catch that?
DO NOT have any extensions running by default except "on click".
There should be at least some kind of static checker of extensions for their calls to fetch or other network APIs. The Web is just too permissive with updating code, you've got eval and much more. It would be great if browsers had only a narrow bottleneck through which code could be updated, and would ask the user first.
(That wouldn't really solve everything since there can be sleeper code that is "switched on" with certain data coming over the wire, but better than what we have now.)
I think the permission system should be much more complicated so that the user gets a prompt that explains what is needed and why.
Furthermore there should be [paid] independent reviewers to sign off on extensions. This adds a lot of credibility, specially to a first time publication without users. That would also give app stores someone to talk to before deleting something. Nefarious actors working for app stores can have their credibility questioned.
Keep in mind, extensions can update themselves at any time
GP suggested only installing extensions you can build yourself from source. Most extensions that auto update do so via the Chrome store. If you install an extension from source, that won't happen."Avoids bot detection and CAPTCHAs by using your real browser fingerprint."
Yeah, not really.
I've used a similar system a few weeks back (one I wrote myself), having AI control my browser using my logged in session, and I started to get Captcha's during my human sessions in the browser and eventually I got blocked from a bunch of websites. Now that I've stopped using my browser session in that way, the blocks eventually went away, but be warned, you'll lose access yourself to websites doing this, it isn't a silver bullet.
Also I assume this extension is pretty obvious so it wont take long for CF bot detection to see it the same as playwrite or whatever else.
Hence why projects like this exist: https://github.com/Kaliiiiiiiiii-Vinyzu/patchright. They hide the debugging part from JavaScript.
Screen readers need to see a de-bullshittified, machine-readable version of the site + this is required by law sometimes, and generally considered a nice thing to enable -> the site becomes not just screen-reader friendly, but end user automation-friendly in general.
(I don't know how long this will hold, though. LLMs are already capable of becoming a screen reader without any special provisions - they can make sense of the UI the same way a sighted person can. I wouldn't trust them much now, but they'll only get better.)
> These Captchas are really bad at detecting bots and really good at falsely labelling humans as bots.
As a human it feels that way to you. I suspect their false-positive rate is very low.
Of course, you may well be right that you get pinged more because of your style of browsing, which sux.
source: I work in a team that uses this kind of bot detection and yes, it works. And yes we do our best to keep false positives down
Back when I was playing Call of Duty 4, I got routinely accused of cheating because some people didn't think it was possible to click the mouse button as fast as I did.
To them it looked like I had some auto-trigger bot or Xbox controller.
I did in fact just have a good mouse and a quick finger.
If CloudFlare mislabels you as a bot, however, you may be unable to access medical services, or your bank account, or unable to check in for a flight, stuff like that. Actual important things.
So yes, I think it's not unreasonable to expect more from CF. The fact that some humans are routinely mischaracterized as bots should be a blocker level issue.
I've never failed the CF bot test so don't know how that feels. Though I have managed to get to level 8 or 9 on Google's ReCaptcha in recent times, and actually given up a couple of times.
Though my point was just it's gonna boil down to a duck test, so if you walk like a duck and quack like a duck, CF might just think you're a duck.
Yes, this is a big signal they use.
> adding some more human like noise to the mouse
Yes, this is a standard avoidance strategy. Easier said than done. For every new noise generation method, they work on detection. They also detect more global usage patterns and other signals, so you'd need to immitate the entire workflow of being human. At least within the noise of their current models.
"Avoids bot detection and CAPTCHAs" - Sure asshole, but understand that's only in place because of people like you. If you truly need access to something, ask for an API, may you need to pay for it, maybe you don't. May you get it, maybe the site owner tells you to go pound sand and you should take that as you're behaviour and/or use case is not wanted.
Most of the automated misbehavior is businesses doing it to other businesses - in many cases, it's direct competition, or a third party the competition outsources it to. Hell, your business is probably doing it to them too (ask the marketing agency you're outsourcing to).
> If you truly need access to something, ask for an API, may you need to pay for it, maybe you don't.
Like you'd give it to me when you know I want it to skip your ads, or plug it to some automation or a streamlined UI, so I don't have to waste minutes of my life navigating your bloated, dog-slow SPA? But no, can't have users be invisible in analytics and operate outside your carefully designed sales funnel.
> May you get it, maybe the site owner tells you to go pound sand and you should take that as you're behaviour and/or use case is not wanted.
Like they have a final say in this.
This is an evergreen discussion, and well-trodden ground. There is a reason the browser is also called "user agent"; there is a well-established separation between user's and server's zone of controls, so as a site owner, stop poking your nose where it doesn't belong.
--
[0] - Not "you" 'mrweasel personally, but "you" the imaginary speaker of your second paragraph.
If you have a sales funnel, as in you take orders and ship something to a customer, consumer or business, I almost guarantee you that you can request an API, if the company you want to purchase from is large enough. They'll probably give you the API access for free, or as part of a signup fee and give you access to discounts. Sometimes that API might be an email, or a monthly Excel dump, but it's an API.
When we're talking site that purely survive on tracking users and reselling their data, then yes, they aren't going to give you API access. Some sites, like Reddit does offer it I think, but the price is going to be insane, reflecting their unwillingness to interact with users in this way.
> Not "you" 'mrweasel personally
Understood, but thank you :-)
I wasn't thinking primarily about tracking and ads here either, when it comes to B2B automation. What I meant was e.g. shops automatically scrapping competing stores on a continued basis, to adjust their own prices - a modern version of the old "send your employees incognito to the nearby stores and have them secretly note down prices". Then you also have comparison-shopping (pricing aggregators) sites that are after the same data, too.
And then of course there's automated reviews (reading and writing), trying to improve your standing and/or sabotage competition. There's all kinds of more or less legit business intelligence happening, etc. Then there's wholesale copying of sites (or just their data) for SEO content farms, and... I could go on.
Point being, it's not the people who want to streamline their own work, make access more convenient for themselves, etc. that are the badly-behaving actors and reasons for anti-bot defenses.
> If you have a sales funnel, as in you take orders and ship something to a customer, consumer or business, I almost guarantee you that you can request an API, if the company you want to purchase from is large enough. They'll probably give you the API access for free, or as part of a signup fee and give you access to discounts. Sometimes that API might be an email, or a monthly Excel dump, but it's an API.
The problem from a POV of a regular users like me is, I'm not in this for business directly; the services I use are either too small to bother providing me special APIs, or I am too small for them to care. All I need is to streamline my access patterns to services I already use, perhaps consolidate it with other services (that's what MCP is doing, with LLM being the glue), but otherwise not doing anything disruptive to their operations. And I'm denied that, because... Bots Bad, AI Bad, Also Pay Us For Privilege?
> When we're talking site that purely survive on tracking users and reselling their data, then yes, they aren't going to give you API access. Some sites, like Reddit does offer it I think, but the price is going to be insane, reflecting their unwillingness to interact with users in this way.
Reddit is an interesting case because the changes to their API and 3rd-party client policies happened recently, and clearly in response to the rise of LLMs. A lot of companies suddenly realized the vast troves of user-generated content they host are valuable beyond just building marketing profiles, and now they try to lock it all up in order to extort rent for it.
and then the LLM model will ask the MCP server to call the functions, check the result, call the next function if needed, etc
Right now if you go to ChatGPT you can't really tell it "open Google maps with my account, search for bike shops near NYC, and grab their phone numbers", because all he can do is reply in text or make images
with a "browser MCP" it is now possible: ChatGPT has a way to tell your browser "open Google maps", "show me a screenshot", "click at that position", etc
Is this what 'calling' is?
It seems strange to me to focus on this sort of standard well in advance of models being reliable enough to, ya know, actually be able perform these operations on behalf of the user with any sort of strong reliability that you would need for widespread adoption to be successful.
Cryptocurrency "if you build it they'll come" vibes.
Believe me. It's not there yet.
I was referring more broadly to ClaudePlaysPokemon, a twitch stream where claude is given tool calling into a Gameboy Color emulator in order to try to play Pokemon. It has slowly made progress and i recommend looking at the stream to see just how flawed LLM's are currently for even the shortest of timelines w.r.t. planning.
I compared the two because the tool calling API here is a similar enough to an MCP configuration with the same hooks/tools (happy to be corrected on that though)
EDIT: Don't get me wrong, the benchmark scores are indeed higher, but in my personal experience, LLMs make as many mistakes as they did before, still too unreliable to use for cases where you actually need a factually correct answer.
Yes, MCP is a way to streamline giving LLMs ability to run arbitrary code on your machine, however indirectly. It's meant to be used on "your side of the airlock", where you trust the things that run. Obviously it's too powerful for it to be used with third-party tools you neither trust nor control; it's not that different than downloading random binaries from the Internet.
I suppose it's good to spell out the risks, but it doesn't make sense blaming MCP itself, because those risks are fundamental aspects of the features it provides.
It introduces a substantial set of novel failure modes, like cross-tool shadowing, which aren't obvious to most folks. Making use of any externally developed tooling — even open source tools on internal architecture — requires more careful consideration and analysis than most would expect. Despite the warnings, there will certainly be major breaches on these lines.
The article also reeks of LLM ironically
https://invariantlabs.ai/blog/mcp-security-notification-tool...
So im not sure id give up the sum total progress of the automobile just because the first decade was a bad one
Is there any browser that can do this yet as it seems extremely useful to be able to extract details from the page!
Would also be interested in hearing more about what you’re envisioning for your use case. Are you thinking a browser extension that acts on sites you’re already on, or some sort of shopping aggregator that lets you do this, or something else entirely?
Example: find me all of the desks on IKEA that come in light coloured wood, are 55 inches wide, and rank them from deepest to shallowest. Oh, and make sure they're in stock at my nearest IKEA, or are delivering within the next week.
I don't know if you've done it already, but it would be great to pause automation when you detect a captcha on the page and then notify the user that the automation needs attention. Playwright keeps trying to plough through captchas.
Is there an issue with the lag between what is happening in the browser and the MCP app (in my case Claude Desktop)?
I have a feeling the first time I tried it, I was fast enough clicking the "Allow for this chat" permissions, whereas by the time I clicked the permission on subsequent chats, the LLM just reports "It seems we had an issue with the click. Let me try again with a different reference.".
Actions which worked flawlessly the first time (rename a Google spreadsheet by clicking on the title and inputting the name) fail 100% of subsequent attempts.
Same with identifying cells A1, B1, etc. and inserting into the rows.
Almost perfect on 1st try, not reproducible in 100% of attempts afterwards.
Kudos to how smooth this experience is though, very nice setup & execution!
EDIT 2: The lag & speed to click the allow action make it seemingly unusable in Claude Desktop. :(
Also consider publishing it so people can use it without having to use git.
{
"mcpServers": {
"ragdocs": {
"command": "npx",
"args": [
"-y",
"@qpd-v/mcp-server-ragdocs"
],
"env": {
"QDRANT_URL": "http://127.0.0.1:6333",
"EMBEDDING_PROVIDER": "ollama",
"OLLAMA_URL": "http://localhost:11434"
}
},
}
}
}
example: https://x.com/xing101/status/1903391600040083488 set up: https://github.com/xing5/mcp-google-sheets
There's no bug or glitch happening. It's just statistically unlikely to perform the action you wanted and you landed a good dice roll on your first turn.
--Error: Cannot access a chrome-extension:// URL of different extension
Every month, go to service providers, log in, find and download statement, create google doc with details filled in, download it, write new email and upload all the files. Maybe double chek the attachments are right but that requires downloading them again instead of being able to view in email).
Automating this is already possible (and a real expense tracking app can eliminate about half of this work) but I think AI tools have the potential to elminate a lot of the nittier-grittier specification of it. This is especially important because these sorts of workflows are often subject to little changes.
Imagine it controlling plugins remotely, have an LLM do mastering and sound shaping with existing tools. The complex overly-graphical UIs of VSTs might be a barrier to performance there, but you could hook into those labeled midi mapping interfaces to control the knobs and levels.
The tool use / function calling thing far predates Anthropic releasing the MCP specification and it really wasn't that onerous to do before either. You could provide a json schema spec and tell the model to generate compliant json to pass to the API in question. MCP doesn't inherently solve any of the problems that come up in that sort of workflow, but it does provide an idiomatic approach for it (so there's a non-zero value there, but not much).
And if other vendors sign on to support MCP, then it becomes a self reinforcing cycle of adoption.
Coupled with the fact that any LLM trained for tool use can utilize the protocol, it doesn't feel like much of a moat that uniquely positions Claude Desktop in a meaningful way.
This is exactly what's happening now. A good portion of applications, frameworks and actors are starting to support it.
I've been reluctant on adopting MCP in applications until there was enough adoption.
However, depending on your use case it may also be too complex for your use case.
Containers are not a big deal when viewed in isolation. But when its common size/standard for all kinds of ships, cranes and trucks, it is a big deal then.
In that sense its more about gathering community around one way to do things.
In theory there are REST APIs and OpenAPI standard, but those were not made for LLMs but code. So you usually need some kind of friendly wrapper(like for candy) on top of REST API.
It really starts to feel like a a big deal when you work in integrating LLMs with tools.
APA (Agentic Process Automation) is the new RPA, and this is definitely one example of it.
2025-04-07T18:43:26.537Z [browsermcp] [info] Initializing server... 2025-04-07T18:43:26.603Z [browsermcp] [info] Server started and connected successfully 2025-04-07T18:43:26.610Z [browsermcp] [info] Message from client: {"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"claude-ai","version":"0.1.0"}},"jsonrpc":"2.0","id":0} node:internal/errors:983 const err = new Error(message); ^
Error: Command failed: FOR /F "tokens=5" %a in ('netstat -ano ^| findstr :9009') do taskkill /F /PID %a at genericNodeError (node:internal/errors:983:15) at wrappedFn (node:internal/errors:537:14) at checkExecSyncError (node:child_process:882:11) at execSync (node:child_process:954:15)
There was another comment that mentioned that there's an issue with port killing code on Windows: https://news.ycombinator.com/item?id=43614145
I just published a new version of the @browsermcp/mcp library (version 0.1.1) that handles the error better until I can investigate further so it should hopefully work now if you're using @browsermcp/mcp@latest.
FWIW, Claude Desktop currently has a bug where it tries to start the server twice, which is why the MCP server tries to kill the process from a previous invocation: https://github.com/modelcontextprotocol/servers/issues/812
Thanks, great job! I like it overall, but I noticed it has some issues entering text in forms, even on google.com. It's able to find a workaround and insert the searched text in the URL, but it would be nice if the entry into forms worked well for UI testing.
1. Kill your Claude Desktop app
2. Click "Connect" in the browser extension.
3. Quickly start your Calude Desktop app.
It will work 50% of the time - I guess the timing must be just right for it to work. Hopefully, the developers can improve this.
Now on to testing :)
"Go to https://news.ycombinator.com/upvoted?id=josefrichter, summarize what topics I am interested in, and then from the homepage pick articles I might be interested in."
Works like a charm.
Do we suppose they will just create a backdoor to allow _some_ bots in? If they do that how long will it be before other bots impersonate them? It seems like a bit of a fad from my small mind.
Suppose it does become a thing, what then? We end up with an internet which is heavily optimised for bots (arguably it already is to an extent) and unusable for humans?
Wild.
https://brightdata.com/pricing/web-unlocker https://2captcha.com/pricing
As opposed to the Web we now have, which is heavily optimized for... wasting human life.
What you're asking for, what "large companies such as CloudFlare have spent millions on", is verifying that on the other end of the connection is a web browser, and behind that web browser there is a human being that's being made to needlessly suffer and waste their limited lifespans, as they tediously work their way through the UI maze like a good little lab rat, watching ads at every turn of the corridor, while being constantly surveilled.
Or do you believe there is some other reason why you should care about whether you're interacting with a "human" (really: an user agent called "web browser") vs. "not human" (really: any other user agent)?
The relationship between the commercial web and its users is antagonistic - businesses make money through friction, by making it more difficult for users to accomplish their goals. That's why we never got the era of APIs and web automation for users. That's why we're dealing with tons of bespoke shitty SPAs instead of consistent interfaces - because no store wants to make it easy for you to comparison-shop, or skip their upsells, or efficiently search through the stock; no news service wants you to skip ads or make focused searches, etc.
As users, we've lost the battle for APIs and continue to be forced to use the "manual web" (with active cooperation of the browser vendors, too). MCP feels promising because we're in a moment in time, however brief, where LLMs can navigate the "manual web" for us, shielding us from all the malicious bullshit (ads, marketing copy, funneling, call to actions, confusing design, dark patterns, less dark patterns, the fact that your store is a bloated SPA instead of an endpoint for a generic database querying frontend, and so on) while remaining mostly impervious to it. This will not last long - the vendors de-facto ruling the web have every reason to shut it down (or turn it around and use LLMs against us). But for now, it works.
Adversarial interoperability is the name of the game. LLMs, especially combined with tool use (and right tools), make it much easier and much more accessible than ever before. For however brief a moment.
As for the optimisation to _waste human life_ I do agree but the reality is that the sites which waste the majority of human life/time are the ones which would not be automated by the MCP and would, ultimately, see more 'real' usage by virtue of the fact that your average human will have more time to mindlessly scroll their favourite echo-chamber.
Then we have the whole other debate of whether we really believe that the VC funders whom are largely responsible for the current state of the web will continue pumping money into something which would hurt their bottom line from another angle?
On the topic of:
> whether we really believe that the VC funders whom are largely responsible for the current state of the web will continue pumping money into something which would hurt their bottom line from another angle?
No, I don't believe that at all - which is why I keep saying the current situation is an anomaly, a brief moment in time. LLMs deployed in form of general-purpose chatbots/agents are giving too much power to the people, which is already becoming disruptive to many businesses, so that power will be gradually taken away. Expect less general-purpose AI agents, and more "AI powered features" that shackle LLMs behind some limited UI, to ensure you can only get as much benefit from AI as it fits the vendors' business strategies.
That, and maybe they will as CF seem quite big on MCP.[0] Or people just bypass the bot detection. It's already not terribly difficult to do; people in the sneaker bot and ticket scalping communities have long had bypasses for all the major companies.
I mean, we can all imagine bad use-cases of bots, but there's also the pros: the internet wastes loads of human time. I still remember needing to browse marketplaces real estate listings with terrible search and notification functionality to find a flat... shudders. Unbelievable amount of hours wasted.
If fewer people are able to build bots that can index a larger number of sites and give better searching capabilities, for instance, where sites are unable to provide this, I'm personally all for it. For many sites, it's that they lack the in-house development expertise and probably they wouldn't even mind.
[0]: https://developers.cloudflare.com/agents/model-context-proto... etc
The Playwright MCP server is great! Currently Browser MCP is largely an adaptation of the Playwright MCP server to use with your actual browser rather than creating a new one each time. This allows you to reuse your existing Chrome profile so that you don't need to log in to each service all over again and avoids bot detection which often triggers when using the fresh browser instances created by Playwright.
I also plan to add other useful tools (e.g. Browser MCP currently supports a tool to get the console logs which is useful for automated debugging) which will likely diverge from the Playwright MCP server features.
https://playwright.dev/docs/api/class-browsertype#browser-ty...
Unfortunately, Firefox doesn't expose WebDriver BiDi (the standardized version of CDP) to browser extensions AFAIK (someone please correct me if I'm mistaken!), so I don't think I can support it even if I tried.
Not going to lie, this makes me happy.
[0]: https://wiki.mozilla.org/WebDriver/RemoteProtocol/WebDriver_...
https://github.com/microsoft/playwright-mcp/blob/main/src/to... https://github.com/BrowserMCP/mcp/blob/main/src/tools/tool.t...
You’re right, this is an adaptation of Playwright MCP to automate the user’s local browser as mentioned in the GitHub README and here:
- https://github.com/BrowserMCP/mcp/blob/3e6824de6f36eba7d2d3b...
- https://news.ycombinator.com/item?id=43613905
Thanks for all your work to Playwright and Playwright MCP. I’m a big fan!
(For those not familiar, Pavel is the largest contributor to both Playwright and Playwright MCP: https://github.com/microsoft/playwright/graphs/contributors, https://github.com/microsoft/playwright-mcp/graphs/contribut...)
> Credits: Browser MCP was adapted from the Playwright MCP server
2025-04-07 10:57:11.606 [info] rmcp: Starting new stdio process with command: npx @browsermcp/mcp@latest
2025-04-07 10:57:11.606 [error] rmcp: Client error for command spawn npx ENOENT
2025-04-07 10:57:11.606 [error] rmcp: Error in MCP: spawn npx ENOENT
2025-04-07 10:57:11.606 [info] rmcp: Client closed for command
2025-04-07 10:57:11.606 [error] rmcp: Error in MCP: Client closed
2025-04-07 10:57:11.606 [info] rmcp: Handling ListOfferings action
2025-04-07 10:57:11.606 [error] rmcp: No server info found
---
EDIT: Ended up fixing it by patching index.js. killProcessOnPort() was the problem. Can hit me up if you have questions, I cannot figure out how to put readable code in HN after all these years with the fake markdown syntax they use.
Not that HN supports much in the way of markup, but code blocks are actually the same as Markdown: indent (by 2 spaces or more, in HN's syntax; Markdown calls for 4 or more, so they're compatible).
print("Hello, world.")
For LLMs to interact with applications (without a two way protocol) is achievable just with tools/functions.
I wonder if it's possible to add such plugins to election apps (e.g.: Slack). It would be such a nice experience if I could just connect my AI of choice to a local app.
Interesting research and reading via the HN search portal: https://hn.algolia.com/?q=bot+detection
- https://github.com/mayt/BrowserGPT
- https://github.com/TaxyAI/browser-extension
- https://github.com/browser-use/browser-use
- https://github.com/Skyvern-AI/skyvern
- https://github.com/m1guelpf/browser-agent
- https://github.com/richardyc/Chrome-GPT
- https://github.com/handrew/browserpilot
Just because the wheel exists doesn't mean we shouldn't strive to make it better by applying new knowledge and technologies to it.
'Avoids bot detection and CAPTCHAs by using your real browser fingerprint.'
Anything on your machine (such as a rogue browser extension or a malicious npm/pypi package) could scan for this and just get all your cookies - and that's only the beginning of your problems.
CDP can access any origin, any data stored (localStorage, indexedDB ...), any javascript heap, cross iframe and origin boundaries, run almost undetectable code that uses your sessions without you knowing, and the list is very long. CDP was never meant to expose a real browser in an untrusted context.
Also works flawlessly with augment code.com too!
Use the pre-built Trails[1][2] as MCP servers or create and publish your own with a familiar puppeteer-like API, powered by your or your friends browsers.
From Claude I have connected to these MCP servers OK : @modelcontextprotocol/server-filesystem, @executeautomation/playwright-mcp-server.
I have connected to OP's extension (browsermcp.io) from vsCode (and clicked 1 tab button OK), but not from Claude desktop so far (I get Cannot find module 'node:path'; which is require-d in npm/lib/cli.js; tried node 18,20,22; some suggestions here : https://medium.com/@aleksej.gudkov/error-cannot-find-module-... ).
i'm waiting for that as well. my other options are
- either bind a host function to manage wss connection to wasm. fork a CDP lib to use that.
- create a proxy between http/wss maybe. And then fork a CDP lib to use http proxy i think.
> that's a great use case! the aria snapshot that browser mcp generates is enough to write tests for playwright using its role-based locators, but i may add a get_page_html tool in the same way that they're considering: https://github.com/microsoft/playwright-mcp/issues/103
I think this is bullshit. Isn't the dom or whatever sent to the model api?
When you automate using a remote browser, another service (not the AI model) gets all of the browsing activity and any information you send (e.g. usernames and passwords) that's required for the automation.
With Browser MCP, since you're automating locally, your sensitive data and browser activity (apart from the results of MCP tool calls that's sent to the AI model) stay on your device.
A lot of non technical people are using these tools to "vibe" their way to productivity. I would explicitly tell them that potentially "all" of their browsing data is going to be exposed to their LLM client and they need to use this at their own risk.
Cursor is currently stuck using an outdated snapshot of the VSCode Marketplace, meaning several extensions within Cursor remain affected by high-severity CVEs that have already been patched upstream in VSCode. As a result, Cursor users unknowingly remain vulnerable to known security issues. This issue has been acknowledged but remains unresolved: https://github.com/getcursor/cursor/issues/1602#issuecomment...
Given Cursor's rising popularity, users should be aware of this gap in security updates. Until the Cursor team resolves the marketplace sync issue, caution is advised when using certain extensions.
I've flagged it here, apologies for the repost: https://news.ycombinator.com/item?id=43609572