If Glasswing has been started years ago with the goal of applying fixes to AI-found gaps, then this would just be another model to add to that effort. But doing so in the ominous shadow of some new super model boosts panic IMO.
Interpret that how you will, but if Anthropic had to take cost/resource savings measures after the last major release, less than 6 months ago, it's unlikely they have the economics to offer what Mythos is promised to be, at any sort of product scale. But I agree, it would be great to get stronger models and start securing all the junk on the web. Of course, that requires maintainers to know how to use these tools.
Benchmarks at https://gertlabs.com/?agentic=all
I don’t need cutting edge AI to take you down. I need MetaSploit with a CVE list that’s been updated in the last 6 months.
Common recklessness obviously include devs running binaries on their work machine, not using basic isolation (why?), sticky IP addresses that straight-up identify them, even worse, using same browsers to access admin panels and some random memes, obviously, hundred more like those that are ALREADY solved and KNOWN by the developers themselves. You literally have developers that still use cleartext DNS (apparently they are ok with their history accessible by random employees outsourced)
I disagree. I think in big tech and the corporate world, it boils down to the organization fundamentally not valuing security and punishing developers if they "move slow", which is often the outcome when you maintain a highly security-oriented process while developing software and infrastructure.
When big leaks happen, the worst that occurs is that some trivial financial penalty is applied to the company so the incentive to ignore security problems until you're forced to acknowledge them is high.
I agree that cyber security is taken too lightly. However, I think that many developers don't actually know about vulnerabilities. In many companies those reports get filter through other teams and prioritized by PMs. The devs tend to do their best at meeting the afressive schedules the penny pinching business people set.
It's most of the time a question of management not caring about security or disliking the inconvenience that security can bring.
I might add however that most companies use FOSS projects without paying for or contributing to them.
The onus is still on the final user to make sure that the code they use is safe.
Nah. It's the corporations that could not care less and therefore do not reward careful work. They care about nothing but time to market. Start stacking legal and financial liability and I guarantee they are suddenly going to start caring a lot.
If you closed all of the AI-discovered security vulnerabilities tomorrow - by the next day there'd be a host of new ones. That's software, baby.
This initiative probably could have started a few months sooner with Opus and similar models, though.
However, no single model of those could find everything that was found by Mythos.
https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jag...
Nevertheless, the distance between free models and Mythos is not so great as claimed by the Anthropic marketing, which of course is not surprising.
In general, this is expected to be also true for other applications, because no single model is equally good for everything, even the SOTA models, trying multiple models may be necessary for obtaining the best results, but with open weights models trying many of them may add negligible cost, especially if they are hosted locally.
Mythos certainly represents a big increase in exploitation capability, and we should have anticipated this coming.
I know of two F100s that already started using foundation models for SCA in tandem with other products back in 2024. It's noisy, but a false positive is less harmful than an undetected true positive depending on the environment.
Evidently they tried and even the most recent Opus 4.6 models couldn't find much. Theres been a step change in capabilities here.
Also I’d like to believe that this really is such a huge step forward compared to Opus, but lately I’ve found it hard to believe when I look at the statements made by the CEOs of AI companies and their associates, who are fuelling the hype surrounding this topic even further. Of course, it is good that large companies and industries that are crucial to the country are the first to have access to this, but until the launch takes place, I will approach this with a degree of scepticism.
I doubt we'll see a shift away from "everything's on the network!" because it's so incredibly beneficial to the surveillance state, but one can hope.
Already been going on for over a decade - export controls on dual use technology like Xeon processors already began being enforced back in the Obama admin.
> until the launch takes place
It's already launched. Some companies had access to Mythos for months.
> fuelling the hype
This is true. Commercially available models from a year ago are already good enough from an offensive security perspective. Their big issue was noise, but that could be managed.
The issue is by the late 2000s to 2010s, most European organizations didn't take advantage of that base despite being US comparable in the 1970s-90s.
Its definitely NOT, in any way, a meeting to discuss potential systemic risk due to insolvency/bankruptcy at some key AI-related company.
There's no legal mechanism for the president or the government at all to do that.
Often it happens anyway, along with some protests, some resignations and maybe an eventual court case reversal months or years later.
> The “tools” prohibitions, set out in sections 1201(a)(2) and 1201(b), outlaw the manufacturing, sale, distribution, or trafficking of tools and technologies that make circumvention possible. These provisions ban both technologies that defeat access controls, and also technologies that defeat use restrictions imposed by copyright owners, such as copy controls. These provisions prohibit the distribution of software that was designed to defeat CD copy-protection technologies, for example.
https://www.eff.org/pages/unintended-consequences-fifteen-ye...
Banks are required by law to be able to produce account balances in a few days. In some countries the are required to submit them to account protection institution regularly so that if a bank fails they can quickly reimburse people to prevent panic spreading to other banks.
You can probably request some sort of notarized proof of accounts, but it will probably cost you $100.
I've seen a bunch of people conflate the Claude Code source-map leak with the Mythos story, though not quite as blatantly as here. I'm confident that they are totally unrelated.
I'm sure it's a great big model, but the level of hype and dishonesty is something out of Sam Altman's book.
Of course it's because of the upcoming IPO, but that's the end game, for now it's critical to get those private equity guys and bank institutions to believe the gospel and hold the bag, only then the suckers from the secondary markets will be allowed to be suckers too.
It is great to be in a "best-effort" business where there are no consequences for bad things happening. Cybersecurity is one of those businesses. Web search, feeds and ads are another.
Imagine you are selling locks to secure homes. A thief breaks the lock. The lock-maker is not held liable. In fact, they now start selling stronger locks, and lock sales actually improve with more thefts.
Still probably a benefit depending on your philosophy.
This. 100% this.
A large portion of the industry is under NDA right now, but most of the F500 have already already deployed or started deploying foundational models for AppSec usecases all the way back in 2023.
Sev1 vulns have already been detected using "older" foundation models like Opus 4.x
Of course the noise is significant, but that's something you already faced with DAST, SAST, and other products, and is why most security teams are also pairing models with experienced security professionals to adjudicate and treat foundation model results as another threat intel feed.
Historically bad security that people just got by with matched with powerful tools that aren't any better than the best people, but now can be deployed by mediocre people.