Looks good!
I also like stacked PRs (which is mercurials default).. Maybe it's worth a shot tbh.
Instead it's nice to think about how you can express the state of a complete system as a single number. It might be you divide active user sessions by database-connections, and then scale by memory capacity.
But as a single digit you can then get used to normal ranges, and have it always visible somewhere obvious. A single number won't show details, but when it changes you can go look at the specific metrics. It's a cute shorthand, and it can work well as a basic "are we normal" check.
there is a subset of the site that pretty much everyone uses — git, issues, pull requests, actions — and if any part of that is broken then the site is broken and the status page should indicate how often this happens
This is a pretty ungenerous take. You could look at it the other way: if I don't use actions then it's useful for me to know that only actions are broken, and I can continue in my normal usage. If you bundle everything up then the status page is reporting an unhelpful false positive for me.
They would love if everyone on the platform used all of the features and had massive lock-in right? So if some part of that is always broken, it’s not a confidence booster for users to adopt more of the feature set.
Sure the more things you use the more likely it is that one has an issue but clearly stability isn’t a goal for these type of companies anymore.
No one cares that much if repo wikis, commit stats or gist had these issues. It's the combination of inter-dependent services that are used in combination, like PRs, actions, discussions, etc.
If one were to build a single percentage for each of these components of both systems, github would still lose. Maybe it's a few days without outages more but this isn't a comparison.
Make sure you cache all the actions you need locally if you go this route otherwise it's not much of an improvement.
Mainly doing it because I think AtProto is cool and self-hosting is fun, but also because owning the infrastructure that hosts my projects is definitely the direction I want to move in.
Tangled's Knot system feels like a really strong abstraction for this. I host the data in an AtProto Repository, but can rely on a third party to host/manage the AtProto Application that presents it to the rest of the world. If Tangled goes under, I can happily take my AtProto login to a different platform and point it at my Knot without changing a thing about my hosting setup.
Much more convenient that hosting an entire, siloed webapp on my own corner of the internet.
Maybe that's good-will doing the work? For me it's always been a sour pill to swallow that I have to buy in to a large companies internal politics and practices in order to work on projects I love. I don't feel like I owe them anything.
Especially if they can't hold up their end of the deal.
Unfettered access to the world's software repositories, for the princely sum of a bucketload of Azure credits.
It’s not like they don’t know that people like us are counting on them: they recognize that their service is the “dial tone” for much of the world’s software development capability. They are keenly aware of the impact.
What happened to #hugops? Does it go out the window because those people happen to work for a company you don’t like?
If I to hire a contractor to redo my roof, and that roof leaks, whether they worked hard or not is immaterial. They did not do the task in they were paid to do. I'm not going to buy their services again just because their shingles guy was particularly charming.
MS has talented engineers, but that's a complete misdirection. Github is a service in decline: there is nothing wrong with criticizing them.
A corporation is not a person. If your organization cannot handle the load, then you need to adjust your practices. The organization needs to prioritize their paying users. The organization needs to shift people from new features to keeping the lights on. And maybe the organization needs to find another strategy to manage its azure transition.
Invoking individual workers well-being to defend a billion dollar company is also very strange.
Would you feel the same way about a colleague who kept causing downtime in your product again and again, seemingly without making any progress in addressing whatever issue was causing their repeated mistakes?
There are web applications out there that are far more complex than GitHub but have much less downtime. It's not like they're facing an unsolvable problem.
You could argue the scales are different, but computers are also faster now.
So, argument to credentialism out of the way... What should we do as consumers if a provider that is a defacto monopoly due to network effects stops functioning?
Scale is everything and a faster computer doesn’t always help. Vertical scaling has limits, and complex distributed systems are complex.
Since you seem to possess a diagnosis and remedy with a reasonable amount of certainty, I’m sure they’d love to hear from you and have you fix all their problems for them. Especially if you can do it while not making the problem worse in any dimension.
I skimmed your profile. Working on the infrastructure for a couple mid-tier video games is a cool accomplishment, but equating this to having solved GitHub level scale rings hollow.
GitHub has a couple orders of magnitude more daily active visitors than the games you worked on had at their peak.
You can make valid criticisms of GitHub without trying to reduce their scale or inflate your credentials to create a false equivalence.
I didn't make one. The sentence after "I have" was literally "you could argue the scales are different."
GitHub spent a decade asking the world to host its code with them. They got what they asked for. You don't get to beg everyone to run services for you for ten years and then have "scaling is hard" be the answer. They should be improving, not regressing over time, and they have some of the worlds best engineers and a trillion dollar corporation behind them, they don't need my sympathy.
The original question is still open and nobody's engaging with it.
Don't you at least see how it's misleading to respond "I have" in response to a question about scaling GitHub-scale services?
Trying to caveat it with "the scales are different" misses the point. The parent commenter was talking about scale.
Discarding legitimate criticism based on some self-determined criteria of intellectual superiority isn't a good look. It smacks of elitism and isn't something conducive to a productive and positive community discussion.
It is unhelpful, rude, condescending, and completely fails to address the underlying problem.
Put more simply: if you get into the ring, you’d better be prepared to take a punch.
I didn't bring their credentials into the conversation. They did.
The earn bucket loads of money, they should be planning for exactly that. And testing for it via load testing every day.
Perhaps you've forgotten the days of GitHub presenting themselves of software engineering thought leaders.
Genuinely could use a refresher here.
Hot take, if it's traffic is causing issues, throttle your free-tier, pause signups, or stop giving out free things (like runner time).
I know I am speaking from a position of some privilege, but I have previously left workplaces that did not allow me to practice good engineering, and I do expect others to do so.
There are literally thousands of people who are ready to ride up the totem pole, it would not be a difficult decision for a bad manager to swing his axe and replace the new head
GitHub is promising service they know they cannot meet, not telling you that, and still charging you full price. What's more, one can argue quite convincingly that they're lying about their level of delivered service by not reflecting the actual level of uptime on their status page.
To give benefit of the doubt requires that the other party is not blatantly and overtly acting in bad faith. When they are, you're just apologizing for fraudulent behavior.
For example, if I am using the free tier of a service and "paying" by seeing ads, should I have similar expectations?
I'm not saying that's how users pay for github - in that case it's more subtle, for example by giving up control of some of their stack and bolstering github already near monopolistic network effect.
Of course. GitHub has been an enormous gift to the open source community. Arguably more than Git itself. They deserve a lot of good will.
Also, the former stewards of that open source goodness sold it to Microsoft for a cheap buck.
Any goodwill they earned has been spent.
But, you are right in the sense that, Github has failed to accept its part of the deal which is actually to just be a usable place. People HAVE previously tolerated so much AI slop and slowness in github's UI just because of its reliability but this downtime is like the Github's achilles heel.
At some point, I recommend people to accept this and move to more healthier alternatives, there is also an momentum. For example, the only reason I joined github was that I wanted to join codeberg but so many of projects used github and involved sign in with github that I finally gave in into github and I had thought that codeberg is so good but nobody is gonna come here because of the network effects but the tide is turning and I hope more people look into codeberg and healthier alternatives.
More than a bit strange. This is an HNism that I'll never get. Why would you go to the comment section anywhere to passionately try to defend the honor of a trillion dollar company, unless 1. you're being paid to astroturf or 2. you own that company's stock? Satya Nadella isn't going to read a post here and say, "Gosh, how nice of that commenter! I'm going to send him some Microsoft stock as a show of appreciation for him defending us online!" I don't think I'll ever understand company-fanboys.
2. Maybe you know a bunch of people who work there, could be ex-colleagues etc. and you think overall it’s mostly good well-intentioned people there. Therefore you want to see them succeed, and also you might disbelieve that the company is deliberately being awful.
I don’t have any specifically warm feelings about a corporate legal entity, but I know people who work at various companies and partly for that reason I am not rooting for those companies to fail and I also don’t believe the least charitable explanations for all their failings.
My free, open-source, bare-bones, caching-free, dependency-free, authentication- and authorization-free pure PHP raw Git viewer. I developed it because GitList blew out my shared host's drive space and memory (due to a caching bug) and to consolidate my GitHub, BitBucket, and GitLab repos. There's something rewarding about self-hosting and not being beholden to the whims of third parties.
One could hope that we'd use these newfound agentic coding powers to actually realize value, improve quality, etc. Instead I see enshittification and stagnation. What are we even doing with all these tokens?
So?
If Microsoft can't scale, who can?
If it can't provide the service, it should stop selling until it can.
This is like the AOL dialup busy signal fiasco of the mid-90's all over again. Except this time, instead of getting mad, people are making excuses for the poor, beleaguered trillion-dollar company.
You literally cannot buy GitHub Copilot right now [1].
If Microsoft can't scale something like Git 14x, then the problem is with Microsoft.
A volume increase that is a single order of magnitude (which 14x is) should not result in this level of failures.
When I compare what Github does and the volumes vs social media companies, payment companies, video platforms, etc, it just doesn't make sense that it is just a volume problem.
It looks a lot more like a platform that already has baseline issues that are compounded by increased volume.
You mean like every startup ever that has been successful?
And for a service that is heavily text bound? A 14x increase would not be a big deal.
I feel for them -- with AI coders submitting 25 PRs within an hour of an issue being filed, GitHub bears the brunt of that along with the maintainers. That's a lot of work that gets done with each PR.
But they need to make some changes quickly.
Also, respectfully, you have no idea what you're talking about. "Just text" doesn't make it easy to solve. GitHub Actions aren't just text and take a lot of compute.
Life is pretty good if one's biggest concern is work stuff and you're not personally in danger or actively being harmed. That's all I'm saying.
That being said, 300k TC for E4 is still pretty good. Plus the RSUs have gone up like 60% in the last several years so that 300k package from a few years ago is maybe 350k or more by now.
My point is that they are compensated well. They should be feeling pressure to get this stuff right when their product is core infrastructure for a majority of the digital products that exist today.
Im not saying this is the end-game solution but absolutely they could have put temporary safeguards in place while they "figure it out" if it _really_ is just AI driven slop setting their computers on fire.
What "brunt"? These are not large numbers.
AI coding has made this orders of magnitude bigger.
The individual numbers are small, but they add up quickly.
But also, each PR kicks off a bunch of CI work, often in GitHub Actions.
How would a random kid in a 3rd world country ever get noticed enough to enter a trust circle, for example?
The "model" - GH effectively allowing an overload of their infra - is already broken
> How would a random kid in a 3rd world country ever get noticed enough to enter a trust circle
By submitting a quality change with a clear description, preferably with unit tests? Is that no longer considered an acceptable hurdle?
But the proposal is to specifically disallow that unless the person is already known.
That is the model today, the one that people want to get rid of.
Let's level-set on the issue: Of late, GH has suffered a continuous stream of noteworthy outages. It is hypothesized the underlying cause of the instability has been the dramatic rise in submissions from coding agents ("AI"). The open question is how (or whether) GH can get load at a manageable level, with the proposal being, 'don't immediately allocate build/compute resources against any and all submissions.'
I don't see why that is equivalent to rampant disenfranchisement in the open source community. I believe what people have in mind is closer to, "don't immediately trigger an expensive build process as soon as someone submits a pull request."
EDIT: from Github's selfish perspective, this would gatekeep their CI load. I assume (I have no idea, it's just a guess) that mostly serving source code and handling commits is not primarily the scale problem. Instead (again just guessing) probably the vast majority of the compute load due to PRs is running all the CI checks. Nontrivial projects can spawn a hell of a lot of compute per PR, and on every subsequent commit pushed while the PR is open.
Heck I stopped using it for projects in 2018, even before the acquisition.
My company was going to end a 6-figure YoY contract with a GitHub Actions competitor to move to GitHub, but scrapped those plans and renewed this morning. That move had been in planning for like 6 months.
/ponder .oO( i must be one of today's lucky 10000 https://xkcd.com/1053/ )