People here need to get over the fact that it's not Rust. I use C++ for my own projects because I enjoy writing in C++. I just wouldn't write them if I forced myself to use Rust or whatever else.
And this is not a garbage collector in the traditional sense, more like arena with smart pointers.
Curious about your approach to the networking stack. Are you planning to support more protocols like HTTPS or WebSockets in the future, or is the focus more on keeping this lightweight and minimal for now?
At a previous company we moved off of wkhtmltopdf to a nodejs service which received static html and rendered it to pdf using phantomjs. These days you probably use puppeteer.
The trick was keeping the page context open to avoid chrome startup costs and recreating `page`. The node service would initialize a page object once with a script inside which would communicate with the server via a named Linux pipe. Then, for each request:
1. node service sends the static html to the page over the pipe
2. the page script receives the html from the pipe, inserts it into the DOM, and sends an “ack” back over the pipe
3. the node service receives the “ack” and calls the pdf rendering method on the page.
I don’t remember why we chose the pipe method: I’m sure there’s a better way to pass data to headless contexts these days.
The whole thing was super fast(~20ms) compared to WK, which took at least 30 seconds for us, and would more often than not just time out.
I remember the afternoon I had the idea: it was beer Friday -and it took a few hours to write up a basic prototype that rendered a PDF in a few hundred milliseconds. That was the first time I’d written a 100x speed improvement. Felt like a real rush.
If your job is to render arbitrary user HTML, this could get much more hairy. First of all, print rendering at the time(and probably now) was notoriously finicky. Things like adjusting colors, improper rendering of SVGs, pagination were difficult. It took a lot of effort to get right.
Furthermore, if you're sending arbitrary HTML, you now have a much larger security exploit surface. If someone figures out how to call `addEventListener` within the page context, they can snoop on every PDF generated by that page.
This blog post convinced us that the switch was worth it: https://zerodha.tech/blog/1-5-million-pdfs-in-25-minutes/
I've made posts about it on HN before but they've never gained traction. I hope that this takes off.
You guys make neat software.
So kudos for building it this far. Now let me see if it runs webgl before I eat my hat.
it would be great to standardize alternative browsers on a consistent subset of web standards and document them so that "smolweb" enthusiasts can target that when building their websites and alternative browsers makers can target something useful without boiling the ocean
i personally prefer this approach to brand new protocols like Gemini, because it retains backward compatibility with popular browsers while offering an off ramp.
Could such a standard be based on the subset of HTML/CSS acceptable in emails? Maybe with a few extra things for interactivity.
HTML 2 might be an interesting subset of HTML to "focus on" for smolweb, but it would be a big retro throwback, and not feel at all modern.
If you were starting today, might be more interesting to start with the most modern stuff and work backwards. HTML 2 TABLE could be implemented as a specialization of CSS Grid, for instance.
Your standard still needs to render in Outlook on Windows, though, which means you need to support the weird Office version of IE11 as an upper limit.
If it actually gets mainstream adoption or goes into the standards pile it another question entirely.
(While on it, can we also ban loading images from third-party servers?)
But I don’t see any email clients with somewhat significant market share going through with this :(
I think something like a reference implementation (Ladybird, Servo or even Vaev maybe?) getting picked up as the small-web living standard feels like the best bet for me since that still lets browser projects get the big-time funding for making the big-web work in their browser too. "It's got to look good in Ladybird/Vaev/etc".
An idea: a web authoring tool built around libweb from Ladybird! (Or any other new web implementation that's easily embeddable) The implied standard-ness of whatever goes in that slot would just come for free. (Given enough people are using it!)
The phrase "living standard" is an oxymoron, invented by the incumbents who want to pretend they're supporting standards while weaponising constant change to keep themselves incumbent.
A "standard" should mean there is a clear goal to work towards to for authors and browser vendors. For example, if a browser implements CSS 2.1 (the last sanely defined CSS version), its vendor can say "we support CSS 2.1", authors who care enough can check their CSS using a validator, and users can report if a CSS 2.1 feature is implemented incorrectly.
With a living standard (e.g. HTML5), all you get is a closed circle of implementations which must add a feature before it is specified. Restricting the number of implementations to one and omitting the descriptive prose sounds even worse than the status quo.
(My opinion as another one who has been slowly working on my own browser engine.)
The least-needed features are often accessibility nightmares (e.g. animation - although usually not semantic).
The accessible subset could then be government standardized and used as a legal hammer against over-complex HTML sites.
For a while search engines helped because they encouraged sites to focus on being more informative (html DOCUMENTS).
I think web applications are a huge improvement over Windows applications, however dynamic HTML is a nightmare. Old school forms were usable.
(edited to improve) Disclosure: wrote a js framework and SPA mid 00's (so I've been more on the problem side than the solution side).
<meta name="dependencies" content="mathjax/1.1 highlightjs/2.0 navbar/5.1"/>
then browser decides how to resolve them.I just want one of these browsers to give me a proper ComboBox (text, search and drop-down thing)
But now we have best of both worlds: use <table> for the actual tables, and CSS grid for UI layouts.
Well, until everybody just gave up and declared it a "Living Standard".
I think we do still need something like this, but I worry that basing it on versions of the spec is just repeating the same DOM levels mistake.
("I don't need to worry about the second axis" seems to be a "not thinking fourth dimensionally enough" excuse to me today. You haven't considered enough responsive breakpoints or you haven't considered future features or future expanded data or future localizations, yet.)
For the same reason C++ is chosen for a lot of projects. Probably the authors have a lot of experience in C++.
For an exceedingly complex and large project, you really want to choose a language you're very proficient in. Like, years and years of experience proficient in. If you don't have the experience in Rust then you don't have it. And, Rust is really the only other language that can be considered here. Swift, C#, whatever, are just a tad too high-level to write an engine in. At least, ergonomically.
I looked at the source code briefly and it's very high-quality code. Writing good C++ is hard, harder than pretty much any other language. It's modern, it's consistent, it's readable, and it's typed well.
Rust is a bad language to write an open source browser in because the hardest problem of building a browser is not security but the number of people you can convince to work on it.
C++ programmers are a dime a dozen, there's a huge number of people who write C++ for 8 hours a day. The Rust community is mostly dabblers like myself.
C++ 64.6%
HTML 22.4%
JavaScript 11.0%
CMake 0.7%
Objective-C++ 0.5%
Swift 0.3%
Other 0.5%
Saying a mature engine that you can use today for ~all of the web is being "overtaken" by unreleased pre-alpha software is a strange definition of overtaking.
>secure HTML/CSS engine
No offense to these folks, but I see no evidence of any fuzzing which makes it hard to believe there aren't some exploitable bugs in the codebase. Google has world-class browser devs and tooling, yet they still write exploitable bugs :p (and sorry Apple / Mozilla, you guys have world-class browser devs but I don't know enough about your tooling. Microsoft was purposefully omitted)
Yeah, very few of those bugs are in the renderer, but they still happen!
Performance is often a concern, but a slow secure browser is better than a fast insecure one. Perhaps I'm a security troll, but writing this stuff in C++ has been shown over the last 30+ years to be functionally impossible, and yet security is one of the most important things for a browser.
If the answer is that there are more possible contributors, or even that this is a hobby project and it's what the author knows, those are reasonable answers, but I'm interested anyway because perhaps the author has a different way of thinking about these tradeoffs to me, and maybe that's something I can learn from.
Google uses it to power YouTube TV.
Unfortunately, while I'm sure I downloaded a Linux X11 binary a while back to play with, I can't find anything of the like available anymore. The release packages just contain a shared library, and the containers in the registry are just full of compiler toolchains (I installed ncdu in them and checked).
The whole system is mired/buried in layers of hardware integration fluff (because Cobalt is meant to be embedded in set-top boxes) and there is very little in the way of batteries-included demos, potentially to keep the product from gaining cottage-industry traction on older systems. Which does make sense, given that there are specific CSS rules that Cobalt doesn't follow the spec on, and I'm not sure where where its JS support is at.
https://developers.google.com/youtube/cobalt
The compilation docs are about as dense as Chromium's are -.-
https://developers.google.com/youtube/cobalt/docs/developmen...
https://github.com/webview/webview
The entire point of WebView is that it's a browser embedded inside of a different application, how do you expect it to be a "standalone project"?
I know there are quite a few options that try and do something similar. But they are all so incredibly bloated when all you want to do is use html5 for a native application UI.
The choice of C++ is bold.
Despite the security concerns often highlighted, modern C++ with smart pointers, and RAII patterns can be just as safe as Rust when done right. Vaev’s security model should focus on process isolation, sandboxing techniques, and leveraging modern C++ features to minimize vulnerabilities.
Super excited to see such raw innovation and courage in tackling a colossal task often monopolized by juggernauts like Chromium.
There's also https://weasyprint.org/ which doesn't use any browser engine, but rather a custom renderer.
And both of those (and Prince) can be used as a backend by Pandoc (https://pandoc.org/)
The first one I found to be unreliable, the second one is super slow and the third can be annoying to work with.
Incredible work and dedication
Now google.com is loads of js crap. The SERP refuse to render without full blown js, css and cookie.
Let me guess, it's lightning-fast because it lacks many features and secure because it's a thousand times less code than the alternatives?
I don't want to discourage, but such description is misleading.
I know there is Lynx but having a non-terminal based browser which could do it would be cool.
> I generally do not connect to web sites from my own machine, aside from a few sites I have some special relationship with. I usually fetch web pages from other sites by sending mail to a program (see https://git.savannah.gnu.org/git/womb/hacks.git) that fetches them, much like wget, and then mails them back to me. Then I look at them using a web browser, unless it is easy to see the text in the HTML page directly. I usually try lynx first, then a graphical browser if the page needs it. [0]
I know you wanted something other than lynx, but you could do this with EWW (Emacs web browser or any graphical browser, provided that your proxy wget dropped the images.
It's still early days, but Clang can check some lifetimes, using the [[clang::lifetimebound]] attribute [1]. You can also restrict unsafe pointer usage [2] outside designated blocks of code—somewhat like Rust’s unsafe keyword.
[1] https://clang.llvm.org/docs/AttributeReference.html#id8 [2] https://clang.llvm.org/docs/SafeBuffers.html#buffer-operatio...
But, this site is called Hacker News, and it's always been one of the site's most important roles, to feature and celebrate novel and interesting projects that people hack on, for whatever reason they choose.
There are all kinds of things that can be learned by starting with a blank slate and re-implementing something as ubiquitous and foundational as a web browser.
Over the years, many users have enjoyed undertaking a course called "Nand to Tetris" [1]. I hope to find time to do it one day. I don't expect it will make me substantially more employable, but I think I'll enjoy learning about the fundamentals, and I'm sure it will be beneficial in my work somehow.
Please let's remember that the playful exploration that happens through a project like this can lead to all kinds of benefits that might be non-obvious, and that it’s fine to appreciate the effort for its own sake.
[1] https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
Your passions projects were problably also far more important to your growth than you give them credit for.
Scratching an itch is how we, as programmers/engineers/whatever, grow. It is also how we stumble into solving real problems and make our mark on the world.
Who knows, this could become the next big player in the browsersphere, or maybe it'll pivot into something else, or perhaps it will spark someones imagination. At the very least it has (probably) already been a source of creative bliss and pride for the ones involved, which in my opinion makes it worthwhile.
The counter-point is that in the case of a web browser you are studying deeply one of the most impactful technologies to exist, and you will learn 80% of the most important lessons with a minimal working build, maybe 0.1% of the real thing. You may learn and execute much faster too because there is a clear blueprint, and you are likely riding a wave of passion that will carry your mind to places you won't have expected.
The perspective gained puts you in a much better place to identify & execute successfully more impactful work. The work may be the seed of something more important, but unseen or unknown yet.