Have all the bugs in OpenSSL over the years taught us nothing?
Wireguard has a concept of identity as long term key pairs, but since the algorithm is based on Diffie-Hellman, and arriving at a shared secret ephemeral key, it's only useful for establishing active connections. The post-quantum version of Wireguard would use KEMs, which also don't work for general purpose PKI.
What we really need is a signature based handshake and simple VPN solution (like what Wireguard does for the Noise Protocol Framework), that a stream multiplexing protocol can be layered on top of. QUIC gets the layers right, in the right order (first encrypt, then deal with transport features), but annoyingly none of the QUIC implementations make it easy to take one layer without the other.
The Linux Foundation is still funding OpenSSL development after scathing review of the codebase[1], so I think it's fair to say things haven't changed a bit.
TweetNaCL to the rescue.
There's a lot of benefits for sure, mTLS being a huge one (particularly when combined with ACME). For general purpose, spoke and hub VPN's tunneling over QUIC is a no-brainer. Trivial to combine with JWT bearer tokens etc. It's a neat solution that should be used more widely.
However there are downsides, and those downsides are primarily performance related. For a bunch of reasons, some just including poorly optimized library code, others involving relatively high message parsing/framing/coalescing/fragmenting costs, and userspace UDP overheads. On fat pipes today you'll struggle to get more than a few gbits of throughput @ 1500 MTU (which is plenty for internet browsing for sure).
For fat pipes and hardware/FPGA acceleration use cases, google probably has the most mature approach here with their datacenter transport PSP [2]. Basically a stripped down per flow IPsec. In-kernel IPsec has gotten a lot faster and more scalable in recent years with multicore/multiqueue support [3]. Internal benchmarking still shows IPsec on linux absolutely dominating performance benchmarks (throughput and latency).
For the mesh project we ended up pivoting to a custom offload friendly, kernel bypass (AF_XDP) dataplane inspired by IPsec/PSP/Geneve.
I'm available for hire btw, if you've got an interesting networking project and need a remote Go/Rust developer (contract/freelance) feel free to reach out!
1. https://www.rfc-editor.org/rfc/rfc9484.html
2. https://cloud.google.com/blog/products/identity-security/ann...
3. https://netdevconf.info/0x17/docs/netdev-0x17-paper54-talk-s...
I don't believe you could implement RFC 9484 directly in the browser (missing capsule apis would make upgrading the connection not possible). Though WebTransport does support datagrams so you could very well implement something custom.
The question is what does QUIC get you that UDP alone does not? I don't know the answer to that. Is it because firewalls understand it better than native wireguard over UDP packets?
> WireGuard does not focus on obfuscation. Obfuscation, rather, should happen at a layer above WireGuard, with WireGuard focused on providing solid crypto with a simple implementation. It is quite possible to plug in various forms of obfuscation, however.
This comment https://news.ycombinator.com/item?id=45562302 goes into a practical example of QUIC being that "layer above WireGuard" which gets plugged in. Once you have that, one may naturally wonder "why not also have an alternative tunnelling protocol with <the additional things built into QUIC originally listed> without the need to also layer Wireguard under it?".
Many design decisions are in direct opposition to Wireguard's design. E.g. Wireguard (intentionally) has no AES and no user selectable ciphers (both intentionally), QUIC does. Wireguard has no obfuscation built in, QUIC does (+ the happy fact when you obfuscate traffic by using it then it looks like standard web traffic). Wireguard doesn't support custom authentication schemes, QUIC does. Both are a reasonable tunneling protocol design, just with different goals.
I.e. the advantage here is any% + QUIC%, where QUIC% is the additional chances of getting through by looking and smelling like actual web traffic, not a promise of 100%.
The net result of two reliable transports which are unaware of each other is awful.
What does QUIC get you that TCP over Wireguard over UDP does not?
I think I see your argument, in that it's similar to what sshuttle does to eliminate TCP over TCP through ssh. sshuttle doesn't prevent HOL blocking though.
Adoption is about offering something that is 1) correct 2) easy to install 3) has reasonable performance 4) stable.
Wireguard provides all of those. OpenVPN was not meeting criterium 1 even a few years ago and IMO, if it doesn't work after a decade of development, it's _never_ going to work.
Now, let's look at your comment, which is full of techno mumbo jumbo (don't worry, I know everything you talk about), doesn't even mention half of those.
I think an extremely naive, but popular position is that when someone comes out with some new tool that "works on their machine", that they assume that everyone else believes immediately that they are not just as stupid as everyone that came before them. This was even true for Wireguard, since Wireguard was _not_ bug free either. In fact, one could argue that Wireguard is still an amateur project despite it working stable for some of my systems.
The problem with software like Wireguard is that there is no incentive to actually make bug free software. If software always works and has all the required features, nobody will call the person or company associated with it anymore. When was the last time that the author of "grep" was recognized as a great programmer? Never. Now, I am not saying that grep is free of bugs, but I just took a fairly stable program as an example. An economy for software like SaaS has much better incentives in that regard (even though they often also do not reach bug free status). curl is also an excellent example of bug ridden software that an entire industry is using, while it is written by an amateur (that has no incentive whatsoever to produce something that doesn't need to have bugs fixed).
If humanity had somewhat more of a collective intelligence, a million people would come together and just all paid $100 to implement a wireguard replacement (possibly even using the same protocol) to perfection such that no new implementation would ever be needed and that would adapt to any hardware automatically. Instead we prefer to continue to fuck around with inferior shit all day long.
Ken Thompson wrote grep, and he is definitely recognised as such.
(Cop)108 348 Q(yright 1998-2000, 2002, 2005-2023 Free Softw)-.1 E(are F)
Sure, he wrote _a_ version of grep, and probably the first, but who cares? "The" (sure, you might run some bsd grep) current version of grep certainly doesn't.Future grep versions, including the FSF one, were all re-implementations.
Your statement in the GP is nonsensical.
In fact, you can quite easily check this by trying to let an LLM generate a program like grep. It can do that. Now, there also exist programs for which LLMs can't generate code, because it's too complex.
I have even used Plan9 and the silly editor.
I probably have forgotten more than you (or Thompson) has ever known.
Thompson is an amateur, because all of his programs are of "trust me, bro"-quality. Call me when Ken (and all the n00bs from that era) grows up and implements grep in Rocq.
> When was the last time that the author of "grep" was recognized as a great programmer? Never.
He is recognised as that. Your opinion on him is nothing to do with anything.
I recently built a fully Layer2-transparent 25Gbps+ capable wireguard-based solution for LR fiber links at work based on Debian with COTS Zen4 machines and a purpose-tailored Linux kernel build - I'd be curious to know what an optimized FPGA can do compared to that.
25G is a lot for WireGuard [1].
Just to elaborate for others, MACSec is a standard (802.1ae) and runs at line rate. Something like a Juniper PTX10008 can run it at 400Gbps, and it’s just a feature you turn on for the port you’d be using for the link you want to protect anyway (PTXs are routers/switches, not security devices).
If I need to provide encryption on a DCI, I’m at least somewhat likely to have gear that can just do this with vendor support instead of needing to slap together some Linux based solution.
Unless, I suppose, there’s various layer 2 domains you’re stitching together with multiple L2 hops and you don’t control the ones in the middle. In which case I’d just get a different link where that isn’t true.
When you say "exists" ... is there an OpenSource high-quality implementation ?
Generally its used when you have links going between two of your sites, so you typically only need it on your switch or router that terminate that link.
If they had produced a platform with four 10 Gbps ports, then it would become interesting. But the whole hardware and bitstream would have to be redevelopped almost from scratch.
A hypothetical ASIC implementation would beat a CPU rather soundly on a per watt and per dollar basis, which is why we have hardware acceleration for other protocols on high end network adaptors.
Personally, if I could buy a Wireguard appliance that was decent for the cost, I'd be interested in that. I ran a FreeBSD server in my closet to do similar things back in the day and don't feel the need to futz around with that again.
It does not have to be all things for all people today. It can be improved. (And it appears to be open-source under a BSD license; anyone can begin making improvements immediately if they wish.)
Concepts like "This proof-of-concept wasn't explored with multiple 10Gbps ports! It is therefore imperfect and thus disinteresting!" are... dismaying, to say the least.
It would be an interesting effort if it only worked with two 10Mbps ports, just because of the new way in which it accomplishes the task.
I don't want to live in a world where the worth of all ideas is reduced a binary concept, where all things are either perfect or useless.
(Fortunately for me, I do not live in such a world that is as binary as that.)
https://old.reddit.com/r/mikrotik/comments/112mo4v/is_there_...
> > > I see. I'll terminate at the Ryzen 7950 box behind the router and see what I get.
> > That will still be a no. Outside of very specialized solutions this level of the performance is not available. It is rarely needed in real life anyways. Only small amount of traffic neess to be protected this way; for everything else point to point protection with ssh or tls is adequate. I studied different router devices and most (ipsec is dominant) have low encryption truoughput compared to routing capabilities. I guess that matches market requrements.
> It looks like I can get 8 Gbps with low CPU utilization using one of my x86 machines as terminal. This is pretty good. Don't need 10 G precisely. 8G is enough.
I've done precisely this so easily. I just terminate the WG at a gateway node and switch in Linux. It's trivial and throughput can easily max the 10G. I had a 40G network behind that on obsolete hardware providing storage and lots of machines reading from that.
Reading that thread was eye-opening since they should have just told him to terminate on the first machine behind. Which he eventually did and predictably worked.
Fully available source from RTL up (although the license seems proprietary?) is very interesting from an audit standpoint, and 1G line speed performance, although easily achieved by any recent desktop hardware, is quite respectable in worst case scenarios (large routing table and small frames). The architecture makes sense (software managed handshakes configure a hardware packet pipeline). WireGuard really lacks acceleration in most contexts (newer Intel QAT supposedly can accelerate ChaCha20 but trying to figure out how one might actually make it work is truly mind bending), so it’s a pretty interesting place to do a hardware implementation.
Hm, "BSD 3-Clause License" is seems really proprietary to you?
But you are right: do the personal license in many(most?) Verilog files[1] overrules the LICENSE file[2] of a repo?
[1] https://github.com/chili-chips-ba/wireguard-fpga/blob/main/1...
[2] https://github.com/chili-chips-ba/wireguard-fpga/blob/main/L...
So for all intents and purposes, in my opinion, large parts of this Wireguard FPGA project are under this weird proprietary Chili Chips license. In fact, the license is so proprietary that the people who made this wireguard FPGA repository and made it visible to the public are seemingly in violation of it.
It puts us in a weird spot as well: I'm now the "holder of" a file and am obligated to keep all information within it confidential and to protect the file from disclosure. So I guess I can't share a link to the repo, since that would violate my obligation to protect the files within it from disclosure.
I would link to the files in question, but, well, that wouldn't protect them from disclosure now would it.
I can see an argument for IPSec. I haven't used that for many years. However, I see zero evidence that OpenVPN is "running out of steam" in any way shape or form.
I would be interested to know the reasoning behind this. Hopefully the sentiment isn't "this is over five years old so something newer must automatically be better". Pardon me if I am being too cynical, but I've just seen way too much of that recently.
The reasons are abundant, the main ones being performance is drastically better, security is easier to guarantee because the stack itself is smaller and simpler, and it’s significantly more configurable and easier to obtain the behavior you want.
So while corp environments may take a long time to switch for various reasons, it will happen eventually. But for stuff like this corp IT tends to be a lagging adopter, 10-20 years behind the curve.
Which is a shame, because I have a number of problematic links (low bandwidth, high latency) that wireguard would be absolutely fantastic for, but neither end supports it and there's no chance they'll let me start terminating a tonne of VPNs in software on a random *nix box.
Problem is IIRC if you need FIPS compliance you can't use Wireguard, since it doesn't support the mandated FIPS ciphers or what-have-you.
[1]https://docs.tigera.io/calico/latest/network-policy/encrypt-...
Wireguard seems to make this much more difficult from what I can tell, though I don't know enough about networking to know if that's fundamental to wireguard or just a result on less mature tooling.
Add SNAT rule, enable forwarding, add allowedIPs to WG config.
WireGuard isn’t certified for any federal installation that I’m aware of and I haven’t heard of any vendors willing to take on the work of getting it certified when its “superiority” is of limited relevance in an enterprise situation.
With WireGuard I instead max out the internet bandwidth (400 megabits/s) with like 20% cpu usage if that.
I really don’t understand why. We have AES acceleration. AES-NI can easily do more bps… why is openvpn so slow?
Here's a very educational comparison between Wireguard, OpenVPN and IPSec. It shows how easy wireguard is to manage compared to the other solutions and measures and explains the noticeable differences in speed: https://www.youtube.com/watch?v=LmaPT7_T87g
Very recommended!
The routing isn’t interesting to me - but protecting low power IoT traffic certain is.
> Both software and hardware implementations of Wireguard already exist. However, the software performance is far below the speed of wire.
> Existing hardware approaches are both prohibitively expensive and based on proprietary, closed-source IP blocks and tools.
> The intent of this project is to bridge these gaps with an FPGA open-source implementation of Wireguard, written in SystemVerilog HDL.
So having it on an FPGA gives you the best of both worlds, speed of a hardware implementation without the concerns of a proprietary black box.
VPNs are normally processed in software, and that processing is usually multi-step. So latency, jitter, processing time per types of packets, etc can vary. This is FPGA based, and FPGA can run some algorithms and programs that can be implemented as chained conditions at fixed latency without relying on function calling in software. Presumably this is faster and more stable than software approaches thanks to that.
If you want your device to connect to a VPN you need something to implement the protocol. Cycles are precious in the embedded world so you don't want to do it in your microcontroller. You might offload it to another uC in your design but at that point it might make sense to just use an FPGA and have this at the hardware(-ish) level.
You can think of this as a "network interface chip" but speaking Wireguard instead of plain IP.
You run the WireGuard app on your computer/phone, tap Connect, and it creates an encrypted tunnel to a small network box (the “FPGA gateway”) at your office or in the cloud. From then on, your apps behave as if you’re on the company network, even if you’re at home or traveling.
Why the FPGA box: Because software implementations are too slow and existing hardware implementations cost too much.
Internal or Internet: Both.
also just a fun project for the authors. :)
There are loads of 10GbE switches from Cisco/Juniper/Arista/et al.
The last time I was checking (which was over 5 years ago now admittedly) there were no 10GbE switch options for reasonable prices. Juniper had good 16 port options with 1GbE interfaces at not crazy prices (which I have two of).
Going to 10GbE was many multiples of the 1GbE price. They just seemed way too expensive and were not dropping.
As it goes, maxing out 1GbE is fast enough for the sort of data and IOPS I send over my LAN. So 10GbE would probably have been overkill.
1Gb is fast enough, cheap, and basically foolproof.
That said, when traveling - on hotel wifi - for internet to work, TCP port 443 is always open, thus OpenVPN will always work if you run it on that port.
For Wireguard, there isn’t a reliable always-open UDP port. Port 123 or 53 could work sometimes, but it’s not as guaranteed.
For any other application though, Wireguard would be my first choice.
[dsvpn]: https://github.com/jedisct1/dsvpn
The one above has a very simple protocol:
The format of the data inside the TCP stream is very simple. Each datagram is preceded with a 16 bit unsigned integer in big endian byte order, specifying the length of the datagram.
Performance would of course suffer but it's not likely that whichever service is blocking UDP is going to be offering high performance.If you are doing it manually you can include two peers, one over UDP and one over TCP and prioritize traffic flow over the UDP one. Commercial VPN apps tend to handle that with "auto".
If you want to be fancy or you are confident that the UDP blocking service can offer high performance you can include a third peer using udp2raw: <https://github.com/wangyu-/udp2raw>
The reason why you may want to retain udp-over-tcp is that some sophisticated firewalls may block fake-TCP.
Couldn't you pipe it through something like udp2raw in those few cases? Probably performance would be worse/terrible, but then you say it's on hotel network so those tend to be terrible anyways.
> However, the Blackwire hardware platform is expensive and priced out of reach of most educational institutions. Its gateware is written in SpinalHDL, a nice and powerfull but a niche HDL, which has not taken roots in the industry. While Blackwire is now released to open-source, that decision came from their financial hardship -- It was originaly meant for sale.
Here's some kind of link for the old BlackWire 100Gbe wiregaurd project mentioned: https://github.com/FPGA-House-AG/BlackwireSpinal
1. None of the commercial tools support them. All other HDLs compile to SV (or plain Verilog) and then you're wasting hours and hours debugging generated code. Not fun. Ask me how I know...
2. SV has an absolute mountain of features and other HDLs rarely come close. Especially when it comes to multi-clock designs (which are annoying and awkward but very common), and especially verification.
The only glimpse of hope I see on the horizon is Veryl, which hews close enough to SV that interop is going to be easy and the generated code is going to be very readable. Plus it's made by very experienced people. It's kind of the Typescript of SystemVerilog.
My issue with systemverilog is the multitude of implementation with widely varying degrees of support and little open source. Xsim poorly supports more advanced constructs and crashes with them, leaving you to figure out which part causes issues. Vivado only supports a subset. Toolchains for smaller FPGAs (lattice, chinese, ...) are much worse. The older Modelsim versions I used were also not great. You really have to figure out the basic common subset of all the tools and for synthesis, that basically leaves interfaces and logic . Interfaces are better than verilog, but much worse than equivalents in these neo-HDLs(?).
While tracing back compiled verilog is annoying, you are also only using one implementation of the HDL, without needing to battle multiple buggy, poorly documented implementation. There is only one, usually less buggy, poorly documented implementation.
This is all feasible with SV or an embedded Macro language as well, but you'll either have to live with a poorly documented meta language (as not a whole lot of people are using it) or heavy missmatches between the meta language and the "real" language. Cocotb very much suffers from this for simulation usage.
And, tbh, if it can be nicely implemented in the host language (which IMHO is the case with amaranth, less so with migen), I don't think there are many benefits by being standalone.
Save for things like SV interfaces (which are equivalently implemented in a far better way using Scala's type system), SpinalHDL can emit pretty much any Verilog you can imagine.