Let’s code a TCP/IP stack, 1: Ethernet and ARP (2016)(www.saminiir.com)

309 pointsby jcartw5 days ago12 comments

cihangir5 days ago
Years ago, I attempted to build a user-space network stack in C [0] that processes raw packets through the TUN interface and got it working to a certain point. It currently includes a simple shell that allows configuring IP addresses, routes, and such. A hybrid structure reminiscent of both mbuf and sk_buf is used to hold the network packets. However, after completing the UDP implementation I didn't find the time or motivation to implement TCP. If you want to check it out, here's the link:
[0] https://github.com/cakturk/unet
- VWWHFSfQ5 days ago
  Many years ago, I wrote a pcap/tcpdump parser in pure bash, because it's all I knew how to write "programs" with. It was, of course, the slowest and most brittle thing of all time, but it did actually work. And was kinda fun. Wish I still had that code somewhere.
- jpfr3 days ago
  Many embedded devices run the lwip implementation of TCP/IP.
  The "POSIX port" of lwip does the same. It takes the raw Ethernet bytes from a TUN/TAP device.
  https://github.com/lwip-tcpip/lwip/blob/master/contrib/ports...
zoobab5 days ago
If you compile a minimal linux kernel without a tcp/ip stack -> 400KB. If you add a tcp/ip stack -> 800KB.
For a project where I should just send the temperature, I just made a small C program in userspace that sent the value over a crafted UDP message, saved a lot of space (and complexity) :-).
- chamomeal5 days ago
  Wow that’s crazy!
  As someone who knows nothing about anything: that doesn’t mean the tcp/ip stuff is half the source code of the whole kernel, does it?
  - tga_d5 days ago
    The majority of the Linux kernel's source code is device drivers. The overwhelming majority of that is not included in the kernel image by default, but instead made available as kernel modules you can enable as needed. E.g., your thermostat probably doesn't need support for an obscure game controller, so doesn't have those drivers, but it could if you were so inclined.
- miohtama5 days ago
  Curiosly, why is the IP stack so large? 400kbytes of binary is a lot of code. Is it highly optimised for large server use case?
  - hylaride5 days ago
    Modern TCP/IP stacks have a lot of extra code, including for anti-spoofing, performance enhancements (eg zero-copy integration with hardware network cards), various attack prevention measures (SYN floods, randomization of sequence numbers, etc) support for various hardware offloading (including many network cards that will do checksum offloading, etc), IPv6 (that also originally mandated IPSec integration), support for lower layer 2 protocols (mostly just ARP for Ethernet, but there are still others around).
kbouck5 days ago
If you disable ARP, you can have a group of servers on the same network configured with the same IP! and if a server acting as a routing frontend can forward packets to a backend server's network interface by mac address (need a kernel extension for this trickery), that backend server will recognize itself as the destination, swap the source/dest IP and respond directly back to the client (without going back through the routing frontend)
Alternatively, you can accomplish the same without disabling ARP and by just adding the common IP address as an alias to the loopback interface, which allows the backend to recognize itself as the destination, but avoids ARP conflicts.
This was a trick used by IBM's WebSphere software load balancer back in the 90's-00's
- lmz12 hours ago
  Also known as DSR (Direct Server Return) https://www.haproxy.com/blog/layer-4-load-balancing-direct-s...
- citrin_ru5 days ago
  > This was a trick used by IBM's WebSphere software load balancer back in the 90's-00's
  Cicso IOS SLB can work in a similar way - a virtual IP added as an alias to loopback on each server in a farm. An advantage over more widely used L3 balancing that there is need to rewrite headers in IP packets.
- Bluecobra5 days ago
  >If you disable ARP, you can have a group of servers on the same network configured with the same IP!
  The downside to this is that a switch/bridge will not learn the MAC address and continue to flood/broadcast these packets to every port in that segment. So if you do decide to do this make sure you make a dedicated VLAN. :)
  - 10000truths5 days ago
    ARP is for the LAN devices. L2 switches don't rely on ARP to build up their forwarding tables, they can just inspect the source MAC of every Ethernet frame they receive, and correlate it with the port they receive it on. Frames with unknown destination MACs are broadcast, but that stops as soon as every device in the LAN has sent at least one frame.
- mannyv5 days ago
  F5s have an arp proxy setting so you don't have to do this. The downside is it tends to break dhcp.
- KeplerBoy5 days ago
  For such low level shenanigans one can also fiddle around with dpdk. ARP disabled by default.
globular-toast5 days ago
I did a similar thing in Python[0]. Probably not as well written and, to be honest, I just made up the address resolution algorithm. I got as far as pinging an internet host with ICMP. I like that mine is completely contained in a (short) notebook, though (the OP article misses many details that are in the larger source code that is referenced).
I hadn't seen this article and did mine all from Wikipedia! There is a huge jump in complexity for TCP, though, and I lost interest a bit. Part 3 of this covers that so maybe one day I'll read that and finish mine.
I found it very rewarding and it's definitely something that is doable by any level of programmer if you're interested in networking.
[0] https://github.com/georgek/notebooks/blob/master/internet.ip...
kasajian5 days ago
One minute into it, the article says, "The dmac and smac are pretty self-explanatory fields"
This immediately turns off anyone reading it who doesn't know what those things mean. The thought process will be, "Oh, this article is for those for whom these fields are self-explanatory. Since it's not for me, I'll stop reading"
- howerj5 days ago
  The full quote would be "The dmac and smac are pretty self-explanatory fields. They contain the MAC addresses of the communicating parties (destination and source, respectively).", it does explain them. However, this is an article about how to make a network stack, it is safe to assume the reader should know something about networking before hand.
- petee5 days ago
  Unless they just updated it, the next sentence explained it -
  > They contain the MAC addresses of the communicating parties (destination and source, respectively).
intrasight5 days ago
Years ago I instrumented a nuclear power plant. I did the client-side development on Sun workstations. I actually got hired because of my TCP/IP experience - which I got from taking "Operating Systems" at CMU. The plant computer on the other hand was a mini computer that had no TCP/IP stack and so that team had to create one.
dang5 days ago
Related:
Let's code a TCP/IP stack (2016) - https://news.ycombinator.com/item?id=27654182 - June 2021 (49 comments)
Let's code a TCP/IP stack, 1: Ethernet & ARP (2016) - https://news.ycombinator.com/item?id=17316487 - June 2018 (47 comments)
Let's Code a TCP/IP Stack: TCP Retransmission - https://news.ycombinator.com/item?id=14701199 - July 2017 (30 comments)
Let's code a TCP/IP stack, 1: Ethernet and ARP - https://news.ycombinator.com/item?id=11234229 - March 2016 (49 comments)
p4bl05 days ago
I don't get where the author get the 10.0.0.4 IP address from, the one used to test ARP resolution. What is it supposed to be the address of? A fake device accessible to the made up Ethernet device programed here? Or is it an actual device on the author network? Can someone explain that?
- globular-toast5 days ago
  It isn't mentioned in the article, but the author hardcodes this when initialising an interface: https://github.com/saminiir/level-ip/blob/e9ceb08f01a5499b85...
  A TAP device is like a software emulated ethernet link (or any layer2?). So if you send packets into it they get sent directly to your user-level program. It's then up to you program to decide what IP address(es) it wants to have and reply to ARPs etc. Normally this kind of thing is handled by the OS and adding IP addresses to the interface requires root permissions (as does opening the TAP device). Networking is largely cooperative and a bad actor with root permissions on your network can do bad things.
  - p4bl05 days ago
    Ah, thanks a lot!
    Forgetting to mention that explicitly in the article is a big miss, I think. It makes the ARP part feel like it's missing crucial information or is not actually entirely explained, while it's the previous part that misses something.
    Thanks again :).
- 5 days ago
  undefined
mannyv5 days ago
From what I remember ARP only works on your local segment. Your router will fill in its address and forward the packet along.
There's also rarp, which is one way to ask 'the network' for your IP address. I have no idea if rarp still works irl.
revskill5 days ago
I appreciate the non assumption explanation in the article. Well done.
5 days ago
undefined
5 days ago
undefined