It's an interactive online IDE for many assembly languages, currently M68K, MIPS, RISC-V and X86 (I need to improve X86). It has a ton of features that are made to teach assembly programming, and it can be embedded in other websites.
It was nominally supposed to be about flow control instructions, but as it goes with those things, it spiralled and ended up touching on relocations, position-independent code, aslr... One on these days I'll clean it up and post it
https://news.ycombinator.com/item?id=22279051
https://sonictk.github.io/asm_tutorial/#introduction/setting...
https://cs.brown.edu/courses/cs033/docs/guides/x64_cheatshee...
https://people.freebsd.org/~lstewart/articles/cpumemory.pdf
https://learn.microsoft.com/en-us/cpp/build/x64-calling-conv...
OpenSecurityTraining2 Architecture 1001: x86-64 Assembly
https://p.ost2.fyi/courses/course-v1:OpenSecurityTraining2+A...
They also have RISC-V and many other for debuggers and reverse engineering tools (ida/ghidra for example)
For full disclosure, I am the author - apologies for the shameless plug
[1]: https://github.com/shikaan/shikaan.github.io/blob/main/_incl...
I have been planning on trying to glue up something with v86[1] as I did in OSle[2] but I did not get to it yet.
In that case, everything would run locally and sandboxes, so you would not have to care.
What I don't understand is why assembly feels so hard to learn in the first place?
I mean, isn't it just a simple language with a few function calls (instructions) and types (operand sizes) and fixed number of variables (registers) and a small number of control flow operators, and that's it? Why does it feel so mysterious?
I think the problem with assemblers in particular is that the canonical definition is the byte code, not the human readable text. x86 is particularly annoying because no one agrees on the syntax of that text, there are hundreds of mnemonics, things are constantly being updated, and the practicing assembly programmer cares deeply about the execution semantics of the microarchitecture more than the specific sequence of instructions. Some of the language is also completely foreign to higher level programmers, like instruction pipelines, uops, instruction latency, and so on.
Rarely do you sit down and write this assembler by hand, you compile some C code and poke at it with vtune/uprof to measure hot sections of the code, break those down, and implement faster versions. It's fundamentally an iterative, experimental process.
1. You are primed to think that it is mysterious because that’s all you usually hear about assembly. (“Roller Coaster Tycoon was written in 100% hand crafted assembly… what an absolute wizard!!”)
2. The language’s textual format is odd - columns vs nested indentation. Actually really nice once you get used to it, but it’s definitely alien at first.
3. Mnemonics and directives have short, cryptic spellings. x86 in particular has arbitrary looking register names as well. RV, AArch64, m68k etc do better here.
4. Mnemonics are inconsistently overloaded and encode lots of stuff. SIMD instructions tend to look like a cat sat on your keyboard.
5. Manually laying out memory is technically simpler than the abstractions provided by higher level languages (structs and classes, fancy generic types, pointer syntax), but it’s fiddly and you have to deal with alignment.
6. You have to do a lot of bookkeeping yourself. It’s like malloc/free turned to 11.
7. Register allocation is a hard problem for computers. It’s kinda tough for humans, too.
8. Lots of books and online stuff discuss assembly for use with high performance code, tight compute kernels, raw hardware access, and fiddly CPU configuration for OS startup and virtual memory configuration. This requires even more specialized registers, arcane instructions, and bit fiddling. This stuff - along with reverse engineering and security research/attacks - gets lumped into what people think of as “assembly language”. The resulting concept surface therefore looks much larger than it actually is.
I highly recommend making a non-trivial program entirely in assembly at least once. I need to do it occasionally professionally but even when I don’t I usually have a hobby project or two cooking at home.
Becoming as proficient in asm as - say - C or Python is quite the lovely expression of craft. You feel like a wizard (see point 1) while simultaneously learning what’s really going on.
For people with a certain geeky disposition it pays lots of aesthetic, psychological, and professional dividends.
For a little 8-bit microprocessor with program size < 8k it can be quite easy and even a joy. Anything else and your compiler will outperform you, better to inline hand-coded assembler as needed.
It's up to the compiler/programmer to handle calling conventions.
Modern programmers also don't regularly encounter "unstructured programming" in higher-level languages these days.
All of these, and more, make it feel overwhelming, as anything they examine through their disassembler will contain all of these elements.
But being able to run it only in a virtual machine, it is a little bit demotivating.
Well, there are some chinese folks selling newly built PC-XT compatible machines on the internet. Maybe I could go this way. And probably, pure, original 8086 assembly is a lot more fun than overly complicated X86_64 with lots of extensions.
AMD seem to have decided to regularise the instruction set for 64-bit long mode, making all the registers consistently able to operate as 64-bit, 32-bit, 16-bit, and 8-bit, using the lowest bits of each register. This only occurs if using a REX prefix, usually to select one of the 8 additional architectural registers added for 64-bit mode. To achieve this, the bits that are used to select the 'high' part of the legacy 8086 registers in 32- or 16-bit code (and when not using the REX prefix) are used instead to select the lowest 8 bits of the index and pointer registers.
From the "Intel 64 and IA-32 Architectures Software Developer's Manual":
"In 64-bit mode, there are limitations on accessing byte registers. An instruction cannot reference legacy high-bytes (for example: AH, BH, CH, DH) and one of the new byte registers at the same time (for example: the low byte of the RAX register). However, instructions may reference legacy low-bytes (for example: AL, BL, CL, or DL) and new byte registers at the same time (for example: the low byte of the R8 register, or RBP). The architecture enforces this limitation by changing high-byte references (AH, BH, CH, DH) to low byte references (BPL, SPL, DIL, SIL: the low 8 bits for RBP, RSP, RDI, and RSI) for instructions using a REX prefix."
In 64-bit code there is very little reason at all to be using bits 15:8 of a longer register.
This possibly puts another spin on Intel's desire to remove legacy 16- and 32-bit support (termed 'X86S'). It would remove the need to support AH, BH, CH and DH - and therefore some of the complex wiring from the register file to support the shifting. If that's what it currently does.
Actually, looking at Agner Fog's optimisation tables (https://www.agner.org/optimize/instruction_tables.pdf) it appears there is significant extra latency in using AH/BH/CH/DH, which suggests to me that the processor actually implements shifting into and out of the high byte using extra micro-ops.
I disagree: there only exists BSWAP r32 (and by 64 extension BSWAP r64): https://www.felixcloutier.com/x86/bswap
No BSWAP r16 exists. Why? in 32 bit mode, it was not needed, because you could simply use
XCHG r/m8, r8
with, say, cl and ch (to swap the endianness of cx).
In 64 bit mode, you can thus only the endianness of a 16 bit value for the "old" registers ax, cx, dx, bx using one instruction. If you want to swap the 16 bit part of one of the "new" registers, you add least have to do a 32 bit (logical) right shift (SHL) after a BSWAP r32 (EDIT: jstarks pointed out that you could also use ROL r/m16, 8 to do this in one instruction on x86-64). By the way: this solution has a pitfall over BSWAP: BSWAP preserves the flags register, while SHL does not.
All runs in the browser:
https://youtube.com/playlist?list=PLn_It163He32Ujm-l_czgEBhb...
Learning assembly with a really good visualizer or debugger in hand is highly underrated; just watching numbers move around as you run your code is more fun than it has any right to be.
I really like Justine Tunney’s blinkenlights program. (https://justine.lol/blinkenlights/)
A version of that for AArch64 / RISC-V would be really cool.
It's currently 50% off and not only will you learn ARM, and some history about ISAs in general, but you'll learn more about how the computer itself works.
And if ARM isn't a hard requirement, an older edition that uses RISCV as the core ISA is a free download.
https://www.cs.sfu.ca/~ashriram/Courses/CS295/assets/books/H...
Highly recommended.
The smallest, simplest, 'useful' (in terms of useful enough that lots of devs did good work with it and thus it might also be 'popular') ASM sets are probably also sufficient to start with. Provided you've got a good guide to using them, and also ideally a good sheet for why given instructions are packed the way they are in binary.
I do agree you're more likely to find pointers to such resources in more classic architectures. They're also more likely to be easy to find free copies of useful literature.
i.e. https://esolangs.org/wiki/FlipJump
Flip Jump is amazing. I understand the theory and how it works but it still amazes me that it does. Things like this is why I love the science in computer science.
And subleq even has a c-compiler and operating system, just wow. https://en.wikipedia.org/wiki/One-instruction_set_computer#S...
https://www.scs.stanford.edu/~zyedidia/arm64/encodingindex.h...
It'll be my keynote presentation at the D conference next month.
Rather clunky and most UNIX Assemblers were never as powerfull as other systems macro assemblers, since after UNIX System V, it was there only as an additional external tool pass for the C compilation steps, and even before that, the assembler was quite minimalist.
Then there is the whole brain twist of being used to op dest, src, and having to switch into op src, dest.
I prefer the Intel approach as it is more similar to dest = src, rather than src -> dest.
Intel's approach is also more common across various assembly languages.
The silly thing is importing one order to another architecture that uses the opposite order. Now your users have to transpose operands for no reason. The really silly thing is messing up the operand order of the cmp instruction in the process. The VAX assembly language had a compare with the sane operand order, BTW.
- https://en.wikipedia.org/wiki/RISC-V_assembly_language
- https://asm-docs.microagi.org/risc-v/riscv-asm.html
I understand that there were only 14 different instructions in the original design.
"The 386 has about 140 different instructions, compared to a couple dozen in the ARM1 (depending how you count)."
https://www.righto.com/2015/12/reverse-engineering-arm1-ance...
If you are daring, you can find my puny attempt here: https://github.com/libriscv/libriscv/blob/master/lib/librisc...
I did manage to improve it once I figured out some of the various modes of accessing memory, and I even managed to cut the jump table down from 64- to 32-bit which should help keep it in memory. I made the jump table part of .text in order to make it RIP-relative. For the fibonacci sequence program, not many bytecodes are needed. I would greatly appreciate some tips on what can be improved there.
one opportunity for optimization is mapping emulated registers to real x86-64 registers and basically never spilling them to memory (so that way if you have to add you don't have to first fetch then add, but just add directly). though that makes writing the emulator a lot more annoying.
Did you do this manually?
gcc changed to put jump tables in .rodata always which causes problems when .rodata is stored in ROM.
It does have the `-fno-jump-tables` option but that just disables jump tables rather than allowing you to control where they go.
Did you mean “ax, bx, cx, dx”?
>> Additionally, the higher 8 bits of ax, bx, cx and dx can be referred to as ah, bh, ch and dh.
Together, AX refers to bits [0-15]. EAX refers to [0-31].
It's counterintuitive (or at least, inconsistent) that we have a name for bits [8-15] but not for [16-31] or [32-63]. My fuzzy understanding is that this came about from legacy decisions.
This page has a helpful visualization at the top: https://www.cs.uaf.edu/2017/fall/cs301/lecture/09_11_registe...
Here's the AMD manual: https://docs.amd.com/v/u/en-US/40332-PUB_4.08
For those writing in compiled languages like C/C++ and particularly with an interest in performance it's been very helpful just to be able to read compiler output and see what it's generating. Takes the guesswork out of it, removing the uncertainty by simply being able to see what the compiler is actually doing. You can just write code and see the result, who knew!. It's actually helped my understanding of C++ in seeing the bigger picture.
Of course it's also much easier to learn just to read a little disassembly than actually write the stuff. I'm sure I'm not alone in that for me Compiler Explorer (https://godbolt.org) was my gateway into this. You can get quite far even if just knowing the basics (I'm no expert).
If these do sound interesting to you, I'd recommend looking into capture the flag (CTF) competitions, and trying reverse engineering or binary exploitation (pwn) challenges. PicoCTF [1] is an entry-level platform that hosts challenges and has references to learning resources - I believe there's a sequence on assembly in the learning resources.
Aside, I also find it useful to know assembly when debugging C/C++ code, as others have suggested.
You could try writing a game (perhaps using Raylib) or a pixel art editor. Or maybe a little web application for your Homelab.
Simple C libraries (Raylib, libcurl, early win32 APIs) tend to be dead simple to use from assembly.
Most asm tutorials are either bare metal / OS or “we will talk directly to the kernel”, but there’s no reason you can’t interface with higher level libraries and make real apps and games. It’s simpler than it sounds because a whole lot of your code will just be moving things around between registers and memory in order to make function calls and bookkeep your program state.
It's kind of useful for understanding compiler output on godbolt, and occasionally for debugging code without debug info.
But I successfully wrote a ton of C++ for decades without knowing assembly. It's not really necessary.
As a systems engineer it's good to know _some_ x86-64 assembly as sooner or later you're facing a stack trace / register dump and will have to try to make sense of it. Personally, I would take it only further, if you're interested in building compilers.
As an odd twist to the whole assembly vs. HLL dichotomy, sbcl is used by some as assembly playground: https://pvk.ca/Blog/2014/03/15/sbcl-the-ultimate-assembly-co...
Maybe there is something to do there (I haven't checked).
https://shenzhen-solitaire.tgratzer.com/
Which I find more enjoyable, both because it's online so it's easier to reach from anywhere, and also because I feel like the version of the solitaire inside the game is a bit... heavy feeling. Like there's some sort of input delay? Anyhow, I must have around 3000 completed games of solitaire across my devices.
[1]https://store.steampowered.com/app/570490/SHENZHEN_SOLITAIRE...
https://www.AssemblyArena.com - an educational game for people who would like to learn and/or program in assembly.
It's a (non mobile-optimized) web-based PvP assembly programming game. The syntax is highly inspired by x86 assembly and the game itself by Core War. There is also a small tutorial and a ranked ladder powered by Glicko-2.
Source Code:
There are a couple of great books out there as well that I have used. I cannot recall their names right now. :(
Also, there are so many great resources in the comments.
- https://asmjit.com/asmgrid/
"Higher language version is easier to optimize, because machine gets better idea what you are aiming at." said Lex Fridman et al.
Yes you need to be good at assembly (especially data oriented architecture), yes it takes forever, but that is no excuse to spew falsehoods.
you will likely not be writing a lot of assembly by hand, however steering the compiler codegen in the right direction requires an understanding of what the compiler produces.
even outside of enhancing performance, knowledge of instruction sets is instrumental for security research and reverse engineering. for some fun but practical demonstrations, see work by Nathan Baggs on YouTube - it involves staring at a lot of disassembly.
i don't know where this misguided notion that assembly language is "1975" comes from. it's not like Cobol where a few large but important systems keep it alive. this is something that lies at the core of every interaction with computers that you have daily.
Natural language is the highest of course. We will return to assembly programming eventually, because those intermediate "highlevel" languages are not needed anymore between you and machine.
Also I should note that until early 1990's, C compilers were quite bad, that is why we wrote what would be now AAA games, in straight Assembly.