Great guide!
I think the first "My first RISC-V assembly program" emulator plane should be right at the beginning of the guide. Otherwise, casual readers might think that this is a text-only introduction (despite the word "interactive" in the title).
Will spend more time on it in the coming days. I am quite interested in RISC-V and I think that it might have a bright future ahead.
If any AI expert is reading this now, please use Replit or Lovable or something like that to re-create "Core War" [0] with RISC-V assembly. It would be GREAT.
I'm still busy building my ox-bike wagon. Soon I'll start pedaling around the country offering tea, and hand written (my brain to my keyboard ™) software.
I was going to add to your comment with some snarky rejoinder like "brain, is that a new agent from OpenAI?" or "we are all vibecoders now" or simply "ain't nobody got time for that".
But then I got to wondering about the OP's statement that they would specifically like someone to create this with AI. It strikes me both as silly as saying "if you're good at using Visual Studio, could you do this?", because AI tools are just tools now, and those who want to use them don't need to be prompted... but also somehow fundamentally different.
OP, what was on your heart that caused you to phrase it that way?
I think it's phrased that way because he's saying it would be great to have as long as you don't spend too much time on it. Doing it by hand would probably not be worth it, but if a tool can do it, then it's worth it.
Having learned assembly with the book "Computer Organization And Design" from Patterson and Hennessy, it really shows how much RISC-V takes from MIPS. After all they share some of the people involved in both ISAs and they have learned from the MIPS mistakes (no delay slots!). Basically if you come from a MIPS the assembly is very very similar, as it was my case.
Now that book is also available with a RISC-V edition, which has a very interesting chapter comparing all different RISC ISAs and what they do differently (SH, Alpha, SPARC, PA-RISC, POWER, ARM, ...),...
However I've been exploring AArch64 for some time and I think it has some very interesting ideas too. Maybe not as clean as RISC-V but with very pragmatic design and some choices that make me question if RISC-V was too conservative in its design.
> However I've been exploring AArch64 for some time and I think it has some very interesting ideas too. Maybe not as clean as RISC-V but with very pragmatic design and some choices that make me question if RISC-V was too conservative in its design.
Not enough people reflect on this, or the fact that it's remarkably hazy where exactly AArch64 came from and what guided the design of it.
AArch64 came from AArch32. That's why it keeps things like condition codes, which are a big mistake for large out-of-order implementations. RISC-V sensibly avoid this by having condition-and-branch instructions instead. Otherwise, RISC-V is conservative because it tries to avoid possibly encumbered techniques. But other than that it's remarkably simple and elegant.
That will be amazing when it happens, and a year is VERY soon!
Tenstorrent's first "Atlantis" Ascalon dev board is going to be similar µarch to Apple M1 but running at a lower clock speed, but all 8 cores are "performance" cores, so it should be in N150 ballpack single-core and soundly beating it multi-core.
They are currently saying Q2 2026, which is only 4-7 months from now.
Afair, AArch64 was basically designed by Apple for their A-series iPhone processors, and pushed to be the official ARM standard. Those guys really knew what they were doing and it shows.
It's clear that Arm worked with Apple on AArch64 but saying it was basically designed 'by Apple' rather than 'with Apple' is demonstrably unfair to the Arm team who have decades of experience in ISA design.
If Apple didn't need Arm then they would have probably found a way of going it alone.
Apple helped develop Arm originally and was a (very) early user with Newton. Why would they go it alone when they already had a large amount of history and familiarity available?
> That's why it keeps things like condition codes, which are a big mistake for large out-of-order implementations. RISC-V sensibly avoid this by having condition-and-branch instructions instead.
Respectfully, the statement in question is partially erroneous and, in far greater measure, profoundly misleading. A distortion draped in fragments of truth remains a falsehood nonetheless.
Whilst AArch64 does retain condition flags, it is not simply because of «AArch32 stretched to 64-bit», and condition codes are not a «big mistake» for large out-of-order (OoO) cores. AArch64 also provides compare-and-branch forms similar to RISC-V, so the contrast given is a false dichotomy.
Namely:
– «AArch64 came from AArch32» – historically AArch64 was a fresh ARMv8-A ISA design that removed many AArch32 features. It has kept flags, but discarded pervasive per-instruction predication and redesigned much of the encoding and register model;
– «Flags are a big mistake for large OoO» – global flags do create extra dependencies, yet modern cores (x86 and ARM) eliminate most of the cost with techniques such as flag renaming, out-of-order flag generation and using instruction forms that avoid setting flags when unnecessary. As implemented in high-IPC x86 and ARM cores, it shows that flags are not an inherent limiter;
– «RISC-V avoids this by having condition-and-branch» – AArch64 also has condition-and-branch style forms that do not use flags, for example:
1) CBZ/CBNZ xN, label – compare register to zero and branch;
2) TBZ/TBNZ xN, #bit, label – test bit and branch.
Compilers freely choose between these and flag-based sequences, depending on what is already available and the code/data flow. Also, many arithmetic operations do not set flags unless explicitly requested, which reduces false flag dependencies.
Lastly, but not least importantly, Apple’s big cores are among the widest, deepest out-of-order designs in production, with very high IPC and excellent branch handling. Their microarchitectures and toolchains make effective use of:
– Flag-free branches where convenient – CBZ/CBNZ, TBZ/TBNZ (see above);
– Flag-setting only when it is free or beneficial – ADDS/SUBS feeding a conditional branch or CSEL;
– Advanced renaming – including flag renaming – which removes most practical downsides of a global NZCV.
I get the same impression w.r.t. RISC-V v. MIPS similarities, just from my (limited) exposure to Nintendo 64 homebrew development. Pretty striking how often I was thinking to myself “huh, that looks exactly like what I was fiddling with in Ares+Godbolt, just without the delay slots”.
Instructions are more easily added than taken away. RISC-V started with a minimum viable set of instructions to efficiently run standard C/C++ code. More instructions are being added over time, but the burden of proof is on someone proposing a new instruction to demonstrate what adding the instruction costs and how much benefit it brings and in what real-world applications.
> Instructions are more easily added than taken away.
That's not saying much, it's basically impossible to remove an instruction. Just because something is easier than impossible doesn't mean that it's easy.
And sure, from a technical perspective, it's quite easy to add new instructions to RISC-V. Anyone can draft up a spec and implement it in their core.
But if you actually want wide-spread adoption of a new instruction, to the point where compilers can actually emit it by default and expect it to run everywhere, that's really, really hard. First you have to prove that this instruction is worthwhile standardizing, then debate the details and actually agree on a spec. Then you have to repeat the process and argue the extension is worth including in the next RVA profile, which is highly contentious.
Then you have to wait. Not just for the first CPUs to support that profile. You have to wait for every single processor that doesn't support that profile to become irrelevant. It might be over a decade before a compiler can safely switch on that instruction by default.
It's not THAT hard. Heck, I've done it myself. But, as I said, the burden of proof that something new is truly useful quite rightly lies with the proposer.
The ORC.B instruction in Zbb was my idea, never done anywhere before as far as anyone has been able to find. I proposed it in late 2019, it was in the ratified spec in later 2021, and implemented in the very popular JH7110 quad core 1.5 GHz SoC in the VisionFive 2 (and many others later on) that was delivered to pre-order customers in Dec 2022 / Jan 2023.
You might say that's a long time, but that's pretty fast in the microprocessor industry -- just over three years from proposal (by an individual member of RISC-V International) to mass-produced hardware.
Compare that to Arm who published the spec for SVE in 2016 and SVE 2 in 2019. The first time you've been able to buy an SBC with SVE was early 2025 with the Radxa Orion O6.
In contrast RISC-V Vector extension (RVV) 1.0 was published in late 2021 and was available on the CanMV-K230 development board in November 2023, just two years later, and in a flood of much more powerful octa-core SpacemiT K1/M1 boards (BPI-F3, Milk-V Jupiter, Sipeed LicheePi 3A, Muse Pi, DC-Roma II laptop) starting around six months later.
The question is not so much when the first CPU ships with the instruction, but when the last CPU without it stops being relevant.
It varies from instruction to instruction, but alternative code paths are expensive, and not well supported by compilers, so new instructions tend to go unused (unless you are compiling code with -march=native).
In one way, RISC-V is lucky. It's not that currently widely deployed anywhere, so RVA23 should be picked up as the default target, and anything included in it will have widespread support.
But RVA23 is kind of pulling the door closed after itself. It will probably become the default target that all binary distributions will target for the next decade, and anything that didn't make it into RVA23 will have a hard time gaining adoption.
I'm confused. You appear to be against adding new instructions, but also against picking a baseline such as RVA23 and sticking with it for a long time.
Every ISA adds new instructions over time. Exactly the same considerations apply to all of them.
Some Linux distros are still built for original AMD64 spec published in August 2000, while some now require the x86-64-v2 spec defined in 2020 but actually met by CPUs from Nehalem and Jaguar on.
The ARMv8-A ecosystem (other than Apple) seems to have been very reluctant to move past the 8.2 spec published in January 2016, even on the hardware side, and no Linux distro I'm aware of requires anything past original October 2011 ARMv8.0-A spec.
I'm not against adding new instructions. I love new instructions, even considered trying to push for a few myself.
What I'm against is the idea that it's easy to add instructions. Or more the idea that it's a good idea to start with the minimum subset of instructions and add them later as needed.
It seems like a good idea; Save yourself some upfront work. Be able to respond to actual real-world needs rather than trying to predict them all in advance. But IMO it just doesn't work in the real world.
The fact that distros get stuck on the older spec is the exact problem that drives me mad, and it's not even their fault. For example, compilers are forced generate some absolute horrid ARMv8.0-A exclusive load/store loops when it comes to atomics, yet there are some excellent atomic instructions right there in ARMv8.1-A, which most ARM SoCs support.
But they can't emit them because that code would then fail on the (substantial) minority of SoCs that are stuck on ARMv8.0-A. So those wonderful instructions end up largely unused on ARMv8 android/linux, simply because they arrived 11 years ago instead of 14 years ago.
At least I can use them on my Mac, or any linux code I compile myself.
-------
There isn't really a solution. Ecosystems getting stuck on increasingly outdated baseline is a necessary evil. It has happened to every single ecosystem to some extent or another, and it will happen to the various RISC-V ecosystems too.
I just disagree with the implication that the RISC-V approach was the right approach [1]. I think ARMv8.0-A did a much better job, including almost all the instructions you need in the very first version, if only they had included proper atomics.
[1] That is, not the right approach for creating a modern, commercially relevant ISA. RISC-V was originally intended as more of an academic ISA, so focusing on minimalism and "RISCness" was probably the best approach for that field.
It takes a heck of a lot longer if you wait until all the advanced features are ready before you publish anything at all.
I think RISC-V did pretty well to get everything in RVA23 -- which is more equivalent to ARMv9.0-A than to ARMv8.0-A -- out after RV64GC aka RVA20 in the 2nd half of 2019.
We don't know how long Arm was cooking up ARMv8 in secret before they announced it in 2011. Was it five years? Was it 10? More? It would not surprise me at all if it was kicked off when AMD demonstrated that Itanium was not going to be the only 64 bit future by starting to talk about AMD64 in 1999, publishing the spec in 2001, and shipping Opteron in April 2003 and Athlon64 five months later.
It's pretty hard to do that with an open and community-developed specification. By which I mean impossible.
I can't even imagine the mess if everyone knew RISC-V was being developed from 2015 but no official spec was published until late 2024.
I am sure it would not have the momentum that it has now.
it's basically impossible to remove an instruction.
Of course not. You can replace an instruction with a polyfill. This will generally be a lot slower, but it won't break any code if you implement it correctly.
While I agree with you, the original comment was still valuable for understanding why RISC-V has evolved the way it has and the philosophy behind the extension idea.
Also, it seems at least some of the RISC-V ecosystem is willing to be a little bit more aggressive. With Ubuntu making RVA23 the minimum profile for Ubuntu, perhaps we will not be waiting a decade for it to become the default. RVA23 was only ratafied a year ago.
My memory is a bit fuzzy but I think Patterson and Hennessy‘s “Computer Architecture: A Quantitative Approach” had some bits that were explicitly about RISC-V, and similarities to MIPS. Unfortunately my copy is buried in a box somewhere so I can’t get you any page numbers, but maybe someone else remembers…
Henessey and Patterson "Computer Architecture: A Quantitative Approach" has 6 published editions (1990, 1996?, 2003, 2006, 2011, 2019) with the 7th due November 2025. Each edition would have a varying set of CPUs as examples for each chapter. For example, the various chapters in the 2nd edition has sections on the MIPS R4000 and the PowerPC 620, while the 3rd edition has sections on the Trimedia TM32, Intel P6, Intel IA-64, Alpha 21264, Sony PS2 Emotion Engine, Sun Wildfire, MIPS R4000, and MIPS R4300. From what I could figure out via web searches, the 6th edition has RISC-V in the appendix, but the 3rd through 5th editions has the MIPS R4000.
Patterson and Hennessy "Computer Organization and Design: The Hardware/Software Interface" has had 6 editions (1998, 2003, 2005, 2012, 2014, 2020) but various editions have had ARM, MIPS, and RISC-V specific editions.
For the uninitiated in AArch64, are there specific parts of it you're referring to here? Mostly what I find is that it lets you stitch common instruction combinations together, like shift + add and fancier adressing. Since the whole point of RISC-V was a RISC instruction set, these things are superfluous.
No, because `lui` results in an absolute address, not a position independent one.
The `auipc/addi` sequence results in 0x3004 + whatever the address of the `auipc` instruction itself is. If the `auipc` is at address 0 then the result will be the same.
Exactly, but the text has the same instruction sequence twice and the parent correctly indicated that the first copy should have used "lui" to illustrate the problem you mentioned and the second copy does use "auipc" to illustrate the fix you mentioned.
RISC-V assembly is mostly very easy (but still super tedious). The main difficulty is their insistence on unnecessarily abbreviated instruction mnemonics. `lw` instead of `load4`, `j` instead of `jump`, and so on. I don't really understand why. We aren't writing these on punch cards.
This really makes me want to try fiddling with some low-level stuff again. I studied mechatronics at uni and programmed microcontrollers in C and assembly, but have gone the webdev direction since then. Does anyone have any trusted quality resources for getting into RISC-V hardware? I'm especially interested in using Rust for this if possible - I've wanted an excuse to learn it in more depth for a while.
I have reached “intro to assembly” in my C course this week and had decided on RISC-V to bridge the gap that everyone has different CPUs and that x86-64 is a little harder to learn than MIPS32, but MIPS32 isn’t as relevant.
And here’s someone who made my course material for the subject entirely.
There will also be the visionfive 2 lite in (hopefully) a moment for more risc-v capabilities. I am excited about it. Haven't looked too much into it, but what I've heard is that the visionfive 2 is not too bad. Lacks a few drivers or performance in them. Will see. I am also curious how easy it can be used for some OS dev.
And if someone wants to get fancy and do some HW-SW-Codesign with risc-v and FPGAs, then there is the PolarFireSoC.
A lot cheaper than AMD/Xilinx products and okay-ish to use. Their dev enviroment feels kinda outdated, but at least it's snappy. Also the documentation feels kinda sparse, but most stuff is documented __somewhere__ - you just gotta find it.
The Microchip "Icicle" came out in late 2020 with the largest FPGA in the range, made using pre-production chips. It was several years more before you could buy the chips themselves. Digikey says it's no longer manufactured and they're just running down stocks.
The BeagleV "Fire" is much cheaper ($150) and uses one of the smallest FPGA parts in the range.
GOWIN also has RISC-V FPGA SoCs (Arora V GW5AST series)
Great work! I was wondering about this after trying out Easy6502. It would be nice to have a more visual component like Easy6502 which has a draw buffer and snake game tho :)
making this in x86 would be fairly tricky since there isn't the same sort of unified core (there used to be, but that was the 16 bit extension set which isn't how x86 is used now).
Nice project. RISC-V tools like this make learning architecture concepts much easier. It’s great to see more hands-on resources that help people move from theory to actual CPU behavior.
Even if there are not that expensive to implement, I do not use official ABI register names, neither pseudo-instructions (I use a command to generate the code on the side).
Once RISC-V has performant silicon implementations (server/desktop/mobile/embedded), the really hard part will be to move the software stack, mostly closed source applications like video games.
And a very big warning: do NOT abuse an assembler macro-preprocessor or you will lock your code on that very assembler. On my side I have been using a simple C pre-processor to keep the door open for any other assembler, at minimal technical cost.
The games are barely moving to ARM, they won’t move to yet another architecture. Desktop games will remain x86 windows for the foreseeable future merely due to the existing catalogue.
The only exception is console games, where the architecture doesn’t matter anyway.
Yep, the super hard part: this rv64 "catalog" is empty.
If you put emulation (hardware accelerated or not), game devs won't compile for rv64. Look at how valve proton is hurting native elf/linux gaming: game devs hardly build anymore for native elf/linux (but this may be related to the toxic behavior of some glibc and gcc devs though), even though their game engine has everything linux support.
It's also how easy the userland can break things. Windows backwards compatibility tends (even if it's not 100% successful on this) to stay relatively stable. It's kind of funny that the most compatible way to distribute Linux binaries for games is to target Proton/Wine.
proton/wine has no official technical support hence illegal in tons of countries.
Since proton/wine is unreliable in time, this is a disaster.
And there is a lot of BS around that: if some devs support their game on elf/linux via proton, they will have to QA it on a linux system anyway, so it does not change that much, even less with a game engine which has everything linux already... it only add the ultra-heavy-complex and then bug inducing proton/wine layer... an horrible mess to debug. One unfortunate patch, proton/wine side or game side, and compat is gone, and you are toast... and those patches do happen.
Conclusion: the only sane way is PROTON = 0 BUCKS.
I play only F2P games with proton (mostly gachas), no way I'll pay bucks for a game without technical support.
Valve should allow to pay games _only_ for the games with official elf/linux/proton support (aka the game devs do QA on elf/linux with valve proton... which would be no better than stupid if their game engine has elf/linux support already in). Why not let elf/linux users play all games which do not have official elf/linux support for free, well, those which run, and run decently, and until an unfortunate patch...
Within the basic "123" ASM demo, I get that x10 - Becomes 0x00000123 as we are taking the integer of x0 and applying 123 to end of it but why is the sp (x2) register at 0x40100000?
What is that sp? Is it important? Why isn't that at 0x000000? Why isn't that explained? That's when I get lost.
Great guide! I think the first "My first RISC-V assembly program" emulator plane should be right at the beginning of the guide. Otherwise, casual readers might think that this is a text-only introduction (despite the word "interactive" in the title).
Will spend more time on it in the coming days. I am quite interested in RISC-V and I think that it might have a bright future ahead.
If any AI expert is reading this now, please use Replit or Lovable or something like that to re-create "Core War" [0] with RISC-V assembly. It would be GREAT.
[0]: https://en.wikipedia.org/wiki/Core_War
Why wouldn't someone just do this with their brain?
I'm still busy building my ox-bike wagon. Soon I'll start pedaling around the country offering tea, and hand written (my brain to my keyboard ™) software.
I was going to add to your comment with some snarky rejoinder like "brain, is that a new agent from OpenAI?" or "we are all vibecoders now" or simply "ain't nobody got time for that".
But then I got to wondering about the OP's statement that they would specifically like someone to create this with AI. It strikes me both as silly as saying "if you're good at using Visual Studio, could you do this?", because AI tools are just tools now, and those who want to use them don't need to be prompted... but also somehow fundamentally different.
OP, what was on your heart that caused you to phrase it that way?
I think it's phrased that way because he's saying it would be great to have as long as you don't spend too much time on it. Doing it by hand would probably not be worth it, but if a tool can do it, then it's worth it.
Having learned assembly with the book "Computer Organization And Design" from Patterson and Hennessy, it really shows how much RISC-V takes from MIPS. After all they share some of the people involved in both ISAs and they have learned from the MIPS mistakes (no delay slots!). Basically if you come from a MIPS the assembly is very very similar, as it was my case.
Now that book is also available with a RISC-V edition, which has a very interesting chapter comparing all different RISC ISAs and what they do differently (SH, Alpha, SPARC, PA-RISC, POWER, ARM, ...),...
However I've been exploring AArch64 for some time and I think it has some very interesting ideas too. Maybe not as clean as RISC-V but with very pragmatic design and some choices that make me question if RISC-V was too conservative in its design.
Probably closer to RISC-1 which shouldn't be surprising given Patterson's role in both - as Patterson himself sets out:
https://aspire.eecs.berkeley.edu/2017/06/how-close-is-risc-v...
https://thechipletter.substack.com/p/risc-on-a-chip-david-pa...
> However I've been exploring AArch64 for some time and I think it has some very interesting ideas too. Maybe not as clean as RISC-V but with very pragmatic design and some choices that make me question if RISC-V was too conservative in its design.
Not enough people reflect on this, or the fact that it's remarkably hazy where exactly AArch64 came from and what guided the design of it.
AArch64 came from AArch32. That's why it keeps things like condition codes, which are a big mistake for large out-of-order implementations. RISC-V sensibly avoid this by having condition-and-branch instructions instead. Otherwise, RISC-V is conservative because it tries to avoid possibly encumbered techniques. But other than that it's remarkably simple and elegant.
Yeah the problem with having flags is demonstrated by multiple very high performance implementations of arm64 and x86, while risc-v has exactly zero.
The time in which you will be able to truthfully say that is very rapidly coming to an end.
RVA23 hopefully.
It looks a lot like Zeno's paradox of RISC-V implementation.
I wish this were true, but we are more than one year(s) away from a consumer RISC-V chip that can beat my Intel N150 mini PC.
That will be amazing when it happens, and a year is VERY soon!
Tenstorrent's first "Atlantis" Ascalon dev board is going to be similar µarch to Apple M1 but running at a lower clock speed, but all 8 cores are "performance" cores, so it should be in N150 ballpack single-core and soundly beating it multi-core.
They are currently saying Q2 2026, which is only 4-7 months from now.
Afair, AArch64 was basically designed by Apple for their A-series iPhone processors, and pushed to be the official ARM standard. Those guys really knew what they were doing and it shows.
It's clear that Arm worked with Apple on AArch64 but saying it was basically designed 'by Apple' rather than 'with Apple' is demonstrably unfair to the Arm team who have decades of experience in ISA design.
If Apple didn't need Arm then they would have probably found a way of going it alone.
Apple helped develop Arm originally and was a (very) early user with Newton. Why would they go it alone when they already had a large amount of history and familiarity available?
Sorry, Apple didn’t help to develop ARM originally. They were an early investor and customer of Advanced RISC Machines when it was spun out of Acorn.
How are you defining "large"? Apple seems to do pretty well with the M-series.
> That's why it keeps things like condition codes, which are a big mistake for large out-of-order implementations. RISC-V sensibly avoid this by having condition-and-branch instructions instead.
Respectfully, the statement in question is partially erroneous and, in far greater measure, profoundly misleading. A distortion draped in fragments of truth remains a falsehood nonetheless.
Whilst AArch64 does retain condition flags, it is not simply because of «AArch32 stretched to 64-bit», and condition codes are not a «big mistake» for large out-of-order (OoO) cores. AArch64 also provides compare-and-branch forms similar to RISC-V, so the contrast given is a false dichotomy.
Namely:
Compilers freely choose between these and flag-based sequences, depending on what is already available and the code/data flow. Also, many arithmetic operations do not set flags unless explicitly requested, which reduces false flag dependencies.Lastly, but not least importantly, Apple’s big cores are among the widest, deepest out-of-order designs in production, with very high IPC and excellent branch handling. Their microarchitectures and toolchains make effective use of:
RISC-V’s variable instruction length (since compression is required to have decent density) is a bigger problem for wide designs.
Not insurmountable, as evidenced by recent AMDs. But still a limitation.
I get the same impression w.r.t. RISC-V v. MIPS similarities, just from my (limited) exposure to Nintendo 64 homebrew development. Pretty striking how often I was thinking to myself “huh, that looks exactly like what I was fiddling with in Ares+Godbolt, just without the delay slots”.
Instructions are more easily added than taken away. RISC-V started with a minimum viable set of instructions to efficiently run standard C/C++ code. More instructions are being added over time, but the burden of proof is on someone proposing a new instruction to demonstrate what adding the instruction costs and how much benefit it brings and in what real-world applications.
> Instructions are more easily added than taken away.
That's not saying much, it's basically impossible to remove an instruction. Just because something is easier than impossible doesn't mean that it's easy.
And sure, from a technical perspective, it's quite easy to add new instructions to RISC-V. Anyone can draft up a spec and implement it in their core.
But if you actually want wide-spread adoption of a new instruction, to the point where compilers can actually emit it by default and expect it to run everywhere, that's really, really hard. First you have to prove that this instruction is worthwhile standardizing, then debate the details and actually agree on a spec. Then you have to repeat the process and argue the extension is worth including in the next RVA profile, which is highly contentious.
Then you have to wait. Not just for the first CPUs to support that profile. You have to wait for every single processor that doesn't support that profile to become irrelevant. It might be over a decade before a compiler can safely switch on that instruction by default.
It's not THAT hard. Heck, I've done it myself. But, as I said, the burden of proof that something new is truly useful quite rightly lies with the proposer.
The ORC.B instruction in Zbb was my idea, never done anywhere before as far as anyone has been able to find. I proposed it in late 2019, it was in the ratified spec in later 2021, and implemented in the very popular JH7110 quad core 1.5 GHz SoC in the VisionFive 2 (and many others later on) that was delivered to pre-order customers in Dec 2022 / Jan 2023.
You might say that's a long time, but that's pretty fast in the microprocessor industry -- just over three years from proposal (by an individual member of RISC-V International) to mass-produced hardware.
Compare that to Arm who published the spec for SVE in 2016 and SVE 2 in 2019. The first time you've been able to buy an SBC with SVE was early 2025 with the Radxa Orion O6.
In contrast RISC-V Vector extension (RVV) 1.0 was published in late 2021 and was available on the CanMV-K230 development board in November 2023, just two years later, and in a flood of much more powerful octa-core SpacemiT K1/M1 boards (BPI-F3, Milk-V Jupiter, Sipeed LicheePi 3A, Muse Pi, DC-Roma II laptop) starting around six months later.
The question is not so much when the first CPU ships with the instruction, but when the last CPU without it stops being relevant.
It varies from instruction to instruction, but alternative code paths are expensive, and not well supported by compilers, so new instructions tend to go unused (unless you are compiling code with -march=native).
In one way, RISC-V is lucky. It's not that currently widely deployed anywhere, so RVA23 should be picked up as the default target, and anything included in it will have widespread support.
But RVA23 is kind of pulling the door closed after itself. It will probably become the default target that all binary distributions will target for the next decade, and anything that didn't make it into RVA23 will have a hard time gaining adoption.
I'm confused. You appear to be against adding new instructions, but also against picking a baseline such as RVA23 and sticking with it for a long time.
Every ISA adds new instructions over time. Exactly the same considerations apply to all of them.
Some Linux distros are still built for original AMD64 spec published in August 2000, while some now require the x86-64-v2 spec defined in 2020 but actually met by CPUs from Nehalem and Jaguar on.
The ARMv8-A ecosystem (other than Apple) seems to have been very reluctant to move past the 8.2 spec published in January 2016, even on the hardware side, and no Linux distro I'm aware of requires anything past original October 2011 ARMv8.0-A spec.
I'm not against adding new instructions. I love new instructions, even considered trying to push for a few myself.
What I'm against is the idea that it's easy to add instructions. Or more the idea that it's a good idea to start with the minimum subset of instructions and add them later as needed.
It seems like a good idea; Save yourself some upfront work. Be able to respond to actual real-world needs rather than trying to predict them all in advance. But IMO it just doesn't work in the real world.
The fact that distros get stuck on the older spec is the exact problem that drives me mad, and it's not even their fault. For example, compilers are forced generate some absolute horrid ARMv8.0-A exclusive load/store loops when it comes to atomics, yet there are some excellent atomic instructions right there in ARMv8.1-A, which most ARM SoCs support.
But they can't emit them because that code would then fail on the (substantial) minority of SoCs that are stuck on ARMv8.0-A. So those wonderful instructions end up largely unused on ARMv8 android/linux, simply because they arrived 11 years ago instead of 14 years ago.
At least I can use them on my Mac, or any linux code I compile myself.
-------
There isn't really a solution. Ecosystems getting stuck on increasingly outdated baseline is a necessary evil. It has happened to every single ecosystem to some extent or another, and it will happen to the various RISC-V ecosystems too.
I just disagree with the implication that the RISC-V approach was the right approach [1]. I think ARMv8.0-A did a much better job, including almost all the instructions you need in the very first version, if only they had included proper atomics.
[1] That is, not the right approach for creating a modern, commercially relevant ISA. RISC-V was originally intended as more of an academic ISA, so focusing on minimalism and "RISCness" was probably the best approach for that field.
It takes a heck of a lot longer if you wait until all the advanced features are ready before you publish anything at all.
I think RISC-V did pretty well to get everything in RVA23 -- which is more equivalent to ARMv9.0-A than to ARMv8.0-A -- out after RV64GC aka RVA20 in the 2nd half of 2019.
We don't know how long Arm was cooking up ARMv8 in secret before they announced it in 2011. Was it five years? Was it 10? More? It would not surprise me at all if it was kicked off when AMD demonstrated that Itanium was not going to be the only 64 bit future by starting to talk about AMD64 in 1999, publishing the spec in 2001, and shipping Opteron in April 2003 and Athlon64 five months later.
It's pretty hard to do that with an open and community-developed specification. By which I mean impossible.
I can't even imagine the mess if everyone knew RISC-V was being developed from 2015 but no official spec was published until late 2024.
I am sure it would not have the momentum that it has now.
When were these extensions available and used by popular compilers for higher level languages?
it's basically impossible to remove an instruction.
Of course not. You can replace an instruction with a polyfill. This will generally be a lot slower, but it won't break any code if you implement it correctly.
It'll continue to take instruction encoding space.
While I agree with you, the original comment was still valuable for understanding why RISC-V has evolved the way it has and the philosophy behind the extension idea.
Also, it seems at least some of the RISC-V ecosystem is willing to be a little bit more aggressive. With Ubuntu making RVA23 the minimum profile for Ubuntu, perhaps we will not be waiting a decade for it to become the default. RVA23 was only ratafied a year ago.
> it's basically impossible to remove an instruction
laughs and/or cries in one of the myriad OISC ISAs
My memory is a bit fuzzy but I think Patterson and Hennessy‘s “Computer Architecture: A Quantitative Approach” had some bits that were explicitly about RISC-V, and similarities to MIPS. Unfortunately my copy is buried in a box somewhere so I can’t get you any page numbers, but maybe someone else remembers…
Henessey and Patterson "Computer Architecture: A Quantitative Approach" has 6 published editions (1990, 1996?, 2003, 2006, 2011, 2019) with the 7th due November 2025. Each edition would have a varying set of CPUs as examples for each chapter. For example, the various chapters in the 2nd edition has sections on the MIPS R4000 and the PowerPC 620, while the 3rd edition has sections on the Trimedia TM32, Intel P6, Intel IA-64, Alpha 21264, Sony PS2 Emotion Engine, Sun Wildfire, MIPS R4000, and MIPS R4300. From what I could figure out via web searches, the 6th edition has RISC-V in the appendix, but the 3rd through 5th editions has the MIPS R4000.
Patterson and Hennessy "Computer Organization and Design: The Hardware/Software Interface" has had 6 editions (1998, 2003, 2005, 2012, 2014, 2020) but various editions have had ARM, MIPS, and RISC-V specific editions.
My copy sure doesn't ... it was published in 1992, almost 20 years before anyone got an idea to make a new ISA called "RISC-V".
For the uninitiated in AArch64, are there specific parts of it you're referring to here? Mostly what I find is that it lets you stitch common instruction combinations together, like shift + add and fancier adressing. Since the whole point of RISC-V was a RISC instruction set, these things are superfluous.
RISC-V has shift+add instructions as part of the Zba extension. Zba is part of B, so it's included in many recent RISC-V profiles.
Do you have a link to the risc-v version? I have the MIPS version and want to pick up the risc-v version.
https://www.amazon.com/Computer-Organization-Design-RISC-V-A...
https://github.com/triilman25/tcp-socket-in-riscv-assembly
I wrote TCP Socket for RISCV by using ISA RV64I. You have to know about linker relaxation and how to using it. Some of reference I have attached there
I think there's an error in the Position Independence section:
The text says that this should result in 0x3004; was this example intended to beNo, because `lui` results in an absolute address, not a position independent one.
The `auipc/addi` sequence results in 0x3004 + whatever the address of the `auipc` instruction itself is. If the `auipc` is at address 0 then the result will be the same.
Exactly, but the text has the same instruction sequence twice and the parent correctly indicated that the first copy should have used "lui" to illustrate the problem you mentioned and the second copy does use "auipc" to illustrate the fix you mentioned.
Oh, fair enough.
I have to praise the interactive style of this content.
As a C/C++ dev, I've always thought assembly was much harder. But this interactive content makes assembly clearer.
RISC-V assembly is mostly very easy (but still super tedious). The main difficulty is their insistence on unnecessarily abbreviated instruction mnemonics. `lw` instead of `load4`, `j` instead of `jump`, and so on. I don't really understand why. We aren't writing these on punch cards.
This really makes me want to try fiddling with some low-level stuff again. I studied mechatronics at uni and programmed microcontrollers in C and assembly, but have gone the webdev direction since then. Does anyone have any trusted quality resources for getting into RISC-V hardware? I'm especially interested in using Rust for this if possible - I've wanted an excuse to learn it in more depth for a while.
Since Rust is not directly hardware, here is a nice tutorial on some simple OS basics: https://operating-system-in-1000-lines.vercel.app/en/
There was also some Rust specific OS tutorial somewhere that was written nicely, but I can't find it right now.
Also if you want to get real hardware, there is the neorv32 that has nice documentation: https://stnolting.github.io/neorv32/ https://github.com/stnolting/neorv32
It's a risc-v core written in VHDL
The Rust OS tutorial you're thinking of is probably this one: https://os.phil-opp.com/
Thank you both! These are some really interesting places to pick up from.
> getting into RISC-V hardware?
Got myself a https://docs.banana-pi.org/en/BPI-F3/BananaPi_BPI-F3 in May last year for 90€. Tinkered again few weeks ago https://bsky.app/profile/benetou.fr/post/3m2m62st3hk2w
> I'm especially interested in using Rust for this if possible
I didn't tinker with Rust on it but if I had to I'd probably try via https://github.com/dockcross/dockcross/tree/master/linux-ris... and avoid compiling on the device itself, unless it's basically a HelloWorld project.
> Subtracting from zero is negation. What’s the negative of 0x123?
It is 0xfffffedd.
> Hmm, we get 0xfffffccd.
No, we didn't. The emulator shows 0xfffffedd, and I've checked it manually. The emulator is right.
This looks great. I like how it starts with a dump of all the instructions we'll be using.
Does anyone know of a complete list, machine readable? e.g.
instructions = [{"name": "lui", "description": "load upper immediate", "args": [...]}, ...]
The gods have spoken.
I have reached “intro to assembly” in my C course this week and had decided on RISC-V to bridge the gap that everyone has different CPUs and that x86-64 is a little harder to learn than MIPS32, but MIPS32 isn’t as relevant.
And here’s someone who made my course material for the subject entirely.
Thank you so much.
RISK-V - you can easily get the hardware: https://www.raspberrypi.com/news/risc-v-on-raspberry-pi-pico...
There will also be the visionfive 2 lite in (hopefully) a moment for more risc-v capabilities. I am excited about it. Haven't looked too much into it, but what I've heard is that the visionfive 2 is not too bad. Lacks a few drivers or performance in them. Will see. I am also curious how easy it can be used for some OS dev.
https://www.kickstarter.com/projects/starfive/visionfive-2-l...
And if someone wants to get fancy and do some HW-SW-Codesign with risc-v and FPGAs, then there is the PolarFireSoC. A lot cheaper than AMD/Xilinx products and okay-ish to use. Their dev enviroment feels kinda outdated, but at least it's snappy. Also the documentation feels kinda sparse, but most stuff is documented __somewhere__ - you just gotta find it.
https://www.microchip.com/en-us/development-tool/MPFS-DISCO-...
Fun fact: The dev board costs less than the chip itself. (Apparently that's often the case, but I just noticed that the first time)
There is more than one dev board.
The Microchip "Icicle" came out in late 2020 with the largest FPGA in the range, made using pre-production chips. It was several years more before you could buy the chips themselves. Digikey says it's no longer manufactured and they're just running down stocks.
The BeagleV "Fire" is much cheaper ($150) and uses one of the smallest FPGA parts in the range.
GOWIN also has RISC-V FPGA SoCs (Arora V GW5AST series)
It'd probably be easier to get on an ESP-32, no? Those are arguably more widely available.
if you're looking for an nice RISCV emulator try RARS https://github.com/TheThirdOne/rars
tangent I want to link back to this https://github.com/mortbopet/Ripes
Great work! I was wondering about this after trying out Easy6502. It would be nice to have a more visual component like Easy6502 which has a draw buffer and snake game tho :)
I like this. Do you have a link to your simulator code? I might borrow for a personal project of mine if it's ok.
https://github.com/dramforever/easyriscv/blob/main/emulator....
Has anyone seen anything similar to this for x86?
making this in x86 would be fairly tricky since there isn't the same sort of unified core (there used to be, but that was the 16 bit extension set which isn't how x86 is used now).
Nice project. RISC-V tools like this make learning architecture concepts much easier. It’s great to see more hands-on resources that help people move from theory to actual CPU behavior.
Even if there are not that expensive to implement, I do not use official ABI register names, neither pseudo-instructions (I use a command to generate the code on the side).
Once RISC-V has performant silicon implementations (server/desktop/mobile/embedded), the really hard part will be to move the software stack, mostly closed source applications like video games.
And a very big warning: do NOT abuse an assembler macro-preprocessor or you will lock your code on that very assembler. On my side I have been using a simple C pre-processor to keep the door open for any other assembler, at minimal technical cost.
The games are barely moving to ARM, they won’t move to yet another architecture. Desktop games will remain x86 windows for the foreseeable future merely due to the existing catalogue.
The only exception is console games, where the architecture doesn’t matter anyway.
Yep, the super hard part: this rv64 "catalog" is empty.
If you put emulation (hardware accelerated or not), game devs won't compile for rv64. Look at how valve proton is hurting native elf/linux gaming: game devs hardly build anymore for native elf/linux (but this may be related to the toxic behavior of some glibc and gcc devs though), even though their game engine has everything linux support.
It's also how easy the userland can break things. Windows backwards compatibility tends (even if it's not 100% successful on this) to stay relatively stable. It's kind of funny that the most compatible way to distribute Linux binaries for games is to target Proton/Wine.
proton/wine has no official technical support hence illegal in tons of countries.
Since proton/wine is unreliable in time, this is a disaster.
And there is a lot of BS around that: if some devs support their game on elf/linux via proton, they will have to QA it on a linux system anyway, so it does not change that much, even less with a game engine which has everything linux already... it only add the ultra-heavy-complex and then bug inducing proton/wine layer... an horrible mess to debug. One unfortunate patch, proton/wine side or game side, and compat is gone, and you are toast... and those patches do happen.
Conclusion: the only sane way is PROTON = 0 BUCKS.
I play only F2P games with proton (mostly gachas), no way I'll pay bucks for a game without technical support.
Valve should allow to pay games _only_ for the games with official elf/linux/proton support (aka the game devs do QA on elf/linux with valve proton... which would be no better than stupid if their game engine has elf/linux support already in). Why not let elf/linux users play all games which do not have official elf/linux support for free, well, those which run, and run decently, and until an unfortunate patch...
Within the basic "123" ASM demo, I get that x10 - Becomes 0x00000123 as we are taking the integer of x0 and applying 123 to end of it but why is the sp (x2) register at 0x40100000?
What is that sp? Is it important? Why isn't that at 0x000000? Why isn't that explained? That's when I get lost.
'sp' is the 'stack pointer' register. There's an explanation of the stack later in the guide: https://dramforever.github.io/easyriscv/#the-stack