Write a compiler (in paper) for a SUBSET of RISC-V processor assembly
language.
To do list:
1) write a grammar for SUBSET of RISC-V (we will named it SRISC)
defined below.
2) make grammar of SRISC suitable for LL(1) parsing
3) compute first and follow set of SRISC grammar in (2)
4) draw parsing table
5) write a parser (pseudo code)
6) (optional) implement a real parser (runable) in your chosen
language.
The parser checks the input SRISC assembly language file that the code is
the language of the grammar or not. It replies with yes/no.
You do not need to generate any parse tree.
You can read about RISC-V assembly here: (link)
You can see the instruction set (subset) of RISC-V here: (link)
add x2, x3, x4
there are 32 registers named x0..x31
the last argument can be a constant (called "immediate" in assembly
language)
add x2, x3, 10
instructions in this group are: add, sub, and, or
ld x5, 40(x6)
sd x5, 20(x4)
instructions in this group are: ld, sd, lw, sw
beq x5, x6, L3
the lable are L0..L10
instruction in this group are: beq, bne, blt, bge
.code <-- beginning of
the code
add x1,x2,x3
....
beq x1,x2,L5
...
L5: sub x3,x3,1 <-- label appears at the beginning with
suffix ":"
...
.end <-- end of the
code
Submit your work of 1)...5)
if you want to do 6) you can use code assistant tools to do so. I
will provide some code example for you to try.
(you do not have to handle comments)
C code
int sum10(){
int i, s;
s = 0;
for(i=0; i<10; i++)
s += i;
return s;
}
compiled and modified for SRISC
use x2 for i, x3 for s, x4 temp
.code add
x2,x0,x0 ; i = 0 add
x3,x0,x0 ; s = 0 add x4,x0,10 ; x4 =
10 L5: bge x2,x4,L4 ; i
>= 10 exit
add
x3,x3,x2 ; s += i
add
x2,x2,1 ; i++
beq
x0,x0,L5 L4:
; result in x3
.end
Footnote: Thank Jirameth
Wannasiwapor for pointing out error in my code.
last update 6 March 2026