Midterm Compiler Project


Write a compiler (in paper) for a SUBSET of RISC-V processor assembly language.

To do list:
1)  write a grammar for SUBSET of RISC-V (we will named it SRISC) defined below.
2)  make grammar of SRISC suitable for LL(1) parsing
3)  compute first and follow set of SRISC grammar in (2)
4)  draw parsing table
5)  write a parser (pseudo code)
6)  (optional)  implement a real parser (runable) in your chosen language.

The parser checks the input SRISC assembly language file that the code is the language of the grammar or not.  It replies with yes/no.  You do not need to generate any parse tree.

You can read about RISC-V assembly here: (link)
You can see the instruction set (subset) of RISC-V here:  (link)

SUBSET of RISC-V assembly language

basic arithmetic and logic

add x2, x3, x4

there are 32 registers named x0..x31
the last argument can be a constant (called "immediate" in assembly language)

add x2, x3, 10

instructions in this group are:  add, sub, and, or

move data

ld x5, 40(x6)
sd x5, 20(x4)


instructions in this group are: ld, sd, lw, sw

branch

beq x5, x6, L3

the lable are L0..L10

instruction in this group are: beq, bne, blt, bge

The structure of an assembly language file is as follows:


.code        <-- beginning of the code
add x1,x2,x3
....
beq x1,x2,L5
...
L5:  sub x3,x3,1  <-- label appears at the beginning with suffix ":"
...

.end         <-- end of the code


Submit your work of 1)...5)
if you want to do 6) you can use code assistant tools to do so.  I will provide some code example for you to try.

(you do not have to handle comments)

Example code

C code
int sum10(){
    int i, s;
    s = 0;
    for(i=0; i<10; i++)
        s += i;
    return s;
}


compiled and modified for SRISC

use x2 for i, x3 for s, x4 temp

.code
        add     x2,x0,x0   ; i = 0
        add     x3,x0,x0   ; s = 0
        add     x4,x0,10   ; x4 = 10  
L5:
        bge     x2,x4,L4   ; i >= 10 exit   
        add     x3,x3,x2   ; s += i
        add     x2,x2,1    ; i++
        beq     x0,x0,L5  

L4:                           ; result in x3
.end

Footnote:  Thank Jirameth Wannasiwapor for pointing out error in my code.

last update 6 March 2026