RISC-V Instruction Formats
Last updated
Last updated
The assembler translates the assembly language program (.s
) into machine language program (.o
).
The 6 instruction formats:
R-Format: instructions using 3 register inputs.
I-Format: instructions with immediates, loads.
S-Format: store instructions: sw
, sb
.
SB-Format: branch instructions: beq
, bge
.
U-Format: instructions with upper immediates.
UJ-Format: the jump instruction: jal
.
All RV32 R-format instructions:
Compared to R-format, I-Format differs in that the func7
and rs2
fields are replaced with imm[11:0]
.
All RISC-V I-Type arithmetic instructions:
Load instructions are also I-Type. The imm
field is used to specify the offset from the base address stored in rs1
. The width of the data to be loaded is specified by the func3
field.
LBU is "load unsigned byte". This means that when loading a byte into an int4_t
, the value will not be sign-extended to 32 bits in rd
.
LH is "load half-word".
There is no LWU in RV32 because no sign/zero extension is needed when copying 32 bits from a memory location into a 32-bit register.
jalr
instruction is also I-Type.
jalr rd, rs1, offset
Writes PC+4 to rd
(return address)
Sets PC = rs1 + offset
Uses same immediates as arithmetic & loads. Therefore, no multiplication by 2 bytes.
Compared to R-format, the rd
and func7
are used for specifying the offset from the base address stored in rs1
. Stores don't write a value to the register file. Therefore, there's no rd
.
Compare two registers and jump to one address if condition met.
PC-Relative addressing: use the immediate field as a two’s complement word offset from PC. RV32 uses word offset because 32-bit Instructions are word-aligned, which means address is always a multiple of 4 (in bytes). Given that PC always point to an instruction, we can take the convenience by offsetting in words.
Can specify aligned word addresses from the PC.
RISC-V Feature, n×16-bit instructions
Extensions to RISC-V base ISA support 16-bit compressed instructions and also variable-length instructions that are multiples of 16-bits in length. To enable this, RISC-V scales the branch offset to be half-words even when there are no 16-bit instructions.
The primary distinction between SB and S lies in the parsing of the imm
.
What do we do if destination is instructions away from branch? Turn beq x10, x0, far
to bne x10, x0, next
.
lui
writes the upper 20 bits of the destination with the immediate value, and clears the lower 12 bits.
auipc
adds upper immediate value to PC and places result in destination register.
A Corner Case for lui
: when setting the lower 12 bits, we employ addi
, which adds sign-extended immediates and may disrupt the upper 20 bits. To address this, we must adjust the value in the upper 20 bits beforehand.
Label: auipc x10, 0
: puts address of label into x10
.
For branches, we assumed that we won’t want to branch too far, so we can specify a change in the PC.
For general jumps (jal
), we may jump to anywhere in code memory.
jal
saves PC+4 in register rd
(the return address).
Set PC = PC + offset (PC-relative jump).
Target somewhere within locations, 2 bytes apart (since an instruciton is at least 2 bytes when compressed instructions are enabled).
32-bit instructions.
Reminder: j
jump is a pseudo-instruction—the assembler will instead use jal but sets rd=x0
to discard return address.