2023.01.03
Exploring Instruction Fusion Opportunities in General Purpose Processors
Introduction
Fusion存在的意义,
Therefore, one could rather keep architectural instructions simple and perform any fusion necessary to optimize pipeline resource utilization at the microarchitectural level.
文章提到的几个概念
- non-contiguous (memory accesses) fusion
- 访问数据连续(特指在统一cache行中)
- asymmetric memory accesses
- 访存大小不一样
- non-consecutive fusion
- 动态执行中不连续的两条指令
文章自己提出自己的3个贡献:fusion归类,刻画non consecutive/contiguous fusion,借助Helios说明正确性、性能问题。
结论:14.2% over no fusion and 8.2% over consecutive and contiguous only microarchitectural memory fusion
2. Background
仅考虑2 uop融合
3. Motivation
RISCV中可融合的指令(Table I)
Head | Tail |
---|---|
add rd, rs1, rs2 | ld rd, 0(rd) |
lui rd, imm[31:12] | addi rd, rd, imm[11:0] |
ld rd, imm(rs1) | add rs1, rs1, 8 |
auipc t, imm20 | jalr ra, imm12(t) |
slli rd, rs1, {1,2,3} | add rd, rd, rs2 |
mulh[[S]U] rdh, rs1, rs2 | mul rdl, rs1, rs2 |
slli rd, rs1, 32 | srli rd, rd, 29/30/31/32 |
div[U] rdq, rs1, rs2 | rem[U] rdr, rs1, rs |
lui rd, imm[31:12] | ld rd, imm11:0 |
auipc rd, symbol[31:12] | ld rd, symbol11:0 |
ld rd1, imm(rs1) | ld rd2, imm+8(rs1) |
st rs2, imm(rs1) | st rs3, imm+8(rs1) |
Figure 2显示内存融合比上面其他融合的指令数量更多