3 I've prepared some information about
5 * current RISC-V hardware
6 * future RISC-V hardware
13 http://fedora.riscv.rocks/koji/
14 https://koji.fedoraproject.org/koji/
16 Primary architecture or koji-shadow?
19 ## Current RISC-V hardware
21 * Milk-V Pioneer (SG2042)
22 * Sophgo 2U server (SG2042)
24 - Possible performance and quality issues
26 - Likely to end up as e-waste soon
31 - Doesn't support any recent extensions, esp. V, H
32 - No BMC, but might use BMC-on-PCIe card
34 * Single board computers (SBCs)
36 - VisionFive2 (StarFive JH7xxx), Lichee Pi 4A, ...
38 - Limited memory, cores
40 - Upfront & ongoing engineering headaches integrating into a 19" rack
44 - Performance numbers below, but not great
45 - Really using x86-64 hardware
46 - Hardware (x86-64) is well known and easy to manage
47 - No "e-waste", servers can be repurposed
48 - Supports all the latest extensions
49 - Supports any amount of RAM, large numbers of vCPUs
50 - We can add new extensions and fix bugs relatively easily
54 ## Future RISC-V hardware
63 - announced in August 2023
69 - power and efficiency versions, but they are not fully compatible
70 - 8 lanes of gen3 PCIe
72 - miniITX development board
79 ## Performance numbers
82 binutils openssl python3.12 mingw-gcc
84 i686 1589 1577 3411 4292
86 x86-64 1419 1172 2462 1827
88 aarch64 1573 811 (?) 1845 2521
90 ppc64le 2165 1291 3073 4388
92 s390x 2553 1380 1984 (?) 6824
96 qemu-system-riscv64 16 vCPUs, 16 GB
97 on AMD Ryzen 9 7950X server
99 (LTO) 4493 3052 14502 12428
100 +217% +160% +489% +580%
102 (no LTO) 3267 1351 6353 (failed)
105 qemu-system-riscv64 32 vCPUs, 16 GB
106 on AMD Genoa-X server
110 (no LTO) 5115 1882 (failed)
114 (LTO) 7202 8823 (crashed in LTO step)
117 (no LTO) 3274 2059 11627
121 ## Single thread performance
123 qemu-system-riscv64 912
128 qemu-system-riscv64 598
143 AMD Genoa-X (x86-64) 36
145 AMD 7950x (x86-64) 35
151 "TCG" is the name for QEMU's software emulation, eg. RISC-V fully
152 emulated guest on x86-64 host.
154 Works using Translation Blocks (TBs) which translate basic blocks of
157 Well understood (by me), easy to fix simpler issues.
159 - I posted a patch yesterday which gets ~ +6% performance gain
161 A few tips to make TCG run (a bit) fast(er):
163 - Compile with -march=native (+4%)
167 - Don't overprovision host CPUs
169 * However pinning vCPUs to pCPUs didn't really help
171 - Give it plenty of guest & host RAM
173 * Measured memory overhead on host is up to 40% after running
176 * Host TBs track guest page cache; as long as a translated
177 executable remains in the guest page cache, it will not be
180 - Don't restart the VM