## Agenda I've prepared some information about * current RISC-V hardware * future RISC-V hardware * performance numbers * notes on QEMU ## Koji http://fedora.riscv.rocks/koji/ https://koji.fedoraproject.org/koji/ Primary architecture or koji-shadow? ## Current RISC-V hardware * Milk-V Pioneer (SG2042) * Sophgo 2U server (SG2042) - Possible performance and quality issues - Expensive - Likely to end up as e-waste soon * SiFive Unmatched - Slow - Doesn't support any recent extensions, esp. V, H - No BMC, but might use BMC-on-PCIe card * Single board computers (SBCs) - VisionFive2 (StarFive JH7xxx), Lichee Pi 4A, ... - Slow - Limited memory, cores - Reliability issues - Upfront & ongoing engineering headaches integrating into a 19" rack * QEMU - Performance numbers below, but not great - Really using x86-64 hardware - Hardware (x86-64) is well known and easy to manage - No "e-waste", servers can be repurposed - Supports all the latest extensions - Supports any amount of RAM, large numbers of vCPUs - We can add new extensions and fix bugs relatively easily ## Future RISC-V hardware * Sophgo SG2380 - 16 x SiFive P670 - announced yesterday * SiFive P870 - announced in August 2023 * StarFive JH8100 - TSMC 12nm - H extension - power and efficiency versions, but they are not fully compatible - 8 lanes of gen3 PCIe - 4 USB 3.2 gen2 - miniITX development board * Ventana * Rivos ## Performance numbers binutils openssl python3.12 mingw-gcc i686 1589 1577 3411 4292 x86-64 1419 1172 2462 1827 aarch64 1573 811 (?) 1845 2521 ppc64le 2165 1291 3073 4388 s390x 2553 1380 1984 (?) 6824 qemu-system-riscv64 16 vCPUs, 16 GB on AMD Ryzen 9 7950X server (LTO) 4493 3052 14502 12428 +217% +160% +489% +580% (no LTO) 3267 1351 6353 (failed) +130% +15% +158% qemu-system-riscv64 32 vCPUs, 16 GB on AMD Genoa-X server (LTO) 6841 3182 (no LTO) 5115 1882 (failed) VisionFive 2 (LTO) 7202 8823 (crashed in LTO step) +408% +653% (no LTO) 3274 2059 11627 +130% +75% ## Single thread performance qemu-system-riscv64 912 on AMD Genoa-X HiFive Unmatched 616 qemu-system-riscv64 598 on AMD 7950x VisionFive 2 425 Koji/ppc64le 144 Koji/i686 105 Koji/x86-64 100 Koji/aarch64 89 Koji/s390x 65 AMD Genoa-X (x86-64) 36 AMD 7950x (x86-64) 35 ## QEMU observations "TCG" is the name for QEMU's software emulation, eg. RISC-V fully emulated guest on x86-64 host. Works using Translation Blocks (TBs) which translate basic blocks of guest code. Well understood (by me), easy to fix simpler issues. - I posted a patch yesterday which gets ~ +6% performance gain A few tips to make TCG run (a bit) fast(er): - Compile with -march=native (+4%) - Profile with perf - Don't overprovision host CPUs * However pinning vCPUs to pCPUs didn't really help - Give it plenty of guest & host RAM * Measured memory overhead on host is up to 40% after running for some time * Host TBs track guest page cache; as long as a translated executable remains in the guest page cache, it will not be retranslated - Don't restart the VM - Software TLB - Fast vs slow jumps