Koji is Fedora's build system: http://fedora.riscv.rocks/koji/ https://koji.fedoraproject.org/koji/ Doesn't use cross-compilation, and not possible with current design. We must find a way to run a full build system somehow. Explanation of primary architecture vs. koji-shadow ## What hardware should we choose for RISC-V as a primary architecture? * Milk-V Pioneer (SG2042) * Sophgo 2U server (SG2042) - Possible performance and quality issues [unverified] - Expensive - Likely to end up as e-waste soon. * SiFive Unmatched - Slow - Doesn't support any recent extensions, esp. V, H - No BMC, but might use BMC-on-PCIe card * [Mystery manufacturer] future development board (under NDA) * Single board computers (SBCs) - VisionFive2 (StarFive JH7xxx), Lichee Pi 4A, ... - Slow - Limited memory, cores - Reliability issues - Upfront & ongoing engineering headaches integrating into a 19" rack * QEMU - Performance numbers below, but not great - Really using x86-64 hardware - Hardware (x86-64) is well known and easy to manage - No "e-waste", servers can be repurposed - Supports all the latest extensions - Supports any amount of RAM, large numbers of vCPUs - We can add new extensions and fix bugs relatively easily ## QEMU performance numbers (Times in seconds, taken from recent Fedora Koji builds) binutils openssl python3.12 mingw-gcc i686 1589 1577 3411 4292 x86-64 1419 1172 2462 1827 aarch64 1573 811 (?) 1845 2521 ppc64le 2165 1291 3073 4388 s390x 2553 1380 1984 (?) 6824 qemu riscv64 [qemu-system-riscv64 16 cores, 16 GB; on AMD Ryzen 9 7950X server] (LTO) 4493 3052 14502 12428 +217% +160% +489% +580% (no LTO) 3267 1351 6353 (failed) +130% +15% +158% VisionFive 2 (LTO) 7202 8823 (crashed in LTO step) +408% +653% (no LTO) 3274 2059 +130% +75% ## QEMU observations "TCG" is the name for QEMU's software emulation, eg. RISC-V fully emulated guest on x86-64 host. Works using Translation Blocks (TBs) which translate basic blocks of guest code. Well understood (by me), easy to fix simpler issues. - I posted a patch yesterday which gets ~ +6% performance gain A few tips to make TCG run (a bit) fast(er): - Compile with -march=native (+4%) - Profile with perf - Don't overprovision host CPUs * However pinning vCPUs to pCPUs didn't really help - Give it plenty of guest & host RAM * Measured memory overhead on host is up to 40% after running for some time * Host TBs track guest page cache; as long as a translated executable remains in the guest page cache, it will not be retranslated - Don't restart the VM - Software TLB - Fast vs slow jumps