CPU scaling benchmark

workers
6 +1 main
iters total
500M
71428571/stream
elapsed
1175.46 ms
total CPU used
6984.21 ms
speedup
5.94×
vs serial
efficiency
84.9%
of 7× ideal
stream spawn ms spawned@ work start@ work end@ work ms reap wait ms
0 (main) 0 10.01 10.02 1157.26 1147.24 0
1 2.097 2.11 15.87 787.63 771.76 0.12
2 1.638 3.8 26.86 1172.65 1145.79 15.5
3 1.552 5.37 24.47 1107.53 1083.06 0.18
4 1.576 6.97 33.33 889.76 856.43 0.21
5 1.485 8.48 46.84 1038.64 991.8 0.23
6 1.491 10 36.88 1025.01 988.13 0.24
main
w1
w2
w3
w4
w5
w6
    fork+handshake      CPU work      parent reap wait
what this measures
Each stream runs a tight integer LCG loop — working set is one CPU register, no memory access, no shared data. Speedup = sum(stream CPU time) / wall-clock elapsed. Efficiency = speedup / (workers+1). 100% efficiency means perfect linear scaling; less than 100% is the cost of serial fork setup, reap tail, SMT/core contention.