CPU scaling benchmark

workers
6 +1 main
iters total
100M
14285714/stream
elapsed
266.13 ms
total CPU used
1279.98 ms
speedup
4.81×
vs serial
efficiency
68.7%
of 7× ideal
stream spawn ms spawned@ work start@ work end@ work ms reap wait ms
0 (main) 0 12.42 12.43 140.26 127.83 0
1 2.321 2.34 15.91 250.14 234.23 110.02
2 2.015 4.38 26.07 233.48 207.41 96.23
3 1.882 6.28 25.74 151.63 125.89 14.26
4 2.346 8.65 55.11 234.32 179.21 97.66
5 1.816 10.51 36.55 263.16 226.61 123.02
6 1.877 12.4 68.51 247.31 178.8 107.19
main
w1
w2
w3
w4
w5
w6
    fork+handshake      CPU work      parent reap wait
what this measures
Each stream runs a tight integer LCG loop — working set is one CPU register, no memory access, no shared data. Speedup = sum(stream CPU time) / wall-clock elapsed. Efficiency = speedup / (workers+1). 100% efficiency means perfect linear scaling; less than 100% is the cost of serial fork setup, reap tail, SMT/core contention.