CPU scaling benchmark

workers
2 +1 main
iters total
500M
166666666/stream
elapsed
1473.07 ms
total CPU used
4337.75 ms
speedup
2.94×
vs serial
efficiency
98%
of 3× ideal
stream spawn ms spawned@ work start@ work end@ work ms reap wait ms
0 (main) 0 3.33 3.33 1441.54 1438.21 0
1 1.789 1.8 16.04 1462.75 1446.71 21.37
2 1.482 3.31 17.53 1470.36 1452.83 28.94
main
w1
w2
    fork+handshake      CPU work      parent reap wait
what this measures
Each stream runs a tight integer LCG loop — working set is one CPU register, no memory access, no shared data. Speedup = sum(stream CPU time) / wall-clock elapsed. Efficiency = speedup / (workers+1). 100% efficiency means perfect linear scaling; less than 100% is the cost of serial fork setup, reap tail, SMT/core contention.