RDRAND (R64) - Throughput and Uops
With 1 independent instruction
With unroll_count=10 and no inner loop
Code:
0: 49 0f c7 f0 rdrand r8
Show nanoBench command
Results:
Instructions retired: 1.0
Core cycles: 1573.58
Reference cycles: 1112.68
UOPS_EXECUTED.THREAD: 9.0
RETIRE_SLOTS: 14.15
UOPS_MITE: -2.67
UOPS_MS: 15.78
UOPS_DISPATCHED.INT_EU_ALL: 14.9
UOPS_DISPATCHED.ALU: 11.9
UOPS_DISPATCHED.SLOW: 1.0
UOPS_DISPATCHED.STD: 0.0
UOPS_DISPATCHED.SHIFT: 1.0
UOPS_DISPATCHED.JMP: 1.0
UOPS_DISPATCHED.STA: 0.0
UOPS_DISPATCHED.V0: 0.0
UOPS_DISPATCHED.V1: 0.0
UOPS_DISPATCHED.V2: 0.0
UOPS_DISPATCHED.V3: 0.0
DIV_CYCLES: 0.0
ILD_STALL.LCP: 0.0
UOPS_MITE>=1: -0.32
With unroll_count=10, no inner loop, and 1 NOP
Code:
0: 49 0f c7 f0 rdrand r8 4: 90 nop
Show nanoBench command
Results:
Instructions retired: 2.0
Core cycles: 1573.47
Reference cycles: 1111.0
UOPS_EXECUTED.THREAD: 9.0
RETIRE_SLOTS: 15.4
UOPS_MITE: 0.52
UOPS_MS: 14.93
UOPS_DISPATCHED.INT_EU_ALL: 15.0
UOPS_DISPATCHED.ALU: 11.98
UOPS_DISPATCHED.SLOW: 1.0
UOPS_DISPATCHED.STD: 0.0
UOPS_DISPATCHED.SHIFT: 1.0
UOPS_DISPATCHED.JMP: 1.0
UOPS_DISPATCHED.STA: 0.0
UOPS_DISPATCHED.V0: 0.0
UOPS_DISPATCHED.V1: 0.0
UOPS_DISPATCHED.V2: 0.0
UOPS_DISPATCHED.V3: 0.0
DIV_CYCLES: 0.0
ILD_STALL.LCP: 0.0
UOPS_MITE>=1: 1.03
With loop_count=10 and unroll_count=1
Code:
0: 49 0f c7 f0 rdrand r8
Show nanoBench command
Results:
Instructions retired: 3.0
Core cycles: 1573.8
Reference cycles: 1105.95
UOPS_EXECUTED.THREAD: 10.0
RETIRE_SLOTS: 18.13
UOPS_MITE: 0.25
UOPS_MS: 17.03
UOPS_DISPATCHED.INT_EU_ALL: 15.85
UOPS_DISPATCHED.ALU: 11.9
UOPS_DISPATCHED.SLOW: 1.0
UOPS_DISPATCHED.STD: 0.0
UOPS_DISPATCHED.SHIFT: 1.0
UOPS_DISPATCHED.JMP: 2.0
UOPS_DISPATCHED.STA: 0.0
UOPS_DISPATCHED.V0: 0.0
UOPS_DISPATCHED.V1: 0.0
UOPS_DISPATCHED.V2: 0.0
UOPS_DISPATCHED.V3: 0.0
DIV_CYCLES: 0.0
ILD_STALL.LCP: 0.0
UOPS_MITE>=1: 0.13