RDSEED (R64) - Throughput and Uops
With 1 independent instruction
With unroll_count=10 and no inner loop
Code:
0: 49 0f c7 f8 rdseed r8
Show nanoBench command
Results:
Instructions retired: 1.0
Core cycles: 608.12
Reference cycles: 389.58
UOPS_EXECUTED.THREAD: 15.9
RETIRE_SLOTS: 16.0
UOPS_MITE: 0.0
UOPS_MS: 16.0
UOPS_PORT_0: 4.27
UOPS_PORT_1: 3.6
UOPS_PORT_2: 0.4
UOPS_PORT_3: 0.5
UOPS_PORT_4: 0.0
UOPS_PORT_5: 1.4
UOPS_PORT_6: 5.5
UOPS_PORT_7: -0.09
DIV_CYCLES: 0.0
ILD_STALL.LCP: 0.0
UOPS_MITE>=1: 0.0
With unroll_count=10, no inner loop, and 1 NOP
Code:
0: 49 0f c7 f8 rdseed r8 4: 90 nop
Show nanoBench command
Results:
Instructions retired: 2.0
Core cycles: 607.45
Reference cycles: 387.5
UOPS_EXECUTED.THREAD: 16.2
RETIRE_SLOTS: 17.0
UOPS_MITE: 1.0
UOPS_MS: 16.0
UOPS_PORT_0: 3.6
UOPS_PORT_1: 3.6
UOPS_PORT_2: 0.5
UOPS_PORT_3: 0.6
UOPS_PORT_4: 0.0
UOPS_PORT_5: 1.4
UOPS_PORT_6: 6.2
UOPS_PORT_7: -0.09
DIV_CYCLES: 0.0
ILD_STALL.LCP: 0.0
UOPS_MITE>=1: 1.0
With loop_count=10 and unroll_count=1
Code:
0: 49 0f c7 f8 rdseed r8
Show nanoBench command
Results:
Instructions retired: 3.0
Core cycles: 606.85
Reference cycles: 389.58
UOPS_EXECUTED.THREAD: 16.95
RETIRE_SLOTS: 18.6
UOPS_MITE: 0.6
UOPS_MS: 16.0
UOPS_PORT_0: 4.2
UOPS_PORT_1: 3.7
UOPS_PORT_2: 0.5
UOPS_PORT_3: 0.6
UOPS_PORT_4: 0.0
UOPS_PORT_5: 1.5
UOPS_PORT_6: 6.5
UOPS_PORT_7: -0.09
DIV_CYCLES: 0.0
ILD_STALL.LCP: 0.0
UOPS_MITE>=1: 0.47