PUSH (R64) - Throughput and Uops
With unroll_count=500 and no inner loop
- Code:
0: 41 50 push r8
- Show nanoBench command
- Results:
- Instructions retired: 1.0
- Core cycles: 1.0
- Reference cycles: 0.68
- UOPS_EXECUTED.THREAD: 2.04
- RETIRE_SLOTS: 1.04
- UOPS_MITE: 1.0
- UOPS_MS: 0.0
- UOPS_PORT_0: 0.0
- UOPS_PORT_1: 0.0
- UOPS_PORT_2: 0.32
- UOPS_PORT_3: 0.34
- UOPS_PORT_4: 1.0
- UOPS_PORT_5: 0.01
- UOPS_PORT_6: 0.02
- UOPS_PORT_7: 0.34
- DIV_CYCLES: 0.0
- ILD_STALL.LCP: 0.0
- UOPS_MITE>=1: 0.25
With loop_count=1000 and unroll_count=10
- Code:
0: 41 50 push r8
- Show nanoBench command
- Results:
- Instructions retired: 1.2
- Core cycles: 1.12
- Reference cycles: 0.93
- UOPS_EXECUTED.THREAD: 2.14
- RETIRE_SLOTS: 1.13
- UOPS_MITE: 0.0
- UOPS_MS: 0.0
- UOPS_PORT_0: 0.0
- UOPS_PORT_1: 0.0
- UOPS_PORT_2: 0.33
- UOPS_PORT_3: 0.34
- UOPS_PORT_4: 1.0
- UOPS_PORT_5: 0.02
- UOPS_PORT_6: 0.11
- UOPS_PORT_7: 0.34
- DIV_CYCLES: 0.0
- ILD_STALL.LCP: 0.0
- UOPS_MITE>=1: 0.0
With loop_count=100 and unroll_count=100
- Code:
0: 41 50 push r8
- Show nanoBench command
- Results:
- Instructions retired: 1.02
- Core cycles: 1.13
- Reference cycles: 0.89
- UOPS_EXECUTED.THREAD: 2.04
- RETIRE_SLOTS: 1.05
- UOPS_MITE: 0.01
- UOPS_MS: 0.0
- UOPS_PORT_0: 0.0
- UOPS_PORT_1: 0.0
- UOPS_PORT_2: 0.33
- UOPS_PORT_3: 0.33
- UOPS_PORT_4: 1.0
- UOPS_PORT_5: 0.01
- UOPS_PORT_6: 0.03
- UOPS_PORT_7: 0.34
- DIV_CYCLES: 0.0
- ILD_STALL.LCP: 0.0
- UOPS_MITE>=1: 0.0