MUL (R32) - Throughput and Uops
With unroll_count=500 and no inner loop
Code:
0: 41 f7 e0 mul r8d
Show nanoBench command
Results:
MPERF: 2.41
APERF: 3.0
UOPS: 2.0
FpuPipeAssignment.Total0: 0.0
FpuPipeAssignment.Total1: 0.0
FpuPipeAssignment.Total2: 0.0
FpuPipeAssignment.Total3: 0.0
FpuPipeAssignment.Total4: 0.0
FpuPipeAssignment.Total5: 0.0
DIV_CYCLES: 0.0
With loop_count=1000 and unroll_count=10
Code:
0: 41 f7 e0 mul r8d
Show nanoBench command
Results:
MPERF: 2.43
APERF: 3.0
UOPS: 2.1
FpuPipeAssignment.Total0: 0.0
FpuPipeAssignment.Total1: 0.0
FpuPipeAssignment.Total2: 0.0
FpuPipeAssignment.Total3: 0.0
FpuPipeAssignment.Total4: 0.0
FpuPipeAssignment.Total5: 0.0
DIV_CYCLES: 0.0
With loop_count=100 and unroll_count=100
Code:
0: 41 f7 e0 mul r8d
Show nanoBench command
Results:
MPERF: 2.44
APERF: 3.0
UOPS: 2.01
FpuPipeAssignment.Total0: 0.0
FpuPipeAssignment.Total1: 0.0
FpuPipeAssignment.Total2: 0.0
FpuPipeAssignment.Total3: 0.0
FpuPipeAssignment.Total4: 0.0
FpuPipeAssignment.Total5: 0.0
DIV_CYCLES: 0.0
With additional dependency-breaking instructions
With unroll_count=500 and no inner loop
Code:
0: 48 31 c0 xor rax,rax 3: 41 f7 e0 mul r8d
Show nanoBench command
Results:
MPERF: 0.8
APERF: 1.0
UOPS: 3.0
FpuPipeAssignment.Total0: 0.0
FpuPipeAssignment.Total1: 0.0
FpuPipeAssignment.Total2: 0.0
FpuPipeAssignment.Total3: 0.0
FpuPipeAssignment.Total4: 0.0
FpuPipeAssignment.Total5: 0.0
DIV_CYCLES: 0.0
With loop_count=1000 and unroll_count=10
Code:
0: 48 31 c0 xor rax,rax 3: 41 f7 e0 mul r8d
Show nanoBench command
Results:
MPERF: 0.81
APERF: 1.0
UOPS: 3.1
FpuPipeAssignment.Total0: 0.0
FpuPipeAssignment.Total1: 0.0
FpuPipeAssignment.Total2: 0.0
FpuPipeAssignment.Total3: 0.0
FpuPipeAssignment.Total4: 0.0
FpuPipeAssignment.Total5: 0.0
DIV_CYCLES: 0.0
With loop_count=100 and unroll_count=100
Code:
0: 48 31 c0 xor rax,rax 3: 41 f7 e0 mul r8d
Show nanoBench command
Results:
MPERF: 0.81
APERF: 1.0
UOPS: 3.01
FpuPipeAssignment.Total0: 0.0
FpuPipeAssignment.Total1: 0.0
FpuPipeAssignment.Total2: 0.0
FpuPipeAssignment.Total3: 0.0
FpuPipeAssignment.Total4: 0.0
FpuPipeAssignment.Total5: 0.0
DIV_CYCLES: 0.0