DIV (M64)
Summary:
"Unsigned Divide"
Reference:
https://www.felixcloutier.com/x86/DIV.html
Extension:
BASE
Category:
BINARY
ISA-Set:
I86
CPL:
3
iform:
DIV_MEMv
iclass:
DIV
ASM:
DIV
Operands
Operand 1 (r): Memory
Operand 2 (r/w, suppressed): Register (RAX)
Operand 3 (r/w, suppressed): Register (RDX)
Operand 4 (w, suppressed): Flags (AF: undef, CF: undef, OF: undef, PF: undef, SF: undef, ZF: undef)
Available performance data
Alder Lake-P
Alder Lake-E
Rocket Lake
Tiger Lake
Ice Lake
Cascade Lake
Cannon Lake
Skylake-X
Coffee Lake
Kaby Lake
Skylake
Broadwell
Haswell
Ivy Bridge
Sandy Bridge
Westmere
Nehalem
Wolfdale
Conroe
Tremont
Goldmont Plus
Goldmont
Airmont
Bonnell
AMD Zen 4
AMD Zen 3
AMD Zen 2
AMD Zen+
Alder Lake-P
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
19
Latency operand 1 → 2 (address, index register):
19
Latency operand 1 → 3 (address, base register):
23
Latency operand 1 → 3 (address, index register):
23
Latency operand 2 → 2:
14 ≤ lat ≤ 15
Latency operand 2 → 3:
18
Latency operand 3 → 2:
18
Latency operand 3 → 3:
18
Throughput
Computed from the port usage: 3.00
Measured (loop):
10.00
Measured (unrolled):
10.00
Number of μops
Executed: 4
Retire slots: 5
Decoded (MITE): 5
Microcode Sequencer (MS): 0
Port usage:
3*p1+1*p23A
Alder Lake-E
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
15 ≤ lat ≤ 46
Latency operand 1 → 2 (address, index register):
15 ≤ lat ≤ 46
Latency operand 1 → 3 (address, base register):
16 ≤ lat ≤ 47
Latency operand 1 → 3 (address, index register):
16 ≤ lat ≤ 47
Latency operand 2 → 2:
11 ≤ lat ≤ 42
Latency operand 2 → 3:
12 ≤ lat ≤ 43
Latency operand 3 → 2:
11 ≤ lat ≤ 42
Latency operand 3 → 3:
11 ≤ lat ≤ 42
Throughput
Measured (loop):
6.00
Measured (unrolled):
6.00
Number of μops
Executed: 5
Microcode Sequencer (MS): 4
Rocket Lake
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
19
Latency operand 1 → 2 (address, index register):
19
Latency operand 1 → 3 (address, base register):
23
Latency operand 1 → 3 (address, index register):
23
Latency operand 2 → 2:
14
Latency operand 2 → 3:
18
Latency operand 3 → 2:
18
Latency operand 3 → 3:
18
Throughput
Computed from the port usage: 3.00
Measured (loop):
10.00
Measured (unrolled):
10.00
Number of μops
Executed: 5
Retire slots: 5
Decoded (MITE): 4
Microcode Sequencer (MS): 0
Port usage:
1*p0156+3*p1+1*p23
Tiger Lake
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
19
Latency operand 1 → 2 (address, index register):
19
Latency operand 1 → 3 (address, base register):
23
Latency operand 1 → 3 (address, index register):
23
Latency operand 2 → 2:
14
Latency operand 2 → 3:
18
Latency operand 3 → 2:
18
Latency operand 3 → 3:
18
Throughput
Computed from the port usage: 3.00
Measured (loop):
10.00
Measured (unrolled):
10.00
Number of μops
Executed: 5
Retire slots: 5
Decoded (MITE): 4
Microcode Sequencer (MS): 0
Port usage:
1*p0156+3*p1+1*p23
Ice Lake
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
19
Latency operand 1 → 2 (address, index register):
19
Latency operand 1 → 3 (address, base register):
23
Latency operand 1 → 3 (address, index register):
23
Latency operand 2 → 2:
14
Latency operand 2 → 3:
18
Latency operand 3 → 2:
18
Latency operand 3 → 3:
18
Throughput
Computed from the port usage: 3.00
Measured (loop):
10.00
Measured (unrolled):
10.00
Number of μops
Executed: 5
Retire slots: 5
Decoded (MITE): 4
Microcode Sequencer (MS): 0
Port usage:
1*p0156+3*p1+1*p23
Cascade Lake
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
40 ≤ lat ≤ 95
Latency operand 1 → 2 (address, index register):
40 ≤ lat ≤ 95
Latency operand 1 → 3 (address, base register):
39 ≤ lat ≤ 93
Latency operand 1 → 3 (address, index register):
39 ≤ lat ≤ 93
Latency operand 2 → 2:
31 ≤ lat ≤ 87
Latency operand 2 → 3:
32 ≤ lat ≤ 87
Latency operand 3 → 2:
6 ≤ lat ≤ 74
Latency operand 3 → 3:
6 ≤ lat ≤ 73
Throughput
Computed from the port usage: 7.75
Measured (loop):
21.00 (if an indexed addressing mode is used: 21.09)
Measured (unrolled):
21.04 (if an indexed addressing mode is used: 21.05)
Number of μops
Executed: 33
Retire slots: 36
Decoded (MITE): 4
Microcode Sequencer (MS): 32
Port usage:
2*p0+7*p015+3*p0156+1*p05+13*p06+1*p1+1*p15+1*p23+3*p5 (if an indexed addressing mode is used: 2*p0+8*p015+3*p0156+13*p06+1*p1+1*p15+1*p23+3*p5)
Cannon Lake
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
19 ≤ lat ≤ 20
Latency operand 1 → 2 (address, index register):
19 ≤ lat ≤ 20
Latency operand 1 → 3 (address, base register):
23
Latency operand 1 → 3 (address, index register):
23
Latency operand 2 → 2:
14
Latency operand 2 → 3:
18
Latency operand 3 → 2:
18
Latency operand 3 → 3:
18
Throughput
Computed from the port usage: 3.00
Measured (loop):
10.00
Measured (unrolled):
10.00
Number of μops
Executed: 5
Retire slots: 5
Decoded (MITE): 4
Microcode Sequencer (MS): 0
Port usage:
1*p0156+3*p1+1*p23
Skylake-X
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
40 ≤ lat ≤ 95
Latency operand 1 → 2 (address, index register):
40 ≤ lat ≤ 95
Latency operand 1 → 3 (address, base register):
39 ≤ lat ≤ 93
Latency operand 1 → 3 (address, index register):
39 ≤ lat ≤ 93
Latency operand 2 → 2:
31 ≤ lat ≤ 87
Latency operand 2 → 3:
32 ≤ lat ≤ 87
Latency operand 3 → 2:
5 ≤ lat ≤ 74
Latency operand 3 → 3:
5 ≤ lat ≤ 73
Throughput
Computed from the port usage: 8.00
Measured (loop):
21.00
Measured (unrolled):
21.00
Number of μops
Executed: 33
Retire slots: 36
Decoded (MITE): 4
Microcode Sequencer (MS): 32
Port usage:
2*p0+9*p015+4*p0156+13*p06+1*p1+1*p23+3*p5 (if an indexed addressing mode is used: 2*p0+7*p015+4*p0156+1*p05+13*p06+1*p1+1*p15+1*p23+3*p5)
Coffee Lake
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
40 ≤ lat ≤ 95
Latency operand 1 → 2 (address, index register):
40 ≤ lat ≤ 95
Latency operand 1 → 3 (address, base register):
39 ≤ lat ≤ 94
Latency operand 1 → 3 (address, index register):
39 ≤ lat ≤ 94
Latency operand 2 → 2:
32 ≤ lat ≤ 87
Latency operand 2 → 3:
32 ≤ lat ≤ 87
Latency operand 3 → 2:
5 ≤ lat ≤ 74
Latency operand 3 → 3:
5 ≤ lat ≤ 73
Throughput
Computed from the port usage: 8.00
Measured (loop):
21.00
Measured (unrolled):
21.00
Number of μops
Executed: 33
Retire slots: 36
Decoded (MITE): 4
Microcode Sequencer (MS): 32
Port usage:
2*p0+8*p015+4*p0156+1*p05+13*p06+1*p1+1*p23+3*p5
Kaby Lake
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
40 ≤ lat ≤ 95
Latency operand 1 → 2 (address, index register):
40 ≤ lat ≤ 95
Latency operand 1 → 3 (address, base register):
39 ≤ lat ≤ 93
Latency operand 1 → 3 (address, index register):
39 ≤ lat ≤ 93
Latency operand 2 → 2:
32 ≤ lat ≤ 87
Latency operand 2 → 3:
32 ≤ lat ≤ 87
Latency operand 3 → 2:
6 ≤ lat ≤ 74
Latency operand 3 → 3:
6 ≤ lat ≤ 73
Throughput
Computed from the port usage: 8.00
Measured (loop):
21.00
Measured (unrolled):
21.00
Number of μops
Executed: 33
Retire slots: 36
Decoded (MITE): 4
Microcode Sequencer (MS): 32
Port usage:
2*p0+8*p015+4*p0156+1*p05+13*p06+1*p1+1*p23+3*p5
Skylake
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
40 ≤ lat ≤ 95
Latency operand 1 → 2 (address, index register):
40 ≤ lat ≤ 95
Latency operand 1 → 3 (address, base register):
39 ≤ lat ≤ 94
Latency operand 1 → 3 (address, index register):
39 ≤ lat ≤ 94
Latency operand 2 → 2:
32 ≤ lat ≤ 87
Latency operand 2 → 3:
32 ≤ lat ≤ 87
Latency operand 3 → 2:
5 ≤ lat ≤ 74
Latency operand 3 → 3:
5 ≤ lat ≤ 74
Throughput
Computed from the port usage: 8.00
Measured (loop):
21.00
Measured (unrolled):
21.05
Number of μops
Executed: 33
Retire slots: 36
Decoded (MITE): 4
Microcode Sequencer (MS): 32
Port usage:
2*p0+8*p015+4*p0156+1*p05+13*p06+1*p1+1*p23+3*p5
Broadwell
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
34 ≤ lat ≤ 96
Latency operand 1 → 2 (address, index register):
34 ≤ lat ≤ 96
Latency operand 1 → 3 (address, base register):
33 ≤ lat ≤ 95
Latency operand 1 → 3 (address, index register):
33 ≤ lat ≤ 95
Latency operand 2 → 2:
30 ≤ lat ≤ 92
Latency operand 2 → 3:
31 ≤ lat ≤ 92
Latency operand 3 → 2:
5 ≤ lat ≤ 81
Latency operand 3 → 3:
5 ≤ lat ≤ 78
Throughput
Computed from the port usage: 7.75
Measured (loop):
21.03
Measured (unrolled):
21.00
Number of μops
Executed: 32
Retire slots: 36
Decoded (MITE): 4
Microcode Sequencer (MS): 32
Port usage:
2*p0+7*p015+5*p0156+12*p06+3*p1+1*p23+2*p5
IACA 2.3
Throughput
Computed from the port usage: 2.00
IACA:
27.76
Number of μops:
8
Port usage:
2*p0+1*p0156+2*p1+1*p23+2*p5
IACA 3.0
Throughput
Computed from the port usage: 2.00
IACA:
28.95
Number of μops:
8
Port usage:
2*p0+1*p0156+2*p1+1*p23+2*p5
Haswell
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
34 ≤ lat ≤ 98
Latency operand 1 → 2 (address, index register):
34 ≤ lat ≤ 98
Latency operand 1 → 3 (address, base register):
33 ≤ lat ≤ 97
Latency operand 1 → 3 (address, index register):
33 ≤ lat ≤ 97
Latency operand 2 → 2:
30 ≤ lat ≤ 94
Latency operand 2 → 3:
31 ≤ lat ≤ 95
Latency operand 3 → 2:
5 ≤ lat ≤ 83
Latency operand 3 → 3:
5 ≤ lat ≤ 82
Throughput
Computed from the port usage: 7.75
Measured (loop):
21.03
Measured (unrolled):
21.00
Number of μops
Executed: 32
Retire slots: 36
Decoded (MITE): 4
Microcode Sequencer (MS): 32
Port usage:
2*p0+7*p015+4*p0156+12*p06+3*p1+1*p23+3*p5
Ivy Bridge
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
35 ≤ lat ≤ 94
Latency operand 1 → 2 (address, index register):
34 ≤ lat ≤ 94
Latency operand 1 → 3 (address, base register):
33 ≤ lat ≤ 93
Latency operand 1 → 3 (address, index register):
32 ≤ lat ≤ 93
Latency operand 2 → 2:
28 ≤ lat ≤ 89
Latency operand 2 → 3:
28 ≤ lat ≤ 90
Latency operand 3 → 2:
6 ≤ lat ≤ 78
Latency operand 3 → 3:
6 ≤ lat ≤ 78
Throughput
Computed from the port usage: 11.00
Measured (loop):
22.00
Measured (unrolled):
22.04
Number of μops
Executed: 32
Retire slots: 35
Decoded (MITE): 4
Microcode Sequencer (MS): 31
Port usage:
2*p0+3*p01+10*p015+2*p05+3*p1+1*p23+11*p5
Sandy Bridge
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
34 ≤ lat ≤ 95
Latency operand 1 → 2 (address, index register):
34 ≤ lat ≤ 95
Latency operand 1 → 3 (address, base register):
34 ≤ lat ≤ 94
Latency operand 1 → 3 (address, index register):
34 ≤ lat ≤ 94
Latency operand 2 → 2:
28 ≤ lat ≤ 89
Latency operand 2 → 3:
28 ≤ lat ≤ 89
Latency operand 3 → 2:
6 ≤ lat ≤ 78
Latency operand 3 → 3:
7 ≤ lat ≤ 78
Throughput
Computed from the port usage: 11.00
Measured (loop):
22.00
Measured (unrolled):
22.02
Number of μops
Executed: 32
Retire slots: 34
Decoded (MITE): 4
Microcode Sequencer (MS): 30
Port usage:
2*p0+3*p01+10*p015+2*p05+3*p1+1*p23+11*p5
Westmere
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
31 ≤ lat ≤ 89
Latency operand 1 → 2 (address, index register):
31 ≤ lat ≤ 89
Latency operand 1 → 3 (address, base register):
29 ≤ lat ≤ 88
Latency operand 1 → 3 (address, index register):
29 ≤ lat ≤ 88
Latency operand 2 → 2:
28 ≤ lat ≤ 87
Latency operand 2 → 3:
28 ≤ lat ≤ 87
Latency operand 3 → 2:
10 ≤ lat ≤ 82
Latency operand 3 → 3:
9 ≤ lat ≤ 78
Throughput
Computed from the port usage: 10.33
Measured (loop):
24.08
Measured (unrolled):
24.09
Number of μops
Executed: 32
Retire slots: 32
Microcode Sequencer (MS): 98
Port usage:
1*p0+14*p015+6*p05+3*p1+1*p2+7*p5
Nehalem
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
31 ≤ lat ≤ 91
Latency operand 1 → 2 (address, index register):
31 ≤ lat ≤ 91
Latency operand 1 → 3 (address, base register):
30 ≤ lat ≤ 91
Latency operand 1 → 3 (address, index register):
30 ≤ lat ≤ 91
Latency operand 2 → 2:
29 ≤ lat ≤ 89
Latency operand 2 → 3:
28 ≤ lat ≤ 90
Latency operand 3 → 2:
12 ≤ lat ≤ 79
Latency operand 3 → 3:
10 ≤ lat ≤ 78
Throughput
Computed from the port usage: 9.33
Measured (loop):
19.42
Measured (unrolled):
19.45
Number of μops
Executed: 29
Retire slots: 29
Microcode Sequencer (MS): 85
Port usage:
1*p0+13*p015+4*p05+3*p1+1*p2+7*p5
Wolfdale
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
31 ≤ lat ≤ 69
Latency operand 1 → 2 (address, index register):
31 ≤ lat ≤ 69
Latency operand 1 → 3 (address, base register):
30 ≤ lat ≤ 68
Latency operand 1 → 3 (address, index register):
30 ≤ lat ≤ 68
Latency operand 2 → 2:
27 ≤ lat ≤ 66
Latency operand 2 → 3:
28 ≤ lat ≤ 66
Latency operand 3 → 2:
12 ≤ lat ≤ 61
Latency operand 3 → 3:
11 ≤ lat ≤ 68
Throughput
Computed from the port usage: 10.33
Measured (loop):
17.33
Measured (unrolled):
17.25
Number of μops
Executed: 32
Port usage:
2*p0+8*p015+4*p05+8*p1+1*p2+9*p5
Conroe
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
35 ≤ lat ≤ 124
Latency operand 1 → 2 (address, index register):
35 ≤ lat ≤ 124
Latency operand 1 → 3 (address, base register):
33 ≤ lat ≤ 122
Latency operand 1 → 3 (address, index register):
33 ≤ lat ≤ 122
Latency operand 2 → 2:
32 ≤ lat ≤ 121
Latency operand 2 → 3:
32 ≤ lat ≤ 119
Latency operand 3 → 2:
14 ≤ lat ≤ 102
Latency operand 3 → 3:
13 ≤ lat ≤ 102
Throughput
Computed from the port usage: 11.00
Measured (loop):
17.43
Measured (unrolled):
17.43
Number of μops
Executed: 34
Port usage:
2*p0+8*p015+8*p05+9*p1+1*p2+6*p5
Tremont
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
14 ≤ lat ≤ 45
Latency operand 1 → 2 (address, index register):
14 ≤ lat ≤ 45
Latency operand 1 → 3 (address, base register):
15 ≤ lat ≤ 46
Latency operand 1 → 3 (address, index register):
15 ≤ lat ≤ 46
Latency operand 2 → 2:
11 ≤ lat ≤ 42
Latency operand 2 → 3:
12 ≤ lat ≤ 43
Latency operand 3 → 2:
11 ≤ lat ≤ 42
Latency operand 3 → 3:
11 ≤ lat ≤ 42
Throughput
Measured (loop):
11.00
Measured (unrolled):
11.00
Number of μops
Executed: 6
Microcode Sequencer (MS): 5
Goldmont Plus
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
16 ≤ lat ≤ 45
Latency operand 1 → 2 (address, index register):
16 ≤ lat ≤ 45
Latency operand 1 → 3 (address, base register):
17 ≤ lat ≤ 46
Latency operand 1 → 3 (address, index register):
17 ≤ lat ≤ 46
Latency operand 2 → 2:
13 ≤ lat ≤ 42
Latency operand 2 → 3:
14 ≤ lat ≤ 43
Latency operand 3 → 2:
13 ≤ lat ≤ 42
Latency operand 3 → 3:
13 ≤ lat ≤ 42
Throughput
Measured (loop):
13.00
Measured (unrolled):
13.00
Number of μops
Executed: 5
Microcode Sequencer (MS): 5
Goldmont
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
16 ≤ lat ≤ 45
Latency operand 1 → 2 (address, index register):
16 ≤ lat ≤ 45
Latency operand 1 → 3 (address, base register):
17 ≤ lat ≤ 46
Latency operand 1 → 3 (address, index register):
17 ≤ lat ≤ 46
Latency operand 2 → 2:
13 ≤ lat ≤ 42
Latency operand 2 → 3:
14 ≤ lat ≤ 43
Latency operand 3 → 2:
13 ≤ lat ≤ 42
Latency operand 3 → 3:
13 ≤ lat ≤ 42
Throughput
Measured (loop):
13.00
Measured (unrolled):
13.00
Number of μops
Executed: 5
Microcode Sequencer (MS): 5
Airmont
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
37 ≤ lat ≤ 99
Latency operand 1 → 2 (address, index register):
37 ≤ lat ≤ 99
Latency operand 1 → 3 (address, base register):
36 ≤ lat ≤ 98
Latency operand 1 → 3 (address, index register):
36 ≤ lat ≤ 98
Latency operand 2 → 2:
34 ≤ lat ≤ 96
Latency operand 2 → 3:
33 ≤ lat ≤ 95
Latency operand 3 → 2:
32 ≤ lat ≤ 95
Latency operand 3 → 3:
32 ≤ lat ≤ 96
Throughput
Measured (loop):
24.02
Measured (unrolled):
24.00
Number of μops
Executed: 23
Microcode Sequencer (MS): 23
Bonnell
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
73 ≤ lat ≤ 195
Latency operand 1 → 2 (address, index register):
195 ≤ lat ≤ 246
Latency operand 1 → 3 (address, base register):
246 ≤ lat ≤ 357
Latency operand 1 → 3 (address, index register):
73 ≤ lat ≤ 195
Latency operand 2 → 2:
70 ≤ lat ≤ 192
Latency operand 2 → 3:
70 ≤ lat ≤ 192
Latency operand 3 → 2:
70 ≤ lat ≤ 192
Latency operand 3 → 3:
70 ≤ lat ≤ 192
Throughput
Measured (loop):
71.03
Measured (unrolled):
71.00
Number of μops
Executed: 38
Microcode Sequencer (MS): 38
AMD Zen 4
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
14 ≤ lat ≤ 22
Latency operand 1 → 2 (address, index register):
14 ≤ lat ≤ 22
Latency operand 1 → 3 (address, base register):
14 ≤ lat ≤ 22
Latency operand 1 → 3 (address, index register):
14 ≤ lat ≤ 22
Latency operand 2 → 2:
10 ≤ lat ≤ 18
Latency operand 2 → 3:
10 ≤ lat ≤ 18
Latency operand 3 → 2:
11 ≤ lat ≤ 19
Latency operand 3 → 3:
11 ≤ lat ≤ 19
Throughput
Measured (loop):
7.00
Measured (unrolled):
7.00
Number of μops
Executed: 3
AMD Zen 3
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
14 ≤ lat ≤ 22
Latency operand 1 → 2 (address, index register):
14 ≤ lat ≤ 22
Latency operand 1 → 3 (address, base register):
14 ≤ lat ≤ 22
Latency operand 1 → 3 (address, index register):
14 ≤ lat ≤ 22
Latency operand 2 → 2:
10 ≤ lat ≤ 18
Latency operand 2 → 3:
10 ≤ lat ≤ 18
Latency operand 3 → 2:
11 ≤ lat ≤ 19
Latency operand 3 → 3:
11 ≤ lat ≤ 19
Throughput
Measured (loop):
7.00
Measured (unrolled):
7.00
Number of μops
Executed: 3
Documentation
Latency: 18
Throughput: 10.00
Number of μops: 2
Port usage: DIV
AMD Zen 2
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
12 ≤ lat ≤ 43
Latency operand 1 → 2 (address, index register):
12 ≤ lat ≤ 43
Latency operand 1 → 3 (address, base register):
14 ≤ lat ≤ 45
Latency operand 1 → 3 (address, index register):
14 ≤ lat ≤ 45
Latency operand 2 → 2:
8 ≤ lat ≤ 39
Latency operand 2 → 3:
10 ≤ lat ≤ 41
Latency operand 3 → 2:
10 ≤ lat ≤ 41
Latency operand 3 → 3:
10 ≤ lat ≤ 41
Throughput
Measured (loop):
14.00
Measured (unrolled):
14.00
Number of μops
Executed: 2
Documentation
Latency: 41
Throughput: 41.00
Number of μops: 2
Port usage: ALU2
AMD Zen+
Measurements
Latencies
Latency operand 1 → 2 (address, base register):
12 ≤ lat ≤ 43
Latency operand 1 → 2 (address, index register):
12 ≤ lat ≤ 43
Latency operand 1 → 3 (address, base register):
14 ≤ lat ≤ 45
Latency operand 1 → 3 (address, index register):
14 ≤ lat ≤ 45
Latency operand 2 → 2:
8 ≤ lat ≤ 39
Latency operand 2 → 3:
10 ≤ lat ≤ 41
Latency operand 3 → 2:
10 ≤ lat ≤ 41
Latency operand 3 → 3:
10 ≤ lat ≤ 41
Throughput
Measured (loop):
14.00
Measured (unrolled):
14.00
Number of μops
Executed: 2