RDRAND (R64)
Summary:
"Read Random Number"
Reference:
https://www.felixcloutier.com/x86/rdrand
Extension:
RDRAND
Category:
RDRAND
ISA-Set:
RDRAND
CPL:
3
iform:
RDRAND_GPRv
iclass:
RDRAND
ASM:
RDRAND
Operands
Operand 1 (w): Register (RAX, RCX, RDX, RBX, RSP, RBP, RSI, RDI, R8, R9, R10, R11, R12, R13, R14, R15)
Operand 2 (w, suppressed): Flags (AF: w, CF: w, OF: w, PF: w, SF: w, ZF: w)
Available performance data
Arrow Lake-P
Arrow Lake-E
Meteor Lake-P
Meteor Lake-E
Emerald Rapids
Alder Lake-P
Alder Lake-E
Rocket Lake
Tiger Lake
Ice Lake
Cascade Lake
Cannon Lake
Skylake-X
Coffee Lake
Kaby Lake
Skylake
Broadwell
Haswell
Ivy Bridge
Tremont
Goldmont Plus
Goldmont
Airmont
AMD Zen 5
AMD Zen 4
AMD Zen 3
AMD Zen 2
AMD Zen+
Arrow Lake-P
Measurements
Throughput
Computed from the port usage: 2.67
Measured (loop):
1573.80
Measured (unrolled):
1573.58
Number of μops
Executed: 9
Retire slots: 14
Decoded (MITE): 0
Microcode Sequencer (MS): 15
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
13*ALU+1*JMP+1*LD+1*SHIFT+1*SLOW
Arrow Lake-E
Measurements
Throughput
Measured (loop):
1565.92
Measured (unrolled):
1459.58
Number of μops
Executed: 13
Microcode Sequencer (MS): 13
Requires the complex decoder
Meteor Lake-P
Measurements
Throughput
Computed from the port usage: 7.00
Measured (loop):
1842.35
Measured (unrolled):
1844.38
Number of μops
Executed: 21
Retire slots: 24
Decoded (MITE): 0
Microcode Sequencer (MS): 20
Requires the complex decoder (1 other instruction can be decoded with simple decoders in the same cycle)
Port usage:
2*p0+3*p0156B+2*p056+5*p06+7*p1+3*p15+1*p23A+2*p5
Meteor Lake-E
Measurements
Throughput
Measured (loop):
1457.13
Measured (unrolled):
1443.82
Number of μops
Executed: 12
Microcode Sequencer (MS): 11
Requires the complex decoder
Emerald Rapids
Measurements
Throughput
Computed from the port usage: 7.00
Measured (loop):
118.65
Measured (unrolled):
118.90
Number of μops
Executed: 21
Retire slots: 24
Decoded (MITE): 0
Microcode Sequencer (MS): 24
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
2*p0+3*p0156B+3*p056+4*p06+7*p1+3*p15+1*p23A+2*p5
Alder Lake-P
Measurements
Throughput
Computed from the port usage: 7.00
Measured (loop):
1379.67
Measured (unrolled):
1373.38
Number of μops
Executed: 21
Retire slots: 24
Decoded (MITE): 0
Microcode Sequencer (MS): 24
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
2*p0+4*p0156B+2*p056+5*p06+7*p1+2*p15+1*p23A+2*p5
Alder Lake-E
Measurements
Throughput
Measured (loop):
1400.78
Measured (unrolled):
1224.35
Number of μops
Executed: 19
Microcode Sequencer (MS): 18
Requires the complex decoder
Rocket Lake
Measurements
Throughput
Computed from the port usage: 7.75
Measured (loop):
1425.23
Measured (unrolled):
1425.33
Number of μops
Executed: 32
Retire slots: 34
Decoded (MITE): 0
Microcode Sequencer (MS): 34
Requires the complex decoder (4 other instructions can be decoded with simple decoders in the same cycle)
Port usage:
2*p0+4*p015+4*p0156+11*p06+7*p1+3*p5+1*p78
Tiger Lake
Measurements
Throughput
Computed from the port usage: 14.25
Measured (loop):
13541.75
Measured (unrolled):
13829.85
Number of μops
Executed: 80
Retire slots: 64
Decoded (MITE): 0
Microcode Sequencer (MS): 91
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
4*p0+5*p015+2*p0156+24*p06+13*p1+1*p23+9*p49+9*p5+12*p78
Ice Lake
Measurements
Throughput
Computed from the port usage: 18.00
Measured (loop):
1548.77
Measured (unrolled):
1548.77
Number of μops
Executed: 56
Retire slots: 54
Decoded (MITE): 0
Microcode Sequencer (MS): 62
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
9*p0+1*p01+17*p06+18*p1+10*p5+1*p78
Cascade Lake
Measurements
Throughput
Computed from the port usage: 4.00
Measured (loop):
606.03
Measured (unrolled):
607.43
Number of μops
Executed: 16
Retire slots: 16
Decoded (MITE): 0
Microcode Sequencer (MS): 16
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
3*p015+1*p0156+8*p06+2*p1+1*p23
Cannon Lake
Measurements
Throughput
Computed from the port usage: 2.75
Measured (loop):
905.75
Measured (unrolled):
905.75
Number of μops
Executed: 12
Retire slots: 14
Decoded (MITE): 0
Microcode Sequencer (MS): 14
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
1*p015+3*p0156+5*p06+2*p1+1*p23
Skylake-X
Measurements
Throughput
Computed from the port usage: 4.50
Measured (loop):
574.57
Measured (unrolled):
575.10
Number of μops
Executed: 16
Retire slots: 16
Decoded (MITE): 0
Microcode Sequencer (MS): 16
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
3*p015+1*p0156+9*p06+2*p1+1*p23
Coffee Lake
Measurements
Throughput
Computed from the port usage: 1869.50
Measured (loop):
5718.93
Measured (unrolled):
5719.53
Number of μops
Executed: 8459
Retire slots: 8416
Decoded (MITE): 0
Microcode Sequencer (MS): 8534
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
1808*p0+2*p015+2*p0156+1931*p06+1783*p1+1162*p23+12*p237+17*p4+1742*p5
Kaby Lake
Measurements
Throughput
Computed from the port usage: 3201.50
Measured (loop):
7241.02
Measured (unrolled):
7242.97
Number of μops
Executed: 14407
Retire slots: 14350
Decoded (MITE): 0
Microcode Sequencer (MS): 14526
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
3102*p0+11*p01+6*p05+3301*p06+3003*p1+1997*p23+17*p4+2987*p5
Skylake
Measurements
Throughput
Computed from the port usage: 2460.00
Measured (loop):
6160.33
Measured (unrolled):
6160.10
Number of μops
Executed: 11108
Retire slots: 11045
Decoded (MITE): 0
Microcode Sequencer (MS): 11216
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
2381*p0+8*p01+4*p05+2539*p06+2329*p1+1*p15+1533*p23+10*p237+18*p4+2293*p5
Broadwell
Measurements
Throughput
Computed from the port usage: 20.00
Measured (loop):
9223372036854775808.00
Measured (unrolled):
2088.27
Number of μops
Executed: 75
Retire slots: 58
Decoded (MITE): 0
Microcode Sequencer (MS): 103
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
3*p0+35*p06+20*p1+7*p4+15*p5
IACA 2.3
Throughput
Computed from the port usage: 1.00
IACA:
8.57
Number of μops:
5
Port usage:
3*p0156+1*p06+1*p23
IACA 3.0
Throughput
Computed from the port usage: 1.00
IACA:
0.97
Number of μops:
5
Port usage:
3*p0156+1*p06+1*p23
Haswell
Measurements
Throughput
Computed from the port usage: 20.00
Measured (loop):
9223372036854775808.00
Measured (unrolled):
2381.62
Number of μops
Executed: 73
Retire slots: 58
Decoded (MITE): 0
Microcode Sequencer (MS): 98
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
4*p0+36*p06+18*p1+7*p4+15*p5
Ivy Bridge
Measurements
Throughput
Computed from the port usage: 4.00
Measured (loop):
9223372036854775808.00
Measured (unrolled):
110.33
Number of μops
Executed: 13
Retire slots: 13
Decoded (MITE): 0
Microcode Sequencer (MS): 13
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
7*p015+1*p05+2*p1+1*p23+2*p5
Tremont
Measurements
Throughput
Measured (loop):
9223372036854775808.00
Measured (unrolled):
3539.98
Number of μops
Executed: 58
Microcode Sequencer (MS): 58
Requires the complex decoder
Goldmont Plus
Measurements
Throughput
Measured (loop):
9223372036854775808.00
Measured (unrolled):
1940.43
Number of μops
Executed: 16
Microcode Sequencer (MS): 16
Requires the complex decoder
Goldmont
Measurements
Throughput
Measured (loop):
9223372036854775808.00
Measured (unrolled):
3230.75
Number of μops
Executed: 17
Microcode Sequencer (MS): 17
Requires the complex decoder
Airmont
Measurements
Throughput
Measured (loop):
9223372036854775808.00
Measured (unrolled):
1345.37
Number of μops
Executed: 15
Microcode Sequencer (MS): 15
Requires the complex decoder
AMD Zen 5
Measurements
Throughput
Computed from the port usage: 8.50
Measured (loop):
67.40
Measured (unrolled):
66.97
Number of μops
Executed: 56
Port usage:
16*FP01+8*FP03+8*FP12+2*FP23+2*FP45
Documentation
Latency: NA
Throughput: NA
Number of μops: ucode
Port usage: ucode
AMD Zen 4
Measurements
Throughput
Computed from the port usage: 7.33
Measured (loop):
65.33
Measured (unrolled):
65.00
Number of μops
Executed: 58
Port usage:
14*FP01+8*FP12+1*FP23+2*FP45
Documentation
Latency: variable
Throughput: variable
Number of μops: ucode
AMD Zen 3
Measurements
Throughput
Computed from the port usage: 7.00
Measured (loop):
72.25
Measured (unrolled):
72.00
Number of μops
Executed: 55
Port usage:
7*FP01+7*FP03+7*FP1+1*FP12+1*FP23+2*FP45
Documentation
Latency: variable
Throughput: variable
Number of μops: ucode
AMD Zen 2
Measurements
Throughput
Measured (loop):
3610.35
Measured (unrolled):
3610.62
Number of μops
Executed: 17
Documentation
Latency: variable
Throughput: variable
Number of μops: ucode
AMD Zen+
Measurements
Throughput
Measured (loop):
2536.53
Measured (unrolled):
2535.90
Number of μops
Executed: 19
Documentation
Latency: variable
Throughput: variable
Number of μops: ucode