KANDB (K, K, K)
Summary:
"Bitwise Logical AND Masks"
Reference:
https://www.felixcloutier.com/x86/kandw:kandb:kandq:kandd
Extension:
AVX512VEX
Category:
KMASK
ISA-Set:
AVX512DQ_KOPB
CPL:
3
iform:
KANDB_MASKmskw_MASKmskw_MASKmskw_AVX512
iclass:
KANDB
ASM:
KANDB
Operands
Operand 1 (w): Register (K0, K1, K2, K3, K4, K5, K6, K7)
Operand 2 (r): Register (K7, K6, K5, K4, K3, K2, K1, K0)
Operand 3 (r): Register (K0, K1, K2, K3, K4, K5, K6, K7)
Available performance data
Emerald Rapids
Alder Lake-P
Rocket Lake
Tiger Lake
Ice Lake
Cascade Lake
Cannon Lake
Skylake-X
AMD Zen 5
AMD Zen 4
Emerald Rapids
Measurements
Latencies
Latency operand 2 → 1:
1
Latency operand 3 → 1:
1
Throughput
Computed from the port usage: 1.00
Measured (loop):
1.00
Measured (unrolled):
1.00
Number of μops
Executed: 1
Retire slots: 1
Decoded (MITE): 1
Microcode Sequencer (MS): 0
Port usage:
1*p0
Alder Lake-P
Measurements
Latencies
Latency operand 2 → 1:
1
Latency operand 3 → 1:
1
Throughput
Computed from the port usage: 1.00
Measured (loop):
1.00
Measured (unrolled):
1.00
Number of μops
Executed: 1
Retire slots: 1
Decoded (MITE): 1
Microcode Sequencer (MS): 0
Port usage:
1*p0
Rocket Lake
Measurements
Latencies
Latency operand 2 → 1:
1
Latency operand 3 → 1:
1
Throughput
Computed from the port usage: 1.00
Measured (loop):
1.00
Measured (unrolled):
1.00
Number of μops
Executed: 1
Retire slots: 1
Decoded (MITE): 1
Microcode Sequencer (MS): 0
Port usage:
1*p0
Tiger Lake
Measurements
Latencies
Latency operand 2 → 1:
1
Latency operand 3 → 1:
1
Throughput
Computed from the port usage: 1.00
Measured (loop):
1.00
Measured (unrolled):
1.00
Number of μops
Executed: 1
Retire slots: 1
Decoded (MITE): 1
Microcode Sequencer (MS): 0
Port usage:
1*p0
Ice Lake
Measurements
Latencies
Latency operand 2 → 1:
1
Latency operand 3 → 1:
1
Throughput
Computed from the port usage: 1.00
Measured (loop):
1.00
Measured (unrolled):
1.00
Number of μops
Executed: 1
Retire slots: 1
Decoded (MITE): 1
Microcode Sequencer (MS): 0
Port usage:
1*p0
Cascade Lake
Measurements
Latencies
Latency operand 2 → 1:
1
Latency operand 3 → 1:
1
Throughput
Computed from the port usage: 1.00
Measured (loop):
1.00
Measured (unrolled):
1.00
Number of μops
Executed: 1
Retire slots: 1
Decoded (MITE): 1
Microcode Sequencer (MS): 0
Port usage:
1*p0
Cannon Lake
Measurements
Latencies
Latency operand 2 → 1:
1
Latency operand 3 → 1:
1
Throughput
Computed from the port usage: 1.00
Measured (loop):
1.00
Measured (unrolled):
1.00
Number of μops
Executed: 1
Retire slots: 1
Decoded (MITE): 1
Microcode Sequencer (MS): 0
Port usage:
1*p0
Skylake-X
Measurements
Latencies
Latency operand 2 → 1:
1
Latency operand 3 → 1:
1
Throughput
Computed from the port usage: 1.00
Measured (loop):
1.00
Measured (unrolled):
1.00
Number of μops
Executed: 1
Retire slots: 1
Decoded (MITE): 1
Microcode Sequencer (MS): 0
Port usage:
1*p0
IACA 2.3
Throughput
Computed from the port usage: 1.00
IACA:
1.00
Number of μops:
1
Port usage:
1*p0
IACA 3.0
Throughput
Computed from the port usage: 1.00
IACA:
0.98
Number of μops:
1
Port usage:
1*p0
AMD Zen 5
Measurements
Latencies
Latency operand 2 → 1:
1
Latency operand 3 → 1:
1
Throughput
Computed from the port usage: 0.50
Measured (loop):
0.50
Measured (unrolled):
0.50
Number of μops
Executed: 1
Port usage:
1*FP03
Documentation
Latency: 1
Throughput: 0.50
Number of μops: 1
Port usage: FP0/3
AMD Zen 4
Measurements
Latencies
Latency operand 2 → 1:
1
Latency operand 3 → 1:
1
Throughput
Computed from the port usage: 0.50
Measured (loop):
0.50
Measured (unrolled):
0.50
Number of μops
Executed: 1
Port usage:
1*FP23
Documentation
Latency: 1
Throughput: 0.50
Number of μops: 1
Port usage: FP2/3