Они упёрлись в CPU, там как раз получается 14 миллионов иопсов с 56 ядер:
Let’s say you want to obtain this with one server. Using an Intel Xeon 2630 v4, ten cores run at 2.20 GHz each. 2.20 GHz equals 2.2 billion cycles per second. Divide this number by 27,000 and you know you can potentially run 81,482 IOPS on a core. That means to obtain 1 million IOPS you need 12.42 cores to generate one million IOPS with an AHCI connected device. That is one complete CPU package exhausted, leaving the other CPU package to deal with VM CPU cycles. By comparison, with NVMe it only takes 4.13 cores.