阿里云ECS云盘IOPS压测|SundayHK

https://help.aliyun.com/zh/ecs/user-guide/test-the-iops-performance-of-an-essd

虽然测试裸盘可以获得较为真实的块存储盘性能，但会破坏文件系统结构，请在测试前提前创建快照做好数据备份。具体操作，请参见创建快照。
强烈建议不要将操作系统所在的系统盘或含有重要数据的云盘作为测试对象，以避免数据丢失。建议在新创建的、无重要数据的数据盘或临时盘上使用工具测试块存储性能。
如果需要对系统盘进行裸盘压测，压测完成后建议重置系统后再部署业务使用，避免压测活动引入的潜在问题，从而确保系统的长期稳定运行。
性能测试结果均在测试环境下获得，仅供参考。在真实生产环境中，受网络环境、并发访问量等因素影响，云盘的性能表现可能存在差异，请您以实际情况为准。

sudo yum install libaio libaio-devel fio -y
#sudo apt install libaio1 libaio-dev fio -y

阿里云ECS云盘 IPOS: 3000

root@hk:/tmp# fdisk -l
Disk /dev/nvme0n1: 40 GiB, 42949672960 bytes, 83886080 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: B5398D59-51E9-4E6D-B0E6-67D8813B9F94

Device          Start      End  Sectors  Size Type
/dev/nvme0n1p1   2048     4095     2048    1M BIOS boot
/dev/nvme0n1p2   4096   395263   391168  191M Microsoft basic data
/dev/nvme0n1p3 395264 83886046 83490783 39.8G Linux filesystem

Disk /dev/nvme1n1: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

# /tmp/test_100w.sh
function RunFio
{
 numjobs=$1   # 实例中的测试线程数，例如示例中的10
 iodepth=$2   # 同时发出I/O数的上限，例如示例中的64
 bs=$3        # 单次I/O的块文件大小，例如示例中的4k
 rw=$4        # 测试时的读写策略，例如示例中的randwrite
 size=$5
 filename=$6  # 指定测试文件的名称，例如示例中的/dev/your_device
 nr_cpus=`cat /proc/cpuinfo |grep "processor" |wc -l`
 if [ $nr_cpus -lt $numjobs ];then
     echo “Numjobs is more than cpu cores, exit!”
     exit -1
 fi
 let nu=$numjobs+1
 cpulist=""
 for ((i=1;i<10;i++))
 do
     list=`cat /sys/block/your_device/mq/*/cpu_list | awk '{if(i<=NF) print $i;}' i="$i" | tr -d ',' | tr '\n' ','`
     if [ -z $list ];then
         break
     fi
     cpulist=${cpulist}${list}
 done
 spincpu=`echo $cpulist | cut -d ',' -f 2-${nu}`
 echo $spincpu
 fio --ioengine=libaio --runtime=30s --numjobs=${numjobs} --iodepth=${iodepth} --bs=${bs} --size=${size} --rw=${rw} --filename=${filename} --time_based=1 --direct=1 --name=test --group_reporting --cpus_allowed=$spincpu --cpus_allowed_policy=split
}
echo 2 > /sys/block/your_device/queue/rq_affinity
sleep 5
RunFio 10 128 4k randwrite 1024g /dev/your_device # 若要保留数据则使用具体路径如/mnt/test.image

请将所有your_device设置为ESSD云盘实际的设备名，例如nvme1n1。
如果云盘上的数据丢失不影响业务，可以设置filename=[设备名，例如/dev/vdb]；否则，请设置为filename=[具体的文件路径，例如/mnt/test.image]。
根据实际情况设置RunFio 10 64 4k randwrite /dev/your_device中的10、64、4k、_randwrite和/dev/yourdevice。

运行

bash /tmp/test_100w.sh

test100w.sh脚本解读

以下命令将块设备的系统参数rq_affinity取值修改为2。

echo 2 > /sys/block/your_device/queue/rq_affinity

rq_affinity取值	取值说明
1	表示块设备收到I/O完成（I/O Completion）的事件时，这个I/O被发送回处理这个I/O下发流程的vCPU所在Group上处理。在多线程并发的情况下，I/O Completion就可能集中在某一个vCPU上执行，造成瓶颈，导致性能无法提升。
2	表示块设备收到I/O Completion的事件时，这个I/O会在当初下发的vCPU上执行。在多线程并发的情况下，就可以充分发挥各个vCPU的性能。

以下命令分别将几个jobs绑定到不同的CPU Core上。

fio -ioengine=libaio -runtime=30s -numjobs=${numjobs} -iodepth=${iodepth} -bs=${bs} -rw=${rw} -filename=${filename} -time_based=1 -direct=1 -name=test -group_reporting -cpus_allowed=$spincpu -cpus_allowed_policy=split

#fio --ioengine=libaio --runtime=30s --numjobs=2 --iodepth=128 --bs=4k --size=1024g --rw=randwrite --filename=/data/test.img --time_based=1 --direct=1 --name=test --group_reporting --cpus_allowed=1, --cpus_allowed_policy=split

说明

普通模式下，一个设备（Device）只有一个请求队列（Request-Queue），在多线程并发处理I/O的情况下，这个唯一的Request-Queue就是一个性能瓶颈点。多队列（Multi-Queue）模式下，一个设备（Device）可以拥有多个处理I/O的Request-Queue，充分发挥后端存储的性能。假设您有4个I/O线程，您需要将4个I/O线程分别绑定在不同的Request-Queue对应的CPU Core上，这样就可以充分利用Multi-Queue提升性能。

参数	说明	取值示例
`numjobs`	I/O线程。	10
`/dev/your_device`	ESSD云盘设备名。	/dev/nvme1n1
`cpus_allowed_policy`	FIO提供了参数`cpus_allowed_policy`以及`cpus_allowed`来绑定vCPU。	split

以上命令一共运行了几个jobs，分别绑定在几个CPU Core上，分别对应着不同的Queue_Id。关于如何查看Queue_Id绑定的cpu_core_id，您可以运行如下命令：

运行ls /sys/block/your_device/mq/。其中，your_device是您的设备名，例如nvme1n1。运行该命令查看设备名为vd*云盘的Queue_Id。
运行cat /sys/block/your_device/mq/cpu_list。其中，your_device是您的设备名，例如nvme1n1。运行该命令查看对应设备名为vd云盘的Queue绑定到的cpu_core_id。

SundayHK

阿里云ECS云盘IOPS压测

test100w.sh脚本解读

相关文章

发布评论取消回复

sunday

redis Error: Server closed the connection 解决

Categraf Prometheus Grafana 采集Dell iDRAC SNMP硬件信息

MySQL连接数暴增，wait_timeout 调整

Nginx 允许多个域名跨域访问

Ansible 常用模块

阿里云ECS云盘IOPS压测

test100w.sh脚本解读

相关文章

发布评论 取消回复

sunday

redis Error: Server closed the connection 解决

Categraf Prometheus Grafana 采集Dell iDRAC SNMP硬件信息

MySQL连接数暴增，wait_timeout 调整

Nginx 允许多个域名跨域访问

Ansible 常用模块

发布评论取消回复