About Me

Fremont, CA, United States

Monday, October 3, 2022

High await avgqu-sz Netbackup rman backup oracle 19c

High  await avgqu-sz in Oracle or Redhat Linux 7 NetBckup RMAN Oracle 19c I/O issues

Environments :

Oracle Linux 7

Database Version : 19c (19.14)

Grid ASM : 19c (19.14)

Storage : NetApp SAN

Backup : rman

DB Size : 120 TB


Issue : High  await avgqu-sz in Oracle or Redhat Linux 7. Online operations and application users process would see performance degradation.



avgqu-sz

The average queue length of the requests issued to the device.

await

The average time (milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.




iostat -xtzm 5 | grep -E "dm-|Device"



I/O performance Issue is observed with when RMAN Backup is running and application process will take more time to complete


Check max_sectors_kb for devices . In my case all ASM devices starts with dm-





You can run as root :


multipath -ll | sed -rn 's/.*(dm-[[:digit:]]+|sd[[:alpha:]]+).*/\1/p' | xargs -I % echo egrep -H \"*\" /sys/block/%/queue/max_sectors_kb |bash | grep dm-


And all devices had max_sectors_kb were set to 64.


All of these devices max_sectors_kb were set to 1024 in multipath.conf file. Max_sectors_kb can be setup based on ASM AU Size, however in my case ASM AU Size was set to 4MB but we opted for 1024 in max_sectors_kb to see impact.


*** After change values ****

/sys/block/dm-8/queue/max_sectors_kb:1024

/sys/block/dm-7/queue/max_sectors_kb:1024

/sys/block/dm-6/queue/max_sectors_kb:1024

/sys/block/dm-3/queue/max_sectors_kb:1024

/sys/block/dm-9/queue/max_sectors_kb:1024

/sys/block/dm-5/queue/max_sectors_kb:1024

/sys/block/dm-4/queue/max_sectors_kb:1024

/sys/block/dm-2/queue/max_sectors_kb:1024

/sys/block/dm-10/queue/max_sectors_kb:1024



All devices max_sectors_kb value was changed from 64 to 1024 


Max_sectors_kb was introduced in multipath.conf file 

After changing multipath.conf file, you do need to bounce system to take effect.



cat /etc/multipath.conf | grep -v "#"



defaults {

user_friendly_names no

find_multipaths yes

        polling_interval        10

        max_fds                 8192

}

devices {

   device {

      vendor         "NETAPP  "

      product         "LUN.*"

      no_path_retry     fail

      path_checker      tur

      max_sectors_kb 1024

   }

}


blacklist {

devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"

devnode "^(hd|xvd|vd)[a-z]*"

}


High  await avgqu-sz  issue was resolved. 

RMAN backup finished in half the time and all application level performance is also improved.


No comments: