Read Buffer performance hit
I had some fun yesterday with some odd performance problems. So I did a run with oprofile and got this:
561612 25.0417 /lib64/tls/libc-2.3.4.so memset 429457 19.1491 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux clear_page 214268 9.5540 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux do_page_fault 144293 6.4339 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux do_no_page 94410 4.2097 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux buffered_rmqueue 64998 2.8982 /lib64/tls/libc-2.3.4.so memcpy 59565 2.6559 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux __down_read_trylock 59369 2.6472 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux handle_mm_fault 47312 2.1096 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux free_hot_cold_page 39161 1.7462 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux release_pages 39140 1.7452 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux unmap_vmas 27200 1.2128 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux __alloc_pages 21520 0.9596 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux __pagevec_lru_add_active 20772 0.9262 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux page_add_anon_rmap 19852 0.8852 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux page_remove_rmap 16424 0.7323 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux bad_range 11993 0.5348 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux __might_sleep
Which seemed to me like an awful lot of page faults, especially on a system that wasn’t
swapping at all.
Alexander and I got to talking about it, and I learned a few new things. The first thing is that mmap() appears as page_faults, because the machinery for doing mmap() and for doing swap is actually the same. Fun.
Back to the problem at hand… the following was in the my.cnf:
read_buffer=5M
As I understand it, if read_buffer set to over 256K, it uses mmap() instead of malloc() for memory
allocation. Actually - this is a libc malloc thing and is tunable, but defaults to 256k. From the manual:
Each thread that does a sequential scan allocates a buffer of this size (in bytes) for each table it scans.
In this case, the server is pretty much handling click logging, so all of the queries are pure inserts. Why then would the read buffer get used at all?
Well, it just so happens that this is a PHP app and is not using any sort of persistent connections, so the app is connecting and disconnecting 60 - 100 times a second. Each of those connections is going to have to authenticate against (at least) the mysql.user table. On the other hand, that _should_ be cached, so it shouldn’t be a problem. However, lowering the read buffer is quite easy … and it’ll be interesting.
So, theory to the fire - I reduced read_buffer to 128k (the default) and then got this output from oprofile:
2220 15.7760 /lib64/tls/libc-2.3.4.so memset 785 5.5785 /lib64/tls/libc-2.3.4.so memcpy 608 4.3206 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux thread_return 398 2.8283 /usr/libexec/mysqld yyparse(void*) 337 2.3948 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux system_call 234 1.6629 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux do_gettimeoffset_hpet 197 1.3999 /lib64/tls/libc-2.3.4.so _int_malloc 183 1.3005 /lib64/tls/libpthread-2.3.4.so pthread_mutex_lock 181 1.2862 /lib64/tls/libpthread-2.3.4.so pthread_mutex_unlock 181 1.2862 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux clear_page 176 1.2507 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux wake_futex 143 1.0162 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux sys_tgkill 138 0.9807 /usr/libexec/mysqld yylex(void*, void*) 130 0.9238 /usr/libexec/mysqld update_sys_var_str(sys_var_str*, _pthread_rwlock_t*, set_var*) 126 0.8954 /jbd (no symbols) 123 0.8741 /usr/libexec/mysqld my_strntoul_ucs2 106 0.7533 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux __might_sleep 106 0.7533 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux do_futex 104 0.7391 /usr/lib/debug/lib/modules/2.6.9-34.ELsmp/vmlinux tcp_recvmsg
Which is more what we would expect.
The results on the system as a whole were that with read buffer set to 5M, extra 3-5% usr and 6-12% sys cpu time taken on an 8 cpu box (as reported by top) or, to look at it another way, about 100% of a single CPU taken by mysqld.
With read_buffer reduced to 128k, the CPU usage dropped to just about nothing.
3 Responses to 'Read Buffer performance hit'
Leave a Reply
You must be logged in to post a comment.
on September 12, 2007 on 2:35 am
[...] Monty Taylor posted interesting investigation of the fact read_buffer_size variable affects connection speed. [...]
on March 6, 2008 on 8:19 am
[...] http://mysql-ha.com/2007/09/06/read-buffer-performance-hit/ [...]
on May 7, 2008 on 10:55 am
[...] MySQL-HA: Read Buffer performance hit [...]