Upstream fix for http://bugs.mysql.com/bug.php?id=70768 causes RW lock creations and destructions at a high rate during ibuf merge

Bug #1279671 reported by Laurynas Biveinis
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
Medium
Laurynas Biveinis
5.1
Invalid
Undecided
Unassigned
5.5
Invalid
Undecided
Unassigned
5.6
Fix Released
Medium
Laurynas Biveinis

Bug Description

MySQL 5.6.16 fixed http://bugs.mysql.com/bug.php?id=70768 by changing dict_table_t stat-related member locking. Instead of a static array of 64 shared latches, to eliminate contention on these, each dict_table_t object got its own dynamically created and destroyed latch.

The short-lived private dict_table_t objects such as those used in ibuf merge got these dynamic rwlatches too. Each (non-priority) rwlatch creation performs heap memory allocations, acquires rw_lock_list_mutex, creates two OS events, which in turn acquire os_sync_mutex.

Parts of a Perf report on the preliminary PS-5.6.16 branch.

- 14,88% mysqld mysqld [.] ut_delay(unsigned long)
   - ut_delay(unsigned long)
      - 90,02% mutex_spin_wait(void*, bool, char const*, unsigned long)
         - 47,73% rw_lock_create_func(rw_lock_t*, char const*)
              dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
            + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
         - 46,52% rw_lock_free_func(rw_lock_t*) - dict_mem_table_free(dict_table_t*)
               + 92,34% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
               + 6,54% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
               + 1,11% ibuf_rec_get_volume_func(unsigned char const*)
...
- 6,35% mysqld libpthread-2.12.so [.] pthread_mutex_lock
   - pthread_mutex_lock
      - 35,31% os_event_create()
         - 60,64% rw_lock_create_func(rw_lock_t*, char const*)
              dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
            + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
         - 39,36% mutex_create_func(ib_mutex_t*, char const*)
              dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
            + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
      - 19,28% sync_array_reserve_cell(sync_array_t*, void*, unsigned long, char const*, unsigned long, unsigned long*)
         - 99,16% mutex_spin_wait(void*, bool, char const*, unsigned long)
            - 50,51% rw_lock_create_func(rw_lock_t*, char const*)
                 dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
               + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
            - 49,26% rw_lock_free_func(rw_lock_t*)
               - dict_mem_table_free(dict_table_t*)
                  + 92,81% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
                  + 6,48% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
                  + 0,71% ibuf_rec_get_volume_func(unsigned char const*)
         + 0,82% rw_lock_s_lock_spin(void*, unsigned long, bool, bool, char const*, unsigned long)
      - 18,17% os_fast_mutex_free_func(pthread_mutex_t*)
         - 67,53% os_event_free(os_event*)
            - 59,21% rw_lock_free_func(rw_lock_t*)
               - dict_mem_table_free(dict_table_t*)
                  + 91,27% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
                  + 6,96% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
                  + 1,77% ibuf_rec_get_volume_func(unsigned char const*)
            - 40,79% mutex_free_func(ib_mutex_t*)
               - dict_mem_table_free(dict_table_t*)
                  + 91,62% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
                  + 6,85% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
                  + 1,53% ibuf_rec_get_volume_func(unsigned char const*)
         - 32,47% dict_mem_index_free(dict_index_t*)
            + 89,70% ibuf_dummy_index_free(dict_index_t*)
            + 8,57% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
            + 1,73% ibuf_rec_get_volume_func(unsigned char const*)
      - 7,98% os_event_free(os_event*)
         - 53,98% rw_lock_free_func(rw_lock_t*)
            - dict_mem_table_free(dict_table_t*)
               + 91,23% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
               + 7,07% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
               + 1,69% ibuf_rec_get_volume_func(unsigned char const*)
         - 46,02% mutex_free_func(ib_mutex_t*)
            - dict_mem_table_free(dict_table_t*)
               + 91,32% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
               + 6,99% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
               + 1,69% ibuf_rec_get_volume_func(unsigned char const*)
      - 7,17% sync_array_wait_event(sync_array_t*, unsigned long)
         - 98,83% mutex_spin_wait(void*, bool, char const*, unsigned long)
            - 50,14% rw_lock_create_func(rw_lock_t*, char const*)
                 dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
               + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
            - 49,35% rw_lock_free_func(rw_lock_t*)
               - dict_mem_table_free(dict_table_t*)
                  + 92,97% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
                  + 6,19% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
                  + 0,84% ibuf_rec_get_volume_func(unsigned char const*)
         + 1,14% rw_lock_s_lock_spin(void*, unsigned long, bool, bool, char const*, unsigned long)
      - 6,76% os_fast_mutex_init_func(pthread_mutex_t*)
           dict_mem_index_create(char const*, char const*, unsigned long, unsigned long, unsigned long)
         + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
      - 3,89% mutex_signal_object(ib_mutex_t*)
         - 81,44% rw_lock_free_func(rw_lock_t*)
            - dict_mem_table_free(dict_table_t*)
               + 92,90% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
               + 6,37% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
               + 0,73% ibuf_rec_get_volume_func(unsigned char const*)
         - 18,19% rw_lock_create_func(rw_lock_t*, char const*)
              dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
            + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
      - 0,76% sync_array_free_cell(sync_array_t*, unsigned long)
         - 87,03% mutex_spin_wait(void*, bool, char const*, unsigned long)
            - 50,19% rw_lock_create_func(rw_lock_t*, char const*)
                 dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
               + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
            - 46,32% rw_lock_free_func(rw_lock_t*)
               - dict_mem_table_free(dict_table_t*)
                  + 93,59% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
                  + 5,77% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
                  + 0,64% ibuf_rec_get_volume_func(unsigned char const*)
            + 1,88% pfs_mutex_enter_func(ib_mutex_t*, char const*, unsigned long)
            + 0,87% mtr_commit(mtr_t*)
            + 0,53% _ZL20pfs_mutex_enter_funcP10ib_mutex_tPKcm.clone.2
         + 12,46% rw_lock_s_lock_spin(void*, unsigned long, bool, bool, char const*, unsigned long)
- 3,68% mysqld mysqld [.] mutex_spin_wait(void*, bool, char const*, unsigned long)
   - mutex_spin_wait(void*, bool, char const*, unsigned long)
      - 48,40% rw_lock_create_func(rw_lock_t*, char const*)
           dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
         + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
      - 46,88% rw_lock_free_func(rw_lock_t*)
         - dict_mem_table_free(dict_table_t*)
            + 92,22% _ZL35ibuf_get_volume_buffered_count_funcPKhPmmPl.clone.0
            + 6,71% ibuf_merge_or_delete_for_page(buf_block_t*, unsigned long, unsigned long, unsigned long, unsigned long)
            + 1,07% ibuf_rec_get_volume_func(unsigned char const*)

The above causes a performance regression on MySQL 5.6.16 merge in TPS of RW Sysbench (I/O bound (!)): about 7% drop at 512 threads, 20% at 1024 threads.

Related branches

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

The same reasoning as for dict_table_t::stats_latch probably applies to dict_table_t::autoinc_mutex too, which is not a regression but a pre-existing inefficiency. Considering fixing at the same time.

description: updated
tags: added: innodb performance upstream
description: updated
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1474

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.