Introduction on MySQL and performance tuning
MySQL is the most widely used open source relational database management system (RDBMS) and act as a database server providing multiple user access to number of databases.Have a look on MySQL and performance tuning.
SQL means structured query language.
The first version of MySQL was released on 23 May 1995 and was owned and sponsored by a Swedish Company MySQL AB and now its owned by Oracle. It is named after co-founder Michael Widenius’ daughter My.
MySQL is a better choice as database for use in web applications and it’s a main component of widely used LAMP open source web application bundle.
MySQL works on almost all system platforms like IBM-AIX, FreeBSD, HP-UX, Linux, Mac OS X, Microsoft Windows, Novell NetWare, etc, .
MySQL paid editions are also available for commercial use with additional features. The applications which uses MySQL databases includes CMSes like Moodle, Joomla, WordPress, Drupal, etc,. and it is also used by internet giants like Google, Facebook, Wikipedia,Twitter and YouTube.
MySQL is written in C and C++ with SQL parser which is written in YACC (Yet Another Compiler Compiler), but it uses a home-brewed lexical analyzer.
Storage Engines (Database Engine)
It is a software component on DBMS which uses to create, read, update and delete data from a database.
There are many storage engines available for MySQL and performance tuning. Each one has its own advantages and disadvantages and each are more suitable in some situations than other.
The default storage engine before MySQL 5.5 was MyISAM, and on 5.5 or later default storage engine is InnoDB.
The available storage engines are :- MyISAM, InnoDB, Memory, CSV, Merge, Archive, Federated, Blackhole. BerkeleyDB, ndbcluster, MRG_MYISAM, ISAM and Example
To find available storage engines, execute “show engines” command after connecting to MySQL and it also shows whether it supports or not on your server.
Also you can specify which storage engine needs to be used when creating tables.
The two major storage engines of MySQL databases currently using are MyISAM and InnoDB.
MyISAM | InnoDB |
|
|
Advantages
InnoDB
-
InnoDB would be used where data integrity comes priority because of relationship constraints and transactions.
-
Since use of row-level locking InnoDB is faster in write-intensive (inserting and updating values) and lock the row on which data is being inserted or updated.
MyISAM
- While comparing with InnoDB, MyISAM is simple to design and create, thus better for beginners. Also it is not using foreign relationships between tables.
- Because of simple structure MyISAM is faster than InnoDB and thus much less costs of server resources.
- Full text-indexing and faster in read-intensive.
Disadvantages
InnoDB
- Since InnoDB takes care of different relationships between tables, database administrator and scheme creators may take more time in designing the data models which are complex than MyISAM.
- InnoDB consumes more system resourcessuch as memory.
- No full text-indexing
MyISAM
- No data integrity (e.g. relationship constraints) check, which then comes a responsibility and overhead of the database administrators and application developers.
- MyISAM doesn’t support transactions which is essential in critical data applications such as banking.
- Slower than InnoDB for tables that are frequently being inserted to or updated, because the entire table is locked for any insert or update.
Note : Typically MyISAM considered as faster in read-intensive, but current InnoDB improvements removes this difference.
In short, InnoDB is more useful for critical data applications such as banking which require regular inserts and updates whereas MyISAM suits well for applications which don’t require frequent updates, that means just select and display data. Also use where data integrity comes low priority.
MySQL performance tuning
Even though there are lot of variables that you can tweak to improve the performance of MySQL and performance tuning, only few of them are more important.
The main variables are :-
innodb_buffer_pool_size :- This is very important variable if you are using InnoDB while comparing with MyISAM. InnoDB much more sensitive in the buffer pool size. Since InnoDB buffer pool caches both data and indexes, no need to bother about data cache. That means InnoDB doesn’t use OS cache to cache the data, so normally we can set this value to 70-80% of available memory in the system for innodb installations only. Otherwise we can set this upto 50% of available memory. Don’t set this too large because there might be a chance of paging in the OS, also out of memory problems. If you set larger value, the less disk I/O needed to access data in tables.
innodb_additional_mem_pool_size :- This pool used to store data dictionary information and other internal data structures. Actually this variable does not really affect performance too much because the modern operating systems fast in memory allocating functions. It automatically starts allocating memory from OS, if innodb runs out of this pool. 8MB-16MB is enough.
innodb_log_file_size :- It is a very important variable for write intensive workloads especially if there are large data sets. Please note that larger log file size increases performance, but increase recovery time at system crashes. Usually set this value to 25% of innodb_buffer_pool_size. (commonly use values between 64M-512M)
innodb_log_buffer_size :- A large innodb_log_buffer_size allows us to run large transactions without write the log to disk before the transactions committ. Default value is 1MB. Normally we can set this between 1MB and 8 MB. If there are large transactions we can set higher values that will reduce disk I/O.
innodb_flush_log_at_trx_commit :- If you are using innodb, you should set proper values for better performance. This variable has three permitted values (0 , 1 and 2). The default value is 1 in version after 4.0.13 and thats safe.
The value 0 means, write log buffer to log file and flush the log file to disk every second. But nothing happens on transaction commit.
The default value 1 means, write log buffer to log file and flush to disk on every transaction commit.
The value 2 means, write the log buffer to log file on every transaction commit and flush to disk once per second.
The value 2 will return better performance on high-write systems, but there may be a chance of failing transactions on system crashes. The value 2 can be used if the systems having battery backed disk controller with write cache.
key_buffer_size :- This is most important variable that needs to be tuned if you are using MyISAM. Default values is 8MB. This is the global buffer where mysql caches frequently used blocks of index data of MyISAM data and it uses by all threads. Normally set this variable to 25% or upto 50% (not more than 50) of available memory in the system. The size depends on amount of indexes, datasize and workload. Also keep in mind that MyISAM uses OS caches to cache the data, so you need to leave memory for it as well, and data can be much larger than indexes in many cases. Maximum size is 4GB on 32 bit platform, on 62 bit greater values are permitted.
A simple way to check the performance of buffer is by checking four other status variables: key_read_requests, key_reads, key_write_requests, and key_writes.
table_cache_size :- The default value is 64. It had been replaced by table_open_cache since 5.1.3 version. While MySQL and performance tuning accessing a table, it places in the cache. So if we have more tables to open, increase this value. Since MySQL is multi-threaded, there will be a chance of executing many queries on table at a time, so each of these will open a table. You can adjust this value depends upon two other variables (open_tables and opened_tables). Check these values during peak times. If you see open_tables is same as table_cache and opened_tables are rapidly increasing, then you should increase table_cache for better performance. 1024 is good value for applications having more tables or many connections.
thread_cache_size :- On each connect/disconnect of threads will be created/destructed. Default value is 16. This value means, how many thread the server should cache for reuse. that means, if a client disconnects from server, that thread will put in the cache for reuse if there are fewer threads in cache than thread_cache_size. So if a new client connects, thread in cache will use and if there is no threads in cache new thread will be created for that connection. So you can set larger value if you have many connections per second. You can set proper value for this by closely watching threads_created status variable during peak time.
This command would help you to monitor threads_created :-
#mysqladmin -u root -p -r -i 1 ext | grep Threads_created
Enter password:
| Threads_created | 100 |
| Threads_created | 10 |
| Threads_created | 15 |
| Threads_created | 20 |
| Threads_created | 12 I
In this case 20 would be enough.
Please note the first line of output should not consider for calculations since that is an accumulated value.
query_cache_size :- If query cache is enabled, query cache stores the queries and its corresponding results. Later, if the server gets same query again, MySQL and performance tuning takes the result from cache. This would be useful in case of more reads on database such as wordpress blog, that doesn’t update tables regularly. If there is regular update on tables, query cache may slow down the performance. Values from 32M to 512M normally make sense.
The query cache is enabled by setting up three server variables. query_cache_type, query_cache_size, and query_cache_limit
query_cache_type has three status (0 – disabled, 1 – enabled (except SELECT SQL_NO_CACHE … queries), 2 – enabled and works on demand (cache only SELECT SQL_CACHE … queries))
query_cache_size memory for storing results. minimum should be 40KB
query_cache_limit results bigger than this won’t be cached.
To check whether query cache is available on your server with the help of have_query_cache parameter.
sort_buffer_size :- You can set this as session or global variable, but best use this as a per session variable. per-session means the defined memory should be used for each connection/thread. sort_buffer helps us while performing large number of sorts (using ORDER BY or GROUP BY) and it doesn’t depend on which storage engine you are using. If you set larger value, it may slow down the performance. For example, if you set this value to 1MB and need only 100KB for sorting, but mysql allocates 1MB for that query as it is a per session variable and hence there would be waste of memory. So better value is between 256K and 1MB. The maximum permitted value is 4GB.
read_rnd_buffer_size :- The read_rnd_buffer_size is used after a sort, when reading rows in sorted order. If you are having many queries with ORDER BY clause, then this variable would be helpful for improving the performance. Prior to 4.0.3 this variable was record_rnd_buffer. This can be set as either global or session. Since this is a buffer allocated for each client don’t set high value globally. Instead of that set it as session variable. The value of 1KB for each 1MB of memory on the server would be enough. The maximum permitted value is 2GB. This is not specific to any storage engine.
tmp_table_size :- The maximum size of internal in-memory temporary tables. If an in-memory temporary table exceeds the limit, MySQL and performance tuning automatically converts it to an on-disk MyISAM table. Increase the value if you have many GROUP BY queries and have lots of memory. The maximum permitted value is 32 MB. You can set this as either global or session.
myisam_sort_buffer_size :- The size of the buffer that is allocated when sorting MyISAM indexes during a REPAIR TABLE when creating indexes with CREATE INDEX or ALTER TABLE. The maximum size is 4GB. You can set this value as either global or session.
Note: You will get a Perl script https://raw.github.com/rackerhacker/MySQLTuner-perl/master/mysqltuner.pl and that will tell you which variable to what value you need to tweak and it depends on server workload. But the results are not accurate in all situations.
Also as a part of performance tuning enable slow query log which helps you to find out which query takes more time to execute than the specified time. You can enable slow query log with the help of three variables.
slow_query_log : Two values, Yes or No. Yes –enabled, No-disabled
slow_query_log_file : in this parameter you can specify the filename to which the slow queries needs to be written.
query_long_time : The value is in seconds and the queries which takes more time than this value will write to the log file.