O'Reilly FYI

News from within O'Reilly

Read an Excerpt from High Performance MySQL: Winner of a Productivity Award at the 19th Annual Jolt Awards

 
By Kathryn Barrett
March 17, 2009

Hiperfcoverimage.jpgThe following is an excerpt from High Performance MySQL, Second Edition, winner of a Jolt Productivity award. Congratulations to the authors: Baron Schwartz, Peter Zaitsev, Vadim Tkachenko, Jeremy Zawodny, Arjen Lentz, and Derek J. Balling.

Tools for High Performance

The MySQL server distribution doesn't include tools for many common tasks, such as monitoring the server or comparing data between servers. Fortunately, MySQL's devoted community has made a wide variety of tools available, reducing the need to roll your own. Many companies also provide commercial alternatives or supplements to MySQL's own tools.

This chapter covers some of the most popular and important productivity tools for MySQL. We divide the tools into categories: interface, monitoring, analysis, and utilities.

MySQL Visual Tools

MySQL AB distributes a set of visual tools that includes MySQL Query Browser, MySQL Administrator, MySQL Migration Toolkit, and MySQL Workbench. These are all freely available, and you can download and install them as a bundle. They run on all popular desktop operating systems. These tools previously had many annoying quirks, but MySQL AB recently made an effort to find and fix bugs in all four of them.

MySQL Query Browser can be used for tasks such as running queries, creating tables and stored procedures, exporting data, and browsing database structures. It has integrated documentation on MySQL's SQL commands and functions. It is most useful for those who develop and query MySQL databases.

MySQL Administrator is focused on server administration and is therefore most useful for DBAs, not developers or analysts. It helps automate tasks such as creating backups, creating users and assigning privileges, and viewing server logs and status information. It includes some basic monitoring functionality, such as graphing status variables, but is not as flexible as the interactive monitoring tools presented later in this chapter. It also doesn't record the statistics for later analysis, which many other monitoring tools are designed to do.

The bundle also includes MySQL Migration Toolkit, which helps migrate databases from other systems to MySQL, and the MySQL Workbench modeling tool.

The benefits of MySQL's own tools are that they're free, they're now quite good quality, and they run on most desktop operating systems. They have a simple feature set that's adequate for many tasks. The standout features are the user management and backup features in MySQL Administrator and the integrated documentation in MySQL Query Browser.

The primary drawback of these tools is that they are somewhat simplistic, without all the bells and whistles power users may come to appreciate and demand. A complete description, including screenshots, is available on MySQL's web site at http://www.mysql.com/products/tools/.


Note

MySQL Workbench was recently rewritten from scratch and is now available in both free and commercial versions. The free version is not feature-crippled, but the commercial version includes some plug-ins that help automate tasks so that they require less manual work. At the time of this writing, the new version of the MySQL Workbench tool is still in beta.

SQLyog

SQLyog is the most popular visual tool for MySQL. It is well designed to support DBA and developer productivity. The full feature list is too large to include here, but here are some highlights:


  • Code autocompletion to help you write queries more quickly
  • The ability to connect over SSH tunnels to remotely hosted servers
  • Visual tools and wizards to help with common tasks like building queries
  • The ability to schedule tasks such as backups, data imports, and data synchronization
  • Keyboard shortcuts
  • Schema comparisons, offering access to properties of objects such as tables and views
  • User management

SQLyog also has all the standard features you'd expect, such as a schema editor. It is available only for Microsoft Windows, in a full-featured edition for a price and in a limited-functionality edition for free. More information about SQLyog is available at http://www.webyog.com.

phpMyAdmin

phpMyAdmin is a popular administration tool that runs on a web server and gives you a browser-based interface to your MySQL servers. It has a lot of nice features for querying and administration. Its main advantages are platform independence, a large feature set, and access through a browser. Browser-based access is nice if you're away from your usual environment and a browser is all you have. For example, you can install phpMyAdmin on hosted servers where you have only FTP access and therefore can't run the mysql client or any other programs for a shell.

phpMyAdmin is certainly a handy tool that can be just what you need for a lot of situations. Be very careful when installing it on systems that are accessible to the Web, however, because if your server isn't secured properly, you could hardly give an attacker a better way in.

phpMyAdmin's detractors say it has too many features and is too large and complex.phpMyAdmin is hosted on SourceForge.net, where it is consistently ranked one of the top projects. More information is available at http://sourceforge.net/projects/phpmyadmin/.

Monitoring Tools

Monitoring MySQL is a topic that almost deserves its own book: it's a large and complicated task, with different applications often having different requirements. However, we can direct you to some of the better tools and resources on the subject.

"Monitoring" is one of those terms people tend to overload with several meanings, assuming others know what they're talking about. However, in our experience, most MySQL shops need to do many different kinds of monitoring.

We focus on tools for noninteractive monitoring and interactive monitoring. Noninteractive monitoring usually involves an automated system that takes measurements and potentially alerts the administrator when some parameter is out of its safe range. Interactive monitoring tools let you watch a server in real time. We present these two categories of tools separately in the following sections.

You might also be interested in other distinctions between tools, such as those that monitor passively (such as innotop) versus the active ones that can send alerts or initiate actions (for example, Nagios); or perhaps you're looking for a tool that creates an information warehouse, rather than one that just displays current statistics. We indicate each tool's qualities as we go.

Noninteractive Monitoring Systems

Many monitoring systems are not designed specifically to monitor the MySQL server. Instead, they are general-purpose systems designed to periodically check the status of many kinds of resources, from machines to routers to software (such as MySQL). They usually have some kind of plug-in architecture and often come with ready-made plug-ins for MySQL. Some such systems can record the status of the systems they monitor and graph it via web interfaces. Many can also send alerts or initiate an action when something they're monitoring fails or exceeds a safe limit.

You generally install such a system on its own server and use it to monitor other servers. If you're using it to monitor important systems, it will quickly become a critical part of your infrastructure, so you may need to take extra steps, such as making the monitoring system itself redundant with failover.

An automated monitoring system that records history and shows trends can be a lifesaver when a MySQL instance slows down under increasing load or experiences other troubles. Fixing problems often requires knowing what has changed, which requires knowing your server's history and thus recording that history. A system that alerts you when something looks awry can warn you before disaster strikes and help focus your troubleshooting efforts if it does.

Homegrown systems

Many organizations start by building their own monitoring and alert systems. This usually works OK when there are few systems to monitor and few people involved. However, when the organization becomes larger and more complex and more members of the system administration staff get involved, homegrown monitoring systems tend to break down. They might flood mailboxes with thousands of email messages every time there's a network outage, or they might fail silently and not alert anyone of a critical problem. Duplicate or redundant notifications are a frequent issue with homegrown systems and can be an obstacle to getting any work done.

If you are considering writing a monitoring tool yourself--even something as simple as a cron job that checks a query and emails someone when there's a problem--you should give this some thought. It's probably a better idea to invest the time and energy into learning one of the systems mentioned in the following sections. Even though some of these systems have a steep learning curve and might not seem worth the initial investment, they will save you time and energy in the long run, and your organization will be better off. Implementing one of them, even if it's done poorly at first, will ultimately prove preferable to implementing your own system. At the very least, in the long run, you will have gained experience and competence in using a standard monitoring system.

Nagios

Nagios (http://www.nagios.org) is an open source monitoring and alerting system that periodically checks services you define and compares the results to default or explicit limits. If the results are outside the limits, Nagios can execute a program and/or alert someone to the trouble. Nagios's contact and alert system lets you escalate alerts to different contacts, change alerts or send them to different places depending on the time of day and other conditions, and honor scheduled downtime. Nagios also understands dependencies between services, so it won't bother you about a MySQL instance being down when it notices the server is unreachable because a router in the middle is down, or when it finds that the host server itself is down.

Nagios can run any executable file as a plug-in, provided it accepts the right arguments and gives the right output. As a result, Nagios plug-ins exist in many languages, including the shell, Perl, Python, Ruby, and other scripting languages. There's even a web site, http://www.nagiosexchange.org, devoted to sharing and categorizing plug-ins. And if you can't find a plug-in that does exactly what you need, it's simple to create your own. A plug-in just needs to accept standard arguments, exit with an appropriate status, and optionally print output for Nagios to capture.

Nagios can monitor just about anything you can measure, on many operating systems, via several methods (including active checks, remotely executed plug-ins, and passive checks that merely accept status data "pushed" from other systems). It has a web interface as well, which you can use to check status, view graphs and visualizations of your network and its status, schedule planned downtime, and much more.

Nagios's major shortcoming is its daunting complexity. Even once you've learned it well, it is hard to maintain. It also keeps its entire configuration in files, which have a special syntax that is easy to get wrong, and they are labor-intensive to modify as your systems grow and evolve. Finally, its graphing, trending, and visualization capabilities are limited. Nagios can store some performance and other data in a MySQL server and generate graphs from it, but not as flexibly as some other systems.

There are several books devoted to Nagios; we like Wolfgang Barth's Nagios System and Network Monitoring (No Starch Press).

Alternatives to Nagios

Although Nagios is the most popular general-purpose monitoring and alerting software, there are several open source alternatives:

Zenoss

Zenoss is written in Python and has a browser-based user interface that uses Ajax to make it faster and more productive. It can autodiscover resources on the network, and it folds monitoring, alerting, trending, graphing, and recording historical data into a unified tool. Zenoss uses SNMP to gather data from remote machines by default but can also use SSH, and it has support for Nagios plug-ins. More information is available at http://www.zenoss.com.

Hyperic HQ

Hyperic HQ is a Java-based monitoring system that is targeted more toward so-called enterprise monitoring than most of the other systems in its class. Like Zenoss, it can autodiscover resources and supports Nagios plug-ins, but its logical organization and architecture are different, and it is a little "bulkier." Whether this suits you will depend on your preferences and what you are trying to monitor. More information can be found at http://www.hyperic.com.

OpenNMS

OpenNMS is written in Java and has an active developer community. It has the usual features, such as monitoring and alerting, but adds graphing and trending capabilities as well. Its goals are high performance and scalability, automation, and flexibility. Like Hyperic, it is intended for enterprise monitoring of large, critical systems. For more information, see http://www.opennms.org.

Groundwork Open Source

Groundwork Open Source is actually based on Nagios, and it combines Nagios and several other tools into one system with a portal interface. Perhaps the best way to describe it is as the system you might build in-house if you were an expert in Nagios, Cacti, and a host of other tools and had a lot of time to integrate them seamlessly together. See http://www.groundworkopensource.com for more information.

Zabbix

Zabbix is an open source monitoring system similar in many respects to Nagios but with some key differences. For example, it stores all configuration and other data in a database, not in configuration files. It also stores more types of data than Nagios and can thus generate better trending and history reports. Its network graphing and visualization capabilities are superior to Nagios's, and many people find it easier to configure and more flexible. It is also said to stand up to much heavier loads than Nagios. On the other hand, Zabbix has a more limited community than Nagios, and its alerting capabilities aren't as advanced. See http://www.zabbix.com for more information.

MySQL Monitoring and Advisory Service

MySQL's own monitoring solution is designed specifically to monitor MySQL instances, and it can monitor some key aspects of the host machine as well. It is not open source, and it requires a MySQL Enterprise subscription.

A major advantage of this service over Nagios is that it offers a prebuilt set of rules, or "advisors," that examine many aspects of server performance, status, and configuration. It can also suggest solutions to the problems it notices, instead of just letting the system administrator figure out what's wrong. It has a well-designed dashboard that shows status information for all your servers at once.

Though it would be possible to use Nagios or another system to monitor the same statistics, it would be a fair amount of work to write the necessary plug-ins and configure Nagios to monitor each of the scores of metrics the MySQL Monitoring and Advisory Service provides out of the box.

The disadvantage of this product is that you can't monitor the rest of your network with it; it is designed for monitoring only MySQL. It also requires an agent to be installed on each system it monitors. This is distasteful to some MySQL administrators, who like to keep their servers trimmed down to the bare essentials.

More information is available at http://www.mysql.com/products/enterprise/advisors.html.

MONyog

MONyog (http://www.webyog.com) is a lightweight, agentless monitoring system that takes a different approach from the tools previously mentioned. It is designed to run on a desktop system, where it starts an HTTP listener on an unused port. You can point your browser at this port to see information on your MySQL servers, rendered by a combination of JavaScript and Flash. The underlying implementation uses a JavaScript engine, and all configuration is done via a JavaScript object model.

MONyog is actually both interactive and noninteractive, so you might want to examine its capabilities for both kinds of monitoring.

RRDTool-based systems

Although it's not strictly a monitoring system, RRDTool (http://www.rrdtool.org) is important enough to mention here. Many organizations use some kind of script or program--often homemade--to extract information from servers and save them in round-robin database (RRD) files. RRD files are an elegant solution for many situations that require recording and graphing data. They automatically aggregate incoming data, interpolate missing values in case the incoming values are not delivered when expected, and have powerful graphing tools that generate beautiful, distinctive graphs. Several RRDTool-based systems are available.

The Multi Router Traffic Grapher, or MRTG (http://oss.oetiker.ch/mrtg/), is the quintessential RRDTool-based system. It is really designed for recording network traffic, but it can be extended to record and graph other things as well.

Munin (http://munin.projects.linpro.no) is a system that gathers data for you, puts it into RRDTool, and then generates graphs of the data at several levels of granularity. It creates static HTML files from the configuration, so you can browse them and view trends easily. It is easy to define a graph; you just create a plug-in script whose command-line help output has some special syntaxes Munin recognizes as graphing instructions. Munin's disadvantages include the requirement to load an agent on each system it monitors, and simplified one-size-fits-all configuration and graphing options that might not be flexible enough for some needs.

Cacti (http://www.cacti.net) is another popular graphing and trending system. It works by fetching data from systems, storing it in RRD files, then graphing the data with RRDTool via a PHP web interface, which is also the configuration and management interface (configuration data is stored in a MySQL server). It is template-driven, so you can define templates and then apply them to your systems. It can fetch data from SNMP or custom scripts.

Cricket (http://cricket.sourceforge.net) is a Cacti-like system written in Perl, but with a file-based configuration system. Ganglia (http://ganglia.sourceforge.net) is also similar to Cacti, but it's designed to monitor clusters and grids of systems, so you can view data from many servers in aggregate and drill down to the individual servers if you wish. (Cacti and Cricket can't show aggregated data.)

These systems can all be used to gather, record, and graph data and report on MySQL systems, with various degrees of flexibility and for slightly different purposes. They all lack a really flexible means of alerting someone when something is wrong, and some of them don't even have a concept of "wrong." Some people view this as an advantage, feeling it is better to separate the jobs of recording, graphing, and alerting; in fact, Munin is specifically designed to use Nagios as the alerting system. However, for others it's a drawback. Another disadvantage is the time and effort you may need to invest to install and configure a system that almost meets your needs, but not quite.

Finally, you should consider your future needs. RRD files don't let you query the data by SQL or other standard means, and nor do they store data at a fine granularity forever by default. Many MySQL administrators are unwilling to accept these limitations and opt to store historical data in a relational database instead. A lot of DBAs also want more customized and flexible ways to record data, so they end up writing their own systems or tweaking an existing one.

Whether RRDTool-based systems are a good match for your organization will be a matter of personal choice, availability of the skills needed to administer the system, and your organization's requirements.

Interactive Tools

Interactive tools are those you can start on demand and use to get a continually updating view of what's happening in your server. We concentrate on innotop (http://innotop.sourceforge.net), but there are several others, such as mtop (http://mtop.sourceforge.net), mytop (http://jeremy.zawodny.com/mysql/mytop/), and some web-based clones of mytop.

innotop

Baron Schwartz, one of this book's authors, wrote innotop. Despite its name, it is not limited to monitoring InnoDB internals. This tool was inspired by mytop but offers much more functionality. It has many modes to monitor all kinds of MySQL internals, including all the information available in SHOW INNODB STATUS, which it parses into its component parts. It lets you monitor multiple MySQL instances simultaneously, and it is very configurable and extensible.

Some of its features include:


  • A transaction list that displays current InnoDB transactions
  • A query list that shows currently running queries
  • A list of current locks and lock waits
  • Summaries of server status and variables to show the relative magnitudes of values
  • Modes to display information about InnoDB internals, such as its buffers, deadlocks, foreign key errors, I/O activity, row operations, semaphores, and more
  • Replication monitoring, with master and slave statuses displayed together
  • A mode to view arbitrary server variables
  • Server grouping to help you organize many servers easily
  • Noninteractive mode for use in command-line scripting

It's easy to install innotop. You can either install it from your operating system's package repository or download it from http://innotop.sourceforge.net, unpack it, and run the standard make install routine:

perl Makefile.PL
make install

Once you've installed it, execute innotop at the command line, and it will walk you through the process of connecting to a MySQL instance. It can read your ~/.my.cnf option files, so you may not need to do anything but type your server's hostname and press Enter a few times. Once connected, you'll be in T (InnoDB Transaction) mode, and you should see a list of InnoDB transactions, as shown in Figure 14-1.

Figure 14-1. innotop in T (Transaction) mode

14-1-Innotop.jpg

By default, innotop applies filters to reduce the clutter (as with everything in innotop, you can define your own or customize the built-in filters). In Figure 14-1, most of the transactions have been filtered out to show only active transactions. You can press the i key to disable the filter and fill the screen with as many transactions as will fit.

innotop displays a header and a main thread list in this mode. The header shows some overall InnoDB information, such as the length of the history list, the number of unpurged InnoDB transactions, the percentage of dirty buffers in the buffer pool, and so forth.

The first key you should press is the question mark (?), to see the help screen. This screen's contents will vary depending on what mode innotop is in, but it always displays every active key, so you can see all possible actions. Figure 14-2 shows the help screen in T mode.

We won't go through all of its other modes, but as you can see from the help screen, innotop has a lot of features.

The only other thing we cover here is some basic customization to show you how to monitor whatever you please. One of innotop's strengths is its ability to interpret user-defined expressions, such as Uptime/Questions to derive a queries-per-second metric. It can display the result since the server was started and/or incrementally since the last sample.

This makes it easy to add your own columns to its tabular displays. For example, the Q (Query List) mode has a header that shows some overall server information. Let's see how to modify it to monitor how full the key cache is. Start innotop and press Q to enter Q mode. The result will look like Figure 14-3.

The screenshot is truncated because we're not interested in the query list for this exercise; we care only about the header.

Figure 14-2. innotop help screen

14-2-innotop.jpg

Figure 14-3. innotop in Q (Query List) mode

14-3-innotop.jpg

The header shows statistics for "Now" (which measures incremental activity since the last time innotop refreshed itself with new data from the server) and "Total" (which measures all activity since the MySQL server started 25 days ago). Each column in the header is derived from an equation involving values from SHOW STATUS and SHOW VARIABLES. The default headers shown in Figure 14-3 are built-in, but it's easy to add your own. All you have to do is add a column to the header "table." Press the ^ key to start the table editor, then enter q_header at the prompt to edit the header table (Figure 14-4). Tab completion is built-in, so you can just press q and then Tab to complete the word.

Figure 14-4. Adding a header (start)

14-4-innotop.jpg

After this, you'll see the table definition for the Q mode header (Figure 14-5). The table definition shows the table's columns. The first column is selected. We could move the selection around, reorder and edit the columns, and do several other things (press ? to see a full list), but we're just going to create a new column. Press the n key and type the column name (Figure 14-6).

Figure 14-5. Adding a header (choices)

14-5-innotop.jpg

Figure 14-6. Adding a header (naming column)

14-6-innotop.jpg

Next, type the column's header, which will appear at the top of the column (Figure 14-7). Finally, choose the column's source. This is an expression that innotop compiles into a function internally. You can use names from SHOW VARIABLES and SHOW STATUS as though they're variables in an equation. We use some parentheses and Perl-ish "or" defaults to prevent division by zero, but otherwise this equation is pretty straightforward. We also use an innotop transformation called percent() to format the resulting column as a percentage; check the innotop documentation for more on that. Figure 14-8 shows the expression.

Figure 14-7. Adding a header (text for column)

14-7-innotop.jpg

Figure 14-8. Adding a header (expression to calculate)

14-8-innotop.jpg

Press Enter, and you'll see the table definition just as before, but with the new column added at the bottom. Press the + key a few times to move it up the list, next to the key_buffer_hit column, and then press q to exit the table editor. Voilà: your new column, nestled between KCacheHit and BpsIn (Figure 14-9). It's easy to customize innotop to monitor what you want. You can even write plug-ins if it really can't do what you need. There's more documentation at http://innotop.sourceforge.net.

Figure 14-9. Adding a header (result)

14-9-innotop.jpg

Analysis Tools

Analysis tools help you automate the tedious job of inspecting servers and looking for areas that might benefit from optimization or tuning. These tools can be a great way to get started tackling performance problems. If one of them raises an obvious issue, you can focus your efforts there and perhaps solve the problem more quickly.

HackMySQL Tools

Daniel Nichter maintains a web site called HackMySQL, where he hosts some useful MySQL tools. mysqlreport is a Perl script that examines the server's SHOW STATUS output, transforms it into an easy-to-read report, and prints it out. You can read this report much more quickly than you can examine SHOW STATUS, and it is quite thorough.

Here's an overview of the major parts of the report, as of version 3.23:


  • The "Key" section shows how your keys (indexes) are being used. If these values are not healthy, you probably need to tune your key cache settings.
  • The "Questions" section shows what kinds of queries your server is executing to give you an idea of where the load is concentrated.
  • The "SELECT and Sort" section shows what kinds of query plans and sort strategies your server runs most often. This section can show problems with indexing or poorly optimized queries.
  • The "Query Cache" section shows how well your query cache is performing. If it is not performing well, you may need to tune the settings or, if your workload doesn't benefit from caching, even disable the cache.
  • Several sections show information about tables, locks, connections, and network traffic. Problems here usually indicate a poorly tuned server.
  • Three sections show InnoDB performance metrics and settings. Problems here might indicate bad server settings, hardware problems, or query or schema optimization issues.

More information is available at http://hackmysql.com/mysqlreport, including a detailed tutorial on how to interpret reports. It's worth taking the time to learn how to read the reports, especially if you frequently troubleshoot unfamiliar servers. With practice, you can scan a report and immediately pick out problems.

mysqlsla (the MySQL Statement Log Analyzer) is another useful tool. You can use it to analyze the general log of all queries executed on the server, the slow query log (that is, queries that needed more than the configured maximum time to execute), or any other log. It accepts several log formats and can analyze many logs at once. See "Finer control over logging" on Section 2.6.2.2 for more on analyzing MySQL's log files.

Other programs at the site can help you analyze a server's index usage and examine MySQL-related network traffic.

This is an excerpt from Chapter 14, Tools for High Performance. If you enjoyed this excerpt and want to learn more, get a copy of High Performance MySQL.


You might also be interested in:

 

Popular Topics

Browse Books

Archives

Or, visit our complete archives.

FYI Topics

Recommended for You

Got a Question?