Monday, November 30, 2009
Maakit tool options description
Maatkit to prove replication is working correctly, fix corrupted data, automate repetitive tasks, speed up your servers, and much, much more.
1) mk-archiver : Archive rows from a MySQL table into another table or a file.
- Can move the old data into other table or a file format 'load data infile'.
Adv :( used in Dataware housing concept ) - low impact on OLTP.
2) mk-audit : Analyze, summarize and report on MySQL config, schema and operation.
- OS version and flavor, and information on CPU, memory and disks
- It reports some information of Mysqld server's ( instance wise) and the data in it.
3) mk-deadlock-logger : Extract and log MySQL deadlock information.
- Prints the deadlock info currently only innodb or can store in a table also.
4) mk-duplicate-key-checker : Find duplicate indexes and foreign keys on MySQL tables.
- Examines SHOW CREATE TABLE o/p and display duplicates.
5) mk-fifo-split : Split files and pipe lines to a fifo without really splitting.
- Read hugefile.txt in chunks of a million lines without physically splitting it.
6) mk-find : Find MySQL tables and execute actions, like GNU find.
- Uses SHOW TABLES and SHOW TABLE STATUS and display the info like
table engines, data size, index size.
7) mk-heartbeat : Monitor MySQL replication delay.
- Creates a table does the update and checks the update timestamp in the slave and reports the delay time.
8) mk-kill : Kill MySQL queries that match certain criteria.
- Uses SHOW PROCESSLIST. Also called as “Slow Query Snipper”. Criteria is like time, state. It has may options so Be careful, it kill replication threads also.
We can specify certain queries also to kill like “ select * from tablename.”
9) mk-loadavg : Watch MySQL load and take action when it gets too high.
-Watches including MySQL status values from SHOW STATUS, SHOW INNODB STATUS and SHOW SLAVE STATUS, the three system load averages from uptime, and values from VMSTAT executes our given command when it resches threshold.
10) mk-log-player : Split and play MySQL slow logs.
- Split the slow logs into sessions (for parallel execution) and does the stress-test and load-test the server.
11) mk-parallel-dump : Dump MySQL tables in parallel.
- Dumps big table in chunks as specified size. Only tables and data are dumped; view definitions or any kind of stored code (triggers, events, routines, procedures, etc.) are not dumped.
mk-parallel-restore Load files into MySQL in parallel.
- Restore the dump taken by mk-parallel-dump in parallel.
12) mk-query-profiler : Execute SQL statements and print statistics, or measure activity caused by other processes.
- It reads a file containing one or more SQL statements or shell commands, executes them, and analyzes the output of SHOW STATUS afterwards. It then prints statistics about how the batch performed. For example, it can show how many table scans the batch caused, how many page reads, how many temporary tables.
13) mk-profile-compact : Compact the output from mk-query-profiler.
- It slices and aligns the output from mk-query-profiler so you can compare profile results side by side easily
14) mk-query-digest : Parses logs and more. Analyze, transform, filter, review and report on queries.
- Analyses queries from Slow log and PROCESSLIST and gives a report.
15) mk-show-grants : Canonicalize and print MySQL grants so you can effectively replicate, compare and version-control them.
- With this o/p, we can pipe the grants from one server to another server.
16) mk-slave-delay : Make a MySQL slave server lag behind its master.
- It bases the delay on binlog positions in the slave's relay logs by default, so there is no need to connect to the master.
17) mk-slave-find : Find and print replication hierarchy tree of MySQL slaves.
18) mk-slave-move : Move a MySQL slave around in the replication hierarchy.
- Detach and Make the server a slave of its grandparent, so it is a sibling of its master.
19) mk-slave-prefetch : Pipeline relay logs on a MySQL slave to pre-warm caches.
- reads the slave's relay log slightly ahead of where the slave's SQL thread is reading, converts statements into SELECT, and executes them. Keeps the data in the unmodified data in the cache.
20) mk-slave-restart : Watch and restart MySQL replication after errors.
- We can specify errors to skip and run the slaves until a certain binlog position.
21) mk-table-checksum : Perform an online replication consistency check, or checksum MySQL tables efficiently on one or many servers.
- It examines table structure only on the first host specified, so if anything differs on the others, it won't notice. It ignores views.
22) mk-checksum-filter : Filter checksums from mk-table-checksum.
- Filters the o/p of the mk-table-checksum of two files and display the different checksums or count.
23) mk-table-sync : Synchronize MySQL tables efficiently.
- It does not synchronize table structures, indexes, or any other schema changes. It synchronizes only data.
24) mk-upgrade Execute : SQL statements against two MySQL servers and compare the results.
- Useful before an upgrade or configuration change.
25) mk-visual-explain : Format EXPLAIN output as a tree. - In tree format, helpful when we have more joins.
Friday, November 27, 2009
Tuesday, November 24, 2009
MySQL Cluster Query execution process
MySQL Cluster executes the queries depending on the below type of scan’s it does
- Primary key lookup
- Unique key lookup
- Ordered index scan (i.e., non-unique indexes that use T-trees)
- Full table scan
Let’s say you have 4 data nodes in your cluster (NoOfReplicas=2). This means you have 2 node groups and each one has half the data. Cluster uses a hash on the primary key (unless you’ve controlled the partitioning using the 5.1 partitioning features). So for any table, half the rows are in one node group and half the rows are the in other node group.
Now for the 4 types of query execution. You can verify which type of execution is used with EXPLAIN. Here’s how each ones works:
- Primary key lookup - the MySQL server can calculate the hash on the primary key and know which data node the relevant row resides in. Then the MySQL server contacts that data node and receives the row.
- Unique key lookup - the MySQL server cannot know which data node the row might be stored in. So it contacts a pseudo-random node. That data node has a hidden table that maps the unique key values to the primary key. Then the hash on the primary key reveals where the row resides.
- Ordered index scans are sent in parallel to all data nodes, where they search their local t-tree.
- Full table scans are send in parallel to all data nodes, where they scan their primary fragment.
Summary: primary key lookups are best. If you have more than 2 nodes, throughput goes up because all nodes are actively serving different fragments of data. Ordered index lookup and full table scans are done in parallel, so more nodes leads to better performance.