Description
MySQL Parallel Dump
Multi threaded mysqldump is not an utopia any more. mysqlpdump can dump all your tables and databases in parallel so it can be much faster in systems with multiple cpu’s.
It stores each table in a different file by default. It can also generate the dump to stdout although this is not recommended because it can use all the memory in your system if your tables are big.
History
I saw an interesting post on MySQL Performance Blog with some suggestions to improve mysqldump.
Here is my effort to implement some of that suggestions.
Download
Requeriments
- Python 2.4
- MySQL-python module
Usage
Simplest usage (will save a file for each table):
mysqlpdump.py -u root -p password
Save compressed files (gzip) to /tmp/dumps and pass “–skip-opt” to mysqldump
mysqlpdump.py -u root -p password -d /tmp/dumps/ -g -P "--skip-opt"
Output to stdout and use 20 threads:
mysqlpdump.py -u root -p password -stdout -t 20
Be more “verbose”:
mysqlpdump.py -u root -p password -v
Exclude “mysql” and “test” table from dumping:
mysqlpdump.py -u root -p password -e mysql -e test
Only dump “mysql” table:
mysqlpdump.py -u root -p password -i mysql
Links
- mysqlpdump at freshmeat
- Original article in MySQL Performance blog
- mysql-paralel-dump (similar script from the autor of MySQL Toolkit)
Changelog
- 0.5
- Compress 00_master_data.sql file if specified
- bugfix: when it’s called without a terminal or a logged user, it uses “nobody”.
- bugfix: destination now works with 00_master_data.sql
- 0.4
- Made it compatible with python 2.4
- Can include and exclude specified databases.
- 0.3
- Fixed a bug that prevented the tables of being dumped because of a lock
- Added –master-data option to write “CHANGE MASTER TO ” statement
- 0.2
- Store dumps to files directly instead to stdout
- Can compress files
- Dump each table in its own file
- Can pass parameters directly to mysqldump
- 0.1
- First version
License
mysqlpdump uses GNU/GPL License.
interessant, pero et recomano que posis la llicéncia dins d’aquesta web, és l’habitual no?
I suppose we need Python 2.5 installed, isn’t it ?
Devis: That’s true. Python 2.5 is necessary because I’m using some functions in the Queue module that are only available in 2.5.
Hi fr3nd, it’s pity because Debian will release 2.5 in 2070… So for the moment I cannot use this script, hope for the future… Do you think it is possible to remove the 2.5 instructions, like queue.join ?
Anyway, thank you!
Devis, I’ll try it… I just need to find some safe way to substitute .join()
Let me check it.
Devis: I made it compatible with python 2.4. It was easier than I thought actually. I’ve also added two more options to specify what databases to dump and what databases not to dump. See usage for examples.
Wonderful, it works! 10M Kudos to you!
Surely I will follow the script evolution and if I can help just write me :-)
Thank you!
Hi fr3nd I’ve put mysqlpdump in a cron job but I am having little problems:
——-
Traceback (most recent call last):
File “mysqlpdump.py”, line 241, in ?
main()
File “mysqlpdump.py”, line 187, in main
parser.add_option(“-u”, “–user”, action=”store”, dest=”user”, type=”string”, default=os.getlogin(), help=”User for login.”)
OSError: [Errno 25] Inappropriate ioctl for device
——-
It is caused by “os.getlogin()” in combination with cron, as there isn’t any terminal or logged user.
For the moment I’ve substitued this with ‘root’ and it works for me.
Devis: version 0.5 fixes that bug. Thanks for the bug report!
This sounds very interesting and promising; but I think it will not be possible to create a consistent dump that way? Or how should locking be handled?
Hi danielj, to create a consistent dump you need to lock *all* tables before the dump.
Thank you !
Too bad MySQL does not have support for several connections sharing same transactional context so the only way you can get consistent backup is setting low priority updates and locking all the tables. As you have multiple threads you may have no luck locking all of them otherwise.
Thanks for the tool! I love it. I used it and blogged about it yesterday, and then today I wrote a restore tool for it (http://mysql-ha.com/2007/09/13/mysqlprestore-for-parallel-restores/)
Please feel free to grab the file and incorporate it however you like. I tried to do it in such a way that the two could be merged pretty well, I think.
This tool automatically locks all tables before starting the dump, so there is no need to lock them manually. It’ll create a consistent dump.
About the restore tool, I don’t think it’s needed… Restoring is as easy as doing:
cat *.sql | mysql
surts referiat a la segona edicio del “High Performance MySQL” pero veient qui et comenta més amunt, no m’extranya
Que bò, havia llegit alguns capitols del llibre, pero no tot… suposo que hauré de fer-li una repassada més intensa :)
Merci per avisar