mysqlpdump

Description

MySQL Parallel Dump

Multi threaded mysqldump is not an utopia any more. mysqlpdump can dump all your tables and databases in parallel so it can be much faster in systems with multiple cpu’s.

It stores each table in a different file by default. It can also generate the dump to stdout although this is not recommended because it can use all the memory in your system if your tables are big.

History

I saw an interesting post on MySQL Performance Blog with some suggestions to improve mysqldump.

Here is my effort to implement some of that suggestions.

Download

mysqlpdump in github

Requeriments

Python 2.4
MySQL-python module

Usage

Simplest usage (will save a file for each table):

mysqlpdump.py -u root -p password

Save compressed files (gzip) to /tmp/dumps and pass “–skip-opt” to mysqldump

mysqlpdump.py -u root -p password -d /tmp/dumps/ -g -P "--skip-opt"

Output to stdout and use 20 threads:

mysqlpdump.py -u root -p password -stdout -t 20

Be more “verbose”:

mysqlpdump.py -u root -p password -v

Exclude “mysql” and “test” table from dumping:

mysqlpdump.py -u root -p password -e mysql -e test

Only dump “mysql” table:

mysqlpdump.py -u root -p password -i mysql

Changelog

0.5
- Compress 00_master_data.sql file if specified
- bugfix: when it’s called without a terminal or a logged user, it uses “nobody”.
- bugfix: destination now works with 00_master_data.sql
0.4
- Made it compatible with python 2.4
- Can include and exclude specified databases.
0.3
- Fixed a bug that prevented the tables of being dumped because of a lock
- Added –master-data option to write “CHANGE MASTER TO ” statement
0.2
- Store dumps to files directly instead to stdout
- Can compress files
- Dump each table in its own file
- Can pass parameters directly to mysqldump
0.1
- First version

License

mysqlpdump uses GNU/GPL License.

16 thoughts on “mysqlpdump”

di4blo says:

24/05/2007 at 4:55 pm

interessant, pero et recomano que posis la llicéncia dins d’aquesta web, és l’habitual no?
Devis says:

05/06/2007 at 1:06 pm

I suppose we need Python 2.5 installed, isn’t it ?
fr3nd says:

05/06/2007 at 1:30 pm

Devis: That’s true. Python 2.5 is necessary because I’m using some functions in the Queue module that are only available in 2.5.
Devis says:

05/06/2007 at 1:56 pm

Hi fr3nd, it’s pity because Debian will release 2.5 in 2070… So for the moment I cannot use this script, hope for the future… Do you think it is possible to remove the 2.5 instructions, like queue.join ?
Anyway, thank you!
fr3nd says:

05/06/2007 at 2:00 pm

Devis, I’ll try it… I just need to find some safe way to substitute .join()

Let me check it.
fr3nd says:

05/06/2007 at 3:32 pm

Devis: I made it compatible with python 2.4. It was easier than I thought actually. I’ve also added two more options to specify what databases to dump and what databases not to dump. See usage for examples.
Devis says:

05/06/2007 at 3:51 pm

Wonderful, it works! 10M Kudos to you!
Surely I will follow the script evolution and if I can help just write me :-)
Thank you!
Devis says:

06/06/2007 at 12:37 pm

Hi fr3nd I’ve put mysqlpdump in a cron job but I am having little problems:

——-
Traceback (most recent call last):
File “mysqlpdump.py”, line 241, in ?
main()
File “mysqlpdump.py”, line 187, in main
parser.add_option(“-u”, “–user”, action=”store”, dest=”user”, type=”string”, default=os.getlogin(), help=”User for login.”)
OSError: [Errno 25] Inappropriate ioctl for device
——-

It is caused by “os.getlogin()” in combination with cron, as there isn’t any terminal or logged user.
For the moment I’ve substitued this with ‘root’ and it works for me.
fr3nd says:

07/06/2007 at 9:26 am

Devis: version 0.5 fixes that bug. Thanks for the bug report!
danielj says:

13/09/2007 at 7:09 am

This sounds very interesting and promising; but I think it will not be possible to create a consistent dump that way? Or how should locking be handled?
Devis says:

13/09/2007 at 7:19 am

Hi danielj, to create a consistent dump you need to lock *all* tables before the dump.
Peter Zaitsev says:

13/09/2007 at 9:52 am

Thank you !

Too bad MySQL does not have support for several connections sharing same transactional context so the only way you can get consistent backup is setting low priority updates and locking all the tables. As you have multiple threads you may have no luck locking all of them otherwise.
Monty Taylor says:

13/09/2007 at 6:56 pm

Thanks for the tool! I love it. I used it and blogged about it yesterday, and then today I wrote a restore tool for it (http://mysql-ha.com/2007/09/13/mysqlprestore-for-parallel-restores/)
Please feel free to grab the file and incorporate it however you like. I tried to do it in such a way that the two could be merged pretty well, I think.
fr3nd says:

14/09/2007 at 8:24 am

This tool automatically locks all tables before starting the dump, so there is no need to lock them manually. It’ll create a consistent dump.

About the restore tool, I don’t think it’s needed… Restoring is as easy as doing:

cat *.sql | mysql
dani says:

19/11/2008 at 11:37 pm

surts referiat a la segona edicio del “High Performance MySQL” pero veient qui et comenta més amunt, no m’extranya
fr3nd says:

20/11/2008 at 9:35 am

Que bò, havia llegit alguns capitols del llibre, pero no tot… suposo que hauré de fer-li una repassada més intensa :)
Merci per avisar