August 28, 2011 in Classic self-hosted technical help
After ten years in operation we've amassed a fairly large database (3 gigs, about five minutes to dump the whole thing) as well as a massive number of images that span literally every version of IP.Board (which means they are stored in like ten different ways depending on IP.Board version, attachment versus Gallery, profile photos, avatars, etc. etc. etc.)
Since the vast majority of the site is more-or-less static (an image uploaded ten years ago is unlikely to change...) and since we pay for bandwidth to our offsite storage, I'd like to develop an incremental backup strategy. This is especially important for the uploads/ directory, which is massive, but ideally I'd love a way to do a backup of the database that doesn't require me to dump all three gigs every time, since the VAST majority of those posts are not changing.
Has anyone successfully developed an incremental or semi-incremental backup strategy for IP.Board?
I use DeltaCopy which is an rsync port for Windows, to back up the webroot folder to my backup servers. I also use it to backup the database. The task runs once every night.
After the initial transfer only changed files are updated. It works well. For onsite backups I use a batch file that dumps the db, zips it and moves it to one of the slaves.
Unfortunately I can't do something as simple as an rsync: I need to ultimately end up with a single delta (a "patch" file if you will) that I can upload to Amazon's S3 system.
One possible solution is to run something like
tar `date +%F`.tar.gz `find ./ -mmin -2885`
in a cron job every day, then weekly you do a weekly and delete the old dailies, and monthly delete the weeklies, etc. That would work (I think), but it seems like there is probably some kind of failing that I'm missing.
For the database, I'm wondering if there's a query that can be constructed against the posts, topics, messages, etc. databases to only dump out the new posts, changed topics, etc. since some given date. Has anyone tried this? What tables did you get it to work for? Obviously not every table maintains a timestamp column.
I run cron job from DirectAdmin. DB size is around 1 GB.
This topic is now archived and is closed to further replies.
Started 1 hour ago
Started 5 hours ago
Started 3 hours ago