Jump to content

DB replication


Recommended Posts

High availability (HA) setup for those who need it gets it. It's not entirely about the size of the site.

If your site is far too big to be hosted on a single machine (ie, the best available machine isn't enough to support your system) and you obviously don't want to get into the super computer setups where they cost like half a million... you have little choice but to go with replication. But this is why you see really beefy servers being used for database often. So they don't have to bother with replication.

But most sites enter replication usage when they want HA setup. It's basically having everything with a failover that allows it to run even when one goes down. But this literally means you need double of EVERYTHING because your secondary needs to be able to serve when your primary goes down. Minimum of two switches, two power source, two incoming network, two web handlers, two database, etc, etc. The costs of HA setup are always in thousands/mo at minimum, so it's often not realistic solution for many.

4hr down time for hardware failure is quite frankly not bad. I forget the source, but there's a stat that showed uptime of the big sites in the world. Only 2 sites had 100% uptime throughout the year, google (the search engine, not all of their branches) being one of them. Having 100% uptime is something very hard to do even with lots of money.

Link to comment
Share on other sites


High availability (HA) setup for those who need it gets it. It's not entirely about the size of the site.



If your site is far too big to be hosted on a single machine (ie, the best available machine isn't enough to support your system) and you obviously don't want to get into the super computer setups where they cost like half a million... you have little choice but to go with replication. But this is why you see really beefy servers being used for database often. So they don't have to bother with replication.



But most sites enter replication usage when they want HA setup. It's basically having everything with a failover that allows it to run even when one goes down. But this literally means you need double of EVERYTHING because your secondary needs to be able to serve when your primary goes down. Minimum of two switches, two power source, two incoming network, two web handlers, two database, etc, etc. The costs of HA setup are always in thousands/mo at minimum, so it's often not realistic solution for many.



4hr down time for hardware failure is quite frankly not bad. I forget the source, but there's a stat that showed uptime of the big sites in the world. Only 2 sites had 100% uptime throughout the year, google (the search engine, not all of their branches) being one of them. Having 100% uptime is something very hard to do even with lots of money.




I see, so even a high traffic forum might not be using db repication as a backup method?
Link to comment
Share on other sites


I see, so even a high traffic forum might not be using db repication as a backup method?



If all you need is a backup, just run mysqldump on a low priority...
If you need a fail over to run instantly when the primary server goes down, you need replication/mirroring.

I have a high traffic forum, but my budget for my own niche doesn't allow me to do a HA setup. So, I simply have backups stored on amazon cloud -- which is not a db server. Just a storage.

Simple question is, is down time worth more than the HA setup? If yes, then you do it. If not, then you don't do it.
Assuming your server is up 99% of the time (any less and you really have something to fix...). So, is <1% uptime worth thousands of dollars every month? Either in short term or long term.
sql db replication alone is obviously not going to cost thousands. But, replicated sql only reduces ONE factor of a potential down time. So, these numbers are really assuming that you do a full HA setup.
Link to comment
Share on other sites


If all you need is a backup, just run mysqldump on a low priority...


If you need a fail over to run instantly when the primary server goes down, you need replication/mirroring.



I have a high traffic forum, but my budget for my own niche doesn't allow me to do a HA setup. So, I simply have backups stored on amazon cloud -- which is not a db server. Just a storage.



Simple question is, is down time worth more than the HA setup? If yes, then you do it. If not, then you don't do it.


Assuming your server is up 99% of the time (any less and you really have something to fix...). So, is <1% uptime worth thousands of dollars every month? Either in short term or long term.


sql db replication alone is obviously not going to cost thousands. But, replicated sql only reduces ONE factor of a potential down time. So, these numbers are really assuming that you do a full HA setup.



So how you do it? do you take your forum offline while doing db bakcups? and then after the backup is done you transfer the backup to amazon?
how you set up the low priority?

thank you
Link to comment
Share on other sites

No, my site runs while I create backups. Doing so doesn't create a significant impact on the runtime performance. But depending on how you setup/resources available, shutting down http may be necessary.

Anything called with nice (the command) is run on low priority on linux. You can modify already running things with renice. Low priority affects CPU btw, not hard drive -- which is the bigger impact for these.
Low priority obviously doesn't mean no effect, just lot less of an effect when cpu bound.

I partially take back what I said that mysqldump works. This is most of the case true b/c most sites are small... But I guess the question is for big sites. Thinking about it, I'm not sure you want to run mysqldump on low priority if you ARE jammed on cpu.

with innodb, mysqldump has option to do per single transaction. This prevents big table locking. This is often sufficient.

with myisam, there's more options, or alternative options at the lack of above.
1. mysqlhotcopy. You can look this up. This is faster. Only works for myisam and archive.
2. mysqldump with no locks. Yes, you can just turn off table locking. lol~~~ This, however prevents you from having a perfectly synced output. Without locks, it won't cause any down time, and as long as you're not seriously lacking disk resources, it'll run without much of an effect to the runtime of your website.
3. backup by parts with mysqldump or mysqlhotcopy. Not the entire database at once. If you do it by parts, each backups will take a short time, thus prevents big locks. You can further elaborate by parts even further. You can even separate the data to be backed up by time or points (if possible). Thus, making backup parts even smaller and more efficient. Though, restoring needs to be just as creative. Though, just like #2, this will cause incorrect perfect sync, but frankly, don't care. lol Minor corruption... I'm not running a bank transactions here.
4. lvmdump... If you're running on lvm, you can just dump the disk data... It'll lock the disk, but it's supposed to be uber fast. Doesn't work with innodb due to non-os managed write cache. Never tried this myself, but heard it works and somewhat fault-tolerant enough for non critical data. No guarantees for working though since it's not a sql operation at all.

If you don't want to do that manually (duh!) and mysqldump is too slow and clunky for you, there are commercial and free solutions out there like r1soft, amanda/zmanda, mysqldumper, xtrabackup, etc. Not all of them will run on all environments. They're, by no means, perfect solutions though. Zmanda for example is just a tool that manages the above methods for you (including replication).

The truly 0 downtime, 0 loss of performance and no other negative consequence is replication, stop slave and do a mysqldump on the slave. then restart. Well, I guess the negative is money. lol If mysql mirror is the only replication you want, either for backup purposes for partial HA setup, and especially for pure backup, I guess it's not that really outside of people's budget. But... taking a look at my own... the day after I turn on mysql replication is the day I run out of disk space on my SSD. lol

Right now, I just run mysqldump with no locks. :P Risky, yes. But again, not running a bank. If a post is made and the user that made the post is missing, IPB isn't going to crash. ^^ (thank you IPB. lol)
Thinking of trying some tools again.

Link to comment
Share on other sites

Thank you Grumpy for your answer. So doing db backups while the forum is online it will corrupt db but not at the point where you see mysql errors right? it just wont keep a post made while backingup correct? I cant imagine a high traffic forum being backed up while online :S

Link to comment
Share on other sites

It's not going to corrupt your running database, it will generate a possibly (likely in high traffic) non-synced data output. The all table lock is there so that it is capable of taking a perfect snapshot. If you prevent the locks, the table will be constantly be written on to. So, if a table that's already backed up is changed in the middle of the backup procedure, then your rest of the data may be too far in the future. This newly created backup my be "corrupted" depending on the definition of it. It will not contain mysql errors. But it is broken in the sense like my example. Post with user, but user doesn't exist.

Link to comment
Share on other sites

Normally the only need for replication is a very busy forum... I see this happen at about 3500-4500 users in 15 minutes on a proper server, it can happen much sooner depending on the server specs and setup however. There isn't one rule for all, as different configs and setup will determine when you hit that limit. As mentioned above, the issue is locking the tables, if you have too much traffic, the wait times will overload apache/web server while these tables are locked for the dump to complete. IF you run the dump without locking, it's most likely going to have some conflicts if you ever needed to restore it.

Link to comment
Share on other sites


Hello,



Im wondering if database repilcation is really necessary for big forums as a backup method. Is there any other way?




Thank you


not necessary depending on your budget and time you'd want to allocate to such

2 thread/posts you want to read which basically highlight the fact Nice read [*] [*]bit more on how mydumper works and benchmarks comparing mydumper vs traditional mysqldump First post highlights that if you don't mind some downtime/forum closure for maintenance backups, it's a much cheaper alternative. Note here in Australia the top 4 largest banks all have nightly 1-2hr maintenance windows where clients can't access their bank's web sites and services usually when everyone is asleep. So if the big banks can do it, you can too i.e. close of forum for maintenance for 5-15mins while backups are run. But if you want 100% uninterrupted online time, then yes mysql replication is needed. HTH
  1. [*]that mysql replication isn't a backup method but a high availability method which helps with large db backups [*]that there are alternative backup methods which are faster than traditional mysqldump method, i.e. percona xtradb and mydumper multi-threaded tool is 3x to 10x times faster than mysqldump

    1. [*]
http://community.inv...20#entry2234644http://www.webhostin...d.php?t=1112716http://www.webhostingtalk.com/showthread.php?t=1113263http://vbtechsupport.com/1716/





Link to comment
Share on other sites

Thanks all! after reading I think I will prefer to put my commuity offline at nights no matter how little or huge is to do my db backups, the only thing what will change is the frecuency of the backups depending how big the community becomes.



Also, backing up files while forums online doesnt trouble my site right?? all the modifications goes directly to the database right?




thank you

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...