Jump to content

MyISAM, MyISAM - or SolidDB?


Guest W13

Recommended Posts

What mistake and imbroved ?



(I'm the dev of this mod )



server side files would not clear or rotate and sphinx would not complete the nightly rebuild. This happened on the second day after manually clearing all files. The sphinx server would then fail to restart.
To get around that I had to add a rm -rf cron to delete the files before rebuilding.
Hopefully fixed with the gold release of sphinx.

IPB_Mod side.

Searches on just membernames failed, every time.
Link to comment
Share on other sites

  • Replies 73
  • Created
  • Last Reply

if you read that thread it did go down like a million times.




that's probably because they didn't configured their daemon correctly.

I still prefer xapian because it doesn't need a daemon (so don't go down) and I think the results of xapian are more relevant too. But the index speed of xapian is quite long and it does not default support utf .... I think the search speed of xapian is superior to sphinx though

therefor I'm using the xapian as a base search, to find the relevant search and using the mod to use sphinx search for the more filtered out posts.

I'm planning to completely replace the ipb-search witch xapian one day though
Link to comment
Share on other sites

Yeah, we're still doing the dance with Sphinx at Neowin, but I only work on the site part time so it does have a few kinks to work out. Searches are approximately 1 to 2 billion times faster, and no more horrific table locks. We had actually set up a 2nd MySQL server exclusively for FULLTEXT searches. The master runs everything fully InnoDB, and the slave has ibf_posts as a MyISAM table. Works fine with replication, but FULLTEXT still gets slow at high record counts. So, Sphinx is the best option we've seen. Plus, throw in stemming support and it's way more powerful than FULLTEXT. Yeah, 100% recommendation from us. We've just got to upgrade to 0.9.7 now that it's out.

Link to comment
Share on other sites

  • Management

Yeah, we're still doing the dance with Sphinx at Neowin, but I only work on the site part time so it does have a few kinks to work out. Searches are approximately 1 to 2 billion times faster, and no more horrific table locks. We had actually set up a 2nd MySQL server exclusively for FULLTEXT searches. The master runs everything fully InnoDB, and the slave has ibf_posts as a MyISAM table. Works fine with replication, but FULLTEXT still gets slow at high record counts. So, Sphinx is the best option we've seen. Plus, throw in stemming support and it's way more powerful than FULLTEXT. Yeah, 100% recommendation from us. We've just got to upgrade to 0.9.7 now that it's out.




Very interesting, thanks for sharing that, Tim. I'd like to talk to you more about that when you have some spare time! :)
Link to comment
Share on other sites

if you read that thread it did go down like a million times.


It is a bug in sphinx-0.9.7-rc2 sometimes the deamon crach (fixed in final release ! :D )

Yeah, we're still doing the dance with Sphinx at Neowin, but I only work on the site part time so it does have a few kinks to work out. Searches are approximately 1 to 2 billion times faster, and no more horrific table locks. We had actually set up a 2nd MySQL server exclusively for FULLTEXT searches. The master runs everything fully InnoDB, and the slave has ibf_posts as a MyISAM table. Works fine with replication, but FULLTEXT still gets slow at high record counts. So, Sphinx is the best option we've seen. Plus, throw in stemming support and it's way more powerful than FULLTEXT.

Yeah, 100% recommendation from us

. We've just got to upgrade to 0.9.7 now that it's out.


Thanks ! :)
Link to comment
Share on other sites

  • 2 months later...

So I run/host a decent sized IPB (2.3.1) site as well (900,000+ posts and 200,000+ page views / day) and we've spent the last few days heavily optimizing the site. We're running it on a single server at the moment (Dual 2.4 Xeon w/ 2GB of RAM). We're actually about to move up in hardware but we figured we'd optimize what we had as far as we could take it. I can happily say that the server is running worry free right now, so here's what we did:


memcache: This was a big one early on, we actually added this months ago and saw immediate performance gains. Now that memcache support is built into IPB this is a must-have for large sites.

sphinx: We moved away from full text searching in order to move to innodb, but I've been following sphinx for quite a while now and was waiting for an excuse. Searches are extremely fast now and a lot of the "weird" queries have disappeared

innodb: This one was huge and probably the biggest thing we've done. With so many concurrent users, we found that the table would start locking recently and end up in a spiral where load inevitably shot through the roof and the site became inaccessible. We moved the posts and members tables to innodb and locking has pretty much stopped there. We did move attachments temporarily but ended up moving it back to myisam due to the shear # of reads that were going on.

memory sessions: We moved the sessions table to memory as suggested, works well it seems

my.cnf optimizations: Using the mysql analyzer script in this thread, we optimized the configuration and the server seems to be handling tons of concurrent connections much better now

low_priority_updates: We turned this on as well, not sure what kind of effect it's having with everything else going on, but some people swear by it.


All in all, load is now staying below 10 at almost all times, whereas we'd see spikes above 20 constantly and higher than 40 when the locking kicked in. The biggest differences we've seen were due to the innodb and memcache implementation, but the rest of them certainly didn't hurt. There are still a few things on the list to do, but we'll be waiting for the server move to complete them. Here they are if you have any suggestions:


sphinx storage engine: One of the problems we're seeing right now is queries that are essentially pulling a bunch of rows from the posts table by id, but it'll be a list of hundreds of ids. These come from the results of the sphinx daemon so that you can display the posts. We actually have a running compile of the sphinx storage engine for mysql which essentially does a socket connection to the sphinx daemon and queries it based on the parameters you put in. The benefit of this is we can fold our SELECT query and our sphinx search into one query and connect them with a join. This should provide a pretty sizable speed boost when it comes to searches (not that sphinx isn't fast as is).

lighttpd: Unfortunately, our entire httpd.conf is built via mod_perl and it's a bit of a pain to migrate that conf to lighttpd. Hopefully, on the new server, we'll write a special conf just for this site to point all of the static files to lighttpd.

php accelerator: We've been wanting to implement a php accelerator like xcache or APC, but due to conflicts with memcache we haven't done it so far. It'll be implemented on the new server (and isn't a bottleneck at the moment anyway)


And that's about it so far. Somehow we've managed to keep the site+database running on a server shared with dozens of other sites (though this is by far the heaviest) with only 2GB of RAM and 2x 2.4GHz procs. Without all the suggestions on here, this wouldn't have been possible. So thanks to everybody and I hope this helps those that follow.

Link to comment
Share on other sites

What conflicts with memcache are you having with apc? An opcode cache can make a bigger difference than memcachd. If it's a choice of one or the other, make sure you're making the right choice.

I consider lighttpd overrated for much other than rails. Have you explored the caching options offered by apache2.2 much? I'm currently upgrading all my stuff to debian etch which uses 2.2 and the difference has been phenomenal. Apache tweaking is a bit more complicated because it's more versatile, but for most usage cases you can get the same performance out of apache worker_mpm+php-fcgi. It's also better supported, has a longer track record, and is easier to integrate into a full stack.

With the exception of mongral and self-hosting stuff like that I've yet to be convinced there's any reason to use anything other than apache. I expect apache 1.4 is going to be EOLed any day now btw. Apache 2.0 is showing it's age and is really sluggish compared to 2.2 which has been out over a year now. 2.2 is the one to use.

If you're not using php 5.2, do so. There have been major improvements in the memory handling but you're want to increase memory_limit in php.ini to well above what the default was in prior versions.

Stuff like cpanal and plesk use absurd amounts of resources. If you're on a VPS or dedicated server and have some kind of control panal, consider ditching it and doing everything from the shell. It's worth it.

Link to comment
Share on other sites

We're running Apache 2.2.3 now with php 5.2.0-8. The conflicts we've had with memcache and APC actually deal with the apt repos for Debian and while some of our system is custom compiled, some (like memcache and other php modules) are pulled from the repos. For some reason it won't let you pull down php5-apc if you have php5-memcache installed. We've thought about custom compiling that but with a move to a new system coming up, there hasn't been a need.

And as for the system resources, we have a custom built system that's heavily optimized. The panel is actually custom written to build the confs from a pretty minimal mysql database we made. It's very light and doesn't actually do anything except create confs and create flags when a process needs to cycle.

And an opcode cache is nice but we have more database load than php load (by far) at the moment and we're using memcache for other sites as well, to actually cache the results of mysql queries (vs the template caching that I believe IPB does).

I'm still interested in lighttpd though, the benchmarks on it are very promising and it has a memcache module that will allow it to cache static files in memory, which should be a huge benefit.

Link to comment
Share on other sites

There is no APC, xcache or eaccelorator package in debian stable. They have issues with the use of the php license for anything but php, pretty much. I compiled it myself and put an apc.ini file in /etc/php5/conf.d/ to keep with the debian configuration scheme and have apc.mmap_file_mask pointing to a subdirectory under /var/tmp/ rather than tmp as in the default.

http://pecl.php.net/package/APC



Link to comment
Share on other sites

There is no APC, xcache or eaccelorator package in debian stable. They have issues with the use of the php license for anything but php, pretty much. I compiled it myself and put an apc.ini file in /etc/php5/conf.d/ to keep with the debian configuration scheme and have apc.mmap_file_mask pointing to a subdirectory under /var/tmp/ rather than tmp as in the default.



http://pecl.php.net/package/APC


We have a pretty good third party repo we use for a lot of web stuff if we're too lazy to compile it (and it's updated more often than stable). Mostly everything web-oriented is custom compiled on the new server though.
Link to comment
Share on other sites

maddog107, my forum with 500K posts / 100K views takes no more than 30% load. And it has single Xeon (Core 2 Duo) with 4GB memory. IMHO, your config is not enougth efficient.

- switch to lighttpd / nginx and use php in FCGI mode. It's the first thing to do, prior to ANY other optimizations.
- increase memory to 4GB, if possible, and give as mush as you can to mysql. The best situation - if all database can fit in memory.

Memcache for static files... hm... not sure. Those are efficiently cached in system file cache, if you have enougth memory.

Link to comment
Share on other sites

maddog107, my forum with 500K posts / 100K views takes no more than 30% load. And it has single Xeon (Core 2 Duo) with 4GB memory. IMHO, your config is not enougth efficient.



- switch to lighttpd / nginx and use php in FCGI mode. It's the first thing to do, prior to ANY other optimizations.


- increase memory to 4GB, if possible, and give as mush as you can to mysql. The best situation - if all database can fit in memory.



Memcache for static files... hm... not sure. Those are efficiently cached in system file cache, if you have enougth memory.



It's an older Xeon we're on (P4?), not a core 2, there's a *huge* speed difference and from your stats, we get about 2x as much traffic as you. Given that it's an older processor architecture (you say single Xeon but it's dual core so you're running the same # of procs as us) w/ half the RAM and 2x the traffic, the config is very efficient right now. And I'm not sure what you mean by 30%, if you're talking CPU activity, between everything, the cpu is idle 53% of the time on average with load averaging 4 or so. We also run another site on the server that does a ton of innodb writes every day so it's eating up a bit of the database power.

And lighttpd is not always an option, we run a ton of sites on this server and while we want to run lighttpd for some stuff eventually, we're not moving over to it any time soon. And if you didn't see, I said we're upgrading to a completely new server in the future, so it'll have more memory but the comments I made were about optimizing what we have now. We're doing a lot with the little we have.

btw: memory does make a *huge* difference with things like this and while we could just go out and buy RAM for the server, we figured we'd try to optimize for the 2GB we have now (which can't fit the entire database in memory) and just wait for the new server altogether.
Link to comment
Share on other sites

Of cause, I understand, that your traffic is 2 times bigger than my one. I mean, that if my server is loaded less that 1/2, then your hardware is good. 2 older Xeons are usually better that single new one, because they have 2 independed buses. Speed is ~ the same, because they have older core, but higher CPU clock. I guess, your server also have SCSI disks, and my one works with IDE.

30% - load avarage, according to "top". It shows 0.3-0.5. Maximum is 2. That's preliminary, and correct only if disk I/O & bus are not overloaded.

btw: memory does make a *huge* difference with things like this and while we could just go out and buy RAM for the server, we figured we'd try to optimize for the 2GB we have now (which can't fit the entire database in memory) and just wait for the new server altogether.



Any way, it will be a really good step to migrate some projects to lighttpd, as you plan. Because apache under high load can eat as many memory, as you have in server, even 8Gb :) . My old server was Celeron 2000 with 2GB ram, and I was involved in many optimisation tricks, to keep it work :) .

Because CPU & bus speed are OK in your case, the next goal is to reduce disk I/O, caused by database. lighttpd with php FCGI will allow to do that. It's not much "faster" than apache, but uses memory much more effective, and more predictable. SO, more memory will be left for MYSQL.

Of cause, it's difficult sometime to fire out apache in complicated configurations. In this case, it's possible to use lighttpd as proxy and static pages webserver, to reduce loads from keep-alive sessions.

About MyISAM locks. That's not bad itself, but that increases apache processes count, and causes useless waste of memory (so, mysql works with disk and becames slow). With lighttpd you will have no such problem.

I don't know, if innodb is faster, than myisam. Probably, it's slower. But it's Robust and more convenient. In my case, all tables are in innodb. It requires more minimal memory to be effective, and _probably_ (i don't know, really), myisam could be better with lighttpd in your case (with 2 GB).

And note, that if website have many pages, the most load goes from serach bots. For example, my forum stats show 30 users & 1000 guests in the night time, and 400-500 users & 1000 guests in the daytime. I think, those guests are very strange :)
Link to comment
Share on other sites

We have a pretty good third party repo we use for a lot of web stuff if we're too lazy to compile it (and it's updated more often than stable). Mostly everything web-oriented is custom compiled on the new server though.


I used to use dotdeb when it was the only way to get php5 for sarge. They lack the security and stability testing that are the whole point of using debian. I strongly recomend using etch with the stock debian apache and php packages. Otherwise you might as well be running sid.

Link to comment
Share on other sites

Of cause, I understand, that your traffic is 2 times bigger than my one. I mean, that if my server is loaded less that 1/2, then your hardware is good. 2 older Xeons are usually better that single new one, because they have 2 independed buses. Speed is ~ the same, because they have older core, but higher CPU clock. I guess, your server also have SCSI disks, and my one works with IDE.




Any way, it will be a really good step to migrate some projects to lighttpd, as you plan. Because apache under high load can eat as many memory, as you have in server, even 8Gb :) . My old server was Celeron 2000 with 2GB ram, and I was involved in many optimisation tricks, to keep it work :) .





Of cause, it's difficult sometime to fire out apache in complicated configurations. In this case, it's possible to use lighttpd as proxy and static pages webserver, to reduce loads from keep-alive sessions.



About MyISAM locks. That's not bad itself, but that increases apache processes count, and causes useless waste of memory (so, mysql works with disk and becames slow). With lighttpd you will have no such problem.



I don't know, if innodb is faster, than myisam. Probably, it's slower. But it's Robust and more convenient. In my case, all tables are in innodb. It requires more minimal memory to be effective, and _probably_ (i don't know, really), myisam could be better with lighttpd in your case (with 2 GB).





Our hard drives are actually IDE as well and I'd dispute the core 2 vs old Xeon bit but that's not really an issue as we're moving anyway. I'm not sure about MyISAM + lighttpd vs InnoDB + apache, that'd be interesting to see tested, even apache + InnoDB. The main reason for the InnoDB tables is the locking but if you're only running a single threaded server like lighttpd, shouldn't be a problem unless the indexes are taking a long time to rebuild. Somebody should benchmark that.

Bots have pushed about 800k of our 3.6 million pages so far this month, there are usually a couple hundred guests browsing the site at any one time.

The big thing about the new server is it has more RAM and 6x 10k rpm SCSIs to alleviate some of the I/O that's going on. We're also moving all the user uploads off to a shared NAS so we can actually farm out the images through a separate server if we want.

Thanks for the recommendations (from everybody), the site's running well now and I'll report back whenever we perform the server move (2 weeks or so).
Link to comment
Share on other sites

  • 10 months later...

I'm trying to set my ibf_topic_markers to InnoDB per the recommendation by Brandon, but when I go to phpmyadmin and select InnoDB for the table type for that table, the mySQL server becomes non-responsive and I need to go into SSH to restart it, and when it comes back up it's still in myISAM. Am I missing something here?

Link to comment
Share on other sites

Just would like to post a followup saying that I IM'd Brandon and he said he's no longer recommending any use of InnoDB...



Why? :o

That's like, the biggest optimization thing we've done.
Link to comment
Share on other sites

Of course keep in mind that every server is different, there are different versions of MySQL, and every database is different (in terms of size, what tables have a lot of rows, and so on).

We changed over the topic_markers table to InnoDB on a rather large site to try to offset the table being locked so much. Instead of it being locked, it simply took 10+ seconds to select query it, till we switched it back to MyISAM. While InnoDB does support row-level locking, it is still exponentially slower than MyISAM, offsetting any gains from the functionality, in my limited experience.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.

×
×
  • Create New...