Jump to content

How do I scale IPB to handle 25000 simultaneous users?


Recommended Posts

Hello! I have a large client that wants to build a forum for its customers. I have recommended IPB, but I haven't used it for anything until now - so I'm a real newbie! The client wants us to plan for 25000 simultaneous users. Not specified how many logged users in or not - let's suppose 5000 logged in. I'm sure 25000 sounds like a lot but considering the size of the client, I can understand the worry. Do you think this is doable with IPB? How would you go about doing this? I'm thinking 2-3 load balanced webservers running memcached and eaccelerator. The dabase server would need to have alot of memory and many CPU cores. For searching, we would use sphinx. Where should this run? On the database-server?

any tips & ideas are welcome!

matti

Link to comment
Share on other sites

  • 2 weeks later...

[quote name='kmf' date='30 November 2009 - 07:39 AM' timestamp='1259566783' post='1883499']
I guess you need to prepare a server for memcache or other caching too.

This forum has 6000 users at the same time and it's pretty snappy.
http://asianfanatics.net/forum/


I assumed that we would run memcached on the webservers. Memcached is memory bound, apache is cpu-bound, so it should be a good fit. You have lots of traffic, your at once record is almost 30k! Good job! What machines do you run your site on? Have you done any special configuration? Thanks.

matti

Link to comment
Share on other sites

[quote name='BacTalan' date='30 November 2009 - 07:50 AM' timestamp='1259567404' post='1883501']
According to some of the devs, MSSQL scales much better than MySQL. That may be something to keep in mind.


Do you know if this is ipboard specific? Some of the largest sites on the net run mysql so I assumed it would scale just fine.

Link to comment
Share on other sites

[quote name='Matti' date='30 November 2009 - 12:16 AM' timestamp='1259568998' post='1883507']
Do you know if this is ipboard specific? Some of the largest sites on the net run mysql so I assumed it would scale just fine.

I'm no expert on this, so don't take anything I say to be certain. I'm just working off what I've seen from some posts by developers. I'd assume it would be in general, however.

Link to comment
Share on other sites

[quote name='Matti' date='30 November 2009 - 07:45 AM' timestamp='1259567129' post='1883500']
I assumed that we would run memcached on the webservers. Memcached is memory bound, apache is cpu-bound, so it should be a good fit. You have lots of traffic, your at once record is almost 30k! Good job! What machines do you run your site on? Have you done any special configuration? Thanks.

matti



It's running on 3 dedicated servers. 2 webservers loadbalanced and 1 database server. The webservers are using eaccelerator.
Gzip was disabled and I used mod_deflate to zip all non-img static files. We just upgraded to ipb3 from ipb2, so we're still testing things out.

As it seems that ipb3 is causing twice as much load.

Also, for search we used xapian and sphinx. The database is using innodb for the large and fast-update tables like posts, members and session.

Link to comment
Share on other sites

[quote name='kmf' date='30 November 2009 - 08:40 AM' timestamp='1259570431' post='1883513']
It's running on 3 dedicated servers. 2 webservers loadbalanced and 1 database server. The webservers are using eaccelerator.
Gzip was disabled and I used mod_deflate to zip all non-img static files. We just upgraded to ipb3 from ipb2, so we're still testing things out.

As it seems that ipb3 is causing twice as much load.

Also, for search we used xapian and sphinx. The database is using innodb for the large and fast-update tables like posts, members and session.


Allright, thanks for the info, especially the bit about innodb! So, if I understand you correctly, you're not using mecached or something else to offload the db? Since most users are just browsing the forums as guests, I thought that a fat cache would be the key to get good performance.

matti

Link to comment
Share on other sites

[quote name='Matti' date='30 November 2009 - 09:05 AM' timestamp='1259571908' post='1883519']
Allright, thanks for the info, especially the bit about innodb! So, if I understand you correctly, you're not using mecached or something else to offload the db? Since most users are just browsing the forums as guests, I thought that a fat cache would be the key to get good performance.

matti



I'm using eaccelerator for caching. It's a form of memcache and filecache.
That's because I'm at a budget and can't afford a server with a lot of memory for memcache. And eaccelerator works great too.

Link to comment
Share on other sites

[quote name='kmf' date='30 November 2009 - 12:38 PM' timestamp='1259584734' post='1883553']
I'm using eaccelerator for caching. It's a form of memcache and filecache.
That's because I'm at a budget and can't afford a server with a lot of memory for memcache. And eaccelerator works great too.


Ok. I haven't used eaccelerator myself but I was under the impression that it is mainly used to store compiled php opcodes so that the pages didn't have recompile on every req, whereas memcached is used to store data that would otherwise be fetched from the db. I'm glad to hear your site is running well though, since you have that much traffic.

Link to comment
Share on other sites

[quote name='Matti' date='30 November 2009 - 01:15 PM' timestamp='1259586925' post='1883563']
Ok. I haven't used eaccelerator myself but I was under the impression that it is mainly used to store compiled php opcodes so that the pages didn't have recompile on every req, whereas memcached is used to store data that would otherwise be fetched from the db. I'm glad to hear your site is running well though, since you have that much traffic.



eaccelerator on itself is mainly used for pre-compiled php. But if you install it with the "shared-memory" option, you can let it work like memcache too.

Only downside is, as far as I know, it only works on the local server. Whereas memcache allows you to offload all cache to another server.
Upside is, faster PHP-interpreting and auto file/memory caching rotation.

IPB is using partial caching. Meaning it won't cache direct results, but parsed, sorted and prepared data. Something I prefer and make use off myself too.

So it doesn't really matter much what caching you use, if you know the advantage of the one over the other.

I'm pretty sure IPB does not support MULTIPLE caches though... otherwise you could use a combo of eaccelerator and memcache.

Link to comment
Share on other sites

[quote name='kmf' date='30 November 2009 - 09:45 PM' timestamp='1259617525' post='1883720']
eaccelerator on itself is mainly used for pre-compiled php. But if you install it with the "shared-memory" option, you can let it work like memcache too.

Only downside is, as far as I know, it only works on the local server. Whereas memcache allows you to offload all cache to another server.
Upside is, faster PHP-interpreting and auto file/memory caching rotation.


Great, I didn't know about the shared-memory option, I have to look into that. Even so, that cache is local to one machine. I think the reason why facebook, digg, wikipedia all use memcached is that it can be distributed over many machines. That gives it great scalability. From what I understand, it's basically a distributed hashmap that can be installed on all your webserver nodes or wherever else you have spare memory in effect giving you a giant virtual memory pool.

[quote name='kmf' date='30 November 2009 - 09:45 PM' timestamp='1259617525' post='1883720']
IPB is using partial caching. Meaning it won't cache direct results, but parsed, sorted and prepared data. Something I prefer and make use off myself too.


You're right, that is a good design choice. It makes for a more flexible solution.

[quote name='kmf' date='30 November 2009 - 09:45 PM' timestamp='1259617525' post='1883720']
So it doesn't really matter much what caching you use, if you know the advantage of the one over the other.
I'm pretty sure IPB does not support MULTIPLE caches though... otherwise you could use a combo of eaccelerator and memcache.


That's right - IPB only supports one cachetype at once. I found out that today by rummaging through the registry source code. Too bad, I think the combo would be best. If I was planning on running this on a single webserver, I would go for eaccelerator. That would give the opcode caching performance boost along with the shared memory data caching. Since our client is willing to pay for the hardware, I think we'll go with memcached. It gives us more scaleability and reduces waste.

Link to comment
Share on other sites

What you should run is xcache which in tests has proven faster then eaccelerate for your opcache. This will cache the php scripts into your ram. These needs to be run on every front end machine. Then you run memcache for the cache system in IPB itself. This cache can be shared across multiple servers and is designed this way, memcache can scale to any size.

However 25,000 active users at once is quite a bit of traffic. Your talking about requiring multiple database servers at least a primary and slave server. Your front end is going to take the brunt of the work and will require good load balancing and I would recommend maximizing your server resources with xcache, memcached, and running sphinx. I would also get away from apache and switch to nginx with php-fpm. I only run a forum that caps out at 2k users but the speed difference is amazing and the server load is quite noticeable.

Overall I think your client also needs to understand his up front costs of running at least 10-15 servers even if optimized. I personally would start with 2 front end and one database server and work from there.

Link to comment
Share on other sites

[quote name='ssslippy' date='01 December 2009 - 02:52 AM' timestamp='1259635952' post='1883807']
What you should run is xcache which in tests has proven faster then eaccelerate for your opcache. This will cache the php scripts into your ram. These needs to be run on every front end machine. Then you run memcache for the cache system in IPB itself. This cache can be shared across multiple servers and is designed this way, memcache can scale to any size.

However 25,000 active users at once is quite a bit of traffic. Your talking about requiring multiple database servers at least a primary and slave server. Your front end is going to take the brunt of the work and will require good load balancing and I would recommend maximizing your server resources with xcache, memcached, and running sphinx. I would also get away from apache and switch to nginx with php-fpm. I only run a forum that caps out at 2k users but the speed difference is amazing and the server load is quite noticeable.


Ok. Thanks for the info. I was actually wondering why ipboard had to know about the php opcode caching. I will definitely look into xcache and nginx, I've been hearing good things about them. Since I'm guessing that 90% of the traffic will be guests just browsing the forums, I think a cache will go a long way. This however, depends on ipboard actually fetching most of its data from the cache, rather than hitting the db. We'll see how that goes during the load testing. If the db becomes a bottleneck, we'll have have to look into master/slave setups but I'd rather not because of the complexity.

[quote name='ssslippy' date='01 December 2009 - 02:52 AM' timestamp='1259635952' post='1883807']
Overall I think your client also needs to understand his up front costs of running at least 10-15 servers even if optimized. I personally would start with 2 front end and one database server and work from there.


Sure. What's most important for us is to show that we can help them out if they run into trouble down the line.

Link to comment
Share on other sites

[quote name='kmf' date='30 November 2009 - 08:40 AM' timestamp='1259570431' post='1883513']
It's running on 3 dedicated servers. 2 webservers loadbalanced and 1 database server. The webservers are using eaccelerator.
Gzip was disabled and I used mod_deflate to zip all non-img static files. We just upgraded to ipb3 from ipb2, so we're still testing things out.

As it seems that ipb3 is causing twice as much load.

Also, for search we used xapian and sphinx. The database is using innodb for the large and fast-update tables like posts, members and session.


Wouldn't you want your sessions table to be using the MEMORY instead? MyISAM may be a better option in some regard, but for all of the above MEMORY is the way to go when the information is temporary (info stored doesn't survive a reboot).

I would also recommend FastCGI + suExec vs. mod_php + suPHP as I believe suExec is more secure, and FastCGI is more resource friendly. On shared enviroments you have to use a wrapper script for PHP because of the tight restrictions with suExec.... but if you're running a single site it's not that big of a deal.

I've also heard great things about lightHTTP (also supports fastcgi) vs Apache, although I've had no experience with it. I've seen some benchmarks where lightHTTP outperformed Apache at a fairly high margin. But because of the experience I have with Apache, I'm sticking with it unless I really have the need or there is a feature that trumps Apache in a huge way.

Link to comment
Share on other sites

[quote name='Luke' date='03 December 2009 - 05:52 PM' timestamp='1259862753' post='1884624']
Wouldn't you want your sessions table to be using the MEMORY instead? MyISAM may be a better option in some regard, but for all of the above MEMORY is the way to go when the information is temporary (info stored doesn't survive a reboot).

I would also recommend FastCGI + suExec vs. mod_php + suPHP as I believe suExec is more secure, and FastCGI is more resource friendly. On shared enviroments you have to use a wrapper script for PHP because of the tight restrictions with suExec.... but if you're running a single site it's not that big of a deal.

I've also heard great things about lightHTTP (also supports fastcgi) vs Apache, although I've had no experience with it. I've seen some benchmarks where lightHTTP outperformed Apache at a fairly high margin. But because of the experience I have with Apache, I'm sticking with it unless I really have the need or there is a feature that trumps Apache in a huge way.



Tried it all before, but the best combo for me is what I'm using at this moment too.
Sessions being MEMORY STILL causes tablelocks, even though the table is fast, it takes time for the script to finish using it and releasing it. (i've tried memory-tables on version 1.3 of IPB. That's years ago. And I merely had 500 people online at the same time. Now with 10 times that amount, I doubt memory tables scales correctly)
Apache2 with modPHP works fine for the webservers, as long as eaccelerator is there.
I use lighttpd for the static files on another server.
And sphinx for searches.

ATM there is just no need for me to experiment further and lose capabilities of apache in exchange for speed from lighttpd. The usage of various options is all a matter of balance.

Link to comment
Share on other sites

[quote name='kmf' date='05 December 2009 - 04:21 PM' timestamp='1260030116' post='1885206']
Tried it all before, but the best combo for me is what I'm using at this moment too.
Sessions being MEMORY STILL causes tablelocks, even though the table is fast, it takes time for the script to finish using it and releasing it. (i've tried memory-tables on version 1.3 of IPB. That's years ago. And I merely had 500 people online at the same time. Now with 10 times that amount, I doubt memory tables scales correctly)
Apache2 with modPHP works fine for the webservers, as long as eaccelerator is there.
I use lighttpd for the static files on another server.
And sphinx for searches.

ATM there is just no need for me to experiment further and lose capabilities of apache in exchange for speed from lighttpd. The usage of various options is all a matter of balance.


Fair enough.

I do believe the "MEMORY" engine is different than the "HEAP" engine which is what you would have used back then. The "MEMORY" table is supposed to have everything loaded in ram, so I don't know why there would be any table locks as nothing is actually written to disk. The only thing I can think of is perhaps there were too many rows that were inactive and it wrote to disk temporarily... but fixing that may be a simple matter of configuring the table differently than the engine defaults (I think that's correct?).

Link to comment
Share on other sites

Luke, from someone who has used MEMORY tables on a large dataset (MEMORY table with 1GB of data in it, for example), I can tell you that after a certain point, it becomes far *less* efficient than MyISAM. We used it and switched back due to locking issues.

Also HEAP is just an alias for the MEMORY engine. :)

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...