Jump to content

Load on server is increasing yet hit rate getting worse


Recommended Posts

Posted

I'm at my wits end with this stuff and wondering if someone can help me figure out where the resource issues are coming from on my site. At the moment I host 4 sites on my VPS, the busiest of which is in my sig, the other one's don't even register on the "mildly busy" dial. Since I upgraded to 3.3 I didn't really notice any resource issues, in fact probably the opposite, it seemed a little snappier if anything. Then I noticed an issue with IPDownloads category listings, they were slightly off-tab so enabling screenshots to show on sub cats sorted that. The resource usage went through the roof so I quickly reverted that and things went back to normal. I've since also removed a few mods that I can live without but things just don't seem to be getting any better now - worse in fact.

I made the final changes (removing a few mods, etc) last week, and things looked good, then this morning the site is crawling again. Normally the load on the site is anywhere between 0.40 and 1-2. Now it's something like 5+ and slowing the site considerably. If we had a sudden hit from a lot of people (100+) it can spike upto around 3-4 for a moment but always comes down, but now with just 50 guest's it's killing itself at 5+ for ages and the site can take anywhere up to a minute to refresh a page.

I'm not sure if there is such a thing as an idiots guide on where to start with this but if anyone can help walk me through how I begin to figure what's going on I'd really appreciate it :)

Specs and stats as follows:
4200+ registered members - never really seen any more than 100 people on the site at any one time
CentOS 5 - i386 (32-bit)
CPU's on server: 16
IP.Board Version : v3.3.0 (ID:33011)
MYSQL Version : MYSQL 5.0.95-community-log
PHP Version : 5.3.6 (apache2handler)
Loaded Extensions : Core, PDO, Reflection, SPL, SQLite, SimpleXML, apache2handler, bcmath, calendar, ctype, curl, date, dom, eAccelerator, ereg, exif, filter, ftp, gd, hash, iconv, imap, ionCube Loader, json, libxml, mcrypt, memcache, mysql, mysqli, openssl, pcre, pdo_mysql, pdo_sqlite, posix, session, soap, sockets, sqlite3, standard, tokenizer, xml, xmlreader, xmlrpc, xmlwriter, zlib
Total Server Memory : 4096 MB
Available Server Memory : 2621 MB

Snapshot of System Processes

top - 06:33:58 up  1:55,  0 users,  load average: 1.02, 0.92, 1.90

Tasks:  44 total,   2 running,  42 sleeping,   0 stopped,   0 zombie

Cpu(s):  5.7%us,  0.5%sy,  0.0%ni, 93.3%id,  0.5%wa,  0.0%hi,  0.0%si,  0.0%st

Mem:   4194304k total,  1495864k used,  2698440k free,		0k buffers

Swap:		0k total,		0k used,		0k free,		0k cached


  PID USER	  PR  NI  VIRT  RES  SHR S %CPU %MEM	TIME+  COMMAND			

 7918 nobody	25   0  241m 208m  12m R 99.8  5.1   3:26.69 httpd			  

	1 root	  15   0  2156  668  576 S  0.0  0.0   0:03.69 init			   

 1138 root	  15  -4  2260  544  328 S  0.0  0.0   0:00.00 udevd			  

 1481 root	  18   0  1812  572  476 S  0.0  0.0   0:00.04 syslogd			

 1528 named	 22   0  191m 5492 2068 S  0.0  0.1   0:05.29 named			  

 1569 memcache  18   0 76988  23m  540 S  0.0  0.6   0:06.35 memcached		  

 1585 root	  18   0  7240 1044  636 S  0.0  0.0   0:00.00 sshd			   

 1593 root	  15   0  2832  860  684 S  0.0  0.0   0:00.00 xinetd			 

 1605 root	  18   0  2548 1164 1000 S  0.0  0.0   0:00.00 mysqld_safe		

 1635 mysql	 15   0  683m  70m 3588 S  0.0  1.7   4:34.19 mysqld			 

 3190 root	  18   0  138m 102m 1288 S  0.0  2.5   0:04.14 clamd			  

 3197 mailnull  15   0 10368 1136  640 S  0.0  0.0   0:00.00 exim			   

 3226 root	  18   0 16060  13m 1160 S  0.0  0.3   0:01.42 lfd				

 3241 root	  15   0 38244  31m 2548 S  0.0  0.8   0:02.12 spamd			  

 3247 root	  18   0  2156  708  544 S  0.0  0.0   0:00.19 dovecot			

 3248 root	  15   0  2632 1020  824 S  0.0  0.0   0:00.09 dovecot-auth	   

 3252 dovecot   15   0  5296 2000 1632 S  0.0  0.0   0:00.00 pop3-login		 

 3253 dovecot   15   0  5296 2000 1632 S  0.0  0.0   0:00.00 pop3-login		 

 3254 dovecot   15   0  5316 2016 1636 S  0.0  0.0   0:00.00 imap-login		 

 3255 dovecot   15   0  5316 2012 1636 S  0.0  0.0   0:00.00 imap-login		 

 3281 root	  18   0 42416  14m 4952 S  0.0  0.4   0:01.20 httpd			  

 3292 root	  18   0  5384 1448 1100 S  0.0  0.0   0:00.00 pure-ftpd		  

 3294 root	  16   0  5080 1152  904 S  0.0  0.0   0:00.00 pure-authd		 

 3303 root	  15   0  3328 1112  568 S  0.0  0.0   0:00.08 crond			  

 3371 root	  15   0 38696  31m 1892 S  0.0  0.8   0:03.33 spamd			  

 3439 root	  18   0 14228 7260 1204 S  0.0  0.2   0:00.21 cpsrvd-ssl		 

 3443 root	  18   0 15332 9224 1772 S  0.0  0.2   0:00.00 cpdavd			 

 3467 root	  15   0  6088 4312 1320 S  0.0  0.1   0:00.08 queueprocd		 

 3482 root	  33  18  4140 1828  672 S  0.0  0.0   0:00.00 cpanellogd		 

 3499 root	  15   0  7960 5204 1752 S  0.0  0.1   0:00.65 tailwatchd		 

 7860 root	  16   0 10448  380  228 S  0.0  0.0   0:00.00 vzctl			  

 7861 root	  18   0  2548 1360 1120 S  0.0  0.0   0:00.01 bash			   

 7915 root	  16   0  7996 5252 1824 S  0.0  0.1   0:00.09 leechprotect	   

 8023 nobody	15   0 49784  27m  11m S  0.0  0.7   0:05.14 httpd			  

 8026 nobody	15   0 51900  29m  11m S  0.0  0.7   0:07.85 httpd			  

 8141 nobody	17   0 47308  24m  10m S  0.0  0.6   0:02.86 httpd			  

 8147 nobody	15   0 47284  24m 9768 S  0.0  0.6   0:01.48 httpd			  

 8159 nobody	15   0 47324  24m 9988 S  0.0  0.6   0:00.70 httpd			  

 8160 nobody	15   0 47288  23m 9292 S  0.0  0.6   0:04.79 httpd			  

 8161 nobody	15   0 46992  24m  10m S  0.0  0.6   0:02.98 httpd			  

 8164 nobody	15   0 45988  22m 9724 S  0.0  0.6   0:03.62 httpd			  

 8181 nobody	18   0 45684  22m 9836 S  0.0  0.6   0:00.37 httpd			  

 9216 nobody	18   0 47296  23m 9524 S  0.0  0.6   0:00.61 httpd			  

 9225 nobody	18   0  2284  924  732 R  0.0  0.0   0:00.00 top 

Posted

Just as a sidenote, I've alread loaded MySQLTuner but because I've had to restart the server in the last 24 hours the results could be off a little so I'll post those after the server has been up and running for a while longer :)

Posted

16 CPUs (which I presume is two quads with HT?) and only 4GB of RAM, and x86 so no 64-bit instruction set? Seems an odd setup.

See what MySQLTuner recommends, but my immediate thoughts are that perhaps you should make sure all of your tables are optimised. I would shut down the MySQL server and run a myisamchk on all of the MYI files, then start it back up and repair and optimise all tables. As well as this, I think your MySQL server could be configured to use more of the available RAM although I'm not sure about that. It seems to be Apache causing the hold up though.

Try turning off GZip compression if you have it enabled, and perhaps also try enabling a caching system.

Posted

also check
shell$ mysql -uUSER -p
show processlist;

and check if there is any intensive query running

i had similar problems and tracked down to be scheduled task running at the whorst time ;)

Posted

Thanks for the initial advice Peter, I have already tried optimising the tables but only from the admin CP, but nothing noticeable so far. I also tried rebooting the server from WHM after the server became so unresponsive page loads just weren't happening and that hung!. I had to raise a support ticket to have the "container" restarted.

Thanks Luis, will give that a try and report back :)

Posted

The processlist concerns me somewhat, can anyone form an opinion on whether apache seems to be creating an issue here based on the CPU usage which doesn't really go down any?

Bearing in mind we only have something like 30-40 guest's on the site at the moment and I've just noticed that eAccelerator is configured and running also.

post-116929-0-18046700-1333295268_thumb.

Posted

Apache uses a separate thread for each connection (that looks like your setup from what I can see), so there's no way two threads should be using all of your available CPU. Is anything else other than IPB hosted on the same server?

EDIT: What apache modules are you running?

Posted

Are you using DSO as a handler? (based on the nobody). This is what I would do.

-Drop DSO and switch to mod_ruid2 (its amazing)
-Update your php to 5.3.10 :P

Posted

Yes, 2 wordpress sites and another download site which draw hardly any visitors at all :)




Okay, a few things. First, is anything going on in your Apache error logs? That would be a good place to start.

Secondly, run top again, get a PID which is causing a problem (in your screenshot, an example PID would be 7742). Issue this command and take note of the output:

lsof -p 7742



Where 7742 is the PID. This will dump pretty much all the actions taken by that process, including the connection information (usually the last entry from that command). From there you can derive the IP address and check your Apache access log to see what that user has requested, and perhaps get a URL which is causing the problem. If you want to PM me the output of that command I can get the details for you about what to search your logs for and send it over. I wouldn't post it publicly as it reveals some path information about your server as well as the IP of it and the user, so if you don't want to, that's fine.


Are you using DSO as a handler? (based on the nobody). This is what I would do.



-Drop DSO and switch to mod_ruid2 (its amazing)


-Update your php to 5.3.10 :P




Operating Systems like CentOS which use the package system often report their PHP version as being something like 5.3.6, but they have all the security updates/patches from the latest versions.
Posted

@Connor - I've absolutely no idea what that means mate, I'll certainly look into it but if you are able to offer any guidance on how I would go about that in kinda quick-form I'd seriously appreciate it :)

@Peter - I'm on it :)

Posted

Eww, OK, just found the apache error log and it's not particularly good ... it's absolutely massive, downloading just now, but on a side-note, I took a look at the sites hosted on on my VPS and there were actually more than I remembered. They are all mine, but old one's, so I've suspended all of them for now apart from FBB and take a look at the processes now!!

I'm still working on trying to figure where the issue might be coming from but I guess I can safely, and thankfully, say they don't seem to be coming from FBB :)

post-116929-0-64841800-1333300356_thumb.

Posted

Woah woah don't bother downloading it. We just need the last few issues from it.

tail /path/to/apache/log



If you really want to download it at least compress it first!

Posted

first of all you need to know who/what is eating your CPU. For that run the following command


ps -eo pcpu,pid,user,args | sort -k 1 -r | head -10

but the issue is since you're using DSO. You won't be able which account is the culprit. I'd recommend switching to fcgi. Now we need to know if the overload is caused by memory shortage. For that, run


vmstat 2 10



look under the column so (page out). Ideally it should be zero if it stayed at anything bigger than zero then you need to optimize/increase your RAM.

Posted

OK, Peter very kindly took a closer look at what was happening in the logs, etc, and it looks like my file downloads are causing a potential issue on the site. When I can't link directly to a developers website for a file download, I download the file myself, upload it via FTP and then link to it in IPDownloads. I do it this way because a while back I considered using a CDN but went off the idea after I realised it kill cripple me financially, but I never changed the initial setup on the site and things have just went boom over the past year so I've not had the time to start changing anything. As I say, Peter has looked through just about everything and most of it seems perfectly normal, so there is a question mark hanging over the file downloads from the site - that's something I'm going to try and work on very shortly and hopefully with the introduction of a mod to the site I've been planning for some time I'll avoid the need to serve downloads completely :)
Thanks to everyone for the help, especially Peter who spent a while on this one with me
Marko

Posted

Just a final update on this one, Peter spent a great deal of time walking me through setting up a nginx server to serve our downloads which are held locally which seems to certainly have helped considerably. I have also suspended a few of the sites which were on the server which weren't really getting used a lot and which I really don't have the time anymore to administer so in all, a great outcome - thanks a million again to all concerned for their input and Peter for his time :thumbsup:

  • 2 weeks later...
Posted

Well, unfortunately it would appear my issue is back and I'm beginning to wonder if it is something to do with the recent upgrade to IPB and changes in IPSEO?.

I've just about lost it with my host who seem to make one excuse after another, basically they don't know what the issue is - but the latest excuse they have come up with is googlebot has been making thousands of request's on my site today, and it would appear they are blaming gbot for the past few weeks also. According to our host gbot had made 24,000 hits on our site in the last 5 hours so I reduced the crawl rate in webmaster tools to hopefully slow them down, but 24,000 hits in 5 hours???? What the heck?!

However, before the upgrade I didn't have these issues so I'm left wondering have the recent "SEO improvements" contributed to this issue?. I still find it strange that it only happens at weekends, during the week I hardly see a spike on the server at all, always starts kicking off around a Sunday!!

Posted

maybe setup a hourly cronjob for WHM version of mysqlmymonlite.sh (zip readme.txt has a how to), and log all server stats to time stamped log files every hour for say 24-48hrs and review the logs to see what's happening at high cpu load time ?

of course it could be true too that your server is being hammered by bots etc but check google webmaster > diagnostic -> crawl stats to see what activity there is if it's up to date

Posted

Personally I can't help feel my host is talking bull - I've restricted the bots yet our loads are still up at 3-4 for the site and browsing is just a complete waste of time - it's a managed VPS because I know virtually nothing about managing a server so basically I think it's time to look for another host - I get the feeling the host just takes a quick look at the first stat they find interesting and blames that!

Posted

Yep, already banned a few bots in the server firewall, just banned another IP which was taking up nearly 60% of the CPU for some reason, couldn't identify what it was and now the server load is down to 0.02 instead of the 5-6 it was before banning !!!

Posted

My best guess is that it is baidu, It would always have an active presence on a site i work on with about 10 ip's crawling the pages.

It actually forced us to move to a VPS since we got suspended by the host since at the time. Luckily the VPS price was the same as the current hosting so that was awesome.

From what i remember baidu was always coming back from being blocked. This site helped a lot (It also has other bots that don't obey robots.txt) http://www.netnuisance.net/ip/se.php

Posted

Keep in mind a VPS is still shared... so other VPS's on the same box can effect your sites no matter what type of "sales spin" they put on the term VPS.

Just something to keep in mind... many people think a VPS is non dependent on other VPS's on the same machine and that is false. So while you have been pulling your hair out looking for an issue, it very well could have been an issue not related to your site at all.

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...