Jump to content

Something is causing runaway PHP processes and I cant find what


Recommended Posts

Hi All,

So, once or twice a week my site goes down at the moment. WHen I log in to the server, I see a heap of php-cgi processes with high (20%+) CPU usage.

What I know;

  • Newrelic reports time spent is on /forums/index.php which doesn't really help as I dont think ?app=blah would register a separate transaction, so I can't rely on this.
  • Newrelic confirms that the number of processes does not change. The PHP processes stays at 51 which is what it is set to.
  • Newrelic also confirms that I have a lot of available memory. 6GB available at last crash, so its all CPU.

So, I have no idea where to look. I keep throwing more resources at my site and it keeps hitting the limits.

MySQL is running beautifully with the slowest query during my last downtime being a whopping 18ms.

Note that this is all in AWS with the DB in RDS and the server WAS load balanced with auto scaling, but the scaling was all over the place because of this issue so its now just on a massive single instance. 4 CPU and 15GB mem. For the record, in peak times load will sit around 3 - 4, so the instance should be big enough. Its just whatever is causing these runaway processes that is killing everything.

Would love some help. Please!!

Thanks,
Christian
Link to comment
Share on other sites

Do you have image processing tasks? Sometimes on my site, the image processing task like adding a watermark takes a long time and will eventually cause the process to look like it's been running for a while.

My solution is to simply find any php processes that have been running for longer than 5 minutes and kill them. I use a cron job that runs every 5 minutes and then kills processes that have been running longer than 5 min.

Link to comment
Share on other sites

I dont have much in the way of image processing, but I do like the idea of killing processes - setting a low 'max requests' doesnt seem to do it.

This seems to be a more standard way of handling things these days. Can you show me an example cron? Do you run a script or do it from within cron?

Thanks.

 

Link to comment
Share on other sites

Can you post some more stats? 'top' for starters. See sticky.

You probably peak at 51 processes because your setting probably has a cap of 50 processes. (plus 1 for controller)

What are you running? apache? su_php? nginx? php-fpm? version?

Do you have modules/hooks/etc installed? Have you tried disabling them?

Also, don't setup cron to kill php. There's a setting in php.ini where you can set max_execution time. 

Link to comment
Share on other sites

Can you post some more stats? 'top' for starters. See sticky.

Yep, when it starts going crazy again, I'll capture top and I reckon netstat as I *think* its connections related (right now load is 1.35 and connections is 83). I cant believe I missed the sticky, my apologies.

At last wig-out, the instance had several gigs of memory free. Cant remember the exact number but was > 3gb.

You probably peak at 51 processes because your setting probably has a cap of 50 processes. (plus 1 for controller)

Yep, max processes is 50.

What are you running? apache? su_php? nginx? php-fpm? version?

I'm running php-fastcgi which I dont like as much as php-fpm.

PHP version is 5.4.4
nginx version: nginx/1.2.1

 

Do you have modules/hooks/etc installed? Have you tried disabling them?

I do have a few, I haven't gone through them, no. There are mainly hooks, not many apps. Considering these make up a very large portion of my sites functionality I'd like to try to leave this step till later as it may be 5 or 6 days before we see the spikes again which means min 5 or 6 days without the functionality.

Also, don't setup cron to kill php. There's a setting in php.ini where you can set max_execution time. 

Currently
max_execution_time = 30 
Default I believe. Do we reduce that to stop the processes spending too long on a task?

 

Thanks *so* much for your help, it really is very appreciated!

Link to comment
Share on other sites

Though quite late to respond, I'd advise against raising process limit without knowing what system/resource we have. If you have runaway PHPs, having MORE runaway PHPs only increases the problem. Also comparatively speaking, even on my beefy E5-1650v2's I don't need more than 50 processes to max out the CPU.

You can decrease max_execution time a lot. It partly depends on what you think & feel is acceptable. If you think ALL pages on site should load under 3seconds and average is like ~100ms, you can even put 3 there. But will likely fail to serve ~1 in 1000 pages or so of normal pages just due to randomness (no calculation here, just rough estimations). You can have higher limits to make it safer and crash less. It's a balance between killing runaways early vs decreasing failures.

Link to comment
Share on other sites

The author says that he has 4 CPU and 15GB mem and my recommendation maximum and only if he go that high he will use  5gb .... so a lot of ram for the rest....

Yesterday i solve the exact problem for a client and the problem was the sql and that's will be my next recommendation to check if the above will not work....

But without checking the server is not so easy to find exactly what the problem is....

Link to comment
Share on other sites

Thanks guys,

I have had a couple more instances of downtime but interestingly the system seems to resolve the issue itself, where I used to have to restart php, it appears it now settles on its own.

I like the idea of dropping max execution time. New relic reports 250ms time with app server so dropping it right down to 3 or 4 might be a good win.

 

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...