Jump to content

Excessive 502 errors


Recommended Posts

Posted

Hello,

We are currently running PHP 7.3.5 with nginx 1.16.0. IPS 4.4.3.

PHP is running as php-fpm.

Navigating our website, we get random 502 errors. As far as I can tell, this is a timeout error between nginx and php-fpm, but I have no clue on how to debug and fix this.

nginx status page:

pool:                 www
process manager:      dynamic
start time:           08/May/2019:23:54:43 -0300
start since:          554034
accepted conn:        16438369
listen queue:         15
max listen queue:     133
listen queue len:     511
idle processes:       0
active processes:     29
total processes:      29
max active processes: 99
max children reached: 0
slow requests:        0

I've attached the php-fpm config files to this post.

Any pointers would be highly appreciated.

php-fpm.conf www.conf

Posted

As you can see, your listen queue status is not 0, which means that php runs out of ressources and waits. Depending on your available memory, you shoud put a max children parameter greater than the actual 256. You should tweak it so your listen queue never goes over 0.

Here is mine (restarted this morning, new kernel, but I watch it closely), listen queue never changes from 0 : 

pool:                 www
process manager:      dynamic
start time:           15/May/2019:10:29:10 +0200
start since:          24323
accepted conn:        89294
listen queue:         0
max listen queue:     0
listen queue len:     0
idle processes:       14
active processes:     2
total processes:      16
max active processes: 16
max children reached: 0
slow requests:        0

and here is my www.conf :

pm = dynamic
pm.max_children = 512
pm.start_servers = 16
pm.min_spare_servers = 8
pm.max_spare_servers = 16
pm.max_requests = 1000

 

Posted
2 hours ago, bfarber said:

Do you see anything in your php error logs, or in your webserver error logs, that might point to the issue?

Not really, php error log is empty, and nginx error log only says "502". I believe @b416 got me a really nice starting point. Thanks, @b416!!!! Tweaked here as suggested and will let you know what happens.

1 hour ago, opentype said:

Did it start with 7.3 as people reported here?

No, it also happened with 7.2 and I was too lazy to take a better look into this.

 

Posted

@b416 Just got a 502 error here, listen queue is 0. (pm.max_children = 512)

pool:                 www
process manager:      dynamic
start time:           15/May/2019:13:50:39 -0300
start since:          12175
accepted conn:        594918
listen queue:         0
max listen queue:     71
listen queue len:     511
idle processes:       27
active processes:     16
total processes:      43
max active processes: 72
max children reached: 0
slow requests:        0

 

Posted

What do you have in your mpm_event.conf ?   Here is mine : 

 

ServerLimit    	          500
StartServers                4
MinSpareThreads            25
MaxSpareThreads   	   	   75
ThreadLimit          	   64
ThreadsPerChild       	   25
MaxRequestWorkers    	  500
MaxConnectionsPerChild   1000

 

Posted

Sorry, I overlooked that. 

You should increase your pm.start_servers and pm.max_spare_servers, you are right, you did not reach max children, but max active processes (72, or you start only 10 and 10 as max spare).   

Posted

Hi, I enabled some logging here to take a deeper look into this. Thanks.

Update: enabling php-fpm with NOTICE level gave me several insights, as it points out which parameters must be increased and why. Will try that and keep you posted.

[16-May-2019 12:31:33] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 3 idle, and 31 total children
[16-May-2019 12:31:34] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 3 idle, and 36 total children
[16-May-2019 12:31:35] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 5 idle, and 41 total children

This seems to be the origin of the 502 errors! 🙂

 

Posted

@opentype I can confirm that the excessive number of 502 errors were caused by a bug with PHP 7.3 where random segmentation faults occur. More details here: 

After downgrading to 7.2, the 502 errors are now gone.

I need, however, do a small fine-tuning to my www.conf file.

I see the following advice in my php-fpm.log file:

[19-May-2019 19:51:58] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 63 idle, and 71 total children
[19-May-2019 19:51:59] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 63 idle, and 72 total children
[19-May-2019 19:52:00] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 59 idle, and 73 total children
[20-May-2019 08:57:57] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 0 idle, and 110 total children
[20-May-2019 08:57:58] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 0 idle, and 118 total children
[20-May-2019 08:57:59] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 134 total children
[20-May-2019 08:58:00] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 24 idle, and 166 total children

What I don't understand is why php-fpm is spawning new children if there are more than enough idle processes... I will increase the pm.min_spare_server to see what happens.

Currently we have:

pm.max_children 192
pm.start_servers 64
pm.min_spare_servers 64
pm.max_spare_servers 128

Increased pm.start_servers and pm_min_spare_servers to 96 and will let you know what happens.

Posted

My recommendation for you:

pm = dynamic <--Static in theory will perform a bit better as they are all active and waiting for connections but you will get a higher load and higher resource usage and most of the times will not be needed and the rest software on the server will have to use less resources.

pm.max_children = 120  <--I know that you have seen sometimes max children reach at your logs but if you increase that value a lot to not get that error then you probably get stability issues and a lot more 502 errors. Most of the time the max reach is coming as a spike from some scripts and they can accept a small delay than pushing your system to 502 errors and get stability issues.

pm.start_servers = 48

pm.min_spare_servers = 24

pm.max_spare_servers = 96

pm.max_requests = 5.000

 

Your testing value:

pm.max_children 192

If all workers be active your server will need around 20GB only for phpfpm. The average process size is around 70-100mb (i prefer to calculate it as 100mb) and this is what i did in your case as i don't have access on your server.

Now if you still have issues with the above numbers you may need to check for scripts/addons or anything related that you may use with some not well optimized queries or maybe a very long running processes.

Of course Mysql optimization is important also!

 

 

 

 

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...