Jump to content

Huge increase in server load after upgrading to 4.4.x


Recommended Posts

  • Replies 110
  • Created
  • Last Reply
Posted
On ‎3‎/‎23‎/‎2019 at 9:10 AM, ExiledVip3r said:

Update: After enabling php-fpm's slow log and examining it, almost every entry was related to /applications/core/sources/AdminNotification/AdminNotification.php.

Commenting out every instance of \IPS\core\AdminNotification in init.php and truncating the core_acp_notifications table (which had 67,000+ rows) instantly returned my site from unusable slow to loading instantly. CPU Usage has gone from 98% to 16%. It's only been 30 minutes or so however and it's the middle of the night. Hopefully it lasts through the rest of the day, but I'm actually hopeful for the first time since 4.4's reléase 😕

Thanks for your response and recommendation.

@ExiledVip3r, This has definitely resolved your slowness ?? Could you take it with a temporary solution?

Posted

@bfarber and everybody else,

After weeks rubbing a fine-tooth comb here, we were able to find the culprits and fix the high load issue we were facing (or sort of -- there is a still an issue regarding Redis, but more on that later in this post).

The high-load issues on both web and MySQL servers were being caused by several slow, search-related queries being launched at the same time, by two specific third-party plugins. So, the issue wasn't related to the core software.

One was a custom plugin, and the developer we use was able to debug and fix it. It was caused by a minor change in the IPS framework from 4.3 to 4.4 and another adjustment that was needed. This plugin was caching data for 24 hours, and when this cache expired, it was trying to recache everything at the same time, thus launching a huge amount of search-related queries at the same time. This is what was causing the MySQL server load to skyrocket once a day.

The second plugin is one available at the Marketplace: 

This plugin was launching too many queries at the same time, overloading the MySQL server as well. This plugin might work well on smaller, less busy forums, but with over 1.2 million topics and thousand of users online, it was hitting performance really hard. A suggestion to @Adriano Faria would be maybe adding a configuration for limiting the search to topics posted within the past "x" days (e.g., past 180 days). Therefore, we uninstalled this plugin here.

With these changes, loads dropped to levels below what we were seeing with 4.3.x.

Now, I went ahead and installed Redis. It really dropped load at the MySQL server. Maximum load I am seeing during the day is below 0.30, which is great and well below the 0.50 I was seeing without Redis. At night, load drops to 0.10~0.20, which is impressive.

Now, for the webserver, I still need some advice. At night, load is around 9 (way below what we were seeing with memcached), but during the day, when our traffic is heavier, I am seeing loads still in the 30~50 range (way above what we were seeing with memcached), with redis-server hitting 100% CPU load:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
18476 redis     20   0 1261768 250388   2560 R  97.0  0.8  21:16.89 redis-serv+

The only two lines I changed at redis.conf were:

maxmemory 4gb
maxmemory-policy allkeys-lru

Everything else is at their default values.

I don't know how to debug this Redis issue. I also don't know if there is an optimization tool available.

The significant entries in our constants.php file are:

\define( 'REDIS_ENABLED', true );
\define( 'STORE_METHOD', 'Redis' );
\define( 'STORE_CONFIG', '[]' );
\define( 'CACHE_METHOD', 'Redis' );
\define( 'REDIS_CONFIG', '{"server":"127.0.0.1","port":6379,"password":""}' );
\define( 'CACHE_PAGE_TIMEOUT', 0 );

Any pointers would be really welcome.

Thank you!

Posted
13 minutes ago, Gabriel Torres said:

The second plugin is one available at the Marketplace: 

This plugin was launching too many queries at the same time, overloading the MySQL server as well. This plugin might work well on smaller, less busy forums, but with over 1.2 million topics and thousand of users online, it was hitting performance really hard. A suggestion to @Adriano Faria would be maybe adding a configuration for limiting the search to topics posted within the past "x" days (e.g., past 180 days).

It uses IPS4 default search system and it is only fired when someone tries to create a content item (after leave the TITLE field) so it may be a issue in a board so busy like yours (easily 4,000 users online). You can't use any resource.... I'll take a look to restrict it.

Btw, it's curious, you use this since IPS 4.1 or 4.2 and the increase happened only now?

Posted

Hi, we have very big problem with speed on our audiostereo.pl.

It was fine for some time with 4.4.2, but from about 2 weeks its very slow. Website speed is maybe not so slow, but adding posts is lottery. Sometime it is 5 seconds, sometimes it is 30, or event 60+ seconds. Our users are mad. We didnt change anything in last days. Traffic is in the same level (google analytics show even a bit lower). Users also reports very slow running of Tapatalk. They told us, that timeline is loading forever and forever. We have 2 servers - one for apache and files and second for mysql. We have new mysql and php on them, machines are new - mysql one is from 2 months ago, and apache one is from beggining of 2018, so they are not to slow. 

We've been trying to enable cache in ACP, but after that it was even slower - whole website was slow, so we are back to no cache no. But in with no cache & ips 4.4 it was very fine few weeks ago (max 3-5seconds to add post).

I have disabled all plugins and stuff to be sure, but it didn't work.

Please help us.

Zrzut ekranu z 2019-04-19 11-44-02.png

Posted

We were using memcached for while, but whole site was slower then without cache, so as I mentioned cache is disabled now. I have only switched today into 30 minutes for guests, and maybe that will help something.

I have also reconfigured search engine. Cutoff was disabled, so I have set it into 5 years nows. Post adding seems to work faster, but its still reindexing, so we will see after finishing it. This table was big (more then 4gb), so it may be related.

I saw that yesterday 4.4.3 was out. Any optimizations? Should we update to it?

Here are the slow queries log:

slow01dj0ij1e.tar.gz

Posted

@wegorz23

1. What are the loads (e.g. using the top command) you are seeing on your servers? What numbers you were seeing before?

2. Try enabling the slow query logs for both mysql and php-fpm

3. MySQL is running MyISAM or InnoDB? Are you using MySQL itself or a drop-in replacement such as Percona or MariaDB?

4. As suggested by almost everyone here, move from memcached to Redis

5. Are you using a CDN such as CloudFlare? If not, you should.

Other people will certainly chime in with more advice.

As you can see, there are too much missing information from your posts... 😉

Posted

Ill be back with more info from my admin.

1. Apache servers is on the top

3. We are using MariaDB for sure as I know. Its mixed - some on InnoDB, some on MyISAM - https://ibb.co/fkRGzbq

2. We don't have php-fpm on server now. I will ask our admin for it.

4. We will check it, but we had already tried memcache(d), APC and it was slower then without it

5. I dont think we had it.

 

 

Zrzut ekranu z 2019-04-25 13-10-28.png

Posted
17 hours ago, Gabriel Torres said:

The only two lines I changed at redis.conf were:


maxmemory 4gb
maxmemory-policy allkeys-lru

Everything else is at their default values.

I don't know how to debug this Redis issue. I also don't know if there is an optimization tool available.

Good job and thanks for sharing.

Could tell us why you changed the Redis maxmemory-policy?

I mostly check the Redis slow log using phpredmin, which let's me check the db, stored keys, config, stats and the slow log. The best lightweight tool I found yet.
This and linux top etc. to check whether my mem, cpu etc. is doing fine. We have about 1000 concurrent users with 1500-1800 at peak times these days...

 

Posted
17 hours ago, DSyste said:

appendonly yes
appendfsync everysec

@DSyste Thanks a lot, now with full traffic our server load is low, these configuration lines fixed the issue of high load with redis-server! 🙂

5 hours ago, Thomas P said:

Could tell us why you changed the Redis maxmemory-policy?

@Thomas P I was trying to fix the issue I described above and tried playing with that. It seems this configuration together with the two lines posted by DSyste provided the best performance here for us. However, I still need to run a diagnostics tool to see what is going on with Redis and optimize it. Thanks a lot for the suggestion of phpredmin, I will take a look into it and let you guys know if I am able to improve performance even more.

@wegorz23 From the screenshot you posted, your webserver load is relatively low, so your issue is something different from what we are discussing in this topic (I opened this topic to discuss load issue, here on my webserver it was above 40 -- it is around 2 at yours). From what you describe, it might be something related to network latency, plugins/apps, or something else. I suggest you to open your own topic if you haven't done so, so other people can do some independent auditing on your website and give you some ideas. Meanwhile, please do your website a favor and put it behind a CDN such as CloudFlare (they offer a free plan). It will cache images and scripts and improve load times. Also, install and configure Redis. As explained here on this topic, IPS is currently optimized for Redis and not so much for memcached/APC. For us, the difference between using memcached and Redis was night and day. I hope I have helped. (PS: also always make sure you are running the latest versions of PHP/MySQL/etc. For PHP, make sure opcache is enabled)

Oh, BTW. Upgrading to 4.4.3 also improved performance.

Posted

Thank you. We will definitely try this Reddis cache. It is just installing it by admin on server and enabling on ACP? Or need some more configuration and stuff? Because I saw some posts here with modifications of config file. 

CloudFlare have a free plan? Oh great, I will check it, but never been using it, so hope that I can handle it.

We will upgrade for 4.4.3 at Monday early morning I think. Friday is maybe not best day for updates 🙂

PS. I have also some recommendations from our data center:

	 General recommendations:    Run OPTIMIZE TABLE to defragment tables for better performance      OPTIMIZE TABLE audiostereo.ipb_core_output_cache; -- can free 7984 MB      OPTIMIZE TABLE audiostereo.ipb_core_mail_error_logs; -- can free 30.4762802124023 MB    Total freed space after theses OPTIMIZE TABLE : 8014.4762802124 Mb    Set up a Password for user with the following SQL statement ( SET PASSWORD FOR 'user'@'SpecificDNSorIp' = PASSWORD('secure_password'); )    Reduce or eliminate persistent connections to reduce connection usage    Configure your accounts with ip or subnets only, then update your configuration with skip-name-resolve=1    Increasing the query_cache size over 128M may reduce performance    Adjust your join queries to always utilize indexes    Increase table_open_cache gradually to avoid file descriptor limits    Read this before increasing table_open_cache over 64: http://bit.ly/1mi7c4C    Beware that open_files_limit (1263) variable    should be greater than table_open_cache ( 512)Variables to adjust:    max_connections (> 200)    wait_timeout (< 28800)    interactive_timeout (< 28800)    query_cache_size (> 128M) [see warning above]    join_buffer_size (> 256.0K, or always use indexes with joins)    table_open_cache (> 512)    innodb_buffer_pool_size (>= 2G) if possible.
Posted

@wegorz23 When you enable Redis at the ACP, you will download a new constants.php file that must be uploaded to the root directory of your website.

It is clear that you are very "green" regarding technical knowledge, so I recomment you to start opening topics here: https://invisioncommunity.com/forums/forum/406-server-management-resources-optimization/

People will help you there, as now your questions are off-topic for this specific topic. This recommendation you posted is because your MySQL tables are MyISAM, you must convert them to InnoDB. Open a topic at the above forum and people will help you. There, describe your MySQL server configuration and post your my.cnf file. You should have innodb_buffer_pool_size with at least 70% of your server's RAM.

-----

Regarding my specific case: I found out today that the www.conf from my php-fpm pool was in need of some tweaking! 🙂

 

 

  • 2 weeks later...
Posted
On 4/25/2019 at 1:35 AM, DSyste said:

Here I added the following lines in the  /etc/redis.conf
appendonly yes
appendfsync everysec

I also prioritized a memory for redis in  /etc/sysctl.conf 
vm.overcommit_memory = 1

I liked this tutorial about it ->  https://www.linode.com/docs/databases/redis/install-and-configure-redis-on-centos-7/

@DSyste May I ask, why you bother about data persistence?
We use Redis and it works quite well imo since 4.4.x and am looking for tweaking the performance even more.

As far as I understood the documentation you can use the default RDB file or AOF log to achieve data persistence.

Regarding performance I guess these sections are relevant (see this Redis doc page)

Quote

RDB advantages
(...)
RDB maximizes Redis performances since the only work the Redis parent process needs to do in order to persist is forking a child that will do all the rest. The parent instance will never perform disk I/O or alike.
(...)
AOF disadvantages
AOF can be slower than RDB depending on the exact fsync policy. In general with fsync set to every second performances are still very high, and with fsync disabled it should be exactly as fast as RDB even under high load.

Still RDB is able to provide more guarantees about the maximum latency even in the case of an huge write load.

So I would not use RDP + AOF as they recommend in the documentation, as I don't think the cache data is very important to retain at all costs.

Have you noticed any advantages using AOF instead of RDB?

Thanks,
Thomas

Posted

Hi folks! Another follow-up to my original issue.

I upgraded PHP yesterday from 7.2.x to 7.3.x, and it really improved performance here. Load dropped even more, and I am proud to say that right now our webserver load is as low as it has ever been!

Another tweak we did here: installed the phpredis module and configured php to store sessions in memory!

https://github.com/phpredis/phpredis

session.save_handler = redis
session.save_path = "tcp://localhost:6379?auth=yourverycomplexpasswordhere"

 

Posted

I should note that we use a custom session save handler, so that change won't do anything at all for our software (except for upgrading, but that's a rare and quick process only executed by one user once in a blue moon, not something hit by every user on every page load).

  • 2 weeks later...
Posted

I'm about to upgrade our site to 4.4.x and this has been helpful. Thank you everyone!

I did have problems with php-fpm when we migrated our big board (2 frontends / 2 database servers) over from vbulletin back in January. The solution that worked for us was to get a dedicated server for Redis (w/lots of memory) and moved all static files to S3. (So, hopefully this won't be a problem for us. Crossing fingers.)

  • 1 month later...
Posted

I've done most things mentioned in this thread to increase performance, with the exception of using Redis instead of Memcache. All of my testing indicates that Memcache is faster than Redis, so I am running Memcache.

The main reason for my post, is that through all of my server tweaks, database setting changes, network tweaks and apache setting changes, the Pages app still has a measurable and unexplained slower TTFB than the forum app does. 

I've run literally hundreds of tests, and there is no doubt in my mind, or in the experts I've consulted with, that the Pages app is inherently slower than the forum. In my case there is at least a 0.80 to 1.0 second delay in the TTFB for my Pages app in comparison to the forum app. 

Are others also seeing this? I suspect a coding issue with Pages is causing this.

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...