Invision Community 4: SEO, prepare for v5 and dormant account notifications By Matt Monday at 02:04 PM
PeterUK Posted September 4, 2012 Posted September 4, 2012 This issue only applies to the 3.3.X series So I'm seeing this issue coming up more and more here and peoples' forums expand and such. The issue consists of the whole forum being generally slow and the issue is caused by one query when a topic is loaded. The problem generally only occurs on a forum that has large topics (consisting of 5,000 posts or more), when one of these topics is loaded, the query runs, and, depending on the size of the topic can cause between 2 seconds and 45 seconds(!) of load time. During this time, MySQL generally locks and no other queries can be processed, if you have a busy forum this is a huge issue. The problem is that on a huge topic, all of the posts are pulled from MySQL and joined on the multiple tables (sometimes up to 8 or 9 tables) for every post, and then the resulting set is limited. So if your topic has 20,000 posts, it does 8 or 9 table joins for every single post and then selects the posts you actually want from that huge set of 20,000. What the patch actually does is selects the posts you want first (usually no more than 20) and then does the joins on those 20 posts, making it unbelievably efficient in comparison. The issue was reported here by a user:http://community.inv...database-r38208 But the patch that he wrote, while probably fixing the issue by splitting the query into multiple queries is unnecessary and quite complex to implement. There have also been multiple users with exactly this problem:http://community.inv...zation-service/ And a few who have PMed me asking for instructions on how to patch this. I wouldn't recommend doing this unofficial patch unless you're experiencing the issue, and it is literally a quick and dirty patch I cooked up to fix our forum (it's not very neat, but it does the job). It's also been submitted to IPS for them to neaten up and I've been told it will be fixed for 3.4. Anyway, enough talk, here's how to patch it. [*]Turn your board offline. [*]Make a backup of (or make sure you have the IP.Board archive) the following file: admin/applications/forums/modules_public/forums/topics.php [*]Inside that file, find the following: //----------------------------------------- // Get posts //----------------------------------------- $this->DB->build( array( 'select' => 'p.*', 'from' => array( 'posts' => 'p' ), 'where' => 'p.topic_id='.$topicData['tid']. $queued_query_bit, 'order' => 'p.'.$this->settings['post_order_column'].' '.$this->settings['post_order_sort'], 'limit' => array( $first, $this->settings['display_max_posts'] ), 'add_join' => $_post_joins ) ); $oq = $this->DB->execute(); Replace this: $this->DB->build( array( 'select' => 'p.*', 'from' => array( 'posts' => 'p' ), 'where' => 'p.topic_id='.$topicData['tid']. $queued_query_bit, 'order' => 'p.'.$this->settings['post_order_column'].' '.$this->settings['post_order_sort'], 'limit' => array( $first, $this->settings['display_max_posts'] ), 'add_join' => $_post_joins ) ); $oq = $this->DB->execute(); With: $this->DB->allow_sub_select = 1; $this->DB->build( array( 'select' => 'p.*', 'from' => array( '(SELECT * FROM '.$this->settings['sql_tbl_prefix'].'posts WHERE '.$this->settings['sql_tbl_prefix'].'posts.topic_id='.$topicData['tid'].str_replace('p.', $this->settings['sql_tbl_prefix'].'posts.', $queued_query_bit).' ORDER BY '.$this->settings['sql_tbl_prefix'].'posts.'.$this->settings['post_order_column'].' '.$this->settings['post_order_sort'].' LIMIT '.$first.','.$this->settings['display_max_posts'].')' => 'p' ), 'add_join' => $_post_joins, 'group' => 'p.pid' ) ); $new_query = $this->DB->fetchSqlString(); $this->DB->flushQuery(); $oq = $this->DB->query(str_replace($this->settings['sql_tbl_prefix']."(", "(", $new_query)); This patch may benefit you even if you don't have huge topics (in my tests, the load times were improved even for topics of just 3-4 pages), but it is an unofficial patch so apply it at your own risk, but after applying it to our forum where the issue was rampant, the difference has been black and white.
Marcher Technologies Posted September 4, 2012 Posted September 4, 2012 ....dirty is the right word... I assume you have tested it, but the group clause is notoriously heavy, and avoidance of $this->DB->query is universally a good thing.
PeterUK Posted September 4, 2012 Author Posted September 4, 2012 ....dirty is the right word... I assume you have tested it, but the group clause is notoriously heavy, and avoidance of $this->DB->query is universally a good thing. It's tested on multiple forums. The group clause is actually a patch for another issue (not this one), it's just included because it will be that way anyway in 3.4: http://community.invisionpower.com/resources/bugs.html/_/ip-board/multiple-warn-logs-causes-posts-to-be-omitted-r38422 It's only grouping on a very small result set and in my tests it shows almost no performance hit. If you have a better way to use $this->DB->build() without it applying the DB prefix to the temporary table then please let me know. My original code just used $this->DB->build() but it turns out if you have a DB prefix it applies it to the start of the temporary table so it ends up like this:ibf_(SELECT .....) AS p which of course causes an SQL error.
Marcher Technologies Posted September 4, 2012 Posted September 4, 2012 >_< proper use of indexes to avoid the temp table entirely seems a more sound solution than forcing the creation of one via sub-selects. Simply an observation, I did observe you have tested it, and it quite apparently is helping in said testing, far be it for me to question something you obviously have spent time troubleshooting.
PeterUK Posted September 4, 2012 Author Posted September 4, 2012 >_< proper use of indexes to avoid the temp table entirely seems a more sound solution than forcing the creation of one via sub-selects. Simply an observation, I did observe you have tested it, and it quite apparently is helping in said testing, far be it for me to question something you obviously have spent time troubleshooting. Feel free to try running the query yourself, it's a fully indexed query, the problem is not that, it's the sheer amount of joins that is done. Every single table has the correct indexes, and every single index gets used correctly when the query is run. The problem is on a huge topic, a result set of say 20,000 posts is selected, all of the joins (8-9 tables) are performed, and then a LIMIT is used on the whole set to pull out the X posts needed to display on the page. Using a sub-select causes the posts table, as far as the rest of the query is concerned, to just consist of the X posts you want anyway, which makes the subsequent joins a hell of a lot faster.
Marcher Technologies Posted September 4, 2012 Posted September 4, 2012 ok.... then an entirely different tack, if the issue literally is it all in one query(the joins being run on the whole result set pre-limit), pull the posts first with said limit, grab the rest separately by pid/member_id? Effectively the same thing, adds 1 query, nukes the need for a group or a sub-select.... again, not doubting the issue, just really not fond of subselect or query.
PeterUK Posted September 4, 2012 Author Posted September 4, 2012 ok.... then an entirely different tack, if the issue literally is it all in one query(the joins being run on the whole result set pre-limit), pull the posts first with said limit, grab the rest separately by pid/member_id? Effectively the same thing, adds 1 query, nukes the need for a group or a sub-select.... again, not doubting the issue, just really not fond of subselect or query. You could indeed do that if you want, but it's like people saying, "You shouldn't ever use X", sometimes it's just simpler, easier and makes most sense to do that. In my opinion it depends on whether or not the DB class has the ability to handle a subselect with an alias properly, if not, it makes more sense to use $this->DB->build() twice to build it up, if it can support it properly (and I just don't know how to do it) then I see no reason to not use the sub-select. Either way, this is just an unofficial interim patch until the official fix in the next major point revision. :)
FabioPaz Posted September 4, 2012 Posted September 4, 2012 My board changed from water to wine, after this patch. I don't know how safe it is or if it uses the most suitable way to fix the problem, but it worked and was far more simple and efficient than the solution that IPB Official Support gave me: "cut user features to make it faster or buy better hosting. " Thanks for the patch.
PeterUK Posted September 4, 2012 Author Posted September 4, 2012 Also just to add on to my previous post, it doesn't remove the need for a group, you still need to group by PID as there's a join on warn logs, and if a post has been warned more than once then it causes issues with display (see the bug report I linked in my 2nd post about posts being omitted).
Marcher Technologies Posted September 4, 2012 Posted September 4, 2012 Also just to add on to my previous post, it doesn't remove the need for a group, you still need to group by PID as there's a join on warn logs, and if a post has been warned more than once then it causes issues with display (see the bug report I linked in my 2nd post about posts being omitted). :) it is only an issue with the join on the posts table, if you nab the posts first, then the rest separately, then there is no need to be grouping. My board changed from water to wine, after this patch. I don't know how safe it is or if it uses the most suitable way to fix the problem, but it worked and was far more simple and efficient than the solution that IPB Official Support gave me: "cut user features to make it faster or buy better hosting. " Thanks for the patch. Glad it helps, as I said, not arguing the issue, just noting the factual fix in IPB will likely be more sanish in implementation. Also, as posts can be archived(different DB, different table), it simply makes more sense to separate it.
Luis Manson Posted September 5, 2012 Posted September 5, 2012 anyone can show the problematic query executed? I have one or two issues i can not track and I cant figure the final query looking at the code posted thanks!
FabioPaz Posted September 5, 2012 Posted September 5, 2012 In my case was this query: SELECT p.*,m.member_id as mid,m.name,m.member_group_id,m.email,m.joined,m.posts, m.last_visit, m.last_activity,m.login_anonymous,m.title as member_title, m.warn_level, m.warn_lastwarn, m.members_display_name, m.members_seo_name, m.member_banned, m.has_gallery, m.has_blog, m.members_bitoptions,m.mgroup_others,pp.*,w.wl_id,rep_index.rep_rating as has_given_rep,rep_cache.rep_points, rep_cache.rep_like_cache,cca.*,ccb.cache_content as cache_content_sig, ccb.cache_updated as cache_updated_sig FROM ibf_posts p LEFT JOIN ibf_members m ON ( m.member_id=p.author_id ) LEFT JOIN ibf_profile_portal pp ON ( m.member_id=pp.pp_member_id ) LEFT JOIN ibf_members_warn_logs w ON ( w.wl_content_app='forums' and w.wl_content_id1=p.pid ) LEFT JOIN ibf_reputation_index rep_index ON ( rep_index.app='forums' AND rep_index.type='pid' AND rep_index.type_id=p.pid AND rep_index.member_id=0 ) LEFT JOIN ibf_reputation_cache rep_cache ON ( rep_cache.app='forums' AND rep_cache.type='pid' AND rep_cache.type_id=p.pid ) LEFT JOIN ibf_content_cache_posts cca ON ( cca.cache_content_id=p.pid ) LEFT JOIN ibf_content_cache_sigs ccb ON ( ccb.cache_content_id=m.member_id ) WHERE p.topic_id=26890 AND p.queued=0 ORDER BY p.pid asc LIMIT 6960,20
PeterUK Posted September 5, 2012 Author Posted September 5, 2012 :) it is only an issue with the join on the posts table, if you nab the posts first, then the rest separately, then there is no need to be grouping. There is, unless you intend to do a single query for *every* post and LIMIT 1. The duplicated rows come from the join on the warns table where there can be multiple entries for a single PID. If you pre-select 20 posts and join them on the warns table, you can still end up with more than 20 rows in which a single PID occurs more than once. anyone can show the problematic query executed? I have one or two issues i can not track and I cant figure the final query looking at the code posted thanks! The query can differ slightly based on whether you have reputation or warn etc. enabled but it's pretty much what FabioPaz posted. :)
Grumpy Posted September 5, 2012 Posted September 5, 2012 Problem certainly exists, but I'm not sure I see this as a solution. Here's my test results on said problematic query: On topic with 2 posts. IPB Version: 0.0026 sec OP Version: 0.0089 sec On topic with ~600 posts. IPB Version: 0.0085 sec OP Version: 0.0121 sec On topic with ~28000 posts. IPB Version: 0.1400 sec OP Version: 0.1415 sec Data used to test this is was not generated, but from a real forum data. It's quite possible I'm not getting the same problem because I can run the query in .14 seconds even with 28k posts. But then, sort of proves the point that, better machine is also a solution. The machine tested on is my home computer, just a typical desktop single 7200 rpm drive. WD Blue to be specific.
PeterUK Posted September 5, 2012 Author Posted September 5, 2012 If your 28,000 post topic is already loading in 0.14 seconds then you're not experiencing the problem. It's reasonable to expect the time to increase as the topic gets larger, just not by the amount it does for most people. This patch is only relevant for 3.3.X (not sure about 3.2, I skipped that series).
Grumpy Posted September 5, 2012 Posted September 5, 2012 3.2 seems to have a different query build, so, I'd guess this would just crash it. lol For reference: 'from' => array( 'posts' => 'p' ), 'where' => "p.pid IN(" . implode( ',', $this->pids ) . ")", 'order' => $this->settings['post_order_column'] . ' ' . $this->settings['post_order_sort'], 'add_join' => $_post_joins ) ); $oq = $this->DB->execute();$this->DB->build( array( 'select' => 'p.*',
PeterUK Posted September 5, 2012 Author Posted September 5, 2012 Yeah, is your forum running 3.2? In 3.3 the warn system, and some changes as to what's displayed regarding the reputation system is what kills the performance. On a 181,000 post topic (real):pre-patch: 53.7164 48.6963 47.9003 48.8025 49.7213 Avg: 49.76736 post-patch: 1.0353 1.1235 1.0724 1.065 1.0285 Avg: 1.06494 Also 3.2 selects a range of posts (which is basically what this does) rather than selecting all of them which is where the performance hit is. I guess I should make it clearer in the OP that this is only for 3.3. This isn't a problem with server configuration or power either, we run an extremely powerful MySQL server that is well configured, this query is fully indexed and it still bogs down the server, that's why it's being fixed in 3.4. :P
AndyF Posted September 5, 2012 Posted September 5, 2012 Initial post modified with notice as requested.
maddog107_merged Posted September 5, 2012 Posted September 5, 2012 Thanks, I may need to try it out. I got SSD's because of the ridiculously slow load times of some 1,000 page threads. And even then the first load time of a particular topic could take a while but once its cached into the temp tables its pretty quick.
PeterUK Posted September 5, 2012 Author Posted September 5, 2012 Thanks, I may need to try it out. I got SSD's because of the ridiculously slow load times of some 1,000 page threads. And even then the first load time of a particular topic could take a while but once its cached into the temp tables its pretty quick. Let me know if it makes a difference for you. :)
Luis Manson Posted September 5, 2012 Posted September 5, 2012 yup, its on the loved slow log... Count: 92 Time=13.05s (1200s) Lock=0.13s (11s) Rows_sent=16.0 (1470), Rows_examined=68736.2 (6323734), USER[MYuser]@localhost yet i will wait for this, since i have bigger issues with another query
PeterUK Posted September 6, 2012 Author Posted September 6, 2012 yup, its on the loved slow log... Count: 92 Time=13.05s (1200s) Lock=0.13s (11s) Rows_sent=16.0 (1470), Rows_examined=68736.2 (6323734), USER[MYuser]@localhost yet i will wait for this, since i have bigger issues with another query Can you run the other problematic query without an issue with your board turned offline? We had loads of stuff turning up in our slow query log but it turned out to be because this caused a table lock and held up the other queries which the server then saw as "slow" and logged them.
Luis Manson Posted September 6, 2012 Posted September 6, 2012 Can you run the other problematic query without an issue with your board turned offline? We had loads of stuff turning up in our slow query log but it turned out to be because this caused a table lock and held up the other queries which the server then saw as "slow" and logged them. the query i have problems with is somewthing like:SELECT * FROM XXreputation_totals WHERE rt_app_type=MD5( CONCAT( 'forums', ';', 'pid' ) ) AND rt_type_id IN (SELECT p.pid FROM posts p,topics t WHERE p.queued=0 AND t.tid=p.topic_id AND t.approved=1 AND ( ( FIND_IN_SET(25,t.forum_id) OR FIND_IN_SET(86,t.forum_id) OR FIND_IN_SET(5,t.forum_id) OR FIND_IN_SET(36,t.forum_id) OR FIND_IN_SET(10,t.forum_id) OR FIND_IN_SET(34,t.forum_id) OR FIND_IN_SET(27,t.forum_id) OR FIND_IN_SET(271,t.forum_id) OR FIND_IN_SET(22,t.forum_id) OR FIND_IN_SET(23,t.forum_id) OR FIND_IN_SET(21,t.forum_id) FIND_IN_SET(266,t.forum_id) ) )) GROUP BY rt_key ORDER BY rt_total DESC LIMIT 0,20; NOTE: I removed like 70% of the query before posting it (lines like "or find_in_set...") do you want me to run this one which is my current problem or the one this post is about?
PeterUK Posted September 6, 2012 Author Posted September 6, 2012 I was just curious as to know whether or not your other slow queries always run slowly, or whether they are actually being delayed by what's described in this topic (since you seem to experience the issue from the slow query info you posted earlier).
Luis Manson Posted September 6, 2012 Posted September 6, 2012 Actually after months of having a processlist mailed to me automatically i feel very confident all my issues in life are because of this query i quoted :P however i have to apply a fix IPB sentme, after that it will be more clear to me if there is an issue with this query too
Recommended Posts
Archived
This topic is now archived and is closed to further replies.