Numbered Posted February 3, 2017 Posted February 3, 2017 We have bigint(20) type of member_id field in db on old forums (3.x). Start convertion. But when script go deeper to member_id, then it do lesser and lesser members by one mr-step. I am found cause of that. In /applications/core/setup/upg_40000/upgrade.php on line 240 we have this select: $select = \IPS\Db::i()->select( 'core_members.*, profile_portal.*', 'core_members', NULL, 'core_members.member_id', array( $limit, $perCycle ) ) ->join( 'profile_portal', 'profile_portal.pp_member_id=core_members.member_id' ); where $limit - is last count of member_id offset (not last member id) and $perCycle is constant 350. But this select create not optimised query, where MySQL select each of member_id for found next $limit needed. So in my test case we have 2.4kk users and in 1.8kk point of convertion script do 1 or 3 members per 20-30 sec. It's so slow because this select query get ~10 seconds for get data. We can fix this MySQL thing with additional join, where we can select only member_id field from core_members with offset. This subquery ends very fast. And after that MySQL select only needed users with profile_portal. Additional info about this problem you can read in english here: https://explainextended.com/2009/10/23/mysql-order-by-limit-performance-late-row-lookups/ With this change i got less then 0.5s for query with same result against 10s+ of standard. I think you can optimize all big table convertions (and may be other things). Fast examples: SELECT * from core_members LIMIT 1867311,350; // 10.3811315 sec duration from mysql profiling SELECT * FROM core_members JOIN (SELECT member_id FROM core_members LIMIT 1867311, 350) AS b ON b.member_id = core_members.member_id; // 2.48505275 sec duration from mysql profiling same result, ofcouse.
bfarber Posted February 20, 2017 Posted February 20, 2017 Good suggestion. Instead of the subquery, we'd probably approach this differently (if you return the last ID that was operated on, we can do on the next cycle WHERE member_id > x LIMIT y ORDER BY member_id ASC which is probably the most efficient approach), but either way I've raised this internally for discussion and investigation. Thanks for the suggestion.
Numbered Posted February 28, 2017 Author Posted February 28, 2017 @bfarber, you right! In my case I am change convertion script to where. This is /applications/core/setup/upg_40000/upgrade.php line 238: $select = \IPS\Db::i()->select( 'core_members.*, profile_portal.*', 'core_members', NULL, 'core_members.member_id', array( $limit, $perCycle ) ) // old to $select = \IPS\Db::i()->select( 'core_members.*, profile_portal.*', 'core_members', 'core_members.member_id > '.$limit, 'core_members.member_id ASC', $perCycle ) // new line 247: return ( $limit + $did ); to return $did; line 250: $did++; to $did = $row['member_id']; and finally line 786: $_SESSION['_step1Count'] = \IPS\Db::i()->select( 'COUNT(*)', 'core_members' )->first(); to $_SESSION['_step1Count'] = \IPS\Db::i()->select( 'member_id', 'core_members', null, 'member_id DESC', 1 )->first(); With that changes convertion don't lose speed, when it process very big id's
David.. Posted March 6, 2017 Posted March 6, 2017 @bfarber Is there any update on this? I feel like upgrading members is the longest process compared to the others. So if this was sped up, it would mean less downtime for communities with a big member table.
Numbered Posted March 6, 2017 Author Posted March 6, 2017 @David..: much more time take a post convertion process of convertion legacy post markup.. x10 more time, than members with updated method with where. Fix for post convertion is start queue in several streams. But for that I haven't create solution. In simply view I can get all of them in iteration and set $post->post = \IPS\Text\LegacyParser::parse($post->post); $post->save(); It's work well, but not affect RebuildingPosts background queue (
Linguica Posted March 6, 2017 Posted March 6, 2017 I just want to thank you for this snippet, I noticed that my custom conversion script seemed unusually slow and when I looked at the slow query log I realized it was precisely the problem you were talking about here. So I replaced SELECT * FROM post ORDER BY postid ASC LIMIT 1000000, 1000; with SELECT * FROM post JOIN (SELECT postid FROM post ORDER BY postid ASC LIMIT 1000000, 1000) AS b on b.postid = post.postid; and the select query is now an order of magnitude faster, which is nice.
bfarber Posted March 8, 2017 Posted March 8, 2017 On 3/6/2017 at 8:33 AM, David.. said: @bfarber Is there any update on this? I feel like upgrading members is the longest process compared to the others. So if this was sped up, it would mean less downtime for communities with a big member table. On 2/20/2017 at 8:55 AM, bfarber said: Good suggestion. Instead of the subquery, we'd probably approach this differently (if you return the last ID that was operated on, we can do on the next cycle WHERE member_id > x LIMIT y ORDER BY member_id ASC which is probably the most efficient approach), but either way I've raised this internally for discussion and investigation. Thanks for the suggestion. There is no further update beyond this yet. I've raised this internally, and we will likely see this changed in an upcoming maintenance release.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.