Numbered Posted November 8, 2018 Posted November 8, 2018 Hello. Background task queue 'RecountMemberReputation' must be improved. This background queue can start only from two places: Button to recount ALL reputation in ACP - good When 'somebody' removes given reputation - bad. It's bad from two sides: Logic If you have a big amount of users and their activity it is too heavy to recount reputation for ALL users when just one of them get his reaction back. This call located in \system\Member\Member.php line 3507. Please, make it smarter and stop recount all users. Method This background queue calls with that code: \IPS\Task::queue( 'core', 'RecountMemberReputation', array(), 4 ); We haven't fifth parameter here, which should prevent multiplying this task. So for every canceled reaction, this code creates a new one background queue, which starts recounting your millions of users. No matter finished previous one or not. No matter how much the same task already doing. RecountMemberReputation queue (\applications\core\extensions\core\Queue\RecountMemberReputation.php) should be smarter too. Let's see 'run' function. All looks good, but you miss situation for forums, which using member_id as not incremental id. We have an external account database and the special login service. When new user entering the forum and authenticate in this server - this login method create not a clean member. It creates a new member with a special member_id (which are the same as member id in an external database), which nickname, email, and other data. So for the situation, where the difference (or spaces) between two member_id's maybe thousands (for example it enough), we obtain that situation: \IPS\Db::i()->select(... get a 50 members (for example their id's: 1,178,257,258,259...536,718) They reputation rebuilt successfully 'run' return ($offset + $this->rebuild): (in first run it return 0 + 50 = 50) Second 'run' start with a select and offset 50. So in our select, we get... 178,257,258,259...536,718. Very bad, right? Our members recount the second time... 'run' return ($offset + $this->rebuild): (in second run it return 50 + 50 = 100).... What? The third stage runs with a 100... and will rebuild same member_id's 178,257,258,259...536,718. The fourth stage will rebuild the same members 178,257,258,259...536,718... And on the fifth stage, we just obtain new member_id after 718.. something like 1678. The obvious possible solution is returning the last member_id as offset instead of increment by 50. And of course, in the progress, we see the wrong number. Count number will always be lower than $offset. Change the priority for RecountMemberReputation from current 4 to 5 (as much lower as possible). This task is very heavy for large communities. And it totally blocks other queues with 5 priority. For example, indexing new comment item, which tries to start with a five. The result that we can't index any new comment items before completing this RecountMemberReputation. With multiple RecountMemberReputation queues, it totally blocks all other lower priority tasks. I hope you understand my bad English and agree for improve that points. Thanks!
newbie LAC Posted November 8, 2018 Posted November 8, 2018 55 minutes ago, Upgradeovec said: So for the situation, where the difference (or spaces) between two member_id's maybe thousands (for example it enough), we obtain that situation: \IPS\Db::i()->select(... get a 50 members (for example their id's: 1,178,257,258,259...536,718) They reputation rebuilt successfully 'run' return ($offset + $this->rebuild): (in first run it return 0 + 50 = 50) Second 'run' start with a select and offset 50. So in our select, we get... 178,257,258,259...536,718. Very bad, right? Our members recount the second time... 'run' return ($offset + $this->rebuild): (in second run it return 50 + 50 = 100).... What? The third stage runs with a 100... and will rebuild same member_id's 178,257,258,259...536,718. The fourth stage will rebuild the same members 178,257,258,259...536,718... And on the fifth stage, we just obtain new member_id after 718.. something like 1678. What? The queries looks like SELECT * FROM core_members ORDER BY member_id ASC LIMIT 0, 50; SELECT * FROM core_members ORDER BY member_id ASC LIMIT 50, 50; SELECT * FROM core_members ORDER BY member_id ASC LIMIT 100, 50; https://dev.mysql.com/doc/refman/8.0/en/select.html Quote The LIMIT clause can be used to constrain the number of rows returned by the SELECT statement. I agree with 1 hour ago, Upgradeovec said: The obvious possible solution is returning the last member_id 1. It's faster SELECT * FROM core_members WHERE member_id > 500000 ORDER BY member_id ASC LIMIT 20; 1.3895 seconds vs SELECT * FROM core_members ORDER BY member_id ASC LIMIT 500000, 20; 0.0003 seconds 2. For example we have 500 users (from 1 to 500) We run the queue SELECT * FROM core_members ORDER BY member_id ASC LIMIT 0, 50; SELECT * FROM core_members ORDER BY member_id ASC LIMIT 50, 50; In the next run we should update users from 101 to 150 (101,102, ..., 150) We remove users from 1 to 10 SELECT * FROM core_members ORDER BY member_id ASC LIMIT 100, 50; return users from 111 to 160 (111,112, ..., 160) We skip 10 users
Numbered Posted November 8, 2018 Author Posted November 8, 2018 @newbie LAC, agree. I was wrong with the idea of same members updates. This part might exclude from attention. I knew about selects, but something broke in my mind when I tried to systematize all needs in this queue task. Feel shame. Thanks for correct me. Agree about performance. Same offset+limit performance degrees I got in upgrade scripts (for members table too) - it was changed well.
bfarber Posted November 8, 2018 Posted November 8, 2018 I can say for 4.4 we've already changed the offsetting to use the last used member_id (instead of using LIMIT 0, 50 then LIMIT 50, 50 and so on). I'll note the rest of your feedback for internal review however. 🙂
Recommended Posts
Archived
This topic is now archived and is closed to further replies.