realmaverickuk Posted December 10, 2011 Posted December 10, 2011 Dear IPS devs, As larger sites can struggle with huge member tables, I'd absolutely love it, if this was taken in to consideration and rectified for IPB 3.3. I'm not a developer, so these suggestions might not be possible but here goes;Members table: Add the ability to archive members. Ideally this would be user defined, i.e archive members who haven't logged in for x days. I think logging in, would be more logical than last post. Alternatively, index the table a-z 0-9 giving 36 tables total, making each one much more manageable. Though I don't know what implication 36 tables has....Members list: I'm not sure what the difference is between IPB member lists and VB. But massive VB member lists load instantly. I'd guess is that IPB are doing some kind of calculation on the entire members table, each time the page is loaded? That's maybe what's causing the slow down in browsing the member list? I realise this only effects 1-2% of websites, but it would be great for both new and existing customers, to know that IPB is fully scalable. Thanks a lot for considering my request :)
bfarber Posted December 12, 2011 Posted December 12, 2011 Splitting the tables isn't realistic. Firstly, 99% of our customers have no need (and this adds a lot of overhead if there's no need), and secondly, you'd be surprised at the number of non-English speaking customers (so, A-Z is useless since their alphabets do not contain latin characters).
realmaverickuk Posted December 13, 2011 Author Posted December 13, 2011 Are you saying adding an archiving system adds overhead to the system? I'm 99% certain that it doesn't. And Brandon, what about the 2% of customers, who do require this?
Brett L Posted December 13, 2011 Posted December 13, 2011 Are you saying adding an archiving system adds overhead to the system? I'm 99% certain that it doesn't. And Brandon, what about the 2% of customers, who do require this? De-normalizing the database in this way will actually increase your database size. I don't understand what you are trying to accomplish by splitting the members table.
Marcher Technologies Posted December 13, 2011 Posted December 13, 2011 Are you saying adding an archiving system adds overhead to the system? I'm 99% certain that it doesn't. And Brandon, what about the 2% of customers, who do require this? members in groups that are not shown in the member listing are not fetched there, firstly. secondly, and most importantly, pulling data from another table in addition makes it less scalable. Thirdly, (this is no fact, but a sneaking suspicion), the A-Z sorting/filtering ITSELF can be one of the slowest queries to run, especially paginated.... think about every time you jump a letter. which is more intensive, showing a single list of all 100k+ pages or that list and one for every letter... kinda obvious.
mat206 Posted December 13, 2011 Posted December 13, 2011 Have you tried partitioning your tables on your own if you are running a site that big? This *may* help.. You don't need IPS to write software to split members into different tables.. MySQL can do it natively. http://dev.mysql.com...g-overview.html Some caveats:http://www.mysqlperf...ou-or-kill-you/ I'm not sure exactly where the problem is but if it's accessing the member tab, I'd have someone write something for you that makes sure that it only pulls members that are relatively new. If it's just access to the members table (which just about everything uses) then try to partition the date yourself and see how that works. TBH, I don't see the value of having a members list if you have 2M+ members..
bfarber Posted December 13, 2011 Posted December 13, 2011 Are you saying adding an archiving system adds overhead to the system? I'm 99% certain that it doesn't. And Brandon, what about the 2% of customers, who do require this? I did not say adding an archiving system adds overhead to the system. I said splitting one table into 36 adds overhead to the 99% of sites that do not need it. I do not believe even 2% of our customers require the tables to be split. Doing this at the software level will not allow the same functionality (it is not easy, in a resource efficient manner, to pull data from multiple tables into one view - this is the same reason we do not have all search results from all apps mashed up together - so you would lose the "All" tab of the member list for instance), and I believe there are probably better solutions to this problem than trying to split the members table into 36 individual tables. Correct me if I'm wrong - you keep saying "accessing members table" slows down the site, but every single page view accesses the members table (usually multiple times). Unless I'm mistaken, you are ONLY having trouble with the actual "Members" tab at the top of the page (i.e. the member list), right?
Management Matt Posted December 14, 2011 Management Posted December 14, 2011 You couldn't do it anyway as many SQL queries join on the member table.
Marcher Technologies Posted December 14, 2011 Posted December 14, 2011 You couldn't do it anyway as many SQL queries join on the member table. the whole kit and ka-boodle really.... count(IPSMember::load()) basically.... every single one. and that's just in the places that function is used over the raw join.
Management Matt Posted December 14, 2011 Management Posted December 14, 2011 Indeed, that's a server level optimisation and not a software level one. When you start asking questions like that, there is usually a simpler answer.
Dmacleo Posted December 14, 2011 Posted December 14, 2011 I am so far from an sql expert I can barely spell it, but can (for those 2 % ) the members table be run on a dedicated sql server? does ip software allow that?
bfarber Posted December 14, 2011 Posted December 14, 2011 I am so far from an sql expert I can barely spell it, but can (for those 2 % ) the members table be run on a dedicated sql server? does ip software allow that? IP.Board cannot query a separate database for member data, no. As Matt said, the data is joined into other queries all over the place. That said, MySQL may have provisions for moving specific tables to other servers (I don't often play around with the more advanced functionality of MySQL like this).
shaks Posted December 15, 2011 Posted December 15, 2011 Here is something (that i believe) is more realistic and achievable for the general community using IPB. In my forums, i have had to recently delete inactive members, which added to my database and made it slow to load and to search for members, so i set a weekly deletion process for all inactive member accounts. The problem with that is when the inactive members come back, they have to register again. Now, there are 2 problems with this that i hope can be solved: 1. Allow "archiving" the inactive members who have not logged for x days, BUT of course nothing drastic like changing the way the database operates. Why not just deactivate the member accounts, by automatic permission settings and auto-assigning these inactive members to a special "inactive" membergroup. That feature should exist by default in IPB, so that it can be automated or manually done via a dedicated menu for administering members. The inactive membergroup should then make those inactive member accounts "invisible" to the entire forum system, meaning all functions for these accounts should be disabled, except for their posts, of course. For example, if another member sends a private message via the forum to an inactive member, there should be an error message, saying that "the member account has been deactivated due to prolonged inactivity". And also, when the inactive member comes back, he will need to verify his email address again (optional) and other settings that can be added as "punishment" such as limited time browsing for x minutes and y days, etc. 2. Once i have deleted the inactive member account, their posts stays on the forum. Now, when the member/s come back, is it possible to re-assign those "Guest" posts back to the new member's account? I ask because it's possible in SMF 2.0 and it would help to keep the IPB forum much cleaner, and easier for deleted members to be allowed to manage their own posts if these posts were re-attributed to them.
realmaverickuk Posted December 21, 2011 Author Posted December 21, 2011 I'm not sure why the entire focus was put on the 36 tables idea, if that can't work, fine. However, if you can archive the posts and are doing so in future versions, why can you not archive the inactive members in the same way? You're willing to provide the facility for boards with large post tables but not boards with large member tables? Once again, to clarify, the issue is not just the members tab. The issue is querying the members table in any way. Of course, the members tab queries the members table and so its slow. VB doesn't have this issue. It's something unique to IPB.
Marcher Technologies Posted December 21, 2011 Posted December 21, 2011 I'm not sure why the entire focus was put on the 36 tables idea, if that can't work, fine. However, if you can archive the posts and are doing so in future versions, why can you not archive the inactive members in the same way? You're willing to provide the facility for boards with large post tables but not boards with large member tables? Once again, to clarify, the issue is not just the members tab. The issue is querying the members table in any way. Of course, the members tab queries the members table and so its slow. VB doesn't have this issue. It's something unique to IPB. the board index queries the members table COUNTLESS TIMES.... every POST queries the member table, every APP. your logic is flawed entirely.
realmaverickuk Posted December 21, 2011 Author Posted December 21, 2011 I'm not a programmer, and regardless of my logic, it's still an issue. What exactly is your point? Are you denying the issue exists? are you saying the fix is not viable? Or are you just trolling?
Marcher Technologies Posted December 21, 2011 Posted December 21, 2011 I'm not a programmer, and regardless of my logic, it's still an issue. What exactly is your point? Are you denying the issue exists? are you saying the fix is not viable? Or are you just trolling? Im saying i'm already looking at having to develop anything i do off the posts table to pull from what may possibly be an external table. im saying that is one app. I'm saying the members table is used EVERYWHERE. I am not trolling, I am stating splitting the tables is a nightmare of bugs waiting to happen. I am also saying if the rest of the board/site runs fine. then its NOT accessing the members table that is slowing it... if that was the case, the forum and every single ipb app would be dragging as well. I'm saying possibly the members list query could possibly be better optimized, and is the source of the trouble.
yacenty Posted December 21, 2011 Posted December 21, 2011 in my case percona and Innodb for users gave us about 15-20% performance growth
realmaverickuk Posted December 21, 2011 Author Posted December 21, 2011 I'll have to ask my developer to elaborate on the issue. But if what you were saying is 100% true, then the members list wouldn't struggle to load, on larger boards. I've not found a single board over 1 million members, who's list loads ok. Yet all the large VB boards load fine. I am guessing the issue isn't only with table size, but also how it's accessed/queried. As I said somewhere else, it's as though each time we click next, the entire table is re-read or something. Rather than just grabbing the next x number of members out of the table. We did establish that the page count was a part of the issue. Page 1 of 50,000 or whatever.
realmaverickuk Posted December 21, 2011 Author Posted December 21, 2011 Correct me if I'm wrong - you keep saying "accessing members table" slows down the site, but every single page view accesses the members table (usually multiple times). Unless I'm mistaken, you are ONLY having trouble with the actual "Members" tab at the top of the page (i.e. the member list), right? Yes you're wrong. At least in regards to the members list. I've shown you several sites with over a million members and they either load slowly or not at all. I have not found a single IPB site, with over a million members, that doesn't struggle with the members tab. Regarding the accessing of the members table, I'm going to ask my developer to elaborate. I'm not sure why it's not an issue in some instances and is in others.
Marcher Technologies Posted December 22, 2011 Posted December 22, 2011 I'll have to ask my developer to elaborate on the issue. But if what you were saying is 100% true, then the members list wouldn't struggle to load, on larger boards. I've not found a single board over 1 million members, who's list loads ok. Yet all the large VB boards load fine. I am guessing the issue isn't only with table size, but also how it's accessed/queried. As I said somewhere else, it's as though each time we click next, the entire table is re-read or something. Rather than just grabbing the next x number of members out of the table. We did establish that the page count was a part of the issue. Page 1 of 50,000 or whatever. Have your coder swap it to prev-next pagination(api has a method for it)... doesn't require a count.... and yes.... i could see that member list pages count being a hazard. Counts are resource-heavy.... page counts hazardously so at that range.
Dmacleo Posted December 22, 2011 Posted December 22, 2011 I bet large part of it is that page count, and how many members referenced in those 50k pages exist still ?
bfarber Posted December 22, 2011 Posted December 22, 2011 1) Grabbing a count is not inherently resource intensive. When testing the issue described in this report, the count(*) query does show slow while the select shows fast...but as soon as you remove the count(*) query, the select becomes slow. In other words, once the count is run MySQL has cached the table data, so the select is fast - removing the count simply means the select query becomes slow instead. The issue is pushed around, not fixed. 2) Splitting the tables (ala the alphabet approach) is not possible, so we'll just move past that. 3) Archiving is *possible* but bound to be require massive rewrites of virtually every file in IP.Board. Nearly every file joins on the members table - there is no way to conditionally join on members table OR archived members table, so everywhere fetching member data would probably need to issue 2 queries instead of 1 if we did that. This is a less than desirable solution, realistically. 4) Accessing the members table is *not* a problem (or your entire site would be unusable, not just select areas pulling lists of members such as the members tab). It's the specific queries that are fetching the members that you find slow (i.e. the members tab), mostly because of various filtering and sorting options. One way this could potentially be tackled is a new column on the members table "show_memberlist" or similar, and then once all criteria are met for that member to display on the members list (has a display name, isn't in a group that shouldn't display in the memberlist, etc.) set that flag to a 1. Index the column along with the default sort column (to prevent a filesort) and then query based on that. It would help, but again would require either (1) massive changes throughout many IP.Board files to maintain this new column (what if they are moved to a new group?) or (2) a task that recalculates this column periodically (in which case your memberlist could be out of date for periods of time until the task fixes accounts).
Marcher Technologies Posted December 22, 2011 Posted December 22, 2011 1) Grabbing a count is not inherently resource intensive. When testing the issue described in this report, the count(*) query does show slow while the select shows fast...but as soon as you remove the count(*) query, the select becomes slow. In other words, once the count is run MySQL has cached the table data, so the select is fast - removing the count simply means the select query becomes slow instead. The issue is pushed around, not fixed. 2) Splitting the tables (ala the alphabet approach) is not possible, so we'll just move past that. 3) Archiving is *possible* but bound to be require massive rewrites of virtually every file in IP.Board. Nearly every file joins on the members table - there is no way to conditionally join on members table OR archived members table, so everywhere fetching member data would probably need to issue 2 queries instead of 1 if we did that. This is a less than desirable solution, realistically. 4) Accessing the members table is *not* a problem (or your entire site would be unusable, not just select areas pulling lists of members such as the members tab). [u]It's the specific queries that are fetching the members that you find slow (i.e. the members tab), mostly because of various filtering and sorting options.[/u] One way this could potentially be tackled is a new column on the members table "show_memberlist" or similar, and then once all criteria are met for that member to display on the members list (has a display name, isn't in a group that shouldn't display in the memberlist, etc.) set that flag to a 1. Index the column along with the default sort column (to prevent a filesort) and then query based on that. It would help, but again would require either (1) massive changes throughout many IP.Board files to maintain this new column (what if they are moved to a new group?) or (2) a task that recalculates this column periodically (in which case your memberlist could be out of date for periods of time until the task fixes accounts). A performance option to turn off the filtering/sorting options and the query bits attached would be nice.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.