Jump to content

Too much select count(*)...


athlonkmf

Recommended Posts

I just had the chance to test out ipb302, preparing for an upgrade.

And i noticed that there are a lot of select count(*) in the admin panel. Like the cache_overview. It counts all posts and members just for a percentage showing.

With an innodb database, 5 million posts and 200k members, that's not funny at all..

Can't it just use the board statistics which is cached anyway?

something like this:

in admin\applications\core\modules_admin\tools\cache.asp

use


$cacheContent['posts'] = intval($this->caches['stats']['total_replies']);


instead of



            $cacheContent['posts'] = $this->DB->buildAndFetch( array( 'select' => 'COUNT(*) as count',

                                                                        'from'   => 'posts' ) );
Link to comment
Share on other sites

A select count(*) query with no where clause is generally a very fast query, because database engines store the total number of records in internal tables/indexes. These queries should not cause any problems whatsoever.

Further to that, how often do you load up the cache management page in the ACP? The queries are done to be sure you have the correct most up to date number. The caches on the front end do not take into account deleted posts, for instance. So if 3 posts are made, all 3 are deleted, then 3 more are made and all 3 are cached - it would reflect only 50% posts being cached, instead of 100%.

Link to comment
Share on other sites

[quote name='bfarber' date='24 August 2009 - 11:11 AM' timestamp='1251126693' post='1848369']
A select count(*) query with no where clause is generally a very fast query, because database engines store the total number of records in internal tables/indexes. These queries should not cause any problems whatsoever.


This is only true for MyISAM tables. For MyISAM tables, count(*) with no where clause does a full index scan which can be quite time consuming for very large tables. You really should cache these values when their only purpose is for displaying stats (or cache them if the table is not MyISAM). MySQL has to do the full scan for InnoDB because InnoDB supports transactions.

Link to comment
Share on other sites

Honestly, we don't recommend anyone to use InnoDB. The theory behind "row level vs table level" locking is sound, but in practice I've never seen any IPB perform better on InnoDB myself.

At the end of the day, we cache this data where it's used on the front end for the purpose it's used. In the ACP on that one particular page, a correct/accurate count is desired, so we run a query. We can't simply NOT run count() queries in a PHP script like a bulletin board.

I'm not sure what kind of resource offset you'd end up seeing if we tried to maintain an accurate count for posts on the front end - it's possible (perhaps probable) the overhead in recounting every time a delete occurs (which happens on the front end) could offset any gain you'd get from the once in a blue moon you might visit the specific page we're referring to here in the ACP.

Link to comment
Share on other sites

[quote name='bfarber' date='24 August 2009 - 07:05 PM' timestamp='1251137148' post='1848440']
Honestly, we don't recommend anyone to use InnoDB. The theory behind "row level vs table level" locking is sound, but in practice I've never seen any IPB perform better on InnoDB myself.



Well, here's one example for you where rowlevel locking is needed. In fact, when our forum had 1000 visitors at once, rowlevel locking was already needed, because the posting went that fast that there were too many table locks. Which often caused an avalanche effect, making the site effectively inaccessable.
InnoDB solved that.
And we're happy too, now that we have 4000+ visitors at a given time, but that innodb vs myisam is another discussion

At the end of the day, we cache this data where it's used on the front end for the purpose it's used. In the ACP on that one particular page, a correct/accurate count is desired, so we run a query. We can't simply NOT run count() queries in a PHP script like a bulletin board. I'm not sure what kind of resource offset you'd end up seeing if we tried to maintain an accurate count for posts on the front end - it's possible (perhaps probable) the overhead in recounting every time a delete occurs (which happens on the front end) could offset any gain you'd get from the once in a blue moon you might visit the specific page we're referring to here in the ACP.

The thing is, the total amount of posts and total amount of members are already cached and in that page, it's only meant for a statistics. xx cached posts vs xx total amount of posts. Unless the percentage had 100 decimals, or the board only has 100 posts, it won't matter much what the percentage is. Why not just use the cached value?

It's only a tip for a better user experience, as I'll be editing it for myself anyway.

Oh, and you're better off doing a select count(primary_id) than a select count(*) if it's really needed. Effect might me marginal, but every little thing helps





Link to comment
Share on other sites

[quote name='bfarber' date='24 August 2009 - 08:44 PM' timestamp='1251143046' post='1848504']
Yes,for larger sites the statistical difference will be nearlynothing. Care to take a guess how many sites use IPB that have about100 posts though, that would instantly see the "bug" if we used theincorrect cached number? :P


Howabout putting an if-statement then ;) Use count if there's less than1000 posts (from cache) otherwise, just use the cachethingie.

Anyway, for the record. I just had time to take a better look.

For (reference and) those who are cursed with a large board, they might want to do this.



            /* Get all posts */

//            $cacheContent['posts'] = $this->DB->buildAndFetch( array( 'select' => 'COUNT(*) as count',

//                                                                        'from'   => 'posts' ) );


            /* Get all members */

//            $cacheContent['members'] = $this->DB->buildAndFetch( array( 'select' => 'COUNT(*) as count',

//

//                

                        $this->caches   =& $this->registry->cache()->fetchCaches();

                        $cacheContent['posts'] = array('count'=>intval($this->caches['stats']['total_replies']));

                        $cacheContent['members'] = array('count'=>intval($this->caches['stats']['mem_count']));

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...