Jump to content
Matt

IP.Board 3.3 Dev Update: Performance Enhancements

During each release cycle we often take some time out to assess performance and look at ways to improve in this area. We're also in a unique position to have first hand experience at hosting tens of thousands of IP.Board installations via our own hosting network.

We also work closely with our clients who constantly give us feedback on how IP.Board is performing and let us know about any areas that need further examination.

All of this data is very useful when it comes to profiling and testing IP.Board and making performance improvements for the next major version.

In this this blog entry, I'd like to discuss some of the improvements we've made for IP.Board 3.3.

Topic Markers
IP.Board has had a centralised database drive topic marking system since 3.0. As IP.Board is only part of the suite, we wrote the system to be extensible and flexible so that our own apps and apps written by others can use the system without maintaining their own tracking databases.

We wrote the system to use two tables. One of which can be considered a 'deep storage' table. This contains permanent tracking data in the format of one row per member per parent. So this means that if you had 200 forums, each member would take up 200 rows.
The second table can be considered the 'active' table. When a member is loaded from the database and no 'active' row is found, the markers are pulled from deep storage and written in a serialised form to the 'active' table.
When the member is no longer active, the data is removed from the 'active' table and written back to the 'deep storage' table ready for the next time they visit.

In theory, this is the perfect solution. You only have to read and write to a smaller table which should make the system more efficient. However, we discovered that trying to keep the tables synchronised when you have a very busy site negated the benefit. The sheer number of SQL inserts and deletes often caused bottlenecks affecting the whole board.

Another downside was that all the marking data had to be loaded when the member was loaded. This could be up to 200k of marking data - most of which wouldn't be needed. If the member was viewing a topic, they wouldn't need marking data for Blog, for example.

We've tweaked the system to remove this SQL bottleneck. We've removed the 'active' table and simply write to the main tracker table. Now we don't have tables to synchronise, we can simply write back to the 1 row that needs updating and not have to periodically update all 200 rows.

Furthermore, we've removed the need to load all markers at once. A new function in 'coreExtensions.php' dictates which markers to load. You can still load all as this may be more efficient (as is in the case of the board index when you have a lot of sidebar hooks)

If you choose not to load the marker data on member initiation, you can use the new built in JOIN methods to fetch the marking data along with your dataset.

In testing, this has dramatically reduced write overhead and the memory footprint required per page view by up to 150k.

We're testing this out right now on our company forums and many people have already commented that 3.3 is seriously faster.

Post Table Access
The largest table in your database is almost certainly the post table. We have clients with millions of rows in this one table alone. It makes good sense to keep reads to a minimum where possible.

In older versions of IP.Board, we had different views such as 'threaded'. These were removed in 3.2 as these older legacy views were rarely used and not really applicable in a modern context. However, some of the older code remained which meant that the post table was being queried twice per topic view. Once to fetch a list of post IDs and again to fetch the data.

We've rewritten this bit of code to use a new API and now we only query the table once. This alone will drop read access to your post table by almost 50% in normal daily use. This is a significant change.

Today's Top Posters
This fairly innocuous feature is accessible via a link in the board footer and on most boards doesn't get a deal of traffic. However, we've found that clients with larger boards notice a significant slow down when this feature is used which can cause another SQL bottleneck.

This is because the query is fairly complex due to the flexible permission IP.Board offers. The query causes the creation of a temporary table to sort the data which isn't desirable for larger boards.

We've added a new caching table which caches recent post IDs. This makes this feature much quicker (over a second in SQL terms in testing) and as an added plus, it doesn't have to query the post table to generate the list which again saves read access on that large table.

Conclusion
There are many other, smaller changes in additions to those listed here. Some of these changes may seem trivial but they quickly stack up. It only takes one or two slow queries to bring a site to a crawl while SQL catches up with queued queries. These changes will make a significant different to everyone but especially those working with large databases. Your IP.Board will be faster, consume less memory and be more SQL efficient. Those are changes we can all appreciate!


×
×
  • Create New...