Jump to content

Recommended Posts

Posted

I have a topic that had >150k posts. A member has requested that their account and posts be deleted, which includes about 10k posts in that topic (and 4k posts in other topics combined). That deletion has been running for the past 20 hours, maxing out the CPU on my server the whole time, and is about 71% complete. The massive topic is having issues where pages are disappearing and reappearing, I think due to consistency issues due to the rate of updates to that topic (although solving that without transactions might be hard), and submitting new posts is very sluggish.

The CPU is mostly being consumed by elasticsearch, I think because each post results in every post in the topic being updated in the index. There must be a better way to do that, batching the updates by topic or something.

Posted

There is no other way in which to delete posts at present unfortunately. I can tag our developers to see if they have any further comments, but unfortunately if you have a topic which is that large in size, it is going to keep moving about pages until everything is deleted.

Posted (edited)
7 hours ago, Marc Stridgen said:

There is no other way in which to delete posts at present unfortunately. I can tag our developers to see if they have any further comments, but unfortunately if you have a topic which is that large in size, it is going to keep moving about pages until everything is deleted.

I do understand moving around pages, but there are clearly some optimizations that can be done to the elasticsearch utilisation because it should not need to reindex all 150k documents 10k times.

 

That topic's db record also seems to be corrupt, but I need to look into that further (I think the posts field in the db is out of sync with the actual number of replies, but have not verified it).

Edited by Colonel_mortis
Posted (edited)

Also had some members reporting today that the most recent posts were still not visible after the deletion had completed. It does look like the cached post count (used for computing the page count) was pretty far off:

mysql> SELECT posts FROM forums_topics WHERE tid=901907;
+--------+
| posts  |
+--------+
| 143021 |
+--------+

mysql> SELECT COUNT(*) FROM forums_posts WHERE topic_id=901907 AND queued=0;
+----------+
| COUNT(*) |
+----------+
|   143048 |
+----------+

I'm not sure what caused the discrepancy to be this big, but this looks like a pretty classic race condition. Resyncing the comment counts (which would have happened organically if anyone had replied in the past few hours) has fixed it, and it's probably not straight forward to reengineer to avoid this, but it does suck.

Edited by Colonel_mortis
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...