Jump to content

GDPR data deletion issue


jellyhound

Recommended Posts

We recently tried to comply with GDPR data deletion request for one of our clients and stumbled upon a major issue with Invision's anonymise attributions ability.

As part of this process we must delete the users from the forum.
We want to keep conversation flowing and so we want to anonymise attributions which is great and generally understood to be an acceptable GDPR approach/
However, many users like to use their full names as usernames, this is Personal Identifiable Information.

So you delete the user and the username is changed to guest, great!

However, if anyone has "quoted" that user then the name still appears throughout the thread leaving a huge data task cleaning up all quote references.

Link to comment
Share on other sites

  • Management

This is a rare problem that we have heard reported before. I say rare because using real, full names as display names is not that common but does come up. Like I use just my first name and there are a lot of people named Charles in the world so I would not consider it PII for the purposes of GDPR.

Unfortunately we do not yet have a good solution as "quoted content" is almost the same as someone copy/pasting the post into their own editor and replying with it. The only real way to address this is a full database search and replace which is, as you might expect, intensive. It is also prone to errors as let's say someone's name on your community if John Doe. We remove all quoted post references but then elsewhere someone was talking about John Doe in their own post. That's not PII as it's not attributed to the user but would get replaced and break a conversation.

Link to comment
Share on other sites

30 minutes ago, Charles said:

We remove all quoted post references but then elsewhere someone was talking about John Doe in their own post. That's not PII as it's not attributed to the user but would get replaced and break a conversation.

Yes, that's understandable, Charles, and I don't think anybody could expect you to do a full replacement of every single mention for precisely that reason. It would be possible to target names in quotes and mentions, wouldn't it? I'm not a particularly versed coder but your team could probably find a way to say "any instance of the username which is located within <ipsCitation> tags or within a container which has the data attribute 'data-mentionid'", I would think.

Link to comment
Share on other sites

I had this same issue last week. It's specifically the @Paul E. usertag type stuff, and the names that appear in quotes. The HTML is all pre-parsed in the post table for these names, so it takes (as of the current architecture) a search/replace of the post's text in the database. Usernames attributed to posts are an easy fix--there's an ID that can be matched. Reparsing the content of posts is a bit more of an infrastructure change, though I think necessary. Perhaps as a background job.

 

Link to comment
Share on other sites

  • Management
15 hours ago, Meddysong said:

Yes, that's understandable, Charles, and I don't think anybody could expect you to do a full replacement of every single mention for precisely that reason. It would be possible to target names in quotes and mentions, wouldn't it? I'm not a particularly versed coder but your team could probably find a way to say "any instance of the username which is located within <ipsCitation> tags or within a container which has the data attribute 'data-mentionid'", I would think.

Yes, that's something we have considered. It's just an intensive operation to do a search and replace across a database. Imagine a site with millions of posts.

It's something we want to do but just need to work on the engineering side of it.

Link to comment
Share on other sites

1 hour ago, Charles said:

Imagine a site with millions of posts.

This is our scenario. Trying to do find/replace operations via SQL to the post content blob field are at the point where it's simply unworkable. Yet, with some engineering and creative indexing, this is not an insurmountable issue. Look forward to seeing it addressed.

Link to comment
Share on other sites

On 9/28/2020 at 1:26 PM, jellyhound said:

We recently tried to comply with GDPR data deletion request for one of our clients and stumbled upon a major issue with Invision's anonymise attributions ability.

As part of this process we must delete the users from the forum.
We want to keep conversation flowing and so we want to anonymise attributions which is great and generally understood to be an acceptable GDPR approach/
However, many users like to use their full names as usernames, this is Personal Identifiable Information.

So you delete the user and the username is changed to guest, great!

However, if anyone has "quoted" that user then the name still appears throughout the thread leaving a huge data task cleaning up all quote references.

Do we really need to do anything about that?

It would be like forbidding users to mention the other user name, which does not make sense to me.  I'm not a lawyer, but I have serious doubts the GDPR gives users this right.

 

Link to comment
Share on other sites

I would think it's far easier.  In quotes simply have the editor insert a tag for the member ID that's replaced when viewed. Something like ##MID-2245##.

Since it's tied to the MID which never changes, a quick strrplace would just insert the currently associated member name upon displaying a post (or if none found then just list "Guest").

Edited by Fast Lane!
Link to comment
Share on other sites

  • Management
9 hours ago, Fast Lane! said:

I would think it's far easier.  In quotes simply have the editor insert a tag for the member ID that's replaced when viewed. Something like ##MID-2245##.

Since it's tied to the MID which never changes, a quick strrplace would just insert the currently associated member name upon displaying a post (or if none found then just list "Guest").

That would require us to query the member database on every page load perhaps dozens of times. It would be quite a performance impact.

Link to comment
Share on other sites

22 hours ago, ptprog said:

Do we really need to do anything about that?

It would be like forbidding users to mention the other user name, which does not make sense to me.  I'm not a lawyer, but I have serious doubts the GDPR gives users this right.

Depending on the community, there may be other reasons for display name changes. Imagine if you registered for a community that dealt with something embarrassing and two months later you realize using your full legal name as your display name may have not been the best life decision. It would be a nice, welcome feature to have the names displayed in quotes and in mentions update with display name changes to help keep continuity over time.

If someone simply refers to the name otherwise (not in a mention or quote), I don't imagine we'd want that to be handled by such an extension of the display name change functionality that presently exists.

Link to comment
Share on other sites

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...