Jump to content

Emojis replaced with question marks


Recommended Posts

I just noticed that I have a bunch of posts where emojis where replaced with a question mark.  I already double checked the database setup, and the correct encoding seems to be used everywhere.  Moreover, in a few test posts I created, emojis were created fine.

Has anybody else experienced the same issue?  Any idea of what is the cause?

Link to comment
Share on other sites

Just for future reference. 

mysqldump --default-character-set=utf8mb4 -h [host] -u [username] -p [database_name] --set-gtid-purged=OFF --port=3306 --protocol=tcp --skip-triggers -r dump.sql

 

It turns out, using > is not always a safe way of exporting. Using -r dump.sql instead of > dump.sql does the trick.

Edited by Linux-Is-Best
format
Link to comment
Share on other sites

There was no move of database in the past years, and the last failure is on a post from two weeks ago.

There was also no recent change on DB configs, so maybe some periodic task is causing this.  I need to recover older backups to confirm this.

Link to comment
Share on other sites

16 hours ago, Stuart Silvester said:

This can happen when you export a database with the wrong character set, i.e. if moving server etc. You would need to specify --default-character-set=utf8mb4 when running mysqldump.

It is irreversible, you cannot restore the now corrupt emoji data.

If its set in the database as default, does it automatically recognise that? I've never used the query in mysqldumps before to include utf8mb4

db.thumb.jpg.d0d33bee6623ef86b5113a41ccfb33ac.jpg

collation.png.9f675c4c72a3a62ab1191e98af65de55.png

Edited by marklcfc
Link to comment
Share on other sites

  • Management

Thanks for posting!

Unfortunately, this issue is beyond the scope of our technical support. 👩‍💻

Our technical support is happy to help you with the Invision Community platform, but we're unable to help with things like server management, theme questions and modifications.

I've moved this to our Community Support area where other Invision Community owners will see it and help where they can.

Link to comment
Share on other sites

6 hours ago, ptprog said:

Yes, I fully manage my server.

Ok, then we can rule out a backup/restore that didn't factor in utf8mb4, and that this is not something introduced from a backup/restore operation that stripped out emoji characters.

The other possibility is that your community uses native emoji characters instead of the Twitter emojis. If you're doing this, and you're browsing on a device that doesn't have newer emojis available, you will see oddness. You can switch to the Twitter emojis and see if those posts begin displaying correctly for you (assuming the new characters have been adopted by Twitter and not in your native platform's supported fonts).

See: https://blog.emojipedia.org/first-look-217-new-emojis-in-ios-14-5/

Edited by Paul E.
Link to comment
Share on other sites

I'm indeed using the native ones.  I never really looked into the difference between native and Twitter, but now I see how the Twitter ones provide more reliability.  This is something I need to discuss internally, as it should prevent losses of data.

 

I still don't think this totally explains the issue I'm seeing, though.  I would expect that if somebody is able to type an emoji, then his OS should be able to send it to the server (we could have problems with quotes or C&P, but they should be uncommon).

Link to comment
Share on other sites

1 minute ago, ptprog said:

I would expect that if somebody is able to type an emoji, then his OS should be able to send it to the server (we could have problems with quotes or C&P, but they should be uncommon).

That's exactly what happens. It's your browser/OS not rendering them correctly.

They will appear just fine to whoever typed them. It's usually iOS that has emojis before everyone else.

Link to comment
Share on other sites

39 minutes ago, ptprog said:

then his OS should be able to send it to the server

The browser does send the emoji..an emoji is really just text though, it's not an image. The browser renders the image from the text...but if the OS/Browser doesn't know the emoji (which happens when apple release new ones as per Paul's comments) then you see the question marks. But someone else viewing the same post could see the image.

Link to comment
Share on other sites

My problem is not with some browsers not showing emojis.  My problem is with having a "?" char (0x3F) stored in the database instead of an emoji.

 

Meanwhile I think I was able to trace the source of the issue based on the posts affected.  I used a REGEXP_REPLACE query to remove some data from posts, and it seems the REGEXP_REPLACE does not handle some UTF-8 characters properly.  Time to try to recover the data from a backup 🙂

Link to comment
Share on other sites

1 hour ago, ptprog said:

My problem is not with some browsers not showing emojis.  My problem is with having a "?" char (0x3F) stored in the database instead of an emoji.

And how are you viewing the database? Does the OS and/or browser through which you are viewing it support that specific emoji? Just saying that sometimes the ? isn't always an indication of a problem with database.

Based on your last paragraph, looks like you figured it out yourself but I am still making the above point in case others happen across this at a later point.

Edited by Nathan Explosion
Link to comment
Share on other sites

On 2/20/2021 at 10:07 AM, Nathan Explosion said:

And how are you viewing the database? Does the OS and/or browser through which you are viewing it support that specific emoji? Just saying that sometimes the ? isn't always an indication of a problem with database.

My terminal supports UTF-8 chars, and I could see emojis on the posts that still had the emojis.  But just in case this was an issue with a specific emoji, I also used the MySQL HEX function to see the hexadecimal representation of the char stored and confirmed it was a plain "?".

 


 

It looks like this bug is known for 4 years: https://jira.mariadb.org/browse/MDEV-11777 (but there is no mention to it in the function documentation).  Time to consider going back to MySQL...

Edited by ptprog
Link to comment
Share on other sites

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...