Notifications
Clear all

[Sticky] Special characters display broken

13 Posts
6 Users
0 Reactions
3,761 Views
(@esteliel)
Posts: 8
Active Member
Topic starter
 

URL to your eFiction: http://faerie-archive.com
Version of eFiction: 3.5.3.
Have you bridged eFiction, if so with what?: no
Version of PHP: 5.6
Version of MySQL: 4.2.12deb2+deb8u2
Have you searched for your problem: yes
If so, what terms did you try:special characters, charset
State the nature of your problem:

I run a Tolkien fic archive, which means that there's lots of names with special characters. I fixed a problem with them in 2011 thanks to this forum, but since around 2 weeks ago, special characters began to be displayed weirdly again. When I looked into this, I found that my hoster did a Debian upgrade to their servers.

To try and fix this, I went into en.php and changed the character set to UTF-8, which seemed to help at first, because all the weird symbols vanished and the special characters were displayed correctly again.

Unfortunately, since that change new errors have appeared in older stories - not all older stories, but some. You can see it for example here: http://faerie-archive.com/viewstory.php?sid=1906

If I change en.php back from UTF-8 to ISO-8859-1, those are displayed correctly again, but then the other special character errors return all over the site.

Do you have a test account for us? Account: Test Password: test


 
Posted : 01/03/2018 11:49 am
(@sheepcontrol)
Posts: 332
Reputable Member
 

Had the same problem on german sites (we have the black belt in umlaut).

If your stories are stored as files, this might be a big problem, as they may be stored in an encoding other than UTF-8.

If you change the encoding to UTF-8 in en.php, what exactly is right and what is wrong?
If one story is wrong, are all of them with special characters wrong?
If stories are right, what else is wrong?


 
Posted : 01/03/2018 1:17 pm
(@esteliel)
Posts: 8
Active Member
Topic starter
 

Right now, the site is set to UTF-8 in en.php.

There are errors popping up in several stories, for example http://faerie-archive.com/viewstory.php?sid=1906
It seems to affect quotation marks and characters like ?, ?, ?, ?.
At the same time, other stories display the same characters correctly.

When en.php is set to ISO-8859-1, text all over the site displays weirdly, not just some of the stories, but also summaries, the little intro header text of the archive, the shoutbox etc. This is what it looks like then:
Meanwhile the example from above displays correctly with that setting:

Al of this was displayed correctly with ISO-8859-1 until about two weeks ago, before my hoster did a Debian upgrade.

I have the stories stored as files, and not in the database.


 
Posted : 02/03/2018 11:25 am
(@sheepcontrol)
Posts: 332
Reputable Member
 

When the header and such are created in UTF-8, they can't display properly when using ISO, that makes sense.

With the stories (or rather chapters), could it be that they are okay up to some point and broken past that point (or the other way around)?

There is a fix i bodged together for above mentioned german site, it's ugly as night, but it works ... so you might want to give it a shot:

Open viewstory.php and got to around line 360.

Look for this code:

		$file = STORIESPATH."/$chapterauthor/$chapid.txt";
$log_file = @fopen($file, "r");
$file_contents = @fread($log_file, filesize($file));

Add directly below:

		if ( !mb_detect_encoding($file_contents, 'UTF-8', true) )
$file_contents = mb_convert_encoding($file_contents, "UTF-8", "ISO-8859-1");

In the end, this section should look like this:

		$file = STORIESPATH."/$chapterauthor/$chapid.txt";
$log_file = @fopen($file, "r");
$file_contents = @fread($log_file, filesize($file));
if ( !mb_detect_encoding($file_contents, 'UTF-8', true) )
{
$file_contents = mb_convert_encoding($file_contents, "UTF-8", "ISO-8859-1");
}
$story = $file_contents;
@fclose($log_file);

 
Posted : 02/03/2018 5:11 pm
(@esteliel)
Posts: 8
Active Member
Topic starter
 

Thanks! I tried your workaround, but unfortunately the problem stories like http://faerie-archive.com/viewstory.php?sid=120 still don't display correctly. 🙁

As these are chapters that were posted years ago and used to display correctly until my hoster's Debian upgrade, I'm really not sure what changed. I'd also happily go back to using ISO-8859-1 in the header to have it display correctly, but then there are broken characters in the shoutbox, summaries and news etc.


 
Posted : 07/03/2018 11:26 pm
(@esteliel)
Posts: 8
Active Member
Topic starter
 

I think I've got it fixed now.

What I did was to download UTFCast Express and use it to convert all txt files in my fic folder to UTF-8.

When I uploaded them, the black diamond question marks finally displayed the correct characters again. But now there were other broken characters showing up as e.g. ë for ë.

So what I did then was to use the Find in Files function of Notepad++ to do a search and replace for all the broken characters in my fic folder, using this UTF-8 debug list: http://www.i18nqa.com/debug/utf8-debug.html , replacing ë with ë and so on.


 
Posted : 09/03/2018 2:28 pm
(@sheepcontrol)
Posts: 332
Reputable Member
 

Awesome work.
Going to pin this topic as it could be of value for others too.


 
Posted : 10/03/2018 3:52 am
ShiKahr
(@shikahr)
Posts: 60
Trusted Member
 

I have the same issue, since I moved my archive to my localhost.
I would like to get it right again. Is there any setting in MySQL or Apache what needs to be amended?  :think:


Get your Arsch out of the couch." Gayle Tufts

 
Posted : 19/08/2019 5:26 pm
(@azurite)
Posts: 209
Reputable Member
 

I'm having this same issue despite trying to apply the fix. Now I have the special character (e-acute in my case, é) that shows up at the end of every alphabetical list, but the character itself in context (like in a character's name) still displays as a question mark.


Archive: Dragonfayth
eFiction: 3.5.5/6
Latest Patch(es): Yes
bridged?: No
modified?: Yes
PHP: 7.4.25
MySQL: 5.7.32-35-log

 
Posted : 22/12/2021 7:16 am
(@jimmi)
Posts: 95
Estimable Member
 

@azurite 

It means that your file (alphabet in language files) has correct coding but you have a problem with the database (characters are saved in the database).

When I did a version that is able to work with e107 CMS (clean UTF-8), I noticed some issues related to this topic. (not mention that wordcount() is wrong with national characters)

I had to do some changes, for example in:

includes/mysqli_functions.php 

after

mysqli_query($mysql_access, "SET SESSION sql_mode = 'MYSQL40'");

I added 

 mysqli_query($mysql_access, "SET NAMES UTF8;");

And you need to check your database coding - not only DB, but tables too.  

 

 


Never say that something is impossible because there will be always some dummy who will do it.
URL for efiction site: https://www.hpfanfiction.cz/
php: 8.1.33 MariaDB 10.5
efic version: 4.0.x
mods: storyimage, notifications, storyend

 
Posted : 22/12/2021 11:54 am
(@azurite)
Posts: 209
Reputable Member
 

@jimmi Thank you for this; I was able to slowly but surely change the database and table encoding from latin1_swedish_ci (not sure how it got set to that, of all things) to utf8_general_ci.

Despite that (and adding the line you recommended in the mysqli_functions file), there are still some places where the character doesn't display correctly. In stories, it's fine, but in the site's welcome message, changing the ? to the e-acute character results in everything starting from that character getting cut off and disappearing completely!

Additionally, for some reason, no matter what skin I pick, things like "Summary:" and "by" are in German, not English! When I go to the Site Settings, "en" is still selected, but there are two mysterious blank options as well. I don't know if it's related to the database encoding, but it's a bit strange that it seems to have happened around the same time that I'm trying to fix this issue.


Archive: Dragonfayth
eFiction: 3.5.5/6
Latest Patch(es): Yes
bridged?: No
modified?: Yes
PHP: 7.4.25
MySQL: 5.7.32-35-log

 
Posted : 23/12/2021 4:10 pm
(@ladama)
Posts: 57
Trusted Member
 

I'm hoping this thread can help me. Everything was working fine until about the 4th of this month, now some pages have turned hyphens and such into – and also members aren't getting notification emails for updates/reviews/etc. For some reason or another the collation of the fanfiction_ tables are set to latin1_swedish_ci and I suspect this is contributing to the issues. Do I need to change the collation (and what would be the best way?) or should I try the tricks Sheepcontrol posted a while ago or what?


 
Posted : 17/02/2022 1:53 am
(@jimmi)
Posts: 95
Estimable Member
 

@azurite I am sorry, I missed your question.  I hope you fixed it but in case you don't...

The German text is there because you are using skin with hardcoded German text

Characters from files are wrong because you have the wrong coding of the file itself.  Check your editor - f.e. I used PSPad before and for some time it started all files with ANSI charset (or coding or what it is called in English). If you save the file with this setting, the national characters are wrong. I think it is fixed now but I already moved to Visual Code. 

 

 

 


Never say that something is impossible because there will be always some dummy who will do it.
URL for efiction site: https://www.hpfanfiction.cz/
php: 8.1.33 MariaDB 10.5
efic version: 4.0.x
mods: storyimage, notifications, storyend

 
Posted : 13/12/2022 3:10 am
Share: