Forum

Notifications
Clear all

[Sticky] Special characters display broken

Esteliel
(@esteliel)
Active Member

URL to your eFiction: http://faerie-archive.com
Version of eFiction: 3.5.3.
Have you bridged eFiction, if so with what?: no
Version of PHP: 5.6
Version of MySQL: 4.2.12deb2+deb8u2
Have you searched for your problem: yes
If so, what terms did you try:special characters, charset
State the nature of your problem:

I run a Tolkien fic archive, which means that there's lots of names with special characters. I fixed a problem with them in 2011 thanks to this forum, but since around 2 weeks ago, special characters began to be displayed weirdly again. When I looked into this, I found that my hoster did a Debian upgrade to their servers.

To try and fix this, I went into en.php and changed the character set to UTF-8, which seemed to help at first, because all the weird symbols vanished and the special characters were displayed correctly again.

Unfortunately, since that change new errors have appeared in older stories - not all older stories, but some. You can see it for example here: http://faerie-archive.com/viewstory.php?sid=1906

If I change en.php back from UTF-8 to ISO-8859-1, those are displayed correctly again, but then the other special character errors return all over the site.

Do you have a test account for us? Account: Test Password: test

Quote
Topic starter Posted : 01/03/2018 10:49 am
Sheepcontrol
(@sheepcontrol)
Member Admin

Had the same problem on german sites (we have the black belt in umlaut).

If your stories are stored as files, this might be a big problem, as they may be stored in an encoding other than UTF-8.

If you change the encoding to UTF-8 in en.php, what exactly is right and what is wrong?
If one story is wrong, are all of them with special characters wrong?
If stories are right, what else is wrong?

ReplyQuote
Posted : 01/03/2018 12:17 pm
Esteliel
(@esteliel)
Active Member

Right now, the site is set to UTF-8 in en.php.

There are errors popping up in several stories, for example http://faerie-archive.com/viewstory.php?sid=1906
It seems to affect quotation marks and characters like ?, ?, ?, ?.
At the same time, other stories display the same characters correctly.

When en.php is set to ISO-8859-1, text all over the site displays weirdly, not just some of the stories, but also summaries, the little intro header text of the archive, the shoutbox etc. This is what it looks like then:
Meanwhile the example from above displays correctly with that setting:

Al of this was displayed correctly with ISO-8859-1 until about two weeks ago, before my hoster did a Debian upgrade.

I have the stories stored as files, and not in the database.

ReplyQuote
Topic starter Posted : 02/03/2018 10:25 am
Sheepcontrol
(@sheepcontrol)
Member Admin

When the header and such are created in UTF-8, they can't display properly when using ISO, that makes sense.

With the stories (or rather chapters), could it be that they are okay up to some point and broken past that point (or the other way around)?

There is a fix i bodged together for above mentioned german site, it's ugly as night, but it works ... so you might want to give it a shot:

Open viewstory.php and got to around line 360.

Look for this code:

		$file = STORIESPATH."/$chapterauthor/$chapid.txt";
$log_file = @fopen($file, "r");
$file_contents = @fread($log_file, filesize($file));

Add directly below:

		if ( !mb_detect_encoding($file_contents, 'UTF-8', true) )
$file_contents = mb_convert_encoding($file_contents, "UTF-8", "ISO-8859-1");

In the end, this section should look like this:

		$file = STORIESPATH."/$chapterauthor/$chapid.txt";
$log_file = @fopen($file, "r");
$file_contents = @fread($log_file, filesize($file));
if ( !mb_detect_encoding($file_contents, 'UTF-8', true) )
{
$file_contents = mb_convert_encoding($file_contents, "UTF-8", "ISO-8859-1");
}
$story = $file_contents;
@fclose($log_file);
ReplyQuote
Posted : 02/03/2018 4:11 pm
Esteliel
(@esteliel)
Active Member

Thanks! I tried your workaround, but unfortunately the problem stories like http://faerie-archive.com/viewstory.php?sid=120 still don't display correctly. 🙁

As these are chapters that were posted years ago and used to display correctly until my hoster's Debian upgrade, I'm really not sure what changed. I'd also happily go back to using ISO-8859-1 in the header to have it display correctly, but then there are broken characters in the shoutbox, summaries and news etc.

ReplyQuote
Topic starter Posted : 07/03/2018 10:26 pm
Esteliel
(@esteliel)
Active Member

I think I've got it fixed now.

What I did was to download UTFCast Express and use it to convert all txt files in my fic folder to UTF-8.

When I uploaded them, the black diamond question marks finally displayed the correct characters again. But now there were other broken characters showing up as e.g. ë for ë.

So what I did then was to use the Find in Files function of Notepad++ to do a search and replace for all the broken characters in my fic folder, using this UTF-8 debug list: http://www.i18nqa.com/debug/utf8-debug.html , replacing ë with ë and so on.

ReplyQuote
Topic starter Posted : 09/03/2018 1:28 pm
Sheepcontrol
(@sheepcontrol)
Member Admin

Awesome work.
Going to pin this topic as it could be of value for others too.

ReplyQuote
Posted : 10/03/2018 2:52 am
ShiKahr
(@shikahr)
Trusted Member

I have the same issue, since I moved my archive to my localhost.
I would like to get it right again. Is there any setting in MySQL or Apache what needs to be amended? 

KS-Shipping-Community
Get your Arsch out of the couch." Gayle Tufts

ReplyQuote
Posted : 19/08/2019 5:26 pm
Share: