PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Autres forums > Forum Programmation & Conception > alt.comp.lang.php > Foreign characters behaving oddly
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
Foreign characters behaving oddly

Réponse
 
LinkBack Outils de la discussion
Vieux 16/07/2007, 22h31   #1
Matthew White
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Foreign characters behaving oddly

Hello,
I have a website that is supposed to grab a French word, and return the
English translation. The front-end has an AJAX script, that dynamically
POST's the value to the backend:

function post() {
var string = document.getElementById("string").value;
var poststr = "string=" + encodeURI( string );
makePOSTRequest('dict.eng.php', poststr);
}

Then the backend takes the script, and queries a database for 30 words most
like that word:

$query = "SELECT * FROM dictionary WHERE fr like ('" . $string . "%') ORDER
BY fr LIMIT 30";
$query = mysql_query($query);

If I enter in a word like "bonjour", the script returns the words that are
most like bonjour. A word with a special character, like "français", will
return no values, even though it is in the dictionary. The page is in
UTF-8, and the database, tables, and fields are all utf8_bin. Can anyone
please point me in the right direction?

Matt




  Réponse avec citation
Vieux 19/07/2007, 13h29   #2
Markus
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Foreign characters behaving oddly

Matthew White schrieb:
> "Markus" <derernst@NO#SP#AMgmx.ch> wrote in message
> news:469db6f9$1_1@news.cybercity.ch...
>> Matthew White schrieb:
>>
>>> "Rik" <luiheidsgoeroe@hotmail.com> wrote in message
>>> newsp.tvmvq3jkqnv3q9@metallium...
>>>> On Tue, 17 Jul 2007 20:31:56 +0200, Matthew White <mgw854@msn.com>
>>>> wrote:
>>>>
>>>>> I added that query right after calling the database, and it now
>>>>> works fine,
>>>>> but here is a problem- "français" returns three matches:
>>>>> français
>>>>> française
>>>>> françaises
>>>>>
>>>>> Why is "ç" being substituted for "ç", even when I pass each
>>>>> returned string
>>>>> through htmlentities()?

[...]
>> It looks like your string is in UTF-8 encoding, but the output is
>> converted to Latin-1 or whatever. Check the following points:
>>
>> 1. All scripts (PHP, HTML) are in UTF-8 encoding
>>
>> 2. Send UTF-8 header to the browser:
>> header('Content-Type: text/html; charset=UTF-8');
>>
>> 3. Set also the appropriate Meta tag in the HTML source (should not be
>> necessary if correct header is sent, but you never know about browsers):
>> <meta http-equiv="content-type" content="text/html;charset=UTF-8">

[...]
>
> I had already made sure of the first and last, but I did add the
> header() to my PHP file. It has made no difference in the output.


Hum... if you don't find the solution in the links posted by Good Man,
you could try to add

ini_set('default_charset', 'utf-8');

to your PHP script (somewhere at the top); but I also think it is rather
a MySQL issue now. BTW, which MySQL version do you use?

One possible reason is that the db contents, that existed before you
added mysql_query("SET NAMES 'utf8'"), are now returned distorted, as
you entered them without telling the DB they are UTF-8, so "ç" was
stored as "ç", which will now be returned in proper UTF-8 encoding. To
test this, make the same test with data you entered after you added the
"SET NAMES" query.

Anyway, if this is the case, it is likely that your original problem
re-arises with all data entered with proper SET NAMES setting!
  Réponse avec citation
Vieux 19/07/2007, 13h34   #3
Nis Jørgensen
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Foreign characters behaving oddly

Matthew White skrev:
> I added that query right after calling the database, and it now works fine,
> but here is a problem- "français" returns three matches:
> français
> française
> françaises
>
> Why is "ç" being substituted for "ç", even when I pass each returned
> string
> through htmlentities()?


Htmlentities will interpret what comes from the database as iso-8859-1,
while it is in fact utf-8.

Either use
htmlentities($myvar, ENT_QUOTES, 'utf-8')
or
htmlspecialchars($myvar)

I recommend the second option - if your output is utf-8, you should
hardly ever need htmlentities.

Nis
  Réponse avec citation
Vieux 19/07/2007, 20h02   #4
Matthew White
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Foreign characters behaving oddly

"Matthew White" <mgw854@msn.com> wrote in message
news:CsRmi.2399$s25.1211@trndny04...
> Hello,
> I have a website that is supposed to grab a French word, and return the
> English translation. The front-end has an AJAX script, that dynamically
> POST's the value to the backend:
>
> function post() {
> var string = document.getElementById("string").value;
> var poststr = "string=" + encodeURI( string );
> makePOSTRequest('dict.eng.php', poststr);
> }
>
> Then the backend takes the script, and queries a database for 30 words
> most like that word:
>
> $query = "SELECT * FROM dictionary WHERE fr like ('" . $string . "%')
> ORDER BY fr LIMIT 30";
> $query = mysql_query($query);
>
> If I enter in a word like "bonjour", the script returns the words that are
> most like bonjour. A word with a special character, like "français", will
> return no values, even though it is in the dictionary. The page is in
> UTF-8, and the database, tables, and fields are all utf8_bin. Can anyone
> please point me in the right direction?
>
> Matt


Retracing my steps, I opened up the MySQL database, only to find those
values were corrupted. After adding in mysql_query("SET NAMES 'utf8'") to
the script that parses the dictionary file, I was able to make everything
work well. Thanks for everyone's !

Matt

  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 10h44.


Édité par : vBulletin® version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières ©2000-2008
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,11068 seconds with 12 queries