PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Autres forums > Forum Programmation & Conception > comp.lang.php > Strange 'Â' character output when using simplexml load string
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
Strange 'Â' character output when using simplexml load string

Réponse
 
LinkBack Outils de la discussion
Vieux 24/02/2008, 18h33   #1
bizt
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Strange 'Â' character output when using simplexml load string

Hi,

I converting an XML string using simplexml_load_string function. It is
giving me a  character for some reason dotted around the text. The
xml i am retrieving using a cURL function but this is the second time
from two different sources that I get this. I have checked the XML
after it is retreived and the character is not there, its only after I
use the simplexml function (and do a print_r) that it appears. Until
now I have been using str_replace but was wondering if anyone else
encounted this and what may cause it so hopefully I can get rid of it.

Cheers

Burnsy
  Réponse avec citation
Vieux 24/02/2008, 20h18   #2
Andy Hassall
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Strange 'Â' character output when using simplexml load string

On Sun, 24 Feb 2008 10:33:47 -0800 (PST), bizt <bissatch@yahoo.co.uk> wrote:

>I converting an XML string using simplexml_load_string function. It is
>giving me a  character for some reason dotted around the text. The
>xml i am retrieving using a cURL function but this is the second time
>from two different sources that I get this. I have checked the XML
>after it is retreived and the character is not there, its only after I
>use the simplexml function (and do a print_r) that it appears. Until
>now I have been using str_replace but was wondering if anyone else
>encounted this and what may cause it so hopefully I can get rid of it.


simplexml always outputs in UTF-8. Is your page's encoding UTF-8?
--
Andy Hassall :: andy@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool
  Réponse avec citation
Vieux 25/02/2008, 09h56   #3
Toby A Inkster
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Strange 'Â' character output when usingsimplexml load string

Andy Hassall wrote:
> bizt <bissatch@yahoo.co.uk> wrote:
>
>> I converting an XML string using simplexml_load_string function. It is
>> giving me a  character for some reason dotted around the text.

>
> simplexml always outputs in UTF-8. Is your page's encoding UTF-8?


At a guess, ISO-8859-1 or perhaps ISO-8859-15.

In UTF-8, a "prefix" of an 0xC2 byte is used to access the top half of the
"Latin-1 Supplement" block which includes a lot of juicy characters such
as currency symbols, fractions, superscript 2 and 3, the copyright and
registered trademark symbols, and the non-breaking space.

However in ISO-8859-1 and -15, the byte 0xC2 represents an Â, so if UTF-8
is misinterpreted as one of those, then you get  followed by some other
nonsense character.

Probably the easiest solution would be to take the output from SimpleXML
and pass it through iconv():

$xmlout = iconv('UTF-8', 'ISO-8859-15//TRANSLIT', $xmlout);

Note that UTF-8 is capable of representing a far greater range of
characters than ISO-8859-1/-15 are, so certain characters may not properly
survive conversion. (Using the '//TRANSLIT' option tells iconv to do its
best, and if, say, a particular accented character is not available in
ISO-8859-1, then to substitute an unaccented one in its place.)

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 26 days, 15:55.]

Bottled Water
http://tobyinkster.co.uk/blog/2008/02/18/bottled-water/
  Réponse avec citation
Vieux 07/04/2008, 14h19   #4
bizt
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Strange 'Â' character output when using simplexml load string

On 25 Feb, 10:56, Toby A Inkster <usenet200...@tobyinkster.co.uk>
wrote:
> Andy Hassall wrote:
> > bizt <bissa...@yahoo.co.uk> wrote:

>
> >> I converting an XML string using simplexml_load_string function. It is
> >> giving me a  character for some reason dotted around the text.

>
> > simplexml always outputs in UTF-8. Is your page's encoding UTF-8?

>
> At a guess, ISO-8859-1 or perhaps ISO-8859-15.
>
> In UTF-8, a "prefix" of an 0xC2 byte is used to access the top half of the
> "Latin-1 Supplement" block which includes a lot of juicy characters such
> as currency symbols, fractions, superscript 2 and 3, the copyright and
> registered trademark symbols, and the non-breaking space.
>
> However in ISO-8859-1 and -15, the byte 0xC2 represents an Â, so if UTF-8
> is misinterpreted as one of those, then you get  followed by some other
> nonsense character.
>
> Probably the easiest solution would be to take the output from SimpleXML
> and pass it through iconv():
>
> $xmlout = iconv('UTF-8', 'ISO-8859-15//TRANSLIT', $xmlout);
>
> Note that UTF-8 is capable of representing a far greater range of
> characters than ISO-8859-1/-15 are, so certain characters may not properly
> survive conversion. (Using the '//TRANSLIT' option tells iconv to do its
> best, and if, say, a particular accented character is not available in
> ISO-8859-1, then to substitute an unaccented one in its place.)
>
> --
> Toby A Inkster BSc (Hons) ARCS
> [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
> [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 26 days, 15:55.]
>
> Bottled Water
> http://tobyinkster.co.uk/blog/2008/02/18/bottled-water/



Hi, ive tried what you said which worked for one of my pages but when
i tried it on another i got the following:

Notice: iconv() [function.iconv]: Detected an illegal character in
input string in /home/public_html/search_apartments.php on line 67

Im using the following to convert my XML string which is fetched via
cUrl:

$result = iconv('UTF-8', 'ISO-8859-15//TRANSLIT', $result);

Would it be the case that my $result string, im not providing the
iconv() with the correct input encoding? If so, is there a way for me
to detect the input encoding?

Cheers

Martyn
  Réponse avec citation
Vieux 08/04/2008, 16h00   #5
AnrDaemon
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Strange 'Â' character output when using simplexml load string

Greetings, bizt.
In reply to Your message dated Monday, April 7, 2008, 17:19:28,

>> >> I converting an XML string using simplexml_load_string function. It is
>> >> giving me a  character for some reason dotted around the text.

>>
>> > simplexml always outputs in UTF-8. Is your page's encoding UTF-8?

>>
>> At a guess, ISO-8859-1 or perhaps ISO-8859-15.
>>
>> In UTF-8, a "prefix" of an 0xC2 byte is used to access the top half of the
>> "Latin-1 Supplement" block which includes a lot of juicy characters such
>> as currency symbols, fractions, superscript 2 and 3, the copyright and
>> registered trademark symbols, and the non-breaking space.
>>
>> However in ISO-8859-1 and -15, the byte 0xC2 represents an Â, so if UTF-8
>> is misinterpreted as one of those, then you get  followed by some other
>> nonsense character.
>>
>> Probably the easiest solution would be to take the output from SimpleXML
>> and pass it through iconv():
>>
>> $xmlout = iconv('UTF-8', 'ISO-8859-15//TRANSLIT', $xmlout);
>>
>> Note that UTF-8 is capable of representing a far greater range of
>> characters than ISO-8859-1/-15 are, so certain characters may not properly
>> survive conversion. (Using the '//TRANSLIT' option tells iconv to do its
>> best, and if, say, a particular accented character is not available in
>> ISO-8859-1, then to substitute an unaccented one in its place.)


> Hi, ive tried what you said which worked for one of my pages but when
> i tried it on another i got the following:


> Notice: iconv() [function.iconv]: Detected an illegal character in
> input string in /home/public_html/search_apartments.php on line 67


> Im using the following to convert my XML string which is fetched via
> cUrl:


> $result = iconv('UTF-8', 'ISO-8859-15//TRANSLIT', $result);


> Would it be the case that my $result string, im not providing the
> iconv() with the correct input encoding? If so, is there a way for me
> to detect the input encoding?


As a guess, Your "B" probably followed by space and represent a non-breaking
space.

To Your trouble with iconv on $result, I think You should take care of the
SOURCE BEFORE using simplexml_load_string.
And see what the encoding it use. Because if Your source in, say, ISO-8859-15,
You can't have any untranslatable characters in UTF-8 what You can't convert
back to ISO-8859-15.


--
Sincerely Yours, AnrDaemon <anrdaemon@freemail.ru>

  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 07h55.


Édité par : vBulletin® version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières ©2000-2008
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,61605 seconds with 13 queries