|
|
|
|
||||||
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
I got an XML Data in UTF-8 encoding like this
Dah$)A(&li but when I run the schema to insert data into SQL Server 2005 Express edition, the data will become like this Dah$)A!'0„7li How can I solve the encoding problem? XML data: ======================================= <crew program='SH008774030000'> <member> <role>Director</role> <givenname>Dah$)A(&li</givenname> <surname>Hall</surname> </member> <member> <role>Writer</role> <givenname>Dah$)A(&li</givenname> <surname>Hall</surname> </member> </crew> XML schema: ======================================= <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" xmlns:sql="urn:schemas-microsoft-com:mapping-schema"> <xsd:annotation> <xsd:appinfo> <sql:relationship name="OrderOD" parent="productionID" parent-key="program" child="productionCrew" child-key="program"/> <sql:relationship name="ODProduct" parent="productionCrew" parent-key="role givenname surname" child="crew" child-key="role givenname surname"/> </xsd:appinfo> </xsd:annotation> <xsd:element name="crew" sql:relation="productionID"> <xsd:complexType> <xsd:sequence> <xsd:element name="member" sql:relation="crew" sql:relationship="OrderOD ODProduct"> <xsd:complexType> <xsd:sequence> <xsd:element name="role" type="xsd:string"/> <xsd:element name="givenname" type="xsd:string"/> <xsd:element name="surname" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> <xsd:attribute name="program" type="xsd:string"/> </xsd:complexType> </xsd:element> </xsd:schema> Database table: ======================================= CREATE TABLE ProgramListings.dbo.productionCrew ( program VARCHAR(20), role VARCHAR(20), givenname VARCHAR(20), surname VARCHAR(20), PRIMARY KEY(program, role, surname) ) CREATE TABLE ProgramListings.dbo.crew ( role VARCHAR(20), givenname VARCHAR(20), surname VARCHAR(20), PRIMARY KEY(role, surname) ) |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
>I got an XML Data in UTF-8 encoding like this
> > Dah$)A(&li > > but when I run the schema to insert data into SQL Server 2005 Express > edition, the data will become like this > > Dah$)A!'0"7li > > How can I solve the encoding problem? Your XML is not well-formed. Entity references need to be specified in place of illegal XML characters (e.g. "&" instead of "&"). Note that this is not specific to SQLXML but part of the basic XML standards. See http://www.w3.org/TR/REC-xml/. A CDATA section is commonly used in XML in order to eliminate the need to escape illegal characters. In a CDATA section, only the end tag ("]]>"). For example: <crew program='SH008774030000'> <member> <role>Director</role> <givenname><![CDATA[Dah$)A(&li]]></givenname> <surname>Hall</surname> </member> <member> <role>Writer</role> <givenname><![CDATA[Dah$)A(&li]]></givenname> <surname>Hall</surname> </member> </crew> -- Hope this s. Dan Guzman SQL Server MVP http://weblogs.sqlteam.com/dang/ <worlman385@yahoo.com> wrote in message news:23ouu39b3arbpskfmgdtfof9b81ilskjg3@4ax.com... >I got an XML Data in UTF-8 encoding like this > > Dah$)A(&li > > but when I run the schema to insert data into SQL Server 2005 Express > edition, the data will become like this > > Dah$)A!'0"7li > > How can I solve the encoding problem? > > XML data: > ======================================= > > <crew program='SH008774030000'> > <member> > <role>Director</role> > <givenname>Dah$)A(&li</givenname> > <surname>Hall</surname> > </member> > <member> > <role>Writer</role> > <givenname>Dah$)A(&li</givenname> > <surname>Hall</surname> > </member> > </crew> > > > XML schema: > ======================================= > > <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" > elementFormDefault="qualified" > xmlns:sql="urn:schemas-microsoft-com:mapping-schema"> > > <xsd:annotation> > <xsd:appinfo> > <sql:relationship name="OrderOD" parent="productionID" > parent-key="program" > child="productionCrew" child-key="program"/> > > <sql:relationship name="ODProduct" parent="productionCrew" > parent-key="role givenname surname" child="crew" > child-key="role givenname surname"/> > </xsd:appinfo> > </xsd:annotation> > > > <xsd:element name="crew" sql:relation="productionID"> > <xsd:complexType> > <xsd:sequence> > <xsd:element name="member" sql:relation="crew" > sql:relationship="OrderOD ODProduct"> > <xsd:complexType> > <xsd:sequence> > <xsd:element name="role" type="xsd:string"/> > <xsd:element name="givenname" type="xsd:string"/> > <xsd:element name="surname" type="xsd:string"/> > </xsd:sequence> > </xsd:complexType> > </xsd:element> > </xsd:sequence> > <xsd:attribute name="program" type="xsd:string"/> > </xsd:complexType> > </xsd:element> > > > </xsd:schema> > > Database table: > ======================================= > > CREATE TABLE ProgramListings.dbo.productionCrew > ( > program VARCHAR(20), > role VARCHAR(20), > givenname VARCHAR(20), > surname VARCHAR(20), > PRIMARY KEY(program, role, surname) > ) > > CREATE TABLE ProgramListings.dbo.crew > ( > role VARCHAR(20), > givenname VARCHAR(20), > surname VARCHAR(20), > PRIMARY KEY(role, surname) > > ) |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
> In a CDATA section, only the end tag ("]]>").
This sentence should have been: In a CDATA section, only the end tag ("]]>") is recognized as markup. -- Hope this s. Dan Guzman SQL Server MVP http://weblogs.sqlteam.com/dang/ "Dan Guzman" <guzmanda@nospam-online.sbcglobal.net> wrote in message news:6E2D5295-651D-4DE4-ACF6-149251E0433B@microsoft.com... > >I got an XML Data in UTF-8 encoding like this >> >> Dah$)A(&li >> >> but when I run the schema to insert data into SQL Server 2005 Express >> edition, the data will become like this >> >> Dah$)A!'0"7li >> >> How can I solve the encoding problem? > > Your XML is not well-formed. Entity references need to be specified in > place of illegal XML characters (e.g. "&" instead of "&"). Note that > this is not specific to SQLXML but part of the basic XML standards. See > http://www.w3.org/TR/REC-xml/. > > A CDATA section is commonly used in XML in order to eliminate the need to > escape illegal characters. In a CDATA section, only the end tag ("]]>"). > For example: > > <crew program='SH008774030000'> > <member> > <role>Director</role> > <givenname><![CDATA[Dah$)A(&li]]></givenname> > <surname>Hall</surname> > </member> > <member> > <role>Writer</role> > <givenname><![CDATA[Dah$)A(&li]]></givenname> > <surname>Hall</surname> > </member> > </crew> > > -- > Hope this s. > > Dan Guzman > SQL Server MVP > http://weblogs.sqlteam.com/dang/ > > <worlman385@yahoo.com> wrote in message > news:23ouu39b3arbpskfmgdtfof9b81ilskjg3@4ax.com... >>I got an XML Data in UTF-8 encoding like this >> >> Dah$)A(&li >> >> but when I run the schema to insert data into SQL Server 2005 Express >> edition, the data will become like this >> >> Dah$)A!'0"7li >> >> How can I solve the encoding problem? >> >> XML data: >> ======================================= >> >> <crew program='SH008774030000'> >> <member> >> <role>Director</role> >> <givenname>Dah$)A(&li</givenname> >> <surname>Hall</surname> >> </member> >> <member> >> <role>Writer</role> >> <givenname>Dah$)A(&li</givenname> >> <surname>Hall</surname> >> </member> >> </crew> >> >> >> XML schema: >> ======================================= >> >> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >> elementFormDefault="qualified" >> xmlns:sql="urn:schemas-microsoft-com:mapping-schema"> >> >> <xsd:annotation> >> <xsd:appinfo> >> <sql:relationship name="OrderOD" parent="productionID" >> parent-key="program" >> child="productionCrew" child-key="program"/> >> >> <sql:relationship name="ODProduct" parent="productionCrew" >> parent-key="role givenname surname" child="crew" >> child-key="role givenname surname"/> >> </xsd:appinfo> >> </xsd:annotation> >> >> >> <xsd:element name="crew" sql:relation="productionID"> >> <xsd:complexType> >> <xsd:sequence> >> <xsd:element name="member" sql:relation="crew" >> sql:relationship="OrderOD ODProduct"> >> <xsd:complexType> >> <xsd:sequence> >> <xsd:element name="role" type="xsd:string"/> >> <xsd:element name="givenname" type="xsd:string"/> >> <xsd:element name="surname" type="xsd:string"/> >> </xsd:sequence> >> </xsd:complexType> >> </xsd:element> >> </xsd:sequence> >> <xsd:attribute name="program" type="xsd:string"/> >> </xsd:complexType> >> </xsd:element> >> >> >> </xsd:schema> >> >> Database table: >> ======================================= >> >> CREATE TABLE ProgramListings.dbo.productionCrew >> ( >> program VARCHAR(20), >> role VARCHAR(20), >> givenname VARCHAR(20), >> surname VARCHAR(20), >> PRIMARY KEY(program, role, surname) >> ) >> >> CREATE TABLE ProgramListings.dbo.crew >> ( >> role VARCHAR(20), >> givenname VARCHAR(20), >> surname VARCHAR(20), >> PRIMARY KEY(role, surname) >> >> ) > |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
> I think the input is UTF-8 data but the SQLXML interface convert UTF-8
> to ASCII so the data is messed up when loaded from XML to database. Does your XML include a processing instruction to specify UTF-8 encoding? For example: <?xml version="1.0" encoding="UTF-8" ?> -- Hope this s. Dan Guzman SQL Server MVP http://weblogs.sqlteam.com/dang/ <worlman385@yahoo.com> wrote in message news:7m50v3dsmbcdqcrvhf2qm0c0q0apc724mo@4ax.com... > Thanks your Dan! > > But no! the XML is well formed: > http://www.oniva.com/upload/1356/x1.jpg > > but after using the COM object of SQLXML to load XML file into > Database, the data will look like this: > http://www.oniva.com/upload/1356/x1.jpg > > notice the givenname data is changed. > > I think the input is UTF-8 data but the SQLXML interface convert UTF-8 > to ASCII so the data is messed up when loaded from XML to database. > > Since some data in XML is non-ASCII > > > >>Your XML is not well-formed. Entity references need to be specified in >>place of illegal XML characters (e.g. "&" instead of "&"). Note that >>this is not specific to SQLXML but part of the basic XML standards. See >>http://www.w3.org/TR/REC-xml/. >> >>A CDATA section is commonly used in XML in order to eliminate the need to >>escape illegal characters. In a CDATA section, only the end tag ("]]>"). >>For example: >> >><crew program='SH008774030000'> >><member> >><role>Director</role> >><givenname><![CDATA[Dah$)A(&li]]></givenname> >><surname>Hall</surname> >></member> >><member> >><role>Writer</role> >><givenname><![CDATA[Dah$)A(&li]]></givenname> >><surname>Hall</surname> >></member> >></crew> |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
(worlman385@yahoo.com) writes:
> Thanks your Dan! > > But no! the XML is well formed: > http://www.oniva.com/upload/1356/x1.jpg > > but after using the COM object of SQLXML to load XML file into > Database, the data will look like this: > http://www.oniva.com/upload/1356/x1.jpg > > notice the givenname data is changed. > > I think the input is UTF-8 data but the SQLXML interface convert UTF-8 > to ASCII so the data is messed up when loaded from XML to database. Your original post was encoded in iso-2022-cn, which may explain why I and Dan so very funny characters. Since your target columns are varchar, I need to ask: what is the collation of these columns? If that collation does not include é in its ANSI set, you cannot get the name right. Then again, then you should get "e". What you got appears to be UTF-8 interpreted as ANSI. I don't have any experience of XML bulkload, so I don't know what is going on. You could try to add <?xml version="1.0" encoding="utf-8" ?> first in the file, even though this should not be needed since UTF-8 is the default for XML. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
Thanks your Dan!
But no! the XML is well formed: http://www.oniva.com/upload/1356/x1.jpg but after using the COM object of SQLXML to load XML file into Database, the data will look like this: http://www.oniva.com/upload/1356/x1.jpg notice the givenname data is changed. I think the input is UTF-8 data but the SQLXML interface convert UTF-8 to ASCII so the data is messed up when loaded from XML to database. Since some data in XML is non-ASCII >Your XML is not well-formed. Entity references need to be specified in >place of illegal XML characters (e.g. "&" instead of "&"). Note that >this is not specific to SQLXML but part of the basic XML standards. See >http://www.w3.org/TR/REC-xml/. > >A CDATA section is commonly used in XML in order to eliminate the need to >escape illegal characters. In a CDATA section, only the end tag ("]]>"). >For example: > ><crew program='SH008774030000'> ><member> ><role>Director</role> ><givenname><![CDATA[Dah$)A(&li]]></givenname> ><surname>Hall</surname> ></member> ><member> ><role>Writer</role> ><givenname><![CDATA[Dah$)A(&li]]></givenname> ><surname>Hall</surname> ></member> ></crew> |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
Sorry, the link of second one should be
http://www.oniva.com/upload/1356/x2.jpg > >but after using the COM object of SQLXML to load XML file into >Database, the data will look like this: >http://www.oniva.com/upload/1356/x1.jpg > |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
Yes, I put the string like
<?xml version="1.0" encoding="UTF-8" ?> also tried other encoding string but no luck. I tried to put the parameter in the connection string: sendStringParametersAsUnicode=true but doesn't ======================= hr = pIXMLBulkLoad->put_ConnectionString(L"provider=SQLOLEDB;data source=(local)\\SQLEXPRESS;database=ProgramListing s;integrated security=SSPI;sendStringParametersAsUnicode=true") ; } >> I think the input is UTF-8 data but the SQLXML interface convert UTF-8 >> to ASCII so the data is messed up when loaded from XML to database. > >Does your XML include a processing instruction to specify UTF-8 encoding? >For example: > ><?xml version="1.0" encoding="UTF-8" ?> |
|
|
|
#9 |
|
Messages: n/a
Hébergeur: |
Just solve the problem
use NVARCHAR instead of VARCHAR NVARCHAR - support unicdoe ============== CREATE TABLE ProgramListings.dbo.productionCrew ( program NVARCHAR(20), role NVARCHAR(20), givenname NVARCHAR(20), surname NVARCHAR(20), PRIMARY KEY(program, role, surname) ) >> I think the input is UTF-8 data but the SQLXML interface convert UTF-8 >> to ASCII so the data is messed up when loaded from XML to database. > >Your original post was encoded in iso-2022-cn, which may explain why I >and Dan so very funny characters. > >Since your target columns are varchar, I need to ask: what is the collation >of these columns? If that collation does not include $)A(& in its ANSI set, >you cannot get the name right. Then again, then you should get "e". What >you got appears to be UTF-8 interpreted as ANSI. > >I don't have any experience of XML bulkload, so I don't know what is >going on. You could try to add > > <?xml version="1.0" encoding="utf-8" ?> > >first in the file, even though this should not be needed since UTF-8 >is the default for XML. > |
|
|
|
#10 |
|
Messages: n/a
Hébergeur: |
Just solve the problem
use NVARCHAR instead of VARCHAR NVARCHAR - support unicdoe ============== CREATE TABLE ProgramListings.dbo.productionCrew ( program NVARCHAR(20), role NVARCHAR(20), givenname NVARCHAR(20), surname NVARCHAR(20), PRIMARY KEY(program, role, surname) ) |
|
|
|
#11 |
|
Messages: n/a
Hébergeur: |
Just solve the problem
use NVARCHAR instead of VARCHAR NVARCHAR - support unicdoe ============== CREATE TABLE ProgramListings.dbo.productionCrew ( program NVARCHAR(20), role NVARCHAR(20), givenname NVARCHAR(20), surname NVARCHAR(20), PRIMARY KEY(program, role, surname) ) |
|
|
|
#12 |
|
Messages: n/a
Hébergeur: |
> Just solve the problem
> > use NVARCHAR instead of VARCHAR I'm glad you were able to figure this out. I had assumed that your database default collation was appropriate for the characters being stored. -- Hope this s. Dan Guzman SQL Server MVP http://weblogs.sqlteam.com/dang/ <worlman385@yahoo.com> wrote in message news:4s80v3dun5cl5vf7vks6mnkqt7tdk116uq@4ax.com... > Just solve the problem > > use NVARCHAR instead of VARCHAR > > NVARCHAR - support unicdoe > > ============== > > CREATE TABLE ProgramListings.dbo.productionCrew > ( > program NVARCHAR(20), > role NVARCHAR(20), > givenname NVARCHAR(20), > surname NVARCHAR(20), > PRIMARY KEY(program, role, surname) > ) |
|
![]() |
| Outils de la discussion | |
|
|