|
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hi List,
I am searching franticly for a solution (or the procedure) to setting the coding of a new DB to UTF-8. I can find no setting in the Server Manager, during creation of the DB, to influence this. Can someone please show me the way? Thanks --Shawn |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
See the following artciles:
Description of storing UTF-8 data in SQL Server (the article points to SQL 7/2000, but 2005 still uses UCS-2) http://support.microsoft.com/kb/232580 UTF8 String User-Defined Data Type http://msdn2.microsoft.com/en-us/library/ms160893.aspx International Features in Microsoft SQL Server 2005 http://msdn2.microsoft.com/en-us/library/bb330962.aspx HTH, Plamen Ratchev http://www.SQLStudio.com |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
Hi Plamen,
Thanks for your answer. > See the following artciles: > > Description of storing UTF-8 data in SQL Server (the article points to > SQL 7/2000, but 2005 still uses UCS-2) > http://support.microsoft.com/kb/232580 I found this one, but could not really understand the meat of the mater. > UTF8 String User-Defined Data Type > http://msdn2.microsoft.com/en-us/library/ms160893.aspx > > International Features in Microsoft SQL Server 2005 > http://msdn2.microsoft.com/en-us/library/bb330962.aspxB Basically, there is no native UTF-8 storage then in MSSQL, am I right? I mean, the scripts that we are using in perl are writing to the DB in UTF8, and the displayed information in the webapp is returning properly, but when you look at the data in the table, it is displayed incorrectly. Does that seem right? I am speaking of storing German charachters like "äöüß" --shawn |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
Basically if you store your data in BINARY or VARBINARY data type, SQL
Server could care less what encoding the data is. But then you can use it only as storage, as ordering and string functions do not work properly. SQL Server supports UTF-8 encoding for handling XML documents. You can define a column of XML data type and use it for storing UTF-8 data in XML format. See more here: http://msdn2.microsoft.com/en-us/lib...ql2005_topic14 Also, the example for implementing UTF-8 user defined data type works fine too: http://msdn2.microsoft.com/en-us/library/ms160893.aspx HTH, Plamen Ratchev http://www.SQLStudio.com |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
Shawn Beasley (shawn.beasley@freenet.de) writes:
> Basically, there is no native UTF-8 storage then in MSSQL, am I right? Right. > I mean, the scripts that we are using in perl are writing to the DB in > UTF8, and the displayed information in the webapp is returning properly, > but when you look at the data in the table, it is displayed > incorrectly. Does that seem right? The best handling would be to store data in the nvarchar data type as UCS-2. You would Unicode both in the web application and in the database, just encoded differently. I don't know what API you use from Perl, but if you use Win32::SqlServer, it will handle all necessary conversion from UTF-8 to UCS-2 and back for you completely seamlessly. I don't know about DBD/DBI, but it may be more difficult with these, as they are designed to be portable between platforms, whereas Win32::SqlServer is designer for SQL Server only. You find Win32::SqlServer on my web site, http://www.sommarskog.se/mssqlperl/index.html. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
On Mar 26, 11:11 am, Shawn Beasley <shawn.beas...@freenet.de> wrote:
> Hi List, > > I am searching franticly for a solution (or the procedure) to setting > the coding of a new DB to UTF-8. I can find no setting in the Server > Manager, during creation of the DB, to influence this. Can someone > please show me the way? Thanks > > --Shawn I have converted my data from Latin 1 to UTF-8 in MSSQL. The first thing to realise is (as has already been mentioned), MSSQL uses UCS-2, a 2 byte Unicode encoding scheme. Not UTF-8, but we can still use it. What I did was create another table with an ntext column type (instead of the old text used for Latin 1, of course your column in question may be varchar etc, so you will need to create a table with a column of type nvarchar etc). What I then did is create a script to pull the data out of the old table and insert it into the new table with the ntext column. For each line I pulled out I used a function to encode it into UTF-8 (I was using php, maybe you could use Convert::Scalar::utf8_encode() ? ). For that line I executed an SQL statement to insert the line into the new table, however I prefixed the Letter 'N' to the value of the newly UTF-8 encoded string. This tells the db to treat the string as a unicode constant and not to apply any of its own formatting/encoding to the string. Now the newly encoded values in your new table may look like garbage through the MSSQL gui, but the data is correctly encoded in UTF-8 and will display in a webpage with UTF-8 content-type headers. Maybe there is an easier way to do it, but this worked for me. Hope that s at least a little. Raymond |
|
![]() |
| Outils de la discussion | |
|
|