Converting Oracle Varchar2 Data Type to SQL Server

Have you ever had to import data from Oracle and store it in a SQL Server database? Normally, I don't think too hard about what is required, but I ran into a situation where I had a table with a row size that would exceed the 8000-byte row size limit if converted to SQL Server. I was using SSIS (SQL Server Integration Services.) The big issue was that I had were the columns defined as varchar2 in Oracle and my initial plan was to convert them as nvarchar in SQL Server. But for certain operations, the SQL Server buffer size limit was exceeded. By defining those columns as nvarchar, I would exceed the SQL Server buffer size limit. An easy solution is to define the columns as varchar in SQL Server and right away I am under the buffer size limit. But now my question becomes, "Did I mess up the data? Will there be any character loss? What am I affecting by moving the data from an Oracle varchar2 column into a SQL Server varchar column?" To know whether I messed up the data or not all depends on the character set used in Oracle and the collation used in SQL Server. In Oracle, the character set is chosen when the database is created.

In Oracle, varchar2 (10) means it can store 10 bytes. The maximum length is 4000 bytes. In a single-byte character set a varchar2 (10) column can store up to 10 characters; the number of bytes and the number of characters are basically the same. In a multi-byte character set things are different. If the database is setup to use the Unicode character dataset then it can store characters that use more than 1 byte.

Oracle also supports UTF-16 by using the national character set (which is used for data stored inside nvarchar2, nchar, nclob columns). Any character in a UTF-16 implementation occupies no less than 2 bytes of storage, whereas varchar2 characters can be stored in as little as 1 byte. In an nvarchar2 column, size is the number of characters. (The number of bytes may be up to 2 times this number for the AL16UTF16 encoding and 3 times this number for the UTF8 encoding.)

When an Oracle database is created, the user chooses the character set. Oracle's recommendation is that if the environment (clients and servers) consists entirely of Oracle 9i or higher, to use AL32UTF8 (It encodes Unicode data in the UTF-8 encoding) as NLS_CHARACTERSET and for the national character set to use AL16UTF16 (It encodes Unicode data in the UTF-16 encoding.) AL32UTF8 is a varying width character set, which means that the code for a character can be 1 to 4 bytes long, depending on the character itself; the AL16UTF16 character set will use 2 or 4 bytes to store a character. More detailed notes on the AL32UTF8 character set can be found here.

UTF-8 is the 8-bit encoding of Unicode. It is a variable-width encoding. One Unicode character can be 1 to 4 bytes in UTF-8 encoding. Characters from the European scripts are represented in either 1 or 2 bytes. Characters from most Asian scripts are represented in 3 bytes. Supplementary characters are represented in 4 bytes.

UTF-16 is the 16-bit encoding of Unicode. It is an extension of UCS-2 and supports the supplementary characters defined in Unicode 3.1 by using a pair of UCS-2 code points. One Unicode character can be 2 bytes or 4 bytes in UTF-16 encoding. Characters (including ASCII characters) from European scripts and most Asian scripts are represented in 2 bytes. Supplementary characters are represented in 4 bytes.

In SQL Server, varchar uses one byte per character and within that 1 byte it can carry the ASCII characters. Characters 0 through 127 are the ASCII characters (which covers English). The characters 128 through 255 also present characters. Which ones depends on the collation chosen for the database. nchar/nvarchar are designed to use up to 2 bytes to store a unicode string. SQL Server 2012 provides full support for UTF-16 by using up to 4 bytes to store a unicode string. Previous versions of SQL Server do not support UTF-16, but it supports UCS-2 which is a subset of UTF-16; it only supports characters that fit in 2 bytes. When using double-bytes in SQL Server 2012, nvarchar(10) means it can only carry 5 characters of a double-byte type. You have to examine the collation and code page being used to determine the characters that can actually be stored.

With the above information you can see when converting Oracle varchar2 data to SQL Server, you should use an nvarchar data type in SQL Server. When that is not possible, as what I experienced in one of my recent projects, you need to find an alternative or inform the end user that there may be some character loss if that conversion is made. The only time when you can be 100 percent sure that there is not any character loss when using a varchar data type is when the source column only contains ASCII characters.

Converting Oracle Varchar2 Data Type to SQL Server

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112