nchar vis nvarchar

Question

nchar vis nvarchar

Tariq Changgez

Old Hand

Points: 399
More actions
January 2, 2005 at 1:48 am

#84497

In one of my databases where I am working these days, I have used nvarchar instead of nchar. I did this because I was expecting to store Arabic language characters in these fields. However so far I have only English letters saved in these fields and do not expect, at least in the near future, to store any other language characters.
My question is that is there any problem if I change field type from nvarchar to nchar and (possibly in future) back to nvarchar.
I want to do this to save space as nvarchar consumes two bytes for each letter.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Steve Jones - SSC Editor SSC Guru Points: 728164 More actions · Answer 1

nchar consumes two bytes as well. Do you mean convert to varchar instead? In SS2K, above 4 or 5 char, it doesn't make sense to use char.

Follow me on Twitter: http://www.twitter.com/way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com

Frank Kalis SSC Guru Points: 111183 More actions · Answer 2

In SS2K, above 4 or 5 char, it doesn't make sense to use char.

Steve, happy new year!

What do you mean by this? I think it always makes sense to use CHAR when you don't expect the length of the string to vary greatly in length.

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

depeg911 Newbie Points: 5 More actions · Answer 3

It makes sense to use nvarchar or varchar in this case. If you use nchar or char, each nchar field in each row of the table will contain that many characters. For example, if your field is set to nchar(10), then each row will have a field with space for 10 characters. The inefficiency arises when you have hundreds of thousands of rows and most of the data for the nchar field is less than 10 characters in length. I have always found that using nvarchar or varchar is the better solution.

Cheers.

John

fhanlon SSChampion Points: 13131 More actions · Answer 4

I agree with Frank. Using char (or nchar if using Unicode) makes sense if the values are always or almost always the same length. For example in Canada (i think Britain as well) postal codes are always 6 characters. I would use Char(6) not varchar(6). On the other had if the length might vary - like last name, then varchar is almost always better.

Francis

PBirch SSCrazy Eights Points: 8483 More actions · Answer 5

A row with variable-length columns requires special offset entries in order to be internally maintained. These entries keep track of the actual length of the column. Calculating and maintaining the offsets requires slightly more overhead than does a pure fixed-length row, which needs no offsets...However, the extra overhead of maintaining these offsets is generally inconsequential, and this alone would not make a significant difference in most systems, if any.

[The other issue is rows per page. This is found on page 230 - 232]

Now that you know the issues, this general rule merits repeating: variable-length datatypes are most appropriate when you expect significant variation in the size of the data for that column and when the column won't be updated frequently.

(Inside SQL Server 2000)

Quand on parle du loup, on en voit la queue

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 6

Not a challenge here... I'd really like to know...

Using the following code example, I get what most people would consider to be an "undesirable result"...

 CREATE TABLE #MyTest (TestChar CHAR(10))
 INSERT INTO #MyTest (TestChar)
 SELECT 'abcd' UNION ALL
 SELECT 'efgh'

DECLARE @t1 CHAR(10)
DECLARE @t2 CHAR(10)

 SELECT @t1=TestChar 
   FROM #MyTest
  WHERE TestChar = 'abcd'

 SELECT @t2=TestChar 
   FROM #MyTest
  WHERE TestChar = 'efgh'

 SELECT '|'+@t1+@t2+'|'

---------------------- 
|abcd      efgh      | -- Notice the trailing blanks...

(1 row(s) affected)

Now, I'm aware that RTRIM could be used in this case but which is faster for large sets of records? RTRIM twice for each record or VARCHAR?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Tariq Changgez Old Hand Points: 399 More actions · Answer 7

Actually I did a small mistake in my original question which took the whole discussion to a different direction. As Steve has noticed I wanted to ask nvarchar vis varchar (by mistake I wrote nchar).

I repeat my question but this time summarized in two sentences:

Is there any harm if I convert all my nvarchar columns to varchar, and probably at a later date I revert them to nvarchar again. As of today I have only English characters stored in these columns but probably need to store Arabic characters some time in future.

Frank Kalis SSC Guru Points: 111183 More actions · Answer 8

---------------------- 
|abcd      efgh      | -- Notice the trailing blanks...

(1 row(s) affected)

Could it be you're mixing some kind of presentational stuff in here, Jeff?

I would consider such thing the job of the client, and certainly won't do this on larger sets, no matter what string datatype I use. And if I would do it, I would probably use

SELECT REPLACE('|'+@t1+@t2+'|',' ','')

But to answer to (slightly changed) original question:

If you know in advance that you some day need to store unicode data, design your schema accordingly right from the start. I wouldn't wait until the first client comes up with that request. One thing to keep in mind is that unicode data requires twice as much storage space, or put it this way, you can only half as much characters as compared to non-unicode datatypes.

--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

PBirch SSCrazy Eights Points: 8483 More actions · Answer 9

I agree with Frank -

Do not switch back and forth between Unicode and ANSI. You may find yourself losing data if, for example, someone has more than 4,000 characters in one of your Notes fields. Make a choice and live with it. If there is a chance that your database will be used in an International system then Unicode is the best option.

Quand on parle du loup, on en voit la queue

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 10

Frank wrote: >Could it be you're mixing some kind of presentational stuff in here, Jeff?

Only for demonstration purposes... the pipes were so you could see the spaces.

I'm one of those poor slobs that has to put up with both third party and "homegrown" databases written by others. My contention has been that unless there's no chance of trailing spaces, use VarChar instead of Char to save the overhead of using Replace or RTrim (as well as having to remember to do it). Depending on the speed of each method, that contention may be in error.

I wanted your experience (or opinion) as to which may be faster... using VarChar in these instances or using Char with either Replace or RTrim to remove the trailing spaces from the fields when they are concatenated for one reason or another (usually to provide data to a print vendor or as an input to a less than normal system).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

depeg911 Newbie Points: 5 More actions · Answer 11

I "second" what PBirch says. Always plan for the least common denominator. If International characters then unicode else ANSI <- at least in this case.

-- John

Yelena Varshal SSC-Dedicated Points: 34286 More actions · Answer 12

Hi,

Talking about the ANSI /Unicode conversion

Is there an easy and automated way to convert an ANSI text file to Unicode text file? We are taking over some app where the data feed was designed to import the Unicode text file while the file comes as ANSI. I was told to open it everyday in Notepad and save as Unicode then use a state-of-art dot NET program to import data (no comments from me here.....). I assume I can import in SQL Server, use Cast or Convert and export back which would not make sence because the destination of the data will be eventually in SQL Server (why those developers did not do it this way from the start anyway?)

I can write a VBscript for example using ChrW function or something like that, but maybe you know of something ready-to-use?

Yelena

Regards,Yelena Varsha