How to programmatically find all non-English values across multiple tables?

  • I have lots of tables with nvarchar columns and want to change all of them to varchar. But the risk is I don't want to change when the values are indeed non-English. In our case they could be Japanese. I've already found some by visually examining select * from them. But how to find them programmatically? Obviously, I will write a cursor for it, but how to test each value or column?

    Thanks

     

  • I would recommend against this.  You're likely to break things in ways that aren't obvious.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • blindly updating is not a solution, as mentioned, but you can at least review the data and make plans accordingly.

    from the metadata, this query generates the queries to compare the varchar conversion to the nvarchar, to show the items that cannot convert cleanly to normal text.

    it produces something that looks like this:

     SELECT 'dbo.SomeComplianceTable' AS QualifiedObjectName,
    CONVERT(varchar(2048),[ApplicationName]) AS [cnv_ApplicationName],
    [ApplicationName]
    FROM dbo.SomeComplianceTable
    WHERE CONVERT(varchar(2048),[ApplicationName]) <> [ApplicationName]

    note this does not fix high ascii, like some of the characters in this string '•ÀAlbèert ËEîinstêeiìn ÌInstìitúutëe', just unicode values:

    SELECT
    CONVERT(VARCHAR(128),DB_NAME()) AS DatabaseName,
    OBJECT_SCHEMA_NAME(object_id) AS SchemaName,
    OBJECT_NAME(object_id),name AS ObjectName,
    [max_length],
    'SELECT '''
    + OBJECT_SCHEMA_NAME(object_id) + '.' + OBJECT_NAME(object_id) + ''' AS QualifiedObjectName,' + CHAR(10)
    + 'CONVERT(varchar('
    + CASE WHEN [max_length] = -1 THEN 'max' ELSE CONVERT(VARCHAR,[max_length] / 2) END
    + '),' + QUOTENAME(name) + ') AS ' + quotename('cnv_' + CONVERT(VARCHAR(128),name)) + ',' + CHAR(10)
    + QUOTENAME(name) + CHAR(10)
    + ' FROM ' + OBJECT_SCHEMA_NAME(object_id) + '.' + OBJECT_NAME(object_id) + CHAR(10)
    + ' WHERE CONVERT(varchar('
    + CASE WHEN [max_length] = -1 THEN 'max' ELSE CONVERT(VARCHAR,[max_length] / 2) END
    + + '),' + QUOTENAME(name) + ') <> '
    + QUOTENAME(name)
    AS Testcmd
    FROM sys.columns
    WHERE TYPE_NAME([system_type_id]) = 'nvarchar'
    AND object_schema_name(object_id) <> 'sys'

    • This reply was modified 5 years, 3 months ago by  Lowell.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply