March 28, 2011 at 8:32 am
Sure,
I'd like the output to be like this.
Name | Oct 2010 | Nov 2010 | Dec 2010 | 2010 Total | Jan 2011 | Feb 2011 | Mar 2011 | ... | 2011 Total |
March 28, 2011 at 2:36 pm
Tim-24860 (3/28/2011)
Sure,I'd like the output to be like this.
Name | Oct 2010 | Nov 2010 | Dec 2010 | 2010 Total | Jan 2011 | Feb 2011 | Mar 2011 | ... | 2011 Total |
Perfect. I've got just the article for that...
http://qa.sqlservercentral.com/articles/Crosstab/65048/
For further understanding, especially when it comes to performance on the subject of Cross Tabs by using "pre-aggregation" as Peter Larsson calls it, see part one of that series...
http://qa.sqlservercentral.com/articles/T-SQL/63681/
--Jeff Moden
May 2, 2011 at 4:32 pm
Great article Jeff!
One situation I would like to see if it can be solved, is I wrote a script that will search my database char columns for 'funny' characters. Of course it does this rbar and takes a good 20 minutes to run (but does save me some head aches because our source system does NOT have anything in place to prevent these characters).
I wanted to try and use this tally table method to see if I can improve performance but I have run into a problem. My loop iteratively builds the character field so for instance
'test data' in the test varchar(100) column would be searched one character at a time and the output would be compared to the orginal (I have a lot of other code in between that makes this useful but the loop is the crux of the performance).
Anyways when I use the tally method this spits the result out as rows instead of a singular column like I have in my loop so it would appear
test
1 t
2 e
3 s
4 t
5
6 d
7 a
8 t
9 a
Is there a way using the tally method to roll this back up, or return it as just 'test data' from this point?
Link to my blog http://notyelf.com/
May 3, 2011 at 5:48 pm
Thanks for the feedback, Shannon.
When you say...
One situation I would like to see if it can be solved, is I wrote a script that will search my database char columns for 'funny' characters.
I'm thinking that you may not need a loop or a Tally Table. Could you give an example of something with "funny" characters in it, the original, and explain what you want done one you find a "funny" character? Perhaps maybe even post your loop code comparsion?
Thanks.
--Jeff Moden
May 4, 2011 at 12:05 pm
Not a problem. I know I didn't do too good of a job explaining myself. Basically we transfer some of our data to a 3rd party source for reporting that needs to be in ascii. Because our source is in unicode and not constrained to certain character times, we get a lot of diacritical marks and sometimes just some really strange characters (some I have no idea what they are!). My code at the moment doesn't remove them yet, just simply spits out a list of which row and column in which tables are offenders and I clean them up afterwards.
Here is the function I use to do this
CREATE FUNCTION dbo.RemoveSpecialChars (@s varchar(256)) returns varchar(256)
with schemabinding
BEGIN
DECLARE @s2 varchar(256) = '' ,@p int = 1,@l int = len(@s),@c int
IF @S is null
RETURN null
IF len(@s) = 0 AND @S is not null
RETURN ''
WHILE @p <= @l
BEGIN
SET @C = ascii(substring(@s, @p, 1))
IF @C between 32 and 126 Or @C IN (220,146,252,167,214,246,196,228,145,147,148,150,180,188,189,190, 201,233,166, 195, 227, 193, 225, 199, 231, 209, 241)
SET @s2 = @s2 + CHAR(@c)
SET @p = @p + 1
END
RETURN @s2
END
GO
From there I have a stored proc that simply loops through every column in every table and builds a sql statement that is executed via sp_executesql @SQL. So technically its 3 loops, which I have been trying to stray away from for a long time :-D.
Anyways, any help on this is appreciated, I have been looking for an alternate solution in my spare time but as of yet to find anything. I have successfully added the tally table into some loops within my production code in the past though so thank you again!
Link to my blog http://notyelf.com/
May 4, 2011 at 2:00 pm
Ideas:
1) Are you dumping your results into a text file of some sort? If so, it might be easier to this in a post-processing script on the text file.
1b) dump it to a text file, process to strip the bad characters, import it back in.
2) Use SSIS and use a script to do the processing.
3) What I fundamentally want to do is something like:
REPLACE(...(Replace(Column,CHAR(1),' '), Char(2), ' ')..., Char(255), ' ')
Which is much easier (for me anyway) in a script, which I would build like this:
// PseudoCode
Dim BadChars as String, temp as string
BadChars = Char(1) & Char(2) & ... // (All of the Characters that you do not want...)
temp = inputstring
For i = 1 to len(BadChars)
temp = replace(temp,mid(BadChars,i,1),' ')
next
return temp
Hopefully someone else has a better answer for you.
--
JimFive
Edited to fix psuedocode error
May 4, 2011 at 3:21 pm
shannonjk (5/4/2011)
Not a problem. I know I didn't do too good of a job explaining myself. Basically we transfer some of our data to a 3rd party source for reporting that needs to be in ascii. Because our source is in unicode and not constrained to certain character times, we get a lot of diacritical marks and sometimes just some really strange characters (some I have no idea what they are!). My code at the moment doesn't remove them yet, just simply spits out a list of which row and column in which tables are offenders and I clean them up afterwards.Here is the function I use to do this
(.... function removed so won't kick up a storm at work...)
From there I have a stored proc that simply loops through every column in every table and builds a sql statement that is executed via sp_executesql @SQL. So technically its 3 loops, which I have been trying to stray away from for a long time :-D.
Anyways, any help on this is appreciated, I have been looking for an alternate solution in my spare time but as of yet to find anything. I have successfully added the tally table into some loops within my production code in the past though so thank you again!
That function does actually remove all but the "sanctioned" list of characters from the return variable. You may not be using it that way, but that's what it actually does. Anyway, we should be able to get rid of some of the loops no matter which way you need to have it. I'll see if I can demo for you after work.
--Jeff Moden
May 4, 2011 at 3:31 pm
Yes you are correct sorry it has been a while since I designed and/or even looked at this code. I use it as a comparative like
select dbo.removespecialcharacters(column) from table where column != dbo.removespecialcharacters(column)
The output shows only 'bad' data and I take the output from there to manipulate it.
I have a master table I save to show me how to get to unique records on all my tables that is referenced in the stored proc loop so when it spits out the bad data its along with the unique row identifiers as well.
Link to my blog http://notyelf.com/
May 5, 2011 at 6:45 am
Sorry... I didn't forget you... I just got busy. I'll be back.
--Jeff Moden
May 5, 2011 at 12:27 pm
No worries take your time. I have been looking for a solution as well, I will post something if I figure it out π
Link to my blog http://notyelf.com/
May 5, 2011 at 2:51 pm
shannonjk (5/5/2011)
No worries take your time. I have been looking for a solution as well, I will post something if I figure it out π
Shannon, to confirm, in the end, you wanted something like this:
IF OBJECT_ID( 'tempdb..#tmp') is not null
drop table #tmp
CREATE table #tmp ( RowID INT IDENTITY( 1,1) NOT NULL, testfield nVARCHAR(256))
INSERT INTO #tmp (testfield) VALUES
( N'abcdefghijklmnopqrstuvwxyz'),
(N'abcdef' + nCHAR(300) + N'ghi'),
(N'abcd' + nchar(301) + N'efg' + nchar(302) + N'hij' + nCHAR(303))
SELECT * FROM #tmp
--===== "Inline" CTE Driven "Tally Tableβ produces values up to
-- 10,000... enough to cover VARCHAR(8000)
;WITH
E1(N) AS ( --=== Create Ten 1's
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 --10
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4)
SELECT
t.RowID,
t.testfield AS BadField
FROM
#tmp AS t
JOIN
(SELECT DISTINCT
t.RowID
FROM
cteTally AS ct,
#tmp AS t
WHERE
ct.N <= LEN( t.testfield)
AND UNICODE(substring( t.testfield, n, 1)) > 256
) AS drv
ONt.RowID = drv.RowID
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
May 7, 2011 at 11:02 am
Craig Farrell (5/5/2011)
Shannon, to confirm, in the end, you wanted something like this:
What I was thinking is that no Tally Table is necessary for this. I believe a LIKE NOT (using [^]) would do it faster than a Tally Table implementation. I just haven't had the time to tweek it so it escapes embedded wildcard and reserved characters of _, %, and [ although I do believe I just had an epiphany in that area. π
--Jeff Moden
May 7, 2011 at 11:43 am
Jeff Moden (5/7/2011)
Craig Farrell (5/5/2011)
Shannon, to confirm, in the end, you wanted something like this:What I was thinking is that no Tally Table is necessary for this. I believe a LIKE NOT (using [^]) would do it faster than a Tally Table implementation. I just haven't had the time to tweek it so it escapes embedded wildcard and reserved characters of _, %, and [ although I do believe I just had an epiphany in that area. π
Most likely, and I'm sure you're right. This would allow for the stripping of the characters or whatnot as a baseline for a number of different tasks, but for simple location it's probably a bit overkill and slower. If I get a chance I'll see if I can determine if an accented A exists in [a-Z]. π
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
May 7, 2011 at 11:53 am
At this point, I thought Shannon didn't really want to strip characters... only find those "cells" that had unwanted characters. At the very least, LIKE NOT would likely make a decent filter for those "cells" that do need cleaning.
--Jeff Moden
May 7, 2011 at 8:52 pm
Ok, Shannon... here's part 1. No need for a Tally Table just to figure out if a "cell" has "weird" characters in it. The following code makes 100,000 rows with a column that may have occasional "wierd" characters in it (about 7,800 rows... I say "about" because they're randomly generated).
--===== Create a test table where the last character
-- may or may not be an illegal character
SELECT TOP 100000
RowNum = IDENTITY(INT,1,1),
TestString =
CAST(NEWID() AS VARCHAR(36))
+ CHAR(ABS(CHECKSUM(NEWID()))%90+25)
INTO #TestTable
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
To find them is easy. We'll build your string of "valid" characters like you did before but we'll add some special "escape" characters using a handful of nested replaces. These "escape" characters are important to "escape" characters normally used as WildCards and character range blocks so they'll be treated as normal characters. Then, it's just a matter of doing a SELECT with the right kind of "LIKE" in the WHERE clause with an ESCAPE clause to use the special "escape" characters we inserted before.
--===== Build the "valid" character string.
DECLARE @LegalCharacters VARCHAR(256)
SELECT @LegalCharacters = ISNULL(@LegalCharacters,'')+CHAR(t.N)
FROM dbo.Tally t
WHERE t.N between 32 and 126
OR t.N IN (220,146,252,167,214,246,196,228,145,147,148,150,180,188,189,190,201,233,166,195,227,193,225,199,231,209,241)
;
--===== Add the "escape" characters to the characters normally used as WildCards and character range blocks.
-- In this case, I've used an underscore as the "escape" character.
SELECT @LegalCharacters = REPLACE(REPLACE(REPLACE(REPLACE(@LegalCharacters,'_','__'),'%','_%'),'[','_['),']','_]')
;
--===== And now we find ALL of the "cells" in the column that have a "wierd" character.
-- The "AsciiValue" column shows the decimal ASCII value of the "weird" character found.
-- I didn't use every possible wierd character but it will work for all "weird" characters.
SELECT RowNum, TestString, AsciiValue = ASCII(RIGHT(TestString,1))
FROM #TestTable
WHERE TestString LIKE '%[^'+@LegalCharacters+']%' ESCAPE '_'
;
The SELECT to find the "wierd" characters finds all ~7,800 rows out of 100,000 rows in about 9 seconds on my 9 year old single 1.8GHz CPU. It should do a whole lot better on a server.
--Jeff Moden
Viewing 15 posts - 316 through 330 (of 497 total)
You must be logged in to reply to this topic. Login to reply