A maths puzzle for anyone up for a challenge!

  • webtekkie (10/30/2012)


    Dwain - that's impressive! I'm going through it now to see understand what's going on here.

    Jeff - I've checked with the data sourcing team - they have told me that there will ONLY be 7 chars for Value1 and 10 chars for Value2 - never more and never less.

    I think Jeff may have been trying to determine whether lower case letters are allowed in either or both strings.

    If you thought that was impressive (not really, just foolin' around) then here's one for the math geeks out there. Might even run slightly faster.

    Same set up except change to the length of v (got it down to 34 bytes!):

    DECLARE @Values TABLE

    (ID INT IDENTITY, Value1 VARCHAR(7)COLLATE SQL_Latin1_General_CP1_CS_AS

    , Value2 VARCHAR(10) COLLATE SQL_Latin1_General_CP1_CS_AS

    ,v DECIMAL(34,20))

    DECLARE @Alphanumerics CHAR(62) =

    '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'

    -- Scramble the encryption key

    ;WITH Tally (n) AS (

    SELECT n=number

    FROM [master].dbo.spt_values Tally

    WHERE [Type] = 'P' AND Number BETWEEN 1 AND 62)

    SELECT @Alphanumerics = (

    SELECT SUBSTRING(@Alphanumerics, n, 1)

    FROM Tally a

    ORDER BY NEWID()

    FOR XML PATH(''))

    SELECT EncryptionKey=@Alphanumerics

    -- Sample data

    INSERT INTO @Values (Value1, Value2)

    SELECT 'AXH32CT','22124587'

    UNION ALL SELECT '544DX88','21457751'

    UNION ALL SELECT '43sAb2', '32m9SDSA09'

    First the encode:

    -- Encode

    ;WITH Tally (n) AS (

    SELECT n=number

    FROM [master].dbo.spt_values Tally

    WHERE [Type] = 'P' AND Number BETWEEN 1 AND 10),

    Encode1 AS (

    SELECT ID, Value1, Value2, a.v1, n=POWER(CAST(10 AS DECIMAL(14,0)), 2*(LEN(Value1)-n))

    FROM @Values

    CROSS APPLY (

    SELECT n, RIGHT('0' +

    CAST(CHARINDEX(SUBSTRING(Value1,n,1), @Alphanumerics) AS VARCHAR),2)

    FROM Tally

    WHERE n BETWEEN 1 AND LEN(Value1)) a(n, v1)),

    Catenate1 AS (

    SELECT ID, Value1, Value2

    ,v1=( SELECT SUM(n * CAST(v1 AS DECIMAL(14,0)))

    FROM Encode1 b

    WHERE a.ID = b.ID)

    FROM Encode1 a

    GROUP BY ID, Value1, Value2),

    Encode2 AS (

    SELECT ID, Value1, Value2, v1, a.v2, n=POWER(CAST(10 AS DECIMAL(20,0)), 2*(LEN(Value2)-n))

    FROM Catenate1

    CROSS APPLY (

    SELECT n, RIGHT('0' +

    CAST(CHARINDEX(SUBSTRING(Value2,n,1), @Alphanumerics) AS VARCHAR),2)

    FROM Tally

    WHERE n BETWEEN 1 AND LEN(Value2)) a(n, v2)),

    Catenate2 AS (

    SELECT ID, Value1, Value2, v1

    ,v2=( SELECT SUM(n * CAST(v2 AS DECIMAL(20,0)))

    FROM Encode2 b

    WHERE a.ID = b.ID)

    FROM Encode2 a

    GROUP BY ID, Value1, Value2, v1)

    UPDATE a

    SET v=CAST(v1 AS VARCHAR(14)) + '.' +

    CASE LEN(CAST(v2 AS VARCHAR(20)))%2 WHEN 1 THEN '0' ELSE '' END +

    CAST(v2 AS VARCHAR(20))

    -- Display the encoded values

    OUTPUT INSERTED.ID, INSERTED.Value1, INSERTED.Value2, INSERTED.v

    FROM @Values a

    INNER JOIN Catenate2 b

    ON a.ID = b.ID

    Now the decode:

    -- Decode

    ;WITH Tally (n) AS (

    SELECT n=number

    FROM [master].dbo.spt_values Tally

    WHERE [Type] = 'P' AND Number BETWEEN 1 AND 37),

    Decode1 AS (

    SELECT ID, Value1, Value2, v, n, v1

    FROM @Values

    CROSS APPLY (

    SELECT n, SUBSTRING(@Alphanumerics

    ,CAST(SUBSTRING(CASE LEN(CAST(FLOOR(v) AS VARCHAR(14)))%2 WHEN 1 THEN '0' ELSE '' END +

    CAST(FLOOR(v) AS VARCHAR(14)), 2*n-1, 2) AS INT), 1)

    FROM Tally

    WHERE n BETWEEN 1 AND 1+LEN(CAST(FLOOR(v) AS VARCHAR(14)))/2) a(n, v1)),

    Decode2 AS (

    SELECT ID, Value1, Value2, v, n, v2

    FROM @Values

    CROSS APPLY (

    SELECT n, SUBSTRING(@Alphanumerics

    ,CAST(SUBSTRING(CAST(v-CAST(FLOOR(v) AS DECIMAL(18,0)) AS VARCHAR(22)), 3+2*(n-1), 2) AS INT), 1)

    FROM Tally

    WHERE n BETWEEN 1 AND 10) a(n, v2)),

    SELECT ID, Value1, Value2, v

    ,v1=( SELECT v1 + ''

    FROM Decode1 b

    WHERE a.ID = b.ID

    ORDER BY n

    FOR XML PATH(''))

    ,v2=( SELECT v2 + ''

    FROM Decode2 b

    WHERE a.ID = b.ID

    ORDER BY n

    FOR XML PATH(''))

    FROM @Values a

    GROUP BY ID, Value1, Value2, v

    There's probably a few unnecessary CASTs running around in this one but taking them out would be a chore. Did manage to remove a couple of the CTEs on the decode though. Depending on which approach you choose, you should be able to do the same (remove 2 CTEs) from the first approach the same way I did on the second.

    Both solutions use the same basic idea. Use a tally table to convert each string one character at a time to get an index into @Alphanuerics. The second solution avoids needing to store the length of each string in the encoded string itself - that's how I reduced the size by 3 miserly bytes. It also uses sums against powers of 10 to put the strings back together instead of doing a character concatenate.

    Definitely can't wait to see what Jeff comes up with. The man is in a class all by himself.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • And here is my version:

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[f_JoinToOne]') AND type in (N'FN', N'IF', N'TF', N'FS', N'FT'))

    DROP FUNCTION [dbo].[f_JoinToOne]

    GO

    CREATE FUNCTION [dbo].[f_JoinToOne] (@value1 VARCHAR(7),@value2 DECIMAL(10))

    RETURNS DECIMAL(22) WITH SCHEMABINDING

    AS

    BEGIN

    -- some variables

    DECLARE @characters CHAR(36),

    @iterator BIGINT = LEN(@value1),

    @multiplier BIGINT = 1,

    @value10 BIGINT = 0

    -- encoding string and the default result

    SET @characters = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'

    IF @value1 IS NULL OR @value2 IS NULL RETURN NULL;

    --convert value to decimal

    WHILE @iterator > 0

    SELECT @value10 = @value10 + (( CHARINDEX(SUBSTRING(@value1, @iterator, 1 ), @characters)-1) * @multiplier )

    ,@multiplier = @multiplier * 36

    ,@iterator = @iterator -1;

    RETURN CAST(@value10 * POWER(10.0,10) AS DECIMAL(22)) + @value2;

    END

    GO

    IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[f_SplitToTwo]') AND type in (N'FN', N'IF', N'TF', N'FS', N'FT'))

    DROP FUNCTION [dbo].[f_SplitToTwo]

    GO

    CREATE FUNCTION [dbo].[f_SplitToTwo] (@value DECIMAL(22))

    RETURNS @t TABLE (value1 VARCHAR(7), value2 DECIMAL(10)) WITH SCHEMABINDING

    AS

    BEGIN

    -- some variables

    DECLARE @value1 VARCHAR(7) = '',

    @value2 DECIMAL(10),

    @characters CHAR(36),

    @value10 BIGINT = 0

    IF @value IS NULL RETURN;

    -- encoding string and the default result

    SELECT @characters = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'

    SET @value2 = CAST(RIGHT(CAST(@value AS VARCHAR(22)),10) AS DECIMAL(10));

    SET @value10 = (@value - @value2) / POWER(10.0,10)

    WHILE @value10 > 0

    SELECT @value1 = SUBSTRING(@characters, @value10 % 36 + 1, 1) + @value1,

    @value10 = @value10 / 36;

    INSERT @t SELECT @value1, @value2

    RETURN;

    END

    GO

    select [dbo].[f_JoinToOne]('ZZZZZZZ',9999999999)

    select * from [dbo].[f_SplitToTwo](783641640959999999999)

    select [dbo].[f_JoinToOne]('1',0)

    select * from [dbo].[f_SplitToTwo](10000000000)

    select [dbo].[f_JoinToOne]('A',1)

    select * from [dbo].[f_SplitToTwo](100000000001)

    select [dbo].[f_JoinToOne]('',1)

    select * from [dbo].[f_SplitToTwo](1)

    Cannot be sure for performance, but will not be surprised that [f_JoinToOne] (loop-based scalar UDF) will outperform CTE/tally table based.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • webtekkie (10/30/2012)


    Dwain - that's impressive! I'm going through it now to see understand what's going on here.

    Jeff - I've checked with the data sourcing team - they have told me that there will ONLY be 7 chars for Value1 and 10 chars for Value2 - never more and never less.

    Even better! But I still need to know what the actual content possibilites will be. To re-ask the question, only 0-9 and A-Z (upper case only) characters for Value 1 and only 0-9 for Value 2?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Thanks again, Jeff - I'm even more impressed now.

    Jeff, I think that Dwain is setting some pretty high expecatations here! Personally, I don't think that there is a solution better than his but are you up for proving me wrong? 😛

    The alpha characters are all uppercase.

  • Sorry Jeff - yes, exactly as you stated - 0-9 and A-Z (upper case only) characters for Value 1 and only 0-9 for Value 2?

  • dwain.c (10/30/2012)


    Definitely can't wait to see what Jeff comes up with. The man is in a class all by himself.

    Thanks for the compliment :blush: but seriously not true. I haven't checked your code for performance but it looks like something that I might write. Well done!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • webtekkie (10/30/2012)


    Sorry Jeff - yes, exactly as you stated - 0-9 and A-Z (upper case only) characters for Value 1 and only 0-9 for Value 2?

    Perfect! I'll take a look at this tonight after work. I'm not sure that I can beat any of the fine solutions posted so far but it'll sure be fun trying. I love these kinds of problems.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I hate to throw cold water all over your efforts Dwain, but it appears the output does have to be an int and not a decimal. I think that means that your solution isn't going to work?

  • Another take on it. Simple... but I think Eugene's might be faster. It's a lot closer with a physical tally table, but I think his still wins.

    CREATE FUNCTION dbo.TallyCombine(

    @Value1varchar(7),

    @Value2varchar(10)

    )

    RETURNS decimal(24,10)

    AS

    BEGIN

    DECLARE @holding varchar(25) = ''

    ;WITH

    t1 AS (SELECT 1 N UNION ALL SELECT 1 N),

    t2 AS (SELECT 1 N FROM t1 x, t1 y),

    Tally AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) N FROM t2 x, t2 y)

    SELECT @holding = @holding + CAST(ASCII(SUBSTRING(@Value1,N,N+1)) AS varchar(30))

    FROM Tally

    WHERE N < LEN(@Value1)

    SELECT @holding = @holding + '.' + @Value2

    RETURN @holding

    END

    GO

    CREATE FUNCTION dbo.TallySplitNums(

    @Value3varchar(25)

    )

    RETURNS @t TABLE(Value1 VARCHAR(7), Value2 int)

    AS

    BEGIN

    DECLARE @Value1 varchar(7) = ''

    ;WITH

    t1 AS (SELECT 1 N UNION ALL SELECT 1 N),

    t2 AS (SELECT 1 N FROM t1 x, t1 y),

    Tally AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) N FROM t2 x, t2 y)

    SELECT @Value1 = @Value1 + CAST(CHAR(SUBSTRING(@Value3,N-1,2)) AS varchar(30))

    FROM Tally

    WHERE N < CHARINDEX('.',@Value3)

    AND N%2 = 0

    INSERT INTO @t(value1,value2)

    SELECT @Value1, RIGHT(@Value3, LEN(@Value3) - CHARINDEX('.',@Value3))

    RETURN;

    END

    GO

    SELECT dbo.TallyCombine('ABCD12Z','1234567890')

    SELECT * FROM dbo.TallySplitNums('65666768495090.1234567890')

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • webtekkie (10/30/2012)


    I hate to throw cold water all over your efforts Dwain, but it appears the output does have to be an int and not a decimal. I think that means that your solution isn't going to work?

    7-digits in 36-based numeric system will take upto 12 digits in 10-based (decimal) numeric system, plus your 10 digits of second value... So, it cannot be INT or even BIGINT! 4 and 8 bytes are not enough.

    Combination of your two values in decimal numeric system will need upto 22 digits.

    If you just don't want to have decimal point, then you can use my code, it does produce DECIMAL(22).

    BTW, I've tested my scalar UDF against Dwain method on 1,000,000 records. As I thought, my UDF is faster. Stopped Dwain one after 3 min of execution but my one did complete update in 52 seconds.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • It has to be an "integer value", but does it have to be an "integer datatype"?

    A varchar(22) can hold things like "1234567890123456789012". Looks like a number, is a number to a human user, allows more digits than Int/BigInt. Only problem is you'll get overflow errors if you try to do mathematical operations on it as an integer instead of a floating point number. (And you'll get the usual rounding issues with floating point numerics if you do math on it that way, just not error messages.)

    If the intent is to store a numeric representation of the alphanumeric codes, then varchar() should be fine. If the intent is to do math on it, then it won't work, or will require custom code in order to work.

    Usually, this kind of data doesn't end up having math done on it.

    Store it that way, and you can convert to/from without error. Will that do what you need?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Eugene - I didn't even notice your post earlier as I was posting at the same time. It's an excellent solution but sadly fails for the same reason as Dwain's - it returns a decimal.

    GSquared - full marks for thinking "outside the box" - unfortunately it does have to be an integer datatype as the third party solution requires it to be so.

    So, it does look like I'm asking the impossible - it isn't mathematically possible to turn that many characters into an integer datatype which will fit into a bigint.

  • webtekkie (10/30/2012)


    Eugene - I didn't even notice your post earlier as I was posting at the same time. It's an excellent solution but sadly fails for the same reason as Dwain's - it returns a decimal.

    GSquared - full marks for thinking "outside the box" - unfortunately it does have to be an integer datatype as the third party solution requires it to be so.

    So, it does look like I'm asking the impossible - it isn't mathematically possible to turn that many characters into an integer datatype which will fit into a bigint.

    Not in a way that can be recomposed back to the original data. Checksum() will get you an integer from pretty much any input, and it's deterministic so far as I know, but there's no way to compose back to the original value. Plus, of course, you can end up with collisions on checksums, which means it's not guaranteed unique for different inputs.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • ...

    So, it does look like I'm asking the impossible - it isn't mathematically possible to turn that many characters into an integer datatype which will fit into a bigint.

    Yes, exactly right! It is not possible mathematically and I've explained already why is that.

    But...

    Practically, you can do it. I don't believe that number of rows in your table is larger than upper limit of INT (or BIGINT).

    So, as it was discussed somewhere at the beginning, create a mapping table with IDENTITY (int or bigint whatever appropriate). Simple and very fast.;-)

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (10/30/2012)


    ...

    So, it does look like I'm asking the impossible - it isn't mathematically possible to turn that many characters into an integer datatype which will fit into a bigint.

    Yes, exactly right! It is not possible mathematically and I've explained already why is that.

    But...

    Practically, you can do it. I don't believe that number of rows in your table is larger than upper limit of INT (or BIGINT).

    So, as it was discussed somewhere at the beginning, create a mapping table with IDENTITY (int or bigint whatever appropriate). Simple and very fast.;-)

    Not that it matters now but I agree. Not mathematically possible with BIGINT given 367 * 1010.

    Shifting gears, my recommendation would be to stop messing around with such a conversions and do something very similar to what Eugene suggested except, instead of a mapping table, just add an IDENTITY column to the existing table and call it a day. If you can't modify the table by adding an identity column to it, then a mapping table, as Eugene suggested would certainly do the trick.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 16 through 30 (of 33 total)

You must be logged in to reply to this topic. Login to reply