can i use hashkey to speedup join with varchar

  • if i don't have a numeric column to uniquely identify a varchar description field and i need to join the tables on the varchar field, is hashkey a valid option... if not does anyone have any suggestion on joining or filtering on a varchar() field.

    thanks

  • How big is the varchar field? While they are not as performant as integers, if it's not large and the data inside is selective enough, you can put an index on the column and use it in the join.

    If you could add a hash column, why not just add an integer and use that instead?

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • see i was hoping someone would tell me i don't even know what i'm talking about with the hashkey usage.

    the field is varchar(100) and it exists on multiple tables.. and there is no shared key. I guess a shared key could be created, but that means

    creating the key relationship to the varchar on all the tables to make sure they are the same and then making sure all our processing are using the right key, probably throught a lookup table... sounds good, but takes time. so for today i was just trying to figure out the fastest way join the varchar field on two tables as a work around.

    thanks

  • Well, a hash key can work. In fact, if you're dealing with really big fields, it's probably a good idea since you can't go beyond 600 bytes in an index key. But it's not very high on my list of preferences. It sounds like you have a problem with your database design. Instead of trying to hack a solution with a hash field, rearchitect your design. It's much more likely to be a successful solution that way.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Natarshia,

    You have to have some way to uniquely identity each row in each table in order to properly join to other tables or you will get multiple joins from one to the other instead of a join from one distinct record to the other.

    You may want to do a proof of concept first to determine if the varchar(100) column is unique in each table. Something like:

    SELECT Varchar100Column

    INTO #Temp

    FROM YourTable

    ALTER TABLE #Temp ADD CONSTRAINT #TempPK PRIMARY KEY CLUSTERED (Varchar100Column)

    If you get an error message, then the varchar(100) column is not unique in the table. Even if you determine that the varchar(100) column is unique, you would have to have a way to keep it that way as new rows are inserted into the table.

    Even then, turning it into a hash key could produce duplicate hash keys - there can be duplicates if the right conditions exist.

    I agree with Grant here. There seems to be a problem with the database design.

    Todd Fifield

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply