Generating Username

  • Hi All,

    I am adding a username column to an existing users table. I am generating the usernames based upon the email addresses we already have in the system in a function. Then passing the users on at a time in a cursor, getting the username back from the function and updating the table. I am using the cursor because the logic in the function to avoid duplicate usernames depends on the usernames in the base table.

    Does anyone have a more efficient solution that does not involve a cursor or while loop?

    Thanks in advance for everyone's help,

    Paul

  • Populate new column into temp table by set based SQL and then insert desired usernames by selcting distinct values from temp table.

  • Well the logic to generate a unique username depends on whether or not it already exists. So i could insert them one at a time into a temp table and check against that, then do the massive update on the base table, but i dont know if this would be any faster.

    Originally i was thinking of using the cross apply function but i ended up with dups and i cannot do the update inside of the function. I have not used it all that much but was wondering if some how CTE's could be used.

  • Well, I'm not sure what the specification for your login names is, but if it were something like, say, take the email and make it a login name, then in order to handle duplicates, what you could do is something like,

    WITH cte AS

    (SELECT ID, email, ROW_NUMBER() over (PARTITION BY email ORDER BY ID) AS RowNum

    FROM CurrentTable)

    UPDATE cte

    SET LoginName = (CASE WHEN rowNum = 1 THEN email ELSE LEFT(@email, CHARINDEX('@', email) - 1) + CAST(RowNum AS VARCHAR) + SUBSTRING(email, CHARINDEX('@', email), 9999))

  • Awesome, thanks for your help. Now would i be able to add logic to lets say concatenate a first and last name if the login is taken? I was told to avoid adding numbers to the end of the login if possible.

  • Oops - had a syntax error in my last post, forgot the END for the CASE statement.

    Anyway- something like this?

    WITH cte AS

    (SELECT ID, firstname, lastname, email, ROW_NUMBER() over (PARTITION BY email ORDER BY ID) AS RowNum

    FROM CurrentTable)

    UPDATE cte

    SET LoginName = (CASE WHEN rowNum = 1 THEN email ELSE firstname + '_' + lastname + '_' + email END)

  • I just wanted to add that the email addresses are unique.

  • Well, that makes a pretty big difference! If email addresses are unique, and you used the email address to generate the login name, then how can you ever run into duplicate login names?

  • Well lets say we have the following addresses:

    paul@yahoo.com

    paul@gmail.com

    First Name = Paul

    Last Name = Smith

    They want me to take paul as the first login and then paulsmith as the second. Then i have criteria to follow if thats taken and so on using first name last initial...

    I have built all of that into a function. Originally i tried using the cross apply in an update calling the this function, the problem there is the base table is never updated so all of my IF statements are useless because the logins don't exist yet.

    Does that make any sense?

    Thanks again for your time...

  • Okay, sure, makes sense. So you're saying that your function is going to return the correct login name based on some parameters, but your function is going to take a long time to run since it needs to do a full table scan to figure out what the correct login name should be, for each login name accessed?

    The thing about row_number is that, it's good for giving you a seperation of values, but it's not so useful for dealing with multiple conditions. In your case, i'd have to think about it more ... going home for the day though, so I'm sure someone else will help you out before I get back tomorrow 😛

  • Yeah i see why the code you posted originally will not work. Thanks again for you time. Hopefully i can come up with something if nobody else can assist.

  • Alright - depending upon how big your table currently is, and what kind of system resources you have to make use of, you might want to consider using SSIS to handle your initial population of the login name value.

    It seems to me that, once initially populated, you should be able to handle things relatively easily from then on; as long as your columns that you're using to define the login name are indexed, all you need to do is have your procedure that's creating your records, call the function to obtain the correct login name. Yeah, it's going to have to look through the current records to get the login name, but itt'l do so using index seeks, so it shouldn't be too bad, and if your function is written well, it should handle it.

    So, the problem becomes just dealing with populating it initially. This is what SSIS is good for.

    What you can do in SSIS is load the whole table as an in-memory lookup table. Once that's done, the task of coming up with the login name for each record is much, much faster, as lookups against a lookup table are very fast with SSIS. The only catch is that you'll need a fairly powerful system, depending on the size of your table.

    When running some SSIS tasks against one of my applications, which involved loading 3 tables of about 10 million records each into memory, the SSIS application was taking up around 14gb of memory to run; now, mind you, it wasn't tuned efficiently, and if it were, it would be much less intensive, but if you're dealing with only having a 2gb memory machine, it could be tricky to get SSIS to work for you. Still, it should be doable.

  • Ill give SSIS a shot, thanks again for your help. Even though i was unable to solve this with a CTE i def learned more about them,

    Thanks again,

    Paul

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply