Fuzzy Lookup Transformation

  • Does anyone know what algorithm the ssis fuzzy grouping component uses to produce a confidence score?

    I've been able to calculate the similarity score and produce a subset of data from a lookup table but I need to produce a confidence score for each result in the subset of data returned.

    My ultimate aim is to produce a fuzzy grouping ssis custom component. I know one already exists but this component only comes with SQL Server 2012 Enterprise edition hence my need to develop the component.

    I've searched the internet for the last few days now to no end. Any help is greatly appreciated.

  • I am looking in BIDS for SQL 2005 right now and I see a fuzzy grouping transform.

  • I found the answer.

    I've developed a basic console application in C# to test the algorithms and so far they seem to be working perfectly.

    Next step is to stick the code into a script component and then develop a data transformation custom component.

  • Daniel Bowlin (5/16/2012)


    I am looking in BIDS for SQL 2005 right now and I see a fuzzy grouping transform.

    This. Its right there... hope you haven't wasted too much time?....

  • I know there is a Fuzzy Lookup and Fuzzy Grouping component in sql server but if you read my original post.

    "My ultimate aim is to produce a fuzzy grouping ssis custom component. I know one already exists but this component only comes with SQL Server 2012 Enterprise edition hence my need to develop the component."

    My company will be upgrading to SQL Server 2012 as soon as it is released but not all our sql servers will be Enterprise edition.

  • I did read your original post... but read what I quoted. This component has existed inssis at least as early as SQL server 2005 standard edition, this isn't a new component its been around for years.

  • I know but when we upgrade to sql server 2012 and begin to create packages in ssis sql server 2012 we wont be able to use the fuzzy components unless we are using the enterprise edition.

    Besides the company wants more control over the fuzzy matching components. I've got a whole host of other custom components to develop but this is the biggy.

  • Ah, it is not that you don't have it now, it is that you won't have it after the upgrade. That is unfortunate.

  • Are you saying you're currently running Fuzzy Groupings now in SQL 2008 on a server that is not Enterprise? This is an Enterprise only (except Developer) feature.

  • No I'm not saying that at all. We have 2008 enterprise edition but the company is concerned at the licensing costs of 2012 enterprise edition.

  • And I assume all your servers are not on SA then either. I think attempting to write your own Fuzzy logic willbe a daunting task; the MS version is pretty black box and I haven't seen a whole lot on how they actually get the component to work.

  • I agree but I'm making rather good headway.

    I've got a prototype creating an EIT with tokens and subtokens from a reference table as well as acurately calculating the similarity score and confidence score.

    It's only in a console application at the moment but I'll move it into a component once I'm happy with the prototype.

  • Wait a minute, do you mean you don't use Fuzzy Grouping but...

    no i'm only joking. Thought i'd flog that dead horse.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply