Handling "optional fields"

  • I have a situation where I need to be able to handle "optional fields" within a database structure that is used in an ASP setting. In other words, I have tables that have a set of base columns. Let's say accountID, recordID, companyname for example. Multiple customers will store data in this table and each customer's data will have their accountID so that we can segment user data. The catch is that each customer must be able to add additional columns of their choosing to the base table.

    For example, one customer may add two columns called SICcode (char15) and EmployeeCount (int) while another customer may add a description (text) column while a third customer may not add any new columns. The data added into these additional columns must be searchable.

    I can think of three basic ways to handle this, all outlined below. I wonder if anyone else has had experience with this and can make any suggestions.

    option 1) add a text column that stores the optional data in an XML structure. Each customer would have a different XML structure to store their data in this field (of course, the actual structure would need to be stored somewhere). Everytime the user added or subtracted a field, the XML for their dataset would need to be modified.

    Pros - powerful - users can add fields to their hearts content. No real limit in what they would store.

    Cons - pia to maintain - everytime a user wanted to add/remove a field, there would be lots of logic that would need to be added within a transaction to add/remove the correct nodes. If the user has a lot of records, this could potentially cause a major slowdown and blocking/locking issues. Also, could not easily search the "fields".

    option 2) create a separate table for each customer that was an extended table that linked back to the records they owned in the primary org table. They would be able to add and remove fields to this table which would be handled through code. There would still be locking issues, but they would pretty much only affect their data in most cases.

    Pros - powerful, again users can add and remove whatever they want. Much easier to maintain than option1 because the database would handle all the logic of adding and removing fields. Very easy to search the data that was in the extended table.

    Cons - again, with a customer that has a lot of records, could cause blockages. Also, need to manage all the tables for the different customers. Makes it hard to optimize stored procedures for joins as one stored procedure would not work for all customers since we'd be joining different tables based on who the customer was.

    option 3) Simply add 20-30 varchar255 columns. Whenever a customer wanted a new field, the varchar field would be utilitzed for this purpose and thus could hold anything.

    Pros - probably best in terms of performance. I would be able to do all the joins, etc. in stored procedures.

    Cons - least flexible - limited in amount of data that could be stored. Would be joining lots of fields which would not have anything in them.

    Anyway, those are the 3 options I can think of. I think I am probably leaning towards 3 although 2 is also a possibility. 1 seems the least attractive. Anyone have some suggestions on how best to proceed?

    Thanks,

    Alex Gadea

    Apptik Inc.

  • Personally, I think 2 is actually your best option. The cons that you outlined I don't see as drawbacks, actually. You can easily add powerful join indexing simply by adding a mandatory index on each extended table on its foreign key:

    
    
    --Main Table Sample:
    CREATE TABLE MainTable
    (
    accountID INT NOT NULL
    , recordID INT NOT NULL
    , companyname VARCHAR(100) NOT NULL
    , CONSTRAINT PK_MainTable PRIMARY KEY (accountID, recordID)
    )
    --Example Extended Tables
    CREATE TABLE ExTable_Cust1
    (
    accountID INT NOT NULL
    , recordID INT NOT NULL
    , someIntField INT NULL
    , someTextField TEXT NULL
    , CONSTRAINT FK_MainTable_ExTable_Cust1 FOREIGN KEY (accountID, recordID) REFERENCES MainTable (accountID, recordID)
    )
    --Index on Foreign Key
    CREATE NONCLUSTERED INDEX IDX_ExTable_Cust1_MainTable ON ExTable_Cust1 (accountID, recordID)

    This will give you a great indexing scheme for table joins, and your ease of maintenance will probably depend on how modular you make your logic to create the extended tables (ie with a standard suffix, etc...)

    HTH,

    Jay

  • When one customer is modifying their extended table, would we see locking issues on the primary table for customers other than the one modifying the extended table? In other words if customer A is modifying ext_tableA and customer B does a join of primarytable to ext_tableB, would customer B experience locking/contention issues?

    Also, this would mean that all my select statements would need to be dynamic in that they would be joining different extended tables based on the customer. Would it be worth it to make one primary stored procedure that dynamically called a second stored procedure based on the customer doing the select? This way I would not get real optimization on the first sp but would on the secondary SP?

    Truthfully, I know that option 2 is probably the best option, but it's also going to be a PIA to write the code based on some other variables in our app. I guess I was hoping for validation that option 3 might work since it will be the easiest to implement ;).

    thanks,

    Alex

  • quote:


    When one customer is modifying their extended table, would we see locking issues on the primary table for customers other than the one modifying the extended table?


    No, SQL Server implements row-level lock. Looking at your keys, I don't believe this would be an issue.

    quote:


    Also, this would mean that all my select statements would need to be dynamic in that they would be joining different extended tables based on the customer. Would it be worth it to make one primary stored procedure that dynamically called a second stored procedure based on the customer doing the select? This way I would not get real optimization on the first sp but would on the secondary SP?


    Good point, and yes, it would be worth your while.

    Good luck with your solution; I realize this type of solution can be tricky! 🙂

    Jay

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply