The Ad Hoc Change

  • Comments posted to this topic are about the item The Ad Hoc Change

  • Right on. I would add only one precept: use scripts, not GUI. Scripts can be reviewed and reused later, but if you use GUI, you may log a step correctly while actually doing something slightly off and wrong.

  • Ironically, it's actually easier to make manual adjustments like this in a badly designed database--normalisation makes things easier for computers but a lot harder for people!

  • It's also easier to put invalid data in a denormalized database or one that lacks proper declarative referential integrity.

  • gitmo (5/31/2011)


    It's also easier to put invalid data in a denormalized database or one that lacks proper declarative referential integrity.

    Entirely true, and I wasn't suggesting that we should all design bad databases just to make manual updates easier--just that if you're working with an already badly designed database, this sort of manipulation isn't as tricky!

  • Right-on! Thank-you for sharing the original post, Steve! In my 6 years of experience, I have done numerous production ad-hoc updates, but have always ensured that all 7 processes are followed. This ensures that we don't end up in a cycle of ad-hoc updates (we fix the data, fix the code and that's it - we should not have to fix the data again).

    Also, the team was formulated such that we had our own DBA. The regular DBA would be brought in sync (just as the rest of the development team) on any production issues so that they can take care about these in the next release of the product. In most times, we found that both were masters of their own areas - the production/sustenance team/DBA was best in troubleshooting and applying such patches, and the development DBA in actually fixing the issue once the root-cause was identified.

    Thanks & Regards,
    Nakul Vachhrajani.
    http://nakulvachhrajani.com

    Follow me on
    Twitter: @sqltwins

  • In the past, whenver a "data fix" was required in production; a change order was created, and I'd write a T-SQL script which would go though the normal version control, QA, and production deployment process.

    Regarding the possibility of an ad-hoc change violating some data itegrity or business rule; that's a very good point. What I do is attempt to identify stored procedure calls that could accomplish the same task, rather than just coding a series of ad-hoc DML statements. Fortunately we don't put DML coding in triggers. Sometimes the application doesn't support a specific workflow or case usage need, and a one-off script is needed as an interim solution until the application developers can develop, QA, and deploy the permanent solution to production.

    These scripts, which make stored procedures calls to mimic the application workflow, are actually very similar to the scripts I use for unit testing purposes.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I was the senior DBA in a PCI certified business. I had zero access to all production systems, and was required to work through an operator for all Ad-hoc changes.

    When I took the role I thought these constraints impossible to manage in a timely fashion. I was wrong.

    Over time we developed the ability to rapidly emulate the production system in a dev environment.

    All Ad-Hoc changes were designed, scripted and testing in the DEV environment. Once passing DEV, they were reviewed and approved by QA. If there was enough risk, the scripts were also tested in the QA environment.

    When complete, the scripts were passed to the production operator. The production operator knew little SQL, and simply ran the scripts from a command line.

    In the four years I worked with this process, we NEVER had to execute our backout plan. We were very quick to get changes into production with 100% success.

    Having experienced this process, seeing how quickly it can be executed, and experiencing the safety and integrity of the data, I am sure I will embrace this kind of process long into my future.

    Cheers,

    Ben

  • One other thing I'd like to add about deploying ad-hoc change scripts; I also wrap the DML or procedure calls in a transaction and add assertions on the expected number of rows modified. It's not always possible to unit test the script against a copy of the same data as it exists in production, but the developer should know ahead of time how many rows are to be affected. For example:

    update some_table

    set some_column = null

    where member_id = 42376 and task_id = 52;

    if @@rowcount <> 3

    raiserror ('Assertion 38: An unexpected number of rows were updated.

    The transaction will now be aborted.', 16, 1);

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Assertions are a great extra check, Eric. Thanks for the note.

  • Steve Jones - SSC Editor (5/31/2011)


    Assertions are a great extra check, Eric. Thanks for the note.

    I'd much rather have the production control team notify me that a script failed to commit changes for a single client due to an assertion, than I would be notified that the script completed without error but inadvertently deleted records for 100,000 clients in the process. I can re-submit a fix for the first type of script failure and still get home in time for dinner, but the second type of script failure is a disaster.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I am rather appalled that all the answers to date are so narrowly concerned with T-SQL, scripts, etc., to "correct" the so called errors.

    There appears, to me at least, no effort to discover the final affect of the changes the scripts when executed will have. For example changing a negative inventory on hand quantity to either a zero or some positive value. In my specific case, yes it was, what at first appeared to be an application error, but further evaluation of the affect of that supposed error was an over statement of inventory to the value of slightly over one million USD. Fortunately I had been employed by the purchasing company to examine the data base and interfaces of the company being purchased. Further discussions with the prior DBA revealed that he had been "instructed" by management to alter the interface code, and when he learned of the pending purchase of the company, quickly resigned and obtained new employment. Based on what I discovered and the prior DBA's revelation, uncovered an unethical scam to garner the private owners of the company being purchased, an illicit gain of one million plus dollars.

    So my recommendation to all who are requested to make "ad hoc changes", be aware of the affect on the company as a entirety, and to protect one’s own butt, by having the "ad hoc change request" in writing signed by a company official, and keep a copy of that written instruction off site.

    And once satisfied that you are correcting a pure and simple error, go ahead write the scripts, do the testing and checking and get the job done.

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Ron aka bitbucket-25253,

    According to GAAP, for each change to business data there has to be a "ledger correction document." DBA should not touch the DB without a copy of the doc.

    So if there is a problem and data is wrong, there ought to be a doc explaining that the data is wrong and why. Then there is a corrective action, which may be done to the database directly if the problem was in an app and there is no GUI with that could do that with log, DBA can correct the databse but again with a doc. (Yes, I know, it is PITA. Done that.)

  • Revenant (5/31/2011)


    Ron aka bitbucket-25253,

    According to GAAP, for each change to business data there has to be a "ledger correction document." DBA should not touch the DB without a copy of the doc.

    So if there is a problem and data is wrong, there ought to be a doc explaining that the data is wrong and why. Then there is a corrective action, which may be done to the database directly if the problem was in an app and there is no GUI with that could do that with log, DBA can correct the databse but again with a doc. (Yes, I know, it is PITA. Done that.)

    Exactly, what my concern was that in the press of a business day a DBA would just go ahead and make the alterations, not thinking of the consequences. And I did want to make those who are Tech geeks as the term goes, aware of the possible greater social consequences, and because of their position, and lack of non-tech aspects not be taken advantage of.

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • paul.knibbs (5/31/2011)


    Ironically, it's actually easier to make manual adjustments like this in a badly designed database--normalisation makes things easier for computers but a lot harder for people!

    It's demonstrably false than normalisation always makes things easier for computers. Some times it makes things impossible for computers and inordinately difficult for humans.

    Take a simple schema like this favourite foods setup

    -- this version of the schema is in 3NF and EKNF.

    -- it is NOT in BCNF, 4NF, or 5NF

    CREATE TABLE Food_CLass (

    Class nvarchar(32) NOT NULL,

    Food nvarchar(32) NOT NULL CONSTRAINT PK_Food_Class Primary Key Clustered,

    CONSTRAINT UQ_Food_Class UNIQUE (Class, Food)

    )

    CREATE TABLE FavFoods (

    Name nvarchar(32) NOT NULL,

    FoodClass nvarchar(32) NOT NULL,

    Favorite nvarchar(32) NOT NULL,

    CONSTRAINT PK_FavFoods Primary Key Clustered (Name,FoodClass),

    CONSTRAINT FK_FavFoods_Classes FOREIGN KEY

    (FoodClass,Favorite) REFERENCES Food_Class (Class, Food)

    )

    This is, as you can see, in 3NF but not in 4NF (it's in EKNF but not in BCNF, to be a little more precise). The attribute FoodClass is functionally dependent on the attribute Favorite - that's OK in 3NF because Favorite is a key attribute ((Name,Favorite) is a candidate key) and so is FoodClass (it's part of the primary key), but not in BCNF or higher. Similarly in the other table Class is dependent on Food - a Food can be either fish or vegetable (or something else) but can't be both. The computer can easily prevent insertions or updates which would produce invalid data like a person having two favorites in the same class, or something being classified both as fowl and as fish, because the constraints enforce the uniqueness required by the business rules.

    If I normalise to 4NF (or to BCNF, or to anything higher than BCNF) in the obvious manner, the schema will no longer be able to enforce this business rule, because I end up with

    -- this version of the schema is in BCNF, 4NF, and 5NF; but it can't enforce the business rules

    CREATE TABLE Food_CLass (

    Class nvarchar(32) NOT NULL,

    Food nvarchar(32) NOT NULL,

    CONSTRAINT PK_Food_Class Primary Key Clustered (Food)

    )

    CREATE TABLE FavFoods (

    Name nvarchar(32) NOT NULL,

    Favorite nvarchar(32) NOT NULL,

    CONSTRAINT PK_FavFoods Primary Key Clustered (Name,Favorite),

    CONSTRAINT FK_FavFoods_Classes FOREIGN KEY

    (Favorite) REFERENCES Food_Class (Food)

    )

    [/code]

    Although the uniqueness of a Food's class is enforced by the constraints, there is now nothing to stop a person from having two favorites in the same class; so as a result of normalisation beyond EKNF the computer can't enforce the business rule and prevent those insert and/or update errors (and it is much easier now for a manual update or an application bug to cause the database to hold erroneous data).

    This problem - that the schema can no longer represent certain sorts of groups of functional dependencies if you normalise too far - has been know since Carlo Zaniolo pointed it out about 30 years ago; but still DBAs blindly treat BCNF, 4NF, and 5NF as if the important thing was to get as far up the normalisation path as possible, regardless of whether this introduces a failure of representation which costs more to deal with than any anomaly introduced by (for example) multi-valued dependencies or non-trivial decompositions, instead of properly evaluating the costs and advantages of normalisation beyond EKNF in each case.

    Edit: it's not just 30 years that the problem has been known; Philip Bernstein pointed out the problem in BCNF 35 years ago in the paper introducing his normal schema generation algorithm for 3NF; the algorithm was intended to enforce the representation principle. Zaniolo defined a new normal form EKNF lying between 3NF and BCNF and proved that Bernstein's algorithm generated schemas that were not just 3NF but EKNF, showed that EKNF maintained the representation principle (unlike BCNF) and gave a new definition for 3NF which made the relationships between 3NF, EKNF and BCNF crystal clear. So maybe we've had 35 years of DBAs with heads in sand, not just 30 (or maybe it was just 30; after all, until Xaniolo it wasn't known that there was anoirmal form which fixed the anomalies that BCNF was intended to address without breaking the representation principle).

    Tom

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply