An Introduction to Database Design

  • Excellent article Paul. A great primer to the concept of normalization with examples that I think anyone can understand. It's all well and good t tell people that all the columns in a table need to relate to the key, but in articles I've read and sessions I've attended no one has explained it as clearly as you have.

    I don't claim to be a data architect or some great database designer and I'm not well-versed in theory, but this article provides, in my opinion, a great starting point by modeling what happens often in the real world, especially in small business. How many small businesses have started with paper, then used excel, and then Access or SQL Server Express? A lot, and this shows how they tend to work through the process. Sure in the real world a business owner may ask the question about what happens when a price changes on a product, and what about when a customer's address changes, but that is part of the iterative process that most businesses go through.

    I look forward to the continuing saga of Bob and Alice.

  • i wish more people knew the basics like this. troubleshooting an issue this morning and it turned out our devs tried to use NULL values as real data. worked OK until the table in question grew and some other data changed and now a lot of queries are running way to slow

  • this was an awesome article on database design................

    what a beautiful concept u have posted paul..........i am so happy that i have learned a great thing on your site...........................

    so many thanks!!!!!!!!!!!!

  • A very good article, Paul. I look forward to the further adventures of Alice and Bob as Bob discovers his database needs some upgrading. Very nice for the basics.

    -- Kit

  • This article is an incredible launch for the continuing adventures of Bob's database and the friends that help him with the original design and modifications coming up. The article is very well written, highly appropriate graphics where used at just the right times, and it's an entertaining easy-read to boot. The amount of and order of "revelation" is perfect especially for an introductory article. I also think it's perfect that you didn't confound beginners with things like Type 2 SCD's or natural keys for the Customer table right up front. Those will make perfect follow-up articles somewhere down the line when Bob's business begins to age a bit and Bob starts having his own revelations about his additional needs.

    Incredibly well done, Paul. Keep up the good work! 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • mike_walsh (1/18/2011)


    This is quite possibly one of the best technical articles for a wide audience that I have read in a long time, Paul. To anyone who is already well acquainted with database design, there won't be much new here...

    I respectfully disagree (not with the first sentence, with the second!) I think the breadth of the comments in this article show that one key thing that is "new" here to many DB professionals (myself included) is the end-user perspective. We spend so much time sitting in front of our screens modelling diagrams or tweaking scripts or reports that it's easy to forget how to help our users understand and get comfortable with tools that can help them a great deal. We get bogged down in query efficiency and degrees of normalization and can lose sight of the whole reason we're doing that.

    I read this article and the comments not to learn how to design a database, but to remember what drives people to databases and to get ideas on how to introduce the concepts without overloading people with technical jargon. This article does a great job of illustrating that.

  • I usually don't rate articles but this deserved all the stars possible! I plan to make it mandatory reading with all the work study students we have volunteering to help with our database clean up and web designs.

    Tim

    Dept of VA

    Rehabiltation Counseling Services

  • I have to admit I am surprised that the consensus seems to be here that it takes an expert to design a database that will prevent duplicates or that the concept and implementation is somehow too advanced for an introduction to database design.

    Am I really the only person here that thinks such concepts as "what is a duplicate record" and "how they are prevented" IS rudimentary database and beginning database concept??

    Perhaps the industry is in even worse shape than I think it is, and that is saying something!

  • sknox (1/20/2011)


    mike_walsh (1/18/2011)


    This is quite possibly one of the best technical articles for a wide audience that I have read in a long time, Paul. To anyone who is already well acquainted with database design, there won't be much new here...

    I respectfully disagree (not with the first sentence, with the second!) I think the breadth of the comments in this article show that one key thing that is "new" here to many DB professionals (myself included) is the end-user perspective. We spend so much time sitting in front of our screens modelling diagrams or tweaking scripts or reports that it's easy to forget how to help our users understand and get comfortable with tools that can help them a great deal. We get bogged down in query efficiency and degrees of normalization and can lose sight of the whole reason we're doing that.

    I read this article and the comments not to learn how to design a database, but to remember what drives people to databases and to get ideas on how to introduce the concepts without overloading people with technical jargon. This article does a great job of illustrating that.

    Fair enough, I agree and that is the point of my comment. We are saying the same thing and I had an implied "won't be much new technical information here..."

    Definitely a good reminder on the end user perspective and training/article writing approach to those on the business side or just starting out.

    __________________________________________________

    Mike Walsh
    SQL Server DBA
    Blog - www.straightpathsql.com/blog |Twitter

  • kenambrose (1/21/2011)


    I have to admit I am surprised that the consensus seems to be here that it takes an expert to design a database that will prevent duplicates or that the concept and implementation is somehow too advanced for an introduction to database design.

    Am I really the only person here that thinks such concepts as "what is a duplicate record" and "how they are prevented" IS rudimentary database and beginning database concept??

    Perhaps the industry is in even worse shape than I think it is, and that is saying something!

    Seriously?

    It appears as though you've missed many comments, including one by the author talking about this being a first bit sized chunk. The audience on SQL Server Central is wide ranging. From business people and managers without a technical background to intermediate to advanced to sages like you. So to a sage with many years of teaching experience, such as yourself, this will seem a bit beneath you and that is okay. We get that, you are very skilled, you know the finer points of database design and are a guru of gurus who eats Codd and Date for breakfast. This article isn't for you then. Perhaps when the author is done writing all parts (as he indicated he is going to be doing) you can take the entire set and then pass final critique.

    Perhaps you can share your vast knowledge with us in the form of a follow up article of your own and woo us with deeper database theory and thoughts on how the industry can come back from this brink of utter failure that you seem to describe as a result of reading this article.

    In the meantime I am still going to refer to this article for clients and folks starting out and will be looking forward to the next installment from Paul (or one from you, which would be so much more constructive).

    __________________________________________________

    Mike Walsh
    SQL Server DBA
    Blog - www.straightpathsql.com/blog |Twitter

  • Well I guess you totally and completely missed my point. If there is a more fundamental and foundational concept to db design than preventing duplicates, I don't know what it is.

    I was hoping that this duscussion might help folks to realize what was missed, and why it is so important for BEGINNERS to grasp the concept. Which won't happen of course unless it is offered in training to them.

    *Why* it was missed, and why so few practitioners even noticed it here, is clear- the industry is terribly weak on basic DB theory.

    I won't repeat myself further so you can have the final turn.

  • kenambrose (1/22/2011)


    Well I guess you totally and completely missed my point. If there is a more fundamental and foundational concept to db design than preventing duplicates, I don't know what it is.

    I was hoping that this duscussion might help folks to realize what was missed, and why it is so important for BEGINNERS to grasp the concept. Which won't happen of course unless it is offered in training to them.

    *Why* it was missed, and why so few practitioners even noticed it here, is clear- the industry is terribly weak on basic DB theory.

    I won't repeat myself further so you can have the final turn.

    Apologies for the sarcasm in the last post. I don't believe I missed the point. I don't believe anyone has. I don't think Paul's post structure is Phase 1 Fundamentals, Phase 2 Advanced, Phase 3 Expert and Phase 4 you are ready to apply for Ken Ambrose's job designing complex database systems. I see the whole series as geared for beginners. If dealing with duplicates is never covered in the series, then yes something is missing.

    The first post introduced the problem statement. It introduced the players and it showed a database design created by someone who recalled her days in one database design course in college way back when.

    So he started with some first day concepts. He started with what normalization is with an example of normalization in general terms. He showed how to look at the data and figure out how to avoid repeating groups. He talked about the importance and one function of a primary key. Yes you don't like a surrogate key for a Primary Key it seems. Yes he didn't suggest identifying a unique index on a natural key, etc. He introduced the concept of foreign key relationships and showed the solution to one of Bob's biggest points of frustration - having to key in the full customer info each time he created a receipt. It could allow for duplicates of customer information depending on the source system that assigns customer numbers, certainly.

    I have read through the comments here and never got the impression that duplicates are a really complex problem that take Joe Celko to solve. Not sure why you read that. Also to your point on training, your livelihood from an earlier post, absolutely necessary. I don't think this replaces training but it is a great way of bridging the gap from the business to IT. It is a great introduction for someone just browsing around here trying to understand basics.

    If someone reads a few posts online (no matter who they are from) and designs a database for a mission critical enterprise application without consulting any expert advice or resources then I feel bad for them and feel good for you because you'll eventually gain a new customer.

    Let's see where the series goes and I still really hope to see a follow up article by you designed for someone who thinks of Excel as a database and covers everything critical without room for disagreement 🙂

    __________________________________________________

    Mike Walsh
    SQL Server DBA
    Blog - www.straightpathsql.com/blog |Twitter

  • Ken,

    The fact is that the surrogate key + enforced unique natural key idea just would not fit nicely within the flow of the story, while keeping the article to a reasonable length.

    The first article leaves a number of very obvious 'open doors' for subsequent parts to explore; I decided that enforcing the uniqueness of natural keys belongs there. Clearly you disagree, but there we are.

    Paul

    P.S. Bob's customer notebook (referred to in the article) has numbers pre-printed for each entry. This is where the Customer IDs shown on the receipt came from.

  • kenambrose (1/22/2011)


    Well I guess you totally and completely missed my point. If there is a more fundamental and foundational concept to db design than preventing duplicates, I don't know what it is.

    I was hoping that this duscussion might help folks to realize what was missed, and why it is so important for BEGINNERS to grasp the concept. Which won't happen of course unless it is offered in training to them.

    *Why* it was missed, and why so few practitioners even noticed it here, is clear- the industry is terribly weak on basic DB theory.

    I won't repeat myself further so you can have the final turn.

    The way I read this article it was primarily addressing an even more fundamental concept: getting the job done. The starting scenario is that Bob's business was run on paper. Perhaps that's contrived; surely no small business owner keeps records on paper - right? For the sake of simplicity assume Bob is that guy and suspend disbelief long enough to follow Paul's story about how Bob might discover a proper database. Of course an uninitiated business person doesn't arrive at "basic DB theory" in a single step, let alone an ideal schema on the first try. I concede that might make a better textbook example but it would not be a very entertaining story.

    I know mike_walsh already apologized for his sarcasm, but I'd like to further the sentiment. My first comment on SQLServerCentral was unnecessarily critical of the author. Part of the reason was that I thought this site's was fueled by professional authors due to the volume of high quality articles. I learned this is a community effort and realized I wouldn't want to receive such grief as I had given if I were to submit an article of my own. In this sense of community, I would suggest if we aren't up to contributing something constructive then don't hit "post." Having said that, though, I think this has been a great conversation thread following a good start on Paul's ongoing story.

  • Hi Paul,

    What I try to do when I train beginners in DB work is to emphasize Codd's fundamental logical concepts- in layman's language.

    I start out with the basic concepts such as modeling of data processing, aligning a model of the data to the business meaning of real world processes and things (duplicates never exist in the real world so preventing them is a basic step in recording facts about the world) , why we seperate out complex business facts into seperate record keeping "entities", and "what are the common record keeping problems that relational principles address and how does it try to address them" , etc.

    I think it's a different opinion - from a trainers standpoint - of what's important for beginners to start with...

Viewing 15 posts - 46 through 60 (of 125 total)

You must be logged in to reply to this topic. Login to reply