Using an Int to store a date

  • I am working on a database that a consultant created. He likes to use INTs to store a date.

    For example "20070107" is stored in an int column instead of storing it as 07/01/2007 in a datetime column

    Has anyone else used this method? What do you think of it?

  • Jason Messersmith (6/17/2010)


    I am working on a database that a consultant created. He likes to use INTs to store a date.

    For example "20070107" is stored in an int column instead of storing it as 07/01/2007 in a datetime column

    Has anyone else used this method? What do you think of it?

    Well, for one thing, that uses a 4 bit int instead of a 3 bit date. If you're using SQL Server 2008, that is.

    It makes it harder to determine the difference between 2 dates. With a Date is is just DateDiff. With an integer, you have to convert to date first or account for how many days are in each month and if there's a leapyear.

    It increases the likelihood that there could be an error in the data. With a date field, there's no way to enter 2/57/2007, with an int, there's no problem at all entering 20075707.

    Did you get to ask him why he did it? I don't really see any reason for it. Even if you're using an older database and have to use the 8 bit datetime, the better accuracy and ease of manipulation more than makes up for the storage savings.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Thanks for the reply.

    he isnt round anymore, i'm sure you know how consultants are, get it working, who cares how it works as long as it gets me the money.

  • Jason Messersmith (6/17/2010)


    Thanks for the reply.

    he isnt round anymore, i'm sure you know how consultants are, get it working, who cares how it works as long as it gets me the money.

    Well, as a consultant, that's not how I operate. I want it to work right so whomever comes after me doesn't have problems. Some consultants are indeed like that, but not all of us.

    If you can change it to a date column, I think you should. There may be others who can give you a good reason to keep it the way it is, but I don't know of one. If you're not concerned about day and just want to store the month, I know there's an argument to be made that you can store the date as 200701 and not have to worry about the day. If you don't care about the day, many of my arguments against it go away, except for data entry problems and that can be handled with checks. If you care about the day, though, there's no reason to store it as an int.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • If you have any chance to get the money back from that "consultant": go for it!

    It's one of the worst choices I've seen to store a date value. Even worse than storing it as char/varchar, since you need to use rather complicated code to convert it into a valid datetime format.

    Example:

    -- you'd need to comment the following block out to get a valid result...

    DECLARE @a INT

    SET @a = 20070107

    SELECT CAST (@a AS DATETIME)

    -- versus

    DECLARE @b-2 CHAR(8)

    SET @b-2 = '20070107'

    SELECT CAST (@b AS DATETIME)

    To save a byte by using an inappropriate data type is not a valid argument at all. You'd waste a lot more processing time to convert the data than you'd gain due to smaller tables.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Stefan Krzywicki (6/17/2010)


    ...

    If you can change it to a date column, I think you should. There may be others who can give you a good reason to keep it the way it is, but I don't know of one. If you're not concerned about day and just want to store the month, I know there's an argument to be made that you can store the date as 200701 and not have to worry about the day. If you don't care about the day, many of my arguments against it go away, except for data entry problems and that can be handled with checks. If you care about the day, though, there's no reason to store it as an int.

    I second Stefan in terms of trying to change it to a date column. But you might want to add a separate column (a persistent computed column) to hold the date version of the int value. The other option would be a code review to clean up the bad design.

    We usually make the vendor to pay for it, assuming we detect it during approval process/testing... 😉

    Regarding a person who might be able to provide "a good reason": let's see if there will be a single person being able to provide valid arguments for it...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • That's how MSDB stores job-related dates. If anyone's ever tried to write custom scripts against the MSDB job tables, they'll know how complex that method of storage makes queries.

    Dates should be stored in a date/datetime column. Even storing a date as a varchar's better than storing one as an int (because implicit and explicit conversions are possible)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • As already stated a DATE column ought to be a better choice for SQL Server 2008 onwards.

    However, there is at least one situation in previous versions where it could make sense to use an INT for dates. Suppose you need to represent dates earlier than the year 1753. You can't do that with a DATETIME and therefore INT is a reasonable choice. One situation where you may want to handle dates in that way is in a data integration situation where you are importing dates from another system that has a less restricted date range which would otherwise cause problems.

  • lmu92 (6/17/2010)


    Regarding a person who might be able to provide "a good reason": let's see if there will be a single person being able to provide valid arguments for it...

    I just didn't want to rule anything out. I'm constantly learning new things here.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • David Portas (6/17/2010)


    As already stated a DATE column ought to be a better choice for SQL Server 2008 onwards.

    However, there is at least one situation in previous versions where it could make sense to use an INT for dates. Suppose you need to represent dates earlier than the year 1753. You can't do that with a DATETIME and therefore INT is a reasonable choice. One situation where you may want to handle dates in that way is in a data integration situation where you are importing dates from another system that has a less restricted date range which would otherwise cause problems.

    I still wouldn't use an INT there. I'd stick with a varchar(10) for readability.

    The only time I think I might use an INT would be if all you want to store is YearMonth.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • I don't think that's a right way to store dates at all.

    You have several drawbacks storing dates as integers

    1. Int takes more disk space than the Date data type. So the database size is unnecessarily larger for no benefit.

    2. You lose the automatic data integrity mechanism that you get with using the Date data type. So users can enter wrong values in the column. For example, the Int column will allow even non-date values like 20101355. You will have to write your own rules and triggers to maintain data integrity issues if you use Int column to store date values. Date column provides this type of data integrity automatically.

    3. It is hard to understand for programmers and other users of the database. It also impacts query performance. The Int value will always need to be converted to Date either implicitly or explicitly whenver you want to compare it with other date values in your queries.

  • I forgot to mention that this is in a data warehouse. so maybe it was to be able to join to a fiscal calendar quicker?

    I am going to try and test some execute times on queries tomorrow with it broken out into a date column.

  • Joe Celko (6/17/2010)


    David Portas (6/17/2010)


    As already stated a DATE column ought to be a better choice for SQL Server 2008 onwards.

    However, there is at least one situation in previous versions where it could make sense to use an INT for dates. Suppose you need to represent dates earlier than the year 1753. You can't do that with a DATETIME and therefore INT is a reasonable choice. One situation where you may want to handle dates in that way is in a data integration situation where you are importing dates from another system that has a less restricted date range which would otherwise cause problems.

    At that point, I would drop SQL Server and go to Oracle or DB2 which can handle a full ISO-8601 date range. I worked for the Getty Museum and this was a real problem. But most commercial outfits are not billing a Mummy 🙂

    There's no reason to drop SQL server, just upgrade to 2008. Why give yourself Oracle or DB2 headaches? Heck, I'd rather deal with trying to store it all in varchar than have to use Oracle or DB2.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Jason Messersmith (6/17/2010)


    I forgot to mention that this is in a data warehouse. so maybe it was to be able to join to a fiscal calendar quicker?

    I am going to try and test some execute times on queries tomorrow with it broken out into a date column.

    Even if this would have been the purpose: a calendar table should hold the date as date/datetime data type instead of integer. So, no argument there, I'd say.

    @stefan:

    I just didn't want to rule anything out. I'm constantly learning new things here.

    I didn't want to sound negative against your argument. There was no intention whatsoever. If I left that impression I apologize.

    @david-2:

    I guess that's about the only argument. But if you'd need to deal with dates within that range and you'd need to do any calculation based on that, it get's complicated anyway (e.g. you'd need to consider that the Gregorian calendar didn't even exist before 1582 in most catholic countries, in others up to a few hundred years later... All of a sudden the calculation of Februry 29th would depend on the country referenced. *shudder*) 😉



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Jason Messersmith (6/17/2010)


    I forgot to mention that this is in a data warehouse. so maybe it was to be able to join to a fiscal calendar quicker?

    I am going to try and test some execute times on queries tomorrow with it broken out into a date column.

    Even if it is to speed up comparisons, it will only make sense if the fiscal calendar also uses an Int column to store the dates. In that case, your fiscal calendar also needs to be fixed. Comparing two values of the Date data type will almost always perform better than comparing an int with a date value(in which case one has to be implicitly converted and an index seek would not be performed)

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply