RE: many small tables vs. one large table??

Question

RE: many small tables vs. one large table??

lmeinke

SSCommitted

Points: 1956
More actions
July 23, 2009 at 7:01 am

#385040

Hello,
I am dealing with 12 months of sales data 2-3M rows per month. I initially set them up as 12 seperate tables and have dealt with my queries through stored procedures (rarely change them). However, recently I have been running many ad hoc queries that require all 12 months of data....this makes for lenghty SQL.
I am wondering if 12 2-3 million row tables perform better than one 30 million row table (indexes being equal)? Should I combine these into a single table for convenience sake?
Please advise.
Thank you,
Lonnie

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

SQLZ SSChampion Points: 12872 More actions · Answer 1

In this situation I would be inclined to put all of the data into one table and then partition the table with one partition for each month. Thereafter, when you query the table SQL Server will take care of which partitions it needs to read from and give you optimum performance, with the added benefit that you can query as many or as few months as you'd like to.

Having the data split across 12 tables isn't very scalable and will provide more management overhead than using partitions - although you'll still need to manage the partitions and possibly use a sliding window mechanism with each new month.

Hope that helps.

Karl
source control for SQL Server

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 2

I'll second that.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Paul White SSC Guru Points: 150457 More actions · Answer 3

SQLZ (7/23/2009)
In this situation I would be inclined to put all of the data into one table and then partition the table with one partition for each month. Thereafter, when you query the table SQL Server will take care of which partitions it needs to read from and give you optimum performance, with the added benefit that you can query as many or as few months as you'd like to.
Having the data split across 12 tables isn't very scalable and will provide more management overhead than using partitions - although you'll still need to manage the partitions and possibly use a sliding window mechanism with each new month.
Hope that helps.

Partitioning requires Enterprise Edition.

Partitioning performance can be disappointing, especially in 2005.

Splitting the data into 12 tables and using a partitioned view (available in all editions) will almost certainly perform better.

Table partitioning in SQL Server is internally implemented as separate table structures anyway.

Paul

edit: also, a database containing a partitioned table (or even a partition scheme or function) prevents that database being attached, or restored, to an edition of SQL Server which is not Enterprise or equivalent. This can be extremely tedious if you use copies of the database for reporting purposes (for instance).

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Steve Jones - SSC Editor SSC Guru Points: 728164 More actions · Answer 4

Good points, Paul.

I like the idea of using a view to join the tables together if you need that.

Are you querying by month that often? I'm just wondering if you need a current and archived table instead, letting SQL separate out things by table based on a date or a couple of columns (yr/mn)

Follow me on Twitter: http://www.twitter.com/way0utwest
Forum Etiquette: How to post data/code on a forum to get the best help
My Blog: www.voiceofthedba.com

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 5

Paul White (7/25/2009)
Splitting the data into 12 tables and using a partitioned view (available in all editions) will almost certainly perform better.

Steve Jones - Editor (7/25/2009)
I'm just wondering if you need a current and archived table instead...

I absolutely agree with both of those suggestions... we had a similar problem and I split the data at the 120 day mark (Telephony and Call Detail Records (CDR's))... we would call the table directly for the short term stuff and a partitioned view for the long term stuff. It helped even really good code never mind the junk code.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

lmeinke SSCommitted Points: 1956 More actions · Answer 6

Hello All,

Thanks for the suggestions. I have created three views of the data (based on the time frames I generally query). This will suffice for now but I believe I will eventually create one large table with partitions. I am curious about the archiving idea of Steve Jones....but not sure how to implement?

Thanks again,

Lonnie