Question/Advice needed Sliding Window partitioning range right,what the effecient way to implement/are staging tables needed

Question

Question/Advice needed Sliding Window partitioning range right,what the effecient way to implement/are staging tables needed

Resender

SSCertifiable

Points: 7285
More actions
October 30, 2015 at 4:18 am

#301817

Okay yesterday I was in a bit of a panic and made the post Partition Sliding window causing loads of... in a haste without thinking it through, so that topic can be ignored, I managed to rest calm down and think.
Here's the situation; we have a database for our warehouse where they test machines before shipping them off to the clients,these machines are either newly build or came back for repairs. Each machine gets it own table which is the warehouse name followed by sequence number that's unique for that machine. The database is starting to get sluggish, we want to implement a sliding window partition,containing 2 partitions Archive & Current, with current containing a max of the past 45 days.
Issues like I said are new machines,so on an almost 24/7 basis new tables can be added or an old table that hasn't been touched in years gets new data.
On a weekly basis there is a job that checks the tables for compression levels, every 4 hours a backup of the transaction log is taken.
There is a 0.00 margin for downtime :crying:
So this is a range right partion,I've done the following
CREATE PARTITION FUNCTION [WAREHOUSE_P_F](datetime) AS RANGE RIGHT FOR VALUES (N'2015-09-08T00:00:00.000')
CREATE PARTITION SCHEME [WAREHOUSE_S_F] AS PARTITION [WAREHOUSE_P_F] TO ([FG_DATA_001], [FG_DATA_CURRENT])
Now I've been moving the tables over starting with the smallest and also a few big ones,checking locks,transaction logs and filegrowth of FG_DATA_CURRENT.
Now comes my problem with the implementation of the sliding window,the procedure I got takes to long blocks to many progress's and gets deadlocked itself after a while.
This was not an issue during development for the setup and my request back then for a proper realistic test environment fell on deaf ears.
I've seen some potential solutions but they've all used staging tables (and they were also dealing with a single table), so anyone got a idea what I could do.

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 1

Resender (10/30/2015)
Okay yesterday I was in a bit of a panic and made the post Partition Sliding window causing loads of... in a haste without thinking it through, so that topic can be ignored, I managed to rest calm down and think.
Here's the situation; we have a database for our warehouse where they test machines before shipping them off to the clients,these machines are either newly build or came back for repairs. Each machine gets it own table which is the warehouse name followed by sequence number that's unique for that machine. The database is starting to get sluggish, we want to implement a sliding window partition,containing 2 partitions Archive & Current, with current containing a max of the past 45 days.
Issues like I said are new machines,so on an almost 24/7 basis new tables can be added or an old table that hasn't been touched in years gets new data.
On a weekly basis there is a job that checks the tables for compression levels, every 4 hours a backup of the transaction log is taken.
There is a 0.00 margin for downtime :crying:
So this is a range right partion,I've done the following
CREATE PARTITION FUNCTION [WAREHOUSE_P_F](datetime) AS RANGE RIGHT FOR VALUES (N'2015-09-08T00:00:00.000')
CREATE PARTITION SCHEME [WAREHOUSE_S_F] AS PARTITION [WAREHOUSE_P_F] TO ([FG_DATA_001], [FG_DATA_CURRENT])
Now I've been moving the tables over starting with the smallest and also a few big ones,checking locks,transaction logs and filegrowth of FG_DATA_CURRENT.
Now comes my problem with the implementation of the sliding window,the procedure I got takes to long blocks to many progress's and gets deadlocked itself after a while.
This was not an issue during development for the setup and my request back then for a proper realistic test environment fell on deaf ears.
I've seen some potential solutions but they've all used staging tables (and they were also dealing with a single table), so anyone got a idea what I could do.

Before proceeding any further,

how many rows are in the table?

What size is it?

What gains do you expect from partitioning?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Resender SSCertifiable Points: 7285 More actions · Answer 2

Before proceeding any further,

ChrisM@Work (10/30/2015)how many rows are in the table?

The largest table contains above 1.5 million records,the smallest about 16, and their hundreds on hundreds that are currently empty.

ChrisM@Work (10/30/2015)What size is it?

I currently don't have the figures or the time, but based on our estimates we would need a minimum of 54GB of disk for all the data in the past 45 days

and it's been over 1600 days since this database came into use the first time with this system.

ChrisM@Work (10/30/2015)What gains do you expect from partitioning?

Improve query response times,for 1 machine will need all the historical data but the reports only need the data from the past 6 weeks.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

Resender (11/3/2015)
Before proceeding any further,
ChrisM@Work (10/30/2015)how many rows are in the table?
The largest table contains above 1.5 million records,the smallest about 16, and their hundreds on hundreds that are currently empty.
ChrisM@Work (10/30/2015)What size is it?
I currently don't have the figures or the time, but based on our estimates we would need a minimum of 54GB of disk for all the data in the past 45 days
and it's been over 1600 days since this database came into use the first time with this system.
ChrisM@Work (10/30/2015)What gains do you expect from partitioning?
Improve query response times,for 1 machine will need all the historical data but the reports only need the data from the past 6 weeks.

Partitioning is unlikely to improve your query response times, in fact it's quite likely to make them worse. There's a good proof of this here. If performance is your sole reason for considering partitioning, then you'd be well advised not to pursue it further, and focus on other avenues for query tuning.

1.5 million rows is very modest.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Resender SSCertifiable Points: 7285 More actions · Answer 4

OK, not so good news then,the total size of the database currently is 1.9TB

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 5

Each machine gets it own table which is the warehouse name followed by sequence number that's unique for that machine.

How many of these tables are there in the db?

Are they structurally identical?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Resender SSCertifiable Points: 7285 More actions · Answer 6

Total number of machine tables (3/11/2015 13:44 CET) 123851

Those tables are the same

[MachineRegisterId] [int] NOT NULL,

[UtcTime] [datetime] NOT NULL,

[Value] [float] NOT NULL

With primary key on the combination of MachineRegisterId & UtcTime

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 7

Resender (11/3/2015)
Total number of machine tables (3/11/2015 13:44 CET) 123851
Those tables are the same
[MachineRegisterId] [int] NOT NULL,
[UtcTime] [datetime] NOT NULL,
[Value] [float] NOT NULL
With primary key on the combination of MachineRegisterId & UtcTime

Your plan is to merge these 123,851 tables into one partitioned table (with two additional columns derived from the table names), with a partition on [UtcTime]?

What is the total rowcount?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Resender SSCertifiable Points: 7285 More actions · Answer 8

ChrisM@Work (11/3/2015)
Resender (11/3/2015)
Total number of machine tables (3/11/2015 13:44 CET) 123851
Those tables are the same
[MachineRegisterId] [int] NOT NULL,
[UtcTime] [datetime] NOT NULL,
[Value] [float] NOT NULL
With primary key on the combination of MachineRegisterId & UtcTime
Your plan is to merge these 123,851 tables into one partitioned table (with two additional columns derived from the table names), with a partition on [UtcTime]?
What is the total rowcount?

oh not at all the idea is to partition all 123,851 tables over 2 partitions so that each table has an archive/active section,my appologies if my explanation was bad.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 9

Resender (11/3/2015)
ChrisM@Work (11/3/2015)
Resender (11/3/2015)
Total number of machine tables (3/11/2015 13:44 CET) 123851
Those tables are the same
[MachineRegisterId] [int] NOT NULL,
[UtcTime] [datetime] NOT NULL,
[Value] [float] NOT NULL
With primary key on the combination of MachineRegisterId & UtcTime
Your plan is to merge these 123,851 tables into one partitioned table (with two additional columns derived from the table names), with a partition on [UtcTime]?
What is the total rowcount?
oh not at all the idea is to partition all 123,851 tables over 2 partitions so that each table has an archive/active section,my appologies if my explanation was bad.

what's the total rowcount?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 10

ChrisM@Work (11/3/2015)
Resender (11/3/2015)
ChrisM@Work (11/3/2015)
Resender (11/3/2015)
Total number of machine tables (3/11/2015 13:44 CET) 123851
Those tables are the same
[MachineRegisterId] [int] NOT NULL,
[UtcTime] [datetime] NOT NULL,
[Value] [float] NOT NULL
With primary key on the combination of MachineRegisterId & UtcTime
Your plan is to merge these 123,851 tables into one partitioned table (with two additional columns derived from the table names), with a partition on [UtcTime]?
What is the total rowcount?
oh not at all the idea is to partition all 123,851 tables over 2 partitions so that each table has an archive/active section,my appologies if my explanation was bad.
what's the total rowcount?

and how is this lot typically interrogated?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Resender SSCertifiable Points: 7285 More actions · Answer 11

In the meantime 123883 tables (yes 30 new tables got added) the total recordcount is

89,372,968,620

ChrisM@Work (11/3/2015)and how is this lot typically interrogated?

I wish I knew but I'm frequently being told that the reports are beyond my paygrade

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 12

Resender (11/3/2015)
In the meantime 123883 tables (yes 30 new tables got added) the total recordcount is
89,372,968,620
ChrisM@Work (11/3/2015)and how is this lot typically interrogated?
I wish I knew but I'm frequently being told that the reports are beyond my paygrade

Until you posted up that number, I was thinking that the best way to handle this lot would be a single table, but the largest single table I've worked with was a little over INT rows, about 2,800,000,000 - and it took some effort to get it to play nicely. Here, our largest table is a little under 1,000,000,000 rows and we get results back in milliseconds. 89,372,968,620 is out of my experience / comfort zone but I'd still lean towards one or two tables, one to hold individual devices, the other to hold activity. Let's see what alternatives folks come up with.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Resender SSCertifiable Points: 7285 More actions · Answer 13

ChrisM@Work (11/3/2015)
Resender (11/3/2015)
In the meantime 123883 tables (yes 30 new tables got added) the total recordcount is
89,372,968,620
ChrisM@Work (11/3/2015)and how is this lot typically interrogated?
I wish I knew but I'm frequently being told that the reports are beyond my paygrade
Until you posted up that number, I was thinking that the best way to handle this lot would be a single table, but the largest single table I've worked with was a little over INT rows, about 2,800,000,000 - and it took some effort to get it to play nicely. Here, our largest table is a little under 1,000,000,000 rows and we get results back in milliseconds. 89,372,968,620 is out of my experience / comfort zone but I'd still lean towards one or two tables, one to hold individual devices, the other to hold activity. Let's see what alternatives folks come up with.

Let's up that unconformtability a bit,from the figures I get a minute of unexpected downtime and the inputbuffers for the data not holding would cost us 4160 euro.

Or to be better put this is an application that resulted from the merge of several dozens instead being developed from the ground up, and we're reached a point where we can neither maintain or keep feedback from it properly.

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 14

Resender (11/3/2015)
OK, not so good news then,the total size of the database currently is 1.9TB

Hold the phone a minute... While it's true that partitioning is NOT a panacea of performance for queries in general, it's fantastic at being able to almost instantly drop (using SWITCH) large amounts of legacy data.

With that in mind, let's peel just one potato at a time to get to the bottom of all this. For starters, what are the 3 largest tables in the database that you'd like to apply the 45 day drop-rows rule to and how are they used? Are they OLTP or more like an audit/invoice detail table?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions