Query on a large database

  • Hi Gail

    The tempdb example was not meant to indicate that there is some kind of partitioning involved but to point out that indexes are not the only way to improve read performance and that proper allocation and placement of files od disk also provide a improvement in performance.

    With regard to clustered index and placement of rows on disk , I wanted to point out how we are able to move files from one file group to another ( and thus from one file to another ) by simply creating a clustered index on the partition scheme , this way clustered index does have a role to play in moving data on the disk and where it gets placed.

    Mainly trying to highlight the difference between an allocation scan and traversing the b tree.

    Jayanth Kurup[/url]

  • Jayanth_Kurup (10/23/2015)


    Mainly trying to highlight the difference between an allocation scan and traversing the b tree.

    Once again:

    Can you show how a query for an age group for ages between @m and @N should look like?

    _____________
    Code for TallyGenerator

  • Yeah, a covered index in my case would be highly expensive on a table with billion+ rows. But, I get the point.

  • Sergiy (10/23/2015)


    Jayanth_Kurup (10/23/2015)


    Mainly trying to highlight the difference between an allocation scan and traversing the b tree.

    Once again:

    Can you show how a query for an age group for ages between @m and @N should look like?

    Here you go Sergiy

    Screenshot for table with clustered index on ref_age vs unindexed heap that is just partitioned by ref_age

    for a range of 5 ages

    Let me know your thought, I would be happy to test out any other scenarios if you can provide me the scripts with details on how you want me to test.

    EDIT :- to take things further aggregated some data over the indexed and the partitioned table to measure performance , when summarizing , partitioned table provides a minor improvement. screenshot attached.

    Check out the screenshot for aggregated over non partitioned or indexed column. here I am not even using the partition key or index key column to filter the data , the the difference in performance is almost double.

    Jayanth Kurup[/url]

  • Jayanth_Kurup (10/23/2015)


    Hi Sergiy

    I can demonstrate what I am explaining in my video , no gimmicks just plain tables and rows.If you have a few minutes , please add me on skype and I can demo the concept.

    my skypeid is Jayanth (dot) Kurup at hotmail dot com

    More than happy to demo the concept if it helps , Also the sample script and data are shared on google drive have you tried it?

    PS:Big table resides in a single file , it can be made to sit in multiple files but without any control over which row goes into which file ( proportinal filling) ,

    if you are going to quote me please do so completely.

    What's wrong with building a test data generator and posting the code for that and for what you've done?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jayanth_Kurup (10/23/2015)


    Here are the scripts , I usually create the video to show the steps to reproduce the issue, mostly for those cases where it more important to explain the concept than how i got there , the video is not meant to explain how partitioning works just that if you compare a table with and without partitioning the latter does provide benefits.

    . As on linkedin I provide the code when requested it help keep things focused on the main point.

    https://drive.google.com/folderview?id=0ByGed3QzAHjfZl9OTEdZRHZBdnc&usp=sharing%5B/quote%5D

    Ah... was reading the posts bottom up. Thanks. I'll check it out.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jayanth_Kurup (10/23/2015)


    Hi Sergiy

    Here is a screen shot with clustered index on the monolithic table on REF_AGE column as requested. Notice that it took 17 sec , just the same amount of time my partitioned heap table took without any indexes ( as shown in the video).

    It looks like you returned a ton of rows to the screen, which would be the great equalizer in duration.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (10/23/2015)


    Jayanth_Kurup (10/23/2015)


    Hi Sergiy

    Here is a screen shot with clustered index on the monolithic table on REF_AGE column as requested. Notice that it took 17 sec , just the same amount of time my partitioned heap table took without any indexes ( as shown in the video).

    It looks like you returned a ton of rows to the screen, which would be the great equalizer in duration.

    There are screenshot with aggregated results as well which use just two columns and return around 16 rows.

    Jayanth Kurup[/url]

  • Took a quick peek at the code you posted.

    EXEC sys.sp_db_vardecimal_storage_format N'partitioned', N'ON'

    I could be misreading something but you do know that VarDecimal has been deprecated, yes?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • i havent changed the storage formats , its all at defaults , the only thing done extra is trace 1117

    Jayanth Kurup[/url]

  • Jayanth_Kurup (10/23/2015)


    Jeff Moden (10/23/2015)


    Jayanth_Kurup (10/23/2015)


    Hi Sergiy

    Here is a screen shot with clustered index on the monolithic table on REF_AGE column as requested. Notice that it took 17 sec , just the same amount of time my partitioned heap table took without any indexes ( as shown in the video).

    It looks like you returned a ton of rows to the screen, which would be the great equalizer in duration.

    There are screenshot with aggregated results as well which use just two columns and return around 16 rows.

    This is the screen shot I'm looking at and it contains a wee bit more than 16 rows.

    You might also want to do an upgrade. According to that screen shot, you're still using 2012 RTM.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Yes I am using DB engine version 2014 RTM

    Jayanth Kurup[/url]

  • You mentioned the report should be accessible by multiple customers and should run within 15 seconds. Does each customer see the exact same report, or does each customer see a subset of records pertaining only to them? Some reporting tools like SSRS have a feature to cache reports for reuse, if for example you need to produce only one report each month and then make it accessible to multiple users.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • You must cluster the table properly to get good performance from large tables. You can instead try creating gazillions of covering indexes, to cover each (major) query, but that is a lot more maintenance effort. INSERTs than also take much longer, which can lead to blocking/deadlocking in other tasks.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Just wondering if anybody was able to finish testing the scripts and if there are any new results ?

    EDIT : BUMP, hoping for some independently verified results , also created a complete video with more scenarios in case any body is looking for a deep dive.

    Jayanth Kurup[/url]

Viewing 15 posts - 46 through 60 (of 104 total)

You must be logged in to reply to this topic. Login to reply