query with several joins fills up tempdb and won't finish

  • Hi,

    so I have this query I need for a report. Originally it was 4 queries to be used in Crystal Reports. Now I want to create the same report with SSRS and therefore I incorporated all queries in one in order not to use subreports (http://qa.sqlservercentral.com/Forums/Topic1600522-1633-1.aspx).

    I am new to this and therefore am not sure if it is the query itself or any other problem with configuration. tempdb fills up to nearly 90 GB. I am running SQL Server on a local box, so I am sure there is no other traffic. Here is the query:

    SELECT AdHaupt.NSprache_ID

    ,AdHaupt.mengentext AS mengentextHaupt

    ,AdHaupt.Einzelpreis

    ,AdHaupt.Anzeigebezeichnung

    ,AdHaupt.Gesamtpreis

    ,AdHaupt.Anzahl

    ,AdHaupt.mm

    ,AdHaupt.Spalten

    ,AdHaupt.istmm

    ,AdHaupt.gesamtpreisAEO

    ,AdHaupt.gesamtpreisAE

    ,AdHaupt.NSprache_ID AS NSprache_IDHaupt

    ,AdHaupt.preisprostueck

    ,AdHaupt.AnzahlAusgaben

    ,AdHauptAdr.id_adressen

    ,AdHauptAdr.hauptauftrag

    ,AdHauptAdr.Netto

    ,AdHauptAdr.MwSt

    ,AdHauptAdr.Brutto

    ,AdHauptAdr.MwStSatz

    ,AdHauptAdr.Skontotext

    ,AdHauptAdr.MainRechnungID

    ,AdHauptAdr.id_rechnung

    ,AdHauptAdr.Auftragsnummer

    ,AdHauptAdr.adressen_id_ag

    ,AdHauptAdr.rechnungsnummer

    ,AdHauptAdr.rechnungsdatum

    ,AdHauptAdr.Rechnungsart

    ,AdHauptAdr.getmail

    ,AdHauptAdr.rechnungstext

    ,AdHauptAdr.Bemerkung

    ,AdHauptAdr.BanktextSEPAlang

    ,AdHauptAdr.IstKundeMitAE

    ,AdHauptAdr.IstAE

    ,AdAdressen.NAME1

    ,AdAdressen.NAME2

    ,AdAdressen.STRASSE

    ,AdAdressen.PLZ

    ,AdAdressen.PLZ_ORT

    ,AdAdressen.POSTFACH

    ,AdAdressen.PLZ_POSTFACH

    ,AdAdressen.POSTFACH_ORT

    ,AdAdressen.ADRESSNR

    ,AdAdressen.NAME5

    ,AdAdressen.NAME6

    ,AdAdressen.Anschrift1

    ,AdAdressen.Anschrift2

    ,AdAdressen.Anschrift3

    ,AdAdressen.Anschrift4

    ,AdAdressen.Anschrift5

    ,AdAdressen.Anschrift6

    ,AdAdressen.Anschrift7

    ,AdAdressen.ANREDE

    ,AdAdressen.NAME1 AS NAME1AG

    ,AdAdressen.NAME2 AS NAME2AG

    ,AdAdressen.NAME5 AS NAME5AG

    ,AdAdressen.NAME6 AS NAME6AG

    ,AdAdressen.STRASSE AS STRASSEAG

    ,AdAdressen.PLZ AS PLZAG

    ,AdAdressen.PLZ_ORT AS PLZ_ORTAG

    ,AdAdressen.POSTFACH AS POSTFACHAG

    ,AdAdressen.PLZ_POSTFACH AS PLZ_POSTFACHAG

    ,AdAdressen.POSTFACH_ORT AS POSTFACH_ORTAG

    ,AdAdressen.Bezeichnung_nnation

    ,AdAdressen.debitornr

    ,AdAdressen.UstIDNummer

    ,AdAdressen.SteuerNr

    ,AdAdressen.ADRESSNR AS ADRESSNRAG

    ,AdAdressen.STRASSE AS STRASSEKU

    ,AdAdressen.PLZ AS PLZKU

    ,AdAdressen.PLZ_ORT AS PLZ_ORTKU

    ,AdAdressen.POSTFACH AS POSTFACHKU

    ,AdAdressen.PLZ_POSTFACH AS PLZ_POSTFACHKU

    ,AdAdressen.POSTFACH_ORT AS POSTFACH_ORTKU

    ,AdAdressen.Bezeichnung_nnation AS Bezeichnung_nnationKU

    ,AdAdressen.debitornr AS debitornrKU

    ,AdAdressen.UstIDNummer AS UstIDNummerKU

    ,AdAdressen.ADRESSNR AS ADRESSNRKU

    ,AdAuftrag.Bestellnummer

    ,AdAuftrag.Auftragsnummer AS AuftragsnummerAd

    ,AdAuftrag.erscheinungsdatum

    ,AdAuftrag.ausgabeextern

    ,AdAuftrag.lfdnr

    ,AdAuftrag.ID_AdAuftrag

    ,AdAuftrag.adAnzeigeArt_id

    ,AdAuftrag.LaufzeitVon

    ,AdAuftrag.LaufzeitBis

    ,AdAuftrag.AdAnzeigeArt_IstInternetanzeige

    ,AdAuftrag.Motiv

    ,AdAuftrag.PlazierungExtern

    ,AdAuftrag.Rubrik

    ,AdAuftrag.objekt

    ,AdZusatz.Bezeichnung

    ,AdZusatz.Zwischenergebnis

    ,AdZusatz.Basiswert

    ,AdZusatz.RechenArt

    ,AdZusatz.Reihung

    ,AdZusatz.ID_AdAddonArt

    ,AdBelege.anzahlbelege

    ,AdBelege.adressen_id AS adressen_idBelege

    FROM

    AdHauptAdr

    INNER JOIN AdHaupt AS AdHaupt

    ON AdHauptAdr.hauptauftrag = AdHaupt.ID_AdAuftrag

    LEFT JOIN AdZusatz

    ON AdZusatz.AdAuftrag_ID = AdHauptAdr.hauptauftrag

    INNER JOIN AdAuftrag AS AdAuftrag

    ON AdHauptAdr.hauptauftrag = AdAuftrag.Hauptauftrag

    INNER JOIN AdAdressen AS AdAdressen

    ON AdHauptAdr.id_adressen = AdAdressen.ID_ADRESSEN

    LEFT JOIN AdAdressen AS AdAdressenAG

    ON AdHauptAdr.adressen_id_ag = AdAdressenAG.ID_ADRESSEN

    INNER JOIN AdAdressen AS AdAdressenRA

    ON AdHauptAdr.adressen_id_ra = AdAdressenRA.ID_ADRESSEN

    INNER JOIN AdAdressen AS AdAdressenKU

    ON AdAuftrag.Adressen_ID = AdAdressenKU.ID_ADRESSEN

    INNER JOIN AdBelege

    ON AdBelege.adressen_id = AdAdressen.ID_ADRESSEN

    ORDER BY AdHauptAdr.MainRechnungID

    ,AdHauptAdr.id_rechnung

    ,AdHauptAdr.hauptauftrag

    ,AdAuftrag.ID_AdAuftrag

    ,AdZusatz.Reihung

    ,adressen_idBelege

    Is the code bad? I am not sure about the joins.

    Any other Info you might need?

    Thanks in advance.

    EDIT: I ran it with TOP 10 as well, just to see if it will finish at all, but it never did (ran for an hour now).

  • First of would like to see the execution plan. Appreciate that the query will never finish so the estimated plan will be better than nothing.

    Then please provide all of the DDL of all of the table including any indexes, foreign keys etc.

    Please see the "how to post code and data" link in my signature should you need any help with the above.

  • Sooo, after having had a look at the execution plan I noticed that CPU cost listed for the grouping is 100%. After removing the order by clause it still takes some time, but it starts giving out rows. So the problem seems to be the ordering. Since this report is only for testing purposes as of now, that suffices. However Anthony, I thank you for your fast response.

    Edit: grammar

  • NoDBA (8/14/2014)


    Since this report is only for testing purposes as of now, that suffices.

    Possibly not. It may actually be that the query never got to the ORDER BY. Check the execution plan for some really fat arrows and see if any of them are larger than the table they originally eminate from. If you find any, those tables are possibly a part of an accidental CROSS JOIN known as a "Many-to-Many" join and it's sometimes caused by not having sufficient criteria.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hi Jeff,

    thank you for your response. There are indeed some arrows more bold than others. They, however, display a cost of 0% (alas, I am far too much unexperienced to determine if that really means what it implies). So, I attached the plan and I would appreciate it if you'd have a look at it. Since it is the estimated execution plan, do you need me to post the create statements for the tables?

    Wanted to add, that I have been lurking here on SSC for some months now and basically that is what taught me SQL. So thanks for all the articles and posts!

    Flo

  • It might be just me but the exec plan you attached returns a blank page for me even if I save the file and try to open it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I tried and it works for me. Let me attach it again.

  • Wowwww ...... Everything is table Scan.

    the thickest line is showing the following stats

    Estimated Data Size : 5595 GB

    Estimated Number of Rows : 4228010000

    Estimated Row Size : 1421 B

    im afraid if this query is will ever give you the result.

    Things start going crazy when following tables are joined

    [dbo].[AdAuftrag].Hauptauftrag

    [dbo].[AdAdressen].ID_ADRESSEN

    [ReportDB].[dbo].[AdHaupt] [AdHaupt]

    Please do check your data as Jeff suggested, do your analysis where you are getting accidental CROSS Join. you will identify the actual problem as your query is not that much complex. you might required few adjustment. i believe you would never wanted 4228010000 many rows. OR are you ?

  • Thanks twin devil for taking a look at the plan. Where can I find that information looking at the plan with SSMS? Or is there any other tool to analyze those plans?

    Actually I do not want to receive 4 228 010 000 rows. But it makes sense now, that sthequery ran for over an hour wwithoutfinishing (I aborted it).

    The thing is, I made several attempts to enenlargehe tables in order to get more rows back. Now that I think of it I am not sure any longer at what stage the exec plan was created. When I run the query now it gives me back 40 000 rows which is far more, than we actually need. But since I want to test SSRS I thought I'll scale things up a bit.

    I'd be interested though how SSRS would handle 4 228 010 000 rows with my report.

    Nevertheless, I'll take a look at the query and try wipe out the cross joins. I don't know why, but joins always leave me puzzled. I understand the concept, but joining more than two tables takes me forever.

    Flo

  • NoDBA (8/15/2014)


    Thanks twin devil for taking a look at the plan. Where can I find that information looking at the plan with SSMS? Or is there any other tool to analyze those plans?

    Actually I do not want to receive 4 228 010 000 rows. But it makes sense now, that sthequery ran for over an hour wwithoutfinishing (I aborted it).

    The thing is, I made several attempts to enenlargehe tables in order to get more rows back. Now that I think of it I am not sure any longer at what stage the exec plan was created. When I run the query now it gives me back 40 000 rows which is far more, than we actually need. But since I want to test SSRS I thought I'll scale things up a bit.

    I'd be interested though how SSRS would handle 4 228 010 000 rows with my report.

    Nevertheless, I'll take a look at the query and try wipe out the cross joins. I don't know why, but joins always leave me puzzled. I understand the concept, but joining more than two tables takes me forever.

    Flo

    yes you can get the information from SSMS, just hover your mouse on the thick arrows you find in the execution plan OR you right click on the arrows and select 'Properties' option in the popup menu. it will show you the properties tab Right on your screen by default.

    you need to break your problem down.

    1) check why are you not getting any Index scan/Index seek in your query (check either the clustered/non clustered) are available or not.

    if not then create appropriate indexes.

    2) After that try to execute your query simply with inner join see if its bring any information or not.(there are two Left join 1st exclude them from the query then execute, after that try to inner join them and execute query.)

    3) check at which particular "many to many" relationship exists and if you find try to eliminate them.

    There are N number of possibilities when one start to optimize the query. its all about find the bottle neck and then try to fix it.

    all you need to check what is causing this. Hope it helps.

  • From the execution you can see the numbre of rows going into each join, and the number coming out. From that you can get some interesting information: for example, it appears that LEFT JOIN AdZusatz ON AdZusatz.AdAuftrag_ID = AdHauptAdr.hauptauftrag contributes a factor of about 13 to the number of rows generated: I don't know whether that makes sense or not because I don't know what your data looks like, but you should know whether it is reasonable or indicates a problem. Alternatively, it's worth trying each ON condition on it's own with the corresponding INNER or OUTER join using only the tables mentioned in the ON condition, to see where the the number of rows out is crazy. But maybe a different approach to combining the four queries you started from would be easier than fixing the current combined query.

    Perhaps the best approach would be to look again at the original four queries used previously; use each one as a CTE, but add an extra column to indicate which of the original queries this CTE comes from from plus further extra columns (set to NULL) to each so that the four results each have the same signature so that UNION makes sense; then the main query can just select the union (using UNION ALL) of the four CTEs plus whatever ORDER clause you want. This means you have no extra joins that were not in the original four queries, you are combing them by union so the total number of rows will be the sum of the four numbers of rows that you had before; this avoids having what amounts to joins bewteen these four rowsets, which risk causing a combinatorial explosion of row numbers. It will work provided the report can be made to use the data set this generates.

    Tom

  • Thanks twin.devil and Tom, although not easy for me to understand some of the things you said, I am working on using your advice to trim down the query, especially the joins.

    Taking a closer look at the code revealed there are several joins, that are useless since the data is already retrieved by others, I also saw that I put multiple columns twice in the SELECT part.

    It works for now, but I think I'll reread your posts and try doing it with CTEs. I just yesterday used my 1st CTE and think it is a good way to understand how they can be used (and it might add to my lacking knowledge about joins as well).

    I appreciate your input

    Flo

  • did you check the indexes ?

  • No, I didn't, but I do not have any influence on how the table schema looks like. But I think it is taken care of by someone who has far more experience than me. However, I will probably talk to the someone and learn some trigger index stuff.

    Flo

    EDIT: I don't know why I would think of trigger when I mean index...

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply