Heterogeneous query performance question

  • Hi:

    I have two databases, on the same server, in a SQL Server 7.0 environment.

    I can run heterogeneous query between them, but there appears to be a big performance hit if I do.

    My questions are:

    1) is that expected

    2) if so, how can that hit be minimized, if at all.

    I've looked at linked servers, but I don't think that applies here.

    Here is the situation:

    One database is a turnkey contact management (CM) database, the second is a license management database built in-house. For the time being I'd like to keep them separate, mostly because we don't have a support agreement for the CM database and I'm concerned about doing things that will come back to haunt me.

    Each record in the license has a contact ID that matches it up with the appropriate record in the contact database. So there are two routes I can take to retrieve contact information for a specific license:

    1) use a single connection to the license database and use a heterogeneous query between the license and contact databases to return the contact data

    2) use two connections, on each to the license and contact database. The first returns the contact ID from the license database which is used as the basis for a second query to the contact database via the second connection.

    The second option is much, much faster than the first option, at least according to the tests that I've run, both through Query Analyzer and actual application.

    The end application is a web based look-up application, not for database maintenance. I suspect that heterogeneous are more useful for database maintenance where performance can often be less of an issue, assuming off-hours type of work.

    Is the answer, like most things, an evaluation of resource trade-offs? Primarily, the number of available connections vs application response?

    All comments welcome and appreciated.

    Thanks,

    JK

  • You may want to review index structure and exeution plans. We have run multiple databases on one machine for years without significant performance issues.

    Making a second connection is fine if you don't plan on updating anything. Otherwise, you run the risk of deadlocking yourself which wouldn't happen under a single connection scenario. Licensing is also a factor.

    Seems a little strange to have a significant performanc penalty when they are both in the same SQL environment.

    Guarddata-

  • I thought so too, but my experience is limited. I'll go back and take another look at it.

    Thanks,

    JK

  • I thought so too, but my experience is limited. I'll go back and take another look at it.

    Thanks,

    JK

  • Hello Guarddata:

    Well, I went back to the drawing board and set up a test case:

    intTestChoice = 1 ' use separate database connections/queries

    intTestChoice = 2 ' use single databases connection w/ heterogeneous query

    timerStart = Timer

    If intTestChoice = 2 Then

    Response.Write "Not heterogenous<BR>"

    Set rsContact = GetContactDBData(rsLocal("ProxyContactID"), "Contact")

    Else

    Response.Write "heterogenous<BR>"

    Set rsContact = GetContactDBDataTest(rsLocal("ProxyContactID"), "Contact", cnnLocal)

    End If

    Response.Write "Time: " & CStr(Timer - timerStart)

    The SQL for intTestChoice = 1 (single DB)

    strSQL = "SELECT 'DOM' AS 'PrimaryMarket'," & _

    " ContactID, FirstName, LastName, Account, WorkPhone, HomePhone, Fax, T2.Address1, Address2, City, State, PostalCode, Country, Email, SecondaryEmail, WebAddress" & _

    " FROM SalesLogix.sysdba.Contact AS T1 INNER JOIN SalesLogix.sysdba.Address AS T2 ON T1.AddressID = T2.AddressID" & _

    " WHERE T1.ContactID = '" & strIDValue & "'"

    The SQL for intTestChoice = 2 (heterogeneous)

    strSQL = "SELECT 'DOM' AS 'PrimaryMarket'," & _

    " T1.ContactID, T1.FirstName, T1.LastName, T1.Account, T1.WorkPhone, T1.HomePhone, T1.Fax, T1.Email, T1.SecondaryEmail, T1.WebAddress, T2.Address1, T2.Address2, T2.City, T2.State, T2.PostalCode, T2.Country" & _

    " FROM Register AS T0 INNER JOIN SalesLogix.sysdba.Contact AS T1 ON T0.ProxyContactID = T1.ContactID" & _

    " INNER JOIN SalesLogix.sysdba.Address AS T2 ON T1.AddressID = T2.AddressID" & _

    " WHERE T0.ProxyContactID = '" & strIDValue & "'"

    The single DB (non-heterogeous) implementation has additional overhead in that I create a second connection and command object to run the query against the database.

    The results:

    non-heterogeneous: max of 2 seconds on the first request, consistently in tenths or thousands of seconds for all subsequent requests.

    heterogeneous: consistently between 7.5 and 10 seconds, mostly mid 9's. However, there can be stretches where it executes in tenths of seconds, but it isn't something that can be relied upon.

    Since this is not a stored procedure, the query has to be compiled prior to running. Isn't the compiled query cached for a short time? Is it possible that with the hetergeneous query the compiled query must be recompiled every time since it includes two databases?

    I'm baffled.

    Thanks for whatever insight you can provide.

    JK

  • Hi:

    Sorry. The line

    'If intTestChoice = 2 Then'

    should be

    'If intTestChoice = 1 Then'

    Also, there are two functions that return a recordset based on the value of intTestChoice

    GetContactDBData

    vs

    GetContactDBDataTest

    That's why there are a different number of parameters fed to the functions. Just wanted to help highlight that information.

    JK

  • JK (Same first initial and last name as my brother - trivia point)

    This must be an abbreviation of your works since it seems the only reason to use the Register table is to join with the contact and could be removed. Have you run these two queries directly from Query Analyzer? It would be very interesting to see how the execution plans look. My guess is that an index, that you expect, is not being used (or does not exist).

    Guarddata-

  • Hello Guarddata:

    You are dead-on regarding both points. I've never worked with the Query Analyzer Execution Plans before, but I used it last night to come to the same conclusion you did.

    I don't know why I was linking the Register table in the query. It was simply a mistake. Its inclusion caused the use of a terrible index. Removing the table from the query allowed the selection of the correct index and it looks like things work as expected. I'll test it some more today.

    Thanks for your time and for pointing me in the right direction. I learned something.

    Cheers to you (and your almost my namesake brother).

    JK

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply