Best way to get large amount of data for a join from another server?

  • We have a main database on its own server for our internal application. Several other applications with databases on different servers need this data in real time. We currently replicate the data to the other instances of SQL, but are trying to get away from that. In some cases, the data needed is small and can be retrieved with a web service or something like that. However, in many cases... The application data needs to join with large tables in the other. Does anyone have any better method than replication for handling this?

    Jared
    CE - Microsoft

  • SQLKnowItAll (1/27/2014)


    We have a main database on its own server for our internal application. Several other applications with databases on different servers need this data in real time. We currently replicate the data to the other instances of SQL, but are trying to get away from that. In some cases, the data needed is small and can be retrieved with a web service or something like that. However, in many cases... The application data needs to join with large tables in the other. Does anyone have any better method than replication for handling this?

    Based on forum name you are on SQL 2012, so if you have Enterprise Edition you have the potential for Always On Readable Secondary(ies). I note that read-only access comes with NUMEROUS caveats, provisos, limitations and gotchas. Plus you only get 4 copies.

    I would also investigate SiOS - they have some very cool sh-t, although I don't know how many copies you can make or if it is appropriate for your scenario.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • SQLKnowItAll (1/27/2014)


    We have a main database on its own server for our internal application. Several other applications with databases on different servers need this data in real time. We currently replicate the data to the other instances of SQL, but are trying to get away from that. In some cases, the data needed is small and can be retrieved with a web service or something like that. However, in many cases... The application data needs to join with large tables in the other. Does anyone have any better method than replication for handling this?

    Personally, I usually just link the DB and dump the foreign pieces into a local #temp table... which works for the small data needs you're discussing.

    My other option is to reverse the process. If I need to send a large limiter over to the foreign server to reduce the 'noise' between them, I'll send over an XML table as a parameter, use that as a join on the foreign server, and have that return back to me the limited dataset, which I'll then connect up all the dimensions/attributes/whatnot to. Most of that communication can usually be limited to just IDs and/or filter records.

    It all depends on where the heavy data restrictions lie.

    For the record, while I can do replication, I tend to avoid it unless circumstances require it (IE: a database needs to function when the another data server is down). That's become much less of a concern overall as HADR and Clustering has become much more advanced.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Thanks for the thoughts guys. We will actually have a read-only server in the HAGR that will handle a lot of the stuff. It is the large stuff I am worried about. I like the XML thought and will also check out SiOS. Thanks!!!

    Jared
    CE - Microsoft

  • Can you give some more detail on the big cases? From what I understand, you need to join IA.DA1.SA1.TA1 to IB.DB1.SB1.TB1, where both are "large". Please extend to however many tables are involved :).

    My questions would start with:

    How much data from IA.DA1.SA1.TA1 needs to be included in the join?

    How much data from IB.DB1.SB1.TB1 needs to be included in the join?

    How much data from IA.DA1.SA1.TA1 needs to be included in the results?

    How much data from IB.DB1.SB1.TB1 needs to be included in the results?

    How rapidly does the data in IA.DA1.SA1.TA1 change?

    How rapidly does the data in IB.DB1.SB1.TB1 change?

    How up to date does the data in IA.DA1.SA1.TA1 need to be?

    How up to date does the data in IB.DB1.SB1.TB1 need to be?

    Are there locking/blocking considerations on IA.DA1.SA1.TA1?

    Are there locking/blocking considerations on IB.DB1.SB1.TB1?

    How fast is the network connection between InstanceA and InstanceB?

    How reliable is the network connection between InstanceA and InstanceB?

    How fast does the query with the big join have to be?

    How often is the query with the big join run? If infrequently, is it during on-hours or off-hours?

    I would note that you don't have to do InstanceA.X -> InstanceB.X; if you have several tables or databases on each, there are options to have some tables going one way and others going the other that may be valuable if you can restrict yourself to moving slowly changing data or the like, handle the rest of the limitations of that technique, and if it has a net benefit. It's a rare case, but it's something to consider if it fits.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply