SQL Server slows down and hangs after 300,000 + connections

  • Hi all

    I shoud've posted this here to begin with.

    Sorry for posting this in administration.

    I've got a win 2000 SP4, 4 CPU, 4 GB RAM sql server 2000 standard edition SP3 box with 12 databases.

    Memory configured dynamically and nothing else runs on this box.

    All databases feed magazine websites.

    4-5 times a month we experience very heavy traffic for one of our magazine sites (that's when another global portal like AOL or Yahoo creates a link to our sites).

    we get about 300,000 to 800,000 connections sometimes within 1-2 hours.

    At that time SQL Server starts to decline in response time and then just becomes unresponsive.

    And we have to restart the server to bring it to a stable level. If we wait we start to get alerts from our monitoring tools letting us know that all the 12 sites are down.

    Now i've been reviewing our application code (it doesn't look good at all) and fixing some of the sql by converting them to sp's and replacing temp tables with subqueries. I would like to know if there are any sql server settings i can change like worker threads to at least make the server stay stable and not deal with locks and timeouts.

    I added more memory for queries from 1024 to 2048.

    I've been using "with (nolock) and (rowlock)" for some of the procedures but would like to know if anyone has any other ideas

     

    Thank you

    Any help appreciated

     

     

     

    Alex S
  • Depends why it's unresponsive. Are you seeing lots of blocking on queries? Are you seeing logins timing out?

    Check the front end, see if it has connection pooling enabled. If not, try push for that. Are connections been kept open longer than necessary?

    > I added more memory for queries from 1024 to 2048.

    Did you change the memory per query? Or the amount that SQL is allowed to use? Is 3GB enabled on the server? AWE?

    I would suggest, on the SQL side, run profiler during a relativly busy time and track down the longer running queries, the ones doing high numbers of reads or using lots of CPU, then focus on getting them working better, by changing the code or by indexing

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • 1) SQL 2000 Std can't use more than 2GB of ram, IIRC, so more memory won't help here.

    2) that is truly a staggering number of connections in such a short span. I can't remember the specific number of bytes allocated per connection, but no matter what the number 100K+ of them is not good for server performance. I would investigate the following: max worker threads, connection pooling (actually several things about how you connect/disconnect), buffer cache hit ratio, cpu usage during high-connection times. Heck, how about your network bandwidth and internet server utilization to??

    3) What, exactly, are the symptoms when the box starts to roll over??

    4) Before I get further details, I will venture a guess that RAM is insufficient on the box for that level of connections. Worker threads and number of cpus may need to be bumped up too, and disk I/O to serve up the requests that such a number of connections make will need to be high-end. That last could be alleviated somewhat if it is mostly reads and sufficient RAM exists to cache most/all data.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I believe that you have a major architectural issue to solve first. That many connections in such a short amount of time ??? Are there not multiple application/web servers accessing the database server ??? Are each of the application/web servers using connection pooling ??? What are the connection pool parameters - initial size, increment and maximum size ???

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • I changed memory per query.

    I did run sql profiler and got many queries.

    Going over each one and optimizing them as we speak. There are 100's of queries so it will take me some time.

    Alex S
  • It's mostly reads as far as disk I/O goes.

    Our 12 webservers use load balancing with fiber connections 2Gbit nic cards in between.

    2 seaprate CISCO swtiches and 2 separate T3 lines are used for all incoming traffic and for all internal traffic.

     

    Here is the RAID configuration

    System Databases and setup files on Drive C and F are on controller 1 raid1

    Transaction Logs on Drive E is on controller 1 raid1

    Data Files Drive  on D is on controller 2 raid 10

     

    Alex S
  • Thanks for all your replies.

    There is a fashion show in NYC all this week so yesterday night we hit 1.6 million connections in one hour. of course we had to restart again .

    Alex S
  • Man, this is a performance specialists wet dream!! 🙂

    Hey, are you caching data on your webfarm?? You seriously need to limit your database activity in this scenario, and appropriate data caching is CRITICAL for situations such as this.

    How many CONCURRENT connections to sql server are you seeing?

    What is the exact reason you are having to reboot sql server?

    How does performance degrade? It is fine and then falls off a cliff, or does it slowly get worse over many minutes?

    Were you able to monitor ANY perf mon counters during the period?

    ADO or ADO.NET?

    Stored Procedures??

    Connection pooling defaulted or explicitly controlled?

    Can you duplicate the database onto multiple database servers? This would provide cheap scale out for performance, but would require management to control load balancing and data synchronization.

    There are a BUNCH of things to check/evaluate for performance analsis . . .

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I believe you have an architectural Issue. Connection pooling is a must and if you don't use it you will hit a bottleneck.

    You must ensure that all is being pooled. User Perfmon to trace "user connections" so that you have an idea of how many concurrent connections you are actually dealing with.

    I believe you are hitting some other problem. SQL Server Maximun capacity specifications indicate that the maximum number of concurrent connections that can be handled (sp_configure) are 32,767 user connections.


    * Noel

  • Thank you noeld ...

    RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."

  • Any chance your front end apps are creating new connections for each call rather than closing and reusing connection correctly. If you have a single user session, with 20 calls to the db is that creating 20 connections or reusing a single connection. Try setting the app up in a test environment and running a single user through each app to ensure you are reusing existing db connections rather than creating new ones. Been there a few times although not on this scale. good luck.

  • There have been great suggestions in this thread, but I'd submit that your hardware is likely very underpowered for this setup. If you really have that many connections, you need to engage PSS for this issue and work with them because changing some of the settings, like worker threads, can have some strange effects. They are better suited to evaluate the effects in your case.

    I'd also tell your boss that you need new hardware. I'd hate to guess at a size here, but I'm sure you need Enterprise Edition and need to run more than 4GB. At this point in time, you'd be one of the few that I'd suggest a 64-bit server (if you don't have reasons like DTS, to not use it), and SS2K5, even running in SS2K mode. I'd probably look for at least 8 if not 16GB or more of RAM. No idea on CPUs, look at what your current setup uses.

    If you're running 12 magazine sites, I can't believe that there isn't a good reason to invest in more hardware and a more complete architecture. I'd really recommned you engage Microsoft consultants, preferably a SQL Ranger like my buddy David Reed (http://blogs.msdn.com/reedme/), and get them to evaluate what makes sense.

    This doesn't seem like a problem that will be solved in a forum like this.

  • Thank you Noel.

    I changed max number of concurrent connections parameter to 65534.

    Also using the perf mon i see concurrent connections value changing.

     

    how can i check if the webserver connection pool is enabled.

     

    we use IIS and all 12 servers are load balanced.

     

     

    Alex S
  • Thank you Steve.

    We are looking into every possibility here but as far as upgrading and investing in hardware that will never happen since we are moving into LAMP architecture.

    Alex S
  • I sure hope you aren't moving to LAMP because you (or the powers that be) decided that SQL/IIS can't perform acceptably!!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 15 posts - 1 through 15 (of 29 total)

You must be logged in to reply to this topic. Login to reply