SQL 2008 not starting on one node in a cluster

  • sorry if this is in the wrong section...please move if required...

    I built a WS 2008 failover cluster on 2 HP blades and have one instance of SQL 2008 running on it. It has ran fine for weeks, but then something odd has started.

    Node 1 has been active for a few months, then we failed it over to node 2 for a while.

    On switching back to node 1, SQL will not start. I get the following group of errors in the cluster log:

    checkODBCConnectError: sqlstate = HYT00; native error = 0; message = [Microsoft][SQL Server Native Client 10.0]Login timeout expired

    [sqsrvres] ODBC sqldriverconnect failed

    [sqsrvres] checkODBCConnectError: sqlstate = 08001; native error = 274d; message = [Microsoft][SQL Server Native Client 10.0]TCP Provider: No connection could be made because the target machine actively refused it.

    Report Server Windows Service (MSSQLSERVER) cannot connect to the report server database.

    [sqsrvres] checkODBCConnectError: sqlstate = 08001; native error = 274d; message = [Microsoft][SQL Server Native Client 10.0]A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online.

    It's hard to get downtime on the system as its used a lot, so when I do get a few hours window I need to make sure I can get it working again. Any ideas would be greatly appreciated!

    Thanks

    Martyn

  • ps - to add to this, the instance is now only connectable by it's name, not it's IP address, from clients.

    Thanks

  • on further investigation it appears that SQL reporting services has been installed on one node (the one that will not run SQL)

    I will stop the service and see if this makes a difference when failing it over

    thanks

  • Was this cluster tested thoroughly before SQL server was installed on it?

    Also, did you if it was a side by side upgrade, did you failover the nodes after installing SQL Server?

    These issues are usually seen during testing and can be resolved without risking downtime once the system is live..

    Thanks...

    The_SQL_DBA
    MCTS

    "Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives."

  • the cluster was tested for a long period of time both before and after SQL was installed.

    It was a fresh build of 2 servers, with fresh install of SQL, no upgrades were required.

    I think I have found the problem...

    SQL was installed and worked perfect

    Our 3rd part support have now installed SQL reporting services onto the cluster BUT have but it on node 1 instead of a clustered service

    This happeend at the same time it stopped working on node 1

    so, I currently have SQL on node2 and reporting services on node1...which is causing SQL to fail when I try and fail it back to node1

    Now that I know the problem, I hope to get them to resinstall reporting as a clustered service, or on its own server if not

    Thanks again

  • got to the bottom of the problems this morning...

    disabling SQL reporting Services on node1 has now allowed it to accept SQL in the event of a failover. Really not too sure as to why this would stop it, but I am happy it is working again as we needed failover abilities on this live app - yippee...

    Many thanks guys and girls

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply