Two Node WSFC on a multi subnet is there any need for a file share if we do not want automatic failover

  • We are running SQL Server 2012 Enterprise on Windows 2012 R2. We have set up A WSFC with the primary Node in our East Coast Data Center and the Secondary Node in our West Coast Data Center each node is in a dfferent subnet. Since we have high latency between the sites we run in Asynchronous mode with manual failover only. My quesion concerns Quorum, everything I read would indicate we need a third node (odd number) (e.g. a fileshare Witness) with only the file share and the primary node having a vote...What I don't see is any need for a file share if we can only do manual failover.

    Any advice would be appreciated...

  • Which quorum mode did you configured?

  • Reg_Mayfield (2/10/2015)


    everything I read would indicate we need a third node (odd number) (e.g. a fileshare Witness)

    Yes, the vote count should have an odd number to provide a majority in failure situations

    Reg_Mayfield (2/10/2015)


    with only the file share and the primary node having a vote.

    So you go to all the trouble of configuring a fileshare witness on a third site and then bump the vote count back down to 2

    :w00t:

    The vote count needs to be an odd number

    Reg_Mayfield (2/10/2015)


    ..What I don't see is any need for a file share if we can only do manual failover.

    Any advice would be appreciated...

    The witness is there to keep the cluster service online, if you don't then you cant perform a manual failover of the instance without forcing cluster service beforehand which you really don't want to do unless a last resort.

    It might pay you to have a read through my stairway to AlwaysOn series on this site, to get essential knowledge on clusters and FCIs.

    Start at this link[/url]

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • OK I have read all the comments as well as read the Stairway series on AlwaysOn...

    I am still unsure how to set up the QUorum???

    Here is what I understand...(And this would be so much easier if I had more than one node that could vote)

    We have a Two Node setup each in a different subnet...Since one of the nodes is our DR node it should not have a vote

    that is according to all the articles I have read.

    That means I have 1 node that can vote and it is an Odd count.. So far so good...

    If the primary node goes down and we are running in Asynchronous mode and the failover node has no vote then I am guessing the

    Cluster is no longer in good health.

    By adding a File Share witness as another voting node then the Cluster health would be OK for a manual Failover but then you

    have an even number of votes which according to the Stairway article is also a no go...

    So I am in a quandry as to how I should set the quorum for a 2 node Cluster where only one node has a vote...

    Do we need to have Two file share witnesses and one node that has a vote???

  • Reg_Mayfield (2/11/2015)


    Since one of the nodes is our DR node it should not have a vote

    that is according to all the articles I have read.

    Where have you read this, this isn't necessarily true per se. In your situation you will need the DR node to have a vote.

    I think you may be confused, the golden rule is that your DR site should not be vote heavy, this can cause unwanted failovers to the DR location.

    Reg_Mayfield (2/11/2015)


    That means I have 1 node that can vote and it is an Odd count.. So far so good...

    No a cluster with only 1 voting node cannot sustain any node outages, the cluster will be offline and so will your SQL Server FCIs and AlwaysOn Availability groups.

    Reg_Mayfield (2/11/2015)


    If the primary node goes down and we are running in Asynchronous mode and the failover node has no vote then I am guessing the

    Cluster is no longer in good health.

    By adding a File Share witness as another voting node then the Cluster health would be OK for a manual Failover but then you

    have an even number of votes which according to the Stairway article is also a no go...

    So I am in a quandry as to how I should set the quorum for a 2 node Cluster where only one node has a vote...

    Do we need to have Two file share witnesses and one node that has a vote???

    Both nodes will have a vote, if the nodes are same site you may use a disk based witness, for geographically dispersed nodes use a fileshare witness ideally on a third site. If not on a third site place it on the Primary site. Do not (and I think this is where you are confused with DR voting) place it on the DR site, the site will become vote heavy, I.e.

    Primary site 1 vote

    DR site 2 votes

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • The reason I did not give the DR site a vote was based on the following Microsoft Article

    The Fileshare witness is located on the Primary Nodes Subnet

    https://msdn.microsoft.com/en-us/library/hh270280(v=sql.110)

    Recommended Adjustments to Quorum Voting

    When enabling or disabling a given WSFC node’s vote, follow these guidelines:

    • No vote by default. Assume that each node should not vote without explicit justification.

    • Include all primary replicas. Each WSFC node that hosts an availability group primary replica or is the preferred

    owner of an FCI should have a vote.

    • Include possible automatic failover owners. Each node that could host a primary replica, as the result of an automatic availability

    group failover or FCI failover, should have a vote. If there is only one availability group in the WSFC cluster and availability replicas

    are hosted only by standalone instances, this rule includes only the secondary replica that is the automatic failover target.

    Exclude secondary site nodes. In general, do not give votes to WSFC nodes that reside at a secondary disaster recovery site.

    You do not want nodes in the secondary site to contribute to a decision to take the cluster offline when there is nothing wrong

    with the primary site

    • Odd number of votes. If necessary, add a witness file share, a witness node, or a witness disk to the cluster and

    adjust the quorum mode to prevent possible ties in the quorum vote.

    • Re-assess vote assignments post-failover.

    You do not want to fail over into a cluster configuration that does not support a healthy quorum.

    So if I follow MS advice and not allow the DR site to have a vote that still leaves me with a two node quorum (Vote)

    the primary Node and a fileshare Node

  • It's all about interpretation of the text and these are the points you're missing here

    Reg_Mayfield (2/11/2015)


    Voting and Non Voting Nodes

    By default, each node in the WSFC cluster is included as a member of the cluster quorum; each node has a single vote in determining the overall cluster health, and each node will continuously attempt to establish a quorum. The quorum discussion to this point has carefully qualified the set of WSFC cluster nodes that vote on cluster health as voting nodes.

    No individual node in a WSFC cluster can definitively determine that the cluster as a whole is healthy or unhealthy. At any given moment, from the perspective of each node, some of the other nodes may appear to be offline, or appear to be in the process of failover, or appear unresponsive due to a network communication failure. A key function of the quorum vote is to determine whether the apparent state of each of node in the WSFC cluster is indeed that actual state of those nodes.

    For all of the quorum models except ‘Disk Only’, the effectiveness of a quorum vote depends on reliable communications between all of the voting nodes in the cluster. Network communications between nodes on the same physical subnet should be considered reliable; the quorum vote should be trusted.

    However, if a node on another subnet is seen as non-responsive in a quorum vote, but it is actually online and otherwise healthy, that is most likely due to a network communications failure between subnets. Depending upon the cluster topology, quorum mode, and failover policy configuration, that network communications failure may effectively create more than one set (or subset) of voting nodes.

    When more than one subset of voting nodes is able to establish a quorum on its own, that is known as a split-brain scenario. In such a scenario, the nodes in the separate quorums may behave differently, and in conflict with one another.

    Note

    The split-brain scenario is only possible when a system administrator manually performs a forced quorum operation, or in very rare circumstances, a forced failover; explicitly subdividing the quorum node set.

    In order to simplify your quorum configuration and increase up-time, you may want to adjust each node’s NodeWeight setting so that the node’s vote is not counted towards the quorum.

    And also here

    Reg_Mayfield (2/11/2015)


    When enabling or disabling a given WSFC node’s vote, follow these guidelines:

    • No vote by default. Assume that each node should not vote without explicit justification.

    • Include all primary replicas. Each WSFC node that hosts an availability group primary replica or is the preferred

    owner of an FCI should have a vote.

    • Include possible automatic failover owners. Each node that could host a primary replica, as the result of an automatic availability

    group failover or FCI failover, should have a vote. If there is only one availability group in the WSFC cluster and availability replicas

    are hosted only by standalone instances, this rule includes only the secondary replica that is the automatic failover target.

  • Exclude secondary site nodes. In general, do not give votes to WSFC nodes that reside at a secondary disaster recovery site.

    You do not want nodes in the secondary site to contribute to a decision to take the cluster offline when there is nothing wrong

    with the primary site

  • • Odd number of votes. If necessary, add a witness file share, a witness node, or a witness disk to the cluster and

    adjust the quorum mode to prevent possible ties in the quorum vote.

    • Re-assess vote assignments post-failover.

    You do not want to fail over into a cluster configuration that does not support a healthy quorum.

    So if I follow MS advice and not allow the DR site to have a vote that still leaves me with a two node quorum (Vote)

    the primary Node and a fileshare Node

    Situations will be different and there are times when all nodes will vote. Obviously as the number of cluster nodes rises you will have more flexibility with the voting system.

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • So one last question and I'll put this discussion to bed...

    I will set up my WSFC to have 3 nodes Primary, Seconday and a file share each having one vote...

    We have quite a few servers that are being upgraded to AlwaysOn. The Sysadmin is insisting that he build one

    VM/Server per Fileshare Witness...His reasoning is, if he has to shut down the server and we have multiple WSFC using

    that one Fileshare we are setting ourselves up for trouble down the line. Is this a worthy endeavor on his part

    or is he wasting resources having a dedicated server for each file share...What would you consider Best Practice

    when it comes to setting up a Fileshare witness that could possibly be used by 10, 20 even 50 servers???

    Thanks

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply