首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 其他教程 > 其他相关 >

ZooKeeper源码阅览(五):Leader选举

2013-10-10 
ZooKeeper源码阅读(五):Leader选举ZooKeeper中的Leader选举也不是Paxos, 实现相关的类包括FastLeaderElect

ZooKeeper源码阅读(五):Leader选举

ZooKeeper中的Leader选举也不是Paxos, 实现相关的类包括FastLeaderElection, LeaderElection, 继承层次如下:


//Election 

//  +AuthFastLeaderElection

//  +FastLeaderElection

//    +MockFLE (测试)

//  +LeaderElection

//    +MockLeaderElection

 

默认使用的是FastLeaderElection, 不过ZooKeeper其实并不关心Leader选举是如何实现的,只要满足:

  • The leader has seen the highest zxid of all the followers.

  • A quorum of servers have committed to following the leader.

    如果有follower的zxid比Leader看到的zxid还大,说明是Leader选举结束之后才连上Leader的.这时Leader会发送TRUNC消息,让Follower丢弃.


    1) FastLeaderElection的实现

     

    //构造函数

           //两个队列,在Messenger中将会用到

         sendqueue = newLinkedBlockingQueue<ToSend>();

         recvqueue = newLinkedBlockingQueue<Notification>();

     

    //messenger对象

    //Messenger启动了两个线程:WorkerSender,WorkerReceiver

         this.messenger =newMessenger(manager);

                      

    //WorkerSender消费sendqueue

             ToSendm = sendqueue.poll(3000,TimeUnit.MILLISECONDS);

           process(m);

     

    //队列中消息类型为ToSend

    //{leader, zxid,elecEpoch, state, sid, peerEpoch, configData}

             //表示的消息为Notification或者对Notification的回复

     

    //WorkerReceiver接收消息

             //如果不是参与投票的,直接回复自己的投票,放到sendqueue中

             //

             //否则,解析成Notification。自己还是Looking则放到recvqueue中,

             //自己不是Looking,对方是Looking,则回复自己的leader编号

     

    //lookForLeader开始新的一轮选举

             //更新logicalclock

             synchronized(this){

                logicalclock++;

               updateProposal(getInitId(), getInitLastLoggedZxid(), getPeerEpoch());

            }

     

           sendNotifications();

     

    //发送Notification,直到不是LOOKING

            while ((self.getPeerState()== ServerState.LOOKING) &&

                 (!stop)){

                /*

                 *Remove next notification from queue, times out after 2 times

                 * thetermination time

                 */

               Notification n=recvqueue.poll(notTimeout, TimeUnit.MILLISECONDS);

     

                       //如果是投票者(VoterFollower),根据它的状态

                       //LOOKING:

                       //更新logicclock,更新Proposal,发送Notification

                       //放入recvset,然后判断是否达到多数派

                       //如果达到,则等待finalizeWait微秒,没有更优的投票则选举结束

     

                       //对方是Following,Leading

                       //如果logicalclock相符,检查是否达到多数派

                       //放到outofelection的都是LEADING或者FOLLOWING状态的,已经选举完毕

                      //都要checkLeader(outofelection,n.leader, n.electionEpoch)检查

     

                       //checkLeader

            /*

             * Ifeveryone else thinks I'm the leader, I must be the leader.

             * Theother two checks are just for the case in which I'm not the

             * leader.If I'm not the leader and I haven't received a message

             * fromleader stating that it is leading, then predicate is false.

             */

     

            if(leader !=self.getId()){

                if(votes.get(leader)==null) predicate =false;

                else if(votes.get(leader).getState()!= ServerState.LEADING) predicate =false;

            } else if(logicalclock !=electionEpoch) {

               predicate = false;

            }

     

    logicalclock的增加表示新一轮的选举过程.


    Leader选举初始投票选自己,收到别人的投票之后判断是否更优,如果是则更新自己的投票,最终zxid最大的follower将收到多数派的投票,等待finalizeWait微秒后仍无更优投票,则转为LEADING状态.

     

    2) LeaderElection的实现则更简单

             //向所有VotingView中的Server建立socket,发送/接收vote

             //然后countVotes判断截止


    3) AuthFastLeaderElection则是在FastLeaderElection基础上增加了简单的认证

     

    Leader选举时,将选择zxid最高的。这也是一个优化,因为不用和其它组员那里找丢失的事务。


    参考:

    http://zookeeper.apache.org/doc/r3.2.2/zookeeperInternals.html#sc_leaderElection

热点排行