Hey, I’m Eugene and this is the second part of the live contract review series and today we’re going to review the staking pool contract, which is used right now to secure the NEAR Protocol proof of stake system. Basically all of the validators that are currently running on NEAR Protocol are running on behalf of this contract. They don’t control the account that stakes the amount of NEAR tokens required for proof of stake themselves, but instead the contract stakes this amount, and they just provide a staking pool and run their nodes. Today we’re gonna dig into this contract. In the core contracts we have a staking pool contract and it’s a little more complicated than the previous contract that we reviewed (the voting contract). So today we’re going to focus more on the logic and less on the near_bindgen and Rust specific stuff, but it will probably involve a little more of NEAR Protocol knowledge. Here is the staking pool contract on the NEAR Github. Below is the original video which this guide is based on.
As before, the contract starts from the main structure. In this case it’s a staking contract structure. As you can see there’s near_bindgen, and BorshSerialize and BorshDeserialize. The structure now has way more fields than the past one, and there are some comments on them, most of them are probably up to date. The logic of the staking pool contract allows us to do the following: basically anyone can deposit some amount of NEAR tokens to the staking pool, and delegate them to the staking pool by staking them in the pool. That allows us to bundle together balances from multiple people (we call them accounts here) into one large stake, and this way this large state may actually qualify for validator seats. NEAR Protocol has a limited amount of seats for a single shard right now, there’s at most 100 validator seats. You can think about the seats in the following way: if you take the total amount of token staked and divide it by 100 the result will be the minimum amount of tokens required for a single seat, except it’s a little bit more complicated to involve removing the stakes that are not qualified to contain this minimum amount, etc. This contract basically is a standalone contract without any access key on it that is controlled by the owner. In this case the owner is provided in the initialization method.
So let’s go to the initialization method. It has three arguments and first is the owner_id which is the account id of the owner account. The owner has a bunch of permissions on this contract that allow the contract to perform actions that are not available for the rest of the accounts. One of these methods was to vote on behalf of the staking pool to the voting contract that we discussed last time. So the owner can call the vote method.
We then verify that the predecessor is equal to the owner since this method can only be called by the owner.
So what the vote method does is that it verifies that the method was only called by the owner and then it verifies some logic, but that’s not important right now.
So the contract is the owner, and this owner can do certain things, it has the extra permissions. Then it takes a few more fields: it’s taking the stake_public_key. When you stake on the NEAR Protocol you need to provide a public key that will be used by your validator node to sign messages on behalf of the validator node. This public key can be different from any access key, and ideally it should be different from any access key because your node may be run in a data center that may be vulnerable to some attacks. In this case the most they can do is actually do something bad to the network, but not to your account. They cannot steal your funds, and you can easily replace this key when compared to how you need to replace the bad access key. Finally, the third argument that the contract takes is the initial reward_fee_fraction. This is the commission that the owner of the staking pool takes for running the validator node.
This is a fraction that has a numerator and denominator, and it allows you to basically say “I take 1% of the rewards for running this particular pool”, for example. Let’s say you have a 1 000 000 tokens they acquired some reward, let’s say there is a 10 000 token reward, then the owner will take 1% out of this which is 100 tokens. Floats have an unpredictable behavior when you multiply them. For example, with fractions, you can use math with a larger number of bits. So the way you do division, for example, is you first multiply the amount which is u128 by a numerator (this can already overflow in u128), but that’s why we do this in u256. Then you divide it by the denominator which should bring it below u128 again. This gives you higher precision than float64 which cannot operate with u128 bit precision, so it will have some rounding errors or precision errors when you do the math. So you either need higher precision floats, which are not really different from the math where we simulate this with u256. Solidity originally didn’t support floats, and we also originally did not, but that cast some issues around string formatting in Rust for debugging, so we decided that there is no harm in supporting floats especially as we standardize this on the vm side. The biggest issue with floats was the undefined behavior around certain values of loads. For example, what other bits contain when you have an infinite float. We standardized this, and now they are equivalent platform independent. So, it’s okay to use floats now in our vm environment.
The standard practice with init is that we first check that the state doesn’t exist. Then we verify the input. The first thing we do is verify that the fraction is valid and check that the denominator is not zero. Next, we have an else statement that checks that the numerator is less than or equal to the denominator, which means that the fraction is less than or equal to 1. This is important to avoid some logic mistakes. The next thing we do is verify that the account is valid. This contract was written before some of the helper metrics that exist now. For example, we have the valid account id in JSON types that does this check automatically during deserialization, if it’s invalid it will just panic. After that we pull current account balance of the staking contract. This balance is usually large enough because it has to pay for the storage of this particular contract, and then we say that we’re going to allocate some tokens for the STAKE_SHARE_PRICE_GUARANTEE_FUND. The staking pool has certain guarantees that are important for local contracts. Guarantees ensure that when you deposit to the staking pool, you should be able to withdraw at least the same amount of tokens, and you cannot lose tokens even for as much as 1 000 000 000 000 yocto NEAR on this contract by depositing and withdrawing from the staking pools. The STAKE_SHARE_PRICE_GUARANTEE_FUND fund is around 1 trillion yocto NEAR, while we usually consume around 1 or 2 trillion yocto NEAR for rounding errors. Finally we remember what the balance that we’re going to stake on behalf of this contract is. This is required to establish some baseline to limit the rounding differences. Next, we verify that the account has not staked already. This might break some logic. We don’t want this to happen, so we want to initialize the contract before it stakes anything. Finally we initialize the structure, but we don’t return it immediately. We just created the :StakingContract structure here.
Then we issue a restaking transaction. This is important, because we need to make sure that this staking key that was provided is a valid ristretto restricted key, for example, a 5 119 valid key. There are some keys on the curve that are valid keys, but are not ristretto specific, and validator keys can only be ristretto specific. This is a NEAR Protocol specific thing, and what happens is it makes a staking transaction with the given key. Once this transaction is created from the contract, we validate this transaction when it leaves. If the key is invalid, then it will throw an error, and the entire initialization of this staking pool will fail. If you pass an invalid stake_public_key as an input then your contract consolidation and deployment, and everything that happens in this one batch transaction will be reverted. This is important so that the pool doesn’t have an invalid key, because that might be allowing you to block the stake of other people. As a part of the guarantees we’re saying that if you unstake, your tokens will be returned in 4 epochs. They will be eligible for withdrawal, and this is important to be able to return them to lockups.
I think this is too many details before I explain the high level overview of how contracts work and how balances work. Let’s explain the concept of how we actually can distribute rewards to account owners in constant time when an epoch passes. This is important for most smart contracts. They want to act in constant time for every method instead of linear time for the amount of users, because if the amount of users grows then the amount of gas required to operate a linear scale will grow as well, and it will eventually run out of gas. That’s why all smart contracts have to act in constant time.
The way it works for every user we keep the structure called account. Every user that has delegated to this staking pool will have a structure called account that has the following fields: unstaked is the balance in yocto NEAR that is not staked so it’s just the balance of the user. Then stake_shares is actually a balance, but not in NEAR, but instead in the number of stake shares. Stake_shares is a concept that was added to this particular staking pool. The way it works is when you stake, you essentially buy new shares at the current price by converting your unstake balance into stake shares. A stake share price is originally 1, but over time it grows with the rewards, and when the account receives rewards its total stake balance increases, but the amount of total stake shares doesn’t change. Essentially when an account receives validation rewards, or some other deposits straight to the balance, it increases the amount that you can receive for every stake share. Let’s say, for example, you originally had 1 million NEAR that was deposited to this account. Let’s say you get 1 million shares (ignoring the yocto NEAR for now), if the staking pool received 10 000 NEAR in rewards, you still have 1 million shares, but the 1 million shares now corresponds to 1 010 000 NEAR. Now if someone else wants to stake at this time, they will purchase stake shares internally within the contract at the price of 1.001 NEAR, because every share is worth that now. When you receive another reward, you don’t need to buy more shares despite the total balance, and in the constant time everybody shares the reward proportionally to the number of shares they have. Now, when you unstake, you’re essentially selling these shares, or burning them using the concept of fungible tokens in favor of unstaked balance. So you sell at the current price, you decrease the total amount of stake as well as the total amount of shares, and when you purchase you increase the total stake balance, and the total stake shares while keeping the price at constant. When you stake or unstake you don’t change the price, when you receive the rewards you increase the price.
The price can only go up, and this may lead to the rounding errors when your yocto NEAR and your balance cannot correspond precisely. That’s why we have this guarantee fund of 1 trillion yocto NEAR that will throw one extra yocta NEAR into the mix a few times. Finally, the final part is there, because the NEAR Protocol does not unstake and return the balance immediately, it has to wait three epochs until your balance will become unstaked and returned to the account. If you unstake you cannot withdraw this balance immediately from the staking pool, you need to wait three epochs. Then you remember at which epoch height you called the last unstake action, and after three epochs your balance will become unlocked, and you should be able to withdraw from unstaked. However, there’s one caveat: if you call unstake at the last block of the epoch, the actual promise that does unstaking will arrive for the next epoch. It will arrive at the first block of the next epoch, and that will delay your locked balance in becoming unlocked to four epochs instead of three. This is because we recorded the epoch in the previous block, but the actual transaction happened in the next block, in the next epoch. To make sure that doesn’t happen we lock the balance by four epochs instead of three epochs to account for this border case. That’s what constitutes an account. The idea of shares is not that new, because on Ethereum the majority of liquidity providers, and automated market makers use this similar concept. When you, for example, deposit to the liquidity pool you get some kind of token from this pool instead of the actual amount that is represented there. When you withdraw from the liquidity pool then you burn this token, and get the actual represented tokens. The idea is very similar to calling them shares, because they have a corresponding price, and we could have called them differently. This was from almost the beginning of this staking pool contract. There was exploration around how we can do this properly, and one way was that we would limit the number of accounts that can deposit to a given pool account for this particular update. We eventually landed on the constant complexity time and it was actually a simpler model. Then the math of the stake_shares structure became somewhat reasonable even so there’s some involved as well there.
Let’s go through this contract. It’s not as well structured as a lockup contract for example, because lockup is even more complicated. The types are still bundled in the same contract. There are a bunch of types of types, for example reward_fee_fraction is a separate type.
Account is a separate type and there’s also a human readable account which is also a type that is only used for view calls, so it’s not used for logic internally.
Then after we finish with all of the types, we have cross contract calls using a high level interface.
There’s two of them. The way it works is that you have a macro from near_bindgen called ext_contract (standing for external contract). You can give it a short name that it will generate that you will be able to use. Then you have a trait description describing the interface of the external contract that you want to use. This describes the fact that you can call a vote method on a remote contract, and pass one argument in. The argument is_vote which is a true or false boolean. Now you will be able to create a promise when you need it, and pass a positional argument instead of a JSON serialized argument. The macro will make it into low-level promise apis behind the scenes. The second interface is for a callback on our self, this is fairly common, you can call it ext_self. When you need to do a callback, and do something on the result of the asynchronous promise you can have this type of interface. What we do is we check if the staking action succeeded. Finally, we have this main implementation structure implementation body of the staking pool.
Contract File Structure
This contract is split into multiple modules.
You have libs.rs which is the main input, and you also have an internal module. The internal module has the implementation without the near_bindgen macro, so none of these methods will be visible to be called by a contract by someone else on the chain. They can only be called within this contract internally so that they don’t generate JSON formats, and don’t deserialize state. They all act as regular rust methods. How this contract works high level is that when an epoch passes you may acquire certain rewards as a validator.
Important Methods of the Contract
We have a ping method which pings the contract. The ping method checks if an epoch has passed and then we need to distribute rewards. If the epoch changed then it will also restake, because there might be some change in the amount of total stake the contract has to stake. The next is deposit.
The deposit method is a payable which means it can accept an attached deposit. This is similar to the Ethereum decorator that allows you to receive funds only to the methods that expect them. So near_bindgen by default will panic if you try to call a method, for example ping, and attach a deposit to this method. Consequently, payable allows us to attach deposits. In every method there’s an internal ping to make sure that we distributed previous rewards before changing any logic. The common structure is that if we need to restake, then we first do some logic, and then restake.
The next method is deposit_and_stake. This is a combination between two methods. First, you deposit the balance to the stake balance of your account, and you also want to stake the same amount immediately instead of doing two transactions. It’s also payable because it also accepts a deposit.
The next is withdraw_all. It tries to withdraw the entire unstake balance from the account that called it. When you interact with the staking pool you need to interact with the account that owns the balance. In this case this is is the predecessor_account_id and we basically check the account, and then we withdraw the unstaked amount if we can. If it’s not withdrawn, then it will panic. For example, if it’s still locked due to unstaking less than 4 epochs ago.
Withdraw allows you to withdraw only partial balance.
Then stake_all stakes all unstaked balance, and it’s pretty rare to use this method, because you usually use deposit stake, and it already has all the stake balance.
Then in the stake method you just stake some amount of stake balance. Moonlight wallet uses a separate cost to deposit in stake, but they use a batched transaction to do this.
Finally you have unstake_all which basically unstakes all your stake shares by converting them to yocto NEAR. There is a helper method which says convert my number of shares to an amount of yocto NEAR and round down, because we cannot give you extra for your share multiplied by price. That’s how we get the amount and then we call unstake for the given amount.
The staked_amount_from_num_shares_ rounded_down logic uses u256, because balances operate on u128. To avoid overflow, we multiply the total_staked_balance by the number of shares in u256. The price is the quotient rounded down.
The round up version staked_amount_from_num_shares_rounded_up is very similar except, we do a check that allows us to round up. At the end of both we cast it back to u128.
Then we have an unstake action which is very similar to unstake_all, except you pass the amount.
After that there’s a bunch of getter methods that are view calls that return you some amounts. You can get the account unstaked balance, account staked balance, account total balance, check if you can withdraw, total stake balance, which is the total amount the staking pool has in active stake.
Then you can get who the owner of the staking pool is, you can get the current reward fee or commission of the staking pool, get the current staking key, and there’s a separate thing that checks if the owner paused the staking pool.
Let’s say the owner does a migration on the staking pool on the node. They need to completely unstake, so for example, they can pause the staking pool which will send a state transaction to the NEAR Protocol, and then will not restake until they resume the staking pool. However, you can still withdraw your balances, but you will stop acquiring rewards after it’s passed.
Finally you can get a human readable account which gives you how many tokens you actually have for the number of shares at the current price, and finally it says whether you can withdraw or not.
Then it gives you the number of accounts which is the number of delegators to this staking pool, and you can also retrieve multiple delegators at once. This is pagination on a large number of accounts within the unordered map. One way of doing this is you use the helper that we call keys_as a_vector from the unordered map. It gives you a persistent collection of keys from the map, and then you can use an iterator to request accounts from these keys. That’s not the most efficient way, but it allows you to implement pagination on unordered maps.
There are a bunch of owner methods. An owner method is a method that can only be called by the owner. The owner can update the staking key. Let’s say they have a different node, and the owner need to use a different key. All of these methods first check that only the owner could call it.
This is the method that changes the commission on the staking pool. The owner can change the commission that will be active at this epoch starting from this epoch immediately, but all of the previous commissions will be calculated using the previous fee.
Then this was the vote method that allowed us to transition to phase two of the mainnet.
Next are the two methods that I already described which allow to pause staking and to resume staking.
The rest are just tests. Most of the logic is happening in the internals.
We also basically have simulation tests for a particular pool. This simulation test is how the network is actually going to work. We first initialized the pool.
Bob is the delegator. Bob called the pool deposit method which is the deposit_amount using the deposit method. Then Bob can verify that the unstaked balance is working correctly. Then bob stakes the amount. Then we check the amount of stake now. We verified that Bob has staked the same amount.
Bob calls the ping method. There’s no rewards, but in simulations the rewards are not working anyway so you need to manually do this. We’ll verify once more that Bob’s amount is still the same. Then the pool resumes. We verify that the pool has resumed, then lock to zero. Then we simulate that the pool has acquired some rewards (1 NEAR) and bob pings the pool. Then we verify that the amount that Bob received is positive. That’s a very simple simulation case which is saying that Bob first deposited to the pool which verifies that the pause and resume works, or simulates that it works and makes sure that the pool doesn’t stake while being paused. Then when resumed, the pool actually stakes. So this test verifies not only this, but also that Bob has acquired the reward, and got distributed the reward. There’s another test that verifies some logic but that’s more complicated. There are some unit tests on the bottom of this that are supposed to verify certain stuff.
Some of these tests are not ideal but they verify certain stuff that was good enough to make sure that math adds up.
Internal Ping Method
Let’s move onto internal_ping. It is the method that anyone can call through ping to make sure rewards are distributed. Right now we have active staking pools and there’s an account sponsored by one of the NEAR folks that basically pings every stake in the pool every 15 minutes to make sure they have distributed the rewards to display on the balance. That way the reward distribution works. We first check the current epoch height, so if epoch height is the same then the epoch hasn’t changed, we return false so you don’t need to restake. If the epoch has changed then we remember that the current epoch (epoch height) exists, we get the new total balance of the account. Ping may be called when some tokens were deposited through deposit ballots, and they are already part of the account_balance, and since ping was called before we need to subtract this balance before we distribute the rewards. We get the total amount that the account has including both locked balance and unlocked balance. Locked balance is a staked amount that acquires rewards, and unlocked balance also may have rewards in certain scenarios where you decrease your stake, but your rewards will still be reflected for the next two epochs. After that they will come to the unstaked amount. We verify using assert! that the total balance is more than the previous total balance. This is an invariant that the staking pool requires. There was a bunch of stuff on the testnet that happened to fail this invariant because people still had access keys on the same staking pool, and when you have it you spend the balance for gas, and you may decrease your total balance without acquiring the reward. Finally we calculate the amount of rewards that the staking pool received. This is the total balance minus the previous known total balance, the balance from the previous epoch. If the rewards are positive we distribute them. The first thing we do is calculate the reward that the owner takes for themselves as a commission.
We multiply the reward_fee_fraction by the total reward received and this is similarly rounded down with the numerator in u256 multiplied by value divided by denominator in u256.
The owners_fee is the amount in yocto NEAR that the owner will keep for themselves. The remaining_reward is the remaining rewards that have to restaked. Then it goes on to be restaked. The owner received the rewards in yocta NEAR, not in shares, but because all of the logic has to be in shares the owner of the staking pool purchases shares at the price of the post reward distributions to the rest of the delegators. So num_shares is the number of shares that the owner will receive as compensation for running the staking pool. If it’s positive we increase the amount of shares, and save the owner account back, and we also increase the total amount of stake in shares. If for some reason during rounding down this balance became zero, the reward was very small, and the price per share was very large, and the pool only received zero rewards. In that case this balance will just go to the price per share instead of compensating the owner. Next, we put some total logging data that says that the current epoch exists, that we received the rewards in an amount of staking shares or tokens, that the total stake balance of the pool is something, and we log the number of shares. The only way we expose the number of shares to the external world is through the logs. Next, if the owner received rewards, it is saying that the total reward was so many shares. Lastly, we just remember the new total balance and that’s it. We have distributed all rewards in constant time and we only updated one account (owner’s account) for commission, and only if the commission was positive.
Internal Stake Method
The internal_stake is where we implement the price guarantee fund. Let’s say the predecessor, in this case we’re going to call it account_id wants to stake an amount of tokens. Balance is actually not a JSON type, because it’s an internal method so we don’t need JSON here. We calculate how many shares are rounded down that are required to stake the given amount, so this is how many shares the owner will receive. It has to be positive. Then we check the amount the owner should pay for the shares, again rounded down. This is to guarantee that when the owner purchased shares, and converted them back without rewards never lost the 1 yocto NEAR, because it might break the guarantee. Finally, we assert that the account has enough to pay for the amount charged, and we decrease the internal unstaked balance, and increase the internal number of shares balance of the account. Next we round the staked_amount_from_num_shares_rounded_up up so that the number of shares is actually rounded up. This 1 extra penny or 1 extra yocto NEAR will come from the guaranteed fund during the rounding up of the shares. We charged the user less, but we contributed more to the amount from this 1 trillion yocto NEAR that we had originally designated for this. This difference usually is just 1 yocto NEAR that may come from rounded up or down. After that there is the amount of total_staked_balance and total_stake_shares. Next we mint new shares with them. Finally we put a log and return the result.
Unstaking works very similarly. You round up to the amount of shares you need to pay. Then we calculate the amount you receive, again rounding up to be overpaying you for this. This also comes from a guarantee fund. Then we decrease the shares to increase the amount and state when you can unlock the balance between four epochs. The unstake_amount is rounded down so that we unstake slightly less to guarantee the price of other participants of the pool. That’s pretty much how the staking pool works and how the math works. We compensate for rounding errors from the funds that we allocated.
We updated the ristretto keys during the design of this contract and it was surprising that we needed to account for this. In the STAKE_SHARE_PRICE_GUARANTEE_FUND 1 trillion yocto NEAR should be enough for 500 billion transactions which should be long enough for the staking pool so that it cannot be refilled because the rewards will be immediately redistributed to the total_stake_balance on the next ping. We spent quite a bit of time, and effort on this contract, because we did tons of security reviews including internally and externally, especially around this math. That was complicated, and some stuff was discovered like the ristretto key that popped up during the reviews. We marked the change log of this contract, as well in the readme there’s a bunch of stuff that popped up during the development, and testing on the live system, but the original version took about a week to write. Later we cleaned it up, tested it and improved it. Then we did a bunch of revisions. Pausing, and resuming was asked for by the pool, because otherwise the owner had no ability to unstake if their node goes down. They will be attacking the network. Essentially this active stake would be requesting the validation and not running the network. We used to not have slashing. This was just not an issue for the participants, but it was an issue for the network itself. That way the owner can pause the staking if they don’t want to run the pool they migrate into the pool, and communicate as much as possible before this. Next we updated the vote interface to match the final phase two voting contract. We added helper view methods to be able to query accounts in a human readable way. Finally, there were some improvements around batching methods together so deposit_and_stake, stake_all, unstake_all and withdraw_all instead of having to make a view call first, get the amount, and put the amount to call the stake. Here is the way we fixed it.
When you stake, not only do you stake the amount, we also attach a promise to check if the stake was successful. It’s needed for two things: you if you’re trying to stake with an invalid key (not ristretto specific key) then the promise will fail before execution. It will fail validation before sending it, and that will make it so you don’t need to check it within the contract. It will revert the last call, and it’s all going to be good. We also introduced the minimum stake on the protocol level. The minimum stake is one tenth of the amount of the last seat price, and if your contract tries to stake less than this then the action will fail, and you will not send the promise. Let’s say you want to unstake some amount and you dropped your balance below one tenth of the stake. The staking action may fail, and you will not unstake, while you need it to guarantee that unstaking has to happen. In this case we have this callback that checks that the staking action has successfully completed. This callback basically checks that if it fails, and the balance is positive we need to unstake. So it will call unstake for an action where the stake amount is zero to make sure that all balance is released. You can withdraw in 4 epochs during the testing of these contracts that we did on the beta 9 testnet before maintenance .The contract was ready maybe around the summer time so the testing of this iteration of this took probably like 2-4 months due to the complexity that it involves in interacting with the protocol. There was quite a lot of learning from pagination to helper methods, and batching together some stuff. One thing that would be really nice to have stake or deposit and stake all on a lock-up contract. Right now you have to manually issue how much you want to stake on a lockup contract, but it would be great if you didn’t need to think about your yocto NEAR, and how much it’s locked for storage. You just want to stake everything from your lockup, but because it was already deployed it was too late to think about this. There’s also some gas that is hardcoded, and with the common decrease in fee these numbers cannot be changed because they’re already on chain.
So vote is not important but the ON_STAKE_ACTION_GAS method requires you to have a large number for every stake and you cannot decrease it. Risk taking actions on every call on this contract will require us to have a large amount of gas and the problem is that that’s wasteful. Let’s say we agree on burning all the gas, this gas will be always burned, and wasted plus it limits the number of transactions you can put into one block if we are restricting gas based on this case. There was lots of iteration on the testing of the contract using the simulation test framework that we improved on a lot. If we get to the lockup contracts eventually you can see how the structure of lockup contracts improved over this one.