oxen-core/src/cryptonote_core/service_node_voting.h

214 lines
8 KiB
C
Raw Permalink Normal View History

// Copyright (c) 2018, The Loki Project
//
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without modification, are
// permitted provided that the following conditions are met:
//
// 1. Redistributions of source code must retain the above copyright notice, this list of
// conditions and the following disclaimer.
//
// 2. Redistributions in binary form must reproduce the above copyright notice, this list
// of conditions and the following disclaimer in the documentation and/or other
// materials provided with the distribution.
//
// 3. Neither the name of the copyright holder nor the names of its contributors may be
// used to endorse or promote products derived from this software without specific
// prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
// MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
// THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
// STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
// THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#pragma once
2023-04-13 15:50:13 +02:00
#include <boost/serialization/base_object.hpp>
#include <cassert>
#include <mutex>
#include <utility>
2023-04-13 15:50:13 +02:00
#include <vector>
2023-04-13 15:50:13 +02:00
#include "common/periodic_task.h"
#include "cryptonote_basic/cryptonote_basic.h"
Relax deregistration rules The replaces the deregistration mechanism with a new state change mechanism (beginning at the v12 fork) which can change a service node's network status via three potential values (and is extensible in the future to handle more): - deregistered -- this is the same as the existing deregistration; the SN is instantly removed from the SN list. - decommissioned -- this is a sort of temporary deregistration: your SN remains in the service node list, but is removed from the rewards list and from any network duties. - recommissioned -- this tx is sent by a quorum if they observe a decommissioned SN sending uptime proofs again. Upon reception, the SN is reactivated and put on the end of the reward list. Since this is broadening the quorum use, this also renames the relevant quorum to a "obligations" quorum (since it validates SN obligations), while the transactions are "state_change" transactions (since they change the state of a registered SN). The new parameters added to service_node_rules.h control how this works: // Service node decommissioning: as service nodes stay up they earn "credits" (measured in blocks) // towards a future outage. A new service node starts out with INITIAL_CREDIT, and then builds up // CREDIT_PER_DAY for each day the service node remains active up to a maximum of // DECOMMISSION_MAX_CREDIT. // // If a service node stops sending uptime proofs, a quorum will consider whether the service node // has built up enough credits (at least MINIMUM): if so, instead of submitting a deregistration, // it instead submits a decommission. This removes the service node from the list of active // service nodes both for rewards and for any active network duties. If the service node comes // back online (i.e. starts sending the required performance proofs again) before the credits run // out then a quorum will reinstate the service node using a recommission transaction, which adds // the service node back to the bottom of the service node reward list, and resets its accumulated // credits to 0. If it does not come back online within the required number of blocks (i.e. the // accumulated credit at the point of decommissioning) then a quorum will send a permanent // deregistration transaction to the network, starting a 30-day deregistration count down. This commit currently includes values (which are not necessarily finalized): - 8 hours (240 blocks) of credit required for activation of a decommission (rather than a deregister) - 0 initial credits at registration - a maximum of 24 hours (720 blocks) of credits - credits accumulate at a rate that you hit 24 hours of credits after 30 days of operation. Miscellaneous other details of this PR: - a new TX extra tag is used for the state change (including deregistrations). The old extra tag has no version or type tag, so couldn't be reused. The data in the new tag is slightly more efficiently packed than the old deregistration transaction, so it gets used for deregistrations (starting at the v12 fork) as well. - Correct validator/worker selection required generalizing the shuffle function to be able to shuffle just part of a vector. This lets us stick any down service nodes at the end of the potential list, then select validators by only shuffling the part of the index vector that contains active service indices. Once the validators are selected, the remainder of the list (this time including decommissioned SN indices) is shuffled to select quorum workers to check, thus allowing decommisioned nodes to be randomly included in the nodes to check without being selected as a validator. - Swarm recalculation was not quite right: swarms were recalculated on SN registrations, even if those registrations were include shared node registrations, but *not* recalculated on stakes. Starting with the upgrade this behaviour is fixed (swarms aren't actually used currently and aren't consensus-relevant so recalculating early won't hurt anything). - Details on decomm/dereg are added to RPC info and print_sn/print_sn_status - Slightly improves the % of reward output in the print_sn output by rounding it to two digits, and reserves space in the output string to avoid excessive reallocations. - Adds various debugging at higher debug levels to quorum voting (into all of voting itself, vote transmission, and vote reception). - Reset service node list internal data structure version to 0. The SN list has to be rescanned anyway at upgrade (its size has changed), so we might as well reset the version and remove the version-dependent serialization code. (Note that the affected code here is for SN states in lmdb storage, not for SN-to-SN communication serialization).
2019-06-18 23:57:02 +02:00
#include "cryptonote_basic/tx_extra.h"
2023-04-13 15:50:13 +02:00
namespace cryptonote {
struct tx_verification_context;
struct vote_verification_context;
struct checkpoint_t;
}; // namespace cryptonote
2023-04-13 15:50:13 +02:00
namespace service_nodes {
struct quorum;
2023-04-13 15:50:13 +02:00
struct checkpoint_vote {
crypto::hash block_hash;
};
2023-04-13 15:50:13 +02:00
struct state_change_vote {
uint16_t worker_index;
new_state state;
uint16_t reason;
};
enum struct quorum_type : uint8_t { obligations = 0, checkpointing, blink, pulse, _count };
2023-04-13 15:50:13 +02:00
inline constexpr std::string_view to_string(const quorum_type& q) {
switch (q) {
case quorum_type::obligations: return "obligation";
case quorum_type::checkpointing: return "checkpointing";
case quorum_type::blink: return "blink";
case quorum_type::pulse: return "pulse";
default: assert(false); return "xx_unhandled_type";
}
2023-04-13 15:50:13 +02:00
};
enum struct quorum_group : uint8_t { invalid, validator, worker, _count };
struct quorum_vote_t {
uint8_t version = 0;
quorum_type type;
uint64_t block_height;
quorum_group group;
uint16_t index_in_group;
crypto::signature signature;
2023-04-13 15:50:13 +02:00
union {
state_change_vote state_change;
checkpoint_vote checkpoint;
};
RPC overhaul High-level details: This redesigns the RPC layer to make it much easier to work with, decouples it from an embedded HTTP server, and gets the vast majority of the RPC serialization and dispatch code out of a very commonly included header. There is unfortunately rather a lot of interconnected code here that cannot be easily separated out into separate commits. The full details of what happens here are as follows: Major details: - All of the RPC code is now in a `cryptonote::rpc` namespace; this renames quite a bit to be less verbose: e.g. CORE_RPC_STATUS_OK becomes `rpc::STATUS_OK`, and `cryptonote::COMMAND_RPC_SOME_LONG_NAME` becomes `rpc::SOME_LONG_NAME` (or just SOME_LONG_NAME for code already working in the `rpc` namespace). - `core_rpc_server` is now completely decoupled from providing any request protocol: it is now *just* the core RPC call handler. - The HTTP RPC interface now lives in a new rpc/http_server.h; this code handles listening for HTTP requests and dispatching them to core_rpc_server, then sending the results back to the caller. - There is similarly a rpc/lmq_server.h for LMQ RPC code; more details on this (and other LMQ specifics) below. - RPC implementing code now returns the response object and throws when things go wrong which simplifies much of the rpc error handling. They can throw anything; generic exceptions get logged and a generic "internal error" message gets returned to the caller, but there is also an `rpc_error` class to return an error code and message used by some json-rpc commands. - RPC implementing functions now overload `core_rpc_server::invoke` following the pattern: RPC_BLAH_BLAH::response core_rpc_server::invoke(RPC_BLAH_BLAH::request&& req, rpc_context context); This overloading makes the code vastly simpler: all instantiations are now done with a small amount of generic instantiation code in a single .cpp rather than needing to go to hell and back with a nest of epee macros in a core header. - each RPC endpoint is now defined by the RPC types themselves, including its accessible names and permissions, in core_rpc_server_commands_defs.h: - every RPC structure now has a static `names()` function that returns the names by which the end point is accessible. (The first one is the primary, the others are for deprecated aliases). - RPC command wrappers define their permissions and type by inheriting from special tag classes: - rpc::RPC_COMMAND is a basic, admin-only, JSON command, available via JSON RPC. *All* JSON commands are now available via JSON RPC, instead of the previous mix of some being at /foo and others at /json_rpc. (Ones that were previously at /foo are still there for backwards compatibility; see `rpc::LEGACY` below). - rpc::PUBLIC specifies that the command should be available via a restricted RPC connection. - rpc::BINARY specifies that the command is not JSON, but rather is accessible as /name and takes and returns values in the magic epee binary "portable storage" (lol) data format. - rpc::LEGACY specifies that the command should be available via the non-json-rpc interface at `/name` for backwards compatibility (in addition to the JSON-RPC interface). - some epee serialization got unwrapped and de-templatized so that it can be moved into a .cpp file with just declarations in the .h. (This makes a *huge* difference for core_rpc_server_commands_defs.h and for every compilation unit that includes it which previously had to compile all the serialization code and then throw all by one copy away at link time). This required some new macros so as to not break a ton of places that will use the old way putting everything in the headers; The RPC code uses this as does a few other places; there are comments in contrib/epee/include/serialization/keyvalue_serialization.h as to how to use it. - Detemplatized a bunch of epee/storages code. Most of it should have have been using templates at all (because it can only ever be called with one type!), and now it isn't. This broke some things that didn't properly compile because of missing headers or (in one case) a messed up circular dependency. - Significantly simplified a bunch of over-templatized serialization code. - All RPC serialization definitions is now out of core_rpc_server_commands_defs.h and into a single .cpp file (core_rpc_server_commands_defs.cpp). - core RPC no longer uses the disgusting BEGIN_URI_MAP2/MAP_URI_BLAH_BLAH macros. This was a terrible design that forced slamming tons of code into a common header that didn't need to be there. - epee::struct_init is gone. It was a horrible hack that instiated multiple templates just so the coder could be so lazy and write `some_type var;` instead of properly value initializing with `some_type var{};`. - Removed a bunch of useless crap from epee. In particular, forcing extra template instantiations all over the place in order to nest return objects inside JSON RPC values is no longer needed, as are a bunch of stuff related to the above de-macroization of the code. - get_all_service_nodes, get_service_nodes, and get_n_service_nodes are now combined into a single `get_service_nodes` (with deprecated aliases for the others), which eliminates a fair amount of duplication. The biggest obstacle here was getting the requested fields reference passed through: this is now done by a new ability to stash a context in the serialization object that can be retrieved by a sub-serialized type. LMQ-specifics: - The LokiMQ instance moves into `cryptonote::core` rather than being inside cryptonote_protocol. Currently the instance is used both for qnet and rpc calls (and so needs to be in a common place), but I also intend future PRs to use the batching code for job processing (replacing the current threaded job queue). - rpc/lmq_server.h handles the actual LMQ-request-to-core-RPC glue. Unlike http_server it isn't technically running the whole LMQ stack from here, but the parallel name with http_server seemed appropriate. - All RPC endpoints are supported by LMQ under the same names as defined generically, but prefixed with `rpc.` for public commands and `admin.` for restricted ones. - service node keys are now always available, even when not running in `--service-node` mode: this is because we want the x25519 key for being able to offer CURVE encryption for lmq RPC end-points, and because it doesn't hurt to have them available all the time. In the RPC layer this is now called "get_service_keys" (with "get_service_node_key" as an alias) since they aren't strictly only for service nodes. This also means code needs to check m_service_node, and not m_service_node_keys, to tell if it is running as a service node. (This is also easier to notice because m_service_node_keys got renamed to `m_service_keys`). - Added block and mempool monitoring LMQ RPC endpoints: `sub.block` and `sub.mempool` subscribes the connection for new block and new mempool TX notifications. The latter can notify on just blink txes, or all new mempool txes (but only new ones -- txes dumped from a block don't trigger it). The client gets pushed a [`notify.block`, `height`, `hash`] or [`notify.tx`, `txhash`, `blob`] message when something arrives. Minor details: - rpc::version_t is now a {major,minor} pair. Forcing everyone to pack and unpack a uint32_t was gross. - Changed some macros to constexprs (e.g. CORE_RPC_ERROR_CODE_...). (This immediately revealed a couple of bugs in the RPC code that was assigning CORE_RPC_ERROR_CODE_... to a string, and it worked because the macro allows implicit conversion to a char). - De-templatizing useless templates in epee (i.e. a bunch of templated types that were never invoked with different types) revealed a painful circular dependency between epee and non-epee code for tor_address and i2p_address. This crap is now handled in a suitably named `net/epee_network_address_hack.cpp` hack because it really isn't trivial to extricate this mess. - Removed `epee/include/serialization/serialize_base.h`. Amazingly the code somehow still all works perfectly with this previously vital header removed. - Removed bitrotted, unused epee "crypted_storage" and "gzipped_inmemstorage" code. - Replaced a bunch of epee::misc_utils::auto_scope_leave_caller with LOKI_DEFERs. The epee version involves quite a bit more instantiation and is ugly as sin. Also made the `loki::defer` class invokable for some edge cases that need calling before destruction in particular conditions. - Moved the systemd code around; it makes much more sense to do the systemd started notification as in daemon.cpp as late as possible rather than in core (when we can still have startup failures, e.g. if the RPC layer can't start). - Made the systemd short status string available in the get_info RPC (and no longer require building with systemd). - during startup, print (only) the x25519 when not in SN mode, and continue to print all three when in SN mode. - DRYed out some RPC implementation code (such as set_limit) - Made wallet_rpc stop using a raw m_wallet pointer
2020-04-28 01:25:43 +02:00
KV_MAP_SERIALIZABLE
2023-04-13 15:50:13 +02:00
// TODO(oxen): idk exactly if I want to implement this, but need for core tests to compile. Not
// sure I care about serializing for core tests at all.
private:
friend class boost::serialization::access;
template <class Archive>
2023-04-13 15:50:13 +02:00
void serialize(Archive& ar, const unsigned int /*version*/) {}
};
2023-04-13 15:50:13 +02:00
struct service_node_keys;
quorum_vote_t make_state_change_vote(
uint64_t block_height,
uint16_t index_in_group,
uint16_t worker_index,
new_state state,
uint16_t reason,
const service_node_keys& keys);
quorum_vote_t make_checkpointing_vote(
cryptonote::hf hf_version,
crypto::hash const& block_hash,
uint64_t block_height,
uint16_t index_in_quorum,
const service_node_keys& keys);
cryptonote::checkpoint_t make_empty_service_node_checkpoint(
crypto::hash const& block_hash, uint64_t height);
bool verify_checkpoint(
cryptonote::hf hf_version,
cryptonote::checkpoint_t const& checkpoint,
service_nodes::quorum const& quorum);
bool verify_tx_state_change(
const cryptonote::tx_extra_service_node_state_change& state_change,
uint64_t latest_height,
cryptonote::tx_verification_context& vvc,
const service_nodes::quorum& quorum,
cryptonote::hf hf_version);
bool verify_vote_age(
const quorum_vote_t& vote,
uint64_t latest_height,
cryptonote::vote_verification_context& vvc);
bool verify_vote_signature(
cryptonote::hf hf_version,
const quorum_vote_t& vote,
cryptonote::vote_verification_context& vvc,
const service_nodes::quorum& quorum);
bool verify_quorum_signatures(
service_nodes::quorum const& quorum,
service_nodes::quorum_type type,
cryptonote::hf hf_version,
uint64_t height,
crypto::hash const& hash,
std::vector<quorum_signature> const& signatures,
const cryptonote::block* block = nullptr);
bool verify_pulse_quorum_sizes(service_nodes::quorum const& quorum);
crypto::signature make_signature_from_vote(
quorum_vote_t const& vote, const service_node_keys& keys);
crypto::signature make_signature_from_tx_state_change(
cryptonote::tx_extra_service_node_state_change const& state_change,
const service_node_keys& keys);
struct pool_vote_entry {
quorum_vote_t vote;
2023-04-13 15:50:13 +02:00
uint64_t time_last_sent_p2p;
};
2023-04-13 15:50:13 +02:00
struct voting_pool {
// return: The vector of votes if the vote is valid (and even if it is not unique) otherwise
// nullptr
std::vector<pool_vote_entry> add_pool_vote_if_unique(
const quorum_vote_t& vote, cryptonote::vote_verification_context& vvc);
2021-01-04 01:09:45 +01:00
// TODO(oxen): Review relay behaviour and all the cases when it should be triggered
2023-04-13 15:50:13 +02:00
void set_relayed(const std::vector<quorum_vote_t>& votes);
void remove_expired_votes(uint64_t height);
void remove_used_votes(std::vector<cryptonote::transaction> const& txs, cryptonote::hf version);
/// Returns relayable votes for either p2p (quorum_relay=false) or quorumnet
/// (quorum_relay=true). Before HF14 everything goes via p2p; starting in HF14 obligation votes
/// go via quorumnet, checkpoints go via p2p.
2023-04-13 15:50:13 +02:00
std::vector<quorum_vote_t> get_relayable_votes(
uint64_t height, cryptonote::hf hf_version, bool quorum_relay) const;
bool received_checkpoint_vote(uint64_t height, size_t index_in_quorum) const;
private:
2023-04-13 15:50:13 +02:00
std::vector<pool_vote_entry>* find_vote_pool(
const quorum_vote_t& vote, bool create_if_not_found = false);
struct obligations_pool_entry {
explicit obligations_pool_entry(const quorum_vote_t& vote) :
height{vote.block_height},
worker_index{vote.state_change.worker_index},
state{vote.state_change.state} {}
obligations_pool_entry(const cryptonote::tx_extra_service_node_state_change& sc) :
height{sc.block_height}, worker_index{sc.service_node_index}, state{sc.state} {}
uint64_t height;
uint32_t worker_index;
new_state state;
std::vector<pool_vote_entry> votes;
bool operator==(const obligations_pool_entry& e) const {
return height == e.height && worker_index == e.worker_index && state == e.state;
}
};
Relax deregistration rules The replaces the deregistration mechanism with a new state change mechanism (beginning at the v12 fork) which can change a service node's network status via three potential values (and is extensible in the future to handle more): - deregistered -- this is the same as the existing deregistration; the SN is instantly removed from the SN list. - decommissioned -- this is a sort of temporary deregistration: your SN remains in the service node list, but is removed from the rewards list and from any network duties. - recommissioned -- this tx is sent by a quorum if they observe a decommissioned SN sending uptime proofs again. Upon reception, the SN is reactivated and put on the end of the reward list. Since this is broadening the quorum use, this also renames the relevant quorum to a "obligations" quorum (since it validates SN obligations), while the transactions are "state_change" transactions (since they change the state of a registered SN). The new parameters added to service_node_rules.h control how this works: // Service node decommissioning: as service nodes stay up they earn "credits" (measured in blocks) // towards a future outage. A new service node starts out with INITIAL_CREDIT, and then builds up // CREDIT_PER_DAY for each day the service node remains active up to a maximum of // DECOMMISSION_MAX_CREDIT. // // If a service node stops sending uptime proofs, a quorum will consider whether the service node // has built up enough credits (at least MINIMUM): if so, instead of submitting a deregistration, // it instead submits a decommission. This removes the service node from the list of active // service nodes both for rewards and for any active network duties. If the service node comes // back online (i.e. starts sending the required performance proofs again) before the credits run // out then a quorum will reinstate the service node using a recommission transaction, which adds // the service node back to the bottom of the service node reward list, and resets its accumulated // credits to 0. If it does not come back online within the required number of blocks (i.e. the // accumulated credit at the point of decommissioning) then a quorum will send a permanent // deregistration transaction to the network, starting a 30-day deregistration count down. This commit currently includes values (which are not necessarily finalized): - 8 hours (240 blocks) of credit required for activation of a decommission (rather than a deregister) - 0 initial credits at registration - a maximum of 24 hours (720 blocks) of credits - credits accumulate at a rate that you hit 24 hours of credits after 30 days of operation. Miscellaneous other details of this PR: - a new TX extra tag is used for the state change (including deregistrations). The old extra tag has no version or type tag, so couldn't be reused. The data in the new tag is slightly more efficiently packed than the old deregistration transaction, so it gets used for deregistrations (starting at the v12 fork) as well. - Correct validator/worker selection required generalizing the shuffle function to be able to shuffle just part of a vector. This lets us stick any down service nodes at the end of the potential list, then select validators by only shuffling the part of the index vector that contains active service indices. Once the validators are selected, the remainder of the list (this time including decommissioned SN indices) is shuffled to select quorum workers to check, thus allowing decommisioned nodes to be randomly included in the nodes to check without being selected as a validator. - Swarm recalculation was not quite right: swarms were recalculated on SN registrations, even if those registrations were include shared node registrations, but *not* recalculated on stakes. Starting with the upgrade this behaviour is fixed (swarms aren't actually used currently and aren't consensus-relevant so recalculating early won't hurt anything). - Details on decomm/dereg are added to RPC info and print_sn/print_sn_status - Slightly improves the % of reward output in the print_sn output by rounding it to two digits, and reserves space in the output string to avoid excessive reallocations. - Adds various debugging at higher debug levels to quorum voting (into all of voting itself, vote transmission, and vote reception). - Reset service node list internal data structure version to 0. The SN list has to be rescanned anyway at upgrade (its size has changed), so we might as well reset the version and remove the version-dependent serialization code. (Note that the affected code here is for SN states in lmdb storage, not for SN-to-SN communication serialization).
2019-06-18 23:57:02 +02:00
std::vector<obligations_pool_entry> m_obligations_pool;
2023-04-13 15:50:13 +02:00
struct checkpoint_pool_entry {
explicit checkpoint_pool_entry(const quorum_vote_t& vote) :
height{vote.block_height}, hash{vote.checkpoint.block_hash} {}
checkpoint_pool_entry(uint64_t height, crypto::hash const& hash) :
height(height), hash(hash) {}
uint64_t height;
crypto::hash hash;
std::vector<pool_vote_entry> votes;
bool operator==(const checkpoint_pool_entry& e) const {
return height == e.height && hash == e.hash;
}
};
std::vector<checkpoint_pool_entry> m_checkpoint_pool;
mutable std::recursive_mutex m_lock;
2023-04-13 15:50:13 +02:00
};
}; // namespace service_nodes
2023-04-13 15:50:13 +02:00
template <>
inline constexpr bool formattable::via_to_string<service_nodes::quorum_type> = true;