Move service node list methods to state_t methods
Add querying state from alt blocks and put key image parsing into function
Incorporate hash into state_t to find alt states
Add a way to query alternate testing quorums
Rebase cleanup
This removes a bunch of macros from cryptonote_config.h that either
aren't used at all, or have never applied to Loki, along with
removing/simplifying some of the dead code touching these macros.
A few of these are still used only in the test suite, so I moved them
there instead.
One of these sounds sort of scary -- ALLOW_DEBUG_COMMANDS -- but this
has *always* been forcibly enabled with no way to disable it going back
to the very first monero git commit.
DYNAMIC_FEE_PER_KB_BASE_FEE_V5 was defined in a very strange way that
doesn't make a lot of sense (including using a constant that is not
otherwise applied in loki), so I just replaced it with the expanded
value.
* Don't relay service node votes or uptime proof if synchronising
* Only relay votes if state is > state_synchronizing
Not before. Handshake = no, synchronizing = no.
* Relay votes/uptime to all nodes including those on I2P/TOR.
This converts the stored service_node_info value into a
`shared_ptr<const service_node_info>` rather than a plain
`service_node_info`. This yields a huge performance benefit by
significantly eliminating the vast majority of service_node_info
construction, destruction, and copying.
Most of the time when we copy a service_node_info nothing in it has
changed, which means we're storing exactly the same thing; this means an
extra construction for every SN info on every block *and* an extra
destruction when we cull old stored history. By using a
shared_ptr, the vast majority of those constructions and destructions
are eliminated.
The immediately previous commit (upon which this one builds) already
reduced a full rescan from 180s to 171s; this commit further reduces
that time to 104s, or about 42% reduced from the rescan time required
before this pair of commits. (All timings are from the dev.lokinet.org
box, tested over multiple runs with the entire lmdb cached in memory).
With the shared_ptr approach, we only make a copy when a change is
actually needed: because of infrequent (at the per-SN level) events like
a state_change, received reward, contribution, etc. The contained
reference is deliberately `const` so that values are not changeable;
there's a new function that does an explicit copying duplication,
returning the new non-const and storing the const ref in the shared
pointer.
Related to this is a small change (and fix) to how proof info and
public_ip/storage_port are stored: rather than store the values in the
service_node_info struct itself, they now gets stored in a shared_ptr
inside the service_node_info that intentionally gets shared among all
copies of the service_node_info (that is, a SN info copy deliberately
copies the pointer rather than the values). This also moves the ip/port
values into the proof struct, since that seemed much easier than
maintaining a separate shared_ptr for each value.
Previously, because these were stored as values in the service_node_info
they would actually get rolled back in the event of a reorg, but that
seems highly undesirable: you would end up rolling back to the old
values of the uptime proof and ip address (for example), but that should
not happen: those values are not dependent on the blockchain and so
should not be affected by a reorg/rollback. With this change they
aren't since there is only one actual proof stored.
Note that the shared storage here only applies to in-memory states;
states loaded from the db will still be duplicated.
This commit makes various simplifications and optimizations, mainly in
the service node list code.
All in all, this shaves about 5% of the CPU time used for re-processing
the entire service node list.
In particular:
- changed m_state_history from a std::vector to a std::set that sorts on
height. This is responsible for the bulk of the CPU reduction by
significant reducing the amount of work required for checkpoint
culling, which has to shuffle a lot of `state_t`s around when removing
from the midde of a vector.
- the above also allows replacing the binary-search `std::lower_bound`
complexity with a much simpler `find()`.
- since the history is now always sorted, removed the error messages
related to sorting that either can't happen (on store) or don't matter
(on load).
- Added some converting constructors to simplify code (for example, a
`state_t` can now be very usefully constructed from an r-value
`state_serialized`).
- Many construct + moves (and a couple construct + copy) were replaced
with in-place constructions.
- removed some unused variables
* Update state transition check to account for height and universally set timestamp on recommission
Reject invalidated state changes by their height after HF13
* Prune invalidated state changes on blockchain increment
Simplify check_tx_inputs for state_changes by using service node list
Instead of querying the last 60 historical blocks for every transaction,
use the service node list and determine the state of the service node
and if it can transition to its new state.
We also now enforce at hardfork 13, that the network cannot commit
transactions to the network if they would have been invalidated by
a newer state change that already is already on the blockchain.
This is backwards compatible all the way back to hardfork 9.
Greatly simplify state change tx pruning on block added
Use the new stricter rules for pruning state changes in the txpool
We can do so because pruning the TX pool won't cause issues at the
protocol level, but the more people we can upgrade the better network
behaviour we get in terms of propagating more intrinsically correct
ordering of state changes to other peers.
* Don't generate state changes if not valid, disallow voting if node is non-votable
* Add disabled-by-default quorum storage support
This adds support for storing N blocks of recent expired quorum state
history in lokid (and dumping to/from the lmdb).
This isn't useful for regular nodes (and that's why we don't store it)
but is incredibly useful for the block explorer to be able to report
*which* node got deregistered/decommissioned/etc. by a given
state_change tx.
* Fix quorum states copy
* Add soft forking for checkpointing on mainnet
* Clear checkpoints on softfork on mainnet
* Move softfork date until we're ready, fix test results and modulo addition
* Only round up delete_height if it's not a multiple of CHECKPOINT_INTERVAL
* Remove unused variable soft fork in service_node_rules (replaced by cryptonote_config)
Update the way votes are relayed to fix superfluous p2p disconnections:
- check that votes are actually not older than VOTE_LIFETIME before
relaying.
- Fix off-by-one error in incoming vote handling (it was rejecting on >=
maximum age instead of > maximum age).
- Add a 5-block buffer to the incoming vote tolerance handling: don't
accept a vote that is too new or too old, but also don't trigger a
disconnection if it is within 5 blocks of where it would be
acceptable so that relays from slightly out of sync peers don't
trigger p2p disconnections.
* Add deregistration of checkpoints by checking how many votes are missed
Move uptime proofs and add checkpoint counts in the service_node_list
because we typically prune uptime proofs by time, but it seems we want
to switch to a model where we persist proof data until the node expires
otherwise currently we would prune uptime entries and potentially our
checkpoint vote counts which would cause premature deregistration as the
expected vote counts start mismatching with the number of received
votes.
* Revise deregistration
* Fix test breakages
* uint16_t for port, remove debug false, min votes to 2 in integration mode
* Fix integration build
* Removes sn-public-ip and combines into --service-node
* Only rename the argument, don't combine args for clarity
* Update help message to use --service-node-public-ip
The vast majority of the time `is_active()` is what we actually want.
Most of these are just a name change, but there is also one important
fix here to the next-winner list which needs to use active, not
fully-funded.
epee's is_ip_local is missing two ranges that are commonly found:
link-local auto-config addresses (169.254.0.0/16) that are sometimes
used as a fallback when DHCP isn't present; and the carrier-grade NAT
range (100.64.0.0/10) reserved for carriers who impose NAT on their
customers.
There are also other ranges that aren't exactly "local" but aren't
public either: 0.0.0.0/8 isn't a valid destination address; and
224.0.0.0/3 (includes includes both the 224/4 multicast range, and the
reserved (but will most likely never be used) 240.0.0.0/4 range).
These are now added to a new `is_ip_public` function that returns true
if it's not local or loopback, and not one of these special ranges.
This also simplifies some convoluted netmask logic. (The simplification
would look better except that epee took the extremely bizarre and wrong
decision to store IPv4 addresses in little-endian order).
* core: do not commit half constructed batch db txn
* Add defer macro
* Revert dumb extra copy/move change
* Fix pop_blocks not calling hooks, fix BaseTestDB missing prototypes
* Merge ServiceNodeCheckpointing5 branch, syncing and integration fixes
* Update tests to compile with relaxed-registration changes
* Get back to feature parity pre-relaxed registration changes
* Remove debug changes noticed in code review and some small bugs
The replaces the deregistration mechanism with a new state change
mechanism (beginning at the v12 fork) which can change a service node's
network status via three potential values (and is extensible in the
future to handle more):
- deregistered -- this is the same as the existing deregistration; the
SN is instantly removed from the SN list.
- decommissioned -- this is a sort of temporary deregistration: your SN
remains in the service node list, but is removed from the rewards list
and from any network duties.
- recommissioned -- this tx is sent by a quorum if they observe a
decommissioned SN sending uptime proofs again. Upon reception, the SN
is reactivated and put on the end of the reward list.
Since this is broadening the quorum use, this also renames the relevant
quorum to a "obligations" quorum (since it validates SN obligations),
while the transactions are "state_change" transactions (since they
change the state of a registered SN).
The new parameters added to service_node_rules.h control how this works:
// Service node decommissioning: as service nodes stay up they earn "credits" (measured in blocks)
// towards a future outage. A new service node starts out with INITIAL_CREDIT, and then builds up
// CREDIT_PER_DAY for each day the service node remains active up to a maximum of
// DECOMMISSION_MAX_CREDIT.
//
// If a service node stops sending uptime proofs, a quorum will consider whether the service node
// has built up enough credits (at least MINIMUM): if so, instead of submitting a deregistration,
// it instead submits a decommission. This removes the service node from the list of active
// service nodes both for rewards and for any active network duties. If the service node comes
// back online (i.e. starts sending the required performance proofs again) before the credits run
// out then a quorum will reinstate the service node using a recommission transaction, which adds
// the service node back to the bottom of the service node reward list, and resets its accumulated
// credits to 0. If it does not come back online within the required number of blocks (i.e. the
// accumulated credit at the point of decommissioning) then a quorum will send a permanent
// deregistration transaction to the network, starting a 30-day deregistration count down.
This commit currently includes values (which are not necessarily
finalized):
- 8 hours (240 blocks) of credit required for activation of a
decommission (rather than a deregister)
- 0 initial credits at registration
- a maximum of 24 hours (720 blocks) of credits
- credits accumulate at a rate that you hit 24 hours of credits after 30
days of operation.
Miscellaneous other details of this PR:
- a new TX extra tag is used for the state change (including
deregistrations). The old extra tag has no version or type tag, so
couldn't be reused. The data in the new tag is slightly more
efficiently packed than the old deregistration transaction, so it gets
used for deregistrations (starting at the v12 fork) as well.
- Correct validator/worker selection required generalizing the shuffle
function to be able to shuffle just part of a vector. This lets us
stick any down service nodes at the end of the potential list, then
select validators by only shuffling the part of the index vector that
contains active service indices. Once the validators are selected, the
remainder of the list (this time including decommissioned SN indices) is
shuffled to select quorum workers to check, thus allowing decommisioned
nodes to be randomly included in the nodes to check without being
selected as a validator.
- Swarm recalculation was not quite right: swarms were recalculated on
SN registrations, even if those registrations were include shared node
registrations, but *not* recalculated on stakes. Starting with the
upgrade this behaviour is fixed (swarms aren't actually used currently
and aren't consensus-relevant so recalculating early won't hurt
anything).
- Details on decomm/dereg are added to RPC info and print_sn/print_sn_status
- Slightly improves the % of reward output in the print_sn output by
rounding it to two digits, and reserves space in the output string to
avoid excessive reallocations.
- Adds various debugging at higher debug levels to quorum voting (into
all of voting itself, vote transmission, and vote reception).
- Reset service node list internal data structure version to 0. The SN
list has to be rescanned anyway at upgrade (its size has changed), so we
might as well reset the version and remove the version-dependent
serialization code. (Note that the affected code here is for SN states
in lmdb storage, not for SN-to-SN communication serialization).
This converts the transaction type and version to scoped enum, giving
type safety and making the tx type assignment less error prone because
there is no implicit conversion or comparison with raw integers that has
to be worried about.
This ends up converting any use of `cryptonote::transaction::type_xyz`
to `cryptonote::transaction::txtype::xyz`. For version, names like
`transaction::version_v4` become `cryptonote::txversion::v4_tx_types`.
This also allows/includes various other simplifications related to or
enabled by this change:
- handle `is_deregister` dynamically in serialization code (setting
`type::standard` or `type::deregister` rather than using a
version-determined union)
- `get_type()` is no longer needed with the above change: it is now
much simpler to directly access `type` which will always have the
correct value (even for v2 or v3 transaction types). And though there
was an assertion on the enum value, `get_type()` was being used only
sporadically: many places accessed `.type` directly.
- the old unscoped enum didn't have a type but was assumed castable
to/from `uint16_t`, which technically meant there was potential
undefined behaviour when deserializing any type values >= 8.
- tx type range checks weren't being done in all serialization paths;
they are now. Because `get_type()` was not used everywhere (lots of
places simply accessed `.type` directory) these might not have been
caught.
- `set_type()` is not needed; it was only being used in a single place
(wallet2.cpp) and only for v4 txes, so the version protection code was
never doing anything.
- added a std::ostream << operator for the enum types so that they can be
output with `<< tx_type <<` rather than needing to wrap it in
`type_to_string(tx_type)` everywhere. For the versions, you get the
annotated version string (e.g. 4_tx_types) rather than just the number
4.
* Initial updates to allow syncing of checkpoints in protocol_handler
* Handle checkpoints in prepare_handle_incoming_blocks
* Debug changes for testing, cancel changes later
* Add checkpoint pruning code
* Reduce DB checkpoint accesses, sync votes for up to 60 blocks
* Remove duplicate lifetime variable
* Move parsing checkpoints to above early return for consistency
* Various integration test fixes
- Don't try participate in quorums at startup if you are not a service node
- Add lock for when potentially writing to the DB
- Pre-existing batch can be open whilst updating checkpoint so can't use
DB guards
- Temporarily emit the ban message using msgwriter instead of logs
* Integration mode bug fixes
- Always deserialize empty quorums so get_testing_quorum only fails when
an actual critical error has occurred
- Cache the last culled height so we don't needlessly query the DB over
the same heights trying to delete already-deleted checkpoints
- Submit votes when new blocks arrive for more reliable vote
transportation
* Undo debug changes for testing
* Update incorrect DB message and stale comment
* Incorporate service node ip address into uptime proofs; expose them using rpc
* Check that storage server port is specified in service-node mode
* Remove problematic const, rename argument name for storage port, update comments
* Validate ip address when receive uptime proof
* Better argument names and descriptions
* Initial updates to allow syncing of checkpoints in protocol_handler
* Handle checkpoints in prepare_handle_incoming_blocks
* Parse checkpoints sent by peer
* Fix rebase to dev referencing no longer valid argument
* Unify checkpointing and uptime quorums
* Begin making checkpoints cull old votes/checkpoints
* Begin rehaul of service node code out of core, to assist checkpoints
* Begin overhaul of votes to move resposibility into quorum_cop
* Update testing suite to work with the new system
* Remove vote culling from checkpoints and into voting_pool
* Fix bugs making integration deregistration fail
* Votes don't always specify an index in the validators
* Update tests for validator index member change
* Rename deregister to voting, fix subtle hashing bug
Update the deregister hash derivation to use uint32_t as originally set
not uint64_t otherwise this affects the result and produces different
results.
* Remove un-needed nettype from vote pool
* PR review, use <algorithms>
* Rename uptime_deregister/uptime quorums to just deregister quorums
* Remove unused add_deregister_vote, move side effect out of macro
* Beging adding functions to recalculate difficulty
* Add command line args to utility for recalculating difficulty
* Exception safety for recalculating difficulty
* Update help text for recalc difficulty
* Add recalc flag on the daemon
* Make context be const, signify intent for var++ to 1
* Add functions for storing checkpoints to the DB
* Allocate the DB entry on the stack instead of heap
* Add virtual overrides for new checkpoint functions
* Clean up for pull request, simplify some logic
* Revise API to include height in checkpoint header
* Move log to top of function even if early exit
* Begin moving checkpoints to db
* Allow storing of checkpoints to DB
* Cleanup for code reviewer, fix unit tests
* Fix tests, fix casting issues
* Don't use DUPSORT, use height->checkpoint mapping in DB
* Remove if 0 disabling checkpoint vote, we already check HF12
* Fix unit test infinite loop
* Update db schemas to match blk_checkpoint_header
* Code review