This enables optional support for systemd notification which allows
lokid to be run via `Type=notify`, allowing it to better signal status
to systemd and enables systemd watchdog handling to restart if something
goes wrong.
Enabled here are:
- systemd watchdog ping every 10s
- systemd status update every 10s, so that `systemctl status loki-node`
gives you a status line such as:
Status: "Height: 450085, SN: active, proof: 15m12s, storage: 3m7s, lokinet: 27s"
- initialization notification so that systemd can wait for
and report on initialization status rather than just that the process
has launched.
- shutdown notification
All of these require changing the service type to `Type=notify` in the
`[Service]` section of the systemd service file; enabling the watchdog
also requires adding a `WatchdogSec=5min` line in the `[Service]`
section.
The systemd support is optional and requires the libsystemd-dev package
to be built (and is probably not feasible at all for a static build).
cryptonote_protocol_handler calls `pool.get_blink(hash)` while already
holding a blink shared lock, which should have been
`pool.get_blink(hash, true)` to avoid `get_blink` trying to take its own
lock.
That double lock is undefined behaviour and can cause a deadlock on the
mutex, although it appears rare that it actually does. If it does,
however, this eventually backs up into vote relaying during the idle
loop, which then stalls the idle loop so we stop sending out uptime
proofs (since that is also in the idle loop).
A simple fix here is to add the `true` argument, but on reconsideration
this extra argument to take or not take a lock is messy and error prone,
so this commit instead removes the second argument entirely and instead
documents which call must and must not hold a lock, getting rid of the
three methods (get_blink, has_blink, and add_existing_blink) that had
the `have_lock` argument. This ends up having only a small impact on
calling code - the vast majority of callers already hold a lock, and the
few that don't are easily adjusted.
Handle errors better when long polling is disabled instead of endlessly
spamming logs.
Avoid lock contention when set_daemon is called. Instead of immediately
affecting the long polling thread (which could be engaging the mutex
until RPC timeout, meaning the program stalls for that duration), update
the address on the next iteration of the long polling thread.
Wallets handle daemons that disable long polling better by sleeping.
`--regtest` didn't work in some edge cases, this fixes various things:
- the genesis block wasn't accepted because it needed to be v7, not
vMax
- reduce initial uptime proof delay to 5s in regtest mode
- add --regtest flag to the wallet so that it can talk to a daemon in
--regtest mode.
This also adds two new mining options, available via rpc:
- slow_mining - this avoids the RandomX initialization. It is much
slower, but for regtest with fixed difficulty of 1 that is perfectly
fine.
- `num_blocks` - instruct the miner to mine for the given number of
blocks, then stop. (This can overmine if mining with multiple
threads at a low difficulty, but that's fine).
This can occur when syncing if we get a blink tx before the blocks that
let us determine the quorum. Just ignore it at this point; we'll pick
it up at the next once-per-minute sync run.
This replaces the horrible, horrible, badly misused templated
once_a_time_seconds and once_a_time_milliseconds with a `periodic_task`
that works the same way but takes parameters as constructor arguments
instead of template parameters.
It also makes various small improvements:
- uses std::chrono::steady_clock instead of ifdef'ing platform dependent
timer code.
- takes a std::chrono duration rather than a template integer and
scaling parameter.
- timers can be reset to trigger on the next invocation, and this is
thread-safe.
- timer intervals can be changed at run-time.
This all then gets used to reset the proof timer immediately upon
receiving a ping (initially or after expiring) from storage server and
lokinet so that we send proofs out faster.
quorum_vote_t's were serialized as blob data, which is highly
non-portable (probably isn't the same on non-amd64 arches) and broke
between 5.x and 6.x because `signature` is aligned now (which changed
its offset and thus broke 5.x <-> 6.x vote transmission).
This adds a hack to write votes into a block of memory compatible with
AMD64 5.x nodes up until HF14, then switches to a new command that fully
serializes starting at the hard fork (after which we can remove the
backwards compatibility stuff added here).
Checkpoint votes were only going over quorumnet, which was quite broken
as they need to go out universally: random nodes piece together the
individual checkpoints to create checkpoints on their own, and so need
to see all of the votes. With the current code only service nodes that
participated in the specific quorum ever saw the checkpoint votes.
This commit returns checkpoint vote distribution to distribution over
p2p.
We could, in theory, do checkpoint vote collection over quorumnet and
then relay the set of votes as a pack to reduce p2p traffic a bit, but
this wouldn't be a huge optimization since we'd still have to distribute
all the votes over p2p at some point anyway, so leaving that as a future
optimization for now.
(Obligations votes, in contrast, don't need to be distributed at all --
if a vote results in a state change, the quorum members themselves can
produce and distribute the state change tx).
This extracts uptime proof data entirely from service node states,
instead storing (some) proof data as its own first-class object in the
code and backed by the database. We now persistently store:
- timestamp
- version
- ip & ports
- ed25519 pubkey
and update it every time a proof is received. Upon restart, we load all
the proofs from the database, which means we no longer lose last proof
received times on restart, and always have the most recently received ip
and port information (rather than only having whatever it was in the
most recently stored state, which could be ~12 blocks ago). It also
means we don't need to spam the network with proofs for up to half an
hour after a restart because we now know when we last sent (and
received) our own proof before restarting.
This separation required some restructuring: most notably changing
`state_t` to be constructed with a `service_node_list` pointer, which it
needs both directly and for BlockchainDB access. Related to this is
also eliminating the separate BlockchainDB pointer stored in
service_node_list in favour of just accessing it via m_blockchain (which
now has a new `has_db()` method to detect when it has been initialized).
- Fix assert to use version_t::_count in service_node_list.cpp
- Incoporate staking changes from LNS which revamp construct tx to
derive more of the parameters from hf_version and type. This removes
extraneous parameters that can be derived elsewhere.
Also delegate setting loki_construct_tx_params into wallet2 atleast,
so that callers into the wallet logic, like rpc server, simplewallet
don't need bend backwards to get the HF version whereas the wallet has
dedicated functions for determining the HF.
Rename handle_incoming_txs to handle_parsed_txs, and re-add a
handle_incoming_txs that does the parse+handle_parsed call with the
appropriate lock held for code that doesn't care about being able to
split it into two steps.
Clarifies the locking requirements with a comment in the code, and adds
an incoming tx lock over the blink receiving process in
cryptonote_protocol_handler.inl.
Also fixes a bug in the weight calculation happening because the
blobdata pointer was being invalidated when going through
the singular `handle_parsed_tx()`.
This adds support for bumping conflicting transactions from the mempool
upon receipt of a blink tx if those conflicting txs are not themselves
blinks.
This commit chops the handle_incoming_txs effectively in half into a
parse_incoming_txs phase, followed by a handle_incoming_txs phase which
is needed so that we can first parse the txes then, assuming they parse
successfully, can flag them as blink txes if blink info was included
with them, so that the actual insertion becomes forcing.
There is a stub here for also allowing a rollback of the mutable part of
the chain; implementation of actually doing that rollback will follow in
another commit.
- actually include the blink hashes in the core sync data
- fix cleanup to delete heights in (0, immutable] instead of [0,
immutable); we want to keep 0 because it signifies the mempool, and we
only need blocks after immutable, not immutable itself.
- fixed NOTIFY_REQUEST_GET_TXS to handle mempool txes properly (it was
only searching the chain and ignored missed_txs, but missed_txs are ones
we need to look up in the mempool)
- Add a method to tx_pool (needed for the above) to grab multiple txes
by hash (essentially a multi-tx version of `get_transaction()`), and
change get_transaction() to use it with a single value.
- Added various debugging statements
- Added a bunch of comments to each condition of the preliminary blink
data check condition.
- Don't abort blink addition on a single signature failure: if there are
enough valid signatures we should still accept it.
- Check for blink signature approval when receiving blink signatures;
it's not enough to know that all were added successfully, we also have
to ask the blink tx if it is approved (which does additional checks on
subquorum counts) once we add them all.
This adds blink tx synchronization: when doing a periodic sync with
other nodes each node includes a map of {HEIGHT, HASH} pairs where
HEIGHT is a mined, post-immutable block height and HASH is an xor of all
the tx hashes mined into that block.
If upon receipt the node disagrees about what it thinks the HASH should
be, it can request a list of txes for one or more disagreeing heights,
for which it gets a list of tx hashes of all blink txes mined into that
block.
If it is then missing any of those TXes, this adds an ability to request
that the remote send TXes via the existing NOTIFY_NEW_TRANSACTIONS
mechanism, but with an added flag (so that these don't get relayed).
This originally was going to request the TXes via the existing
NOTIFY_REQUEST_GET_OBJECTS, which has a `txs` field, but unfortunately
any txs passed to it are completely ignored; it is *only* usable for
block synchronization. As such I renamed it and removed the `txs` field
to make the responsibility/capability clearer.
This adds the ability for check_fee() to also check the burn amount.
This requires passing extra info through `add_tx()` (and the various
things that call it), so I took the:
bool keeped_by_block, bool relayed, bool do_not_relay
argument triplet, moved it into a struct in tx_pool.h, then added the other fee
options there (along with some static factory functions for generating the
typical sets of option).
The majority of this commit is chasing that change through the codebase and
test suite.
This is used by blink but should also help LNS and other future burn
transactions to verify a burn amount simply when adding the transation to the
mempool. It supports a fixed burn amount, a burn amount as a multiple of the
minimum tx fee, and also allows you to increase the minimum tx fee (so that,
for example, we could require blink txes to pay miners 250% of the usual
minimum (unimportant) priority tx fee.
- Removed a useless core::add_new_tx() overload that wasn't used anywhere.
Blink-specific changes:
(I'd normally separate these into a separate commit, but they got interwoven
fairly heavily with the above change).
- changed the way blink burning is specified so that we have three knobs for
fee adjustment (fixed burn fee; base fee multiple; and required miner tx fee).
The fixed amount is currently 0, base fee is 400%, and require miner tx fee is
simply 100% (i.e. no different than a normal transaction). This is the same as
before this commit, but is changing how they are being specified in
cryptonote_config.h.
- blink tx fee, burn amount, and miner tx fee (if > 100%) now get checked
before signing a blink tx. (These fee checks don't apply to anyone else --
when propagating over the network only the miner tx fee is checked).
- Added a couple of checks for blink quorums: 1) make sure they have reached
the blink hf; 2) make sure the submitted tx version conforms to the current hf
min/max tx version.
- print blink fee information in simplewallet's `fee` output
- add "typical" fee calculations in the `fee` output:
[wallet T6SCwL (has locked stakes)]: fee
Current fee is 0.000000850 loki per byte + 0.020000000 loki per output
No backlog at priority 1
No backlog at priority 2
No backlog at priority 3
No backlog at priority 4
Current blink fee is 0.000004250 loki per byte + 0.100000000 loki per output
Estimated typical small transaction fees: 0.042125000 (unimportant), 0.210625000 (normal), 1.053125000 (elevated), 5.265625000 (priority), 0.210625000 (blink)
where "small" here is the same tx size (2500 bytes + 2 outputs) used to
estimate backlogs.
- Adds blink signature synchronization and storage through the regular
p2p network
- Adds wallet support (though this is still currently buggy and needs
additional fixes - it sees the tx when it arrives in the mempool but
isn't properly updating when the blink tx gets mined.)
There are a bunch trivial forwarding wrappers in cryptonote_core that
simply call the same method in the pool, and blink would require adding
several more. Instead of all of these existing (and new) wrappers, just
directly expose the tx_pool reference so that anything with a `core`
reference can access and call to the mempool directly.
Pool printing is separately implemented in rpc_command_executor; this
method in tx_pool is both obsolete (missing some fields) and not called
anywhere.
Neither of these have a place in modern C++11; boost::value_initialized
is entirely superseded by `Type var{};` which does value initialization
(or default construction if a default constructor is defined). More
problematically, each `boost::value_initialized<T>` requires
instantiation of another wrapping templated type which is a pointless
price to pay the compiler in C++11 or newer.
Also removed is the AUTO_VAL_INIT macro (which is just a simple macro
around constructing a boost::value_initialized<T>).
BOOST_FOREACH is a similarly massive pile of code to implement
C++11-style for-each loops. (And bizarrely it *doesn't* appear to fall
back to C++ for-each loops even when under a C++11 compiler!)
This removes both entirely from the codebase.
This is the bulk of the work for blink. There is two pieces yet to come
which will follow shortly, which are: the p2p communication of blink
transactions (which needs to be fully synchronized, not just shared,
unlike regular mempool txes); and an implementation of fee burning.
Blink approval, multi-quorum signing, cli wallet and node support for
submission denial are all implemented here.
This overhauls and fixes various parts of the SNNetwork interface to fix
some issues (particularly around non-SN communication with SNs, which
wasn't working).
There are also a few sundry FIXME's and TODO's of other minor details
that will follow shortly under cleanup/testing/etc.
The owning pointer is stashed in a unique_ptr so that it gets properly
destroyed; everything else gets a plain pointer to it.
(Ideally it would be std::optional instead, but that's not until C++17)
This tweaks uptime proofs to go out on the first timer tick in the
58.5-62.5 interval (rather than 60-65) which should result in proofs
usually being very close to the 60 minute mark. (Currently a fair
percentage end up at the 65m mark because the timer fires just before
the 60m mark).
It also tweaks the first uptime proof to go out after 2 minutes instead
of 5 minutes, and uses separate constants
(UPTIME_PROOF_INITIAL_DELAY_SECONDS, UPTIME_PROOF_TIMER_SECONDS) for
times that were previously overloading the
UPTIME_PROOF_BUFFER_IN_SECONDS constant.
For debugging this also adds a cmake option LOKI_DEBUG_SHORT_PROOFS
that, when enabled, makes the whole proof cycle (sending and acceptance)
20x faster, which is very useful for local testnet debugging.
Currently we store it as various different things: 3 separate ints, 2
u16s, 3 separate u16s, and a vector of u16s. This unifies all version
values to a `std::array<uint16_t,3>`.
- LOKI_VERSION_{MAJOR,MINOR,PATCH} are now just LOKI_VERSION
- The previous LOKI_VERSION (C-string of the version) is now renamed
LOKI_VERSION_STR
A related change included here is that the restricted RPC now returns
the major version in the get_info rpc call instead of an empty string
(e.g. "5" instead of ""). There is almost certainly enough difference
in the RPC results to distinguish major versions already so this doesn't
seem like it actually leaks anything significant.
This adds vote relaying via quorumnet.
- The main glue between existing code and quorumnet code is in
cryptonote_protocol/quorumnet.{h,cpp}
- Uses function pointers assigned at initialization to call into the
glue without making cn_core depend on p2p
- Adds ed25519 and quorumnet port to uptime proofs and serialization.
- Added a second uptime proof signature using the ed25519 key to that
SNs are proving they actually have it (otherwise they could spoof
someone else's pubkey).
- Adds quorumnet port, defaulting to zmq rpc port + 1. quorumnet
listens on the p2p IPv4 address (but broadcasts the `public_ip` to the
network).
- Derives x25519 when received or deserialized.
- Converted service_info_info::version_t into a scoped enum (with the
same underlying type).
- Added contribution_t::version_t scoped enum. It was using
service_info_info::version for a 0 value which seemed rather wrong.
Random small details:
- Renamed internal `arg_sn_bind_port` for consistency
- Moved default stagenet ports from 38153-38155 to 38056-38058, and add the default stagenet
quorumnet port at 38059 (this keeps the range contiguous; otherwise the next stagenet default port
would collide with the testnet p2p port).
This allows a `"if_block_not_equal": "hash"` parameter to be given to
`get_n_service_nodes` which, if the given value matches the current
top block hash, skips building and returning a reply.
This is primarily aimed at lokinet which polls fairly frequenty for an
update but where the vast majority of those polls contain no new
information.
It also removes the get_all_service_nodes_public_keys RPC request
completely as lokinet was the only consumer of it (particularly unlikely
that anyone else was using this because it was returning the keys
base32z-encoded).
Neither of these have a place in modern C++11; boost::value_initialized
is entirely superseded by `Type var{};` which does value initialization
(or default construction if a default constructor is defined). More
problematically, each `boost::value_initialized<T>` requires
instantiation of another wrapping templated type which is a pointless
price to pay the compiler in C++11 or newer.
Also removed is the AUTO_VAL_INIT macro (which is just a simple macro
around constructing a boost::value_initialized<T>).
BOOST_FOREACH is a similarly massive pile of code to implement
C++11-style for-each loops. (And bizarrely it *doesn't* appear to fall
back to C++ for-each loops even when under a C++11 compiler!)
This removes both entirely from the codebase.
This adds a tx extra field that specifies an amount of the tx fee that
must be burned; the miner can claim only (txnFee - burnFee) when
including the block.
This will be used for the extra, burned part of blink fees and LNS fees
and any other transaction that requires fee burning in the future.
This allows flushing internal caches (for now, the bad tx cache,
which will allow debugging a stuck monerod after it has failed to
verify a transaction in a block, since it would otherwise not try
again, making subsequent log changes pointless)
* Fix elapsed time in storage server warning message
This was passing the time value rather than the number of seconds so
basically always printed "a long time" instead of the elapsed time.
* Removed pre-HF12 code
* Wait for storage server before sending proof
On startup we send a proof immediately, but this is misleading to the
operator - they may see a proof broadcast over the network suggesting
everything is good even when the storage server is still down. This
makes lokid wait for a first ping before the first proof so that the
user doesn't get mislead.
It also adds the storage server last ping into the `lokid status`
message, such as:
Height: 375634/375634 (100.0%) on mainnet, not mining, net hash 63.00 MH/s, v12, up to date, 1(out)+0(in) connections, uptime 0d 0h 0m 4s
SN: f4558f60b1c4075e469a15411f12d5a747c1c62b44bcbc8523f1a90becc80475 not registered, s.server: NO PING RECEIVED
or:
Height: 375663/375663 (100.0%) on mainnet, not mining, net hash 76.17 MH/s, v12, up to date, 1(out)+0(in) connections, uptime 0d 0h 0m 32s
SN: f4558f60b1c4075e469a15411f12d5a747c1c62b44bcbc8523f1a90becc80475 not registered, s.server: last ping 11 seconds ago
This generates a ed25519 keypair (and from it derives a x25519 keypair)
and broadcasts the ed25519 pubkey in HF13 uptime proofs.
This auxiliary key will be used both inside lokid (starting in HF14) in
places like the upcoming quorumnet code where we need a standard
pub/priv keypair that is usable in external tools (e.g. sodium) without
having to reimplement the incompatible (though still 25519-based) Monero
pubkey format.
This pulls it back into HF13 from the quorumnet code because the
generation code is ready now, and because there may be opportunities to
use this outside of lokid (e.g. in the storage server and in lokinet)
before HF14. Broadcasting it earlier also allows us to be ready to go
as soon as HF14 hits rather than having to wait for every node to have
sent a post-HF14 block uptime proof.
For a similar reason this adds a placeholder for the quorumnet port in
the uptime proof: currently the value is just set to 0 and ignored, but
allowing it to be passed will allow upgraded loki 6.x nodes to start
sending it to each other without having to wait for the fork height so
that they can start using it immediately when HF14 begins.
This puts the SN pubkey/privkey into a single struct (which have
ed25519/x25519 keys added to it in the future), which simplifies various
places to just pass the struct rather than storing and passing the
pubkey and privkey separately.
If the peer (whether pruned or not itself) supports sending pruned blocks
to syncing nodes, the pruned version will be sent along with the hash
of the pruned data and the block weight. The original tx hashes can be
reconstructed from the pruned txes and theur prunable data hash. Those
hashes and the block weights are hashes and checked against the set of
precompiled hashes, ensuring the data we received is the original data.
It is currently not possible to use this system when not using the set
of precompiled hashes, since block weights can not otherwise be checked
for validity.
This is off by default for now, and is enabled by --sync-pruned-blocks
2cd4fd8 Changed the use of boost:value_initialized for C++ list initializer (JesusRami)
4ad191f Removed unused boost/value_init header (whyamiroot)
928f4be Make null hash constants constexpr (whyamiroot)
If a peer views the destination peer as not synchronizing, then the
destination peer should just accept the uptime proof, rather than accept
it and then conditionally relay it depending on whether or not you are
synchronizing at the point of attempting to relay (which you could
transition into synchronizing state interim between accepting and
attempting to relay the proof).
Otherwise we get into a ping-pong situation as follows
Node1 sends uptime ->
Node2 receives uptime and relays it back to Node1 for acknowledgement ->
Node1 receives it, handle_uptime_proof returns true to acknowledge ->
Node1 tries to resend to the same peers again
Instead, if we receive our own uptime proof, then acknowledge but don't
send on. If the we are missing an uptime proof it will have been
submitted automatically by the daemon itself instead of using my own
proof relayed by other nodes.
Move service node list methods to state_t methods
Add querying state from alt blocks and put key image parsing into function
Incorporate hash into state_t to find alt states
Add a way to query alternate testing quorums
Rebase cleanup
This removes a bunch of macros from cryptonote_config.h that either
aren't used at all, or have never applied to Loki, along with
removing/simplifying some of the dead code touching these macros.
A few of these are still used only in the test suite, so I moved them
there instead.
One of these sounds sort of scary -- ALLOW_DEBUG_COMMANDS -- but this
has *always* been forcibly enabled with no way to disable it going back
to the very first monero git commit.
DYNAMIC_FEE_PER_KB_BASE_FEE_V5 was defined in a very strange way that
doesn't make a lot of sense (including using a constant that is not
otherwise applied in loki), so I just replaced it with the expanded
value.
* Don't relay service node votes or uptime proof if synchronising
* Only relay votes if state is > state_synchronizing
Not before. Handshake = no, synchronizing = no.
* Relay votes/uptime to all nodes including those on I2P/TOR.
This converts the stored service_node_info value into a
`shared_ptr<const service_node_info>` rather than a plain
`service_node_info`. This yields a huge performance benefit by
significantly eliminating the vast majority of service_node_info
construction, destruction, and copying.
Most of the time when we copy a service_node_info nothing in it has
changed, which means we're storing exactly the same thing; this means an
extra construction for every SN info on every block *and* an extra
destruction when we cull old stored history. By using a
shared_ptr, the vast majority of those constructions and destructions
are eliminated.
The immediately previous commit (upon which this one builds) already
reduced a full rescan from 180s to 171s; this commit further reduces
that time to 104s, or about 42% reduced from the rescan time required
before this pair of commits. (All timings are from the dev.lokinet.org
box, tested over multiple runs with the entire lmdb cached in memory).
With the shared_ptr approach, we only make a copy when a change is
actually needed: because of infrequent (at the per-SN level) events like
a state_change, received reward, contribution, etc. The contained
reference is deliberately `const` so that values are not changeable;
there's a new function that does an explicit copying duplication,
returning the new non-const and storing the const ref in the shared
pointer.
Related to this is a small change (and fix) to how proof info and
public_ip/storage_port are stored: rather than store the values in the
service_node_info struct itself, they now gets stored in a shared_ptr
inside the service_node_info that intentionally gets shared among all
copies of the service_node_info (that is, a SN info copy deliberately
copies the pointer rather than the values). This also moves the ip/port
values into the proof struct, since that seemed much easier than
maintaining a separate shared_ptr for each value.
Previously, because these were stored as values in the service_node_info
they would actually get rolled back in the event of a reorg, but that
seems highly undesirable: you would end up rolling back to the old
values of the uptime proof and ip address (for example), but that should
not happen: those values are not dependent on the blockchain and so
should not be affected by a reorg/rollback. With this change they
aren't since there is only one actual proof stored.
Note that the shared storage here only applies to in-memory states;
states loaded from the db will still be duplicated.
This commit makes various simplifications and optimizations, mainly in
the service node list code.
All in all, this shaves about 5% of the CPU time used for re-processing
the entire service node list.
In particular:
- changed m_state_history from a std::vector to a std::set that sorts on
height. This is responsible for the bulk of the CPU reduction by
significant reducing the amount of work required for checkpoint
culling, which has to shuffle a lot of `state_t`s around when removing
from the midde of a vector.
- the above also allows replacing the binary-search `std::lower_bound`
complexity with a much simpler `find()`.
- since the history is now always sorted, removed the error messages
related to sorting that either can't happen (on store) or don't matter
(on load).
- Added some converting constructors to simplify code (for example, a
`state_t` can now be very usefully constructed from an r-value
`state_serialized`).
- Many construct + moves (and a couple construct + copy) were replaced
with in-place constructions.
- removed some unused variables
* Update state transition check to account for height and universally set timestamp on recommission
Reject invalidated state changes by their height after HF13
* Prune invalidated state changes on blockchain increment
Simplify check_tx_inputs for state_changes by using service node list
Instead of querying the last 60 historical blocks for every transaction,
use the service node list and determine the state of the service node
and if it can transition to its new state.
We also now enforce at hardfork 13, that the network cannot commit
transactions to the network if they would have been invalidated by
a newer state change that already is already on the blockchain.
This is backwards compatible all the way back to hardfork 9.
Greatly simplify state change tx pruning on block added
Use the new stricter rules for pruning state changes in the txpool
We can do so because pruning the TX pool won't cause issues at the
protocol level, but the more people we can upgrade the better network
behaviour we get in terms of propagating more intrinsically correct
ordering of state changes to other peers.
* Don't generate state changes if not valid, disallow voting if node is non-votable
* Add disabled-by-default quorum storage support
This adds support for storing N blocks of recent expired quorum state
history in lokid (and dumping to/from the lmdb).
This isn't useful for regular nodes (and that's why we don't store it)
but is incredibly useful for the block explorer to be able to report
*which* node got deregistered/decommissioned/etc. by a given
state_change tx.
* Fix quorum states copy
* Add soft forking for checkpointing on mainnet
* Clear checkpoints on softfork on mainnet
* Move softfork date until we're ready, fix test results and modulo addition
* Only round up delete_height if it's not a multiple of CHECKPOINT_INTERVAL
* Remove unused variable soft fork in service_node_rules (replaced by cryptonote_config)
Update the way votes are relayed to fix superfluous p2p disconnections:
- check that votes are actually not older than VOTE_LIFETIME before
relaying.
- Fix off-by-one error in incoming vote handling (it was rejecting on >=
maximum age instead of > maximum age).
- Add a 5-block buffer to the incoming vote tolerance handling: don't
accept a vote that is too new or too old, but also don't trigger a
disconnection if it is within 5 blocks of where it would be
acceptable so that relays from slightly out of sync peers don't
trigger p2p disconnections.
* Add deregistration of checkpoints by checking how many votes are missed
Move uptime proofs and add checkpoint counts in the service_node_list
because we typically prune uptime proofs by time, but it seems we want
to switch to a model where we persist proof data until the node expires
otherwise currently we would prune uptime entries and potentially our
checkpoint vote counts which would cause premature deregistration as the
expected vote counts start mismatching with the number of received
votes.
* Revise deregistration
* Fix test breakages
* uint16_t for port, remove debug false, min votes to 2 in integration mode
* Fix integration build
* Removes sn-public-ip and combines into --service-node
* Only rename the argument, don't combine args for clarity
* Update help message to use --service-node-public-ip
The vast majority of the time `is_active()` is what we actually want.
Most of these are just a name change, but there is also one important
fix here to the next-winner list which needs to use active, not
fully-funded.
epee's is_ip_local is missing two ranges that are commonly found:
link-local auto-config addresses (169.254.0.0/16) that are sometimes
used as a fallback when DHCP isn't present; and the carrier-grade NAT
range (100.64.0.0/10) reserved for carriers who impose NAT on their
customers.
There are also other ranges that aren't exactly "local" but aren't
public either: 0.0.0.0/8 isn't a valid destination address; and
224.0.0.0/3 (includes includes both the 224/4 multicast range, and the
reserved (but will most likely never be used) 240.0.0.0/4 range).
These are now added to a new `is_ip_public` function that returns true
if it's not local or loopback, and not one of these special ranges.
This also simplifies some convoluted netmask logic. (The simplification
would look better except that epee took the extremely bizarre and wrong
decision to store IPv4 addresses in little-endian order).
* core: do not commit half constructed batch db txn
* Add defer macro
* Revert dumb extra copy/move change
* Fix pop_blocks not calling hooks, fix BaseTestDB missing prototypes
* Merge ServiceNodeCheckpointing5 branch, syncing and integration fixes
* Update tests to compile with relaxed-registration changes
* Get back to feature parity pre-relaxed registration changes
* Remove debug changes noticed in code review and some small bugs
The replaces the deregistration mechanism with a new state change
mechanism (beginning at the v12 fork) which can change a service node's
network status via three potential values (and is extensible in the
future to handle more):
- deregistered -- this is the same as the existing deregistration; the
SN is instantly removed from the SN list.
- decommissioned -- this is a sort of temporary deregistration: your SN
remains in the service node list, but is removed from the rewards list
and from any network duties.
- recommissioned -- this tx is sent by a quorum if they observe a
decommissioned SN sending uptime proofs again. Upon reception, the SN
is reactivated and put on the end of the reward list.
Since this is broadening the quorum use, this also renames the relevant
quorum to a "obligations" quorum (since it validates SN obligations),
while the transactions are "state_change" transactions (since they
change the state of a registered SN).
The new parameters added to service_node_rules.h control how this works:
// Service node decommissioning: as service nodes stay up they earn "credits" (measured in blocks)
// towards a future outage. A new service node starts out with INITIAL_CREDIT, and then builds up
// CREDIT_PER_DAY for each day the service node remains active up to a maximum of
// DECOMMISSION_MAX_CREDIT.
//
// If a service node stops sending uptime proofs, a quorum will consider whether the service node
// has built up enough credits (at least MINIMUM): if so, instead of submitting a deregistration,
// it instead submits a decommission. This removes the service node from the list of active
// service nodes both for rewards and for any active network duties. If the service node comes
// back online (i.e. starts sending the required performance proofs again) before the credits run
// out then a quorum will reinstate the service node using a recommission transaction, which adds
// the service node back to the bottom of the service node reward list, and resets its accumulated
// credits to 0. If it does not come back online within the required number of blocks (i.e. the
// accumulated credit at the point of decommissioning) then a quorum will send a permanent
// deregistration transaction to the network, starting a 30-day deregistration count down.
This commit currently includes values (which are not necessarily
finalized):
- 8 hours (240 blocks) of credit required for activation of a
decommission (rather than a deregister)
- 0 initial credits at registration
- a maximum of 24 hours (720 blocks) of credits
- credits accumulate at a rate that you hit 24 hours of credits after 30
days of operation.
Miscellaneous other details of this PR:
- a new TX extra tag is used for the state change (including
deregistrations). The old extra tag has no version or type tag, so
couldn't be reused. The data in the new tag is slightly more
efficiently packed than the old deregistration transaction, so it gets
used for deregistrations (starting at the v12 fork) as well.
- Correct validator/worker selection required generalizing the shuffle
function to be able to shuffle just part of a vector. This lets us
stick any down service nodes at the end of the potential list, then
select validators by only shuffling the part of the index vector that
contains active service indices. Once the validators are selected, the
remainder of the list (this time including decommissioned SN indices) is
shuffled to select quorum workers to check, thus allowing decommisioned
nodes to be randomly included in the nodes to check without being
selected as a validator.
- Swarm recalculation was not quite right: swarms were recalculated on
SN registrations, even if those registrations were include shared node
registrations, but *not* recalculated on stakes. Starting with the
upgrade this behaviour is fixed (swarms aren't actually used currently
and aren't consensus-relevant so recalculating early won't hurt
anything).
- Details on decomm/dereg are added to RPC info and print_sn/print_sn_status
- Slightly improves the % of reward output in the print_sn output by
rounding it to two digits, and reserves space in the output string to
avoid excessive reallocations.
- Adds various debugging at higher debug levels to quorum voting (into
all of voting itself, vote transmission, and vote reception).
- Reset service node list internal data structure version to 0. The SN
list has to be rescanned anyway at upgrade (its size has changed), so we
might as well reset the version and remove the version-dependent
serialization code. (Note that the affected code here is for SN states
in lmdb storage, not for SN-to-SN communication serialization).
This converts the transaction type and version to scoped enum, giving
type safety and making the tx type assignment less error prone because
there is no implicit conversion or comparison with raw integers that has
to be worried about.
This ends up converting any use of `cryptonote::transaction::type_xyz`
to `cryptonote::transaction::txtype::xyz`. For version, names like
`transaction::version_v4` become `cryptonote::txversion::v4_tx_types`.
This also allows/includes various other simplifications related to or
enabled by this change:
- handle `is_deregister` dynamically in serialization code (setting
`type::standard` or `type::deregister` rather than using a
version-determined union)
- `get_type()` is no longer needed with the above change: it is now
much simpler to directly access `type` which will always have the
correct value (even for v2 or v3 transaction types). And though there
was an assertion on the enum value, `get_type()` was being used only
sporadically: many places accessed `.type` directly.
- the old unscoped enum didn't have a type but was assumed castable
to/from `uint16_t`, which technically meant there was potential
undefined behaviour when deserializing any type values >= 8.
- tx type range checks weren't being done in all serialization paths;
they are now. Because `get_type()` was not used everywhere (lots of
places simply accessed `.type` directory) these might not have been
caught.
- `set_type()` is not needed; it was only being used in a single place
(wallet2.cpp) and only for v4 txes, so the version protection code was
never doing anything.
- added a std::ostream << operator for the enum types so that they can be
output with `<< tx_type <<` rather than needing to wrap it in
`type_to_string(tx_type)` everywhere. For the versions, you get the
annotated version string (e.g. 4_tx_types) rather than just the number
4.
* Initial updates to allow syncing of checkpoints in protocol_handler
* Handle checkpoints in prepare_handle_incoming_blocks
* Debug changes for testing, cancel changes later
* Add checkpoint pruning code
* Reduce DB checkpoint accesses, sync votes for up to 60 blocks
* Remove duplicate lifetime variable
* Move parsing checkpoints to above early return for consistency
* Various integration test fixes
- Don't try participate in quorums at startup if you are not a service node
- Add lock for when potentially writing to the DB
- Pre-existing batch can be open whilst updating checkpoint so can't use
DB guards
- Temporarily emit the ban message using msgwriter instead of logs
* Integration mode bug fixes
- Always deserialize empty quorums so get_testing_quorum only fails when
an actual critical error has occurred
- Cache the last culled height so we don't needlessly query the DB over
the same heights trying to delete already-deleted checkpoints
- Submit votes when new blocks arrive for more reliable vote
transportation
* Undo debug changes for testing
* Update incorrect DB message and stale comment
* Incorporate service node ip address into uptime proofs; expose them using rpc
* Check that storage server port is specified in service-node mode
* Remove problematic const, rename argument name for storage port, update comments
* Validate ip address when receive uptime proof
* Better argument names and descriptions
* Initial updates to allow syncing of checkpoints in protocol_handler
* Handle checkpoints in prepare_handle_incoming_blocks
* Parse checkpoints sent by peer
* Fix rebase to dev referencing no longer valid argument
* Unify checkpointing and uptime quorums
* Begin making checkpoints cull old votes/checkpoints
* Begin rehaul of service node code out of core, to assist checkpoints
* Begin overhaul of votes to move resposibility into quorum_cop
* Update testing suite to work with the new system
* Remove vote culling from checkpoints and into voting_pool
* Fix bugs making integration deregistration fail
* Votes don't always specify an index in the validators
* Update tests for validator index member change
* Rename deregister to voting, fix subtle hashing bug
Update the deregister hash derivation to use uint32_t as originally set
not uint64_t otherwise this affects the result and produces different
results.
* Remove un-needed nettype from vote pool
* PR review, use <algorithms>
* Rename uptime_deregister/uptime quorums to just deregister quorums
* Remove unused add_deregister_vote, move side effect out of macro
* Beging adding functions to recalculate difficulty
* Add command line args to utility for recalculating difficulty
* Exception safety for recalculating difficulty
* Update help text for recalc difficulty
* Add recalc flag on the daemon
* Make context be const, signify intent for var++ to 1
* Add functions for storing checkpoints to the DB
* Allocate the DB entry on the stack instead of heap
* Add virtual overrides for new checkpoint functions
* Clean up for pull request, simplify some logic
* Revise API to include height in checkpoint header
* Move log to top of function even if early exit
* Begin moving checkpoints to db
* Allow storing of checkpoints to DB
* Cleanup for code reviewer, fix unit tests
* Fix tests, fix casting issues
* Don't use DUPSORT, use height->checkpoint mapping in DB
* Remove if 0 disabling checkpoint vote, we already check HF12
* Fix unit test infinite loop
* Update db schemas to match blk_checkpoint_header
* Code review
* Collect checkpoint votes and signatures
* Copy deregister vote and adapt for checkpoint vote
Monkey see, monkey do
Write boilerplate code for receiving and relaying votes
* Make current checkpoints service node aware
Fix not saving checkpoint hash for normal checkpoints
Simplify some of the checkpointing API
* Verify blocks against service node checkpoints
Remove checkpoint copy, make checkpoint at vote threshold
Fix null ptr dereferences, add debug print checkpoints
* Add simple node checkpointing w/no conflict resolution
* Use endl to flush output
* Add hash to vote
* Revise checkpoint rules
* Don't store checkpoints in a linked list
* Commit checkpoints on super majority of votes received
* Add locks since checkpoints is accessed in network thread
* Handle checkpoint vote conflicts better
* Comment out early exiting voting process for simplicity
* Clean up for code review
* Fix whitespacing
* More cleanup for review
* Remove debug changes, fix using upper_bound incorrectly, return points by value
* find_if uses == not < as the predicate
* Fix merge issues from rebasing multiple quorum types
This changed the stored proof info to a struct storing both the
timestamp and the major/minor/patch versions that were included in the
last uptime proof.
The version gets exposed as a "service_node_version" 3-int array in the
get_service_nodes RPC call.
This commit also includes a minor non-critical fix to the uptime proof
pruning as part of the changes: deleting while iterating isn't
guaranteed to iterate through all elements before C++14.
The original intent of one false positive a week on average
was not met, since what we really want is not the probability
of having N blocks in T seconds, but either N blocks of fewer
in T seconds, or N blocks or more in T seconds.
Some of this could be cached since it calculates the same fairly
complex floating point values, but it seems pretty fast already.
Current 2.0.x nodes will only accept a v10 network uptime proof if the
proof's snode_version_major is exactly equal to 2 (rather than >= 2), so
the proofs generated by a 3.0.x node won't be accepted by 2.0.x nodes.
This commit fakes the snode version put into an uptime proof prior to
the to the v11 fork to a fictitious 2.3.x version instead of the actual
3.0.x to keep v2 nodes happy with the proof.
(The same issue isn't present for future upgrade: the code in 3.0
properly recognizes >= 3 as valid for sending v11 network proofs).
The 10 minute one will never trigger for 0 blocks, as it's still
fairly likely to happen even without the actual hash rate changing
much, so we add a 20 minute window, where it will (for 0 blocks)
and a one hour window.
This curbs runaway growth while still allowing substantial
spikes in block weight
Original specification from ArticMine:
here is the scaling proposal
Define: LongTermBlockWeight
Before fork:
LongTermBlockWeight = BlockWeight
At or after fork:
LongTermBlockWeight = min(BlockWeight, 1.4*LongTermEffectiveMedianBlockWeight)
Note: To avoid possible consensus issues over rounding the LongTermBlockWeight for a given block should be calculated to the nearest byte, and stored as a integer in the block itself. The stored LongTermBlockWeight is then used for future calculations of the LongTermEffectiveMedianBlockWeight and not recalculated each time.
Define: LongTermEffectiveMedianBlockWeight
LongTermEffectiveMedianBlockWeight = max(300000, MedianOverPrevious100000Blocks(LongTermBlockWeight))
Change Definition of EffectiveMedianBlockWeight
From (current definition)
EffectiveMedianBlockWeight = max(300000, MedianOverPrevious100Blocks(BlockWeight))
To (proposed definition)
EffectiveMedianBlockWeight = min(max(300000, MedianOverPrevious100Blocks(BlockWeight)), 50*LongTermEffectiveMedianBlockWeight)
Notes:
1) There are no other changes to the existing penalty formula, median calculation, fees etc.
2) There is the requirement to store the LongTermBlockWeight of a block unencrypted in the block itself. This is to avoid possible consensus issues over rounding and also to prevent the calculations from becoming unwieldy as we move away from the fork.
3) When the EffectiveMedianBlockWeight cap is reached it is still possible to mine blocks up to 2x the EffectiveMedianBlockWeight by paying the corresponding penalty.
Note: the long term block weight is stored in the database, but not in the actual block itself,
since it requires recalculating anyway for verification.
This curbs runaway growth while still allowing substantial
spikes in block weight
Original specification from ArticMine:
here is the scaling proposal
Define: LongTermBlockWeight
Before fork:
LongTermBlockWeight = BlockWeight
At or after fork:
LongTermBlockWeight = min(BlockWeight, 1.4*LongTermEffectiveMedianBlockWeight)
Note: To avoid possible consensus issues over rounding the LongTermBlockWeight for a given block should be calculated to the nearest byte, and stored as a integer in the block itself. The stored LongTermBlockWeight is then used for future calculations of the LongTermEffectiveMedianBlockWeight and not recalculated each time.
Define: LongTermEffectiveMedianBlockWeight
LongTermEffectiveMedianBlockWeight = max(300000, MedianOverPrevious100000Blocks(LongTermBlockWeight))
Change Definition of EffectiveMedianBlockWeight
From (current definition)
EffectiveMedianBlockWeight = max(300000, MedianOverPrevious100Blocks(BlockWeight))
To (proposed definition)
EffectiveMedianBlockWeight = min(max(300000, MedianOverPrevious100Blocks(BlockWeight)), 50*LongTermEffectiveMedianBlockWeight)
Notes:
1) There are no other changes to the existing penalty formula, median calculation, fees etc.
2) There is the requirement to store the LongTermBlockWeight of a block unencrypted in the block itself. This is to avoid possible consensus issues over rounding and also to prevent the calculations from becoming unwieldy as we move away from the fork.
3) When the EffectiveMedianBlockWeight cap is reached it is still possible to mine blocks up to 2x the EffectiveMedianBlockWeight by paying the corresponding penalty.
The 10 minute one will never trigger for 0 blocks, as it's still
fairly likely to happen even without the actual hash rate changing
much, so we add a 20 minute window, where it will (for 0 blocks)
and a one hour window.
No need to do a round-trip just to call set relayed on votes. Also makes
it more robust by actually checking that we were able to relay the vote
before declaring it as relayed.
* Cleanup and undoing some protocol breakages
* Simplify expiration of nodes
* Request unlock schedules entire node for expiration
* Fix off by one in expiring nodes
* Undo expiring code for pre v10 nodes
* Fix RPC returning register as unlock height and not checking 0
* Rename key image unlock height const
* Undo testnet hardfork debug changes
* Remove is_type for get_type, fix missing var rename
* Move serialisable data into public namespace
* Serialise tx types properly
* Fix typo in no service node known msg
* Code review
* Fix == to >= on serialising tx type
* Code review 2
* Fix tests and key image unlock
* Add command to print locked key images
* Update ui to display lock stakes, query in print cmd blacklist
* Modify print stakes to be less slow
* Remove autostaking code
* Refactor staking into sweep functions
It appears staking was derived off stake_main written separately at
implementation at the beginning. This merges them back into a common
code path, after removing autostake there's only some minor differences.
It also makes sure that any changes to sweeping upstream are going to be
considered in the staking process which we want.
* Display unlock height for stakes
* Begin creating output blacklist
* Make blacklist output a migration step
* Implement get_output_blacklist for lmdb
* In wallet output selection ignore blacklisted outputs
* Apply blacklisted outputs to output selection
* Fix broken tests, switch key image unlock
* Fix broken unit_tests
* Begin change to limit locked key images to 4 globally
* Revamp prepare registration for new min contribution rules
* Fix up old back case in prepare registration
* Remove debug code
* Cleanup debug code and some unecessary changes
* Fix migration step on mainnet db
* Fix blacklist outputs for pre-existing DB's
* Remove irrelevant note
* Tweak scanning addresses for locked stakes
Since we only now allow contributions from the primary address we can
skip checking all subaddress + lookahead to speed up wallet scanning
* Define macro for SCNu64 for Mingw
* Fix failure on empty DB
* Add missing error msg, remove contributor from stake
* Improve staking messages
* Flush prompt to always display
* Return the msg from stake failure and fix stake parsing error
* Tweak fork rules for smaller bulletproofs
* Tweak pooled nodes minimum amounts
* Fix crash on exit, there's no need to store on destructor
Since all information about service nodes is derived from the blockchain
and we store state every time we receive a block, storing in the
destructor is redundant as there is no new information to store.
* Make prompt be consistent with CLI
* Check max number of key images from per user to node
* Implement error message on get_output_blacklist failure
* Remove resolved TODO's/comments
* Handle infinite staking in print_sn
* Atoi->strtol, fix prepare_registration, virtual override, stale msgs
This will trigger if a reorg is seen. This may be used to do things
like stop automated withdrawals on large reorgs.
%s is replaced by the height at the split point
%h is replaced by the height of the new chain
%n is replaced by the number of new blocks after the reorg
b6534c40 ringct: remove unused senderPk from ecdhTuple (moneromooo-monero)
7d375981 ringct: the commitment mask is now deterministic (moneromooo-monero)
99d946e6 ringct: encode 8 byte amount, saving 24 bytes per output (moneromooo-monero)
cdc3ccec ringct: save 3 bytes on bulletproof size (moneromooo-monero)
f931e16c add a bulletproof version, new bulletproof type, and rct config (moneromooo-monero)
* Remove dead branches in hot-path check_tx_inputs
Also renames #define for mixins to better match naming convention
* Shuffle around some more code into common branches
* Fix min/max tx version rules, since there 1 tx v2 on v9 fork
* First draft infinite staking implementation
* Actually generate the right key image and expire appropriately
* Add framework to lock key images after expiry
* Return locked key images for nodes, add request unlock option
* Introduce transaction types for key image unlock
* Update validation steps to accept tx types, key_image_unlock
* Add mapping for lockable key images to amounts
* Change inconsistent naming scheme of contributors
* Create key image unlock transaction type and process it
* Update tx params to allow v4 types and as a result construct_tx*
* Fix some serialisation issues not sending all the information
* Fix dupe tx extra tag causing incorrect deserialisation
* Add warning comments
* Fix key image unlocks parsing error
* Simplify key image proof checks
* Fix rebase errors
* Correctly calculate the key image unlock times
* Blacklist key image on deregistration
* Serialise key image blacklist
* Rollback blacklisted key images
* Fix expiry logic error
* Disallow requesting stake unlock if already unlocked client side
* Add double spend checks for key image unlocks
* Rename get_staking_requirement_lock_blocks
To staking_initial_num_lock_blocks
* Begin modifying output selection to not use locked outputs
* Modify output selection to avoid locked/blacklisted key images
* Cleanup and undoing some protocol breakages
* Simplify expiration of nodes
* Request unlock schedules entire node for expiration
* Fix off by one in expiring nodes
* Undo expiring code for pre v10 nodes
* Fix RPC returning register as unlock height and not checking 0
* Rename key image unlock height const
* Undo testnet hardfork debug changes
* Remove is_type for get_type, fix missing var rename
* Move serialisable data into public namespace
* Serialise tx types properly
* Fix typo in no service node known msg
* Code review
* Fix == to >= on serialising tx type
* Code review 2
* Fix tests and key image unlock
* Add additional test, fix assert
* Remove debug code in wallet
* Fix merge dev problem
* create get_service_node_list rpc call
currently does nothing, just a shell (that compiles)
* implement get_all_service_node_keys rpc call
* change get_all_service_node_keys rpc to use json rpc
also change the result to be vector of hex strings rather than binary keys
* Make nodes be plural, add hex to base32z for keys
Base32z for Lokinet internal usage
* Add option to return fully funded service nodes only
* Add nullptr check for conversion
* Add assert for incorrect usage of to base32z