The recent PR that revamped the connection IDs missed a case when
connecting to service nodes where we store the SN pubkey in peers, but
then fail to find the peer when we look it up by connection id.
This adds the required tracking to fix that case (and adds a test that
fails without the fix here).
The existing code was overly complicated by trying to track indices in
the `connections` vector, which complication happening because things
get removed from `connections` requiring all the internal index values
to be updated. So we ended up with a connection ID inside the
ConnectionID object, plus a map of those connection IDs to the
`connections` index, and need a map back from indices to ConnectionIDs.
Though this seems to work usually, I recently noticed an
oxen-storage-server sending oxend requests on the wrong connection and
so I suspect there is some rare edge cases here where a failed
connection index might not be updated properly.
This PR simplifies the whole thing by making getting rid of connection
ids entirely and keeping the connections in a map (with connection ids
that never change). This might end up being a little less efficient
than the vector, but it's unlikely to matter and the added complexity
isn't worth it.
- Allow up to 200ms (instead of 100ms) for the things we are waiting on
to become available, to prevent occasional spurious failures.
- Add unscoped info for how long we waited.
- Avoid calling into oxenmq with the catch lock held in the "hey google"
tests (because this will deadlock if the oxenmq call invokes any
logging).
- Replace an old std::cerr logger with the updated catch2 logger.
This commit adds support for listening on new ports after startup. This
will make things easier in storage server, in particular, where we want
to delay listening on public ports until we have an established
connection and initial block status update from oxend.
I realized after merging the previous PR that it is difficult to
correctly pass ownership into a timer, because something like:
TimerID x = omq.add_timer([&] { omq.cancel_timer(x); }, 5ms);
doesn't work when the timer job needs to outlive the caller. My next
approach was:
auto x = std::make_shared<TimerID>();
*x = omq.add_timer([&omq, x] { omq.cancel_timer(*x); }, 5ms);
but this has two problems: first, TimerID wasn't default constructible,
and second, there is no guarantee that the assignment to *x happens
before (and is visible to) the access for the cancellation.
This commit fixes both issues: TimerID is now default constructible, and
an overload is added that takes the lvalue reference to the TimerID to
set rather than returning it (and guarantees that it will be set before
the timer is created).
Updates `add_timer` to return a new opaque TimerID object that can later
be passed to `cancel_timer` to cancel an existing timer.
Also adds timer tests, which was omitted (except for one in the tagged
threads section), along with a new test for timer deletion.
Storage server, in particular, needs to disable pubkey-based routing on
its connection to oxend (because it is sharing oxend's own keys), but
wants it by default for SS-to-SS connections. This allows the oxend
connection to turn it off so that we don't have oxend omq connections
replacing each other.
This provides an interface for sending a reply to a message later (i.e.
after the Message& itself is no longer valid) by using a new
`send_later()` method of the Message instance that returns an object
that can properly route replies (and can outlive the Message it was
called on).
Intended use is:
run_this_lambda_later([send=msg.send_later()] {
send.reply("content");
});
which is equivalent to:
run_this_lambda_later([&msg] {
msg.send_reply("content");
});
except that it works properly even if the lambda is invoked beyond the
lifetime of `msg`.
Decoding into a std::byte output iterator was not working because the
`*out++ = val` assignment doesn't work when the output is std::byte and
val is a char/unsigned char/uint8_t. Instead we need to explicitly
cast, but figuring out what we have to cast to is a little bit tricky.
This PR makes it work (and bumps the version for this and the is_hex
fix).
`is_hex()` is a bit misleading as `from_hex()` requires an even-length
hex string, but `is_hex()` also allows odd-length hex strings, which
means currently callers should be doing `if (lokimq::is_hex(str) &&
str.size() % 2 == 0)`, but probably aren't.
Since the main point of `lokimq/hex.h` is for byte<->hex conversions it
doesn't make much sense to allow `is_hex()` to return true for something
that can't be validly decoded via `from_hex()`, thus this PR changes it
to return false.
If someone *really* wants to test for an odd-length hex string (though
I'm skeptical that there is a need for this), this also exposes
`is_hex_digit` so that they could use:
bool all_hex = std::all_of(str.begin(), str.end(), lokimq::is_hex_digit<char>)
This is making lokimq headers & static lib get installed when lokimq is
used as a project subdirectory, which is very annoying.
This adds an option for enabling the install lines, and only enables it
if doing a shared library or a top-level project build.
The thread_local `std::map` here can end up being destructed *before*
the LokiMQ instance (if both are being destroyed during thread joining),
in which case we segfault by trying to use the map. Move the owning
container into the LokiMQ instead (indexed by the thread) to prevent
that.
Also cleans this code up by:
- Don't close control sockets from the proxy thread; socket_t's aren't
necessarily thread safe so this could be causing issues where we trouble
double-closing or using a closed socket.
- We can just let them get closed during destruction of the LokiMQ.
- Avoid needing shared_ptr's; instead we can just use a unique pointer
with raw pointers in the thread_local cache. This simplifies closing
because all closing will happen during the LokiMQ destruction.
Apple, in particular, often fails tests with an address already in use
if attempt to reuse a port that the process just closed, because it is a
wonderful OS.
Add var::get/var::visit implementations of std::get/std::visit that get
used if compiling for an old macos target, and use those.
The issue is that on a <10.14 macos target Apple's libc++ is missing
std::bad_variant_access, and so any method that can throw it (such as
std::get and std::visit) can't be used. This workaround is ugly, but
such is life when you want to support running on Apple platforms.
On the wire they are just lists, but this lets you put tuples onto and
pull tuples off of the wire. (Also supports std::pair).
Supports direct serialization (via bt_serialize()/bt_deserialize()),
list/dict consumer deserialization, and conversion from a bt_value or
bt_list via a new bt_tuple() function.