This replaces the NIH epee http server which does not work all that well
with an external C++ library called uWebSockets. Fundamentally this
gives the following advantages:
- Much less code to maintain
- Just one thread for handling HTTP connections versus epee's pool of
threads
- Uses existing LokiMQ job server and existing thread pool for handling
the actual tasks; they are processed/scheduled in the same "rpc" or
"admin" queues as lokimq rpc calls. One notable benefit is that "admin"
rpc commands get their own queue (and thus cannot be delayed by long rpc
commands). Currently the lokimq threads and the http rpc thread pool
and the p2p thread pool and the job queue thread pool and the dns lookup
thread pool and... are *all* different thread pools; this is a step
towards consolidating them.
- Very little mutex contention (which has been a major problem with epee
RPC in the past): there is one mutex (inside uWebSockets) for putting
responses back into the thread managing the connection; everything
internally gets handled through (lock-free) lokimq inproc sockets.
- Faster RPC performance on average, and much better worst case
performance. Epee's http interface seems to have some race condition
that ocassionally stalls a request (even a very simple one) for a dozen
or more seconds for no good reason.
- Long polling gets redone here to no longer need threads; instead we
just store the request and respond when the thread pool, or else in a
timer (that runs once/second) for timing out long polls.
---
The basic idea of how this works from a high level:
We launch a single thread to handle HTTP RPC requests and response data.
This uWebSockets thread is essentially running an event loop: it never
actually handles any logic; it only serves to shuttle data that arrives
in a request to some other thread, and then, at some later point, to
send some reply back to that waiting connection. Everything is
asynchronous and non-blocking here: the basic uWebSockets event loop
just operates as things arrive, passes it off immediately, and goes back
to waiting for the next thing to arrive.
The basic flow is like this:
0. uWS thread -- listens on localhost:22023
1. uWS thread -- incoming request on localhost:22023
2. uWS thread -- fires callback, which injects the task into the LokiMQ job queue
3. LMQ main loop -- schedules it as an RPC job
4. LMQ rpc thread -- Some LokiMQ thread runs it, gets the result
5. LMQ rpc thread -- Result gets queued up for the uWS thread
6. uWS thread -- takes the request and starts sending it
(asynchronously) back to the requestor.
In more detail:
uWebSockets has registered has registered handlers for non-jsonrpc
requests (legacy JSON or binary). If the port is restricted then admin
commands get mapped to a "Access denied" response handler, otherwise
public commands (and admin commands on an unrestricted port) go to the
rpc command handler.
POST requests to /json_rpc have their own handler; this is a little
different than the above because it has to parse the request before it
can determine whether it is allowed or not, but once this is done it
continues roughly the same as legacy/binary requests.
uWebSockets then listens on the given IP/port for new incoming requests,
and starts listening for requests in a thread (we own this thread).
When a request arrives, it fires the event handler for that request.
(This may happen multiple times, if the client is sending a bunch of
data in a POST request). Once we have the full request, we then queue
the job in LokiMQ, putting it in the "rpc" or "admin" command
categories. (The one practical different here is that "admin" is
configured to be allowed to start up its own thread if all other threads
are busy, while "rpc" commands are prioritized along with everything
else.) LokiMQ then schedules this, along with native LokiMQ "rpc." or
"admin." requests.
When a LMQ worker thread becomes available, the RPC command gets called
in it and runs. Whatever output it produces (or error message, if it
throws) then gets wrapped up in jsonrpc boilerplate (if necessary), and
delivered to the uWebSockets thread to be sent in reply to that request.
uWebSockets picks up the data and sends whatever it can without
blocking, then buffers whatever it couldn't send to be sent again in a
later event loop iteration once the requestor can accept more data.
(This part is outside lokid; we only have to give uWS the data and let
it worry about delivery).
---
PR specifics:
Things removed from this PR:
1. ssl settings; with this PR the HTTP RPC interface is plain-text. The
previous default generated a self-signed certificate for the server on
startup and then the client accepted any certificate. This is actually
*worse* than unencrypted because it is entirely MITM-readable and yet
might make people think that their RPC communication is encrypted, and
setting up actual certificates is difficult enough that I think most
people don't bother.
uWebSockets *does* support HTTPS, and we could glue the existing options
into it, but I'm not convinced it's worthwhile: it works much better to
put HTTPS in a front-end proxy holding the certificate that proxies
requests to the backend (which can then listen in restricted mode on
some localhost port). One reason this is better is that it is much
easier to reload and/or restart such a front-end server, while
certificate updates with lokid require a full restart. Another reason
is that you get an error page instead of a timeout if something is wrong
with the backend. Finally we also save having to generate a temporary
certificate on *every* lokid invocation.
2. HTTP Digest authentication. Digest authentication is obsolete (and
was already obsolete when it got added to Monero). HTTP-Digest was
originally an attempt to provide a password authentication mechanism
that does not leak the password in transit, but still required that the
server know the password. It only has marginal value against replay
attacks, and is made entirely obsolete by sending traffic over HTTPS
instead. No client out there supports Digest but *not* Basic auth, and
so given the limited usefulness it seems pointless to support more than
Basic auth for HTTP RPC login.
What's worse is that epee's HTTP Digest authentication is a terrible
implementation: it uses boost::spirit -- a recursive descent parser
meant for building complex language grammars -- just to parse a single
HTTP header for Digest auth. This is a big load of crap that should
never have been accepted upstream, and that we should get rid of (even
if we wanted to support Digest auth it takes less than 100 lines of code
to do it when *not* using a recursive descent parser).
With the current approach -fPIC wasn't getting used when building
external libraries. I tried moving add_subdirectory(external) down
below where we set flags, but that results in a slew or warning because
it *also* turn on a bunch of warnings that aren't safe in the various
external code.
Switch loki dev branch to C++17 compilation, and update the code with
various C++17 niceties.
- stop including the (deprecated) lokimq/string_view.h header and
instead switch everything to use std::string_view and `""sv` instead of
`""_sv`.
- std::string_view is much nicer than epee::span, so updated various
loki-specific code to use it instead.
- made epee "portable storage" serialization accept a std::string_view
instead of const lvalue std::string so that we can avoid copying.
- switched from mapbox::variant to std::variant
- use `auto [a, b] = whatever()` instead of `T1 a; T2 b; std::tie(a, b)
= whatever()` in a couple places (in the wallet code).
- switch to std::lock(...) instead of boost::lock(...) for simultaneous
lock acquisition. boost::lock() won't compile in C++17 mode when given
locks of different types.
- removed various pre-C++17 workarounds, e.g. for fold expressions,
unused argument attributes, and byte-spannable object detection.
- class template deduction means lock types no longer have to specify
the mutex, so `std::unique_lock<std::mutex> lock{mutex}` can become
`std::unique_lock lock{mutex}`. This will make switching any mutex
types (e.g. from boost to std mutexes) far easier as you just have to
update the type in the header and everything should work. This also
makes the tools::unique_lock and tools::shared_lock methods redundant
(which were a sort of poor-mans-pre-C++17 way to eliminate the
redundancy) so they are now gone and replaced with direct unique_lock or
shared_lock constructions.
- Redid the LNS validation using a string_view; instead of using raw
char pointers the code now uses a string view and chops off parts of the
view as it validates. So, for instance, it starts with "abcd.loki",
validates the ".loki" and chops the view to "abcd", then validates the
first character and chops to "bcd", validates the last and chops to
"bc", then can just check everything remaining for is-valid-middle-char.
- LNS validation gained a couple minor validation checks in the process:
- slightly tightened the requirement on lokinet addresses to require
that the last character of the mapped address is 'y' or 'o' (the
last base32z char holds only one significant bit).
- In parse_owner_to_generic_owner made sure that the owner value has
the correct size (otherwise we could up end not filling or
overfilling the pubkey buffer).
- Replaced base32z/base64/hex conversions with lokimq's versions which
have a nicer interface, are better optimized, and don't depend on epee.
-Ofast turns on -ffast-math, which allows gcc to do IEEE-violating FP
math optimizations. The consequence of this is that it means a Release
build and a Debug build produce *different* difficulty values. (It may
also have contributed to difficulty value divergence).
Turn it off because we are not Gentoo.
There is actually little point of setting CMAKE_CXX_FLAGS_RELEASE at all
here: adding `-DNDEBUG` is already the default for a cmake release
build, and cmake similarly has a release build default optimization
level (which I think is `-O3` -- which is already what we're doing with
`-Ofast` on Linux, except that `-Ofast` turns on unsafe stuff on top of
`-O3`).
For Debug build, this removes -Og and lets cmake use its default (-O0),
which is usually better when developing: lack of optimization means
faster binaries, are easier to trace. The downside is enormous binaries
(~250MB), but that seems at least manageable for a debug build.
From a cmake build dir (`make` used for simple example; can also be
something else, such as ninja):
make create_tarxz
make create_zip
creater a loki-<OS>[-arch]-x.y.z[-dev]-<GITHASH>.tar.xz or .zip.
make create_archive
decides what to do based on the build type: creates a .zip for a windows
build, a tar.xz for anything else. (We have been distributing the macOS
binaries as a .zip but that seems unnecessary: I tested on our dev mac
and a .tar.xz offers exactly the same UX as a .zip, but is noticeably
smaller).
From the top-level makefile there is also a new `make
release-full-static-archive` that does a full static build (include all
deps) and builds the archive.
This adds a static dependency script for libraries like boost, unbound,
etc. to cmake, invokable with:
cmake .. -DBUILD_STATIC_DEPS=ON
which downloads and builds static versions of all our required
dependencies (boost, unbound, openssl, ncurses, etc.). It also implies
-DSTATIC=ON to build other vendored deps (like miniupnpc, lokimq) as
static as well.
Unlike the contrib/depends system, this is easier to maintain (one
script using nicer cmake with functions instead of raw Makefile
spaghetti code), and isn't concerned with reproducible builds -- this
doesn't rebuild the compiler, for instance. It also works with the
existing build system so that it is simply another way to invoke the
cmake build scripts but doesn't require any external tooling.
This works on Linux, Mac, and Windows.
Some random comments on this commit (for preserving history):
- Don't use target_link_libraries on imported targets. Newer cmake is
fine with it, but Bionic's cmake doesn't like it but seems okay with
setting the properties directly.
- This rebuilds libzmq and libsodium, even though there is some
provision already within loki-core to do so: however, the existing
embedded libzmq fails with the static deps because it uses libzmq's
cmake build script, which relies on pkg-config to find libsodium which
ends up finding the system one (or not finding any), rather than the one
we build with DownloadLibSodium. Since both libsodium and libzmq are
faily simple builds it seemed easiest to just add them to the cmake
static build rather than trying to shoehorn the current code into the
static build script.
- Half of the protobuf build system ignores CC/CXX just because Google,
and there's no documentation anywhere except for a random closed bug
report about needing to set these other variables (CC_FOR_BUILD,
CXX_FOR_BUILD) instead, but you need to. Thanks Google.
- The boost build is set to output very little because even the minimum
-d1 output level spams ~15k lines of output just for the headers it
installs.
This makes three big changes to how translation files are generated:
- use Qt5 cmake built-in commands to do the translations rather than
calling lrelease directly. lrelease is often not in the path, while Qt5
cmake knows how to find and invoke it.
- Slam the resulting files into a C++ file using a cmake script rather
than needing to compile a .c file to generate C++ file. This is
simpler, but more importantly avoids the mess needed when cross
compiling of having to import a cmake script from an external native
build.
- In the actual generated files, use an unordered_map rather than a
massive list of static variable pointers.
CMake 3.9+ has generic LTO enabling code, switch to it.
Update required cmake version to 3.10 (3.9 is probably sufficient, but
3.10 is bionic's version that we're actually testing).
static_assert is required both by C++11 and C11; if we don't have a
standard compliant compiler then compilation should fail, not be hacked
like this.
The C++ version of this definition is particularly preposterous; the C
version is probably just covering up that the C code forget to include
the `<assert.h>` header where the `static_assert` macro is defined.
There's really no reason to submodule it - we work with pretty much any
libunbound version, and it's a very commonly available library
(comparable to sqlite3 or boost, which we don't submodule).
This removes the submodule and switches it to a hard dependency.
This triggers a pile of false positives from gtest and mapbox variant.
In the case of gtest, these were being hidden by including gtest as a
system include, which was simply disgusting.
The gtest version bundled inside tests/ is ancient (7 years old) and
doesn't build properly for some compilers.
Replace it with a current gtest submodule in external/.
Using an external project to build a subdirectory is gross, and moreover
it breaks if you are trying to use a custom compiler and uses the wrong
one (or just fails if a `c++` binary doesn't exist).
Since the builds appear to run just fine without this, just include it
via add_subdirectory instead.
Link readline directly into epee; having a separate epee_readline
library is not saving anything since we have it widely linked anyway.
Conditionally linking it to epee simplifies a bit of CMake code.
Also simplify how epee detects cmake to just look for a `readline`
target, which we now only set up if we find readline in the top-level
CMakeLists.txt
The policy needs to be set before the `link_dep_libs` function (it was
before when it was in `external`, but this started generating warnings
again when I moved it from there to the top level CMakeLists.txt).
This will require a one-time change when we merge to master (to delete
the `-dev`) but after that it should just sit there without needing any
modifications on `master` or `dev`. (And letting it be specified via
cmake arguments will let me slightly simplify the debs which currently
have to add a vaguely similar patch to get the debian version into the
version string).
CMake already provides variables to handle the version major/minor/patch
if we give it the dotted version in the `project()` command. Using it
significantly reduces the amount of macro stuff we have to do in
version.cpp.in, and it seems a little nicer to have it defined in the
project top level rather than buried in a needs-to-beprocessed .cpp
file.
This moves the release codename there, too, so that it stays being
defined in essentially the same place as the version.
This change here requires some minor tweaking of the version generation
code to do it in two steps (when we have git): the first
(`src/version.cpp.in` -> `build/version.cpp.in`) replaces all the main
version variables during cmake configuration, the second
(`build/version.cpp.in` -> `build/version.cpp`) then replaces the
VERSIONTAG at build time. (Before this commit, there was only version
tag replacement that only happened at build time).
Also bumps up the version here (since I'm moving it anyway) to match
master's 7.1.8.
Not needed anywhere else, this change is mainly for Android, compiling
Boost::locale means also compiling ICU which can get complicated and
we're able to completely side step.
- set WITH_SYSTEMD=OFF in contrib/depends so that we don't try to go
look for it on the host system.
- link icu libs properly
- don't build embedded zmq but rather set up a target so that loki-mq
uses the externally built one
- fix OpenSSL::Crypto not properly depending on ws2_32
- build boost atomic in the built boost because boost::thread depends on
it (and we were just getting lucky before by not happening to touch
anything that needed it)
- make bundled unbound link against the `extra` interface for various
required windows crap (ws2_32 and other stuff).
- updating to latest loki-mq (1.0.0 + various linking fixes)
- BUILD_SHARED_LIBS was being handled very strangely; make it a full
option instead (defaulting to off) that a cmake invoker can specify, as
per cmake recommendations.
- travis ci tweaks/changes:
- Add a static bionic build
- Simplify cmake argument code
- Add `--version` invocation for lokid and loki-wallet-cli to test
that the binaries were linked properly.
- always build an embedded sodium statically; if we do it dynamically
and an older system one exists we are going to have trouble.
- don't force epee and blocks to be static; rather they get controlled
by the above BUILD_SHARED_LIBS, just like all the other internal
libraries.
- use some PkgConfig:: imported targets rather than bunch-of-variables.
- use kitware upstream cmake instead of building it from source
- use NPROC everywhere
- upgrade openssl to 1.1.1d
- don't install libzmq (just let loki-mq build it)
- boost 1.72
- various tweaks to build parameters to speed up/correct build a bit
Some submodules can request a libsodium version and set SODIUM_*
variables before us. Combining with force downloading we can end up
building dependencies with differing versions of Sodium.
This adds the loki-mq dependency and replaces SNNetwork with it (along
with some syntax updates for how loki-mq changed a bit from SNNetwork).
This also replaces common/hex.h and common/string_view.h with loki-mq's
faster (hex) and more complete and tested (string_view) implementations.
The archaic (i.e. decade old) cmake usage here really got in the way of
trying to properly use newer libraries (like lokimq), so this undertakes
overhauling it considerably to make it much more sane (and significantly
reduce the size).
I left more of the architecture-specific bits in the top-level
CMakeLists.txt intact; most of the efforts here are about properly
loading dependencies, specifying dependencies and avoiding a whole pile
of cmake antipatterns.
This bumps the required cmake version to 3.5, which is what xenial comes
with.
- extensive use of interface libraries to include libraries,
definitions, and include paths
- use Boost::whatever instead of ${Boost_WHATEVER_LIBRARY}. The
interface targets are (again) much better as they also give you any
needed include or linking flags without needing to worry about them.
- don't list header files when building things. This has *never* been
correct cmake usage (cmake has always known how to wallet_rpc_headers
the headers that .cpp files include to know about build changes).
- remove the loki_add_library monstrosity; it breaks target names and
makes compiling less efficient because the author couldn't figure out
how to link things together.
- make loki_add_executable take the output filename, and set the output
path to bin/ and install to bin because *every single usage* of
loki_add_executable was immediately followed by setting the output
filename and setting the output path to bin/ and installing to bin.
- move a bunch of crap that is only used in one particular
src/whatever/CMakeLists.txt into that particular CMakeLists.txt instead
of the top level CMakeLists.txt (or src/CMakeLists.txt).
- Remove a bunch of redundant dependencies; most of them look like they
were just copy-and-pasted in, and many more aren't needed (since they
are implied by the PUBLIC linking of other dependencies).
- Removed `die` since it just does a FATAL_ERROR, but adds color (which
is useless since CMake already makes FATAL_ERRORs perfectly visible).
- Change the way LOKI_DAEMON_AND_WALLET_ONLY works to just change the
make targets to daemon and simplewallet rather than changing the build
process (this should make it faster, too, since there are various other
things that will be excluded).