Commit Graph

541 Commits

Author SHA1 Message Date
Jason Rhinelander 4a79ce9599
1.2.16 release 2023-09-28 11:15:29 -03:00
Jason Rhinelander 8b827cef4e
Rediff patches 2023-09-28 11:15:23 -03:00
Jason Rhinelander e43fde7618
Merge remote-tracking branch 'origin/stable' into ubuntu/bionic 2023-09-28 11:15:21 -03:00
Jason Rhinelander 5b8597d308
Merge remote-tracking branch 'origin/dev' into stable 2023-09-28 11:07:54 -03:00
Jason Rhinelander dc7fb35493
Merge pull request #88 from jagerman/epoll
epoll: always retrieve events from triggered sockets
2023-09-16 12:24:57 -03:00
Jason Rhinelander caadd35052
epoll: fix hang on heavily loaded sockets
This fixes a hang in the epoll code that triggers on heavy, bursty
connections (such as the live SPNS APNs notifier).

It turns out that side-effects of processing our sockets could leave
other sockets (that we processed earlier in the loop) in a
needs-attention state which we might not notice if we go back to
epoll_wait right away.  zmq::poll apparently takes care of this (and so
is safe to re-poll even in this state), but when we are using epoll we
need to worry about it by always checking for zmq events (which itself
has side effects) and, if we get any, re-enter the loop body immediately
*without* polling to deal with them.
2023-09-15 18:29:23 -03:00
Jason Rhinelander fd58ab9cac
Merge pull request #87 from jagerman/epoll
Add epoll support for Linux (huge proxy thread CPU reduction)
2023-09-14 16:22:51 -03:00
Jason Rhinelander 8f97add30f
Add epoll support for Linux
Each call to zmq::poll is painfully slow when we have many open zmq
sockets, such as when we have 1800 outbound connections (i.e. connected
to every other service node, as services nodes might have sometimes and
the Session push notification server *always* has).

In testing on my local Ryzen 5950 system each time we go back to
zmq::poll incurs about 1.5ms of (mostly system) CPU time with 2000 open
outbound sockets, and so if we're being pelted with a nearly constant
stream of requests (such as happens with the Session push notification
server) we incur massive CPU costs every time we finish processing
messages and go back to wait (via zmq::poll) for more.

In testing a simple ZMQ (no OxenMQ) client/server that establishes 2000
connections to a server, and then has the server send a message back on
a random connection every 1ms, we get atrocious CPU usage: the proxy
thread spends a constant 100% CPU time.  Virtually all of this is in the
poll call itself, though, so we aren't really bottlenecked by how much
can go through the proxy thread: in such a scenario the poll call uses
its CPU then returns right away, we process the queue of messages, and
return to another poll call.  If we have lots of messages received in
that time, though (because messages are coming fast and the poll was
slow) then we process a lot all at once before going back to the poll,
so the main consequences here are that:

1) We use a huge amount of CPU
2) We introduce latency in a busy situation because the CPU has to make
   the poll call (e.g. 1.5ms) before the next message can be processed.
3) If traffic is very bursty then the latency can manifest another
   problem: in the time it takes to poll we could accumulate enough
   incoming messages to overfill our internal per-category job queue,
   which was happening in the SPNS.

(I also tested with 20k connections, and the poll time scaling was
linear: we still processed everything, but in larger chunks because
every poll call took about 15ms, and so we'd have about 15 messages at a
time to process with added latency of up to 15ms).

Switching to epoll *drastically* reduces the CPU usage in two ways:

1) It's massively faster by design: there's a single setup and
   communication of all the polling details to the kernel which we only
   have to do when our set of zmq sockets changes (which is relatively
   rare).
2) We can further reduce CPU time because epoll tells us *which* sockets
   need attention, and so if only 1 connection out of the 2000 sent us
   something we can only bother checking that single socket for
   messages.  (In theory we can do the same with zmq::poll by querying
   for events available on the socket, but in practice it doesn't
   improve anything over just trying to read from them all).

In my straight zmq test script, using epoll instead reduced CPU usage in
the sends-every-1ms scenario from a constant pegged 100% of a core to an
average of 2-3% of a single core.  (Moreover this CPU usage level didn't
noticeably change when using 20k connections instead of 2k).
2023-09-14 15:03:15 -03:00
Jason Rhinelander e1b66ced48
Update oxen-encoding submodule 2023-08-28 18:46:54 -03:00
Jason Rhinelander 4f6dc35ea1
Merge remote-tracking branch 'origin/dev' into stable 2023-07-17 13:53:17 -03:00
Jason Rhinelander 4f3ee28784
Bump version 2023-07-17 13:50:00 -03:00
Jason Rhinelander bd3e2cdfb0
Merge pull request #85 from jagerman/random-string-redux
Redo random string generation
2023-04-28 15:52:49 -03:00
Jason Rhinelander b8bb10eac5 Redo random string generation
This is probably slightly more efficient (as it avoids going through
uniform_int_distribution), but more importantly, won't trigger some of
Apple's new xcode buggy crap.
2023-04-04 12:16:43 -03:00
Jason Rhinelander b33fabd2a6
Drop long-deprecated liblokimq-dev package 2022-10-05 20:38:51 -03:00
Jason Rhinelander dadcafc519
1.2.14 release 2022-10-05 20:32:52 -03:00
Jason Rhinelander df65f8857f
Rediff patches 2022-10-05 20:32:51 -03:00
Jason Rhinelander bc5a8d747a
Merge branch 'stable' into ubuntu/bionic 2022-10-05 20:32:49 -03:00
Jason Rhinelander ac6ef82ff6
Merge branch 'dev' into stable 2022-10-05 20:27:29 -03:00
Jason Rhinelander ff0e515c51
Fix installed headers
- Remove more deprecated shim headers
- Remove the gone (and newly gone) headers from the install list
- Add missing pubsub.h to install list
2022-10-05 20:26:34 -03:00
Jason Rhinelander ae57e27c4b
Merge remote-tracking branch 'origin/dev' into stable 2022-10-05 19:40:25 -03:00
Jason Rhinelander 2e308d4f43
Merge pull request #82 from oxen-io/fix-race-condition
Attempt to fix a race condition
2022-10-05 19:35:28 -03:00
Jason Rhinelander 445f214840
Fix a race condition with tagged thread startup
There's a very rare race condition where a tagged thread doesn't seem to
exist when the proxy tries syncing startup with them, and so the proxy
thread hangs in startup.

This addresses it by avoiding looking at the `proxy_thread` variable
(which probably isn't thread safe) in the worker's startup, and
signalling the you-need-to-shutdown condition via a third option for the
(formerly boolean) `tagged_go`.
2022-10-05 19:32:54 -03:00
Jason Rhinelander 358005df06
Merge pull request #80 from tewinget/pubsub
initial implementation of generic pub/sub management
2022-09-28 16:48:13 -03:00
Thomas Winget 85437d167b initial implementation of generic pub/sub management
Implements a generic pub/sub system for RPC endpoints to allow clients
to subscribe to things.

patch version bump

tests included and passing
2022-09-28 15:43:45 -04:00
Jason Rhinelander b26fe8cb04
Merge pull request #81 from jagerman/remove-deprecated
Remove deprecated code
2022-09-28 14:47:49 -03:00
Jason Rhinelander df19d1dd94
Add sid workaround
lsb_release -sc on sid currently prints 'n/a' because of debian bugs
1020893 and 1008735.  Add a workaround.

Also bumps clang builds to latest version.
2022-09-28 14:00:05 -03:00
Jason Rhinelander 25f714371b
Remove deprecated code
- Removes the old lokimq name compatibility shims
- Removes the oxenmq::bt* -> oxenc::bt* shim headers
2022-09-28 13:28:48 -03:00
Jason Rhinelander d4e5d6ef57
rebuild for updated oxenc 2022-08-31 13:47:57 -03:00
Jason Rhinelander bd5a43d3c7
Bump oxen-encoding dep 2022-08-31 13:47:29 -03:00
Jason Rhinelander b56d8fea9a
Ubuntu cmake hack 2022-08-31 13:37:32 -03:00
Jason Rhinelander 461029aa40
Drop bionic arm builds
Ubuntu has apparently broken bionic arm, yay!
2022-08-31 13:31:10 -03:00
Jason Rhinelander f2593a9437
1.2.13 release 2022-08-31 12:53:35 -03:00
Jason Rhinelander 5ac7ca4a91
Rediff patches 2022-08-31 12:53:34 -03:00
Jason Rhinelander 1576f52251
Merge remote-tracking branch 'origin/stable' into ubuntu/bionic 2022-08-31 12:53:32 -03:00
Jason Rhinelander 0858dd278b
oxen-encoding submodule to latest tagged release 2022-08-31 12:00:07 -03:00
Jason Rhinelander a93c16af04
oxen-encoding submodule to latest tagged release 2022-08-31 11:59:24 -03:00
Jason Rhinelander aea9b98b0c
Merge remote-tracking branch 'origin/dev' into stable 2022-08-31 11:57:34 -03:00
Jason Rhinelander 057685b7c0
Merge pull request #79 from jagerman/socket-limits
Fix zmq socket limit setting
2022-08-31 11:57:22 -03:00
Jason Rhinelander 3a3ffa7d23
Increase ulimit on macos
The test suite is now running out of file descriptors, because of
macos's default tiny limit.
2022-08-31 11:49:44 -03:00
Jason Rhinelander edcde9246a
Fix zmq socket limit setting
MAX_SOCKETS wasn't working properly because ZMQ uses it when the context
is initialized, which happens when the first socket is constructed on
that context.

For OxenMQ, we had several sockets constructed on the context during
OxenMQ construction, which meant the context_t was being initialized
during OxenMQ construction, rather than during start(), and so setting
MAX_SOCKETS would have no effect and you'd always get the default.

This fixes it by making all the member variable zmq::socket_t's
default-constructed, then replacing them with proper zmq::socket_t's
during startup() so that we also defer zmq::context_t initialization to
the right place.

A second issue found during testing (also fixed here) is that the socket
worker threads use to communicate to the proxy could fail if the worker
socket creation would violate the zmq max sockets limit, which wound up
throwing an uncaught exception and aborting.  This pre-initializes (but
doesn't connect) all potential worker threads sockets during start() so
that the lazily-initialized worker thread will have one already set up
rather than having to create a new one (which could fail).
2022-08-05 10:40:01 -03:00
Sean c854046684
Merge pull request #78 from darcys22/custom-formatters
Adds custom formatters for ConnectionID and AuthLevel
2022-08-04 11:03:01 +10:00
Sean Darcy c91e56cf2d adds custom formatter for OMQ structs that have to_string member 2022-08-04 10:50:02 +10:00
Jason Rhinelander 61b7505304
Update oxenc so that oxenc::oxenc target exists 2022-06-09 13:26:58 -03:00
Jason Rhinelander b74df139b0
1.2.12 release 2022-05-30 16:42:33 -03:00
Jason Rhinelander 08a57670f3
Rediff patches 2022-05-30 16:42:17 -03:00
Jason Rhinelander 61a4190c81
Merge remote-tracking branch 'origin/stable' into ubuntu/bionic 2022-05-30 16:42:15 -03:00
Jason Rhinelander eadb37c765
Merge remote-tracking branch 'origin/dev' into stable 2022-05-30 13:29:44 -03:00
Jason Rhinelander b0c3bd4ee9
fix linkage for submodule dep use 2022-05-30 13:28:52 -03:00
Jason Rhinelander 53443bbc31
Merge remote-tracking branch 'origin/dev' into stable 2022-05-30 13:13:52 -03:00
Jason Rhinelander fd95919704
Merge pull request #77 from jagerman/private-linking
Fix use of parent oxenc::oxenc target
2022-05-30 13:13:17 -03:00