oxen-mq

Commit Graph

Author	SHA1	Message	Date
Jason Rhinelander	fc1ea66599	Reduce heartbeat frequency to 15s 3s was excessive especially considering that the default heartbeat timeout is set to 30s.	2020-04-18 02:58:22 -03:00
Jason Rhinelander	238dfa7f78	Drop idle connections regularly The check here on "only if we have some idle workers" fails catastrophically with one worker because that worker is always occupied when this code gets called because of how the loop works and so connections don't get expired at all.	2020-04-18 02:55:12 -03:00
Jason Rhinelander	911c66140f	Bump version to 1.1.1	2020-04-17 16:19:32 -03:00
Jason Rhinelander	2966427cc0	Increase ZMQ socket limit ZMQ's default is 1024, which we are close to hitting; this changes the default for LokiMQ to 10000.	2020-04-17 16:13:04 -03:00
Jason Rhinelander	34bbaaf612	Use slower and exponential backoff in reconnection ZMQ's default reconnection time is 100ms, indefinitely, which seems far too aggressive, particularly where we have some potential for hundreds or thousands of connections. This changes the default to be slightly slower (250ms instead of 100ms) on the first attempt, and to use exponential backoff doubling the time between each failed connection attempt up to a max of 5s between reconnection attempts to calm things down.	2020-04-17 16:09:53 -03:00
Jason Rhinelander	b2518b8eb3	Fix broken idle expiry timeout Idle time was being calculated as the negative of what it should have been, so a connection idle for 30s was idle for "-30s", and since -30 is not greater than whatever the idle time is, it would never expire and get closed. This was resulting in SNs keeping connections open forever, which was very likely not helping with connectivity (and probably also responsible for some of the connection rushes triggering ISP DDOS warnings).	2020-04-17 16:06:54 -03:00
Jason Rhinelander	712662f144	Fix storing reference to temporary consume_string returns a temporary string; we wnat consume_string_view which returns a view into the data being consumed.	2020-04-17 16:05:41 -03:00
Jason Rhinelander	131bc95f65	Fix pre-1.1.0 UNKNOWNCOMMAND detection 1.0.5 sends just ["UNKNOWNCOMMAND"], so the detection here was broken, which resulted in a warning rather than just a debug log message.	2020-04-14 23:53:19 -03:00
Jason Rhinelander	3aa63c059d	Test suite timing tweaks	2020-04-14 17:40:41 -03:00
Jason Rhinelander	7de36da483	Add ZMTP heartbeating (enabled by default) ZMTP heartbeating should help keep the connection alive, and should result in earlier detection of connection failures.	2020-04-14 16:08:54 -03:00
Jason Rhinelander	b081cf9331	Add missing SET_SNS proxy handler	2020-04-13 16:11:30 -03:00
Jason Rhinelander	84bd5544cc	Move pubkey_set into auth.h header This allows it to be brought in without the full lokimq.h header.	2020-04-13 13:03:19 -03:00
Jason Rhinelander	3b86eb1341	1.1.0: invocation-time SN auth; failure responses This replaces the recognition of SN status to be checked per-command invocation rather than on connection. As this breaks the API quite substantially, though doesn't really affect the functionality, it seems suitable to bump the minor version. This requires a fundamental shift in how the calling application tells LokiMQ about service nodes: rather than using a callback invoked on connection, the application now has to call set_active_sns() (or the more efficient update_active_sns(), if changes are readily available) to update the list whenever it changes. LokiMQ then keeps this list internally and uses it when determining whether to invoke. This release also brings better request responses on errors: when a request fails, the data argument will now be set to the failure reason, one of: - TIMEOUT - UNKNOWNCOMMAND - NOT_A_SERVICE_NODE (the remote isn't running in SN mode) - FORBIDDEN (auth level denies the request) - FORBIDDEN_SN (SN required and the remote doesn't see us as a SN) Some of these (UNKNOWNCOMMAND, NOT_A_SERVICE_NODE, FORBIDDEN) were already sent by remotes, but there was no connection to a request and so they would log a warning, but the request would have to time out. These errors (minus TIMEOUT, plus NO_REPLY_TAG signalling that a command is a request but didn't include a reply tag) are also sent in response to regular commands, but they simply result in a log warning showing the error type and the command that caused the failure when received.	2020-04-12 19:57:19 -03:00
Jason Rhinelander	fb3bf9bd1f	Bump version to 1.0.5	2020-04-06 18:16:59 -03:00
Jason Rhinelander	95540ec7d5	Fix pollitems_stale not being set in some cases This could cause stalls of up to 250ms before we detect an incoming message.	2020-04-06 13:16:55 -03:00
Jason Rhinelander	af42875e97	Made simple_string_view take a char type This allows (most usefully) a `ustring_view` for viewing unsigned char strings.	2020-04-03 12:28:50 -03:00
Jason Rhinelander	bc49b5e9a0	Expose advanced zmq context setting ability	2020-04-03 12:28:50 -03:00
Jason Rhinelander	e3a86aaf71	Add `send_option::outgoing` to force a send on an outgoing connection SS wants this, in particular, to be able to do reachability tests. (Using connect_remote for this was bad with pubkey-based routing ids because the second connection could replace an existing connection).	2020-04-03 01:34:21 -03:00
Jason Rhinelander	b9e9f10f29	Reset stale pollitems This was never being reset to false which could really hurt performance (because it being false would cause the proxy socket reading loop to short circuit before reading all available msgs, basically needing one full proxy loop per incoming message).	2020-04-03 01:34:21 -03:00
Jason Rhinelander	d4ffebebbd	Change thread count logs to debug from trace	2020-04-03 01:34:21 -03:00
Jason Rhinelander	6ba70923b9	Add job queue check on total workers size Without this there could be a race condition where a job could create a new worker during shutdown, and end up causing an assert failure.	2020-03-29 15:43:17 -03:00
Jason Rhinelander	4c470f3e33	Bump version to 1.0.4	2020-03-29 15:21:44 -03:00
Jason Rhinelander	bd196d08b8	Allow log level to be specified in constructor It can still be set using `lmq.log_level(...)`, but this can be slightly more convenient -- and without this log messages in the constructor are completely useless.	2020-03-29 15:21:20 -03:00
Jason Rhinelander	b66f653708	Less verbose logging at `info` level Downgrades a bunch of not-useful-at-info-level debug messages from info -> debug. This makes `info` a more useful value for a client that wants messages about startup/shutdown but not random non-serious connection related messages.	2020-03-29 15:21:20 -03:00
Jason Rhinelander	716d73d196	All sends use dontwait; add send failure callbacks We really don't ever want send to block, no matter how it is called, since the send is always in the proxy thread. This makes the actual send call always non-blocking, and adds callbacks that we can invoke on send failures: either on queue full errors (which might be recoverable), or both full queue and hard failures (which are generally not recoverable). These callbacks are both optional: they have to be passed in using `send_option::queue_full` (if you just want queue full notifies) or `send_option::queue_failure` (if you want queue full notifies and other send exceptions).	2020-03-29 15:21:20 -03:00
Jason Rhinelander	8e1b2dffa5	Catch connect failures socket.connect() can throw, e.g. if given an invalid connection address; catch this, log the error, and return a failure condition.	2020-03-29 14:40:21 -03:00
Jason Rhinelander	2493e2abd4	Remove empty file All the batch implementation code is in jobs.cpp, this file wasn't meant to be committed originally.	2020-03-29 12:29:38 -03:00
Jason Rhinelander	bcca8dd34e	Catch errors on internal msgs; support non-blocking sends When we try to route an internal message ("BYE", "NOT_A_SERVICE_NODE", etc.) back to the remote from the proxy thread we can end up trying to send to a disconnected remote, which raises an exception, but this isn't caught in proxy code: fix this by catching and ignoring it. This also changes the code to send these messages in "dontwait" mode so that if we can't queue the message we get (and ignore) an exception rather than blocking.	2020-03-29 11:34:55 -03:00
Jason Rhinelander	7f9141a4a9	1.0.3 release	2020-03-27 18:55:16 -03:00
Jason Rhinelander	fd19f7b183	Trim logged filenames to lokimq/* Otherwise this includes the full build path which is gross.	2020-03-27 15:17:34 -03:00
Jason Rhinelander	0639bfa629	Avoid segfault on retried SN connection request When we fail to send to a SN but can retry (e.g. because we had an incoming connection which no longer works, but can retry an outgoing connection) we were recursing, but this was resulting in a double-free of the request callback (since we'd try to take ownership of the incoming serialized pointer twice). Rewrite the code to use a loop with single ownership instead. This also changes the request callback behaviour to fire a failure callback immediately if we can't send a request; previously you'd have to wait for a timeout, but that is pointless if we couldn't get the request out.	2020-03-27 14:59:11 -03:00
Jason Rhinelander	a7c669775f	Avoid masking ReplyCallback type with template param	2020-03-27 14:48:35 -03:00
Jason Rhinelander	9fec81856f	1.0.2 version bump	2020-03-24 11:35:31 -03:00
Jason Rhinelander	8b6f6f498c	Make request timeout configurable For example: lmq.request(conn, "some.method", callback, lokimq::request_timeout{5s}); will result in the callback being called with a failure if the response doesn't arrive within 5s. (If it still arrives, but after the failure callback, it gets dropped).	2020-03-23 22:30:53 -03:00
Jason Rhinelander	75750001ce	Reduce connection check interval and make configurable The previous 1s default seems on the long side; this reduces it to 250ms. It also makes it a public member so that it can be configured (which is mainly needed for the test suite, but might be useful for lokimq-calling code that needs faster or slower connection cleanups).	2020-03-23 22:29:14 -03:00
Jason Rhinelander	b97f3442e7	Rename keep-alive -> keep_alive in internal serialization This makes it consistent with other internal parameter names.	2020-03-23 22:28:23 -03:00
Jason Rhinelander	48d3f261d3	1.0.1 release - internal data structure change to help armhf/gcc-6 - various test suite fixes - various build system improvements	2020-03-21 12:57:45 -03:00
Jason Rhinelander	04e2bf7cf7	Change pending_connects from vector to list Having this as a vector seems to cause armhf/gcc-6 to segfault. On closer inspection there's no good reason this should be a vector in the first place: it only gets used during new connection handshaking and isn't in any hot loop, plus the elements are fairly large tuples where shifting elements is going to be relatively expensive. Thus switching it to a list everywhere (rather than just on old gcc arm) seems fine.	2020-03-21 12:56:46 -03:00
Jason Rhinelander	98b1bd6930	Add more locks around assertions Catch2 isn't currently thread safe, so if we hit one of these assertions while some other thread is doing things such as logging we might segfault.	2020-03-21 12:56:13 -03:00
Jason Rhinelander	3a120efb79	Increase test timeouts for arm These sometimes spurious fail because apparently they weren't quite long enough to pass tests on my Pi 4.	2020-03-21 11:10:07 -03:00
Jason Rhinelander	0a7074c573	Add BUILD_BYPRODUCTS so that ninja build works	2020-03-19 19:54:24 -03:00
Jason Rhinelander	a36e53d409	More linking overhaul - Don't try to use cppzmq, just find libzmq ourselves. - Allow existing `libzmq` and `sodium` targets to be used to control how we link to libzmq and/or sodium. - Use PkgConfig:: targets instead of the older bunch-of-variables approach (requires cmake >= 3.6).	2020-03-15 01:43:23 -03:00
Jason Rhinelander	bc0e6be801	Add sodium dep if embedding static lib when doing a shared build, too	2020-03-14 16:06:58 -03:00
Jason Rhinelander	dd088c8ba5	cmake compatibility fix	2020-03-14 15:17:48 -03:00
Jason Rhinelander	3d315ba123	More static build linking fixes Static linking is a dumpster fire.	2020-03-14 14:34:56 -03:00
Jason Rhinelander	dd1a8eeb1d	Use the correct variable for shared libs	2020-03-14 02:20:43 -03:00
Jason Rhinelander	1176b946e5	1.0.0 release	2020-03-13 21:08:34 -03:00
Jason Rhinelander	ec50ee8cbd	Compile libzmq statically if embedding	2020-03-13 21:08:34 -03:00
Jason Rhinelander	c4d74a8640	Slightly relax build dep to 4.3 Distros (such as buster) include a patched 4.3.1, which is fine to use.	2020-03-13 19:41:08 -03:00
Jason Rhinelander	036e871cdb	32-bit warning fix	2020-03-13 19:05:12 -03:00

1 2 3

122 Commits All Branches Search

122 Commits

All Branches