Binary data cannot be sent via D-Bus as string, because D-Bus will
balk on non-UTF8 sequences of bytes. GIO D-Bus failed particularly
spectacularly, with internal asserts followed by a segfault.
This affected sending the initial message as part of SyncParams and
thus all testing with syncevo-http-server. D-Bus testing itself
didn't have a Connection test involving WBXML.
Fixed by using SharedBuffer inside SyncParams with D-Bus traits based
on DBusArray. DBusArray itself is less suitable because it cannot be
copied easily inside the app. Also added a test, with a b64 encoded
binary message inside the Python script.
The previous implementation relied on the glib event loop to free the
allocated shared pointer. If the process quit before executing the
idle callback, that memory was listed as "potentially lost" by
valgrind. To avoid that, track all pending deletions in the Server
instance and explicitly free them in the destructor, in addition to
the idle callback.
Because we don't have a common base class for all items which might
have to be tracked like this, boost::bind is used to automatically
hold a copy of the shared pointer. The goal is not to invoke the
resulting functor, only freeing it matters.
Running the fork/exec implementation under valgrind caused
some tests to fail because a) some tests ran longer (fixed
by increasing timeouts) and b) some tests resulted in
different D-Bus communication depending on the timing.
Added more debug logging in the syncevo-dbus-server Connection
class and the syncevo-dbus-helper DBusTransport to track this
down.
There were multiple reasons, usually related to handling aborted
connections.
The D-Bus API explicitly says about the "Abort" signal sent by the
server: "This signal is sent at most once for each connection. No
reply will be sent on an aborted connection." The old code did send an
empty, final reply after aborting and the test-dbus.py actually
checked for it. Now that final message is really only send when the
connection is still waiting for a reply (state == PROCESSING) and
hasn't been aborted. The test was fixed accordingly.
The "Abort" documentation also says that "all further operations on it
[= Connection] will fail". Despite that comment one D-Bus test did a
Connection.Close() after receiving the Abort signal. The server now
destroys the Connection instance once it has failed and thus the
Close() call failed. It was removed.
The Connection class now consistently uses delayed deletion, instead
of destructing itself while some of its methods are still active. A
bit safer.
While thinking about the server<->helper communication I noticed that
a Connection.Close() succeeds even if the helper hasn't shut down
yet. Not sure whether there are relevant error scenarios where we
need to tell the client that shutdown of the helper failed.
This commit moves the blocking syncing, database restore and command
line execution into a separate, short-lived process executing the
syncevo-dbus-helper. The advantage is that the main
syncevo-dbus-server remains responsive under all circumstances (fully
asynchronous now) and suffers less from memory leaks and/or crashes
during a sync.
The core idea behind the new architecture is that Session remains the
D-Bus facing side of a session. It continues to run inside
syncevo-dbus-server and uses the syncevo-dbus-helper transparently via
a custom D-Bus interface between the two processes. State changes of
the helper are mirrored in the server.
Later the helper might also be used multiple times in a Session. For
example, anything related to loading backends should be moved into the
helper (currently the "is config usable" check still runs in the
syncevo-dbus-server and needs to load/initialize backends). The
startup code of the helper already handles that (see boolean result of
operation callback), but it is not used yet in practice.
At the moment, only the helper provides a D-Bus API. It sends out
signals when it needs information from the server. The server watches
those and replies when ready. The helper monitors the connection to
the parent and detects that it won't get an answer if that connection
goes down.
The problem of "helper died unexpectedly" is also handled, by not
returning a D-Bus method reply until the requested operation is
completed (different from the way how the public D-Bus API is
defined!).
The Connection class continues to use such a Session, as before. It's
now fully asynchronous and exchanges messages with the helper via the
Session class.
Inside syncevo-dbus-server, boost::signals2 and the dbus-callbacks
infrastructure for asynchronous methods execution are used heavily
now. The glib event loop is entered exactly once and only left to shut
down.
Inside syncevo-dbus-helper, the event loop is entered only as
needed. Password requests sent from syncevo-local-sync to
syncevo-dbus-helper are handled asynchronously inside the event loop
driven by the local transport.
syncevo-dbus-helper and syncevo-local-sync are conceptually very
similar. Should investigate whether a single executable can serve both
functions.
The AutoSyncManager was completely rewritten. The data structure is a
lot simpler now (basically just a cache of transient information about
a sync config and the relevant config properties that define auto
syncing). The main work happens inside the schedule() call, which
verifies whether a session can run and, if not possible for some
reasons, ensures that it gets invoked again when that blocker is
gone (timeout over, server idle, etc.). The new code also uses
signals/slots instead of explicit coupling between the different
classes.
All code still lives inside the src/dbus/server directory. This
simplifies checking differences in partly modified files like
dbus-sync.cpp. A future commit will move the helper files.
The syslog logger code is referenced by the server, but never used.
This functionality needs further thought:
- Make usage depend on command line option? Beware that test-dbus.py
looks for the "ready to run" output and thus startup breaks when
all output goes to syslog instead of stdout.
- Redirect glib messages into syslog (done by LogRedirect, disabled when
using LoggerSyslog)?
The syncevo-dbus-server now sends the final "Session.StatusChanged
done" signal immediately. The old implementation accidentally delayed
sending that for 100 seconds. The revised test-dbus.py checks for
more "session done" quit events to cover this fix.
Only user-visible messages should have the INFO level in any of the
helpers. Messages about starting and stopping processes are related to
implementation details and thus should only have DEBUG level.
The user also doesn't care about where the operation eventually
runs. All messages related to it should be in INFO/DEBUG/ERROR
messages without a process name. Therefore now syncevo-dbus-server
logs with a process name (also makes some explicit argv[0] logging
redundant; requires changes in test-dbus.py) and syncevo-dbus-helper
doesn't.
syncevo-local-sync is different from syncevo-dbus-helper: it produces
user-relevant output (the other half of the local sync). It's output
is carefully chosen so that the process name is something the user
understands (target context) and output can be clearly related to one
side or the other (for example, context names are included in the sync
table).
Output handling is based on the same idea as output handling in the
syncevo-dbus-server:
- Session registers itself as the top-most logger and sends
SyncEvolution logging via D-Bus to the parent, which re-sends
it with the right D-Bus object path as output of the session.
- Output redirection catches all other output and feeds it back
to the Session log handler, from where it goes via D-Bus
to the parent.
The advantage of this approach is that level information is made
available directly to the parent and that message boundaries are
preserved properly.
stderr and stdout are redirected into the parent and logged there as
error. Normally the child should not print anything. While it runs,
LogRedirect inside it will capture output and log it
internally. Anything reaching the parent thus must be from early
process startup or shutdown.
Almost all communication from syncevo-dbus-helper to
syncevo-dbus-server is purely information for the syncevo-dbus-server;
syncevo-dbus-helper doesn't care whether the signal can be
delivered. The only exception is the information request, which must
succeed.
Instead of catching exceptions everywhere, the optional signals are
declared as such in the EmitSignal template parameterization and
no longer throw exceptions when something goes wrong. They also don't
log anything, because that could lead to quite a lof of output.
KDE complains with "realPath called for relative path ..., please fix"
if the XDG env variables point to a relative path. Using absolute paths
makes sense because it allows processes to change their current directory.
Always unsetting SYNCEVOLUTION_DEBUG removed an important debugging
method: SYNCEVOLUTION_DEBUG is useful in cases where output
redirection and/or handling is broken.
Therefore this commit removes the forced unsetting of
SYNCEVOLUTION_DEBUG again and instead let's at least
TestCmdline.testConfigureTemplates pass with and without the
additional debug output.
Other tests fail when it is set, so the right way to invoke
test-dbus.py is without it.
The doSync() method does several steps which can take a long time
(for example, send SAN message) before finally entering the while
loop which checks for abort/suspend. In the case that a suspend or
abort request arrives early, doSync() should not complete the
sync setup. Now it checks more often and returns early.
This was found because the fork/exec sync changed the timing so that
the syncevo-dbus-helper ended up being killed by SIGTERM before it
even started syncing. That had the downside that the process didn't
clean up, resulting in many "potentially lost" memory chunks.
Better choose the timing so that doSync() really is ready to handle
the signal. Aborting at that time is also more realistic (normally
sync startup isn't that slow, only valgrind makes it slow).
When a test is stuck in a loop.run() and times out before the expected
event arrives, then the logged message wasn't very informative (just a
generic "timed out"). Now the error message includes a stack backtrace
(useful to find out where the test was waiting, and thus for what) and
the list of all events received so far (useful to determine which one
is missing).
When the parent has died, sending a reply to its method call will
fail. This must not kill the D-Bus helper. Currently only the failure
reply is secured that way, reasoning that if something already went
wrong, further errors can be ignored.
Sending a positive result might need the same treatment.
There are cases in the D-Bus server where the caller of
Exception::handle() just wants the error explanation, without having
it logged as error at the same time. The new EXCEPTION_HANDLE_NO_ERROR
flag reduces the level so that the explanation is merely logged as
debug information.
A failure to obtain the password must be communicated back to the caller.
Include as much information in the error message as possible.
Avoid reporting this as a generic transport error in the parent.
This fixes an infinite loop when the engine is waiting for data and
the transport is dead: trying to abort just re-triggers the transport
problem, so to abort, the code noq proceeds to shutdown when it all
involved parties (engine, transport) were already notified of the
abort.
This is necessary for the syncevo-dbus-helper: it must integrate the
password request to its parent into the normal event loop, to avoid
triggering assertions in glib when calling glib from inside glib.
In contrast to the version from GDBus libdbus, GDBus GIO's
dbus_get_bus_connection() with name doesn't fail even if the name
cannot be obtained. That's because it asynchronously asks for the
name without waiting for success.
As a quick-and-dirty solution register a function which kills the
process when the name registration fails. This assumes that the
process will run as a daemon with exactly one name, where failures to
obtain that name are indeed fatal.
In some cases the sender of a D-Bus signal doesn't care whether it can
be delivered (log messages, for example). Such signals can be declared
"optional" with a template parameter, which will suppress throwing
error exceptions.
The API is unchanged. EmitSignal0 is a typedef for the variant with
the traditional semantic (throw errors). To suppress that, use
EmitSignal0Template<true>.
In contrast to the same patch in GDBus libdbus, the GIO version also
includes some refactoring. The common code (storage of DBusObject and
signal name, message send code) is now in a EmitSignalHelper base
class.
Moved common code into the DBusResult base class (sendMsg()) and
throwFailure() function. The revised code there includes the GError
message in the exception, to provide a better explanation than before.
"invalid argument" as error messages provided no information about the
location or reason for the failure. Now the text refers to the
GVariant decoding problem ("g_variant failure") and includes source
code + line (to identify the exact type).
In some cases the sender of a D-Bus signal doesn't care whether it can
be delivered (log messages, for example). Such signals can be declared
"optional" with a template parameter, which will suppress throwing
error exceptions.
The API is unchanged. EmitSignal0 is a typedef for the variant with
the traditional semantic (throw errors). To suppress that, use
EmitSignal0Template<true>.
Strictly speaking only GDBus for libdbus really needed the "connection
lost" callback, because the "catch error when pending method call
fails" approach didn't work with it. But GDBus GIO also has a "closed"
signal. Using it is straight-forward and will allow removing the more
complex code in ForkExec.
libdbus has a bug where it doesn't notify the caller when a pending
method call failed because the connection went
down (https://bugs.freedesktop.org/show_bug.cgi?id=49728). ForkExec
relied on that to detect loss of connection.
Therefore a different method is needed to alert ForkExec. This commit
finishes the implementation of g_dbus_set_disconnect_function() and
provides a C++ binding for ForkExec.
A parent already could detect when the child died, but the child
wasn't notified when the parent died or (more important) closed the
connection.
Now it is possible to detect loss of connection, which includes
"parent died" but also the weaker "parent closed its side of the
communication". The idea is the same as in the handling of pending
operations in syncevo-dbus-helper: invoke a method and keep the method
call open forever. When the connection is lost, D-Bus will invoke the
error callback of the pending method.
ForkExecParent now can capture the child's stdout and/or stderr (if
desired, combined into one stream) and make them available via
signals.
Unit tests were added for this functionality. They could be extended
to cover D-Bus connection setup and usage.
LoggerSyslog copied quite a bit of code from LoggerStdout. Moved that
code into the LoggerBase. That avoids code duplication and allows
LoggerSyslog to benefit from improvements made recently in that
code (prefix in each line).
As it was before, LoggerSyslog did not handle line breaks inside the
printed text. syslog() showed them as #012. Now each line is passed to
syslog() separately, with a proper prefix.
The level in the syslog() also seemed to be wrong earlier. It needs to
be based on the log level of the current message, not on the threshold
set for logging in the logger.
Split Timespec class out of util.h into its own Timespec.h so that
Logging.h can use it (next patch). Otherwise there would be a circular
dependency between Logging.h and util.h.
The last output of syncevo-dbus-server was about deactivating the
SuspendFlags. Added a "done" part, to avoid the false impression that
this operation didn't complete.
Both old and new implementation only look at the start times
of syncs to determine whether the autoSyncInterval is over.
D-Bus test should do the same. It still passed because of
the inaccurate time comparison. Now the comparison is more
likely to succeed.
Writing D-Bus traits involves writing append() and get() methods which
take native types (connection, message, iterator, ...). Code which
calls other dbus_traits can be written without ifdefs when these
native types have aliases which are the same in the GIO and libdbus
implementations.
A signals.connect() to a EmitSignal instance works only after defining
the result_type for boost::bind. Example:
m_configChangedSignal.connect(boost::bind(boost::ref(configChanged)))
A signals.connect() to a EmitSignal instance works only after defining
the result_type for boost::bind. Example:
m_configChangedSignal.connect(boost::bind(boost::ref(configChanged)))
Use same pattern as for global logging: keep instance around forever,
instead of implicitly calling its destructor during process shutdown.
If any other global instance needs the SuspendFlags instance after
that point, the process would crash. Not seen in practice yet, but
might happen.
Added runOnce(), which takes a callback with void return code
and which returns false in a wrapper -> useful for one-time
timeouts with a more natural callback.
When a callback is invoked, it gets called directly from the glib
event loop. Letting an exception escape would jump right through the
pure C levels in the callstack, which is likely to cause all kinds of
trouble
Exceptions should never reach the callback. If they don't get caught
inside the callback, then the callback wrapper now catches them and
aborts the process.
This allows writing simple C++ callbacks without a try/catch clause
when exceptions are truely fatal (like "out of memory" in STL).
Asking for the type of an empty array returns zero instead of
the real type. Because the type was checked, this triggered
an exception during decoding. Now the check accepts zero if,
and only if, the array is empty.
All methods called from glib have to catch C++ exceptions. When a
callback let a C++ exception pass, the process was aborted with a
rather mysterious "std::terminate called without pending exception".
All that the D-Bus wrapper can do is abort the program, which is done
here together with logging by calling g_error().
Therefore the real solution is to handle errors inside the C++
callbacks.
NULL callbacks are always allowed. Checking for those was inconsistent
(no need to call empty(), better use implicit conversion to bool) and
missing for asynchronous method calls.
NULL callbacks are always allowed. Checking for those was inconsistent
(no need to call empty(), better use implicit conversion to bool) and
missing for asynchronous method calls.
The HANDLE_EXCEPTION_FATAL flag tells Exception::handle() that the
caller cannot recover from the exception. handle() then calls abort(),
which terminates the process. Useful for unexpected exceptions in the
syncevo-dbus-server which might have left the process in an
inconsistent state.
The test covers waiting for a reply that never comes because the peer
dies. GIO D-Bus returns a "connection lost" error in this case. Still
need to test what happens when a process isn't currently waiting for a
response to its own call, but rather a method invocation from its peer
- it seems that it'll simply get stuck waiting forever when the peer
dies.
The new Exception::tryRethrowDBus() extracts the original internal
exception attributes (class, parameters) from the string that gets
transmitted via D-Bus and throws recognized exceptions. Will be needed
for syncevo-dbus-helper->syncevo-dbus-server error reporting.
tryRethrow() is the version which works on strings without the D-Bus
exception prefix. It's useful for relaying exceptions from inside an
event loop to the caller outside of the loop by storing the
explanation temporarily in a string.