I managed to trace things to the file libmetrics/netbsd/metrics.c in
the get_netbw function. Apparently, the code in get_netbw violates
alignment constraints for sparc64. I attached a patch against the result
of a "make patch" in parallel/ganglia-monitor-core. While I was at it, I
also changed proc_run_func somewhat to only count actually running
processes (having a look at NetBSD's ps(1) implementation) - without the
change, I got around 30 running processes on an idle machine.
"Looks good at a quick glance" martin@
Bump PKGREVISION.
Update to 1.2.8 (formerly in devel/apr1), no longer build from the
httpd distfile.
devel/rapidsvn:
devel/subversion-base:
parallel/ganglia-monitor-core:
security/hydra:
www/apache2:
Use devel/apr0.
www/apache22:
Use devel/apr and devel/apr-util.
INSTALLATION_DIRS, as well as all occurrences of ${PREFIX}/man with
${PREFIX}/${PKGMANDIR}.
Fixes PR 35265, although I did not use the patch provided therein.
Changes (mostly bugfixes)::
* srclib/libmetrics/freebsd/metrics.c (1.6): Many bug fixes and
cleanups: - Make cpu_state act like get_netbw and get new values
only if called more than 1/2 second from the last value update.
This causing obviously weird results from the CPU metrics on
sparc64 (where the counters seem to be very course) and bogus,
but more subtlety broken results on other architectures. This
has always been broken. - Implement cpu_intr_func (one line!)
- Make the logic for handling bad returns from sysctl make sense.
It should never be triggered in most cases, but at least this
way it won't return bogus values when it happens. - Prefer
sysctlbyname() to sysctl(). It's much easier to read. - Reduce
the use of pointless temporary variables. - Comment/white space
fixes, include more comments of metrics we are unlikely to
actually implement and comments on other rather bogus metrics,
mostly memory related ones.
* lib/libgmond.c (1.17): Set the default time for
tcp_accept_channels to be -1 (blocking io)
* srclib/libmetrics/linux/metrics.c (1.5): Fixed a bug in
pkts_in/out bytes_in/out on for some Linux 2.6.x kernels
http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=21
* gmond/: g25_config.c (1.3), gmond.c (1.102): Patched two bugs in
gmond. The first bug causes gmond to occasionally stop reporting
occasionally when there is a network failure. The second bug in
gmond relates to the host mask being set to 24 instead of 32 when
converting old gmond.conf configuration files.
* srclib/libmetrics/freebsd/metrics.c (1.5): Fix a number of bugs
of varying severity: - makenetvfslist had some nasty uninitilized
variable bugs under FreeBSD 4.x, fix those. - general
reorganization and logic clarity improvements in
makenetvfslist. - Make machine_type_func, os_name_func, and
os_release_func and correct their error handling code to
actually do something useful (not that it should ever be
triggered).
* srclib/libmetrics/freebsd/metrics.c (1.4): - Fix a memory leak in
find_disk_space() as reported by Glen Beane. - Overhaul
makenetvfslist() a bit to fix a leak in low memory situations,
reduce duplicated code, and streamline error handling. - Fix a few
compiler warnings.
Ganglia is a scalable distributed monitoring system for high-performance
computing systems such as clusters and Grids. It is based on a hierarchical
design targeted at federations of clusters. It relies on a multicast-based
listen/announce protocol to monitor state within clusters and uses a tree of
point-to-point connections amongst representative cluster nodes to federate
clusters and aggregate their state. It leverages widely used technologies such
as XML for data representation, XDR for compact, portable data transport, and
RRDtool for data storage and visualization. It uses carefully engineered data
structures and algorithms to achieve very low per-node overheads and high
concurrency. The implementation is robust, has been ported to an extensive set
of operating systems and processor architectures, and is currently in use on
over 500 clusters around the world. It has been used to link clusters across
university campuses and around the world and can scale to handle clusters with
2000 nodes.
http://ganglia.sourceforge.net