New in 1.4.3
------------
- Fixed handling of the array_of_argv parameter in the Fortran
binding of MPI_COMM_SPAWN_MULTIPLE.
- Fixed a problem with the Fortran binding for
MPI_FILE_CREATE_ERRHANDLER. Thanks to Secretan Yves for identifying
the issue.
- Updates to the LSF PLM to ensure that the path is correctly passed.
Thanks to Teng Lin for the patch.
- Fixes for the F90 MPI_COMM_SET_ERRHANDLER and MPI_WIN_SET_ERRHANDLER
bindings. Thanks to Paul Kapinos for pointing out the issue.
- Fixed various MPI_THREAD_MULTIPLE race conditions.
- Fixed an issue with an undeclared variable from ptmalloc2 munmap on
BSD systems.
- Fixes for BSD interface detection.
- Various other BSD fixes. Thanks to Kevin Buckley helping to track.
all of this down.
- Fixed issues with the use of the -nper* mpirun command line arguments.
- Fixed an issue with coll tuned dynamic rules.
- Fixed an issue with the use of OPAL_DESTDIR being applied too aggressively.
- Fixed an issue with one-sided xfers when the displacement exceeds 2GBytes.
- Change to ensure TotalView works properly on Darwin.
- Added support for Visual Studio 2010.
- Fix to ensure proper placement of VampirTrace header files.
- Needed to add volatile keyword to a varialbe used in debugging
(MPIR_being_debugged).
- Fixed a bug in inter-allgather.
- Fixed malloc(0) warnings.
- Corrected a typo the MPI_Comm_size man page (intra -> inter). Thanks
to Simon number.cruncher for pointing this out.
- Fixed a SegV in orted when given more than 127 app_contexts.
- Removed xgrid source code from the 1.4 branch since it is no longer
supported in the 1.4 series.
- Removed the --enable-opal-progress-threads config option since
opal progress thread support does not work in 1.4.x.
- Fixed a defect in VampirTrace's vtfilter.
- Fixed wrong Windows path in hnp_contact.
- Removed the requirement for a paffinity component.
- Removed a hardcoded limit of 64 interconnected jobs.
- Fix to allow singletons to use ompi-server for rendezvous.
- Fixed bug in output-filename option.
- Fix to correctly handle failures in mx_init().
- Fixed a potential Fortran memory leak.
- Fixed an incorrect branch in some ppc32 assembly code. Thanks
to Matthew Clark for this fix.
- Remove use of undocumented AS_VAR_GET macro during configuration.
- Fixed an issue with VampirTrace's wrapper for MPI_init_thread.
- Updated mca-btl-openib-device-params.ini file with various new vendor id's.
- Configuration fixes to ensure CPPFLAGS in handled properly if a non-standard
valgrind location was specified.
- Various man page updates
I managed to trace things to the file libmetrics/netbsd/metrics.c in
the get_netbw function. Apparently, the code in get_netbw violates
alignment constraints for sparc64. I attached a patch against the result
of a "make patch" in parallel/ganglia-monitor-core. While I was at it, I
also changed proc_run_func somewhat to only count actually running
processes (having a look at NetBSD's ps(1) implementation) - without the
change, I got around 30 running processes on an idle machine.
"Looks good at a quick glance" martin@
Bump PKGREVISION.
to trigger/signal a rebuild for the transition 5.10.1 -> 5.12.1.
The list of packages is computed by finding all packages which end
up having either of PERL5_USE_PACKLIST, BUILDLINK_API_DEPENDS.perl,
or PERL5_PACKLIST defined in their make setup (tested via
"make show-vars VARNAMES=..."), minus the packages updated after
the perl package update.
sno@ was right after all, obache@ kindly asked and he@ led the
way. Thanks!
Changes in v1.4.2 as compared to v1.4.1:
- Fixed problem when running in heterogeneous environments.
- Update LSF support to ensure that the path is passed correctly.
- Fixed some miscellaneous oversubscription detection bugs.
- IBM re-licensed its LoadLeveler code to be BSD-compliant.
- Various fixes for multithreading deadlocks, race conditions, and
other nefarious things.
- Fixed ROMIO's handling of "nearly" contiguous issues (e.g., with
non-zero true_lb).
- Bunches of Windows build fixes.
- Now allow the graceful failover from MTLs to BTLs if no MTLs can
initialize successfully.
- Added "clobber" information to various atomic operations, fixing
erroneous behavior in some newer versions of the GNU compiler suite.
- Update various iWARP and InfiniBand device specifications in the
OpenFabrics .ini support file.
- Fix the use of hostfiles when a username is supplied.
- Various fixes for rankfile support.
- Updated the internal version of VampirTrace to 5.4.12.
- Fixed OS X TCP wireup issues having to do with IPv4/IPv6 confusion
(see https://svn.open-mpi.org/trac/ompi/changeset/22788 for more
details).
- Fixed some problems in processor affinity support, including when
there are "holes" in the processor namespace (e.g., offline
processors).
- Ensure that Open MPI's "session directory" (usually located in /tmp)
is cleaned up after process termination.
- Fixed some problems with the collective "hierarch" implementation
that could occur in some obscure conditions.
- Various MPI_REQUEST_NULL, API parameter checking, and attribute
error handling fixes.
- Fix case where MPI_GATHER erroneously used datatypes on non-root nodes.
- Patched ROMIO support for PVFS2 > v2.7 (patch taken from MPICH2
version of ROMIO).
- Fixed "mpirun --report-bindings" behavior when used with
mpi_paffinity_alone=1. Also fixed mpi_paffinity_alone=1 behavior
with non-MPI applications.
- Ensure that all OpenFabrics devices have compatible receive_queues
specifications before allowing them to communicate. See the lengthy
comment in https://svn.open-mpi.org/trac/ompi/changeset/22592 for details.
- Fix some issues with checkpoint/restart.
- Improve the pre-MPI_INIT/post-MPI_FINALIZE error messages.
- Ensure that loopback addresses are never advertised to peer
processes for RDMA/OpenFabrics support.
- Fixed a CSUM PML false positive.
- Various fixes for Catamount support.
- Minor update to wrapper compilers in how user-specific argv is
ordered on the final command line. Thanks to Jed Brown for the
suggestions.
- Update to PLPA v1.3.2, addressing a licensing issue identified by
the Fedora project. See
https://svn.open-mpi.org/trac/plpa/changeset/262 for details.
- Add check for malformed checkpoint metadata files (Ticket #2141).
- Fix error path in ompi-checkpoint when not able to checkpoint
(Ticket #2138).
- Cleanup component release logic when selecting checkpoint/restart
enabled components (Ticket #2135).
- Fixed VT node name detection for Cray XT platforms, and fixed some
broken VT documentation files.
- Fix a possible race condition in tearing down RDMA CM-based
connections.
- Relax error checking on MPI_GRAPH_CREATE. Thanks to David Singleton
for pointing out the issue.
- Fix a shared memory "hang" problem that occurred on x86/x86_64
platforms when used with the GNU >=4.4.x compiler series.
- Add fix for Libtool 2.2.6b's problems with the PGI 10.x compiler
suite. Inspired directly from the upstream Libtool patches that fix
the issue (but we need something working before the next Libtool
release).
===============================================================================
Changes in 1.2.1
===============================================================================
# OVERALL: Improved support for fine-grained multithreading.
# OVERALL: Improved integration with Valgrind for debugging builds of MPICH2.
# PM/PMI: Initial support for hwloc process-core binding library in
Hydra.
# PM/PMI: Updates to the PMI-2 code to match the PMI-2 API and
wire-protocol draft.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available using:
svn log -r5425:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.2.1
... or at the following link:
https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.2.1?action=follow_copy&rev=HEAD&stop_rev=5425&mode=follow_copy
===============================================================================
Changes in 1.2
===============================================================================
# OVERALL: Support for MPI-2.2
# OVERALL: Several fixes to Nemesis/MX.
# WINDOWS: Performance improvements to Nemesis/windows.
# PM/PMI: Scalability and performance improvements to Hydra using
PMI-1.1 process-mapping features.
# PM/PMI: Support for process-binding for hyperthreading enabled
systems in Hydra.
# PM/PMI: Initial support for PBS as a resource management kernel in
Hydra.
# PM/PMI: PMI2 client code is now officially included in the release.
# TEST SUITE: Support to run the MPICH2 test suite through valgrind.
# Several other minor bug fixes, memory leak fixes, and code cleanup.
A full list of changes is available using:
svn log -r5025:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.2
... or at the following link:
https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.2?action=follow_copy&rev=HEAD&stop_rev=5025&mode=follow_copy
===============================================================================
Changes in 1.1.1p1
===============================================================================
- OVERALL: Fixed an invalid read in the dataloop code for zero count types.
- OVERALL: Fixed several bugs in ch3:nemesis:mx (tickets #744,#760;
also change r5126).
- BUILD SYSTEM: Several fixes for functionality broken in 1.1.1 release,
including MPICH2LIB_xFLAGS and extra libraries living in $LIBS instead of
$LDFLAGS. Also, '-lpthread' should no longer be duplicated in link lines.
- BUILD SYSTEM: MPICH2 shared libraries are now compatible with glibc versioned
symbols on Linux, such as those present in the MX shared libraries.
- BUILD SYSTEM: Minor tweaks to improve compilation under the nvcc CUDA
compiler.
- PM/PMI: Fix mpd incompatibility with python2.3 introduced in mpich2-1.1.1.
- PM/PMI: Several fixes to hydra, including memory leak fixes and process
binding issues.
- TEST SUITE: Correct invalid arguments in the coll2 and coll3 tests.
- Several other minor bug fixes, memory leak fixes, and code cleanup. A full
list of changes is available using:
svn log -r5032:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.1.1p1
... or at the following link:
https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.1.1p1?action=follow_copy&rev=HEAD&stop_rev=5032&mode=follow_copy
===============================================================================
Changes in 1.1.1
===============================================================================
# OVERALL: Improved support for Boost MPI.
# PM/PMI: Significantly improved time taken by MPI_Init with Nemesis and MPD on
large numbers of processes.
# PM/PMI: Improved support for hybrid MPI-UPC program launching with
Hydra.
# PM/PMI: Improved support for process-core binding with Hydra.
# PM/PMI: Preliminary support for PMI-2. Currently supported only
with Hydra.
# Many other bug fixes, memory leak fixes and code cleanup. A full
list of changes is available using:
svn log -r4655:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.1.1
... or at the following link:
https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.1.1?action=follow_copy&rev=HEAD&stop_rev=4655&mode=follow_copy
===============================================================================
Changes in 1.1
===============================================================================
- OVERALL: Added MPI 2.1 support.
- OVERALL: Nemesis is now the default configuration channel with a
completely new TCP communication module.
- OVERALL: Windows support for nemesis.
- OVERALL: Added a new Myrinet MX network module for nemesis.
- OVERALL: Initial support for shared-memory aware collective
communication operations. Currently MPI_Bcast, MPI_Reduce, MPI_Allreduce,
and MPI_Scan.
- OVERALL: Improved handling of MPI Attributes.
- OVERALL: Support for BlueGene/P through the DCMF library (thanks to
IBM for the patch).
- OVERALL: Experimental support for fine-grained multithreading
- OVERALL: Added dynamic processes support for Nemesis.
- OVERALL: Added automatic as well as statically runtime configurable
receive timeout variation for MPD (thanks to OSU for the patch).
- OVERALL: Improved performance for MPI_Allgatherv, MPI_Gatherv, and MPI_Alltoall.
- PM/PMI: Initial support for the new Hydra process management
framework (current support is for ssh, rsh, fork and a preliminary
version of slurm).
- ROMIO: Added support for MPI_Type_create_resized and
MPI_Type_create_indexed_block datatypes in ROMIO.
- ROMIO: Optimized Lustre ADIO driver (thanks to Weikuan Yu for
initial work and Sun for further improvements).
- Many other bug fixes, memory leak fixes and code cleanup. A full
list of changes is available using:
svn log -r813:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.1
... or at the following link:
https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.1?action=follow_copy&rev=HEAD&stop_rev=813&mode=follow_copy
New in OpenPA v1.0.2:
Major Changes:
* Add support for 64-bit PPC.
* Static initializer macros for OPA types.
balaji (1):
* Fix pthread_mutex usage for inter-process shared memory regions.
buntinas (1):
* added OPA typedef for pthread_mutex_t
fortnern (4):
* Add more tests for compare-and-swap.
* Add integer compare-and-swap fairness test.
* Add pointer version of compare-and-swap fairness test.
* Added configure test for pthread_yield.
goodell (6):
* Fix bad include guard in the opa_by_lock.h header.
* Add new "unsafe" primitives. Also minor updates to the docs.
* Add support for 64-bit PPC.
* Update README to reflect 64-bit PPC support.
* Add static initializer macros for OPA_int_t/OPA_ptr_t.
* Actually include the COPYRIGHT and CHANGELOG files in the distribution.
jayesh (1):
* Fixed compiler warnings in NT intrinsics. Now type casting the arguments to NT intrinsics correctly
Grid Engine 6.2, which has undergone significant changes in qmaster to
significantly improve its scalability in challenging environments, adds
powerful features to the core system, introduces multi cluster support
for the Accounting and Reporting Console (ARCo) and comes with a new
module extending the scope of Grid Engine to a new domain of use cases:
the Service Domain Manager (SDM), aka. project Hedeby allows to
dynamically (re-)assign computational resources on demand.
plus lots of bug fixes.
This changes the buildlink3.mk files to use an include guard for the
recursive include. The use of BUILDLINK_DEPTH, BUILDLINK_DEPENDS,
BUILDLINK_PACKAGES and BUILDLINK_ORDER is handled by a single new
variable BUILDLINK_TREE. Each buildlink3.mk file adds a pair of
enter/exit marker, which can be used to reconstruct the tree and
to determine first level includes. Avoiding := for large variables
(BUILDLINK_ORDER) speeds up parse time as += has linear complexity.
The include guard reduces system time by avoiding reading files over and
over again. For complex packages this reduces both %user and %sys time to
half of the former time.
Changes since 1.0.7:
- Added support for MPI 2.1
- Added support for MPI_Type_create_resized and
MPI_Type_create_indexed_block datatypes in ROMIO.
- Bug fixes, memory leak fixes and code cleanup.
patch-au compiles sge_arch.c with -ansi so that stringification hack works
on NetBSD and FreeBSD and probably others. Otherwise architecture names
like nbsd-i386 turn into nbsd-1 - From the FreeBSD port.
Bugs fixed in SGE 6.1u5 since release 6.1u4
wrong documentation for upgrade 6.0u2 and higher to 6.1u2 and higher
Multiple loadsensor instances are trying to access the same temp load
file on AIX51
Validation of the Filter List in Simple Query builder fails
qhost -l h=<hostname> does not work
Numbers in error mail too large
use of the same pathes for input/output stream must be dealt with
DRMAA Java language binding segfaults on Session.exit() with sol-x86
binaries on AMD64
sgeexecd startup script shouldn't suppress error messages from sge_execd
binary
Advanced Query with wild card character * does not produce correct results.
'Infinity' must be rejected when specified in 'complex_values' or RQS
limits for consumables
Invalid qconf -mrqs crashes qmaster with segmentation fault
RQS: Line wrap of host list introduces syntax error
Row Limit in ARCo Simple Query builder cannot be empty
loadsensor does not work on AIX51
qhost -xml has wrong namespace
QMON: The help for Resource Quotas is not available
qmon fills in fields incorrectly for restoring config for Submit Job
sgemaster -qmaster stop shutdowns also shadowd
incorrect depencency on xinetd in init scripts for linux
Latebindings for Advanced Queries does not work
Switching from Simple Query to Advanced Query removes the Latebingings
32-bit Linux binaries are having problems with file access in 64-bit NFS
environments
using of default_domain may prohibit execd installation
Commlib might crash if running out of memory
Configuration file check of automatic installation does not recognize remsh
loadcheck prints error message "kstat finds too many cpus"
Communication library thread locking problem results in qmaster crash
ARCo should not print exception stack trace in the console
TABLESPACE values should be written to dbwriter.conf
Incorrectly considering two host group names to be the same
Clients not disconnecting correctly
SGE util/arch script is broken for AIX 5.3 Operating System
error message given by qalter -q '' <jobid> suggests a memory access problem
bootstrap(5) man page sees itself als sge_conf(5)
qmaster reinstall overwriting an existing installation fails
qconf -ae|-Ae return 0 even if exechost exists already
qconf -dxxxx does not set exit status on error
qconf -as, -am, -ao, -Ae, -Acal, -Ackpt, -Ap when msg "already exist"
should return not 0 exit code
qconf -acal doesn't return error code 1 when failed
setting of QMaster port number leads to infinite loop
use of -l tmpdir=abc can crash schedd
load scaling display not working correctly
qstat -j does not print array task information
job hold due to -hold_jid is not indicated as STATE_SYSTEM_ON_HOLD by
drmaa_job_ps(3)
Segmentation fault of sge_schedd
A load sensor reporting values for other hosts does not work
reporting file is lacking information about global consumables, if
log_consumables=false
Wallclock_Time query should be more constrained
"./install_execd -winsvc -auto /path/to/auto.conf" command causes error
The default has to be local spool directory when install_execd is run for
a Windows host
qmaster runs out of memory on AIX
dbw install parameters are not verified
Incorrect slots_total from qstat -F -xml output
Wrong permissions if install_qmaster creates qmaster spool directory
Installation of execution daemon left user unclear which port was chosen.
Exception occurs during the exportation of a query result to pdf
memory leak in sge_execd with qsub -v SGE_* or qsub -V
ARCo should support SJWC 3.1
Bugs fixed in SGE 6.1u4 since release 6.1u3
on Windows installation fails when installing as root and SGE admin
user = none
accounting records for slave tasks of pe jobs should contain the correct
task submission time
check if config parameters qlogin_daemon and rlogin_daemon are pathes
parallel scheduling memory leak in sge_schedd
execd installation does not test absolute path for local spool dir
Sort on table column throws exception if explicit SORT specified in
SimpleQuery Sort on table column
Error.jsp contains unbalanced tagError.jsp contains unbalanced tag
arco_read should be able to create synonyms instead of arco_write
DBW should use batch inserts
prolog an epilog descriptions should include exit codes
It is possible to negative tickets / shares in qmon and from the command
line
ORDER BY clause ignored in Advanced Query
Queue Consumables query incorrect in ARCo predefined queries
CLI accepts the slot number of more than 10000000
ARCo online help contains invalid, unclear or outdated information
the installation of two rpc databases on the some host fails
DBWriter should not exit if there is a database connection error
Reporting 'View' dropdown menu and 'Save Result' functionality is confusing
DBW derived rules and reporting queries that count jobs need to be updated
incomplete error loging in case of classic spooling failures
Row Limit in Simple Query uses wrong syntax
NONE' as value is not rejected for queue_conf(5) shell and qsub(1) -S
Upgrade to 6.1u3 fails for PostgreSQL < 8.0, minor issues i
dbdefinition.xml for PSQL > 8.0
dbwriter should write checkpoint to database
dbwriter deletion rules delete tasks of pe_jobs
unclear 'exit_status' description in accounting(5) about Grid Engine
specific status
autoinstall configfile should be parsed and checked for valid input!
qstat -j output is broken for shell_path
the project field should be displayed in the qstat -j output
Wrong variable for calculating daily host values from hourly ones
Pending PE job qstat -j output displays addtional useless message when not
running because of RQs
automatic backup is broken!
Spelling mistakes in the qmon help menus
deletion rule for PostgreSQL incorrect for deletion of sge_share_log
qquota broken if quota definition contains "hosts" or "users" scope negation
Access_list(5) man page not precise enough with regards secondary/primary
group(s)
RQS debitation of running jobs is broken if enabled by -mattr
Set SGE_QMASTER_PORT in settings file if sge_qmaster is not found in
/etc/services file
Failed to deliver STOP signal for subordinated jobs
Missing array job task usage in the accounting file
qhost/qstat can't be interrupted with ctrl-c
typographical errors in messages from install_qmaster
Sort order and row limit cannot be specified together in ARCo Simple Query
builder
Qmaster segfaults with long host resource evaluation expression
Error message for unsupported platforms should be more verbose
qsub does not accept resource strings size larger than 256
Memory leak in drmaa_run_job()/drmaa_run_bulk_job()
ARCo reporting module installation script is broken on Red Hat Enterprise
Linux 4 Update 4
Job predecessor list missing from qstat -j output
In SJWC on Oracle dates appear truncated to just MM/DD/YYYY
configfile check in automatic installation is to strict
load sensor might block execd port
Uninstallation of remote execd if not interactive
Infotext spawned on remote machine with -wait or -ask does not display the
text
Uninstall does not remove the SGE_STARTUP_SCRIPT
qmaster crashes when SGE_ND=1, dl 2 and BDB server spooling
inst_sge -ux all -um fails
Usage string for some commands is incomplete
dbwriter installation can't finish on large amount of data
reprioritize disappears after sge_qmaster restart
qmaster failover should not change the state of any queue
to trigger/signal a rebuild for the transition 5.8.8 -> 5.10.0.
The list of packages is computed by finding all packages which end
up having either of PERL5_USE_PACKLIST, BUILDLINK_API_DEPENDS.perl,
or PERL5_PACKLIST defined in their make setup (tested via
"make show-vars VARNAMES=...").
-------------------------------------------
Issue Sun BugId Description
-------- --------- ------------------------------------------------------------------------------------------
376 4743006 problem with floating point job resource limits
1909 6353628 information provided by qstat -j and qstat -j -xml are not equivalent
2076 6440408 qstat -j messages disagree between plain, XML output
2077 6440412 qstat -j -xml messages incomplete
2138 6506667 forbid deletion of global config values
2194 6527836 authuser binary returns unusable error message!
2249 6568575 SGE does not work if primary group entry is too big in groups map
2270 6575720 ENABLE_ADDGRP_KILL is missing from sge_conf(5)
2272 6575727 sge_shadowd(8) man page is missing some env vars
2274 6564461 Duplicate scheduling info messages for reservation jobs
2276 6575731 share_tree(5) doesn't explain type field
2283 6565821 Oracle, Postgres DWB should prompt for tablespace where indexes and tables should be created
2293 6569088 Resource reservation broken for sequential jobs depending on RQS specified for subset of queues only
2303 6571749 parallel resource reservation broken when non-queue instance based quotas limits apply
2323 6576153 Creating a userset with NONE as a type results in a core dump
2327 6578213 qconf -(A,D,M,R)attr dumps core when the supplied file is empty
2328 6579232 high scheduler dispatching time with many sequential resource reservation jobs and resource quotas
2336 6287501 rctemplates lack of requirement
2338 6585721 Parallel RR broken if jobs wait for queue slots and no RQS configured
2342 6590010 Original primary group vanishes after newgrp command (USE_QSUB_GID=true)
2344 6590079 Resource reservation broken with sequences of identical jobs differing only in their -R y|n
2346 6604155 qmon binary job submit is broken
2351 6597463 qsub -t 1-N:N creates a normal job with one task
2352 6594665 Installation fails on Linux with glibc 2.6
2353 6597423 commit method of UnixLoginModule does not report RuntimeExceptions
2356 6600619 Userset spooling in classic mode is broken
2367 6597547 qdel does not recognize wc_job_range_list as it is defined
2369 6577034 Several qconf options display only single message when a list of messages should be printed
2372 6469494 clients should issue a more explicit error message when qmaster is busy
2374 6589459 Expose the availability of keyword "none" in the manual page of calendar_conf
2382 6569862 Unset old_value out of the scope
2383 6553062 qconf -mc accepts erroneous resource entries without an urgency; qmon gives (poor) error message
2387 6614041 Multiple occurrence of a name in RQS limit definition break classic spooling
2392 6614108 Specifying more than one drmaa_v_env attribute causes spurious error msg
2394 6608259 scheduler prints empty line in messages file after every 'sge_mirror' logging
2396 6608236 scheduling of parallel jobs does not respect consumables, if consumable is referenced in rqs
2400 6564543 sge_shepherd should exit if it cannot write to any of its essential files
2401 6617450 add option to reporting_params for switching off writing of consumables
2404 6618328 qmon displays wrong string for queue filtering
2406 6596931 Incorrect messages in qconf command
2407 6618619 the restore feature does not delete old configuration before restoring
2409 6619016 removing parameters from the reporting_params will not fallback to the default
2410 6619657 qmod -e|-d '*' times out in large clusters
2411 6619662 qhost becomes sluggish in large clusters
2414 6618599 Long running jobs cause incorrect usage summary for ARCo database
2415 6620930 ARCO view_accounting filters out parallel job usage incorrectly
2416 6621482 ju_exit_status should provide means to recognize the intermediate record
2417 6622842 the start_time field in intermediate accounting records is incorrect
2418 6588743 qrsh fails with "connection refused" error message
2419 6391244 qstat -ext reports wrong usage as compared to other commands such as qstat -t or qstat -j
2424 6620253 During the installation the admin user should create web.xml file
2428 6630268 upgrade from 6.0u2 and higher to 6.1u2 and higher does not work
2435 6599335 inst_sge help output for -upd switch is incorrect
Bugs fixed in SGE 6.1u2 since release 6.1u1
-------------------------------------------
Issue Sun BugId Description
-------- --------- ------------------------------------------------------------------------------------------
- 6590960 Man pages show the wrong version number
2345 6590574 resource quota can prevent dispatching of jobs that requests no resource in this quota
2343 6589807 newline missing from "illegal debug level format" message
2338 6585721 Parallel RR broken if jobs wait for queue slots and no RQS configured
2334 6584632 user/system/operator hold state combinations cause strange qstat output
Bugs fixed in SGE 6.1u1 since release 6.1
-----------------------------------------
Issue Sun BugId Description
-------- --------- ------------------------------------------------------------------------------------------
2323 6576153 Creating a userset with NONE as a type results in a core dump
2317 6574565 Oralce, Postgres FOREIGN KEY fields need to be indexed
- 6573980 'qconf -help' suggests usage of patterns in user_list which is not true
2316 6573508 qrsh with ssh causes job to go in error state when Ctrl-C is pressed
2308 6572803 qhost -xml lacks '>' with initial qhost tag
2309 6572801 sge_queue_values definition does not contain PRIMARY KEY
2321 6571714 Inadequate error message when qconf -sstree is run when no share tree is configuredIf no share tree
2241 6568712 util/arch has problem recognizing libc version number with comma
2292 6568578 6.1 upgrade procedure shall exit when there are jobs in the cluster
2249 6568575 SGE does not work if primary group entry is too big in groups map
2284 6565841 Oracle: rollback segments keep filling up, Postgres: delete query keeps running
2306 6564592 SGE 6.1 upgrade procedure is broken when using the classic qmaster spooling
2275 6564503 sge_schedd deadlock upon schedd_job_info job_list being enabled
2250 6558006 qmaster may crash with projects or usersets used in RQS
2243 6555744 qmon crashes when displaying about dialog
2248 6554313 add -u <user> to scheduler category only if there is a resource quota for the user
2238 6551568 need faster resource quota matchmaking and more concise job info messages
- 6550718 qstat -j lacks resource quota info messages in case of "incomplete" resource quotas
2296 6548455 csp mode installation, using /etc/services, qmaster is not starting!
2232 6546807 qhost -j -xml does not work
2325 6542987 drmaa_run_job(3) raises error if drmaa_native_specification has leading spaces
2239 6542137 use of hostgroups in resource quotas is less performant than the full list of hosts
- 6541085 NFS write error on N1GE trace file
2300 6539199 qquota(1) filtering broken for project and pe if -P/-pe switch is not used
2299 6536039 sgeremoterun not working
2201 6529974 Use of MORE fails on some architectures
- 6528949 inst_sge -ux uninstallation of exechost tries to delete local spooldir, even it isn't configured!
2191 6525883 qstat -s hX filtering is broken on darwin
2189 6525375 qacct ignores jobs in output
2320 6513115 in qmon, under calendar configuration, it is possible to modify even if no calendar exists
2326 6506661 sge_conf(5), description for rlogin_daemon and qlogin_daemon is wrong
2307 6433628 qconf -sq all.q@myhost produces no value at all for complex_values (not even NONE)
2289 6565951 Qmon panel does not check for valid data in Scheduler Configuration
2314 6513116 Qmon x qconf inconsistent in allowed characters in attribute names
- 6195248 QMON Job Control Window: Incomprehensible Priority Button
2313 6410592 Double clicking in Consumables/Fixed Attributes list does not behave as a GUI should
2312 6482211 complex attributes whose deletion is denied donot reflect back after the denial message in qmon
2301 6551121 Memory leak in libdrmaa.so
916 6355875 qsub -terse to just output job id
- 6522273 Wrong exit code with qconf -sds
2266 6563346 Wrong usage of 'day' format model in trunc(date) Oracle functions
2187 6562190 memory leak in sge_schedd
2265 6280747 qmon loses sharetree changes
747 6291044 "Modify"-Button is activated but should be grayed
2263 6553066 qmon's Complex Configuration Load and Save buttons did not work
2262 4742097 Qmon has a ticket number limitation
1729 4818801 qmon on secondary screen crashes when "Job Control" is pressed
2261 6538740 clear usage operation should implicitely trigger refresh in share-tree dialogue
2260 6327539 Ability to sort queue instances using each column of the queue instances table
2229 6544869 UNKNOWN group/owner in accouting(5)
2247 6556411 DBW queries "Average Job Turnaround Time", "Average Job Wait Time" might not work
- 6481737 Arco should support webconsole 3.0.x
- 6559385 Calling JGDI getQueueInstanceSummary results in a memory leak
1813 6328064 Queue request -q from sge_request can't be overridden through command line
- 6355674 arcorun can not be used as sge_admin user if the toc file is not available
2164 6514085 Need a possibility to update existing example queries for the ARCo web application
- 6426331 remove util/sge_log_tee from distribution
- 6476263 function job_get_id_string() is not MT save and used in qmaster
2219 6536426 inst_sge -m fails for non-root when USER variable is not set
1860 6345522 qdel on a job in deleted state does not output any information
2258 5081743 queue status in reporting file is missing.
2050 6422335 still used usersets/project/calendar/pe/checkpoint can be removed under certain conditions
Bugs fixed in SGE 6.1 since release 6.1_beta
--------------------------------------------
Issue Sun BugId Description
-------- --------- ------------------------------------------------------------------------------------------
1941 5086007 qstat -qs doesn't work
2183 6499217 meaningless error in clients when reporting_param flush_time is incorrectly set
- 6525497 JGDI crashes JVM when null is passed to JNI GetStringUTFChars function
2220 6440226 add installation of SGE_Helper_Service to auto installation
2221 6521802 the binary check in inst_sge is wrong!
- 6537633 Extraneous space in qsub's "Invalid month specification." message
2222 6538293 Hybrid user/project share-tree is broken for user sharing amongst array jobs
2180 6518684 Qconf usage x man page inconsistency
2181 6518689 Project man page contains different attribute names.
2171 6516288 Scheduler does not write pid file in daemonize phase
2178 6518607 invalid memory access in cl_com_get_handle
- 6520761 add background mode to N1 Grid Engine Helper Service
- 6233523 loadcheck reports on a hyperthreaded CPU only one processor
- 6276612 provide support for Itanium platform
752 6288953 scalability issue with qdel and very large array jobs
751 6291047 qconf -sstnode cannot find root
- 6303750 Install guide ambiguous on role of CSP
1930 6329378 incorrect qsub error message, if an invalid integer value is passed to the -l option
1858 6344960 qtcsh behaves differently in direct mode from qrshmode
1933 6349037 "qstat -explain E" displays explanation of the same error two times.
1940 6362523 qstat -q filter does compare hosts in queue instances
- 6363245 on some Windows execution hosts, execd hangs after the job has finished
1978 6383256 no newline at end of sge_shepherd's exit_status messages
- 6395078 wrong entry in sgepasswd file wrongly sets whole host in error state
2012 6402127 qconf -suserl reports incorrect status if no users are defined
- 6403152 qconf -as returns error code 0 even in case of unresolvable host
2015 6403810 JavaDocs for DRMAA need improvement
- 6428621 add a reserved complex value to control displaying Windows GUIs
- 6453426 Event clients will not get list updates, when they change their subscription after the registration
- 6461308 Wrong path to spooled parallel jobs with using classic spooling
2130 6501447 No online usage for MacOS X
2141 6506701 sge_shepherd dumps core on linux amd64 for qrsh jobs with very long cmdline (> 10k)
2233 6528950 modifying a RQS with invalid syntax results in its deletion
- 6533952 Admins guide does not mention that parallel environments must be linked with queues
- 6535768 Upgrade chapter 5 in 6.1 install guide must mention abolition of LD_LIBRARY_PATH for Solaris/Linux
- 6535775 Upgrade chapter in 6.1 Install Guide wrongly indicates upgrade from 5.3 were possible
- 6537476 6.1 install guide broken and incomplete wrts MySQL installation for ARCO
- 6537607 6.1 Admins guide needs improvement on the linking between queues and parallel environments
- 6539215 quota verification time may not grow with the number of queues
2224 6539792 resource quotas broken after qmaster restart
- 6542483 Important changes with Resource Quota chapter in 6.1 admins guide
- 6545277 sge_statistic tables are not documented
2230 6546370 Pivot for ARCo Accounting Queries does not show all the fields
2231 6546802 qstat -F -xml does not show resources
Bugs fixed in SGE 6.1_beta since release 6.1_preview2
-----------------------------------------------------
Issue Sun BugId Description
-------- --------- ------------------------------------------------------------------------------------------
- 6267190 Typo before "About the urgent priority" in Admin Guide
1445 6291021 64 bit solaris BDB rpc server broken
1703 6295319 Admin guide: refers to sge_host(5) instead of host_conf(5)
- 6344917 Error in Embedded Command Line Options example
- 6395075 on Windows, execd doesn't provide useful error messages when SSL keys broken
2188 6421113 CSP mode auto installation: certificates are not copied to submit hosts
- 6444526 Admin guide describes N1GE backup facility, but restore is not described
2196 6472614 Auto installation option failed to save the install log
2182 6513433 remote installation of execd's need enhancement, rework, cleanup
2139 6506690 dbwriter should not use autocommit mode
- 6520257 need to define continuation character behaviour with qconf file formats
- 6521285 describe useful characters for every parameter
2185 6522385 qmon crash in cluster configuration dialog when modifying a host
2192 6525917 qacct -l h=<hostname> dumps core on darwin and linux itanium
2198 6528808 sge_ca script fails on nfs no root access file systems
2202 6530335 qmaster aborts when a resource quota set is modified while jobs are running
2204 6531317 qstat -xml does not show pending/zombie jobs
2206 6531921 qstat -r -xml is not working
2207 6533754 resource quota are modified on qconf -mrqs, even if the editor is exited without saving
Bugs fixed in SGE 6.1_preview2 since release 6.1_preview1
---------------------------------------------------------
Issue Sun BugId Description
-------- --------- ------------------------------------------------------------------------------------------
- 5093930 ARCo should work with MySQL
- 5101053 Regular expressions should also be mentioned in qsub in addition to complex
- 5101735 Needs more boolean operators support for resource requests
56 6205203 Logical OR operator works only with complex attributes of type RESTRING
2135 6506115 Invalid qconf -mattr crashes qmaster
2150 6507572 qconf -Arqs added invalid RQS
2146 6510635 Default requests for complexes not honored by resource quotas
2161 6513944 qmaster core dump with usersets referenced in RQS
2162 6513967 unix groups are not considered by RQS
2166 6515122 add -wd working_dir in addition to -cwd option for submission