Commit graph

5 commits

Author SHA1 Message Date
jnemeth
cc6147e399 Update to MySQL Cluster 7.4.6:
----

Changes in MySQL Cluster NDB 7.4.6 (5.6.24-ndb-7.4.6)

Bugs Fixed

    During backup, loading data from one SQL node followed by
repeated DELETE statements on the tables just loaded from a different
SQL node could lead to data node failures. (Bug #18949230)

    When an instance of NdbEventBuffer was destroyed, any references
to GCI operations that remained in the event buffer data list were
not freed. Now these are freed, and items from the event bufer data
list are returned to the free list when purging GCI containers.
(Bug #76165, Bug #20651661)

    When a bulk delete operation was committed early to avoid an
additional round trip, while also returning the number of affected
rows, but failed with a timeout error, an SQL node performed no
verification that the transaction was in the Committed state. (Bug
#74494, Bug #20092754)

    References: See also Bug #19873609.

Changes in MySQL Cluster NDB 7.4.5 (5.6.23-ndb-7.4.5)

Bugs Fixed

    In the event of a node failure during an initial node restart
followed by another node start, the restart of the the affected
node could hang with a START_INFOREQ that occurred while invalidation
of local checkpoints was still ongoing. (Bug #20546157, Bug #75916)

    References: See also Bug #34702.

    It was found during testing that problems could arise when the
node registered as the arbitrator disconnected or failed during
the arbitration process.

    In this situation, the node requesting arbitration could never
receive a positive acknowledgement from the registered arbitrator;
this node also lacked a stable set of members and could not initiate
selection of a new arbitrator.

    Now in such cases, when the arbitrator fails or loses contact
during arbitration, the requesting node immediately fails rather
than waiting to time out. (Bug #20538179)

    DROP DATABASE failed to remove the database when the database
directory contained a .ndb file which had no corresponding table
in NDB. Now, when executing DROP DATABASE, NDB performs an check
specifically for leftover .ndb files, and deletes any that it finds.
(Bug #20480035)

    References: See also Bug #44529.

    The maximum failure time calculation used to ensure that normal
node failure handling mechanisms are given time to handle survivable
cluster failures (before global checkpoint watchdog mechanisms
start to kill nodes due to GCP delays) was excessively conservative,
and neglected to consider that there can be at most number_of_data_nodes
/ NoOfReplicas node failures before the cluster can no longer
survive. Now the value of NoOfReplicas is properly taken into
account when performing this calculation.  (Bug #20069617, Bug
#20069624)

    References: See also Bug #19858151, Bug #20128256, Bug #20135976.

    When performing a restart, it was sometimes possible to find
a log end marker which had been written by a previous restart, and
that should have been invalidated. Now when when searching for the
last page to invalidate, the same search algorithm is used as when
searching for the last page of the log to read.  (Bug #76207, Bug
#20665205)

    During a node restart, if there was no global checkpoint
completed between the START_LCP_REQ for a local checkpoint and its
LCP_COMPLETE_REP it was possible for a comparison of the LCP ID
sent in the LCP_COMPLETE_REP signal with the internal value
SYSFILE->latestLCP_ID to fail. (Bug #76113, Bug #20631645)

    When sending LCP_FRAG_ORD signals as part of master takeover,
it is possible that the master is not synchronized with complete
accuracy in real time, so that some signals must be dropped. During
this time, the master can send a LCP_FRAG_ORD signal with its
lastFragmentFlag set even after the local checkpoint has been
completed. This enhancement causes this flag to persist until the
statrt of the next local checkpoint, which causes these signals to
be dropped as well.

    This change affects ndbd only; the issue described did not
occur with ndbmtd. (Bug #75964, Bug #20567730)

    When reading and copying transporter short signal data, it was
possible for the data to be copied back to the same signal with
overlapping memory. (Bug #75930, Bug #20553247)

    NDB node takeover code made the assumption that there would be
only one takeover record when starting a takeover, based on the
further assumption that the master node could never perform copying
of fragments. However, this is not the case in a system restart,
where a master node can have stale data and so need to perform such
copying to bring itself up to date. (Bug #75919, Bug #20546899)

    Cluster API: A scan operation, whether it is a single table
scan or a query scan used by a pushed join, stores the result set
in a buffer. This maximum size of this buffer is calculated and
preallocated before the scan operation is started. This buffer may
consume a considerable amount of memory; in some cases we observed
a 2 GB buffer footprint in tests that executed 100 parallel scans
with 2 single-threaded (ndbd) data nodes.  This memory consumption
was found to scale linearly with additional fragments.

    A number of root causes, listed here, were discovered that led
to this problem:

	Result rows were unpacked to full NdbRecord format before
they were stored in the buffer. If only some but not all columns
of a table were selected, the buffer contained empty space (essentially
wasted).

	Due to the buffer format being unpacked, VARCHAR and
VARBINARY columns always had to be allocated for the maximum size
defined for such columns.

	BatchByteSize and MaxScanBatchSize values were not taken
into consideration as a limiting factor when calculating the maximum
buffer size.

    These issues became more evident in NDB 7.2 and later MySQL
Cluster release series. This was due to the fact buffer size is
scaled by BatchSize, and that the default value for this parameter
was increased fourfold (from 64 to 256) beginning with MySQL Cluster
NDB 7.2.1.

    This fix causes result rows to be buffered using the packed
format instead of the unpacked format; a buffered scan result row
is now not unpacked until it becomes the current row. In addition,
BatchByteSize and MaxScanBatchSize are now used as limiting factors
when calculating the required buffer size.

    Also as part of this fix, refactoring has been done to separate
handling of buffered (packed) from handling of unbuffered result
sets, and to remove code that had been unused since NDB 7.0 or
earlier. The NdbRecord class declaration has also been cleaned up
by removing a number of unused or redundant member variables.  (Bug
#73781, Bug #75599, Bug #19631350, Bug #20408733)

-----

Changes in MySQL Cluster NDB 7.4.4 (5.6.23-ndb-7.4.4)

Bugs Fixed

    When upgrading a MySQL Cluster from NDB 7.3 to NDB 7.4, the
first data node started with the NDB 7.4 data node binary caused
the master node (still running NDB 7.3) to fail with Error 2301,
then itself failed during Start Phase 5. (Bug #20608889)

    A memory leak in NDB event buffer allocation caused an event
to be leaked for each epoch. (Due to the fact that an SQL node uses
3 event buffers, each SQL node leaked 3 events per epoch.) This
meant that a MySQL Cluster mysqld leaked an amount of memory that
was inversely proportional to the size of TimeBetweenEpochs that
is, the smaller the value for this parameter, the greater the amount
of memory leaked per unit of time. (Bug #20539452)

    The values of the Ndb_last_commit_epoch_server and
Ndb_last_commit_epoch_session status variables were incorrectly
reported on some platforms. To correct this problem, these values
are now stored internally as long long, rather than long. (Bug
#20372169)

    When restoring a MySQL Cluster from backup, nodes that failed
and were restarted during restoration of another node became
unresponsive, which subsequently caused ndb_restore to fail and
exit. (Bug #20069066)

    When a data node fails or is being restarted, the remaining
nodes in the same nodegroup resend to subscribers any data which
they determine has not already been sent by the failed node.
Normally, when a data node (actually, the SUMA kernel block) has
sent all data belonging to an epoch for which it is responsible,
it sends a SUB_GCP_COMPLETE_REP signal, together with a count, to
all subscribers, each of which responds with a SUB_GCP_COMPLETE_ACK.
When SUMA receives this acknowledgment from all subscribers, it
reports this to the other nodes in the same nodegroup so that they
know that there is no need to resend this data in case of a subsequent
node failure. If a node failed before all subscribers sent this
acknowledgement but before all the other nodes in the same nodegroup
received it from the failing node, data for some epochs could be
sent (and reported as complete) twice, which could lead to an
unplanned shutdown.

    The fix for this issue adds to the count reported by
SUB_GCP_COMPLETE_ACK a list of identifiers which the receiver can
use to keep track of which buckets are completed and to ignore any
duplicate reported for an already completed bucket.  (Bug #17579998)

    The output format of SHOW CREATE TABLE for an NDB table containing
foreign key constraints did not match that for the equivalent InnoDB
table, which could lead to issues with some third-party applications.
(Bug #75515, Bug #20364309)

    An ALTER TABLE statement containing comments and a partitioning
option against an NDB table caused the SQL node on which it was
executed to fail. (Bug #74022, Bug #19667566)

    Cluster API: When a transaction is started from a cluster
connection, Table and Index schema objects may be passed to this
transaction for use. If these schema objects have been acquired
from a different connection (Ndb_cluster_connection object), they
can be deleted at any point by the deletion or disconnection of
the owning connection. This can leave a connection with invalid
schema objects, which causes an NDB API application to fail when
these are dereferenced.

    To avoid this problem, if your application uses multiple
connections, you can now set a check to detect sharing of schema
objects between connections when passing a schema object to a
transaction, using the NdbTransaction::setSchemaObjectOwnerChecks()
method added in this release. When this check is enabled, the schema
objects having the same names are acquired from the connection and
compared to the schema objects passed to the transaction. Failure
to match causes the application to fail with an error. (Bug #19785977)

    Cluster API: The increase in the default number of hashmap
buckets (DefaultHashMapSize API node configuration parameter) from
240 to 3480 in MySQL Cluster NDB 7.2.11 increased the size of the
internal DictHashMapInfo::HashMap type considerably.  This type
was allocated on the stack in some getTable() calls which could
lead to stack overflow issues for NDB API users.

    To avoid this problem, the hashmap is now dynamically allocated
from the heap. (Bug #19306793)

-----

Changes in MySQL Cluster NDB 7.4.3 (5.6.22-ndb-7.4.3)

Functionality Added or Changed

    Important Change; Cluster API: This release introduces an
epoch-driven Event API for the NDB API that supercedes the earlier
GCI-based model. The new version of this API also simplifies error
detection and handling, and monitoring of event buffer memory usage
has been been improved.

    New event handling methods for Ndb and NdbEventOperation added
by this change include NdbEventOperation::getEventType2(),
pollEvents2(), nextEvent2(), getHighestQueuedEpoch(),
getNextEventOpInEpoch2(), getEpoch(), isEmptyEpoch(), and isErrorEpoch.
The pollEvents(), nextEvent(), getLatestGCI(), getGCIEventOperations(),
isConsistent(), isConsistentGCI(), getEventType(), getGCI(),
getLatestGCI(), isOverrun(), hasError(), and clearError() methods
are deprecated beginning with the same release.

    Some (but not all) of the new methods act as replacements for
deprecated methods; not all of the deprecated methods map to new
ones. The Event Class, provides information as to which old methods
correspond to new ones.

    Error handling using the new API is no longer handled using
dedicated hasError() and clearError() methods, which are now
deprecated as previously noted. To support this change, TableEvent
now supports the values TE_EMPTY (empty epoch), TE_INCONSISTENT
(inconsistent epoch), and TE_OUT_OF_MEMORY (insufficient event
buffer memory).

    Event buffer memory management has also been improved with the
introduction of the get_eventbuffer_free_percent(),
set_eventbuffer_free_percent(), and get_eventbuffer_memory_usage()
methods, as well as a new NDB API error Free percent out of range
(error code 4123). Memory buffer usage can now be represented in
applications using the EventBufferMemoryUsage data structure, and
checked from MySQL client applications by reading the
ndb_eventbuffer_free_percent system variable.

    For more information, see the detailed descriptions for the
Ndb and NdbEventOperation methods listed. See also The Event::TableEvent
Type, as well as The EventBufferMemoryUsage Structure.

    Additional logging is now performed of internal states occurring
during system restarts such as waiting for node ID allocation and
master takeover of global and local checkpoints. (Bug #74316, Bug
#19795029)

    Added the MaxParallelCopyInstances data node configuration
parameter. In cases where the parallelism used during restart copy
phase (normally the number of LDMs up to a maximum of 16) is
excessive and leads to system overload, this parameter can be used
to override the default behavior by reducing the degree of parallelism
employed.

    Added the operations_per_fragment table to the ndbinfo information
database. Using this table, you can now obtain counts of operations
performed on a given fragment (or fragment replica).  Such operations
include reads, writes, updates, and deletes, scan and index operations
performed while executing them, and operations refused, as well as
information relating to rows scanned on and returned from a given
fragment replica. This table also provides information about
interpreted programs used as attribute values, and values returned
by them.

    Cluster API: Two new example programs, demonstrating reads and
writes of CHAR, VARCHAR, and VARBINARY column values, have been
added to storage/ndb/ndbapi-examples in the MySQL Cluster source
tree. For more information about these programs, including source
code listings, see NDB API Simple Array Example, and NDB API Simple
Array Example Using Adapter.

Bugs Fixed

    The global checkpoint commit and save protocols can be delayed
by various causes, including slow disk I/O. The DIH master node
monitors the progress of both of these protocols, and can enforce
a maximum lag time during which the protocols are stalled by killing
the node responsible for the lag when it reaches this maximum. This
DIH master GCP monitor mechanism did not perform its task more than
once per master node; that is, it failed to continue monitoring
after detecting and handling a GCP stop. (Bug #20128256)

    References: See also Bug #19858151, Bug #20069617, Bug #20062754.

    When running mysql_upgrade on a MySQL Cluster SQL node, the
expected drop of the performance_schema database on this node was
instead performed on all SQL nodes connected to the cluster.  (Bug
#20032861)

    The warning shown when an ALTER TABLE ALGORITHM=INPLACE ...
ADD COLUMN statement automatically changes a column's COLUMN_FORMAT
from FIXED to DYNAMIC now includes the name of the column whose
format was changed. (Bug #20009152, Bug #74795)

    The local checkpoint scan fragment watchdog and the global
checkpoint monitor can each exclude a node when it is too slow when
participating in their respective protocols. This exclusion was
implemented by simply asking the failing node to shut down, which
in case this was delayed (for whatever reason) could prolong the
duration of the GCP or LCP stall for other, unaffected nodes.

    To minimize this time, an isolation mechanism has been added
to both protocols whereby any other live nodes forcibly disconnect
the failing node after a predetermined amount of time. This allows
the failing node the opportunity to shut down gracefully (after
logging debugging and other information) if possible, but limits
the time that other nodes must wait for this to occur. Now, once
the remaining live nodes have processed the disconnection of any
failing nodes, they can commence failure handling and restart the
related protocol or protocol, even if the failed node takes an
excessiviely long time to shut down.  (Bug #19858151)

    References: See also Bug #20128256, Bug #20069617, Bug #20062754.

    The matrix of values used for thread configuration when applying
the setting of the MaxNoOfExecutionThreads configuration parameter
has been improved to align with support for greater numbers of LDM
threads. See Multi-Threading Configuration Parameters (ndbmtd),
for more information about the changes.  (Bug #75220, Bug #20215689)

    When a new node failed after connecting to the president but
not to any other live node, then reconnected and started again, a
live node that did not see the original connection retained old
state information. This caused the live node to send redundant
signals to the president, causing it to fail. (Bug #75218, Bug
#20215395)

    In the NDB kernel, it was possible for a TransporterFacade
object to reset a buffer while the data contained by the buffer
was being sent, which could lead to a race condition. (Bug #75041,
Bug #20112981)

    mysql_upgrade failed to drop and recreate the ndbinfo database
and its tables as expected. (Bug #74863, Bug #20031425)

    Due to a lack of memory barriers, MySQL Cluster programs such
as ndbmtd did not compile on POWER platforms. (Bug #74782, Bug
#20007248)

    In spite of the presence of a number of protection mechanisms
against overloading signal buffers, it was still in some cases
possible to do so. This fix adds block-level support in the NDB
kernel (in SimulatedBlock) to make signal buffer overload protection
more reliable than when implementing such protection on a case-by-case
basis. (Bug #74639, Bug #19928269)

    Copying of metadata during local checkpoints caused node restart
times to be highly variable which could make it difficult to diagnose
problems with restarts. The fix for this issue introduces signals
(including PAUSE_LCP_IDLE, PAUSE_LCP_REQUESTED, and
PAUSE_NOT_IN_LCP_COPY_META_DATA) to pause LCP execution and flush
LCP reports, making it possible to block LCP reporting at times
when LCPs during restarts become stalled in this fashion. (Bug
#74594, Bug #19898269)

    When a data node was restarted from its angel process (that
is, following a node failure), it could be allocated a new node ID
before failure handling was actually completed for the failed node.
(Bug #74564, Bug #19891507)

    In NDB version 7.4, node failure handling can require completing
checkpoints on up to 64 fragments. (This checkpointing is performed
by the DBLQH kernel block.) The requirement for master takeover to
wait for completion of all such checkpoints led in such cases to
excessive length of time for completion.

    To address these issues, the DBLQH kernel block can now report
that it is ready for master takeover before it has completed any
ongoing fragment checkpoints, and can continue processing these
while the system completes the master takeover. (Bug #74320, Bug
#19795217)

    Local checkpoints were sometimes started earlier than necessary
during node restarts, while the node was still waiting for copying
of the data distribution and data dictionary to complete.  (Bug
#74319, Bug #19795152)

    The check to determine when a node was restarting and so know
when to accelerate local checkpoints sometimes reported a false
positive. (Bug #74318, Bug #19795108)

    Values in different columns of the ndbinfo tables
disk_write_speed_aggregate and disk_write_speed_aggregate_node were
reported using differing multiples of bytes. Now all of these
columns display values in bytes.

    In addition, this fix corrects an error made when calculating
the standard deviations used in the std_dev_backup_lcp_speed_last_10sec,
std_dev_redo_speed_last_10sec, std_dev_backup_lcp_speed_last_60sec,
and std_dev_redo_speed_last_60sec columns of the
ndbinfo.disk_write_speed_aggregate table. (Bug #74317, Bug #19795072)

    Recursion in the internal method Dblqh::finishScanrec() led to
an attempt to create two list iterators with the same head.  This
regression was introduced during work done to optimize scans for
version 7.4 of the NDB storage engine. (Bug #73667, Bug #19480197)

    Transporter send buffers were not updated properly following
a failed send. (Bug #45043, Bug #20113145)

    Disk Data: An update on many rows of a large Disk Data table
could in some rare cases lead to node failure. In the event that
such problems are observed with very large transactions on Disk
Data tables you can now increase the number of page entries allocated
for disk page buffer memory by raising the value of the
DiskPageBufferEntries data node configuration parameter added in
this release. (Bug #19958804)

    Disk Data: In some cases, during DICT master takeover, the new
master could crash while attempting to roll forward an ongoing
schema transaction. (Bug #19875663, Bug #74510)

    Cluster API: It was possible to delete an Ndb_cluster_connection
object while there remained instances of Ndb using references to
it. Now the Ndb_cluster_connection destructor waits for all related
Ndb objects to be released before completing. (Bug #19999242)

    References: See also Bug #19846392.

-----

Changes in MySQL Cluster NDB 7.4.2 (5.6.21-ndb-7.4.2)

Functionality Added or Changed

    Added the restart_info table to the ndbinfo information database
to provide current status and timing information relating to node
and system restarts. By querying this table, you can observe the
progress of restarts in real time. (Bug #19795152)

    After adding new data nodes to the configuration file of a
MySQL Cluster having many API nodes, but prior to starting any of
the data node processes, API nodes tried to connect to these missing
data nodes several times per second, placing extra loads on management
nodes and the network. To reduce unnecessary traffic caused in this
way, it is now possible to control the amount of time that an API
node waits between attempts to connect to data nodes which fail to
respond; this is implemented in two new API node configuration
parameters StartConnectBackoffMaxTime and ConnectBackoffMaxTime.

    Time elapsed during node connection attempts is not taken into
account when applying these parameters, both of which are given in
milliseconds with approximately 100 ms resolution. As long as the
API node is not connected to any data nodes as described previously,
the value of the StartConnectBackoffMaxTime parameter is applied;
otherwise, ConnectBackoffMaxTime is used.

    In a MySQL Cluster with many unstarted data nodes, the values
of these parameters can be raised to circumvent connection attempts
to data nodes which have not yet begun to function in the cluster,
as well as moderate high traffic to management nodes.

    For more information about the behavior of these parameters,
see Defining SQL and Other API Nodes in a MySQL Cluster. (Bug
#17257842)

Bugs Fixed

    When performing a batched update, where one or more successful
write operations from the start of the batch were followed by write
operations which failed without being aborted (due to the AbortOption
being set to AO_IgnoreError), the failure handling for these by
the transaction coordinator leaked CommitAckMarker resources. (Bug
#19875710)

    References: This bug was introduced by Bug #19451060, Bug #73339.

    Online downgrades to MySQL Cluster NDB 7.3 failed when a MySQL
Cluster NDB 7.4 master attempted to request a local checkpoint with
32 fragments from a data node already running NDB 7.3, which supports
only 2 fragments for LCPs. Now in such cases, the NDB 7.4 master
determines how many fragments the data node can handle before making
the request. (Bug #19600834)

    The fix for a previous issue with the handling of multiple node
failures required determining the number of TC instances the failed
node was running, then taking them over. The mechanism to determine
this number sometimes provided an invalid result which caused the
number of TC instances in the failed node to be set to an excessively
high value. This in turn caused redundant takeover attempts, which
wasted time and had a negative impact on the processing of other
node failures and of global checkpoints. (Bug #19193927)

    References: This bug was introduced by Bug #18069334.

    The server side of an NDB transporter disconnected an incoming
client connection very quickly during the handshake phase if the
node at the server end was not yet ready to receive connections
from the other node. This led to problems when the client immediately
attempted once again to connect to the server socket, only to be
disconnected again, and so on in a repeating loop, until it suceeded.
Since each client connection attempt left behind a socket in
TIME_WAIT, the number of sockets in TIME_WAIT increased rapidly,
leading in turn to problems with the node on the server side of
the transporter.

    Further analysis of the problem and code showed that the root
of the problem lay in the handshake portion of the transporter
connection protocol. To keep the issue described previously from
occurring, the node at the server end now sends back a WAIT message
instead of disconnecting the socket when the node is not yet ready
to accept a handshake. This means that the client end should no
longer need to create a new socket for the next retry, but can
instead begin immediately with a new handshake hello message. (Bug
#17257842)

    Corrupted messages to data nodes sometimes went undetected,
causing a bad signal to be delivered to a block which aborted the
data node. This failure in combination with disconnecting nodes
could in turn cause the entire cluster to shut down.

    To keep this from happening, additional checks are now made
when unpacking signals received over TCP, including checks for byte
order, compression flag (which must not be used), and the length
of the next message in the receive buffer (if there is one).

    Whenever two consecutive unpacked messages fail the checks just
described, the current message is assumed to be corrupted. In this
case, the transporter is marked as having bad data and no more
unpacking of messages occurs until the transporter is reconnected.
In addition, an entry is written to the cluster log containing the
error as well as a hex dump of the corrupted message. (Bug #73843,
Bug #19582925)

    During restore operations, an attribute's maximum length was
used when reading variable-length attributes from the receive buffer
instead of the attribute's actual length. (Bug #73312, Bug #19236945)

-----

Changes in MySQL Cluster NDB 7.4.1 (5.6.20-ndb-7.4.1)

Node Restart Performance and Reporting Enhancements

    Performance: A number of performance and other improvements
have been made with regard to node starts and restarts. The following
list contains a brief description of each of these changes:

	Before memory allocated on startup can be used, it must be
touched, causing the operating system to allocate the actual physical
memory needed. The process of touching each page of memory that
was allocated has now been multithreaded, with touch times on the
order of 3 times shorter than with a single thread when performed
by 16 threads.

	When performing a node or system restart, it is necessary
to restore local checkpoints for the fragments. This process
previously used delayed signals at a point which was found to be
critical to performance; these have now been replaced with normal
(undelayed) signals, which should shorten significantly the time
required to back up a MySQL Cluster or to restore it from backup.

	Previously, there could be at most 2 LDM instances active
with local checkpoints at any given time. Now, up to 16 LDMs can
be used for performing this task, which increases utilization of
available CPU power, and can speed up LCPs by a factor of 10, which
in turn can greatly improve restart times.

	Better reporting of disk writes and increased control over
these also make up a large part of this work. New ndbinfo tables
disk_write_speed_base, disk_write_speed_aggregate, and
disk_write_speed_aggregate_node provide information about the speed
of disk writes for each LDM thread that is in use. The DiskCheckpointSpeed
and DiskCheckpointSpeedInRestart configuration parameters have been
deprecated, and are subject to removal in a future MySQL Cluster
release. This release adds the data node configuration parameters
MinDiskWriteSpeed, MaxDiskWriteSpeed, MaxDiskWriteSpeedOtherNodeRestart,
and MaxDiskWriteSpeedOwnRestart to control write speeds for LCPs
and backups when the present node, another node, or no node is
currently restarting.

	For more information, see the descriptions of the ndbinfo
tables and MySQL Cluster configuration parameters named previously.

	Reporting of MySQL Cluster start phases has been improved,
with more frequent printouts. New and better information about the
start phases and their implementation has also been provided in
the sources and documentation. See Summary of MySQL Cluster Start
Phases.

Improved Scan and SQL Processing

    Performance: Several internal methods relating to the NDB
receive thread have been optimized to make mysqld more efficient
in processing SQL applications with the NDB storage engine. In
particular, this work improves the performance of the
NdbReceiver::execTRANSID_AI() method, which is commonly used
to receive a record from the data nodes as part of a scan
operation. (Since the receiver thread sometimes has to process
millions of received records per second, it is critical that
this method does not perform unnecessary work, or tie up
resources that are not strictly needed.) The associated internal
functions receive_ndb_packed_record() and handleReceivedSignal()
methods have also been improved, and made more efficient.

Per-Fragment Memory Reporting

    Information about memory usage by individual fragments can now
be obtained from the memory_per_fragment view added in this release
to the ndbinfo information database. This information includes
pages having fixed, and variable element size, rows, fixed element
free slots, variable element free bytes, and hash index memory
usage. For information, see The ndbinfo memory_per_fragment Table.

Bugs Fixed

    In some cases, transporter receive buffers were reset by one
thread while being read by another. This happened when a race
condition occurred between a thread receiving data and another
thread initiating disconnect of the transporter (disconnection
clears this buffer). Concurrency logic has now been implemented to
keep this race from taking place. (Bug #19552283, Bug #73790)

    When a new data node started, API nodes were allowed to attempt
to register themselves with the data node for executing transactions
before the data node was ready. This forced the API node to wait
an extra heartbeat interval before trying again.

    To address this issue, a number of HA_ERR_NO_CONNECTION errors
(Error 4009) that could be issued during this time have been changed
to Cluster temporarily unavailable errors (Error 4035), which should
allow API nodes to use new data nodes more quickly than before. As
part of this fix, some errors which were incorrectly categorised
have been moved into the correct categories, and some errors which
are no longer used have been removed. (Bug #19524096, Bug #73758)

    Executing ALTER TABLE ... REORGANIZE PARTITION after increasing
the number of data nodes in the cluster from 4 to 16 led to a crash
of the data nodes. This issue was shown to be a regression caused
by previous fix which added a new dump handler using a dump code
that was already in use (7019), which caused the command to execute
two different handlers with different semantics. The new handler
was assigned a new DUMP code (7024).  (Bug #18550318)

    References: This bug is a regression of Bug #14220269.

    When certain queries generated signals having more than 18 data
words prior to a node failure, such signals were not written
correctly in the trace file. (Bug #18419554)

    Failure of multiple nodes while using ndbmtd with multiple TC
threads was not handled gracefully under a moderate amount of
traffic, which could in some cases lead to an unplanned shutdown
of the cluster. (Bug #18069334)

    For multithreaded data nodes, some threads do communicate often,
with the result that very old signals can remain at the top of the
signal buffers. When performing a thread trace, the signal dumper
calculated the latest signal ID from what it found in the signal
buffers, which meant that these old signals could be erroneously
counted as the newest ones. Now the signal ID counter is kept as
part of the thread state, and it is this value that is used when
dumping signals for trace files. (Bug #73842, Bug #19582807)

    Cluster API: When an NDB API client application received a
signal with an invalid block or signal number, NDB provided only
a very brief error message that did not accurately convey the nature
of the problem. Now in such cases, appropriate printouts are provided
when a bad signal or message is detected.  In addition, the message
length is now checked to make certain that it matches the size of
the embedded signal. (Bug #18426180)

-----

The following improvements to MySQL Cluster have been made in MySQL
Cluster NDB 7.4:

    Conflict detection and resolution enhancements.  A reserved
column name namespace NDB$ is now employed for exceptions table
metacolumns, allowing an arbitrary subset of main table columns to
be recorded, even if they are not part of the original table's
primary key.

    Recording the complete original primary key is no longer
required, due to the fact that matching against exceptions table
columns is now done by name and type only. It is now also possible
for you to record values of columns which not are part of the main
table's primary key in the exceptions table.

    Read conflict detection is now possible. All rows read by the
conflicting transaction are flagged, and logged in the exceptions
table. Rows inserted in the same transaction are not included among
the rows read or logged. This read tracking depends on the slave
having an exclusive read lock which requires setting
ndb_log_exclusive_reads in advance. See Read conflict detection
and resolution, for more information and examples.

    Existing exceptions tables remain supported. For more information,
see Section 18.6.11, "MySQL Cluster Replication Conflict Resolution".

    Circular ("active-active") replication improvements.  When
using a circular or "active-active" MySQL Cluster Replication
topology, you can assign one of the roles of primary of secondary
to a given MySQL Cluster using the ndb_slave_conflict_role server
system variable, which can be employed when failing over from a
MySQL Cluster acting as primary, or when using conflict detection
and resolution with NDB$EPOCH2() and NDB$EPOCH2_TRANS() (MySQL
Cluster NDB 7.4.2 and later), which support delete-delete conflict
handling.

    See the description of the ndb_slave_conflict_role variable,
as well as NDB$EPOCH2(), for more information. See also Section
18.6.11, MySQL Cluster Replication Conflict Resolution.

    Per-fragment memory usage reporting.  You can now obtain data
about memory usage by individual MySQL Cluster fragments from the
memory_per_fragment view, added in MySQL Cluster NDB 7.4.1 to the
ndbinfo information database. For more information, see Section
18.5.10.17, "The ndbinfo memory_per_fragment Table".

    Node restart improvements.  MySQL Cluster NDB 7.4 includes a
number of improvements which decrease the time needed for data
nodes to be restarted. These are described in the following list:

	Memory allocated that is allocated on node startup cannot
be used until it has been, which causes the operating system to
set aside the actual physical memory required. In previous versions
of MySQL Cluster, the process of touching each page of memory that
was allocated was singlethreaded, which made it relatively
time-consuming. This process has now been reimplimented with
multithreading. In tests with 16 threads, touch times on the order
of 3 times shorter than with a single thread were observed.

	Increased parallelization of local checkpoints; in MySQL
Cluster NDB 7.4, LCPs now support 32 fragments rather than 2 as
before. This greatly increases utilization of CPU power that would
otherwise go unused, and can make LCPs faster by up to a factor of
10; this speedup in turn can greatly improve node restart times.

	The degree of parallelization used for the node copy phase
during node and system restarts can be controlled in MySQL Cluster
NDB 7.4.3 and later by setting the MaxParallelCopyInstances data
node configuration parameter to a nonzero value.

	Reporting on disk writes is provided by new ndbinfo tables
disk_write_speed_base, disk_write_speed_aggregate, and
disk_write_speed_aggregate_node, which provide information about
the speed of disk writes for each LDM thread that is in use.

	This release also adds the data node configuration parameters
MinDiskWriteSpeed, MaxDiskWriteSpeed, MaxDiskWriteSpeedOtherNodeRestart,
and MaxDiskWriteSpeedOwnRestart to control write speeds for LCPs
and backups when the present node, another node, or no node is
currently restarting.

	These changes are intended to supersede configuration of
disk writes using the DiskCheckpointSpeed and DiskCheckpointSpeedInRestart
configuration parameters.  These 2 parameters have now been
deprecated, and are subject to removal in a future MySQL Cluster
release.

	Faster times for restoring a MySQL Cluster from backup have
been obtained by replacing delayed signals found at a point which
was found to be critical to performance with normal (undelayed)
signals. The elimination or replacement of these unnecessary delayed
signals should noticeably reduce the amount of time required to
back up a MySQL Cluster, or to restore a MySQL Cluster from backup.

	Several internal methods relating to the NDB receive thread
have been optimized, to increase the efficiency of SQL processing
by NDB. The receiver thread at time may have to process several
million received records per second, so it is critical that it not
perform unnecessary work or waste resources when retrieving records
from MySQL Cluster data nodes.

    Improved reporting of MySQL Cluster restarts and start phases.
The restart_info table (included in the ndbinfo information database
beginning with MySQL Cluster NDB 7.4.2) provides current status
and timing information about node and system restarts.

    Reporting and logging of MySQL Cluster start phases also provides
more frequent and specific printouts during startup than previously.
See Section 18.5.1, Summary of MySQL Cluster Start Phases, for more
information.

    NDB API: new Event API.  MySQL Cluster NDB 7.4.3 introduces an
epoch-driven Event API that supercedes the earlier GCI-based model.
The new version of the API also simplifies error detection and
handling. These changes are realized in the NDB API by implementing
a number of new methods for Ndb and NdbEventOperation, deprecating
several other methods of both classes, and adding new type values
to Event::TableEvent.

    The event handling methods added to Ndb in MySQL Cluster NDB
7.4.3 are pollEvents2(), nextEvent2(), getHighestQueuedEpoch(),
and getNextEventOpInEpoch2(). The Ndb methods pollEvents(),
nextEvent(), getLatestGCI(), getGCIEventOperations(), isConsistent(),
and isConsistentGCI() are deprecated beginning with the same release.

    MySQL Cluster NDB 7.4.3 adds the NdbEventOperation event handling
methods getEventType2(), getEpoch(), isEmptyEpoch(), and isErrorEpoch;
it obsoletes getEventType(), getGCI(), getLatestGCI(), isOverrun(),
hasError(), and clearError().

    While some (but not all) of the new methods are direct replacements
for deprecated methods, not all of the deprecated methods map to
new ones. The Event Class, provides information as to which old
methods correspond to new ones.

    Error handling using the new API is no longer handled using
dedicated hasError() and clearError() methods, which are now
deprecated (and thus subject to removal in a future release of
MySQL Cluster). To support this change, the list of TableEvent
types now includes the values TE_EMPTY (empty epoch), TE_INCONSISTENT
(inconsistent epoch), and TE_OUT_OF_MEMORY (inconsistent data).

    Improvements in event buffer management have also been made by
implementing new get_eventbuffer_free_percent(),
set_eventbuffer_free_percent(), and get_eventbuffer_memory_usage()
methods. Memory buffer usage can now be represented in application
code using EventBufferMemoryUsage. The ndb_eventbuffer_free_percent
system variable, also implemented in MySQL Cluster NDB 7.4, makes
it possible for event buffer memory usage to be checked from MySQL
client applications.

    For more information, see the detailed descriptions for the
Ndb and NdbEventOperation methods listed. See also The Event::TableEvent
Type, as well as The EventBufferMemoryUsage Structure.

    Per-fragment operations information.  In MySQL Cluster NDB
7.4.3 and later, counts of various types of operations on a given
fragment or fragment replica can obtained easily using the
operations_per_fragment table in the ndbinfo information database.
This includes read, write, update, and delete operations, as well
as scan and index operations performed by these. Information about
operations refused, and about rows scanned and returned from a
given fragment replica, is also shown in operations_per_fragment.
This table also provides information about interpreted programs
used as attribute values, and values returned by them.

MySQL Cluster NDB 7.4 is also supported by MySQL Cluster Manager,
which provides an advanced command-line interface that can simplify
many complex MySQL Cluster management tasks. See MySQL Cluster
Manager 1.3.5 User Manual, for more information.

-----

Changes in MySQL Cluster NDB 7.3.9 (5.6.24-ndb-7.3.9)

Bugs Fixed

    It was found during testing that problems could arise when the
node registered as the arbitrator disconnected or failed during
the arbitration process.

    In this situation, the node requesting arbitration could never
receive a positive acknowledgement from the registered arbitrator;
this node also lacked a stable set of members and could not initiate
selection of a new arbitrator.

    Now in such cases, when the arbitrator fails or loses contact
during arbitration, the requesting node immediately fails rather
than waiting to time out. (Bug #20538179)

    The values of the Ndb_last_commit_epoch_server and
Ndb_last_commit_epoch_session status variables were incorrectly
reported on some platforms. To correct this problem, these values
are now stored internally as long long, rather than long. (Bug
#20372169)

    The maximum failure time calculation used to ensure that normal
node failure handling mechanisms are given time to handle survivable
cluster failures (before global checkpoint watchdog mechanisms
start to kill nodes due to GCP delays) was excessively conservative,
and neglected to consider that there can be at most number_of_data_nodes
/ NoOfReplicas node failures before the cluster can no longer
survive. Now the value of NoOfReplicas is properly taken into
account when performing this calculation.  (Bug #20069617, Bug
#20069624)

    References: See also Bug #19858151, Bug #20128256, Bug #20135976.

    When a data node fails or is being restarted, the remaining
nodes in the same nodegroup resend to subscribers any data which
they determine has not already been sent by the failed node.
Normally, when a data node (actually, the SUMA kernel block) has
sent all data belonging to an epoch for which it is responsible,
it sends a SUB_GCP_COMPLETE_REP signal, together with a count, to
all subscribers, each of which responds with a SUB_GCP_COMPLETE_ACK.
When SUMA receives this acknowledgment from all subscribers, it
reports this to the other nodes in the same nodegroup so that they
know that there is no need to resend this data in case of a subsequent
node failure. If a node failed before all subscribers sent this
acknowledgement but before all the other nodes in the same nodegroup
received it from the failing node, data for some epochs could be
sent (and reported as complete) twice, which could lead to an
unplanned shutdown.

    The fix for this issue adds to the count reported by
SUB_GCP_COMPLETE_ACK a list of identifiers which the receiver can
use to keep track of which buckets are completed and to ignore any
duplicate reported for an already completed bucket.  (Bug #17579998)

    When performing a restart, it was sometimes possible to find
a log end marker which had been written by a previous restart, and
that should have been invalidated. Now when when searching for the
last page to invalidate, the same search algorithm is used as when
searching for the last page of the log to read.  (Bug #76207, Bug
#20665205)

    When reading and copying transporter short signal data, it was
possible for the data to be copied back to the same signal with
overlapping memory. (Bug #75930, Bug #20553247)

    When a bulk delete operation was committed early to avoid an
additional round trip, while also returning the number of affected
rows, but failed with a timeout error, an SQL node performed no
verification that the transaction was in the Committed state. (Bug
#74494, Bug #20092754)

    References: See also Bug #19873609.

    An ALTER TABLE statement containing comments and a partitioning
option against an NDB table caused the SQL node on which it was
executed to fail. (Bug #74022, Bug #19667566)

    Cluster API: When a transaction is started from a cluster
connection, Table and Index schema objects may be passed to this
transaction for use. If these schema objects have been acquired
from a different connection (Ndb_cluster_connection object), they
can be deleted at any point by the deletion or disconnection of
the owning connection. This can leave a connection with invalid
schema objects, which causes an NDB API application to fail when
these are dereferenced.

    To avoid this problem, if your application uses multiple
connections, you can now set a check to detect sharing of schema
objects between connections when passing a schema object to a
transaction, using the NdbTransaction::setSchemaObjectOwnerChecks()
method added in this release. When this check is enabled, the schema
objects having the same names are acquired from the connection and
compared to the schema objects passed to the transaction. Failure
to match causes the application to fail with an error. (Bug #19785977)

    Cluster API: The increase in the default number of hashmap
buckets (DefaultHashMapSize API node configuration parameter) from
240 to 3480 in MySQL Cluster NDB 7.2.11 increased the size of the
internal DictHashMapInfo::HashMap type considerably.  This type
was allocated on the stack in some getTable() calls which could
lead to stack overflow issues for NDB API users.

    To avoid this problem, the hashmap is now dynamically allocated
from the heap. (Bug #19306793)

    Cluster API: A scan operation, whether it is a single table
scan or a query scan used by a pushed join, stores the result set
in a buffer. The maximum size of this buffer is calculated and
preallocated before the scan operation is started. This buffer may
consume a considerable amount of memory; in some cases we observed
a 2 GB buffer footprint in tests that executed 100 parallel scans
with 2 single-threaded (ndbd) data nodes.  This memory consumption
was found to scale linearly with additional fragments.

    A number of root causes, listed here, were discovered that led
to this problem:

	Result rows were unpacked to full NdbRecord format before
they were stored in the buffer. If only some but not all columns
of a table were selected, the buffer contained empty space (essentially
wasted).

	Due to the buffer format being unpacked, VARCHAR and
VARBINARY columns always had to be allocated for the maximum size
defined for such columns.

	BatchByteSize and MaxScanBatchSize values were not taken
into consideration as a limiting factor when calculating the maximum
buffer size.

    These issues became more evident in NDB 7.2 and later MySQL
Cluster release series. This was due to the fact buffer size is
scaled by BatchSize, and that the default value for this parameter
was increased fourfold (from 64 to 256) beginning with MySQL Cluster
NDB 7.2.1.

    This fix causes result rows to be buffered using the packed
format instead of the unpacked format; a buffered scan result row
is now not unpacked until it becomes the current row. In addition,
BatchByteSize and MaxScanBatchSize are now used as limiting factors
when calculating the required buffer size.

    Also as part of this fix, refactoring has been done to separate
handling of buffered (packed) from handling of unbuffered result
sets, and to remove code that had been unused since NDB 7.0 or
earlier. The NdbRecord class declaration has also been cleaned up
by removing a number of unused or redundant member variables.  (Bug
#73781, Bug #75599, Bug #19631350, Bug #20408733)
2015-05-25 22:17:36 +00:00
jperkin
aad20bde6b Fix PLIST on non-x86_64 platforms. 2015-02-19 09:28:49 +00:00
jnemeth
0a853b56c3 Update to MySQL Cluster 7.3.8:
Changes in MySQL Cluster NDB 7.3.8 (5.6.22-ndb-7.3.8) (2015-01-21)

   MySQL Cluster NDB 7.3.8 is a new release of MySQL Cluster, based on
   MySQL Server 5.6 and including features from version 7.3 of the NDB
   storage engine, as well as fixing a number of recently discovered bugs
   in previous MySQL Cluster releases.

   This release also incorporates all bugfixes and changes made in
   previous MySQL Cluster releases, as well as all bugfixes and feature
   changes which were added in mainline MySQL 5.6 through MySQL 5.6.22
   (see Changes in MySQL 5.6.22 (2014-12-01)).

   Functionality Added or Changed
     * Performance: Recent improvements made to the multithreaded
       scheduler were intended to optimize the cache behavior of its
       internal data structures, with members of these structures placed
       such that those local to a given thread do not overflow into a
       cache line which can be accessed by another thread. Where required,
       extra padding bytes are inserted to isolate cache lines owned (or
       shared) by other threads, thus avoiding invalidation of the entire
       cache line if another thread writes into a cache line not entirely
       owned by itself. This optimization improved MT Scheduler
       performance by several percent.
       It has since been found that the optimization just described
       depends on the global instance of struct thr_repository starting at
       a cache line aligned base address as well as the compiler not
       rearranging or adding extra padding to the scheduler struct; it was
       also found that these prerequisites were not guaranteed (or even
       checked). Thus this cache line optimization has previously worked
       only when g_thr_repository (that is, the global instance) ended up
       being cache line aligned only by accident. In addition, on 64-bit
       platforms, the compiler added extra padding words in struct
       thr_safe_pool such that attempts to pad it to a cache line aligned
       size failed.
       The current fix ensures that g_thr_repository is constructed on a
       cache line aligned address, and the constructors modified so as to
       verify cacheline aligned adresses where these are assumed by
       design.
       Results from internal testing show improvements in MT Scheduler
       read performance of up to 10% in some cases, following these
       changes. (Bug #18352514)
     * Cluster API: Two new example programs, demonstrating reads and
       writes of CHAR, VARCHAR, and VARBINARY column values, have been
       added to storage/ndb/ndbapi-examples in the MySQL Cluster source
       tree. For more information about these programs, including source
       code listings, see NDB API Simple Array Example, and NDB API Simple
       Array Example Using Adapter.

   Bugs Fixed
     * The global checkpoint commit and save protocols can be delayed by
       various causes, including slow disk I/O. The DIH master node
       monitors the progress of both of these protocols, and can enforce a
       maximum lag time during which the protocols are stalled by killing
       the node responsible for the lag when it reaches this maximum. This
       DIH master GCP monitor mechanism did not perform its task more than
       once per master node; that is, it failed to continue monitoring
       after detecting and handling a GCP stop. (Bug #20128256)
       References: See also Bug #19858151.
     * When running mysql_upgrade on a MySQL Cluster SQL node, the
       expected drop of the performance_schema database on this node was
       instead performed on all SQL nodes connected to the cluster. (Bug
       #20032861)
     * A number of problems relating to the fired triggers pool have been
       fixed, including the following issues:
          + When the fired triggers pool was exhausted, NDB returned Error
            218 (Out of LongMessageBuffer). A new error code 221 is added
            to cover this case.
          + An additional, separate case in which Error 218 was wrongly
            reported now returns the correct error.
          + Setting low values for MaxNoOfFiredTriggers led to an error
            when no memory was allocated if there was only one hash
            bucket.
          + An aborted transaction now releases any fired trigger records
            it held. Previously, these records were held until its
            ApiConnectRecord was reused by another transaction.
          + In addition, for the Fired Triggers pool in the internal
            ndbinfo.ndb$pools table, the high value always equalled the
            total, due to the fact that all records were momentarily
            seized when initializing them. Now the high value shows the
            maximum following completion of initialization.
       (Bug #19976428)
     * Online reorganization when using ndbmtd data nodes and with binary
       logging by mysqld enabled could sometimes lead to failures in the
       TRIX and DBLQH kernel blocks, or in silent data corruption. (Bug
       #19903481)
       References: See also Bug #19912988.
     * The local checkpoint scan fragment watchdog and the global
       checkpoint monitor can each exclude a node when it is too slow when
       participating in their respective protocols. This exclusion was
       implemented by simply asking the failing node to shut down, which
       in case this was delayed (for whatever reason) could prolong the
       duration of the GCP or LCP stall for other, unaffected nodes.
       To minimize this time, an isolation mechanism has been added to
       both protocols whereby any other live nodes forcibly disconnect the
       failing node after a predetermined amount of time. This allows the
       failing node the opportunity to shut down gracefully (after logging
       debugging and other information) if possible, but limits the time
       that other nodes must wait for this to occur. Now, once the
       remaining live nodes have processed the disconnection of any
       failing nodes, they can commence failure handling and restart the
       related protocol or protocol, even if the failed node takes an
       excessiviely long time to shut down. (Bug #19858151)
       References: See also Bug #20128256.
     * A watchdog failure resulted from a hang while freeing a disk page
       in TUP_COMMITREQ, due to use of an uninitialized block variable.
       (Bug #19815044, Bug #74380)
     * Multiple threads crashing led to multiple sets of trace files being
       printed and possibly to deadlocks. (Bug #19724313)
     * When a client retried against a new master a schema transaction
       that failed previously against the previous master while the latter
       was restarting, the lock obtained by this transaction on the new
       master prevented the previous master from progressing past start
       phase 3 until the client was terminated, and resources held by it
       were cleaned up. (Bug #19712569, Bug #74154)
     * When using the NDB storage engine, the maximum possible length of a
       database or table name is 63 characters, but this limit was not
       always strictly enforced. This meant that a statement using a name
       having 64 characters such CREATE DATABASE, DROP DATABASE, or ALTER
       TABLE RENAME could cause the SQL node on which it was executed to
       fail. Now such statements fail with an appropriate error message.
       (Bug #19550973)
     * When a new data node started, API nodes were allowed to attempt to
       register themselves with the data node for executing transactions
       before the data node was ready. This forced the API node to wait an
       extra heartbeat interval before trying again.
       To address this issue, a number of HA_ERR_NO_CONNECTION errors
       (Error 4009) that could be issued during this time have been
       changed to Cluster temporarily unavailable errors (Error 4035),
       which should allow API nodes to use new data nodes more quickly
       than before. As part of this fix, some errors which were
       incorrectly categorised have been moved into the correct
       categories, and some errors which are no longer used have been
       removed. (Bug #19524096, Bug #73758)
     * When executing very large pushdown joins involving one or more
       indexes each defined over several columns, it was possible in some
       cases for the DBSPJ block (see The DBSPJ Block) in the NDB kernel
       to generate SCAN_FRAGREQ signals that were excessively large. This
       caused data nodes to fail when these could not be handled
       correctly, due to a hard limit in the kernel on the size of such
       signals (32K). This fix bypasses that limitation by breaking up
       SCAN_FRAGREQ data that is too large for one such signal, and
       sending the SCAN_FRAGREQ as a chunked or fragmented signal instead.
       (Bug #19390895)
     * ndb_index_stat sometimes failed when used against a table
       containing unique indexes. (Bug #18715165)
     * Queries against tables containing a CHAR(0) columns failed with
       ERROR 1296 (HY000): Got error 4547 'RecordSpecification has
       overlapping offsets' from NDBCLUSTER. (Bug #14798022)
     * In the NDB kernel, it was possible for a TransporterFacade object
       to reset a buffer while the data contained by the buffer was being
       sent, which could lead to a race condition. (Bug #75041, Bug
       #20112981)
     * mysql_upgrade failed to drop and recreate the ndbinfo database and
       its tables as expected. (Bug #74863, Bug #20031425)
     * Due to a lack of memory barriers, MySQL Cluster programs such as
       ndbmtd did not compile on POWER platforms. (Bug #74782, Bug
       #20007248)
     * In some cases, when run against a table having an AFTER DELETE
       trigger, a DELETE statement that matched no rows still caused the
       trigger to execute. (Bug #74751, Bug #19992856)
     * A basic requirement of the NDB storage engine's design is that the
       transporter registry not attempt to receive data
       (TransporterRegistry::performReceive()) from and update the
       connection status (TransporterRegistry::update_connections()) of
       the same set of transporters concurrently, due to the fact that the
       updates perform final cleanup and reinitialization of buffers used
       when receiving data. Changing the contents of these buffers while
       reading or writing to them could lead to "garbage" or inconsistent
       signals being read or written.
       During the course of work done previously to improve the
       implementation of the transporter facade, a mutex intended to
       protect against the concurrent use of the performReceive() and
       update_connections()) methods on the same transporter was
       inadvertently removed. This fix adds a watchdog check for
       concurrent usage. In addition, update_connections() and
       performReceive() calls are now serialized together while polling
       the transporters. (Bug #74011, Bug #19661543)
     * ndb_restore failed while restoring a table which contained both a
       built-in conversion on the primary key and a staging conversion on
       a TEXT column.
       During staging, a BLOB table is created with a primary key column
       of the target type. However, a conversion function was not provided
       to convert the primary key values before loading them into the
       staging blob table, which resulted in corrupted primary key values
       in the staging BLOB table. While moving data from the staging table
       to the target table, the BLOB read failed because it could not find
       the primary key in the BLOB table.
       Now all BLOB tables are checked to see whether there are
       conversions on primary keys of their main tables. This check is
       done after all the main tables are processed, so that conversion
       functions and parameters have already been set for the main tables.
       Any conversion functions and parameters used for the primary key in
       the main table are now duplicated in the BLOB table. (Bug #73966,
       Bug #19642978)
     * Corrupted messages to data nodes sometimes went undetected, causing
       a bad signal to be delivered to a block which aborted the data
       node. This failure in combination with disconnecting nodes could in
       turn cause the entire cluster to shut down.
       To keep this from happening, additional checks are now made when
       unpacking signals received over TCP, including checks for byte
       order, compression flag (which must not be used), and the length of
       the next message in the receive buffer (if there is one).
       Whenever two consecutive unpacked messages fail the checks just
       described, the current message is assumed to be corrupted. In this
       case, the transporter is marked as having bad data and no more
       unpacking of messages occurs until the transporter is reconnected.
       In addition, an entry is written to the cluster log containing the
       error as well as a hex dump of the corrupted message. (Bug #73843,
       Bug #19582925)
     * Transporter send buffers were not updated properly following a
       failed send. (Bug #45043, Bug #20113145)
     * ndb_restore --print_data truncated TEXT and BLOB column values to
       240 bytes rather than 256 bytes.
     * Disk Data: An update on many rows of a large Disk Data table could
       in some rare cases lead to node failure. In the event that such
       problems are observed with very large transactions on Disk Data
       tables you can now increase the number of page entries allocated
       for disk page buffer memory by raising the value of the
       DiskPageBufferEntries data node configuration parameter added in
       this release. (Bug #19958804)
     * Disk Data: When a node acting as a DICT master fails, the
       arbitrator selects another node to take over in place of the failed
       node. During the takeover procedure, which includes cleaning up any
       schema transactions which are still open when the master failed,
       the disposition of the uncommitted schema transaction is decided.
       Normally this transaction be rolled back, but if it has completed a
       sufficient portion of a commit request, the new master finishes
       processing the commit. Until the fate of the transaction has been
       decided, no new TRANS_END_REQ messages from clients can be
       processed. In addition, since multiple concurrent schema
       transactions are not supported, takeover cleanup must be completed
       before any new transactions can be started.
       A similar restriction applies to any schema operations which are
       performed in the scope of an open schema transaction. The counter
       used to coordinate schema operation across all nodes is employed
       both during takeover processing and when executing any non-local
       schema operations. This means that starting a schema operation
       while its schema transaction is in the takeover phase causes this
       counter to be overwritten by concurrent uses, with unpredictable
       results.
       The scenarios just described were handled previously using a
       pseudo-random delay when recovering from a node failure. Now we
       check before the new master has rolled forward or backwards any
       schema transactions remaining after the failure of the previous
       master and avoid starting new schema transactions or performing
       operations using old transactions until takeover processing has
       cleaned up after the abandoned transaction. (Bug #19874809, Bug
       #74503)
     * Disk Data: When a node acting as DICT master fails, it is still
       possible to request that any open schema transaction be either
       committed or aborted by sending this request to the new DICT
       master. In this event, the new master takes over the schema
       transaction and reports back on whether the commit or abort request
       succeeded. In certain cases, it was possible for the new master to
       be misidentified--that is, the request was sent to the wrong node,
       which responded with an error that was interpreted by the client
       application as an aborted schema transaction, even in cases where
       the transaction could have been successfully committed, had the
       correct node been contacted. (Bug #74521, Bug #19880747)
     * Cluster Replication: When an NDB client thread made a request to
       flush the binary log using statements such as FLUSH BINARY LOGS or
       SHOW BINLOG EVENTS, this caused not only the most recent changes
       made by this client to be flushed, but all recent changes made by
       all other clients to be flushed as well, even though this was not
       needed. This behavior caused unnecessary waiting for the statement
       to execute, which could lead to timeouts and other issues with
       replication. Now such statements flush the most recent database
       changes made by the requesting thread only.
       As part of this fix, the status variables
       Ndb_last_commit_epoch_server, Ndb_last_commit_epoch_session, and
       Ndb_slave_max_replicated_epoch, originally implemented in MySQL
       Cluster NDB 7.4, are also now available in MySQL Cluster NDB 7.3.
       For descriptions of these variables, see MySQL Cluster Status
       Variables; for further information, see MySQL Cluster Replication
       Conflict Resolution. (Bug #19793475)
     * Cluster Replication: It was possible using wildcards to set up
       conflict resolution for an exceptions table (that is, a table named
       using the suffix $EX), which should not be allowed. Now when a
       replication conflict function is defined using wildcard
       expressions, these are checked for possible matches so that, in the
       event that the function would cover an exceptions table, it is not
       set up for this table. (Bug #19267720)
     * Cluster API: It was possible to delete an Ndb_cluster_connection
       object while there remained instances of Ndb using references to
       it. Now the Ndb_cluster_connection destructor waits for all related
       Ndb objects to be released before completing. (Bug #19999242)
       References: See also Bug #19846392.
     * Cluster API: The buffer allocated by an NdbScanOperation for
       receiving scanned rows was not released until the NdbTransaction
       owning the scan operation was closed. This could lead to excessive
       memory usage in an application where multiple scans were created
       within the same transaction, even if these scans were closed at the
       end of their lifecycle, unless NdbScanOperation::close() was
       invoked with the releaseOp argument equal to true. Now the buffer
       is released whenever the cursor navigating the result set is closed
       with NdbScanOperation::close(), regardless of the value of this
       argument. (Bug #75128, Bug #20166585)
     * ClusterJ: The following errors were logged at the SEVERE level;
       they are now logged at the NORMAL level, as they should be:
          + Duplicate primary key
          + Duplicate unique key
          + Foreign key constraint error: key does not exist
          + Foreign key constraint error: key exists
       (Bug #20045455)
     * ClusterJ: The com.mysql.clusterj.tie class gave off a logging
       message at the INFO logging level for every single query, which was
       unnecessary and was affecting the performance of applications that
       used ClusterJ. (Bug #20017292)
     * ClusterJ: ClusterJ reported a segmentation violation when an
       application closed a session factory while some sessions were still
       active. This was because MySQL Cluster allowed an
       Ndb_cluster_connection object be to deleted while some Ndb
       instances were still active, which might result in the usage of
       null pointers by ClusterJ. This fix stops that happening by
       preventing ClusterJ from closing a session factory when any of its
       sessions are still active. (Bug #19846392)
       References: See also Bug #19999242.
2015-02-09 06:46:55 +00:00
fhajny
c112971377 Fix PLIST for SunOS. 2015-01-20 11:03:51 +00:00
jnemeth
7697633caf MySQL Cluster is a highly scalable, real-time, ACID-compliant
transactional database, combining 99.999% availability with the
low TCO of open source.

Designed around a distributed, multi-master architecture with no
single point of failure, MySQL Cluster scales horizontally on
commodity hardware to serve read and write intensive workloads,
accessed via SQL and NoSQL interfaces.
2014-12-01 05:57:48 +00:00