Changes since 8.0, from ChangeLog
commit 895183d5a2eceabcfdd81daff87ecab1159c32c6
Author: Rinku Kothiya <rkothiya@redhat.com>
Date: Wed Sep 16 07:15:41 2020 +0000
doc: Added release 8.2 notes
Updates: #1485
Change-Id: Ia42666051df1624444ea203bf8b7c876cf78b592
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
commit 85ff28ace3901a5a54d8de42d33ab2f9ac528ed8
Author: Srijan Sivakumar <ssivakum@redhat.com>
Date: Tue Sep 1 12:48:48 2020 +0530
Events: Fixing coverity issues.
Fixing resource leak reported by coverity scan.
CID: 1431237
Change-Id: I2bed106b3dc4296c50d80542ee678d32c6928c25
Updates: #1060
Signed-off-by: Srijan Sivakumar <ssivakum@redhat.com>
(cherry picked from commit ebc0253269d8a538239dd0b99d42f56ea320b0f0)
commit 93d48622d9ddb96f07a8590312c2885e11751436
Author: srijan-sivakumar <ssivakum@redhat.com>
Date: Sat Jul 18 05:59:09 2020 +0530
Events: Socket creation after getaddrinfo and IPv4 and IPv6 packet capture
Issue: Currently, the socket creation is done
prior to getaddrinfo function being invoked. This
can cause mismatch in the protocol and address
families of the created socket and the result
of the getaddrinfo api. Also, the glustereventsd
UDP server by default only captures IPv4 packets
hence IPv6 packets are not even captured.
Code Changes:
1. Modified the socket creation in such a way that
the parameters taken in are dependent upon the
result of the getaddrinfo function.
2. Created a subclass for adding address family
in glustereventsd.py for both AF_INET and AF_INET6.
3. Modified addresses in the eventsapiconf.py.in
Reasoning behind the approach:
1. If we are using getaddrinfo function then
socket creation should happen only after we
check if we received back valid addresses.
Hence socket creation should come after the call
to getaddrinfo
2. The listening server which pushes the events
to the webhook has to listen for both IPv4
and IPv6 messages as we would not be sure as to
what address family is picked in _gf_event.
Fixes: #1377
Change-Id: I568dcd1a977c8832f0fef981e1f81cac7043c760
Signed-off-by: srijan-sivakumar <ssivakum@redhat.com>
(cherry picked from commit 7c309928591deb8d0188793677958226ac03897a)
commit b4cc0988d5e9a5bf354dd4c2cb254ce52546facb
Author: nik-redhat <nladha@redhat.com>
Date: Thu Sep 10 14:55:35 2020 +0530
glusterd: readdir-ahead off by default
Changing the default value of readdir-ahead to
off, but it can be enabled/disabled later on if with
gluster vol set <volname> performance.readdir-ahead enabel/disable
command.
Updates: #1472
Change-Id: Idb3e16e8be98d7a811fc8e5d09906919ef50fbab
Signed-off-by: nik-redhat <nladha@redhat.com>
(cherry picked from commit 84a4cf76219b6187fc625740d1a1ebbe40e9f22c)
commit 68db6b60f621d371c4059a7ee728ebb267854708
Author: nik-redhat <nladha@redhat.com>
Date: Wed Aug 26 15:08:56 2020 +0530
glusterd: cksum mismatch on upgrading to latest gluster
Issue:
In gluster versions less than 7, the checksums were calculated
whether or not the quota is enabled or not, and that cksum value
was also getting stored in the quota.cksum file. But, from gluster
7 version onwards cksum was calculated only if the quota is enabled.
Due to this, the cksums in quota.cksum files differ after upgrading.
Fix:
Added a check to see if the OP_VERSION is less than 7 then, follow
the previous method otherwise, move as per the latest changes for
cksum calculation.
This changes for the cksum calculation was done in
this commit : https://github.com/gluster/glusterfs/commit/3b5eb592f5
Updates: #1332
Change-Id: I7a95e5e5f4d4be4983fb7816225bf9187856c003
Signed-off-by: nik-redhat <nladha@redhat.com>
(cherry picked from commit 865cca1190e233381f975ff36118f46e29477dcf)
Signed-off-by: nik-redhat <nladha@redhat.com>
commit a5d9edce9b59ee00d2a4027fafba126e82e2fcfd
Author: Xavi Hernandez <xhernandez@redhat.com>
Date: Fri Sep 4 14:49:50 2020 +0200
open-behind: implement create fop
Open behind didn't implement create fop. This caused that files created
were not accounted for the number of open fd's. This could cause future
opens to be delayed when they shouldn't.
This patch implements the create fop. It also fixes a problem when
destroying the stack: when frame->local was not NULL, STACK_DESTROY()
tried to mem_put() it, which is not correct.
Fixes: #1440
Change-Id: Ic982bad07d4af30b915d7eb1fbcef7a847a45869
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
commit 473453c4e2b1b6fc94edbce438dd9a3c0ea58c67
Author: Amar Tumballi <amar@kadalu.io>
Date: Tue Aug 18 14:08:20 2020 +0530
tests: provide an option to mark tests as 'flaky'
* also add some time gap in other tests to see if we get things properly
* create a directory 'tests/000/', which can host any tests, which are flaky.
* move all the tests mentioned in the issue to above directory.
* as the above dir gets tested first, all flaky tests would be reported quickly.
* change `run-tests.sh` to continue tests even if flaky tests fail.
Reference: gluster/project-infrastructure#72
Updates: #1000
Change-Id: Ifdafa38d083ebd80f7ae3cbbc9aa3b68b6d21d0e
Signed-off-by: Amar Tumballi <amar@kadalu.io>
(cherry picked from 097db13c11390174c5b9f11aa0fd87eca1735871)
commit 635dcf82505efcdeaf01c4e0450a157b533099ba
Author: Ravishankar N <ravishankar@redhat.com>
Date: Tue Sep 1 11:36:42 2020 +0530
libglusterfs: fix dict leak
Problem:
gf_rev_dns_lookup_cached() allocated struct dnscache->dict if it was null
but the freeing was left to the caller.
Fix:
Moved dict allocation and freeing into corresponding init and fini
routines so that its easier for the caller to avoid such leaks.
Updates: #1000
Change-Id: I90d6a6f85ca2dd4fe0ab461177aaa9ac9c1fbcf9
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 079f7a7d8a2bd85070c1da4dde2452ca82a1cdbb)
commit f9b8462ba212e0fd572efdf6ade03f4d5c53d11e
Author: Rinku Kothiya <rkothiya@redhat.com>
Date: Tue Aug 25 12:31:20 2020 +0000
doc: Updated release 8.1 notes
Updates: #1318
Change-Id: I87787a1aaf59302ad045ed6d2562920e17654678
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
commit ab40a26dcd9ce8b676f482bf751e57024e227e89
Author: Rinku Kothiya <rkothiya@redhat.com>
Date: Sat Aug 22 17:23:25 2020 +0000
doc: Added release 8.1 notes
Updates: #1318
Change-Id: I14d589bd9af85bdd4ae02902e41d4c5f7d930358
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
commit f120311737cf681c36423d8551e2e218509cd5f7
Author: Ravishankar N <ravishankar@redhat.com>
Date: Wed Aug 19 11:14:25 2020 +0530
afr: add null check for thin-arbiter gfid.
Problem:
Lookup/creation of thin-arbiter ID file happens in background during
mounting. On new volumes, if the ID file creation is in progress, and a
FOP fails on data brick, a post-op (xattrop) is attemtped on TA. Since
the TA file's gfid is null at this point, the ASSERT checks in protocol/
client causes a crash.
Fix:
Given that we decided to do Lookup/creation of thin-arbiter in
background, fail the other AFR FOPS on TA if the ID file's gfid is null
instead of winding it down to protocol/client.
Also remove afr_changelog_thin_arbiter_post_op() which seems to be dead
code.
Updates: #763
Change-Id: I70dc666faf55cc5c8f7cf8e7d36085e4fa399c4d
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit f9b5074394e3d2f3b6728aab97230ba620879426)
commit 4398db9d70f34e9a8af88fe3de564a906db7b182
Author: Xavi Hernandez <xhernandez@redhat.com>
Date: Wed Aug 19 23:27:38 2020 +0200
open-behind: fix call_frame leak
When an open was delayed, a copy of the frame was created because the
current frame was used to unwind the "fake" open. When the open was
actually sent, the frame was correctly destroyed. However if the file
was closed before needing to send the open, the frame was not destroyed.
This patch correctly destroys the frame in all cases.
Change-Id: I8c00fc7f15545c240e8151305d9e4cf06d653926
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Fixes: #1440
commit e173c5b0ee32c210a7d36f03f1847c42218a62e5
Author: Mohit Agrawal <moagrawa@redhat.com>
Date: Mon Jul 27 18:08:00 2020 +0530
posix: Implement a janitor thread to close fd
Problem: In the commit fb20713b380e1df8d7f9e9df96563be2f9144fd6 we use
syntask to close fd but we have found the patch is reducing the
performance
Solution: Use janitor thread to close fd's and save the pfd ctx into
ctx janitor list and also save the posix_xlator into pfd object to
avoid the race condition during cleanup in brick_mux environment
Change-Id: Ifb3d18a854b267333a3a9e39845bfefb83fbc092
Fixes: #1396
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
(cherry picked from commit 41b9616435cbdf671805856e487e373060c9455b)
commit 05060c9664153beb392206ae05a498d4d4178f5f
Author: Leonid Ishimnikov <lishim@fastmail.com>
Date: Thu Aug 13 15:37:50 2020 -0400
glusterd: dump SSL error stack on disconnect
Problem: When a non-SSL connection is attempted on an SSL-enabled
management port, unrelated peers are subsequently disconnected
from the node with a misleading error message.
Cause: A non-SSL client causes OpenSSL to push a wrong version error
into its thread-local error stack, but this error is never
cleared, and it lingers in the stack until the thread is used
by another SSL session, and a certain condition requires the error
stack to be examined, at which time the old error is discovered and
the connection is terminated.
Solution: Log and clear the error stack upon terminating the connection.
Change-Id: I82f3a723285df24dafc88850ae4fca65b69f6ae4
Fixes: #1418
Signed-off-by: Leonid Ishimnikov <lishim@fastmail.com>
(cherry picked from commit bb5801d1480314e09b4203d2525bd01aada5c683)
commit c5fc58c8cb01753e2fed173c76aea1e9cc333862
Author: Vinayakswami Hariharmath <vharihar@redhat.com>
Date: Thu Aug 6 14:39:59 2020 +0530
features/shard: optimization over shard lookup in case of prealloc
Assume that we are preallocating a VM of size 1TB with a shard
block size of 64MB then there will be ~16k shards.
This creation happens in 2 steps shard_fallocate() path i.e
1. lookup for the shards if any already present and
2. mknod over those shards do not exist.
But in case of fresh creation, we dont have to lookup for all
shards which are not present as the the file size will be 0.
Through this, we can save lookup on all shards which are not
present. This optimization is quite useful in the case of
preallocating big vm.
Also if the file is already present and the call is to
extend it to bigger size then we need not to lookup for non-
existent shards. Just lookup preexisting shards, populate
the inodes and issue mknod on extended size.
Fixes: #1425
Change-Id: I60036fe8302c696e0ca80ff11ab0ef5bcdbd7880
Signed-off-by: Vinayakswami Hariharmath <vharihar@redhat.com>
(cherry picked from commit 2ede911d07c6dc07a0f729526ab590ace77341ae)
commit 8ef4b79162a0409023b10a15561c84606e0b3ae0
Author: Krutika Dhananjay <kdhananj@redhat.com>
Date: Mon May 4 14:30:57 2020 +0530
extras: Modify group 'virt' to include network-related options
This is needed to work around an issue seen where vms running on
online hosts are getting killed when a different host is rebooted
in ovirt-gluster hyperconverged environments. Actual RCA is quite
lengthy and documented in the github issue. Please refer to it
for more details.
Change-Id: Ic25b5f50144ad42458e5c847e1e7e191032396c1
Fixes: #1217
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
(cherry picked from commit 5391f16fc4aa00f75af2a4c2707768370ace5f6c)
commit 7b372cdeed876e68293620c25c6821324068fb54
Author: Ashish Pandey <aspandey@redhat.com>
Date: Thu Jul 23 11:07:32 2020 +0530
cluster/ec: Remove stale entries from indices/xattrop folder
Problem:
If a gfid is present in indices/xattrop folder while
the file/dir is actaully healthy and all the xattrs are healthy,
it causes lot of lookups by shd on an entry which does not need
to be healed.
This whole process eats up lot of CPU usage without doing meaningful
work.
Solution:
Set trusted.ec.dirty xattr of the entry so that actual heal process
happens and at the end of it, during unset of dirty, gfid enrty from
indices/xattrop will be removed.
Change-Id: Ib1b9377d8dda384bba49523e9ff6ba9f0699cc1b
Fixes: #1385
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
(cherry picked from commit ba1b0a471dec968633f89c7f790b099fb4ad700d)
commit 7ff51badda5cbcbaa17f729d1e4ab715c462396a
Author: Mohit Agrawal <moagrawa@redhat.com>
Date: Sat Aug 1 09:28:47 2020 +0530
glusterd: Increase buffer length to save multiple hostnames in peer file
Problem: At the time of handling friend update request glusterd updates peer
file and if DNS has returned multiple hostnames for the same IP, glusterd
saves all hostnames in peer file.In commit 1fa089e7a2b180e0bdcc1e7e09a63934a2a0c0ef
We changed the approach to save all key value pairs in single shot.
In case of a buffer is not having space to store the hostnames glusterd
writes partial hostname in peer file.
Solution: To avoid the failure increase the buffer length
Change-Id: Iee969d165333e9c5ba69431d474c541b8f12d442
Fixes: #1407
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
(cherry picked from commit 6e8e73a06d71382f8f6e3cd83fe72692d19e66ba)
commit 4be26c88732f55b38da171c86334eddbdaac5c14
Author: Sunny Kumar <sunkumar@redhat.com>
Date: Tue May 19 16:13:01 2020 +0100
geo-rep: Fix corner case in rename on mkdir during hybrid crawl
Problem:
The issue is being hit during hybrid mode while handling rename on slave.
In this special case the rename is recorded as mkdir and geo-rep process it
by resolving the path form backend.
While resolving the backend path during this special handling one corner case is not considered.
<snip>
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 588, in entry_ops
src_entry = get_slv_dir_path(slv_host, slv_volume, gfid)
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 710, in get_slv_dir_path
dir_entry = os.path.join(pfx, pargfid, basename)
File "/usr/lib64/python2.7/posixpath.py", line 75, in join
if b.startswith('/'):
AttributeError: 'int' object has no attribute 'startswith'
In pyhthon3:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.8/posixpath.py", line 90, in join
genericpath._check_arg_types('join', a, *p)
File "/usr/lib64/python3.8/genericpath.py", line 152, in _check_arg_types
raise TypeError(f'{funcname}() argument must be str, bytes, or '
TypeError: join() argument must be str, bytes, or os.PathLike object, not 'int'
</snip>
Backport of:
>Ptach link: https://review.gluster.org/#/c/glusterfs/+/24468/
>Change-Id: I8b926899c60ad8c4ffc886d57028ba70fd21e332
>Fixes: #1250
>Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
Change-Id: I8b926899c60ad8c4ffc886d57028ba70fd21e332
Fixes: #1250
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
(cherry picked from commit 27f5c8ba844e9da54fc1304df4ffe015a3bbb9bd)
Change-Id: I171eb9ad4e30f49cfe86cb258918682d3c0f5af9
commit 269ece312c9fd890c74c46e79de70efe1720752c
Author: Xavi Hernandez <xhernandez@redhat.com>
Date: Thu Jul 2 18:08:52 2020 +0200
cluster/ec: Improve detection of new heals
When EC successfully healed a directory it assumed that maybe other
entries inside that directory could have been created, which could
require additional heal cycles. For this reason, when the heal happened
as part of one index heal iteration, it triggered a new iteration.
The problem happened when the directory was healthy, so no new entries
were added, but its index entry was not removed for some reason. In
this case self-heal started and endless loop healing the same directory
continuously, cause high CPU utilization.
This patch improves detection of new files added to the heal index so
that a new index heal iteration is only triggered if there is new work
to do.
Change-Id: I2355742b85fbfa6de758bccc5d2e1a283c82b53f
Fixes: #1354
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
commit 19dc7b37fc9f6d6037958e5dd3c0c6a4a993e2af
Author: Krutika Dhananjay <kdhananj@redhat.com>
Date: Thu Sep 12 11:07:10 2019 +0530
features/shard: Convert shard block indices to uint64
This patch fixes a crash in FOPs that operate on really large sharded
files where number of participant shards could sometimes exceed
signed int32 max.
The patch also adds GF_ASSERTs to ensure that number of participating
shards is always greater than 0 for files that do have more than one
shard.
Change-Id: I354de58796f350eb1aa42fcdf8092ca2e69ccbb6
Fixes: #1348
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
(cherry picked from commit cdf01cc47eb2efb427b5855732d9607eec2abc8a)
commit 96e7ff7396a8e18ca69d5198be0a8d29bcc37129
Author: Vinayakswami Hariharmath <vharihar@redhat.com>
Date: Wed Jun 3 18:58:56 2020 +0530
features/shard: Use fd lookup post file open
Issue:
When a process has the open fd and the same file is
unlinked in middle of the operations, then file based
lookup fails with ENOENT or stale file
Solution:
When the file already open and fd is available, use fstat
to get the file attributes
Change-Id: I0e83aee9f11b616dcfe13769ebfcda6742e4e0f4
Fixes: #1281
Signed-off-by: Vinayakswami Hariharmath <vharihar@redhat.com>
(cherry picked from commit 71dd19f710b81136f318b3a95ae430971198ee70)
commit 6b10b33f8a9bce054fec980583cc597f5a438bb5
Author: Soumya Koduri <skoduri@redhat.com>
Date: Thu Jul 2 02:07:56 2020 +0530
Issue with gf_fill_iatt_for_dirent
In "gf_fill_iatt_for_dirent()", while calculating inode_path for loc,
the inode should be of parent's. Instead it is loc.inode which results in error
and eventually lookup/readdirp fails.
This patch fixes the same.
Change-Id: Ied086234a4634e8cb13520521ac547c87b3c76b5
Fixes: #1351
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit ab8308333aaf033e07dbbdf2f69f9313a7e311f3)
commit e8aedcd40f9f24e5b821e1539275e40ebfccca94
Author: Pranith Kumar K <pkarampu@redhat.com>
Date: Fri May 29 14:24:53 2020 +0530
cluster/afr: Delay post-op for fsync
Problem:
AFR doesn't delay post-op for fsync fop. For fsync heavy workloads
this leads to un-necessary fxattrop/finodelk for every fsync leading
to bad performance.
Fix:
Have delayed post-op for fsync. Add special flag in xdata to indicate
that afr shouldn't delay post-op in cases where either the
process will terminate or graph-switch would happen. Otherwise it leads
to un-necessary heals when the graph-switch/process-termination
happens before delayed-post-op completes.
Fixes: #1253
Change-Id: I531940d13269a111c49e0510d49514dc169f4577
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
commit 30b95ff9cdec72d9089f4882dafca447ae3174f1
Author: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Date: Thu Jul 2 15:52:15 2020 -0400
api: libgfapi symbol versions break LTO in Fedora rawhide/f33
The way symbol versions are implemented is incompatible with gcc-10 and LTO.
Fedora provenpackager Jeff Law (law [at] redhat.com) writes in the
Fedora dist-git glusterfs.spec:
This package uses top level ASM constructs which are incompatible with LTO.
Top level ASMs are often used to implement symbol versioning. gcc-10
introduces a new mechanism for symbol versioning which works with LTO.
Converting packages to use that mechanism instead of toplevel ASMs is
recommended.
In particular, note that the version of gluster in Fedora rawhide/f33 is
glusterfs-8.0RC0. Once this fix is merged it will be necessary to backport
it to the release-8 branch.
At the time that gfapi symbol versions were first implemented we copied
the GNU libc (glibc) symbol version implementation following Uli Drepper's
symbol versioning HOWTO.
Now gcc-10 has a symver attribute that can be used instead. (Maybe it
has been there all along?)
Both the original implemenation and this implemenation yield the same
symbol versions. This can be seen by running
`nm -D --with-symbol-versions libgfapi.so`
on the libgfapi.so built before and after applying this fix.
Change-Id: I05fda580afacfff1bfc07be810dd1afc08a92fb8
Fixes: #1352
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
- Properly screen the .attribute directory where NetBSD UFS1 stores
extended attributes.
- Fix NULL pointer usage.
- Make FUSE notification optional at configure time since NetBSD does not
implement them.
And update comment in patches with references to commits upstream
There is an important performance bug fix specific to NetBSD here,
which disable gfid2path by default. This features causes a huge
amount of different extended attributes to be created, and the
NetBSD implementation does not scale well with it.
In order to recover a server after the feature is disabled, stop
glusterfs daemones, disable extended attributes using extattrctl,
remove ${BRICK_ROOT}/.attribute/system/trusted.gfid2path.*
re-enable extended attributes and restart glusterfs.
From http://blog.gluster.org/2016/06/glusterfs-3-8-released/
Gluster.org announces the release of 3.8 on June 14, 2016, marking
a decade of active development.
The 3.8 release focuses on:
- containers with inclusion of Heketi
- hyperconvergence
- ecosystem integration
- protocol improvements with NFS Ganesha
Contributed features are marked with the supporting organizations.
Automatic conflict resolution, self-healing improvements (Facebook)
Synchronous Replication receives a major boost with features
contributed from Facebook. Multi-threaded self-healing makes
self-heal perform at a faster rate than before. Automatic
Conflict resolution ensures that conflicts due to network
partitions are handled without the need for administrative
intervention
NFSv4.1 (Ganesha) - protocol
Gluster's native NFSv3 server is disabled by default with this
release. Gluster's integration with NFS Ganesha provides NFS
v3, v4 and v4.1 accesses to data stored in Gluster volume.
BareOS - backup / data protection
Gluster 3.8 is ready for integration with BareOS 16.2. BareOS
16.2 leverages glusterfind for intelligently backing up objects
stored in a Gluster volume.
"Next generation" tiering and sharding - VM images
Sharding is now stable for VM image storage. Geo-replication
has been enhanced to integrate with sharding for offsite
backup/disaster recovery of VM images. Self-healing and data
tiering with sharding makes it an excellent candidate for
hyperconverged virtual machine image storage.
block device & iSCSI with LIO - containers
File backed block devices are usable from Gluster through iSCSI.
This release of Gluster integrates with tcmu-runner
[https://github.com/agrover/tcmu-runner] to access block devices
natively through libgfapi.
Heketi - containers, dynamic provisioning
Heketi provides the ability to dynamically provision Gluster
volumes without administrative intervention. Heketi can manage
multiple Gluster clusters and will be the cornerstone for
integration with Container and Storage as a Service management
ecosystems.
glusterfs-coreutils (Facebook) - containers
Native coreutils for Gluster developed by Facebook that uses
libgfapi to interact with gluster volumes. Useful for systems
and containers that do not have FUSE.
For more details, our release notes are included:
https://github.com/gluster/glusterfs/blob/release-3.8/doc/release-notes/3.8.0.md
The release of 3.8 also marks the end of life for GlusterFS 3.5,
there will no further updates for this version.
Complete list of changes since 3.7.1:
- doc: add 1233044, 1232179 in 3.7.2 release-notes
- features/bitrot: fix fd leak in truncate (stub)
- doc: add release notes for 3.7.2
- libgfchangelog: Fix crash in gf_changelog_process
- glusterd: Fix snapshot of a volume with geo-rep
- cluster/ec: Avoid parallel executions of the same state machine
- quota: fix double accounting with rename operation
- cluster/dht: Prevent use after free bug
- cluster/ec: Wind unlock fops at all cost
- glusterd: Buffer overflow causing crash for glusterd
- NFS-Ganesha: Automatically export vol that was exported before vol restart
- common-ha: cluster HA setup sometimes fails
- cluster/ec: Prevent double unwind
- quota/glusterd: porting to new logging framework.
- bitrot/glusterd: gluster volume set command for bitrot should not supported
- tests: fix spurious failure in bug-857330/xml.t
- features/bitrot: tuanble object signing waiting time value for bitrot
- quota: don't log error when disk quota exceeded
- protocol/client : porting log messages to new framework
- cluster/afr: Do not attempt entry self-heal if the last lookup on entry
failed on src
- changetimerecorder : port log messages to a new framework
- tier/volume set: Validate volume set option for tier
- glusterd/tier: glusterd crashed with detach-tier commit force
- rebalance,store,glusterd/glusterd: porting to new logging framework.
- libglusterfs: Enabling the fini() in cleanup_and_exit()
- sm/glusterd: Porting messages to new logging framework
- nfs: Authentication performance improvements
- common-ha: cluster HA setup sometimes fails
- glusterd: subvol_count value for replicate volume should be calculate
correctly
- common-ha : Clean up cib state completely
- NFS-Ganesha : Return correct return value
- glusterd: Porting messages to new logging framework.
- glusterd: Stop tcp/ip listeners during glusterd exit
- storage/posix: Handle MAKE_INODE_HANDLE failures
- cluster/ec: Prevent Null dereference in dht-rename
- doc: fix markdown formatting
- upcall: prevent busy loop in reaper thread
- protocol/server : port log messages to a new framework
- nfs.c nfs3.c: port log messages to a new framework
- logging: log "Stale filehandle" on the client as Debug
- snapshot/scheduler: Modified main() function to take arguments.
- tools/glusterfind: print message for good cases
- geo-rep: ignore symlink and harlink errors in geo-rep
- tools/glusterfind: ignoring deleted files
- spec/geo-rep: Add rsync as dependency for georeplication rpm
- features/changelog: Do htime setxattr without XATTR_REPLACE flag
- tools/glusterfind: Cleanup glusterfind dir after a volume delete
- tools/glusterfind: Cleanup session dir after delete
- geo-rep: Validate use_meta_volume option
- spec: correct the vendor string in spec file
- tools/glusterfind: Fix GFID to Path conversion for dir
- libglusterfs: update glfs-message header for reserved segments
- features/qemu-block: Don't unref root inode
- features/changelog: Avoid setattr fop logging during rename
- common-ha: handle long node names and node names with '-' and '.' in them
- features/marker : Pass along xdata to lower translator
- tools/glusterfind: verifying volume is online
- build: fix compiling on older distributions
- snapshot/scheduler: Handle OSError in os. callbacks
- snapshot/scheduler: Check if GCRON_TASKS exists before
- features/quota: Fix ref-leak
- tools/glusterfind: verifying volume presence
- stripe: fix use-after-free
- Upcall/cache-invalidation: Ignore fops with frame->root->client not set
- rpm: correct date and order of entries in the %changelog
- nfs: allocate and return the hashkey for the auth_cache_entry
- doc: add release notes for 3.7.1
- snapshot: Fix finding brick mount path logic
- glusterd/snapshot: Return correct errno in events of failure - PATCH 2
- rpc: call transport_unref only on non-NULL transport
- heal : Do not invoke glfs_fini for glfs-heal commands
- Changing log level from Warning to Debug
- features/shard: Handle symlinks appropriately in fops
- cluster/ec: EC_XATTR_DIRTY doesn't come in response
- worm: Let lock, zero xattrop calls succeed
- bitrot/glusterd: scrub option should be disabled once bitrot option is
reset
- glusterd/shared_storage: Provide a volume set option to create and mount
the shared storage
- dht: Add lookup-optimize configuration option for DHT
- glusterfs.spec.in: move libgf{db,changelog}.pc from -api-devel to -devel
- fuse: squash 64-bit inodes in readdirp when enable-ino32 is set
- glusterd: do not show pid of brick in volume status if brick is down.
- cluster/dht: fix incorrect dst subvol info in inode_ctx
- common-ha: fix race between setting grace and virt IP fail-over
- heal: Do not call glfs_fini in final builds
- dht/rebalance : Fixed rebalance failure
- cluster/dht: Fix dht_setxattr to follow files under migration
- meta: implement fsync(dir)
- socket: throttle only connected transport
- contrib/timer-wheel: fix deadlock in del_timer()
- snapshot/scheduler: Return proper error code in case of failure
- quota: retry connecting to quotad on ENOTCONN error
- features/quota: prevent statfs frame loss when an error happens during
ancestry
- features/quota : Make "quota-deem-statfs" option "on" by default, when
quota is enabled
- cluster/dht: pass a destination subvol to fop2 variants to avoid races.
- cli: Fix incorrect parse logic for volume heal commands
- glusterd: Bump op version and max op version for 3.7.2
- cluster/dht: Don't rely on linkto xattr to find destination subvol
- afr: honour selfheal enable/disable volume set options
- features/shard: Fix incorrect parameter to get_lowest_block()
- libglusterfs: Copy d_len and dict as well into dst dirent
- features/quota : Do unwind if postbuf is NULL
- cluster/ec: Fix incorrect check for iatt differences
- features/shard: Fix issue with readdir(p) fop
- glusterfs.spec.in: python-gluster should be 'noarch'
- glusterd: Bump op version and max op version for 3.7.1
- glusterd: fix repeated connection to nfssvc failed msgs
Bitrot detection is a technique used to identify an ?insidious?
type of disk error where data is silently corrupted with no indication
from the disk to the storage software layer that an error has
occurred. When bitrot detection is enabled on a volume, gluster
performs signing of all files/objects in the volume and scrubs data
periodically for signature verification. All anomalies observed
will be noted in log files.
* Multi threaded epoll for performance improvements
Gluster 3.7 introduces multiple threads to dequeue and process more
requests from epoll queues. This improves performance by processing
more I/O requests. Workloads that involve read/write operations on
a lot of small files can benefit from this enhancement.
* Volume Tiering [Experimental]
Policy based tiering for placement of files. This feature will serve
as a foundational piece for building support for data classification.
Volume Tiering is marked as an experimental feature for this release.
It is expected to be fully supported in a 3.7.x minor release.
Trashcan
This feature will enable administrators to temporarily store deleted
files from Gluster volumes for a specified time period.
* Efficient Object Count and Inode Quota Support
This improvement enables an easy mechanism to retrieve the number
of objects per directory or volume. Count of objects/files within
a directory hierarchy is stored as an extended attribute of a
directory. The extended attribute can be queried to retrieve the
count.
This feature has been utilized to add support for inode quotas.
* Pro-active Self healing for Erasure Coding
Gluster 3.7 adds pro-active self healing support for erasure coded
volumes.
* Exports and Netgroups Authentication for NFS
This feature adds Linux-style exports & netgroups authentication
to the native NFS server. This enables administrators to restrict
access to specific clients & netgroups for volume/sub-directory
NFSv3 exports.
* GlusterFind
GlusterFind is a new tool that provides a mechanism to monitor data
events within a volume. Detection of events like modified files is
made easier without having to traverse the entire volume.
* Rebalance Performance Improvements
Rebalance and remove brick operations in Gluster get a performance
boost by speeding up identification of files needing movement and
a multi-threaded mechanism to move all such files.
* NFSv4 and pNFS support
Gluster 3.7 supports export of volumes through NFSv4, NFSv4.1 and
pNFS. This support is enabled via NFS Ganesha. Infrastructure changes
done in Gluster 3.7 to support this feature include:
- Addition of upcall infrastructure for cache invalidation.
- Support for lease locks and delegations.
- Support for enabling Ganesha through Gluster CLI.
- Corosync and pacemaker based implementation providing resource
monitoring and failover to accomplish NFS HA.
pNFS support for Gluster volumes and NFSv4 delegations are in beta
for this release. Infrastructure changes to support Lease locks and
NFSv4 delegations are targeted for a 3.7.x minor release.
* Snapshot Scheduling
With this enhancement, administrators can schedule volume snapshots.
* Snapshot Cloning
Volume snapshots can now be cloned to create a new writeable volume.
* Sharding [Experimental]
Sharding addresses the problem of fragmentation of space within a
volume. This feature adds support for files that are larger than
the size of an individual brick. Sharding works by chunking files
to blobs of a configurabe size.
Sharding is an experimental feature for this release. It is expected
to be fully supported in a 3.7.x minor release.
* RCU in glusterd
Thread synchronization and critical section access has been improved
by introducing userspace RCU in glusterd
* Arbiter Volumes
Arbiter volumes are 3 way replicated volumes where the 3rd brick
of the replica is automatically configured as an arbiter. The 3rd
brick contains only metadata which provides network partition
tolerance and prevents split-brains from happening.
Update to GlusterFS 3.7.1
* Better split-brain resolution
split-brain resolutions can now be also driven by users without
administrative intervention.
* Geo-replication improvements
There have been several improvements in geo-replication for stability
and performance.
* Minor Improvements
- Message ID based logging has been added for several translators.
- Quorum support for reads.
- Snapshot names contain timestamps by default.Subsequent access
to the snapshots should be done by the name listed in gluster
snapshot list
- Support for gluster volume get <volname> added.
- libgfapi has added handle based functions to get/set POSIX ACLs
based on common libacl structures.
New features:
- Volume Snapshots
Distributed lvm thin-pool based snapshots for backing up volumes
in a Gluster Trusted Storage Pool. Apart from providing cluster
wide co-ordination to trigger a consistent snapshot, several
improvements have been performed through the GlusterFS stack to
make translators more crash consistent. Snapshotting of volumes is
tightly coupled with lvm today but one could also enhance the same
framework to integrate with a backend storage technology like btrfs
that can perform snapshots.
- Erasure Coding
Xavier Hernandez from Datalab added support to perform erasure
coding of data in a GlusterFS volume across nodes in a Trusted
Storage Pool. Erasure Coding requires fewer nodes to provide better
redundancy than a n-way replicated volume and can help in reducing
the overall deployment cost. We look forward to build on this
foundation and deliver more enhancememnts in upcoming releases.
- Better SSL support
Multiple improvements to SSL support in GlusterFS. The GlusterFS
driver in OpenStack Manila that provides certificate based access
to tenants relies on these improvements.
- Meta translator
This translator provides a /proc like view for examining internal
state of translators on the client stack of a GlusterFS volume and
certainly looks like an interface that I would be heavily consuming
for introspection of GlusterFS.
- Automatic File Replication (AFR) v2
A significant re-factor of the synchronous replication translator,
provides granular entry self-healing and reduced resource consumption
with entry self-heals.
- NetBSD, OSX and FreeBSD ports
Lot of fixes on the portability front. The NetBSD port passes most
regression tests as of 3.6.0. At this point, none of these ports
are ready to be deployed in production. However, with dedicated
maintainers for each of these ports, we expect to reach production
readiness on these platforms in a future release.
Complete releases notes are available at
https://github.com/gluster/glusterfs/blob/release-3.6/doc/release-notes/3.6.0.md
* Improvements for Virtual Machine Image Storage
A number of improvements have been performed to let Gluster volumes provide
storage for Virtual Machine Images. Some of them include:
- qemu / libgfapi integration.
- Causal ordering in write-behind translator.
- Tunables for a gluster volume in group-virt.example.
The above results in significant improvements in performance for VM image
hosting.
* Synchronous Replication Improvements
GlusterFS 3.4 features significant improvements in performance for
the replication (AFR) translator. This is in addition to bug fixes
for volumes that used replica 3.
* Open Cluster Framework compliant Resource Agents
Resource Agents (RA) plug glusterd into Open Cluster Framework
(OCF) compliant cluster resource managers, like Pacemaker.
The glusterd RA manages the glusterd daemon like any upstart or
systemd job would, except that Pacemaker can do it in a cluster-aware
fashion.
The volume RA starts a volume and monitors individual brick?s
daemons in a cluster aware fashion, recovering bricks when their
processes fail.
* POSIX ACL support over NFSv3
setfacl and getfacl commands now can be used on a nfs mount that
exports a gluster volume to set or read posix ACLs.
* 3.3.x compatibility
The new op-version infrastructure provides compatibility with 3.3.x
release of GlusterFS. 3.3.x clients can talk to 3.4.x servers and
the vice-versa is also possible.
If a volume option that corresponds to 3.4 is enabled, then 3.3
clients cannot mount the volume.
* Packaging changes
New RPMs for libgfapi and OCF RA are present with 3.4.0.
* Experimental Features
- RDMA-connection manager (RDMA-CM)
- New Block Device translator
- Support for NUFA
As experimental features, we don?t expect them to work perfectly
for this release, but you can expect them to improve dramatically
as we make successive 3.4.x releases.
* Minor Improvements:
- The Ext4 file system change which affected readdir workloads for
Gluster volumes has been addressed.
- More options for selecting read-child with afr available now.
- Custom layouts possible with distribute translator.
- No 32-aux-gid limit
- SSL support for socket connections.
- Known issues with replica count greater than 2 addressed.
- quick-read and md-cache translators have been refactored.
- open-behind translator introduced.
- Ability to avoid glusterfs bind to reserved ports.
- statedumps are now created in /var/run/gluster instead of /tmp by default.
and the amount of data memory involved is not easy to forcast. We therefore
raise the limit to the maximum.
Patch from Manuel Bouyer. It helps completing a cvs update on a glusterfs
volume.
This maintenance release with no new features. Majot bug fixes are:
Bug 2464 Fixed all the issues caused by GFID mismatch during
distribute rename.
Bug 2988 Fixed the issue of high CPU usage when Directory Quota
is enabled.
Bug 3122 Enhanced the volume set interface to support io-threads
on the client.
Bug 3210 Fixed the issue of modified mtime/atime of the files after
rebalance operation.
Bug 3191 Fixed the issue with symlinks during rebalance operation.