The restriction to python 2.7 was noted being current as of an ancient
version. Dropping that line and building (therefore with 3.9)
succeeded, and the upstream configure.ac searches for python3 and
accepts it. Thus, even without testing, this seems ok.
Changes since 8.0, from ChangeLog
commit 895183d5a2eceabcfdd81daff87ecab1159c32c6
Author: Rinku Kothiya <rkothiya@redhat.com>
Date: Wed Sep 16 07:15:41 2020 +0000
doc: Added release 8.2 notes
Updates: #1485
Change-Id: Ia42666051df1624444ea203bf8b7c876cf78b592
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
commit 85ff28ace3901a5a54d8de42d33ab2f9ac528ed8
Author: Srijan Sivakumar <ssivakum@redhat.com>
Date: Tue Sep 1 12:48:48 2020 +0530
Events: Fixing coverity issues.
Fixing resource leak reported by coverity scan.
CID: 1431237
Change-Id: I2bed106b3dc4296c50d80542ee678d32c6928c25
Updates: #1060
Signed-off-by: Srijan Sivakumar <ssivakum@redhat.com>
(cherry picked from commit ebc0253269d8a538239dd0b99d42f56ea320b0f0)
commit 93d48622d9ddb96f07a8590312c2885e11751436
Author: srijan-sivakumar <ssivakum@redhat.com>
Date: Sat Jul 18 05:59:09 2020 +0530
Events: Socket creation after getaddrinfo and IPv4 and IPv6 packet capture
Issue: Currently, the socket creation is done
prior to getaddrinfo function being invoked. This
can cause mismatch in the protocol and address
families of the created socket and the result
of the getaddrinfo api. Also, the glustereventsd
UDP server by default only captures IPv4 packets
hence IPv6 packets are not even captured.
Code Changes:
1. Modified the socket creation in such a way that
the parameters taken in are dependent upon the
result of the getaddrinfo function.
2. Created a subclass for adding address family
in glustereventsd.py for both AF_INET and AF_INET6.
3. Modified addresses in the eventsapiconf.py.in
Reasoning behind the approach:
1. If we are using getaddrinfo function then
socket creation should happen only after we
check if we received back valid addresses.
Hence socket creation should come after the call
to getaddrinfo
2. The listening server which pushes the events
to the webhook has to listen for both IPv4
and IPv6 messages as we would not be sure as to
what address family is picked in _gf_event.
Fixes: #1377
Change-Id: I568dcd1a977c8832f0fef981e1f81cac7043c760
Signed-off-by: srijan-sivakumar <ssivakum@redhat.com>
(cherry picked from commit 7c309928591deb8d0188793677958226ac03897a)
commit b4cc0988d5e9a5bf354dd4c2cb254ce52546facb
Author: nik-redhat <nladha@redhat.com>
Date: Thu Sep 10 14:55:35 2020 +0530
glusterd: readdir-ahead off by default
Changing the default value of readdir-ahead to
off, but it can be enabled/disabled later on if with
gluster vol set <volname> performance.readdir-ahead enabel/disable
command.
Updates: #1472
Change-Id: Idb3e16e8be98d7a811fc8e5d09906919ef50fbab
Signed-off-by: nik-redhat <nladha@redhat.com>
(cherry picked from commit 84a4cf76219b6187fc625740d1a1ebbe40e9f22c)
commit 68db6b60f621d371c4059a7ee728ebb267854708
Author: nik-redhat <nladha@redhat.com>
Date: Wed Aug 26 15:08:56 2020 +0530
glusterd: cksum mismatch on upgrading to latest gluster
Issue:
In gluster versions less than 7, the checksums were calculated
whether or not the quota is enabled or not, and that cksum value
was also getting stored in the quota.cksum file. But, from gluster
7 version onwards cksum was calculated only if the quota is enabled.
Due to this, the cksums in quota.cksum files differ after upgrading.
Fix:
Added a check to see if the OP_VERSION is less than 7 then, follow
the previous method otherwise, move as per the latest changes for
cksum calculation.
This changes for the cksum calculation was done in
this commit : https://github.com/gluster/glusterfs/commit/3b5eb592f5
Updates: #1332
Change-Id: I7a95e5e5f4d4be4983fb7816225bf9187856c003
Signed-off-by: nik-redhat <nladha@redhat.com>
(cherry picked from commit 865cca1190e233381f975ff36118f46e29477dcf)
Signed-off-by: nik-redhat <nladha@redhat.com>
commit a5d9edce9b59ee00d2a4027fafba126e82e2fcfd
Author: Xavi Hernandez <xhernandez@redhat.com>
Date: Fri Sep 4 14:49:50 2020 +0200
open-behind: implement create fop
Open behind didn't implement create fop. This caused that files created
were not accounted for the number of open fd's. This could cause future
opens to be delayed when they shouldn't.
This patch implements the create fop. It also fixes a problem when
destroying the stack: when frame->local was not NULL, STACK_DESTROY()
tried to mem_put() it, which is not correct.
Fixes: #1440
Change-Id: Ic982bad07d4af30b915d7eb1fbcef7a847a45869
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
commit 473453c4e2b1b6fc94edbce438dd9a3c0ea58c67
Author: Amar Tumballi <amar@kadalu.io>
Date: Tue Aug 18 14:08:20 2020 +0530
tests: provide an option to mark tests as 'flaky'
* also add some time gap in other tests to see if we get things properly
* create a directory 'tests/000/', which can host any tests, which are flaky.
* move all the tests mentioned in the issue to above directory.
* as the above dir gets tested first, all flaky tests would be reported quickly.
* change `run-tests.sh` to continue tests even if flaky tests fail.
Reference: gluster/project-infrastructure#72
Updates: #1000
Change-Id: Ifdafa38d083ebd80f7ae3cbbc9aa3b68b6d21d0e
Signed-off-by: Amar Tumballi <amar@kadalu.io>
(cherry picked from 097db13c11390174c5b9f11aa0fd87eca1735871)
commit 635dcf82505efcdeaf01c4e0450a157b533099ba
Author: Ravishankar N <ravishankar@redhat.com>
Date: Tue Sep 1 11:36:42 2020 +0530
libglusterfs: fix dict leak
Problem:
gf_rev_dns_lookup_cached() allocated struct dnscache->dict if it was null
but the freeing was left to the caller.
Fix:
Moved dict allocation and freeing into corresponding init and fini
routines so that its easier for the caller to avoid such leaks.
Updates: #1000
Change-Id: I90d6a6f85ca2dd4fe0ab461177aaa9ac9c1fbcf9
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 079f7a7d8a2bd85070c1da4dde2452ca82a1cdbb)
commit f9b8462ba212e0fd572efdf6ade03f4d5c53d11e
Author: Rinku Kothiya <rkothiya@redhat.com>
Date: Tue Aug 25 12:31:20 2020 +0000
doc: Updated release 8.1 notes
Updates: #1318
Change-Id: I87787a1aaf59302ad045ed6d2562920e17654678
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
commit ab40a26dcd9ce8b676f482bf751e57024e227e89
Author: Rinku Kothiya <rkothiya@redhat.com>
Date: Sat Aug 22 17:23:25 2020 +0000
doc: Added release 8.1 notes
Updates: #1318
Change-Id: I14d589bd9af85bdd4ae02902e41d4c5f7d930358
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
commit f120311737cf681c36423d8551e2e218509cd5f7
Author: Ravishankar N <ravishankar@redhat.com>
Date: Wed Aug 19 11:14:25 2020 +0530
afr: add null check for thin-arbiter gfid.
Problem:
Lookup/creation of thin-arbiter ID file happens in background during
mounting. On new volumes, if the ID file creation is in progress, and a
FOP fails on data brick, a post-op (xattrop) is attemtped on TA. Since
the TA file's gfid is null at this point, the ASSERT checks in protocol/
client causes a crash.
Fix:
Given that we decided to do Lookup/creation of thin-arbiter in
background, fail the other AFR FOPS on TA if the ID file's gfid is null
instead of winding it down to protocol/client.
Also remove afr_changelog_thin_arbiter_post_op() which seems to be dead
code.
Updates: #763
Change-Id: I70dc666faf55cc5c8f7cf8e7d36085e4fa399c4d
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit f9b5074394e3d2f3b6728aab97230ba620879426)
commit 4398db9d70f34e9a8af88fe3de564a906db7b182
Author: Xavi Hernandez <xhernandez@redhat.com>
Date: Wed Aug 19 23:27:38 2020 +0200
open-behind: fix call_frame leak
When an open was delayed, a copy of the frame was created because the
current frame was used to unwind the "fake" open. When the open was
actually sent, the frame was correctly destroyed. However if the file
was closed before needing to send the open, the frame was not destroyed.
This patch correctly destroys the frame in all cases.
Change-Id: I8c00fc7f15545c240e8151305d9e4cf06d653926
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Fixes: #1440
commit e173c5b0ee32c210a7d36f03f1847c42218a62e5
Author: Mohit Agrawal <moagrawa@redhat.com>
Date: Mon Jul 27 18:08:00 2020 +0530
posix: Implement a janitor thread to close fd
Problem: In the commit fb20713b380e1df8d7f9e9df96563be2f9144fd6 we use
syntask to close fd but we have found the patch is reducing the
performance
Solution: Use janitor thread to close fd's and save the pfd ctx into
ctx janitor list and also save the posix_xlator into pfd object to
avoid the race condition during cleanup in brick_mux environment
Change-Id: Ifb3d18a854b267333a3a9e39845bfefb83fbc092
Fixes: #1396
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
(cherry picked from commit 41b9616435cbdf671805856e487e373060c9455b)
commit 05060c9664153beb392206ae05a498d4d4178f5f
Author: Leonid Ishimnikov <lishim@fastmail.com>
Date: Thu Aug 13 15:37:50 2020 -0400
glusterd: dump SSL error stack on disconnect
Problem: When a non-SSL connection is attempted on an SSL-enabled
management port, unrelated peers are subsequently disconnected
from the node with a misleading error message.
Cause: A non-SSL client causes OpenSSL to push a wrong version error
into its thread-local error stack, but this error is never
cleared, and it lingers in the stack until the thread is used
by another SSL session, and a certain condition requires the error
stack to be examined, at which time the old error is discovered and
the connection is terminated.
Solution: Log and clear the error stack upon terminating the connection.
Change-Id: I82f3a723285df24dafc88850ae4fca65b69f6ae4
Fixes: #1418
Signed-off-by: Leonid Ishimnikov <lishim@fastmail.com>
(cherry picked from commit bb5801d1480314e09b4203d2525bd01aada5c683)
commit c5fc58c8cb01753e2fed173c76aea1e9cc333862
Author: Vinayakswami Hariharmath <vharihar@redhat.com>
Date: Thu Aug 6 14:39:59 2020 +0530
features/shard: optimization over shard lookup in case of prealloc
Assume that we are preallocating a VM of size 1TB with a shard
block size of 64MB then there will be ~16k shards.
This creation happens in 2 steps shard_fallocate() path i.e
1. lookup for the shards if any already present and
2. mknod over those shards do not exist.
But in case of fresh creation, we dont have to lookup for all
shards which are not present as the the file size will be 0.
Through this, we can save lookup on all shards which are not
present. This optimization is quite useful in the case of
preallocating big vm.
Also if the file is already present and the call is to
extend it to bigger size then we need not to lookup for non-
existent shards. Just lookup preexisting shards, populate
the inodes and issue mknod on extended size.
Fixes: #1425
Change-Id: I60036fe8302c696e0ca80ff11ab0ef5bcdbd7880
Signed-off-by: Vinayakswami Hariharmath <vharihar@redhat.com>
(cherry picked from commit 2ede911d07c6dc07a0f729526ab590ace77341ae)
commit 8ef4b79162a0409023b10a15561c84606e0b3ae0
Author: Krutika Dhananjay <kdhananj@redhat.com>
Date: Mon May 4 14:30:57 2020 +0530
extras: Modify group 'virt' to include network-related options
This is needed to work around an issue seen where vms running on
online hosts are getting killed when a different host is rebooted
in ovirt-gluster hyperconverged environments. Actual RCA is quite
lengthy and documented in the github issue. Please refer to it
for more details.
Change-Id: Ic25b5f50144ad42458e5c847e1e7e191032396c1
Fixes: #1217
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
(cherry picked from commit 5391f16fc4aa00f75af2a4c2707768370ace5f6c)
commit 7b372cdeed876e68293620c25c6821324068fb54
Author: Ashish Pandey <aspandey@redhat.com>
Date: Thu Jul 23 11:07:32 2020 +0530
cluster/ec: Remove stale entries from indices/xattrop folder
Problem:
If a gfid is present in indices/xattrop folder while
the file/dir is actaully healthy and all the xattrs are healthy,
it causes lot of lookups by shd on an entry which does not need
to be healed.
This whole process eats up lot of CPU usage without doing meaningful
work.
Solution:
Set trusted.ec.dirty xattr of the entry so that actual heal process
happens and at the end of it, during unset of dirty, gfid enrty from
indices/xattrop will be removed.
Change-Id: Ib1b9377d8dda384bba49523e9ff6ba9f0699cc1b
Fixes: #1385
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
(cherry picked from commit ba1b0a471dec968633f89c7f790b099fb4ad700d)
commit 7ff51badda5cbcbaa17f729d1e4ab715c462396a
Author: Mohit Agrawal <moagrawa@redhat.com>
Date: Sat Aug 1 09:28:47 2020 +0530
glusterd: Increase buffer length to save multiple hostnames in peer file
Problem: At the time of handling friend update request glusterd updates peer
file and if DNS has returned multiple hostnames for the same IP, glusterd
saves all hostnames in peer file.In commit 1fa089e7a2b180e0bdcc1e7e09a63934a2a0c0ef
We changed the approach to save all key value pairs in single shot.
In case of a buffer is not having space to store the hostnames glusterd
writes partial hostname in peer file.
Solution: To avoid the failure increase the buffer length
Change-Id: Iee969d165333e9c5ba69431d474c541b8f12d442
Fixes: #1407
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
(cherry picked from commit 6e8e73a06d71382f8f6e3cd83fe72692d19e66ba)
commit 4be26c88732f55b38da171c86334eddbdaac5c14
Author: Sunny Kumar <sunkumar@redhat.com>
Date: Tue May 19 16:13:01 2020 +0100
geo-rep: Fix corner case in rename on mkdir during hybrid crawl
Problem:
The issue is being hit during hybrid mode while handling rename on slave.
In this special case the rename is recorded as mkdir and geo-rep process it
by resolving the path form backend.
While resolving the backend path during this special handling one corner case is not considered.
<snip>
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 588, in entry_ops
src_entry = get_slv_dir_path(slv_host, slv_volume, gfid)
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 710, in get_slv_dir_path
dir_entry = os.path.join(pfx, pargfid, basename)
File "/usr/lib64/python2.7/posixpath.py", line 75, in join
if b.startswith('/'):
AttributeError: 'int' object has no attribute 'startswith'
In pyhthon3:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.8/posixpath.py", line 90, in join
genericpath._check_arg_types('join', a, *p)
File "/usr/lib64/python3.8/genericpath.py", line 152, in _check_arg_types
raise TypeError(f'{funcname}() argument must be str, bytes, or '
TypeError: join() argument must be str, bytes, or os.PathLike object, not 'int'
</snip>
Backport of:
>Ptach link: https://review.gluster.org/#/c/glusterfs/+/24468/
>Change-Id: I8b926899c60ad8c4ffc886d57028ba70fd21e332
>Fixes: #1250
>Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
Change-Id: I8b926899c60ad8c4ffc886d57028ba70fd21e332
Fixes: #1250
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
(cherry picked from commit 27f5c8ba844e9da54fc1304df4ffe015a3bbb9bd)
Change-Id: I171eb9ad4e30f49cfe86cb258918682d3c0f5af9
commit 269ece312c9fd890c74c46e79de70efe1720752c
Author: Xavi Hernandez <xhernandez@redhat.com>
Date: Thu Jul 2 18:08:52 2020 +0200
cluster/ec: Improve detection of new heals
When EC successfully healed a directory it assumed that maybe other
entries inside that directory could have been created, which could
require additional heal cycles. For this reason, when the heal happened
as part of one index heal iteration, it triggered a new iteration.
The problem happened when the directory was healthy, so no new entries
were added, but its index entry was not removed for some reason. In
this case self-heal started and endless loop healing the same directory
continuously, cause high CPU utilization.
This patch improves detection of new files added to the heal index so
that a new index heal iteration is only triggered if there is new work
to do.
Change-Id: I2355742b85fbfa6de758bccc5d2e1a283c82b53f
Fixes: #1354
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
commit 19dc7b37fc9f6d6037958e5dd3c0c6a4a993e2af
Author: Krutika Dhananjay <kdhananj@redhat.com>
Date: Thu Sep 12 11:07:10 2019 +0530
features/shard: Convert shard block indices to uint64
This patch fixes a crash in FOPs that operate on really large sharded
files where number of participant shards could sometimes exceed
signed int32 max.
The patch also adds GF_ASSERTs to ensure that number of participating
shards is always greater than 0 for files that do have more than one
shard.
Change-Id: I354de58796f350eb1aa42fcdf8092ca2e69ccbb6
Fixes: #1348
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
(cherry picked from commit cdf01cc47eb2efb427b5855732d9607eec2abc8a)
commit 96e7ff7396a8e18ca69d5198be0a8d29bcc37129
Author: Vinayakswami Hariharmath <vharihar@redhat.com>
Date: Wed Jun 3 18:58:56 2020 +0530
features/shard: Use fd lookup post file open
Issue:
When a process has the open fd and the same file is
unlinked in middle of the operations, then file based
lookup fails with ENOENT or stale file
Solution:
When the file already open and fd is available, use fstat
to get the file attributes
Change-Id: I0e83aee9f11b616dcfe13769ebfcda6742e4e0f4
Fixes: #1281
Signed-off-by: Vinayakswami Hariharmath <vharihar@redhat.com>
(cherry picked from commit 71dd19f710b81136f318b3a95ae430971198ee70)
commit 6b10b33f8a9bce054fec980583cc597f5a438bb5
Author: Soumya Koduri <skoduri@redhat.com>
Date: Thu Jul 2 02:07:56 2020 +0530
Issue with gf_fill_iatt_for_dirent
In "gf_fill_iatt_for_dirent()", while calculating inode_path for loc,
the inode should be of parent's. Instead it is loc.inode which results in error
and eventually lookup/readdirp fails.
This patch fixes the same.
Change-Id: Ied086234a4634e8cb13520521ac547c87b3c76b5
Fixes: #1351
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit ab8308333aaf033e07dbbdf2f69f9313a7e311f3)
commit e8aedcd40f9f24e5b821e1539275e40ebfccca94
Author: Pranith Kumar K <pkarampu@redhat.com>
Date: Fri May 29 14:24:53 2020 +0530
cluster/afr: Delay post-op for fsync
Problem:
AFR doesn't delay post-op for fsync fop. For fsync heavy workloads
this leads to un-necessary fxattrop/finodelk for every fsync leading
to bad performance.
Fix:
Have delayed post-op for fsync. Add special flag in xdata to indicate
that afr shouldn't delay post-op in cases where either the
process will terminate or graph-switch would happen. Otherwise it leads
to un-necessary heals when the graph-switch/process-termination
happens before delayed-post-op completes.
Fixes: #1253
Change-Id: I531940d13269a111c49e0510d49514dc169f4577
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
commit 30b95ff9cdec72d9089f4882dafca447ae3174f1
Author: Kaleb S. KEITHLEY <kkeithle@redhat.com>
Date: Thu Jul 2 15:52:15 2020 -0400
api: libgfapi symbol versions break LTO in Fedora rawhide/f33
The way symbol versions are implemented is incompatible with gcc-10 and LTO.
Fedora provenpackager Jeff Law (law [at] redhat.com) writes in the
Fedora dist-git glusterfs.spec:
This package uses top level ASM constructs which are incompatible with LTO.
Top level ASMs are often used to implement symbol versioning. gcc-10
introduces a new mechanism for symbol versioning which works with LTO.
Converting packages to use that mechanism instead of toplevel ASMs is
recommended.
In particular, note that the version of gluster in Fedora rawhide/f33 is
glusterfs-8.0RC0. Once this fix is merged it will be necessary to backport
it to the release-8 branch.
At the time that gfapi symbol versions were first implemented we copied
the GNU libc (glibc) symbol version implementation following Uli Drepper's
symbol versioning HOWTO.
Now gcc-10 has a symver attribute that can be used instead. (Maybe it
has been there all along?)
Both the original implemenation and this implemenation yield the same
symbol versions. This can be seen by running
`nm -D --with-symbol-versions libgfapi.so`
on the libgfapi.so built before and after applying this fix.
Change-Id: I05fda580afacfff1bfc07be810dd1afc08a92fb8
Fixes: #1352
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
- Properly screen the .attribute directory where NetBSD UFS1 stores
extended attributes.
- Fix NULL pointer usage.
- Make FUSE notification optional at configure time since NetBSD does not
implement them.
And update comment in patches with references to commits upstream
There is an important performance bug fix specific to NetBSD here,
which disable gfid2path by default. This features causes a huge
amount of different extended attributes to be created, and the
NetBSD implementation does not scale well with it.
In order to recover a server after the feature is disabled, stop
glusterfs daemones, disable extended attributes using extattrctl,
remove ${BRICK_ROOT}/.attribute/system/trusted.gfid2path.*
re-enable extended attributes and restart glusterfs.