linux-hardened/fs/ext4
Wang Shilong 901ed070df ext4: reduce lock contention in __ext4_new_inode
While running number of creating file threads concurrently,
we found heavy lock contention on group spinlock:

FUNC                           TOTAL_TIME(us)       COUNT        AVG(us)
ext4_create                    1707443399           1440000      1185.72
_raw_spin_lock                 1317641501           180899929    7.28
jbd2__journal_start            287821030            1453950      197.96
jbd2_journal_get_write_access  33441470             73077185     0.46
ext4_add_nondir                29435963             1440000      20.44
ext4_add_entry                 26015166             1440049      18.07
ext4_dx_add_entry              25729337             1432814      17.96
ext4_mark_inode_dirty          12302433             5774407      2.13

most of cpu time blames to _raw_spin_lock, here is some testing
numbers with/without patch.

Test environment:
Server : SuperMicro Sever (2 x E5-2690 v3@2.60GHz, 128GB 2133MHz
         DDR4 Memory, 8GbFC)
Storage : 2 x RAID1 (DDN SFA7700X, 4 x Toshiba PX02SMU020 200GB
          Read Intensive SSD)

format command:
        mkfs.ext4 -J size=4096

test command:
        mpirun -np 48 mdtest -n 30000 -d /ext4/mdtest.out -F -C \
                -r -i 1 -v -p 10 -u #first run to load inode

        mpirun -np 48 mdtest -n 30000 -d /ext4/mdtest.out -F -C \
                -r -i 3 -v -p 10 -u

Kernel version: 4.13.0-rc3

Test  1,440,000 files with 48 directories by 48 processes:

Without patch:

File Creation   File removal
79,033          289,569 ops/per second
81,463          285,359
79,875          288,475

With patch:
File Creation   File removal
810669		301694
812805		302711
813965		297670

Creation performance is improved more than 10X with large
journal size. The main problem here is we test bitmap
and do some check and journal operations which could be
slept, then we test and set with lock hold, this could
be racy, and make 'inode' steal by other process.

However, after first try, we could confirm handle has
been started and inode bitmap journaled too, then
we could find and set bit with lock hold directly, this
will mostly gurateee success with second try.

Tested-by: Shuichi Ihara <sihara@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2017-08-24 12:56:35 -04:00
..
acl.c ext4: Don't clear SGID when inheriting ACLs 2017-07-30 23:33:01 -04:00
acl.h ext2/3/4: use generic posix ACL infrastructure 2014-01-25 23:58:19 -05:00
balloc.c The major change this cycle is deleting ext4's copy of the file system 2016-07-26 18:35:55 -07:00
bitmap.c ext4: remove unused header files 2015-04-02 23:47:42 -04:00
block_validity.c ext4: add missing KERN_CONT to a few more debugging uses 2016-10-15 09:57:31 -04:00
dir.c ext4: remove unused variable 2016-09-30 02:14:56 -04:00
ext4.h ext4: make xattr inode reads faster 2017-08-06 00:07:01 -04:00
ext4_extents.h ext4: fix misspellings in comments. 2016-03-09 23:49:05 -05:00
ext4_jbd2.c ext4: add shutdown bit and check for it 2017-02-05 01:28:48 -05:00
ext4_jbd2.h ext4, project: expand inode extra size if possible 2017-08-06 01:00:49 -04:00
extents.c ext4: fix copy paste error in ext4_swap_extents() 2017-08-06 01:33:07 -04:00
extents_status.c scripts/spelling.txt: add "comsume(r)" pattern and fix typo instances 2017-02-27 18:43:47 -08:00
extents_status.h ext4: move procfs registration code to fs/ext4/sysfs.c 2015-09-23 12:46:17 -04:00
file.c ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize 2017-08-05 17:43:24 -04:00
fsmap.c ext4: fix off-by-one fsmap error on 1k block filesystems 2017-06-23 00:58:57 -04:00
fsmap.h ext4: support GETFSMAP ioctls 2017-04-30 00:36:53 -04:00
fsync.c ext4: use errseq_t based error handling for reporting data writeback errors 2017-07-06 07:02:30 -04:00
hash.c ext4: move halfmd4 into hash.c directly 2017-02-02 11:52:14 -05:00
ialloc.c ext4: reduce lock contention in __ext4_new_inode 2017-08-24 12:56:35 -04:00
indirect.c ext4: call journal revoke when freeing ea_inode blocks 2017-06-21 21:36:51 -04:00
inline.c ext4: xattr-in-inode support 2017-06-21 21:10:32 -04:00
inode.c ext4, project: expand inode extra size if possible 2017-08-06 01:00:49 -04:00
ioctl.c ext4, project: expand inode extra size if possible 2017-08-06 01:00:49 -04:00
Kconfig dax: fix build warnings with FS_DAX and !FS_IOMAP 2017-01-24 16:26:14 -08:00
Makefile ext4: support GETFSMAP ioctls 2017-04-30 00:36:53 -04:00
mballoc.c ext4: fix clang build regression 2017-08-14 08:29:18 -04:00
mballoc.h ext4: send parallel discards on commit completions 2017-06-22 23:54:33 -04:00
migrate.c ext4: do not set posix acls on xattr inodes 2017-06-21 21:21:39 -04:00
mmp.c block,fs: use REQ_* flags directly 2016-11-01 09:43:26 -06:00
move_extent.c ext4: add ext4_is_quota_file() 2017-06-22 11:31:25 -04:00
namei.c ext4: make xattr inode reads faster 2017-08-06 00:07:01 -04:00
page-io.c ext4: add support for passing in write hints for buffered writes 2017-06-27 12:05:44 -06:00
readpage.c block: switch bios to blk_status_t 2017-06-09 09:27:32 -06:00
resize.c ext4: fix overflow caused by missing cast in ext4_resize_fs() 2017-08-06 01:18:31 -04:00
super.c ext4: remove unused metadata accounting variables 2017-07-30 22:30:11 -04:00
symlink.c ext4: Add statx support 2017-04-03 01:05:58 -04:00
sysfs.c ext4: check return value of kstrtoull correctly in reserved_clusters_store 2017-06-23 01:08:22 -04:00
truncate.h ext4: fix races between page faults and hole punching 2015-12-07 14:28:03 -05:00
xattr.c ext4: add missing xattr hash update 2017-08-14 08:30:06 -04:00
xattr.h ext4: fix __ext4_new_inode() journal credits calculation 2017-07-06 00:01:59 -04:00
xattr_security.c switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
xattr_trusted.c switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
xattr_user.c switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00