linux-hardened/net
Jarek Poplawski 345aa03120 ipv4: Fix fib_trie rebalancing, part 4 (root thresholds)
Pawel Staszewski wrote:
<blockquote>
Some time ago i report this:
http://bugzilla.kernel.org/show_bug.cgi?id=6648

and now with 2.6.29 / 2.6.29.1 / 2.6.29.3 and 2.6.30 it back
dmesg output:
oprofile: using NMI interrupt.
Fix inflate_threshold_root. Now=15 size=11 bits
...
Fix inflate_threshold_root. Now=15 size=11 bits

cat /proc/net/fib_triestat
Basic info: size of leaf: 40 bytes, size of tnode: 56 bytes.
Main:
        Aver depth:     2.28
        Max depth:      6
        Leaves:         276539
        Prefixes:       289922
        Internal nodes: 66762
          1: 35046  2: 13824  3: 9508  4: 4897  5: 2331  6: 1149  7: 5
9: 1  18: 1
        Pointers: 691228
Null ptrs: 347928
Total size: 35709  kB
</blockquote>

It seems, the current threshold for root resizing is too aggressive,
and it causes misleading warnings during big updates, but it might be
also responsible for memory problems, especially with non-preempt
configs, when RCU freeing is delayed long after call_rcu.

It should be also mentioned that because of non-atomic changes during
resizing/rebalancing the current lookup algorithm can miss valid leaves
so it's additional argument to shorten these activities even at a cost
of a minimally longer searching.

This patch restores values before the patch "[IPV4]: fib_trie root
node settings", commit: 965ffea43d from
v2.6.22.

Pawel's report:
<blockquote>
I dont see any big change of (cpu load or faster/slower
routing/propagating routes from bgpd or something else) - in avg there
is from 2% to 3% more of CPU load i dont know why but it is - i change
from "preempt" to "no preempt" 3 times and check this my "mpstat -P ALL
1 30"
always avg cpu load was from 2 to 3% more compared to "no preempt"
[...]
cat /proc/net/fib_triestat
Basic info: size of leaf: 20 bytes, size of tnode: 36 bytes.
Main:
        Aver depth:     2.44
        Max depth:      6
        Leaves:         277814
        Prefixes:       291306
        Internal nodes: 66420
          1: 32737  2: 14850  3: 10332  4: 4871  5: 2313  6: 942  7: 371  8: 3  17: 1
        Pointers: 599098
Null ptrs: 254865
Total size: 18067  kB
</blockquote>

According to this and other similar reports average depth is slightly
increased (~0.2), and root nodes are shorter (log 17 vs. 18), but
there is no visible performance decrease. So, until memory handling is
improved or added parameters for changing this individually, this
patch resets to safer defaults.

Reported-by: Pawel Staszewski <pstaszewski@itcare.pl>
Reported-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Tested-by: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-08 10:46:45 -07:00
..
9p net/9p: Fix crash due to bad mount parameters. 2009-07-02 13:17:01 -07:00
802 net: remove COMPAT_NET_DEV_OPS 2009-05-25 01:53:53 -07:00
8021q 8021q: Vlan driver should use rcu_barrier() on unload instead of syncronize_net() 2009-06-10 01:11:22 -07:00
appletalk net: correct off-by-one write allocations reports 2009-06-18 00:29:12 -07:00
atm atm: sk_wmem_alloc initial value is one 2009-06-18 00:29:12 -07:00
ax25 net: Move rx skb_orphan call to where needed 2009-06-23 16:36:25 -07:00
bluetooth net: correct off-by-one write allocations reports 2009-06-18 00:29:12 -07:00
bridge bridge: Use rcu_barrier() instead of syncronize_net() on unload. 2009-06-26 13:51:32 -07:00
can can: af_can.c use rcu_barrier() on module unload. 2009-06-10 01:11:24 -07:00
core gro: Flush GRO packets in napi_disable_pending path 2009-06-26 19:27:04 -07:00
dcb DCB: fix kfree(skb) 2009-01-04 17:29:21 -08:00
dccp ipv6: Use correct data types for ICMPv6 type and code 2009-06-23 04:31:07 -07:00
decnet decnet: Use rcu_barrier() on module unload. 2009-06-26 13:51:27 -07:00
dsa dsa: fix 88e6xxx statistics counter snapshotting 2009-07-05 18:03:35 -07:00
econet net: sk_wmem_alloc has initial value of one, not zero 2009-06-17 04:31:25 -07:00
ethernet net: remove COMPAT_NET_DEV_OPS 2009-05-25 01:53:53 -07:00
ieee802154 nl802154: add module license and description 2009-06-29 18:20:28 +04:00
ipv4 ipv4: Fix fib_trie rebalancing, part 4 (root thresholds) 2009-07-08 10:46:45 -07:00
ipv6 IPv6: preferred lifetime of address not getting updated 2009-07-03 19:10:13 -07:00
ipx net: correct off-by-one write allocations reports 2009-06-18 00:29:12 -07:00
irda net: Move rx skb_orphan call to where needed 2009-06-23 16:36:25 -07:00
iucv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 2009-06-22 11:57:09 -07:00
key net: correct off-by-one write allocations reports 2009-06-18 00:29:12 -07:00
lapb
llc net: correct off-by-one write allocations reports 2009-06-18 00:29:12 -07:00
mac80211 mac80211: Use rcu_barrier() on unload. 2009-06-26 13:51:36 -07:00
netfilter netfilter: xtables: conntrack match revision 2 2009-06-29 14:31:46 +02:00
netlabel netlabel: Use genl_register_family_with_ops() 2009-05-21 16:50:24 -07:00
netlink net: correct off-by-one write allocations reports 2009-06-18 00:29:12 -07:00
netrom net: correct off-by-one write allocations reports 2009-06-18 00:29:12 -07:00
packet net: correct off-by-one write allocations reports 2009-06-18 00:29:12 -07:00
phonet Phonet: generate Netlink RTM_DELADDR when destroying a device 2009-06-25 02:58:16 -07:00
rds Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2009-05-18 21:08:20 -07:00
rfkill rfkill: export persistent attribute in sysfs 2009-06-19 11:50:18 -04:00
rose net: correct off-by-one write allocations reports 2009-06-18 00:29:12 -07:00
rxrpc RxRPC: Don't attempt to reuse aborted connections 2009-06-16 21:20:14 -07:00
sched net: correct off-by-one write allocations reports 2009-06-18 00:29:12 -07:00
sctp sctp: fix warning at inet_sock_destruct() while release sctp socket 2009-07-06 12:47:08 -07:00
sunrpc sunrpc: Use rcu_barrier() on unload. 2009-06-26 13:51:34 -07:00
tipc tipc: Use genl_register_family_with_ops() 2009-05-21 16:50:23 -07:00
unix net: correct off-by-one write allocations reports 2009-06-18 00:29:12 -07:00
wanrouter wanrouter: fix sparse warnings: context imbalance 2009-02-26 23:13:36 -08:00
wimax wimax: fix warning caused by not checking retval of rfkill_set_hw_state() 2009-06-11 11:12:48 -07:00
wireless cfg80211: validate station settings 2009-06-19 11:50:24 -04:00
x25 net: correct off-by-one write allocations reports 2009-06-18 00:29:12 -07:00
xfrm xfrm: use xfrm_addr_cmp() instead of compare addresses directly 2009-06-29 19:41:46 -07:00
compat.c net: socket infrastructure for SO_TIMESTAMPING 2009-02-15 22:43:35 -08:00
Kconfig net: add IEEE 802.15.4 socket family implementation 2009-06-09 05:25:32 -07:00
Makefile net: add IEEE 802.15.4 socket family implementation 2009-06-09 05:25:32 -07:00
nonet.c
socket.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2009-04-06 18:05:43 -07:00
sysctl_net.c net: sysctl_net - use net_eq to compare nets 2009-03-16 16:23:30 +01:00
TUNABLE