arch_dup_task_struct() copies thread.fpu if fpu_allocated(), this
looks suboptimal and misleading. Say, a forking process could use
FPU only once in a signal handler but now tsk_used_math(src) == F,
in this case the child gets a copy of fpu->state for no reason. The
child won't use the saved registers anyway even if it starts to use
FPU, this can only avoid fpu_alloc() in do_device_not_available().
Change this code to check tsk_used_math(current) instead. We still
need to clear fpu->has_fpu/state, we could do this memset(0) under
fpu_allocated() check but I think this doesn't make sense. See also
the next change.
use_eager_fpu() assumes that fpu_allocated() is always true, but a
forking task (and thus its child) must always have PF_USED_MATH set,
otherwise the child can either use FPU without used_math() (note that
switch_fpu_prepare() doesn't do stts() in this case), or it will be
killed by do_device_not_available()->BUG_ON(use_eager_fpu).
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20140902175723.GA21659@redhat.com
Reviewed-by: Suresh Siddha <sbsiddha@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>