[cap-talk] execve() and CLOEXEC (was: [Apparmor-dev] Object Capabilities for AppArmor)

Jonathan S. Shapiro shap at eros-os.com
Sat Nov 24 13:14:22 EST 2007


On Sat, 2007-11-24 at 15:54 +0000, Mark Seaborn wrote:
> Crispin Cowan <crispin at mercenarylinux.com> wrote:
> 
> > Rob Meijer wrote:
> 
> > > Further, where there may be some justification for making Fd passing
> > > across exec policy based (although I would feel its a bit overkill for the
> > > problem
> > > it solves), doing the same for Fd's passed over sockets would I feel be a
> > > bit to much.
> > >
> > Because of the nature of the API's, it is very easy for software to
> > mistakenly delegate permission by failing to close on exec, because
> > delegation across exec is the default behavior unless you block it. In
> > contrast, delegation over a socket cannot happen by mistake, you must
> > explicitly pass the FD through the socket. Therefore I am ok with the
> > idea that our policy control should only apply to passing via exec, and
> > we can trust that i software delegates over a socket, that it really
> > meant to do that. If we want to stop delegation by socket, then we need
> > to block the communication entirely.

If I may interject:

The close on exec issue is a problem of badly selected defaults. A
better design would have the following behavior:

 - Close on exec by default, ioctl required to express that
   surviving exec is intended.

 - All descriptors that survive an exec boundary are restored to "close
   on exec by default" behavior.

Curiously, the majority of UNIX programs would operate just fine if the
current behavior were changed for fd >= 3.

The socket behavior (also, I think, for pipes) of I_SENDFD is different
because it requires an explicit expression of intent by the program.
That's not a cure-all, but it goes a long way toward mitigating stupid
mistakes.

> OK, I agree with this, although I don't entirely agree with what you
> say earlier:
> 
> > There is already such a system call. "man 2 fcntl" and look for the
> > CLOEXEC flag.
> > 
> > The issue is that some software forgets to set this flag.
> 
> The problem is that software has to set this flag at all.  The problem
> is that this interface (execve() + fcntl() + all the calls that add
> FDs to the FD table) is very easy to use in an unsafe way.

Exactly.

> But it is not very difficult to use execve() in a safe way: you just
> close all FDs except the ones you know you want to pass to the new
> program before calling execve().  (Performance is another issue.  This
> assumes that there is an efficient way to close all but a specified
> set of FDs.)

Closing FDs is usually fast. When it is not, the cost has more to do
with the cost of "sync on close" than with any cost of system calls.
Unfortunately, this has no bearing on "close on exec". In the "close on
exec" case, the most common pattern is a fork followed by an exec. In
this scenario, the close on exec is generally closing a descriptor that
is still open in the pre-fork process. The close is therefore
decrementing an open count rather than performing a true close.

> The current work to extend FD-creating system calls on Linux to be
> able to set the CLOEXEC flag at the time the FD is added/created
> doesn't really address the root of the problem.

Revising the interface to provide a smarter default is a good thing. The
design error is that the option flag should be "O_PASSONEXEC", not
"O_CLOEXEC".  Also the fcntl interface needs to be updated such that the
bit can be both set and cleared.

> It's less practical to change
> FD-creating code and it's less reasonable because this problem is not
> really the responsibility of FD-creating code.

This depends on how you select the default. If the default behavior is
close on exec, and an option must be added for pass on exec, then (a)
what you say is probably not right, (b) the number of lines requiring
modification is small, because the number of descriptors (therefore
opens) passed across exec in practice is quite small, and (c) the burden
falls on the correct programs -- those doing unusual behavior.

The more interesting question is: what should the behavior of dup(2) be?


shap



More information about the cap-talk mailing list