[cap-talk] execve() and CLOEXEC (was: [Apparmor-dev] Object Capabilities for AppArmor)
Mark Seaborn
mrs at mythic-beasts.com
Sat Nov 24 10:54:08 EST 2007
Crispin Cowan <crispin at mercenarylinux.com> wrote:
> Rob Meijer wrote:
> > Further, where there may be some justification for making Fd passing
> > across exec policy based (although I would feel its a bit overkill for the
> > problem
> > it solves), doing the same for Fd's passed over sockets would I feel be a
> > bit to much.
> >
> Because of the nature of the API's, it is very easy for software to
> mistakenly delegate permission by failing to close on exec, because
> delegation across exec is the default behavior unless you block it. In
> contrast, delegation over a socket cannot happen by mistake, you must
> explicitly pass the FD through the socket. Therefore I am ok with the
> idea that our policy control should only apply to passing via exec, and
> we can trust that i software delegates over a socket, that it really
> meant to do that. If we want to stop delegation by socket, then we need
> to block the communication entirely.
OK, I agree with this, although I don't entirely agree with what you
say earlier:
> There is already such a system call. "man 2 fcntl" and look for the
> CLOEXEC flag.
>
> The issue is that some software forgets to set this flag.
The problem is that software has to set this flag at all. The problem
is that this interface (execve() + fcntl() + all the calls that add
FDs to the FD table) is very easy to use in an unsafe way.
But it is not very difficult to use execve() in a safe way: you just
close all FDs except the ones you know you want to pass to the new
program before calling execve(). (Performance is another issue. This
assumes that there is an efficient way to close all but a specified
set of FDs.)
So execve() is unsafe, but it's easy to provide a wrapper around
execve() that provides a safe interface. In the general case, a
safe_execve() would take an array mapping from FD indexes in the new
FD table to FD indexes in the current table. If it worked like this,
it would be very similar to how capability arguments are passed in
capability invocations on kernels like EROS/KeyKOS (albeit that
execve() does not return).
The current work to extend FD-creating system calls on Linux to be
able to set the CLOEXEC flag at the time the FD is added/created
doesn't really address the root of the problem. There are probably a
lot more lines of code that deal with creating FDs than deal with
calling execve() in some way. It's less practical to change
FD-creating code and it's less reasonable because this problem is not
really the responsibility of FD-creating code.
I don't see this as different in principle to dealing with other
unsafe interfaces. For example, buffer overruns are caused both by
specific library interfaces that are unsafe (i.e. difficult to use
safely, or too easy to use unsafely) and by the fact that the C
language is not memory-safe. People have dealt with the buffer
overrun problem at lots of different levels:
* static analysis (the simplest being grepping for calls to gets() :-) )
* glibc changes (checked versions of libc calls)
* refactoring to use safer string abstractions
* language changes (use a memory safe language)
* compiler changes (checking canaries on the stack)
* compiler + ABI changes (CCured, CapC)
* kernel changes (address space randomisation)
* even processor changes (non-executable stacks)
The execve() problem is much smaller in scope because it involves
fewer lines of code, but it can also be addressed at different levels.
If you want to work at the level of processes, static analysis is out,
but you can record when FDs (other than stdin/stdout/stderr) are
passed across execve() and then flag the calling executable as
potentially problematic. You can track whether the callee uses the
FDs that were passed. If the FDs were used, the caller probably needs
to pass FDs and is doing so deliberately. If the FDs were not used,
the caller is probably passing them accidentally. This detection code
could be implemented in the kernel, or it could be implemented in
glibc.
If you have a program that does not need to pass FDs across execve()
(other than stdin/stdout/stderr), you could change its execve() call
to close any offending FDs (perhaps logging a warning) or halt with an
error if there are any offending FDs. Again, that can be done in the
kernel or in glibc.
The general pattern could be:
* Change the unsafe call, execve(), to point to a safe but limited version,
safe_simple_execve(), which closes or errors on FDs other than 0-2.
* If testing or inspection reveals that this breaks code, then either:
1) change the specific call sites to use unsafe_execve(), which
gets back the original behaviour of execve(), and review the code
thoroughly; or
2) refactor the call sites to use safe_execve(), which takes a list of
FDs to pass on and may be implemented in terms of unsafe_execve().
I was going to cite Python's subprocess module as an example of a safe
interface, but unfortunately when I checked I found its default is
close_fds=False and not close_fds=True. Part of the reason for that
is that closing FDs can be slow for large FD tables
(http://mail.python.org/pipermail/python-bugs-list/2007-February/037228.html),
so some useful steps here would be to make sure there is a good way to
close FDs in bulk efficiently and then to get Python's default
changed. It would have been nice if I could point to a safe version
of execve() that already exists and is widely available and widely
used, but I don't know of one.
> > Do you at least agree that it is a designation rather than a delegation
> > that is the underlaying problem in the accedentaly passed Fd's?
> >
> I assume that "designation" and "delegation" are technical terms in the
> OC space, so that just like "authority" means something very specific,
> so do "designation" and "delegation". I just don't know what that
> meaning is, so I can't answer your question yet.
>
> I've made this complaint before, it is an unfortunate habit of the OC
> community to overload the meaning of common words such as "authority",
> "permission", and in this case "designation" with specific technical
> meanings. It confuses people from outside the community, and because the
> technical terms are all common words, it makes it hard to google for.
Perhaps you should raise this issue on the cap-talk mailing list?
(CC'ing cap-talk)
When Rob says that this is a designation problem rather than a
delegation problem he means that the problem is that calls to execve()
do not designate the FDs to pass.
Regards,
Mark
More information about the cap-talk
mailing list