[cap-talk] Designation linux kernel patch concept
Mark Seaborn
mrs at mythic-beasts.com
Tue Dec 4 15:13:38 EST 2007
David Hopwood <david.hopwood at industrial-designers.co.uk> wrote:
> It seems to me that using fds as (essentially) capabilities for files
> does not quite work, because:
>
> - an fd to an open file can't be safely used to designate the file
> itself, if duplicates of the fd are to be shared between processes.
> That is because the duplicates share attributes that you don't want
> to be shared, such as the O_NONBLOCK flag (see
> <http://plash.beasts.org/wiki/UsefulKernelChanges>
> and <http://lkml.org/lkml/2007/8/14/135>).
I've discovered that you can use /proc/self/fd/N to re-open a pipe FD
and that gives you another FD on which O_NONBLOCK can be set
independently. That's not ideal for the use case I had in mind
(http://plash.beasts.org/wiki/EventLoopAndFDs) because you would have
to treat file FDs and pipe FDs differently -- re-opening a file FD
would give you an FD with an independent seek position -- and I'm a
bit relucant to change behaviour based on the FD type reported by
fstat().
> - there is no fd type that designates just a file inode.
You can open a file and use the resulting FD to re-open the file using
/proc/self/fd/N. Most operations available through pathname-based
calls can also be done on file FDs -- assuming you can acquire the FD
in the first place (eg. you can't do an lchmod() by doing open() +
fchmod() if the file has its permissions bits unset to start with).
If /proc isn't directly available (as under Plash) you could have a
trusted intermediary for /proc that re-opens an FD if you asked for a
subset of the file mode flags.
That doesn't work for inodes that are not files such as Unix domain
sockets though.
> - to do system call interposition in a way that is safe against
> race conditions, you *really* need an fd type that designates a
> file inode. Let's say that you are interposing on one of the "at"
> calls, and you do a security check based on the dirfd and the
> relative path. The check passes, so you forward the call to the
> kernel. But how do you know that by the time the kernel
> interprets the dirfd and path, it is pointing to the same file?
> You don't. As pointed out in
> <https://db.usenix.org/events/woot07/tech/full_papers/watson/watson.pdf>,
> such race conditions are quite exploitable in practice.
What sort of checks do you have in mind? The main problem Plash faces
in this area is doing operations relative to a dir FD without
following symlinks. connect() on Unix domain sockets is the main
problem here because there is no way to switch its symlink-following
off (http://plash.beasts.org/wiki/PlashIssues/ConnectRaceCondition).
> However, the "at" calls don't have the right interface to be used with
> an inode fd. It would be bordering on insane to add yet another set of
> API calls and/or syscalls (and it would beg the question, "how many
> more iterations will we need to get this right?").
Yes, Linux seems to be facing the problem at the moment having
pressures to add interfaces but without having a good framework for
doing so. Now if only they had some sort of generic object invocation
interface... :-)
Regards,
Mark
More information about the cap-talk
mailing list