[cap-talk] NERSC 10 person survey on SELinux (was: capabilities vs. SELinux)

Jed Donnelley jed at nersc.gov
Thu Mar 27 15:47:02 EDT 2008


On 3/27/2008 3:23 AM, Richard Uhtenwoldt wrote:
> Since the cognitive skills required to admin and defend a LAN of Linux
> boxes is very unevenly distributed in the population, Dr. Shapiro's
> experience with SELinux does not mean that Jed Donnelly's assertion is
> wrong.  It might be the case that Dr. Shapiro is particularly high in
> the requisite cognitive skills.  It still might be the case that
> making available an object-capability platform will greatly increase
> the number of sysadmins who can do what he does (once those sysadmins
> learn capability principles) or greatly decrease the labor cost of
> doing it or greatly increase the range and scope of the systems that
> can be adminned.

Heh.  Not what I'd call a ringing vote of confidence for our
system administrators at NERSC:

http://www.nersc.gov/

Perhaps by clarifying what I asserted (Donnelley with a final "e")
we can better understand the tradeoffs we've faced at NERSC
in deploying SELinux controls.

First, a little background that may help.  At NERSC the focus
of our work is scientific computing.  The majority of our IT
professionals support what these days are "supercomputer" clusters
with thousands to tens of thousands of processors such as our
newest Cray-XT4 "Franklin" system:

http://www.nersc.gov/nusers/systems/franklin/

The "Server Team" that I most recently worked in consists of two
system administrators who run some 50-60 Unix "server" systems (small
SMPs distinct from the "big iron" supercomputer clusters), about
1/3 development and the rest production.  The number varies from
year to year and month to month.  The amount of application level
support also varies across the systems.  About 1/2 of these systems
are Linux systems (almost all Redhat derived, mostly the CentOS
distribution), about 1/3 are FreeBSD systems, and we still run a
few Solaris systems.  For the Redhat/CentOS systems we use the
"Enterprise" distributions.  In order to effectively support so many
systems with so few people we have to highly automate our system
installation (kickstart/jumpstart to a "standard" base configuration)
and support (CFEngine configuration management with automated updates)
and we have to carefully husband our systems administration time.

In the NERSC Server Team we naturally update software on our
systems as a regular part of our work.  As part of this process
when the Redhat Linux Enterprise 4 system came out, it was provided
with default of SELinux support enabled, namely in:

/etc/selinux/config.org

SELINUX=enforcing

though this could be easily changed either to permissive, or
disabled.

While we found that we could leave some of our simplest systems
with SELINUX=enforcing, for most of our more complex systems (e.g.
Oracle or other DB servers, rich Web services with lots of custom
applications, systems with GPFS mounted file systems, etc.) the
time cost of the process that Jonathan describes as 'tweak'ing
the policy, whether in response to installation requirements or
in response to version drift, was greater than the value that we
felt could be derived from such "tweak"ing.  If such tweaking
couldn't be completed in a day, then it seemed to drag on for
weeks and was too costly.  We chose (actually me in Server Team,
but others in other areas - see the survey results below) to run
nearly all our systems with SELinux disabled, a few with it configured
as "permissive" and only a handful of essentially stock systems
with SELinux enabled.

 From our perspective the SELinux enabling (above) is a bit
like Secure Level 2 in FreeBSD.  It enables more checking
and consequently can block some access which, if you have a
stock system where no such access blocking causes problems,
then you can run a more secure system (e.g. where some
software faults that would otherwise result in vulnerabilities
end up not resulting in a breach) with little to no additional
cost.  However, if one turns on SELinux (either "permissive" or
particularly "enforcing") and it does cause problems (services
don't work), then the work, as Jonathan says, to "tweak"
its configurations to enable the needed services is more cost
than the derived security value is worth.

At this time SELinux enforcements don't even show up in our
priorities for system administration work.  Except for a brief
flurry when new systems are stood up to decide whether doing
SELinux enforcing causes problems or not, we don't consider
SELinux configuration as an area where we can get significant
value for our systems administration time (e.g. vs. tightening local
firewall configurations or tightening authentication/authorization
mechanisms, or tightening application level configurations or ...).

"SELinux's nearly perfect lack of documentation" hasn't been
a significant issue to this point for us because we never have
time to get into the guts of the policy configuration for our
services.

Do others feel we would do well to put some time into reevaluating
the value of tighter SELinux enforcement for improving our
security posture?  If so, I'd be interested to hear the
thoughts and experiences of others so we can present them to
our management to argue for a change in priorities.  I'd be
interested to hear what others feel is the "state of the art"
with regard to SELinux configurations for Linux systems,
particularly for highly customized systems.

Survey:

I just ran around our organization (NERSC) and surveyed the state
of thinking by various people here about SELinux.  I hit
up nine people (besides me) whose focuses are:

1.  Two people in our computer security group
2.  A person in our "open source software" group (almost all
     our software is open source - these folks do development)
3.  The two current "Server Team" members (relatively recent hires)
4.  A person on our mass storage team, and
5.  Three people who administer some of our scientific clusters.

All were familiar with SELinux to some extent.  While there
was some variation in the thoughts I heard (I'll separate
some of the thoughts below), that variation was in a rather
narrow range around the following:

SELinux can provide some useful additional checks that can
enhance security/integrity, but *only* for the most stock
systems (e.g. LAMP).  As soon as configuration problems
arise (services start failing due to SELinux checks) the
costs for getting SELinux working are higher than the
value that it provides.  There are only a very few systems
that we run (less than .1% of the processors) that run
with SELinux enabled.

Even among the above consensus, opinions ranged from
"Only 'nut cases' run SELinux" to "I tried it, but it
was just too much overhead for our custom configurations"
and on to "things have improved recently with some
new support tools, we're considering running more
systems with the SELinux controls enforced, but
haven't gotten there yet."

In particular one of the support tools mentioned was
"sealert", e.g.:

http://linux.die.net/man/8/sealert

that I have no experience with.  Perhaps Jonathan does?

If the promise of such tools (namely that after you
have problems due to SELinux blocked access, you can
run such tools and have it generate minimized policy
changes that can get your services running again)
is realized, then perhaps at some time in the future
we will begin running more systems at NERSC with
SELinux enabled.  I don't see any movement in that
direction at this time, but my survey and shared
thoughts got a few people considering SELinux more
seriously than they have to this point.

I expect part of the difference in attitudes towards
software like SELinux comes for the different focuses
of organizations - e.g. our production scientific
computing focus at NERSC vs. Jonathan's research
focused organization and even a focus on systems
security and integrity.

I'll be interested to hear about others who enable
SELinux with customized applications where they
are required to "tweak" the SELinux configurations,
and how they minimize their "tweak"ing administration
costs.

This message is already much too long, but perhaps
after such a survey of the state of the art regarding
SELinux it might be worthwhile to revisit the topic
of "SELinux vs. capabilities" for POLA computing.

--Jed  http://www.webstart.com/jed/



More information about the cap-talk mailing list