Seccomp-based system call filtering for Upstart

Registered by David Gaarenstroom on 2012-11-27

Starting with 3.5 the kernel security feature "seccomp" has been transformed into a filtering mechanism that allows processes to specify a filter for system calls using BPF. Individual system calls can be allowed, denied (killing the offensive program), or completely bypassed while setting errno. (And there's also "trap" and "trace", see kernel docs). This feature was already introduced in the Ubuntu 12.04LTS kernels.

Systemd already has an implementation for Seccomp filtering using "SystemCallFilter" (see: http://0pointer.de/public/systemd-man/systemd.exec.html ) For Upstart I'd like to add an implementation too. I'd like to stay relatively close to Systemd's syntax, making it easier for developers to write a policy for both upstart and Systemd (although I haven't seen a lot of examples around?). However, I am extending its syntax a bit adding an optional policy for each syscall.

The EBNF representation I'm thinking of would be:
seccomp filter = "seccomp-filter", white space, [ "~" ], seccomp rules;
seccomp rules = seccomp rule, { ",", seccomp rule };
seccomp rule = systemcall, [ ":", policy ];
policy = "allow" | "errno" | "kill" | "trace" | "trap";

The default policy is "allow explicitly listed syscalls as default policy, kill for anything not explicitly listed", unless the set of rules is prepended with "~" which reverts this (deny explicitly listed syscalls as default policy, allow anything not explicitly listed"

E.g.:
  seccomp-filter write
for "echo hello world".
or:
  seccomp-filter getrlimit:allow,setrlimit:errno
for a fictional program that is allowed to call getrlimit and setrlimit, but the latter will simply be ignored.
or:
  seccomp-filter ~setuid, socket
to prevent the usage of setuid and socket

Most of this is already implemented in a Seccomp exec wrapper I wrote which can be found here:
https://gitorious.org/guardian/guardian

References:
https://lwn.net/Articles/498231/
http://kernelnewbies.org/Linux_3.5#head-c48d6a7a26b6aae95139358285eee012d6212b9e
http://0pointer.de/public/systemd-man/systemd.exec.html
https://gitorious.org/guardian/guardian

Blueprint information

Status:
Not started
Approver:
None
Priority:
Undefined
Drafter:
None
Direction:
Needs approval
Assignee:
None
Definition:
New
Series goal:
None
Implementation:
Unknown
Milestone target:
None

Related branches

Sprints

Whiteboard

Current progress:
- I have a very simple, working concept on top of upstart-1.5 for hostname.conf, only allowing sethostname and the calls required by default for exec-ing and exiting a program.
- I am still figuring out how the development process works for Ubuntu. E.g. who should fill in the optional fields of this blueprint? Is someone going to review my changes and provide me with feedback? And where can I post my current progress...
- At the moment seccomp is only supported for x86 and amd64, while ARM support is already implemented for ChromeOS but not yet added to mainline. I haven't taken this into account, my modification may not compile for ARM or other architectures.
- A related upstart job option should probably be added to upstart to support setting or not setting the "No New Privileges" prctl, in most cases setting this is required for seccomp to work. (See systemd.exec)
- For trap and errno policies in a seccomp filter a return-value can be set, I still need to implement this, but I am thinking of:
"errno" | "trap", [ "(", short integer, ")" ];

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.