Containers have become popular with developers and operators for their scalability, portability, and agility. These lightweight packages encapsulate an entire runtime environment, enabling rapid development, testing, and deployment.
Unlike machines that utilize hardware virtualization, containers employ OS kernel features (such as Linux namespaces and groups) to limit what processes inside a container can see of the host process space and resources.
POSIX ACLs in Kubernetes
Kubernetes uses POSIX ACLs to restrict access to files and directories on a volume by user and group. This allows the administrator to ensure that a specific file or directory is read-only or only readable by certain groups. Using POSIX ACLs, the kubectl command line offers multiple options for modifying a file or directory’s access permissions:
The entries owner, owning group, other, and mask are required in a POSIX ACL. The first three entries correspond to the traditional permission bits, while the last reflects the overlapping portion of the permissions for the owning group and the mask.
When a process creates a file or directory in a POSIX ACL-protected directory, the resulting file is assigned the same ACL as the parent directory. The mkdir command below shows how this is accomplished with the ACL for the directory mydir:
In Unix/Linux, processes have a set of privileges based on their primary group ID (PID), the UID of the container in which they are running, and Linux capabilities. While POSIX ACLs do not provide fine-grained controls over individual process privileges, Linux Capabilities, Security Enhanced Linux, and AppArmor provide this functionality in containers.
In previous releases of Kubernetes, every time a volume was mounted, it would recursively chown and chmod the files on the PersistentVolumeClaim – even if the fsGroup field in the PVC was already set to the requested value. This is a costly operation for large volumes with many small files and is now being fixed in 1.20 and later releases of Kubernetes. A new configuration option called fsGroupPolicy can be set within the CSI Driver config file to opt out of these recursive permission changes.
With this granularity, Kubernetes volume mount permissions become a powerful tool for administrators. You can prevent accidental deletions, enforce data confidentiality, and ensure smooth collaboration within your containerized city.
Access Control Lists (ACLs)
Access Control Lists (ACLs) dictate who has privileges for an object on a computer system. They help to prevent unintentional data breaches by filtering out what shouldn’t be in the system and allowing in what should be there. ACLs should also be configured on every public-facing network interface and within the internal network. This ensures that sensitive devices aren’t accidentally exposed and allows administrators to create detailed access controls.
A filesystem ACL is a table that tells the operating system which users have permission to access specific objects, such as a directory or file on the system. Whenever someone attempts to access the system, the operating system checks if they match any ACL entries and either grants or denies them.
ACLs come in two varieties: standard and extended. Routers include standard ACLs, which filter traffic depending on a packet’s originating IP address. In contrast, an extended ACL can filter on several criteria, including port numbers, protocols, time ranges, and priority, as determined by its Differentiated Services Code Point (DSCP). Some ACLs even allow users to add comments that provide more details about the rules. It would help to understand what each ACL rule is doing before implementing it. Also, you should set up a monitoring mechanism to ensure that ACLs are working correctly and that any changes are reflected in your environment.
POSIX ACLs
ACLs allow more detailed permissions to be granted and restricted than the default POSIX permissions model. However, they can quickly become confusing to manage. They only support using numeric user and group identifiers local to the system. Adding the ability to grant and restrict permissions to non-local users would require substantial changes to the process model.
The traditional POSIX file system object permission concept defines three classes of users: owner, group, and other. Each class is associated with a set of permissions that define access to a file or directory. The ls -l command displays the effective permission bits for a file or directory.
POSIX ACLs can extend this model to allow additional users or groups access to files and directories even though they do not belong to the original owner class or the group class for the file. These additional permissions are called mask entries.
POSIX ACLs require that the partition of the file or directory be mounted with POSIX ACL support. When the ls -l command displays the ACLs of a file or directory, you will see the mask entry and the other ACL entries. The POSIX ACLs for a file or directory can be changed using the setfacl command. The -m (modify) argument of setfacl allows you to change the type specifications of existing ACL entries.
POSIX ACLs in Docker
ACLs allow you to control access to files or directories on the file system. The permissions of a directory or file are determined by the default ACL of that object, which is inherited at creation time. POSIX ACLs allow you to grant or deny permissions using the DENY ACL entry.
For example, if a process inside a container creates a file on a bind mount and the UID of the container user matches that of an existing user on the host, it will appear to be owned by the other user on the host, even though it is not. This is a common problem in development and testing environments where containers are often run as root or some hard-coded UID such as 1000.
One solution is to use CAP_CHOWN and CAP_FOWNER to force UID/GID mappings between containers and the existing user on the host. Another is to use ACL to control access for a given volume.
In Docker, a command such as VOLUME nginx-vol path> will mark the path as a volume to be created, populated, and mounted at runtime, allowing other containers to access its content without changing it. The -v and –mount flags differ in syntax: the -v flag is more verbose but has similar options to the –mount argument; the -v option groups the different volumes driver options together into a single field, while the –mount option separates them into a tuple.