If you are in amidst of a compliance or trying to begin its difficult journey, you might find lot of red alerts for the EC2 instances you are spinning up. Moreover, you also start to question the usage of your default Amazon Linux images, whether they are right for your use case? You might be wondering if there is/are custom AMI(s) which might support your containerized workloads, has less overhead to patch and provider better security.
Bottlerocket is a purpose-built Linux-based operating system meticulously crafted to serve as an optimal container host. Designed to reduce unnecessary overhead, Bottlerocket emerges as a lean and efficient solution tailored for the modern demands of containerized applications.
💡
This specialized Linux image is engineered with a singular focus on streamlining container orchestration while prioritizing security, compliance, and operational efficiency.
Bottlerocket relies on three main core values elaborated below
Why to have hundreds of unwanted applications when all I want to run is a docker image. Bottlerocket brings a minimalist approach to only have required packages in its bare image so that its becomes light and easier to manage.
All the require packages are installed in a single Linux image which is a combination of different AWS services along with different platforms/architectures it supports. Each combination is called a variant.
For example bottlerocket-aws-k8s-1.28-x86_64-1.16 is a variant for Bottlerocket image of EKS version 1.28 on x86_64 architecture. Another variant for ECS would be something like bottlerocket-aws-ecs-2-x86_64-1.16
bottlerocket-aws-k8s-1.28-x86_64-1.16
x86_64
bottlerocket-aws-ecs-2-x86_64-1.16
Bottlerocket relies fundamentally on the concepts of bin packing everything in a single AMI image and hence there is no package manager.
As everything gets bind to an image, to update anything you would simply update the image version which would have the latest (stable and secure) version of all the components tested, removing the need to manage the packages separately.
Left side shows the running instance where the running version has a newer version update. This is being downloaded into the same instance but at a separate location. Once the download is complete a reboot takes place switching the new version as the primary source for the kernel.
Security is the topmost concern of the bottlerocket instance and they are doing few things to ensure that
dm-verity
Below are few high-level concepts to understand more on bottlerocket instances
As the bottlerocket instances don’t have any shell, how would you query things like image version and its updates, basic operations or admin level tasks. To solve these tasks bottlerocket gives a well-defined HTTP API which can solve all these problems for you, simultaneously ensuring that only rightful operations are getting performed on the instances with precise steps for each operation.
As we have already discussed that root file system is immutable and is verified by dm-verity , the /etc becomes the part of your mutable file system using tmpfs.
/etc
tmpfs
.
Using bootstrap containers you can enable certain programs or features that you want to install on top of the root file system during your instance bootup. These are set of containers that run on top of the container runtime containerd . You can run multiple such bootstrap containers and the instance booting will finish once all of containers have exited successfully. You can apply certain exit conditions on these bootstrap containers. You can read more on this here.
containerd
By default secure boot is also enabled making sure that right software gets offloaded by the UEFI firmware when trying to boot up machine.
Bottlerocket instances are specific to containerized workloads and for this two sets of containerd instances run. One of them is used to run your normal containers on your orchestrator engine like kubelet and other is to run an admin container which can act as a pseduo shell for you to run API driven HTTP calls using apiclient , a tool given by bottlerocket to run API requests, and to debug your instances. This admin containers doesn’t ensure mutability on root file system.
kubelet
apiclient
Using bottlerocket instances in your EKS nodes is very simple. We just need to make sure that we are passing right AMIs and correct labels to the nodes so that bottlerocket update operator can actually check image updates on these nodes and reboot them whenever an update is available. We will discuss bottlerocket update operator shortly.
In our current EKS deployment we deploy nodes in two forms
To make changes in the our terraform code for EKS, we pass an option eks_managed_node_groups in which we add an additional node pool something like this
eks_managed_node_groups
eks_managed_node_groups = { bottle = { enable_bootstrap_user_data = true platform = "bottlerocket" bootstrap_extra_args = <<-EOT [settings] "motd" = "TrueFoundry: MLOps platform" [settings.kubernetes.node-labels] "bottlerocket.aws/updater-interface-version" = "2.0.0" EOT instance_types = local.env.user_input.tfy_control_plane.enabled == "True" ? ["c6a.xlarge", "m6a.xlarge", "c6i.xlarge", "r6a.xlarge"] : ["c6a.large", "m6a.large", "c6i.large", "r6a.xlarge"] capacity_type = "SPOT" ami_type = "BOTTLEROCKET_x86_64" # Not required nor used - avoid tagging two security groups with same tag as well create_security_group = false # Ensure enough capacity to run 2 Karpenter pods min_size = 2 max_size = 2 desired_size = 2 labels = { "class.truefoundry.io" = "bottle" } tags = { # This will tag the launch template created for use by Karpenter "karpenter.sh/discovery" = local.env.cluster_name } block_device_mappings = { xvdb = { device_name = "/dev/xvdb" ebs = { volume_size = 100 volume_type = "gp3" throughput = 150 delete_on_termination = true } } } } }
In these there are few important things to note in the above spec
"bottlerocket.aws/updater-interface-version" = "2.0.0"
ami_type = "BOTTLEROCKET_X86_64"
/dev/xvdb
/dev/xvda
Karpenter is relatively newer way of autoscaling your workloads. Based on the compute required it will try to bring the right sized node, simultaneously bin-packing the daemonsets so that all necessary workloads gets executed on the node.
We rely heavily on karpenter to spin Compute and GPU workloads. Karpenter has a concept called Provisioner(which is now deprecated and named as NodePool ) defining the allowed size of nodes with right labels and taints if required. Moreover through AwsNodeTemplates (which is now deprecated and named as NodeClasses ) you can define the template of the node, giving the right security group, AMI family and root volume size.
Provisioner
NodePool
AwsNodeTemplates
NodeClasses
Now you can understand where we might have to make changes in the Karpenter provisioner and awsnodetemplates to make sure Karpenter spins Bottlerocket instances.
provisioner
awsnodetemplates
amiFamily
awsnodetemplate
Through this Karpenter is able to support bottlerocket instances as well.
I am trying to avoid mentioning the provisioner and awsnodetemplate spec as they are now deprecated by karpenter in older versions.
Bottlerocket update operator or brupop is a controller for keeping your bottlerocket instances in an EKS cluster up-to-date.
brupop
It consists of three main components
Brupop has two helm charts which we need to install
bottlerocketshadow
apiVersion: argoproj.io/v1alpha1kind: Applicationmetadata: name: bottlerocket-shadow namespace: argocd finalizers: - resources-finalizer.argocd.argoproj.iospec: destination: namespace: brupop-bottlerocket-aws server: https://kubernetes.default.svc project: default source: chart: bottlerocket-shadow repoURL: https://bottlerocket-os.github.io/bottlerocket-update-operator targetRevision: 1.0.0 syncPolicy: automated: {} syncOptions: - CreateNamespace=true
apiVersion: argoproj.io/v1alpha1kind: Applicationmetadata: name: bottlerocket-update-operator namespace: argocd finalizers: - resources-finalizer.argocd.argoproj.iospec: destination: namespace: brupop-bottlerocket-aws server: https://kubernetes.default.svc project: default source: chart: bottlerocket-update-operator repoURL: https://bottlerocket-os.github.io/bottlerocket-update-operator targetRevision: 1.3.0 syncPolicy: automated: {} syncOptions: - CreateNamespace=true
Make sure destination namespace is always brupop-bottlerocket-aws as its fixed in their helm chart.
brupop-bottlerocket-aws
Controller uses the bottlerocketshadow CRDS to manage your nodes. Earlier in the document we asked the nodes to have labels "bottlerocket.aws/updater-interface-version" = "2.0.0" . This was done so that controller can identify which bottlerocket instances you want brupop to control for you.
You can simply check the status of your nodes by running
$ kubectl get brs -n brupop-bottlerocket-awsNAME STATE VERSION TARGET STATE TARGET VERSION CRASH COUNTbrs-ip-xx-xx-1-243.ec2.internal Idle 1.17.0 Idle <no value> 0brs-ip-xx-xx-14-136.ec2.internal Idle 1.17.0 Idle <no value> 0brs-ip-xx-xx-31-78.ec2.internal Idle 1.17.0 Idle <no value> 0
We at TrueFoundry support bottlerocket instances to support both CPU and GPU workloads so that your entire MLOps journey can now be focussed on developing the right code to solve right problems, releasing the pressure on maintainable patches and security fixes.
Join AI/ML leaders for the latest on product, community, and GenAI developments