Tuesday, April 13, 2021

A safe worker on main Jenkins node

A CI farm using Jenkins can start small and grow big. It can be a PoC or a multi-config builder running on a laptop for a single developer's fun, or it can be a dispatcher of jobs commanding numerous machines and swarms of VMs and containers. In the latter case, when a Jenkins instance grows big (and public), security considerations come into view. One of these is the safety of the Jenkins controller (nee master) against the payload of the jobs it runs on the lowest layers in the OS - how safe are its configuration files and running processes against the arbitrary scripts someone puts into Git?

Documents like https://wiki.jenkins.io/display/JENKINS/Security+implication+of+building+on+master or https://www.jenkins.io/redirect/building-on-controller/ and plugins like https://wiki.jenkins.io/display/JENKINS/Job+Restrictions+Plugin address this situation, effectively by reducing the list of jobs that can run on the controller.

In many cases, for a modern deployment, you would not need to have jobs running on the controller; you would have a pipeline using some agent definition (maybe as a docker, maybe as a match to some label expression) and its payload would run there. The "master" node can then serve zero executors and not pick any work up, except for some system tasks like initial parsing of the pipeline scripts or SCM scanning (which do not consume executor count).

However, more likely in legacy but possibly in new deployments tailored to the physical layout of some build farm, you may need "infrastructural" jobs, whether for grooming your hypervisor, or orchestrating integration tests of product images, or collecting health stats from non-Jenkins players in your farm.

Quite likely, such jobs may need a worker on a predictable host or even a persistent workspace to pass some state between runs. Reasons may include using certain files you leave in the FS (though an anti-pattern for pure Jenkins setups - credentials may be better, including a text-file "credential"); using NFS shares; persistent workspaces to pass state around, etc., and this is where the solution below can help:

My recurring pattern for avoiding that "insecure" setup while providing the equivalent type of worker is to just create a persistent agent (SSH, Swarm...) running with a different Unix/Linux account on the same machine as the Jenkins controller, labeled e.g. "master-worker" and limited in node configuration to only run jobs that match by label. Those several infra jobs which need it, explicitly want to run on that node via label expression - by agent definition in pipelines, or "Restrict where this project can be run" in legacy job types (e.g. Freestyle). The original "(master)" node then has a limit to run 0 jobs, so effectively it only processes pipeline start-ups; you can manage it at your $JENKINS_URL/computer/(master)/configure/ (parentheses included).

So this way such "master-worker" is just another worker not endangering the Jenkins master (as far as messing with FS and processes is concerned - agent.jar runs under a different account which just happens to be on the "localhost" relative to controller), but it is persistent unlike containers, dockers, etc. and runs on a predictable machine which may be an advantage.