Current state clarity, Part 2 – Policyfiles

If you’ve never heard of Policyfiles, you should start by reading Michael Hedgpeth’s excellent Introduction to Policyfiles.

The Problem

When working with Chef for configuration management, there’s maybe nothing more rewarding than the moment when you’ve got a particular set of configs exactly right and Chef just rolls them to every single applicable node for you automatically. No matter how many times we’ve done that, it never stops feeling a little magical.

But getting to that point can be one of the least rewarding parts of infrastructure automation with Chef. You have to make sure that your nodes are collecting the right versions of the right cookbooks.  You have to figure out which attributes to set where.  You have to decide whether you’re going to compose your run_list in a role, a role cookbook, or (yikes!) directly on the node object.  You also have to ensure the correct dependencies for all of those cookbooks come together exactly the way you need them.

If you have a complex infrastructure (and all of ours are simple, right?), building all of that can take a substantial amount of time. Compounding those needs, chef-client recalculates those dependencies at the start of every run so that it can automatically pick up new changes. That means, once you’ve built up an entire solution set just right, you also have to ensure it comes together the right way every single time.

To do that, you’d typically build up precautions against new changes being introduced that might accidentally alter that solution out from under you. There are many established patterns to solve this problem: Berkshelf, pinning cookbook versions with Chef environments, the “role cookbook” pattern, the “environment cookbook” pattern, and others. Each of these solutions has its benefits and drawbacks. Let’s specifically look at one of those others: a Policyfile.

Policyfiles: Benefits

Policyfiles provide a unified interface by which the expression of the run_list, all attributes, and all cookbooks representing a particular node state are consolidated into one singular file.

When the author of the policy is satisfied, the policy is compiled on the author’s workstation. A solution bundle is generated, ready to share via Chef Server or by copying the bundle remotely for chef-solo runs via tarball. The solution bundle consists of a lockfile and uniquely-identified copies of every cookbook required by the solution. It becomes impossible to use just part of the solution bundle: the lockfile and cookbook set are now (effectively) one single object.  They are transported, reassembled, and applied together in lockstep.

This represents a simplicity shift: Instead of thinking about the world as a set of object relationships and worrying about a run_list with a set of cookbooks to manage for each node, policy authors need only to think about and generate the final bundle (editor’s note: Habitat, anyone?). Nodes are then assigned to use a particular policy bundle. Instead of thinking “web servers run the nginx cookbook, the firewall cookbook, and the base cookbook,” authors need only think, “webservers run the webserver policy.”

Because policies are immutable and cannot be changed once bundled, the chef-client no longer recalculates dependency sets at the start of every run. Authors no longer have to take precautions about the configuration changing out from under them. The solution bundle is locked and prevented from ever changing. That lock provides a level of static granularity that ensures regardless of any change, anywhere in the system, being applied to any components within your solution set, your node will still get the same exact code (and only the code in your locked solution set) every single time.  In some settings, that level of certainty is also beneficial for non-technical reasons (e.g. grokability, onboarding, working with strict change management policies, etc).

Again, policy solution bundles are immutable.  A generated policy will never change. Updating an existing policy means re-compiling everything on a developer workstation and publishing or sharing a new version of the solution bundle. That means policy bundles are promotable artifacts and they can (and should!) be tested and managed via build pipelines.

Policyfiles: Drawbacks

While Policyfiles provide definite advantages, they may be difficult for some users to work with exactly because a generated policy will never change.  Is the juice worth the squeeze? That depends on your needs. There are existing patterns that might provide a more flexible way of accomplishing many of the same outcomes. Let’s explore where Policyfiles may have some drawbacks for you.

It’s difficult to make small and targeted changes across many policies. Because locked policy bundles are immutable, applying any change means generating an entire new policy bundle. At larger scales, this becomes impractical: imagine that you had two dozen policies spread across hundreds of nodes. Now imagine making a change to a core underlying cookbook that is in use across every single policy. In order for all nodes in the infrastructure to pick up that change, you would have to generate two dozen new policy updates, test all the resulting solution bundles, and promote each policy bundle to each affected node directly.

We all try to avoid having special nodes in our infrastructure, but if a single node needs to be slightly different than all the rest, that requires generating a new policy just for that node. When using Policyfiles at scale, you trade the upfront complexity of managing dependency resolution for the tailend complexity of managing sprawling policies. There are mitigation techniques to ease that burden. For example, many users of Policyfiles turn to Data Bags to make modeling exceptions easier by storing data separate from the policy. But, again, that’s a mitigation technique.

Policyfiles are a great way to statically lock dependent objects together, but they aren’t very good at describing infrastructure rules that cut across boundaries like, say, “all nodes in the staging environment must use the staging database.” Depending on the size and complexity of your infrastructure, this approach may prove to be unsustainable. You may find yourself jumping through hoops as your adoption matures just to get the same basic functionality you find with other existing solutions to the dependency resolution problem. If your infrastructure is complex, that complexity never entirely goes away no matter what you choose. The question becomes when and how you’re going to pay for it.

What this means for you

Should you be using Policyfiles? It depends.

If your workflow for managing cookbook dependencies is already working well for you, then you should continue to use your working solution. There is no compelling reason to switch.

Policyfiles may be for you if you are in one of the scenarios where they’re particularly useful:

  1. You’re having trouble with cookbook dependency management and you’re looking for an easily understandable way to simplify the problem upfront.
  2. You’re looking for a faster way to onboard new Chef practitioners in your organization and you’re willing to deal with shifting complexity further down the line in order to make it easier for new people to work with Chef.
  3. The restrictions on change control in your organization are very high. Use Policyfiles if the atomic guarantees they provide are worth the expense and effort of the management overhead that comes with using them at scale.
  4. You work in air-gapped (i.e. network disconnected) environments where the process of packaging a policy and its dependent cookbooks into a single archive file works well with your chef-solo based approach for configuration management.

What this means for the Chef community

As we’ve seen above, the effort required to generate, test, deploy, and manage Policyfiles can be significant. Sometimes that significance may be worth the additional effort. Sometimes it may not.  You can determine that for yourself by examining how the guidelines above apply to your situation and organization.

Another consideration for Policyfiles is that they’re not presently compatible with the workflow feature of Chef Automate. Making them compatible means writing a build cookbook that supports a pipeline for Policyfiles. The delivery-truck build cookbook, used as the default for defining workflow in Chef Automate, is open-source. It could potentially be modified to support Policyfiles if the community finds that effort worthwhile.

Policyfiles have a small, but passionate, following in the Chef community. If you’d like to learn more about them and how you can help spread adoption, consider joining the #policyfile channel in the Chef community Slack.

Author George Miranda

George is a Product Marketing Director at Chef. He worked in webops for over 15 years at a variety of small dotcoms and large enterprises before delving into DevOps and Infrastructure as Code. He enjoys being a technical advocate and discussing effective solutions. He's an automation junkie that lives to help others solve problems and would love to help you solve yours. He lives in the Pacific Northwest and is a sucker for artisanal whiskey.