A Basic Guide for New Marionettists


This article only covers the so-called masterless mode, a.k.a. puppet apply, because the master-slave mode is unethical and bolt sounds violent. On the other hand, puppet apply plays the role of both master and slave, which is considered not unlikely inoffensive.

Puppet uses a declarative DSL, i.e. describing the state of puppets. Below is an example from the official document:

case $operatingsystem {
  centos, redhat: { $service_name = 'ntpd' }
  debian, ubuntu: { $service_name = 'ntp' }

package { 'ntp':
  ensure => installed,

service { 'ntp':
  name      => $service_name,
  ensure    => running,
  enable    => true,
  subscribe => File['ntp.conf'],

file { 'ntp.conf':
  path    => '/etc/ntp.conf',
  ensure  => file,
  require => Package['ntp'],
  source  => "puppet:///modules/ntp/ntp.conf",

This guide applies to Puppet 6.7.

Quick Start #

Install the puppet-agent package (which provides puppet apply), following the instructions at official document.

As an oversimplified example, I wrote a script to ensure git is installed:

# filename: laptop.pp
package {'git':
  ensure => installed

Validate its syntax:

puppet parser validate laptop.pp

In practice, this is mainly for CI, since a decent editor should have warned you if you made some syntax mistakes.

Rehearse the play (dry run):

puppet apply --noop laptop.pp

It's show time: (run as root since installing package usually requires root permission)

puppet apply laptop.pp

You can also pass a directory instead of a file to puppet apply. But in this case Puppet combines all files under that directory as one script, without isolated scope. Thus I prefer use one file.

Puppet DSL #

Style #

The Puppet community prefers snake_case and two spaces soft tab.

Selector Expression #

Selector expression returns a value. For example, the case "statement" in the beginning code sample:

case $operatingsystem {
  centos, redhat: { $service_name = 'ntpd' }
  debian, ubuntu: { $service_name = 'ntp' }

can be rewritten as:

$service_name = $operatingsystem ? {
  centos, redhat => 'ntpd',
  debian, ubuntu => 'ntp',

Strictly speaking, they are not semantically equivalent. If the os failed to match, the former will do nothing, while the later will refuse to compile.

However, the if and case "statements" are actually expressions, returning the value of the last expression in the executed block or undef when no block was executed.

Puppet Defaults #

Puppet DSL has a wired syntax (capitalizing type name) for declaring defaults for a specific type of puppet:

# Set default values for the path attribute of the exec puppet.
Exec {
  path => '/usr/bin:/bin:/usr/sbin:/sbin',

The more wired part is, puppet defaults are dynamically scoped!

Therefore, using per-expression defaults is preferred. Below is an example from the official document:

file {
    ensure => file,
    owner  => "root",
    group  => "wheel",
    mode   => "0600",
  ['ssh_host_dsa_key', 'ssh_host_key', 'ssh_host_rsa_key']:
    # use all defaults
  ['ssh_config', '', '', '', 'sshd_config']:
    # override mode
    mode => "0644",

Puppet Collector #

This is sometimes called the "spaceship operator", which selects a group of puppets via attribute searching:

User <| groups == 'wheel' |>

Besides ==, there are other operators, !=, and, and or.

Lazy Puppet #

The at (@) prefix marks a puppet declaration as lazy:

@user {'ubuntu':
  uid => 1000,
  comment => 'default user',
  group => wheel,

Lazy puppets will not be applied until:

  1. explicitly realized, e.g. realize User['ubuntu'], or
  2. implicitly realized by a matched collector, e.g. User <| group == wheel |>.

Glossary #

This guide uses different terms to the official Puppet documentation. These are just my personal preference. For example, the script term used in this guide is called manifest in the official Puppet documentation. In my opinion, script makes more sense than manifest. In fact, chef refers to the similar notation as recipe.