salt.states.archive

Extract an archive

New in version 2014.1.0.

salt.states.archive.extracted(name, source, source_hash=None, source_hash_name=None, source_hash_update=False, skip_files_list_verify=False, skip_verify=False, password=None, options=None, list_options=None, force=False, overwrite=False, clean=False, clean_parent=False, user=None, group=None, if_missing=None, trim_output=False, use_cmd_unzip=None, extract_perms=True, enforce_toplevel=True, enforce_ownership_on=None, archive_format=None, **kwargs)

New in version 2014.1.0.

Changed in version 2016.11.0: This state has been rewritten. Some arguments are new to this release and will not be available in the 2016.3 release cycle (and earlier). Additionally, the ZIP Archive Handling section below applies specifically to the 2016.11.0 release (and newer).

Ensure that an archive is extracted to a specific directory.

Important

Changes for 2016.11.0

In earlier releases, this state would rely on the if_missing argument to determine whether or not the archive needed to be extracted. When this argument was not passed, then the state would just assume if_missing is the same as the name argument (i.e. the parent directory into which the archive would be extracted).

This caused a number of annoyances. One such annoyance was the need to know beforehand a path that would result from the extraction of the archive, and setting if_missing to that directory, like so:

extract_myapp:
  archive.extracted:
    - name: /var/www
    - source: salt://apps/src/myapp-16.2.4.tar.gz
    - user: www
    - group: www
    - if_missing: /var/www/myapp-16.2.4

If /var/www already existed, this would effectively make if_missing a required argument, just to get Salt to extract the archive.

Some users worked around this by adding the top-level directory of the archive to the end of the name argument, and then used --strip or --strip-components to remove that top-level dir when extracting:

extract_myapp:
  archive.extracted:
    - name: /var/www/myapp-16.2.4
    - source: salt://apps/src/myapp-16.2.4.tar.gz
    - user: www
    - group: www

With the rewrite for 2016.11.0, these workarounds are no longer necessary. if_missing is still a supported argument, but it is no longer required. The equivalent SLS in 2016.11.0 would be:

extract_myapp:
  archive.extracted:
    - name: /var/www
    - source: salt://apps/src/myapp-16.2.4.tar.gz
    - user: www
    - group: www

Salt now uses a function called archive.list to get a list of files/directories in the archive. Using this information, the state can now check the minion to see if any paths are missing, and know whether or not the archive needs to be extracted. This makes the if_missing argument unnecessary in most use cases.

Important

ZIP Archive Handling

Note: this information applies to 2016.11.0 and later.

Salt has two different functions for extracting ZIP archives:

  1. archive.unzip, which uses Python's zipfile module to extract ZIP files.

  2. archive.cmd_unzip, which uses the unzip CLI command to extract ZIP files.

Salt will prefer the use of archive.cmd_unzip when CLI options are specified (via the options argument), and will otherwise prefer the archive.unzip function. Use of archive.cmd_unzip can be forced however by setting the use_cmd_unzip argument to True. By contrast, setting this argument to False will force usage of archive.unzip. For example:

/var/www:
  archive.extracted:
    - source: salt://foo/bar/myapp.zip
    - use_cmd_unzip: True

When use_cmd_unzip is omitted, Salt will choose which extraction function to use based on the source archive and the arguments passed to the state. When in doubt, simply do not set this argument; it is provided as a means of overriding the logic Salt uses to decide which function to use.

There are differences in the features available in both extraction functions. These are detailed below.

  • Command-line options (only supported by archive.cmd_unzip) - When the options argument is used, archive.cmd_unzip is the only function that can be used to extract the archive. Therefore, if use_cmd_unzip is specified and set to False, and options is also set, the state will not proceed.

  • Permissions - Due to an upstream bug in Python, permissions are not preserved when the zipfile module is used to extract an archive. As of the 2016.11.0 release, archive.unzip (as well as this state) has an extract_perms argument which, when set to True (the default), will attempt to match the permissions of the extracted files/directories to those defined within the archive. To disable this functionality and have the state not attempt to preserve the permissions from the ZIP archive, set extract_perms to False:

    /var/www:
      archive.extracted:
        - source: salt://foo/bar/myapp.zip
        - extract_perms: False
    
name

Directory into which the archive should be extracted

source

Archive to be extracted

Note

This argument uses the same syntax as its counterpart in the file.managed state.

source_hash

Hash of source file, or file with list of hash-to-file mappings

Note

This argument uses the same syntax as its counterpart in the file.managed state.

Changed in version 2016.11.0: If this argument specifies the hash itself, instead of a URI to a file containing hashes, the hash type can now be omitted and Salt will determine the hash type based on the length of the hash. For example, both of the below states are now valid, while before only the second one would be:

foo_app:
  archive.extracted:
    - name: /var/www
    - source: https://mydomain.tld/foo.tar.gz
    - source_hash: 3360db35e682f1c5f9c58aa307de16d41361618c

bar_app:
  archive.extracted:
    - name: /var/www
    - source: https://mydomain.tld/bar.tar.gz
    - source_hash: sha1=5edb7d584b82ddcbf76e311601f5d4442974aaa5
source_hash_name

When source_hash refers to a hash file, Salt will try to find the correct hash by matching the filename part of the source URI. When managing a file with a source of salt://files/foo.tar.gz, then the following line in a hash file would match:

acbd18db4cc2f85cedef654fccc4a4d8    foo.tar.gz

This line would also match:

acbd18db4cc2f85cedef654fccc4a4d8    ./dir1/foo.tar.gz

However, sometimes a hash file will include multiple similar paths:

37b51d194a7513e45b56f6524f2d51f2    ./dir1/foo.txt
acbd18db4cc2f85cedef654fccc4a4d8    ./dir2/foo.txt
73feffa4b7f6bb68e44cf984c85f6e88    ./dir3/foo.txt

In cases like this, Salt may match the incorrect hash. This argument can be used to tell Salt which filename to match, to ensure that the correct hash is identified. For example:

/var/www:
  archive.extracted:
    - source: https://mydomain.tld/dir2/foo.tar.gz
    - source_hash: https://mydomain.tld/hashes
    - source_hash_name: ./dir2/foo.tar.gz

Note

This argument must contain the full filename entry from the checksum file, as this argument is meant to disambiguate matches for multiple files that have the same basename. So, in the example above, simply using foo.txt would not match.

New in version 2016.11.0.

source_hash_updateFalse

Set this to True if archive should be extracted if source_hash has changed and there is a difference between the archive and the local files. This would extract regardless of the if_missing parameter.

Note that this is only checked if the source value has not changed. If it has (e.g. to increment a version number in the path) then the archive will not be extracted even if the hash has changed.

Note

Setting this to True along with keep_source set to False will result the source re-download to do a archive file list check. If it's not desirable please consider the skip_files_list_verify argument.

New in version 2016.3.0.

skip_files_list_verifyFalse

Set this to True if archive should be extracted if source_hash has changed but only checksums of the archive will be checked to determine if the extraction is required.

It will try to find a local cache of the source and check its hash agains the source_hash. If there is no local cache available, for example if you set the keep_source to False, it will try to find a cached source hash file in the Minion archives cache directory.

Note

The current limitation of this logic is that you have to set minions hash_type config option to the same one that you're going to pass via source_hash argument.

Warning

With this argument set to True Salt will only check for the source_hash agains the local hash of the sourse. So if you, for example, remove extracted files without clearing the Salt Minion cache next time you execute the state Salt will not notice that extraction is required if the hashes are still match.

New in version 3000.

skip_verifyFalse

If True, hash verification of remote file sources (http://, https://, ftp://) will be skipped, and the source_hash argument will be ignored.

New in version 2016.3.4.

keep_sourceTrue

For source archives not local to the minion (i.e. from the Salt fileserver or a remote source such as http(s) or ftp), Salt will need to download the archive to the minion cache before they can be extracted. To remove the downloaded archive after extraction, set this argument to False.

New in version 2017.7.3.

keepTrue

Same as keep_source, kept for backward-compatibility.

Note

If both keep_source and keep are used, keep will be ignored.

password

For ZIP archives only. Password used for extraction.

New in version 2016.3.0.

Changed in version 2016.11.0: The newly-added archive.is_encrypted function will be used to determine if the archive is password-protected. If it is, then the password argument will be required for the state to proceed.

options

For tar and zip archives only. This option can be used to specify a string of additional arguments to pass to the tar/zip command.

If this argument is not used, then the minion will attempt to use Python's native tarfile/zipfile support to extract it. For zip archives, this argument is mostly used to overwrite existing files with o.

Using this argument means that the tar or unzip command will be used, which is less platform-independent, so keep this in mind when using this option; the CLI options must be valid options for the tar/unzip implementation on the minion's OS.

New in version 2016.11.0.

Changed in version 2015.8.11,2016.3.2: XZ-compressed tar archives no longer require J to manually be set in the options, they are now detected automatically and decompressed using the xz CLI command and extracted using tar xvf. This is a more platform-independent solution, as not all tar implementations support the J argument for extracting archives.

Note

For tar archives, main operators like -x, --extract, --get, -c and -f/--file should not be used here.

list_options

For tar archives only. This state uses archive.list to discover the contents of the source archive so that it knows which file paths should exist on the minion if the archive has already been extracted. For the vast majority of tar archives, archive.list "just works". Archives compressed using gzip, bzip2, and xz/lzma (with the help of the xz CLI command) are supported automatically. However, for archives compressed using other compression types, CLI options must be passed to archive.list.

This argument will be passed through to archive.list as its options argument, to allow it to successfully list the archive's contents. For the vast majority of archives, this argument should not need to be used, it should only be needed in cases where the state fails with an error stating that the archive's contents could not be listed.

New in version 2016.11.0.

forceFalse

If a path that should be occupied by a file in the extracted result is instead a directory (or vice-versa), the state will fail. Set this argument to True to force these paths to be removed in order to allow the archive to be extracted.

Warning

Use this option very carefully.

New in version 2016.11.0.

overwriteFalse

Set this to True to force the archive to be extracted. This is useful for cases where the filenames/directories have not changed, but the content of the files have.

New in version 2016.11.1.

cleanFalse

Set this to True to remove any top-level files and recursively remove any top-level directory paths before extracting.

Note

Files will only be cleaned first if extracting the archive is deemed necessary, either by paths missing on the minion, or if overwrite is set to True.

New in version 2016.11.1.

clean_parentFalse

If True, and the archive is extracted, delete the parent directory (i.e. the directory into which the archive is extracted), and then re-create that directory before extracting. Note that clean and clean_parent are mutually exclusive.

New in version 3000.

user

The user to own each extracted file. Not available on Windows.

New in version 2015.8.0.

Changed in version 2016.3.0: When used in combination with if_missing, ownership will only be enforced if if_missing is a directory.

Changed in version 2016.11.0: Ownership will be enforced only on the file/directory paths found by running archive.list on the source archive. An alternative root directory on which to enforce ownership can be specified using the enforce_ownership_on argument.

group

The group to own each extracted file. Not available on Windows.

New in version 2015.8.0.

Changed in version 2016.3.0: When used in combination with if_missing, ownership will only be enforced if if_missing is a directory.

Changed in version 2016.11.0: Ownership will be enforced only on the file/directory paths found by running archive.list on the source archive. An alternative root directory on which to enforce ownership can be specified using the enforce_ownership_on argument.

if_missing

If specified, this path will be checked, and if it exists then the archive will not be extracted. This path can be either a directory or a file, so this option can also be used to check for a semaphore file and conditionally skip extraction.

Changed in version 2016.3.0: When used in combination with either user or group, ownership will only be enforced when if_missing is a directory.

Changed in version 2016.11.0: Ownership enforcement is no longer tied to this argument, it is simply checked for existence and extraction will be skipped if if is present.

trim_outputFalse

Useful for archives with many files in them. This can either be set to True (in which case only the first 100 files extracted will be in the state results), or it can be set to an integer for more exact control over the max number of files to include in the state results.

New in version 2016.3.0.

use_cmd_unzipFalse

Set to True for zip files to force usage of the archive.cmd_unzip function to extract.

New in version 2016.11.0.

extract_permsTrue

For ZIP archives only. When using archive.unzip to extract ZIP archives, Salt works around an upstream bug in Python to set the permissions on extracted files/directories to match those encoded into the ZIP archive. Set this argument to False to skip this workaround.

New in version 2016.11.0.

enforce_toplevelTrue

This option will enforce a single directory at the top level of the source archive, to prevent extracting a 'tar-bomb'. Set this argument to False to allow archives with files (or multiple directories) at the top level to be extracted.

New in version 2016.11.0.

enforce_ownership_on

When user or group is specified, Salt will default to enforcing permissions on the file/directory paths detected by running archive.list on the source archive. Use this argument to specify an alternate directory on which ownership should be enforced.

Note

This path must be within the path specified by the name argument.

New in version 2016.11.0.

archive_format

One of tar, zip, or rar.

Changed in version 2016.11.0: If omitted, the archive format will be guessed based on the value of the source argument. If the minion is running a release older than 2016.11.0, this option is required.

Examples

  1. tar with lmza (i.e. xz) compression:

    graylog2-server:
      archive.extracted:
        - name: /opt/
        - source: https://github.com/downloads/Graylog2/graylog2-server/graylog2-server-0.9.6p1.tar.lzma
        - source_hash: md5=499ae16dcae71eeb7c3a30c75ea7a1a6
    
  2. tar archive with flag for verbose output, and enforcement of user/group ownership:

    graylog2-server:
      archive.extracted:
        - name: /opt/
        - source: https://github.com/downloads/Graylog2/graylog2-server/graylog2-server-0.9.6p1.tar.gz
        - source_hash: md5=499ae16dcae71eeb7c3a30c75ea7a1a6
        - options: v
        - user: foo
        - group: foo
    
  3. tar archive, with source_hash_update set to True to prevent state from attempting extraction unless the source_hash differs from the previous time the archive was extracted:

    graylog2-server:
      archive.extracted:
        - name: /opt/
        - source: https://github.com/downloads/Graylog2/graylog2-server/graylog2-server-0.9.6p1.tar.lzma
        - source_hash: md5=499ae16dcae71eeb7c3a30c75ea7a1a6
        - source_hash_update: True