salt.runners.cluster#
Salt runner for cluster ring management and inspection.
Query-only operator surface for the Raft-backed cluster. Reads come
from the per-master persisted Raft state on disk, so the runner does
not need IPC into the publish daemon's RaftService (which is a
separate process and not reachable from a runner subprocess).
CLI Examples:
# Show this master's view of the cluster voter/learner set.
salt-run cluster.members
# Show this master's current ring state.
salt-run cluster.ring_info
New in version 3009.0.
- salt.runners.cluster.collect_from_peers(channels=(), banks=('jobs/loads', 'jobs/minions', 'jobs/endtimes', 'jobs/nocache'))#
Pull cache contents from every peer to this master.
The migration "going out" runner — reverses
cluster.sync_roots()direction. This master fires acluster/runner/collect_from_peersevent; the publish daemon broadcasts a collect-request to every peer. Each peer streams its cache contents for the requested channels back over the existing state-sync chunk transport, and this master's receiver applies them locally.Use to gather full coverage before flipping
cluster.route_clear()a data type back to broadcast: after every master has run this runner successfully, every master holds the full keyspace again and a route flip won't strand reads.Two channel families are supported:
channels— fixed state-sync channelskeysanddenied_keys(the join-time minion-key transport).banks— arbitrarysalt.cache.Cachebanks (e.g. the salt_cache returner'sjobs/*banks). Each bank name is wrapped as abank:<bank>channel on the wire and the peer streams it viasalt.cluster.state_sync.iter_bank_chunks().
- Parameters:
channels -- Iterable of fixed state-sync channel names (subset of
{"keys", "denied_keys"}). Defaults to empty; only set when migrating PKI banks (the default keys/denied_keys layout is intentionally broadcast in this branch — seeMULTI_RING_DESIGN.md).banks -- Iterable of
salt.cache.Cachebank names. Defaults to the fourjobs/*banks written bysalt.returners.salt_cache, which is the production case for multi-ring migrations.
Fire-and-forget: the runner returns immediately after the event is on the bus. Poll local cache contents (or tail the master log for
state-sync ... installed N items) to confirm delivery from each peer.CLI Examples:
# Default: collect the jobs/* banks from every peer. salt-run cluster.collect_from_peers # Collect a specific bank only. salt-run cluster.collect_from_peers banks='["jobs/loads"]' # Operator migrating a routed PKI-keys bank (rare). salt-run cluster.collect_from_peers channels='["keys"]' banks='[]'
- salt.runners.cluster.members()#
Return this master's view of the cluster's committed Raft membership.
Reads the persisted Raft log and snapshot from the local
SaltStorageand replays committed CONFIG entries through a freshMembershipStateMachine. The returned set is what this master has applied locally — in a healthy cluster every master converges to the same answer, but the response is local-only and may briefly diverge during membership changes.Output:
{ "node_id": str, # this master's interface "voters": [str, ...], # sorted "learners": [str, ...], # sorted "membership_version": int, # log index of latest CONFIG entry "voter_count": int, "learner_count": int, }
membership_versionis-1when no CONFIG entry has been applied yet (e.g. a fresh master that has not finished joining).CLI Example:
salt-run cluster.membersNew in version 3009.0.
- salt.runners.cluster.migrate_jobs_to_cache(dry_run=False)#
Migrate job-cache state from the
local_cachereturner layout into the bank layoutsalt.returners.salt_cacheuses.The default
master_job_cache: local_cachereturner writes each JID to<cachedir>/jobs/<2-hex>/<28-hex>/{.load.p, .minions.p, <minion_id>/return.p, …}. Operators flipping tomaster_job_cache: salt_cache(the multi-ring-capable returner) start with an empty bank set — every job submitted before the flip becomes invisible to the new returner.This one-shot runner walks the old filesystem layout and populates the salt_cache banks:
<cachedir>/jobs/<2>/<28>/.load.p -> bank "jobs/loads", key=jid <cachedir>/jobs/<2>/<28>/.minions.p -> bank "jobs/minions", key=jid <cachedir>/jobs/<2>/<28>/endtime -> bank "jobs/endtimes", key=jid <cachedir>/jobs/<2>/<28>/nocache -> bank "jobs/nocache", key=jid <cachedir>/jobs/<2>/<28>/<m>/return.p -> bank "jobs/returns/<jid>", key=<m> <cachedir>/jobs/<2>/<28>/<m>/out.p -> folded into the same record
The original files are left in place — operators who want to reclaim the disk can
rm -rf<cachedir>/jobsafter confirming the new banks are correct (runningcluster.members/salt-run jobs.list_jobsagainst the new returner is the smoke check).- Parameters:
dry_run -- If
True, walk and count without writing any cache entries. Use to verify the runner sees every JID before committing.
Returns a structured result:
{ "status": "ok" | "skipped", "scanned": int, # JIDs walked "migrated": int, # JIDs successfully written "skipped": int, # malformed entries the runner ignored "returns_migrated": int, # minion return records written "dry_run": bool, "jobs_root": str, # path that was walked }
CLI Examples:
# Preview without writing anything. salt-run cluster.migrate_jobs_to_cache dry_run=True # Actually copy the state across. salt-run cluster.migrate_jobs_to_cache
Operationally: stop the master before flipping
master_job_cacheso new writes don't race the migration, run this runner, restart the master with the new opt set.
- salt.runners.cluster.ring_create(name, voters)#
Create a named ring with the given founding voters.
Fires a
cluster/runner/ring_createevent on the master's local bus; the publish daemon intercepts it and proposes aRING_REGISTRYentry through the cluster Raft group. Each master that is in voters will then bring up the per-ring Raft group locally when the registry entry commits.- Parameters:
name -- Operator-chosen ring identifier (e.g.
"jobs").voters -- List of master node-ids (interface addresses) to serve as the founding voter set of the ring.
Asymmetric with
cluster.ring_destroy: this runner only requests creation — bring-up of the per-ring Node is driven by the registry's commit callback inside the daemon.CLI Example:
salt-run cluster.ring_create name=jobs voters='["m1","m2","m3"]'
- salt.runners.cluster.ring_destroy(name)#
Mark the named ring as destroyed.
Fires a
cluster/runner/ring_destroyevent; the publish daemon proposes aRING_REGISTRYentry withstatus="destroyed". Once committed, every master that hosted the ring's Raft group tears it down locally. The on-disk state is left in place so an operator who re-creates the same ring picks up the persisted state.- Parameters:
name -- Ring identifier (must match the
nameused atring_create()time).
CLI Example:
salt-run cluster.ring_destroy name=jobs
- salt.runners.cluster.ring_info()#
Return a snapshot of this master's ring state.
Reads the per-process ring populated by
RaftService. Output:{ "is_clustered": bool, "node_count": int, "nodes": [str, ...], # sorted "vnodes": int, }
Note that runners run in their own subprocess; the ring instance they see is not the publish daemon's ring. In the current design that subprocess never has a populated ring, so this function will always report
is_clustered=Falseuntil stage 2 introduces a process-shared ring (seeGAPS.md). The signature is stable so the caller's contract does not change when the backing source does.CLI Example:
salt-run cluster.ring_info
- salt.runners.cluster.ring_set(name=None, members=None, replicas=None)#
Propose a new policy for the named ring.
Fires a
cluster/runner/ring_setevent; the publish daemon proposes aRING_CONFIGentry on the ring's own Raft log (not the cluster log). Partial updates are honoured — omit a knob to keep its existing value.- Parameters:
name -- Ring identifier (required).
members --
"self"(ring is self-only — gate writes broadcast) or"voters"(ring tracks the ring's committed voter set — gate writes shard).Nonekeeps the existing value.replicas -- Integer >= 1.
Nonekeeps the existing value.
Must be invoked on a master that is a leader of the named ring's Raft group. Operators typically discover this by checking
cluster.membersfirst to find the ring's current leader.CLI Example:
salt-run cluster.ring_set name=jobs members=voters replicas=2
- salt.runners.cluster.rings()#
Return the cluster-log multi-ring registry as this master sees it.
Reads the persisted cluster Raft log on this master and replays every committed
RING_REGISTRYentry through a freshRingRegistryStateMachine. The result is the registry view this master has applied locally — in a healthy cluster every master converges to the same answer, but during a membership change a follower may lag by a heartbeat.Output:
{ "node_id": str, # this master's interface "rings": { "<ring_id>": { "founding_voters": [str, ...], "status": "active" | "destroyed", }, ... }, "active_rings": [str, ...], # sorted, status=="active" only "registry_version": int, # log index of last commit, -1 if none }
CLI Example:
salt-run cluster.ringsNew in version 3009.0.
- salt.runners.cluster.route_clear(data_type)#
Clear the route for a data type, returning it to broadcast.
Fires a
cluster/runner/route_clearevent; the publish daemon proposes aROUTEentry mapping data_type toNone. Once committed, every master mirrors the data type's writes again (the pre-multi-ring default).- Parameters:
data_type -- Logical cache identifier (e.g.
"jobs").
CLI Example:
salt-run cluster.route_clear data_type=jobs
- salt.runners.cluster.route_set(data_type, ring)#
Route a data type to a named ring.
Fires a
cluster/runner/route_setevent; the publish daemon proposes aROUTEentry through the cluster Raft group. Once committed, gate sites insalt.masterconsult the routing table when they receive a write for data_type and defer to that ring'sHashRing.owns()answer.- Parameters:
data_type -- Logical cache identifier (e.g.
"jobs").ring -- Ring name to route to (must have been created via
ring_create()).
CLI Example:
salt-run cluster.route_set data_type=jobs ring=jobs
- salt.runners.cluster.routes()#
Return the cluster-log data-type -> ring routing table as this master sees it.
Reads the persisted cluster Raft log and replays every committed
ROUTEentry through a freshRoutingStateMachine. Same caveats asrings(): a follower's view may briefly lag the leader during a routing change.Output:
{ "node_id": str, "routes": {"<data_type>": "<ring_id>" | None, ...}, "routing_version": int, # log index of last commit, -1 "drop_stats": { # see ring_membership.drop_stats "<data_type>": { "ring_id": str, "not_a_member": int, "other_ring_member": int, }, ... }, }
The
drop_statsfield is local-process only — it reflects what this master has gated since startup.not_a_memberis the misconfig signal: a non-zero count means traffic for the named data type landed on a master that isn't in the routed ring (the load balancer probably needs adjusting).Note: the runner subprocess and the publish daemon are separate processes with their own counter state, so this surface reflects the runner's view, not the daemon's. For an operational signal use
grep "ring_membership: dropping"in the master log.CLI Example:
salt-run cluster.routesNew in version 3009.0.
- salt.runners.cluster.shed_status()#
Read this master's local
cluster-shed-status.jsonsentinel, if any.The sentinel is written by the master daemon whenever it runs a local shed (either operator-triggered
cluster.shed_unowned, or a peer-triggered fan-out viacluster.shed_unowned_all). Operators check this file cluster-wide to confirm shed completed on every master.Returns
{"status": "missing"}when no sentinel has been written yet — typical on a master that has never run shed.CLI Example:
salt-run cluster.shed_status
- salt.runners.cluster.shed_unowned(ring, banks=('jobs/loads', 'jobs/minions', 'jobs/endtimes', 'jobs/nocache'), subbank_template='jobs/returns/{key}', driver=None, dry_run=False)#
Drop cache entries this master does not own for the named ring.
The migration "going in" runner. After
cluster.ring_create/route_sethave wired ring into the routing table and the per-ring Raft group has elected a leader, every master still has the full keyspace in its caches (a legacy of the pre-multi-ring broadcast era). This runner walks the configured cache banks on this master and deletes the entries that hash to other ring members.- Parameters:
ring -- Ring identifier whose voter set defines ownership.
banks -- Cache banks to scan. Defaults match the
salt.returners.salt_cachejob layout (jobs/loadsis the primary JID index; the others are sibling banks keyed by JID). Operators routing other caches override.subbank_template -- Optional
str.format-able template. When set, for each unowned key found in the firstbanksentry the runner also flushes the templated bank in its entirety — used for the salt_cache returner's per-JID returns bank ("jobs/returns/{key}"). PassNonefor caches without sub-banks.driver -- Optional override for the
salt.cache.Cachedriver. Defaults to thecache:opt — the same driver the returner writes through.dry_run -- If
True, compute the counts but don't flush anything. Use to preview the partition before committing.
Returns a structured result:
{ "status": "ok" | "skipped", "ring": str, "dropped": int, # primary-bank entries flushed "kept": int, # primary-bank entries this master owns "subbanks_dropped": int, # cascade banks flushed wholesale "dry_run": bool, }
Reads membership from local persisted Raft state (same path
cluster.membersalready uses) so the runner subprocess can answer "what does the ring look like?" without IPC into the publish daemon.CLI Examples:
# Preview which JIDs would be dropped on this master. salt-run cluster.shed_unowned ring=jobs dry_run=True # Commit the deletions on the default jobs/* banks. salt-run cluster.shed_unowned ring=jobs # Shard a different cache type (the keys/denied-keys banks # are intentionally broadcast and should NOT be sharded; this # example assumes the operator has built a routed # ``inventory`` cache). salt-run cluster.shed_unowned ring=inventory \ banks='["inventory/items"]' subbank_template=None
- salt.runners.cluster.shed_unowned_all(ring, banks=('jobs/loads', 'jobs/minions', 'jobs/endtimes', 'jobs/nocache'), subbank_template='jobs/returns/{key}', driver=None, dry_run=False)#
Fan-out
shed_unowned()across every master in the cluster.The single-master
shed_unowned()runner drops the local master's unowned cache entries. For a complete migration the operator has to run that on every ring member — error-prone and verbose for clusters with more than three or four masters. This runner solves the operator UX:Fires a
cluster/runner/shed_unowned_allevent from the runner subprocess on this master.The publish daemon intercepts the event, broadcasts a
cluster/peer/shed-requestevent (cluster_aes-encrypted) to every peer carrying the runner's parameters.Each peer's daemon intercepts the request and runs the same shed-unowned logic locally, writing a per-master sentinel at
cachedir/cluster-shed-status.jsonso the operator can poll for results without tailing logs.The originator also runs its own local shed inline so the runner returns with a useful result even before peer sentinels appear.
- Parameters:
ring -- Ring identifier whose voter set defines ownership. Same shape as
shed_unowned().banks -- Cache banks to scan; defaults to the salt_cache jobs layout.
subbank_template -- Cascade bank template; defaults to
"jobs/returns/{key}". PassNoneto disable the cascade.driver -- Optional
salt.cache.Cachedriver override. Defaults to thecache:opt.dry_run -- When True, runs the partition preview on every master without committing.
Returns the same shape as
shed_unowned()for this master's local pass, plus afan_outfield naming the cluster/peer/shed-request event that fanned to peers. Per-peer results land in their own sentinel files; operators can collect them withcluster.shed_status.CLI Example:
# Preview shed across every master in the cluster. salt-run cluster.shed_unowned_all ring=jobs dry_run=True # Commit shed across every master. salt-run cluster.shed_unowned_all ring=jobs
- salt.runners.cluster.sync_roots(roots='both')#
Push this master's
file_rootsand/orpillar_rootsto every other cluster master.Runs the operator-driven counterpart of the bulk state-sync that fires automatically during a cluster join. Use it when the canonical content on this master has changed and you want every peer to pick up the new files without restarting them or waiting for the next join handshake.
The runner fires a local event; the master daemon picks it up and fans out chunks to every peer over the encrypted cluster pub bus (same transport as the join-time state-sync). Returns immediately after the event is fired — the actual sync runs asynchronously in the master process. Check each peer's master log for the
state-sync ... installed N itemslines to confirm delivery.- Parameters:
roots --
"file","pillar", or"both"(default"both"). Selects which content trees to sync.
CLI Example:
# Push both file_roots and pillar_roots to all peers salt-run cluster.sync_roots # Push only file_roots salt-run cluster.sync_roots roots=file
New in version 3009.0.