Skip to content
This repository was archived by the owner on Jan 21, 2020. It is now read-only.

Commit e6c2deb

Browse files
author
David Chung
authored
Manager: a stateful group plugin with leader detection (#283)
Signed-off-by: David Chung <[email protected]>
1 parent 89a65c0 commit e6c2deb

File tree

11 files changed

+948
-3
lines changed

11 files changed

+948
-3
lines changed

Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ ifneq (,$(findstring .m,$(VERSION)))
8282
endif
8383

8484
$(call build_binary,infrakit,github.com/docker/infrakit/cmd/cli)
85+
$(call build_binary,infrakit-manager,github.com/docker/infrakit/cmd/manager)
8586
$(call build_binary,infrakit-group-default,github.com/docker/infrakit/cmd/group)
8687
$(call build_binary,infrakit-flavor-combo,github.com/docker/infrakit/example/flavor/combo)
8788
$(call build_binary,infrakit-flavor-swarm,github.com/docker/infrakit/example/flavor/swarm)

cmd/manager/README.md

Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
InfraKit Manager
2+
================
3+
4+
The Manager is a binary that offers a Group interface while providing the following:
5+
6+
+ Leadership detection - for coordinating multiple sets (replicas) of InfraKit plugins
7+
+ State storage - persists user configuration in some backend
8+
9+
Both file-based and Docker Swarm (Swarm Mode) based leadership detection and state storage are
10+
available.
11+
12+
## Group Interface
13+
14+
Currently the manager exposes the same Group plugin interface as the `infrakit-group-default`.
15+
This means `infrakit group ...` command will work as usual. The manager expects a group plugin
16+
to be running prior to starting up and it functions as proxy for that group plugin:
17+
18+
+ When user does a `infrakit group watch` or `infrakit group update`, the manager will
19+
persist the input configuration in the data store it was configured at startup time.
20+
+ If the data store is configured with a backend that is shared or replicated across multiple
21+
instances of InfraKit ensemble (all the collaborating plugins), high availability can be
22+
achieved via leader detection and global availabilit of state (the stored config).
23+
+ Multiple replicas of the manager can do leader detection so that only one is active. As
24+
soon as leadership changes, the responsibility of maintaining infrastructure state is transfered
25+
to the new manager that became active.
26+
27+
## Leadership
28+
29+
The manager can use either `os` or `swarm` for leadership detection:
30+
31+
### OS mode (via the `os` subcommand)
32+
33+
1. Assumes multiple instances of managers can access a shared file (e.g. over NFS or FUSE on S3).
34+
2. Each manager starts up with a name (the `--name` flag).
35+
3. The manager instance with the name that matches the content of the shared file is the leader.
36+
37+
### Swarm mode (via the `swarm` subcommand)
38+
39+
1. Assumes there's a manager instance per Docker Swarm manager instance
40+
2. Leadership depends on the status of the Swarm manager node. If the Swarm manager node is the
41+
leader, then the InfraKit manager instance running on that node is the leader.
42+
3. When leadership changes in the Swarm, InfraKit leadership follows.
43+
44+
When an instance assumes leadership:
45+
46+
+ State is retrieved from shared storage (see below) and for each group in the config, a group
47+
`watch` is invoked so that the new leader can begin watching the groups
48+
+ Since this is the frontend for the stateless group, it records any input the user provides when the
49+
user performs and update. The new config is then written in the shared store and `update` is forwarded
50+
to the actual group plugin to do the real work.
51+
52+
When an instance loses leadership:
53+
54+
+ The manager uses previous configuration and 'deactivates' the local group plugin by calling `unwatch`
55+
on the downstream group plugin
56+
+ It rejects user's attempt to `update` since it's not the leader.
57+
58+
59+
## State Storage
60+
61+
The manager can use either `os` or `swarm` for state storage:
62+
63+
### OS mode (via the `os` subcommand)
64+
65+
1. State is stored in a local file that is well-known and defined at startup of the manager.
66+
2. This file is a global config that can include multiple groups.
67+
68+
### Swarm mode (via the `swarm` subcommand)
69+
70+
1. State is stored in the Swarm via annotations
71+
2. A single global state is stored in a single annotation. The data is compressed and encoded.
72+
73+
74+
## Fronted (Proxy) for Group
75+
76+
The manager requires a group plugin to be running so that it can forward calls to it to actually
77+
perform the work of watching and updating:
78+
79+
+ When you intend to use the manager, you should start your default group plugin with a name like
80+
`group-stateless`
81+
+ Then when starting the manager, set the `--proxy-for-group` flag to the name of the group plugin
82+
(e.g. `group-stateless`). By default, the manager starts up with the name of `group`. This matches
83+
the default name that the CLI (`infrakit group ...`) uses.
84+
85+
86+
## Running
87+
88+
```shell
89+
$ make binaries
90+
$ build/infrakit-manager -h
91+
Manager
92+
93+
Usage:
94+
infrakit-manager [command]
95+
96+
Available Commands:
97+
os os
98+
swarm swarm mode for leader detection and storage
99+
version print build version information
100+
101+
Flags:
102+
--log int Logging level. 0 is least verbose. Max is 5 (default 4)
103+
--name string Name of the manager (default "group")
104+
--proxy-for-group string Name of the group plugin to proxy for. (default "group-stateless")
105+
106+
Use "infrakit-manager [command] --help" for more information about a command.
107+
```
108+
109+
### Running in OS Mode
110+
111+
Useful for local testing:
112+
113+
```shell
114+
$ infrakit-manager os --log 5
115+
```
116+
117+
### Running in Swarm Mode
118+
119+
First enable Swarm mode:
120+
121+
```shell
122+
docker swarm init
123+
```
124+
125+
On each Swarm manager node:
126+
127+
```shell
128+
$ infrakit-manager swarm --log 5
129+
```
130+
will connect to Docker using defaulted Docker socket.
131+
132+
133+
## Example -- Running Locally
134+
135+
You can use the `os` subcommand of the manager to run the manager in the local, os mode where a
136+
shared file is used to determine leadership.
137+
138+
1. Start the plugins depending on which plugins you reference in your config. Note that the
139+
usual Group plugin is renamed `group-stateless`.
140+
141+
```shell
142+
$ make binaries
143+
$ build/infrakit-group-default --name group-stateless &
144+
$ build/infrakit-instance-file
145+
$ build/infrakit-flavor-vanilla
146+
```
147+
148+
2. Use a local file for leadership. For example - `/tmp/leader`
149+
150+
```shell
151+
echo group > /tmp/leader
152+
```
153+
2. Start the manager with the name `group`
154+
155+
```shell
156+
$ build/infrakit-manager os --log 5 --proxy-for-group group-stateless --leader-file /tmp/leader --name group
157+
DEBU[0000] Opening: /Users/myuser/.infrakit/plugins
158+
DEBU[0000] Discovered plugin at /Users/myuser/.infrakit/plugins/group-stateless
159+
DEBU[0000] Discovered plugin at /Users/myuser/.infrakit/plugins/infrakit-flavor-vanilla
160+
DEBU[0000] Discovered plugin at /Users/myuser/.infrakit/plugins/infrakit-instance-file
161+
INFO[0000] Starting up manager: &{group 0xc4202ce7e0 0xc4202e8810 0xc4202ce940 group-stateless}
162+
DEBU[0000] Opening: /Users/myuser/.infrakit/plugins
163+
DEBU[0000] Discovered plugin at /Users/myuser/.infrakit/plugins/group-stateless
164+
DEBU[0000] Discovered plugin at /Users/myuser/.infrakit/plugins/infrakit-flavor-vanilla
165+
DEBU[0000] Discovered plugin at /Users/myuser/.infrakit/plugins/infrakit-instance-file
166+
INFO[0000] Manager starting
167+
INFO[0000] Listening at: /Users/myuser/.infrakit/plugins/group
168+
DEBU[0005] ID (group) - checked /tmp/leader for leadership: group, err=<nil>, leader=true
169+
DEBU[0005] leader: true
170+
INFO[0005] Assuming leadership
171+
INFO[0005] Loaded snapshot. err= <nil>
172+
INFO[0005] Start watching groups
173+
DEBU[0005] Opening: /Users/myuser/.infrakit/plugins
174+
DEBU[0005] Discovered plugin at /Users/myuser/.infrakit/plugins/group
175+
DEBU[0005] Discovered plugin at /Users/myuser/.infrakit/plugins/group-stateless
176+
DEBU[0005] Discovered plugin at /Users/myuser/.infrakit/plugins/infrakit-flavor-vanilla
177+
DEBU[0005] Discovered plugin at /Users/myuser/.infrakit/plugins/infrakit-instance-file
178+
INFO[0005] Processing group managers with plugin group-stateless
179+
DEBU[0005] exec on group managers plugin= group-stateless
180+
INFO[0005] WATCH group managers with spec: {managers 0xc420122600}
181+
INFO[0005] Processing group workers with plugin group-stateless
182+
DEBU[0005] exec on group workers plugin= group-stateless
183+
INFO[0005] WATCH group workers with spec: {workers 0xc420122a60}
184+
DEBU[0010] ID (group) - checked /tmp/leader for leadership: group, err=<nil>, leader=true
185+
DEBU[0015] ID (group) - checked /tmp/leader for leadership: group, err=<nil>, leader=true
186+
DEBU[0020] ID (group) - checked /tmp/leader for leadership: group, err=<nil>, leader=true
187+
DEBU[0025] ID (group) - checked /tmp/leader for leadership: group, err=<nil>, leader=true
188+
DEBU[0030] ID (group) - checked /tmp/leader for leadership: group, err=<nil>, leader=true
189+
190+
```
191+
192+
You should see that the current instance is detecting that it's the leader, since `$(cat /tmp/leader) == 'group'`.
193+
You can change the leadership by changing the content of the file `/tmp/leader`:
194+
195+
```shell
196+
$ echo group2 > /tmp/leader
197+
```
198+
199+
You should see the instance detecting its non-leader status and will unwatch groups if any.
200+
201+
```shell
202+
DEBU[0150] leader: false
203+
INFO[0150] Lost leadership
204+
INFO[0150] Unwatching groups
205+
DEBU[0150] Opening: /Users/myuser/.infrakit/plugins
206+
DEBU[0150] Discovered plugin at /Users/myuser/.infrakit/plugins/group
207+
DEBU[0150] Discovered plugin at /Users/myuser/.infrakit/plugins/group-stateless
208+
DEBU[0150] Discovered plugin at /Users/myuser/.infrakit/plugins/infrakit-flavor-vanilla
209+
DEBU[0150] Discovered plugin at /Users/myuser/.infrakit/plugins/infrakit-instance-file
210+
INFO[0150] Processing group managers with plugin group-stateless
211+
DEBU[0150] exec on group managers plugin= group-stateless
212+
INFO[0150] UNWATCH group managers with spec: {managers 0xc420123580}
213+
INFO[0150] Processing group workers with plugin group-stateless
214+
DEBU[0150] exec on group workers plugin= group-stateless
215+
INFO[0150] UNWATCH group workers with spec: {workers 0xc4200cb880}
216+
DEBU[0155] ID (group) - checked /tmp/leader for leadership: group2, err=<nil>, leader=false
217+
DEBU[0160] ID (group) - checked /tmp/leader for leadership: group2, err=<nil>, leader=false
218+
```

cmd/manager/main.go

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
package main
2+
3+
import (
4+
"os"
5+
"path/filepath"
6+
7+
log "github.com/Sirupsen/logrus"
8+
"github.com/docker/infrakit/cli"
9+
"github.com/docker/infrakit/discovery"
10+
"github.com/docker/infrakit/leader"
11+
"github.com/docker/infrakit/manager"
12+
"github.com/docker/infrakit/rpc"
13+
group_rpc "github.com/docker/infrakit/rpc/group"
14+
"github.com/docker/infrakit/store"
15+
"github.com/spf13/cobra"
16+
)
17+
18+
type backend struct {
19+
id string
20+
plugins discovery.Plugins
21+
leader leader.Detector
22+
snapshot store.Snapshot
23+
pluginName string //This is the name of the stateless group plugin that the manager will proxy for.
24+
}
25+
26+
func main() {
27+
28+
logLevel := cli.DefaultLogLevel
29+
backend := &backend{}
30+
31+
cmd := &cobra.Command{
32+
Use: filepath.Base(os.Args[0]),
33+
Short: "Manager",
34+
PersistentPreRun: func(c *cobra.Command, args []string) {
35+
cli.SetLogLevel(logLevel)
36+
},
37+
}
38+
cmd.PersistentFlags().IntVar(&logLevel, "log", logLevel, "Logging level. 0 is least verbose. Max is 5")
39+
cmd.PersistentFlags().StringVar(&backend.id, "name", "group", "Name of the manager")
40+
cmd.PersistentFlags().StringVar(&backend.pluginName, "proxy-for-group", "group-stateless", "Name of the group plugin to proxy for.")
41+
42+
cmd.AddCommand(cli.VersionCommand(), osEnvironment(backend), swarmEnvironment(backend))
43+
44+
err := cmd.Execute()
45+
if err != nil {
46+
log.Error(err)
47+
os.Exit(1)
48+
}
49+
}
50+
51+
func runMain(backend *backend) error {
52+
53+
log.Infoln("Starting up manager:", backend)
54+
55+
manager, err := manager.NewManager(backend.plugins,
56+
backend.leader, backend.snapshot, backend.pluginName)
57+
if err != nil {
58+
return err
59+
}
60+
61+
_, err = manager.Start()
62+
if err != nil {
63+
return err
64+
}
65+
66+
_, stopped, err := rpc.StartPluginAtPath(
67+
filepath.Join(discovery.Dir(), backend.id),
68+
group_rpc.PluginServer(manager),
69+
)
70+
if err != nil {
71+
return err
72+
}
73+
74+
<-stopped // block until done
75+
76+
manager.Stop()
77+
log.Infoln("Manager stopped")
78+
79+
return err
80+
}

cmd/manager/os.go

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
package main
2+
3+
import (
4+
"os"
5+
"os/user"
6+
"path/filepath"
7+
"time"
8+
9+
"github.com/docker/infrakit/discovery"
10+
file_leader "github.com/docker/infrakit/leader/file"
11+
file_store "github.com/docker/infrakit/store/file"
12+
"github.com/spf13/cobra"
13+
)
14+
15+
const (
16+
// LeaderFileEnvVar is the environment variable that may be used to customize the plugin leader detection
17+
LeaderFileEnvVar = "INFRAKIT_LEADER_FILE"
18+
19+
// StoreDirEnvVar is the directory where the configs are stored
20+
StoreDirEnvVar = "INFRAKIT_STORE_DIR"
21+
)
22+
23+
func getHome() string {
24+
if usr, err := user.Current(); err == nil {
25+
return usr.HomeDir
26+
}
27+
return os.Getenv("HOME")
28+
}
29+
30+
func defaultLeaderFile() string {
31+
if leaderFile := os.Getenv(LeaderFileEnvVar); leaderFile != "" {
32+
return leaderFile
33+
}
34+
return filepath.Join(getHome(), ".infrakit/leader")
35+
}
36+
37+
func defaultStoreDir() string {
38+
if storeDir := os.Getenv(StoreDirEnvVar); storeDir != "" {
39+
return storeDir
40+
}
41+
return filepath.Join(getHome(), ".infrakit/configs")
42+
}
43+
44+
func osEnvironment(backend *backend) *cobra.Command {
45+
46+
var pollInterval time.Duration
47+
var filename, storeDir string
48+
49+
cmd := &cobra.Command{
50+
Use: "os",
51+
Short: "os",
52+
RunE: func(c *cobra.Command, args []string) error {
53+
54+
plugins, err := discovery.NewPluginDiscovery()
55+
if err != nil {
56+
return err
57+
}
58+
59+
leader, err := file_leader.NewDetector(pollInterval, filename, backend.id)
60+
if err != nil {
61+
return err
62+
}
63+
64+
snapshot, err := file_store.NewSnapshot(storeDir, "global.config")
65+
if err != nil {
66+
return err
67+
}
68+
69+
backend.plugins = plugins
70+
backend.leader = leader
71+
backend.snapshot = snapshot
72+
73+
return runMain(backend)
74+
},
75+
}
76+
cmd.Flags().StringVar(&filename, "leader-file", defaultLeaderFile(), "File used for leader election/detection")
77+
cmd.Flags().StringVar(&storeDir, "store-dir", defaultStoreDir(), "Dir to store the config")
78+
cmd.Flags().DurationVar(&pollInterval, "poll-interval", 5*time.Second, "Leader polling interval")
79+
return cmd
80+
}

0 commit comments

Comments
 (0)