-
Notifications
You must be signed in to change notification settings - Fork 27
Add doc on multi-az cassandra deployments #267
Conversation
kragniz
commented
Feb 28, 2018
wallrj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kragniz
I left a few comments below.
I propose we adhere to http://rhodesmill.org/brandon/2012/one-sentence-per-line/ in our docs.
It'll make diffs easier to read and review comments easier to pinpoint.
docs/cassandra/multi-az.md
Outdated
| @@ -0,0 +1,51 @@ | |||
| Cassandra across multiple availability zones | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe Use Title Case
docs/cassandra/multi-az.md
Outdated
|
|
||
| Navigator supports running Cassandra with rack and datacenter-aware | ||
| replication. To deploy this, you must run a `nodePool` in each availability | ||
| zone, and mark each as a separate Cassandra rack. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a link to Cassandra docs on DC / rack awareness.
docs/cassandra/multi-az.md
Outdated
|
|
||
| The `nodeSelector` field of a nodePool allows scheduling the nodePool to a set | ||
| of nodes matching labels. This should be used with a node label such as | ||
| `failure-domain.beta.kubernetes.io/zone`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❔ Maybe add a link to kubernetes failure-domain label docs. If there are any.
docs/cassandra/multi-az.md
Outdated
|
|
||
| The `datacenter` and `rack` fields mark all Cassandra nodes in a nodepool as | ||
| being located in that datacenter and rack. This information can then be used | ||
| with the `NetworkTopologyStrategy` keyspace replica placement strategy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❔ Link to NetworkTopologyStrategy docs.
| - name: "np-europe-west1-b" | ||
| replicas: 3 | ||
| datacenter: "europe-west1" | ||
| rack: "europe-west1-b" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❔ This makes me wonder what happens if I used rack: "b". Does Cassandra know that rack b in europe-west1 is different than rack b in us-west1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if not, perhaps we should add a note that rack labels must be unique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ec2Snitch simply labels a rack in us-east-1a as 1a, so it seems cassandra knows the difference:
The current version is failing with an obscure sed command.
|
The current hack/verify-links.sh is failing to parse the markdown added in this patch, so I've updated it with the version in https://github.com/duglin/vlinker which seems to work better. |
|
|
||
| The `nodeSelector` field of a nodePool allows scheduling the nodePool to a set of nodes matching labels. | ||
| This should be used with a node label such as | ||
| [`failure-domain.beta.kubernetes.io/zone`](https://kubernetes.io/docs/reference/labels-annotations-taints/#failure-domainbetakubernetesiozone). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to add a link out to the official docs on nodeSelector. It's not a concept we've invented and there are probably better descriptions out there that we want to include 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
docs/cassandra/multi-az.md
Outdated
| [`NetworkTopologyStrategy`](http://cassandra.apache.org/doc/latest/architecture/dynamo.html#network-topology-strategy) | ||
| keyspace replica placement strategy. | ||
|
|
||
| As an example, the nodePool section of a CassandraCluster spec for deploying into GKE in europe-west1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... "with rack awareness/network topology strategy/whatever it is called enabled"
| datacenter: "europe-west1" | ||
| rack: "europe-west1-c" | ||
| nodeSelector: | ||
| failure-domain.beta.kubernetes.io/zone: "europe-west1-c" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we drop the "auto-pull zone from nodeSelector into rack attribute" functionality for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I figure it's better to be explicit about it for now. We can add it in later if we think it'll be useful.
|
As we default the 'rack' to the node pool name, perhaps we should also document how someone would disable rack awareness? I worry that we've made it harder for users to do this, and it might be unexpected during some failure scenario (and difficult to change after initial deployment). Otherwise lgtm - happy to add the lgtm label once you've responded to this query. |
|
I updated with two sections - with/without rack awareness |
wallrj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wallrj The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
|
/test all [submit-queue is verifying that this PR is safe to merge] |
|
Automatic merge from submit-queue. |