Skip to content

Commit 687f186

Browse files
authored
Add v1 Deployment & Ops Skills Taxonomy (#19400)
Fixes DOC-12354 NB. Changes ported to supported versions v23.1+
1 parent c08bd98 commit 687f186

12 files changed

+597
-0
lines changed

src/current/_includes/v23.1/sidebar-data/self-hosted-deployments.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,12 @@
88
"/${VERSION}/recommended-production-settings.html"
99
]
1010
},
11+
{
12+
"title": "Deployment and Operations Skills Taxonomy",
13+
"urls": [
14+
"/${VERSION}/deployment-operations-skills-taxonomy.html"
15+
]
16+
},
1117
{
1218
"title": "Deploy Locally",
1319
"items": [

src/current/_includes/v23.2/sidebar-data/self-hosted-deployments.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,12 @@
88
"/${VERSION}/recommended-production-settings.html"
99
]
1010
},
11+
{
12+
"title": "Deployment and Operations Skills Taxonomy",
13+
"urls": [
14+
"/${VERSION}/deployment-operations-skills-taxonomy.html"
15+
]
16+
},
1117
{
1218
"title": "Deploy Locally",
1319
"items": [

src/current/_includes/v24.1/sidebar-data/self-hosted-deployments.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,12 @@
88
"/${VERSION}/recommended-production-settings.html"
99
]
1010
},
11+
{
12+
"title": "Deployment and Operations Skills Taxonomy",
13+
"urls": [
14+
"/${VERSION}/deployment-operations-skills-taxonomy.html"
15+
]
16+
},
1117
{
1218
"title": "Deploy Locally",
1319
"items": [

src/current/_includes/v24.3/sidebar-data/self-hosted-deployments.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,12 @@
88
"/${VERSION}/recommended-production-settings.html"
99
]
1010
},
11+
{
12+
"title": "Deployment and Operations Skills Taxonomy",
13+
"urls": [
14+
"/${VERSION}/deployment-operations-skills-taxonomy.html"
15+
]
16+
},
1117
{
1218
"title": "Deploy Locally",
1319
"items": [

src/current/_includes/v25.1/sidebar-data/self-hosted-deployments.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,12 @@
88
"/${VERSION}/recommended-production-settings.html"
99
]
1010
},
11+
{
12+
"title": "Deployment and Operations Skills Taxonomy",
13+
"urls": [
14+
"/${VERSION}/deployment-operations-skills-taxonomy.html"
15+
]
16+
},
1117
{
1218
"title": "Deploy Locally",
1319
"items": [

src/current/_includes/v25.2/sidebar-data/self-hosted-deployments.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,12 @@
88
"/${VERSION}/recommended-production-settings.html"
99
]
1010
},
11+
{
12+
"title": "Deployment and Operations Skills Taxonomy",
13+
"urls": [
14+
"/${VERSION}/deployment-operations-skills-taxonomy.html"
15+
]
16+
},
1117
{
1218
"title": "Deploy Locally",
1319
"items": [
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
---
2+
title: Deployment & Operations Skills Taxonomy
3+
summary: Learn the foundational skills required to deploy and operate CockroachDB
4+
toc: true
5+
docs_area: deploy
6+
---
7+
8+
This document outlines the foundational skills required to deploy and operate CockroachDB in production environments.
9+
10+
The skills are organized into sections based on the following operational domains:
11+
12+
- [Infrastructure configuration](#infrastructure-configuration)
13+
- [Security](#security)
14+
- [Cluster maintenance](#cluster-maintenance)
15+
- [Troubleshooting](#troubleshooting)
16+
- [Disaster recovery](#disaster-recovery)
17+
18+
Each section includes links to relevant documentation for the listed skills.
19+
20+
{{site.data.alerts.callout_success}}
21+
Cockroach Labs offers [Professional Services](https://www.cockroachlabs.com/company/professional-services/) that can assist you with getting applications into production faster and more efficiently.
22+
{{site.data.alerts.end}}
23+
24+
## Infrastructure configuration
25+
26+
This section covers how to ensure that your hardware and network are properly configured to meet the performance and connectivity requirements of CockroachDB.
27+
28+
- [Verify vCPU, RAM, storage, and disk IOPS performance]({% link {{ page.version.version }}/recommended-production-settings.md %}#hardware)
29+
- [Configure time synchronization with NTP server]({% link {{ page.version.version }}/deploy-cockroachdb-on-premises.md %}#step-1-synchronize-clocks)
30+
- [Validate network connectivity]({% link {{ page.version.version }}/known-limitations.md %}#cockroachdb-does-not-test-for-all-connection-failure-scenarios)
31+
32+
## Security
33+
34+
This section covers how to secure a CockroachDB deployment, including certificate management, load balancing setup, role-based access control, and data encryption.
35+
36+
- [Create and distribute certificates; initialize cluster]({% link {{ page.version.version }}/deploy-cockroachdb-on-premises.md %}#step-2-generate-certificates)
37+
- [Configure load balancer and direct a workload]({% link {{ page.version.version }}/deploy-cockroachdb-on-premises.md %}#step-6-set-up-load-balancing)
38+
- [Configure RBAC]({% link {{ page.version.version }}/security-reference/authorization.md %})
39+
- [Encryption at rest]({% link {{ page.version.version }}/encryption.md %})
40+
41+
## Cluster maintenance
42+
43+
This section covers how to manage the lifecycle of CockroachDB nodes, including adding and removing nodes, handling outages, performing upgrades or downgrades, and modifying cluster settings.
44+
45+
- [Shut down a node gracefully]({% link {{ page.version.version }}/node-shutdown.md %})
46+
- [Handle unplanned node outages]({% link {{ page.version.version }}/recommended-production-settings.md %}#load-balancing)
47+
- [Add nodes]({% link {{ page.version.version }}/cockroach-start.md %}#add-a-node-to-a-cluster)
48+
- [Remove nodes]({% link {{ page.version.version }}/node-shutdown.md %}?filters=decommission#remove-nodes)
49+
- [Add a region]({% link {{ page.version.version }}/alter-database.md %}#add-regions-to-a-database)
50+
- [Remove a region]({% link {{ page.version.version }}/alter-database.md %}#drop-region)
51+
- [Rolling upgrades]({% link {{ page.version.version }}/upgrade-cockroach-version.md %})
52+
- Downgrade a cluster from a [patch or major version]({% link {{ page.version.version }}/upgrade-cockroach-version.md %}#step-5-roll-back-the-upgrade-optional)
53+
- [Change a cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}#change-a-cluster-setting)
54+
- Repave a cluster: cluster repaving involves the following individual skills, which are also used during [rolling upgrades]({% link {{ page.version.version }}/upgrade-cockroach-version.md %}):
55+
1. [Shut down a node gracefully]({% link {{ page.version.version }}/node-shutdown.md %})
56+
1. Detach the [persistent volume]({% link {{ page.version.version }}/kubernetes-overview.md %}#kubernetes-terminology) (a.k.a. persistent disk) from the removed node's virtual machine (VM) (this step is optional but recommended)
57+
1. Delete the removed node's VM
58+
1. Start a new VM
59+
1. Reattach the persistent disk to the new VM (necessary if you did step #2)
60+
1. [Add a node to the cluster]({% link {{ page.version.version }}/cockroach-start.md %}#add-a-node-to-a-cluster) from the new VM
61+
62+
## Troubleshooting
63+
64+
This section contains a list of common issues related to SQL performance, cluster stability, memory usage, load balancing, and changefeed lag.
65+
66+
- [SQL response time for specific queries]({% link {{ page.version.version }}/query-behavior-troubleshooting.md %}#query-issues)
67+
- [SQL throughput degradation across the board]({% link {{ page.version.version }}/query-behavior-troubleshooting.md %}#low-throughput)
68+
- [Cluster instability: Dead/suspect nodes]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#node-liveness-issues)
69+
- [Out of memory (OOM) problems]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#out-of-memory-oom-crash)
70+
- [Imbalanced cluster load]({% link {{ page.version.version }}/architecture/replication-layer.md %}#load-based-replica-rebalancing)
71+
- [End of file (EOF) errors]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#client-connection-issues)
72+
- [Changefeed is falling behind]({% link {{ page.version.version }}/advanced-changefeed-configuration.md %}#lagging-ranges)
73+
- [Gather diagnostic data from a "debug zip" file]({% link {{ page.version.version }}/cockroach-debug-zip.md %})
74+
- [Collect timeseries diagnostic data from a "tsdump" file]({% link {{ page.version.version }}/cockroach-debug-tsdump.md %})
75+
76+
## Disaster recovery
77+
78+
This section covers how to set up and manage backup and restore of your cluster to ensure data recovery in case of failures.
79+
80+
- [Create AWS IAM access key]({% link {{ page.version.version }}/cloud-storage-authentication.md %})
81+
- [Create S3 bucket for backup data]({% link {{ page.version.version }}/use-cloud-storage.md %}#amazon-s3-storage-classes)
82+
- [Full cluster backup to S3]({% link {{ page.version.version }}/take-full-and-incremental-backups.md %}#full-backups)
83+
- [Incremental backup to S3]({% link {{ page.version.version }}/take-full-and-incremental-backups.md %}#incremental-backups)
84+
- [Cluster restore from AWS S3]({% link {{ page.version.version }}/restore.md %}#restore-a-cluster)
85+
86+
## See also
87+
88+
- [Production Checklist]({% link {{ page.version.version }}/recommended-production-settings.md %})
89+
- [Manual Deployment]({% link {{ page.version.version }}/manual-deployment.md %})
90+
- [Deploy a Local Cluster from Binary (Secure)]({% link {{ page.version.version }}/secure-a-cluster.md %})
91+
- [SQL Performance Best Practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %})
92+
- [Performance Tuning Recipes]({% link {{ page.version.version }}/performance-recipes.md %})
93+
- [Troubleshoot Self-Hosted Setup]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %})
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
---
2+
title: Deployment & Operations Skills Taxonomy
3+
summary: Learn the foundational skills required to deploy and operate CockroachDB
4+
toc: true
5+
docs_area: deploy
6+
---
7+
8+
This document outlines the foundational skills required to deploy and operate CockroachDB in production environments.
9+
10+
The skills are organized into sections based on the following operational domains:
11+
12+
- [Infrastructure configuration](#infrastructure-configuration)
13+
- [Security](#security)
14+
- [Cluster maintenance](#cluster-maintenance)
15+
- [Troubleshooting](#troubleshooting)
16+
- [Disaster recovery](#disaster-recovery)
17+
18+
Each section includes links to relevant documentation for the listed skills.
19+
20+
{{site.data.alerts.callout_success}}
21+
Cockroach Labs offers [Professional Services](https://www.cockroachlabs.com/company/professional-services/) that can assist you with getting applications into production faster and more efficiently.
22+
{{site.data.alerts.end}}
23+
24+
## Infrastructure configuration
25+
26+
This section covers how to ensure that your hardware and network are properly configured to meet the performance and connectivity requirements of CockroachDB.
27+
28+
- [Verify vCPU, RAM, storage, and disk IOPS performance]({% link {{ page.version.version }}/recommended-production-settings.md %}#hardware)
29+
- [Configure time synchronization with NTP server]({% link {{ page.version.version }}/deploy-cockroachdb-on-premises.md %}#step-1-synchronize-clocks)
30+
- [Validate network connectivity]({% link {{ page.version.version }}/known-limitations.md %}#cockroachdb-does-not-test-for-all-connection-failure-scenarios)
31+
32+
## Security
33+
34+
This section covers how to secure a CockroachDB deployment, including certificate management, load balancing setup, role-based access control, and data encryption.
35+
36+
- [Create and distribute certificates; initialize cluster]({% link {{ page.version.version }}/deploy-cockroachdb-on-premises.md %}#step-2-generate-certificates)
37+
- [Configure load balancer and direct a workload]({% link {{ page.version.version }}/deploy-cockroachdb-on-premises.md %}#step-6-set-up-load-balancing)
38+
- [Configure RBAC]({% link {{ page.version.version }}/security-reference/authorization.md %})
39+
- [Encryption at rest]({% link {{ page.version.version }}/encryption.md %})
40+
41+
## Cluster maintenance
42+
43+
This section covers how to manage the lifecycle of CockroachDB nodes, including adding and removing nodes, handling outages, performing upgrades or downgrades, and modifying cluster settings.
44+
45+
- [Shut down a node gracefully]({% link {{ page.version.version }}/node-shutdown.md %})
46+
- [Handle unplanned node outages]({% link {{ page.version.version }}/recommended-production-settings.md %}#load-balancing)
47+
- [Add nodes]({% link {{ page.version.version }}/cockroach-start.md %}#add-a-node-to-a-cluster)
48+
- [Remove nodes]({% link {{ page.version.version }}/node-shutdown.md %}?filters=decommission#remove-nodes)
49+
- [Add a region]({% link {{ page.version.version }}/alter-database.md %}#add-regions-to-a-database)
50+
- [Remove a region]({% link {{ page.version.version }}/alter-database.md %}#drop-region)
51+
- [Rolling upgrades]({% link {{ page.version.version }}/upgrade-cockroach-version.md %})
52+
- Downgrade a cluster from a [patch or major version]({% link {{ page.version.version }}/upgrade-cockroach-version.md %}#step-5-roll-back-the-upgrade-optional)
53+
- [Change a cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}#change-a-cluster-setting)
54+
- Repave a cluster: cluster repaving involves the following individual skills, which are also used during [rolling upgrades]({% link {{ page.version.version }}/upgrade-cockroach-version.md %}):
55+
1. [Shut down a node gracefully]({% link {{ page.version.version }}/node-shutdown.md %})
56+
1. Detach the [persistent volume]({% link {{ page.version.version }}/kubernetes-overview.md %}#kubernetes-terminology) (a.k.a. persistent disk) from the removed node's virtual machine (VM) (this step is optional but recommended)
57+
1. Delete the removed node's VM
58+
1. Start a new VM
59+
1. Reattach the persistent disk to the new VM (necessary if you did step #2)
60+
1. [Add a node to the cluster]({% link {{ page.version.version }}/cockroach-start.md %}#add-a-node-to-a-cluster) from the new VM
61+
62+
## Troubleshooting
63+
64+
This section contains a list of common issues related to SQL performance, cluster stability, memory usage, load balancing, and changefeed lag.
65+
66+
- [SQL response time for specific queries]({% link {{ page.version.version }}/query-behavior-troubleshooting.md %}#query-issues)
67+
- [SQL throughput degradation across the board]({% link {{ page.version.version }}/query-behavior-troubleshooting.md %}#low-throughput)
68+
- [Cluster instability: Dead/suspect nodes]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#node-liveness-issues)
69+
- [Out of memory (OOM) problems]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#out-of-memory-oom-crash)
70+
- [Imbalanced cluster load]({% link {{ page.version.version }}/architecture/replication-layer.md %}#load-based-replica-rebalancing)
71+
- [End of file (EOF) errors]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#client-connection-issues)
72+
- [Changefeed is falling behind]({% link {{ page.version.version }}/advanced-changefeed-configuration.md %}#lagging-ranges)
73+
- [Gather diagnostic data from a "debug zip" file]({% link {{ page.version.version }}/cockroach-debug-zip.md %})
74+
- [Collect timeseries diagnostic data from a "tsdump" file]({% link {{ page.version.version }}/cockroach-debug-tsdump.md %})
75+
76+
## Disaster recovery
77+
78+
This section covers how to set up and manage backup and restore of your cluster to ensure data recovery in case of failures.
79+
80+
- [Create AWS IAM access key]({% link {{ page.version.version }}/cloud-storage-authentication.md %})
81+
- [Create S3 bucket for backup data]({% link {{ page.version.version }}/use-cloud-storage.md %}#amazon-s3-storage-classes)
82+
- [Full cluster backup to S3]({% link {{ page.version.version }}/take-full-and-incremental-backups.md %}#full-backups)
83+
- [Incremental backup to S3]({% link {{ page.version.version }}/take-full-and-incremental-backups.md %}#incremental-backups)
84+
- [Cluster restore from AWS S3]({% link {{ page.version.version }}/restore.md %}#restore-a-cluster)
85+
86+
## See also
87+
88+
- [Production Checklist]({% link {{ page.version.version }}/recommended-production-settings.md %})
89+
- [Manual Deployment]({% link {{ page.version.version }}/manual-deployment.md %})
90+
- [Deploy a Local Cluster from Binary (Secure)]({% link {{ page.version.version }}/secure-a-cluster.md %})
91+
- [SQL Performance Best Practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %})
92+
- [Performance Tuning Recipes]({% link {{ page.version.version }}/performance-recipes.md %})
93+
- [Troubleshoot Self-Hosted Setup]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %})

0 commit comments

Comments
 (0)