=======
This repository provisions infrastructure resources on Azure for deploying Datafold using the datafold-operator.
The module provisions Azure infrastructure resources that are required for Datafold deployment. Application configuration is now managed through the datafoldapplication custom resource on the cluster using the datafold-operator, rather than through Terraform application directories.
- The "application" directory is no longer part of this repository
- Application configuration is now managed through the
datafoldapplicationcustom resource on the cluster
Note: Unlike the AWS module, the Azure module always deploys an Application Gateway as the load balancer. This is because Azure Application Gateway provides better integration with AKS and is the recommended approach for Azure deployments.
- An Azure subscription, preferably a new isolated one.
- Terraform >= 1.4.6
- A customer contract with Datafold
- The application does not work without credentials supplied by sales
- Access to our public helm-charts repository
The full deployment will create the following resources:
- Azure Virtual Network
- Azure subnets
- Azure blob storage for clickhouse backups
- Azure Application Gateway (optional, disabled by default)
- Azure certificate (if load balancer is enabled)
- Azure bastion
- Azure jump VM
- Three Azure managed disks for local data storage
- Azure PostgreSQL database
- An AKS cluster
- Service accounts for the AKS cluster to perform actions outside of its cluster boundary:
- Provisioning existing managed disks
- Updating application gateway to point to specific pods in the cluster
- Rescaling the nodegroup between 1-2 nodes
Infrastructure Dependencies: For a complete list of required infrastructure resources and detailed deployment guidance, see the Datafold Dedicated Cloud Azure Deployment Documentation.
- This module will not provision DNS names in your zone.
- See the example for a potential setup, which has dependencies on our helm-charts
Create the storage account and container for terraform state file:
- Use the files in
bootstrapto create a terraform state storage account and container. - Run
./run_bootstrap.shto create them. Enter the deployment_name when the question is asked.- The
deployment_nameis important. This is used for the k8s namespace and datadog unified logging tags and other places. - Suggestion:
company-datafold
- The
- Transfer the name of that storage account and container into the
backend.hcl - Set the
resource_group_nameandlocationwhere the storage account is stored. backend.hclis only about where the terraform state file is located.
The example directory contains a single deployment example for infrastructure setup.
Setting up the infrastructure:
-
It is easiest if you have full admin access in the target subscription.
-
Pre-create a symmetric encryption key that is used to encrypt/decrypt secrets of this deployment.
- Use the alias instead of the
mrklink. Put that intolocals.tf
- Use the alias instead of the
-
Certificate Requirements: Pre-create and validate the certificate in your DNS, then refer to that certificate in main.tf using its domain name (Replace "datafold.acme.com")
-
Change the settings in locals.tf
- provider_region = which region you want to deploy in.
- resource_group_name = The resource group in which to deploy.
- kms_profile = Can be the same profile, unless you want the encryption key elsewhere.
- kms_key = A pre-created symmetric KMS key. It's only purpose is for encryption/decryption of deployment secrets.
- deployment_name = The name of the deployment, used in kubernetes namespace, container naming and datadog "deployment" Unified Tag)
- azure_tenant_id = The tenant ID where to deploy.
- azure_subscription_id = The ID of the subscription to deploy in.
-
Run
terraform init -backend-config=../backend.hclin the infra directory. -
Run
terraform applyininfradirectory. This should complete ok.- Check in the console if you see the AKS cluster, PostgreSQL database, etc.
- If you enabled load balancer deployment, check for the Application Gateway as well.
- The configuration values needed for application deployment will be output to the console after the apply completes.
Application Deployment: After infrastructure is ready, deploy the application using the datafold-operator. Continue with the Datafold Helm Charts repository to deploy the operator manager and then the application through the operator. The operator is the default and recommended method for deploying Datafold.
This module is designed to provide the complete infrastructure stack for Datafold deployment. However, if you already have AKS infrastructure in place, you can choose to configure the required resources independently.
Required Infrastructure Components:
- AKS cluster with appropriate node pools
- Azure Database for PostgreSQL
- Azure Storage account for ClickHouse backups
- Azure managed disks for persistent storage (ClickHouse data, ClickHouse logs, Redis data)
- Managed identities and role assignments for cluster operations
- Azure Application Gateway (always deployed by this module)
- Virtual Network and networking components
- SSL certificate (must be pre-created and validated)
Alternative Approaches:
- Use this module: Provides complete infrastructure setup for new deployments
- Use existing infrastructure: Configure required resources manually or through other means
- Hybrid approach: Use this module for some components and existing infrastructure for others
For detailed specifications of each required component, see the Datafold Dedicated Cloud Azure Deployment Documentation. For application deployment instructions, continue with the Datafold Helm Charts repository to deploy the operator manager and then the application through the operator.
The terraform-azure-datafold module provides 41 resource name override variables that allow you to customize resource names according to your organization's naming standards or compliance requirements.
- Compliance: Meet organizational naming conventions
- Environment Separation: Different naming patterns for dev/staging/prod
- Multi-tenant: Unique identifiers for different customers/teams
- Integration: Match existing resource naming patterns
module "azure" {
source = "datafold/datafold/azure"
# Standard configuration...
deployment_name = "example-datafold"
# Custom resource group name (set directly, no override needed)
resource_group_name = "prod-acme-datafold-rg"
create_resource_group = false
# Custom resource names via overrides
aks_cluster_name_override = "prod-acme-datafold-aks"
storage_account_name_override = "prodacmedatafoldstorage"
key_vault_name_override = "prod-acme-datafold-kv"
virtual_network_name_override = "prod-acme-datafold-vnet"
}- Azure Storage Accounts: Max 24 chars, lowercase letters/numbers only
- Key Vault Names: Max 24 chars, alphanumeric and hyphens only
- Service Account Scopes: Update role assignment scopes when using overrides
- Storage Account Consistency: When overriding storage account names, ensure service account scopes reference the same name to avoid permission errors
đź“– For complete documentation and examples, see examples/README.md
Connecting to the AKS cluster requires 3 terminals in total.
- The first terminal is set up to access the VPC through the bastion.
- The second sets up a tunnel to the jumpbox.
- The third terminal is the one doing the work.
# Set up Kube config
deployment_name="acme-datafold"
proxy_port="1081"
az aks get-credentials --resource-group "${deployment_name}-rg" --name "${deployment_name}-cluster"
kubectl config set clusters.azure-dev-datafold-cluster.proxy-url "socks5://localhost:${proxy_port}"
kubectl config set-context --current --namespace="${deployment_name}"
# Run in terminal 1: Open an Azure Bastion tunnel into VM
deployment_name="acme-datafold"
target="jumpbox-vm"
vm_id=$(az vm list --resource-group "${deployment_name}-rg" | jq -r '.[].id' | grep "${deployment_name}-${target}")
az network bastion tunnel --name "bastion" --resource-group "${deployment_name}-rg" --target-resource-id "${vm_id}" --resource-port 22 --port 50022
# Run in terminal 2 (authorized_keys are passed on by cloud-init.txt file/jumpbox_custom_data):
proxy_port="1081"
ssh -i ~/.ssh/id_rsa -D $proxy_port -p 50022 [email protected] -N
# Run in terminal 3:
k9sAfter deploying the application through the operator (see the Datafold Helm Charts repository), establish a shell into the <deployment>-dfshell container.
It is likely that the scheduler and server containers are crashing in a loop.
All we need to do is to run these commands:
./manage.py clickhouse create-tables./manage.py database create-or-upgrade./manage.py installation set-new-deployment-params
Now all containers should be up and running.
You can get more information from our documentation site:
https://docs.datafold.com/datafold-deployment/dedicated-cloud/azure
| Name | Version |
|---|---|
| acme | ~> 2.0 |
| azurerm | ~>4.35.0 |
| tls | ~> 3.0 |
| Name | Version |
|---|---|
| azurerm | ~>4.35.0 |
| Name | Source | Version |
|---|---|---|
| aks | ./modules/aks | n/a |
| clickhouse_backup | ./modules/clickhouse_backup | n/a |
| data_lake | ./modules/data_lake | n/a |
| database | ./modules/database | n/a |
| identity | ./modules/identity | n/a |
| key_vault | ./modules/key_vault | n/a |
| load_balancer | ./modules/load_balancer | n/a |
| networking | ./modules/networking | n/a |
| Name | Type |
|---|---|
| azurerm_resource_group.default | data source |
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
| acme_config | The configuration for the provider of the DNS challenge | any |
n/a | yes |
| acme_provider | The name of the provider for the DNS challenge | string |
n/a | yes |
| adls_dns_link_name_override | Override for the name used in resource.azurerm_private_dns_zone_virtual_network_link.adls (modules/data_lake) | string |
"" |
no |
| adls_filesystem_name_override | Override for the name used in resource.azurerm_storage_data_lake_gen2_filesystem.adls (modules/data_lake) | string |
"" |
no |
| adls_private_dns_zone_name_override | Override for the name used in resource.azurerm_private_dns_zone.adls (modules/data_lake) | string |
"" |
no |
| adls_private_endpoint_name_override | Override for the name used in resource.azurerm_private_endpoint.adls (modules/data_lake) | string |
"" |
no |
| adls_storage_account_name_override | Override for the name used in resource.azurerm_storage_account.adls (modules/data_lake) | string |
"" |
no |
| aks_cluster_name_override | Override for the name used in resource.azurerm_kubernetes_cluster.default (modules/aks) | string |
"" |
no |
| aks_dns_prefix_override | Override for the dns_prefix used in resource.azurerm_kubernetes_cluster.default (modules/aks) | string |
"" |
no |
| aks_dns_service_ip | The IP address for the Kubernetes DNS service | string |
"172.16.0.10" |
no |
| aks_service_cidr | The CIDR block for the Kubernetes services | string |
"172.16.0.0/16" |
no |
| aks_sku_tier | The SKU tier for the cluster | string |
"Free" |
no |
| aks_subnet_cidrs | The CIDR block for the AKS subnet. If empty it will be calculated from the VPC CIDR and given size. | list(string) |
[] |
no |
| aks_subnet_name_override | Override for the name used in resource.azurerm_subnet.aks_subnet (modules/networking) | string |
"" |
no |
| aks_subnet_size | The size of the AKS subnet in number of IPs | number |
1024 |
no |
| aks_workload_identity_enabled | Flag to enable workload identity | bool |
true |
no |
| app_gw_subnet_cidrs | The CIDR block for the app gateway subnet. If empty it will be calculated from the VPC CIDR and given size. | list(string) |
[] |
no |
| app_gw_subnet_name_override | Override for the name used in resource.azurerm_subnet.app_gw_subnet (modules/networking) | string |
"" |
no |
| app_gw_subnet_size | The size of the app gateway subnet in number of IPs | number |
256 |
no |
| app_subnet_cidrs | The CIDR block for the app subnet. If empty it will be calculated from the VPC CIDR and given size. | list(string) |
[] |
no |
| app_subnet_name_override | Override for the name used in resource.azurerm_subnet.app_subnet (modules/networking) | string |
"" |
no |
| app_subnet_size | The size of the app subnet in number of IPs | number |
256 |
no |
| application_gateway_name_override | Override for the name used in resource.azurerm_application_gateway.default (modules/load_balancer) | string |
"" |
no |
| azure_bastion_subnet_cidrs | The CIDR block for the Azure Bastion subnet. If empty it will be calculated from the VPC CIDR and given size. | list(string) |
[] |
no |
| azure_bastion_subnet_name_override | Override for the name used in resource.azurerm_subnet.azure_bastion_subnet (modules/networking) | string |
"" |
no |
| azure_bastion_subnet_size | The size of the Azure Bastion subnet in number of IPs | number |
256 |
no |
| bastion_host_name_override | Override for the name used in resource.azurerm_bastion_host.bastion (modules/networking) | string |
"" |
no |
| bastion_public_ip_name_override | Override for the name used in resource.azurerm_public_ip.ip_bastion_host (modules/networking) | string |
"" |
no |
| ch_data_disk_iops | IOPS of volume | number |
3000 |
no |
| ch_data_disk_throughput | Throughput of volume | number |
1000 |
no |
| ch_logs_disk_iops | IOPS of volume | number |
3000 |
no |
| ch_logs_disk_throughput | Throughput of volume | number |
250 |
no |
| clickhouse_backup_container_name_override | Override for the name used in resource.azurerm_storage_container.clickhouse_backup (modules/clickhouse_backup) | string |
"" |
no |
| clickhouse_data_disk_name_override | Override for the name used in resource.azurerm_managed_disk.clickhouse_data | string |
"" |
no |
| clickhouse_data_size | ClickHouse data disk size in GB | number |
40 |
no |
| clickhouse_logs_disk_name_override | Override for the name used in resource.azurerm_managed_disk.clickhouse_logs | string |
"" |
no |
| clickhouse_logs_size | ClickHouse logs disk size in GB | number |
40 |
no |
| create_adls | Whether to create Azure Data Lake Storage | bool |
false |
no |
| create_database | Flag to toggle PostgreSQL database creation | bool |
true |
no |
| create_resource_group | Flag to toggle resource group creation | bool |
true |
no |
| custom_node_pools | Dynamic extra node pools | list(object({ |
[] |
no |
| database_backup_retention_days | PostgreSQL backup retention days | number |
7 |
no |
| database_dns_link_name_override | Override for the name used in resource.azurerm_private_dns_zone_virtual_network_link.database (modules/networking) | string |
"" |
no |
| database_name | Postgres database name | string |
"datafold" |
no |
| database_private_dns_zone_name_override | Override for the name used in resource.azurerm_private_dns_zone.database (modules/networking) | string |
"" |
no |
| database_sku | PostgreSQL SKU | string |
"GP_Standard_D2s_v3" |
no |
| database_storage_mb | PostgreSQL storage in MB. One of a predetermined set of values, see: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/postgresql_flexible_server#storage_mb | number |
32768 |
no |
| database_subnet_cidrs | The CIDR block for the database subnet. If empty it will be calculated from the VPC CIDR and given size. | list(string) |
[] |
no |
| database_subnet_name_override | Override for the name used in resource.azurerm_subnet.database_subnet (modules/networking) | string |
"" |
no |
| database_subnet_size | The size of the database subnet in number of IPs | number |
256 |
no |
| database_username | ProgreSQL username | string |
"datafold" |
no |
| deploy_lb | Flag to toggle load balancer creation. When false, load balancer should be deployed via helm-charts/kubernetes. | bool |
true |
no |
| deployment_name | The name of the deployment | string |
n/a | yes |
| disk_sku | Disk SKU type | string |
"StandardSSD_LRS" |
no |
| domain_name | The domain name for the load balancer. E.g. azure-dev.datafold.io | string |
n/a | yes |
| environment | The environment for the resources | string |
"dev" |
no |
| etcd_key_name_override | Override for the name used in resource.azurerm_key_vault_key.etcd (modules/key_vault) | string |
"" |
no |
| gw_private_ip_address | The private IP address of the gateway. Should be within the gateway subnet CIDR range. | string |
"10.0.9.10" |
no |
| identity_name_override | Override for the name used in resource.azurerm_user_assigned_identity.default (modules/identity) | string |
"" |
no |
| jumpbox_custom_data | Custom data for the jumpbox. Can be used to e.g. pass on ~/.ssh/authorized_keys with a cloud-init script. | string |
"" |
no |
| jumpbox_nsg_name_override | Override for the name used in resource.azurerm_network_security_group.jumpbox (modules/networking) | string |
"" |
no |
| jumpbox_public_ip_name_override | Override for the name used in resource.azurerm_public_ip.jumpbox (modules/networking) | string |
"" |
no |
| k8s_public_access_cidrs | List of CIDRs that are allowed to connect to the EKS control plane | list(string) |
n/a | yes |
| key_vault_name_override | Override for the name used in resource.azurerm_key_vault.default (modules/key_vault) | string |
"" |
no |
| lb_is_public | Flag that determines if LB is public | bool |
true |
no |
| linux_vm_name_override | Override for the name used in resource.azurerm_linux_virtual_machine.linux_vm (modules/networking) | string |
"" |
no |
| location | The Azure location where the resources will be created | string |
"" |
no |
| max_node_count | n/a | number |
6 |
no |
| max_pods | The maximum number of pods that can run on a node | number |
50 |
no |
| min_node_count | n/a | number |
1 |
no |
| node_pool_name | The name of the node pool | string |
"default" |
no |
| node_pool_node_count | The number of nodes in the pool | number |
1 |
no |
| node_pool_vm_size | The size of the VMs in the pool | string |
"Standard_DS2_v2" |
no |
| postgresql_database_name_override | Override for the name used in resource.azurerm_postgresql_flexible_server_database.main (modules/database) | string |
"" |
no |
| postgresql_major_version | PostgreSQL major version | string |
"15" |
no |
| postgresql_server_name_override | Override for the name used in resource.azurerm_postgresql_flexible_server.main (modules/database) | string |
"" |
no |
| private_cluster_enabled | Flag to enable private cluster | bool |
true |
no |
| private_endpoint_adls_subnet_cidrs | List of subnet CIDRs for ADLS private endpoints | list(string) |
[] |
no |
| private_endpoint_adls_subnet_name_override | Override for the name used in resource.azurerm_subnet.private_endpoint_adls (modules/networking) | string |
"" |
no |
| private_endpoint_adls_subnet_size | Size of the ADLS subnet (number of IP addresses) | number |
256 |
no |
| private_endpoint_storage_subnet_cidrs | The CIDR block for the private endpoint storage subnet. If empty it will be calculated from the VPC CIDR and given size. | list(string) |
[] |
no |
| private_endpoint_storage_subnet_name_override | Override for the name used in resource.azurerm_subnet.private_endpoint_storage (modules/networking) | string |
"" |
no |
| private_endpoint_storage_subnet_size | The size of the private endpoint storage subnet in number of IPs | number |
256 |
no |
| public_ip_name_override | Override for the name used in resource.azurerm_public_ip.default (modules/networking) | string |
"" |
no |
| redis_data_disk_name_override | Override for the name used in resource.azurerm_managed_disk.redis_data | string |
"" |
no |
| redis_data_size | Redis data disk size in GB | number |
50 |
no |
| redis_disk_iops | IOPS of redis volume | number |
3000 |
no |
| redis_disk_throughput | Throughput of redis volume | number |
125 |
no |
| resource_group_name | The name of the resource group where the resources will be created | string |
"" |
no |
| resource_group_tags | The tags to apply to the resource group | map(string) |
{} |
no |
| service_accounts | Map of service accounts and their configuration | map(object({ |
{} |
no |
| ssl_cert_name | The name of the SSL certificate to use for the load balancer. This needs to be referenced by the k8s azure-application-gateway ingress config. | string |
n/a | yes |
| ssl_certificate_name_override | Override for the name used in resource.azurerm_key_vault_certificate.ssl (modules/key_vault) | string |
"" |
no |
| storage_account_name_override | Override for the name used in resource.azurerm_storage_account.storage (modules/clickhouse_backup) | string |
"" |
no |
| storage_dns_link_name_override | Override for the name used in resource.azurerm_private_dns_zone_virtual_network_link.storage_account_link (modules/clickhouse_backup) | string |
"" |
no |
| storage_private_dns_zone_name_override | Override for the name used in resource.azurerm_private_dns_zone.storage_account_dns (modules/clickhouse_backup) | string |
"" |
no |
| storage_private_endpoint_name_override | Override for the name used in resource.azurerm_private_endpoint.storage (modules/clickhouse_backup) | string |
"" |
no |
| virtual_network_name_override | Override for the name used in resource.azurerm_virtual_network.vnet (modules/networking) | string |
"" |
no |
| virtual_network_tags | The tags to apply to the virtual network | map(string) |
{} |
no |
| vm_bastion_subnet_cidrs | The CIDR block for the VM Bastion subnet. If empty it will be calculated from the VPC CIDR and given size. | list(string) |
[] |
no |
| vm_bastion_subnet_name_override | Override for the name used in resource.azurerm_subnet.vm_bastion_subnet (modules/networking) | string |
"" |
no |
| vm_bastion_subnet_size | The size of the VM Bastion subnet in number of IPs | number |
256 |
no |
| vm_nic_name_override | Override for the name used in resource.azurerm_network_interface.vm_nic (modules/networking) | string |
"" |
no |
| vnet_nsg_name_override | Override for the name used in resource.azurerm_network_security_group.nsg_vnet (modules/networking) | string |
"" |
no |
| vpc_cidrs | The address space for the virtual network | list(string) |
[ |
no |
| Name | Description |
|---|---|
| adls_account_key | The access key for the Azure Data Lake Storage account |
| adls_account_name | The name of the Azure Data Lake Storage account |
| adls_filesystem | The filesystem details for the Azure Data Lake Storage |
| azure_blob_account_key | The access key for the Azure Blob Storage account |
| azure_blob_account_name | The name of the Azure Blob Storage account |
| azure_blob_container | The name of the Azure Blob Storage container |
| clickhouse_data_volume_id | The volume ID where clickhouse data will be stored. |
| clickhouse_logs_volume_id | The volume ID where clickhouse logs will be stored. |
| cloud_provider | The cloud provider being used (always 'azure' for this module) |
| cluster_name | The name of the AKS cluster |
| domain_name | The domain name configured for the deployment |
| load_balancer_ips | The public IP addresses assigned to the load balancer |
| postgres_database_name | The name of the PostgreSQL database |
| postgres_host | The hostname of the PostgreSQL server |
| postgres_password | The password for PostgreSQL database access |
| postgres_username | The username for PostgreSQL database access |
| public_ip_jumpbox | The private IP address of the jumpbox |
| redis_data_volume_id | The volume ID of the Redis data volume. |
| resource_group_name | The resource group where resources were deployed |
| service_account_configs | The Azure identity configs |
| vnet_name | The name of the virtual network |
| vpc_cidr | The CIDR block of the VPC |