Using Workspaces for Environment Separation
Terraform workspaces are a powerful feature that allows you to manage multiple environments (such as dev, staging, and prod) within the same configuration. For example, to switch to the production workspace:
tofu workspace select prod
Applying Configuration Changes
When applying changes to your Terraform configuration, you can specify a variable file for different environments and approve the changes automatically:
tofu apply -var-file ./dev.tfvars -auto-approve
Importing Existing Resources
To import an existing resource into Terraform’s state, use the import command with the appropriate variable file:
tofu import -var-file prod.tfvars azurerm_storage_account.res-5 "/subscriptions/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/resourceGroups/my_rg_name/providers/Microsoft.Storage/storageAccounts/my_storage_account"
Mass Import Tool
For importing multiple resources at once, consider using the Azure Terraform Export tool:
https://github.com/Azure/aztfexport
KeyVault Deletion Considerations
KeyVaults in Azure are not immediately deleted upon removal. Instead, they remain in a recoverable state for a period of time to allow for recovery. If you need to completely purge a KeyVault, use the following command:
az keyvault purge --name keyvaultname
You can also search for the option on the Azure Portal:
Forcing Resource Refresh with Taint
If you need to force Terraform to refresh or recreate a specific resource, you can use the taint command:
tofu taint -var-file=prod.tfvars "terraform_data.run_databricks_job"
Executing Arbitrary Code with terraform_data
Terraform’s `terraform_data` resource is useful for executing arbitrary code during deployment. In this example, a Databricks job is created and then immediately triggered to run:
# Job for a script that takes two parameters
resource "databricks_job" "mounting_job" {
name = "Mount the datalake folders"
provider = databricks.az_databricks
task {
task_key = "Mount"
existing_cluster_id = databricks_cluster.db_cluster.cluster_id
notebook_task {
notebook_path = databricks_notebook.notebook.path
base_parameters = {
"Datalake" = azurerm_storage_account.res-5.primary_dfs_endpoint
"Folders" = join(",", formatlist("%s", keys(local.storage_containers)))
}
}
}
}
# This executes the run of a job thats created during deployment
resource "terraform_data" "run_databricks_job" {
provisioner "local-exec" {
command = "python ./run_job_now.py \"https://${azurerm_databricks_workspace.res-2.workspace_url}\" ${databricks_token.pat.token_value} ${databricks_job.mounting_job.id}"
}
lifecycle {
replace_triggered_by = [
databricks_job.mounting_job,
databricks_notebook.notebook
]
}
depends_on = [databricks_job.mounting_job]
}
This configuration ensures that the Databricks job is executed every time it or its associated notebook changes (or you force a taint), thanks to the `replace_triggered_by` lifecycle rule.
Using Maps and For Each Functionality
Terraform supports maps (also known as “objects”) for organizing data
in key-value pairs. The for_each meta-argument allows you to iterate
over a map or set of strings, creating multiple instances of a
resource or module based on the keys in that map.
For example, if you have a map of storage containers and want to create resources for each container:
locals {
storage_containers = {
container1 = "value1"
container2 = "value2"
}
}
resource "azurerm_storage_container" "container" {
for_each = local.storage_containers
name = each.key
storage_account_name = each.value
}
In this case, Terraform will create two azurerm_storage_container
resources: one named container1 and another named container2, each
with its corresponding value from the map. The each.key represents
the key of the current iteration, while each.value would represent
the value.
This approach is useful for managing similar resources that have different names or configurations based on a set of keys, making your Terraform configuration more dynamic and reusable.
Have fun!