Terraform/OpenTofu Notes

Using Workspaces for Environment Separation

Terraform workspaces are a powerful feature that allows you to manage multiple environments (such as dev, staging, and prod) within the same configuration. For example, to switch to the production workspace:

tofu workspace select prod

Applying Configuration Changes

When applying changes to your Terraform configuration, you can specify a variable file for different environments and approve the changes automatically:

tofu apply -var-file ./dev.tfvars -auto-approve

Importing Existing Resources

To import an existing resource into Terraform’s state, use the import command with the appropriate variable file:

tofu import -var-file prod.tfvars azurerm_storage_account.res-5 "/subscriptions/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/resourceGroups/my_rg_name/providers/Microsoft.Storage/storageAccounts/my_storage_account"

Mass Import Tool

For importing multiple resources at once, consider using the Azure Terraform Export tool:

https://github.com/Azure/aztfexport

KeyVault Deletion Considerations

KeyVaults in Azure are not immediately deleted upon removal. Instead, they remain in a recoverable state for a period of time to allow for recovery. If you need to completely purge a KeyVault, use the following command:

az keyvault purge --name keyvaultname

You can also search for the option on the Azure Portal:

https://learn.microsoft.com/en-us/azure/key-vault/general/key-vault-recovery?tabs=azure-portal#list-recover-or-purge-a-soft-deleted-key-vault

Forcing Resource Refresh with Taint

If you need to force Terraform to refresh or recreate a specific resource, you can use the taint command:

tofu taint -var-file=prod.tfvars "terraform_data.run_databricks_job"

Executing Arbitrary Code with terraform_data

Terraform’s `terraform_data` resource is useful for executing arbitrary code during deployment. In this example, a Databricks job is created and then immediately triggered to run:

# Job for a script that takes two parameters
resource "databricks_job" "mounting_job" {
    name = "Mount the datalake folders"
    provider = databricks.az_databricks
    task {
        task_key = "Mount"
        existing_cluster_id = databricks_cluster.db_cluster.cluster_id
        notebook_task {
            notebook_path = databricks_notebook.notebook.path
            base_parameters = {
                "Datalake"  = azurerm_storage_account.res-5.primary_dfs_endpoint
                "Folders" = join(",", formatlist("%s", keys(local.storage_containers)))
            }
        }
    }
}

# This executes the run of a job thats created during deployment
resource "terraform_data" "run_databricks_job" {
    provisioner "local-exec" {
        command = "python ./run_job_now.py \"https://${azurerm_databricks_workspace.res-2.workspace_url}\" ${databricks_token.pat.token_value} ${databricks_job.mounting_job.id}"
    }
    lifecycle {
        replace_triggered_by = [
            databricks_job.mounting_job,
            databricks_notebook.notebook
        ]
    }
    depends_on = [databricks_job.mounting_job]
}

This configuration ensures that the Databricks job is executed every time it or its associated notebook changes (or you force a taint), thanks to the `replace_triggered_by` lifecycle rule.

Using Maps and For Each Functionality

Terraform supports maps (also known as “objects”) for organizing data in key-value pairs. The for_each meta-argument allows you to iterate over a map or set of strings, creating multiple instances of a resource or module based on the keys in that map.

For example, if you have a map of storage containers and want to create resources for each container:

locals {
  storage_containers = {
    container1 = "value1"
    container2 = "value2"
  }
}

resource "azurerm_storage_container" "container" {
  for_each = local.storage_containers

  name                 = each.key
  storage_account_name = each.value
}

In this case, Terraform will create two azurerm_storage_container resources: one named container1 and another named container2, each with its corresponding value from the map. The each.key represents the key of the current iteration, while each.value would represent the value.

This approach is useful for managing similar resources that have different names or configurations based on a set of keys, making your Terraform configuration more dynamic and reusable.

Have fun!

 Share!