In the last part of this series, we showed how seemingly harmless data sources in Terraform modules can become a serious performance issue. Multi-minute terraform plan runtimes, unstable pipelines and uncontrollable API throttling effects were the result.
But how can you avoid this scalability trap in an elegant and sustainable way?
In this part, we present proven architectural patterns that allow you to centralize data sources, inject them efficiently and thereby achieve fast, stable and predictable Terraform executions even with hundreds of module instances.
Included: three scalable solution strategies, a practical step-by-step guide and a best practices checklist for production-ready infrastructure modules.
Best Practice: Scalable Alternatives
Solution 1 (simple scenarios): Variable Injection Pattern
Instead of using data sources in modules, inject the required data as variables:
data "oci_identity_availability_domains" "available" { compartment_id = var.tenancy_ocid } data "oci_core_subnets" "database" { compartment_id = var.compartment_id vcn_id = var.vcn_id filter { name = "display_name" values = ["*database*"] } } locals { availability_domains = data.oci_identity_availability_domains.available.availability_domains database_subnets = data.oci_core_subnets.database.subnets } module "databases" { for_each = var.database_configs != null ? var.database_configs : {} source = "./modules/database" availability_domains = local.availability_domains subnet_ids = [for subnet in local.database_subnets : subnet.id] name = each.key size = each.value.size }
variable "availability_domains" { type = list(object({ name = string id = string })) description = "Available ADs for database placement" } variable "subnet_ids" { type = list(string) description = "Database subnet IDs" } resource "oci_database_db_system" "main" { for_each = var.db_systems != null ? var.db_systems : {} availability_domain = var.availability_domains[0].name subnet_id = var.subnet_ids[0] compartment_id = var.compartment_id }
Solution 2 (complex scenarios): Structured Configuration Pattern
For more complex scenarios, use structured configuration objects:
data "oci_core_images" "ol8" { compartment_id = var.tenancy_ocid operating_system = "Oracle Linux" operating_system_version = "8" } locals { compute_images = { "VM.Standard.E4.Flex" = { image_id = [for img in data.oci_core_images.ol8.images : img.id if can(regex(".*E4.*", img.display_name))][0] boot_volume_size = 50 } "BM.Standard3.64" = { image_id = [for img in data.oci_core_images.ol8.images : img.id if can(regex(".*Standard.*", img.display_name))][0] boot_volume_size = 100 } } network_config = { availability_domains = data.oci_identity_availability_domains.ads.availability_domains vcn_id = data.oci_core_vcn.main.id } } module "compute_instances" { for_each = var.instance_configs != null ? var.instance_configs : {} source = "./modules/compute-instance" compute_config = local.compute_images[each.value.shape] network_config = local.network_config }
variable "compute_config" { type = object({ image_id = string boot_volume_size = number }) description = "Pre-resolved compute configuration" } variable "network_config" { type = object({ availability_domains = list(object({ name = string id = string })) vcn_id = string }) description = "Pre-resolved network configuration" } resource "oci_core_instance" "this" { for_each = var.instances != null ? var.instances : {} availability_domain = var.network_config.availability_domains[0].name compartment_id = var.compartment_id source_details { source_id = var.compute_config.image_id source_type = "image" boot_volume_size_in_gbs = var.compute_config.boot_volume_size } }
Solution 3 (very complex scenarios): Data Proxy Pattern
For very complex scenarios, create dedicated "Data Proxy" modules:
data "oci_core_images" "oracle_linux" { compartment_id = var.tenancy_ocid operating_system = "Oracle Linux" operating_system_version = "8" } data "oci_core_vcn" "main" { vcn_id = var.vcn_id } data "oci_core_security_lists" "web" { compartment_id = var.compartment_id vcn_id = var.vcn_id filter { name = "display_name" values = ["*web*"] } } output "platform_data" { value = { image_id = data.oci_core_images.oracle_linux.images[0].id vcn_id = data.oci_core_vcn.main.id instance_shapes = { small = "VM.Standard.E3.Flex" medium = "VM.Standard.E4.Flex" large = "VM.Standard3.Flex" } } }
module "platform_data" { source = "./modules/data-proxy" tenancy_ocid = var.tenancy_ocid compartment_id = var.compartment_id vcn_id = var.vcn_id } module "web_servers" { for_each = var.web_server_configs != null ? var.web_server_configs : {} source = "./modules/oci-instance" platform_data = module.platform_data.platform_data name = each.key instance_type = each.value.size }
Performance Comparison
A concrete example from a customer project deploying 50 VM instances illustrates the dramatic difference:
|
After: Variable Injection |
|
|
150 API calls | 3 API calls |
|
$ time terraform plan real 4m23.415s |
$ time terraform plan real 0m18.732s |
Result: 93% less planning time and 98% fewer API calls.
Variable Injection: Step-by-Step Guide
Step 1: Centralize Data Sources
Goal: Remove all data sources from modules and centralize them in the root module to consolidate API calls and establish a single source of truth.
How: Move all data sources used by modules into the root module. This ensures that each piece of information is queried only once, regardless of how many modules require that data. In doing so, you reduce the number of API calls from N×M (number of modules × number of data sources) to just M (number of data sources).
data "oci_identity_availability_domains" "ads" { compartment_id = var.tenancy_ocid } data "oci_core_images" "ol8" { compartment_id = var.tenancy_ocid operating_system = "Oracle Linux" operating_system_version = "8" } data "oci_core_vcn" "main" { vcn_id = var.vcn_id }
Step 2: Process Data in Locals
Goal: Transform raw data source results into a consumable format while keeping complexity out of the modules.
How: Use locals to filter, sort and convert data source results into structured data formats. This allows you to handle complex logic centrally and supply modules with already processed, clean data. With for-loops and conditional expressions, you can also implement fallback mechanisms and validation logic at the same time.
locals { availability_domains = [ for ad in data.oci_identity_availability_domains.ads.availability_domains : ad.name ] compute_images = { standard = [ for img in data.oci_core_images.ol8.images : img.id if can(regex(".*Standard.*", img.display_name)) ][0] gpu = [ for img in data.oci_core_images.ol8.images : img.id if can(regex(".*GPU.*", img.display_name)) ][0] } network_config = { vcn_id = data.oci_core_vcn.main.id vcn_cidr = data.oci_core_vcn.main.cidr_block availability_domains = local.availability_domains } }
Step 3: Define Variables in Modules
Goal: Create clear interfaces for passing data to modules while ensuring type safety and validation.
How: Replace data sources in modules with typed variables that include descriptive documentation and validation rules. The type definitions ensure consistency and robustness, while validation blocks guarantee that only valid data is passed into the modules. This makes modules more testable and independent from the cloud provider API.
variable "availability_domains" { type = list(string) description = "List of available availability domains" validation { condition = length(var.availability_domains) > 0 error_message = "At least one availability domain must be provided." } } variable "compute_images" { type = map(string) description = "Map of compute images by type" validation { condition = alltrue([ for image_id in values(var.compute_images) : can(regex("^ocid1\\.image\\.", image_id)) ]) error_message = "All image IDs must be valid OCI OCIDs." } }
Step 4: Implement Modules Without Data Sources
Goal: Completely decouple modules from external API calls and turn them into pure resource definition containers.
How: Replace all data source references in modules with variable references. This makes modules deterministic and predictable, as they only operate on the parameters passed in and do not make any unexpected API calls. At the same time, it makes modules independently testable, since you can inject mock data through the variables.
resource "oci_core_instance" "this" { for_each = var.instances != null ? var.instances : {} availability_domain = var.availability_domains[each.value.ad_index] compartment_id = var.compartment_id shape = each.value.shape create_vnic_details { subnet_id = each.value.subnet_id } source_details { source_id = var.compute_images[each.value.image_type] source_type = "image" } metadata = { ssh_authorized_keys = var.ssh_public_key } }
Step 5: Call Modules with Injected Data
Goal: Establish the connection between centrally retrieved data and modules to implement a clean data flow pattern.
How: Pass the data processed in locals as parameters to the modules. This completes the variable injection loop: data is retrieved centrally once, processed and then explicitly distributed to the modules. This explicit data transfer creates clear dependencies that are understandable both for humans and for Terraform itself.
module "web_servers" { for_each = var.web_server_configs != null ? var.web_server_configs : {} source = "./modules/compute" availability_domains = local.availability_domains compute_images = local.compute_images network_config = local.network_config instances = each.value.instances compartment_id = each.value.compartment_id ssh_public_key = var.ssh_public_key }
Best Practices Checklist
✅ Do's: Scalable Patterns
- [ ] Central Data Sources: Define all data sources in the root module
- [ ] Variable Injection: Pass data to modules via variables
- [ ] Structured Objects: Organize complex data in typed objects
- [ ] Validation Rules: Implement variable validations for injected data
- [ ] Documentation: Write variable descriptions for injected data
- [ ] Local Processing: Process data in locals in the root module
- [ ] Data Proxy Pattern: Use separate data modules for very complex scenarios
❌ Don'ts: Avoid Anti-Patterns
- [ ] Data Sources in Modules: Never use data sources in reusable modules
- [ ] Redundant Lookups: Identical data sources in multiple modules
- [ ] Complex Filtering: Costly filter operations in every module
- [ ] Nested Data Sources: Data sources depending on other data sources
- [ ] Dynamic References: for_each on data source results within modules
- [ ] Missing Validation: Using injected data without validation
Monitoring and Debugging
To monitor data source performance, you can search and evaluate the debug output of terraform plan for data source entries:
export TF_LOG=DEBUG export TF_LOG_PATH=./terraform.log terraform plan 2>&1 | grep -E "(data\.|GET|POST)" | wc -l terraform plan 2>&1 | grep -E "data\." | awk '{print $2}' | sort | uniq -c
Conclusion: Performant Modules Through Intentional Architecture
Data sources are a powerful Terraform feature - but in modules, they can become a performance trap. The variable injection pattern offers an elegant solution:
Advantages:
- Drastically reduced API calls (95%+ savings possible)
- Linear performance scaling instead of exponential degradation
- Centralized data logic for better maintainability
- Explicit dependencies instead of hidden data source calls
- Better testability through injectable mock data
The key lies in a paradigm shift: Instead of fetching data when needed, fetch it once centrally and distribute it in a targeted manner.
At ICT.technology, we have reduced Terraform planning times from minutes to seconds - even with hundreds of module instances - by consistently applying these patterns.