[{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/authors/","section":"Authors","summary":"","title":"Authors","type":"authors"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/categories/","section":"Categories","summary":"","title":"Categories","type":"categories"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/cloud/","section":"Tags","summary":"","title":"Cloud","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/cost/","section":"Tags","summary":"","title":"Cost","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/authors/george-boone/","section":"Authors","summary":"","title":"George Boone","type":"authors"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/migration/","section":"Tags","summary":"","title":"Migration","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/posts/","section":"Posts","summary":"","title":"Posts","type":"posts"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/categories/strategy/","section":"Categories","summary":"","title":"Strategy","type":"categories"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/strategy/","section":"Tags","summary":"","title":"Strategy","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/","section":"Tags","summary":"","title":"Tags","type":"tags"},{"content":"Every cloud migration starts with a business case. Compute costs will drop. You\u0026rsquo;ll stop paying for underutilized hardware. The capital expense becomes an operational expense. The numbers look good in a spreadsheet.\nThen the bill arrives.\nThis isn\u0026rsquo;t a post arguing that cloud migration is a bad idea. For most organizations, moving to the cloud is the right call. But the gap between projected and actual costs is wide enough that it\u0026rsquo;s worth being honest about what the estimates usually miss.\nThe Lift-and-Shift Tax # The cheapest migration, on paper, is lift-and-shift: take what you have, move it to the cloud unchanged, and optimize later. In practice, this often produces cloud infrastructure that costs more than the on-premise equivalent it replaced.\nOn-premise hardware is sized for peak load and then sits idle. In the cloud, you pay for what you run — which means that a server that\u0026rsquo;s at 15% utilization in your data center will cost you 85% more per unit of work in the cloud unless you right-size it. Lift-and-shift migrations frequently inherit the same oversizing patterns without the fixed-cost protection.\n\u0026ldquo;Optimize later\u0026rdquo; is a plan that often doesn\u0026rsquo;t happen. Once services are running in the cloud and teams have moved on, the political will to re-architect for efficiency rarely materializes.\nData Transfer Costs # Cloud providers charge for data leaving their network. This is called egress, and it surprises almost every organization that hasn\u0026rsquo;t budgeted for it.\nThe pattern is always the same: data going in is free or cheap, data coming out is not. For workloads that move significant amounts of data — analytics pipelines, content delivery, backup and restore workflows, applications with high-traffic APIs — egress can represent a substantial fraction of the total cloud bill.\nThis cost is particularly acute for organizations that run hybrid architectures, where data moves regularly between on-premise and cloud environments. Budget for it explicitly, and factor it into the architectural decisions about where data lives and how it moves.\nThe Operations Gap # On-premise infrastructure comes with a staff that knows how to run it. They know which servers have had hardware issues, how the network is laid out, what the change management process looks like, and where the runbooks are.\nCloud infrastructure requires a different skill set. The tooling is different, the failure modes are different, and the operational patterns are different. Organizations that migrate without investing in training and hiring often find that their existing operations team is underwater, leading to incidents, slower deployments, and remediation costs.\nThis isn\u0026rsquo;t a reason to delay migration — it\u0026rsquo;s a reason to plan for the transition. Budget for training. Hire people with cloud experience before you need them. Accept that productivity will dip during the transition period.\nUnused Resources # Cloud resources are easy to create and easy to forget. Development environments spin up and never get torn down. Experiments run and the infrastructure persists. Snapshots accumulate. Reserved capacity goes unused after projects end.\nIn a data center, unused hardware is visible and occupies physical space. In the cloud, it\u0026rsquo;s invisible until you look at your bill. Organizations without good resource governance practices typically find 20-30% of their cloud spend going to resources that serve no current purpose.\nThe solution is governance infrastructure: tagging requirements, automated cleanup of old resources, regular cost reviews, and alerts for spending anomalies. This costs something to build and maintain, but it costs less than the waste it prevents.\nReserved Capacity Commitments # Cloud providers offer significant discounts — often 30-60% — in exchange for committing to use a certain amount of compute for one to three years. These commitments are worth taking in stable parts of your infrastructure, but they introduce a new category of hidden cost: commitment waste.\nIf your workload shrinks, or you migrate to a different instance type, or a service gets deprecated, you may end up paying for reserved capacity you can\u0026rsquo;t use. The discount that looked attractive becomes a liability.\nThe mitigation is to commit conservatively, for shorter terms, and only for workloads you\u0026rsquo;re confident will remain stable. Cover your baseline with reservations and let variable workloads run on-demand.\nThe True Cost of Migration Projects # The migration itself costs money. Engineering time to re-architect applications, build automation, test in new environments, and manage the cutover is significant. So is running parallel environments during the transition period, when you\u0026rsquo;re paying for both old and new infrastructure simultaneously.\nThese costs are often underestimated in initial migration business cases because they\u0026rsquo;re hard to forecast. A reasonable heuristic: double whatever the engineering estimate is, and assume you\u0026rsquo;ll run in parallel for longer than planned.\nGetting the Numbers Right # None of this means cloud migration doesn\u0026rsquo;t pencil out — it usually does. But the analysis needs to be honest.\nA realistic cloud migration budget includes:\nInfrastructure costs, sized accurately for your actual workload Egress costs, estimated based on data movement patterns Reserved capacity strategy, with realistic assumptions about commitment risk Governance tooling and ongoing cost management Training and hiring to close the skills gap Migration project costs, including parallel running periods Build the full picture before committing. Surprises in cloud bills are manageable when they\u0026rsquo;re expected and sized; they\u0026rsquo;re a crisis when they\u0026rsquo;re not.\nThe organizations that get cloud economics right treat the cloud as a different kind of infrastructure that requires different financial modeling — not as a data center with a pay-as-you-go billing model.\n","date":"20 March 2026","externalUrl":null,"permalink":"/posts/the-hidden-costs-of-cloud-migration/","section":"Posts","summary":"","title":"The Hidden Costs of Cloud Migration","type":"posts"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/","section":"ThinkInfra","summary":"","title":"ThinkInfra","type":"page"},{"content":"","date":"16 March 2026","externalUrl":null,"permalink":"/tags/automation/","section":"Tags","summary":"","title":"Automation","type":"tags"},{"content":"","date":"16 March 2026","externalUrl":null,"permalink":"/series/cloud-fundamentals/","section":"Series","summary":"","title":"Cloud-Fundamentals","type":"series"},{"content":"","date":"16 March 2026","externalUrl":null,"permalink":"/tags/devops/","section":"Tags","summary":"","title":"Devops","type":"tags"},{"content":"","date":"16 March 2026","externalUrl":null,"permalink":"/categories/engineering/","section":"Categories","summary":"","title":"Engineering","type":"categories"},{"content":"","date":"16 March 2026","externalUrl":null,"permalink":"/tags/iac/","section":"Tags","summary":"","title":"Iac","type":"tags"},{"content":"Most teams that adopt infrastructure as code follow the same trajectory: they start by translating their existing setup into configuration files, feel good about the reproducibility, and then gradually accumulate a codebase that becomes harder to change than the manual processes it replaced.\nThis post is about avoiding that outcome.\nThe Problem with \u0026ldquo;It Works\u0026rdquo; # When infrastructure as code is done poorly, it exhibits specific failure modes:\nDrift between what the code describes and what actually exists in the cloud, usually caused by manual changes made during incidents Sprawl as the codebase grows without consistent structure, making it hard to find things or understand dependencies Fear of change because nobody is confident what a modification will actually do Long feedback loops because running a plan or apply takes ten minutes and touches too many resources at once These aren\u0026rsquo;t tool problems. They\u0026rsquo;re process and structure problems that any IaC tool can develop.\nState is the Hard Part # The central challenge in infrastructure as code is state management. Your IaC tool maintains a record of what it believes exists in the world. When that record diverges from reality, things break in confusing ways.\nA few practices that help:\nLock state during applies. If two engineers run applies simultaneously against the same state, the results are unpredictable. Remote state backends with locking prevent this.\nStore state remotely, not locally. Local state files get lost, go stale, and can\u0026rsquo;t be shared. Remote backends with versioning give you a safety net.\nImport before you write. If you\u0026rsquo;re codifying existing infrastructure, import resources into state before writing the configuration. Writing the config first leads to duplicated resources.\nTreat state corruption seriously. If your state file is corrupted or severely drifted, stop and fix it before making any other changes. Applying on top of bad state compounds the problem.\nModule Design # Modules are the primary unit of reuse in most IaC tools. They\u0026rsquo;re also where most complexity lives, for better and worse.\nA well-designed module:\nHas a single, clear purpose Exposes inputs for everything that legitimately varies between uses Hides implementation details that callers shouldn\u0026rsquo;t need to know about Is versioned so that callers can upgrade deliberately A poorly designed module:\nDoes too many things and accumulates unrelated resources over time Exposes every input, making callers deal with details they don\u0026rsquo;t care about Has implicit dependencies that aren\u0026rsquo;t expressed in the interface Is modified in place, breaking callers unexpectedly The right granularity is usually \u0026ldquo;one module per logical component\u0026rdquo; — a database, a service, a network — rather than \u0026ldquo;one module per resource type\u0026rdquo; or \u0026ldquo;one module for everything.\u0026rdquo;\nTesting Infrastructure Code # Infrastructure code is harder to test than application code because side effects are the point. You can\u0026rsquo;t meaningfully test a VPC configuration without creating a VPC.\nThat said, several layers of testing are worth implementing:\nStatic analysis catches formatting issues, deprecated syntax, and common mistakes without touching the cloud. This should run on every commit and be fast.\nPlan validation runs a plan against a real (or ephemeral) environment and asserts on the output. You can check that expected resources will be created, that certain tags are present, that no unexpected destructive changes will occur.\nIntegration testing actually applies infrastructure and verifies it behaves correctly. This is slow and expensive, so it\u0026rsquo;s usually reserved for CI on the main branch rather than every pull request.\nCompliance scanning checks that resources will be configured according to your security and governance policies before they\u0026rsquo;re created.\nThe Pipeline Question # Infrastructure changes should go through a pipeline, not be applied from an engineer\u0026rsquo;s laptop. This isn\u0026rsquo;t just about consistency — it\u0026rsquo;s about auditability. When something goes wrong, you want to know exactly what was applied, by whom, and when.\nA reasonable pipeline:\nValidate and lint on every commit Plan on every pull request, with the plan output posted as a comment Require approval before applying to production Apply in CI, not locally Notify on success or failure The plan-then-apply workflow is particularly important. Engineers reviewing a pull request should see exactly what infrastructure changes will result from merging. Reviewing a diff of configuration files is not the same as reviewing a diff of what will actually change.\nOrganizational Considerations # As infrastructure codebases grow, ownership becomes a challenge. A few patterns that help:\nSeparate repositories by blast radius. Core networking and shared services go in one repository, individual application infrastructure in another. Changes to the network shouldn\u0026rsquo;t require touching the same codebase as changes to a specific service\u0026rsquo;s database.\nEstablish conventions early. Naming conventions, tagging requirements, and module structure are much harder to retrofit than to establish from the start. Write them down and enforce them in CI.\nMake the safe path the easy path. If following best practices requires significantly more work than ignoring them, they won\u0026rsquo;t be followed. Invest in tooling and templates that make the right thing the easy thing.\nInfrastructure as code is one of those practices that\u0026rsquo;s worth doing well. Done poorly, it\u0026rsquo;s overhead. Done well, it becomes the foundation that everything else is built on.\n","date":"16 March 2026","externalUrl":null,"permalink":"/posts/infrastructure-as-code-beyond-the-basics/","section":"Posts","summary":"","title":"Infrastructure as Code: Beyond the Basics","type":"posts"},{"content":"","date":"16 March 2026","externalUrl":null,"permalink":"/series/","section":"Series","summary":"","title":"Series","type":"series"},{"content":"","date":"16 March 2026","externalUrl":null,"permalink":"/tags/terraform/","section":"Tags","summary":"","title":"Terraform","type":"tags"},{"content":"","date":"10 March 2026","externalUrl":null,"permalink":"/tags/beginner/","section":"Tags","summary":"","title":"Beginner","type":"tags"},{"content":"Standing up cloud infrastructure for the first time can feel overwhelming. There are dozens of services to choose from, pricing models to understand, and architectural decisions to make before you\u0026rsquo;ve even written a line of configuration. This post cuts through the noise and gives you a straightforward path from zero to a working environment.\nWhy Cloud Infrastructure Matters # The shift from on-premise servers to cloud infrastructure isn\u0026rsquo;t just about cost — it\u0026rsquo;s about capability. When your database can scale from handling 100 requests per minute to 100,000 without a phone call to a data center, your entire relationship with growth changes. Capacity planning stops being a six-month exercise and becomes a runtime concern.\nThat said, the cloud introduces its own complexity. Misconfigured security groups, runaway compute costs, and distributed systems failures are all very real risks. Getting the fundamentals right matters more than picking the \u0026ldquo;correct\u0026rdquo; provider.\nThe Core Primitives # Regardless of which cloud you use, the same primitives appear everywhere:\nCompute is where your code runs. This might be a virtual machine, a container, or a serverless function. The right choice depends on how much control you need over the runtime environment versus how much operational overhead you\u0026rsquo;re willing to accept.\nStorage comes in several flavors. Object storage (like S3-compatible buckets) is cheap, durable, and ideal for files, backups, and static assets. Block storage behaves like a traditional disk and is better suited to databases. Network-attached file systems bridge the gap for workloads that need shared filesystem semantics.\nNetworking is where most beginners underinvest. Your virtual private cloud, subnets, routing tables, and security groups form the skeleton of your infrastructure. Getting this wrong can mean services that can\u0026rsquo;t reach each other, or worse, services that are reachable when they shouldn\u0026rsquo;t be.\nIdentity and Access Management controls who and what can do what. Overly permissive IAM policies are one of the most common causes of cloud security incidents. Start restrictive and expand as needed.\nA Minimal Starting Point # For most early-stage projects, a reasonable starting topology looks like this:\nA single virtual private cloud with public and private subnets spread across two availability zones Compute in the private subnet, with a load balancer in the public subnet accepting traffic A managed database service in the private subnet, not directly accessible from the internet Object storage for files and backups A bastion host or VPN for administrative access This isn\u0026rsquo;t the most cost-efficient setup possible, and it\u0026rsquo;s not optimized for a specific workload. But it\u0026rsquo;s secure by default, easy to reason about, and gives you room to evolve.\nCommon Mistakes # Over-architecting too early. Kubernetes is a powerful tool. It\u0026rsquo;s also significant operational overhead. If you have fewer than a dozen services and a small team, a simpler container runtime or even traditional VMs will serve you better.\nIgnoring cost from the start. Cloud billing can surprise you. Enable cost alerts before you deploy anything. Tag every resource. Understand the pricing model for every service you use before you commit to it.\nTreating infrastructure as an afterthought. The decisions you make in week one — how you structure your network, how you handle secrets, how you think about IAM — will shape everything that comes after. Invest time here early.\nNext Steps # Once you have a basic environment running, the natural next investments are:\nInfrastructure as code — committing your infrastructure definitions to version control and applying them through a pipeline Observability — logs, metrics, and traces that give you visibility into what\u0026rsquo;s happening at runtime Disaster recovery — understanding your recovery time and recovery point objectives, and testing your ability to meet them None of these are glamorous. All of them will save you pain.\nCloud infrastructure done well feels invisible. The goal is to build systems you stop thinking about, so you can focus on the work that actually matters.\n","date":"10 March 2026","externalUrl":null,"permalink":"/posts/getting-started-with-cloud-infrastructure/","section":"Posts","summary":"","title":"Getting Started with Cloud Infrastructure","type":"posts"},{"content":"","date":"10 March 2026","externalUrl":null,"permalink":"/tags/infrastructure/","section":"Tags","summary":"","title":"Infrastructure","type":"tags"},{"content":"","date":"10 March 2026","externalUrl":null,"permalink":"/categories/tutorials/","section":"Categories","summary":"","title":"Tutorials","type":"categories"}]