20 pointsby mooreds6 hours ago9 comments
  • MPSimmons5 hours ago
    Cloud providers in general haven't gone very far toward providing hooks for validation.

    It seems easier for the cloud provider to implement the equivalent of a dry-run flag in API calls that validate that the call would succeed (even if it's best effort determination) which could be used by tools like Terraform during the planning and dependency tree generation.

    Instead, you have platform providers like AzureRM that squint at the supplied objects and make a guess as to whether that looks valid, which causes a ton of failures upon actual application. For instance, if you try to create storage with a redundancy level not supported by the region you're adding it to, Terraform will pass a plan stage, but the actual application of the resource will fail because the region doesn't support that level of redundancy.

    There are unlimited other examples in a similar vein, all of which could be resolved if API providers had a dryrun flag.

  • willi595498795 hours ago
    I am not a fan of abreviations, this article didn't even have terraform written out once.
    • parpfish5 hours ago
      I assumed it was going to be about tensorflow
  • akersten6 hours ago
    The most confusing part of terraform for me is that terraform's view of the infrastructure is a singleton config file that is often stored in that very infrastructure. And then you have to share that somehow with your team and be very careful that no one gets it out of sync.

    Why don't cloud providers have a nice way for tools like TF to query the current state of the infra? Maybe they do and I'm doing IaC wrong?

    • cobolexpert5 hours ago
      At $WORK we have a Git repo set up by the devops team, where we can manage our junk by creating Terraform resources in our main AWS account.

      The state however is always stored in a _separate AWS account_ that only the devops team can manage. I find this to be a reasonable way of working with TF. I agree that it is confusing though, because one is using $PROVIDER to both create things and manage those things at the same time, but conceptually from TF’s perspective they are very different things.

    • raffraffraff5 hours ago
      There is the code, the recorded state of the infra when you applied the code and the actual state at some point in the future (which may have drifted) . You store the code in git, the recorded state (which contains unique IDs, ARNs etc) in a bucket and you read the "actual state" next time you run a plan, and you detect drift.

      These days people store the state in terraform cloud or spaceliftor env0 or whatever. Doesn't have to be the same infra you deployed.

      If you were a lunatic you could not use a state backend and just let it create state files in the terraform code directory, check the file into git with all those secrets and unique ids etc.

    • don-code5 hours ago
      > Why don't cloud providers have a nice way for tools like TF to query the current state of the infra? Maybe they do and I'm doing IaC wrong?

      This is technically how Ansible works. Here's an extensive list of modules that deploy resources in various public clouds: https://docs.ansible.com/projects/ansible/2.9/modules/list_o...

      That said, it looks like Ansible has deprecated those modules, and that seems fair - I haven't actually heard of anyone deploying infrastructure in a public cloud with Ansible in years. It found its niche is image generation and systems management. Almost all modern tools like Terraform, Pulumi, and even CloudFormation (albeit under the hood) keep a state file.

    • mooreds6 hours ago
      > The most confusing part of terraform for me is that terraform's view of the infrastructure is a singleton config file that is often stored in that very infrastructure.

      These folks also have an article about that: https://newsletter.masterpoint.io/p/how-to-bootstrap-your-st...

      • bigstrat20036 hours ago
        That article is way overkill. One should just manually create the backend storage (S3 bucket or whatever you use). No reason to faff about with the steps in the article.
        • catlifeonmars5 hours ago
          This is excellent advice.

          When you have a hammer… as the expression goes. It’s crazy how many times that even knowing this, I have to catch myself and step back. IaC is a contextually different way of thinking and it’s easy to get lost.

    • colechristensen5 hours ago
      There are three things:

      * Your terraform code

      * The state terraform holds which is what it thinks your infrastructure state is

      * The actual state of your infrastructure

      >Why don't cloud providers have a nice way for tools like TF to query the current state of the infra?

      What a terraform provider is is code that queries the targeted resources through whatever APIs they provide. I guess you could argue these APIs could be better, faster, or more tuned towards infrastructure management... but gathering state from whatever resources it manages is one of the core things terraform does. I'm not sure what you're asking for.

      • fragmede5 hours ago
        for the plan file to be updated to the state of the world in a non-conusing way so that apply does the right thing without a chance it's gonna blow things up.
        • colechristensen5 hours ago
          This is really up to the writer of the provider (very often the service itself) to have the provider code correctly model how the service works. It very often doesn't and allows you to plan error-free what will fail during apply.

          It's not an API issue but a terraform provider issue having missing or incomplete code (i.e. https://github.com/hashicorp/terraform-provider-aws )

    • 5 hours ago
      undefined
    • cyberax5 hours ago
      > Why don't cloud providers have a nice way for tools like TF to query the current state of the infra?

      They do! In fact, this is my greatest pet peeve with TF, it adds state when it's not needed.

      I was doing infra-as-code without TF with AWS long time ago. It went like this:

        env_tag = "${project_name}-${env_name}"  
        aws_instances = conn.describe_instances(filter_by_tag={"env_tag": env_tag})
        if len(aws_instances) != 1:
          conn.launch_aws_instances(tags={"env_tag": env_tag})
      
      AWS has tag-on-create now, making this sort of code reliable. Before that, you could do the same with instance idempotency tokens. GCP also has tags.
  • jdalsgaard5 hours ago
    Most tools, frameworks and articles in IT, SaaS in particular, are about spinning up things. It is what people find exciting.

    Work a few years in Ops and you learn that spinning up things is not a big part of your work. It's maintenance, such as deleting stuff.

    Unfortunately this process is the hardest, and there's very little to help you do it right. Many tools, framework and vendors don't even have proper support for it.

    Some even recommend 'rinse and repeat' instead of adjusting what you have - and this method is not great if you value uptime, nor if you have state that you want to preserve, such as customer data :-)

    Deleting stuff, shutting services down, turning off servers - those are hard tasks in IT.

    • jiggawatts3 hours ago
      My acid test for provisioning automation products is asking: Can it rename deployed resources?

      Practically none can, even in market segments where this is highly relevant. For example: user identity and access management products. Women get married and change their name all the time!

      The next level up is the ability to rename a container such as an organisational unit or a security group.

      Then, products that can rearrange a hierarchy to accommodate a merger, split, or a new layer of management. This obviously needs to preserve the data. “Immutable infrastructure” where everything is recreated from scratch and the original is dropped is cheating.

      I’ve only ever seen one such provisioning tool, the rest don’t even begin to approach this level of capability.

  • sshine4 hours ago
    I love how terraform can describe what I’ve got. Sort of. Assuming I or my colleagues or my noob customers don’t modify resources on the same account.

    I don’t love how unreliable providers are, even for creating resources. Clouds like DigitalOcean will 429 throttle me for making too many plans in a row with only 100+ resources. Sometimes the plan goes through, but the apply fails. Sometimes halfway through.

    I’d rather use a cloud-specific API, unless I’m certain of the quality of the specific terraform provider.

  • 5 hours ago
    undefined
  • based25 hours ago
    Because TF is lacking sequentials state descriptions in rare cases - ex: Termination protections in AWS.
  • dpkirchner5 hours ago
    Hell, let's talk about why ^c'ing the plan phase sucks.
  • otterley4 hours ago
    "Because referential integrity is a thing, and if you don't have all dependencies either explicitly declared or implicitly determinable in your plan, your cloud provider is going to enforce it for you."