Ansible and aws: adding hosts to known_hosts

Background

Ansible uses SSH to control hosts.
SSH (by default) expects the user to verify the identity of a server upon initially connecting to it.
When connecting to a host for the 1st time, you will be prompted:
Are you sure you want to continue connecting (yes/no/[fingerprint])?
If you type yes in the above dialog, the host's public key and DNS name are added to a file called known_hosts
The problem is that in many cloud environments, the DNS name for an instance is modified at each boot.
Ansible, being reliant on SSH to connect to hosts, will fail if the host is not already in the known_hosts file, instead prompting the user to add the host to known_hosts.

There are a few ways to solve this. You can instruct ansible to not verify the server's identity - but that defeats the ability of this mechanism to protect you from e.g. Man In The Middle attacks.


Solution
The linux ssh-keyscan utility can add a host to the known_hosts file.
We will use aws cli to enumerate running instances and add to the known_hosts file.
I have actually created a Jenkins job to do this for me on demand

Prerequisites

    • ssh-keyscan (part of the ssh utils package)
    • aws cli installed and configured
    • Proper firewall (Security Group) settings on aws

      TL;DR

      my_known_hosts="$HOME/.ssh/known_hosts"
      ## housekeeping ##
      if [ -f $my_known_hosts".old" ]
          then rm -f $my_known_hosts".old"
      fi
      ## housekeeping ##

      ## backup ##
      if [ -f $my_known_hosts ]
          then mv $my_known_hosts "$my_known_hosts.old"
      fi
      ## backup ##

      ## query aws for active hosts and add to known_hosts
      aws ec2 describe-instances --query 'Reservations[*].Instances[*].NetworkInterfaces[*].Association.PublicDnsName' --output text | xargs -L1 ssh-keyscan -H >> $my_known_hosts
      ## query aws for active hosts and add to known_hosts

      Explained

      1. Use a variable to store the file location
        my_known_hosts="$HOME/.ssh/known_hosts"
      2. Cleanup old backup if it exists
        if [ -f $my_known_hosts.old ]
             then rm -f $my_known_hosts.old
        fi
         
      3. Backup known_hosts just in case
        if [ -f $my_known_hosts ]
             then mv $my_known_hosts "$my_known_hosts.old"
        fi
      4. Use this aws query to detect all instances. for each, use ssh-keyscan and add it to the known_hosts file
        aws ec2 describe-instances --query 'Reservations[*].Instances[*].NetworkInterfaces[*].Association.PublicDnsName' --output text | xargs -L1 ssh-keyscan -H >> $my_known_hosts

      Bonus: Jenkins Job

      This script is what I ended up using on Jenkins. It assumes the hosts have just been powered-up and so waits for them to come online and then adds keys to known_hosts as above
      my_known_hosts="$HOME/.ssh/known_hosts"
      ## housekeeping ##
      echo Housekeeping
      echo ____________________________________

      if [ -f $my_known_hosts".old" ]
          then rm -f $my_known_hosts".old"
      fi
      ## housekeeping ##
      echo Backup
      echo ____________________________________

      ## backup ##
      if [ -f $my_known_hosts ]
          then mv $my_known_hosts $my_known_hosts".old"
      fi
      ## backup ##
      echo waiting for instances to come online
      echo ____________________________________

      ## query aws for active hosts and add to known_hosts
      counter=1
      until [[ -n $(aws ec2 describe-instances --query 'Reservations[*].Instances[*].NetworkInterfaces[*].Association.PublicDnsName' --output text) ]]
      do
          echo "Waiting for instances to come online"  && echo $counter
          ((counter++))
          sleep 5s
          if [["$counter" -ge 10]]; then
              echo "AWS Instances did not come online within timeout. quitting"
              exit 1 # instances are not online
          fi
      done
      echo "Grace period: let all instances complete booting"
      sleep 20s
      echo adding hosts to known_hosts file
      echo ____________________________________

      aws ec2 describe-instances --query 'Reservations[*].Instances[*].NetworkInterfaces[*].Association.PublicDnsName' --output text | xargs -L1 ssh-keyscan -H >> $my_known_hosts
      ## query aws for active hosts and add to known_hosts
      Any suggestions for improvement? Problems? Ideas?

      Comments

      Popular posts from this blog

      Mac OS: Log-foo