Automating RavenDB 4 with Ansible
I've finished migrating my database infrastructure for Keep Track of My Games from RavenDB 3.5 to 4.1 (and soon, 4.2!).
Summary
I thought I'd take a moment to lay out how I used Ansible, a popular configuration automation tool, to stand up my RavenDB servers with a bit less manual work and to maintain easy upgrades.
Since I use Vultr for hosting as it's fast and cheap I decided to "do it right" and plan for the future architecture of KTOMG which I hope to work towards this year, namely hosting on Linux only and cutting a ton of cost from my monthly hosting bill.
I believe it's important to treat your servers like cattle, not pets. Unfortunately, when I originally set up KTOMG on my Vultr VMs I didn't follow my own advice (remember dear reader: Do as I say not as I do). I had no automation scripts or configuration management, it was entirely bespoke (with documented steps, granted). I have used Ansible in the past and knew it would be a good fit for configuring my new RavenDB VMs.
Preparing the VM
In order to execute Ansible in the first place, I needed to set up a fresh Ubuntu VM on Vultr. Typically, you don't want to just run as root
for everything in Linux. Instead, you set up dedicated users with permissions to do the things you need. This is the principle of least privilege. I followed the steps outlined in this basic guide to securing Ubuntu 18.04 to set up a deployment user (deploy
) with appropriate permissions and lock down SSH.
In Vultr as well I created a Firewall group for the databases that allows my custom SSH port through (from specific IPs) which provides another layer of security.
The last thing I did was enable Canonical Live Patch to hotpatch the servers without rebooting, which is free for 3 servers. I did try to set up unattended upgrades with Ansible but was running into strange apt locking issues.
Much of this VM setup could be automated at least partially with VM Startup Scripts, a feature Vultr offers. Since it only takes a couple minutes and I only have a few VMs, I didn't bother with that step.
Once you test logging in via SSH with your new deploy
user, you can configure Ansible to do the same.
Setting Up Your Inventory
Ansible has the idea of an inventory which is the sets of servers to configure. You can group them and provide ansible variables to each one as-needed.
Ansible also supports dynamic inventory and even supports Vultr as a source! I opted to not do this, again due to the limited number of VMs I needed to configure.
The inventory file looks something like this:
[dbservers]
10.0.0.1
10.0.0.2
[dbservers:vars]
ansible_port=12345
ansible_user=deploy
ansible_become_user=ravensu
ansible_become_method=su
Each node is listed that you wish to configure followed by some shared "group" variables that you want to provide to Ansible (they are shared for each node, see variable precedence). In this case, we are connecting as the user deploy
on a custom SSH port.
What is ansible_become_user
? For the playbook we'll use, most if not all the tasks requires sudo permissions to configure. The way I accomplish this is by ensuring I connect via SSH with my approved user (deploy
) who has low permissions, then at playbook runtime specify Ansible to become a sudo user (ravensu
) to execute any privileged tasks.
Note: The ansible_become_method=su
was the only option that worked for me against my Vultr VMs and I'm unsure why. The default is sudo
but that wasn't working!
Creating the Playbook
In my example playbook, I only do a handful of tasks to set up a server. Possibly in the future, it'd be nice to even automate some of the cluster setup tasks you need to perform for Raven, scripted using the HTTP API. Bonus points: Make a formal Ansible role!
At a high-level, the playbook does the following:
- Install Docker on Linux
- Copy RavenDB settings and certificates to the remote machine
- Start a RavenDB 4 container with the appropriate volume mounts, networking configuration and environment variables set
View the playbook in my ravendb-devops repository.
I want to touch on a few key configuration settings:
- Port Mapping: The
published_ports
maps the VM host ports (8081, 38889) to the container ports Raven will bind to (8080, 38888). I don't think they need to be different but in experimenting with setting up Raven I ended up using different ports. - Settings: I am providing a custom
settings.json
file from the host machine. This contains pre-set config with IP addresses or DNS for what URLs to bind to and other server settings I wanted to customize. - Certificates: Raven uses two different kinds of certificates for server administration: a Cluster certificate and an Admin Client certificate. The cluster certificate is for communicating between nodes securely. The Admin Client cert is for you to connect to Studio securely via your browser. Raven needs the private key so it can authenticate you to the Studio interface using Client Certificate Authentication. Both certs need to be copied to the remote host and we map the host
/var/db_cert
to the container path/opt/RavenDB/cert
where Raven looks for certificates. - Raven Args: The
RAVEN_ARGS
env variable sets up some config needed such as the path to the customsettings.json
file we copy over, specifying no setup wizard experience, and what server URL to bind Raven to, in this case the private container address172.17.0.2:8080
. - Volumes: Containers stop and start, they are ephemeral. We don't want to lose our data if a container is deleted either (during an upgrade to a new image, say). Raven stores data within the container at
/opt/RavenDB/Server/RavenData
and we map that to our host path/var/db
so we can take backups, persist to SSD or block storage, etc.
Remember: Raven is running within a container, so we bind to a private IP. We then map the container port (8080) to a host port (8081) to expose it. Why don't we bind to 10.0.0.1
, the host IP? Because Raven can't see that from within a container and it'll fail to boot up. Instead, the DNS *.ravendb.community
can be used to set up a name that points to the private VM network IP and so you'll never need to expose Raven publicly outside your VM, besides the Studio interface to manage it (which you can lockdown to a limited IP range using Vultr).
Note about DNS: Notice I am copying certificates over. Where did I get them? I actually generated them all through Setup Wizard in a fresh local RavenDB installation so I could more easily set up a new cluster with certs and DNS already created. You can walk through the same steps with a local Docker container of RavenDB.
I found the following guides helpful when setting up RavenDB in a container:
- RavenDB Installation: Setup Wizard
- RavenDB: Docker on AWS Linux VM
- RavenDB: Running in a Docker Container
- DockerHub for ravendb/ravendb
Running the Ansible Playbook
Now that we have our ravendb.yml
playbook and inventory
file, we can run ansible-playbook
:
ansible-playbook -i inventory -K ravendb.yml
The -K
flag will prompt for the ansible_become_user
password. This should not be anywhere in plaintext in source control or your host machine. It either needs to be provided when running the command interactively or via Ansible Vault.
You'll be promoted first for the become_user
password, then the SSH key phrase if necessary, and finally it will run through the playbook.
Finishing the Installation
If it was successful, everything will be ok
and you can then browse to the Studio in your local browser against your VM's public IP address (and hopefully, only allowing your IP via a firewall rule).
If the Studio prompts for your certificate, choose it, and now you can finish setting up the cluster:
- Enter your license to activate the cluster
- Set up backups, which took all of 5 minutes to hook up to Azure Storage.
- Create client certs for your applications to connect to Raven programmatically. When creating a client certificate in the Studio, a zip file will be downloaded automatically after generating containing your
pfx
file you can use in your app.
Conclusion
While I've laid out what works for me if you deviate chances are you'll need to do a continuous trial-and-error process to successfully configure your Raven installation. But I hope this provides a nice starting point for configuring a production cluster! If you run into any issues, feel free to leave a comment so others can learn too.
If you are still learning RavenDB 4, check out my Getting Started with RavenDB 4 Pluralsight course! It covers the basics like how to get around the Studio, connect with .NET, and more. Using my link won't cost you anything extra but I will get credit for your subscription which helps me out!