I’m not devops or admin. My day to day job is software development. Mainly back-end (Ruby / Golang). Sometimes front-end (React). Additionally I’m dealing with databases, API-s, all the stuff. And I’m here to say that truthfully speaking nowadays it’s not needed to have any knowledge about infrastructure or deployment strategies if you just want to run another one web app. So, why should we, developers, care about devops?
For the beginning, let’s admit that PaaS platforms like heroku.com or now.sh are amazing game changers. They reduced cost of infrastructure support for initial stage of the project dramatically. If you just want to test an idea or launch simple utility which will make your life easier you can get it up and running with almost zero effort. During my career I’ve seen companies which were able to built their successful online businesses from scratch without having in-house admins or devops for pretty long period of time just by using PaaS. And by “pretty long” here I mean years.
So, yes, for sure, it’s possible to create an app, publish and deliver it to end users without having any knowledge about servers, clusters, IP-s, installations, replications and other buzzwords at all. No problems, bro. It’s just a fact proved by reality.
The issue with PaaS is that it hides complexity by having common conventions. It helps in the beginning while all apps are the same like small kids. They scream, dream and want to eat, right? But every adult service is a unique set of decisions, needs, biases and exceptions. So, once your strategies and requirements will grow up you may find yourself learning internal details of specific PaaS very deeply just to make things work as you want them to work. It increases complexity of the infra up to the level of standalone installation. So, what’s the point to stay away from servers?
As an alternative to PaaS approach it’s very common to see an opposite way when system is built entirely in-house and managed by special team of smart people. In the end, it doesn’t make any difference for the developer because you still have to consider production as a black box but often with poor quality and UX not even close to PaaS.
Additionally it draws borders between “my responsibility” and “not my responsibility”. How often you have seen an investigation of the problem which goes down to networking or service discovery level and stops there with resolution “infrastructure problem, please, ask devops”? Or another situation when some good initiative can’t be done because “oh, f*ck, we will need to go to devops in order to do it, so let’s keep it simple”? And “simple” here is an euphemism for “a little bit stupid”.
So, it seems very important for me to know how applications I’m creating are operated. Well, may be I’m a little bit old school control freak but I strongly believe that your ops affects your development process, decisions and even the way you are thinking about the problem. The quality of the infra is a basis for the quality of the application itself. So, developer which doesn’t know what his infra capable or not capable of is sometimes blind or at least not effective.
- If your commit is deployed immediately after being merged to the master branch and you know how it works (hooks, actions, pipelines, rolling deployment, etc) you’ll be much more concerned about having proper documentation, test coverage and feature flags. “Oh, service X has been eventually deployed before service Y but our beautiful feature there requires new version of X to work so now everything is down”. What you have been thinking about when you pressed “merge”, pal? And, btw, have you discussed rollout strategy for your db migrations in production? Can you see why your
ALTER TABLE xxx ADD COLUMN yyyshould be merged and executed as separated deployment stage?
- If your infra has advanced support of shadow / canary / beta deployment and you know how to use it a lot of problems may be approached by just battle testing an idea in production. In the same moment you are getting better protection from “black swans” because you are building with possible failures during experiments and shadow testing in mind. If your team is laughing when somebody is saying “let’s just check it in production” something went wrong with the infra. Switching db replicas and monitoring some specifically labeled deployment should be regular thing for developers.
- If you know in advance that there is no any guarantee regarding hosts or IP-s where you apps are running then service discovery and dependency injection become your best friends. It makes your system more agile and brings a lot of interesting abilities for security and service level access management. You can clearly see how insecure local version of your code become TLS-aware thing in production. Why don’t I care that this endpoint which is supposed to be used only by A will be called by B? Because it’s enforced on the level of the mesh.
- If you are aware of the fact that your services can get up and down multiple times per hour and sometimes even per minute then having alive and heartbeat endpoints, distributed tracing, traffic monitoring, retries and circuit breakers is not an option but requirement for you. All those capabilities may be provided and exposed by your environment but it’s hard to use it without knowing what’s “sidecar” or “service mesh”.
- The situation when operations are decoupled from developers at 100% level you may have a case when team is putting a lot of effort into solution of the problem which may be easily resolved by environment. Just because people prefer to use tools they know and understand. Have you spent couple of month to build networking library which supports circuit breakers and request / response rate monitoring in Ruby? Good. Now repeat it in Go and Node.JS and don’t forget to keep them synced. Or just add proxy to your pod. But never mind. Keep coding, pal.
With this points in mind I’m starting the series of blog posts called “Devops for developers”. But as I said, I’m not devops for the majority of my life time so a lot of things can be missed or done wrong just because I’m not good enough. So, if you see something done in a wrong way or you know better one — share it. If you don’t know anything and trying to learn — be careful and think before following instructions.
Here are few topics we are going to talk about:
- How you can use Terraform in order to automate your infrastructure creation?
- How you can use Ansible in order to keep configuration management under control? How you can make them friends with Terraform?
- How you can use Kubernetes to build a cluster for your application? How master can be initialized and nodes can join by using Ansible inside of Terraform’s provisioners?
- How you can install and use Prometheus in order to start monitoring of your application?
- How different tools for organizing of service mesh like Istio or Consul Connect can be deployed and used in your cluster?
- How you can manage your cluster as a digital space for “pet” projects in your company?
Probably it’ll be more since I’m going to write as I go. That’s by the way another reason to read carefully and give your feedback: there is a chance that some things I’m doing in the post your are reading are fixed or changed in next ones. Don’t consider this series as a manual. It’s more like a live blog.
Ok, so if motivation is clear let’s get it up on running, right?