EDIT (26 July 2017) Updated the post with a better solution (since I was wrong and also etcd 3.2 accepts peerURLs containing domain names).
When using an etcd cluster to store important key value data you’ll probably prefer data persistency over availability. If more than half of your etcd cluster members goes down you’ll prefer to wait for them to return back accepting a loss of availability instead of recreating a new etcd cluster restoring a backup of your data that will probably be an old version of the data at the moment of the disaster.
As an example in stolon the stolon cluster data is saved inside a store like etcd or consul. Restoring the stolon cluster data from a backup could lead to bad behaviors since the stolon cluster state (that contains different information and the primary one is which postgres is the master/primary) won’t be in sync wiTh the real stolon cluster state).
Stolon was architected to be seamlessy deployed inside a k8s cluster so it becomes logical to also deploy the store (etcd or consul) inside k8s.
Today there’re different ways to deploy an etcd cluster inside k8s but, as I’m going to explain in this post, these don’t meet the above requirement:
Why these above options fail to meet our requirements?