There were a few reasons for the upgrade. First and foremost, our provider found.no (who are absolutely phenomenal and really know their stuff. So much so that elastic, the company behind elasticsearch itself, recently aquired Found) was in the process of deprecating older versions of elasticsearch. This was the main driving force behind the switch. In addition, we were still on a pretty old version (1.1.1) and development on elasticsearch moves quickly, so we were missing out on lots of new progress and features.
Upon starting the transition I did some searching around to see if there were any guides or best practices on upgrading a production rails application from ‘retire’ to ‘elasticsearch-ruby’. The ‘retire’ gem, formally ‘tire’, was and is fairly popular but had been retired for some time with a suggestion at the top of the README to move to elasticsearch-ruby. Yet, with my searching I found very little. So: this is a guide on how we made the transition and tips that we hope to help readers.
The first big change that you will notice is that the new elasticsearch alternative is much more modular and broken up than Tire was. Tire was mostly one library with all of the functionality built-in. With elastic’s solution, you have lots of smaller, nicely encapsulated gems:
elasticsearch-api elasticsearch-dsl elasticsearch-extensions elasticsearch-transport elasticsearch-watcher
elasticsearch-model elasticsearch-persistence elasticsearch-rails
This modularity really makes it easy and simple to grab only the parts you need. In our case we used the following:
## Search gem 'elasticsearch-rails', git: 'git://github.com/elasticsearch/elasticsearch-rails.git' gem 'elasticsearch-model', git: 'git://github.com/elasticsearch/elasticsearch-rails.git' gem 'elasticsearch-dsl', git: 'git://github.com/elasticsearch/elasticsearch-ruby.git'
‘elasticsearch-rails’ is fairly small and mostly builds on top of elasticsearch-model for giving rails specific functionalities. The elasticsearch-model gem helps to incorporate the API into your models. It is what lets you define mappings, specify settings, import your data into our cluster, call convenient search methods from your models and more. Last but certainly not least, the DSL makes it very easy to write complex queries in ruby and makes the transition from ‘tire’ very straightforward.
Let’s take a look at an example of a DSL change.
As you can see they are very similar. There are only minute changes, such as terms queries accepting a hash rather than a few params.
You may note that above we have a few method calls such as
Pin.add_aggregations. These functions are defined in a concern we call
Searchable which each searchable class includes. This concern has made the
transition much simpler for us and we would recommend the pattern to everyone.
It makes the searching, index updates, definition of aggregations and boilerplate
code much more manageable and DRY.
Rake tasks and Tools
A common data migration that we run at Handshake are mapping changes. This results in a library for all elasticsearch based helpers, and that of course needed to be updated. A nice change in the new suite of gems is the introduction of the ‘elasticsearch-api’ gem which defines lots of helper actions. To view them on github (and all of the great comments within each one) go to: https://github.com/elastic/elasticsearch-ruby/tree/master/elasticsearch-api/lib/elasticsearch/api/actions. I couldn’t find anything in Tire that wasn’t properly replaced by an API action in the new suite.
One of the facets of the ‘retire’ gem that was greatly appreciated across our team was the incredible depth and helpfulness of the comments. I’m happy to say that the great comments found in ‘retire’ are also in the new suite of gems. In addition, it is very well tested and one can almost always find an example of a particular piece of the library in a unit or integration test. Although it would be nice to have more in depth documentation for the gems, the comments and test suite get 90% of the way there.
There were a few other ‘gotchas’ that we noticed during our upgrade.
_source attribute in elasticsearch results. In tire the
attribute was moved to be top level in the returned results. The new suite
doesn’t — so you will find yourself having to refer to the
on the result rather than having the expected data top level on the result
itself. With some existing abstraction, this problem was solved fairly easily.
No automatic index creation! This one is a great change as we us an alias strategy for our reindexing. Previously with ‘retire’ if an index of an model did not exist it would automatically create it. This is no longer the case. While this is great overall for us it did introduce some spec issues where we were not explicitly creating the index before it was tested.
load: true. Instead, elasticsearch-model introduces two different
method types which change how records are loaded.
will return JSON results directly from elasticsearch while
Article.search().records will load them from the database based on the ids.
When it came to finally upgrading our cluster we wanted to ensure as little disruption as possible. We also wanted to make sure that we had a plan in place in case the upgrade fails. The high level process we ended up using is to 1) take a snapshot of the old cluster 2) keep track of all changes happening from the start of the snapshot so they can be replayed onto the new cluster 3) spin up new cluster and import the snapshot 4) hot swap the application to point to the new cluster.
We were thankful that found.no provided an upgrade solution almost exactly as we had planned, and were able to opt to using their automated and well tested process. If the upgrade were to fail it would automatically fall back to the old cluster for us.
During the upgrade process we wanted to make sure no changes were lost. This was very simple for us using our ‘Searchable’ concern. We simply forced all index updates to go through our background job queue and turned off the queue during the upgrade. While this resulted in a few minutes of updates not being propogated, it was as simple as turning the queue back on to catch back up. And, if the upgrade were to fail, it would be just as simple to get the updates back into the old cluster. In the future we hope to implement a more robust strategy of writing updates to both clusters during the upgrade.
Overall we found our upgrade strategy to be successfully with just a few minutes of lagged index updates. Awesome!
Have thoughts on our transition or questions? We’d love to hear them! Just reach out to us on our website and we’ll get in touch.
Originally posted on Medium