Elasticsearch: 5 Things I Wish I Understood Before Starting.
I was actually mistaken and I don’t want you to be so.
“I just found a wonderful tool that can help us to develop the search engine for our free images website and that’s Elasticsearch”. That was my friend Aiman when we were building our first real project, years ago. We thought we found the perfect solution for all of our problems.
Guess what, we failed. Not because Elasticsearch is not a great tool but because we didn’t really understand what it’s meant to help with and how it really works. Now that I understand, I don’t want you to have the same misunderstandings.
1- Elasticsearch is a full-text search engine
The first time I read that Elasticsearch is a full-text search engine I thought about fuzzy search, partial text search, and a lot of other search engine features. Elasticsearch has got all these features but this is not really what full-text search means.
We talk about full-text search when we’re looking for the exact match of the given word in a content or a collection of worlds.
It means that if we have the sentence “Elasticsearch is such an amazing tool.” and we make a full-text search of the world “amaz”, we’re going to have an empty result.
But as I was saying Elasticsearch has got some features that help with implementing a partial text search, a fuzzy search like the search-as-you-type field type, and so on. To know more about those wonderful features, check this link: Elasticsearch 7.13 is there: Software Engineer, Get to Know the Giant of Searching and Analytics.
2- Think of a relational database first
We wanted to build a search engine and we started researching how to integrate Elasticsearch into our stack. That was a big mistake. If I had to start over again I would’ve implemented a simple, basic search engine based on the relational database or the primary database that we were working with. And after the search engine was built, we would’ve improved it by introducing Elaticsearch to hone our searching capabilities. Kind of regrets!
3- You need to replicate your data on Elasticsearch
Back that time, we were really confused about how Elsaaticsearch really works. So after a few days of reading, we thought Elasticsearch will directly connect to our database to process our requests. That’s not how Elasticsearch actually works.
What we had to do, after designing and setting up our relational database that’s meant to store our data, was to design the schema of an index (an index is what we call a database from a relational database perspective) in order to store and to sync any change of the primary data with Elasticsearch (that’s what is called indexing). Therefore we can make the requests again the data that are indexed on Elasticsearch but are coming from our relational database.
4- You need a tool to sync your primary data with Elasticsearch
Yes, you got me right. But it doesn’t mean that you have to stress yourself too much about this. The first question you should ask yourself if you want to add Elasticsearch to your stack is:
Which software/engine/package can I use to sync easily my data from my primary database to my Elasticsearh index?
Elasticsearch is part of a stack: the ELK stack (Elasticsearch, Logstash, and Kibana). Logstash is the main tool that can make it easy to sync data from your primary database to your Elasticsarch instance.
Depending on the framework, a database system that you’re using you can find the right tool for this synchronization. I’ve worked on a project for which we combined a MongoDB database with Elasticsearch and that worked really great.
5- From relational data to documents-oriented database
Here is one of the mistakes a lot of want-to-use Elasticsearch folks make. In most cases, your primary database will be a relational one. Elasticsearch is a documents-oriented database. It means that you need to think about how you can organize your relational data so that it’ll be easy for you to store, sync, and search through. I’m going to write a special article about this topic in the next couple of days so that everything will be clear enough for you.
Elasticsearch is just an outstanding tool that I don’t even know how it’s possible that it’s free. But that’s also what makes it so valuable for us as engineers, geeks, and tech guys.
If you’re looking for a logging system that you can really on and get amazing insights in real-time, ELK is for sure your way to go. If your aim is to build your search engine based on Elasticsearch, you should take into account the five points I’ve mentioned in this article and I hope you found them useful.
Till next time, take care! And don’t forget to follow me.