Elasticsearch: Complete Guide for Searching (100% Practical)

Akintola L. F. ADJIBAO
8 min readJul 1, 2021
People photo created by drobotdean — Freepik

You’ve just finished reading tons of “Elasticsearch from scratch” tutorials or “x things you should know about Elasticsearch” articles. You sit staring bitterly at your screen. You gathered a lot of information but nothing really practical to do with Elasticsearch.

Welcome to the party! You’re not the first, you’re just being through the same hell I’ve been through months ago.

Searching is one of the most important features of Elasticsearch and in this post, we’ll be discussing the 20% of the knowledge that crafts the 80% of searching skills you need in Elasticsearch.

Elasticsearch is constantly improving. Its way to make search, parameters and settings have been changing. So apart from this guide, you will need to check Elasticsearch site for the latest updates.

Prerequisite

Let‘s jump right in!

There are basically two ways to perform a search in Elasticsearch:

  • Through Query Lite Searches that is the other way to say “I don’t have time to write long clauses or I need quick checks
  • And through Body Searches for “Deep and advanced searches”.

This post is 100% practical. So we need to be sure you have Elasticsearch installed and can use the terminal. If you don’t have Elasticsearch already installed, check Installing Elasticsearch on Ubuntu, Mac, or Windows to install it.

We consent that this post will be truly practical, but a few theoretical concepts are mandatory before we start. I shorted them for you, it’ll take just 5 Minutes for the Whole Theory you need to know about Elasticsearch.

We need data to make searches

Before starting, we need to seed an Elasticsearch index to execute our queries.

For that purpose, we’ll perform our queries on — guess what? — some trendy data: the mangas.

Without any further formalities, let’s execute this query to create an index and store our manga data into it. Our index is called mangas.

Now let’s check if the magic happened. To get all the mangas inserted, run the command below.

curl -XGET -H "Content-Type: application/json" "127.0.0.1:9200/mangas/_search?pretty"

You will get all the mangas we’ve just inserted in the hits attribute of the output.

Now let’s start the interesting part of the cake: the queries. Let‘s jump right in!

There are basically two ways to perform a search in Elasticsearch:

  • Through Query Lite Searches that is the other way to say “I don’t have time to write long clauses or I need quick checks
  • And through Body Searches for “Deep and advanced searches”.

1. Query Lite

Query Lite is the shortest and the quickest way to perform searches against an Elasticsearch index. Let’s assume that you want to know if your index has a certain manga whose title contains Journey’.

Actually, you don’t have to stress yourself. All you have to do is to rely on Query Lite searches. Let’s make our first queries thanks to Query Lite searches.

Example 1: Searching for documents that have “Journey” in anyone of their fields

curl -XGET -H "Content-Type: application/json" "127.0.0.1:9200/mangas/_search?q=Journey&pretty"

Let’s break down this first example:

  • curl is a tool — more specifically, a command-line tool — used here to interact with Elasticsearch API,
  • 127.0.0.1:9200 is a combination of the IP address (127.0.0.1) and the port (9200) we use to interact with Elasticsearch API,
  • mangas is the name of our index,
  • q=Journey is the main part of our request. It means we are searching for the world “Journey” in all the data we have in our index.

Example 2: Searching for documents that have “Journey” in their “title” field.

curl -XGET -H "Content-Type: application/json" "127.0.0.1:9200/mangas/_search?q=title:Journey&pretty"

Example 3: Searching for documents that have “Journey” in their “title” field OR (we use “+”) publication “year” greater or equal than 2000

curl -XGET -H "Content-Type: application/json" "127.0.0.1:9200/mangas/_search?q=title:Journey+year:>=2000&pretty"

Something to keep in mind is that Query Lite searches can be performed in a browser since they are GET requests. As you’d imagine, if we want to make our search through a browser, we have to encode the URL because we’ll be using some special characters (:, +, etc). So it can become very difficult to make a complex and advanced search with the Query Lite interface. The other snag is that this method is quite dangerous because of some security issues.

This is where our main character comes into play 😎 : Body search or Query DSL.

2. Query DSL

Let’s now dive into the most interesting part of Elasticsearch search.

First of all, what does DSL stand for?

For Elasticsearch purpose, DSL stands for domain-specific language (DSL). DSL queries use the HTTP request body and allow you to specify the full range of search options Elasticsearch puts at your disposal.

Now that we know what DSL stands for, we’re good to go through its different usages and options.

Summary

  • Match query
  • Term query
  • Filters
  • Compound queries

Match query

Match query is one of the most used in Elasticsearch. Match queries return results if the search term is present in the field. You can use this query to search for text, numbers, or boolean values.

For example, let’s search all of the documents (mangas in our case) that have the word “Journey” in their title as we did it before in the Query Lite section.

Well, look at the result. You’ll notice that one manga has got a title which contains “Journey”.

Term query

Let’s assume that you want to select mangas published in 1990, only them. The best query type to use is Term query. Why? Because you know exactly the publication year of mangas you’re looking for.

Term queries find documents that contain the exact term specified in the search query.

Let’s take a look at it.

I hope you got it. Perfect! It’s working.

Now, let’s dive more into Term query. What we are going to see is one of the most made confusions when it comes to using Term queries.

Let’s assume you want to select “One piece” manga. So we know exactly the title of the manga we’re looking for. Let’s search it with a Term query.

No document found 😲 !!! Why?

The simple explanation is mapping.

  • During our bulk insertion, Elasticsearch sets dynamically the type “text” to title fields.
  • Then because any text data type field is analyzed and indexed by Elasticsearch, every manga title has been split into words.

Example: “One Piece” = [“One”, “Piece”]

  • So when we search “One Piece” with a Term query, Elasticsearch finds that there is no “One Piece” but there is “One”, “Piece” and so on.

That’s why we don’t have any results.

To solve that problem, you have to reindex your data. You’ll define a new index with new mapping and then you will seed your new index with the former data. To know more about mapping, I invite you to check my article about mapping Software Engineer, Before Inserting One Iota of Data in an Elasticsearch Index, You Must Do This.

Filter queries

Filters are very interesting in Elasticsearch. Filters request yes/no results. It means that a filter finds all of the documents that match its exact conditions. There is no relevance here like when you’re dealing with queries. The main advantage of filters is that they are faster and cacheable. Here, we’ll be talking about Term filter and Range filter.

- Term filter

Term filters behave like term queries. The main difference is the absence of relevance evaluation for Term filter.

Let’s try to find as previously, manga(s) published in 1990.

Take a look at the field “_score”, you will notice that its value is 0.0: no relevance estimation.

- Range filter

The second filter we’re going to study is Range filter. It is used on date data type fields, number as well as text and keyword data type fields. The snag is that a range filter on text or keyword is time-consuming and it’s not a good practice especially for text fields.

Let’s take a look at it.

We’ll select all the mangas published between 1995 and 2000, 2000 included ( 1995 < year ≤ 2000).

To express the conditions of a Range filter, you can use the following parameters:

  • gt: greater than
  • gte: greater than or equal to
  • lt: less than
  • lte: less than or equal to

So let’s run this query.

Crazy ✨ !!! I don’t know for you but it took around … 0 seconds 😲 (filters) on my machine!

Compound queries

Till now, we’ve tried quite simple queries and filters. However, in real life, you will rarely be dealing with simple queries like those ones. In real-world situations, engineers and database administrators deal with complex queries with many conditions.

Elasticsearch makes it easy to run complex and advanced queries (queries with filters) by providing Compound Queries.

The common way to make those searches is through Bool queries. Let’s have a look at a typical bool query structure:

Let’s go through that query:

  • must: means that the searched word or expression must appear in the matching documents and will contribute to the score.
  • must_not: is used to say that the specified word or expression must not appear in the result documents.
  • should: means that the term or expression we’re looking for should appear in the matching document, in most cases, it is used with must only to increase relevance but can also be used alone.
  • filter: expresses that the searched term or word must appear in the matching documents. However, unlike must, the score of the query will be ignored.

When it comes to talking about compound queries there are basically 5 statements to keep in mind:

  • Statement 1: It’s possible to use only one or two, three or all of the options (must, must_not, should, and filter).
  • Statement 2: Every option can contain one or more queries. They all accept an array of queries.

Example:

  • Statement 3: For every option, you can use the simple queries we went through earlier 👆, I mean match, term, range, and so on.

Example:

  • Statement 4: When you combine must and should, your should queries don’t change the set of documents selected by your must queries. However, your should queries add relevance to your documents. Let’s take a look at these examples.

Example 1: Compound query with only must

Example 2: Compound query with must and should

  • Statement 5: If you use filter or must_not option, there is no relevance estimation per document, the score of every document is 0.

Example 1: Only filter clause

Example 2: Only must_not clause

Conclusion

Elasticsearch searching features offer almost infinite possibilities and they are plenty of advanced capabilities you need to check out later. As long as you practice you will get better for sure. We all are on the journey and I hope I will meet you a day to talk about our experiences with Elastisearch: things we’ve done, features we’ve tried, and so on 🎓.

That’s it, amigo 👌.

Till next time, take care.

--

--