Pagination With Elasticsearch: You Must Think About This Before Any Request.
Years ago, as I was working at a large amount of data processing company as a software engineer intern. I learned one of those lessons we don’t have a chance to learn at school.
I was in charge of producing some reports from a production database. After I looked at the description of the tables, I wrote my query and it’s was time to test it. No syntax error! Yes, but the result was taking too much time to come up. So I started worrying about what was happening.
And when the result showed up, my machine just stopped working. After thinking about it deeply I found that the data were too large to be shown and then I learned from a senior database administrator working near me that pagination was one of the sesame.
Well, in this article, we’ll be going through data pagination on Elasticsearch.
Prerequisite
- Elasticsearch
- curl to make HTTP requests
First things first as an old saying goes. Let’s talk about what pagination means.
Pagination, also known as paging, is the process of dividing a large number of elements into sets called pages.
Elasticsearch offers a rich diversity of tools to help you query your data and chunk the easily.
Among many other pagination features, Elasticsearch search API offers Simple pagination with justfrom
and size
parameters, Search After, Scroll Search results.
Seed an Index
This is a pure hands-on article so we’re going to go with some quick and simple pagination implementations.
For the sake of this lesson, we need an index in order to make requests and analyze the results.
Our index will be called mangas. To create the mangas index and insert some mangas, run the command below.
Don’t forget to start your Elasticsearch server before!
Paginate
Elasticsearch search API has two parameters that are used to paginate results: justfrom
and size
.
The figure below shows how Elasticsearch numbers the documents gotten as the result of your request.
The from
parameter defines the number of hits to skip, defaulting to 0
. The size
parameter is the maximum number of hits to return. Together, these two parameters define a page of results.
Now let’s take a look at some examples.
Request 1
For this first request, we want to get the two first mangas from our index so we start from document number zero and will be taking two. Here is how it goes:
curl -XGET -H "Content-Type: application/json" "127.0.0.1:9200/mangas/_search?from=0&size=2&pretty"
Request 2
For this second request, we want to pick from the third manga to the fifth one. Here is how it goes:
curl -XGET -H "Content-Type: application/json" "127.0.0.1:9200/mangas/_search?from=2&size=3&pretty"
Request 3
curl -XGET -H "Content-Type: application/json" "127.0.0.1:9200/mangas/_search?pretty" -d '
{
"from": 0,
"size": 2
}
Which major difference have you noticed between the first two requests and the last one?
Well, Request1 and 2 have been made through URL but Request 3 has not. Request 3 is what we call a Body Request and in real-life situations, you’ll be mostly working with Body Requests, mostly.
Okay, dude, we’re done with this tutorial.
I hope you enjoy our journey! Take care and more importantly, keep it up.