Lunr is a great client-side search engine that works really well with the Jamstack. (Be sure to see my earlier posts on the topic: Adding Search to your Eleventy Static Site with Lunr, Integrating Navigation Search with Lunr and Eleventy, and Using Pre-Built Lunr Indexes with Eleventy) While I like the simplicity of Lunr, it requires that your searched data be loaded on the client-side. This makes it a bad fit for large sites. For example, my blog has over six thousand posts so I went Algolia instead. However, I was doing some thinking recently and realized that Lunr can also be used on the server as well. I decided to take a stab at trying this out and here's what I came up with.

What I had in mind was this:

  • Use Eleventy to generate a JSON file that represents my site content, something appropriate for Lunr to index.
  • Make that JSON available to a serverless function
  • In the serverless function, accept input (what to search for) and pass it to the index

First off, I wasn't exactly sure if I could use Eleventy to generate content that my serverless function uses. So for example, my site here uses a JSON file that is just stored under web root. As part of my deploy process serverless function, I do a HTTP call to my own site to get the content and then pass it to Algolia.

Instead, I wanted Eleventy to give the file directly to the serverless function itself. But I wasn't sure if Netlify would deploy the serverless function before Eleventy ran. I did some testing and it turns out that Eleventy runs before your functions are deployed so that seemed to be a safe thing to do. I also got confirmation from a support person at Netlify that it was indeed supposed to work that way.

Here's how I built the index in Eleventy:

---
permalink: netlify/functions/search/data.json
permalinkBypassOutputDir: true
---

{% assign posts = collections.posts %}
[
{% for post in posts %}
	{
		"title": {{post.data.title | jsonify}},
		"date":"{{ post.date }}",
		"url":"{{ post.url }}",
		"content":{{ post.templateContent | jsonify }}
	}{% unless forloop.last %},{% endunless %}
{% endfor %}
]

The crutical bit is the front matter. Notice I'm writing directly to my Netlify functions directory and I've told Eleventy to not use the output directory. You need both of these to avoid writing to _site. The actual content of the JSON isn't terribly important for this demo. I decided to use the title, date, url, and complete content of a post. I could have also included categories, tags, even an author field. This is where you would customise it to match the shape of your site and what you need to search.

Now let's look at the search serverless function:

const lunrjs = require('lunr');

const handler = async (event) => {
  try {

    const search = event.queryStringParameters.term;
    if(!search) throw('Missing term query parameter');

    const data = require('./data.json');
    const index = createIndex(data);

    let results = index.search(search);

    results.forEach(r => {
      r.title = data[r.ref].title;
      r.content = truncate(data[r.ref].content, 400);
      r.date = data[r.ref].date;
      r.url = data[r.ref].url;
      
      delete r.matchData;
      delete r.ref;
    });

    return {
      statusCode: 200,
      body: JSON.stringify(results),
    }
  } catch (error) {
    return { statusCode: 500, body: error.toString() }
  }
}

function createIndex(posts) {
  return lunrjs(function() {
    this.ref('id');
    this.field('title');
    this.field('content');
    this.field('date');

    posts.forEach((p,idx) => {
      p.id = idx;
      this.add(p);
    });
  });
}

function truncate(str, size) {
  //first, remove HTML
  str = str.replace(/<.*?>/g, '');
  if(str.length < size) return str;
  return str.substring(0, size-3) + '...';
}

module.exports = { handler }

I begin by looking at the query string for a search term. I load in my JSON file (that's what is generated by the Liquid template above) and then tell Lunr to create an index from it. Pay special attention to these lines in the loop:

posts.forEach((p,idx) => {
	p.id = idx;

Remember that Lunr has that weird behavior where search results do not contain the original data of the matched item. I need a way to associate a particular search result back with it's original data. My original data is an array, so when I loop over it, I store an id value that matches the loop index.

You can see this being used here:

results.forEach(r => {
	r.title = data[r.ref].title;
	r.content = truncate(data[r.ref].content, 400);
	r.date = data[r.ref].date;
	r.url = data[r.ref].url;

The ref value in a search match is that loop index which means I can get the original information from the data array. Note I also truncate the content a bit as that's not needed.

And honestly that's it. My front end search form just hits the end point. Here's my pretty basic vanilla JS page that does this:

---
title: Search
---

<h1>Search</h1>

<p>
<input type="search" id="term">  
<button id="searchBtn">Search</button>
</p>

<div id="results"></div>


<script>
document.addEventListener('DOMContentLoaded', init, false);

async function init() {

	document.querySelector('#searchBtn').addEventListener('click', search);

	field = document.querySelector('#term');

	resultsDiv = document.querySelector('#results');

}

async function search() {
	let search = field.value.trim();
	if(!search) return;
	console.log(`search for ${search}`);

	let searchRequest = await fetch(`/api/search?term=${search}`);
	let results = await searchRequest.json();

	let resultsHTML = '<p><strong>Search Results</strong></p>';

	if(!results.length) {
		resultsHTML += '<p>Sorry, there were no results.</p>';
		resultsDiv.innerHTML = resultsHTML;
		return;
	}

	resultsHTML += '<ul>';

	// we need to add title, url from ref
	results.forEach(r => {
		resultsHTML += `<li><a href="${r.url}">${ r.title }</a></li>`;
	});

	resultsHTML += '</ul>';
	resultsDiv.innerHTML = resultsHTML;

}
</script>

You can test this yourself here: https://eleventy-lunrtest.netlify.app/search/ The content I used for my test comes from a subset of my blog content, so try searching for pdf to see some results. Note that I didn't use any kind of "loading" indictator as the search is performed. It takes about two seconds (more on that in a second) so be patient when searching. The source code for the demo may be found here: https://github.com/cfjedimaster/eleventy-demos/tree/master/lunr4

So... one thing I'll note is that I'm recreating the index on every search. That's wasteful. One thing I could do is generate both my data JSON file and a pre-built index. Both could be copied to the search serverless function directory. I'd still need to load both into memory when doing a search, but I wouldn't have to build the index. I've got an idea of how to do that and will give it a shot next week.

I hope this helps! Reach out if you've got any questions!