FlexSearch v0.8 (Preview)

npm install flexsearch@latest

What's New

Persistent indexes support for: IndexedDB (Browser), Redis, SQLite, Postgres, MongoDB, Clickhouse
Enhanced language customization via the new Encoder class
Result Highlighting
Query performance achieve results up to 4.5 times faster compared to the previous generation v0.7.x by also improving the quality of results
Enhanced support for larger indexes or larger result sets
Improved offset and limit processing achieve up to 100 times faster traversal performance through large datasets
Support for larger In-Memory index with extended key size (the defaults maximum keystore limit is: 2^24)
Greatly enhanced performance of the whole text encoding pipeline
Improved indexing of numeric content (Triplets)
Intermediate result sets and Resolver
Basic Resolver: and, or, xor, not, limit, offset, boost, resolve
Improved charset collection
New charset preset soundex which further reduces memory consumption by also increasing "fuzziness"
Performance gain when polling tasks to the index by using "Event-Loop-Caches"
Up to 100 times faster deletion/replacement when not using the additional "fastupdate" register
Regex Pre-Compilation (transforms hundreds of regex rules into just a few)
Extended support for multiple tags (DocumentIndex)
Custom Fields ("Virtual Fields")
Custom Filter
Custom Score Function
Added French language preset (stop-word filter, stemmer)
Enhanced Worker Support
Export / Import index in chunks
Improved Build System + Bundler (Supported: CommonJS, ESM, Global Namespace), also the import of language packs are now supported for Node.js
Full covering index.d.ts type definitions
Fast-Boot Serialization optimized for Server-Side-Rendering (PHP, Python, Ruby, Rust, Java, Go, Node.js, ...)

Compare Benchmark: 0.7.0 vs. 0.8.0

Persistent Indexes

FlexSearch provides a new Storage Adapter where indexes are delegated through persistent storages.

Supported:

IndexedDB (Browser)
Redis
SQLite
Postgres
MongoDB
Clickhouse

The .export() and .import() methods are still available for non-persistent In-Memory indexes.

All search capabilities are available on persistent indexes like:

Context-Search
Suggestions
Cursor-based Queries (Limit/Offset)
Scoring (supports a resolution of up to 32767 slots)
Document-Search
Partial Search
Multi-Tag-Search
Boost Fields
Custom Encoder
Resolver
Tokenizer (Strict, Forward, Reverse, Full)
Document Store (incl. enrich results)
Worker Threads to run in parallel
Auto-Balanced Cache (top queries + last queries)

All persistent variants are optimized for larger sized indexes under heavy workload. Almost every task will be streamlined to run in batch/parallel, getting the most out of the selected database engine. Whereas the InMemory index can't share their data between different nodes when running in a cluster, every persistent storage can handle this by default.

Examples Node.js

nodejs-commonjs:
nodejs-esm:

Examples Browser

browser-legacy:
browser-module:

import FlexSearchIndex from "./index.d.ts";
import Database from "./db/indexeddb/index.js";
// create an index
const index = new FlexSearchIndex();
// create db instance with optional prefix
const db = new Database("my-store");
// mount and await before transfering data
await flexsearch.mount(db);

// update the index as usual
index.add(1, "content...");
index.update(2, "content...");
index.remove(3);

// changes are automatically committed by default
// when you need to wait for the task completion, then you
// can use the commit method explicitely:
await index.commit();

Alternatively mount a store by index creation:

const index = new FlexSearchIndex({
    db: new Storage("my-store")
});

// await for the db response before access the first time
await index.db;
// apply changes to the index
// ...

Query against a persistent storage just as usual:

const result = await index.search("gulliver");

Auto-Commit is enabled by default and will process changes asynchronously in batch. You can fully disable the auto-commit feature and perform them manually:

const index = new FlexSearchIndex({
    db: new Storage("my-store"),
    commit: false
});
// update the index
index.add(1, "content...");
index.update(2, "content...");
index.remove(3);

// transfer all changes to the db
await index.commit();

You can call the commit method manually also when commit: true option was set.

Benchmark

The benchmark was measured in "terms per second".

Store	Add	Search 1	Search N	Replace	Remove	Not Found	Scaling
	_{terms per sec}	_{terms per sec}	_{terms per sec}	_{terms per sec}	_{terms per sec}	_{terms per sec}
IndexedDB	123,298	83,823	62,370	57,410	171,053	425,744	No
Redis	1,566,091	201,534	859,463	117,013	129,595	875,526	Yes
Sqlite	269,812	29,627	129,735	174,445	1,406,553	122,566	No
Postgres	354,894	24,329	76,189	324,546	3,702,647	50,305	Yes
MongoDB	515,938	19,684	81,558	243,353	485,192	67,751	Yes
Clickhouse	1,436,992	11,507	22,196	931,026	3,276,847	16,644	Yes

Search 1: Single term query
Search N: Multi term query (Context-Search)

The benchmark was executed against a single client.

Encoder

Search capabilities highly depends on language processing. The old workflow wasn't really practicable. The new Encoder class is a huge improvement and fully replaces the encoding part. Some FlexSearch options was moved to the new Encoder instance.

New Encoding Pipeline:

charset normalization
custom preparation
split into terms (apply includes/excludes)
filter (pre-filter)
matcher (substitute terms)
stemmer (substitute term endings)
filter (post-filter)
replace chars (mapper)
custom regex (replacer)
letter deduplication
apply finalize

Example

const encoder = new Encoder({
    normalize: true,
    dedupe: true,
    cache: true,
    include: {
        letter: true,
        number: true,
        symbol: false,
        punctuation: false,
        control: false,
        char: "@"
    }
});

You can use an include instead of an exclude definition:

const encoder = new Encoder({
    exclude: {
        letter: false,
        number: false,
        symbol: true,
        punctuation: true,
        control: true
    }
});

Instead of using include or exclude you can pass a regular expression to the field split:

const encoder = new Encoder({
    split: /\s+/
});

The definitions include and exclude is a replacement for split. You can just define one of those 3.

Adding custom functions to the encoder pipeline:

const encoder = new Encoder({
    normalize: function(str){
        return str.toLowerCase();
    },
    prepare: function(str){
        return str.replace(/&/g, " and ");
    },
    finalize: function(arr){
        return arr.filter(term => term.length > 2);
    }
});

Assign encoder to an index:

const index = new Index({ 
    encoder: encoder
});

Define language specific transformations:

const encoder = new Encoder({
    replacer: [
        /[´`’ʼ]/g, "'"
    ],
    filter: new Set([
        "and",
    ]),
    matcher: new Map([
        ["xvi", "16"]
    ]),
    stemmer: new Map([
        ["ly", ""]
    ]),
    mapper: new Map([
        ["é", "e"]
    ])
});

Or use predefined language and extend it with custom options:

import EnglishBookPreset from "./lang/en.js";
const encoder = new Encoder(EnglishBookPreset, {
    filter: false
});

Equivalent:

import EnglishBookPreset from "./lang/en.js";
const encoder = new Encoder(EnglishBookPreset);
encoder.assign({ filter: false });

Assign extensions to the encoder instance:

import LatinEncoderPreset from "./charset/latin/simple.js";
import EnglishBookPreset from "./lang/en.js";
// stack definitions to the encoder instance
const encoder = new Encoder()
    .assign(LatinEncoderPreset)
    .assign(EnglishBookPreset)
// override preset options ...
    .assign({ minlength: 3 });
// assign further presets ...

When adding extension to the encoder every previously assigned configuration is still intact, very much like Mixins, also when assigning custom functions.

Add custom transformations to an existing index:

import LatinEncoderPreset from "./charset/latin/default.js";
const encoder = new Encoder(LatinEncoderPreset);
encoder.addReplacer(/[´`’ʼ]/g, "'");
encoder.addFilter("and");
encoder.addMatcher("xvi", "16");
encoder.addStemmer("ly", "");
encoder.addMapper("é", "e");

Shortcut for just assigning one encoder configuration to an index:

import LatinEncoderPreset from "./charset/latin/default.js";
const index = new Index({ 
    encoder: LatinEncoderPreset
});

Resolver

Retrieve an unresolved result:

const raw = index.search("a short query", { 
    resolve: false
});

You can apply and chain different resolver methods to the raw result, e.g.:

raw.and( ... )
   .and( ... )
   .boost(2)
   .or( ... ,  ... )
   .limit(100)
   .xor( ... )
   .not( ... )
   // final resolve
   .resolve({
       limit: 10,
       offset: 0,
       enrich: true
   });

The default resolver:

const raw = index.search("a short query", { 
    resolve: false
});
const result = raw.resolve();

Or use declaration style:

import Resolver from "./resolver.js";
const raw = new Resolver({ 
    index: index,
    query: "a short query"
});
const result = raw.resolve();

Chainable Boolean Operations

The basic concept explained:

// 1. get one or multiple unresolved results
const raw1 = index.search("a short query", { 
    resolve: false
});
const raw2 = index.search("another query", {
    resolve: false,
    boost: 2
});

// 2. apply and chain resolver operations
const raw3 = raw1.and(raw2, /* ... */);
// you can access the aggregated result by raw3.result
console.log("The aggregated result is:", raw3.result)
// apply further operations ...

// 3. resolve final result
const result = raw3.resolve({
    limit: 100,
    offset: 0
});
console.log("The final result is:", result)

Use inline queries:

const result = index.search("further query", {
    // set resolve to false on the first query
    resolve: false,
    boost: 2
})
.or( // union
    index.search("a query")
    .and( // intersection
        index.search("another query", {
            boost: 2
        })
    )
)
.not( // exclusion
    index.search("some query")
)
// resolve the result
.resolve({
    limit: 100,
    offset: 0
});

import Resolver from "./resolver.js";
const result = new Resolver({
    index: index,
    query: "further query",
    boost: 2
})
.or({
    and: [{ // inner expression
        index: index,
        query: "a query"
    },{
        index: index,
        query: "another query",
        boost: 2
    }]
})
.not({ // exclusion
    index: index,
    query: "some query"
})
.resolve({
    limit: 100,
    offset: 0
});

When all queries are made against the same index, you can skip the index in every declaration followed after initially calling new Resolve():

import Resolver from "./resolver.js";
const result = new Resolver({
    index: index,
    query: "a query"
})
.and({ query: "another query", boost: 2 })
.or ({ query: "further query", boost: 2 })
.not({ query: "some query" })
.resolve(100);

Custom Resolver

function CustomResolver(raw){
    // console.log(raw)
    let output;
    // generate output ...
    return output;
}

const result = index.search("a short query", { 
    resolve: CustomResolver
});

Result Highlighting

Result highlighting could be just enabled when using Document-Index with enabled Data-Store. Also when you just want to add id-content-pairs you'll need to use a DocumentIndex for this feature (just define a simple document descriptor as shown below).

// create the document index
const index = new Document({
  document: {
    store: true,
    index: [{
      field: "title",
      tokenize: "forward",
      encoder: Charset.LatinBalance
    }]
  }
});

// add data
index.add({
  "id": 1,
  "title": "Carmencita"
});
index.add({
  "id": 2,
  "title": "Le clown et ses chiens"
});

// perform a query
const result = index.search({
  query: "karmen or clown or not found",
  suggest: true,
  // set enrich to true (required)
  enrich: true,
  // highlight template
  // $1 is a placeholder for the matched partial
  highlight: "<b>$1</b>"
});

The result will look like:

[{
  "field": "title",
  "result": [{
      "id": 1,
      "doc": {
        "id": 1,
        "title": "Carmencita"
      },
      "highlight": "<b>Carmen</b>cita"
    },{
      "id": 2,
      "doc": {
        "id": 2,
        "title": "Le clown et ses chiens"
      },
      "highlight": "Le <b>clown</b> et ses chiens"
    }
  ]
}]

Big In-Memory Keystores

The default maximum keystore limit for the In-Memory index is 2^24 of distinct terms/partials being stored (so-called "cardinality"). An additional register could be enabled and is dividing the index into self-balanced partitions.

const index = new FlexSearchIndex({
    // e.g. set keystore range to 8-Bit:
    // 2^8 * 2^24 = 2^32 keys total
    keystore: 8 
});

You can theoretically store up to 2^88 keys (64-Bit address range).

The internal ID arrays scales automatically when limit of 2^31 has reached by using Proxy.

Persistent storages has no keystore limit by default. You should not enable keystore when using persistent indexes, as long as you do not stress the buffer too hard before calling index.commit().

Multi-Tag-Search

Assume this document schema (a dataset from IMDB):

{
    "tconst": "tt0000001",
    "titleType": "short",
    "primaryTitle": "Carmencita",
    "originalTitle": "Carmencita",
    "isAdult": 0,
    "startYear": "1894",
    "endYear": "",
    "runtimeMinutes": "1",
    "genres": [
        "Documentary",
        "Short"
    ]
}

An appropriate document descriptor could look like:

import LatinEncoder from "./charset/latin/simple.js";

const flexsearch = new Document({
    encoder: LatinEncoder,
    resolution: 3,
    document: {
        id: "tconst",
        //store: true, // document store
        index: [{
            field: "primaryTitle",
            tokenize: "forward"
        },{
            field: "originalTitle",
            tokenize: "forward"
        }],
        tag: [
            "startYear",
            "genres"
        ]
    }
});

The field contents of primaryTitle and originalTitle are encoded by the forward tokenizer. The field contents of startYear and genres are added as tags.

Get all entries of a specific tag:

const result = flexsearch.search({
    //enrich: true, // enrich documents
    tag: { "genres": "Documentary" },
    limit: 1000,
    offset: 0
});

Get entries of multiple tags (intersection):

const result = flexsearch.search({
    //enrich: true, // enrich documents
    tag: { 
        "genres": ["Documentary", "Short"],
        "startYear": "1894"
    }
});

Combine tags with queries (intersection):

const result = flexsearch.search({
    query: "Carmen", // forward tokenizer
    tag: { 
        "genres": ["Documentary", "Short"],
        "startYear": "1894"
    }
});

Alternative declaration:

const result = flexsearch.search("Carmen", {
    tag: [{
        field: "genres",
        tag: ["Documentary", "Short"]
    },{
        field: "startYear",
        tag: "1894"
    }]
});

Filter Fields (Index / Tags / Datastore)

const flexsearch = new Document({
    document: {
        id: "id",
        index: [{
            // custom field:
            field: "somefield",
            filter: function(data){
                // return false to filter out
                // return anything else to keep
                return true;
            }
        }],
        tag: [{
            field: "city",
            filter: function(data){
                // return false to filter out
                // return anything else to keep
                return true;
            }
        }],
        store: [{
            field: "anotherfield",
            filter: function(data){
                // return false to filter out
                // return anything else to keep
                return true;
            }
        }]
    }
});

Custom Fields (Index / Tags / Datastore)

Dataset example:

{
    "id": 10001,
    "firstname": "John",
    "lastname": "Doe",
    "city": "Berlin",
    "street": "Alexanderplatz",
    "number": "1a",
    "postal": "10178"
}

You can apply custom fields derived from data or by anything else:

const flexsearch = new Document({
    document: {
        id: "id",
        index: [{
            // custom field:
            field: "fullname",
            custom: function(data){
                // return custom string
                return data.firstname + " " + 
                       data.lastname;
            }
        },{
            // custom field:
            field: "location",
            custom: function(data){
                return data.street + " " +
                       data.number + ", " +
                       data.postal + " " +
                       data.city;
            }
        }],
        tag: [{
            // existing field
            field: "city"
        },{
            // custom field:
            field: "category",
            custom: function(data){
                let tags = [];
                // push one or multiple tags
                // ....
                return tags;
            }
        }],
        store: [{
            field: "anotherfield",
            custom: function(data){
                // return a falsy value to filter out
                // return anything else as to keep in store
                return data;
            }
        }]
    }
});

Filter is also available in custom functions when returning false.

Perform a query against the custom field as usual:

const result = flexsearch.search({
    query: "10178 Berlin Alexanderplatz",
    field: "location"
});

const result = flexsearch.search({
    query: "john doe",
    tag: { "city": "Berlin" }
});

Custom Score Function

const index = new FlexSearchIndex({
    resolution: 10,
    score: function(content, term, term_index, partial, partial_index){
        // you'll need to return a number between 0 and "resolution"
        // score is starting from 0, which is the highest score
        // for a resolution of 10 you can return 0 - 9
        // ... 
        return 3;
    } 
});

A common situation is you have some predefined labels which are related to some kind of order, e.g. the importance or priority. A priority label could be high, moderate, low so you can derive the scoring from those properties. Another example is when you have something already ordered and you would like to keep this order as relevance.

Probably you won't need the parameters passed to the score function. But when needed here are the parameters from the score function explained:

content is the whole content as an array of terms (encoded)
term is the current term which is actually processed (encoded)
term_index is the index of the term in the content array
partial is the current partial of a term which is actually processed
partial_index is the index position of the partial within the term

Partials params are empty when using tokenizer strict. Let's take an example by using the tokenizer full.

The content: "This is an example of partial encoding"
The highlighting part marks the partial which is actually processed. Then your score function will called by passing these parameters:

function score(content, term, term_index, partial, partial_index){
    content = ["this", "is", "an", "example", "of", "partial", "encoding"]
    term = "example"
    term_index = 3
    partial = "amp"
    partial_index = 2
}

Merge Document Results

By default, the result set of Field-Search has a structure grouped by field names:

[{
    field: "fieldname-1",
    result: [{
        id: 1001,
        doc: {/* stored document */}
    }]
},{
    field: "fieldname-2",
    result: [{
        id: 1001,
        doc: {/* stored document */}
    }]
},{
    field: "fieldname-3",
    result: [{
        id: 1002,
        doc: {/* stored document */}
    }]
}]

By passing the search option merge: true the result set will be merged into (group by id):

[{
    id: 1001,
    doc: {/* stored document */}
    field: ["fieldname-1", "fieldname-2"]
},{
    id: 1002,
    doc: {/* stored document */}
    field: ["fieldname-3"]
}]

Extern Worker Configuration

When using Worker by also assign custom functions to the options e.g.:

Custom Encoder
Custom Encoder methods (normalize, prepare, finalize)
Custom Score (function)
Custom Filter (function)
Custom Fields (function)

... then you'll need to move your field configuration into a file which exports the configuration as a default export. The field configuration is not the whole Document-Descriptor.

When not using custom functions in combination with Worker you can skip this part.

Since every field resolves into a dedicated Worker, also every field which includes custom functions should have their own configuration file accordingly.

Let's take this document descriptor:

{
    document: {
        index: [{
            // this is the field configuration
            // ---->
            field: "custom_field",
            custom: function(data){
                return "custom field content";
            }
            // <------
        }]
    }
};

The configuration which needs to be available as a default export is:

{
    field: "custom_field",
    custom: function(data){
        return "custom field content";
    }
};

You're welcome to make some suggestions how to improve the handling of extern configuration.

Example Node.js:

An extern configuration for one WorkerIndex, let's assume it is located in ./custom_field.js:

const { Charset } = require("flexsearch");
const { LatinSimple } = Charset;
// it requires a default export:
module.exports = {
    encoder: LatinSimple,
    tokenize: "forward",
    // custom function:
    custom: function(data){
        return "custom field content";
    }
};

Create Worker Index along the configuration above:

const { Document } = require("flexsearch");
const flexsearch = new Document({
    worker: true,
    document: {
        index: [{
            // the field name needs to be set here
            field: "custom_field",
            // path to your config from above:
            config: "./custom_field.js",
        }]
    }
});

Browser (ESM)

An extern configuration for one WorkerIndex, let's assume it is located in ./custom_field.js:

import { Charset } from "./dist/flexsearch.bundle.module.min.js";
const { LatinSimple } = Charset;
// it requires a default export:
export default {
    encoder: LatinSimple,
    tokenize: "forward",
    // custom function:
    custom: function(data){
        return "custom field content";
    }
};

Create Worker Index with the configuration above:

import { Document } from "./dist/flexsearch.bundle.module.min.js";
// you will need to await for the response!
const flexsearch = await new Document({
    worker: true,
    document: {
        index: [{
            // the field name needs to be set here
            field: "custom_field",
            // Absolute URL to your config from above:
            config: "http://localhost/custom_field.js"
        }]
    }
});

Here it needs the absolute URL, because the WorkerIndex context is from type Blob and you can't use relative URLs starting from this context.

Test Case

As a test the whole IMDB data collection was indexed, containing of:

JSON Documents: 9,273,132
Fields: 83,458,188
Tokens: 128,898,832

The used index configuration has 2 fields (using bidirectional context of depth: 1), 1 custom field, 2 tags and a full datastore of all input json documents.

A non-Worker Document index requires 181 seconds to index all contents.
The Worker index just takes 32 seconds to index them all, by processing every field and tag in parallel. For such large content it is a quite impressive result.

CSP-friendly Worker (Browser)

When just using worker by passing the option worker: true, the worker will be created by code generation under the hood. This might have issues when using strict CSP settings.

You can overcome this issue by passing the filepath to the worker file like worker: "./worker.js". The original worker file is located at src/worker/worker.js.

Fuzzy-Search

Fuzzysearch describes a basic concept of how making queries more tolerant. FlexSearch provides several methods to achieve fuzziness:

Use a tokenizer: forward, reverse or full
Don't forget to use any of the builtin encoder simple > balance > advanced > extra > soundex (sorted by fuzziness)
Use one of the language specific presets e.g. /lang/en.js for en-US specific content
Enable suggestions by passing the search option suggest: true

Additionally, you can apply custom Mapper, Replacer, Stemmer, Filter or by assigning a custom normalize(str), prepare(str) or finalize(arr) function to the Encoder.

Compare Fuzzy-Search Encoding

Original term which was indexed: "Struldbrugs"

Encoder:	`LatinExact`	`LatinDefault`	`LatinSimple`	`LatinBalance`	`LatinAdvanced`	`LatinExtra`	`LatinSoundex`
Index Size	3.1 Mb	1.9 Mb	1.8 Mb	1.7 Mb	1.6 Mb	1.1 Mb	0.7 Mb
Struldbrugs	✓	✓	✓	✓	✓	✓	✓
struldbrugs		✓	✓	✓	✓	✓	✓
strũldbrųĝgs			✓	✓	✓	✓	✓
strultbrooks				✓	✓	✓	✓
shtruhldbrohkz					✓	✓	✓
zdroltbrykz						✓	✓
struhlbrogger							✓

The index size was measured after indexing the book "Gulliver's Travels".

Custom Encoder

Since it is very simple to create a custom Encoder, you are welcome to create your own. e.g.

function customEncoder(content){
   const tokens = [];
   // split content into terms/tokens
   // apply your changes to each term/token
   // you will need to return an Array of terms/tokens
   // so just iterate through the input string and
   // push tokens to the array
   // ...
   return tokens;
}

const index = new Index({
   // set to strict when your tokenization was already done
   tokenize: "strict",
   encode: customEncoder
});

If you get some good results please feel free to share your encoder.

Fast-Boot Serialization for Server-Side-Rendering (PHP, Python, Ruby, Rust, Java, Go, Node.js, ...)

This is an experimental feature with limited support which probably might drop in future release. You're welcome to give some feedback.

When using Server-Side-Rendering you can create a different export which instantly boot up. Especially when using Server-side rendered content, this could help to restore a static index on page load. Document-Indexes aren't supported yet for this method.

When your index is too large you should use the default export/import mechanism.

As the first step populate the FlexSearch index with your contents.

You have two options:

1. Create a function as string

const fn_string = index.serialize();

The contents of fn_string is a valid Javascript-Function declared as inject(index). Store it or place this somewhere in your code.

This function basically looks like:

function inject(index){
    index.reg = new Set([/* ... */]);
    index.map = new Map([/* ... */]);
    index.ctx = new Map([/* ... */]);
}

You can save this function by e.g. fs.writeFileSync("inject.js", fn_string); or place it as string in your SSR-generated markup.

After creating the index on client side just call the inject method like:

const index = new Index({/* use same configuration! */});
inject(index);

That's it.

You'll need to use the same configuration as you used before the export. Any changes on the configuration needs to be re-indexed.

2. Create just a function body as string

Alternatively you can use lazy function declaration by passing false to the serialize function:

const fn_body = index.serialize(false);

You will get just the function body which looks like:

index.reg = new Set([/* ... */]);
index.map = new Map([/* ... */]);
index.ctx = new Map([/* ... */]);

Now you can place this in your code directly (name your index as index), or you can also create an inject function from it, e.g.:

const inject = new Function("index", fn_body);

This function is callable like the above example:

const index = new Index();
inject(index);

Load Library (Node.js, ESM, Legacy Browser)

npm install flexsearch

The dist folder are located in: node_modules/flexsearch/dist/

Build	File	CDN
flexsearch.bundle.debug.js	Download	https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.bundle.debug.js
flexsearch.bundle.min.js	Download	https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.bundle.min.js
flexsearch.bundle.module.debug.js	Download	https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.bundle.module.debug.js
flexsearch.bundle.module.min.js	Download	https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.bundle.module.min.js
flexsearch.es5.debug.js	Download	https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.es5.debug.js
flexsearch.es5.min.js	Download	https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.es5.min.js
flexsearch.light.debug.js	Download	https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.light.debug.js
flexsearch.light.min.js	Download	https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.light.min.js
flexsearch.light.module.debug.js	Download	https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.light.module.debug.js
flexsearch.light.module.min.js	Download	https://cdn.jsdelivr.net/gh/nextapps-de/flexsearch@0.8.1/dist/flexsearch.light.module.min.js
Javascript Modules (ESM)	Download	https://github.com/nextapps-de/flexsearch/tree/0.8.1/dist/module
Javascript Modules Minified (ESM)	Download	https://github.com/nextapps-de/flexsearch/tree/0.8.1/dist/module-min
Javascript Modules Debug (ESM)	Download	https://github.com/nextapps-de/flexsearch/tree/0.8.1/dist/module-debug
flexsearch.custom.js	Read more about "Custom Build"

All debug versions are providing debug information through the console and gives you helpful advices on certain situations. Do not use them in production, since they are special builds containing extra debugging processes which noticeably reduce performance.

The abbreviations used at the end of the filenames indicates:

bundle All features included, FlexSearch is available on window.FlexSearch
light Only basic features are included, FlexSearch is available on window.FlexSearch
es5 bundle has support for EcmaScript5, FlexSearch is available on window.FlexSearch
module indicates that this bundle is a Javascript module (ESM), FlexSearch members are available by import { Index, Document, Worker, Encoder, Charset } from "./flexsearch.bundle.module.min.js" or alternatively using the default export import FlexSearch from "./flexsearch.bundle.module.min.js"
min bundle is minified
debug bundle has enabled debug mode and contains additional code just for debugging purposes (do not use for production)

Non-Module Bundles (ES5 Legacy)

Non-Module Bundles export all their features to the public namespace "FlexSearch" e.g. window.FlexSearch.Index or window.FlexSearch.Document.

Load the bundle by a script tag:

<script src="dist/flexsearch.bundle.min.js"></script>
<script>
  // ... access FlexSearch
  var Index = window.FlexSearch.Index;
  var index = new Index(/* ... */);
</script>

FlexSearch Members are accessible on:

var Index = window.FlexSearch.Index;
var Document = window.FlexSearch.Document;
var Encoder = window.FlexSearch.Encoder;
var Charset = window.FlexSearch.Charset;
var Resolver = window.FlexSearch.Resolver;
var Worker = window.FlexSearch.Worker;
var IdxDB = window.FlexSearch.IndexedDB;
// only exported by non-module builds:
var Language = window.FlexSearch.Language;

Load language packs:

<!-- English: -->
<script src="dist/lang/en.min.js"></script>
<!-- German: -->
<script src="dist/lang/de.min.js"></script>
<!-- French: -->
<script src="dist/lang/fr.min.js"></script>
<script>
  var EnglishEncoderPreset = window.FlexSearch.Language.en;
  var GermanEncoderPreset = window.FlexSearch.Language.de;
  var FrenchEncoderPreset = window.FlexSearch.Language.fr;
</script>

Module (ESM)

When using modules you can choose from 2 variants: flexsearch.xxx.module.min.js has all features bundled ready for production, whereas the folder /dist/module/ export all the features in the same structure as the source code but here compiler flags was resolved.

Also, for each variant there exist:

A debug version for the development
A pre-compiled minified version for production

Use the bundled version exported as a module (default export):

<script type="module">
    import FlexSearch from "./dist/flexsearch.bundle.module.min.js";
    const index = new FlexSearch.Index(/* ... */);
</script>

Or import FlexSearch members separately by:

<script type="module">
    import { Index, Document, Encoder, Charset, Resolver, Worker, IdxDB } 
        from "./dist/flexsearch.bundle.module.min.js";
    const index = new Index(/* ... */);
</script>

Use non-bundled modules:

<script type="module">
    import Index from "./dist/module/index.js";
    import Document from "./dist/module/document.js";
    import Encoder from "./dist/module/encoder.js";
    import Charset from "./dist/module/charset.js";
    import Resolver from "./dist/module/resolver.js";
    import Worker from "./dist/module/worker.js";
    import IdxDB from "./dist/module/db/indexeddb/index.js";
    const index = new Index(/* ... */);
</script>

Language packs are accessible via:

import EnglishEncoderPreset from "./dist/module/lang/en.js";
import GermanEncoderPreset from "./dist/module/lang/de.js";
import FrenchEncoderPreset from "./dist/module/lang/fr.js";

Also, pre-compiled non-bundled production-ready modules are located in dist/module-min/, whereas the debug version is located at dist/module-debug/.

You can also load modules via CDN:

<script type="module">
    import Index from "https://unpkg.com/flexsearch@0.8.1/dist/module/index.js";
    const index = new Index(/* ... */);
</script>

Node.js

Install FlexSearch via NPM:

npm install flexsearch

Use the default export:

const FlexSearch = require("flexsearch");
const index = new FlexSearch.Index(/* ... */);

Or require FlexSearch members separately by:

const { Index, Document, Encoder, Charset, Resolver, Worker, IdxDB } = require("flexsearch");
const index = new Index(/* ... */);

When you are using ESM in Node.js then just use the Modules explained one section above.

Language packs are accessible via:

const EnglishEncoderPreset = require("flexsearch/lang/en");
const GermanEncoderPreset = require("flexsearch/lang/de");
const FrenchEncoderPreset = require("flexsearch/lang/fr");

Persistent Connectors are accessible via:

const Postgres = require("flexsearch/db/postgres");
const Sqlite = require("flexsearch/db/sqlite");
const MongoDB = require("flexsearch/db/mongodb");
const Redis = require("flexsearch/db/redis");
const Clickhouse = require("flexsearch/db/clickhouse");

Custom Builds

The /src/ folder of this repository requires some compilation to resolve the build flags. Those are your options:

Closure Compiler (Advanced Compilation) (used by this library here)
Babel + Plugin babel-plugin-conditional-compile (used by this library here)

You can't resolve build flags with:

Webpack
esbuild
rollup
Terser

These are some of the basic builds located in the /dist/ folder:

npm run build:bundle
npm run build:light
npm run build:module
npm run build:es5

Perform a custom build (UMD bundle) by passing build flags:

npm run build:custom SUPPORT_DOCUMENT=true SUPPORT_TAGS=true LANGUAGE_OUT=ECMASCRIPT5 POLYFILL=true

Perform a custom build in ESM module format:

npm run build:custom RELEASE=custom.module SUPPORT_DOCUMENT=true SUPPORT_TAGS=true

Perform a debug build:

npm run build:custom DEBUG=true SUPPORT_DOCUMENT=true SUPPORT_TAGS=true

On custom builds each build flag will be set to false by default when not passed.

The custom build will be saved to dist/flexsearch.custom.xxxx.min.js or when format is module to dist/flexsearch.custom.module.xxxx.min.js (the "xxxx" is a hash based on the used build flags).

Supported Build Flags

Flag	Values	Info
Feature Flags
SUPPORT_WORKER	true, false
SUPPORT_ENCODER	true, false
SUPPORT_CHARSET	true, false
SUPPORT_CACHE	true, false
SUPPORT_ASYNC	true, false	Asynchronous Rendering (support Promises)
SUPPORT_STORE	true, false
SUPPORT_SUGGESTION	true, false
SUPPORT_SERIALIZE	true, false
SUPPORT_DOCUMENT	true, false
SUPPORT_TAGS	true, false
SUPPORT_PERSISTENT	true, false
SUPPORT_KEYSTORE	true, false
SUPPORT_COMPRESSION	true, false
SUPPORT_RESOLVER	true, false
Compiler Flags
DEBUG	true, false	Output debug information to the console (default: false)
RELEASE	custom custom.module bundle bundle.module es5 light compact
POLYFILL	true, false	Include Polyfills (based on LANGUAGE_OUT)
PROFILER	true, false	Just used for automatic performance tests
LANGUAGE_OUT	ECMASCRIPT3 ECMASCRIPT5 ECMASCRIPT_2015 ECMASCRIPT_2016 ECMASCRIPT_2017 ECMASCRIPT_2018 ECMASCRIPT_2019 ECMASCRIPT_2020 ECMASCRIPT_2021 ECMASCRIPT_2022 ECMASCRIPT_NEXT STABLE	Target language

Misc

A formula to determine a well-balanced value for the resolution is: $2*floor(\sqrt{content.length})$ where content is the value pushed by index.add(). Here the maximum length of all contents should be used.

Import / Export (In-Memory)

Persistent-Indexes and Worker-Indexes don't support Import/Export.

Export an Index or Document-Index to the folder /export/:

import { promises as fs } from "fs";

await index.export(async function(key, data){
  await fs.writeFile("./export/" + key, data, "utf8");
});

Import from folder /export/ into an Index or Document-Index:

const index = new Index({/* keep old config and place it here */});

const files = await fs.readdir("./export/");
for(let i = 0; i < files.length; i++){
  const data = await fs.readFile("./export/" + files[i], "utf8");
  await index.import(files[i], data);
}