Skip to content

A high-performance text search engine developed in Go, designed to efficiently load, index, and search large datasets from XML wiki dumps. This project features a custom indexing algorithm that significantly improves search speed and accuracy.

Notifications You must be signed in to change notification settings

likhithkp/text-search-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple Go Text Search Engine

This is a simple text search engine built in Go. It loads documents, indexes them, and allows you to perform keyword searches.

Features

  • Load documents from a specified data dump file.
  • Index and search for keywords within the documents.

Getting Started

Prerequisites

  • Go (version 1.15 or later)

Installation

  1. Clone the repository:

    git clone https://github.com/likhithkp/go-text-search-engine
    cd go-text-search-engine
  2. Install the required Go packages:

    go mod tidy
  3. Prepare your data dump file (e.g., enwiki-latest-abstract1.xml.gz). Make sure it’s in the project directory or specify its path in the command.

Usage

Run the search engine with the following command:

go run main.go -p <path_to_data_dump> -q <search_query>

Example:

go run main.go -p enwiki-latest-abstract1.xml.gz -q "Small wild cat"

Logging

The application logs the number of documents loaded, indexed, and how many matches were found during the search.

Contributing

Feel free to contribute by opening issues or submitting pull requests!

License

This project is licensed under the MIT License.

About

A high-performance text search engine developed in Go, designed to efficiently load, index, and search large datasets from XML wiki dumps. This project features a custom indexing algorithm that significantly improves search speed and accuracy.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages