Skip to content

IamNikolayTverdokhleb/inverted_index

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

inverted_index

That is document search engine.

For now we created reader for CRANFIELD http://ir.dcs.gla.ac.uk/resources/test_collections/cran/, which provides a set of documents and corresponding queries, as well as ranking of queries.

Preprocessing class provides methods for tokenizing and stemming of both queries and documents, as well as removing stop-words from them.

There's implementation of Vector Space Model with TF-IDF space and Language Model with 1-gram query likelyhood.

Note: VectorSpaceModel provides methods write_inverted_file read_inverted_file to write and read inverted file from memory, but our search machine appears to be really fast and efficient on chosen dataset so we are not using it.

About

Vector Space Model and Language Model for Information Retrieval system based on collections of text documents.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages