Skip to content

Positional Index vs Inverted Index

Le Thu Nguyen edited this page Mar 28, 2018 · 6 revisions

Previous

Positional index:

Explain:

•Each term in the vocabulary, store postings of the form docID (position1, position2,..) •Store each term in the form of

  <number of docs containing term;
  Doc1: position1, position2…;
  Doc2: position1, position2…;
  etc.>

Example:

To, 993:
(1,6:(7,18,33,72,..))

The word to has a document frequency 993 and occurs 6 times in document 1 at positions 7,18,33, 72 and etc.

  • Phrase queries, use a merge algorithm recursively at the document level.

Pros/Cons:

Fast indexing Less efficient query's

Inverted index

Explain

•For each term T, store a list of all documents containing term T. It is a list of all unique word that appears in any document.

Example:

Document 1: The quick brown fox jumped
Document 2: Quick brown foxes
> Term	Doc1	Doc2
> The	x	
> quick	x	
> brown	x	x
> fox	x	
> jumped	x	
> Quick		x
> foxes		x

Pros/Cons:

Fast query Slower indexing

Previous

Clone this wiki locally