Skip to content

Understanding how Simila works

Mehran Davoudi edited this page Oct 3, 2015 · 1 revision

How Simila Works

How Simila Structures a String

When Simila starts to compare 2 strings, she breaks them into some Phrases.

Phrase: A string consists of one or more phrases separated with '.' character.

For example:

String: Today is a good day. Tomorrow will be better.

Phrases: [Today is a good day] + [Tomorrow will be better]

After then, Simila breaks Phrases into Words.

Word: A phrase consists of one or more words separated with non alpha-numeric characters.

Phrase: It is a good day

Words: [It]+[is]+[a]+[good]+[day]

And finally, Simila breaks Words into Characters

Character: A word consists of one or more characters.

Word: Tomorrow Characters: [T]+[o]+[m]+[o]+[r]+[r]+[o]+[w]

How Simila Compares everything:

In Simila, there is a very basic interface:

interface ISimilarityResolver<T>
{
    float GetSimilarity(T left, T right)
}

So, there are some default implementations for each type:

class CharacterSimilarityResolverDefault : ISimilarityResolver<char>
{
    // Some implementation for characters.
}

class WordSimilarityResolverDefault : ISimilarityResolver<Word>
{
    // Some implementation for words.
}

class PhraseSimilarityResolverDefault : ISimilarityResolver<Phrase>
{
    // Some implementation for phrases.
}

The interesting point is that all of these classes are just default implementations in Simila. So you can configure Simila to use your SimilarityResolvers for each type during its algorithm.

Clone this wiki locally