-
Notifications
You must be signed in to change notification settings - Fork 72
Description
The upstream code uses a very object-oriented approach, where you have to create an object for your algorithm in order to use it. In many cases, this is unnecessary, and just leads to more code and increased allocations. Additionally, static method calls are slightly faster. We should retain the ability to instantiate these algorithms as objects so that it is not a breaking change and users can still swap implementations via the interfaces, but the object-based methods should delegate their state as parameters (or just call directly, if there's no state) to public static methods on these types.
For example, in MetricLCS, instead of this:
public double Distance<T>(ReadOnlySpan<T> s1, ReadOnlySpan<T> s2)
where T : IEquatable<T>
{
if (s1.SequenceEqual(s2))
{
return 0;
}
int m_len = Math.Max(s1.Length, s2.Length);
if (m_len == 0) return 0.0;
return 1.0
- (1.0 * LongestCommonSubsequence.Length(s1, s2))
/ m_len;
}It would look something like this:
public double Distance<T>(ReadOnlySpan<T> s1, ReadOnlySpan<T> s2)
where T : IEquatable<T>
=> GetDistance(s1, s2);
public static double GetDistance<T>(ReadOnlySpan<T> s1, ReadOnlySpan<T> s2)
where T : IEquatable<T>
{
if (s1.SequenceEqual(s2))
{
return 0;
}
int m_len = Math.Max(s1.Length, s2.Length);
if (m_len == 0) return 0.0;
return 1.0
- (1.0 * LongestCommonSubsequence.Length(s1, s2))
/ m_len;
}This may require some reworking of the ShingleBased base class.
The benchmarks should be conditionally-compiled to include these static method calls for performance comparison, since they are not available on older versions.