Class IntersectionSimilarity<T>
java.lang.Object
org.apache.commons.text.similarity.IntersectionSimilarity<T>
- Type Parameters:
T
- the type of the elements extracted from the character sequence
- All Implemented Interfaces:
SimilarityScore<IntersectionResult>
public class IntersectionSimilarity<T>
extends Object
implements SimilarityScore<IntersectionResult>
Measures the intersection of two sets created from a pair of character sequences.
It is assumed that the type T
correctly conforms to the requirements for storage
within a Set
or HashMap
. Ideally the type is immutable and implements
Object.equals(Object)
and Object.hashCode()
.
- Since:
- 1.7
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static final class
Mutable counter class for storing the count of elements.private class
A minimal implementation of a Bag that can store elements and a count. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final Function
<CharSequence, Collection<T>> The converter used to create the elements from the characters. -
Constructor Summary
ConstructorsConstructorDescriptionIntersectionSimilarity
(Function<CharSequence, Collection<T>> converter) Create a new intersection similarity using the provided converter. -
Method Summary
Modifier and TypeMethodDescriptionapply
(CharSequence left, CharSequence right) Calculates the intersection of two character sequences passed as input.private static <T> int
getIntersection
(Set<T> setA, Set<T> setB) Computes the intersection between two sets.private int
getIntersection
(IntersectionSimilarity<T>.TinyBag bagA, IntersectionSimilarity<T>.TinyBag bagB) Computes the intersection between two bags.private IntersectionSimilarity<T>.TinyBag
toBag
(Collection<T> objects) Converts the collection to a bag.
-
Field Details
-
converter
The converter used to create the elements from the characters.
-
-
Constructor Details
-
IntersectionSimilarity
Create a new intersection similarity using the provided converter.If the converter returns a
Set
then the intersection result will not include duplicates. Any otherCollection
is used to produce a result that will include duplicates in the intersect and union.- Parameters:
converter
- the converter used to create the elements from the characters- Throws:
IllegalArgumentException
- if the converter is null
-
-
Method Details
-
getIntersection
Computes the intersection between two sets. This is the count of all the elements that are within both sets.- Type Parameters:
T
- the type of the elements in the set- Parameters:
setA
- the set AsetB
- the set B- Returns:
- The intersection
-
apply
Calculates the intersection of two character sequences passed as input.- Specified by:
apply
in interfaceSimilarityScore<T>
- Parameters:
left
- first character sequenceright
- second character sequence- Returns:
- The intersection result
- Throws:
IllegalArgumentException
- if either input sequence isnull
-
getIntersection
private int getIntersection(IntersectionSimilarity<T>.TinyBag bagA, IntersectionSimilarity<T>.TinyBag bagB) Computes the intersection between two bags. This is the sum of the minimum count of each element that is within both sets.- Parameters:
bagA
- the bag AbagB
- the bag B- Returns:
- The intersection
-
toBag
Converts the collection to a bag. The bag will contain the count of each element in the collection.- Parameters:
objects
- the objects- Returns:
- The bag
-