Class EmbeddingModelTextClassifier<E extends Enum<E>>

java.lang.Object
dev.langchain4j.classification.EmbeddingModelTextClassifier<E>
Type Parameters:
E - Enum that is the result of classification.
All Implemented Interfaces:
TextClassifier<E>

public class EmbeddingModelTextClassifier<E extends Enum<E>> extends Object implements TextClassifier<E>
A TextClassifier that uses an EmbeddingModel and predefined examples to perform classification. Classification is done by comparing the embedding of the text being classified with the embeddings of predefined examples. The classification quality improves with a greater number of examples for each label. Examples can be easily generated by the LLM.

Example:


 enum Sentiment {
     POSITIVE, NEUTRAL, NEGATIVE
 }

  Map<Sentiment, List<String>> examples = Map.of(
     POSITIVE, List.of("This is great!", "Wow, awesome!"),
     NEUTRAL,  List.of("Well, it's fine", "It's ok"),
     NEGATIVE, List.of("It is pretty bad", "Worst experience ever!")
 );

 EmbeddingModel embeddingModel = new AllMiniLmL6V2QuantizedEmbeddingModel();

 TextClassifier<Sentiment> classifier = new EmbeddingModelTextClassifier<>(embeddingModel, examples);

 List<Sentiment> sentiments = classifier.classify("Awesome!");
 System.out.println(sentiments); // [POSITIVE]
 
  • Constructor Details

    • EmbeddingModelTextClassifier

      public EmbeddingModelTextClassifier(dev.langchain4j.model.embedding.EmbeddingModel embeddingModel, Map<E,? extends Collection<String>> examplesByLabel)
      Creates a classifier with the default values for maxResults (1), minScore (0) and meanToMaxScoreRatio (0.5).
      Parameters:
      embeddingModel - The embedding model used for embedding both the examples and the text to be classified.
      examplesByLabel - A map containing examples of texts for each label. The more examples, the better. Examples can be easily generated by the LLM.
    • EmbeddingModelTextClassifier

      public EmbeddingModelTextClassifier(dev.langchain4j.model.embedding.EmbeddingModel embeddingModel, Map<E,? extends Collection<String>> examplesByLabel, int maxResults, double minScore, double meanToMaxScoreRatio)
      Creates a classifier.
      Parameters:
      embeddingModel - The embedding model used for embedding both the examples and the text to be classified.
      examplesByLabel - A map containing examples of texts for each label. The more examples, the better. Examples can be easily generated by the LLM.
      maxResults - The maximum number of labels to return for each classification.
      minScore - The minimum similarity score required for classification, in the range [0..1]. Labels scoring lower than this value will be discarded.
      meanToMaxScoreRatio - A ratio, in the range [0..1], between the mean and max scores used for calculating the final score. During classification, the embeddings of examples for each label are compared to the embedding of the text being classified. This results in two metrics: the mean and max scores. The mean score is the average similarity score for all examples associated with a given label. The max score is the highest similarity score, corresponding to the example most similar to the text being classified. A value of 0 means that only the mean score will be used for ranking labels. A value of 0.5 means that both scores will contribute equally to the final score. A value of 1 means that only the max score will be used for ranking labels.
  • Method Details

    • classify

      public List<E> classify(String text)
      Description copied from interface: TextClassifier
      Classify the given text.
      Specified by:
      classify in interface TextClassifier<E extends Enum<E>>
      Parameters:
      text - Text to classify.
      Returns:
      A list of classification categories.