Consider the following code:
final String[] texts = {
"Allons, enfants de la Patrie, Le jour de gloire est arrivé",
"O Tannenbaum, o Tannenbaum, wie treu sind deine Blätter!",
"..."
};
final LanguageDetector ld = new OptimaizeLangDetector(); // or e.g. OpenNLPDetector
ld.loadModels();
Arrays.stream(texts).parallel().forEach(text -> System.out.println(ld.detect(text)));
Can I assume that ld.detect() and ld.detectAll() are thread-safe and can be ran in parallel on multiple texts using a single LanguageDetector instance?
The thing that makes me worry is that LanguageDetector has methods like addText(), hasEnoughText() and reset() which make it stateful, and therefore - by definition - non-thread-safe...
https://tika.apache.org/2.7.0/api/org/apache/tika/language/detect/LanguageDetector.html
A requirement for a class to be thread-safe, is that it is immutable. That means after construction, instance methods are not allowed to change any members.
When reading the source for
org.apache.tika.langdetect.optimaize.OptimaizeLangDetectorherewe'll see this instance method
which is changing member
and with that the state of the
OptimaizeLangDetectorinstance. HenceOptimaizeLangDetectoris not thread-safe.