But if you have to execute long queries against a Solr app you should use language detection at query time if the query language is not known. This improves relevancy (by stemming, analyzing, etc.) and performance (by a smaller index and stop words / common terms query). Additionally, language detection is the basis for using common grams in order to improve the performance of phrase queries.
Here is how it works:
- Do language detection at indexing time and put the resulting Solr documents in language specific fields (e.g. text_de, text_en, ...).
- Do also language detection at query time in order to search in the corresponding language specific fields. Here is a Github Gist with an implementation that uses Googles language detection library that outperforms the Tika library.