create pipeline to split terms in order to avoid Lucene limitations
Lucene has a limitation on the length of the terms as mentioned here in ignore_above
. Nonetheless, in many cases this limitation would imply hard limitations on searchability over files.
One possible solution is the Split processor