We decided to avoid the keyword type and use text for everything, utilizing truncation in the analyzer stage to keep tokens within the lucene limits. It seems this hasn't worked though, importing the enwiki dumps into relforge came across an error on the external_links field for the page: https://en.wikipedia.org/wiki/Wikipedia:Editor_assistance/Requests/Archive_122?action=cirrusdump
The error was:
java.lang.IllegalArgumentException: Document contains at least one immense term in field="external_link" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms.
This should be found at https://relforge1001.eqiad.wmnet:9243/crosswiki_enwiki_general/page/45641485 but the entire document was not indexed because of the above exception. We need to put together a test case and fix our mapping to handle this case.