Interface TermToBytesRefAttribute
- All Superinterfaces:
Attribute
- All Known Subinterfaces:
BytesTermAttribute
- All Known Implementing Classes:
BytesTermAttributeImpl,CharTermAttributeImpl,PackedTokenAttributeImpl
This attribute is requested by TermsHashPerField to index the contents. This attribute can be
used to customize the final byte[] encoding of terms.
Consumers of this attribute call getBytesRef() for each term. Example:
final TermToBytesRefAttribute termAtt = tokenStream.getAttribute(TermToBytesRefAttribute.class);
while (tokenStream.incrementToken() {
final BytesRef bytes = termAtt.getBytesRef();
if (isInteresting(bytes)) {
// because the bytes are reused by the attribute (like CharTermAttribute's char[] buffer),
// you should make a copy if you need persistent access to the bytes, otherwise they will
// be rewritten across calls to incrementToken()
doSomethingWith(BytesRef.deepCopyOf(bytes));
}
}
...
- NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
- This is a very expert and internal API, please use
CharTermAttributeand its implementation for UTF-8 terms; to index binary terms, useBytesTermAttributeand its implementation.
-
Method Summary
-
Method Details
-
getBytesRef
BytesRef getBytesRef()Retrieve this attribute's BytesRef. The bytes are updated from the current term. The implementation may return a new instance or keep the previous one.- Returns:
- a BytesRef to be indexed (only stays valid until token stream gets incremented)
-