Class BlockReader
- java.lang.Object
-
- org.apache.lucene.index.TermsEnum
-
- org.apache.lucene.index.BaseTermsEnum
-
- org.apache.lucene.codecs.uniformsplit.BlockReader
-
- All Implemented Interfaces:
Accountable,BytesRefIterator
- Direct Known Subclasses:
IntersectBlockReader,STBlockReader
public class BlockReader extends BaseTermsEnum implements Accountable
Seeks the block corresponding to a given term, read the block bytes, and scans the block terms.Reads fully the block in
blockReadBuffer. Then scans the block terms in memory. The details region is lazily decoded withtermStatesReadBufferwhich shares the same byte array withblockReadBuffer. SeeBlockWriterandBlockLinefor the block format.- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.index.TermsEnum
TermsEnum.SeekStatus
-
-
Field Summary
Fields Modifier and Type Field Description protected BlockDecoderblockDecoderprotected intblockFirstLineStartOffset of the start of the first line of the current block (just after the header), relative to the block start.protected BlockHeaderblockHeaderCurrent block header.protected BlockHeader.SerializerblockHeaderReaderprotected IndexInputblockInputIndexInputon theblock file.protected BlockLineblockLineCurrent block line.protected BlockLine.SerializerblockLineReaderprotected ByteArrayDataInputblockReadBufferIn-memory read buffer for the current block.protected longblockStartFPCurrent block start file pointer, absolute in theblock file.protected IndexDictionary.BrowserdictionaryBrowserHolds theIndexDictionary.Browseronce loaded.protected IndexDictionary.BrowserSupplierdictionaryBrowserSupplierIndexDictionary.Browsersupplier for lazy loading.protected FieldMetadatafieldMetadataprotected BytesRefBuilderforcedTermSet whenseekExact(BytesRef, TermState)is called.protected intlineIndexInBlockCurrent line index in the block.protected PostingsReaderBasepostingsReaderprotected BytesRefscratchBlockBytesprotected BlockLinescratchBlockLineprotected BlockTermStatescratchTermStateprotected BlockTermStatetermStateCurrent block line details.protected booleantermStateForcedWhether the currentTermStatehas been forced with a call toseekExact(BytesRef, TermState).protected DeltaBaseTermStateSerializertermStateSerializerprotected ByteArrayDataInputtermStatesReadBufferIn-memory read buffer for the details region of the current block.-
Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
-
-
Constructor Summary
Constructors Modifier Constructor Description protectedBlockReader(IndexDictionary.BrowserSupplier dictionaryBrowserSupplier, IndexInput blockInput, PostingsReaderBase postingsReader, FieldMetadata fieldMetadata, BlockDecoder blockDecoder)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidclearTermState()protected intcompareToMiddleAndJump(BytesRef searchedTerm)Compares the searched term to the middle term of the block.protected BlockHeader.SerializercreateBlockHeaderSerializer()protected BlockLine.SerializercreateBlockLineSerializer()protected DeltaBaseTermStateSerializercreateDeltaBaseTermStateSerializer()protected BytesRefdecodeBlockBytesIfNeeded(int numBlockBytes)intdocFreq()protected IndexDictionary.BrowsergetOrCreateDictionaryBrowser()ImpactsEnumimpacts(int flags)protected voidinitializeBlockReadLazily()protected voidinitializeHeader(BytesRef searchedTerm, long targetBlockStartFP)Reads and setsblockHeader.protected booleanisBeyondLastTerm(BytesRef searchedTerm, long blockStartFP)Indicates whether the searched term is beyond the last term of the field.protected booleanisCurrentTerm(BytesRef searchedTerm)protected CorruptIndexExceptionnewCorruptIndexException(String msg, Long fp)BytesRefnext()protected BytesRefnextTerm()Moves to the next term line and reads it, it may be in the next block.longord()PostingsEnumpostings(PostingsEnum reuse, int flags)longramBytesUsed()protected BlockHeaderreadHeader()Reads the block header.protected BlockLinereadLineInBlock()Reads the current block line.protected BlockTermStatereadTermState()Reads theBlockTermStateon the current line.protected BlockTermStatereadTermStateIfNotRead()Reads theBlockTermStateif it is not already set.TermsEnum.SeekStatusseekCeil(BytesRef searchedTerm)voidseekExact(long ord)Not supported.booleanseekExact(BytesRef searchedTerm)voidseekExact(BytesRef term, TermState state)Positions thisBlockReaderwithout re-seeking the term dictionary.protected TermsEnum.SeekStatusseekInBlock(BytesRef searchedTerm)Seeks to the provided term in this block.protected TermsEnum.SeekStatusseekInBlock(BytesRef searchedTerm, long blockStartFP)Seeks to the provided term in the block starting at the provided file pointer.BytesRefterm()TermStatetermState()longtotalTermFreq()-
Methods inherited from class org.apache.lucene.index.BaseTermsEnum
attributes
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
-
-
-
Field Detail
-
blockInput
protected IndexInput blockInput
IndexInputon theblock file.
-
postingsReader
protected final PostingsReaderBase postingsReader
-
fieldMetadata
protected final FieldMetadata fieldMetadata
-
blockDecoder
protected final BlockDecoder blockDecoder
-
blockHeaderReader
protected BlockHeader.Serializer blockHeaderReader
-
blockLineReader
protected BlockLine.Serializer blockLineReader
-
blockReadBuffer
protected ByteArrayDataInput blockReadBuffer
In-memory read buffer for the current block.
-
termStatesReadBuffer
protected ByteArrayDataInput termStatesReadBuffer
In-memory read buffer for the details region of the current block. It shares the same byte array asblockReadBuffer, with a different position.
-
termStateSerializer
protected DeltaBaseTermStateSerializer termStateSerializer
-
dictionaryBrowserSupplier
protected final IndexDictionary.BrowserSupplier dictionaryBrowserSupplier
IndexDictionary.Browsersupplier for lazy loading.
-
dictionaryBrowser
protected IndexDictionary.Browser dictionaryBrowser
Holds theIndexDictionary.Browseronce loaded.
-
blockStartFP
protected long blockStartFP
Current block start file pointer, absolute in theblock file.
-
blockHeader
protected BlockHeader blockHeader
Current block header.
-
blockLine
protected BlockLine blockLine
Current block line.
-
termState
protected BlockTermState termState
Current block line details.
-
blockFirstLineStart
protected int blockFirstLineStart
Offset of the start of the first line of the current block (just after the header), relative to the block start.
-
lineIndexInBlock
protected int lineIndexInBlock
Current line index in the block.
-
termStateForced
protected boolean termStateForced
Whether the currentTermStatehas been forced with a call toseekExact(BytesRef, TermState).- See Also:
forcedTerm
-
forcedTerm
protected BytesRefBuilder forcedTerm
Set whenseekExact(BytesRef, TermState)is called.This optimizes the use-case when the caller calls first
seekExact(BytesRef, TermState)and thenpostings(PostingsEnum, int). In this case we don't access the terms block file (we don't seek) but directly the postings file because we already have theTermStatewith the file pointers to the postings file.
-
scratchBlockBytes
protected BytesRef scratchBlockBytes
-
scratchTermState
protected final BlockTermState scratchTermState
-
scratchBlockLine
protected BlockLine scratchBlockLine
-
-
Constructor Detail
-
BlockReader
protected BlockReader(IndexDictionary.BrowserSupplier dictionaryBrowserSupplier, IndexInput blockInput, PostingsReaderBase postingsReader, FieldMetadata fieldMetadata, BlockDecoder blockDecoder) throws IOException
- Parameters:
dictionaryBrowserSupplier- to load theIndexDictionary.Browserlazily inseekCeil(BytesRef).blockDecoder- Optional block decoder, may be null if none. It can be used for decompression or decryption.- Throws:
IOException
-
-
Method Detail
-
seekCeil
public TermsEnum.SeekStatus seekCeil(BytesRef searchedTerm) throws IOException
- Specified by:
seekCeilin classTermsEnum- Throws:
IOException
-
seekExact
public boolean seekExact(BytesRef searchedTerm) throws IOException
- Overrides:
seekExactin classBaseTermsEnum- Throws:
IOException
-
isCurrentTerm
protected boolean isCurrentTerm(BytesRef searchedTerm)
-
isBeyondLastTerm
protected boolean isBeyondLastTerm(BytesRef searchedTerm, long blockStartFP)
Indicates whether the searched term is beyond the last term of the field.- Parameters:
blockStartFP- The current block start file pointer.
-
seekInBlock
protected TermsEnum.SeekStatus seekInBlock(BytesRef searchedTerm, long blockStartFP) throws IOException
Seeks to the provided term in the block starting at the provided file pointer. Does not exceed the block.- Throws:
IOException
-
seekInBlock
protected TermsEnum.SeekStatus seekInBlock(BytesRef searchedTerm) throws IOException
Seeks to the provided term in this block.Does not exceed this block;
TermsEnum.SeekStatus.ENDis returned if it follows the block.Compares the line terms with the
searchedTerm, taking advantage of the incremental encoding properties.Scans linearly the terms. Updates the current block line with the current term.
- Throws:
IOException
-
compareToMiddleAndJump
protected int compareToMiddleAndJump(BytesRef searchedTerm) throws IOException
Compares the searched term to the middle term of the block. If the searched term is lexicographically equal or after the middle term then jumps to the second half of the block directly.- Returns:
- The comparison between the searched term and the middle term.
- Throws:
IOException
-
readLineInBlock
protected BlockLine readLineInBlock() throws IOException
Reads the current block line. SetsblockLineand incrementslineIndexInBlock.- Returns:
- The
BlockLine; or null if there no more line in the block. - Throws:
IOException
-
seekExact
public void seekExact(BytesRef term, TermState state)
Positions thisBlockReaderwithout re-seeking the term dictionary.The block containing the term is not read by this method. It will be read lazily only if needed, for example if
next()is called. Callingpostings(org.apache.lucene.index.PostingsEnum, int)after this method does require the block to be read.- Overrides:
seekExactin classBaseTermsEnum
-
seekExact
public void seekExact(long ord)
Not supported.
-
next
public BytesRef next() throws IOException
- Specified by:
nextin interfaceBytesRefIterator- Throws:
IOException
-
nextTerm
protected BytesRef nextTerm() throws IOException
Moves to the next term line and reads it, it may be in the next block. The term details are not read yet. They will be read only when needed withreadTermStateIfNotRead().- Returns:
- The read term bytes; or null if there is no more term for the field.
- Throws:
IOException
-
initializeHeader
protected void initializeHeader(BytesRef searchedTerm, long targetBlockStartFP) throws IOException
Reads and setsblockHeader. Sets null if there is no block for the field anymore.- Parameters:
searchedTerm- The searched term; or null if none.targetBlockStartFP- The file pointer of the block to read.- Throws:
IOException
-
initializeBlockReadLazily
protected void initializeBlockReadLazily() throws IOException- Throws:
IOException
-
createBlockHeaderSerializer
protected BlockHeader.Serializer createBlockHeaderSerializer()
-
createBlockLineSerializer
protected BlockLine.Serializer createBlockLineSerializer()
-
createDeltaBaseTermStateSerializer
protected DeltaBaseTermStateSerializer createDeltaBaseTermStateSerializer()
-
readHeader
protected BlockHeader readHeader() throws IOException
Reads the block header. SetsblockHeader.- Returns:
- The block header; or null if there is no block for the field anymore.
- Throws:
IOException
-
decodeBlockBytesIfNeeded
protected BytesRef decodeBlockBytesIfNeeded(int numBlockBytes) throws IOException
- Throws:
IOException
-
readTermStateIfNotRead
protected BlockTermState readTermStateIfNotRead() throws IOException
Reads theBlockTermStateif it is not already set. SetstermState.- Throws:
IOException
-
readTermState
protected BlockTermState readTermState() throws IOException
Reads theBlockTermStateon the current line. SetstermState.Overriding method may return null if there is no
BlockTermState(in this case the extending class must support a nulltermState).- Returns:
- The
BlockTermState; or null if none. - Throws:
IOException
-
docFreq
public int docFreq() throws IOException- Specified by:
docFreqin classTermsEnum- Throws:
IOException
-
totalTermFreq
public long totalTermFreq() throws IOException- Specified by:
totalTermFreqin classTermsEnum- Throws:
IOException
-
termState
public TermState termState() throws IOException
- Overrides:
termStatein classBaseTermsEnum- Throws:
IOException
-
postings
public PostingsEnum postings(PostingsEnum reuse, int flags) throws IOException
- Specified by:
postingsin classTermsEnum- Throws:
IOException
-
impacts
public ImpactsEnum impacts(int flags) throws IOException
- Specified by:
impactsin classTermsEnum- Throws:
IOException
-
ramBytesUsed
public long ramBytesUsed()
- Specified by:
ramBytesUsedin interfaceAccountable
-
getOrCreateDictionaryBrowser
protected IndexDictionary.Browser getOrCreateDictionaryBrowser() throws IOException
- Throws:
IOException
-
clearTermState
protected void clearTermState()
-
newCorruptIndexException
protected CorruptIndexException newCorruptIndexException(String msg, Long fp)
-
-