Package org.apache.lucene.tests.analysis
Class MockAnalyzer
- java.lang.Object
-
- org.apache.lucene.analysis.Analyzer
-
- org.apache.lucene.tests.analysis.MockAnalyzer
-
- All Implemented Interfaces:
Closeable,AutoCloseable
public final class MockAnalyzer extends Analyzer
Analyzer for testingThis analyzer is a replacement for Whitespace/Simple/KeywordAnalyzers for unit tests. If you are testing a custom component such as a queryparser or analyzer-wrapper that consumes analysis streams, it's a great idea to test it with this analyzer instead. MockAnalyzer has the following behavior:
- By default, the assertions in
MockTokenizerare turned on for extra checks that the consumer is consuming properly. These checks can be disabled withsetEnableChecks(boolean). - Payload data is randomly injected into the stream for more thorough testing of payloads.
- See Also:
MockTokenizer
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
-
-
Constructor Summary
Constructors Constructor Description MockAnalyzer(Random random)Create a Whitespace-lowercasing analyzer with no stopwords removal.MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase)MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase, CharacterRunAutomaton filter)Creates a new MockAnalyzer.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Analyzer.TokenStreamComponentscreateComponents(String fieldName)intgetOffsetGap(String fieldName)Get the offset gap between tokens in fields if several fields with the same name were added.intgetPositionIncrementGap(String fieldName)protected TokenStreamnormalize(String fieldName, TokenStream in)voidsetEnableChecks(boolean enableChecks)Toggle consumer workflow checking: if your test consumes tokenstreams normally you should leave this enabled.voidsetMaxTokenLength(int length)Toggle maxTokenLength for MockTokenizervoidsetOffsetGap(int offsetGap)Set a new offset gap which will then be added to the offset when several fields with the same name are indexedvoidsetPositionIncrementGap(int positionIncrementGap)-
Methods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, getReuseStrategy, initReader, initReaderForNormalization, normalize, tokenStream, tokenStream
-
-
-
-
Constructor Detail
-
MockAnalyzer
public MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase, CharacterRunAutomaton filter)
Creates a new MockAnalyzer.- Parameters:
random- Random for payloads behaviorrunAutomaton- DFA describing how tokenization should happen (e.g. [a-zA-Z]+)lowerCase- true if the tokenizer should lowercase termsfilter- DFA describing how terms should be filtered (set of stopwords, etc)
-
MockAnalyzer
public MockAnalyzer(Random random, CharacterRunAutomaton runAutomaton, boolean lowerCase)
-
MockAnalyzer
public MockAnalyzer(Random random)
Create a Whitespace-lowercasing analyzer with no stopwords removal.Calls
MockAnalyzer(random, MockTokenizer.WHITESPACE, true, MockTokenFilter.EMPTY_STOPSET, false).
-
-
Method Detail
-
createComponents
public Analyzer.TokenStreamComponents createComponents(String fieldName)
- Specified by:
createComponentsin classAnalyzer
-
normalize
protected TokenStream normalize(String fieldName, TokenStream in)
-
setPositionIncrementGap
public void setPositionIncrementGap(int positionIncrementGap)
-
getPositionIncrementGap
public int getPositionIncrementGap(String fieldName)
- Overrides:
getPositionIncrementGapin classAnalyzer
-
setOffsetGap
public void setOffsetGap(int offsetGap)
Set a new offset gap which will then be added to the offset when several fields with the same name are indexed- Parameters:
offsetGap- The offset gap that should be used.
-
getOffsetGap
public int getOffsetGap(String fieldName)
Get the offset gap between tokens in fields if several fields with the same name were added.- Overrides:
getOffsetGapin classAnalyzer- Parameters:
fieldName- Currently not used, the same offset gap is returned for each field.
-
setEnableChecks
public void setEnableChecks(boolean enableChecks)
Toggle consumer workflow checking: if your test consumes tokenstreams normally you should leave this enabled.
-
setMaxTokenLength
public void setMaxTokenLength(int length)
Toggle maxTokenLength for MockTokenizer
-
-