public abstract class IndexReader
extends
implements
IndexWriter
will not be visible until a new
IndexReader
is opened. It's best to use DirectoryReader.open(IndexWriter)
to obtain an
IndexReader
, if your IndexWriter
is
in-process. When you need to re-open to see changes to the
index, it's best to use DirectoryReader.openIfChanged(DirectoryReader)
since the new reader will share resources with the previous
one when possible. Search of an index is done entirely
through this abstract interface, so that any subclass which
implements it is searchable.
There are two different types of IndexReaders:
LeafReader
: These indexes do not consist of several sub-readers,
they are atomic. They support retrieval of stored fields, doc values, terms,
and postings.
CompositeReader
: Instances (like DirectoryReader
)
of this reader can only
be used to get stored fields from the underlying LeafReaders,
but it is not possible to directly retrieve postings. To do that, get
the sub-readers via CompositeReader.getSequentialSubReaders()
.
IndexReader instances for indexes on disk are usually constructed
with a call to one of the static DirectoryReader.open()
methods,
e.g. DirectoryReader.open(org.apache.lucene.store.Directory)
. DirectoryReader
implements
the CompositeReader
interface, it is not possible to directly get postings.
For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral -- they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.
NOTE: IndexReader
instances are completely thread
safe, meaning multiple threads can call any of its methods,
concurrently. If your application requires external
synchronization, you should not synchronize on the
IndexReader
instance; use your own
(non-Lucene) objects instead.
Modifier and Type | Class and Description |
---|---|
static interface |
IndexReader.CacheHelper
A utility class that gives hooks in order to help build a cache based on
the data that is contained in this index.
|
static class |
IndexReader.CacheKey
A cache key identifying a resource that is being cached on.
|
static interface |
IndexReader.ClosedListener
A listener that is called when a resource gets closed.
|
Modifier and Type | Method and Description |
---|---|
void |
close()
Closes files associated with this index.
|
void |
decRef()
Expert: decreases the refCount of this IndexReader
instance.
|
abstract int |
docFreq(Term term)
Returns the number of documents containing the
term . |
protected abstract void |
doClose()
Implements close.
|
Document |
document(int docID)
Returns the stored fields of the
n th
Document in this index. |
Document |
document(int docID,
<> fieldsToLoad)
Like
document(int) but only loads the specified
fields. |
abstract void |
document(int docID,
StoredFieldVisitor visitor)
Expert: visits the fields of a stored document, for
custom processing/loading of each field.
|
protected void |
ensureOpen()
Throws AlreadyClosedException if this IndexReader or any
of its child readers is closed, otherwise returns.
|
boolean |
obj) |
abstract IndexReaderContext |
getContext()
Expert: Returns the root
IndexReaderContext for this
IndexReader 's sub-reader tree. |
abstract int |
field)
Returns the number of documents that have at least one term for this field.
|
abstract IndexReader.CacheHelper |
getReaderCacheHelper()
Optional method: Return a
IndexReader.CacheHelper that can be used to cache
based on the content of this reader. |
int |
getRefCount()
Expert: returns the current refCount for this reader
|
abstract long |
field)
Returns the sum of
TermsEnum.docFreq() for all terms in this field. |
abstract long |
field)
Returns the sum of
TermsEnum.totalTermFreq() for all terms in this
field. |
Terms |
getTermVector(int docID,
field)
Retrieve term vector for this document and field, or
null if term vectors were not indexed.
|
abstract Fields |
getTermVectors(int docID)
Retrieve term vectors for this document, or null if
term vectors were not indexed.
|
boolean |
hasDeletions()
Returns true if any documents have been deleted.
|
int |
hashCode() |
void |
incRef()
Expert: increments the refCount of this IndexReader
instance.
|
<LeafReaderContext> |
leaves()
Returns the reader's leaves, or itself if this reader is atomic.
|
abstract int |
maxDoc()
Returns one greater than the largest possible document number.
|
int |
numDeletedDocs()
Returns the number of deleted documents.
|
abstract int |
numDocs()
Returns the number of documents in this index.
|
void |
registerParentReader(IndexReader reader)
Expert: This method is called by
IndexReader s which wrap other readers
(e.g. |
abstract long |
totalTermFreq(Term term)
Returns the total number of occurrences of
term across all
documents (the sum of the freq() for each doc that has this term). |
boolean |
tryIncRef()
Expert: increments the refCount of this IndexReader
instance only if the IndexReader has not been closed yet
and returns
true iff the refCount was
successfully incremented, otherwise false . |
public final void registerParentReader(IndexReader reader)
IndexReader
s which wrap other readers
(e.g. CompositeReader
or FilterLeafReader
) to register the parent
at the child (this reader) on construction of the parent. When this reader is closed,
it will mark all registered parents as closed, too. The references to parent readers
are weak only, so they can be GCed once they are no longer in use.public final int getRefCount()
public final void incRef()
decRef()
, in a finally clause;
otherwise the reader may never be closed. Note that
close()
simply calls decRef(), which means that
the IndexReader will not really be closed until decRef()
has been called for all outstanding
references.decRef()
,
tryIncRef()
public final boolean tryIncRef()
true
iff the refCount was
successfully incremented, otherwise false
.
If this method returns false
the reader is either
already closed or is currently being closed. Either way this
reader instance shouldn't be used by an application unless
true
is returned.
RefCounts are used to determine when a
reader can be closed safely, i.e. as soon as there are
no more references. Be sure to always call a
corresponding decRef()
, in a finally clause;
otherwise the reader may never be closed. Note that
close()
simply calls decRef(), which means that
the IndexReader will not really be closed until decRef()
has been called for all outstanding
references.
public final void decRef() throws
- in case an IOException occurs in doClose()incRef()
protected final void ensureOpen() throws AlreadyClosedException
AlreadyClosedException
public final boolean equals( obj)
IndexReader
subclasses are not allowed
to implement equals/hashCode, so methods are declared final.
in class
public final int hashCode()
IndexReader
subclasses are not allowed
to implement equals/hashCode, so methods are declared final.
in class
public abstract Fields getTermVectors(int docID) throws
public final Terms getTermVector(int docID, field) throws
public abstract int numDocs()
NOTE: This operation may run in O(maxDoc). Implementations that can't return this number in constant-time should cache it.
public abstract int maxDoc()
public final int numDeletedDocs()
NOTE: This operation may run in O(maxDoc).
public abstract void document(int docID, StoredFieldVisitor visitor) throws
document(int)
. If you want to load a subset, use
DocumentStoredFieldVisitor
.
public final Document document(int docID) throws
n
th
Document
in this index. This is just
sugar for using DocumentStoredFieldVisitor
.
NOTE: for performance reasons, this method does not check if the
requested document is deleted, and therefore asking for a deleted document
may yield unspecified results. Usually this is not required, however you
can test if the doc is deleted by checking the Bits
returned from MultiBits.getLiveDocs(org.apache.lucene.index.IndexReader)
.
NOTE: only the content of a field is returned,
if that field was stored during indexing. Metadata
like boost, omitNorm, IndexOptions, tokenized, etc.,
are not preserved.
CorruptIndexException
- if the index is corrupt
- if there is a low-level IO errorpublic final Document document(int docID, <> fieldsToLoad) throws
document(int)
but only loads the specified
fields. Note that this is simply sugar for DocumentStoredFieldVisitor.DocumentStoredFieldVisitor(Set)
.
public boolean hasDeletions()
public final void close() throws
in interface
in interface
- if there is a low-level IO errorprotected abstract void doClose() throws
public abstract IndexReaderContext getContext()
IndexReaderContext
for this
IndexReader
's sub-reader tree.
Iff this reader is composed of sub
readers, i.e. this reader being a composite reader, this method returns a
CompositeReaderContext
holding the reader's direct children as well as a
view of the reader tree's atomic leaf contexts. All sub-
IndexReaderContext
instances referenced from this readers top-level
context are private to this reader and are not shared with another context
tree. For example, IndexSearcher uses this API to drive searching by one
atomic leaf reader at a time. If this reader is not composed of child
readers, this method returns an LeafReaderContext
.
Note: Any of the sub-CompositeReaderContext
instances referenced
from this top-level context do not support CompositeReaderContext.leaves()
.
Only the top-level context maintains the convenience leaf-view
for performance reasons.
public final <LeafReaderContext> leaves()
this.getContext().leaves()
.IndexReaderContext.leaves()
public abstract IndexReader.CacheHelper getReaderCacheHelper()
IndexReader.CacheHelper
that can be used to cache
based on the content of this reader. Two readers that have different data
or different sets of deleted documents will be considered different.
A return value of null
indicates that this reader is not suited
for caching, which is typically the case for short-lived wrappers that
alter the content of the wrapped reader.
public abstract int docFreq(Term term) throws
term
. This method returns 0 if the term or
field does not exists. This method does not take into
account deleted documents that have not yet been merged
away.
TermsEnum.docFreq()
public abstract long totalTermFreq(Term term) throws
term
across all
documents (the sum of the freq() for each doc that has this term).
Note that, like other term measures, this measure does not take
deleted documents into account.
public abstract long getSumDocFreq( field) throws
TermsEnum.docFreq()
for all terms in this field.
Note that, just like other term measures, this measure does not take deleted
documents into account.
Terms.getSumDocFreq()
public abstract int getDocCount( field) throws
Terms.getDocCount()
public abstract long getSumTotalTermFreq( field) throws
TermsEnum.totalTermFreq()
for all terms in this
field. Note that, just like other term measures, this measure does not take
deleted documents into account.
Terms.getSumTotalTermFreq()
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.