Package org.apache.lucene.util
Class BytesRefHash
- java.lang.Object
-
- org.apache.lucene.util.BytesRefHash
-
public final class BytesRefHash extends Object
BytesRefHash
is a special purpose hash-map like data-structure optimized forBytesRef
instances. BytesRefHash maintains mappings of byte arrays to ordinal (Map) storing the hashed bytes efficiently in continuous storage. The mapping to the ordinal is encapsulated inside BytesRefHash
and is guaranteed to be increased for each addedBytesRef
.Note: The maximum capacity
BytesRef
instance passed toadd(BytesRef)
must not be longer thanByteBlockPool.BYTE_BLOCK_SIZE
-2. The internal storage is limited to 2GB total byte storage.- NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
BytesRefHash.BytesStartArray
Manages allocation of the per-term addresses.static class
BytesRefHash.DirectBytesStartArray
A simpleBytesRefHash.BytesStartArray
that tracks memory allocation using a privateAtomicLong
instance.static class
BytesRefHash.MaxBytesLengthExceededException
static class
BytesRefHash.TrackingDirectBytesStartArray
A simpleBytesRefHash.BytesStartArray
that tracks all memory allocation using a sharedAtomicLong
instance.
-
Field Summary
Fields Modifier and Type Field Description static int
DEFAULT_CAPACITY
-
Constructor Summary
Constructors Constructor Description BytesRefHash()
BytesRefHash(ByteBlockPool pool)
Creates a newBytesRefHash
BytesRefHash(ByteBlockPool pool, int capacity, BytesRefHash.BytesStartArray bytesStartArray)
Creates a newBytesRefHash
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int
add(BytesRef bytes)
Adds a newBytesRef
int
add(BytesRef bytes, int code)
Adds a newBytesRef
with a pre-calculated hash code.int
addByPoolOffset(int offset)
int
byteStart(int ord)
Returns the bytesStart offset into the internally usedByteBlockPool
for the given ordvoid
clear()
void
clear(boolean resetPool)
void
close()
Closes the BytesRefHash and releases all internally used memoryint[]
compact()
Returns the ords array in arbitrary order.BytesRef
get(int ord, BytesRef ref)
Populates and returns aBytesRef
with the bytes for the given ord.void
reinit()
reinitializes theBytesRefHash
after a previousclear()
call.int
size()
Returns the number ofBytesRef
values in thisBytesRefHash
.int[]
sort(Comparator<BytesRef> comp)
Returns the values array sorted by the referenced byte values.
-
-
-
Field Detail
-
DEFAULT_CAPACITY
public static final int DEFAULT_CAPACITY
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
BytesRefHash
public BytesRefHash()
-
BytesRefHash
public BytesRefHash(ByteBlockPool pool)
Creates a newBytesRefHash
-
BytesRefHash
public BytesRefHash(ByteBlockPool pool, int capacity, BytesRefHash.BytesStartArray bytesStartArray)
Creates a newBytesRefHash
-
-
Method Detail
-
size
public int size()
Returns the number ofBytesRef
values in thisBytesRefHash
.- Returns:
- the number of
BytesRef
values in thisBytesRefHash
.
-
get
public BytesRef get(int ord, BytesRef ref)
Populates and returns aBytesRef
with the bytes for the given ord.Note: the given ord must be a positive integer less that the current size (
size()
)- Parameters:
ord
- the ordref
- theBytesRef
to populate- Returns:
- the given BytesRef instance populated with the bytes for the given ord
-
compact
public int[] compact()
Returns the ords array in arbitrary order. Valid ords start at offset of 0 and end at a limit ofsize()
- 1Note: This is a destructive operation.
clear()
must be called in order to reuse thisBytesRefHash
instance.
-
sort
public int[] sort(Comparator<BytesRef> comp)
Returns the values array sorted by the referenced byte values.Note: This is a destructive operation.
clear()
must be called in order to reuse thisBytesRefHash
instance.- Parameters:
comp
- theComparator
used for sorting
-
clear
public void clear(boolean resetPool)
-
clear
public void clear()
-
close
public void close()
Closes the BytesRefHash and releases all internally used memory
-
add
public int add(BytesRef bytes)
Adds a newBytesRef
- Parameters:
bytes
- the bytes to hash- Returns:
- the ord the given bytes are hashed if there was no mapping for the
given bytes, otherwise
(-(ord)-1)
. This guarantees that the return value will always be >= 0 if the given bytes haven't been hashed before. - Throws:
BytesRefHash.MaxBytesLengthExceededException
- if the given bytes are > 2 +ByteBlockPool.BYTE_BLOCK_SIZE
-
add
public int add(BytesRef bytes, int code)
Adds a newBytesRef
with a pre-calculated hash code.- Parameters:
bytes
- the bytes to hashcode
- the bytes hash codeHashcode is defined as:
int hash = 0; for (int i = offset; i < offset + length; i++) { hash = 31 * hash + bytes[i]; }
- Returns:
- the ord the given bytes are hashed if there was no mapping for the
given bytes, otherwise
(-(ord)-1)
. This guarantees that the return value will always be >= 0 if the given bytes haven't been hashed before. - Throws:
BytesRefHash.MaxBytesLengthExceededException
- if the given bytes are >ByteBlockPool.BYTE_BLOCK_SIZE
- 2
-
addByPoolOffset
public int addByPoolOffset(int offset)
-
reinit
public void reinit()
reinitializes theBytesRefHash
after a previousclear()
call. Ifclear()
has not been called previously this method has no effect.
-
byteStart
public int byteStart(int ord)
Returns the bytesStart offset into the internally usedByteBlockPool
for the given ord- Parameters:
ord
- the ord to look up- Returns:
- the bytesStart offset into the internally used
ByteBlockPool
for the given ord
-
-