Package org.apache.uima.cas.impl
Implementation and Low-Level API for the CAS Interfaces.
Internal APIs. Use these APIs at your own risk. APIs in this package are subject to change without notice, even in minor releases. Use of this package is not supported. If you think you have found a bug in this package, please try to reproduce it with the officially supported APIs before reporting it.
Internals documentation
NOTE: This documentation is plain HTML, generated from a WYSIWIG editor "tinymce". The way to work on this: after setting up a small web page with the tinymce (running from a local file), use the Tools - source code to cut/paste between this file's source and that editor.
Java Cover Objects
It is possible to run UIMA without creating Java cover objects for FSs. However, for convenience, many of the APIs return Java objects that provide, in turn, various APIs for accessing features, updating in indexes, etc.
There are two kinds of Java cover objects:
- Basic - this provides a generic API that works for all FSs.
- JCas - this provides a custom Java Class for a type that gives get/set style accessors for the Features.
Both of these inherit from the FeatureStructure
Interface. Use of the JCas is optional; if the JCas cover classes are available (in the class path), they are used.
UIMA Indexes
Indexes are defined for a pipeline, and are kept as part of the general CAS definition.
Each CAS View has its own instantiation of the defined indexes (there's one definition for all views), and as a result, a particular FS may be added-to-indexes and indexed in some views, and not in others.
There are 3 kinds of indexes: Sorted, Set, and Bag. The basic object type for an index is FSLeafIndexImpl
. This has 3 subtypes, one for each of the index types:
- FSBagIndex
- FSIntArrayIndex (for Sorted)
- FSRBTSetIndex (for Set)
The leaf index is just for one type (and doesn't include entries for any subtypes).
Indexes are connected to specific index definitions; these definitions include a type which is the top type for elements of this index. The index definition logically includes that type and all of its subtypes.
An additional data struction, the IndexIteratorCachePair, is associated with each index definition. It holds references to the subtype FSLeafIndexImpls for all subtypes of an index; this list is created lazily, only when an iterator is created over this index at a particular type level (which can be the type the index was defined for, or any subtype). This lazy aspect is important, because UIMA is often used in cases where there's a giant type system, with lots of subtypes, only a few of which are used in a particular pipeline instance.
There are two tasks that indexes accomplish:
- updating the index with adds and removes of FSs. This update operation is optimized by
- keeping each type indexed separately, so only that data structure for the particular type need be updated (this design choice has a cost in iteration, though)
- treating more common use cases efficiently - the main one being that of adding something "to the end" of the items in the index.
- iterating over an index for a type and its subtypes.
- For indexes having no subtypes, this is done by iterating over the FSLeafIndexImpl for that index and type.
- For indexing with subtypes, this is done by creating individual iterators for the type and all of its subtypes, each iterating over the FSLeafIndexImpl for that type. These iterators are then logically combined into one iterator.
Iterators
There are two main kinds of iterators:
- Iterators over UIMA Indexes
- Iterators over other UIMA objects, such as Views, or internal structures.
Iterators over UIMA indexes
There are two main kinds of iterators over UIMA indexes:
- those returning int values representing the location of the FS in the heap.
- those returning Java cover objects representing the FS. This is typically implemented as a wrapper around the one returning the int value. (It can't be a subclass, overriding the get() method, because you can't change the return type when overriding).
The basic int iterators are implemented with instances of the classes:
FSIntIteratorImplBase<T extends FeatureStructure>
- the common superclassIntIterator4bag<T extends FeatureStructure>
IntIterator4set<T extends FeatureStructure>
IntIterator4sorted<T extends FeatureStructure>
All of these implement an iterator over the corresponding FSLeafIndexImpl for one type.
The class PointerIterator in FSIndexRepositoryImpl is an int iterator that combines iterators for type and their subtypes, into one aggregated iterator, taking into account the comparator sorting order among the various iterators. So, for instance, if you do a moveTo operation, it does a move to on all the individual iterators, and then figures out which of those is the left-most one in the comparator ordering.
PointerIteratorUnordered is a variant that also combines iterators for a type and its subtypes, but doesn't try to keep these in order. It is designed to be used when iterating through all instances of a type and its subtypes, in an arbitrary order, such as what the method getAllIndexedFS(type)
does.
SnapshotPointerIterator is a variant which creates a snapshot when the iterator is created, and then (ignoring any subsequent index updates) iterates over that. This iterator won't throw ConcurrentModificationException
.
The basic impls of IntIterator4bag/set/sorted are created by calls to pointerIterator; this method is implemented in each of the IntIterator4bag/set/sorted classes.
The 2nd argument passed is a ref to this FSIndexRepositoryImpl's int[] used to detect concurrent modification exception. If null is passed, then no testing for this is done. This kind of call happens with the use of the refIterator() methods, which are used internally when it is known that the iteration will not be modifying the indexes in any way.
Iterators which return Java cover object:
- FSIteratorImplBase<T> abstract class implements FSIterator
- FSIteratorWrapper<T> - this is the standard wrapper around the above int iterators.
- FSIteratorAggregate<T> - created with a list of iterators, operates like PointerIteratorUnordered, except the underlying iterators are ones returning Java cover objects
- FSIteratorFlat<T> - iterates over a corresponding "flattened" iterator of a type and its subtypes (an optimization).
- FSIteratorWrapperDoublecheck<T> - only used for debugging
- FilteredIterator<T> - wraps an FSIterator, and applies a FSMatchConstraint
- Subiterator<T> - created from an AnnotationIndex; wraps a plain FSIterator
- UnambiguousIteratorImpl<T> - created from an AnnotationIndex; wraps a plain FSIterator
Plain FSIterators are created from index instances via the iterator()
method; corresponding int iterators are created from low-level indexes via ll_iterator()
. This method picks the appropriate underlying iterator based on
- the index kind (BAG/SET/SORTED)
- whether or not the index has subtypes
- index extended function flags (snapshot)
- whether or not the index has a currently valid "flat" version (for SORTED indexes)
Iterator Interfaces
There are several overlapping interfaces (probably due to historical reasons) for these iterators.
First, interfaces for iterators returning ints:
- IntPointerIterator - returns an int, implements isValid, inc/dec, moveToFirst, moveToLast, get, copy
- ComparableIntPointerIterator<T> - combines IntPointerIterater and the standard Java Comparable. This is the main interface for a full-function instance iterators over a FSLeafIndexImpl.
- The comparable is to allow the iterators to be compared (which means to compare the values of the FSs the iterators are positioned at), which is only used when ordering multiple iterators in use when iterating over a type and its subtypes (one iterator for each).
- LowLevelIterator - like IntPointerIterator, has moveToFirst/Last, and isValid; difference: has: ll_get, moveToNext/Previous, and moveTo(fs), also ll_indexSize and ll_getIndex.
Next, interfaces for iterators returning Java cover objects:
- FSIterator - extends the standard Java interface: hasNext, next, remove, with isValid, get, moveToNext/Previous/First/Last, moveTo(fs), copy.
-
Interface Summary Interface Description FSComparator Interface to compare two feature structures.FSGenerator<T extends FeatureStructure> FSImplComparator Interface to compare two feature structures, represented by their addresses.FSIndexImpl Class comment for FSIndexImpl.java goes here.FSRefIterator LowLevelCAS Defines the low-level CAS APIs.LowLevelIndex Low-level FS index object.LowLevelIndexRepository Low-level index repository access.LowLevelIterator Low-level FS iterator.LowLevelTypeSystem Low-level version of the type system APIs.StringMap Appears to be unused, 1-2015 schorXMLTypeSystemConsts Class comment for XMLTypeSystemConsts.java goes here. -
Class Summary Class Description AnnotationBaseImpl Class comment for AnnotationImpl.java goes here.AnnotationImpl Class comment for AnnotationImpl.java goes here.AnnotationIndexImpl<T extends AnnotationFS> Implementation of annotation indexes.AnnotationTreeImpl<T extends AnnotationFS> Implementation of annotation tree.AnnotationTreeNodeImpl<T extends AnnotationFS> TODO: Create type comment for AnnotationTreeNodeImpl.ArrayFSGenerator ArrayFSImpl Implementation of theArrayFS
interface.BinaryCasSerDes4 User callable serialization and deserialization of the CAS in a compressed Binary Format This serializes/deserializes the state of the CAS, assuming that the type information remains constant.BinaryCasSerDes6 User callable serialization and deserialization of the CAS in a compressed Binary Format This serializes/deserializes the state of the CAS.BinaryCasSerDes6.ReuseInfo Info reused for 1) multiple serializations of same cas to multiple targets (a speedup), or 2) for delta cas serialization, where it represents the fsStartIndex info before any mods were done which could change that info, or 3) for deserializing with a delta cas, where it represents the fsStartIndex info at the time the CAS was serialized out..BooleanArrayFSImpl Implementation of theBooleanArrayFS
interface.BooleanConstraint Implementation of boolean match constraint.ByteArrayFSImpl Implementation of theByteArrayFS
interface.CASCompleteSerializer This is a small object which contains - CASMgrSerializer instance - a Java serializable form of the type system + index definitions - CASSerializer instance - a Java serializable form of the CAS including lists of which FSs are indexedCASImpl Implements the CAS interfaces.CASMgrSerializer Container for serialized CAS typing information.CasSeqAddrMaps Manage the conversion of Items (FSrefs) to relative sequential index number, and back Manage the difference in two type systems both size of the FSs and handling excluded types During serialization, these maps are constructed before serialization.CASSerializer This object has 2 purposes.CasSerializerSupport CAS serializer support for XMI and JSON formats.CasSerializerSupport.CasSerializerSupportSerialize Methods used to serialize items Separate implementations for JSON and XmiCasTypeSystemMapper This class gets initialized with two type systems, and then provides resources to map type and feature codes between them.CommonArrayFSImpl Common part of array impl for those arrays of primitives which exist in the main heap.CommonAuxArrayFSImpl Common part of array impl for those arrays of primitives which use auxilliary heaps.CommonSerDes Common de/serializationCommonSerDes.Header HEADERS Serialization versioning There are 1 or 2 words used for versioning.CommonSerDes.Reading byte swapping reads of integer formsConstraintFactoryImpl Implementation of the ConstraintFactory interface.DebugFSLogicalStructure DebugFSLogicalStructure.IndexInfo Class holding information about an FSIndex Includes the "label" of the index, and a ref to the CAS this index contents are in.DebugFSLogicalStructure.ViewInfo Class holding info about a View/Sofa.DebugNameValuePair DefaultAnnotationComparator Default implementation to compare two annotations.DefaultFSAnnotationComparator Default implementation to compare two annotations.DoubleArrayFSImpl Implementation of theDoubleArrayFS
interface.FeatureImpl The implementation of features in the type system.FeatureStructureImpl Feature structure implementation.FeatureStructureImplC Feature structure implementation.FeatureValuePathImpl Contains CAS Type and Feature objects to represent a feature path of the form feature1/.../featureN.FloatArrayFSImpl Implementation of theIntArrayFS
interface.FSBagIndex<T extends FeatureStructure> Used for UIMA FS Bag Indexes Uses IntVector or PositiveIntSet to hold values of FSsFSBooleanConstraintImpl See interface for documentation.FSClassRegistry FSIndexComparatorImpl FSIndexFlat<T extends FeatureStructure> Flattened indexes built as a speed-up alternative for Sorted indexes.FSIndexFlat.FSIteratorFlat<TI extends FeatureStructure> FSIndexRepositoryImpl There is one instance of this class per CAS View.FSIntArrayIndex<T extends FeatureStructure> Used for sorted indexes only Uses IntVector (sorted) as the index (of FSs)FSIntIteratorImplBase<T extends FeatureStructure> Base class for int Iterators over indexes.FSIteratorImplBase<T extends FeatureStructure> Base class for FSIterator implementations.FSIteratorWrapper<T extends FeatureStructure> FSIteratorWrapperDoubleCheck<T extends FeatureStructure> Only used for debugging Takes two iterators, and compares them; returns the 1st, throws error if unequalFSLeafIndexImpl<T extends FeatureStructure> The common (among all index kinds - set, sorted, bag) info for an index Subtypes define the actual index repository (integers indexing the CAS) for each kind.FSRBTSetIndex<T extends FeatureStructure> Used for UIMA FS Set Indexes Uses CompIntArrayRBT red black tree to hold items Same as FSRBTIndex, but duplicates are not inserted.Heap A heap for CAS.IntArrayFSImpl Implementation of theIntArrayFS
interface.IntIterator4set<T extends FeatureStructure> LinearTypeOrderBuilderImpl Implementation of theLinearTypeOrderBuilder
interface.ListUtils Utilities for dealing with CAS List types.LLUnambiguousIteratorImpl LongArrayFSImpl Implementation of theArrayFS
interface.MarkerImpl A MarkerImpl holds a high-water "mark" in the CAS, for all views.OutOfTypeSystemData This class is used by the XCASDeserializer to store feature structures that do not fit into the type system of the CAS it is deserializing into.Serialization This class has no fields or instance methods, but instead has only static methods.ShortArrayFSImpl Implementation of theArrayFS
interface.SlotKinds NOTE: adding or altering slots breaks backward compatability and the ability do deserialize previously serialized things This definition shared with BinaryCasSerDes4 Define all the slot kinds.SofaFSImpl Implementation of theSofaFS
interface.StringArrayFSImpl Implementation of theArrayFS
interface.StringHeapDeserializationHelper Support for legacy string heap format.StringTypeImpl Class comment for StringTypeImpl.java goes here.Subiterator<T extends AnnotationFS> Subiterator implementation.TypeImpl The implementation of types in the type system.TypeNameSpaceImpl TypeSystem2Xml Dumps a Type System object to XML.TypeSystemImpl Type system implementation.TypeSystemUtils Class comment for TypeSystemUtils.java goes here.XCASDeserializer XCAS Deserializer.XCASSerializer XCAS serializer.XmiCasDeserializer XMI CAS deserializer.XmiCasSerializer CAS serializer for XMI format; writes a CAS in the XML Metadata Interchange (XMI) format.XmiSerializationSharedData A container for data that is shared between theXmiCasSerializer
and theXmiCasDeserializer
.XmiSerializationSharedData.XmiArrayElement Data structure holding the index and the xmi:id of an array or list element that is a reference to an out-of-typesystem FS. -
Enum Summary Enum Description AllowPreexistingFS BinaryCasSerDes4.Compression BinaryCasSerDes4.CompressLevel Compression alternativesBinaryCasSerDes4.CompressStrat BinaryCasSerDes4.SlotKind Define all the slot kinds.BinaryCasSerDes6.CompressLevel Compression alternativesBinaryCasSerDes6.CompressStrat FSIndexRepositoryImpl.IteratorExtraFunction Kinds of extra functions for iteratorsSlotKinds.SlotKind TypeSystemUtils.PathValid -
Exception Summary Exception Description AnnotationImplException Exception class for package org.apache.uima.cas.impl.LowLevelException Exception class for package org.apache.uima.cas.impl.XCASParsingException Exception class for package org.apache.uima.cas.impl.