Class BaseCharFilter

  • All Implemented Interfaces:
    Closeable, AutoCloseable, Readable
    Direct Known Subclasses:
    HTMLStripCharFilter, MappingCharFilter

    public abstract class BaseCharFilter
    extends CharFilter

    Base utility class for implementing a CharFilter. You subclass this, and then record mappings by calling addOffCorrectMap(int, int), and then invoke the correct method to correct an offset.

    +

    + CharFilters modify an input stream via a series of substring + replacements (including deletions and insertions) to produce an output + stream. There are three possible replacement cases: the replacement + string has the same length as the original substring; the replacement + is shorter; and the replacement is longer. In the latter two cases + (when the replacement has a different length than the original), + one or more offset correction mappings are required. +

    +

    + When the replacement is shorter than the original (e.g. when the + replacement is the empty string), a single offset correction mapping + should be added at the replacement's end offset in the output stream. + The cumulativeDiff parameter to the + addOffCorrectMapping() method will be the sum of all + previous replacement offset adjustments, with the addition of the + difference between the lengths of the original substring and the + replacement string (a positive value). +

    +

    + When the replacement is longer than the original (e.g. when the + original is the empty string), you should add as many offset + correction mappings as the difference between the lengths of the + replacement string and the original substring, starting at the + end offset the original substring would have had in the output stream. + The cumulativeDiff parameter to the + addOffCorrectMapping() method will be the sum of all + previous replacement offset adjustments, with the addition of the + difference between the lengths of the original substring and the + replacement string so far (a negative value). +

    • Constructor Detail

      • BaseCharFilter

        public BaseCharFilter​(CharStream in)
    • Method Detail

      • correct

        protected int correct​(int currentOff)
        Retrieve the corrected offset.
        Overrides:
        correct in class CharFilter
        Parameters:
        currentOff - current offset
        Returns:
        corrected offset
      • getLastCumulativeDiff

        protected int getLastCumulativeDiff()
      • addOffCorrectMap

        protected void addOffCorrectMap​(int off,
                                        int cumulativeDiff)

        Adds an offset correction mapping at the given output stream offset.

        Assumption: the offset given with each successive call to this method will not be smaller than the offset given at the previous invocation.

        Parameters:
        off - The output stream offset at which to apply the correction
        cumulativeDiff - The input offset is given by adding this to the output offset