Class BreakTransliterator

java.lang.Object
com.ibm.icu.text.Transliterator
com.ibm.icu.text.BreakTransliterator
All Implemented Interfaces:
StringTransform, Transform<String,String>

final class BreakTransliterator extends Transliterator
Inserts the specified characters at word breaks. To restrict it to particular characters, use a filter. TODO: this is an internal class, and only temporary. Remove it once we have \b notation in Transliterator.
  • Field Details

    • bi

      private BreakIterator bi
    • insertion

      private String insertion
    • boundaries

      private int[] boundaries
    • boundaryCount

      private int boundaryCount
    • LETTER_OR_MARK_MASK

      static final int LETTER_OR_MARK_MASK
      See Also:
  • Constructor Details

  • Method Details

    • getInsertion

      public String getInsertion()
    • setInsertion

      public void setInsertion(String insertion)
    • getBreakIterator

      public BreakIterator getBreakIterator()
    • setBreakIterator

      public void setBreakIterator(BreakIterator bi)
    • handleTransliterate

      protected void handleTransliterate(Replaceable text, Transliterator.Position pos, boolean incremental)
      Description copied from class: Transliterator
      Abstract method that concrete subclasses define to implement their transliteration algorithm. This method handles both incremental and non-incremental transliteration. Let originalStart refer to the value of pos.start upon entry.
      • If incremental is false, then this method should transliterate all characters between pos.start and pos.limit. Upon return pos.start must == pos.limit.
      • If incremental is true, then this method should transliterate all characters between pos.start and pos.limit that can be unambiguously transliterated, regardless of future insertions of text at pos.limit. Upon return, pos.start should be in the range [originalStart, pos.limit). pos.start should be positioned such that characters [originalStart, pos.start) will not be changed in the future by this transliterator and characters [pos.start, pos.limit) are unchanged.

      Implementations of this method should also obey the following invariants:

      • pos.limit and pos.contextLimit should be updated to reflect changes in length of the text between pos.start and pos.limit. The difference pos.contextLimit - pos.limit should not change.
      • pos.contextStart should not change.
      • Upon return, neither pos.start nor pos.limit should be less than originalStart.
      • Text before originalStart and text after pos.limit should not change.
      • Text before pos.contextStart and text after pos.contextLimit should be ignored.

      Subclasses may safely assume that all characters in [pos.start, pos.limit) are filtered. In other words, the filter has already been applied by the time this method is called. See filteredTransliterate().

      This method is not for public consumption. Calling this method directly will transliterate [pos.start, pos.limit) without applying the filter. End user code should call transliterate() instead of this method. Subclass code should call filteredTransliterate() instead of this method.

      Specified by:
      handleTransliterate in class Transliterator
      Parameters:
      text - the buffer holding transliterated and untransliterated text
      pos - the indices indicating the start, limit, context start, and context limit of the text.
      incremental - if true, assume more text may be inserted at pos.limit and act accordingly. Otherwise, transliterate all text between pos.start and pos.limit and move pos.start up to pos.limit.
      See Also:
    • register

      static void register()
      Registers standard variants with the system. Called by Transliterator during initialization.
    • addSourceTargetSet

      public void addSourceTargetSet(UnicodeSet inputFilter, UnicodeSet sourceSet, UnicodeSet targetSet)
      Description copied from class: Transliterator
      Returns the set of all characters that may be generated as replacement text by this transliterator, filtered by BOTH the input filter, and the current getFilter().

      SHOULD BE OVERRIDDEN BY SUBCLASSES. It is probably an error for any transliterator to NOT override this, but we can't force them to for backwards compatibility.

      Other methods vector through this.

      When gathering the information on source and target, the compound transliterator makes things complicated. For example, suppose we have:

       Global FILTER = [ax]
       a > b;
       :: NULL;
       b > c;
       x > d;
       
      While the filter just allows a and x, b is an intermediate result, which could produce c. So the source and target sets cannot be gathered independently. What we have to do is filter the sources for the first transliterator according to the global filter, intersect that transliterator's filter. Based on that we get the target. The next transliterator gets as a global filter (global + last target). And so on.

      There is another complication:

       Global FILTER = [ax]
       a >|b;
       b >c;
       
      Even though b would be filtered from the input, whenever we have a backup, it could be part of the input. So ideally we will change the global filter as we go.
      Overrides:
      addSourceTargetSet in class Transliterator
      Parameters:
      targetSet - TODO
      See Also: