Class PDFStreamEngine

    • Constructor Detail

      • PDFStreamEngine

        public PDFStreamEngine()
        Constructor.
      • PDFStreamEngine

        public PDFStreamEngine​(java.util.Properties properties)
                        throws java.io.IOException
        Constructor with engine properties. The property keys are all PDF operators, the values are class names used to execute those operators. An empty value means that the operator will be silently ignored.
        Parameters:
        properties - The engine properties.
        Throws:
        java.io.IOException - If there is an error setting the engine properties.
    • Method Detail

      • isForceParsing

        public boolean isForceParsing()
        Indicates if force parsing is activated.
        Returns:
        true if force parsing is active
      • setForceParsing

        public void setForceParsing​(boolean forceParsingValue)
        Enable/Disable force parsing.
        Parameters:
        forceParsingValue - true activates force parsing
      • registerOperatorProcessor

        public void registerOperatorProcessor​(java.lang.String operator,
                                              OperatorProcessor op)
        Register a custom operator processor with the engine.
        Parameters:
        operator - The operator as a string.
        op - Processor instance.
      • resetEngine

        public void resetEngine()
        This method must be called between processing documents. The PDFStreamEngine caches information for the document between pages and this will release the cached information. This only needs to be called if processing a new document.
      • processStream

        public void processStream​(PDPage aPage,
                                  PDResources resources,
                                  COSStream cosStream)
                           throws java.io.IOException
        This will process the contents of the stream.
        Parameters:
        aPage - The page.
        resources - The location to retrieve resources.
        cosStream - the Stream to execute.
        Throws:
        java.io.IOException - if there is an error accessing the stream.
      • processSubStream

        public void processSubStream​(PDPage aPage,
                                     PDResources resources,
                                     COSStream cosStream)
                              throws java.io.IOException
        Process a sub stream of the current stream.
        Parameters:
        aPage - The page used for drawing.
        resources - The resources used when processing the stream.
        cosStream - The stream to process.
        Throws:
        java.io.IOException - If there is an exception while processing the stream.
      • processTextPosition

        protected void processTextPosition​(TextPosition text)
        A method provided as an event interface to allow a subclass to perform some specific functionality when text needs to be processed.
        Parameters:
        text - The text to be processed.
      • inspectFontEncoding

        protected java.lang.String inspectFontEncoding​(java.lang.String str)
        A method provided as an event interface to allow a subclass to perform some specific functionality on the string encoded by a glyph.
        Parameters:
        str - The string to be processed.
      • processEncodedText

        public void processEncodedText​(byte[] string)
                                throws java.io.IOException
        Process encoded text from the PDF Stream. You should override this method if you want to perform an action when encoded text is being processed.
        Parameters:
        string - The encoded text
        Throws:
        java.io.IOException - If there is an error processing the string
      • processOperator

        public void processOperator​(java.lang.String operation,
                                    java.util.List<COSBase> arguments)
                             throws java.io.IOException
        This is used to handle an operation.
        Parameters:
        operation - The operation to perform.
        arguments - The list of arguments.
        Throws:
        java.io.IOException - If there is an error processing the operation.
      • processOperator

        protected void processOperator​(PDFOperator operator,
                                       java.util.List<COSBase> arguments)
                                throws java.io.IOException
        This is used to handle an operation.
        Parameters:
        operator - The operation to perform.
        arguments - The list of arguments.
        Throws:
        java.io.IOException - If there is an error processing the operation.
      • getColorSpaces

        public java.util.Map<java.lang.String,​PDColorSpace> getColorSpaces()
        Returns:
        Returns the colorSpaces.
      • getXObjects

        public java.util.Map<java.lang.String,​PDXObject> getXObjects()
        Returns:
        Returns the colorSpaces.
      • setColorSpaces

        public void setColorSpaces​(java.util.Map<java.lang.String,​PDColorSpace> value)
        Parameters:
        value - The colorSpaces to set.
      • getFonts

        public java.util.Map<java.lang.String,​PDFont> getFonts()
        Returns:
        Returns the fonts.
      • setFonts

        public void setFonts​(java.util.Map<java.lang.String,​PDFont> value)
        Parameters:
        value - The fonts to set.
      • getGraphicsStack

        public java.util.Stack<PDGraphicsState> getGraphicsStack()
        Returns:
        Returns the graphicsStack.
      • setGraphicsStack

        public void setGraphicsStack​(java.util.Stack<PDGraphicsState> value)
        Parameters:
        value - The graphicsStack to set.
      • getGraphicsState

        public PDGraphicsState getGraphicsState()
        Returns:
        Returns the graphicsState.
      • setGraphicsState

        public void setGraphicsState​(PDGraphicsState value)
        Parameters:
        value - The graphicsState to set.
      • getGraphicsStates

        public java.util.Map<java.lang.String,​PDExtendedGraphicsState> getGraphicsStates()
        Returns:
        Returns the graphicsStates.
      • setGraphicsStates

        public void setGraphicsStates​(java.util.Map<java.lang.String,​PDExtendedGraphicsState> value)
        Parameters:
        value - The graphicsStates to set.
      • getTextLineMatrix

        public Matrix getTextLineMatrix()
        Returns:
        Returns the textLineMatrix.
      • setTextLineMatrix

        public void setTextLineMatrix​(Matrix value)
        Parameters:
        value - The textLineMatrix to set.
      • getTextMatrix

        public Matrix getTextMatrix()
        Returns:
        Returns the textMatrix.
      • setTextMatrix

        public void setTextMatrix​(Matrix value)
        Parameters:
        value - The textMatrix to set.
      • getResources

        public PDResources getResources()
        Returns:
        Returns the resources.
      • getCurrentPage

        public PDPage getCurrentPage()
        Get the current page that is being processed.
        Returns:
        The page being processed.
      • getValidCharCnt

        public int getValidCharCnt()
        Get the total number of valid characters in the doc that could be decoded in processEncodedText().
        Returns:
        The number of valid characters.
      • getTotalCharCnt

        public int getTotalCharCnt()
        Get the total number of characters in the doc (including ones that could not be mapped).
        Returns:
        The number of characters.