Class FileEncoding

java.lang.Object
org.freebsd.file.FileEncoding

public class FileEncoding extends Object
Tries to guess the encoding of the byte sequence. Orignial code taken from https://github.com/file/file/blob/master/src/encoding.c
  • Field Details

    • type

      private String type
    • code

      private String code
    • codeMime

      private String codeMime
    • F

      private static final byte F
      See Also:
    • T

      private static final byte T
      See Also:
    • I

      private static final byte I
      See Also:
    • X

      private static final byte X
      See Also:
    • text_chars

      private byte[] text_chars
    • EBCDIC_TO_ASCII

      private static final char[] EBCDIC_TO_ASCII
    • EBCDIC_1047_TO_8859

      private static final char[] EBCDIC_1047_TO_8859
  • Constructor Details

    • FileEncoding

      public FileEncoding()
  • Method Details

    • getCodeMime

      public String getCodeMime()
    • getType

      public String getType()
    • getCode

      public String getCode()
    • guessFileEncoding

      public boolean guessFileEncoding(byte[] buf)
      Try to determine whether text is in some character code we can identify. It also identifies EBCDIC by converting it to ISO-8859-1.
      Returns:
      true if it could guess an encoding.
    • looksAscii

      private boolean looksAscii(byte[] buf, int nbytes)
    • looksLatin1

      private boolean looksLatin1(byte[] buf, int nbytes)
    • looksExtended

      private boolean looksExtended(byte[] buf, int nbytes)
    • looksUtf8

      protected int looksUtf8(byte[] buf, int nbytes)
    • looksUtf8WithBOM

      private boolean looksUtf8WithBOM(byte[] buf, int nbytes)
    • looksUtf7

      private boolean looksUtf7(byte[] buf, int nbytes)
    • looksUcs16

      private int looksUcs16(byte[] buf, int nbytes)
    • fromEbcdic

      private byte[] fromEbcdic(byte[] buf, int nbytes)
    • unsignedByte

      private int unsignedByte(byte value)