Package org.freebsd.file
Class FileEncoding
java.lang.Object
org.freebsd.file.FileEncoding
Tries to guess the encoding of the byte sequence.
Orignial code taken from https://github.com/file/file/blob/master/src/encoding.c
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate byte[]
fromEbcdic
(byte[] buf, int nbytes) getCode()
getType()
boolean
guessFileEncoding
(byte[] buf) Try to determine whether text is in some character code we can identify.private boolean
looksAscii
(byte[] buf, int nbytes) private boolean
looksExtended
(byte[] buf, int nbytes) private boolean
looksLatin1
(byte[] buf, int nbytes) private int
looksUcs16
(byte[] buf, int nbytes) private boolean
looksUtf7
(byte[] buf, int nbytes) protected int
looksUtf8
(byte[] buf, int nbytes) private boolean
looksUtf8WithBOM
(byte[] buf, int nbytes) private int
unsignedByte
(byte value)
-
Field Details
-
type
-
code
-
codeMime
-
F
private static final byte F- See Also:
-
T
private static final byte T- See Also:
-
I
private static final byte I- See Also:
-
X
private static final byte X- See Also:
-
text_chars
private byte[] text_chars -
EBCDIC_TO_ASCII
private static final char[] EBCDIC_TO_ASCII -
EBCDIC_1047_TO_8859
private static final char[] EBCDIC_1047_TO_8859
-
-
Constructor Details
-
FileEncoding
public FileEncoding()
-
-
Method Details
-
getCodeMime
-
getType
-
getCode
-
guessFileEncoding
public boolean guessFileEncoding(byte[] buf) Try to determine whether text is in some character code we can identify. It also identifies EBCDIC by converting it to ISO-8859-1.- Returns:
- true if it could guess an encoding.
-
looksAscii
private boolean looksAscii(byte[] buf, int nbytes) -
looksLatin1
private boolean looksLatin1(byte[] buf, int nbytes) -
looksExtended
private boolean looksExtended(byte[] buf, int nbytes) -
looksUtf8
protected int looksUtf8(byte[] buf, int nbytes) -
looksUtf8WithBOM
private boolean looksUtf8WithBOM(byte[] buf, int nbytes) -
looksUtf7
private boolean looksUtf7(byte[] buf, int nbytes) -
looksUcs16
private int looksUcs16(byte[] buf, int nbytes) -
fromEbcdic
private byte[] fromEbcdic(byte[] buf, int nbytes) -
unsignedByte
private int unsignedByte(byte value)
-