Java NIO - CharSet


Advertisements

In Java for every character there is a well defined unicode code units which is internally handled by JVM.So Java NIO package defines an abstract class named as Charset which is mainly used for encoding and decoding of charset and UNICODE.

Standard charsets

The supported Charset in java are given below.

  • US-ASCII − Seven bit ASCII characters.

  • ISO-8859-1 − ISO Latin alphabet.

  • UTF-8 − This is 8 bit UCS transformation format.

  • UTF-16BE − This is 16 bit UCS transformation format with big endian byte order.

  • UTF-16LE − This is 16 bit UCS transformation with little endian byte order.

  • UTF-16 − 16 bit UCS transformation format.

Important methods of Charset class

  • forName() − This method creates a charset object for the given charset name.The name can be canonical or an alias.

  • displayName() − This method returns the canonical name of given charset.

  • canEncode() − This method checks whether the given charset supports encoding or not.

  • decode() − This method decodes the string of a given charset into charbuffer of Unicode charset.

  • encode() − This method encodes charbuffer of unicode charset into the byte buffer of given charset.

Example

Following example illustrate important methods of Charset class.

package com.java.nio;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
public class CharsetExample {
   public static void main(String[] args) {
      Charset charset = Charset.forName("US-ASCII");
      System.out.println(charset.displayName());
      System.out.println(charset.canEncode());
      String str= "Demo text for conversion.";
      //convert byte buffer in given charset to char buffer in unicode
      ByteBuffer byteBuffer = ByteBuffer.wrap(str.getBytes());
      CharBuffer charBuffer = charset.decode(byteBuffer);
      //convert char buffer in unicode to byte buffer in given charset
      ByteBuffer newByteBuffer = charset.encode(charBuffer);
      while(newbb.hasRemaining()){
         char ch = (char) newByteBuffer.get();
         System.out.print(ch);
      }
      newByteBuffer.clear();
   }
}

Output

US-ASCII
Demo text for conversion.
Advertisements