Java Internalization - Unicode Conversion from/to Reader/Writer


Advertisements

Reader and Writer classes are character oriented stream classes. These can be used to read and convert Unicode characters.

Conversion

Following example will showcase conversion of a Unicode String to UTF8 byte[] and UTF8 byte[] to Unicode byte[] using Reader and Writer classes.

IOTester.java

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.io.Reader;
import java.io.Writer;
import java.nio.charset.Charset;
import java.text.ParseException;

public class I18NTester {
   public static void main(String[] args) throws ParseException, IOException {

      String input = "This is a sample text" ;

      InputStream inputStream = new ByteArrayInputStream(input.getBytes());

      //get the UTF-8 data
      Reader reader = new InputStreamReader(inputStream, Charset.forName("UTF-8"));

      //convert UTF-8 to Unicode
      int data = reader.read();
      while(data != -1){
         char theChar = (char) data;
         System.out.print(theChar);
         data = reader.read();
      }
      reader.close();

      System.out.println();

      //Convert Unicode to UTF-8 Bytes
      ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
      Writer writer = new OutputStreamWriter(outputStream, Charset.forName("UTF-8"));

      writer.write(input);
      writer.close();

      String out = new String(outputStream.toByteArray());
   
      System.out.println(out);
   }  
}

Output

It will print the following result.

This is a sample text
This is a sample text
Print