IO Buffering...

Joe

Thành viên VIP
21/1/13
2,969
1,311
113
Hi

Most of newbies who start to work with files usually prefer "Buffered-IO" because they were told Buffering is "good". In most cases the IO-Buffering is unnecessary and more-or-less superfluous. The reasons are manyfold. However, the main source of this misuse or misconception is the lack of knowledge about Operating System (IT in general). Newbies are taught or told that they need only to learn a Programming Language (e.g. PASCAL) or an Object Oriented Programming Language (e.g. JAVA, C# or PYTHON) if they want to work with Computer or Software. NO NEED to learn the OS concept. Very irresponsible. The outcome is clear: newbies code what they were taught: misconceptions in coding technique and algorithm. One of the greatest misconception is the use of IO-Buffering. IO-Buffering is an OS concept to speed up an IO process.

What is then the IO Buffering? As aforementioned: to speed up the IOs. Normally the OS controls and regulates the Computer and its peripheries (such as Disks, printers, etc.) When a process needs some data from a disk the OS starts to fetch the data for the requester. Because the CPU is usually x-times faster than any Disk-IO driver it has to wait for the driver and the result is a "jerking" or "stuttering" processing due to waiting for piecewise chunk of data.

To accelerate the IO process IO Buffering OS buffers the data by reading more data in advance so that the request of next data block can be delivered with minimal latency time. The following images show you the different IO Buffering techniques how OS buffers the IO data (source: IO Management).

IOBuffering_1.jpg

IOBuffering_2.jpg

Back to Java and Newbies misconception Problems. The following situations highlight the misuse of IO Buffering committed by Newbies and some developers who lack the OS knowledge:

  • Scanner for dialog-oriented IOs. Normally the dialog-oriented IOs work with an input device (e.g. Terminal) where the user inputs the requested data and waits for the next data request. The inputting time of an user is usually slow (in second) so that an IO buffering is a nonsense.
  • Logging for event-oriented IOs. Similar to a dialog-oriented IOs logging events are usually the cases of exception or unexpected events which are usually very seldom. An IO Buffering here could lead to loss of data in case of power blackout because the data "were" buffered in memory and therefore got lost.
  • Communication & Network processing. Communication & Networking data are transferred block-wise (TCP: max. 64K) and the blocks are read byte-wise (see Socket-IO). An IO Buffering is here also a waste of time and resources.

Example 1: compare the 2 following sources with Scanner.

Without IO Buffering
PHP:
import java.io.*;
import java.util.*;
public class TestScanner {
  public static void main(String... A) {
    try (FileOutputStream fout = new FileOutputStream("List.txt", false)) {
      Scanner sc = new Scanner(System.in);
      StringBuilder sb = new StringBuilder();
      while (true) {
        System.out.print("Name:");
        sb.append("Name:"+sc.nextLine()+System.lineSeparator());
        System.out.print("Age:");
        sb.append("Age:"+sc.nextLine()+System.lineSeparator());
        System.out.print("Address:");
        sb.append("Addresd:"+sc.nextLine()+System.lineSeparator());
        System.out.print("Quit? Y/N:");
        if (sc.nextLine().equalsIgnoreCase("y")) break;
        fout.write(sb.toString().getBytes());
        fout.flush();
      }
      fout.close();
    } catch (Exception ex) {
      ex.printStackTrace();
    }
  }
}
With IO Buffering
PHP:
import java.io.*;
import java.util.*;
public class Test_Scanner {
  public static void main(String... A) {
    try (BufferedOutputStream fout =
               new BufferedOutputStream(new FileOutputStream("List.txt", false))) {
      Scanner sc = new Scanner(new BufferedInputStream(System.in));
      StringBuilder sb = new StringBuilder();
      while (true) {
        System.out.print("Name:");
        sb.append("Name:"+sc.nextLine()+System.lineSeparator());
        System.out.print("Age:");
        sb.append("Age:"+sc.nextLine()+System.lineSeparator());
        System.out.print("Address:");
        sb.append("Addresd:"+sc.nextLine()+System.lineSeparator());
        System.out.print("Quit? Y/N:");
        if (sc.nextLine().equalsIgnoreCase("y")) break;
        fout.write(sb.toString().getBytes());
        fout.flush();
      }
      fout.close();
    } catch (Exception ex) {
      ex.printStackTrace();
    }
  }
}
As you see, the 2nd version with IO Buffering is not only more "complicated" than the 1st version, but also superfluous with the buffering when the inputs/outputs happen with the "same" time interval. You gain nothing, but get some performance overhead. Agreed?

Example 2: Logging and Communication & Networking

Without IO Buffering
PHP:
import java.io.*;
import java.net.*;
import java.util.*;
public class TestSocket {
  public static void main(String... A) throws Exception {
    FileOutputStream log = new FileOutputStream("log.txt", false);
    try (Socket soc = new Socket("Localhost", 10000);
      FileOutputStream fout = new FileOutputStream("List.txt", false);
      InputStream finp = soc.getInputStream()) {
      int n;
      byte[] bytes = new byte[65536]; // max. 64K
      while ((n = finp.read(bytes)) > 0) {
        fout.write(bytes, 0, n);
        fout.flush();
      }
      fout.close();
      finp.close();
      soc.close();
    } catch (Exception ex) {
      log.write(("Error:"+ex.toString()).getBytes());
      log.flush();
    }
    log.close();
  }
}
with IO Buffering
PHP:
import java.io.*;
import java.net.*;
import java.util.*;
public class Test_Socket {
  public static void main(String... A) throws Exception {
    BufferedOutputStream log = new BufferedOutputStream(new FileOutputStream("log.txt", false));
    try (Socket soc = new Socket("Localhost", 10000);
      BufferedOutputStream fout = new BufferedOutputStream(new FileOutputStream("List.txt", false));
      BufferedInputStream finp = new BufferedInputStream(soc.getInputStream())) {
      int n;
      byte[] bytes = new byte[65536]; // max. 64K
      while ((n = finp.read(bytes)) > 0) {
        fout.write(bytes, 0, n);
        fout.flush();
      }
      fout.close();
      finp.close();
      soc.close();
    } catch (Exception ex) {
      log.write(("Error:"+ex.toString()).getBytes());
      log.flush();
    }
    log.close();
  }
}
Again, more loss than gain.
 
Sửa lần cuối:

quydtkt

Administrator
1/11/19
389
38
28
27
Great article. How a technology works is very important, but people are less interested
 
  • Like
Reactions: Joe

Joe

Thành viên VIP
21/1/13
2,969
1,311
113
@quydtkt: Thank you :)

(cont.)

The question is now: when, where and how should I use IO Buffering? As aforementioned: to speed up the IOs. And that means: Data are read in advance and in bundles into the buffer so that the next accesses are read from the buffer instead of from the disk, or gather in buffer in bundles before write. In the common practice it could be file copying (Read/Write buffering) or displaying (read buffering) and so on.

Example: File Copying:

with IO Buffering
PHP:
import java.io.*;
public class FileCopy {
  public static void main(String... a) {
    if (a.length != 2) {
      System.out.println("Usage: java FileCopy source destination");
      System.exit(0);
    }
    long beg = System.nanoTime();
    try (BufferedInputStream bin = new BufferedInputStream(new FileInputStream(a[0]));
         BufferedOutputStream bou = new BufferedOutputStream(new FileOutputStream(a[1], false))){
      int N = 0;
      byte[] bytes = new byte[4048];
      for (int n = bin.read(bytes); n > 0; n = bin.read(bytes)) {
        bou.write(bytes, 0, n);
        N += n;
      }
      bou.flush();
      bou.close();
      bin.close();
      System.out.println(a[0]+" is copied to "+a[1]+", elapsed Time:"+
                        (System.nanoTime()-beg)+" nanoSec. ("+N+" bytes)");
    } catch (Exception ex) {
      ex.printStackTrace();
    }
  }
}
without IO Buffering
PHP:
import java.io.*;
public class File_Copy {
  public static void main(String... a) {
    if (a.length != 2) {
      System.out.println("Usage: java FileCopy source destination");
      System.exit(0);
    }
    long beg = System.nanoTime();
    try (FileInputStream bin = new FileInputStream(a[0]);
         FileOutputStream bou = new FileOutputStream(a[1], false)){
      int N = 0;
      byte[] bytes = new byte[4048];
      for (int n = bin.read(bytes); n > 0; n = bin.read(bytes)) {
        bou.write(bytes, 0, n);
        N += n;
      }
      bou.flush();
      bou.close();
      bin.close();
      System.out.println(a[0]+" is copied to "+a[1]+", elapsed Time:"+
                        (System.nanoTime()-beg)+" nanoSec. ("+N+" bytes)");
    } catch (Exception ex) {
      ex.printStackTrace();
    }
  }
}
The next question is: What's about the NewIO such as FileChannel?
NewIO (or java.nio) is an OS-System-near IO-implementation. And that means java.nio takes full advantages of the OS IO-Buffering. Therefore there's NO NEED to do any IO buffering in the Programming level. Example:

PHP:
import java.io.*;
import java.nio.*;
import java.nio.file.*;
import java.nio.channels.*;
public class FileNIOCopy {
  public static void main(String... a) {
    if (a.length != 2) {
      System.out.println("Usage: java FileCopy source destination");
      System.exit(0);
    }
    long beg = System.nanoTime();
    try (FileChannel bin = FileChannel.open(Paths.get(a[0]));
         FileChannel bou = FileChannel.open(Paths.get(a[1]),
                                            StandardOpenOption.CREATE,
                                            StandardOpenOption.WRITE)) {
      long N = bin.size();
      bin.transferTo(0, N, bou); // do the copy
      bou.close();
      bin.close();
      System.out.println(a[0]+" is copied to "+a[1]+", elapsed Time:"+
                        (System.nanoTime()-beg)+" nanoSec. ("+N+" bytes)");
    } catch (Exception ex) {
      ex.printStackTrace();
    }
  }
}
The results of classic IO (with and without IO Buffering) and newIO (OS-IO Buffering) are:
Code:
C:\java\Test>javac -g:none -d ./classes File*Copy.java

C:\java\Test>java FileCopy pic.jpg x1.jpg
pic.jpg is copied to x1.jpg, elapsed Time:14452800 nanoSec. (2167306 bytes)

C:\java\Test>java File_Copy pic.jpg x2.jpg
pic.jpg is copied to x2.jpg, elapsed Time:19901500 nanoSec. (2167306 bytes)

C:\java\Test>java FileNIOCopy pic.jpg x3.jpg
pic.jpg is copied to x3.jpg, elapsed Time:7808600 nanoSec. (2167306 bytes)

C:\java\Test>
+------------------+-------------------+----------------------+-----------------------+
|                  | With IO Buffering | without IO Buffering | NewIO OS IO Buffering |
+------------------+-------------------+----------------------+-----------------------+
|                  |     14452800      |       19901500       |       7808600         |
+------------------+-------------------+----------------------+-----------------------+
| Diff. in NanoSec.| faster: -5448700  |          -           | fastest: -2092900     |
+------------------+-------------------+----------------------+-----------------------+
Conclusion: For a file copy the Without_IO_Buffering version is the slowest, and NewIO with OS-IO Buffering version is the fastest (faster than the IO_Buffering version) and the classic IO_Buffering is in the middle.
Note: for small files it's better to work with the classic IO-Buffering. NewIO with OS-IO-Buffering is only meaningful by large files (up MB).
 
Sửa lần cuối:

quydtkt

Administrator
1/11/19
389
38
28
27
Normally, I use IO Buffering to read files or download files. In the future, I will use NewIO to see how it works
 

Joe

Thành viên VIP
21/1/13
2,969
1,311
113
@quydtkt
Boss, the gain is minimal for a download because URL or Socket delivers you an InputStream which is a classic IO and cannot be derived to a FileChannel. Only the output (the downloaded file) can be a FileChannel. Further, Inputstream of Communication & Networking is byte-wise reading as the bytes come in. Therefore the data cannot be efficiently buffered in advance (see the 3rd mentioned point in the 1st section). Example:

BufferedOutput and BufferedInput
PHP:
import java.io.*;
import java.net.*;
import java.util.*;
public class Download {
  public static void main(String... a) throws Exception {
    if (a.length != 2) {
      System.out.println("Usage: java Download URL_link fileName");
      System.exit(0);
    }
    long beg = System.nanoTime();
    try (BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(a[1], false));
      BufferedInputStream inp = new BufferedInputStream(new URL(a[0]).openStream())) {
      int n;
      byte[] bytes = new byte[65536]; // max. 64K
      while ((n = inp.read(bytes)) > 0) {
        out.write(bytes, 0, n);
        out.flush();
      }
      out.close();
      inp.close();
      System.out.println("Download "+a[0]+" to "+a[1]+"\nElapsed time:"+
                        ((double)(System.nanoTime()-beg)/1000)+" microSec.");
    } catch (Exception ex) {
      ex.printStackTrace();
    }
  }
}
FileChannel/BufferedInput
PHP:
import java.io.*;
import java.net.*;
import java.nio.*;
import java.nio.file.*;
import java.nio.channels.*;
public class Down_load {
  public static void main(String... a) throws Exception {
    if (a.length != 2) {
      System.out.println("Usage: java Download URL_link fileName");
      System.exit(0);
    }
    long beg = System.nanoTime();
    try (FileChannel out = FileChannel.open(Paths.get(a[1]),
                                            StandardOpenOption.CREATE,
                                            StandardOpenOption.WRITE);
         BufferedInputStream inp = new BufferedInputStream(new URL(a[0]).openStream())) {
      int n;
      byte[] bytes = new byte[65536]; // max. 64K
      while ((n = inp.read(bytes)) > 0) {
        out.write(ByteBuffer.wrap(bytes, 0, n));
      }
      out.close();
      inp.close();
      System.out.println("Download "+a[0]+" to "+a[1]+"\nElapsed time:"+
                        ((double)(System.nanoTime()-beg)/1000)+" microSec.");
    } catch (Exception ex) {
      ex.printStackTrace();
    }
  }
}
The gains:
Code:
C:\links\java\Test>javac -g:none -d ./classes Down*.java

C:\links\java\Test>java Download https://i.ibb.co/f0XMCkJ/Server-3.png pic1.jpg
Download https://i.ibb.co/f0XMCkJ/Server-3.png to pic1.jpg
Elapsed time:1042827.7 microSec.

C:\links\java\Test>java Down_load https://i.ibb.co/f0XMCkJ/Server-3.png pic2.jpg
Download https://i.ibb.co/f0XMCkJ/Server-3.png to pic2.jpg
Elapsed time:1042086.1 microSec.


C:\links\java\Test>

+------------+-----------+
| version 1  | version 2 |
+------------+-----------+
| 1042827.7  | 1042086.1 |
+------------+-----------+
|  +741.6    |     0     |
+------------+-----------+
(microSeconds)
If you UN-buffered the input as following
PHP:
import java.io.*;
import java.net.*;
import java.nio.*;
import java.nio.file.*;
import java.nio.channels.*;
public class DownXload {
  public static void main(String... a) throws Exception {
    if (a.length != 2) {
      System.out.println("Usage: java Download URL_link fileName");
      System.exit(0);
    }
    long beg = System.nanoTime();
    try (FileChannel out = FileChannel.open(Paths.get(a[1]),
                                            StandardOpenOption.CREATE,
                                            StandardOpenOption.WRITE);
         InputStream inp = (new URL(a[0]).openStream())) {
      int n;
      byte[] bytes = new byte[65536]; // max. 64K
      while ((n = inp.read(bytes)) > 0) {
        out.write(ByteBuffer.wrap(bytes, 0, n));
      }
      out.close();
      inp.close();
      System.out.println("Download "+a[0]+" to "+a[1]+"\nElapsed time:"+
                        ((double)(System.nanoTime()-beg)/1000)+" microSec.");
    } catch (Exception ex) {
      ex.printStackTrace();
    }
  }
}
the outcome of version 3 is somehow better than the 1st version with buffered In/Output
Code:
C:\links\java\Test>java Download https://i.ibb.co/f0XMCkJ/Server-3.png pic1.jpg
Download https://i.ibb.co/f0XMCkJ/Server-3.png to pic1.jpg
Elapsed time:1117427.6 microSec.

C:\links\java\Test>java Down_load https://i.ibb.co/f0XMCkJ/Server-3.png pic2.jpg
Download https://i.ibb.co/f0XMCkJ/Server-3.png to pic2.jpg
Elapsed time:1028713.0 microSec.

C:\links\java\Test>java DownXload https://i.ibb.co/f0XMCkJ/Server-3.png pic3.jpg
Download https://i.ibb.co/f0XMCkJ/Server-3.png to pic3.jpg
Elapsed time:1074018.1 microSec.

C:\links\java\Test>
1074018.1 versus 1117427.6 microSec. for downloading an image of 65 KB
 
Sửa lần cuối: