Optimal Buffer and Destination Byte Array Size for java.io.BufferedInputStream Reads (for a slow disk)
A continuation of micro benchmarking
Earlier in the week I ran some micro-benchmarks against my new very performant Intel X25M solid state drive. Today, for 'kicks', I ran the same benchmarks with the same data against a 5,400 RPM external USB-attached 3.5" hard disk drive. Wow, what a difference that makes!
Destination Byte Array Size vs. KBps (for 16 different Buffer Sizes + the default)

(Larger Image)
Detailed description of the graph (same as previous post):
- x-axis: The destination byte array size used in individual read method calls as defined by payloadSize in this snippet:
byte[] payload = new byte[payloadSize]; int readIn = is.read(payload); - y-axis: The speed in Kilobytes per second for a complete build up, file opening and reading, and the algorithm's computation.
- series (individual lines): Individual BufferedInputStream buffer size's as set with buffSize during class initialization.
FileInputStream fis = FileInputStream(file); InputStream is = new BufferedInputStream(fis, buffSize)
For a slower disk, the conclusions are different only in the specific sizes:
One aspect of this entire performance question that I have not measured is the CPU load for these different operations. Hmmm...maybe for a new post...
Optimal Buffer and Destination Byte Array Size for java.io.BufferedInputStream Reads (for a fast disk)
Micro-Bechmark Results on a MacBook Pro with an Intel X25M SSD
When implementing file reading Java code with Java IO's BufferedInputStream, what buffer size should one choose? Should we just not specific it and go with the default? And what destination byte array size is best?
These questions pop up from time to time when I have the opportunity to write such code. And I've seen folks ask similar questions on Stack Overflow and The Java Ranch. So, with my trusty new SSD drive, and a bit of spare time this holiday season, I set out to answer those questions.
The methodology for the below micro-benchmark results was simple: Graph the maximum speed of a simple algorithm with various BufferedInputStream buffer and destination byte array payload sizes. They algorithm was just as simple: add up all the bytes in a file. This is both computationally light weight and serves as a simple checksum (to ensure the consistency of my algorithm in with various parameters.) The code: OptimalBufferSizeSquentialReader.java
The target file read in these metrics was a 31MB video clip. To prevent the OS file cache from mucking up the results, there is a unique file for each buffer size + destination byte array size, for a total of 7GB in test data.
Destination Byte Array Size vs. KBps (for 16 different Buffer Sizes + the default)

(Larger Image)
Detailed description of the graph:
- x-axis: The destination byte array size used in individual read method calls as defined by payloadSize in this snippet:
byte[] payload = new byte[payloadSize]; int readIn = is.read(payload); - y-axis: The speed in Kilobytes per second for a complete build up, file opening and reading, and the algorithm's computation.
- series (individual lines): Individual BufferedInputStream buffer size's as set with buffSize during class initialization.
FileInputStream fis = FileInputStream(file); InputStream is = new BufferedInputStream(fis, buffSize)
An interesting graph. My conclusions:
- One generally cannot go wrong with a destination byte array size of 512 or 1024 bytes, regardless of what the BufferedInputStream's buffer size has been initialized to.
- BufferedInputStream's default buffer size is pretty well tuned, as long as one does not use a destination byte array size smaller than 8 or 16 bytes
- There is no point in having BufferedInputStream's default buffer size initialized to anything larger than 2KB. In fact, if an application is going to have many concurrent threads running this code (like in a web application) then large values will only wastefully consume memory, limiting overall scalability.
- The lower destination byte array sizes seem to have some sort of Sigmoid Function.
- My new SSD is freaky fast. ~130,000KBps is ~125MBps! Yeah, baby!
One thing I don't get is that the default buffer size is 8KB, but the 8KB series does not match the default series. Humph.
Update! I was curious aoubt two other aspects of the buffer size and destination byte array size, CPU load and the impact of speed of the disk. To that end, I've followed up with a second, similar set of metrics in Optimal Buffer and Destination Byte Array Size for java.io.BufferedInputStream Reads (for a slow disk)
Intel X25M SSD in a MacBook Pro: Before and After Performance Results
A new Intel X25M Solid State Drive for my early 2008 MacBook Pro does wonders for performance.
Early last week I convinced my employer, xtendx AG, that I needed one of those new, fancy Intel X25M G2 Solid State Drives for my 18 month old early-2008 MacBook Pro. The prices had finally come down to a tolerable sweet-spot for us: ~SFr500 for the 160GB model from a favored local electronics store, Digitec AG.
Last Monday evening at home I opened up my MBP and dropped in the new SSD. The instructions I used to do this can be found on iFixIt.com. In the below photograph, the new drive is on the left.

It took about 90 minutes, but I took some of that time to clean our the dust and grime from the case interior. Note that it is really important to have a T6 Torx screwdriver on hand. Don't even attempt this operation without one.
After getting the drive installed I needed to format it, and install OS X. That went amazingly quickly. Especially the format. I did not time it. But trust me, it was fast!
You don't trust me? OK. Here are some benchmarks and a graph. The data was gathered using Xbench 3.1 on my newly modified MBP and an identical MBP of my colleague's. The raw results:
| Xbench Scores | |||||
| SSD Score | HDD Score | Boost | |||
| Disk Test | 182.55 | 41.16 | 340% | ||
| Sequential | 115.18 | 70.08 | 61% | ||
| Random | 439.80 | 29.14 | 1400% | ||
| Uncached Sequential Speed Metrics (MB/sec) | |||||
| SSD | HDD | Boost | |||
| Write 4K blocks | 84.12 | 63.24 | 33% | ||
| Write 256K blocks | 61.80 | 57.92 | 6.7% | ||
| Read 4K blocks | 21.04 | 10.09 | 93% | ||
| Read 256K blocks | 115.20 | 58.41 | 97% | ||
| Uncached Random Speed Metrics (MB/sec) | |||||
| SSD | HDD | Boost | |||
| Write 4K blocks | 67.60 | 1.00 | 6700% | ||
| Write 256K blocks | 64.64 | 30.35 | 110% | ||
| Read 4K blocks | 8.08 | 0.57 | 1300% | ||
| Read 256K blocks | 109.16 | 23.20 | 370% | ||
Wow! Some of those numbers are absolutely amazing, like the 4k Block Random Write. Below we have a graph of the improvement in performance over the stock 7,200 RPM drive.
Percent Improvement in Performance by Xbench Test

There are two things to remember when interpreting the above graph:
- The y-scale is logarithmic. This is because some of the numbers where so large the smaller values, although important, seemed nominal.
- The values are the % improvement over the original hard drive.
"Uh, whatever" you say? Here are some points on how to interpreter the data:
- A value of 100% means the performance was doubled
- A value of 1000% indicates the performance was ten times better
- <10% is probably not that noticeable to most users
Now do you believe me?!?! An SSD as an upgrade will dramatically change your computing experience.