<< Memory Board from Cray 1 Supercomputer #38 | Home | Book Review: 'Computer Olympics' by Stephan Manes and Paul Somerson >>

Real world performance metrics: java.io vs. java.nio (The Sequel)

About 275% faster for my particular use case
Bookmark and Share

The first set of results (September 2008) measuring the performance improvement gained by the switch to java.nio for FLV indexing were not particularly scientific.  Each data point was from a different file, of dramatically different sizes, with dramatically different key-frame spacing.  The improvements are visible, but fuzzy. 

From the ever powerful yet flawed Wikipedia, there is a concept to help bring these metrics into focus:
Cēterīs paribus is a Latin phrase, literally translated as "with other things the same." It is commonly rendered in English as "all other things being equal." A prediction, or a statement about causal or logical connections between two states of affairs, is qualified by ceteris paribus in order to acknowledge, and to rule out, the possibility of other factors which could override the relationship between the antecedent and the consequent.

A ceteris paribus assumption is often fundamental to the predictive purpose of scientific inquiry. In order to formulate scientific laws, it is usually necessary to rule out factors which interfere with examining a specific causal relationship. Experimentally, the ceteris paribus assumption is realized when a scientist controls for all of the independent variables other than the one under study, so that the effect of a single independent variable on the dependent variable can be isolated. By holding all the other relevant factors constant, a scientist is able to focus on the unique effects of a given factor in a complex causal situation.
Blah, blah, blah.  OK, back to gathering more data with this in mind.

With a single set of 8 files from production webcasts, more results were captured in two series.  The second series was measured immediately after the first series on physically separate copies of the files.  Measurements were taken one fine evening last November on the production server described in the first post--activity was not too busy at the time, maybe 10% of capacity.

(Drum roll please...) The results:

 File Size (KB)  Speed (KBps)
v2.2 (java.io) v2.2 (java.io) v2.4 (java.nio) v2.4 (java.nio)
61,793 67,965 72,647 200,239 202,806
70,079 31,418 30,798 73,225 91,648
82,645 30,529 31,286 84,291 91,887
82,951 53,388 50,290 144,458 154,158
88,086 29,106 29,134 71,360 69,012
101,500 28,491 28,935 75,644 80,758
122,606 30,839 31,954 84,035 92,383
289,423 42,374 41,479 112,543 112,773

java.io versus java.nio

Interpretations:
  1. Much more consistent, although in hindsight I should have encoded a video from a single source to various different qualities.  This would have made the number of index points consistent across each file.
  2. On average, java.nio performed 273% faster than java.io.
  3. The mean performance increase was 277%
  4. The minimum was 241%
  5. The maximum improvement was 288%


Re: Real world performance metrics: java.io vs. java.nio (The Sequel)

Hi, Do you have a gut feel for where the performance increase came from? Did you use memory mapped byte buffers?

Thanks.

Re: Real world performance metrics: java.io vs. java.nio (The Sequel)

Yes, I did use Mapped  Byte Buffers.  But I do not know authoritatively if this is a trait of NIO or MBBs.  Hmmm....good question...I feel another blog post coming on....

Re: Real world performance metrics: java.io vs. java.nio (The Sequel)

What size did you set your MappedBytBuffers to?  Did you map the whole file into memory?

Re: Real world performance metrics: java.io vs. java.nio (The Sequel)

@Simon: Yes. I mapped the entire file, so technically the buffer size was the file size. MBB's do not allocate the array on the Java heap, so there is no OutOfHeapSpace problem. (It is possible, on some JVM implementations to load the entire file to memory: check out the .force() method.) It is important to understand that Java's MappedByteBuffers are backed by the OS's virtual memory subsystem. This has a fundamental impact on how they behave underneath when compared to normal ByteBuffers. Stu

Re: Real world performance metrics: java.io vs. java.nio (The Sequel)

Re: Real world performance metrics: java.io vs. java.nio (The Sequel)

@Stu: It might be a dumb question. Sorry if it is :).. Can we use this same concept for a producer-consumer pattern? If we dont want to use heap for a very large quantum of produced data; we could put this down in a file using MBB; and read it later; may be from consumer? Does it really make a difference? Assuming both wil be in same application; We can keep something like; once the producer finishes with the first buffer; intimate the consumer to take it and it goes on? And maybe once all buffers are filled; intimate the consumer to destroy it after processing last buffer and producer can continue with a new one. Well; its not that always producers will stay ahead of consumer; At one point this guy will catch up i guess. This thing will be better in disk rather than putting it in memory and dealing with OOMs. Isnt it? (A particular case where we dont want to lose any data; which will be difficult to keep in-memory beause if producers are fast; going to get OOM)

Not really sure. Just asking..


Add a comment Send a TrackBack