<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
    <channel>
        <title>Java - Blogs at Near Infinity</title>


        <link>http://www.nearinfinity.com/blogs/</link>
        <description>Employee Blogs</description>
        <language>en</language>
        <copyright>Copyright 2011</copyright>
        <lastBuildDate>Sun, 27 Nov 2011 21:29:12 -0500</lastBuildDate>
        <generator>http://www.sixapart.com/movabletype/</generator>
        <docs>http://www.rssboard.org/rss-specification</docs>
        
        <item>
            <title>Integral Image for Faster Image Processing</title>
            <description><![CDATA[Recently, I was looking into computer vision related technologies. &nbsp;One of the interesting techniques is known as the "integral image" which can enable more advanced techniques, most notably the <a href="http://en.wikipedia.org/wiki/Viola%E2%80%93Jones_object_detection_framework">Viola-Jones object detection framework</a>, which&nbsp;uses a series of simple, Haar-like features, which are rectangular areas with dark and light regions, to find areas that match patterns that correspond with the object you are trying to find.<div><br /></div><div>To do this computation, you search an image by sliding an imaginary window with the pattern &nbsp;at multiple scales over the entire image, looking for differences between the dark and light areas of the pattern that meet a threshold that is determined by the training algorithm. &nbsp;Doing this can require many computations, so how can we speed this up? Let's frame the question this way: Starting with a gray scale image in which each pixel has a value from 0-255, how can we quickly compute the average color value in a sub-region of an image?</div><div><br /></div><div>The conventional way would be to write a loop similar the following:


<pre class="prettyprint">private static int computeSum(BufferedImage image, int regionX1, int regionY1, int regionX2, int regionY2) {
   //Compute a sub-region of the image 
   int sum = 0;
   Raster data = image.getData();
   for (int x = regionX1; x &lt;= regionX2; x++) {
      for (int y = regionY1; y &lt;= regionY2; y++) {
         int value = data.getSample(y, x, 0);
         sum = sum + value;
      }
   }
   return sum;
}
</pre>
</div><div><br /></div><div>But we can do better than this, especially if we'll need to compute multiple regions and not constantly recompute the sums. &nbsp;To create an integral image, we make a single pass over the entire image and for each pixel in the original image compute the sum of all the pixels to the left and above of this pixel and add it to the value of the original pixel. &nbsp;The code for that is below.</div><div><br /></div><div><div><br /></div></div>

<pre class="prettyprint">public class IntegralImage {
   private int[][] integralImage = null;
	
   public IntegralImage(BufferedImage image) {
      int originalImageHeight = image.getHeight();
      int originalImageWidth = image.getWidth();
      integralImage = new int[originalImageHeight][originalImageWidth];
      Raster originalPixels = image.getData();

      int originalPixelValue = 0;
      for (int row = 0; row &lt; originalImageHeight; row++) {
         for (int column = 0; column &lt; originalImageWidth; column++) {
      originalPixelValue = originalPixels.getSample(column, row, 0); 

         //For the leftmost pixel, just copy value from original
         if (row == 0 &amp;&amp; column == 0) {
            integralImage[row][column] = originalPixelValue;
         }

        //For the first row, just add the value to the left of this pixel
         else if (row == 0) {
             integralImage[row][column] = originalPixelValue + integralImage[row][column - 1];
         }

        //For the first column, just add the value to the top of this pixel
         else if (column == 0) {
            integralImage[row][column] = originalPixelValue + integralImage[row - 1][column];      &nbsp;
         }

       //For a pixel that has pixels to its left, above it, and to the left and above diagonally, 
       //add the left and above values and subtract the value to the left and above diagonally
       else {
          integralImage[row][column] = originalPixelValue + integralImage[row][column - 1] + integralImage[row - 1][column] - integralImage[row - 1][column - 1];
       }
    }
}
</pre>

After this pass through the image, we can compute the sum of any region in constant time by doing the following:<div><br /><pre class="prettyprint">public int total(int x1, int y1, int x2, int y2) {
   int a = x1 &gt; 0 &amp;&amp; y1 &gt; 0 ? integralImage[x1-1][y1-1] : 0;
   int b = x1 &gt; 0 ? integralImage[x1-1][y2] : 0;
   int c = y1 &gt; 0 ? integralImage[x2][y1-1] : 0;
   int d = integralImage[x2][y2];
   return a + d - b - c;
}
</pre><div><br /></div>

And that's it!  We can now apply this faster computation to a number of interesting problems.  For a nice tutorial on this, you may want to check out <a href="http://computersciencesource.wordpress.com/2010/09/03/computer-vision-the-integral-image/">the following</a>.</div>]]></description>
            <link>http://www.nearinfinity.com/blogs/tom_neumark/integral_tables_for_faster_ima.html</link>
            <guid>http://www.nearinfinity.com/blogs/tom_neumark/integral_tables_for_faster_ima.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">image</category>
            
            <pubDate>Sun, 27 Nov 2011 21:29:12 -0500</pubDate>
        </item>
        
        <item>
            <title>An Introduction to Blur</title>
            <description><![CDATA[<p>Blur is a new Apache 2.0 licensed software project that provides a search capability built on top of Hadoop and Lucene.  Elastic Search and Solr already exist so why build something new?  While these projects work well, they didn't have a solid integration with the Hadoop ecosystem.  Blur was built specifically for Big Data, taking scalability, redundancy, and performance into consideration from the very start, while leveraging all the goodness that already exists in the Hadoop stack.  </p>

<p>A year and a half ago, my project began using Hadoop for data processing.  Very early on, we were having networking issues that would make our HDFS cluster network connectivity spotty at best.  Over one weekend in particular, we steadily lost network connection to 47 of the 90 data nodes in the cluster.  When we came in on Monday morning, I noticed that the MapReduce system was a little sluggish but still working.  When I checked HDFS I saw that our capacity had dropped by about 50%.  After running an fsck on the cluster I was amazed to find that what seemed like a catastrophic failure over the weekend resulted in a still healthy file system.  This experience left a lasting impression on me.  It was then that I got the idea to somehow leverage the redundancy and fault tolerance of HDFS for the next version of a search system that I was just beginning to (re)write.  </p>

<p>I had already written a custom sharded Lucene server that had been in a production system for a couple of years.  Lucene worked really well and did everything that we needed for search.  The issue that we faced was that it  was running on big iron that was not redundant and could not be easily expanded.  After seeing the resilient characteristics of Hadoop first hand, I decided to look into marrying the already mature and impressive feature set of Lucene with the built in redundancy and scalability of the Hadoop platform.  From this experiment Blur was created.</p>

<p>The biggest technical issues/features that Blur solves:</p>

<ul>
<li>Rapid mass indexing of entire datasets</li>
<li>Automatic Shard Server Failover</li>
<li>Near Real-time update compatibility via Lucene NRT</li>
<li>Compression of Lucene FDT files while maintaining random access performance</li>
<li>Lucene WAL (Write Ahead Log) to provide data reliability</li>
<li>Lucene R/W directly into HDFS (the seek on write problem)</li>
<li>Random access performance with block caching of the Lucene Directory</li>
</ul>

<h1>Data Model</h1>

<p>Data in Blur is stored in Tables that contain Rows.  Rows must have a unique row id and contain one or more Records.  Records have a unique record id (unique within the Row) and a column family for grouping columns that logically make up a single record.  Columns contain a name and a value, and a Record can contain multiple columns with the same name.</p>

<script src="https://gist.github.com/1349055.js?file=gistfile1.js"></script>

<h1>Architecture</h1>

<p>Blur uses Hadoop's MapReduce framework for indexing data, and Hadoop's HDFS filesystem for storing indexes.  Thrift is used for all inter-process communications and Zookeeper is used to know the state of the system and to store meta data.  The Blur architecture is made up of two types of server processes:</p>

<ul>
<li>Blur Controller Server</li>
<li>Blur Shard Server</li>
</ul>

<p>The shard server, serves 0 or more shards from all the currently online tables.  The calculation of the what shards are online in each shard server is done through the state information in Zookeeper.  If a shard server goes down, through interaction with Zookeeper the remaining shard servers detect the failure and determine which if any of the missing shards they need to serve from HDFS.  </p>

<p>The controller server provides a single point of entry (logically) to the cluster for spraying out queries, collecting the responses, and providing a single response.  Both the controller and shard servers expose the same Thrift API which helps to ease debugging.  It also allows developers to start a single shard server and interact with it the same way they would with a large cluster.  Many controller servers can be (and should be) run for redundancy. The controllers act as gateways to all of the data that is being served by the shard servers.</p>

<h1>Updating / Loading Data</h1>

<p>Currently there are two ways to load and update data.  The first is through a bulk load in MapReduce and the second is through mutation calls in Thrift.</p>

<h2>Bulk Load MapReduce Example</h2>

<script src="https://gist.github.com/1348788.js?file=BlurMapReduce.java"></script>

<h2>Data Mutation Thrift Example</h2>

<script src="https://gist.github.com/1348845.js?file=ThriftMutationExample.java"></script>

<h1>Searching Data</h1>

<p>Any element in the Blur data model is searchable through the normal Lucene semantics: analyzers. Analyzers are defined per Blur table.</p>

<p>The standard Lucene query syntax is the default way to search Blur.  If anything outside of the standard syntax is needed, you can create a Lucene query directly with Java objects, and submit them through the expert query API.</p>

<p>The column family grouping within Rows allows for results to be discovered across column families similar to what you would get with an inner join across two tables that share the same key (or in this case rowid).  For complicated data models that have multiple column families, this makes for a very powerful search capability.</p>

<p>The following example searches for "value" as a full text search.  If I had wanted to search for "value" in a single field like column "colA" in column family "famB" the query would look like "famB.colA:value".</p>

<script src="https://gist.github.com/1348874.js?file=ThriftSearchExample.java"></script>

<h1>Fetching Data</h1>

<p>Fetches can be done by row or by record.  This is done by creating a selector object in which you specify the rowid or recordid, and the specific column families or columns that you would like returned.  When not specified, the entire Row or Record is returned.</p>

<script src="https://gist.github.com/1348865.js?file=ThriftFetchExample.java"></script>

<h1>Current State</h1>

<p>Blur is nearing it's first release 0.1 and is relatively stable.  The first release candidate should be available for download within the next few weeks.  In the meantime you can check it out on github:</p>

<p><a href="https://github.com/nearinfinity/blur">https://github.com/nearinfinity/blur</a></p>

<p><a href="http://blur.io">http://blur.io</a></p>
]]></description>
            <link>http://www.nearinfinity.com/blogs/aaron_mccurry/an_introduction_to_blur.html</link>
            <guid>http://www.nearinfinity.com/blogs/aaron_mccurry/an_introduction_to_blur.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Hadoop</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Lucene</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Big Data</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Hadoop</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Lucene</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">MapReduce</category>
            
            <pubDate>Wed, 09 Nov 2011 15:00:00 -0500</pubDate>
        </item>
        
        <item>
            <title>Gradle Class July 18-20 by Tim Berglund</title>
            <description><![CDATA[<p>Near Infinity is proud to host a new Gradle Expert Training class, taught by Tim Berglund, the Author of the Gradle O'Reilly Book Series. This is an intensive and highly practical 3-day Gradle course. You will become familiar with all major concepts of Gradle and how to best use Gradle for simple as well as complex build scenarios. This course is packed with hands-on exercises.
<br /><br />To learn more please visit Gradeware's <a href="http://gradleware.com/training.html">Gradle Expert Training</a> course description. To register visit Gradeware's <a href="http://www.regonline.com/builder/site/Default.aspx?EventID=980125">Gradle course registration</a> web page.
<br /><br />Be sure to check out our <a href="../../trainingcenter/coursecatalog/upcoming/">Upcoming NIC-U Training Courses</a> page for information on more upcoming courses.]]></description>
            <link>http://www.nearinfinity.com/blogs/gray_herter/gradle_class_july_18-20_by_tim.html</link>
            <guid>http://www.nearinfinity.com/blogs/gray_herter/gradle_class_july_18-20_by_tim.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Groovy</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">gradle</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">groovy</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
            <pubDate>Thu, 16 Jun 2011 19:40:19 -0500</pubDate>
        </item>
        
        <item>
            <title>Lucene Thrift and Ruby</title>
            <description><![CDATA[ This post is going to demonstrate thrift usage by searching a Lucene index from Ruby.
<h3>Thrift In a Nutshell</h3>
Essentially thrift is a serialization and RPC framework that allows you to communicate between programs that are not necessarily written in the same language.  Thrift is used by defining data types and services in a .thrift file.  You then run the .thrift file against the thrift compiler which generates the stub code needed for clients and servers.  Currently thrift will generate code for C++, C#, Erlang, Haskell, Java, Objective C/Cocoa, OCaml, Perl, PHP, Python, Ruby, and Squeak.  For a more detailed description of thrift along with instructions on how to install thrift if needed,  consult the <a href="http://wiki.apache.org/thrift/" target="new">thrift wiki</a>

<h3>Generating the Lucene Index</h3>
Our first step is to generate a small, simple lucene index.  To build our index, 50,000 fake person records were downloaded from the <a href="http://www.fakenamegenerator.com/order.php" target="new">Fake Name Generator</a> in a comma delimited file.  Each person record will contain a  first name, last name, address and email address.  Our indexing code will be very simple and will not be using any of lucene's advanced features.
<pre class="prettyprint">
public class IndexBuilder {

    public static void main(String[] args) throws Exception {
        String namesFile = "names.csv";
        Document doc = new Document();
        Field[] fields = new Field[]{new Field("firstName", "", Field.Store.YES, Field.Index.ANALYZED_NO_NORMS),
                new Field("lastName", "", Field.Store.YES, Field.Index.ANALYZED_NO_NORMS),
                new Field("address", "", Field.Store.YES, Field.Index.ANALYZED_NO_NORMS),
                new Field("email", "", Field.Store.YES, Field.Index.ANALYZED_NO_NORMS)};
        addFieldsToDocument(doc, fields);

        BufferedReader reader = new BufferedReader(new FileReader(namesFile));

        IndexWriter indexWriter = new IndexWriter(FSDirectory.open(new File("blog-index")),new IndexWriterConfig(Version.LUCENE_31, new StandardAnalyzer(Version.LUCENE_31)));

        String line;
        while ((line = reader.readLine()) != null) {
            String[] personData = getPersonData(line);
            setFieldData(personData, fields);
            indexWriter.addDocument(doc);
        }
        indexWriter.optimize();
        indexWriter.close();
    }

    private static String[] getPersonData(String line) {
        return line.split(",");
    }

    private static void setFieldData(String[] data, Field[] fields) {
        int index = 0;
        for (Field field : fields) {
            field.setValue(data[index++]);
        }
    }

    private static void addFieldsToDocument(Document doc, Field[] fields) {
        for (Field field : fields) {
            doc.add(field);
        }
    }
}
</pre> 
<h3>Creating the .thrift File </h3>
The next step will be to define what objects and services we want in our .thrift file, which will be called lucene_search.thrift.  The lucene_search.thrift file is intentionally very basic. For more details on the structure of .thrift files consult the <a href="http://wiki.apache.org/thrift/Tutorial" target="new">thrift wiki tutorial</a>
<pre class="prettyprint">
//all generated java code will have the following for package name
namespace java bbejeck.thrift.gen

//this is the person object 
struct Person {
  1: string firstName,
  2: string lastName,
  3: string address,
  4: string email
}

//exception used to send meaningful error messages back to user
exception LuceneSearchException {
  1: string message
}

//service definition used by client and server
service LuceneSearch { 
    list&lt;Person&gt; search(1: string query) throws (1:LuceneSearchException error) 
}
</pre>
As you can see from the example above, the .thrift file format is completely language agnostic. Next we need to generate our java and ruby code.   The following were run from the command line:
<ul>
<li>$ thrift --gen java lucene_search.thrift</li>
<li> $ thrift --gen rb lucene_search.thrift</li>
</ul>
The generated code ends up in two directories named gen-java/ and gen-rb/ respectively. The files generated for java are LuceneSearch.java, LuceneSearchException.java and Person.java.  The generated ruby files are lucene_search.rb, lucene_search_types.rb and lucene_search_constants.rb.   In our next step, we are going to use generated java code to write our thrift server.
<h3>Thrift Server - Java</h3>
Thrift generates all the stub code you need for a server to expose your service or program.  The only code we will need to write is a class that implements the generated Iface interface (defined in the LuceneSearch class), which contains the search method defined in our .thrift file.  
<pre class="prettyprint">
public class LuceneThriftServer {
    private static final int PORT = 9090;
    private static int numberThreads = 5;

    public static void main(String[] args) throws Exception {
        TServerSocket serverSocket = new TServerSocket(PORT, 100000);
        LuceneSearch.Processor searchProcessor = new LuceneSearch.Processor(new SearchHandler(args[0]));
        if (args.length > 1) {
            numberThreads = Integer.parseInt(args[1]);
        }
        TThreadPoolServer.Args serverArgs = new TThreadPoolServer.Args(serverSocket);
        serverArgs.maxWorkerThreads(numberThreads);
        TServer thriftServer = new TThreadPoolServer(serverArgs.processor(searchProcessor).protocolFactory(new TBinaryProtocol.Factory()));
        thriftServer.serve();
    }
</pre> 
<h3>Iface Implementation </h3>
The SearchHandler class actually does the work of searching the lucene index.  One tradeoff made here is that any exception while searching is caught and re-thrown as a LuceneSearchException.  While it's usually not a great idea to just re-throw an exception, in this case it makes sense to do so.  Since the LuceneSearchException is defined in the lucene_search.thrift file, the generated client code will handle that exception.  So instead of receiving a generic thrift exception when an error occurs, the client should receive a more meaningful error message. 
<pre class="prettyprint">
public class SearchHandler implements LuceneSearch.Iface {
    private IndexSearcher searcher;
    private QueryParser queryParser;
    private static final int MAX_RESULTS = 1000;

    public SearchHandler(String indexPath) {
        try {
            searcher = new IndexSearcher(FSDirectory.open(new File(indexPath)), true);
            queryParser = new QueryParser(Version.LUCENE_31, null, new StandardAnalyzer(Version.LUCENE_31));
            queryParser.setAllowLeadingWildcard(true);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }

    public List&lt;Person&gt; search(String query) throws LuceneSearchException {
        List&lt;Person&gt; results = new ArrayList&lt;Person&gt;();
        try {
            Query q = queryParser.parse(query);
            TopDocs topDocs = searcher.search(q, MAX_RESULTS);
            for (ScoreDoc sd : topDocs.scoreDocs) {
                Document document = searcher.doc(sd.doc);
                results.add(getPersonFromDocument(document));
            }
        } catch (Exception e) {
            throw new LuceneSearchException(e.getMessage());
        }
        return results;
    }

    private Person getPersonFromDocument(Document document) {
        Person p = new Person();
        p.firstName = document.get("firstName");
        p.lastName = document.get("lastName");
        p.address = document.get("address");
        p.email = document.get("email");

        return p;
    }
}
</pre>
The next step in our process is to write the client.
<h3>Thrift Client - Ruby</h3>
Writing the thrift ruby client is even easier than writing the server code.  If you have not already done so, install the thrift gem by running "gem install thrift" to get  the required thrift library code.  All the code you need for your client is already generated by thrift.  At this point we are only doing what is needed to get the client to communicate with the server.
<pre class="prettyprint">
module ThriftConnection

  class LuceneClient

    def initialize(host='localhost', port=9090)
      socket = Thrift::Socket.new(host, port)
      @transport = Thrift::BufferedTransport.new(socket)
      protocol_factory = ::Thrift::BinaryProtocolFactory.new
      protocol = protocol_factory.get_protocol(@transport)
      @transport.open
      @client = LuceneSearch::Client.new(protocol)
    end

    def search(query)
      @client.search(query)
    end

    def close
      @transport.close
    end

  end
end
</pre>
<h3>Running With Scissors</h3>
This section has the odd title "Running With Scissors", because like actual running with scissors, what we are about to do may not be a great idea.
In all the thrift generated code there is a warning at the top "DO NOT EDIT UNLESS YOU ARE SURE THAT YOU KNOW WHAT YOU ARE DOING", obviously I don't, but I'm not going to let that stop me at this point (sometimes you just have to see if you can get something to work!)  What we've done is implement method_missing in the generated Person class (found in lucene_search_types.rb) so we can specify searches ala ActiveRecord style.  What we are going to do to accomplish this is use a regular expression to pull out what fields to search for and use the arguments passed in as the search values.  The regular expression here is fairly simple and only aims to handle simple searches.
<pre class="prettyprint">
#added to translate from symbol to expected search format
  SEARCH_KEYS_MAPPING = {:first_name => 'firstName',
                                            :last_name => 'lastName',
                                            :email => 'email',
                                            :address => 'address'}


  def self.method_missing(method_name, *args)
    lucene_client = ThriftConnection::LuceneClient.new
    query = ""
        #handles find_by_first_name etc
    if method_name.to_s =~ /^find_by_([a-z]+_?[a-z]*)$/
      query = "#{SEARCH_KEYS_MAPPING[$1.to_sym]}:#{args[0]}"
        #handles find_by_first_name_or_last_name, find_by_first_name_and_email 
    elsif method_name.to_s =~/^find_by_([a-z]+_[a-z]+)_([a-z]+)_([a-z]+_?[a-z]*)$/
      query ="#{SEARCH_KEYS_MAPPING[$1.to_sym]}:#{args[0]} #{$2.upcase} #{SEARCH_KEYS_MAPPING[$3.to_sym]}:#{args[1]}"
    else
       raise ArgumentError.new("search method pattern #{method_name} not recognized")
    end

    results = lucene_client.search(query)
    lucene_client.close
    results
  end
</pre>
As we'll see in the next section, this actually worked, but I still view this more as a useful experiment.  First of all this was placed in generated code, so any time you make changes you would have to manually get the method_missing definition back into the Person class.  Secondly, Lucene search syntax is really not all that hard to learn.  
<h3>Testing</h3>
All of what we have done so far would not be worth much if we could not verify our work with some testing.  Here is the unit test to verify that we are indeed able to search a Lucene index from Ruby.  To get some names to search on I simply ran <pre class="prettyprint">
head names.csv</pre> and then used some of the information in various combinations to get counts of what searches should return.  For example to get an idea of what searching for a first name of Elizabeth or last name of Krause would return I ran<pre class="prettyprint">cat names.csv | grep -iE 'elizabeth|krause' | wc -l </pre> which returned a count of 289.  So, first making sure that our thrift server was running in the background, here is the unit test that was run to verify our Ruby client searching against a Lucene index.
<pre class="prettyprint">
class SearchTest < Test::Unit::TestCase

  def setup
    @lucene_client = ThriftConnection::LuceneClient.new
  end


  def teardown
    @lucene_client.close
  end

  def test_search_client_first_name
    persons = @lucene_client.search("firstName:Tia")
    assert_equal(5, persons.length)

    persons.each do |person|
      assert_equal("Tia", person.firstName)
    end
  end

  def test_search_person_class_first_name
    persons = Person.find_by_first_name("Tia")
    assert_equal(5, persons.length)

    persons.each do |person|
      assert_equal("Tia", person.firstName)
    end
  end

  def test_search_client_first_name_email_domain
    persons = @lucene_client.search("+firstName:Elizabeth +email:*pookmail.com")
    assert_equal(59, persons.length)
  end

  def test_search_person_class_first_name_email_domain
    persons = Person.find_by_first_name_and_email("elizabeth", "*pookmail.com")
    assert_equal(59, persons.length)
  end

  def test_search_client_first_name_and_last_name
    persons = @lucene_client.search("+firstName:Elizabeth +lastName:Krause")
    assert_equal(1, persons.length)
    person = persons[0]

    assert_equal("Elizabeth", person.firstName)
    assert_equal("Krause", person.lastName)
  end

  def test_search_person_class_first_name_and_last_name
    persons = Person.find_by_first_name_and_last_name("elizabeth", "krause")
    assert_equal(1, persons.length)
    person = persons[0]

    assert_equal("Elizabeth", person.firstName)
    assert_equal("Krause", person.lastName)
  end

  def test_search_person_class_first_name_or_last_name
    persons = Person.find_by_first_name_or_last_name("elizabeth", "krause")
    assert_equal(289, persons.length)
  end

  def test_invalid_search
    assert_raises ArgumentError do
      Person.find_person_by_name("tia")
    end
  end

end
</pre>
<h3>Conclusion</h3>
Thrift is a compelling alternative for RPC or message passing where one might otherwise be using either REST, Java RMI or middleware (JMS, AMQP).  There is a great comparison of how thrift performs against other forms of RPC in this  <a href="http://jnb.ociweb.com/jnb/jnbJun2009.html" target="new">thrift tutorial from OCI</a> found near the end of the article.  It is hoped the reader was able to learn something useful.  Thanks for your time
<h3>Resources</h3>
Full source for the blog including the generated code can be found <a href="https://github.com/bbejeck/LuceneThriftRuby" target="new">on github</a>. If you are interested in running the test you can download <a href="https://github.com/downloads/bbejeck/LuceneThriftRuby/lucene-thrift-example.tar.gz" target="new">lucene-thrift-example.tar.gz</a> extract the tar file and execute the runSearchTest.sh script.  You do not need to have thrift installed to run the test.
<ul>
<li>For more information on thrift the <a href="http://wiki.apache.org/thrift/" target="new"> thrift wiki</a> is a great start</li>
<li>More information on Lucene can be found <a href="http://lucene.apache.org/java/docs/index.html" target="new">here</a></li>
</ul>




]]></description>
            <link>http://www.nearinfinity.com/blogs/bill_bejeck/lucene_thrift_and_ruby.html</link>
            <guid>http://www.nearinfinity.com/blogs/bill_bejeck/lucene_thrift_and_ruby.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Lucene</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Ruby</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Java Thrift Lucene Ruby</category>
            
            <pubDate>Wed, 18 May 2011 23:38:30 -0500</pubDate>
        </item>
        
        <item>
            <title>Monitoring JMS Resources Using WebLogic MBeans</title>
            <description><![CDATA[<div>Recently, I needed to enhance some existing code to apply optional filters when retrieving the status of application-created JMS queues/topics (resources) from WebLogic. &nbsp;These filters would be based upon a pre-defined set of categories for the JMS resources in our application. &nbsp;The existing code used the deprecated MBeanHome interface to connect to the WebLogic MBean server, retrieved all "JMSServerRuntime" MBeans, and returned the status of only application-created JMS resources through the use of nested iterations over the returned MBean set. &nbsp;Since I just needed to add a filter on top of the returned MBean set, the easiest route would have been to add an iteration over the returned JMSServerRuntimeMBeans to return only the ones that matched the provided filter. &nbsp;However, since the existing performance was already slower than desired, I knew that adding an iteration would degrade it even further. &nbsp;As a result, I decided to leverage MBeanServerConnection&nbsp;queryNames when retrieving the MBeans directly from WebLogic to allow WebLogic to do the filtering for me instead of iterating over the results after the MBeans are returned. &nbsp;Also, I figured I'd refactor the existing code to no longer use deprecated WebLogic APIs in favor of the standard JMX programming model.</div><div><br /></div><div>Here is an example of the existing code. &nbsp;It performs the following steps:</div><div>1. &nbsp;Connect to the MBeanServer using MBeanHome &nbsp;</div><div>2. &nbsp;Retrieve all MBeans of type "JMSServerRuntime"</div><div>3. &nbsp;Iterate over the set to find the MBean for the JMS Server created by the application</div><div>4. &nbsp;Retrieve all JMS destinations for the application-created JMS Server, and then iterate over them to retrieve each destination</div><div><span class="Apple-style-span" style="font-family: monospace; white-space: pre; "><br /></span></div>
<pre class="prettyprint"><div><span class="Apple-style-span" style="font-family: monospace; white-space: pre; ">/**</span></div><div>&nbsp;&nbsp;* Obtain the status of JMS queues and topics</div><div>&nbsp;&nbsp;* @return List of all JMSDestinationRuntimeMBean objects (one per destination)</div><div>&nbsp; */</div><div>private static List&lt;JMSDestinationRuntimeMBean&gt; getJMSStatus()</div><div>{</div><div>    List&lt;JMSDestinationRuntimeMBean&gt; destinations = new ArrayList&lt;JMSDestinationRuntimeMBean&gt;();</div><div>&nbsp;&nbsp; &nbsp;try</div><div>&nbsp;&nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp;  // Look up the management bean home for the admin server</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;Context ctx = new InitialContext();</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;MBeanHome home = (MBeanHome) ctx.lookup(MBeanHome.ADMIN_JNDI_NAME);</div><div><br /></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;// Get all JMS server MBeans</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;Set mbeanSet = home.getMBeansByType("JMSServerRuntime");</div><div><br /></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;// Iterate over the set to find the the JMS Server used by our application</div><div><span class="Apple-tab-span" style="white-space:pre">	</span>Iterator mBeanIterator = mbeanSet.iterator();</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;while ( mBeanIterator.hasNext() )</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;  JMSServerRuntimeMBean server = (JMSServerRuntimeMBean) mBeanIterator.next();</div><div><br /></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;// If it's our application-created JMS server, then get all of its destinations</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;String name = server.getName().toUpperCase();</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;if ( name.startsWith("JMSServerName") ) //insert the name of the JMS server used by the application</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;  JMSDestinationRuntimeMBean[] dest = server.getDestinations();</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;for ( int i=0; i &lt; dest.length; i++ )</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;        //NOTE: This is where we would add another iteration to determine if the destination meets any provided filters</div><div><span class="Apple-tab-span" style="white-space:pre">		</span>    destinations.add(dest[i]);</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;}</div><div>&nbsp;&nbsp; &nbsp;}</div><div>&nbsp;&nbsp; &nbsp;catch (Exception ex)</div><div>&nbsp;&nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;LOG.warn("Could not retrieve JMS statistics for all destinations", ex);</div><div>&nbsp;&nbsp; &nbsp;}</div><div><br /></div><div>&nbsp;&nbsp; &nbsp;return(destinations);</div><div>}</div><div><br /></div></pre>

<div>Based on my analysis and testing, I found the performance of this method to be degraded in relation to the number of MBeans returned by WebLogic and the nested iterations over the returned MBean set. &nbsp;So instead of retrieving MBeans using MBeanHome getMBeansByType, I decided to try queryNames from the MBeanServerConnection interface. &nbsp;The queryNames method returns the MBean names from the MBean server and takes an ObjectName and QueryExp as parameters. &nbsp;Since the ObjectName identifies the MBean names to be retrieved, I used "Type=JMSDestinationRuntime". &nbsp;The QueryExp parameter was where I was hoping WebLogic would do the filtering for me. &nbsp;After doing some more testing, I found thats since WebLogic performed the filtering before returning the MBean set, the results were returned much faster than retrieving the full list of MBeans. &nbsp;The only drawback of this approach was that queryNames returns the ObjectNames for the MBeans selected and not all the attributes needed by our application for the status of the JMS queues/topics. &nbsp;An additional query to the MBean server is necessary to retrieve the missing attributes. &nbsp;Even with this additional query, the enhanced code returned the results faster than the existing code.</div><div><br /></div><div>Here is an example of the updated code. &nbsp;It performs the following steps:</div><div>1. &nbsp;Create the QueryExp based upon the provided filters</div><div>2. &nbsp;Connect to the WebLogic Admin MBeanServer using JMXConnector. &nbsp;&nbsp;</div><div>3. &nbsp;Retrieve MBeans (using QueryNames) of type "JMSServerRuntime" and the QueryExp</div><div>4. &nbsp;Check that all MBeans have the application-created JMSServerRuntime</div><div>5. &nbsp;Retrieve all JMS destinations for the application-created JMS Server, and then iterate over them to retrieve each destination</div><div><br /></div>

<pre class="prettyprint"><div>&nbsp;&nbsp; &nbsp;/**</div><div>&nbsp;&nbsp; &nbsp;* Obtain the status of JMS queues and topics</div><div>&nbsp;&nbsp; &nbsp;* @param jmsNameList required leading list of names of queue/topic (if null then all queues/topics returned)</div><div>&nbsp;&nbsp; &nbsp;* @return List of all JMSStatus objects (one per destination)</div><div>&nbsp;&nbsp; &nbsp;*/</div><div>&nbsp;&nbsp; &nbsp;public static List&lt;JMSStatus&gt; getJMSStatusByNamedList(List&lt;String&gt; jmsNameList)</div><div>&nbsp;&nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;QueryExp fullQuery = null;</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;if (jmsNameList != null)</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;QueryExp query = null;</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;for (String jmsName : jmsNameList)</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;AttributeValueExp attribute = Query.attr(mBeanName);</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;StringValueExp name = Query.value(jmsName);</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;query = Query.initialSubString(attribute, name);</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;fullQuery = (fullQuery != null) ? Query.or(fullQuery, query) : query;</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;}</div><div><br /></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;return getJMSStatusByQuery(fullQuery);</div><div><br /></div><div>&nbsp;&nbsp; &nbsp;}</div><div><br /></div><div>&nbsp;&nbsp; &nbsp;/**</div><div>&nbsp;&nbsp; &nbsp;* Obtain the status of JMS queues and topics</div><div>&nbsp;&nbsp; &nbsp;* @param query optional QueryExp to filter the returned jms destinations (if null then all queues/topics returned)</div><div>&nbsp;&nbsp; &nbsp;* @return List of all JMSStatus objects (one per destination) sorted by name</div><div>&nbsp;&nbsp; &nbsp;*/</div><div>&nbsp;&nbsp; &nbsp;private static List&lt;JMSStatus&gt; getJMSStatusByQuery(QueryExp query)</div><div>&nbsp;&nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;List&lt;JMSStatus&gt; destinations = new ArrayList&lt;JMSStatus&gt;();</div><div><span class="Apple-tab-span" style="white-space:pre">	</span>private String[] mBeanAttributes = {"Name", "DestinationType", "MessagesCurrentCount",</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;"MessagesPendingCount", "MessagesReceivedCount", "MessagesHighCount"};</div><div><br /></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;MBeanServerConnection connection = null;</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;JMXConnector connector = null;</div><div><br /></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;try</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;connector = initConnection();</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;connection = connector.getMBeanServerConnection();</div><div><br /></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Set&lt;ObjectName&gt; mbeanSet = connection.queryNames(new ObjectName("*:*,Type=JMSDestinationRuntime"), query);</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;for (ObjectName mbean : mbeanSet)</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;if (mbean.getKeyProperty("JMSServerRuntime").toLowerCase().startsWith("JMSServerName")) // Name of the application-created JMS Server name</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;AttributeList attributes = connection.getAttributes(mbean, mBeanAttributes);</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;if (attributes.size() == mBeanAttributes.length)</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div><span class="Apple-tab-span" style="white-space:pre">			</span>JMSStatus stat = new JMSStatus(</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;(Attribute) attributes.get(0),</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;(Attribute) attributes.get(1),</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;(Attribute) attributes.get(2),</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;(Attribute) attributes.get(3),</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;(Attribute) attributes.get(4),</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;(Attribute) attributes.get(5),</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;mbean.getKeyProperty("JMSServerRuntime"),</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;mbean.getKeyProperty("ServerRuntime"));</div><div><br /></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;destinations.add(stat);</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;}</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;catch (Exception ex)</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;LOG.warn("Could not retrieve JMS statistics for all destinations", ex);</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;}</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;finally</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;if (connector != null)</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;try {</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;connector.close();</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;} catch (IOException ex) {</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;LOG.warn("Could not close the MBean Server connection", ex);</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;}</div><div><br /></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;return(destinations);</div><div><br /></div><div>&nbsp;&nbsp; &nbsp;}</div><div><br /></div><div>&nbsp;&nbsp; &nbsp;/**</div><div>&nbsp;&nbsp; &nbsp;* Initialize the JMX connection to the WebLogic Admin MBeanServer.</div><div>&nbsp;&nbsp; &nbsp;* @return JMXConnector for the MBeanServer connection</div><div>&nbsp;&nbsp; &nbsp;*/</div><div>&nbsp;&nbsp; &nbsp;private static JMXConnector initConnection() throws IOException {</div><div><br /></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;JMXServiceURL serviceURL = new JMXServiceURL(protocol, host, port, jndiPath); // Update using the appropriate protocol, host, port, jndiPath</div><div><br /></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;Map&lt;String,Object&gt; env = new HashMap&lt;String,Object&gt;();</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;env.put(Context.SECURITY_PRINCIPAL, "login"); &nbsp;// update with admin user login</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;env.put(Context.SECURITY_CREDENTIALS, "password"); &nbsp;// update with admin user password</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;env.put(JMXConnectorFactory.PROTOCOL_PROVIDER_PACKAGES, "weblogic.management.remote");</div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;env.put("jmx.remote.x.request.waiting.timeout", 10000);</div><div><br /></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;return JMXConnectorFactory.connect(serviceURL, env);</div><div>&nbsp;&nbsp; }</div></pre>]]></description>
            <link>http://www.nearinfinity.com/blogs/sara_bevels/using_jmx_querynames_for_retri.html</link>
            <guid>http://www.nearinfinity.com/blogs/sara_bevels/using_jmx_querynames_for_retri.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
            <pubDate>Mon, 16 May 2011 19:01:26 -0500</pubDate>
        </item>
        
        <item>
            <title>Hadoop Presentation at NOVA/DC Java Users Group</title>
            <description><![CDATA[<p>Last Thursday (on Cinco de Mayo) I gave a presentation on <a href="http://hadoop.apache.org/">Hadoop</a> and <a href="http://hive.apache.org/">Hive</a> at the <a href="http://www.meetup.com/dc-jug/">Nova/DC Java Users Group</a>. As several people asked about getting the slides, I've shared them <a href="http://www.slideshare.net/scottleber/hadoop-7904044">here</a> on Slideshare. I also posted the presentation sample code on Github at <a href="https://github.com/sleberknight/basic-hadoop-examples">basic-hadoop-examples</a>.</p>]]></description>
            <link>http://www.nearinfinity.com/blogs/scott_leberknight/hadoop_presentation_at_novadc.html</link>
            <guid>http://www.nearinfinity.com/blogs/scott_leberknight/hadoop_presentation_at_novadc.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Hadoop</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Big Data</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">hadoop</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">hive</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
            <pubDate>Tue, 10 May 2011 00:30:15 -0500</pubDate>
        </item>
        
        <item>
            <title>Lucene Compression</title>
            <description><![CDATA[ <div>For a number of years I have used Lucene for both search and data storage. &nbsp;Meaning, I store all the data necessary to display search results in the Lucene index. &nbsp;This means that there is typically a lot of data in the FDT file, and compressing that data becomes a necessity as the indexes grow in size.</div><div><br /></div><div>In version 3.0 support for field compression was dropped by Lucene. &nbsp;I could compress each field into a byte array and store the data in document, but if you have <i>a lot</i> of small fields in a document this doesn't work very well. &nbsp;It typically won't save you any disk space, and actually it might cost you some depending on the compression algorithm.</div><div><br /></div><div>So I have built a block level compression for the FDT file in the form of a Lucene directory. &nbsp;It allows you to choose whatever compression algorithm you want and whatever block size makes sense for your data.</div><div><br /></div><div>The block level compression allows you to compress the entire document (possibly&nbsp;multiple&nbsp;documents) into a single block and achieve a higher compress ratio than if you had compressed each field separately.</div><div><br /></div><div>Plus you don't have to modify any of your Lucene code, other than wrapping your <i>real</i> directory with the compressed implementation.</div><div><br /></div><div>NOTE: &nbsp;This only works with the compound files turned off.</div><div><br /></div><div><a href="https://github.com/nearinfinity/lucene-compression">https://github.com/nearinfinity/lucene-compression</a></div>]]></description>
            <link>http://www.nearinfinity.com/blogs/aaron_mccurry/lucene_compression.html</link>
            <guid>http://www.nearinfinity.com/blogs/aaron_mccurry/lucene_compression.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Lucene</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Big Data</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">compression</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">lucene</category>
            
            <pubDate>Thu, 21 Apr 2011 09:00:00 -0500</pubDate>
        </item>
        
        <item>
            <title>Seam, Resteasy, and Capturing Incoming REST Requests</title>
            <description><![CDATA[So...I embarked upon a task the other day to ensure that all REST traffic coming into our application would be captured (when turned "on" for a particular end user) and logged to our database so that the support staff could review the traffic and help end users with any issues that they might have while submitting REST requests to our system.<div><br /></div><div>Sounds pretty straightforward...but as usually is the case, things are not always as easy as they first seem. &nbsp;Based on the&nbsp;<a href="http://www.jboss.org/resteasy/docs">Resteasy docs</a>, I immediately gravitated to the PreProcessInterceptor. &nbsp;I mean - hey - that fits the need quite nicely, right? &nbsp;Well, yes, until you account for the need to capture requests that may have failed to resolve properly, e.g., if the user sends a POST versus a PUT to a particular URL. &nbsp;In that scenario, attempting to capture requests in the PreProcessInterceptor will not work since Resteasy will not get far enough into its request dispatching process to map the request to a particular resource method...and if Resteasy does not execute a resource method, then the PreProcessInterceptor does not get called. &nbsp;Fail.</div><div><br /></div><div>I could clearly see in my server logs that Resteasy was logging its failure to find a resource method (for an incorrect request) from its SynchronousDispatcher class:</div><div><br /></div>
<pre class="prettyprint">org.jboss.resteasy.spi.NotFoundException: Could not find resource for relative : /samplePath/RESOURCEID of full path: http://127.0.0.1:8080/approot/seam/resource/rest/samplePath/RESOURCEID?apikey=API-1234-1234-1234-1234
	at org.jboss.resteasy.core.registry.RootSegment.matchChildren(RootSegment.java:358)
	at org.jboss.resteasy.core.registry.RootSegment.matchRoot(RootSegment.java:372)
	at org.jboss.resteasy.core.registry.RootSegment.matchRoot(RootSegment.java:365)
	at org.jboss.resteasy.core.ResourceMethodRegistry.getResourceInvoker(ResourceMethodRegistry.java:251)
	at org.jboss.resteasy.core.SynchronousDispatcher.getInvoker(SynchronousDispatcher.java:155)
	at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:116)
</pre><div><br /></div><div>So I began to look more closely at the source code of that class to identify the code execution path and find out what my options were to capture requests prior to resource method resolution. &nbsp;I quickly noticed in the SynchronousDispatcher code the existence of a collection of HttpRequestPreprocessors. &nbsp;Well now - that class sounded more promising...and there is a public method on SynchronousDispatcher to add such a pre-processor. &nbsp;But I wasn't seeing a path forward on how to create a custom HttpRequestPreprocessor and add it to the collection - where do I do that?</div><div><br /></div><div>I looked at the Resteasy configuration docs but those examples weren't helpful - I was forgetting that my Resteasy components are being accessed via&nbsp;<a href="http://docs.jboss.org/seam/2.2.1.Final/reference/en-US/html_single/">JBoss Seam</a>&nbsp;integration. &nbsp;But looking at the Seam / Resteasy integration documentation didn't immediately expose a solution either - no reference directly to HttpRequestPreprocessor at least. &nbsp;(In fact, I didn't see ANY reference to HttpRequestPreprocessor in Resteasy documentation at all.)</div><div><br /></div><div>Ultimately, I found that the Seam / Resteasy integration lies with the existence of the ResteasyBootstrap class which is configured as a Seam Startup component. &nbsp;The ResteasyBootstrap component is what is responsible for constructing the SynchronousDispatcher instance to which we need to add a HttpRequestPreprocessor. &nbsp;My solution was to create my own Seam component that extends Resteasy's ResteasyBootstrap but that has a higher precedence on the @Install annotation - like so:</div><div><br /></div><pre class="prettyprint">/**
 * Custom Resteasy bootstrapper used to add our own HttpRequestPreprocessor.
 */
@Name("org.jboss.seam.resteasy.bootstrap")
@Scope( ScopeType.APPLICATION)
@Startup
@AutoCreate
@Install(classDependencies = "org.jboss.resteasy.spi.ResteasyProviderFactory", precedence = Install.DEPLOYMENT )
public class RESTCaptureBootstrap extends ResteasyBootstrap
{
    /**
     * This method overrides RESTEasy's bootstrapping method solely to add our RESTCaptureInterceptor
     * as an HttpRequestPreprocessor.  We're using that to capture a REST request as early in the pipeline as
     * possible.
     *
     * @param providerFactory Resteasy provider from Seam
     * @return dispatcher with our added HttpRequestPreprocessor
     */
    @Override
    protected Dispatcher createDispatcher( SeamResteasyProviderFactory providerFactory)
    {
        Dispatcher dispatcher = super.createDispatcher( providerFactory );
        dispatcher.addHttpPreprocessor( new RESTCaptureInterceptor() );

        return dispatcher;
    }
}
</pre>
<div><br /></div><div>This code will allow our custom HttpRequestPreprocessor to capture and log requests as necessary.</div><div><br /></div>]]></description>
            <link>http://www.nearinfinity.com/blogs/jim_clingenpeel/seam_resteasy_and_httprequestp.html</link>
            <guid>http://www.nearinfinity.com/blogs/jim_clingenpeel/seam_resteasy_and_httprequestp.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">seam resteasy</category>
            
            <pubDate>Sat, 16 Apr 2011 23:11:29 -0500</pubDate>
        </item>
        
        <item>
            <title>What&apos;s in JDK 7 Lightning Talk Slides</title>
            <description><![CDATA[ <p>Yesterday at the <a href="http://www.nearinfinity.com">Near Infinity</a> 2011 Spring Conference I gave a talk on CoffeeScript (see <a href="http://www.nearinfinity.com/blogs/scott_leberknight/coffeescript_slides.html"> here</a>) and a very short lightning talk on what exactly is in JDK 7. You can find the slides for the JDK 7 talk <a href="http://www.slideshare.net/scottleber/wtf-is-in-javajdkwtf7">here</a> if you're interested.</p>]]></description>
            <link>http://www.nearinfinity.com/blogs/scott_leberknight/whats_in_jdk_7_lightning_talk.html</link>
            <guid>http://www.nearinfinity.com/blogs/scott_leberknight/whats_in_jdk_7_lightning_talk.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">jdk7</category>
            
            <pubDate>Sat, 16 Apr 2011 11:01:01 -0500</pubDate>
        </item>
        
        <item>
            <title>Inheriting Annotations on Java Methods</title>
            <description><![CDATA[<p>As is clearly stated in the @Inherited meta annotation docs <a href="http://download.oracle.com/javase/6/docs/api/java/lang/annotation/Inherited.html">here</a>, you can&#8217;t inherit annotations on anything other than classes. Or, stated differently, you can&#8217;t expect <a href="http://download.oracle.com/javase/6/docs/api/java/lang/reflect/AccessibleObject.html#isAnnotationPresent(java.lang.Class)">Method::isAnnotationPresent(Class<? extends Annotation> annotationClass)</a>to return true unless the annotation you are querying for is on <em>the</em> Method instance you are querying (not on the interface method it may be implementing, for example).</p>

<p>Well, if you&#8217;re like me, as soon as you hear that you can&#8217;t inherit annotations on methods, you start thinking of cases where it would be useful. In my case, the scenario that created my need for it was a RESTful web service. Long story short, we had interfaces specifying our web services, and the services we were &#8220;publishing&#8221; were methods on those interfaces annotated with the <a href="http://download.oracle.com/javaee/6/api/javax/ws/rs/Path.html">@Path</a> annotation. We also had an interceptor that intercepted service calls in order to provide an extra layer of error handling. The problem was that we didn&#8217;t want to intercept <em>every</em> method on our service implementations (for example, we didn&#8217;t want to intercept <em>private</em> methods); we only wanted to intercept those methods that were the result of a web service call. The most precise way to determine whether or not a method was a service call (from the interceptor&#8217;s perspective) was to determine if it was an implementation of a method that was annotated with @Path.</p>

<p>This isn&#8217;t a blog post about intercepting web service calls, though. To recreate that example would obscure the real problem being solved. To demonstrate, I&#8217;ve created a small <a href="https://github.com/emacdona/InheritingMethodAnnotationsExample">sample project</a> on Github. It models the life of one of my favorite Scientist/Mathematician/Engineers, Sir Isaac Newton.</p>

<p>The code that solves the problem of interrogating a method for an annotation it may have &#8220;inherited&#8221; follows. I&#8217;ve added some inline comments. If you&#8217;d like to see it used in context, check out the Github project linked to above.</p>

<pre class="prettyprint" style="border: 1px solid black">
package net.edmacdonald.javaPlayground.metaProgramming.util;

import org.apache.commons.lang.ArrayUtils;

import java.lang.reflect.Method;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class Utils {

    /*
     * Check to see if a given method is annotated with the given annotation.
     * This is the method you call in your client code.
     */
    public static boolean isImplementedMethodAnnotatedWith(Method method, Class annotation) {

        //Find all interfaces/classes that are extended/implemented **explicitly** by the class
        //the given method belongs to. "Explicitly" in this context means the classes/interfaces
        //that are listed in an extends/implements clause. I.e. don't expect Object to show up.
        Class[] clazzes = getAllInterfaces(method.getDeclaringClass());

        //For each of these classes, see if there is a method that looks exactly like this one and
        //is annotated with the given annotation
        for(Class clazz : clazzes) {
            try {
                //Throws an exception if method not found.
                Method m = clazz.getDeclaredMethod(method.getName(), method.getParameterTypes());

                if(m.isAnnotationPresent(annotation)) {
                    return true;
                }
            }
            catch(Exception e) {
                /*Do nothing*/
            }
        }

        return false;
    }

    private static Class[] getAllInterfaces(Class clazz) {
        return getAllInterfaces(new Class[] { clazz } );
    }

    //This method walks up the inheritance hierarchy to make sure we get every class/interface that could
    //possibly contain the declaration of the annotated method we're looking for.
    private static Class[] getAllInterfaces(Class[] classes) {
        if(0 == classes.length ) {
            return classes;
        }
        else {
            List<Class> extendedClasses = new ArrayList<Class>();
            for (Class clazz: classes) {
                extendedClasses.addAll(Arrays.asList( clazz.getInterfaces() ) );
            }
            //Class::getInterfaces() gets only interfaces/classes implemented/extended directly by a given class.
            //We need to walk the whole way up the tree.
            return (Class[]) ArrayUtils.addAll( classes,
                                                getAllInterfaces(
                                                        extendedClasses.toArray(new Class[extendedClasses.size()]))
            );
        }
    }
}
</pre>
]]></description>
            <link>http://www.nearinfinity.com/blogs/ed_macdonald/inheriting_annotations_on_java.html</link>
            <guid>http://www.nearinfinity.com/blogs/ed_macdonald/inheriting_annotations_on_java.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">annotation</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">reflection</category>
            
            <pubDate>Wed, 13 Apr 2011 13:20:25 -0500</pubDate>
        </item>
        
        <item>
            <title>Introducing RJava</title>
            <description><![CDATA[<p>You've no doubt heard about JRuby, which lets you run Ruby code on the JVM. This is nice, but wouldn't it be nicer if you could write Java code on a Ruby VM? This would let you take advantage of the power of Ruby 1.9's new YARV (Yet Another Ruby VM) interpreter while letting you write code in a statically-typed language. Without further ado, I'd like to introduce <strong>RJava</strong>, which does just that!</p>

<p>RJava lets you write code in Java and run it on a Ruby VM! And you still get the full benefit of the Java compiler to ensure your code is 100% correct. Of course with Java you also get checked exceptions and proper interfaces and abstract classes to ensure compliance with your design. You no longer need to worry about whether an object responds to a random message, because the Java compiler will enforce that it does.</p>

<p>You get all this and more but on the power and flexibility of a Ruby VM. And because Java does not support closures, you are ensured that everything is properly designed since you'll be able to define interfaces and then implement anonymous inner classes just like you're used to doing! Even when JDK 8 arrives sometime in the future with lambdas, you can rest assured that they will be statically typed.</p>

<p>As a first example, let's see how you could filter a collection in RJava to find only the even numbers from one to ten. In Ruby you'd probably write something like this:</p>

<pre class="prettyprint">
evens = (1..10).find_all { |n| n % 2 == 0 }
</pre>

<p>With RJava, you'd write this:</p>

<pre class="prettyprint">
List&lt;Integer&gt; evens = new ArrayList&lt;Integer&gt;();
for (int i = 1; i &lt;= 10; i++) {
  if (i % 2 == 0) {
    evens.add(i);
  }
}
</pre>

<p>This example shows the benefits of declaring variables with specific types, how you can use interfaces (e.g. List in the example) when declaring variables, and shows how you also get the benefits of Java generics to ensure your collections are always type-safe. Without any doubt you know that "evens" is a List containing Integers and that "i" is an int, so you can sleep soundly knowing your code is correct. You can also see Java's powerful "for" loop at work here, to easily traverse from 1 to 10, inclusive. Finally, you saw how to effectively use Java's braces to organize code to clearly show blocks, and semi-colons ensure you always know where lines terminate.</p>

<p>I've just released <a href="https://github.com/sleberknight/rjava" onclick="alert('April Fools!'); return false;">RJava</a> on GitHub, so go check it out. Please download RJava today and give it a try and let me know what you think!</p>]]></description>
            <link>http://www.nearinfinity.com/blogs/scott_leberknight/introducing_rjava.html</link>
            <guid>http://www.nearinfinity.com/blogs/scott_leberknight/introducing_rjava.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">JRuby</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Ruby</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">ruby</category>
            
            <pubDate>Fri, 01 Apr 2011 00:00:00 -0500</pubDate>
        </item>
        
        <item>
            <title>Micro Benchmarking with Caliper</title>
            <description><![CDATA[ From time to time I think all developers have done some form of benchmarking.  I recently discovered <a href="http://code.google.com/p/caliper/" target="new">Caliper</a> which is according to the site -  "Caliper is Google's open-source framework for writing, running and viewing the results of Java Microbenchmarks".  I am aware that micro-benchmarking can be misleading depending on who is writing the tests, but sometimes they are very helpful in getting a feel for how your code is running.
<h3>Background</h3>
Micro benchmarks are dead simple to do.  Get the current time in milliseconds, execute your code, then get the time in milliseconds again and subtract the difference.  So why use a tool like Caliper?  To me Caliper is great because it has a very familiar JUnit type of structure and feel to it. Instead of trying to describe what Caliper does, it's probably easiest to look at a simple example I put together.  I decided to benchmark implementations of Heap Sort,Merge Sort,Quick Sort and the sort method of the java.lang.Arrays class.  Not a terribly original idea I admit, but it was quick to do and it makes the point. <b>NOTE: Package statement and imports left out intentionally for brevity</b>
<pre class="prettyprint">
public class SortingBenchmarks extends SimpleBenchmark {
	
	private static final int SIZE = 100000;
	private static final int MAX_VALUE = 80000;
	private int[] values;

	@Override
	protected void setUp() throws Exception {
		values = new int[SIZE];
		Random generator = new Random();
		for (int i = 0; i < values.length; i++) {
			values[i] = generator.nextInt(MAX_VALUE);
		}
	}

	public void timeHeapSort(int reps) {
		for (int i = 0; i < reps; i++) {
			HeapSort.sort(values);
		}
	}

	public void timeMergeSort(int reps) {
		for (int i = 0; i < reps; i++) {
			MergeSort.sort(values);
		}
	}
	
	public void timeQuickSort(int reps) {
		for (int i = 0; i < reps; i++) {
			QuickSort.sort(values);
		}
	}

	public void timeArraysSort(int reps) {
		for (int i = 0; i < reps; i++) {
			Arrays.sort(values);
		}
	}
}
</pre>
<h3>Getting Started</h3>
Here are the basic steps to writing benchmarks in Caliper:
<ol>
<li>Extend the class SimpleBenchmark</li>
<li>Do any test setup/clean up in respective setUp or tearDown methods (Similar to JUnit setUp/tearDown)</li>
<li>Write the methods that will execute the code to benchmark starting with the word "time" (Again similar to JUnit, just "time" instead of "test")</li>
<li>Place the code you want to benchmark inside your timeSomeOperation methods</li>
</ol>
<h3>Getting and Running Caliper</h3>
To get started with Caliper
<ol>
<li>Go <a href="http://code.google.com/p/caliper/source/checkout" target="new">here</a> to get a read-only svn link to the source code</li>
<li>Check out the code then cd into &lt;CALIPER_INSTALL_DIR&gt; and run ant (obviously you need ant installed and on your path)</li>
</ol>
To actually run your benchmarks there is a bash script included in the project that you can use to run Caliper from the command line.  I took a little different approach as I wanted to run from inside my Eclipse project.
<ol>
<li>In &lt;CALIPER_INSTALL_DIR&gt;/build/caliper-0.0/lib/ there are two jar files, allocation.jar and caliper-0.0.jar.  These will need to be on the classpath of your project.  I chose to create a user library in Eclipse</li>
<li>JUnit will also need to be on the project classpath.  Again for me it's referenced in a user library</li>
<li>I created a driver class that simply executes the main method of the com.google.caliper.Runner class.  I pass in the name of my benchmark class by setting up a run configuration in Eclipse for my driver class</li>
</ol>
Here is the code for my driver class:
<pre class="prettyprint">
package bbejeck.caliper;

public class CaliperRunner {

    public static void main(String[] args) {
	com.google.caliper.Runner.main(args[0]);
    }

}
</pre>
After running the driver this is the output received in the Eclipse console screen:
<br/>
<pre style="font-size:medium">
0% Scenario{vm=java, trial=0, benchmark=HeapSort} 10889428.57 ns; ?=69810.62 ns @ 3 trials
25% Scenario{vm=java, trial=0, benchmark=MergeSort} 9066618.18 ns; ?=44341.70 ns @ 3 trials
50% Scenario{vm=java, trial=0, benchmark=QuickSort} 3312312.93 ns; ?=21028.91 ns @ 3 trials
75% Scenario{vm=java, trial=0, benchmark=ArraysSort} 3104668.79 ns; ?=23965.54 ns @ 3 trials

 benchmark    ms logarithmic runtime
  HeapSort 10.89 ==============================
 MergeSort  9.07 =========================
 QuickSort  3.31 ==
ArraysSort  3.10 =

vm: java
trial:
</pre>
<h3>Conclusion</h3>
Caliper is fairly new and is still has some work to be done, but I think that it's a great tool that could be very useful. Another potentially useful feature is that Caliper can <a href="http://code.google.com/p/caliper/wiki/OnlineResults" target="new">automatically publish your benchmarks.</a>   Full source code for my examples can be found on <a href="https://github.com/bbejeck/CaliperBlog" target="new">github</a>]]></description>
            <link>http://www.nearinfinity.com/blogs/bill_bejeck/mico_benchmarking_with_caliper.html</link>
            <guid>http://www.nearinfinity.com/blogs/bill_bejeck/mico_benchmarking_with_caliper.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">General</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
            <pubDate>Tue, 01 Feb 2011 00:07:14 -0500</pubDate>
        </item>
        
        <item>
            <title>JEE 6 and Spring MVC</title>
            <description><![CDATA[With the release of JEE 6 and the Servlet 3.0 specification came support for asynchronous servlets.  While continuations and Comet are not new, the fact that it is now part of the servlet specification, and could be "baked in" to an application, piqued my curiosity.  Although I have not used plain servlets in development for some time, I have been using Spring MVC.  So I wanted to see what would happen if I added asynchronous support to a Spring-MVC DispatcherServlet.  I created a very simple web application using Spring version 3.0.2 and annotation configuration.  I would like to make clear this is purely an experiment and is not being used production.
<h3>Environment Used</h3>
For this blog I'm using:
<ul>
<li>a T61 Thinkpad running Ubuntu 10.10(64Bit) and 4G ram</li>
<li>Java JDK 1.6.0_21</li>
<li>Glassfish Version 3. (I tried Tomcat 7 and Jetty 8, but had the best luck at this point with Glassfish)</li>
</ul>
<h3>DispatcherServlet Code</h3>
My first step was to extend Spring's DispatcherServlet.
<pre class="prettyprint">@WebServlet(urlPatterns = {"/async/*"}, asyncSupported = true, name = "async")
public class AsyncDispatcherServlet extends DispatcherServlet {
    private ExecutorService executor;
    private static final int NUM_ASYNC_TASKS = 15;
    private static final long TIME_OUT = 10 * 1000;
    private final Log log = LogFactory.getLog(AsyncDispatcherServlet.class);
@Override
    public void init(ServletConfig config) throws ServletException {
        super.init(config);
        executor = Executors.newFixedThreadPool(NUM_ASYNC_TASKS);
    }

    @Override
    public void destroy() {
        executor.shutdownNow();
        super.destroy();
    }

    @Override
    protected void doDispatch(final HttpServletRequest request, final HttpServletResponse response) throws Exception {
        final AsyncContext ac = request.startAsync(request, response);
        ac.setTimeout(TIME_OUT);
        FutureTask task = new FutureTask(new Runnable() {

            @Override
            public void run() {
                try {
                    log.debug("Dispatching request " + request);
                    AsyncDispatcherServlet.super.doDispatch(request,response );
                    log.debug("doDispatch returned from processing request " + request);
                    ac.complete();
                } catch (Exception ex) {
                    log.error("Error in async request", ex);
                }
            }
        }, null);

        ac.addListener(new AsyncDispatcherServletListener(task));
        executor.execute(task);
    }</pre>
The only methods overridden were init, destroy and doDispatch. I won't go into detail on init and destroy, what they do is obvious.  All the interesting work is done in doDispatch.  The doDispatch method starts the asynchronous request, then wraps the call to super.doDispatch in a runnable and passes that into an executor service. There are a few key points to consider here:
<ol>
        <li>The @WebServlet annotation at the class definition level.  This is part of Servlet 3.0 specification, you can now declare servlets, filters etc via annotations, although you can still use web.xml.  To enable asynchronous support set the 'asyncSupported' attribute to true.</li>
	<li>On line 22 the setting of a timeout for the asyncContext object. In this case the timeout is 10 seconds</li>
	<li>On line 38 setting an AsyncContextEventListener.</li>
        <li>The application server thread returns almost immediately.</li>
</ol>
<h3>Listener Code</h3>
Getting the request to run asynchronously is only half the battle.  The other half is setting up hooks to handle different events during the life-cycle of the asynchronous request.  The Servlet 3.0 spec added the AsyncListener interface.  AsyncListener has 4 methods, onStartAsync, onComplete, onError and onTimeout. For the AsyncDispatcherServlet we have the inner class AsyncDispatcherServletListener  that takes a FutureTask object as a constructor argument.
<pre class="prettyprint">private class AsyncDispatcherServletListener implements AsyncListener {

        private FutureTask futureTask;

        public AsyncDispatcherServletListener(FutureTask futureTask) {
            this.futureTask = futureTask;
        }

        @Override
        public void onTimeout(AsyncEvent event) throws IOException {
            log.warn("Async request did not complete timeout occured");
            handleTimeoutOrError(event, "Request timed out");
        }

        @Override
        public void onComplete(AsyncEvent event) throws IOException {
            log.debug("Completed async request");
        }

        @Override
        public void onError(AsyncEvent event) throws IOException {
            log.error("Error in async request", event.getThrowable());
            handleTimeoutOrError(event, "Error processing " + event.getThrowable().getMessage());
        }

        @Override
        public void onStartAsync(AsyncEvent event) throws IOException {
            log.debug("Async Event started..");
        }

        private void handleTimeoutOrError(AsyncEvent event, String message) {
            PrintWriter writer = null;
            try {
                future.cancel(true);
                HttpServletResponse response = (HttpServletResponse) event.getAsyncContext().getResponse();
                //HttpServletRequest request = (HttpServletRequest) event.getAsyncContext().getRequest();
                //request.getRequestDispatcher("/app/error.htm").forward(request, response);
                writer = response.getWriter();
                writer.print(message);
                writer.flush();
            } catch (IOException ex) {
                log.error(ex);
            } finally {
                event.getAsyncContext().complete();
                if (writer != null) {
                    writer.close();
                }
            }
        }
    }</pre>
The onStartAsync and onComplete methods merely log a statement, but certainly could be used to open and close resources respectively.  The only methods that do any work are onTimeout and onError, delegating to the handleTimeoutOrError method, passing a message and the AsyncEvent object.  In handleTimeoutOrError we will call cancel on the futureTask object, write the message to the response stream, then mark the asyncContext as completed.  While we are writing the error directly to the response stream, we could have just as easily forwarded to an error page by using the commented out call to request.getRequestDispatcher().forward (obviously you would eliminate lines 38-40).
<h3>Web Application Structure</h3>
<p>
This web application is very simple and has only two controllers - SimpleViewControler and SearchController.
<pre class="prettyprint">

@Controller
public class SimpleViewController {
    
    @RequestMapping({"/","/index.htm"})
    public String showHome(){
        return "index";
    }

    @RequestMapping({"/error.htm"})
    public String error(){
        return "error";
    }
</pre>

<pre class="prettyprint">
@Controller
public class SearchController {

    @RequestMapping("/search.htm")
    public String doSearch(@RequestParam(value = "latency", defaultValue = "2000") long latency,
                                        @RequestParam(value = "blowup", defaultValue = "false") boolean blowUp,
                                        Model model) throws Exception {

        String searchResult = getSearchResult(latency, blowUp);

        model.addAttribute("result", searchResult);
        return "searchResult";
    }

    @RequestMapping("/search.ajax")
    public void doSearchAjax(@RequestParam(value = "latency", defaultValue = "2000") long latency,
            @RequestParam(value = "blowup", defaultValue = "false") boolean blowUp,
            HttpServletResponse response) throws Exception {

        String searchResult = getSearchResult(latency, blowUp);
        
        PrintWriter writer = null;
        try {
            writer = response.getWriter();
            writer.print(searchResult);
            writer.flush();
        } finally {
            if (writer != null) {
                writer.close();
            }
        }
    }

    private String getSearchResult(long latency, boolean blowUp) throws Exception {

        if (blowUp) {
            throw new RuntimeException("Bad error happened in controller");
        }

        Thread.sleep(latency);

        StringBuilder builder = new StringBuilder("Some search/whatever results being returned");
        Date now = new Date();
        builder.append(" @").append(now);
        
        return builder.toString();
    }
</pre>
The latency and blowup parameters in SearchController are used to simulate different response times and errors respectively.  Although there is the doSearchAjax method which writes directly to the response stream, in the tests that are run, we will only be using the doSearch method.  There are the usual context files, which are very light due to the annotation configuration and a web.xml file (needed for the regular DispatcherServlet). </p>
<p>

<h3>Testing</h3>
 Now it's time to see if this experiment works at all.   JMeter is a great tool and it is what I used to load test our simple web application.  I have set up three tests.
<ol>
<li>A "control" test - There are two thread-groups consisting of 50 threads each and will ramp up to run all threads in 3 seconds.  One thread group will make requests to /app/index.htm and the other thread group will make requests to /app/search.htm.  The thread-groups will execute simultaneously  and loop 3 times for a total of 300 requests.  Each thread-group has a "listener" attached to it to measure throughput, and there is a listener attached to the test to measure overall throughput. This test will give us our baseline.  The requests to /app/search.htm will not set any parameters, so each request will have the default value of 2 seconds for latency.
</li>
<li>The "asynchronous" test - This test will measure the effect of using asynchronous servlets in the application.  Setup is identical to the control test above with one exception - the search requests will go to /async/search.htm and hit the AsynchronousDispatcherServlet.
</li>
<li>An error condition test - This test will be structured a little differently.  The thread-group for /app/index.htm has a longer ramp up time, but will remain the same otherwise.  The thread group for /async/search.htm will add a JMeter option known as a 'RandomController'.  There will be 3 possible search requests sent, a valid request, a request with the latency parameter set to 12 seconds causing a timeout and a request with the blowup parameter set to true, so a RuntimeException will be thrown.
</li>
</ol>
<h3>Test Results</h3>
<table cellpadding="3" cellspacing="10" width="100%">
<tr>
<td>
<table cellpadding="3" cellspacing="10" width="100%">
<tr>
<th colspan="2" align="center">Control Test</th>
</tr>
<tr>
<th>Request</th>
<th>Throughput</th>
</tr>
<tr>
<td>Overall</td>
<td>296.174/minute</td>
</tr>
<tr>
<td>Index</td>
<td>164.489/minute</td>
</tr>
<tr>
<td>Search</td>
<td>148.569/minute</td>
</tr>
</table>
</td>
<td>
<table cellpadding="3" cellspacing="10" width="100%">
<tr>
<th align="center" colspan="2">Asynchronous Test</th>
<tr>
  <th>Request</th><th>Throughput</th>
<tr>
<tr>
   <td>Overall</td><td>849.177/minute</td>
</tr>
<tr>
   <td>Index</td><td>2,909.796/minute</td>
</tr>
<tr>
   <td>Search</td><td>429.84/minute</td>
</tr>
</table>
</td>
<td>
<table cellpadding="3" cellspacing="10" width="100%">
<tr>
<th colspan="2" align="center">Error Test</th>
<tr>
  <th>Request</th><th>Throughput</th>
<tr>
<tr>
   <td>Overall</td><td>115.848/minute</td>
</tr>
<tr>
   <td>Index</td><td>2,803.738/minute</td>
</tr>
<tr>
   <td>Search</td><td>57.974/minute</td>
</tr>
</table>
</td>
</tr>
</table>
<br/>
<p>
As we can see from the test results, sending the search requests through the AsyncronousRequestDispatcher increased application throughput.  The request per minute numbers don't mean that much though, given that the web application was so simple and the test was very contrived. What matters more is that the index requests had roughly the same response time and were seemingly unaffected when asynchronous support was used for the search requests</p>
<h3>Summary</h3>
<p>For me there were two main takeaways from this experiment:
<ul>
<li>Even though asynchronous support seemed help with throughput, it is still using a thread pool which consumes server resources, so it should be only be applied to very select parts of an application.</li>
<li>By setting timeouts and getting a chance to handle them gracefully via the event listener, asynchronous support acts a "circuit breaker" of sorts.  This could be valuable when your application makes requests to outside resources that may be down or otherwise unresponsive.</li>
</ul>
</p>
<h3>Resources</h3>
<p>Source for everything is <a href="http://github.com/bbejeck/spring_servlet3" target="new">available on github</a>.<br/> 
To run the JMeter tests<ol>
<li><a href="http://jakarta.apache.org/site/downloads/downloads_jmeter.cgi" target="new">Download JMeter</a> and extract the tar/zip file to some directory</li>
<li>Copy all of the *.jmx files in the jmeter directory from the github site for the code into &lt;JMeter install&gt;/bin. From the bin directory run jmeter or jmeter.bat depending on your platform.  Once JMeter is up and running select File and you should see AsyncWebTestControl.jmx, AsyncWebTestErrors.jmx, AsyncWebTest.jmx in the File menu.   Just click on one of those to open then Ctrl+r to run a test</li>
<li>Download the war file and deploy to glassfish.  I placed the war file in the autodeploy directory in glassfish.  On my laptop it's in /usr/local/servers/glassfishv3/glassfish/domains/domain1/autodeploy.</li>
</ol>
</p>]]></description>
            <link>http://www.nearinfinity.com/blogs/bill_bejeck/jee_6_and_spring_mvc.html</link>
            <guid>http://www.nearinfinity.com/blogs/bill_bejeck/jee_6_and_spring_mvc.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Web Development</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Java Spring</category>
            
            <pubDate>Thu, 09 Dec 2010 00:12:30 -0500</pubDate>
        </item>
        
        <item>
            <title>Android Development Bootcamps coming in January and February!</title>
            <description><![CDATA[            <p>We have the droids you're looking for! <br /></p><p>NIC-U is happy to announce that
we will be hosting two Android Development Bootcamps classes this
winter. These courses will be held January 10-14 and February 28-March
4 at the the NIC-U Training Center.</p>

<p>The Android™ platform is a world-wide phenomenon. Thousands of new
apps are being published in Android Market every week. Take advantage
of this new and exclting smartphone platform by learning how to create
Android applications at one of our Android Development Bootcamps!</p>

<p>These classes will immerse students in the Android platform. Through a
combination of lecture instruction and hands-on labs, attendees will
come away from the class ready to begin writing sophisticated Android
applications.</p>

<p>Both bootcamps will be taught by Mark Murphy, founder of
<a href="http://commonsware.com/" target="_blank">CommonsWare</a>, and the author of<a href="http://www.amazon.com/Beginning-Android-2-Mark-Murphy/dp/1430226293" target="_blank"> Beginning
Android 2</a> and many other books on Android application
development. A polished speaker and trainer, Mr. Murphy has delivered conference
presentations and training sessions on a wide array of topics
internationally.</p>

<p>These classes are guaranteed to run, and are both already half full.
The early bird price only lasts until Nov 30. So, don't miss this
great opportunity to learn Android from a true expert!</p>

<p>To learn more and to register please visit the registration pages for
the <a href="http://www.eventbrite.com/event/992684145" target="_blank">January</a> and <a href="http://www.eventbrite.com/event/999005051" target="_blank">February/March</a> classes.</p>
        
Be sure to check out our <a href="https://www.nearinfinity.com/trainingcenter/coursecatalog/upcoming/">Upcoming NIC-U Training Courses</a> page for future courses. ]]></description>
            <link>http://www.nearinfinity.com/blogs/gray_herter/android_development_bootcamps.html</link>
            <guid>http://www.nearinfinity.com/blogs/gray_herter/android_development_bootcamps.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Android</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">iphone</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">smartphone</category>
            
            <pubDate>Mon, 15 Nov 2010 09:57:44 -0500</pubDate>
        </item>
        
        <item>
            <title>Apache Cassandra Training At NIC-U</title>
            <description><![CDATA[ <p>If you are interested in learning about the <a href="http://cassandra.apache.org/">Apache Cassandra</a> distributed database, which is used by companies like <a href="http://digg.com/">Digg</a>, <a href="http://www.facebook.com/">Facebook</a>, and <a href="http://twitter.com/">Twitter</a>, then you are in luck. <a href="http://www.riptano.com/">Riptano</a> is giving a one-day training course at our cool new training facility. <a href="https://www.nearinfinity.com/trainingcenter/coursecatalog/upcoming/riptano_to_offer_apache_cassandra_training_at_near_infinity.html">Here</a> you can find information about the course or you can go direct to the <a href="http://www.eventbrite.com/event/900402127">registration</a> page.</p>

<p>Be sure to check back at our <a href="https://www.nearinfinity.com/trainingcenter/coursecatalog/upcoming/">NIC-U Upcoming Training Courses</a> page for upcoming courses.</p>]]></description>
            <link>http://www.nearinfinity.com/blogs/scott_leberknight/apache_cassandra_training_at_n.html</link>
            <guid>http://www.nearinfinity.com/blogs/scott_leberknight/apache_cassandra_training_at_n.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Database</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Persistence</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">apache</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">cassandra</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">database</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">NIC-U</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">nosql</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">riptano</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">training</category>
            
            <pubDate>Sat, 30 Oct 2010 12:04:36 -0500</pubDate>
        </item>
        
    </channel>
</rss>

