<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
    <channel>
        <title>Java - Blogs at Near Infinity</title>


        <link>http://www.nearinfinity.com/blogs/</link>
        <description>Employee Blogs</description>
        <language>en</language>
        <copyright>Copyright 2010</copyright>
        <lastBuildDate>Sat, 27 Feb 2010 00:44:19 -0500</lastBuildDate>
        <generator>http://www.sixapart.com/movabletype/</generator>
        <docs>http://www.rssboard.org/rss-specification</docs>
        
        <item>
            <title>Learning ANTLR part I</title>
            <description><![CDATA[<div style="font-size: 10pt;">This year one of my goals is to try and become proficient in using ANTLR.  I think that learning to translate text or build an external DSL is skill that, although not used everyday, will be very useful to know. For my first attempt I settled on something fairly easy, a SQL like grammar that could be used to search for files and the content within those files.  You should also be able to narrow the search results based on when the file was last modified.   My goal is to take something like the following:
<pre class="prettyprint">select * from /logs where file="*.out" and pattern="foobar" and modified &lt; 2 days ago
select * from /logs where file='*.out' and pattern='foobar' and modified between 20 and 30 minutes ago
</pre>
and translate it to the corresponding find command and pipe the results to xargs and grep:
<pre class="prettyprint">find /logs -name '*.out' -mtime -2 | xargs grep 'foobar'
find /logs -name '*.out' -mmin +20 -mmin -30 | xargs grep 'foobar'
</pre>
As an aside, if you are not familiar with xargs, check out <a href="http://www.cyberciti.biz/faq/linux-unix-bsd-xargs-construct-argument-lists-utility/" target="new">this xargs tutorial</a> or the <a href="http://unixhelp.ed.ac.uk/CGI/man-cgi?xargs" target="new">xargs man pages</a> , it's a great utility that executes a command with the output of a previous command.
<h4>Disclaimer</h4>
Now before the villagers gather up with torches and pitch forks to run me out of town (I'm channeling <a href="http://en.wikipedia.org/wiki/Young_Frankenstein" target="new"> Young Frankenstein</a> here), I would like to make somewhat of a disclaimer.  I am not suggesting a new language or discouraging learning the *nix command line tools.  The point here is to learn ANTLR.  I found it more interesting to translate something I use everyday on my current project, versus some of the other "Hello World" ANTLR examples I have seen.  So other than a using this grammar as a learning exercise, I don't see it as being useful.
<h4>Introduction</h4>
ANTLR is a deep topic, so obviously one blog post can not go into any great detail.   So what follows is not in-depth coverage of ANTLR, but a detailed description of the grammar developed.  I will explain each section as well as  some of the decisions and trade-offs I made.  For my development environment I'm using:
<ol>
	<li>Eclipse 3.5.1</li>
	<li>Java 6</li>
	<li>The <a href="http://antlrv3ide.sourceforge.net/" target="new">ANTLR IDE </a>plugin for Eclipse.  You could also use <a href="http://www.antlr.org/works/index.html" target="new">ANTLRWorks</a>, the gui development environment for ANTLR.  ANTLRWorks is an excellent tool, I just felt more comfortable to do this work in Eclipse.</li>
	<li> ANTLR version 3.2</li>
	<li> Mac OS X 10.6.2.</li>
</ol>
So with all of that out of the way,  let's get started looking at the grammar.
<h3>options, @header</h3>
<pre class="prettyprint">grammar FQL;
options {
     language = Java;
}
@header {
     package bbejeck.antlr.fql;
}
</pre>
Here I am specifying a combined grammar named FQL.  (FQL is short for File Query Language and yes, I know the name sucks)
In options I'm specifying that I want the generated code to be  Java.  I could have also specified C,C++ or Python here as well.  ANTLR also has support for generating code in Ruby, but with the version I am using (v 3.2) I could not get it to work.  I did find <a href="http://rubyforge.org/projects/antlr3/" target="new">ANTLR Ruby</a>.  I have not tried it out, but from the documentation it looks promising.  The @header option is setting the package for the generated parser code.  This is also where I would have specified any needed imports.
<h3>@members</h3>

The @members section is where you place instance variables and methods that will be placed and used in the generated parser.  Most likely the code in the members section will be used in embedded actions in the parser rules.
<pre class="prettyprint"> @members {
  private StringBuilder findBuilder = new StringBuilder("find ");
  
  private StringBuilder filter = new StringBuilder();
  
  private void addString(String s){
    if(s!=null){
        findBuilder.append(s);
     }
  }
  
  private String buildTimeArg(String s, String snum, String sign){
       StringBuilder timeBuilder = new StringBuilder();
       int num = Integer.parseInt(snum);
       
       if(s.equals("days")){
           return timeBuilder.append(" -mtime ").append(sign).append(num).toString();
       }
       if(s.equals("hours")){
           return timeBuilder.append(" -mmin ").append(sign).append((num*60)).toString();
       }
       
       return timeBuilder.append(" -mmin ").append(sign).append(num).toString();
  }
  
  protected void mismatch(IntStream input, int ttype, BitSet follow) throws RecognitionException{
        throw new MismatchedTokenException(ttype,input);
  }
  
  public Object recoverFromMismatchedSet(IntStream input, RecognitionException e, BitSet follow) throws RecognitionException{
     throw e;
  }
  
}
</pre>
The two StringBuilders <i>findBuilder</i> and <i>filter</i> will be used by embedded actions to build up our translated query.   The reason for two StringBuilders will be explained when we cover the parsing rules.  The <i>addString</i> method is to check for optional tokens that could be null.  I could have easily checked for null in the embedded code within each rule,  but I felt it cluttered the grammar too much.  The <i>buildTimeArg</i> method is used as sort of a poor man's symbol table to translate the <i>modified</i> clause to the proper time format for the <i>mmin</i> or <i>mtime</i> arguments.  
The final two methods override how the generated parser responds to recognition errors (the generated parser extends ANTRL's Parser class which in turn extends the BaseRecognizer class).  By default ANTLR will recover from recognition errors and continue on, trying to read more tokens if available.   But in this grammar, if there is a recognition error along the way I want to stop processing right there.  

<h3>@rulecatch</h3>
Each parser rule is converted into a method call in the generated parser with a try - catch block surrounding the parsing code.  The catch statement here will be embedded in each one of the try-catch blocks in the parser.  
<pre class="prettyprint">@rulecatch{
    catch (RecognitionException e){
            throw e;
      }
}</pre>
If you remember from the previous section we want to stop parsing stop when RecognitionExceptions are encountered, so we re-throw the caught exception.
<h3>@lexer::header</h3>
Here we are specifying the package for the generated lexer.
<pre class="prettyprint">@lexer::header {
  package bbejeck.antlr.fql;
}
</pre>

Now let's move on to the parsing rules.

<h3>Parsing Rules</h3>
<pre class="prettyprint">evaluate returns [String query]
      :  query';' {$query = builder.toString() + filter.toString() ;}
      ;

query
       :   select_stmt where_stmt
       ;

select_stmt
      :  'select' '*' 'from' directory
      ;
</pre>
Here <i>evaluate</i> is our top level rule and returns a String, translated and built as the input is parsed.  Anything within the curly braces is code that will be embedded in the generated parser.  Note how we reference query from the grammar by placing a '$' before the word 'query'.  Also note that the string returned is a concatenation from the two StringBuilders we declared in the @members section.  The <i>query</i> rule is comprised of a <i>select_stmt</i> followed by a <i>where_stmt</i>.  The <i>select_stmt</i> is "select * from" followed by the directory rule.
<pre class="prettyprint">directory
       : (p='.'{addString($p.text);} | (p='/'?{addString($p.text);}IDENT{addString($IDENT.text);})+ )
       ;
</pre>
The directory rule accepts either a '.', a relative or an absolute path.  If the first expression is not provided there must be at least one path expression denoted by the '+'.  The variable 'p' is used to give a handle to the '.' or '/' token so it can be extracted . IDENT is a lexer rule which will be explained a little bit later.  All tokens here are passed into the <i>addString</i> method defined in the members section.
<pre class="prettyprint">where_stmt
       :  ('where'  clause ('and' clause)* ) ?
       ;
clause
       : file_name
       | pattern
       | modified
       ;
</pre>
The <i>where_stmt</i> rule expects the string 'where' followed by 0 or more clauses.  Also the entire <i>where_stmt</i>  is optional.  Here I chose form over substance.  By that I mean the grammar as it stands here will allow multiple clause's that would not make sense, i.e multiple file_name arguments etc.  I could have specified an exact order of clauses that would have also effectively set the limit of clauses entered, but I would rather the grammar be flexible and trust that the user knows what they want to do.
<pre class="prettyprint">  
file_name
       : 'file'  '=' STRING_LITERAL
         {addString(" -name ");addString($STRING_LITERAL.text);}
       ;

pattern
       :   'pattern'  '=' STRING_LITERAL
             { filter.append(" | xargs grep  ").append($STRING_LITERAL.text); }
       ;
</pre>
The <i>file_name</i> rule sets the -name argument again using the <i>addString</i> method.  The lexer rule STRING_LITERAL will accept whatever the user inputs.  The <i>pattern</i> rule builds up the grep command.  Here we see the use of the second StringBuilder <i>filter</i> that was defined in the @members section.  I feel that having a second StringBuilder to capture text for the grep filter is a hack.   The issue is that the <i>grep</i> command needs to be last in our translated query, but I really want the where statement to be in any order.  So by placing the tokens captured by the <i>pattern</i> rule in a separate StringBuilder I can easily guarantee the <i>grep</i> statement will be last.  
<pre class="prettyprint">modified
       :  modified_less
       |  modified_more
       |  modified_between
       ;
</pre>
The modified rule has three options.  This portion builds the mmin/mtime argument(s) for the <i>find</i> command.
<pre class="prettyprint">   
modified_less
       :   'modified'  '&lt;'  INTEGER time_span                             
           { addString(buildTimeArg($time_span.text,$INTEGER.text,"-")); }                     
       ; 
  
modified_more                     
       :   'modified'  '&gt;' INTEGER time_span
           { addString(buildTimeArg($time_span.text,$INTEGER.text,"+")); }
       ;

modified_between
       :   'modified' 'between' int1=INTEGER 'and' int2=INTEGER time_span
            { addString(buildTimeArg($time_span.text,$int1.text,"+")); }
            { addString(buildTimeArg($time_span.text,$int2.text,"-")); }
       ;
</pre>
The grammar allows you to specify searching by the time a file was last modified.  Here we use the method <i>buildTimeArg</i> to translate the input to the correct argument for either <i>mmin</i> (minutes modified) or <i>mtime</i> (days modified). Also take note of setting the two variables <i>int1</i> and <i>int2</i>.  Those are used to disambiguate which INTEGER token to use.
<pre class="prettyprint">time_span
       :   'days'
       |   'minutes'
       |   'hours'
       ;
</pre>
The time_span rule allows input of days, minutes or hours.  The hours argument is converted into minutes by the <i>buildTimeArg</i> method.

That's it for the parsing rules, now on to the lexer rules.
<h3>Lexer Rules</h3>
<pre class="prettyprint">fragment DIGIT : '0'..'9';
fragment LETTER : 'a'..'z'|'A'..'Z' ;

STRING_LITERAL : '\''.*'\'';
INTEGER : DIGIT+ ;
IDENT : LETTER(LETTER | DIGIT)* ;
WS : (' ' | '\t' | '\n' | '\r' | '\f')+  {$channel=HIDDEN;};
</pre>
DIGIT and LETTER are not lexer rules, as you can see by the fragment definition.  These are used for making the grammar more readable.  In the WS definition the {$channel=HIDDEN;} is used to ignore whitespace in the input.

<h3>Test Code</h3>
I used the following code to test the grammar from the command line:
<pre class="prettyprint">public class FQLTester {

public static void main(String[] args) throws Exception{
     BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
     String line = null;
     System.out.println("Enter your search:");
     while((line = reader.readLine())!= null){
         if(line.equalsIgnoreCase("quit")){
            System.exit(0);
         }
        CharStream charstream = new ANTLRStringStream(line);
        FQLLexer lexer = new FQLLexer(charstream);

        TokenStream tokenStream = new CommonTokenStream(lexer);
        FQLParser parser = new FQLParser(tokenStream);

        String parsed = null;
        try{
            parsed = parser.evaluate();
            System.out.println("parsed query is ["+parsed+"]");
            Process process = Runtime.getRuntime().exec(new String[]{"sh","-c",parsed});
            InputStream input = process.getInputStream();
            BufferedReader procReader = new BufferedReader(new InputStreamReader(input));
            String searchResults = null;
            while((searchResults=procReader.readLine())!=null){
                  System.out.println(searchResults);
            }
        }catch(Exception e){
               e.printStackTrace();
        }
      System.out.println("Enter your search:");
    }
}
</pre>

Since this blog is just scratching the surface as far as ANTLR's capabilities are concerned, I plan to be writing more about ANTLR in the near future.  Full source code for everything presented is <a href="http://github.com/bbejeck/antlr_code" target="new">available here</a>.
More resources for learning ANTLR are:
<ul>
	<li><a href="http://javadude.com/articles/antlr3xtut/index.html" target="new">Scott Stanchfield's video tutorial on ANTLR</a></li>
	<li><a href="http://www.pragprog.com/titles/tpantlr/the-definitive-antlr-reference" target="new">Definitive Guide to ANTLR, Pragmatic Books</a></li>
</ul>

That's it for now, thanks for your time.</div>]]></description>
            <link>http://www.nearinfinity.com/blogs/bill_bejeck/learning_antlr_part_i.html</link>
            <guid>http://www.nearinfinity.com/blogs/bill_bejeck/learning_antlr_part_i.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">General</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
            <pubDate>Sat, 27 Feb 2010 00:44:19 -0500</pubDate>
        </item>
        
        <item>
            <title>Low Memory Patch For Lucene</title>
            <description><![CDATA[While I'm working on getting my code into lucene. &nbsp;I have patched 3.0.0 and 2.9.1 with my <a href="http://www.nearinfinity.com/blogs/aaron_mccurry/my_first_lucene_patch.html">low memory patch</a>.<div><br /></div><div>By default the small memory footprint is enabled to change back to the default implementation set the following system property.</div><div><br /></div><div>-Dorg.apache.lucene.index.TermInfosReader=default<div><br /></div><div>Have fun! &nbsp;If you have any problems or questions please let me know or add to this <a href="https://issues.apache.org/jira/browse/LUCENE-2205">LUCENE-2205</a>.</div><div><br /></div><div><a href="http://www.nearinfinity.com/blogs/assets/amccurry/lucene-core-3.0.0-nic.jar">lucene-core-3.0.0-nic.jar</a></div><div><br /></div><div><a href="http://www.nearinfinity.com/blogs/assets/amccurry/lucene-core-2.9.1-nic.jar">lucene-core-2.9.1-nic.jar</a></div><div><br /></div></div>]]></description>
            <link>http://www.nearinfinity.com/blogs/aaron_mccurry/low_memory_patch_for_lucene.html</link>
            <guid>http://www.nearinfinity.com/blogs/aaron_mccurry/low_memory_patch_for_lucene.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Lucene</category>
            
            
            <pubDate>Tue, 19 Jan 2010 21:25:35 -0500</pubDate>
        </item>
        
        <item>
            <title>Lucene Hit Paging Made Easier</title>
            <description><![CDATA[Lucene gives users the ability to search massive amounts of data in a very short amount of time. &nbsp;However allowing users to page through the entire result set of their search can be difficult and risky depending on how many users are performing searches and how many of those users are paging through 100's if not 1,000's of hits per page.<div><br /></div><div>Problem Scenario:</div><div><br /></div><div><ul><li>Each of your indexes contains 100,000,000's documents.</li><li>You have 500 users on your system&nbsp;actively&nbsp;performing searches.</li><li>You have 100 search results per page.</li><li>And, your typical user pages through the first 10 pages of results. &nbsp;(Normal&nbsp;occurrence&nbsp;on some systems)</li></ul></div><div><br /></div><div>So for the 10th page you will have to collect 1,000 hits, at a cost of a float plus an int plus some object overhead per hit. &nbsp;So let's say 20 bytes per hit. &nbsp;So you have 500 users * 1,000 hits * 20 bytes = 10,000,000 bytes or 10M. &nbsp;Easy, no problem, right?</div><div><br /></div><div>Well what if you also give the users an easy way to move to the end of the result set. &nbsp;Hmm... &nbsp;Well for a result set size of 10,000 it's no big deal. &nbsp;But what if you hand out result sets in the order of a 1,000,000 or even 10,000,000.</div><div><br /></div><div>At this point you really just want to prevent the system from running out of memory. &nbsp;Because if you have 25 users getting 10,000,000 results each and they all click last page at the same time. &nbsp;That's going to cost you 5 Gig of heap! &nbsp;At least. &nbsp;Some might say that it won't ever happen, but in my experience, if it can happen, it will.</div><div><br /></div><div>So I created a Paging Hit Collector, that windows the hits to the users. &nbsp;It's uses the last hit collected from the previous search pass, to feed the next search pass. &nbsp;So yes if a user clicks the last page, it might perform multiple searches but, the system won't run out of memory.</div><div><br /></div><div>The user's will get there answer&nbsp;eventually, and if your system gives them some feedback as it searches and pages, they will probably sit and wait for it to come back. &nbsp;Instead of giving up and hitting cancel and search and cancel and search, and making the system worse and worse.</div><div><br /></div><b>

The Simple Example:</b><div><b><br /></b><div><div>
<pre class="prettyprint">IndexSearcher searcher = new IndexSearcher(reader);<br />
TermQuery query = new TermQuery(new Term("f1", "value"));
IterablePaging paging = new IterablePaging(searcher, query, 100);<br />
for (ScoreDoc sd : paging.skipTo(90)) {
&nbsp;&nbsp;System.out.println("doc id [" + sd.doc + "] " + 
&nbsp;&nbsp;&nbsp;&nbsp;"score [" + sd.score + "]");
}
</pre>
</div></div><b><div><b><br /></b></div>

The More Advanced Example:</b></div><div><b><br /></b><div><div>
<pre class="prettyprint">IndexSearcher searcher = new IndexSearcher(reader);<br />
TotalHitsRef totalHitsRef = new TotalHitsRef();
ProgressRef progressRef = new ProgressRef();<br />
TermQuery query = new TermQuery(new Term("f1", "value"));
IterablePaging paging = new IterablePaging(searcher, query, 100);<br />
for (ScoreDoc sd : paging.skipTo(90).
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;gather(20).
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;totalHits(totalHitsRef).
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;progress(progressRef)) {<br />
&nbsp;&nbsp;System.out.println("time [" + progressRef.queryTime() + "] " + 
&nbsp;&nbsp;&nbsp;&nbsp;"total hits [" + totalHitsRef.totalHits() + "] " + 
&nbsp;&nbsp;&nbsp;&nbsp;"searches [" + progressRef.searchesPerformed() + "] " + 
&nbsp;&nbsp;&nbsp;&nbsp;"position [" + progressRef.currentHitPosition() + "] " + 
&nbsp;&nbsp;&nbsp;&nbsp;"doc id [" + sd.doc + "] " + 
&nbsp;&nbsp;&nbsp;&nbsp;"score [" + sd.score + "]");
}
</pre>
</div></div><div><br /></div><div>Here's a link to the code <a href="https://issues.apache.org/jira/browse/LUCENE-2215">LUCENE-2215</a>.</div><div><br /></div></div>]]></description>
            <link>http://www.nearinfinity.com/blogs/aaron_mccurry/making_lucene_hit_results_pagi.html</link>
            <guid>http://www.nearinfinity.com/blogs/aaron_mccurry/making_lucene_hit_results_pagi.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Lucene</category>
            
            
            <pubDate>Mon, 18 Jan 2010 16:16:59 -0500</pubDate>
        </item>
        
        <item>
            <title>My First Lucene Patch - Making Lucene Do More With Less</title>
            <description><![CDATA[I've been using Lucene for the better part of 2 years, from initial playing around, to prototyping to production application. &nbsp;It's an impressive library and it has come along way in the past couple of years.<div><br /></div><div>When I first started playing around with it the version was 2.1 and the search times were so much faster than what we were trying to use at the time (Oracle Text). &nbsp;The first test was indexing a monster dataset and searching it quickly. &nbsp;It passed with flying colors!</div><div><br /></div><div>Next was to add in record level access control. &nbsp;<a href="http://java.dzone.com/articles/how-implement-row-level-access">Easy</a>&nbsp;and extremely fast.</div><div><br /></div><div>Next was to add in all the other data needed for our application. &nbsp;That was a little bit harder, considering that we have close to 150 fields in our index and well into the billion record range (growing everyday).</div><div><br /></div><div>The problem was that we needed more memory and there was no extra money for any more servers (or upgrades). &nbsp;So there we were, stuck. &nbsp;So I decided to start poking around using <a href="https://visualvm.dev.java.net/">visualvm</a>&nbsp;to see if there were any places in our application or in Lucene to save some memory.</div><div><br /></div><div>We had already disabled norms on all our fields (we really didn't need norms for our data nor did we have the resources). &nbsp;Took a long look at all our fields that we were indexing to see if there were any we didn't need, but we really did need them all. &nbsp;Then I stumbled across the TermInfosReader class in Lucene.</div><div><br /></div><div>This is where Lucene really gets it speed, but also uses quite a bit memory to do it. &nbsp;And this is where I wrote my first Lucene patch.</div><div><br /></div><div>In TermInfosReader there is a bunch stuff but the big memory hogs are in three arrays.</div><div><br /></div><div><ul><li>Terms[]</li><li>TermInfos[]</li><li>long[]</li></ul></div><div>Basically Lucene does a binary search across the Terms array (that by default contains every 128th Term in the index) with a given Term to find where on disk the exact Term needed lives. &nbsp;There's a little bit more going on in the class than that, but that's basically what it's doing.</div><div><br /></div><div>So, I started this patch with the need to save memory. &nbsp;So how in the world do you do that in java when everything is already in basic arrays and everything is needed in memory. &nbsp;Well you have to save it another way, references. &nbsp;References are a hidden cost in Java, every single reference in 32-bit JVM costs you 4 bytes, and 64-bit JVM it's 8 (assuming that you don't have compressed references).</div><div><br /></div><div>Let's count the references.</div><div><br /></div><div><ul><li>Terms[] length * 3, 1 reference for the Term and 2 references for the two Strings inside the Term</li><li>TermInfo[] length * 1</li><li>long[] = 1 reference total<br /></li></ul><div>So, let's talk numbers. &nbsp;If you have a billion terms in your index, that's 125 MB (1,000,000,000 / 128 * (3 + 1 references) * 4 bytes for every ref) bytes of memory for the references. &nbsp;In a 64-bit JVM that doubled 250 MB. &nbsp;Not to mention the object overhead for every one of those Term and TermInfos objects. Wow that's a lot!</div><div><br /></div><div>So I decided to remove nearly all of those references by using a byte array and an int array as an offset index.</div><div><br /></div><div>The results were impressive!</div><div><br /></div><div>Given an index of 6.2 GB size 1,010,000&nbsp;number of documents with 179,822,683 number of terms the default implementation uses 292,235,512 bytes to just get the index usable.</div><div><br /></div><div>My no-ref implementation of the same index uses only 49,849,744 bytes get the index usable. &nbsp;That a 17% of the original size, that's an 83% savings!</div><div><br /></div><div>And the best part is, that it loads the segments faster into memory. &nbsp;So those real-time updates will be online faster. &nbsp;The run-time performance is slightly faster as well. &nbsp;But the huge performance saving is in garbage collection. &nbsp;Over 7 times faster for full GC's on my Macbook Pro. &nbsp;Wow!</div><div><br /></div><div>I think that the results speak for themselves, and I hope that the Lucene folks will accept my patch. &nbsp;That way I won't have to continue patching each version after the fact. &nbsp;Also removing references can be great, but the code required to do it, and maintain the same level of performance, is ugly! &nbsp;So don't try this at home!</div><div><br /></div><div><a href="https://issues.apache.org/jira/browse/LUCENE-2205">LUCENE-2205</a></div><div><br /></div></div>]]></description>
            <link>http://www.nearinfinity.com/blogs/aaron_mccurry/my_first_lucene_patch.html</link>
            <guid>http://www.nearinfinity.com/blogs/aaron_mccurry/my_first_lucene_patch.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Lucene</category>
            
            
            <pubDate>Mon, 11 Jan 2010 20:06:21 -0500</pubDate>
        </item>
        
        <item>
            <title>Using HBase-dsl</title>
            <description><![CDATA[At the beginning of last month I started prototyping various solutions for a customer using HBase. &nbsp;However I found myself writing tons of code to perform some fairly simple tasks. &nbsp;So I set out to simply my HBase code and ended up writing a Java <a href="http://wiki.github.com/nearinfinity/hbase-dsl" target="_blank">HBase DSL</a>. &nbsp;It's still fairly rough around the edges but it does allow the use of standard Java types and it's extensible.<div><br /><font class="Apple-style-span" style="font-size: 1.25em; "><font class="Apple-style-span" style="font-size: 1.25em; ">

Simple Put and Get Example</font></font><br /><br /><b>

Direct HBase API:</b><br />

<br />
<pre class="prettyprint">public class PutAndGet {
   public static void main(String[] args) throws IOException {
      HTable hTable = new HTable("test");

      byte[] rowId = Bytes.toBytes("abcd");
      byte[] famA = Bytes.toBytes("famA");
      byte[] col1 = Bytes.toBytes("col1");
      Put put = new Put(rowId).
         add(famA, col1, Bytes.toBytes("hello world!"));
      hTable.put(put);
      Get get = new Get(rowId);
      Result result = hTable.get(get);
      byte[] value = result.getValue(famA, col1);
      System.out.println(Bytes.toString(value));
   }
}
</pre><b>HBase-dsl API:</b><br /><br />
<pre class="prettyprint">public class PutAndGetWithDsl { 
   public static void main(String[] args) throws IOException { 
      HBase&lt;QueryOps, String&gt; hBase = new HBase&lt;QueryOps&lt;String&gt;, String&gt;(String.class);

      hBase.save("test").  
         row("abcd"). 
            family("famA"). 
               col("col1", "hello world!"); 
      String value = hBase.fetch("test"). 
         row("abcd").
            family("famA"). 
               value("col1", String.class)
      System.out.println(value);
   }
 }</pre>

Now this is where the dsl becomes more powerful!<div><br /><font class="Apple-style-span" style="font-size: 1.25em; "><font class="Apple-style-span" style="font-size: 1.25em; ">

Scanner Example</font></font><br /><br /><b>

Direct HBase API:</b><br /><br />

<pre class="prettyprint">public class Scanner {
   public static void main(String[] args) throws IOException {
      byte[] famA = Bytes.toBytes("famA");
      byte[] col1 = Bytes.toBytes("col1");  

      HTable hTable = new HTable("test");  

      Scan scan = new Scan(Bytes.toBytes("a"), Bytes.toBytes("z"));
      scan.addColumn(famA, col1);  

      SingleColumnValueFilter singleColumnValueFilterA = new SingleColumnValueFilter(
           famA, col1, CompareOp.EQUAL, Bytes.toBytes("hello world!"));
      singleColumnValueFilterA.setFilterIfMissing(true);  

      SingleColumnValueFilter singleColumnValueFilterB = new SingleColumnValueFilter(
           famA, col1, CompareOp.EQUAL, Bytes.toBytes("hello hbase!"));
      singleColumnValueFilterB.setFilterIfMissing(true);  

      FilterList filter = new FilterList(Operator.MUST_PASS_ONE, Arrays
           .asList((Filter) singleColumnValueFilterA,
                singleColumnValueFilterB));  

      scan.setFilter(filter);  

      ResultScanner scanner = hTable.getScanner(scan);  

      for (Result result : scanner) {
         System.out.println(Bytes.toString(result.getValue(famA, col1)));
      }
   }
}</pre>
<b>HBase-dsl API:</b><br /><br />

<pre class="prettyprint">public class ScannerWithDsl {
   public static void main(String[] args) throws IOException {
      HBase&lt;QueryOps, String&gt; hBase = new HBase&lt;QueryOps&lt;String&gt;, String&gt;(String.class);

      hBase.scan("test","a","z").
         select().
            family("famA").
               col("col1").
         where().
            family("famA").
               col("col1").eq("hello world!","hello hbase!").
         foreach(new ForEach<row>() {
            @Override
            public void process(Row row) {
               System.out.println(row.value("famA", "col1", String.class));
            }
         });
  }
}</row></pre><br />
See the unit tests, for more examples.<br /><br /></div></div>]]></description>
            <link>http://www.nearinfinity.com/blogs/aaron_mccurry/using_hbase-dsl.html</link>
            <guid>http://www.nearinfinity.com/blogs/aaron_mccurry/using_hbase-dsl.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Database</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Hadoop</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Persistence</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">hadoop</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">hbase</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">hbase-dsl</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
            <pubDate>Tue, 05 Jan 2010 22:34:20 -0500</pubDate>
        </item>
        
        <item>
            <title>Hibernate Performance Tuning Part 2 Article Published</title>
            <description><![CDATA[  <p>I've just published the second article of a two-part series in the December 2009 <a href="http://www.nofluffjuststuff.com/home/magazine_subscribe?id=10">NFJS Magazine</a> on Hibernate Performance Tuning. Here's the abstract:</p>

<p><em>Tuning performance in Hibernate applications is all about reducing the number of database queries or eliminating them entirely using caching. In the first article in this two part series, you saw how to tune object retrieval using eager fetching techniques to optimize queries and avoid lazy-loads. In this second and final article, I'll show you how inheritance strategy affects performance, how to eliminate queries using the Hibernate second-level cache, and show some simple but effective tools you can use to monitor and profile your applications.</em></p>

<p>If you are using Hibernate and want to know more about how inheritance affects performance, how to use the second-level cache, and some simple monitoring and profiling techniques, check it out and let me know what you think. Note that NFJS Magazine does require a subscription.</p>]]></description>
            <link>http://www.nearinfinity.com/blogs/scott_leberknight/hibernate_performance_tuning_p_1.html</link>
            <guid>http://www.nearinfinity.com/blogs/scott_leberknight/hibernate_performance_tuning_p_1.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Database</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">ORM</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Persistence</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">hibernate</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">ORM</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">performance</category>
            
            <pubDate>Mon, 21 Dec 2009 14:17:55 -0500</pubDate>
        </item>
        
        <item>
            <title>Making Cobertura Reports Show Groovy Code with Maven</title>
            <description><![CDATA[ <p>A recent project started out life as an all-Java project that used Maven as the build tool. Initially we used <a href="http://www.atlassian.com/software/clover/">Atlassian Clover</a> to measure unit test coverage. Clover is a great product for Java code, but unfortunately it only works with Java code because it works at the Java source level. (This was the case as of Spring 2009, and I haven't checked since then.) As we started migrating existing code from Java to Groovy and writing new code in Groovy, we started to lose data about unit test coverage because Clover does not understand Groovy code. To remedy this problem we switched from Clover to <a href="http://cobertura.sourceforge.net/">Cobertura</a>, which instruments at the bytecode level and thus works with Groovy code. Theoretically it would also work with any JVM-based language but I'm not sure whether or not it could handle something like Clojure or not.</p>

<p>In any case, we only cared about Groovy so Cobertura was a good choice. With the <a href="http://mojo.codehaus.org/cobertura-maven-plugin/">Cobertura Maven</a> plugin we quickly found a problem, which was that even though the code coverage was running, the reports only showed coverage for Java code, not Groovy. This blog shows you how to display coverage on Groovy code  when using Maven and the Cobertura plugin. In other words, I'll show how to get Cobertura reports to link to the real Groovy source code in Maven, so you can navigate Cobertura reports as you normally would.</p>

<p>The core problem is pretty simple, though it took me a while to figure out how to fix it. Seems to be pretty standard in Maven: I know what I want to do, but finding out how to do it is the <i>really</i> hard part. The only thing you need to do is tell Maven about the Groovy source code and where it lives. The way I did this is to use the Codehaus <a href="http://mojo.codehaus.org/build-helper-maven-plugin/">build-helper-maven-plugin</a> which has an add-source goal. The add-source goal does just what you would expect; it adds a specified directory (or directories) as a source directory in your Maven build. Here's how you use it in your Maven pom.xml file:</p>

<pre class="prettyprint">
&lt;plugin&gt;
    &lt;groupId&gt;org.codehaus.mojo&lt;/groupId&gt;
    &lt;artifactId&gt;build-helper-maven-plugin&lt;/artifactId&gt;
    &lt;executions&gt;
        &lt;execution&gt;
            &lt;phase&gt;generate-sources&lt;/phase&gt;
            &lt;goals&gt;
                &lt;goal&gt;add-source&lt;/goal&gt;
            &lt;/goals&gt;
            &lt;configuration&gt;
                &lt;sources&gt;
                    &lt;source&gt;src/main/groovy&lt;/source&gt;
                &lt;/sources&gt;
            &lt;/configuration&gt;
        &lt;/execution&gt;
    &lt;/executions&gt;
&lt;/plugin&gt;
</pre>

<p>In the above code snippet, we're  using the "build-helper-maven-plugin" to add the src/main/groovy directory. That's pretty much it. Run Cobertura as normal, view the reports, and you should now see coverage on Groovy source code as well as Java.</p>
]]></description>
            <link>http://www.nearinfinity.com/blogs/scott_leberknight/making_cobertura_reports_show.html</link>
            <guid>http://www.nearinfinity.com/blogs/scott_leberknight/making_cobertura_reports_show.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Groovy</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Testing</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">cobertura</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">groovy</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">maven</category>
            
            <pubDate>Tue, 15 Dec 2009 23:33:19 -0500</pubDate>
        </item>
        
        <item>
            <title>Hibernate Performance Tuning Part 1 Article Published</title>
            <description><![CDATA[ <p>I've just published an article in the November 2009 <a href="http://www.nofluffjuststuff.com/home/magazine_subscribe?id=9">NFJS Magazine</a> on Hibernate Performance Tuning. Here's the abstract:</p>

<p><em>Many developers treat Hibernate like a "black box" and assume it will simply "Do the Right Thing" when it comes to all things related to the underlying database. This is a faulty assumption because, while Hibernate is great at the mechanics of database interaction, it cannot and will likely not ever be able to figure out the specific details of your domain model and discern the most efficient and best performing data access strategies. In this first article of a two part series, I'll show you how to achieve better performance in your Hibernate applications by focusing on tuning object retrieval, which forms the basis of your "fetch plan" for finding and storing objects in the database.</em></p>

<p>If you are using Hibernate and want to know more about how to change how objects are fetched from the database, check it out and let me know what you think. Note that NFJS Magazine does require a subscription.</p>]]></description>
            <link>http://www.nearinfinity.com/blogs/scott_leberknight/hibernate_performance_tuning_p.html</link>
            <guid>http://www.nearinfinity.com/blogs/scott_leberknight/hibernate_performance_tuning_p.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Database</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">ORM</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Persistence</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">hibernate</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">ORM</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">performance</category>
            
            <pubDate>Tue, 01 Dec 2009 19:29:07 -0500</pubDate>
        </item>
        
        <item>
            <title>Can Java Be Saved?</title>
            <description><![CDATA[ <h3>Java and Evolution</h3>

<p>The Java language has been around for a pretty long time, and in my view is now a stagnant language. I don't consider it <a href="http://codemonkeyism.com/java-dead/">dead</a> because I believe it will be around for probably decades if not longer. But it appears to have reached its evolutionary peak, and it doesn't look it's going to be evolved any further. This is not due to problems inherent in the language itself. Instead it seems the problem lies with Java's stewards (Sun and the JCP) and their unwillingness to evolve the language to keep it current and modern, and more importantly the goal to keep backward compatibility at all costs. Not just Sun, but also it seems the large corporations with correspondingly large investments in Java like IBM and Oracle aren't exactly chomping at the bit to improve Java. I don't even know if they think it even needs improvement at all. So really, the ultra-conservative attitude towards change and evolution is the problem with Java from my admittedly limited view of things.</p>

<p>That's why I don't hate Java. But, I do hate the way it has been treated by the people charged with improving it. It is clear many in the Java community want things like closures and a native property syntax but instead we got Project Coin. This, to me, is sad really. It is a shame that things like closures and native properties were not addressed in Java/JDK/whatever-it-is-called 7.</p>

<h3>Why Not?</h3>

<p>I want to know why Java can't be improved. We have concrete examples that it is possible to change a major language in major ways. Even in ways that break backward compatibility in order to evolve and improve. Out with the old, in with the new. Microsoft with C# showed that you can successfully evolve a language over time in major ways. For example C# has always had a property syntax but it now also has many features found in dynamically typed and functional languages such as type inference and, effectively, closures. With LINQ it introduced functional concepts. When C# added generics they did it correctly and retained the type information in the compiled IL, whereas Java used type-erasure and simply dropped the types from the compiled  bytecode. There is a great irony here: though C# began life about five or six years after Java, it not only has caught up but has surpassed Java in most if not all ways, and has continued to evolve while Java has become stagnant.</p>

<p>C# is not the only example. Python 3 is a major overhaul of the Python language, and it introduced breaking changes that are not backwards compatible. I believe they provide a migration tool to assist you should you want to move from the 2.x series to version 3 and beyond. Microsoft has done this kind of thing as well. I remember when they made Visual Basic conform to the .NET platform and introduced some rather gut wrenching (for VB developers anyway) changes, and they also provided a tool to aid the transition. One more recent example is Objective-C which has experienced a resurgence in importance mainly because of the iPhone. Objective-C has been around longer than all of Java, C#, Ruby, Python, etc. since the 1980s. Apple has made improvements to Objective-C and it now sports a way to define and synthesize properties and most recently added blocks (effectively closures). If a language that pre-dates Java (Python also pre-dates Java by the way) can evolve, I just don't get why Java can't.</p>

<p>While it is certainly possible to remain on older versions of software, forcing yourself to upgrade can be a Good Thing, because it ensures you don't get the "COBOL Syndrome" where you end up with nothing but binaries that have to run on a specific hardware platform forever and you are trapped until you rewrite or you go out of business. The other side of this, of course, is that organizations don't have infinite time, money, and resources to update every single application. Sometimes this too can be good, because it forces you to triage older systems, and possibly consolidate or outright eliminate them if they have outlived their usefulness. In order to facilitate large transitions, I believe it is very important to use tools that help automate the upgrade process, e.g. tools that analyze code and fix it if possible (reporting all changes in a log) and which provide warnings and guidance when a simple fix isn't possible.</p>

<h3>The JVM Platform</h3>

<p>Before I get into the changes I'd make to Java to make it not feel like I'm developing with a straightjacket on while having to type masses of unnecessary boilerplate code, I want to say that I think the JVM is a great place to be. Obviously the JVM itself facilitates developing all kinds of languages as evidenced by the huge number of languages that run on the JVM. The most popular ones and most interesting ones these days are probably JRuby, Scala, Groovy, and Clojure though there are probably hundreds more. So I suppose you could make an argument that Java doesn't need to evolve any more because we can simply use a more modern language that runs on the JVM.</p>

<p>The main problem I have with that argument is simply that there is already a ton of Java code out there, and there are many organizations who are simply not going to allow other JVM-based languages; they're going to stick with Java for the long haul, right or wrong. This means there is a good chance that even if you can manage convince someone to try writing that shiny new web app using Scala and its Lift framework,  JRuby on Rails, Grails, or Clojure, chances are at some point you'll also need to maintain or enhance existing large Java codebases. Wouldn't you like to be able to first upgrade to a version of Java that has closures, native property syntax, method/property handles, etc?</p>

<p>Next I'll choose what would be my top three choices to make Java much better immediately.</p>

<h3>Top Three Java Improvements</h3>

<p>If given the chance to change just three things about Java to make it better, I would choose these:</p>

<ul>
<li>Remove checked exceptions</li>
<li>Add closures</li>
<li>Add formal property support</li>
</ul>

<p>I think these three changes along would make coding in Java much, much better. Let's see how.</p>

<h4>Remove Checked Exceptions</h4>

<p>By removing checked exceptions you eliminate a ton of boilerplate try/catch clauses that do nothing except log a message, wrap and re-throw as a RuntimeException, pollute the API with throws clauses all over the place, or worst of all empty catch blocks that can cause very subtle and evil bugs. With unchecked exceptions, developers still have the option to catch exceptions that they can actually handle. It would be interesting to see how many times in a typical Java codebase people actually handle exceptions and do something at the point of exception, or whether they simply punt it away for the caller to handle, who in turn also punts, and so forth all the way up the call stack until some global handler catches it or the program crashes. If I were a betting man, I'd bet a lot of money that for most applications, developers punt the vast majority of the time. So why force people to handle something they cannot possible handle?</p>

<h4>Add Closures</h4>

<p>I specifically listed removing checked exceptions first, because to me it is the first step to being able to have a closure/block syntax that isn't totally horrendous. If you remove checked exceptions, then adding closures would seem to be much easier since you don't need to worry at all about what exceptions could possibly be thrown and there is obviously no need to declare exceptions. Closures/blocks would lead to better ability to handle collections, for example as in Groovy but in Java you would still have types (note I'm also using a literal property syntax here):</p>

<pre class="prettyprint">
// Find all people whose last name is "Smith"
List&lt;Person&gt; peeps = people.findAll { Person person -> person.lastName.equals("Smith");   } 
</pre>

or

<pre class="prettyprint">
// Create a list of names by projecting the name property of a bunch of Person objects
List&lt;String&gt; names = people.collect { Person person -> person.name; }
</pre>

<p>Not quite as clean as Groovy but still much better than the for loops that would traditionally be required (or trying to shoehorn functional-style into Java using the <a href="http://commons.apache.org/collections/">Jakarta Commons Collections</a>  or <a href="http://code.google.com/p/google-collections/">Google Collections</a>). Removal of checked exceptions would allow, as mentioned earlier, the block syntax to not have to deal with declaring exceptions all over the place. Having to declare checked exceptions in blocks makes the syntax worse instead of better, at least when I saw the various closure proposals for Java/JDK/whatever 7 which did not get included. Requiring types in the blocks is still annoying, especially once you get used to Ruby and Groovy, but it would be passable.</p>

<h4>Native Property Syntax</h4>

<p>The third change should do essentially what Groovy for properties does but should introduce a "property" keyword (i.e. don't rely on whether someone accidentally put an access modifier in there as Groovy does). The syntax could be very clean:</p>

<pre class="prettyprint">
property String firstName;
property String lastName;
property Date dateOfBirth;
</pre>

<p>The compiler could automatically generate the appropriate getter/setter for you like Groovy does. This obviates the need to manually code the getter/setter. Like Groovy you should be able to override either or both. It de-clutters code enormously and removes a ton of lines of silly getter/setter code (plus JavaDocs if you are actually still writing them for every get/set method). Then you could reference properties as you would expect: person.name is the "getter" and person.name = "Fred" is the "setter." Much cleaner syntax, way less boilerplate code. By the way, if someone used the word "property" in their code, i.e. as a variable name, it is just not that difficult to rename refactor, especially with all the advanced <a href="http://www.jetbrains.com/idea/">IDEs</a> in the Java community that do this kind of thing in their sleep.</p>

<p>Lots of other things could certainly be done, but if just these three were done I think Java would be much better off, and maybe it would even come into the 21st century like Objective-C. (See the very long but very good <a href="http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars">Ars Technica Snow Leopard review</a> for information on Objective-C's new <a href="http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars/10#blocks">blocks</a> feature.)</p>

<h3>Dessert Improvements</h3>

<p>If (as I suspect they certainly will :-) ) Sun/Oracle/whoever takes my suggestions and makes these changes and improves Java, then I'm sure they'll want to add in a few more for dessert. After the main course which removes checked exceptions, adds closures, and adds native property support, dessert includes the following:</p>

<ul>
<li>Remove type-erasure and clean up generics</li>
<li>Add property/method handles</li>
<li>String interpolation</li>
<li>Type inference</li>
<li>Remove "new" keyword
</ul>

<h4>Clean Up Generics</h4>

<p>Generics should simply not remove type information when compiled. If you're going to have generics in the first place, do it correctly and stop worrying about backward compatibility. Keep type information in the bytecode, allow reflection on it, and allow me to instantiate a "new T()" where T is some type passed into a factory method, for example. I think an improved generics implementation could basically copy the way C# does it and be done.</p>

<h4>Property/Method Handles</h4>

<p>Property/method <a href="http://blogs.sun.com/jrose/entry/method_handles_in_a_nutshell">handles</a> would allow you to reference a property or method directly. They would make code that now must use strings strongly typed and refactoring-safe (IDEs like IntelliJ already know how to search in text and strings but can never be perfect) much nicer. For example, a particular pet peeve of mine and I'm sure a lot of other developers is writing Criteria queries in Hibernate. You are forced to reference properties as simple strings. If the lastName property is changed to surname then you better make sure to catch all the places the String "lastName" is referenced. So you could replace code like this:</p>

<pre class="prettyprint">
session.createCriteria(Person.class)
	.add(Restrictions.eq("lastName", "Smith")
	.addOrder(Order.asc("firstName")
	.list();
</pre>

<p>with this using method/property handles:</p>

<pre class="prettyprint">
session.createCriteria(Person.class)
	.add(Restrictions.eq(Person.lastName, "Smith")
	.addOrder(Order.asc(Person.firstName)
	.list();
</pre>

<p>Now the code is strongly-typed and refactoring-safe. JPA 2.0 tries mightily to overcome having strings in the new criteria query API  with its metamodel. But I find it pretty much appalling to even look at, what with having to create or code-generate a separate "metamodel" class which you reference like "_Person.lastName" or some similar awful way. This metamodel class lives only to represent properties on your real model object for the sole purpose of making JPA 2.0 criteria queries strongly typed. It just isn't worth it and is total overkill. In fact, it reminds me of the bad-old days of rampant over-engineering in Java (which apparently is still alive and well in many circles but I try to avoid it as best I can). The right thing is to fix the language, not to invent something that adds yet more boilerplate and more complexity to an already overcomplicated ecosystem.</p>

<p>Method handles could also be used to make calling methods using reflection much cleaner than it currently is, among other things. Similarly it would make accessing properties via reflection easier and cleaner. And with only unchecked exceptions you would not need to catch the four or five kinds of exceptions reflective code can throw.</p>

<h4>String Interpolation</h4>

<p>String interpolation is like the sorbet that you get at fancy restaurants to cleanse your palate. This would seem to be a no-brainer to add. You could make code like:</p>

<pre class="prettyprint">
log.error("The object of type  ["
    + foo.getClass().getName()
    + "] and identifier ["
    + foo.getId()
    + "] does not exist.", cause);
</pre>

<p>turn into this much more palatable version (using the native property syntax I mentioned earlier):</p>

<pre class="prettyprint">
log.error("The object of type [${foo.class.name}] and identifier [${foo.id}] does not exist.", cause);
</pre>

<h4>Type Inference</h4>

<p>I'd also suggest adding type inference, if only for local variables like C# does. Why do we have to repeat ourselves? Instead of writing:</p>

<pre class="prettyprint">
Person person = new Person();
</pre>

<p>why can't we just write:</p>

<pre class="prettyprint">
var person = new Person();
</pre>

<p>I have to believe the compiler and all the tools are smart enough to infer the type from the "new Person()". Especially since other strongly-typed JVM languages like Scala do exactly this kind of thing.</p>

<h4>Elminate "new"</h4>

<p>Last but not least, and actually not the last thing I can think of but definitely the last I'm writing about here, let's get rid of the "new" keyword and either go with Ruby's new <i>method</i> or Python's constructor syntax, like so:</p>

<pre class="prettyprint">
// Ruby-like new method
var person = Person.new()

// or Python-like construction
var person = Person()
</pre>

<p>This one came to me recently after hearing <a href="http://en.wikipedia.org/wiki/Bruce_Eckel">Bruce Eckel</a> give an excellent talk on language evolution and archaeology. He had a ton of really interesting examples of why things are they way they are, and how Java and other languages like C++ evolved from C. One example was the reason for "new" in Java. In C++ you can allocate objects on the stack or the heap, so there is a stack-based constructor syntax that does not use "new" while the heap-based constructor syntax uses the "new" operator. Even though Java only has heap-based object allocation, it retained the "new" keyword which is not only boilerplate code but also makes the entire process of object construction pretty much immutable: you cannot change anything about it nor can you easily add hooks into the object creation process.</p>

<p>I am not an expert at all in the low-level details, and Bruce obviously knows what he is <a href="http://www.amazon.com/Thinking-C-2-Practical-Programming/dp/0130353132/">talking</a> <a href="http://www.amazon.com/Thinking-Java-4th-Bruce-Eckel/dp/0131872486/">about</a> way more than I do, but I can say that I believe the Ruby and Python syntaxes are not only nicer but more internally consistent, especially in the Ruby case because there is no special magic or sauce going on. In Ruby, new is just a method, on a class, just like everything else.</p>

<h3>Conclusion to this Way Too Long Blog Entry</h3>

<p>I did not actually set out to write a blog whose length is worthy of a <a href="http://blogs.tedneward.com/">Ted Neward</a> blog. It just turned out that way. (And I do in fact like reading Ted's long blogs!) Plus, I found out that <a href="http://en.wikipedia.org/wiki/Speculative_fiction">speculative fiction</a> can be pretty fun to write, since I don't think pretty much any of these things are going to make it into Java anytime soon, if ever, and I'm sure there are lots of people in the Java world who hate things like Ruby won't agree anyway.</p>]]></description>
            <link>http://www.nearinfinity.com/blogs/scott_leberknight/can_java_be_saved.html</link>
            <guid>http://www.nearinfinity.com/blogs/scott_leberknight/can_java_be_saved.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">C#</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">clojure</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">closure</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">groovy</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">python</category>
            
            <pubDate>Mon, 09 Nov 2009 15:23:10 -0500</pubDate>
        </item>
        
        <item>
            <title>Hive - The next great data warehouse</title>
            <description><![CDATA[In the past few weeks I have been spending more and more time working with Hadoop and Hive.&nbsp; For those of you that don't know what Hadoop is check out what <a href="http://en.wikipedia.org/wiki/Hadoop">wikipedia</a> has to say.&nbsp; Hive is built on top of Hadoop, simply stated is it a SQL engine that submits <a href="http://en.wikipedia.org/wiki/Map_Reduce">map/reduce</a> jobs to Hadoop for execution.<br /><br />So next you ask yourself, "why do I care"?&nbsp; Well with Hive using Hadoop for all the heavy lifting, the amount of data that you can process is only limited by the amount of hardware you have in your cluster.&nbsp; Hive is used for data warehousing which means that it is designed to work on huge datasets, huge joins, huge data loads, huge query results, etc.&nbsp; However before you start thinking about getting rid of that MySQL database, think again.&nbsp; Hive is not and never will be low latency.&nbsp; All queries submit map/reduce jobs to Hadoop which then operates on files stored in HDFS.<br /><br />Hive has a lot of nice features built in, like:<br /><ul><li>It can operate on <i>raw</i> files located in HDFS, like logs from you application, like csv files from your database(s).&nbsp; So this can reduce your load time, because you don't have to actually load it into a database before you can use it.</li><li>It can operate on compressed files.&nbsp; I started using this feature last week because I am getting a 4 to 1 compression ratio with no different in performance (I am using sequence files with block compression).</li><li>In your SQL statements you can actually use the Hadoop streaming api to build your own mapper and reducers, and they don't even have to be written in Java!</li><li>You can also create your own user defined functions, so when you have to do something crazy with the data, you can!</li></ul><br />And there are lots more, so go check it out!<br /><br /><a href="http://wiki.apache.org/hadoop/Hive">Hive</a>, the real Netezza killer.<br />]]></description>
            <link>http://www.nearinfinity.com/blogs/aaron_mccurry/hive_-_the_next_great_data_war.html</link>
            <guid>http://www.nearinfinity.com/blogs/aaron_mccurry/hive_-_the_next_great_data_war.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Hadoop</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">SQL</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">hadoop</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">hdfs</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">hive</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">sql</category>
            
            <pubDate>Sun, 04 Oct 2009 13:18:55 -0500</pubDate>
        </item>
        
        <item>
            <title>JVM Language Summit 2009 - Day 3</title>
            <description><![CDATA[Terrance Parr - ANTLR<div><br /></div><div>Terrance is entertaining and energetic. &nbsp;He showed some of the challenges ANTLR faces and had some requests from the JVM that are slightly different from most of the language implementors.</div><div><br /></div><div>Stadler - JVM Continuations</div><div><br /></div><div>Another research talk. &nbsp;The definition of continuations caused a side discussion, but I really enjoyed this talk. &nbsp;It is only implemented in the c1 client compiler. &nbsp;Security (serialization) and finally blocks are some of the unanswered questions here.</div><div><br /></div><div>Wimmer - trace-based JIT</div><div><br /></div><div>More research (UC-Irvine, but really more from Linz). &nbsp;Discussed TCO, client compiler visualization tool and trace based JIT. &nbsp;Uses traces and not just heuristics to identify hotspots for compilation. &nbsp;Works with blocks as well as methods. &nbsp;Think tracemonkey for Hotspot.</div><div><br /></div><div>Gafter - Static dynamic types</div><div><br /></div><div>Neal spoke about the new dynamic keyword in C# and the DLR. &nbsp;Described the DLR as basically a MOP. &nbsp;I would love to see a clean language implementation on the JVM with this sort of type system.</div><div><br /></div><div>Siek - Blame tracking</div><div><br /></div><div>Blame tracking is tracking type problems in dynamic languages and blaming the proper point in the code that made the initial improper call. &nbsp;It also works for mixed static/dynamic type systems. &nbsp;Currently it only identifies the outmost problem for space efficiency reasons.</div><div><br /></div><div>Field - JavaFX binding</div><div><br /></div><div>Bindings are handled by the creation of locations which can be pretty inefficient. &nbsp;This talk went through the various rounds of attempting to make bindings fast(er).</div><div><br /></div><div>Baker - Jython</div><div><br /></div><div>History of Jython. &nbsp;Optimizations. &nbsp;Mostly discussed the libraries and future direction.</div><div><br /></div><div>Remi Forax - 292 (invoke dynamic aka indy) backport</div><div><br /></div><div>Backport works via proxies and code weaving since the byte code instruction is not available. &nbsp;Two transformations occur via Java agents and the performance is very respectable. &nbsp;It works with JDK 5 and 6.</div><div><br /></div><div>The final session (not recorded) was by Slava on the Factor programming language compiler. Factor is a stack based language like Forth that does NOT run on the JVM. &nbsp;This session was awesome. &nbsp;While there weren't any new compiler optimizations that I hadn't heard of previously, it is very impressive was Slava has done with Factor. &nbsp;If you haven't ever checked it out, the url is&nbsp;<a href="http://factorcode.org/">http://factorcode.org/</a>&nbsp;&nbsp;Again, kudos to Sun for inviting non-JVM participants, it really adds to the quality of the event imho.</div><div><br /></div><div>PS Linux doesn't work well for presentations. :)</div><div>PPS People want TCO</div><div>PPPS 64k byte limit in methods is a problem</div><div>PPPPS The amount of cooperation among projects seems to have increased significantly since last year</div><div>PPPPPS Startup time needs to be improved</div>]]></description>
            <link>http://www.nearinfinity.com/blogs/bryan_weber/jvm_language_summit_2009_-_day_2.html</link>
            <guid>http://www.nearinfinity.com/blogs/bryan_weber/jvm_language_summit_2009_-_day_2.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
            <pubDate>Fri, 18 Sep 2009 12:54:58 -0500</pubDate>
        </item>
        
        <item>
            <title>JVM Language Summit 2009 - Day 2</title>
            <description><![CDATA[<div>While day 1 focused on invokedynamic day 2 focused on time and locality. &nbsp;</div><div><br /></div>Hickey keynote<div><br /></div><div>Very thought provoking. &nbsp;Argued for immutable data structures and pure functions, but with time transitions and observers. &nbsp;Nice pragmatic response in a way to Erik Meijer's Fundamentalist Functional Programming from last year's JVM language Summit. &nbsp;This was somewhat a philosophical presentation, but much like last year it made me want to re-visit Clojure. &nbsp;Here is a better summary than I could have written:&nbsp;<a href="http://wiki.jvmlangsummit.com/Clojure_Keynote">http://wiki.jvmlangsummit.com/Clojure_Keynote</a></div><div><br /><div>Cliff Click</div><div><br /></div><div>Awesome scheduling to put this immediately after Rich's talk. &nbsp;Cliff offered up four presentations and the audience overwhelmingly chose the one about x86. &nbsp;Interestingly this had some similar theories to Rich's talk (time and locality), but at an extremely low level.</div><div><br /></div><div>Hotswap</div><div><br /></div><div>Research presentation from European students that have partnered with Sun. &nbsp;This was interesting, but the limitations make it not that useful for real production code hotswapping in my opinion.</div><div><br /></div><div>Groovy</div><div><br /></div><div>This talk by Theodorou focused on the performance of Groovy and was almost an advertisement for not using Groovy. &nbsp;Of course the performance of Groovy doesn't match Java, but it is really bad in some instances. &nbsp;The moral was use Java where performance is necessary.</div><div><br /></div><div>Sun update</div><div><br /></div><div>This was probably not recorded, but Octavian took hard questions from Josh Bloch, Neal Gafter and others. &nbsp;Hopefully the departure of key Sun employees ends up helping Sun in the long run. &nbsp;In my opinion Sun is overestimating the value of the Java store.</div><div><br /></div><div>Erik Meijer</div><div><br /></div><div>Not nearly as entertaining as last year's talk, but still interesting and based on mathematics. &nbsp;Erik was looking for solutions to concurrency and resource cleanup problems that are difficult and that currently have no "clean" solution.</div><div><br /></div><div>JRocket</div><div><br /></div><div>Frederik's talk focused mostly on anti-optimizations and basically asked the language designers to not try to implement optimizations that are also performed by the JVM. &nbsp;The audience was hoping for specific patterns more than Frederik offered up, but Frederik is clearly very knowledgeable about JRocket and the JVM specification.</div><div><br /></div><div>Ioke</div><div><br /></div><div>Ola tried to give a one hour introduction to Ioke in 30 minutes and by his own admission it did not work out that well. &nbsp;Many in the audience were not familiar with the syntax of IO which originally inspired Ioke and that did not help him very much. &nbsp;The BNF was shown much more than actual source examples.</div><div><br /></div><div>Dinner</div><div><br /></div><div>Dinner was at Faultine which has good food and terrible acoustics. &nbsp;Aside from having to scream at the people sitting near me it was a brilliant time. &nbsp;Dick Wall of the Java Posse was very friendly and a fantastic ambassador for the Java/Scala community. &nbsp;If you aren't familiar with the Java Posse podcast then I suggest you check it out on iTunes or at&nbsp;<a href="http://javaposse.com/">http://javaposse.com/</a>. Their google group and mailing list are also interesting.</div><div><br /></div><div><br /></div><div><br /></div></div>]]></description>
            <link>http://www.nearinfinity.com/blogs/bryan_weber/jvm_language_summit_2009_-_day_1.html</link>
            <guid>http://www.nearinfinity.com/blogs/bryan_weber/jvm_language_summit_2009_-_day_1.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
            <pubDate>Fri, 18 Sep 2009 11:57:43 -0500</pubDate>
        </item>
        
        <item>
            <title>JVM Language Summit 2009 - Day 1</title>
            <description><![CDATA[Day 1 of the 2nd annual JVM language summit at Sun Microsystems has come to a close. &nbsp;Once again this conference has had excellent content in the presentations. &nbsp;My overall impression of today was that many projects (JRuby, Atilla's MOP, etc) already have done significant work on integrating invoke dynamic.&nbsp;<div><br /></div><div>And once again Sun was smart enough to invite non-JVM participants. &nbsp;Today was Allison Randal from Parrot. &nbsp;Tomorrow will include Erik Meijer of Microsoft and Haskell fame. I would love to see him top his presentation from last year, it will be difficult.<div><br /></div><div>Charlie Nutter (of Engine Yard, still not quite used to that) talked about JRuby, Duby and Surinx. &nbsp;</div><div><br /></div><div>Mark Reinhold talked about the JDK, Atilla spoke about his MOP framework, Miles Sabin spoke about Scala tooling, and David Pollack spoke about Scala basics.<br /><div><br /></div><div>The JVM presentations were interesting, but the best material was at dinner. &nbsp;Sorry to non-attendees, that won't be on InfoQ later.</div><div><br /></div><div>John Rose's breakout session was the highlight of Day 1 for me. &nbsp;It is really fun to think about the parts of the implementation of Method Handles that do not have solutions yet or where certain features could be taken in the future.&nbsp;</div><div><br /></div><div>One of the breakout sessions was on Noop.&nbsp;<a href="http://code.google.com/p/noop/">http://code.google.com/p/noop/</a>&nbsp;This reminded me of the Fan programming language from last year's event. &nbsp;Only Fan has a real implementation and noop is more or less in the vaporware phase at this point in time. &nbsp;The two goals of noop are easy testing (even of legacy code) and readability. &nbsp;As conceived today it is not going to achieve either to the degree the implementors would like IMHO. &nbsp;It almost reminds me of turning Guice into a programming language...</div><div><br /></div><div>Oh yeah, and if you didn't already know, Brian Goetz knows a lot about concurrency. &nbsp;A whole lot. &nbsp;He was able to detect a concurrency bug in sample code that fit on one slide after a few seconds of reading it. &nbsp;</div><div><br /></div><div>Tomorrow's schedule is jam packed with presentations that interest me. &nbsp;Will post more tomorrow night!</div></div></div>]]></description>
            <link>http://www.nearinfinity.com/blogs/bryan_weber/jvm_language_summit_2009_-_day.html</link>
            <guid>http://www.nearinfinity.com/blogs/bryan_weber/jvm_language_summit_2009_-_day.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">JVM</category>
            
            <pubDate>Wed, 16 Sep 2009 12:05:57 -0500</pubDate>
        </item>
        
        <item>
            <title>Sorting Collections in Hibernate Using SQL in @OrderBy</title>
            <description><![CDATA[<p>When you have collections of associated objects in domain objects, you generally want to specify some kind of default sort order. For example, suppose I have domain objects <code>Timeline</code> and <code>Event</code>:</p>

<pre class="prettyprint">
@Entity
class Timeline {

    @Required 
    String description

    @OneToMany(mappedBy = "timeline")
    @javax.persistence.OrderBy("startYear, endYear")
    Set&lt;Event&gt; events
}

@Entity
class Event {

    @Required
    Integer startYear

    Integer endYear

    @Required
    String description

    @ManyToOne
    Timeline timeline
}
</pre>

<p>In the above example I've used the standard JPA (Java Persistence API) <code>@OrderBy</code> annotation which allows you to specify the order of a collection of objects via object properties, in this example a <code>@OneToMany</code> association .  I'm ordering first by <code>startYear</code>  in ascending order and then by <code>endYear</code>, also in ascending order. This is all well and good, but note that I've specified that only the start year is required. (The <a href="http://www.nearinfinity.com/blogs/scott_leberknight/validating_domain_objects_in_hibernate2.html">@Required</a> annotation is a custom Hibernate Validator annotation which does exactly what you would expect.)  How are the events ordered when you have several events that start in the same year but some of them have no end year? The answer is that it depends on how your database sorts null values by default. Under Oracle 10g nulls will come last. For example if two events both start in 2001 and one of them has no end year, here is how they are ordered:</p>

<pre class="prettyprint">
2001 2002  Some event
2001 2003  Other event
2001       Event with no end year
</pre>

<p>What if you want to control how null values are ordered so they come first rather than last? In Hibernate there are several ways you could do this. First, you could use the Hibernate-specific <code>@Sort</code> annotation to perform in-memory (i.e. not in the database) sorting, using natural sorting or sorting using a <code>Comparator</code> you supply. For example, assume I have an <code>EventComparator</code> helper class that implements <code>Comparator</code>. I could change <code>Timeline</code>'s collection of events to look like this:</p>

<pre class="prettyprint">
@OneToMany(mappedBy = "timeline")
@org.hibernate.annotations.Sort(type = SortType.COMPARATOR, comparator = EventCompator)
 Set&lt;Event&gt; events
</pre>

<p>Using <code>@Sort</code>  will perform sorting in-memory once the collection has been retrieved from the database. While you can certainly do this and implement arbitrarily complex sorting logic, it's probably better to sort in the database when you can. So we now need to turn to <i>Hibernate's</i> <code>@OrderBy</code> annotation, which lets you specify a <i>SQL fragment</i> describing how to perform the sort. For example, you can change the events mapping to :</p>

<pre class="prettyprint">
@OneToMany(mappedBy = "timeline")
@org.hibernate.annotations.OrderBy("start_year, end_year")
 Set&lt;Event&gt; events
</pre>

<p>This sort order is the same as using the JPA <code>@OrderBy</code> with "startYear, endYear" sort order. But since you write actual SQL in Hibernate's <code>@OrderBy</code> you can take advantage of whatever features your database has, at the possible expense of portability across databases. As an example, Oracle 10g supports using a syntax like "order by start_year, end_year nulls first" to order null end years before non-null end years. You could also say "order by start_year, end year nulls last" which sorts null end years last as you would expect. This syntax is probably not portable, so another trick you can use is the NVL function, which is supported in a bunch of databases. You can rewrite <code>Timeline</code>'s collection of events like so:</p>

<pre class="prettyprint">
@OneToMany(mappedBy = "timeline")
@org.hibernate.annotations.OrderBy("start_year, nvl(end_year , start_year)")
 Set&lt;Event&gt; events
</pre>

<p>The expression "nvl(end_year , start_year)" simply says to use <code>end_year</code> as the sort value if it is not null, and <code>start_year</code> if it is null. So for sorting purposes you end up treating <code>end_year</code> as the same as the <code>start_year</code> if <code>end_year</code> is null. In the contrived example earlier, applying the nvl-based sort using Hibernate's <code>@OrderBy</code> to specify SQL sorting criteria, you now end with the events sorted like this:</p>

<pre class="prettyprint">
2001       Event with no end year
2001 2002  Some event
2001 2003  Other event
</pre>

<p>Which is what you wanted in the first place. So if you need more complex sorting logic than what you can get out of the standard JPA <code>@javax.persistence.OrderBy</code>, try one of the Hibernate sorting options, either <code>@org.hibernate.annotations.Sort</code> or <code>@org.hibernate.annotations.OrderBy</code>. Adding a SQL fragment into your domain class isn't necessarily the most <i>elegant</i> thing in the world, but it might be the most <i>pragmatic</i> thing.</p>]]></description>
            <link>http://www.nearinfinity.com/blogs/scott_leberknight/sorting_collections_in_hiberna.html</link>
            <guid>http://www.nearinfinity.com/blogs/scott_leberknight/sorting_collections_in_hiberna.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">database</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">hibernate</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
            <pubDate>Tue, 15 Sep 2009 12:40:00 -0500</pubDate>
        </item>
        
        <item>
            <title>Groovification</title>
            <description><![CDATA[ <p>Last week I tweeted about groovification, which is defined thusly:</p>

<p><i>groovification.</i> noun. the process of converting java source code into groovy source code (usually done to make development more fun)</p>

<p>On my main day-to-day project, we've been writing unit tests in Groovy for quite a while now, and recently we decided to start implementing new code in Groovy rather than Java. The reason for doing this is to gain more flexibility in development, to make testing easier (i.e. in terms of the ability to mock dependencies in a trivial fashion), to eliminate a lot of Java boilerplate code and thus write less code, and of course to make developing more fun. It's not that I hate Java so much as I feel Java simply isn't innovating anymore and hasn't for a while, and isn't adding features that I simply don't want to live without anymore such as closures and the ability to do metaprogramming when I need to. In addition, it isn't removing features that I don't want, such as checked exceptions. If I know, for a fact, that I can handle an exception, I'll handle it appropriately. Otherwise, when there's nothing I can do anyway, I want to let the damn thing propagate up and just show a generic error message to the user, log the error, and send the admin team an email with the problem details.</p>

<p>This being, for better or worse, a Maven project, we've had some interesting issues with mixed compilation of Java and Groovy code. The <a href="http://groovy.codehaus.org/">GMaven plugin</a> is easy to install and works well but currently has some outstanding issues related to Groovy stub generation, specifically it cannot handle <a href="http://jira.codehaus.org/browse/MGROOVY-108">generics</a> or <a href="http://jira.codehaus.org/browse/MGROOVY-109">enums</a> properly right now. (Maybe someone will be less lazy than me and help them fix it instead of complaining about it.) Since many of our classes use generics, e.g. in service classes that return domain objects, we currently are not generating stubs. We'll convert existing classes and any other necessary dependencies to Groovy as we make updates to Java classes, and we are implementing new code in Groovy. Especially in the web controller code, this becomes trivial since the controllers generally depend on other Java and/or Groovy code, but no other classes depend on the controllers. So starting in the web tier seems to be a good choice. Groovy combined with implementing controllers using the Spring @MVC annotation-based controller configuration style (i.e. no XML configuration), is making the controllers <i>really</i> thin, lightweight, and easy to read, implement, and test.</p>

<p>I estimate it will take a while to fully convert all the existing Java code to Groovy code. The point here is that we are doing it piecemeal rather than trying to do it all at once. Also, whenever we convert a Java file to a Groovy one, there are a few basics to make the classes Groovier without going totally overboard and spending loads of time. First, once you've used <a href="http://www.jetbrains.com/idea/">IntelliJ's</a> move refactoring to move the .java file to the Groovy source tree (since we have src/main/java and src/main/groovy) you can then use IntelliJ's handy-dandy "Rename to Groovy" refactoring. In IntelliJ 8.1 you need to use the "Search - Find Action" menu option or keystroke and type "Rename to..." and select "Rename to Groovy" since they goofed in version 8 and that option was left off a menu somehow. Once that's done you can do a few simple things to make the class a bit more groovy. First, get rid of all the semi-colons. Next, replace getter/setter code with direct property access. Third, replace for loops with "each"-style internal iterators when you don't need the loop index and "eachWithIndex" where you do. You can also get rid of some of the redundant modifiers like "public class" since that is the Groovy default. That's not too much at once, doesn't take long, and makes your code Groovier. Over time you can do more groovification if you like.</p>

<p>The most common gotchas I've found have to do with code that uses anonymous or inner classes since Groovy doesn't support those Java language features. In that case you can either make a non-public named class (and it's OK to put it in the same Groovy file unlike Java as long as it's not public) or you can refactor the code some other way (using your creativity and expertise since we are not <a href="http://www.nearinfinity.com/blogs/scott_leberknight/thinking_matters.html">monkeys</a>, right?). This can sometimes be a pain, especially if you are using a lot of them. So it goes. (And yes, that is a <a href="http://en.wikipedia.org/wiki/Slaughterhouse-Five">Slaughterhouse Five</a> reference.)</p>

<p>Happy groovification!</p>]]></description>
            <link>http://www.nearinfinity.com/blogs/scott_leberknight/groovification.html</link>
            <guid>http://www.nearinfinity.com/blogs/scott_leberknight/groovification.html</guid>
            
                <category domain="http://www.sixapart.com/ns/types#category">Groovy</category>
            
                <category domain="http://www.sixapart.com/ns/types#category">Java</category>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">gmaven</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">groovy</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">java</category>
            
            <pubDate>Mon, 04 May 2009 17:22:38 -0500</pubDate>
        </item>
        
    </channel>
</rss>
