BibleGateway.com Verse Of The Day

Monday, November 07, 2005

Regular Expressions Are Your Friend

Once you overcome the cryptic nature of regular expressions, they can accomplish a lot with a little bit of code.

For a great overview, visit the Tao of Regular Expressions (http://sitescooper.org/tao_regexps.html).

Using RE within Java is simple. One of my projects is a Struts web application, and I wanted to give the user the option of uploading a comma-separated CSV file to run batch updates. The upload piece was easy using Jakarta commons-upload. The interesting part was correctly parsing out a CSV file into VO's, allowing for quoted or non quoted values, some of which may have embedded commas. I found this regular expression in the Java Cookbook by Ian Darwin.

You can get more detail on the java.util.regex package from the JavaDoc API documentation online. The difference between m.find() and m.group() below is basically that m.find() will point at the next match if there is one, where m.group() will actually return the value of that match. It's kind of like the difference between list.hasNext() and list.next().

import java.util.regex.Matcher;
import java.util.regex.Pattern;
...

String CSV_PATTERN = "\"([^\"]+?)\",?|([^,]+),?|,";
Pattern _csvRegex;
String input;

_csvRegex = Pattern.compile(CSV_PATTERN);

// read in your file or whatever into 'input'

Matcher m = this._csvRegex.matcher(input);

while (m.find()) //attempts to find next match
{
String field = m.group();//get the last match
// do something with field ...
}