BibleGateway.com Verse Of The Day

Wednesday, February 14, 2007

Parsing Fixed-Length Positional Text Data

I recently had to parse out positional (fixed-length field) data, and I really didn't want to write some ugly hard-coded parser that would break every time a field changed length or a new field was added.

I also didn't want to write some overly complicated abomination with XML configuration files and AbstractFactories and visitor patterns and so on (you know, like the Apache guys would have done.)

Instead I chose to use some of the new features of JDK 5 -- like enums and the enhanced for loop. I create an enum like the one show below.


public enum GetTransactionFields
{
ADDR_ID (10, Justification.LEFT_SPACEFILLED),
ADDRESS (70, Justification.LEFT_SPACEFILLED),
TYPE_CODE (1, Justification.NONE),
ACCOUNT_INFO_IND (1, Justification.NONE),
RECEIVE_OFFERS_IND (1, Justification.NONE),
THIRD_PARTY_IND (1, Justification.NONE),
CUSTOMER_EMAIL_STATUS (1, Justification.NONE),
WIRELESS_CARRIER (4, Justification.NONE),
INVALID_IND (1, Justification.NONE),
SOURCE_CODE (1, Justification.NONE),
SYSTEM_CODE (1, Justification.NONE),
LAST_UPDATE (8, Justification.NONE),
;

private int fieldSize;
private Justification justification;

/** Private constructor */
private GetTransactionFields(int fieldSize, Justification justification)
{
this.fieldSize = fieldSize;
this.justification = justification;
}

/**
* @return Returns the justification.
*/
public Justification getJustification()
{
return justification;
}

/**
* @return Returns the fieldSize.
*/
public int getFieldSize()
{
return fieldSize;
}
}


Now, to add a new field, I simply insert it in the right position, with it's fixed length and the justification (if you only get 10 characters in a 20 character field, are they left or right justified? In this case Justification is an enum as well, but I ended up not needing to use it)

If a field size changes, you only need to change the size here. Your parser code does not need to know or care. That parsing code is extremely simple. An example is shown below. This example just prints the field name (from the enum) and the value that was parsed out.

In my real application, I created an XML document using the enum names as the XML nodes, and the values were the parsed values. This way, the data was more flexible to work with, as I could use XSLT to show the data nicely formatted on a JSP log viewer, and use another XSLT to transform it into the format needed by the backend web services.


private void parseGetRecord(String record)
{
int posn = 0;
int endIdx = 0;
for (GetTransactionFields f: GetTransactionFields.values())
{
endIdx = posn + f.getFieldSize();
System.out.println("Parsing field " + f.toString() +", value = " + record.substring(posn, endIdx)));
posn = endIdx;
}
}


And that's all there is to it. An enum and a for loop. Simple, elegant, and best of all, maintainable even by drooling neanderthal programmers.

3 comments:

David Clarke said...

Just a thanks for your post - I've used your approach for parsing a couple of different flat file formats. I generalised the parsing method so I could pass into it the appropriate enum and I'm really happy with the result, details in the last post at Help passing enum as method parameter.

Jacob von Eyben said...

Parsing these kind of fixed length data is something we all come across some day :-)

Therefore I have created a small non-intrusive framework called fixedformat4j to do the work.

It uses annotations to specify where data is located in text files and is capable of transforming java objects to and from string representation.

The project is opensource and can be found here: fixedformat4j.ancientprogramming.com

Anonymous said...

Who knows where to download XRumer 5.0 Palladium?
Help, please. All recommend this program to effectively advertise on the Internet, this is the best program!