|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectewe.util.DataParser
A DataParser is used to extract numeric and textual information from a formated line of text. It works in a similar fashion to C/C++ scanf() routines, in that you specify the format of the data using "%" fields in a format String.
To parse a large text file you should use the ewe.io.StreamScanner class. This can scan entire files at the maximum speed while not creating any objects for each scan.
Specifying Data Format
You specify the data you wish to parse as a String with a set of '%' formats separated by spaces, similar to C/C++ scanf function. You can scan for Strings (either as fixed length Strings or individual words), integer/long values or floating point values.
The formats you can use for scanning numbers are:
%#i %#f %#d
'i' indicates an integer value and 'f' or 'd' indicates a floating point (double) value.
'#' indicates an optional number specifying the number of digits to read.
If you do not specify a number of digits, then all non-space characters will be read in and
then converted to a number.
For strings, words or characters use:
%#c %#s %q
'c' indicates a single character (byte) to read.
's' indicates a single word or String to read.
'q' indicates a word or set of words that may be in quotes. In other words, if the first
character read is a ' or " character, then all characters will be read until a matching
quote is found. If the first character is not a quote character, then only the first word
is read in.
'#' indicates an optional number specifying the number of characters to read (note that
this cannot be used with the 'q' format).
Note that '%10c' and '%10s' will have the same effect - i.e. both will read in a string of 10
characters, but '%c' reads a single character and '%s' reads the next single word.
Skipping fields - Using a '!' character instead of a '%' character will indicate that the specified field should be skipped over instead of being converted and returned.
Retrieving the Parsed Data
This can be done in two ways. The parse() methods return an Object array that contains a single Object for each '%' field in the scan string (but NOT for any '!' fields). Each object will be either a ewe.sys.Long object (for %i fields), ewe.sys.Double object (for %f fields) and a ewe.util.SubString object (for all text fields). So a scan of "%10s !5s %i %f" will return an array of 3 objects. The object at index 0 will be a SubString, the one at index 1 will be a Long object and the one at index 2 will be a Double object. Note that these objects are re-used for the next parse().
You can also ignore the return value of parse() and instead call one of the getXXX() methods to retrieve a particular data type from the scanned array of values. Using the same example "%10s !5s %i %f" after a parse you could call getString(0) followed by getInt(1) followed by getDouble(2). These calls are only valid until the next parse().
| Constructor Summary | |
DataParser(String format)
Create a new DataParser for the specified format. |
|
| Method Summary | |
double |
getDouble(int index)
Use this to get the value that was just parsed at the specified index. |
String |
getFormat()
|
int |
getInt(int index)
Use this to get the value that was just parsed at the specified index. |
long |
getLong(int index)
Use this to get the value that was just parsed at the specified index. |
String |
getString(int index)
Use this to get the value that was just parsed at the specified index. |
SubString |
getSubString(int index)
Use this to get the value that was just parsed at the specified index. |
Object |
getValue(int index)
Get the parsed value at the specified index. |
Object[] |
parse(byte[] buffer,
int start,
int length)
Parse a string of UTF encoded bytes. |
Object[] |
parse(char[] chars,
int start,
int length)
Parse a string of UTF encoded bytes. |
Object[] |
parse(String data)
Parse a string. |
static DataParser |
parseString(String data,
String format)
This creates a new DataParser for the specified format and then parses the String. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, toString |
| Constructor Detail |
public DataParser(String format)
throws IllegalArgumentException
format - The format in % notation.
IllegalArgumentException - if the format is malformed.| Method Detail |
public String getFormat()
public static DataParser parseString(String data,
String format)
throws IllegalArgumentException,
IndexOutOfBoundsException
data - The data to parse.format - The format string.
IllegalArgumentException - if the format is malformed.
IndexOutOfBoundsException - if there was not enough data to parse all formats.
public Object[] parse(String data)
throws IndexOutOfBoundsException
data - the String data.
IndexOutOfBoundsException - if there was not enough data to parse all formats.
public Object[] parse(byte[] buffer,
int start,
int length)
throws IndexOutOfBoundsException
buffer - the array containing the bytes.start - the start of the data bytes in the array.length - the number of data bytes in the array.
IndexOutOfBoundsException - if there was not enough data to parse all formats.
public Object[] parse(char[] chars,
int start,
int length)
throws IndexOutOfBoundsException
chars - the array containing the characters.start - the start of the data bytes in the array.length - the number of data bytes in the array.
IndexOutOfBoundsException - if there was not enough data to parse all formats.
public Object getValue(int index)
throws IndexOutOfBoundsException
index - The index of the retrieved value.
IndexOutOfBoundsException - if the index is out of bounds.
public long getLong(int index)
throws IllegalArgumentException,
IndexOutOfBoundsException
index - The index of the value for the '%i' element as specified in the format string.
IllegalArgumentException - If the element did not denote an integer value.
IndexOutOfBoundsException - If the index is out of bounds.
public int getInt(int index)
throws IllegalArgumentException,
IndexOutOfBoundsException
index - The index of the value for the '%i' element as specified in the format string.
IllegalArgumentException - If the element did not denote an integer value.
IndexOutOfBoundsException - If the index is out of bounds.
public double getDouble(int index)
throws IllegalArgumentException,
IndexOutOfBoundsException
index - The index of the value for the '%f' element as specified in the format string.
IllegalArgumentException - If the element did not denote an integer value.
IndexOutOfBoundsException - If the index is out of bounds.
public SubString getSubString(int index)
throws IllegalArgumentException,
IndexOutOfBoundsException
index - The index of the value for the '%' element as specified in the format string.
IllegalArgumentException - If the element did not denote an integer value.
IndexOutOfBoundsException - If the index is out of bounds.
public String getString(int index)
throws IllegalArgumentException,
IndexOutOfBoundsException
index - The index of the value for the '%' element as specified in the format string.
IllegalArgumentException - If the element did not denote an integer value.
IndexOutOfBoundsException - If the index is out of bounds.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||