ewe.io
Class StreamScanner

java.lang.Object
  extended byewe.io.StreamScanner

public abstract class StreamScanner
extends Object

A StreamScanner is used to parse and read in formatted data from a text file as quickly as possible. You use a StreamScanner by inheriting from it and overriding the lineReceived()/lineParsed() method.

Specifying Data Format

You specify the data you wish to parse as a String with a set of '%' formats separated by spaces, similar to C/C++ scanf function. You can scan for Strings (either as fixed length Strings or individual words), integer/long values or floating point values. The StreamScanner uses a ewe.util.DataParser object for line scanning and so the format string specifications can be found in the API for ewe.util.DataParser.

Scanning the Data

The startScanning() method is used to begin the scan operation from the Stream. Before doing this you must register the different scanning formats that you will use on the file. You do this be calling addFormat(). Each format is given a new unique integer id that can be used to identify that format. Note that using the standard constructor you can provide one format at that time (which will usually be all you need).

Retrieving Scanned Data

After parsing each line of text the lineParsed() method is called. This provides an array of Objects that hold the parsed data (but no entries will provided for skipped data). Each element will either be a ewe.sys.Long object (for integer values) or a ewe.sys.Double object (for floating point numbers) or a SubString object.

If the preprocessLine variable is set true, then before a parse is done the lineReceived() method is called. This should return a format code to indicate which format to use or 0 to indicate that this line should be skipped.


Field Summary
 int bufferSize
          This is the buffer size to use when reading.
 long linesRead
          This indicates how many lines have been read since startScanning() was called.
 boolean preprocessLine
          Set this true to indicate that the lineReceived() method should be called prior to parsing the line.
 boolean shouldStop
          Set this true to tell the StreamScanner to stop scanning.
 int waitTime
          This is the length of time to wait in between calls to the nonBlockingRead() method of the input stream if that call should indicate no available data yet.
 
Constructor Summary
StreamScanner(BasicStream s, String format)
           
StreamScanner(InputStream s, String format)
           
 
Method Summary
 int addFormat(String format)
           
protected abstract  void lineParsed(int format, Object[] parsedValues)
           
protected  int lineReceived(byte[] buffer, int lineStart, int lineLength)
          If preprocessLine is true, this method is called after each line is read.
protected  void parseError(Exception e, int format, byte[] buffer, int lineStart, int lineLength)
          This method is called if there is an error parsing a line.
 boolean readLine()
          This reads in a line of text.
protected  boolean readMore()
           
 long startScanning()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, toString
 

Field Detail

bufferSize

public int bufferSize
This is the buffer size to use when reading. Generally the bigger it is the faster reading will go.


waitTime

public int waitTime
This is the length of time to wait in between calls to the nonBlockingRead() method of the input stream if that call should indicate no available data yet. By default it is 0 (which simply passes control to another thread). Set it to -1 to tell it not to wait at all.


shouldStop

public boolean shouldStop
Set this true to tell the StreamScanner to stop scanning.


linesRead

public long linesRead
This indicates how many lines have been read since startScanning() was called.


preprocessLine

public boolean preprocessLine
Set this true to indicate that the lineReceived() method should be called prior to parsing the line.

Constructor Detail

StreamScanner

public StreamScanner(BasicStream s,
                     String format)
              throws IllegalArgumentException

StreamScanner

public StreamScanner(InputStream s,
                     String format)
              throws IllegalArgumentException
Method Detail

startScanning

public long startScanning()
                   throws IOException
Throws:
IOException

lineReceived

protected int lineReceived(byte[] buffer,
                           int lineStart,
                           int lineLength)
                    throws IOException
If preprocessLine is true, this method is called after each line is read.

Parameters:
buffer - The bytes for the line.
lineStart - The starting index of the bytes in the buffer.
lineLength - The number of bytes in the line.
Returns:
the integer ID of a format specification, or 0 to skip this line.
Throws:
IOException - if parsing should stop.

parseError

protected void parseError(Exception e,
                          int format,
                          byte[] buffer,
                          int lineStart,
                          int lineLength)
                   throws IOException
This method is called if there is an error parsing a line. It should throw an IOException if it wishes parsing to stop. By default this does nothing.

Parameters:
e - The exception caused by parsing the line.
format - The format used for parsing.
buffer - The bytes for the line.
lineStart - The starting index of the bytes in the buffer.
lineLength - The number of bytes in the line.
Throws:
IOException - if parsing should stop.

lineParsed

protected abstract void lineParsed(int format,
                                   Object[] parsedValues)
                            throws IOException
Throws:
IOException

addFormat

public int addFormat(String format)
              throws IllegalArgumentException
Throws:
IllegalArgumentException

readMore

protected boolean readMore()
                    throws IOException
Throws:
IOException

readLine

public boolean readLine()
                 throws IOException
This reads in a line of text. The line is terminated by a Line Feed (\n) or a Carriage Return (\r) or a CR followed by LF (\r\n). The terminating LF or CR is NOT returned with the string.

Throws:
IOException