pt.tumba.spell
Class XMLWordFinder

java.lang.Object
  extended by pt.tumba.spell.DefaultWordFinder
      extended by pt.tumba.spell.XMLWordFinder

public class XMLWordFinder
extends DefaultWordFinder

A word finder for XMLdocuments, which searches text for sequences of letters, but ignores tags.

Author:
Bruno Martins
See Also:
DefaultWordFinder

Field Summary
 
Fields inherited from class pt.tumba.spell.DefaultWordFinder
currentSegmentPos, currentWord, currentWordPos, nextSegmentPos, nextWord, nextWordPos, sentenceIterator, solveHardCases, startsSentence, text
 
Constructor Summary
XMLWordFinder()
          Constructor for XMLWordFinder.
XMLWordFinder(java.lang.String inText)
          Constructor for XMLWordFinder.
 
Method Summary
 java.lang.String currentSegment()
          Returns the current text segment from the input.
 java.lang.String next()
          This method scans the text from the end of the last word, and returns a String corresponding to the next word.
 
Methods inherited from class pt.tumba.spell.DefaultWordFinder
current, getText, hasNext, ignore, ignore, ignore, ignore, isWordChar, isWordChar, lookAhead, nextSegment, replace, replaceBigram, replaceSegment, setText, splitSegments, splitWords, startsSentence, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

XMLWordFinder

public XMLWordFinder(java.lang.String inText)
Constructor for XMLWordFinder.

Parameters:
inText - A String with the input text to tokenize.

XMLWordFinder

public XMLWordFinder()
Constructor for XMLWordFinder.

Method Detail

currentSegment

public java.lang.String currentSegment()
Returns the current text segment from the input. A segment is defined as the character sequence between the current position and the next non-alphanumeric character, considering also white spaces.

Overrides:
currentSegment in class DefaultWordFinder
Returns:
A String with the current text segment.

next

public java.lang.String next()
This method scans the text from the end of the last word, and returns a String corresponding to the next word. If there are no more words to return, it retuns a null String.

Overrides:
next in class DefaultWordFinder
Returns:
the next word.