|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectpt.tumba.spell.SpellChecker
public class SpellChecker
The main class of the spell checking package.
| Field Summary | |
|---|---|
private CommonMisspellings |
commonErrors
A dictionary of common misspellings |
private TernarySearchTrie |
dictionary
The main dictionary for the spelling checker. |
private boolean |
useBigrams
Use bigrams for context dependent spelling correction |
| Constructor Summary | |
|---|---|
SpellChecker()
|
|
| Method Summary | |
|---|---|
java.lang.String |
findMostSimilar(java.lang.String key)
Takes a word and returns the most similar word from the dictionary, using Levenshtein Distance, Phonetic similarity, Keyboard Proximity and other heuristics to measure similarity. |
java.lang.String |
findMostSimilar(java.lang.String key,
boolean useFrequency)
Takes a word and returns the most similar word from the dictionary, using Levenshtein Distance, Phonetic similarity, Keyboard Proximity and other heuristics to measure similarity. |
java.util.List |
findMostSimilarList(java.lang.String key)
Takes a word and returns a List with similar words from the dictionary,
using Levenshtein Distance to rank words in the list. |
SpellChecker |
getInstance()
Deprecated. TODO: Remove this method and check dependencies with other code. |
private static java.lang.String |
heuristicsPortuguese(java.lang.String str)
Phonetic heuristics for the Portuguese language, taking as input a Portuguese word and replacing letters and groups of letter that correspond to a specific "sound" by a cannonical representation. |
void |
initialize(java.lang.String path)
Reads the dictionary to memory. |
void |
initialize(java.lang.String path1,
java.lang.String path2)
Reads the dictionary to memory. |
void |
initialize(java.lang.String path1,
java.lang.String path2,
java.lang.String path3)
Reads the dictionary to memory. |
static void |
main(java.lang.String[] args)
Main method. |
java.lang.String |
spellCheck(java.lang.String s)
Checks spelling errors in terms from a given String. |
java.lang.String |
spellCheckQuery(java.lang.String s)
Checks spelling errors in terms for a search engine query, ignoring commands to the search system. |
java.lang.String |
spellCheckTeX(java.lang.String s)
Checks spelling errors in terms from a TeX document. |
java.lang.String |
spellCheckWord(java.lang.String word)
Checks if a word is correctly spelled, producing as output a string with the word plus SGML tags indicating if it is correctly spelled or not. |
java.lang.String |
spellCheckXML(java.lang.String s)
Checks spelling errors in terms from an XML document. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
private TernarySearchTrie dictionary
private CommonMisspellings commonErrors
private boolean useBigrams
| Constructor Detail |
|---|
public SpellChecker()
| Method Detail |
|---|
public SpellChecker getInstance()
SpellChecker.private static java.lang.String heuristicsPortuguese(java.lang.String str)
str - A String with a Portuguese word.
public void initialize(java.lang.String path)
throws java.lang.Exception
path - The File path leading up to the dictionary.
java.lang.Exception - an Exception indicating if any problem occured while reading the dictionary.
public void initialize(java.lang.String path1,
java.lang.String path2)
throws java.lang.Exception
path1 - The File path leading up to the dictionary.path2 - The File path leading up to a dictionary of common misspellings.
java.lang.Exception - an Exception indicating if any problem occured while reading the dictionary.
public void initialize(java.lang.String path1,
java.lang.String path2,
java.lang.String path3)
throws java.lang.Exception
path1 - The File path leading up to the dictionary.path2 - The File path leading up to a dictionary of common misspellings.path3 - The File path leading up to a dictionary of correct spellings.
java.lang.Exception - an Exception indicating if any problem occured while reading the dictionary.public java.lang.String findMostSimilar(java.lang.String key)
key - The word to check in the dictionary.
public java.lang.String findMostSimilar(java.lang.String key,
boolean useFrequency)
key - The word to check in the dictionary.useFrequency - Use the relative frequency method.
public java.util.List findMostSimilarList(java.lang.String key)
List with similar words from the dictionary,
using Levenshtein Distance to rank words in the list.
key - The word to check in the dictionary.
List of similar words from the dictionary.public java.lang.String spellCheckQuery(java.lang.String s)
s - A String with a search engine query.
String with spelling errors identifyed.spellCheckWord(String)public java.lang.String spellCheck(java.lang.String s)
String.
s - A String.
String with spelling errors identifyed.spellCheckWord(String)public java.lang.String spellCheckTeX(java.lang.String s)
s - A String with the TeX document.
String with spelling errors identifyed.spellCheckWord(String)public java.lang.String spellCheckXML(java.lang.String s)
s - A String with the XML document.
String with spelling errors identifyed.spellCheckWord(String)public java.lang.String spellCheckWord(java.lang.String word)
SGML tags indicating if it is correctly spelled or not.
The possible SGML tags are:
<misspell> - The word was not found in the dictionary but a suggestion could not be generated.
<plain> - The word is correctly spelled.
<suggestion> - The word was not found in the dictionary and a suggestion was generated.
word - The word to check.
String with the word provided as input (or an appropriate correction)
surrounded with SGML tags indicating if it is correctly spelled or not.
public static void main(java.lang.String[] args)
throws java.lang.Exception
args - The command line input, tokenized.
java.lang.Exception
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||