com.googlecode.whatswrong
Class TokenFilter

java.lang.Object
  extended by com.googlecode.whatswrong.TokenFilter
All Implemented Interfaces:
NLPInstanceFilter

public class TokenFilter
extends java.lang.Object
implements NLPInstanceFilter

A Tokenfilter removes certain properties from each token and removes tokens that do not contain certain property values. The filter also removes all edges that were connecting one or more removed tokens.

Author:
Sebastian Riedel

Constructor Summary
TokenFilter()
          Creates a new TokenFilter.
 
Method Summary
 void addAllowedString(java.lang.String string)
          Add a an allowed property value.
 void addForbiddenProperty(java.lang.String name)
          Add a property that is forbidden so that the corresponding values are removed from each token.
 void clearAllowedStrings()
          Remove all allowed strings.
 NLPInstance filter(NLPInstance original)
          Filter an NLP instance by first filtering the tokens and then removing edges that have tokens which were filtered out.
 java.util.List<Token> filterTokens(java.util.Collection<Token> original)
          Filter a set of tokens by removing property values and individual tokens according to the set of allowed strings and forbidden properties.
 java.util.Set<TokenProperty> getForbiddenTokenProperties()
          Returns an unmodifiable view on the set of all allowed token properties.
 boolean isWholeWord()
          Are tokens allowed only if they have a property value that equals one of the allowed strings or is it sufficient if one value contains one of the allowed strings.
 void removeForbiddenProperty(java.lang.String name)
          Remove a property that is forbidden so that the corresponding values shown again.
 void setWholeWord(boolean wholeWord)
          Should tokens be allowed only if they have a property value that equals one of the allowed strings or is it sufficient if one value contains one of the allowed strings.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TokenFilter

public TokenFilter()
Creates a new TokenFilter.

Method Detail

isWholeWord

public boolean isWholeWord()
Are tokens allowed only if they have a property value that equals one of the allowed strings or is it sufficient if one value contains one of the allowed strings.

Returns:
true iff tokens are allowed based on exact matches with allowed strings, false otherwise.

setWholeWord

public void setWholeWord(boolean wholeWord)
Should tokens be allowed only if they have a property value that equals one of the allowed strings or is it sufficient if one value contains one of the allowed strings.

Parameters:
wholeWord - true iff tokens should be allowed based on exact matches with allowed strings, false otherwise.

addAllowedString

public void addAllowedString(java.lang.String string)
Add a an allowed property value.

Parameters:
string - the allowed property value.

clearAllowedStrings

public void clearAllowedStrings()
Remove all allowed strings. In this state the filter allows all tokens.


addForbiddenProperty

public void addForbiddenProperty(java.lang.String name)
Add a property that is forbidden so that the corresponding values are removed from each token.

Parameters:
name - the name of the property to forbid.

removeForbiddenProperty

public void removeForbiddenProperty(java.lang.String name)
Remove a property that is forbidden so that the corresponding values shown again.

Parameters:
name - the name of the property to show again.

getForbiddenTokenProperties

public java.util.Set<TokenProperty> getForbiddenTokenProperties()
Returns an unmodifiable view on the set of all allowed token properties.

Returns:
an unmodifiable view on the set of all allowed token properties.

filterTokens

public java.util.List<Token> filterTokens(java.util.Collection<Token> original)
Filter a set of tokens by removing property values and individual tokens according to the set of allowed strings and forbidden properties.

Parameters:
original - the original set of tokens.
Returns:
the filtered set of tokens.

filter

public NLPInstance filter(NLPInstance original)
Filter an NLP instance by first filtering the tokens and then removing edges that have tokens which were filtered out.

Specified by:
filter in interface NLPInstanceFilter
Parameters:
original - the original nlp instance.
Returns:
the filtered nlp instance.
See Also:
NLPInstanceFilter.filter(NLPInstance)


Copyright © 2009. All Rights Reserved.