com.googlecode.whatswrong
Class EdgeTokenFilter

java.lang.Object
  extended by com.googlecode.whatswrong.EdgeTokenFilter
All Implemented Interfaces:
NLPInstanceFilter

public class EdgeTokenFilter
extends java.lang.Object
implements NLPInstanceFilter

An EdgeTokenFilter filters out edges based on the properties of their tokens. For example, we can filter out all edges that do not contain at least one token with the word "blah". The filter can also be configured to filter out all edges which are not on a path between tokens with certain properties. For example, we can filter out all edges that are not on the paths between a token with word "blah" and a token with word "blub".

This filter can also filter out the tokens for which all edges have been filtered out via the edge filtering process. This mode is called "collapsing" because the graph is collapsed to contain only connected components.

Note that if no allowed property values are defined (addAllowedProperty(String)) then the filter does nothing and keeps all edges.

Author:
Sebastian Riedel

Constructor Summary
EdgeTokenFilter(java.util.Set<java.lang.String> allowedPropertyValues)
          Creates a new filter with the given allowed property values.
EdgeTokenFilter(java.lang.String... allowedProperties)
          Creates a new filter with the given allowed property values.
 
Method Summary
 void addAllowedProperty(java.lang.String propertyValue)
          Adds an allowed property value.
 boolean allows(java.lang.String propertyValue)
          Returns whether the given value is an allowed property value.
 void clear()
          Removes all allowed words.
 NLPInstance filter(NLPInstance original)
          First filters out edges and then filters out tokens without edges if isCollaps() is true.
 java.util.Collection<Edge> filterEdges(java.util.Collection<Edge> original)
          Filters out all edges that do not have at least one token with an allowed property value.
 boolean isCollaps()
          If active this property will cause the filter to filter out all tokens for which all edges where filtered out in the edge filtering step.
 boolean isUsePaths()
          Usually the filter allows all edges that have tokens with allowed properties.
 boolean isWholeWords()
          If true at least one edge tokens must contain at least one property value that matches one of the allowed properties.
 void removeAllowedProperty(java.lang.String propertyValue)
          Remove an allowed property value.
 void setCollaps(boolean collaps)
          If active this property will cause the filter to filter out all tokens for which all edges where filtered out in the edge filtering step.
 void setUsePaths(boolean usePaths)
          Sets whether the filter uses paths.
 void setWholeWords(boolean wholeWords)
          Sets whether the filter should check for whole word matches of properties.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

EdgeTokenFilter

public EdgeTokenFilter(java.lang.String... allowedProperties)
Creates a new filter with the given allowed property values.

Parameters:
allowedProperties - A var array of allowed property values. An Edge will be filtered out if none of its tokens has a property with an allowed property value (or a property value that contains an allowed value, if isWholeWords() is false).

EdgeTokenFilter

public EdgeTokenFilter(java.util.Set<java.lang.String> allowedPropertyValues)
Creates a new filter with the given allowed property values.

Parameters:
allowedPropertyValues - A set of allowed property values. An Edge will be filtered out if none of its tokens has a property with an allowed property value (or a property value that contains an allowed value, if isWholeWords() is false).
Method Detail

isCollaps

public boolean isCollaps()
If active this property will cause the filter to filter out all tokens for which all edges where filtered out in the edge filtering step.

Returns:
true if the filter collapses the graph and removes tokens without edge.

setCollaps

public void setCollaps(boolean collaps)
If active this property will cause the filter to filter out all tokens for which all edges where filtered out in the edge filtering step.

Parameters:
collaps - true if the filter should collapse the graph and remove tokens without edge.

isUsePaths

public boolean isUsePaths()
Usually the filter allows all edges that have tokens with allowed properties. However, if it "uses paths" an edge will only be allowed if it is on a path between two tokens with allowed properties. This also means that if there is only one token with allowed properties all edges will be filtered out.

Returns:
true if the filter uses paths.

setUsePaths

public void setUsePaths(boolean usePaths)
Sets whether the filter uses paths.

Parameters:
usePaths - should the filter use paths.
See Also:
isUsePaths()

addAllowedProperty

public void addAllowedProperty(java.lang.String propertyValue)
Adds an allowed property value. An Edge must have a least one token with at least one property value that either matches one of the allowed property values or contains one of them, depending on isWholeWords().

Parameters:
propertyValue - the property value to allow.

removeAllowedProperty

public void removeAllowedProperty(java.lang.String propertyValue)
Remove an allowed property value.

Parameters:
propertyValue - the property value to remove from the set of allowed property values.

clear

public void clear()
Removes all allowed words. Note that if no allowed words are specified the filter changes it's behaviour and allows all edges.


isWholeWords

public boolean isWholeWords()
If true at least one edge tokens must contain at least one property value that matches one of the allowed properties. If false it sufficient for the property values to contain an allowed property as substring.

Returns:
whether property values need to exactly match the allowed properties or can contain them as a substring.

setWholeWords

public void setWholeWords(boolean wholeWords)
Sets whether the filter should check for whole word matches of properties.

Parameters:
wholeWords - true iff the filter should check for whold words.
See Also:
isWholeWords()

filterEdges

public java.util.Collection<Edge> filterEdges(java.util.Collection<Edge> original)
Filters out all edges that do not have at least one token with an allowed property value. If the set of allowed property values is empty this method just returns the original set and does nothing.

Parameters:
original - the input set of edges.
Returns:
the filtered out set of edges.

allows

public boolean allows(java.lang.String propertyValue)
Returns whether the given value is an allowed property value.

Parameters:
propertyValue - the value to test.
Returns:
whether the given value is an allowed property value.

filter

public NLPInstance filter(NLPInstance original)
First filters out edges and then filters out tokens without edges if isCollaps() is true.

Specified by:
filter in interface NLPInstanceFilter
Parameters:
original - the original nlp instance.
Returns:
the filtered instance.
See Also:
NLPInstanceFilter.filter(NLPInstance)


Copyright © 2010. All Rights Reserved.