com.googlecode.whatswrong.io
Class TabFormat

java.lang.Object
  extended by com.googlecode.whatswrong.io.TabFormat
All Implemented Interfaces:
CorpusFormat

public class TabFormat
extends java.lang.Object
implements CorpusFormat

A TabFormat loads data from text files where token properties are represented as white-space/tab separated values. This includes formats such as the CoNLL shared task formats or the MALT-Tab format. This class represents the generic framework to process such tab separated data. To implement a concrete format clients have to implement the TabProcessor interface.

Author:
Sebastian Riedel

Nested Class Summary
 
Nested classes/interfaces inherited from interface com.googlecode.whatswrong.io.CorpusFormat
CorpusFormat.Monitor
 
Constructor Summary
TabFormat()
           
 
Method Summary
 void addProcessor(java.lang.String name, TabProcessor processor)
           
 void addProcessor(TabProcessor processor)
           
static void extractSpan00(java.util.List<? extends java.util.List<java.lang.String>> rows, int column, java.lang.String type, NLPInstance instance)
           
static void extractSpan03(java.util.List<? extends java.util.List<java.lang.String>> rows, int column, java.lang.String type, NLPInstance instance)
           
static void extractSpan05(java.util.List<? extends java.util.List<java.lang.String>> rows, int column, java.lang.String type, java.lang.String prefix, NLPInstance instance)
           
 javax.swing.JComponent getAccessory()
          Returns the GUI element that controls how this format is to be loaded.
 java.lang.String getLongName()
          Returns a longer name that may contain information about the configuration of this format.
 java.lang.String getName()
          Returns the name of this format.
 java.util.List<NLPInstance> load(java.io.File file, int from, int to)
          Loads a corpus from a file, starting at instance from and ending at instance to (exclusive).
 void loadProperties(java.util.Properties properties, java.lang.String prefix)
          Loads a configuration for this format from the given Properties object.
 void saveProperties(java.util.Properties properties, java.lang.String prefix)
          Saves the configuration of this format to a Properties object.
 void setMonitor(CorpusFormat.Monitor monitor)
          Sets the objects that monitors the progress of this format when loading a file.
 java.lang.String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

TabFormat

public TabFormat()
Method Detail

addProcessor

public void addProcessor(java.lang.String name,
                         TabProcessor processor)

addProcessor

public void addProcessor(TabProcessor processor)

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

getName

public java.lang.String getName()
Description copied from interface: CorpusFormat
Returns the name of this format.

Specified by:
getName in interface CorpusFormat
Returns:
the name of this format.

getLongName

public java.lang.String getLongName()
Description copied from interface: CorpusFormat
Returns a longer name that may contain information about the configuration of this format.

Specified by:
getLongName in interface CorpusFormat
Returns:
the long name of this format.

getAccessory

public javax.swing.JComponent getAccessory()
Description copied from interface: CorpusFormat
Returns the GUI element that controls how this format is to be loaded.

Specified by:
getAccessory in interface CorpusFormat
Returns:
the GUI element that controls how this format is to be loaded.

setMonitor

public void setMonitor(CorpusFormat.Monitor monitor)
Description copied from interface: CorpusFormat
Sets the objects that monitors the progress of this format when loading a file.

Specified by:
setMonitor in interface CorpusFormat
Parameters:
monitor - the monitor for this format.

loadProperties

public void loadProperties(java.util.Properties properties,
                           java.lang.String prefix)
Description copied from interface: CorpusFormat
Loads a configuration for this format from the given Properties object.

Specified by:
loadProperties in interface CorpusFormat
Parameters:
properties - the Properties object to load from.
prefix - the prefix that properties for this format have in the Properties object.

saveProperties

public void saveProperties(java.util.Properties properties,
                           java.lang.String prefix)
Description copied from interface: CorpusFormat
Saves the configuration of this format to a Properties object.

Specified by:
saveProperties in interface CorpusFormat
Parameters:
properties - the Properties object to store this configuration of this format to.
prefix - the prefix that the properties should have.

load

public java.util.List<NLPInstance> load(java.io.File file,
                                        int from,
                                        int to)
                                 throws java.io.IOException
Description copied from interface: CorpusFormat
Loads a corpus from a file, starting at instance from and ending at instance to (exclusive). This method is required to call CorpusFormat.Monitor.progressed(int) after each instance that was processed.

Specified by:
load in interface CorpusFormat
Parameters:
file - the file to load the corpus from.
from - the starting instance index.
to - the end instance index.
Returns:
a list of NLP instances loaded from the given file in the given interval.
Throws:
java.io.IOException - if I/O goes wrong.

extractSpan03

public static void extractSpan03(java.util.List<? extends java.util.List<java.lang.String>> rows,
                                 int column,
                                 java.lang.String type,
                                 NLPInstance instance)

extractSpan00

public static void extractSpan00(java.util.List<? extends java.util.List<java.lang.String>> rows,
                                 int column,
                                 java.lang.String type,
                                 NLPInstance instance)

extractSpan05

public static void extractSpan05(java.util.List<? extends java.util.List<java.lang.String>> rows,
                                 int column,
                                 java.lang.String type,
                                 java.lang.String prefix,
                                 NLPInstance instance)


Copyright © 2010. All Rights Reserved.