|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcom.googlecode.whatswrong.io.GaleAlignmentFormat
public class GaleAlignmentFormat
The GaleAlignmentFormat reads bilingual alignment data in a xml-like format. The source tag element contains the tokenized source sentence, the translation element contains the target tokenized sentence. The matrix element contains a matrix in which the first row and first column indicate which tokens are null-aligned, and the remainder of the matrix is simply the alignment matrix where each column corresponds to a source token, and each row corresponds to a target token. The seg element can contain the id of the sentence, but doesn't have to. It's only important that there is a seg element for each sentence.
<seg id=1> <source>Ich habe den Fehler in meiner Sprachverarbeitung gefunden .</source> <translation>I've found the error in my NLP .</translation> <matrix> 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 </matrix> <seg id=2> ...
Nested Class Summary |
---|
Nested classes/interfaces inherited from interface com.googlecode.whatswrong.io.CorpusFormat |
---|
CorpusFormat.Monitor |
Constructor Summary | |
---|---|
GaleAlignmentFormat()
|
Method Summary | |
---|---|
javax.swing.JComponent |
getAccessory()
Returns the GUI element that controls how this format is to be loaded. |
java.lang.String |
getLongName()
Returns a longer name that may contain information about the configuration of this format. |
java.lang.String |
getName()
Returns the name of this format. |
java.util.List<NLPInstance> |
load(java.io.File file,
int from,
int to)
Loads a corpus from a file, starting at instance from and ending at instance to
(exclusive). |
void |
loadProperties(java.util.Properties properties,
java.lang.String prefix)
Loads a configuration for this format from the given Properties object. |
void |
saveProperties(java.util.Properties properties,
java.lang.String prefix)
Saves the configuration of this format to a Properties object. |
void |
setMonitor(CorpusFormat.Monitor monitor)
Sets the objects that monitors the progress of this format when loading a file. |
java.lang.String |
toString()
Returns the name of this format. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public GaleAlignmentFormat()
Method Detail |
---|
public java.lang.String getName()
getName
in interface CorpusFormat
public java.lang.String getLongName()
getLongName
in interface CorpusFormat
public javax.swing.JComponent getAccessory()
getAccessory
in interface CorpusFormat
public void setMonitor(CorpusFormat.Monitor monitor)
setMonitor
in interface CorpusFormat
monitor
- the monitor for this format.public void loadProperties(java.util.Properties properties, java.lang.String prefix)
loadProperties
in interface CorpusFormat
properties
- the Properties object to load from.prefix
- the prefix that properties for this format have in the Properties object.public void saveProperties(java.util.Properties properties, java.lang.String prefix)
saveProperties
in interface CorpusFormat
properties
- the Properties object to store this configuration of this format to.prefix
- the prefix that the properties should have.public java.util.List<NLPInstance> load(java.io.File file, int from, int to) throws java.io.IOException
from
and ending at instance to
(exclusive). This method is required to call CorpusFormat.Monitor.progressed(int)
after each instance that was processed.
load
in interface CorpusFormat
file
- the file to load the corpus from.from
- the starting instance index.to
- the end instance index.
java.io.IOException
- if I/O goes wrong.public java.lang.String toString()
toString
in class java.lang.Object
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |