|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.mahout.classifier.BayesFileFormatter
public final class BayesFileFormatter
Flatten a file into format that can be read by the Bayes M/R job.
One document per line, first token is the label followed by a tab, rest of the line are the terms.
| Method Summary | |
|---|---|
static void |
collapse(String label,
org.apache.lucene.analysis.Analyzer analyzer,
File inputDir,
Charset charset,
File outputFile)
Collapse all the files in the inputDir into a single file in the proper Bayes format, 1 document per line |
static void |
format(String label,
org.apache.lucene.analysis.Analyzer analyzer,
File input,
Charset charset,
File outDir)
Write the input files to the outdir, one output file per input file |
static void |
main(String[] args)
Run the FileFormatter |
static String[] |
readerToDocument(org.apache.lucene.analysis.Analyzer analyzer,
Reader reader)
Convert a Reader to a vector |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Method Detail |
|---|
public static void collapse(String label,
org.apache.lucene.analysis.Analyzer analyzer,
File inputDir,
Charset charset,
File outputFile)
throws IOException
label - The labelanalyzer - The analyzer to useinputDir - The input Directorycharset - The charset of the input filesoutputFile - The file to collapse to
IOException
public static void format(String label,
org.apache.lucene.analysis.Analyzer analyzer,
File input,
Charset charset,
File outDir)
throws IOException
label - The label of the fileanalyzer - The analyzer to useinput - The input file or directory. May not be nullcharset - The Character set of the input filesoutDir - The output directory. Files will be written there with the same name as the input file
IOException
public static String[] readerToDocument(org.apache.lucene.analysis.Analyzer analyzer,
Reader reader)
throws IOException
analyzer - The Analyzer to usereader - The reader to feed to the Analyzer
IOException
public static void main(String[] args)
throws Exception
args - The input args. Run with -h to see the help
ClassNotFoundException - if the Analyzer can't be found
IllegalAccessException - if the Analyzer can't be constructed
InstantiationException - if the Analyzer can't be constructed
IOException - if the files can't be dealt with properly
Exception
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||