|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectweka.attributeSelection.ASSearch
weka.attributeSelection.RaceSearch
public class RaceSearch
Races the cross validation error of competing attribute subsets. Use in conjuction with a ClassifierSubsetEval. RaceSearch has four modes:
forward selection races all single attribute additions to a base set (initially no attributes), selects the winner to become the new base set and then iterates until there is no improvement over the base set.
Backward elimination is similar but the initial base set has all attributes included and races all single attribute deletions.
Schemata search is a bit different. Each iteration a series of races are run in parallel. Each race in a set determines whether a particular attribute should be included or not---ie the race is between the attribute being "in" or "out". The other attributes for this race are included or excluded randomly at each point in the evaluation. As soon as one race has a clear winner (ie it has been decided whether a particular attribute should be inor not) then the next set of races begins, using the result of the winning race from the previous iteration as new base set.
Rank race first ranks the attributes using an attribute evaluator and then races the ranking. The race includes no attributes, the top ranked attribute, the top two attributes, the top three attributes, etc.
It is also possible to generate a raked list of attributes through the forward racing process. If generateRanking is set to true then a complete forward race will be run---that is, racing continues until all attributes have been selected. The order that they are added in determines a complete ranking of all the attributes.
Racing uses paired and unpaired t-tests on cross-validation errors of competing subsets. When there is a significant difference between the means of the errors of two competing subsets then the poorer of the two can be eliminated from the race. Similarly, if there is no significant difference between the mean errors of two competing subsets and they are within some threshold of each other, then one can be eliminated from the race.
For more information see:
Andrew W. Moore, Mary S. Lee: Efficient Algorithms for Minimizing Cross Validation Error. In: Eleventh International Conference on Machine Learning, 190-198, 1994.
@inproceedings{Moore1994, author = {Andrew W. Moore and Mary S. Lee}, booktitle = {Eleventh International Conference on Machine Learning}, pages = {190-198}, publisher = {Morgan Kaufmann}, title = {Efficient Algorithms for Minimizing Cross Validation Error}, year = {1994} }Valid options are:
-R <0 = forward | 1 = backward race | 2 = schemata | 3 = rank> Type of race to perform. (default = 0).
-L <significance> Significance level for comaparisons (default = 0.001(forward/backward/rank)/0.01(schemata)).
-T <threshold> Threshold for error comparison. (default = 0.001).
-A <attribute evaluator> Attribute ranker to use if doing a rank search. Place any evaluator options LAST on the command line following a "--". eg. -A weka.attributeSelection.GainRatioAttributeEval ... -- -M. (default = GainRatioAttributeEval)
-F <0 = 10 fold | 1 = leave-one-out> Folds for cross validation (default = 0 (1 if schemata race)
-Q Generate a ranked list of attributes. Forces the search to be forward and races until all attributes have selected, thus producing a ranking.
-N <num to select> Specify number of attributes to retain from the ranking. Overides -T. Use in conjunction with -Q
-J <threshold> Specify a theshold by which attributes may be discarded from the ranking. Use in conjuction with -Q
-Z Verbose output for monitoring the search.
Options specific to evaluator weka.attributeSelection.GainRatioAttributeEval:
-M treat missing values as a seperate value.
Field Summary | |
---|---|
static Tag[] |
TAGS_SELECTION
|
static Tag[] |
XVALTAGS_SELECTION
|
Constructor Summary | |
---|---|
RaceSearch()
|
Method Summary | |
---|---|
java.lang.String |
attributeEvaluatorTipText()
Returns the tip text for this property |
java.lang.String |
debugTipText()
Returns the tip text for this property |
java.lang.String |
foldsTypeTipText()
Returns the tip text for this property |
java.lang.String |
generateRankingTipText()
Returns the tip text for this property |
ASEvaluation |
getAttributeEvaluator()
Get the attribute evaluator used to generate the ranking. |
int |
getCalculatedNumToSelect()
Gets the calculated number of attributes to retain. |
boolean |
getDebug()
Get whether output is to be verbose |
SelectedTag |
getFoldsType()
Get the xfold type |
boolean |
getGenerateRanking()
Gets whether ranking has been requested. |
int |
getNumToSelect()
Gets the number of attributes to be retained. |
java.lang.String[] |
getOptions()
Gets the current settings of BestFirst. |
SelectedTag |
getRaceType()
Get the race type |
java.lang.String |
getRevision()
Returns the revision string. |
double |
getSelectionThreshold()
Returns the threshold so that the AttributeSelection module can discard attributes from the ranking. |
double |
getSignificanceLevel()
Get the significance level |
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on. |
double |
getThreshold()
Get the threshold |
java.lang.String |
globalInfo()
Returns a string describing this search method |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
java.lang.String |
numToSelectTipText()
Returns the tip text for this property |
java.lang.String |
raceTypeTipText()
Returns the tip text for this property |
double[][] |
rankedAttributes()
Returns a X by 2 list of attribute indexes and corresponding evaluations from best (highest) to worst. |
int[] |
search(ASEvaluation ASEval,
Instances data)
Searches the attribute subset space by racing cross validation errors of competing subsets |
java.lang.String |
selectionThresholdTipText()
Returns the tip text for this property |
void |
setAttributeEvaluator(ASEvaluation newEvaluator)
Set the attribute evaluator to use for generating the ranking. |
void |
setDebug(boolean d)
Set whether verbose output should be generated. |
void |
setFoldsType(SelectedTag d)
Set the xfold type |
void |
setGenerateRanking(boolean doRank)
Records whether the user has requested a ranked list of attributes. |
void |
setNumToSelect(int n)
Specify the number of attributes to select from the ranked list (if generating a ranking). |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setRaceType(SelectedTag d)
Set the race type |
void |
setSelectionThreshold(double threshold)
Set the threshold by which the AttributeSelection module can discard attributes. |
void |
setSignificanceLevel(double sig)
Sets the significance level to use |
void |
setThreshold(double t)
Sets the threshold for comparisons |
java.lang.String |
significanceLevelTipText()
Returns the tip text for this property |
java.lang.String |
thresholdTipText()
Returns the tip text for this property |
java.lang.String |
toString()
Returns a string represenation |
Methods inherited from class weka.attributeSelection.ASSearch |
---|
forName, makeCopies |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final Tag[] TAGS_SELECTION
public static final Tag[] XVALTAGS_SELECTION
Constructor Detail |
---|
public RaceSearch()
Method Detail |
---|
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public java.lang.String raceTypeTipText()
public void setRaceType(SelectedTag d)
d
- the type of racepublic SelectedTag getRaceType()
public java.lang.String significanceLevelTipText()
public void setSignificanceLevel(double sig)
sig
- the significance levelpublic double getSignificanceLevel()
public java.lang.String thresholdTipText()
public void setThreshold(double t)
setThreshold
in interface RankedOutputSearch
t
- the threshold to usepublic double getThreshold()
getThreshold
in interface RankedOutputSearch
public java.lang.String foldsTypeTipText()
public void setFoldsType(SelectedTag d)
d
- the type of xvalpublic SelectedTag getFoldsType()
public java.lang.String debugTipText()
public void setDebug(boolean d)
d
- true if output is to be verbose.public boolean getDebug()
public java.lang.String attributeEvaluatorTipText()
public void setAttributeEvaluator(ASEvaluation newEvaluator)
newEvaluator
- the attribute evaluator to use.public ASEvaluation getAttributeEvaluator()
public java.lang.String generateRankingTipText()
public void setGenerateRanking(boolean doRank)
setGenerateRanking
in interface RankedOutputSearch
doRank
- true if ranking is requestedpublic boolean getGenerateRanking()
getGenerateRanking
in interface RankedOutputSearch
public java.lang.String numToSelectTipText()
public void setNumToSelect(int n)
setNumToSelect
in interface RankedOutputSearch
n
- the number of attributes to retainpublic int getNumToSelect()
getNumToSelect
in interface RankedOutputSearch
public int getCalculatedNumToSelect()
getCalculatedNumToSelect
in interface RankedOutputSearch
public java.lang.String selectionThresholdTipText()
public void setSelectionThreshold(double threshold)
threshold
- the threshold.public double getSelectionThreshold()
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-R <0 = forward | 1 = backward race | 2 = schemata | 3 = rank> Type of race to perform. (default = 0).
-L <significance> Significance level for comaparisons (default = 0.001(forward/backward/rank)/0.01(schemata)).
-T <threshold> Threshold for error comparison. (default = 0.001).
-A <attribute evaluator> Attribute ranker to use if doing a rank search. Place any evaluator options LAST on the command line following a "--". eg. -A weka.attributeSelection.GainRatioAttributeEval ... -- -M. (default = GainRatioAttributeEval)
-F <0 = 10 fold | 1 = leave-one-out> Folds for cross validation (default = 0 (1 if schemata race)
-Q Generate a ranked list of attributes. Forces the search to be forward and races until all attributes have selected, thus producing a ranking.
-N <num to select> Specify number of attributes to retain from the ranking. Overides -T. Use in conjunction with -Q
-J <threshold> Specify a theshold by which attributes may be discarded from the ranking. Use in conjuction with -Q
-Z Verbose output for monitoring the search.
Options specific to evaluator weka.attributeSelection.GainRatioAttributeEval:
-M treat missing values as a seperate value.
setOptions
in interface OptionHandler
options
- the list of options as an array of strings
java.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
public int[] search(ASEvaluation ASEval, Instances data) throws java.lang.Exception
search
in class ASSearch
ASEval
- the attribute evaluator to guide the searchdata
- the training instances.
java.lang.Exception
- if the search can't be completedpublic double[][] rankedAttributes() throws java.lang.Exception
RankedOutputSearch
rankedAttributes
in interface RankedOutputSearch
java.lang.Exception
- if the ranking can't be producedpublic java.lang.String toString()
toString
in class java.lang.Object
public java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class ASSearch
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |