weka package¶
Subpackages¶
- weka.core package
- Submodules
- weka.core.capabilities module
- weka.core.classes module
- weka.core.converters module
- weka.core.database module
- weka.core.dataset module
- weka.core.jvm module
- weka.core.packages module
- weka.core.serialization module
- weka.core.stemmers module
- weka.core.stopwords module
- weka.core.tokenizers module
- weka.core.types module
- weka.core.version module
- Module contents
- weka.flow package
- weka.plot package
Submodules¶
weka.associations module¶
-
class
weka.associations.
AssociationRule
(jobject)¶ Bases:
weka.core.classes.JavaObject
Wrapper for weka.associations.AssociationRule class.
-
consequence
¶ Get the the consequence.
Returns: the consequence, list of Item objects Return type: list
-
consequence_support
¶ Get the support for the consequence.
Returns: the support Return type: int
-
metric_names
¶ Returns the metric names for the rule.
Returns: the metric names Return type: list
-
metric_value
(name)¶ Returns the named metric value for the rule.
Parameters: name (str) – the name of the metric Returns: the metric value Return type: float
-
metric_values
¶ Returns the metric values for the rule.
Returns: the metric values Return type: ndarray
-
premise
¶ Get the the premise.
Returns: the premise, list of Item objects Return type: list
-
premise_support
¶ Get the support for the premise.
Returns: the support Return type: int
-
primary_metric_name
¶ Returns the primary metric name for the rule.
Returns: the metric name Return type: str
-
primary_metric_value
¶ Returns the primary metric value for the rule.
Returns: the metric value Return type: float
-
total_support
¶ Get the total support.
Returns: the support Return type: int
-
total_transactions
¶ Get the total transactions.
Returns: the transactions Return type: int
-
-
class
weka.associations.
AssociationRules
(jobject)¶ Bases:
weka.core.classes.JavaObject
Wrapper for weka.associations.AssociationRules class.
-
producer
¶ Returns a string describing the producer that generated these rules.
Returns: the producer Return type: str
-
-
class
weka.associations.
AssociationRulesIterator
(rules)¶ Bases:
object
Iterator for weka.associations.AssociationRules class.
-
next
()¶ Returns the next rule.
Returns: the next rule object Return type: AssociationRule
-
-
class
weka.associations.
Associator
(classname=None, jobject=None, options=None)¶ Bases:
weka.core.classes.OptionHandler
Wrapper class for associators.
-
association_rules
()¶ Returns association rules that were generated. Only if implements AssociationRulesProducer.
Returns: the association rules that were generated Return type: AssociationRules
-
build_associations
(data)¶ Builds the associator with the data.
Parameters: data (Instances) – the data to train the associator with
-
can_produce_rules
()¶ Checks whether association rules can be generated.
Returns: whether scheme implements AssociationRulesProducer interface and association rules can be generated :rtype: bool
-
capabilities
¶ Returns the capabilities of the associator.
Returns: the capabilities Return type: Capabilities
-
classmethod
make_copy
(associator)¶ Creates a copy of the clusterer.
Parameters: associator (Associator) – the associator to copy Returns: the copy of the associator Return type: Associator
-
rule_metric_names
¶ Returns the rule metric names of the association rules. Only if implements AssociationRulesProducer.
Returns: the metric names Return type: list
-
-
class
weka.associations.
Item
(jobject)¶ Bases:
weka.core.classes.JavaObject
Wrapper for weka.associations.Item class.
-
comparison
¶ Returns the comparison operator as string.
Returns: the comparison iterator Return type: str
-
decrease_frequency
(frequency=None)¶ Decreases the frequency.
Parameters: frequency (int) – the frequency to decrease by, 1 if None
-
frequency
¶ Returns the frequency.
Returns: the frequency Return type: int
-
increase_frequency
(frequency=None)¶ Increases the frequency.
Parameters: frequency (int) – the frequency to increase by, 1 if None
-
item_value
¶ Returns the item value as string.
Returns: the item value Return type: str
-
-
weka.associations.
main
(args=None)¶ Runs a associator from the command-line. Calls JVM start/stop automatically. Use -h to see all options.
Parameters: args (list) – the command-line arguments to use, uses sys.argv if None
-
weka.associations.
sys_main
()¶ Runs the main function using the system cli arguments, and returns a system error code.
Returns: 0 for success, 1 for failure. Return type: int
weka.attribute_selection module¶
-
class
weka.attribute_selection.
ASEvaluation
(classname='weka.attributeSelection.CfsSubsetEval', jobject=None, options=None)¶ Bases:
weka.core.classes.OptionHandler
Wrapper class for attribute selection evaluation algorithm.
-
build_evaluator
(data)¶ Builds the evaluator with the data.
Parameters: data (Instances) – the data to use
-
capabilities
¶ Returns the capabilities of the classifier.
Returns: the capabilities Return type: Capabilities
-
post_process
(indices)¶ Post-processes the evaluator with the selected attribute indices.
Parameters: indices (ndarray) – the attribute indices list to use Returns: the processed indices Return type: ndarray
-
-
class
weka.attribute_selection.
ASSearch
(classname='weka.attributeSelection.BestFirst', jobject=None, options=None)¶ Bases:
weka.core.classes.OptionHandler
Wrapper class for attribute selection search algorithm.
-
search
(evaluation, data)¶ Performs the search and returns the indices of the selected attributes.
Parameters: - evaluation (ASEvaluation) – the evaluation algorithm to use
- data (Instances) – the data to use
Returns: the selected attributes (0-based indices)
Return type: ndarray
-
-
class
weka.attribute_selection.
AttributeSelection
¶ Bases:
weka.core.classes.JavaObject
Performs attribute selection using search and evaluation algorithms.
-
classmethod
attribute_selection
(evaluator, args)¶ Performs attribute selection using the given attribute evaluator and options.
Parameters: - evaluator (ASEvaluation) – the evaluator to use
- args (list) – the command-line args for the attribute selection
Returns: the results string
Return type: str
-
crossvalidation
(crossvalidation)¶ Sets whether to perform cross-validation.
Parameters: crossvalidation (bool) – whether to perform cross-validation
-
cv_results
¶ Generates a results string from the last cross-validation attribute selection.
Returns: the results string Return type: str
-
evaluator
(evaluator)¶ Sets the evaluator to use.
Parameters: evaluator (ASEvaluation) – the evaluator to use.
-
folds
(folds)¶ Sets the number of folds to use for cross-validation.
Parameters: folds (int) – the number of folds
-
number_attributes_selected
¶ Returns the number of attributes that were selected.
Returns: the number of attributes Return type: int
-
ranked_attributes
¶ Returns the matrix of ranked attributes from the last run.
Returns: the Numpy matrix Return type: ndarray
-
ranking
(ranking)¶ Sets whether to perform a ranking, if possible.
Parameters: ranking (bool) – whether to perform a ranking
-
reduce_dimensionality
(data)¶ Reduces the dimensionality of the provided Instance or Instances object.
Parameters: data (Instances) – the data to process Returns: the reduced dataset Return type: Instances
-
results_string
¶ Generates a results string from the last attribute selection.
Returns: the results string Return type: str
-
search
(search)¶ Sets the search algorithm to use.
Parameters: search (ASSearch) – the search algorithm
-
seed
(seed)¶ Sets the seed for cross-validation.
Parameters: seed (int) – the seed value
-
select_attributes
(instances)¶ Performs attribute selection on the given dataset.
Parameters: instances (Instances) – the data to process
-
select_attributes_cv_split
(instances)¶ Performs attribute selection on the given cross-validation split.
Parameters: instances (Instances) – the data to process
-
selected_attributes
¶ Returns the selected attributes from the last run.
Returns: the Numpy array of 0-based indices Return type: ndarray
-
classmethod
-
weka.attribute_selection.
main
(args=None)¶ Runs attribute selection from the command-line. Calls JVM start/stop automatically. Use -h to see all options.
Parameters: args (list) – the command-line arguments to use, uses sys.argv if None
-
weka.attribute_selection.
sys_main
()¶ Runs the main function using the system cli arguments, and returns a system error code.
Returns: 0 for success, 1 for failure. Return type: int
weka.classifiers module¶
-
class
weka.classifiers.
Classifier
(classname='weka.classifiers.rules.ZeroR', jobject=None, options=None)¶ Bases:
weka.core.classes.OptionHandler
Wrapper class for classifiers.
-
batch_size
¶ Returns the batch size, in case this classifier is a batch predictor.
Returns: the batch size, None if not a batch predictor Return type: str
-
build_classifier
(data)¶ Builds the classifier with the data.
Parameters: data (Instances) – the data to train the classifier with
-
capabilities
¶ Returns the capabilities of the classifier.
Returns: the capabilities Return type: Capabilities
-
classify_instance
(inst)¶ Peforms a prediction.
Parameters: inst (Instance) – the Instance to get a prediction for Returns: the classification (either regression value or 0-based label index) Return type: float
-
classmethod
deserialize
(ser_file)¶ Deserializes a classifier from a file.
Parameters: ser_file (str) – the model file to deserialize Returns: model and, if available, the dataset header Return type: tuple
-
distribution_for_instance
(inst)¶ Peforms a prediction, returning the class distribution.
Parameters: inst (Instance) – the Instance to get the class distribution for Returns: the class distribution array Return type: ndarray
-
distributions_for_instances
(data)¶ Peforms predictions, returning the class distributions.
Parameters: data (Instances) – the Instances to get the class distributions for Returns: the class distribution matrix, None if not a batch predictor Return type: ndarray
-
graph
¶ Returns the graph if classifier implements weka.core.Drawable, otherwise None.
Returns: the generated graph string Return type: str
-
graph_type
¶ Returns the graph type if classifier implements weka.core.Drawable, otherwise -1.
Returns: the type Return type: int
-
has_efficient_batch_prediction
()¶ Returns whether the classifier implements a more efficient batch prediction.
Returns: True if a more efficient batch prediction is implemented, always False if not batch predictor Return type: bool
-
classmethod
make_copy
(classifier)¶ Creates a copy of the classifier.
Parameters: classifier (Classifier) – the classifier to copy Returns: the copy of the classifier Return type: Classifier
-
serialize
(ser_file, header=None)¶ Serializes the classifier to the specified file.
Parameters: - ser_file (str) – the file to save the model to
- header (Instances) – the (optional) dataset header to store alongside; recommended
-
to_source
(classname)¶ Returns the model as Java source code if the classifier implements weka.classifiers.Sourcable.
Parameters: classname (str) – the classname for the generated Java code Returns: the model as source code string Return type: str
-
-
class
weka.classifiers.
CostMatrix
(matrx=None, num_classes=None)¶ Bases:
weka.core.classes.JavaObject
Class for storing and manipulating a misclassification cost matrix. The element at position i,j in the matrix is the penalty for classifying an instance of class j as class i. Cost values can be fixed or computed on a per-instance basis (cost sensitive evaluation only) from the value of an attribute or an expression involving attribute(s).
-
apply_cost_matrix
(data, rnd)¶ Applies the cost matrix to the data.
Parameters:
-
expected_costs
(class_probs, inst=None)¶ Calculates the expected misclassification cost for each possible class value, given class probability estimates.
Parameters: class_probs (ndarray) – the class probabilities Returns: the calculated costs Return type: ndarray
-
get_cell
(row, col)¶ Returns the JB_Object at the specified location.
Parameters: - row (int) – the 0-based index of the row
- col (int) – the 0-based index of the column
Returns: the object in that cell
Return type: JB_Object
-
get_element
(row, col, inst=None)¶ Returns the value at the specified location.
Parameters: - row (int) – the 0-based index of the row
- col (int) – the 0-based index of the column
- inst (Instance) – the Instace
Returns: the value in that cell
Return type: float
-
get_max_cost
(class_value, inst=None)¶ Gets the maximum cost for a particular class value.
Parameters: - class_value (int) – the class value to get the maximum cost for
- inst (Instance) – the Instance
Returns: the cost
Return type: float
-
initialize
()¶ Initializes the matrix.
-
normalize
()¶ Normalizes the matrix.
-
num_columns
¶ Returns the number of columns.
Returns: the number of columns Return type: int
-
num_rows
¶ Returns the number of rows.
Returns: the number of rows Return type: int
-
classmethod
parse_matlab
(matlab)¶ Parses the costmatrix definition in matlab format and returns a matrix.
Parameters: matlab (str) – the matlab matrix string, eg [1 2; 3 4]. Returns: the generated matrix Return type: CostMatrix
-
set_cell
(row, col, obj)¶ Sets the JB_Object at the specified location. Automatically unwraps JavaObject.
Parameters: - row (int) – the 0-based index of the row
- col (int) – the 0-based index of the column
- obj (object) – the object for that cell
-
set_element
(row, col, value)¶ Sets the float value at the specified location.
Parameters: - row (int) – the 0-based index of the row
- col (int) – the 0-based index of the column
- value (float) – the float value for that cell
-
size
¶ Returns the number of rows/columns.
Returns: the number of rows/columns Return type: int
-
to_matlab
()¶ Returns the matrix in Matlab format.
Returns: the matrix as Matlab formatted string Return type: str
-
-
class
weka.classifiers.
Evaluation
(data, cost_matrix=None)¶ Bases:
weka.core.classes.JavaObject
Evaluation class for classifiers.
-
area_under_prc
(class_index)¶ Returns the area under precision recall curve.
Parameters: class_index (int) – the 0-based index of the class label Returns: the area Return type: float
-
area_under_roc
(class_index)¶ Returns the area under receiver operators characteristics curve.
Parameters: class_index (int) – the 0-based index of the class label Returns: the area Return type: float
-
avg_cost
¶ Returns the average cost.
Returns: the cost Return type: float
-
class_details
(title=None)¶ Generates the class details.
Parameters: title (str) – optional title Returns: the details Return type: str
-
class_priors
¶ Returns the class priors.
Returns: the priors Return type: ndarray
-
confusion_matrix
¶ Returns the confusion matrix.
Returns: the matrix Return type: ndarray
-
correct
¶ Returns the correct count (nominal classes).
Returns: the count Return type: float
-
correlation_coefficient
¶ Returns the correlation coefficient (numeric classes).
Returns: the coefficient Return type: float
-
coverage_of_test_cases_by_predicted_regions
¶ Returns the coverage of the test cases by the predicted regions at the confidence level specified when evaluation was performed.
Returns: the coverage Return type: float
-
crossvalidate_model
(classifier, data, num_folds, rnd, output=None)¶ Crossvalidates the model using the specified data, number of folds and random number generator wrapper.
Parameters: - classifier (Classifier) – the classifier to cross-validate
- data (Instances) – the data to evaluate on
- num_folds (int) – the number of folds
- rnd (Random) – the random number generator to use
- output (PredictionOutput) – the output generator to use
-
cumulative_margin_distribution
()¶ Output the cumulative margin distribution as a string suitable for input for gnuplot or similar package.
Returns: the cumulative margin distribution Return type: str
-
discard_predictions
¶ Returns whether to discard predictions (saves memory).
Returns: True if to discard Return type: bool
-
error_rate
¶ Returns the error rate (numeric classes).
Returns: the rate Return type: float
-
classmethod
evaluate_model
(classifier, args)¶ Evaluates the classifier with the given options.
Parameters: - classifier (Classifier) – the classifier instance to use
- args (list) – the command-line arguments to use
Returns: the evaluation string
Return type: str
-
evaluate_train_test_split
(classifier, data, percentage, rnd=None, output=None)¶ Splits the data into train and test, builds the classifier with the training data and evaluates it against the test set.
Parameters: - classifier (Classifier) – the classifier to cross-validate
- data (Instances) – the data to evaluate on
- percentage (double) – the percentage split to use (amount to use for training)
- rnd (Random) – the random number generator to use, if None the order gets preserved
- output (PredictionOutput) – the output generator to use
-
f_measure
(class_index)¶ Returns the f measure.
Parameters: class_index (int) – the 0-based index of the class label Returns: the measure Return type: float
-
false_negative_rate
(class_index)¶ Returns the false negative rate.
Parameters: class_index (int) – the 0-based index of the class label Returns: the rate Return type: float
-
false_positive_rate
(class_index)¶ Returns the false positive rate.
Parameters: class_index (int) – the 0-based index of the class label Returns: the rate Return type: float
-
incorrect
¶ Returns the incorrect count (nominal classes).
Returns: the count Return type: float
-
kappa
¶ Returns kappa.
Returns: kappa Return type: float
-
kb_information
¶ Returns KB information.
Returns: the information Return type: float
-
kb_mean_information
¶ Returns KB mean information.
Returns: the information Return type: float
-
kb_relative_information
¶ Returns KB relative information.
Returns: the information Return type: float
-
matrix
(title=None)¶ Generates the confusion matrix.
Parameters: title (str) – optional title Returns: the matrix Return type: str
-
matthews_correlation_coefficient
(class_index)¶ Returns the Matthews correlation coefficient (nominal classes).
Parameters: class_index (int) – the 0-based index of the class label Returns: the coefficient Return type: float
-
mean_absolute_error
¶ Returns the mean absolute error.
Returns: the error Return type: float
-
mean_prior_absolute_error
¶ Returns the mean prior absolute error.
Returns: the error Return type: float
-
num_false_negatives
(class_index)¶ Returns the number of false negatives.
Parameters: class_index (int) – the 0-based index of the class label Returns: the count Return type: float
-
num_false_positives
(class_index)¶ Returns the number of false positives.
Parameters: class_index (int) – the 0-based index of the class label Returns: the count Return type: float
-
num_instances
¶ Returns the number of instances that had a known class value.
Returns: the number of instances Return type: float
-
num_true_negatives
(class_index)¶ Returns the number of true negatives.
Parameters: class_index (int) – the 0-based index of the class label Returns: the count Return type: float
-
num_true_positives
(class_index)¶ Returns the number of true positives.
Parameters: class_index (int) – the 0-based index of the class label Returns: the count Return type: float
-
percent_correct
¶ Returns the percent correct (nominal classes).
Returns: the percentage Return type: float
-
percent_incorrect
¶ Returns the percent incorrect (nominal classes).
Returns: the percentage Return type: float
-
percent_unclassified
¶ Returns the percent unclassified.
Returns: the percentage Return type: float
-
precision
(class_index)¶ Returns the precision.
Parameters: class_index (int) – the 0-based index of the class label Returns: the precision Return type: float
-
predictions
¶ Returns the predictions.
Returns: the predictions. None if not available Return type: list
-
recall
(class_index)¶ Returns the recall.
Parameters: class_index (int) – the 0-based index of the class label Returns: the recall Return type: float
-
relative_absolute_error
¶ Returns the relative absolute error.
Returns: the error Return type: float
-
root_mean_prior_squared_error
¶ Returns the root mean prior squared error.
Returns: the error Return type: float
-
root_mean_squared_error
¶ Returns the root mean squared error.
Returns: the error Return type: float
-
root_relative_squared_error
¶ Returns the root relative squared error.
Returns: the error Return type: float
-
sf_entropy_gain
¶ Returns the total SF, which is the null model entropy minus the scheme entropy.
Returns: the gain Return type: float
-
sf_mean_entropy_gain
¶ Returns the SF per instance, which is the null model entropy minus the scheme entropy, per instance.
Returns: the gain Return type: float
-
sf_mean_prior_entropy
¶ Returns the entropy per instance for the null model.
Returns: the entropy Return type: float
-
sf_mean_scheme_entropy
¶ Returns the entropy per instance for the scheme.
Returns: the entropy Return type: float
-
sf_prior_entropy
¶ Returns the total entropy for the null model.
Returns: the entropy Return type: float
-
sf_scheme_entropy
¶ Returns the total entropy for the scheme.
Returns: the entropy Return type: float
-
size_of_predicted_regions
¶ Returns the average size of the predicted regions, relative to the range of the target in the training data, at the confidence level specified when evaluation was performed.
:return:the size of the regions :rtype: float
-
summary
(title=None, complexity=False)¶ Generates a summary.
Parameters: - title (str) – optional title
- complexity (bool) – whether to print the complexity information as well
Returns: the summary
Return type: str
-
test_model
(classifier, data, output=None)¶ Evaluates the built model using the specified test data and returns the classifications.
Parameters: - classifier (Classifier) – the trained classifier to evaluate
- data (Instances) – the data to evaluate on
- output (PredictionOutput) – the output generator to use
Returns: the classifications
Return type: ndarray
-
test_model_once
(classifier, inst)¶ Evaluates the built model using the specified test instance and returns the classification.
Parameters: - classifier (Classifier) – the classifier to cross-validate
- inst (Instances) – the Instance to evaluate on
Returns: the classification
Return type: float
-
total_cost
¶ Returns the total cost.
Returns: the cost Return type: float
-
true_negative_rate
(class_index)¶ Returns the true negative rate.
Parameters: class_index (int) – the 0-based index of the class label Returns: the rate Return type: float
-
true_positive_rate
(class_index)¶ Returns the true positive rate.
Parameters: class_index (int) – the 0-based index of the class label Returns: the rate Return type: float
-
unclassified
¶ Returns the unclassified count.
Returns: the count Return type: float
-
unweighted_macro_f_measure
¶ Returns the unweighted macro-averaged F-measure.
Returns: the measure Return type: float
-
unweighted_micro_f_measure
¶ Returns the unweighted micro-averaged F-measure.
Returns: the measure Return type: float
-
weighted_area_under_prc
¶ Returns the weighted area under precision recall curve.
Returns: the weighted area Return type: float
-
weighted_area_under_roc
¶ Returns the weighted area under receiver operator characteristic curve.
Returns: the weighted area Return type: float
-
weighted_f_measure
¶ Returns the weighted f measure.
Returns: the measure Return type: float
-
weighted_false_negative_rate
¶ Returns the weighted false negative rate.
Returns: the rate Return type: float
-
weighted_false_positive_rate
¶ Returns the weighted false positive rate.
Returns: the rate Return type: float
-
weighted_matthews_correlation
¶ Returns the weighted Matthews correlation (nominal classes).
Returns: the correlation Return type: float
-
weighted_precision
¶ Returns the weighted precision.
Returns: the precision Return type: float
-
weighted_recall
¶ Returns the weighted recall.
Returns: the recall Return type: float
-
weighted_true_negative_rate
¶ Returns the weighted true negative rate.
Returns: the rate Return type: float
-
weighted_true_positive_rate
¶ Returns the weighted true positive rate.
Returns: the rate Return type: float
-
-
class
weka.classifiers.
FilteredClassifier
(jobject=None, options=None)¶ Bases:
weka.classifiers.SingleClassifierEnhancer
Wrapper class for the filtered classifier.
-
check_for_modified_class_attribute
(check)¶ Sets whether to check for class attribute modifications.
Parameters: check (bool) – True if checking for modifications
-
-
class
weka.classifiers.
GridSearch
(jobject=None, options=None)¶ Bases:
weka.classifiers.SingleClassifierEnhancer
Wrapper class for the GridSearch meta-classifier.
-
best
¶ Returns the best classifier setup found during the th search.
Returns: the best classifier setup Return type: Classifier
-
evaluation
¶ Returns the currently set statistic used for evaluation.
Returns: the statistic Return type: SelectedTag
-
x
¶ Returns a dictionary with all the current values for the X of the grid. Keys for the dictionary: property, min, max, step, base, expression Types: property=str, min=float, max=float, step=float, base=float, expression=str
Returns: the dictionary with the parameters Return type: dict
-
y
¶ Returns a dictionary with all the current values for the Y of the grid. Keys for the dictionary: property, min, max, step, base, expression Types: property=str, min=float, max=float, step=float, base=float, expression=str
Returns: the dictionary with the parameters Return type: dict
-
-
class
weka.classifiers.
Kernel
(classname=None, jobject=None, options=None)¶ Bases:
weka.core.classes.OptionHandler
Wrapper class for kernels.
-
build_kernel
(data)¶ Builds the classifier with the data.
Parameters: data (Instances) – the data to train the classifier with
-
capabilities
()¶ Returns the capabilities of the classifier.
Returns: the capabilities Return type: Capabilities
-
checks_turned_off
¶ Returns whether checks are turned off.
Returns: True if checks turned off Return type: bool
-
clean
()¶ Frees the memory used by the kernel.
-
eval
(id1, id2, inst1)¶ Computes the result of the kernel function for two instances. If id1 == -1, eval use inst1 instead of an instance in the dataset.
Parameters: - id1 (int) – the index of the first instance in the dataset
- id2 (int) – the index of the second instance in the dataset
- inst1 (Instance) – the instance corresponding to id1 (used if id1 == -1)
-
-
class
weka.classifiers.
KernelClassifier
(classname=None, jobject=None, options=None)¶ Bases:
weka.classifiers.Classifier
Wrapper class for classifiers that have a kernel property, like SMO.
-
class
weka.classifiers.
MultiSearch
(jobject=None, options=None)¶ Bases:
weka.classifiers.SingleClassifierEnhancer
Wrapper class for the MultiSearch meta-classifier. NB: ‘multi-search-weka-package’ must be installed (https://github.com/fracpete/multisearch-weka-package), version 2016.1.15 or later.
-
best
¶ Returns the best classifier setup found during the th search.
Returns: the best classifier setup Return type: Classifier
-
evaluation
¶ Returns the currently set statistic used for evaluation.
Returns: the statistic Return type: SelectedTag
-
parameters
¶ Returns the list of currently set search parameters.
Returns: the list of AbstractSearchParameter objects Return type: list
-
-
class
weka.classifiers.
MultipleClassifiersCombiner
(classname=None, jobject=None, options=None)¶ Bases:
weka.classifiers.Classifier
Wrapper class for classifiers that use a multiple base classifiers.
-
classifiers
¶ Returns the list of base classifiers.
Returns: the classifier list Return type: list
-
-
class
weka.classifiers.
NominalPrediction
(jobject)¶ Bases:
weka.classifiers.Prediction
Wrapper class for a nominal prediction.
-
distribution
¶ Returns the class distribution.
Returns: the class distribution list Return type: ndarray
-
margin
¶ Returns the margin.
Returns: the margin Return type: float
-
-
class
weka.classifiers.
NumericPrediction
(jobject)¶ Bases:
weka.classifiers.Prediction
Wrapper class for a numeric prediction.
-
error
¶ Returns the error.
Returns: the error Return type: float
-
prediction_intervals
¶ Returns the prediction intervals.
Returns: the intervals Return type: ndarray
-
-
class
weka.classifiers.
Prediction
(jobject)¶ Bases:
weka.core.classes.JavaObject
Wrapper class for a prediction.
-
actual
¶ Returns the actual value.
Returns: the actual value (internal representation) Return type: float
-
predicted
¶ Returns the predicted value.
Returns: the predicted value (internal representation) Return type: float
-
weight
¶ Returns the weight.
Returns: the weight of the Instance that was used Return type: float
-
-
class
weka.classifiers.
PredictionOutput
(classname='weka.classifiers.evaluation.output.prediction.PlainText', jobject=None, options=None)¶ Bases:
weka.core.classes.OptionHandler
For collecting predictions and generating output from. Must be derived from weka.classifiers.evaluation.output.prediction.AbstractOutput
-
buffer_content
()¶ Returns the content of the buffer as string.
Returns: The buffer content Return type: str
-
print_all
(cls, data)¶ Prints the header, classifications and footer to the buffer.
Parameters: - cls (Classifier) – the classifier
- data (Instances) – the test data
-
print_classification
(cls, inst, index)¶ Prints the classification to the buffer.
Parameters: - cls (Classifier) – the classifier
- inst (Instance) – the test instance
- index (int) – the 0-based index of the test instance
-
print_classifications
(cls, data)¶ Prints the classifications to the buffer.
Parameters: - cls (Classifier) – the classifier
- data (Instances) – the test data
Prints the footer to the buffer.
-
print_header
()¶ Prints the header to the buffer.
-
-
class
weka.classifiers.
SingleClassifierEnhancer
(classname=None, jobject=None, options=None)¶ Bases:
weka.classifiers.Classifier
Wrapper class for classifiers that use a single base classifier.
-
classifier
¶ Returns the base classifier.
;return: the base classifier :rtype: Classifier
-
-
weka.classifiers.
main
(args=None)¶ Runs a classifier from the command-line. Calls JVM start/stop automatically. Use -h to see all options.
Parameters: args (list) – the command-line arguments to use, uses sys.argv if None
-
weka.classifiers.
predictions_to_instances
(data, preds)¶ Turns the predictions turned into an Instances object.
Parameters: - data (Instances) – the original dataset format
- preds (list) – the predictions to convert
Returns: the predictions, None if no predictions present
Return type:
-
weka.classifiers.
sys_main
()¶ Runs the main function using the system cli arguments, and returns a system error code.
Returns: 0 for success, 1 for failure. Return type: int
weka.clusterers module¶
-
class
weka.clusterers.
ClusterEvaluation
¶ Bases:
weka.core.classes.JavaObject
Evaluation class for clusterers.
-
classes_to_clusters
¶ Return the array (ordered by cluster number) of minimum error class to cluster mappings.
Returns: the mappings Return type: ndarray
-
cluster_assignments
¶ Return an array of cluster assignments corresponding to the most recent set of instances clustered.
Returns: the cluster assignments Return type: ndarray
-
cluster_results
¶ The cluster results as string.
Returns: the results string Return type: str
-
classmethod
crossvalidate_model
(clusterer, data, num_folds, rnd)¶ Cross-validates the clusterer and returns the loglikelihood.
Parameters: Returns: the cross-validated loglikelihood
Return type: float
-
classmethod
evaluate_clusterer
(clusterer, args)¶ Evaluates the clusterer with the given options.
Parameters: - clusterer (Clusterer) – the clusterer instance to evaluate
- args (list) – the command-line arguments
Returns: the evaluation result
Return type: str
-
log_likelihood
¶ Returns the log likelihood.
Returns: the log likelihood Return type: float
-
num_clusters
¶ Returns the number of clusters.
Returns: the number of clusters Return type: int
-
-
class
weka.clusterers.
Clusterer
(classname='weka.clusterers.SimpleKMeans', jobject=None, options=None)¶ Bases:
weka.core.classes.OptionHandler
Wrapper class for clusterers.
-
build_clusterer
(data)¶ Builds the clusterer with the data.
Parameters: data (Instances) – the data to use for training the clusterer
-
capabilities
¶ Returns the capabilities of the clusterer.
Returns: the capabilities Return type: Capabilities
-
cluster_instance
(inst)¶ Peforms a prediction.
Parameters: inst (Instance) – the instance to determine the cluster for Returns: the clustering result Return type: float
-
classmethod
deserialize
(ser_file)¶ Deserializes a clusterer from a file.
Parameters: ser_file (str) – the model file to deserialize Returns: model and, if available, the dataset header Return type: tuple
-
distribution_for_instance
(inst)¶ Peforms a prediction, returning the cluster distribution.
Parameters: inst (Instance) – the Instance to get the cluster distribution for Returns: the cluster distribution Return type: float[]
-
graph
¶ Returns the graph if classifier implements weka.core.Drawable, otherwise None.
Returns: the graph or None if not available Return type: str
-
graph_type
¶ Returns the graph type if classifier implements weka.core.Drawable, otherwise -1.
Returns: the type Return type: int
-
classmethod
make_copy
(clusterer)¶ Creates a copy of the clusterer.
Parameters: clusterer (Clusterer) – the clustererto copy Returns: the copy of the clusterer Return type: Clusterer
-
number_of_clusters
¶ Returns the number of clusters found.
Returns: the number fo clusters Return type: int
-
serialize
(ser_file, header=None)¶ Serializes the clusterer to the specified file.
Parameters: - ser_file (str) – the file to save the model to
- header (Instances) – the (optional) dataset header to store alongside; recommended
-
update_clusterer
(inst)¶ Updates the clusterer with the instance.
Parameters: inst (Instance) – the Instance to update the clusterer with
-
update_finished
()¶ Signals the clusterer that updating with new data has finished.
-
-
class
weka.clusterers.
FilteredClusterer
(jobject=None, options=None)¶ Bases:
weka.clusterers.SingleClustererEnhancer
Wrapper class for the filtered clusterer.
-
class
weka.clusterers.
SingleClustererEnhancer
(classname=None, jobject=None, options=None)¶ Bases:
weka.clusterers.Clusterer
Wrapper class for clusterers that use a single base clusterer.
-
weka.clusterers.
main
(args=None)¶ Runs a clusterer from the command-line. Calls JVM start/stop automatically. Use -h to see all options.
Parameters: args (list) – the command-line arguments to use, uses sys.argv if None
-
weka.clusterers.
sys_main
()¶ Runs the main function using the system cli arguments, and returns a system error code.
Returns: 0 for success, 1 for failure. Return type: int
weka.datagenerators module¶
-
class
weka.datagenerators.
DataGenerator
(classname='weka.datagenerators.classifiers.classification.Agrawal', jobject=None, options=None)¶ Bases:
weka.core.classes.OptionHandler
Wrapper class for datagenerators.
-
generate_finish
()¶ Returns a “finish” string.
Returns: a finish comment Return type: str
-
generate_start
()¶ Returns a “start” string.
Returns: the start comment Return type: str
-
classmethod
make_copy
(generator)¶ Creates a copy of the generator.
Parameters: generator (DataGenerator) – the generator to copy Returns: the copy of the generator Return type: DataGenerator
-
classmethod
make_data
(generator, args)¶ Generates data using the generator and commandline arguments.
Parameters: - generator (DataGenerator) – the generator instance to use
- args (list) – the command-line arguments
-
num_examples_act
¶ Returns a actual number of examples to generate.
Returns: the number of examples Return type: int
-
single_mode_flag
¶ Returns whether data is generated row by row (True) or in one go (False).
Returns: whether incremental Return type: bool
-
-
weka.datagenerators.
main
(args=None)¶ Runs a datagenerator from the command-line. Calls JVM start/stop automatically. Use -h to see all options.
Parameters: args (list) – the command-line arguments to use, uses sys.argv if None
-
weka.datagenerators.
sys_main
()¶ Runs the main function using the system cli arguments, and returns a system error code.
Returns: 0 for success, 1 for failure. Return type: int
weka.experiments module¶
-
class
weka.experiments.
Experiment
(classname='weka.experiment.Experiment', jobject=None, options=None)¶ Bases:
weka.core.classes.OptionHandler
Wrapper class for an experiment.
-
class
weka.experiments.
ResultMatrix
(classname='weka.experiment.ResultMatrixPlainText', jobject=None, options=None)¶ Bases:
weka.core.classes.OptionHandler
For generating results from an Experiment run.
-
average
(col)¶ Returns the average mean at this location (if valid location).
Parameters: col (int) – the 0-based column index Returns: the mean Return type: float
-
columns
¶ Returns the column count.
Returns: the count Return type: int
-
get_col_name
(index)¶ Returns the column name.
Parameters: index (int) – the 0-based row index Returns: the column name, None if invalid index Return type: str
-
get_mean
(col, row)¶ Returns the mean at this location (if valid location).
Parameters: - col (int) – the 0-based column index
- row (int) – the 0-based row index
Returns: the mean
Return type: float
-
get_row_name
(index)¶ Returns the row name.
Parameters: index (int) – the 0-based row index Returns: the row name, None if invalid index Return type: str
-
get_stdev
(col, row)¶ Returns the standard deviation at this location (if valid location).
Parameters: - col (int) – the 0-based column index
- row (int) – the 0-based row index
Returns: the standard deviation
Return type: float
-
hide_col
(index)¶ Hides the column.
Parameters: index (int) – the 0-based column index
-
hide_row
(index)¶ Hides the row.
Parameters: index (int) – the 0-based row index
Returns whether the column is hidden.
Parameters: index (int) – the 0-based column index Returns: true if hidden Return type: bool
Returns whether the row is hidden.
Parameters: index (int) – the 0-based row index Returns: true if hidden Return type: bool
-
rows
¶ Returns the row count.
Returns: the count Return type: int
-
set_col_name
(index, name)¶ Sets the column name.
Parameters: - index (int) – the 0-based row index
- name (str) – the name of the column
-
set_mean
(col, row, mean)¶ Sets the mean at this location (if valid location).
Parameters: - col (int) – the 0-based column index
- row (int) – the 0-based row index
- mean (float) – the mean to set
-
set_row_name
(index, name)¶ Sets the row name.
Parameters: - index (int) – the 0-based row index
- name (str) – the name of the row
-
set_stdev
(col, row, stdev)¶ Sets the standard deviation at this location (if valid location).
Parameters: - col (int) – the 0-based column index
- row (int) – the 0-based row index
- stdev (float) – the standard deviation to set
-
show_col
(index)¶ Shows the column.
Parameters: index (int) – the 0-based column index
-
show_row
(index)¶ Shows the row.
Parameters: index (int) – the 0-based row index
-
to_string_header
()¶ Returns the header of the matrix as a string.
Returns: the header Return type: str
-
to_string_key
()¶ Returns a key for all the col names, for better readability if the names got cut off.
Returns: the key Return type: str
-
to_string_matrix
()¶ Returns the matrix as a string.
Returns: the generated output Return type: str
-
to_string_ranking
()¶ Returns the ranking in a string representation.
Returns: the ranking Return type: str
-
to_string_summary
()¶ returns the summary as string.
Returns: the summary Return type: str
-
-
class
weka.experiments.
SimpleCrossValidationExperiment
(datasets, classifiers, classification=True, runs=10, folds=10, result=None)¶ Bases:
weka.experiments.SimpleExperiment
Performs a simple cross-validation experiment. Can output the results either in ARFF or CSV.
-
configure_resultproducer
()¶ Configures and returns the ResultProducer and PropertyPath as tuple.
Returns: producer and property path Return type: tuple
-
-
class
weka.experiments.
SimpleExperiment
(datasets, classifiers, jobject=None, classification=True, runs=10, result=None)¶ Bases:
weka.core.classes.OptionHandler
Ancestor for simple experiments.
See following URL for how to use the Experiment API: http://weka.wikispaces.com/Using+the+Experiment+API
-
configure_resultproducer
()¶ Configures and returns the ResultProducer and PropertyPath as tuple.
Returns: producer and property path Return type: tuple
-
configure_splitevaluator
()¶ Configures and returns the SplitEvaluator and Classifier instance as tuple.
Returns: evaluator and classifier Return type: tuple
-
experiment
()¶ Returns the internal experiment, if set up, otherwise None.
Returns: the internal experiment Return type: Experiment
-
classmethod
load
(filename)¶ Loads the experiment from disk.
Parameters: filename (str) – the filename of the experiment to load Returns: the experiment Return type: Experiment
-
run
()¶ Executes the experiment.
-
classmethod
save
(filename, experiment)¶ Saves the experiment to disk.
Parameters: - filename (str) – the filename to save the experiment to
- experiment (Experiment) – the Experiment to save
-
setup
()¶ Initializes the experiment.
-
-
class
weka.experiments.
SimpleRandomSplitExperiment
(datasets, classifiers, classification=True, runs=10, percentage=66.6, preserve_order=False, result=None)¶ Bases:
weka.experiments.SimpleExperiment
Performs a simple random split experiment. Can output the results either in ARFF or CSV.
-
configure_resultproducer
()¶ Configures and returns the ResultProducer and PropertyPath as tuple.
Returns: producer and property path Return type: tuple
-
-
class
weka.experiments.
Tester
(classname='weka.experiment.PairedCorrectedTTester', jobject=None, options=None)¶ Bases:
weka.core.classes.OptionHandler
For generating statistical results from an experiment.
-
dataset_columns
¶ Returns the list of column names that identify uniquely a dataset.
Returns: the list of attributes names Return type: list
-
fold_column
¶ Returns the column name that holds the Fold number.
Returns: the attribute name Return type: str
-
header
(comparison_column)¶ Creates a “header” string describing the current resultsets.
Parameters: comparison_column (int) – the index of the column to compare against Returns: the header Return type: str
-
init_columns
()¶ Sets the column indices based on the supplied names if necessary.
-
multi_resultset_full
(base_resultset, comparison_column)¶ Creates a comparison table where a base resultset is compared to the other resultsets.
Parameters: - base_resultset (int) – the 0-based index of the base resultset (eg classifier to compare against)
- comparison_column (int) – the 0-based index of the column to compare against
Returns: the comparison
Return type: str
-
multi_resultset_ranking
(comparison_column)¶ Creates a ranking.
Parameters: comparison_column (int) – the 0-based index of the column to compare against Returns: the ranking Return type: str
-
multi_resultset_summary
(comparison_column)¶ Carries out a comparison between all resultsets, counting the number of datsets where one resultset outperforms the other.
Parameters: comparison_column (int) – the 0-based index of the column to compare against Returns: the summary Return type: str
-
result_columns
¶ Returns the list of column names that identify uniquely a result (eg classifier + options + ID).
Returns: the list of attribute names Return type: list
-
resultmatrix
¶ Returns the ResultMatrix instance in use.
Returns: the matrix in use Return type: ResultMatrix
-
run_column
¶ Returns the column name that holds the Run number.
Returns: the attribute name Return type: str
-
weka.filters module¶
-
class
weka.filters.
Filter
(classname='weka.filters.AllFilter', jobject=None, options=None)¶ Bases:
weka.core.classes.OptionHandler
Wrapper class for filters.
-
batch_finished
()¶ Signals the filter that the batch of data has finished.
Returns: True if instances can be collected from the output Return type: bool
-
capabilities
()¶ Returns the capabilities of the filter.
Returns: the capabilities Return type: Capabilities
-
classmethod
deserialize
(ser_file)¶ Deserializes a filter from a file.
Parameters: ser_file (str) – the file to deserialize from Returns: model Return type: Filter
-
filter
(data)¶ Filters the dataset(s). When providing a list, this can be used to create compatible train/test sets, since the filter only gets initialized with the first dataset and all subsequent datasets get transformed using the same setup.
NB: inputformat(Instances) must have been called beforehand.
Parameters: data (Instances or list of Instances) – the Instances to filter Returns: the filtered Instances object(s) Return type: Instances or list of Instances
-
input
(inst)¶ Inputs the Instance.
Parameters: inst (Instance) – the instance to filter Returns: True if filtered can be collected from output Return type: bool
-
classmethod
make_copy
(flter)¶ Creates a copy of the filter.
Parameters: flter (Filter) – the filter to copy Returns: the copy of the filter Return type: Filter
-
output
()¶ Outputs the filtered Instance.
Returns: the filtered instance Return type: an Instance object
-
serialize
(ser_file)¶ Serializes the filter to the specified file.
Parameters: ser_file (str) – the file to save the filter to
-
to_source
(classname, data)¶ Returns the model as Java source code if the classifier implements weka.filters.Sourcable.
Parameters: - classname (str) – the classname for the generated Java code
- data (Instances) – the dataset used for initializing the filter
Returns: the model as source code string
Return type: str
-
-
class
weka.filters.
MultiFilter
(jobject=None, options=None)¶ Bases:
weka.filters.Filter
Wrapper class for weka.filters.MultiFilter.
-
filters
¶ Returns the list of base filters.
Returns: the filter list Return type: list
-
-
class
weka.filters.
StringToWordVector
(jobject=None, options=None)¶ Bases:
weka.filters.Filter
Wrapper class for weka.filters.unsupervised.attribute.StringToWordVector.
-
weka.filters.
main
(args=None)¶ Runs a filter from the command-line. Calls JVM start/stop automatically. Use -h to see all options.
Parameters: args (list) – the command-line arguments to use, uses sys.argv if None
-
weka.filters.
sys_main
()¶ Runs the main function using the system cli arguments, and returns a system error code.
Returns: 0 for success, 1 for failure. Return type: int