weka package¶
Subpackages¶
- weka.core package- Submodules
- weka.core.capabilities module
- weka.core.classes module
- weka.core.converters module
- weka.core.database module
- weka.core.dataset module
- weka.core.jvm module
- weka.core.packages module
- weka.core.serialization module
- weka.core.stemmers module
- weka.core.stopwords module
- weka.core.tokenizers module
- weka.core.types module
- weka.core.version module
- Module contents
 
- weka.flow package
- weka.plot package
Submodules¶
weka.associations module¶
- 
class weka.associations.AssociationRule(jobject)¶
- Bases: - weka.core.classes.JavaObject- Wrapper for weka.associations.AssociationRule class. - 
consequence¶
- Get the the consequence. - Returns: - the consequence, list of Item objects - Return type: - list 
 - 
consequence_support¶
- Get the support for the consequence. - Returns: - the support - Return type: - int 
 - 
metric_names¶
- Returns the metric names for the rule. - Returns: - the metric names - Return type: - list 
 - 
metric_value(name)¶
- Returns the named metric value for the rule. - Parameters: - name (str) – the name of the metric - Returns: - the metric value - Return type: - float 
 - 
metric_values¶
- Returns the metric values for the rule. - Returns: - the metric values - Return type: - ndarray 
 - 
premise¶
- Get the the premise. - Returns: - the premise, list of Item objects - Return type: - list 
 - 
premise_support¶
- Get the support for the premise. - Returns: - the support - Return type: - int 
 - 
primary_metric_name¶
- Returns the primary metric name for the rule. - Returns: - the metric name - Return type: - str 
 - 
primary_metric_value¶
- Returns the primary metric value for the rule. - Returns: - the metric value - Return type: - float 
 - 
total_support¶
- Get the total support. - Returns: - the support - Return type: - int 
 - 
total_transactions¶
- Get the total transactions. - Returns: - the transactions - Return type: - int 
 
- 
- 
class weka.associations.AssociationRules(jobject)¶
- Bases: - weka.core.classes.JavaObject- Wrapper for weka.associations.AssociationRules class. - 
producer¶
- Returns a string describing the producer that generated these rules. - Returns: - the producer - Return type: - str 
 
- 
- 
class weka.associations.AssociationRulesIterator(rules)¶
- Bases: - object- Iterator for weka.associations.AssociationRules class. - 
next()¶
- Returns the next rule. - Returns: - the next rule object - Return type: - AssociationRule 
 
- 
- 
class weka.associations.Associator(classname=None, jobject=None, options=None)¶
- Bases: - weka.core.classes.OptionHandler- Wrapper class for associators. - 
association_rules()¶
- Returns association rules that were generated. Only if implements AssociationRulesProducer. - Returns: - the association rules that were generated - Return type: - AssociationRules 
 - 
build_associations(data)¶
- Builds the associator with the data. - Parameters: - data (Instances) – the data to train the associator with 
 - 
can_produce_rules()¶
- Checks whether association rules can be generated. - Returns: - whether scheme implements AssociationRulesProducer interface and - association rules can be generated :rtype: bool 
 - 
capabilities¶
- Returns the capabilities of the associator. - Returns: - the capabilities - Return type: - Capabilities 
 - 
classmethod make_copy(associator)¶
- Creates a copy of the clusterer. - Parameters: - associator (Associator) – the associator to copy - Returns: - the copy of the associator - Return type: - Associator 
 - 
rule_metric_names¶
- Returns the rule metric names of the association rules. Only if implements AssociationRulesProducer. - Returns: - the metric names - Return type: - list 
 
- 
- 
class weka.associations.Item(jobject)¶
- Bases: - weka.core.classes.JavaObject- Wrapper for weka.associations.Item class. - 
comparison¶
- Returns the comparison operator as string. - Returns: - the comparison iterator - Return type: - str 
 - 
decrease_frequency(frequency=None)¶
- Decreases the frequency. - Parameters: - frequency (int) – the frequency to decrease by, 1 if None 
 - 
frequency¶
- Returns the frequency. - Returns: - the frequency - Return type: - int 
 - 
increase_frequency(frequency=None)¶
- Increases the frequency. - Parameters: - frequency (int) – the frequency to increase by, 1 if None 
 - 
item_value¶
- Returns the item value as string. - Returns: - the item value - Return type: - str 
 
- 
- 
weka.associations.main(args=None)¶
- Runs a associator from the command-line. Calls JVM start/stop automatically. Use -h to see all options. - Parameters: - args (list) – the command-line arguments to use, uses sys.argv if None 
- 
weka.associations.sys_main()¶
- Runs the main function using the system cli arguments, and returns a system error code. - Returns: - 0 for success, 1 for failure. - Return type: - int 
weka.attribute_selection module¶
- 
class weka.attribute_selection.ASEvaluation(classname='weka.attributeSelection.CfsSubsetEval', jobject=None, options=None)¶
- Bases: - weka.core.classes.OptionHandler- Wrapper class for attribute selection evaluation algorithm. - 
build_evaluator(data)¶
- Builds the evaluator with the data. - Parameters: - data (Instances) – the data to use 
 - 
capabilities¶
- Returns the capabilities of the classifier. - Returns: - the capabilities - Return type: - Capabilities 
 - 
post_process(indices)¶
- Post-processes the evaluator with the selected attribute indices. - Parameters: - indices (ndarray) – the attribute indices list to use - Returns: - the processed indices - Return type: - ndarray 
 
- 
- 
class weka.attribute_selection.ASSearch(classname='weka.attributeSelection.BestFirst', jobject=None, options=None)¶
- Bases: - weka.core.classes.OptionHandler- Wrapper class for attribute selection search algorithm. - 
search(evaluation, data)¶
- Performs the search and returns the indices of the selected attributes. - Parameters: - evaluation (ASEvaluation) – the evaluation algorithm to use
- data (Instances) – the data to use
 - Returns: - the selected attributes (0-based indices) - Return type: - ndarray 
 
- 
- 
class weka.attribute_selection.AttributeSelection¶
- Bases: - weka.core.classes.JavaObject- Performs attribute selection using search and evaluation algorithms. - 
classmethod attribute_selection(evaluator, args)¶
- Performs attribute selection using the given attribute evaluator and options. - Parameters: - evaluator (ASEvaluation) – the evaluator to use
- args (list) – the command-line args for the attribute selection
 - Returns: - the results string - Return type: - str 
 - 
crossvalidation(crossvalidation)¶
- Sets whether to perform cross-validation. - Parameters: - crossvalidation (bool) – whether to perform cross-validation 
 - 
cv_results¶
- Generates a results string from the last cross-validation attribute selection. - Returns: - the results string - Return type: - str 
 - 
evaluator(evaluator)¶
- Sets the evaluator to use. - Parameters: - evaluator (ASEvaluation) – the evaluator to use. 
 - 
folds(folds)¶
- Sets the number of folds to use for cross-validation. - Parameters: - folds (int) – the number of folds 
 - 
number_attributes_selected¶
- Returns the number of attributes that were selected. - Returns: - the number of attributes - Return type: - int 
 - 
ranked_attributes¶
- Returns the matrix of ranked attributes from the last run. - Returns: - the Numpy matrix - Return type: - ndarray 
 - 
ranking(ranking)¶
- Sets whether to perform a ranking, if possible. - Parameters: - ranking (bool) – whether to perform a ranking 
 - 
reduce_dimensionality(data)¶
- Reduces the dimensionality of the provided Instance or Instances object. - Parameters: - data (Instances) – the data to process - Returns: - the reduced dataset - Return type: - Instances 
 - 
results_string¶
- Generates a results string from the last attribute selection. - Returns: - the results string - Return type: - str 
 - 
search(search)¶
- Sets the search algorithm to use. - Parameters: - search (ASSearch) – the search algorithm 
 - 
seed(seed)¶
- Sets the seed for cross-validation. - Parameters: - seed (int) – the seed value 
 - 
select_attributes(instances)¶
- Performs attribute selection on the given dataset. - Parameters: - instances (Instances) – the data to process 
 - 
select_attributes_cv_split(instances)¶
- Performs attribute selection on the given cross-validation split. - Parameters: - instances (Instances) – the data to process 
 - 
selected_attributes¶
- Returns the selected attributes from the last run. - Returns: - the Numpy array of 0-based indices - Return type: - ndarray 
 
- 
classmethod 
- 
weka.attribute_selection.main(args=None)¶
- Runs attribute selection from the command-line. Calls JVM start/stop automatically. Use -h to see all options. - Parameters: - args (list) – the command-line arguments to use, uses sys.argv if None 
- 
weka.attribute_selection.sys_main()¶
- Runs the main function using the system cli arguments, and returns a system error code. - Returns: - 0 for success, 1 for failure. - Return type: - int 
weka.classifiers module¶
- 
class weka.classifiers.Classifier(classname='weka.classifiers.rules.ZeroR', jobject=None, options=None)¶
- Bases: - weka.core.classes.OptionHandler- Wrapper class for classifiers. - 
batch_size¶
- Returns the batch size, in case this classifier is a batch predictor. - Returns: - the batch size, None if not a batch predictor - Return type: - str 
 - 
build_classifier(data)¶
- Builds the classifier with the data. - Parameters: - data (Instances) – the data to train the classifier with 
 - 
capabilities¶
- Returns the capabilities of the classifier. - Returns: - the capabilities - Return type: - Capabilities 
 - 
classify_instance(inst)¶
- Peforms a prediction. - Parameters: - inst (Instance) – the Instance to get a prediction for - Returns: - the classification (either regression value or 0-based label index) - Return type: - float 
 - 
classmethod deserialize(ser_file)¶
- Deserializes a classifier from a file. - Parameters: - ser_file (str) – the model file to deserialize - Returns: - model and, if available, the dataset header - Return type: - tuple 
 - 
distribution_for_instance(inst)¶
- Peforms a prediction, returning the class distribution. - Parameters: - inst (Instance) – the Instance to get the class distribution for - Returns: - the class distribution array - Return type: - ndarray 
 - 
distributions_for_instances(data)¶
- Peforms predictions, returning the class distributions. - Parameters: - data (Instances) – the Instances to get the class distributions for - Returns: - the class distribution matrix, None if not a batch predictor - Return type: - ndarray 
 - 
graph¶
- Returns the graph if classifier implements weka.core.Drawable, otherwise None. - Returns: - the generated graph string - Return type: - str 
 - 
graph_type¶
- Returns the graph type if classifier implements weka.core.Drawable, otherwise -1. - Returns: - the type - Return type: - int 
 - 
has_efficient_batch_prediction()¶
- Returns whether the classifier implements a more efficient batch prediction. - Returns: - True if a more efficient batch prediction is implemented, always False if not batch predictor - Return type: - bool 
 - 
classmethod make_copy(classifier)¶
- Creates a copy of the classifier. - Parameters: - classifier (Classifier) – the classifier to copy - Returns: - the copy of the classifier - Return type: - Classifier 
 - 
serialize(ser_file, header=None)¶
- Serializes the classifier to the specified file. - Parameters: - ser_file (str) – the file to save the model to
- header (Instances) – the (optional) dataset header to store alongside; recommended
 
 - 
to_source(classname)¶
- Returns the model as Java source code if the classifier implements weka.classifiers.Sourcable. - Parameters: - classname (str) – the classname for the generated Java code - Returns: - the model as source code string - Return type: - str 
 
- 
- 
class weka.classifiers.CostMatrix(matrx=None, num_classes=None)¶
- Bases: - weka.core.classes.JavaObject- Class for storing and manipulating a misclassification cost matrix. The element at position i,j in the matrix is the penalty for classifying an instance of class j as class i. Cost values can be fixed or computed on a per-instance basis (cost sensitive evaluation only) from the value of an attribute or an expression involving attribute(s). - 
apply_cost_matrix(data, rnd)¶
- Applies the cost matrix to the data. - Parameters: 
 - 
expected_costs(class_probs, inst=None)¶
- Calculates the expected misclassification cost for each possible class value, given class probability estimates. - Parameters: - class_probs (ndarray) – the class probabilities - Returns: - the calculated costs - Return type: - ndarray 
 - 
get_cell(row, col)¶
- Returns the JB_Object at the specified location. - Parameters: - row (int) – the 0-based index of the row
- col (int) – the 0-based index of the column
 - Returns: - the object in that cell - Return type: - JB_Object 
 - 
get_element(row, col, inst=None)¶
- Returns the value at the specified location. - Parameters: - row (int) – the 0-based index of the row
- col (int) – the 0-based index of the column
- inst (Instance) – the Instace
 - Returns: - the value in that cell - Return type: - float 
 - 
get_max_cost(class_value, inst=None)¶
- Gets the maximum cost for a particular class value. - Parameters: - class_value (int) – the class value to get the maximum cost for
- inst (Instance) – the Instance
 - Returns: - the cost - Return type: - float 
 - 
initialize()¶
- Initializes the matrix. 
 - 
normalize()¶
- Normalizes the matrix. 
 - 
num_columns¶
- Returns the number of columns. - Returns: - the number of columns - Return type: - int 
 - 
num_rows¶
- Returns the number of rows. - Returns: - the number of rows - Return type: - int 
 - 
classmethod parse_matlab(matlab)¶
- Parses the costmatrix definition in matlab format and returns a matrix. - Parameters: - matlab (str) – the matlab matrix string, eg [1 2; 3 4]. - Returns: - the generated matrix - Return type: - CostMatrix 
 - 
set_cell(row, col, obj)¶
- Sets the JB_Object at the specified location. Automatically unwraps JavaObject. - Parameters: - row (int) – the 0-based index of the row
- col (int) – the 0-based index of the column
- obj (object) – the object for that cell
 
 - 
set_element(row, col, value)¶
- Sets the float value at the specified location. - Parameters: - row (int) – the 0-based index of the row
- col (int) – the 0-based index of the column
- value (float) – the float value for that cell
 
 - 
size¶
- Returns the number of rows/columns. - Returns: - the number of rows/columns - Return type: - int 
 - 
to_matlab()¶
- Returns the matrix in Matlab format. - Returns: - the matrix as Matlab formatted string - Return type: - str 
 
- 
- 
class weka.classifiers.Evaluation(data, cost_matrix=None)¶
- Bases: - weka.core.classes.JavaObject- Evaluation class for classifiers. - 
area_under_prc(class_index)¶
- Returns the area under precision recall curve. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the area - Return type: - float 
 - 
area_under_roc(class_index)¶
- Returns the area under receiver operators characteristics curve. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the area - Return type: - float 
 - 
avg_cost¶
- Returns the average cost. - Returns: - the cost - Return type: - float 
 - 
class_details(title=None)¶
- Generates the class details. - Parameters: - title (str) – optional title - Returns: - the details - Return type: - str 
 - 
class_priors¶
- Returns the class priors. - Returns: - the priors - Return type: - ndarray 
 - 
confusion_matrix¶
- Returns the confusion matrix. - Returns: - the matrix - Return type: - ndarray 
 - 
correct¶
- Returns the correct count (nominal classes). - Returns: - the count - Return type: - float 
 - 
correlation_coefficient¶
- Returns the correlation coefficient (numeric classes). - Returns: - the coefficient - Return type: - float 
 - 
coverage_of_test_cases_by_predicted_regions¶
- Returns the coverage of the test cases by the predicted regions at the confidence level specified when evaluation was performed. - Returns: - the coverage - Return type: - float 
 - 
crossvalidate_model(classifier, data, num_folds, rnd, output=None)¶
- Crossvalidates the model using the specified data, number of folds and random number generator wrapper. - Parameters: - classifier (Classifier) – the classifier to cross-validate
- data (Instances) – the data to evaluate on
- num_folds (int) – the number of folds
- rnd (Random) – the random number generator to use
- output (PredictionOutput) – the output generator to use
 
 - 
cumulative_margin_distribution()¶
- Output the cumulative margin distribution as a string suitable for input for gnuplot or similar package. - Returns: - the cumulative margin distribution - Return type: - str 
 - 
discard_predictions¶
- Returns whether to discard predictions (saves memory). - Returns: - True if to discard - Return type: - bool 
 - 
error_rate¶
- Returns the error rate (numeric classes). - Returns: - the rate - Return type: - float 
 - 
classmethod evaluate_model(classifier, args)¶
- Evaluates the classifier with the given options. - Parameters: - classifier (Classifier) – the classifier instance to use
- args (list) – the command-line arguments to use
 - Returns: - the evaluation string - Return type: - str 
 - 
evaluate_train_test_split(classifier, data, percentage, rnd=None, output=None)¶
- Splits the data into train and test, builds the classifier with the training data and evaluates it against the test set. - Parameters: - classifier (Classifier) – the classifier to cross-validate
- data (Instances) – the data to evaluate on
- percentage (double) – the percentage split to use (amount to use for training)
- rnd (Random) – the random number generator to use, if None the order gets preserved
- output (PredictionOutput) – the output generator to use
 
 - 
f_measure(class_index)¶
- Returns the f measure. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the measure - Return type: - float 
 - 
false_negative_rate(class_index)¶
- Returns the false negative rate. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the rate - Return type: - float 
 - 
false_positive_rate(class_index)¶
- Returns the false positive rate. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the rate - Return type: - float 
 - 
incorrect¶
- Returns the incorrect count (nominal classes). - Returns: - the count - Return type: - float 
 - 
kappa¶
- Returns kappa. - Returns: - kappa - Return type: - float 
 - 
kb_information¶
- Returns KB information. - Returns: - the information - Return type: - float 
 - 
kb_mean_information¶
- Returns KB mean information. - Returns: - the information - Return type: - float 
 - 
kb_relative_information¶
- Returns KB relative information. - Returns: - the information - Return type: - float 
 - 
matrix(title=None)¶
- Generates the confusion matrix. - Parameters: - title (str) – optional title - Returns: - the matrix - Return type: - str 
 - 
matthews_correlation_coefficient(class_index)¶
- Returns the Matthews correlation coefficient (nominal classes). - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the coefficient - Return type: - float 
 - 
mean_absolute_error¶
- Returns the mean absolute error. - Returns: - the error - Return type: - float 
 - 
mean_prior_absolute_error¶
- Returns the mean prior absolute error. - Returns: - the error - Return type: - float 
 - 
num_false_negatives(class_index)¶
- Returns the number of false negatives. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the count - Return type: - float 
 - 
num_false_positives(class_index)¶
- Returns the number of false positives. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the count - Return type: - float 
 - 
num_instances¶
- Returns the number of instances that had a known class value. - Returns: - the number of instances - Return type: - float 
 - 
num_true_negatives(class_index)¶
- Returns the number of true negatives. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the count - Return type: - float 
 - 
num_true_positives(class_index)¶
- Returns the number of true positives. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the count - Return type: - float 
 - 
percent_correct¶
- Returns the percent correct (nominal classes). - Returns: - the percentage - Return type: - float 
 - 
percent_incorrect¶
- Returns the percent incorrect (nominal classes). - Returns: - the percentage - Return type: - float 
 - 
percent_unclassified¶
- Returns the percent unclassified. - Returns: - the percentage - Return type: - float 
 - 
precision(class_index)¶
- Returns the precision. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the precision - Return type: - float 
 - 
predictions¶
- Returns the predictions. - Returns: - the predictions. None if not available - Return type: - list 
 - 
recall(class_index)¶
- Returns the recall. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the recall - Return type: - float 
 - 
relative_absolute_error¶
- Returns the relative absolute error. - Returns: - the error - Return type: - float 
 - 
root_mean_prior_squared_error¶
- Returns the root mean prior squared error. - Returns: - the error - Return type: - float 
 - 
root_mean_squared_error¶
- Returns the root mean squared error. - Returns: - the error - Return type: - float 
 - 
root_relative_squared_error¶
- Returns the root relative squared error. - Returns: - the error - Return type: - float 
 - 
sf_entropy_gain¶
- Returns the total SF, which is the null model entropy minus the scheme entropy. - Returns: - the gain - Return type: - float 
 - 
sf_mean_entropy_gain¶
- Returns the SF per instance, which is the null model entropy minus the scheme entropy, per instance. - Returns: - the gain - Return type: - float 
 - 
sf_mean_prior_entropy¶
- Returns the entropy per instance for the null model. - Returns: - the entropy - Return type: - float 
 - 
sf_mean_scheme_entropy¶
- Returns the entropy per instance for the scheme. - Returns: - the entropy - Return type: - float 
 - 
sf_prior_entropy¶
- Returns the total entropy for the null model. - Returns: - the entropy - Return type: - float 
 - 
sf_scheme_entropy¶
- Returns the total entropy for the scheme. - Returns: - the entropy - Return type: - float 
 - 
size_of_predicted_regions¶
- Returns the average size of the predicted regions, relative to the range of the target in the training data, at the confidence level specified when evaluation was performed. - :return:the size of the regions :rtype: float 
 - 
summary(title=None, complexity=False)¶
- Generates a summary. - Parameters: - title (str) – optional title
- complexity (bool) – whether to print the complexity information as well
 - Returns: - the summary - Return type: - str 
 - 
test_model(classifier, data, output=None)¶
- Evaluates the built model using the specified test data and returns the classifications. - Parameters: - classifier (Classifier) – the trained classifier to evaluate
- data (Instances) – the data to evaluate on
- output (PredictionOutput) – the output generator to use
 - Returns: - the classifications - Return type: - ndarray 
 - 
test_model_once(classifier, inst)¶
- Evaluates the built model using the specified test instance and returns the classification. - Parameters: - classifier (Classifier) – the classifier to cross-validate
- inst (Instances) – the Instance to evaluate on
 - Returns: - the classification - Return type: - float 
 - 
total_cost¶
- Returns the total cost. - Returns: - the cost - Return type: - float 
 - 
true_negative_rate(class_index)¶
- Returns the true negative rate. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the rate - Return type: - float 
 - 
true_positive_rate(class_index)¶
- Returns the true positive rate. - Parameters: - class_index (int) – the 0-based index of the class label - Returns: - the rate - Return type: - float 
 - 
unclassified¶
- Returns the unclassified count. - Returns: - the count - Return type: - float 
 - 
unweighted_macro_f_measure¶
- Returns the unweighted macro-averaged F-measure. - Returns: - the measure - Return type: - float 
 - 
unweighted_micro_f_measure¶
- Returns the unweighted micro-averaged F-measure. - Returns: - the measure - Return type: - float 
 - 
weighted_area_under_prc¶
- Returns the weighted area under precision recall curve. - Returns: - the weighted area - Return type: - float 
 - 
weighted_area_under_roc¶
- Returns the weighted area under receiver operator characteristic curve. - Returns: - the weighted area - Return type: - float 
 - 
weighted_f_measure¶
- Returns the weighted f measure. - Returns: - the measure - Return type: - float 
 - 
weighted_false_negative_rate¶
- Returns the weighted false negative rate. - Returns: - the rate - Return type: - float 
 - 
weighted_false_positive_rate¶
- Returns the weighted false positive rate. - Returns: - the rate - Return type: - float 
 - 
weighted_matthews_correlation¶
- Returns the weighted Matthews correlation (nominal classes). - Returns: - the correlation - Return type: - float 
 - 
weighted_precision¶
- Returns the weighted precision. - Returns: - the precision - Return type: - float 
 - 
weighted_recall¶
- Returns the weighted recall. - Returns: - the recall - Return type: - float 
 - 
weighted_true_negative_rate¶
- Returns the weighted true negative rate. - Returns: - the rate - Return type: - float 
 - 
weighted_true_positive_rate¶
- Returns the weighted true positive rate. - Returns: - the rate - Return type: - float 
 
- 
- 
class weka.classifiers.FilteredClassifier(jobject=None, options=None)¶
- Bases: - weka.classifiers.SingleClassifierEnhancer- Wrapper class for the filtered classifier. - 
check_for_modified_class_attribute(check)¶
- Sets whether to check for class attribute modifications. - Parameters: - check (bool) – True if checking for modifications 
 
- 
- 
class weka.classifiers.GridSearch(jobject=None, options=None)¶
- Bases: - weka.classifiers.SingleClassifierEnhancer- Wrapper class for the GridSearch meta-classifier. - 
best¶
- Returns the best classifier setup found during the th search. - Returns: - the best classifier setup - Return type: - Classifier 
 - 
evaluation¶
- Returns the currently set statistic used for evaluation. - Returns: - the statistic - Return type: - SelectedTag 
 - 
x¶
- Returns a dictionary with all the current values for the X of the grid. Keys for the dictionary: property, min, max, step, base, expression Types: property=str, min=float, max=float, step=float, base=float, expression=str - Returns: - the dictionary with the parameters - Return type: - dict 
 - 
y¶
- Returns a dictionary with all the current values for the Y of the grid. Keys for the dictionary: property, min, max, step, base, expression Types: property=str, min=float, max=float, step=float, base=float, expression=str - Returns: - the dictionary with the parameters - Return type: - dict 
 
- 
- 
class weka.classifiers.Kernel(classname=None, jobject=None, options=None)¶
- Bases: - weka.core.classes.OptionHandler- Wrapper class for kernels. - 
build_kernel(data)¶
- Builds the classifier with the data. - Parameters: - data (Instances) – the data to train the classifier with 
 - 
capabilities()¶
- Returns the capabilities of the classifier. - Returns: - the capabilities - Return type: - Capabilities 
 - 
checks_turned_off¶
- Returns whether checks are turned off. - Returns: - True if checks turned off - Return type: - bool 
 - 
clean()¶
- Frees the memory used by the kernel. 
 - 
eval(id1, id2, inst1)¶
- Computes the result of the kernel function for two instances. If id1 == -1, eval use inst1 instead of an instance in the dataset. - Parameters: - id1 (int) – the index of the first instance in the dataset
- id2 (int) – the index of the second instance in the dataset
- inst1 (Instance) – the instance corresponding to id1 (used if id1 == -1)
 
 
- 
- 
class weka.classifiers.KernelClassifier(classname=None, jobject=None, options=None)¶
- Bases: - weka.classifiers.Classifier- Wrapper class for classifiers that have a kernel property, like SMO. 
- 
class weka.classifiers.MultiSearch(jobject=None, options=None)¶
- Bases: - weka.classifiers.SingleClassifierEnhancer- Wrapper class for the MultiSearch meta-classifier. NB: ‘multi-search-weka-package’ must be installed (https://github.com/fracpete/multisearch-weka-package), version 2016.1.15 or later. - 
best¶
- Returns the best classifier setup found during the th search. - Returns: - the best classifier setup - Return type: - Classifier 
 - 
evaluation¶
- Returns the currently set statistic used for evaluation. - Returns: - the statistic - Return type: - SelectedTag 
 - 
parameters¶
- Returns the list of currently set search parameters. - Returns: - the list of AbstractSearchParameter objects - Return type: - list 
 
- 
- 
class weka.classifiers.MultipleClassifiersCombiner(classname=None, jobject=None, options=None)¶
- Bases: - weka.classifiers.Classifier- Wrapper class for classifiers that use a multiple base classifiers. - 
classifiers¶
- Returns the list of base classifiers. - Returns: - the classifier list - Return type: - list 
 
- 
- 
class weka.classifiers.NominalPrediction(jobject)¶
- Bases: - weka.classifiers.Prediction- Wrapper class for a nominal prediction. - 
distribution¶
- Returns the class distribution. - Returns: - the class distribution list - Return type: - ndarray 
 - 
margin¶
- Returns the margin. - Returns: - the margin - Return type: - float 
 
- 
- 
class weka.classifiers.NumericPrediction(jobject)¶
- Bases: - weka.classifiers.Prediction- Wrapper class for a numeric prediction. - 
error¶
- Returns the error. - Returns: - the error - Return type: - float 
 - 
prediction_intervals¶
- Returns the prediction intervals. - Returns: - the intervals - Return type: - ndarray 
 
- 
- 
class weka.classifiers.Prediction(jobject)¶
- Bases: - weka.core.classes.JavaObject- Wrapper class for a prediction. - 
actual¶
- Returns the actual value. - Returns: - the actual value (internal representation) - Return type: - float 
 - 
predicted¶
- Returns the predicted value. - Returns: - the predicted value (internal representation) - Return type: - float 
 - 
weight¶
- Returns the weight. - Returns: - the weight of the Instance that was used - Return type: - float 
 
- 
- 
class weka.classifiers.PredictionOutput(classname='weka.classifiers.evaluation.output.prediction.PlainText', jobject=None, options=None)¶
- Bases: - weka.core.classes.OptionHandler- For collecting predictions and generating output from. Must be derived from weka.classifiers.evaluation.output.prediction.AbstractOutput - 
buffer_content()¶
- Returns the content of the buffer as string. - Returns: - The buffer content - Return type: - str 
 - 
print_all(cls, data)¶
- Prints the header, classifications and footer to the buffer. - Parameters: - cls (Classifier) – the classifier
- data (Instances) – the test data
 
 - 
print_classification(cls, inst, index)¶
- Prints the classification to the buffer. - Parameters: - cls (Classifier) – the classifier
- inst (Instance) – the test instance
- index (int) – the 0-based index of the test instance
 
 - 
print_classifications(cls, data)¶
- Prints the classifications to the buffer. - Parameters: - cls (Classifier) – the classifier
- data (Instances) – the test data
 
 - Prints the footer to the buffer. 
 - 
print_header()¶
- Prints the header to the buffer. 
 
- 
- 
class weka.classifiers.SingleClassifierEnhancer(classname=None, jobject=None, options=None)¶
- Bases: - weka.classifiers.Classifier- Wrapper class for classifiers that use a single base classifier. - 
classifier¶
- Returns the base classifier. - ;return: the base classifier :rtype: Classifier 
 
- 
- 
weka.classifiers.main(args=None)¶
- Runs a classifier from the command-line. Calls JVM start/stop automatically. Use -h to see all options. - Parameters: - args (list) – the command-line arguments to use, uses sys.argv if None 
- 
weka.classifiers.predictions_to_instances(data, preds)¶
- Turns the predictions turned into an Instances object. - Parameters: - data (Instances) – the original dataset format
- preds (list) – the predictions to convert
 - Returns: - the predictions, None if no predictions present - Return type: 
- 
weka.classifiers.sys_main()¶
- Runs the main function using the system cli arguments, and returns a system error code. - Returns: - 0 for success, 1 for failure. - Return type: - int 
weka.clusterers module¶
- 
class weka.clusterers.ClusterEvaluation¶
- Bases: - weka.core.classes.JavaObject- Evaluation class for clusterers. - 
classes_to_clusters¶
- Return the array (ordered by cluster number) of minimum error class to cluster mappings. - Returns: - the mappings - Return type: - ndarray 
 - 
cluster_assignments¶
- Return an array of cluster assignments corresponding to the most recent set of instances clustered. - Returns: - the cluster assignments - Return type: - ndarray 
 - 
cluster_results¶
- The cluster results as string. - Returns: - the results string - Return type: - str 
 - 
classmethod crossvalidate_model(clusterer, data, num_folds, rnd)¶
- Cross-validates the clusterer and returns the loglikelihood. - Parameters: - Returns: - the cross-validated loglikelihood - Return type: - float 
 - 
classmethod evaluate_clusterer(clusterer, args)¶
- Evaluates the clusterer with the given options. - Parameters: - clusterer (Clusterer) – the clusterer instance to evaluate
- args (list) – the command-line arguments
 - Returns: - the evaluation result - Return type: - str 
 - 
log_likelihood¶
- Returns the log likelihood. - Returns: - the log likelihood - Return type: - float 
 - 
num_clusters¶
- Returns the number of clusters. - Returns: - the number of clusters - Return type: - int 
 
- 
- 
class weka.clusterers.Clusterer(classname='weka.clusterers.SimpleKMeans', jobject=None, options=None)¶
- Bases: - weka.core.classes.OptionHandler- Wrapper class for clusterers. - 
build_clusterer(data)¶
- Builds the clusterer with the data. - Parameters: - data (Instances) – the data to use for training the clusterer 
 - 
capabilities¶
- Returns the capabilities of the clusterer. - Returns: - the capabilities - Return type: - Capabilities 
 - 
cluster_instance(inst)¶
- Peforms a prediction. - Parameters: - inst (Instance) – the instance to determine the cluster for - Returns: - the clustering result - Return type: - float 
 - 
classmethod deserialize(ser_file)¶
- Deserializes a clusterer from a file. - Parameters: - ser_file (str) – the model file to deserialize - Returns: - model and, if available, the dataset header - Return type: - tuple 
 - 
distribution_for_instance(inst)¶
- Peforms a prediction, returning the cluster distribution. - Parameters: - inst (Instance) – the Instance to get the cluster distribution for - Returns: - the cluster distribution - Return type: - float[] 
 - 
graph¶
- Returns the graph if classifier implements weka.core.Drawable, otherwise None. - Returns: - the graph or None if not available - Return type: - str 
 - 
graph_type¶
- Returns the graph type if classifier implements weka.core.Drawable, otherwise -1. - Returns: - the type - Return type: - int 
 - 
classmethod make_copy(clusterer)¶
- Creates a copy of the clusterer. - Parameters: - clusterer (Clusterer) – the clustererto copy - Returns: - the copy of the clusterer - Return type: - Clusterer 
 - 
number_of_clusters¶
- Returns the number of clusters found. - Returns: - the number fo clusters - Return type: - int 
 - 
serialize(ser_file, header=None)¶
- Serializes the clusterer to the specified file. - Parameters: - ser_file (str) – the file to save the model to
- header (Instances) – the (optional) dataset header to store alongside; recommended
 
 - 
update_clusterer(inst)¶
- Updates the clusterer with the instance. - Parameters: - inst (Instance) – the Instance to update the clusterer with 
 - 
update_finished()¶
- Signals the clusterer that updating with new data has finished. 
 
- 
- 
class weka.clusterers.FilteredClusterer(jobject=None, options=None)¶
- Bases: - weka.clusterers.SingleClustererEnhancer- Wrapper class for the filtered clusterer. 
- 
class weka.clusterers.SingleClustererEnhancer(classname=None, jobject=None, options=None)¶
- Bases: - weka.clusterers.Clusterer- Wrapper class for clusterers that use a single base clusterer. 
- 
weka.clusterers.main(args=None)¶
- Runs a clusterer from the command-line. Calls JVM start/stop automatically. Use -h to see all options. - Parameters: - args (list) – the command-line arguments to use, uses sys.argv if None 
- 
weka.clusterers.sys_main()¶
- Runs the main function using the system cli arguments, and returns a system error code. - Returns: - 0 for success, 1 for failure. - Return type: - int 
weka.datagenerators module¶
- 
class weka.datagenerators.DataGenerator(classname='weka.datagenerators.classifiers.classification.Agrawal', jobject=None, options=None)¶
- Bases: - weka.core.classes.OptionHandler- Wrapper class for datagenerators. - 
generate_finish()¶
- Returns a “finish” string. - Returns: - a finish comment - Return type: - str 
 - 
generate_start()¶
- Returns a “start” string. - Returns: - the start comment - Return type: - str 
 - 
classmethod make_copy(generator)¶
- Creates a copy of the generator. - Parameters: - generator (DataGenerator) – the generator to copy - Returns: - the copy of the generator - Return type: - DataGenerator 
 - 
classmethod make_data(generator, args)¶
- Generates data using the generator and commandline arguments. - Parameters: - generator (DataGenerator) – the generator instance to use
- args (list) – the command-line arguments
 
 - 
num_examples_act¶
- Returns a actual number of examples to generate. - Returns: - the number of examples - Return type: - int 
 - 
single_mode_flag¶
- Returns whether data is generated row by row (True) or in one go (False). - Returns: - whether incremental - Return type: - bool 
 
- 
- 
weka.datagenerators.main(args=None)¶
- Runs a datagenerator from the command-line. Calls JVM start/stop automatically. Use -h to see all options. - Parameters: - args (list) – the command-line arguments to use, uses sys.argv if None 
- 
weka.datagenerators.sys_main()¶
- Runs the main function using the system cli arguments, and returns a system error code. - Returns: - 0 for success, 1 for failure. - Return type: - int 
weka.experiments module¶
- 
class weka.experiments.Experiment(classname='weka.experiment.Experiment', jobject=None, options=None)¶
- Bases: - weka.core.classes.OptionHandler- Wrapper class for an experiment. 
- 
class weka.experiments.ResultMatrix(classname='weka.experiment.ResultMatrixPlainText', jobject=None, options=None)¶
- Bases: - weka.core.classes.OptionHandler- For generating results from an Experiment run. - 
average(col)¶
- Returns the average mean at this location (if valid location). - Parameters: - col (int) – the 0-based column index - Returns: - the mean - Return type: - float 
 - 
columns¶
- Returns the column count. - Returns: - the count - Return type: - int 
 - 
get_col_name(index)¶
- Returns the column name. - Parameters: - index (int) – the 0-based row index - Returns: - the column name, None if invalid index - Return type: - str 
 - 
get_mean(col, row)¶
- Returns the mean at this location (if valid location). - Parameters: - col (int) – the 0-based column index
- row (int) – the 0-based row index
 - Returns: - the mean - Return type: - float 
 - 
get_row_name(index)¶
- Returns the row name. - Parameters: - index (int) – the 0-based row index - Returns: - the row name, None if invalid index - Return type: - str 
 - 
get_stdev(col, row)¶
- Returns the standard deviation at this location (if valid location). - Parameters: - col (int) – the 0-based column index
- row (int) – the 0-based row index
 - Returns: - the standard deviation - Return type: - float 
 - 
hide_col(index)¶
- Hides the column. - Parameters: - index (int) – the 0-based column index 
 - 
hide_row(index)¶
- Hides the row. - Parameters: - index (int) – the 0-based row index 
 - Returns whether the column is hidden. - Parameters: - index (int) – the 0-based column index - Returns: - true if hidden - Return type: - bool 
 - Returns whether the row is hidden. - Parameters: - index (int) – the 0-based row index - Returns: - true if hidden - Return type: - bool 
 - 
rows¶
- Returns the row count. - Returns: - the count - Return type: - int 
 - 
set_col_name(index, name)¶
- Sets the column name. - Parameters: - index (int) – the 0-based row index
- name (str) – the name of the column
 
 - 
set_mean(col, row, mean)¶
- Sets the mean at this location (if valid location). - Parameters: - col (int) – the 0-based column index
- row (int) – the 0-based row index
- mean (float) – the mean to set
 
 - 
set_row_name(index, name)¶
- Sets the row name. - Parameters: - index (int) – the 0-based row index
- name (str) – the name of the row
 
 - 
set_stdev(col, row, stdev)¶
- Sets the standard deviation at this location (if valid location). - Parameters: - col (int) – the 0-based column index
- row (int) – the 0-based row index
- stdev (float) – the standard deviation to set
 
 - 
show_col(index)¶
- Shows the column. - Parameters: - index (int) – the 0-based column index 
 - 
show_row(index)¶
- Shows the row. - Parameters: - index (int) – the 0-based row index 
 - 
to_string_header()¶
- Returns the header of the matrix as a string. - Returns: - the header - Return type: - str 
 - 
to_string_key()¶
- Returns a key for all the col names, for better readability if the names got cut off. - Returns: - the key - Return type: - str 
 - 
to_string_matrix()¶
- Returns the matrix as a string. - Returns: - the generated output - Return type: - str 
 - 
to_string_ranking()¶
- Returns the ranking in a string representation. - Returns: - the ranking - Return type: - str 
 - 
to_string_summary()¶
- returns the summary as string. - Returns: - the summary - Return type: - str 
 
- 
- 
class weka.experiments.SimpleCrossValidationExperiment(datasets, classifiers, classification=True, runs=10, folds=10, result=None)¶
- Bases: - weka.experiments.SimpleExperiment- Performs a simple cross-validation experiment. Can output the results either in ARFF or CSV. - 
configure_resultproducer()¶
- Configures and returns the ResultProducer and PropertyPath as tuple. - Returns: - producer and property path - Return type: - tuple 
 
- 
- 
class weka.experiments.SimpleExperiment(datasets, classifiers, jobject=None, classification=True, runs=10, result=None)¶
- Bases: - weka.core.classes.OptionHandler- Ancestor for simple experiments. - See following URL for how to use the Experiment API: http://weka.wikispaces.com/Using+the+Experiment+API - 
configure_resultproducer()¶
- Configures and returns the ResultProducer and PropertyPath as tuple. - Returns: - producer and property path - Return type: - tuple 
 - 
configure_splitevaluator()¶
- Configures and returns the SplitEvaluator and Classifier instance as tuple. - Returns: - evaluator and classifier - Return type: - tuple 
 - 
experiment()¶
- Returns the internal experiment, if set up, otherwise None. - Returns: - the internal experiment - Return type: - Experiment 
 - 
classmethod load(filename)¶
- Loads the experiment from disk. - Parameters: - filename (str) – the filename of the experiment to load - Returns: - the experiment - Return type: - Experiment 
 - 
run()¶
- Executes the experiment. 
 - 
classmethod save(filename, experiment)¶
- Saves the experiment to disk. - Parameters: - filename (str) – the filename to save the experiment to
- experiment (Experiment) – the Experiment to save
 
 - 
setup()¶
- Initializes the experiment. 
 
- 
- 
class weka.experiments.SimpleRandomSplitExperiment(datasets, classifiers, classification=True, runs=10, percentage=66.6, preserve_order=False, result=None)¶
- Bases: - weka.experiments.SimpleExperiment- Performs a simple random split experiment. Can output the results either in ARFF or CSV. - 
configure_resultproducer()¶
- Configures and returns the ResultProducer and PropertyPath as tuple. - Returns: - producer and property path - Return type: - tuple 
 
- 
- 
class weka.experiments.Tester(classname='weka.experiment.PairedCorrectedTTester', jobject=None, options=None)¶
- Bases: - weka.core.classes.OptionHandler- For generating statistical results from an experiment. - 
dataset_columns¶
- Returns the list of column names that identify uniquely a dataset. - Returns: - the list of attributes names - Return type: - list 
 - 
fold_column¶
- Returns the column name that holds the Fold number. - Returns: - the attribute name - Return type: - str 
 - 
header(comparison_column)¶
- Creates a “header” string describing the current resultsets. - Parameters: - comparison_column (int) – the index of the column to compare against - Returns: - the header - Return type: - str 
 - 
init_columns()¶
- Sets the column indices based on the supplied names if necessary. 
 - 
multi_resultset_full(base_resultset, comparison_column)¶
- Creates a comparison table where a base resultset is compared to the other resultsets. - Parameters: - base_resultset (int) – the 0-based index of the base resultset (eg classifier to compare against)
- comparison_column (int) – the 0-based index of the column to compare against
 - Returns: - the comparison - Return type: - str 
 - 
multi_resultset_ranking(comparison_column)¶
- Creates a ranking. - Parameters: - comparison_column (int) – the 0-based index of the column to compare against - Returns: - the ranking - Return type: - str 
 - 
multi_resultset_summary(comparison_column)¶
- Carries out a comparison between all resultsets, counting the number of datsets where one resultset outperforms the other. - Parameters: - comparison_column (int) – the 0-based index of the column to compare against - Returns: - the summary - Return type: - str 
 - 
result_columns¶
- Returns the list of column names that identify uniquely a result (eg classifier + options + ID). - Returns: - the list of attribute names - Return type: - list 
 - 
resultmatrix¶
- Returns the ResultMatrix instance in use. - Returns: - the matrix in use - Return type: - ResultMatrix 
 - 
run_column¶
- Returns the column name that holds the Run number. - Returns: - the attribute name - Return type: - str 
 
- 
weka.filters module¶
- 
class weka.filters.Filter(classname='weka.filters.AllFilter', jobject=None, options=None)¶
- Bases: - weka.core.classes.OptionHandler- Wrapper class for filters. - 
batch_finished()¶
- Signals the filter that the batch of data has finished. - Returns: - True if instances can be collected from the output - Return type: - bool 
 - 
capabilities()¶
- Returns the capabilities of the filter. - Returns: - the capabilities - Return type: - Capabilities 
 - 
classmethod deserialize(ser_file)¶
- Deserializes a filter from a file. - Parameters: - ser_file (str) – the file to deserialize from - Returns: - model - Return type: - Filter 
 - 
filter(data)¶
- Filters the dataset(s). When providing a list, this can be used to create compatible train/test sets, since the filter only gets initialized with the first dataset and all subsequent datasets get transformed using the same setup. - NB: inputformat(Instances) must have been called beforehand. - Parameters: - data (Instances or list of Instances) – the Instances to filter - Returns: - the filtered Instances object(s) - Return type: - Instances or list of Instances 
 - 
input(inst)¶
- Inputs the Instance. - Parameters: - inst (Instance) – the instance to filter - Returns: - True if filtered can be collected from output - Return type: - bool 
 - 
classmethod make_copy(flter)¶
- Creates a copy of the filter. - Parameters: - flter (Filter) – the filter to copy - Returns: - the copy of the filter - Return type: - Filter 
 - 
output()¶
- Outputs the filtered Instance. - Returns: - the filtered instance - Return type: - an Instance object 
 - 
serialize(ser_file)¶
- Serializes the filter to the specified file. - Parameters: - ser_file (str) – the file to save the filter to 
 - 
to_source(classname, data)¶
- Returns the model as Java source code if the classifier implements weka.filters.Sourcable. - Parameters: - classname (str) – the classname for the generated Java code
- data (Instances) – the dataset used for initializing the filter
 - Returns: - the model as source code string - Return type: - str 
 
- 
- 
class weka.filters.MultiFilter(jobject=None, options=None)¶
- Bases: - weka.filters.Filter- Wrapper class for weka.filters.MultiFilter. - 
filters¶
- Returns the list of base filters. - Returns: - the filter list - Return type: - list 
 
- 
- 
class weka.filters.StringToWordVector(jobject=None, options=None)¶
- Bases: - weka.filters.Filter- Wrapper class for weka.filters.unsupervised.attribute.StringToWordVector. 
- 
weka.filters.main(args=None)¶
- Runs a filter from the command-line. Calls JVM start/stop automatically. Use -h to see all options. - Parameters: - args (list) – the command-line arguments to use, uses sys.argv if None 
- 
weka.filters.sys_main()¶
- Runs the main function using the system cli arguments, and returns a system error code. - Returns: - 0 for success, 1 for failure. - Return type: - int 
