weka.plot package

weka.plot.classifiers module

weka.plot.classifiers.generate_thresholdcurve_data(evaluation, class_index)

Generates the threshold curve data from the evaluation object’s predictions.

Parameters
  • evaluation (Evaluation) – the evaluation to obtain the predictions from

  • class_index (int) – the 0-based index of the class-label to create the plot for

Returns

the generated threshold curve data

Return type

Instances

weka.plot.classifiers.get_auc(data)

Calculates the area under the ROC curve (AUC).

Parameters

data (Instances) – the threshold curve data

Returns

the area

Return type

float

weka.plot.classifiers.get_prc(data)

Calculates the area under the precision recall curve (PRC).

Parameters

data (Instances) – the threshold curve data

Returns

the area

Return type

float

weka.plot.classifiers.get_thresholdcurve_data(data, xname, yname)

Retrieves x and y columns from of the data generated by the weka.classifiers.evaluation.ThresholdCurve class.

Parameters
  • data (Instances) – the threshold curve data

  • xname (str) – the name of the X column

  • yname (str) – the name of the Y column

Returns

tuple of x and y arrays

Return type

tuple

weka.plot.classifiers.plot_classifier_errors(predictions, absolute=True, max_relative_size=50, absolute_size=50, title=None, outfile=None, wait=True, key_loc='lower center')

Plots the classifers for the given list of predictions.

TODO: click events http://matplotlib.org/examples/event_handling/data_browser.html

Parameters
  • predictions (list or dict) – the predictions to plot, use a dict to plot predictions of multiple classifiers (keys are used as prefixes for plots)

  • absolute (bool) – whether to use absolute errors as size or relative ones

  • max_relative_size (int) – the maximum size in point in case of relative mode

  • absolute_size (int) – the size in point in case of absolute mode

  • title (str) – an optional title

  • outfile (str) – the output file, ignored if None

  • wait (bool) – whether to wait for the user to close the plot

  • key_loc (str) – the location string for the key

weka.plot.classifiers.plot_learning_curve(classifiers, train, test=None, increments=100, metric='percent_correct', title='Learning curve', label_template='[#] @ $', key_loc='lower right', outfile=None, wait=True)

Plots a learning curve.

Parameters
  • classifiers (list of Classifier) – list of Classifier template objects

  • train (Instances) – dataset to use for the building the classifier, used for evaluating it test set None

  • test (list or Instances) – optional dataset (or list of datasets) to use for the testing the built classifiers

  • increments (float) – the increments (>= 1: # of instances, <1: percentage of dataset)

  • metric (str) – the name of the numeric metric to plot (Evaluation.<metric>)

  • title (str) – the title for the plot

  • label_template (str) – the template for the label in the plot (#: 1-based index of classifier, @: full classname, !: simple classname, $: options, *: 1-based index of test set)

  • key_loc (str) – the location string for the key

  • outfile (str) – the output file, ignored if None

  • wait (bool) – whether to wait for the user to close the plot

weka.plot.classifiers.plot_prc(evaluation, class_index=None, title=None, key_loc='lower center', outfile=None, wait=True)

Plots the PRC (precision recall) curve for the given predictions.

TODO: click events http://matplotlib.org/examples/event_handling/data_browser.html

Parameters
  • evaluation (Evaluation) – the evaluation to obtain the predictions from

  • class_index (list) – the list of 0-based indices of the class-labels to create the plot for

  • title (str) – an optional title

  • key_loc (str) – the location string for the key

  • outfile (str) – the output file, ignored if None

  • wait (bool) – whether to wait for the user to close the plot

weka.plot.classifiers.plot_prcs(evaluations, class_index=0, title=None, key_loc='lower center', outfile=None, wait=True)

Plots the PRC (precision recall) curve for the predictions of multiple classifiers on the same dataset.

TODO: click events http://matplotlib.org/examples/event_handling/data_browser.html

Parameters
  • evaluations (dict) – the dictionary of Evaluation objects to obtain the predictions from, the key is used in the plot key as prefix

  • class_index (int) – the 0-based index of the class-label to create the plot for

  • title (str) – an optional title

  • key_loc (str) – the location string for the key

  • outfile (str) – the output file, ignored if None

  • wait (bool) – whether to wait for the user to close the plot

weka.plot.classifiers.plot_roc(evaluation, class_index=None, title=None, key_loc='lower right', outfile=None, wait=True)

Plots the ROC (receiver operator characteristics) curve for the given predictions.

TODO: click events http://matplotlib.org/examples/event_handling/data_browser.html

Parameters
  • evaluation (Evaluation) – the evaluation to obtain the predictions from

  • class_index (list) – the list of 0-based indices of the class-labels to create the plot for

  • title (str) – an optional title

  • key_loc (str) – the position string for the key

  • outfile (str) – the output file, ignored if None

  • wait (bool) – whether to wait for the user to close the plot

weka.plot.classifiers.plot_rocs(evaluations, class_index=0, title=None, key_loc='lower right', outfile=None, wait=True)

Plots the ROC (receiver operator characteristics) curve for the predictions of multiple classifiers on the same dataset.

TODO: click events http://matplotlib.org/examples/event_handling/data_browser.html

Parameters
  • evaluations (dict) – the dictionary of Evaluation objects to obtain the predictions from, the key is used in the plot key as prefix

  • class_index (int) – the 0-based index of the class-label to create the plot for

  • title (str) – an optional title

  • key_loc (str) – the position string for the key

  • outfile (str) – the output file, ignored if None

  • wait (bool) – whether to wait for the user to close the plot

weka.plot.clusterers module

weka.plot.clusterers.plot_cluster_assignments(evl, data, atts=None, inst_no=False, size=10, title=None, outfile=None, wait=True)

Plots the cluster assignments against the specified attributes.

TODO: click events http://matplotlib.org/examples/event_handling/data_browser.html

Parameters
  • evl (ClusterEvaluation) – the cluster evaluation to obtain the cluster assignments from

  • data (Instances) – the dataset the clusterer was evaluated against

  • atts (list) – the list of attribute indices to plot, None for all

  • inst_no (bool) – whether to include a fake attribute with the instance number

  • size (int) – the size of the circles in point

  • title (str) – an optional title

  • outfile (str) – the (optional) file to save the generated plot to. The extension determines the file format.

  • wait (bool) – whether to wait for the user to close the plot

weka.plot.dataset module

weka.plot.dataset.line_plot(data, atts=None, percent=100.0, seed=1, title=None, outfile=None, wait=True)

Uses the internal format to plot the dataset, one line per instance.

Parameters
  • data (Instances) – the dataset

  • atts (list) – the list of 0-based attribute indices of attributes to plot

  • percent (float) – the percentage of the dataset to use for plotting

  • seed (int) – the seed value to use for subsampling

  • title (str) – an optional title

  • outfile (str) – the (optional) file to save the generated plot to. The extension determines the file format.

  • wait (bool) – whether to wait for the user to close the plot

weka.plot.dataset.matrix_plot(data, percent=100.0, seed=1, size=10, title=None, outfile=None, wait=True)

Plots all attributes against each other.

TODO: click events http://matplotlib.org/examples/event_handling/data_browser.html

Parameters
  • data (Instances) – the dataset

  • percent (float) – the percentage of the dataset to use for plotting

  • seed (int) – the seed value to use for subsampling

  • size (int) – the size of the circles in point

  • title (str) – an optional title

  • outfile (str) – the (optional) file to save the generated plot to. The extension determines the file format.

  • wait (bool) – whether to wait for the user to close the plot

weka.plot.dataset.scatter_plot(data, index_x, index_y, percent=100.0, seed=1, size=50, title=None, outfile=None, wait=True)

Plots two attributes against each other.

TODO: click events http://matplotlib.org/examples/event_handling/data_browser.html

Parameters
  • data (Instances) – the dataset

  • index_x (int) – the 0-based index of the attribute on the x axis

  • index_y (int) – the 0-based index of the attribute on the y axis

  • percent (float) – the percentage of the dataset to use for plotting

  • seed (int) – the seed value to use for subsampling

  • size (int) – the size of the circles in point

  • title (str) – an optional title

  • outfile (str) – the (optional) file to save the generated plot to. The extension determines the file format.

  • wait (bool) – whether to wait for the user to close the plot

weka.plot.experiments module

weka.plot.experiments.plot_experiment(mat, title='Experiment', axes_swapped=False, measure='Statistic', show_stdev=False, key_loc='lower right', outfile=None, wait=True)

Plots the results from an experiment.

Parameters
  • mat (ResultMatrix) – the result matrix to plot

  • title (str) – the title for the experiment

  • axes_swapped (bool) – whether the axes whether swapped (“sets x cls” or “cls x sets”)

  • measure (str) – the measure that is being displayed

  • show_stdev (bool) – whether to show the standard deviation as error bar

  • key_loc (str) – the location string for the key

  • outfile (str) – the output file, ignored if None

  • wait (bool) – whether to wait for the user to close the plot

weka.plot.graph module

weka.plot.graph.plot_dot_graph(graph, filename=None)

Plots a graph in graphviz dot notation.

Parameters
  • graph (str) – the dot notation graph

  • filename (str) – the (optional) file to save the generated plot to. The extension determines the file format.

Module contents

weka.plot.create_subsample(data, percent, seed=1)

Generates a subsample of the dataset. :param data: the data to create the subsample from :type data: Instances :param percent: the percentage (0-100) :type percent: float :param seed: the seed value to use :type seed: int

weka.plot.set_window_title(fig, title)

Sets the window title of the figure (if matplotlib is available).

Parameters
  • fig – the figure to update

  • title (str) – the title to set