weka.core package

weka.core.capabilities module

class weka.core.capabilities.Capabilities(jobject=None, owner=None)

Bases: weka.core.classes.JavaObject

Wrapper for Capabilities.

attribute_capabilities()

Returns all the attribute capabilities.

Returns

attribute capabilities

Return type

Capabilities

capabilities()

Returns all the capabilities.

Returns

all capabilities

Return type

list

class_capabilities()

Returns all the class capabilities.

Returns

class capabilities

Return type

Capabilities

dependencies()

Returns all the dependencies.

Returns

the dependency list

Return type

list

disable(capability)

Disables the specified capability.

Parameters

capability (Capability) – the capability to disable

disable_all()

Disables all capabilities.

disable_all_attribute_dependencies()

Disables all attribute dependencies.

disable_all_attributes()

Disables all attributes.

disable_all_class_dependencies()

Disables all class dependencies.

disable_all_classes()

Disables all classes.

disable_dependency(capability)

Disables the dependency of the given capability Disabling NOMINAL_ATTRIBUTES also disables BINARY_ATTRIBUTES, UNARY_ATTRIBUTES and EMPTY_NOMINAL_ATTRIBUTES.

Parameters

capability (Capability) – the dependency to disable

enable(capability)

enables the specified capability.

Parameters

capability (Capability) – the capability to enable

enable_all()

enables all capabilities.

enable_all_attribute_dependencies()

enables all attribute dependencies.

enable_all_attributes()

enables all attributes.

enable_all_class_dependencies()

enables all class dependencies.

enable_all_classes()

enables all classes.

enable_dependency(capability)

enables the dependency of the given capability enabling NOMINAL_ATTRIBUTES also enables BINARY_ATTRIBUTES, UNARY_ATTRIBUTES and EMPTY_NOMINAL_ATTRIBUTES.

Parameters

capability (Capability) – the dependency to enable

classmethod for_instances(data, multi=None)

returns a Capabilities object specific for this data. The minimum number of instances is not set, the check for multi-instance data is optional.

Parameters
  • data (Instances) – the data to generate the capabilities for

  • multi (bool) – whether to check the structure, too

Returns

the generated capabilities

Return type

Capabilities

handles(capability)

Returns whether the specified capability is set.

Parameters

capability (Capability) – the capability to check

Returns

whether the capability is set

Return type

bool

has_dependencies()

Returns whether any dependencies are set.

Returns

whether any dependecies are set

Return type

bool

has_dependency(capability)

Returns whether the specified dependency is set.

Parameters

capability (Capability) – the capability to check

Returns

whether the dependency is set

Return type

bool

property min_instances

Returns the minimum number of instances that must be supported.

Returns

the minimum number

Return type

int

other_capabilities()

Returns all other capabilities.

Returns

all other capabilities

Return type

Capabilities

property owner

Returns the owner of these capabilities, if any.

Returns

the owner, can be None

Return type

JavaObject

supports(capabilities)

Returns true if the currently set capabilities support at least all of the capabiliites of the given Capabilities object (checks only the enum!)

Parameters

capabilities (Capabilities) – the capabilities to check

Returns

whether the current capabilities support at least the specified ones

Return type

bool

supports_maybe(capabilities)

Returns true if the currently set capabilities support (or have a dependency) at least all of the capabilities of the given Capabilities object (checks only the enum!)

Parameters

capabilities (Capabilities) – the capabilities to check

Returns

whether the current capabilities (potentially) support the specified ones

Return type

bool

test_attribute(att, is_class=None, fail=False)

Tests whether the attribute meets the conditions.

Parameters
  • att (Attribute) – the Attribute to test

  • is_class (bool) – whether this attribute is the class attribute

  • fail (bool) – whether to fail with an exception in case the test fails

Returns

whether the attribute meets the conditions

Return type

bool

test_instances(data, from_index=None, to_index=None, fail=False)

Tests whether the dataset meets the conditions.

Parameters
  • data (Instances) – the Instances to test

  • from_index (int) – the first attribute to include

  • to_index (int) – the last attribute to include

Returns

wether the dataset meets the requirements

Return type

bool

class weka.core.capabilities.Capability(jobject=None, member=None)

Bases: weka.core.classes.Enum

Wrapper for a Capability.

property is_attribute

Returns whether this capability is an attribute.

Returns

whether it is an attribute

Return type

bool

property is_attribute_capability

Returns whether this capability is an attribute capability.

Returns

whether it is an attribute capability

Return type

bool

property is_class

Returns whether this capability is a class.

Returns

whether it is a class

Return type

bool

property is_class_capability

Returns whether this capability is a class capability.

Returns

whether it is a class capability

Return type

bool

property is_other_capability

Returns whether this capability is an other capability.

Returns

whether it is an other capability

Return type

bool

weka.core.classes module

class weka.core.classes.AbstractParameter(classname=None, jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Ancestor for all parameter classes used by SetupGenerator and MultiSearch.

property prop

Returns the currently set property to apply the parameter to.

Returns

the property

Return type

str

class weka.core.classes.Date(jobject=None, msecs=None)

Bases: weka.core.classes.JavaObject

Wraps a java.util.Date object.

property time

Returns the stored milli-seconds.

Returns

the milli-seconds

Return type

long

class weka.core.classes.Enum(jobject=None, enum=None, member=None)

Bases: weka.core.classes.JavaObject

Wrapper for Java enums.

property name

Returns the name of the enum member.

Returns

the name

Return type

str

property ordinal

Returns the ordinal of the enum member.

Returns

the ordinal

Return type

int

property values

Returns list of all enum members.

Returns

all enum members

Return type

list

class weka.core.classes.Environment(jobject=None)

Bases: weka.core.classes.JavaObject

Wraps around weka.core.Environment

add_variable(key, value, system_wide=False)

Adds the environment variable.

Parameters
  • key (str) – the name of the variable

  • value (str) – the value

  • system_wide (bool) – whether to add the variable system wide

remove_variable(key)

Adds the environment variable.

Parameters

key (str) – the name of the variable

classmethod system_wide()

Returns the system-wide environment.

;return: the environment :rtype: Environment

variable_names()

Returns the names of all environment variables.

Returns

the names of the variables

Return type

list

variable_value(key)

Returns the value of the environment variable.

Parameters

key (str) – the name of the variable

Returns

the variable value

Return type

str

class weka.core.classes.JavaArray(jobject)

Bases: weka.core.classes.JavaObject

Convenience wrapper around Java arrays.

component_type()

Returns the classname of the elements.

Returns

the class of the elements

Return type

str

classmethod new_instance(classname, length)

Creates a new array with the given classname and length; initial values are null.

Parameters
  • classname (str) – the classname in Java notation (eg “weka.core.DenseInstance”)

  • length (int) – the length of the array

Returns

the Java array

Return type

JB_Object

class weka.core.classes.JavaArrayIterator(data)

Bases: object

Iterator for elements in a Java array.

class weka.core.classes.JavaObject(jobject)

Bases: confobj._core.JSONObject

Basic Java object.

classmethod check_type(jobject, intf_or_class)

Returns whether the object implements the specified interface or is a subclass.

Parameters
  • jobject (JB_Object) – the Java object to check

  • intf_or_class (str) – the classname in Java notation (eg “weka.core.DenseInstance;”)

Returns

whether object implements interface or is subclass

Return type

bool

property classname

Returns the Java classname in dot-notation.

Returns

the Java classname

Return type

str

classmethod enforce_type(jobject, intf_or_class)

Raises an exception if the object does not implement the specified interface or is not a subclass.

Parameters
  • jobject (JB_Object) – the Java object to check

  • intf_or_class (str) – the classname in Java notation (eg “weka.core.DenseInstance”)

classmethod from_dict(d)

Restores an object state from a dictionary, used in de-JSONification.

Parameters

d (dict) – the object dictionary

Returns

the object

Return type

object

get_property(path)

Attempts to get the value (jobject, a Java object) of the provided (bean) property path.

Parameters

path (str) – the property path, e.g., “filter” for a setFilter(…)/getFilter() method pair

Returns

the wrapped Java object

Return type

JavaObject

property is_serializable

Returns true if the object is serialiable.

Returns

true if serializable

Return type

bool

property jclass

Returns the Java class object of the underlying Java object.

Returns

the Java class

Return type

JB_Object

property jclasswrapper

Returns a JClassWrapper instance of the class for the encapsulated Java object, giving access to the class methods using dot notation.

http://pythonhosted.org//javabridge/highlevel.html#wrapping-java-objects-using-reflection

Returns

the wrapper

Return type

JClassWrapper

property jwrapper

Returns a JWrapper instance of the encapsulated Java object, giving access to methods using dot notation.

http://pythonhosted.org//javabridge/highlevel.html#wrapping-java-objects-using-reflection

Returns

the wrapper

Return type

JWrapper

classmethod new_instance(classname, options=None)

Creates a new object from the given classname using the default constructor, None in case of error.

Parameters
  • classname (str) – the classname in Java notation (eg “weka.core.DenseInstance”)

  • options (list) – the list of options to use, ignored if None

Returns

the Java object

Return type

JB_Object

set_property(path, jobject)

Attempts to set the value (jobject, a Java object) of the provided (bean) property path.

Parameters
  • path (str) – the property path, e.g., “filter” for a setFilter(…)/getFilter() method pair

  • jobject (JB_Object) – the Java object to set; if instance of JavaObject class, the jobject member is automatically used

to_dict()

Returns a dictionary that represents this object, to be used for JSONification.

Returns

the object dictionary

Return type

dict

class weka.core.classes.ListParameter(jobject=None, options=None)

Bases: weka.core.classes.AbstractParameter

Parameter using a predefined list of values, used by SetupGenerator and MultiSearch.

property values

Returns the currently set values.

Returns

the list of values (strings)

Return type

list

class weka.core.classes.MathParameter(jobject=None, options=None)

Bases: weka.core.classes.AbstractParameter

Parameter using a math expression for generating values, used by SetupGenerator and MultiSearch.

property base

Returns the currently set base value.

Returns

the base

Return type

float

property expression

Returns the currently set expression.

Returns

the expression

Return type

str

property maximum

Returns the currently set maximum value.

Returns

the maximum

Return type

float

property minimum

Returns the currently set minimum value.

Returns

the minimum

Return type

float

property step

Returns the currently set step value.

Returns

the step

Return type

float

class weka.core.classes.Option(jobject)

Bases: weka.core.classes.JavaObject

Wrapper for the weka.core.Option class.

property description

Returns the description of the option.

Returns

the description

Return type

str

property name

Returns the name of the option.

Returns

the name

Return type

str

property num_arguments

Returns the synopsis of the option.

Returns

the synopsis

Return type

str

property synopsis

Returns the synopsis of the option.

Returns

the synopsis

Return type

str

class weka.core.classes.OptionHandler(jobject, options=None)

Bases: weka.core.classes.JavaObject, confobj._core.Configurable

Ancestor for option-handling classes. Classes should implement the weka.core.OptionHandler interface to have any effect.

description()

Returns a description of the object.

Returns

the description

Return type

str

classmethod from_dict(d)

Restores an object state from a dictionary, used in de-JSONification.

Parameters

d (dict) – the object dictionary

Returns

the object

Return type

object

global_info()

Returns the globalInfo() result, None if not available.

Rtypes

str

property options

Obtains the currently set options as list.

Returns

the list of options

Return type

list

to_commandline()

Generates a commandline string from the JavaObject instance.

Returns

the commandline string

Return type

str

to_dict()

Returns a dictionary that represents this object, to be used for JSONification.

Returns

the object dictionary

Return type

dict

to_help(title=True, description=True, options=True, use_headers=True, separator='')

Returns a string that contains the ‘global_info’ text and the options.

Parameters
  • title (bool) – whether to output a title

  • description (bool) – whether to output the description

  • options (bool) – whether to output the options

  • use_headers (bool) – whether to output headers, describing the sections

  • separator (str) – the separator line to use between sections

Returns

the generated help string

Return type

str

class weka.core.classes.Random(seed)

Bases: weka.core.classes.JavaObject

Wrapper for the java.util.Random class.

next_double()

Next random double.

Returns

the next random double

Return type

double

next_int(n=None)

Next random integer. if n is provided, then between 0 and n-1.

Parameters

n (int) – the upper limit (minus 1) for the random integer

Returns

the next random integer

Return type

int

class weka.core.classes.Range(jobject=None, ranges=None)

Bases: weka.core.classes.JavaObject

Wrapper for a Weka Range object.

property invert

Returns whether the range is inverted.

Returns

true if inverted

Return type

bool

property ranges

Returns the string range.

Returns

the string range of 1-based indices

Return type

str

selection()

Returns the selection list.

Returns

the list of 0-based integer indices

Return type

list

upper(upper)

Sets the upper limit.

Parameters

upper (int) – the upper limit

class weka.core.classes.SelectedTag(jobject=None, tag_id=None, tag_text=None, tags=None)

Bases: weka.core.classes.JavaObject

Wrapper for the weka.core.SelectedTag class.

property selected

Returns the selected tag.

Returns

the tag

Return type

Tag

property tags

Returns the associated tags.

Returns

the list of Tag objects

Return type

list

class weka.core.classes.SetupGenerator(jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Allows generation of large number of setups using parameter setups.

property base_object

Returns the base object to apply the setups to.

Returns

the base object

Return type

JavaObject or OptionHandler

property parameters

Returns the list of currently set search parameters.

Returns

the list of AbstractSearchParameter objects

Return type

list

setups()

Generates and returns all the setups according to the parameter search space.

Returns

the list of configured objects (of type JavaObject)

Return type

list

class weka.core.classes.SingleIndex(jobject=None, index=None)

Bases: weka.core.classes.JavaObject

Wrapper for a Weka SingleIndex object.

index()

Returns the integer index.

Returns

the 0-based integer index

Return type

int

property single_index

Returns the string index.

Returns

the 1-based string index

Return type

str

upper(upper)

Sets the upper limit.

Parameters

upper (int) – the upper limit

class weka.core.classes.Stoppable

Bases: object

Classes that can be stopped.

is_stopped()

Returns whether the object has been stopped.

Returns

whether stopped

Return type

bool

stop_execution()

Triggers the stopping of the object.

class weka.core.classes.Tag(jobject=None, ident=None, ident_str='', readable='', uppercase=True)

Bases: weka.core.classes.JavaObject

Wrapper for the weka.core.Tag class.

property ident

Returns the current integer ID of the tag.

Returns

the integer ID

Return type

int

property identstr

Returns the current ID string.

Returns

the ID string

Return type

str

property readable

Returns the ‘human readable’ string.

Returns

the readable string

Return type

str

class weka.core.classes.Tags(jobject=None, tags=None)

Bases: weka.core.classes.JavaObject

Wrapper for an array of weka.core.Tag objects.

find(name)

Returns the Tag that matches the name.

Parameters

name (str) – the string representation of the tag

Returns

the tag, None if not found

Return type

Tag

classmethod get_object_tags(javaobject, methodname)

Instantiates the Tag array obtained from the object using the specified method name.

Example: cls = Classifier(classname=”weka.classifiers.meta.MultiSearch”) tags = Tags.get_object_tags(cls, “getMetricsTags”)

Parameters
  • javaobject (JavaObject) – the javaobject to obtain the tags from

  • methodname (str) – the method name returning the Tag array

Returns

the Tags objects

Return type

Tags

classmethod get_tags(classname, field)

Instantiates the Tag array located in the specified class with the given field name.

Example: tags = Tags.get_tags(“weka.classifiers.functions.SMO”, “TAGS_FILTER”)

Parameters
  • classname (str) – the classname in which the tags reside

  • field (str) – the field name of the Tag array

Returns

the Tags objects

Return type

Tags

weka.core.classes.backquote(s)

Backquotes the string.

Parameters

s (str) – the string to process

Returns

the backquoted string

Return type

str

weka.core.classes.call_non_public_method(jobject, method, arg_types=None, arg_values=None)

For calling a non-public method of the provided Java object.

Parameters
  • jobject (JBObject) – the Java object to call the method on

  • method (str) – the name of the method to call

  • arg_types (list) – the method argument types, either Java objects or classname strings (eg “java.lang.Integer” or “int”)

  • arg_values (list) – the method argument values

Returns

the result of the method call

weka.core.classes.complete_classname(classname)

Attempts to complete a partial classname like ‘.J48’ and returns the full classname if a single match was found, otherwise an exception is raised.

Parameters

classname (str) – the partial classname to expand

Returns

the full classname

Return type

str

weka.core.classes.deepcopy(obj)

Creates a deep copy of the JavaObject (or derived class) or JB_Object.

Parameters

obj (object) – the object to create a copy of

Returns

the copy, None if failed to copy

Return type

object

weka.core.classes.from_byte_array(array)

Deserializes Java objects from the numpy array.

Parameters

array (ndarray) – the numpy array to deserialize the Java objects from

Returns

the list of deserialized JB_Object instances

Return type

list

weka.core.classes.from_commandline(cmdline, classname=None)

Creates an OptionHandler based on the provided commandline string.

Parameters
  • cmdline (str) – the commandline string to use

  • classname (str) – the classname of the wrapper to return other than OptionHandler (in dot-notation)

Returns

the generated option handler instance

Return type

object

weka.core.classes.get_classname(obj)

Returns the classname of the JB_Object, Python class or object.

Parameters

obj (object) – the java object or Python class/object to get the classname for

Returns

the classname

Return type

str

weka.core.classes.get_enum(classname, enm)

Returns the instance of the enum.

Parameters
  • classname (str) – the classname of the enum

  • enm (str) – the name of the enum element to return

Returns

the enum instance

Return type

JB_Object

weka.core.classes.get_jclass(classname)

Returns the Java class object associated with the dot-notation classname. Also supports the Java primitives: boolean, byte, short, int, long, float, double, char.

Parameters

classname (str) – the classname

Returns

the class object

Return type

JB_Object

weka.core.classes.get_non_public_field(jobject, field)

Returns the specified non-public field from the Java object.

Parameters
  • jobject (JBObject) – the Java object to get the field from

  • field (str) – the name of the field to retrieve

Returns

the value

weka.core.classes.get_static_field(classname, fieldname, signature)

Returns the Java object associated with the static field of the specified class.

Parameters
  • classname (str) – the classname of the class to get the field from

  • fieldname (str) – the name of the field to retriev

Returns

the object

Return type

JB_Object

weka.core.classes.help_for(classname, title=True, description=True, options=True, use_headers=True, separator='')

Generates a help screen for the specified class.

Parameters
  • classname (str) – the class to get the help screen for, must implement the OptionHandler interface

  • title (bool) – whether to output a title

  • description (bool) – whether to output the description

  • options (bool) – whether to output the options

  • use_headers (bool) – whether to output headers, describing the sections

  • separator (str) – the separator line to use between sections

Returns

the help screen, None if not available

Return type

str

weka.core.classes.is_array(obj)

Checks whether the Java object is an array.

Parameters

obj (JB_Object) – the Java object to check

Returns

whether the object is an array

Return type

bool

weka.core.classes.is_instance_of(obj, class_or_intf_name)

Checks whether the Java object implements the specified interface or is a subclass of the superclass.

Parameters
  • obj (JB_Object) – the Java object to check

  • class_or_intf_name (str) – the superclass or interface to check, dot notation or with forward slashes

Returns

true if either implements interface or subclass of superclass

Return type

bool

weka.core.classes.join_options(options)

Turns the list of options back into a single commandline string.

Parameters

options (list) – the list of options to process

Returns

the combined options

Return type

str

weka.core.classes.list_property_names(obj)

Lists the property names (Bean properties, ie read/write method pair) of the Java object.

Parameters

obj (JB_Object or JavaObject) – the object to inspect

Returns

the list of property names

Return type

list

weka.core.classes.load_suggestions()

Loads the class/package suggestions, if necessary.

weka.core.classes.main()

Runs a classifier from the command-line. Calls JVM start/stop automatically. Use -h to see all options.

weka.core.classes.new_instance(classname)

Instantiates an object of the specified class. Does not raise an Exception if it fails to do so (opposed to JavaObject.new_instance).

Parameters

classname (str) – the name of the class to instantiate

Returns

the object, None if failed to instantiate

Return type

JB_Object

weka.core.classes.quote(s)

Quotes the string if necessary.

Parameters

s (str) – the string to process

Returns

the quoted string

Return type

str

weka.core.classes.serialization_read(filename)

Reads the serialized object from disk. Caller must wrap object in appropriate Python wrapper class.

Parameters

filename (str) – the file with the serialized object

Returns

the JB_Object

Return type

JB_Object

weka.core.classes.serialization_read_all(filename)

Reads the serialized objects from disk. Caller must wrap objects in appropriate Python wrapper classes.

Parameters

filename (str) – the file with the serialized objects

Returns

the list of JB_OBjects

Return type

list

weka.core.classes.serialization_write(filename, jobject)

Serializes the object to disk. JavaObject instances get automatically unwrapped.

Parameters
  • filename (str) – the file to serialize the object to

  • jobject (JB_Object or JavaObject) – the object to serialize

weka.core.classes.serialization_write_all(filename, jobjects)

Serializes the list of objects to disk. JavaObject instances get automatically unwrapped.

Parameters
  • filename (str) – the file to serialize the object to

  • jobjects (list) – the list of objects to serialize

weka.core.classes.split_commandline(cmdline)

Splits the commandline string into classname and options list.

Parameters

cmdline (str) – the commandline string to split

Returns

the tuple of classname and options list

Return type

tuple

weka.core.classes.split_options(cmdline)

Splits the commandline into a list of options.

Parameters

cmdline (str) – the commandline string to split into individual options

Returns

the split list of commandline options

Return type

list

weka.core.classes.suggest_package(name, exact=False)

Suggests package(s) for the given name (classname, package name). Matching can be either exact or just a substring.

Parameters
  • name (str) – the name to look for

  • exact (bool) – whether to perform exact matching or substring matching

Returns

list of matching package names

Return type

list

weka.core.classes.suggestions = None

dictionary for class -> package relation

weka.core.classes.to_byte_array(jobjects)

Serializes the list of objects into a numpy array.

Parameters

jobjects (list) – the list of objects to serialize

Returns

the numpy array

Return type

ndarray

weka.core.classes.to_commandline(optionhandler)

Generates a commandline string from the OptionHandler instance.

Parameters

optionhandler (OptionHandler) – the OptionHandler instance to turn into a commandline

Returns

the commandline string

Return type

str

weka.core.classes.unbackquote(s)

Un-backquotes the string.

Parameters

s (str) – the string to process

Returns

the un-backquoted string

Return type

str

weka.core.classes.unquote(s)

Un-quotes the string.

Parameters

s (str) – the string to process

Returns

the un-quoted string

Return type

str

weka.core.converters module

class weka.core.converters.IncrementalLoaderIterator(loader, structure)

Bases: object

Iterator for dataset rows when loarding incrementally.

class weka.core.converters.Loader(classname='weka.core.converters.ArffLoader', jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for Loaders.

load_file(dfile, incremental=False, class_index=None)

Loads the specified file and returns the Instances object. In case of incremental loading, only the structure.

Parameters
  • dfile (str) – the file to load

  • incremental (bool) – whether to load the dataset incrementally

  • class_index (str) – the class index string to use (‘first’, ‘second’, ‘third’, ‘last-2’, ‘last-1’, ‘last’ or 1-based index)

Returns

the full dataset or the header (if incremental)

Return type

Instances

Raises

Exception – if the file does not exist

load_url(url, incremental=False)

Loads the specified URL and returns the Instances object. In case of incremental loading, only the structure.

Parameters
  • url (str) – the URL to load the data from

  • incremental (bool) – whether to load the dataset incrementally

Returns

the full dataset or the header (if incremental)

Return type

Instances

class weka.core.converters.Saver(classname='weka.core.converters.ArffSaver', jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for Savers.

capabilities()

Returns the capabilities of the saver.

Returns

the capabilities

Return type

Capabilities

save_file(data, dfile)

Saves the Instances object in the specified file.

Parameters
  • data (Instances) – the data to save

  • dfile (str) – the file to save the data to

class weka.core.converters.TextDirectoryLoader(jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for TextDirectoryLoader.

load()

Loads the text files from the specified directory and returns the Instances object. In case of incremental loading, only the structure.

Returns

the full dataset or the header (if incremental)

Return type

Instances

weka.core.converters.load_any_file(filename, class_index=None)

Determines a Loader based on the the file extension. If successful, loads the full dataset and returns it.

Parameters
  • filename (str) – the name of the file to load

  • class_index (str) – the class index string to use (‘first’, ‘second’, ‘third’, ‘last-2’, ‘last-1’, ‘last’ or 1-based index)

Returns

the

Return type

Instances

weka.core.converters.load_csv_file(filename, dialect='excel', delimiter=',', quotechar='"', num_cols=None)

Loads a CSV file using the Python csv module and then converts it to an Instances object. Better at reading CSV files than Weka’s built-in CSVLoader. String attributes can be converted to nominal ones using the weka.filters.unsupervised.attribute.StringToNominal filter.

Parameters
  • filename (str) – the name of the CSV file to load

  • dialect (str) – the type of CSV file to load

  • delimiter (str) – the field delimiter

  • quotechar (str) – the character used for quoting cells

  • quoting – how the quoting works

  • num_cols (list) – the list of 0-based column indices that are numeric, default for cols is str

weka.core.converters.loader_for_file(filename)

Returns a Loader that can load the specified file, based on the file extension. None if failed to determine.

Parameters

filename (str) – the filename to get the loader for

Returns

the assoicated loader instance or None if none found

Return type

Loader

weka.core.converters.ndarray_to_instances(array, relation, att_template='Att-#', att_list=None)

Converts the numpy matrix into an Instances object and returns it.

Parameters
  • array (numpy.darray) – the numpy ndarray to convert

  • relation (str) – the name of the dataset

  • att_template (str) – the prefix to use for the attribute names, “#” is the 1-based index, “!” is the 0-based index, “@” the relation name

  • att_list (list) – the list of attribute names to use

Returns

the generated instances object

Return type

Instances

weka.core.converters.save_any_file(data, filename)

Determines a Saver based on the the file extension. Returns whether successfully saved.

Parameters
  • filename (str) – the name of the file to save

  • data (Instances) – the data to save

Returns

whether successfully saved

Return type

bool

weka.core.converters.saver_for_file(filename)

Returns a Saver that can load the specified file, based on the file extension. None if failed to determine.

Parameters

filename (str) – the filename to get the saver for

Returns

the associated saver instance or None if none found

Return type

Saver

weka.core.database module

class weka.core.database.DatabaseUtils(jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for weka.experiment.DatabaseUtils.

property db_url

Obtains the currently set database URL.

Returns

the database URL

Return type

str

property password

Obtains the currently set database password.

Returns

the database password

Return type

str

property user

Obtains the currently set database user.

Returns

the database user

Return type

str

class weka.core.database.InstanceQuery(jobject=None, options=None)

Bases: weka.core.database.DatabaseUtils

Wrapper class for weka.experiment.InstanceQuery.

property custom_properties

Obtains the currently set custom properties file.

Returns

the custom properties file

Return type

str

property query

Obtains the current SQL query to execute.

Returns

the SQL query

Return type

str

retrieve_instances(query=None)

Executes either the supplied query or the one set via options (or the ‘query’ property).

Parameters

query (str) – query to execute if not the currently set one

Returns

the generated dataq

Return type

Instances

property sparse_data

Obtains the whether sparse data is returned or not.

Returns

whether sparse data is generated

Return type

bool

weka.core.dataset module

class weka.core.dataset.Attribute(jobject)

Bases: weka.core.classes.JavaObject

Wrapper class for weka.core.Attribute.

add_relation(instances)

Adds the relation value, returns the index.

Parameters

instances (Instances) – the Instances object to add

Returns

the index

Return type

int

add_string_value(s)

Adds the string value, returns the index.

Parameters

s (str) – the string to add

Returns

the index

Return type

int

copy(name=None)

Creates a copy of this attribute.

Parameters

name (str) – the new name, uses the old one if None

Returns

the copy of the attribute

Return type

Attribute

classmethod create_date(name, formt="yyyy-MM-dd'T'HH:mm:ss")

Creates a date attribute.

Parameters
  • name (str) – the name of the attribute

  • formt (str) – the date format, see Javadoc for java.text.SimpleDateFormat

classmethod create_nominal(name, labels)

Creates a nominal attribute.

Parameters
  • name (str) – the name of the attribute

  • labels (list) – the list of string labels to use

classmethod create_numeric(name)

Creates a numeric attribute.

Parameters

name (str) – the name of the attribute

classmethod create_relational(name, inst)

Creates a relational attribute.

Parameters
  • name (str) – the name of the attribute

  • inst (Instances) – the structure of the relational attribute

classmethod create_string(name)

Creates a string attribute.

Parameters

name (str) – the name of the attribute

property date_format

Returns the format of this data attribute. See java.text.SimpleDateFormat Javadoc.

Returns

the format string

Return type

str

equals(att)

Checks whether this attributes is the same as the provided one.

Parameters

att (Attribute) – the Attribute to check against

Returns

whether the same

Return type

bool

equals_msg(att)

Checks whether this attributes is the same as the provided one. Returns None if the same, otherwise error message.

Parameters

att (Attribute) – the Attribute to check against

Returns

None if the same, otherwise error message

Return type

str

property index

Returns the index of this attribute.

Returns

the index

Return type

int

index_of(label)

Returns the index of the label in this attribute.

Parameters

label (str) – the string label to get the index for

Returns

the 0-based index

Return type

int

property is_averagable

Returns whether the attribute is averagable.

Returns

whether averagable

Return type

bool

property is_date

Returns whether the attribute is a date one.

Returns

whether date attribute

Return type

bool

is_in_range(value)

Checks whether the value is within the bounds of the numeric attribute.

Parameters

value (float) – the numeric value to check

Returns

whether between lower and upper bound

Return type

bool

property is_nominal

Returns whether the attribute is a nominal one.

Returns

whether nominal attribute

Return type

bool

property is_numeric

Returns whether the attribute is a numeric one (date or numeric).

Returns

whether numeric attribute

Return type

bool

property is_relation_valued

Returns whether the attribute is a relation valued one.

Returns

whether relation valued attribute

Return type

bool

property is_string

Returns whether the attribute is a string attribute.

Returns

whether string attribute

Return type

bool

property lower_numeric_bound

Returns the lower numeric bound of the numeric attribute.

Returns

the lower bound

Return type

float

property name

Returns the name of the attribute.

Returns

the name

Return type

str

property num_values

Returns the number of labels.

Returns

the number of labels

Return type

int

property ordering

Returns the ordering of the attribute.

Returns

the ordering (ORDERING_SYMBOLIC, ORDERING_ORDERED, ORDERING_MODULO)

Return type

int

parse_date(s)

Parses the date string and returns the internal format value.

Parameters

s (str) – the date string

Returns

the internal format

Return type

float

property type

Returns the type of the attribute. See weka.core.Attribute Javadoc.

Returns

the type

Return type

int

type_str(short=False)

Returns the type of the attribute as string.

Returns

the type

Return type

str

property upper_numeric_bound

Returns the upper numeric bound of the numeric attribute.

Returns

the upper bound

Return type

float

value(index)

Returns the label for the index.

Parameters

index (int) – the 0-based index of the label to return

Returns

the label

Return type

str

property values

Returns the labels, strings or relation-values.

Returns

all the values, None if not NOMINAL, STRING, or RELATION

Return type

list

property weight

Returns the weight of the attribute.

Returns

the weight

Return type

float

class weka.core.dataset.AttributeIterator(data)

Bases: object

Iterator for attributes in an Instances object.

class weka.core.dataset.AttributeStats(jobject)

Bases: weka.core.classes.JavaObject

Container for attribute statistics.

property distinct_count

The number of distinct values.

Returns

The number of distinct values

Return type

int

property int_count

The number of int-like values.

Returns

The number of int-like values

Return type

int

property missing_count

The number of missing values.

Returns

The number of missing values

Return type

int

property nominal_counts

Counts of each nominal value.

Returns

Counts of each nominal value

Return type

ndarray

property nominal_weights

Weight mass for each nominal value.

Returns

Weight mass for each nominal value

Return type

ndarray

property numeric_stats

Stats on numeric value distributions.

Returns

Stats on numeric value distributions

Return type

NumericStats

property total_count

The total number of values.

Returns

The total number of values

Return type

int

property unique_count

The number of values that only appear once.

Returns

The number of values that only appear once

Return type

int

class weka.core.dataset.Instance(jobject)

Bases: weka.core.classes.JavaObject

Wrapper class for weka.core.Instance.

property class_attribute

Returns the currently set class attribute.

Returns

the class attribute

Return type

Attribute

property class_index

Returns the currently set class index.

Returns

the class index, -1 if not set

Return type

int

classmethod create_instance(values, classname='weka.core.DenseInstance', weight=1.0)

Creates a new instance.

Parameters
  • values (ndarray or list) – the float values (internal format) to use, numpy array or list.

  • classname (str) – the classname of the instance (eg weka.core.DenseInstance).

  • weight (float) – the weight of the instance

classmethod create_sparse_instance(values, max_values, classname='weka.core.SparseInstance', weight=1.0)

Creates a new sparse instance.

Parameters
  • values (list) – the list of tuples (0-based index and internal format float). The indices of the tuples must be in ascending order and “max_values” must be set to the maximum number of attributes in the dataset.

  • max_values (int) – the maximum number of attributes

  • classname (str) – the classname of the instance (eg weka.core.SparseInstance).

  • weight (float) – the weight of the instance

property dataset

Returns the dataset that this instance belongs to.

Returns

the dataset or None if no dataset set

Return type

Instances

get_relational_value(index)

Returns the relational value at the specified position (0-based).

Parameters

index (int) – the 0-based index of the inernal value

Returns

the relational value

Return type

Instances

get_string_value(index)

Returns the string value at the specified position (0-based).

Parameters

index (int) – the 0-based index of the inernal value

Returns

the string value

Return type

str

get_value(index)

Returns the internal value at the specified position (0-based).

Parameters

index (int) – the 0-based index of the inernal value

Returns

the internal value

Return type

float

has_class()

Returns whether a class attribute is set (convenience method).

Returns

whether a class attribute is currently set

Return type

bool

has_missing()

Returns whether at least one attribute has a missing value.

Returns

whether at least one value is missing

Return type

bool

is_missing(index)

Returns whether the attribute at the specified index is missing.

Parameters

index (int) – the 0-based index of the attribute

Returns

whether the value is missing

Return type

bool

classmethod missing_value()

Returns the numeric value that represents a missing value in Weka (NaN).

Returns

missing value

Return type

float

property num_attributes

Returns the number of attributes.

Returns

the numer of attributes

Return type

int

property num_classes

Returns the number of class labels.

Returns

the numer of class labels

Return type

int

set_missing(index)

Sets the attribute at the specified index to missing.

Parameters

index (int) – the 0-based index of the attribute

set_string_value(index, s)

Sets the string value at the specified position (0-based).

Parameters
  • index (int) – the 0-based index of the inernal value

  • s (str) – the string value

set_value(index, value)

Sets the internal value at the specified position (0-based).

Parameters
  • index (int) – the 0-based index of the attribute

  • value (float) – the internal float value to set

to_numpy(internal=False)

Turns the instance into a numpy matrix.

Parameters

internal (bool) – whether to return the internal format

Returns

the dataset as matrix with single row

Return type

np.ndarray

property values

Returns the internal values of this instance.

Returns

the values as numpy array

Return type

ndarray

property weight

Returns the currently set weight.

Returns

the weight

Return type

float

class weka.core.dataset.InstanceIterator(data)

Bases: object

Iterator for rows in an Instances object.

class weka.core.dataset.InstanceValueIterator(data)

Bases: object

Iterator for values in an Instance object.

class weka.core.dataset.Instances(jobject)

Bases: weka.core.classes.JavaObject

Wrapper class for weka.core.Instances.

add_instance(inst, index=None)

Adds the specified instance to the dataset.

Parameters
  • inst (Instance) – the Instance to add

  • index (int) – the 0-based index where to add the Instance

classmethod append_instances(inst1, inst2)

Merges the two datasets (one-after-the-other). Throws an exception if the datasets aren’t compatible.

Parameters
Returns

the combined dataset

Return type

Instances

attribute(index)

Returns the specified attribute.

Parameters

index (int) – the 0-based index of the attribute

Returns

the attribute

Return type

Attribute

attribute_by_name(name)

Returns the specified attribute, None if not found.

Parameters

name (str) – the name of the attribute

Returns

the attribute or None

Return type

Attribute

attribute_names()

Returns a list of all the attribute names.

Returns

list of attribute names

Return type

list

attribute_stats(index)

Returns the specified attribute statistics.

Parameters

index (int) – the 0-based index of the attribute

Returns

the attribute statistics

Return type

AttributeStats

attributes()

Returns an iterator over the attributes.

property class_attribute

Returns the currently set class attribute.

Returns

the class attribute

Return type

Attribute

property class_index

Returns the currently set class index (0-based).

Returns

the class index, -1 if not set

Return type

int

class_is_first()

Sets the first attribute as class attribute (convenience method).

class_is_last()

Sets the last attribute as class attribute (convenience method).

compactify()

Compactifies the set of instances.

classmethod copy_instances(dataset, from_row=None, num_rows=None)

Creates a copy of the Instances. If either from_row or num_rows are None, then all of the data is being copied.

Parameters
  • dataset (Instances) – the original dataset

  • from_row (int) – the 0-based start index of the rows to copy

  • num_rows (int) – the number of rows to copy

Returns

the copy of the data

Return type

Instances

copy_structure()

Returns a copy of the dataset structure.

Returns

the structure of the dataset

Return type

Instances

classmethod create_instances(name, atts, capacity)

Creates a new Instances.

Parameters
  • name (str) – the relation name

  • atts (list of Attribute) – the list of attributes to use for the dataset

  • capacity (int) – how many data rows to reserve initially (see compactify)

Returns

the dataset

Return type

Instances

cv_splits(folds=10, rnd=None, stratify=True)

Generates a list of train/test pairs used in cross-validation. Creates a copy of the dataset beforehand when randomizing.

Parameters
  • folds (int) – the number of folds to use, >= 2

  • rnd (Random) – the random number generator to use for randomization, skips randomization if None

  • stratify (bool) – whether to stratify the data after randomization

Returns

the list of train/test split tuples

Return type

list

delete(index=None)

Removes either the specified Instance or all Instance objects.

Parameters

index (int) – the 0-based index of the instance to remove

delete_attribute(index)

Deletes an attribute at the given position.

Parameters

index (int) – the 0-based index of the attribute to remove

delete_attribute_type(typ)

Deletes all attributes of the given type in the dataset.

Parameters

typ (int) – the attribute type to remove, see weka.core.Attribute Javadoc

delete_first_attribute()

Deletes the first attribute.

delete_last_attribute()

Deletes the last attribute.

delete_with_missing(index)

Deletes all rows that have a missing value at the specified attribute index.

Parameters

index (int) – the attribute index to check for missing attributes

equal_headers(inst)

Compares this dataset against the given one in terms of attributes.

Parameters

inst (Instances) – the dataset to compare against

Returns

None if the same, otherwise an error message

Return type

str

get_instance(index)

Returns the Instance object at the specified location.

Parameters

index (int) – the 0-based index of the instance

Returns

the instance

Return type

Instance

has_class()

Returns whether a class attribute is set (convenience method).

Returns

whether a class attribute is currently set

Return type

bool

insert_attribute(att, index)

Inserts the attribute at the specified location.

Parameters
  • att (Attribute) – the attribute to insert

  • index (int) – the index to insert the attribute at

classmethod merge_instances(inst1, inst2)

Merges the two datasets (side-by-side).

Parameters
Returns

the combined dataset

Return type

Instances

no_class()

Unsets the class attribute (convenience method).

property num_attributes

Returns the number of attributes.

Returns

the number of attributes

Return type

int

property num_instances

Returns the number of instances.

Returns

the number of instances

Return type

int

randomize(random)

Randomizes the dataset using the random number generator.

Parameters

random (Random) – the random number generator to use

property relationname

Returns the name of the dataset.

Returns

the name

Return type

str

set_instance(index, inst)

Sets the Instance at the specified location in the dataset.

Parameters
  • index (int) – the 0-based index of the instance to replace

  • inst (Instance) – the Instance to set

Returns

the instance

Return type

Instance

sort(index)

Sorts the dataset using the specified attribute index.

Parameters

index (int) – the index of the attribute

stratify(folds)

Stratifies the data after randomization for nominal class attributes.

Parameters

folds (int) – the number of folds to perform the stratification for

subset(col_range=None, col_names=None, invert_cols=False, row_range=None, invert_rows=False, keep_relationame=False)

Returns a subset of attributes/rows of the Instances object. If neither attributes nor rows have been specified a copy of the dataset gets returned. The invers of the specified cols/rows can be returned by setting invert_cols and/or invert_rows to True. The method uses the weka.filters.unsupervised.attribute.Remove and weka.filters.unsupervised.instance.RemoveRange filters under the hood.

Parameters
  • col_range (str) – the subset of attributes to return (eg ‘1-3,7-12,67-last’), None for all

  • col_names (list) – the list of attributes to return (list of names; case-sensitive), takes precedence over col_range

  • invert_cols (bool) – whether to invert the returned attributes

  • row_range (str) – the subset of rows to return (eg ‘1-3,7-12,67-last’), None for all

  • invert_rows (bool) – whether to invert the returned rows

  • keep_relationame (bool) – whether to keep the original relation name

Returns

the subset

Return type

Instances

classmethod summary(inst)

Generates a summary of the dataset.

Parameters

inst (Instances) – the dataset

Returns

the summary

Return type

str

classmethod template_instances(dataset, capacity=0)

Uses the Instances as template to create an empty dataset.

Parameters
  • dataset (Instances) – the original dataset

  • capacity (int) – how many data rows to reserve initially (see compactify)

Returns

the empty dataset

Return type

Instances

test_cv(num_folds, fold)

Generates a test fold for cross-validation.

Parameters
  • num_folds (int) – the number of folds of cross-validation, eg 10

  • fold (int) – the current fold (0-based)

Returns

the training fold

Return type

Instances

to_numpy(internal=False)

Turns the dataset into a numpy matrix.

Parameters

internal (bool) – whether to return the internal format

Returns

the dataset as matrix

Return type

np.ndarray

train_cv(num_folds, fold, random=None)

Generates a training fold for cross-validation.

Parameters
  • num_folds (int) – the number of folds of cross-validation, eg 10

  • fold (int) – the current fold (0-based)

  • random (Random) – the random number generator

Returns

the training fold

Return type

Instances

train_test_split(percentage, rnd=None)

Generates a train/test split. Creates a copy of the dataset first before applying randomization.

Parameters
  • percentage (double) – the percentage split to use (amount to use for training; 0-100)

  • rnd (Random) – the random number generator to use, if None the order gets preserved

Returns

the train/test splits

Return type

tuple

values(index)

Returns the internal values of this attribute from all the instance objects.

Returns

the values as numpy array

Return type

list

class weka.core.dataset.Stats(jobject)

Bases: weka.core.classes.JavaObject

Container for numeric attribute stats.

property count

The number of values seen.

Returns

The number of values seen

Return type

float

property max

The maximum value seen, or Double.NaN if no values seen.

Returns

The maximum value seen, or Double.NaN if no values seen

Return type

float

property mean

The mean of values at the last calculateDerived() call.

Returns

The mean of values at the last calculateDerived() call

Return type

float

property min

The minimum value seen, or Double.NaN if no values seen.

Returns

The minimum value seen, or Double.NaN if no values seen

Return type

float

property stddev

The std deviation of values at the last calculateDerived() call.

Returns

The std deviation of values at the last calculateDerived() call

Return type

float

property sum

The sum of values seen.

Returns

The sum of values seen

Return type

float

property sumsq

The sum of values squared seen.

Returns

The sum of values squared seen

Return type

float

weka.core.dataset.check_col_names_unique(cols_x, col_y=None)

Checks whether the column names are unique (a requirement for Instances objects).

Parameters
  • cols_x (list) – the column names for the input variables

  • col_y (str) – the optional name for the output variable

Returns

None if check passed, otherwise error message

Return type

str

weka.core.dataset.create_instances_from_lists(x, y=None, name='data', cols_x=None, col_y=None)

Allows the generation of an Instances object from a list of lists for X and a list for Y (optional). Data can be numeric, string or bytes. Attributes can be converted to nominal with the weka.filters.unsupervised.attribute.NumericToNominal filter. None values are interpreted as missing values.

Parameters
  • x (list of list) – the input variables (row wise)

  • y (list) – the output variable (optional)

  • name (str) – the name of the dataset

  • cols_x (list) – the column names to use

  • col_y (str) – the column name to use for the output variable (y)

Returns

the generated dataset

Return type

Instances

weka.core.dataset.create_instances_from_matrices(x, y=None, name='data', cols_x=None, col_y=None)

Allows the generation of an Instances object from a 2-dimensional matrix for X and a 1-dimensional matrix for Y (optional). Data can be numeric, string or bytes. Attributes can be converted to nominal with the weka.filters.unsupervised.attribute.NumericToNominal filter. nan values are interpreted as missing values.

Parameters
  • x (np.ndarray) – the input variables

  • y (np.ndarray) – the output variable (optional)

  • name (str) – the name of the dataset

  • cols_x (list) – the column names to use

  • col_y (str) – the column name to use for the output variable (y)

Returns

the generated dataset

Return type

Instances

weka.core.dataset.missing_value()

Returns the value that represents missing values in Weka (NaN).

Returns

missing value

Return type

float

weka.core.distances module

class weka.core.distances.DistanceFunction(classname='weka.core.EuclideanDistance', jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper for Weka’s weka.core.DistanceFunction interface.

property attribute_indices

Returns the attribute indices in use.

Returns

the attribute indices

Return type

str

distance(first, second, cutoff=None)

Computes the distance between the two Instance objects.

Parameters
  • first (Instance) – the first instance

  • second (Instance) – the second instance

  • cutoff (float) – optional cutoff value to speed up calculation

Returns

the calculated distance

Return type

float

property instances

Returns the dataset in use.

Returns

the dataset

Return type

Instances

weka.core.jvm module

weka.core.jvm.add_bundled_jars()

Adds the bundled jars to the JVM’s classpath.

weka.core.jvm.add_system_classpath()

Adds the system’s classpath to the JVM’s classpath.

weka.core.jvm.automatically_install_packages = None

whether to automatically install missing packages

weka.core.jvm.lib_dir()

Returns the “lib” directory path.

Returns

the path to the “lib” directory

Return type

str

weka.core.jvm.start(class_path=None, bundled=True, packages=False, system_cp=False, max_heap_size=None, system_info=False, auto_install=False, logging_level=10)

Initializes the javabridge connection (starts up the JVM).

Parameters
  • class_path (list) – the additional classpath elements to add

  • bundled (bool) – whether to add jars from the “lib” directory

  • packages (bool or str) – whether to add jars from Weka packages as well (bool) or an alternative Weka home directory (str)

  • system_cp (bool) – whether to add the system classpath as well

  • max_heap_size (str) – the maximum heap size (-Xmx parameter, eg 512m or 4g)

  • system_info (bool) – whether to print the system info (generated by weka.core.SystemInfo)

  • auto_install (bool) – whether to automatically install missing Weka packages (based on suggestions); in conjunction with package support

  • logging_level (int) – the logging level to use for this module, e.g., logging.DEBUG or logging.INFO

weka.core.jvm.started = None

whether the JVM has been started

weka.core.jvm.stop()

Kills the JVM.

weka.core.jvm.with_package_support = None

whether JVM was started with package support

weka.core.packages module

class weka.core.packages.Dependency(jobject)

Bases: weka.core.classes.JavaObject

Wrapper for the weka.core.packageManagement.Dependency class.

property source

Returns the source package.

Returns

the package

Return type

Package

property target

Returns the target package constraint.

Returns

the package constraint

Return type

PackageConstraint

weka.core.packages.LATEST = 'Latest'

Constant for the latest version of a package

class weka.core.packages.Package(jobject)

Bases: weka.core.classes.JavaObject

Wrapper for the weka.core.packageManagement.Package class.

as_dict()

Turns the package information into a dictionary. Not to be confused with ‘to_dict’!

Returns

the package information as dictionary

Return type

dict

property dependencies

Returns the dependencies of the package.

Returns

the list of Dependency objects

Return type

list of Dependency

install()

Installs the package.

property is_installed

Returns whether the package is installed.

Returns

whether installed

Return type

bool

property metadata

Returns the meta-data.

Returns

the meta-data dictionary

Return type

dict

property name

Returns the name of the package.

Returns

the name

Return type

str

property url

Returns the URL of the package.

Returns

the url

Return type

str

property version

Returns the version of the package.

Returns

the version

Return type

str

class weka.core.packages.PackageConstraint(jobject)

Bases: weka.core.classes.JavaObject

Wrapper for the weka.core.packageManagement.PackageConstraint class.

check_constraint(pkge=None, constr=None)

Checks the constraints.

Parameters
get_package()

Returns the package.

Returns

the package

Return type

Package

set_package(pkge)

Sets the package.

Parameters

pkge (Package) – the package

weka.core.packages.all_package(name)

Returns Package object for the specified package (either installed or available). Returns None if not found.

Parameters

name (str) – the name of the package to retrieve

Returns

the package information, None if not available

Return type

Package

weka.core.packages.all_packages()

Returns a list of all packages.

Returns

the list of packages

Return type

list

weka.core.packages.available_package(name)

Returns Package object for the specified, available package. Returns None if not installed.

Parameters

name (str) – the name of the available package to retrieve

Returns

the package information

Return type

Package

weka.core.packages.available_packages()

Returns a list of all packages that aren’t installed yet.

Returns

the list of packages

Return type

list

weka.core.packages.establish_cache()

Establishes the package cache if necessary.

weka.core.packages.install_missing_package(pkge, version='Latest', quiet=False, stop_jvm_and_exit=False)

Installs the package if not yet installed.

Parameters
  • pkge (str) – the name of the repository package, a URL (http/https) or a zip file

  • version (str) – in case of the repository packages, the version

  • quiet (bool) – whether to suppress console output and only print error messages

  • stop_jvm_and_exit (bool) – whether to stop the JVM and exit if anything was installed

Returns

tuple of (success, exit_required); “success” being True if either nothing to install or all successfully installed, False otherwise; “exit_required” being True if at least one package was installed and the JVM needs restarting

Return type

tuple

weka.core.packages.install_missing_packages(pkges, quiet=False, stop_jvm_and_exit=False)

Installs the missing packages.

Parameters
  • pkges (the packages to install) – list of tuples (packagename, version) or strings (packagename, LATEST is assume for version), use “Latest” or LATEST constant to grab latest version

  • quiet (bool) – whether to suppress console output and only print error messages

  • stop_jvm_and_exit (bool) – whether to stop the JVM and exit if anything was installed

Returns

tuple of (success, exit_required); “success” being True if either nothing to install or all successfully installed, False otherwise; “exit_required” being True if at least one package was installed and the JVM needs restarting

Return type

tuple

weka.core.packages.install_package(pkge, version='Latest', details=False)

Installs the specified package.

Parameters
  • pkge (str) – the name of the repository package, a URL (http/https) or a zip file

  • version (str) – in case of the repository packages, the version

  • details (bool) – whether to just return a success/failure flag (False) or a dict with detailed information (from_repo, version, error, install_message, success)

Returns

whether successfully installed or dict with detailed information

Return type

bool or dict

weka.core.packages.install_packages(pkges, fail_fast=True, details=False)

Installs the specified packages. When running in fail_fast mode, then the first package that fails to install will stop the installation process. Otherwise, all packages are attempted to get installed.

The details dictionary uses the package name, url or file path as the key and stores the following information in a dict as value: - from_repo (bool): whether installed from repo or “unofficial” package (ie URL or local file) - version (str): the version that was attempted to be installed (if applicable) - error (str): any error message that was encountered - install_message (str): any installation message that got returned when installing from URL or zip file - success (bool): whether successfully installed or not

Parameters
  • pkges (list) – the list of packages to install (name of the repository package, a URL (http/https) or a zip file), if tuple must be name/version

  • fail_fast (bool) – whether to quit the installation of packages with the first package that fails (True) or whether to attempt to install all packages (False)

  • details (bool) – whether to just return a success/failure flag (False) or a dict with detailed information (per package: from_repo, version, error, install_message, success)

Returns

whether successfully installed or detailed information

Return type

bool or dict

weka.core.packages.installed_package(name)

Returns Package object for the specified, installed package. Returns None if not installed.

Parameters

name (str) – the name of the installed package to retrieve

Returns

the package information

Return type

Package

weka.core.packages.installed_packages()

Returns a list of the installed packages.

Returns

the list of packages

Return type

list

weka.core.packages.is_installed(name, version=None)

Checks whether a package with the name is already installed.

Parameters
  • name (str) – the name of the package

  • version (str) – the version to check as well, ignored if None

Returns

whether the package is installed

Return type

bool

weka.core.packages.main(args=None)

Performs the specified package operation from the command-line. Calls JVM start/stop automatically. Use -h to see all options.

Parameters

args (list) – the command-line arguments to use, uses sys.argv if None

weka.core.packages.refresh_cache()

Refreshes the cache.

weka.core.packages.suggest_package(name, exact=False)

Suggests package(s) for the given name (classname, package name). Matching can be either exact or just a substring.

Parameters
  • name (str) – the name to look for

  • exact (bool) – whether to perform exact matching or substring matching

Returns

list of matching package names

Return type

list

weka.core.packages.sys_main()

Runs the main function using the system cli arguments, and returns a system error code.

Returns

0 for success, 1 for failure.

Return type

int

weka.core.packages.uninstall_package(name)

Uninstalls a package.

Parameters

name (str) – the name of the package

weka.core.packages.uninstall_packages(names)

Uninstalls a package.

Parameters

name (list) – the names of the package

weka.core.serialization module

weka.core.stemmers module

class weka.core.stemmers.Stemmer(classname='weka.core.stemmers.NullStemmer', jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for stemmers.

stem(s)

Performs stemming on the string.

Parameters

s (str) – the string to stem

Returns

the stemmed string

Return type

str

weka.core.stopwords module

class weka.core.stopwords.Stopwords(classname='weka.core.stopwords.Null', jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for stopwords handlers.

is_stopword(s)

Checks a string whether it is a stopword.

Parameters

s (str) – the string to check

Returns

True if a stopword

Return type

bool

weka.core.tokenizers module

class weka.core.tokenizers.TokenIterator(tokenizer)

Bases: object

Iterator for string tokens.

class weka.core.tokenizers.Tokenizer(classname='weka.core.tokenizers.AlphabeticTokenizer', jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for tokenizers.

tokenize(s)

Tokenizes the string.

Parameters

s (str) – the string to tokenize

Returns

the iterator

Return type

TokenIterator

weka.core.typeconv module

weka.core.typeconv.float_to_jfloat(d)

Turns the Python float into a Java java.lang.Float object.

Parameters

d (float) – the Python float

Returns

the Float object

Return type

JB_Object

weka.core.typeconv.jdouble_matrix_to_ndarray(m)

Turns the Java matrix (2-dim array) of doubles into a numpy 2-dim array.

Parameters

m – the double matrix

Type

JB_Object

Returns

Numpy array

Return type

numpy.darray

weka.core.typeconv.jdouble_to_float(d)

Turns the Java java.lang.Double object into Python float object.

Parameters

d (JB_Object) – the java.lang.Double

Returns

the Float object

Return type

float

weka.core.typeconv.jenumeration_to_list(enm)

Turns the java.util.Enumeration into a list.

Parameters

enm (JB_Object) – the enumeration to convert

Returns

the list

Return type

list

weka.core.typeconv.jstring_array_to_list(a)

Turns the Java string array into Python unicode string list.

Parameters

a (JB_Object) – the string array to convert

Returns

the string list

Return type

list

weka.core.typeconv.jstring_list_to_string_list(l, return_empty_if_none=True)

Converts a Java java.util.List containing strings into a Python list.

Parameters
  • l (JB_Object) – the list to convert

  • return_empty_if_none (bool) – whether to return an empty list or None when list object is None

Returns

the list with UTF strings

Return type

list

weka.core.typeconv.string_list_to_jarray(l)

Turns a Python unicode string list into a Java String array.

Parameters

l – the string list

Type

list

Return type

java string array

Returns

JB_Object

weka.core.version module

weka.core.version.weka_version()

Determines the version of Weka in use.

Returns

the version

Return type

str

Module contents