weka.core package

weka.core.capabilities module

class weka.core.capabilities.Capabilities(jobject=None, owner=None)

Bases: JavaObject

Wrapper for Capabilities.

attribute_capabilities()

Returns all the attribute capabilities.

Returns:

attribute capabilities

Return type:

Capabilities

capabilities()

Returns all the capabilities.

Returns:

all capabilities

Return type:

list

class_capabilities()

Returns all the class capabilities.

Returns:

class capabilities

Return type:

Capabilities

dependencies()

Returns all the dependencies.

Returns:

the dependency list

Return type:

list

disable(capability)

Disables the specified capability.

Parameters:

capability (Capability) – the capability to disable

disable_all()

Disables all capabilities.

disable_all_attribute_dependencies()

Disables all attribute dependencies.

disable_all_attributes()

Disables all attributes.

disable_all_class_dependencies()

Disables all class dependencies.

disable_all_classes()

Disables all classes.

disable_dependency(capability)

Disables the dependency of the given capability Disabling NOMINAL_ATTRIBUTES also disables BINARY_ATTRIBUTES, UNARY_ATTRIBUTES and EMPTY_NOMINAL_ATTRIBUTES.

Parameters:

capability (Capability) – the dependency to disable

enable(capability)

enables the specified capability.

Parameters:

capability (Capability) – the capability to enable

enable_all()

enables all capabilities.

enable_all_attribute_dependencies()

enables all attribute dependencies.

enable_all_attributes()

enables all attributes.

enable_all_class_dependencies()

enables all class dependencies.

enable_all_classes()

enables all classes.

enable_dependency(capability)

enables the dependency of the given capability enabling NOMINAL_ATTRIBUTES also enables BINARY_ATTRIBUTES, UNARY_ATTRIBUTES and EMPTY_NOMINAL_ATTRIBUTES.

Parameters:

capability (Capability) – the dependency to enable

classmethod for_instances(data, multi=None)

returns a Capabilities object specific for this data. The minimum number of instances is not set, the check for multi-instance data is optional.

Parameters:
  • data (Instances) – the data to generate the capabilities for

  • multi (bool) – whether to check the structure, too

Returns:

the generated capabilities

Return type:

Capabilities

handles(capability)

Returns whether the specified capability is set.

Parameters:

capability (Capability) – the capability to check

Returns:

whether the capability is set

Return type:

bool

has_dependencies()

Returns whether any dependencies are set.

Returns:

whether any dependecies are set

Return type:

bool

has_dependency(capability)

Returns whether the specified dependency is set.

Parameters:

capability (Capability) – the capability to check

Returns:

whether the dependency is set

Return type:

bool

property min_instances

Returns the minimum number of instances that must be supported.

Returns:

the minimum number

Return type:

int

other_capabilities()

Returns all other capabilities.

Returns:

all other capabilities

Return type:

Capabilities

property owner

Returns the owner of these capabilities, if any.

Returns:

the owner, can be None

Return type:

JavaObject

supports(capabilities)

Returns true if the currently set capabilities support at least all of the capabiliites of the given Capabilities object (checks only the enum!)

Parameters:

capabilities (Capabilities) – the capabilities to check

Returns:

whether the current capabilities support at least the specified ones

Return type:

bool

supports_maybe(capabilities)

Returns true if the currently set capabilities support (or have a dependency) at least all of the capabilities of the given Capabilities object (checks only the enum!)

Parameters:

capabilities (Capabilities) – the capabilities to check

Returns:

whether the current capabilities (potentially) support the specified ones

Return type:

bool

test_attribute(att, is_class=None, fail=False)

Tests whether the attribute meets the conditions.

Parameters:
  • att (Attribute) – the Attribute to test

  • is_class (bool) – whether this attribute is the class attribute

  • fail (bool) – whether to fail with an exception in case the test fails

Returns:

whether the attribute meets the conditions

Return type:

bool

test_instances(data, from_index=None, to_index=None, fail=False)

Tests whether the dataset meets the conditions.

Parameters:
  • data (Instances) – the Instances to test

  • from_index (int) – the first attribute to include

  • to_index (int) – the last attribute to include

Returns:

wether the dataset meets the requirements

Return type:

bool

class weka.core.capabilities.Capability(jobject=None, member=None)

Bases: Enum

Wrapper for a Capability.

property is_attribute

Returns whether this capability is an attribute.

Returns:

whether it is an attribute

Return type:

bool

property is_attribute_capability

Returns whether this capability is an attribute capability.

Returns:

whether it is an attribute capability

Return type:

bool

property is_class

Returns whether this capability is a class.

Returns:

whether it is a class

Return type:

bool

property is_class_capability

Returns whether this capability is a class capability.

Returns:

whether it is a class capability

Return type:

bool

property is_other_capability

Returns whether this capability is an other capability.

Returns:

whether it is an other capability

Return type:

bool

weka.core.classes module

class weka.core.classes.AbstractParameter(classname=None, jobject=None, options=None)

Bases: OptionHandler

Ancestor for all parameter classes used by SetupGenerator and MultiSearch.

property prop

Returns the currently set property to apply the parameter to.

Returns:

the property

Return type:

str

class weka.core.classes.Date(jobject=None, msecs=None)

Bases: JavaObject

Wraps a java.util.Date object.

property time

Returns the stored milli-seconds.

Returns:

the milli-seconds

Return type:

long

class weka.core.classes.Enum(jobject=None, enum=None, member=None)

Bases: JavaObject

Wrapper for Java enums.

property name

Returns the name of the enum member.

Returns:

the name

Return type:

str

property ordinal

Returns the ordinal of the enum member.

Returns:

the ordinal

Return type:

int

property values

Returns list of all enum members.

Returns:

all enum members

Return type:

list

class weka.core.classes.Environment(jobject=None)

Bases: JavaObject

Wraps around weka.core.Environment

add_variable(key, value, system_wide=False)

Adds the environment variable.

Parameters:
  • key (str) – the name of the variable

  • value (str) – the value

  • system_wide (bool) – whether to add the variable system wide

remove_variable(key)

Adds the environment variable.

Parameters:

key (str) – the name of the variable

classmethod system_wide()

Returns the system-wide environment.

;return: the environment :rtype: Environment

variable_names()

Returns the names of all environment variables.

Returns:

the names of the variables

Return type:

list

variable_value(key)

Returns the value of the environment variable.

Parameters:

key (str) – the name of the variable

Returns:

the variable value

Return type:

str

class weka.core.classes.JavaArray(jobject)

Bases: JavaObject

Convenience wrapper around Java arrays.

component_type()

Returns the classname of the elements.

Returns:

the class of the elements

Return type:

str

classmethod new_array(classname, length)

Creates a new array with the given classname and length; initial values are null.

Parameters:
  • classname (str) – the classname in Java notation (eg “weka.core.DenseInstance”)

  • length (int) – the length of the array

Returns:

the Java array

Return type:

JPype object

class weka.core.classes.JavaArrayIterator(data)

Bases: object

Iterator for elements in a Java array.

class weka.core.classes.JavaObject(jobject)

Bases: JSONObject

Basic Java object.

classmethod check_type(jobject, intf_or_class)

Returns whether the object implements the specified interface or is a subclass.

Parameters:
  • jobject (JPype object) – the Java object to check

  • intf_or_class (str) – the classname in Java notation (eg “weka.core.DenseInstance;”)

Returns:

whether object implements interface or is subclass

Return type:

bool

property classname

Returns the Java classname in dot-notation.

Returns:

the Java classname

Return type:

str

classmethod enforce_type(jobject, intf_or_class)

Raises an exception if the object does not implement the specified interface or is not a subclass.

Parameters:
  • jobject (JPype object) – the Java object to check

  • intf_or_class (str) – the classname in Java notation (eg “weka.core.DenseInstance”)

classmethod from_dict(d)

Restores an object state from a dictionary, used in de-JSONification.

Parameters:

d (dict) – the object dictionary

Returns:

the object

Return type:

object

get_property(path)

Attempts to get the value (jobject, a Java object) of the provided (bean) property path.

Parameters:

path (str) – the property path, e.g., “filter” for a setFilter(…)/getFilter() method pair

Returns:

the wrapped Java object

Return type:

JavaObject

property is_serializable

Returns true if the object is serialiable.

Returns:

true if serializable

Return type:

bool

property jclass

Returns the Java class object of the underlying Java object.

Returns:

the Java class

Return type:

JClass

property jwrapper

DEPRECATED: use self.jobject directly, as it is already wrapped.

Returns the encapsulated Java object, giving access to methods using dot notation.

Returns:

the JPype object

classmethod new_instance(classname, options=None)

Creates a new object from the given classname using the default constructor, None in case of error.

Parameters:
  • classname (str) – the classname in Java notation (eg “weka.core.DenseInstance”)

  • options (list) – the list of options to use, ignored if None

Returns:

the Java object

Return type:

JPype object

set_property(path, jobject)

Attempts to set the value (jobject, a Java object) of the provided (bean) property path.

Parameters:
  • path (str) – the property path, e.g., “filter” for a setFilter(…)/getFilter() method pair

  • jobject (JPype object) – the Java object to set; if instance of JavaObject class, the jobject member is automatically used

to_dict()

Returns a dictionary that represents this object, to be used for JSONification.

Returns:

the object dictionary

Return type:

dict

class weka.core.classes.ListParameter(jobject=None, options=None)

Bases: AbstractParameter

Parameter using a predefined list of values, used by SetupGenerator and MultiSearch.

property values

Returns the currently set values.

Returns:

the list of values (strings)

Return type:

list

class weka.core.classes.MathParameter(jobject=None, options=None)

Bases: AbstractParameter

Parameter using a math expression for generating values, used by SetupGenerator and MultiSearch.

property base

Returns the currently set base value.

Returns:

the base

Return type:

float

property expression

Returns the currently set expression.

Returns:

the expression

Return type:

str

property maximum

Returns the currently set maximum value.

Returns:

the maximum

Return type:

float

property minimum

Returns the currently set minimum value.

Returns:

the minimum

Return type:

float

property step

Returns the currently set step value.

Returns:

the step

Return type:

float

class weka.core.classes.Option(jobject)

Bases: JavaObject

Wrapper for the weka.core.Option class.

property description

Returns the description of the option.

Returns:

the description

Return type:

str

property name

Returns the name of the option.

Returns:

the name

Return type:

str

property num_arguments

Returns the synopsis of the option.

Returns:

the synopsis

Return type:

str

property synopsis

Returns the synopsis of the option.

Returns:

the synopsis

Return type:

str

class weka.core.classes.OptionHandler(jobject, options=None)

Bases: JavaObject, Configurable

Ancestor for option-handling classes. Classes should implement the weka.core.OptionHandler interface to have any effect.

description()

Returns a description of the object.

Returns:

the description

Return type:

str

classmethod from_dict(d)

Restores an object state from a dictionary, used in de-JSONification.

Parameters:

d (dict) – the object dictionary

Returns:

the object

Return type:

object

global_info()

Returns the globalInfo() result, None if not available.

Rtypes:

str

property options

Obtains the currently set options as list.

Returns:

the list of options

Return type:

list

to_commandline()

Generates a commandline string from the JavaObject instance.

Returns:

the commandline string

Return type:

str

to_dict()

Returns a dictionary that represents this object, to be used for JSONification.

Returns:

the object dictionary

Return type:

dict

to_help(title=True, description=True, options=True, use_headers=True, separator='')

Returns a string that contains the ‘global_info’ text and the options.

Parameters:
  • title (bool) – whether to output a title

  • description (bool) – whether to output the description

  • options (bool) – whether to output the options

  • use_headers (bool) – whether to output headers, describing the sections

  • separator (str) – the separator line to use between sections

Returns:

the generated help string

Return type:

str

class weka.core.classes.Random(seed)

Bases: JavaObject

Wrapper for the java.util.Random class.

next_double()

Next random double.

Returns:

the next random double

Return type:

double

next_int(n=None)

Next random integer. if n is provided, then between 0 and n-1.

Parameters:

n (int) – the upper limit (minus 1) for the random integer

Returns:

the next random integer

Return type:

int

class weka.core.classes.Range(jobject=None, ranges=None)

Bases: JavaObject

Wrapper for a Weka Range object.

property invert

Returns whether the range is inverted.

Returns:

true if inverted

Return type:

bool

property ranges

Returns the string range.

Returns:

the string range of 1-based indices

Return type:

str

selection()

Returns the selection list.

Returns:

the list of 0-based integer indices

Return type:

list

upper(upper)

Sets the upper limit.

Parameters:

upper (int) – the upper limit

class weka.core.classes.SelectedTag(jobject=None, tag_id=None, tag_text=None, tags=None)

Bases: JavaObject

Wrapper for the weka.core.SelectedTag class.

property selected

Returns the selected tag.

Returns:

the tag

Return type:

Tag

property tags

Returns the associated tags.

Returns:

the list of Tag objects

Return type:

list

class weka.core.classes.SetupGenerator(jobject=None, options=None)

Bases: OptionHandler

Allows generation of large number of setups using parameter setups.

property base_object

Returns the base object to apply the setups to.

Returns:

the base object

Return type:

JavaObject or OptionHandler

property parameters

Returns the list of currently set search parameters.

Returns:

the list of AbstractSearchParameter objects

Return type:

list

setups()

Generates and returns all the setups according to the parameter search space.

Returns:

the list of configured objects (of type JavaObject)

Return type:

list

class weka.core.classes.SingleIndex(jobject=None, index=None)

Bases: JavaObject

Wrapper for a Weka SingleIndex object.

index()

Returns the integer index.

Returns:

the 0-based integer index

Return type:

int

property single_index

Returns the string index.

Returns:

the 1-based string index

Return type:

str

upper(upper)

Sets the upper limit.

Parameters:

upper (int) – the upper limit

class weka.core.classes.Tag(jobject=None, ident=None, ident_str='', readable='', uppercase=True)

Bases: JavaObject

Wrapper for the weka.core.Tag class.

property ident

Returns the current integer ID of the tag.

Returns:

the integer ID

Return type:

int

property identstr

Returns the current ID string.

Returns:

the ID string

Return type:

str

property readable

Returns the ‘human readable’ string.

Returns:

the readable string

Return type:

str

class weka.core.classes.Tags(jobject=None, tags=None)

Bases: JavaObject

Wrapper for an array of weka.core.Tag objects.

find(name)

Returns the Tag that matches the name.

Parameters:

name (str) – the string representation of the tag

Returns:

the tag, None if not found

Return type:

Tag

classmethod get_object_tags(javaobject, methodname)

Instantiates the Tag array obtained from the object using the specified method name.

Example: cls = Classifier(classname=”weka.classifiers.meta.MultiSearch”) tags = Tags.get_object_tags(cls, “getMetricsTags”)

Parameters:
  • javaobject (JavaObject) – the javaobject to obtain the tags from

  • methodname (str) – the method name returning the Tag array

Returns:

the Tags objects

Return type:

Tags

classmethod get_tags(classname, field)

Instantiates the Tag array located in the specified class with the given field name.

Example: tags = Tags.get_tags(“weka.classifiers.functions.SMO”, “TAGS_FILTER”)

Parameters:
  • classname (str) – the classname in which the tags reside

  • field (str) – the field name of the Tag array

Returns:

the Tags objects

Return type:

Tags

weka.core.classes.backquote(s)

Backquotes the string.

Parameters:

s (str) – the string to process

Returns:

the backquoted string

Return type:

str

weka.core.classes.complete_classname(classname)

Attempts to complete a partial classname like ‘.J48’ and returns the full classname if a single match was found, otherwise an exception is raised.

Parameters:

classname (str) – the partial classname to expand

Returns:

the full classname

Return type:

str

weka.core.classes.deepcopy(obj)

Creates a deep copy of the JavaObject (or derived class) or JPype object.

Parameters:

obj (object) – the object to create a copy of

Returns:

the copy, None if failed to copy

Return type:

object

weka.core.classes.from_byte_array(array)

Deserializes Java objects from the numpy array.

Parameters:

array (ndarray) – the numpy array to deserialize the Java objects from

Returns:

the list of deserialized JPype object instances

Return type:

list

weka.core.classes.from_commandline(cmdline, classname=None)

Creates an OptionHandler based on the provided commandline string.

Parameters:
  • cmdline (str) – the commandline string to use

  • classname (str) – the classname of the wrapper to return other than OptionHandler (in dot-notation)

Returns:

the generated option handler instance

Return type:

object

weka.core.classes.get_classname(obj)

Returns the classname of the JPype object, Python class or object.

Parameters:

obj (object) – the java object or Python class/object to get the classname for

Returns:

the classname

Return type:

str

weka.core.classes.get_enum(classname, enm)

Returns the instance of the enum.

Parameters:
  • classname (str) – the classname of the enum

  • enm (str) – the name of the enum element to return

Returns:

the enum instance

Return type:

JPype object

weka.core.classes.get_jclass(classname)

Returns the Java class object associated with the dot-notation classname. Also supports the Java primitives: boolean, byte, short, int, long, float, double, char.

Parameters:

classname (str) – the classname

Returns:

the class object

Return type:

JPype object

weka.core.classes.get_static_field(classname, fieldname)

Returns the Java object associated with the static field of the specified class.

Parameters:
  • classname (str) – the classname of the class to get the field from

  • fieldname (str) – the name of the field to retriev

Returns:

the object

Return type:

JPype object

weka.core.classes.help_for(classname, title=True, description=True, options=True, use_headers=True, separator='')

Generates a help screen for the specified class.

Parameters:
  • classname (str) – the class to get the help screen for, must implement the OptionHandler interface

  • title (bool) – whether to output a title

  • description (bool) – whether to output the description

  • options (bool) – whether to output the options

  • use_headers (bool) – whether to output headers, describing the sections

  • separator (str) – the separator line to use between sections

Returns:

the help screen, None if not available

Return type:

str

weka.core.classes.is_array(obj)

Checks whether the Java object is an array.

Parameters:

obj (JPype object) – the Java object to check

Returns:

whether the object is an array

Return type:

bool

weka.core.classes.is_instance_of(obj, class_or_intf_name)

Checks whether the Java object implements the specified interface or is a subclass of the superclass.

Parameters:
  • obj (JPype object) – the Java object to check

  • class_or_intf_name (str) – the superclass or interface to check, dot notation or with forward slashes

Returns:

true if either implements interface or subclass of superclass

Return type:

bool

weka.core.classes.join_options(options)

Turns the list of options back into a single commandline string.

Parameters:

options (list) – the list of options to process

Returns:

the combined options

Return type:

str

weka.core.classes.list_property_names(obj)

Lists the property names (Bean properties, ie read/write method pair) of the Java object.

Parameters:

obj (JPype object or JavaObject) – the object to inspect

Returns:

the list of property names

Return type:

list

weka.core.classes.load_suggestions()

Loads the class/package suggestions, if necessary.

weka.core.classes.main()

Runs a classifier from the command-line. Calls JVM start/stop automatically. Use -h to see all options.

weka.core.classes.new_array(classname, length)

Creates a new array of the specified class and length.

Parameters:
  • classname (str) – the type of the array

  • length (int) – the length of the array

Returns:

the generated array

weka.core.classes.new_instance(classname)

Instantiates an object of the specified class. Does not raise an Exception if it fails to do so (opposed to JavaObject.new_array).

Parameters:

classname (str) – the name of the class to instantiate

Returns:

the object, None if failed to instantiate

Return type:

JPype object

weka.core.classes.quote(s)

Quotes the string if necessary.

Parameters:

s (str) – the string to process

Returns:

the quoted string

Return type:

str

weka.core.classes.serialization_read(filename)

Reads the serialized object from disk. Caller must wrap object in appropriate Python wrapper class.

Parameters:

filename (str) – the file with the serialized object

Returns:

the JPype object

Return type:

JPype object

weka.core.classes.serialization_read_all(filename)

Reads the serialized objects from disk. Caller must wrap objects in appropriate Python wrapper classes.

Parameters:

filename (str) – the file with the serialized objects

Returns:

the list of JB_OBjects

Return type:

list

weka.core.classes.serialization_write(filename, jobject)

Serializes the object to disk. JavaObject instances get automatically unwrapped.

Parameters:
  • filename (str) – the file to serialize the object to

  • jobject (JPype object or JavaObject) – the object to serialize

weka.core.classes.serialization_write_all(filename, jobjects)

Serializes the list of objects to disk. JavaObject instances get automatically unwrapped.

Parameters:
  • filename (str) – the file to serialize the object to

  • jobjects (list) – the list of objects to serialize

weka.core.classes.split_commandline(cmdline)

Splits the commandline string into classname and options list.

Parameters:

cmdline (str) – the commandline string to split

Returns:

the tuple of classname and options list

Return type:

tuple

weka.core.classes.split_options(cmdline)

Splits the commandline into a list of options.

Parameters:

cmdline (str) – the commandline string to split into individual options

Returns:

the split list of commandline options

Return type:

list

weka.core.classes.suggest_package(name, exact=False)

Suggests package(s) for the given name (classname, package name). Matching can be either exact or just a substring.

Parameters:
  • name (str) – the name to look for

  • exact (bool) – whether to perform exact matching or substring matching

Returns:

list of matching package names

Return type:

list

weka.core.classes.suggestions = None

dictionary for class -> package relation

weka.core.classes.to_byte_array(jobjects)

Serializes the list of objects into a numpy array.

Parameters:

jobjects (list) – the list of objects to serialize

Returns:

the numpy array

Return type:

ndarray

weka.core.classes.to_commandline(optionhandler)

Generates a commandline string from the OptionHandler instance.

Parameters:

optionhandler (OptionHandler) – the OptionHandler instance to turn into a commandline

Returns:

the commandline string

Return type:

str

weka.core.classes.unbackquote(s)

Un-backquotes the string.

Parameters:

s (str) – the string to process

Returns:

the un-backquoted string

Return type:

str

weka.core.classes.unquote(s)

Un-quotes the string.

Parameters:

s (str) – the string to process

Returns:

the un-quoted string

Return type:

str

weka.core.converters module

class weka.core.converters.IncrementalLoaderIterator(loader, structure)

Bases: object

Iterator for dataset rows when loarding incrementally.

class weka.core.converters.Loader(classname='weka.core.converters.ArffLoader', jobject=None, options=None)

Bases: OptionHandler

Wrapper class for Loaders.

load_file(dfile, incremental=False, class_index=None)

Loads the specified file and returns the Instances object. In case of incremental loading, only the structure.

Parameters:
  • dfile (str) – the file to load

  • incremental (bool) – whether to load the dataset incrementally

  • class_index (str) – the class index string to use (‘first’, ‘second’, ‘third’, ‘last-2’, ‘last-1’, ‘last’ or 1-based index)

Returns:

the full dataset or the header (if incremental)

Return type:

Instances

Raises:

Exception – if the file does not exist

load_url(url, incremental=False)

Loads the specified URL and returns the Instances object. In case of incremental loading, only the structure.

Parameters:
  • url (str) – the URL to load the data from

  • incremental (bool) – whether to load the dataset incrementally

Returns:

the full dataset or the header (if incremental)

Return type:

Instances

class weka.core.converters.Saver(classname='weka.core.converters.ArffSaver', jobject=None, options=None)

Bases: OptionHandler

Wrapper class for Savers.

capabilities()

Returns the capabilities of the saver.

Returns:

the capabilities

Return type:

Capabilities

save_file(data, dfile)

Saves the Instances object in the specified file.

Parameters:
  • data (Instances) – the data to save

  • dfile (str) – the file to save the data to

class weka.core.converters.TextDirectoryLoader(jobject=None, options=None)

Bases: OptionHandler

Wrapper class for TextDirectoryLoader.

load()

Loads the text files from the specified directory and returns the Instances object. In case of incremental loading, only the structure.

Returns:

the full dataset or the header (if incremental)

Return type:

Instances

weka.core.converters.load_any_file(filename, class_index=None)

Determines a Loader based on the the file extension. If successful, loads the full dataset and returns it.

Parameters:
  • filename (str) – the name of the file to load

  • class_index (str) – the class index string to use (‘first’, ‘second’, ‘third’, ‘last-2’, ‘last-1’, ‘last’ or 1-based index)

Returns:

the

Return type:

Instances

weka.core.converters.load_csv_file(filename, dialect='excel', delimiter=',', quotechar='"', num_cols=None, nom_cols=None)

Loads a CSV file using the Python csv module and then converts it to an Instances object. Better at reading CSV files than Weka’s built-in CSVLoader. String attributes can be converted to nominal ones using the weka.filters.unsupervised.attribute.StringToNominal filter.

Parameters:
  • filename (str) – the name of the CSV file to load

  • dialect (str) – the type of CSV file to load

  • delimiter (str) – the field delimiter

  • quotechar (str) – the character used for quoting cells

  • quoting – how the quoting works

  • num_cols (list) – the list of 0-based column indices that are numeric, default for cols is str

  • nom_cols (list) – the list of 0-based column indices that are nominal, default for cols is str

weka.core.converters.loader_for_file(filename)

Returns a Loader that can load the specified file, based on the file extension. None if failed to determine.

Parameters:

filename (str) – the filename to get the loader for

Returns:

the assoicated loader instance or None if none found

Return type:

Loader

weka.core.converters.ndarray_to_instances(array, relation, att_template='Att-#', att_list=None)

Converts the numpy matrix into an Instances object and returns it.

Parameters:
  • array (numpy.darray) – the numpy ndarray to convert

  • relation (str) – the name of the dataset

  • att_template (str) – the prefix to use for the attribute names, “#” is the 1-based index, “!” is the 0-based index, “@” the relation name

  • att_list (list) – the list of attribute names to use

Returns:

the generated instances object

Return type:

Instances

weka.core.converters.save_any_file(data, filename)

Determines a Saver based on the the file extension. Returns whether successfully saved.

Parameters:
  • filename (str) – the name of the file to save

  • data (Instances) – the data to save

Returns:

whether successfully saved

Return type:

bool

weka.core.converters.saver_for_file(filename)

Returns a Saver that can load the specified file, based on the file extension. None if failed to determine.

Parameters:

filename (str) – the filename to get the saver for

Returns:

the associated saver instance or None if none found

Return type:

Saver

weka.core.database module

class weka.core.database.DatabaseUtils(jobject=None, options=None)

Bases: OptionHandler

Wrapper class for weka.experiment.DatabaseUtils.

property db_url

Obtains the currently set database URL.

Returns:

the database URL

Return type:

str

property password

Obtains the currently set database password.

Returns:

the database password

Return type:

str

property user

Obtains the currently set database user.

Returns:

the database user

Return type:

str

class weka.core.database.InstanceQuery(jobject=None, options=None)

Bases: DatabaseUtils

Wrapper class for weka.experiment.InstanceQuery.

property custom_properties

Obtains the currently set custom properties file.

Returns:

the custom properties file

Return type:

str

property query

Obtains the current SQL query to execute.

Returns:

the SQL query

Return type:

str

retrieve_instances(query=None)

Executes either the supplied query or the one set via options (or the ‘query’ property).

Parameters:

query (str) – query to execute if not the currently set one

Returns:

the generated data

Return type:

Instances

property sparse_data

Obtains the whether sparse data is returned or not.

Returns:

whether sparse data is generated

Return type:

bool

weka.core.dataset module

class weka.core.dataset.Attribute(jobject)

Bases: JavaObject

Wrapper class for weka.core.Attribute.

add_relation(instances)

Adds the relation value, returns the index.

Parameters:

instances (Instances) – the Instances object to add

Returns:

the index

Return type:

int

add_string_value(s)

Adds the string value, returns the index.

Parameters:

s (str) – the string to add

Returns:

the index

Return type:

int

copy(name=None)

Creates a copy of this attribute.

Parameters:

name (str) – the new name, uses the old one if None

Returns:

the copy of the attribute

Return type:

Attribute

classmethod create_date(name, formt="yyyy-MM-dd'T'HH:mm:ss")

Creates a date attribute.

Parameters:
  • name (str) – the name of the attribute

  • formt (str) – the date format, see Javadoc for java.text.SimpleDateFormat

classmethod create_nominal(name, labels)

Creates a nominal attribute.

Parameters:
  • name (str) – the name of the attribute

  • labels (list) – the list of string labels to use

classmethod create_numeric(name)

Creates a numeric attribute.

Parameters:

name (str) – the name of the attribute

classmethod create_relational(name, inst)

Creates a relational attribute.

Parameters:
  • name (str) – the name of the attribute

  • inst (Instances) – the structure of the relational attribute

classmethod create_string(name)

Creates a string attribute.

Parameters:

name (str) – the name of the attribute

property date_format

Returns the format of this data attribute. See java.text.SimpleDateFormat Javadoc.

Returns:

the format string

Return type:

str

equals(att)

Checks whether this attributes is the same as the provided one.

Parameters:

att (Attribute) – the Attribute to check against

Returns:

whether the same

Return type:

bool

equals_msg(att)

Checks whether this attributes is the same as the provided one. Returns None if the same, otherwise error message.

Parameters:

att (Attribute) – the Attribute to check against

Returns:

None if the same, otherwise error message

Return type:

str

property index

Returns the index of this attribute.

Returns:

the index

Return type:

int

index_of(label)

Returns the index of the label in this attribute.

Parameters:

label (str) – the string label to get the index for

Returns:

the 0-based index

Return type:

int

property is_averagable

Returns whether the attribute is averagable.

Returns:

whether averagable

Return type:

bool

property is_date

Returns whether the attribute is a date one.

Returns:

whether date attribute

Return type:

bool

is_in_range(value)

Checks whether the value is within the bounds of the numeric attribute.

Parameters:

value (float) – the numeric value to check

Returns:

whether between lower and upper bound

Return type:

bool

property is_nominal

Returns whether the attribute is a nominal one.

Returns:

whether nominal attribute

Return type:

bool

property is_numeric

Returns whether the attribute is a numeric one (date or numeric).

Returns:

whether numeric attribute

Return type:

bool

property is_relation_valued

Returns whether the attribute is a relation valued one.

Returns:

whether relation valued attribute

Return type:

bool

property is_string

Returns whether the attribute is a string attribute.

Returns:

whether string attribute

Return type:

bool

property lower_numeric_bound

Returns the lower numeric bound of the numeric attribute.

Returns:

the lower bound

Return type:

float

property name

Returns the name of the attribute.

Returns:

the name

Return type:

str

property num_values

Returns the number of labels.

Returns:

the number of labels

Return type:

int

property ordering

Returns the ordering of the attribute.

Returns:

the ordering (ORDERING_SYMBOLIC, ORDERING_ORDERED, ORDERING_MODULO)

Return type:

int

parse_date(s)

Parses the date string and returns the internal format value.

Parameters:

s (str) – the date string

Returns:

the internal format

Return type:

float

property type

Returns the type of the attribute. See weka.core.Attribute Javadoc.

Returns:

the type

Return type:

int

type_str(short=False)

Returns the type of the attribute as string.

Returns:

the type

Return type:

str

property upper_numeric_bound

Returns the upper numeric bound of the numeric attribute.

Returns:

the upper bound

Return type:

float

value(index)

Returns the label for the index.

Parameters:

index (int) – the 0-based index of the label to return

Returns:

the label

Return type:

str

property values

Returns the labels, strings or relation-values.

Returns:

all the values, None if not NOMINAL, STRING, or RELATION

Return type:

list

property weight

Returns the weight of the attribute.

Returns:

the weight

Return type:

float

class weka.core.dataset.AttributeIterator(data)

Bases: object

Iterator for attributes in an Instances object.

class weka.core.dataset.AttributeStats(jobject)

Bases: JavaObject

Container for attribute statistics.

property distinct_count

The number of distinct values.

Returns:

The number of distinct values

Return type:

int

property int_count

The number of int-like values.

Returns:

The number of int-like values

Return type:

int

property missing_count

The number of missing values.

Returns:

The number of missing values

Return type:

int

property nominal_counts

Counts of each nominal value.

Returns:

Counts of each nominal value

Return type:

ndarray

property nominal_weights

Weight mass for each nominal value.

Returns:

Weight mass for each nominal value

Return type:

ndarray

property numeric_stats

Stats on numeric value distributions.

Returns:

Stats on numeric value distributions

Return type:

NumericStats

property total_count

The total number of values.

Returns:

The total number of values

Return type:

int

property unique_count

The number of values that only appear once.

Returns:

The number of values that only appear once

Return type:

int

class weka.core.dataset.Instance(jobject)

Bases: JavaObject

Wrapper class for weka.core.Instance.

property class_attribute

Returns the currently set class attribute.

Returns:

the class attribute

Return type:

Attribute

property class_index

Returns the currently set class index.

Returns:

the class index, -1 if not set

Return type:

int

classmethod create_instance(values, classname='weka.core.DenseInstance', weight=1.0)

Creates a new instance.

Parameters:
  • values (ndarray or list) – the float values (internal format) to use, numpy array or list.

  • classname (str) – the classname of the instance (eg weka.core.DenseInstance).

  • weight (float) – the weight of the instance

classmethod create_sparse_instance(values, max_values, classname='weka.core.SparseInstance', weight=1.0)

Creates a new sparse instance.

Parameters:
  • values (list) – the list of tuples (0-based index and internal format float). The indices of the tuples must be in ascending order and “max_values” must be set to the maximum number of attributes in the dataset.

  • max_values (int) – the maximum number of attributes

  • classname (str) – the classname of the instance (eg weka.core.SparseInstance).

  • weight (float) – the weight of the instance

property dataset

Returns the dataset that this instance belongs to.

Returns:

the dataset or None if no dataset set

Return type:

Instances

get_relational_value(index)

Returns the relational value at the specified position (0-based).

Parameters:

index (int) – the 0-based index of the inernal value

Returns:

the relational value

Return type:

Instances

get_string_value(index)

Returns the string value at the specified position (0-based).

Parameters:

index (int) – the 0-based index of the inernal value

Returns:

the string value

Return type:

str

get_value(index)

Returns the internal value at the specified position (0-based).

Parameters:

index (int) – the 0-based index of the inernal value

Returns:

the internal value

Return type:

float

has_class()

Returns whether a class attribute is set (convenience method).

Returns:

whether a class attribute is currently set

Return type:

bool

has_missing()

Returns whether at least one attribute has a missing value.

Returns:

whether at least one value is missing

Return type:

bool

is_missing(index)

Returns whether the attribute at the specified index is missing.

Parameters:

index (int) – the 0-based index of the attribute

Returns:

whether the value is missing

Return type:

bool

classmethod missing_value()

Returns the numeric value that represents a missing value in Weka (NaN).

Returns:

missing value

Return type:

float

property num_attributes

Returns the number of attributes.

Returns:

the numer of attributes

Return type:

int

property num_classes

Returns the number of class labels.

Returns:

the numer of class labels

Return type:

int

set_missing(index)

Sets the attribute at the specified index to missing.

Parameters:

index (int) – the 0-based index of the attribute

set_string_value(index, s)

Sets the string value at the specified position (0-based).

Parameters:
  • index (int) – the 0-based index of the inernal value

  • s (str) – the string value

set_value(index, value)

Sets the internal value at the specified position (0-based).

Parameters:
  • index (int) – the 0-based index of the attribute

  • value (float) – the internal float value to set

to_numpy(internal=False)

Turns the instance into a numpy matrix.

Parameters:

internal (bool) – whether to return the internal format

Returns:

the dataset as matrix with single row

Return type:

np.ndarray

property values

Returns the internal values of this instance.

Returns:

the values as numpy array

Return type:

ndarray

property weight

Returns the currently set weight.

Returns:

the weight

Return type:

float

class weka.core.dataset.InstanceIterator(data)

Bases: object

Iterator for rows in an Instances object.

class weka.core.dataset.InstanceValueIterator(data)

Bases: object

Iterator for values in an Instance object.

class weka.core.dataset.Instances(jobject)

Bases: JavaObject

Wrapper class for weka.core.Instances.

add_instance(inst, index=None)

Adds the specified instance to the dataset.

Parameters:
  • inst (Instance) – the Instance to add

  • index (int) – the 0-based index where to add the Instance

classmethod append_instances(inst1, inst2)

Merges the two datasets (one-after-the-other). Throws an exception if the datasets aren’t compatible.

Parameters:
Returns:

the combined dataset

Return type:

Instances

attribute(index)

Returns the specified attribute.

Parameters:

index (int) – the 0-based index of the attribute

Returns:

the attribute

Return type:

Attribute

attribute_by_name(name)

Returns the specified attribute, None if not found.

Parameters:

name (str) – the name of the attribute

Returns:

the attribute or None

Return type:

Attribute

attribute_names()

Returns a list of all the attribute names.

Returns:

list of attribute names

Return type:

list

attribute_stats(index)

Returns the specified attribute statistics.

Parameters:

index (int) – the 0-based index of the attribute

Returns:

the attribute statistics

Return type:

AttributeStats

attributes()

Returns an iterator over the attributes.

property class_attribute

Returns the currently set class attribute.

Returns:

the class attribute

Return type:

Attribute

property class_index

Returns the currently set class index (0-based).

Returns:

the class index, -1 if not set

Return type:

int

class_is_first()

Sets the first attribute as class attribute (convenience method).

class_is_last()

Sets the last attribute as class attribute (convenience method).

compactify()

Compactifies the set of instances.

classmethod copy_instances(dataset, from_row=None, num_rows=None)

Creates a copy of the Instances. If either from_row or num_rows are None, then all of the data is being copied.

Parameters:
  • dataset (Instances) – the original dataset

  • from_row (int) – the 0-based start index of the rows to copy

  • num_rows (int) – the number of rows to copy

Returns:

the copy of the data

Return type:

Instances

copy_structure()

Returns a copy of the dataset structure.

Returns:

the structure of the dataset

Return type:

Instances

classmethod create_instances(name, atts, capacity)

Creates a new Instances.

Parameters:
  • name (str) – the relation name

  • atts (list of Attribute) – the list of attributes to use for the dataset

  • capacity (int) – how many data rows to reserve initially (see compactify)

Returns:

the dataset

Return type:

Instances

cv_splits(folds=10, rnd=None, stratify=True)

Generates a list of train/test pairs used in cross-validation. Creates a copy of the dataset beforehand when randomizing.

Parameters:
  • folds (int) – the number of folds to use, >= 2

  • rnd (Random) – the random number generator to use for randomization, skips randomization if None

  • stratify (bool) – whether to stratify the data after randomization

Returns:

the list of train/test split tuples

Return type:

list

delete(index=None)

Removes either the specified Instance or all Instance objects.

Parameters:

index (int) – the 0-based index of the instance to remove

delete_attribute(index)

Deletes an attribute at the given position.

Parameters:

index (int) – the 0-based index of the attribute to remove

delete_attribute_type(typ)

Deletes all attributes of the given type in the dataset.

Parameters:

typ (int) – the attribute type to remove, see weka.core.Attribute Javadoc

delete_first_attribute()

Deletes the first attribute.

delete_last_attribute()

Deletes the last attribute.

delete_with_missing(index)

Deletes all rows that have a missing value at the specified attribute index.

Parameters:

index (int) – the attribute index to check for missing attributes

equal_headers(inst)

Compares this dataset against the given one in terms of attributes.

Parameters:

inst (Instances) – the dataset to compare against

Returns:

None if the same, otherwise an error message

Return type:

str

get_instance(index)

Returns the Instance object at the specified location.

Parameters:

index (int) – the 0-based index of the instance

Returns:

the instance

Return type:

Instance

has_class()

Returns whether a class attribute is set (convenience method).

Returns:

whether a class attribute is currently set

Return type:

bool

insert_attribute(att, index)

Inserts the attribute at the specified location.

Parameters:
  • att (Attribute) – the attribute to insert

  • index (int) – the index to insert the attribute at

classmethod merge_instances(inst1, inst2)

Merges the two datasets (side-by-side).

Parameters:
Returns:

the combined dataset

Return type:

Instances

no_class()

Unsets the class attribute (convenience method).

property num_attributes

Returns the number of attributes.

Returns:

the number of attributes

Return type:

int

property num_instances

Returns the number of instances.

Returns:

the number of instances

Return type:

int

randomize(random)

Randomizes the dataset using the random number generator.

Parameters:

random (Random) – the random number generator to use

property relationname

Returns the name of the dataset.

Returns:

the name

Return type:

str

set_instance(index, inst)

Sets the Instance at the specified location in the dataset.

Parameters:
  • index (int) – the 0-based index of the instance to replace

  • inst (Instance) – the Instance to set

Returns:

the instance

Return type:

Instance

sort(index)

Sorts the dataset using the specified attribute index.

Parameters:

index (int) – the index of the attribute

stratify(folds)

Stratifies the data after randomization for nominal class attributes.

Parameters:

folds (int) – the number of folds to perform the stratification for

subset(col_range=None, col_names=None, invert_cols=False, row_range=None, invert_rows=False, keep_relationame=False)

Returns a subset of attributes/rows of the Instances object. If neither attributes nor rows have been specified a copy of the dataset gets returned. The invers of the specified cols/rows can be returned by setting invert_cols and/or invert_rows to True. The method uses the weka.filters.unsupervised.attribute.Remove and weka.filters.unsupervised.instance.RemoveRange filters under the hood.

Parameters:
  • col_range (str) – the subset of attributes to return (eg ‘1-3,7-12,67-last’), None for all

  • col_names (list) – the list of attributes to return (list of names; case-sensitive), takes precedence over col_range

  • invert_cols (bool) – whether to invert the returned attributes

  • row_range (str) – the subset of rows to return (eg ‘1-3,7-12,67-last’), None for all

  • invert_rows (bool) – whether to invert the returned rows

  • keep_relationame (bool) – whether to keep the original relation name

Returns:

the subset

Return type:

Instances

classmethod summary(inst)

Generates a summary of the dataset.

Parameters:

inst (Instances) – the dataset

Returns:

the summary

Return type:

str

classmethod template_instances(dataset, capacity=0)

Uses the Instances as template to create an empty dataset.

Parameters:
  • dataset (Instances) – the original dataset

  • capacity (int) – how many data rows to reserve initially (see compactify)

Returns:

the empty dataset

Return type:

Instances

test_cv(num_folds, fold)

Generates a test fold for cross-validation.

Parameters:
  • num_folds (int) – the number of folds of cross-validation, eg 10

  • fold (int) – the current fold (0-based)

Returns:

the training fold

Return type:

Instances

to_numpy(internal=False)

Turns the dataset into a numpy matrix.

Parameters:

internal (bool) – whether to return the internal format

Returns:

the dataset as matrix

Return type:

np.ndarray

train_cv(num_folds, fold, random=None)

Generates a training fold for cross-validation.

Parameters:
  • num_folds (int) – the number of folds of cross-validation, eg 10

  • fold (int) – the current fold (0-based)

  • random (Random) – the random number generator

Returns:

the training fold

Return type:

Instances

train_test_split(percentage, rnd=None)

Generates a train/test split. Creates a copy of the dataset first before applying randomization.

Parameters:
  • percentage (double) – the percentage split to use (amount to use for training; 0-100)

  • rnd (Random) – the random number generator to use, if None the order gets preserved

Returns:

the train/test splits

Return type:

tuple

values(index)

Returns the internal values of this attribute from all the instance objects.

Returns:

the values as numpy array

Return type:

np.ndarray

class weka.core.dataset.Stats(jobject)

Bases: JavaObject

Container for numeric attribute stats.

property count

The number of values seen.

Returns:

The number of values seen

Return type:

float

property max

The maximum value seen, or Double.NaN if no values seen.

Returns:

The maximum value seen, or Double.NaN if no values seen

Return type:

float

property mean

The mean of values at the last calculateDerived() call.

Returns:

The mean of values at the last calculateDerived() call

Return type:

float

property min

The minimum value seen, or Double.NaN if no values seen.

Returns:

The minimum value seen, or Double.NaN if no values seen

Return type:

float

property stddev

The std deviation of values at the last calculateDerived() call.

Returns:

The std deviation of values at the last calculateDerived() call

Return type:

float

property sum

The sum of values seen.

Returns:

The sum of values seen

Return type:

float

property sumsq

The sum of values squared seen.

Returns:

The sum of values squared seen

Return type:

float

weka.core.dataset.check_col_names_unique(cols_x, col_y=None)

Checks whether the column names are unique (a requirement for Instances objects).

Parameters:
  • cols_x (list) – the column names for the input variables

  • col_y (str) – the optional name for the output variable

Returns:

None if check passed, otherwise error message

Return type:

str

weka.core.dataset.create_instances_from_lists(x, y=None, name='data', cols_x=None, col_y=None, nominal_x=None, nominal_y=False)

Allows the generation of an Instances object from a list of lists for X and a list for Y (optional). Data can be numeric, string or bytes. Attributes can be converted to nominal with the weka.filters.unsupervised.attribute.NumericToNominal filter. None values are interpreted as missing values.

Parameters:
  • x (list of list) – the input variables (row wise)

  • y (list) – the output variable (optional)

  • name (str) – the name of the dataset

  • cols_x (list) – the column names to use

  • col_y (str) – the column name to use for the output variable (y)

  • nominal_x (list) – the list of 0-based column indices to treat as nominal ones, ignored if None

  • nominal_y (bool) – whether the y column is to be treated as nominal

Returns:

the generated dataset

Return type:

Instances

weka.core.dataset.create_instances_from_matrices(x, y=None, name='data', cols_x=None, col_y=None, nominal_x=None, nominal_y=False)

Allows the generation of an Instances object from a 2-dimensional matrix for X and a 1-dimensional matrix for Y (optional). Data can be numeric, string or bytes. Attributes can be converted to nominal with the weka.filters.unsupervised.attribute.NumericToNominal filter. nan values are interpreted as missing values.

Parameters:
  • x (np.ndarray) – the input variables

  • y (np.ndarray) – the output variable (optional)

  • name (str) – the name of the dataset

  • cols_x (list) – the column names to use

  • col_y (str) – the column name to use for the output variable (y)

  • nominal_x (list) – the list of 0-based column indices to treat as nominal ones, ignored if None

  • nominal_y (bool) – whether the y column is to be treated as nominal

Returns:

the generated dataset

Return type:

Instances

weka.core.dataset.missing_value()

Returns the value that represents missing values in Weka (NaN).

Returns:

missing value

Return type:

float

weka.core.distances module

class weka.core.distances.DistanceFunction(classname='weka.core.EuclideanDistance', jobject=None, options=None)

Bases: OptionHandler

Wrapper for Weka’s weka.core.DistanceFunction interface.

property attribute_indices

Returns the attribute indices in use.

Returns:

the attribute indices

Return type:

str

distance(first, second, cutoff=None)

Computes the distance between the two Instance objects.

Parameters:
  • first (Instance) – the first instance

  • second (Instance) – the second instance

  • cutoff (float) – optional cutoff value to speed up calculation

Returns:

the calculated distance

Return type:

float

property instances

Returns the dataset in use.

Returns:

the dataset

Return type:

Instances

weka.core.jvm module

weka.core.jvm.add_bundled_jars(cp)

Adds the bundled jars to the JVM’s classpath.

Parameters:

cp (list) – the list to append the classpath to

weka.core.jvm.add_system_classpath(cp)

Adds the system’s classpath to the JVM’s classpath.

Parameters:

cp (list) – the list to append the classpath to

weka.core.jvm.automatically_install_packages = None

whether to automatically install missing packages

weka.core.jvm.lib_dir()

Returns the “lib” directory path.

Returns:

the path to the “lib” directory

Return type:

str

weka.core.jvm.start(class_path=None, bundled=True, packages=False, system_cp=False, max_heap_size=None, system_info=False, auto_install=False, logging_level=10)

Initializes the jpype connection (starts up the JVM).

Parameters:
  • class_path (list) – the additional classpath elements to add

  • bundled (bool) – whether to add jars from the “lib” directory

  • packages (bool or str) – whether to add jars from Weka packages as well (bool) or an alternative Weka home directory (str)

  • system_cp (bool) – whether to add the system classpath as well

  • max_heap_size (str) – the maximum heap size (-Xmx parameter, eg 512m or 4g)

  • system_info (bool) – whether to print the system info (generated by weka.core.SystemInfo)

  • auto_install (bool) – whether to automatically install missing Weka packages (based on suggestions); in conjunction with package support

  • logging_level (int) – the logging level to use for this module, e.g., logging.DEBUG or logging.INFO

weka.core.jvm.started = None

whether the JVM has been started

weka.core.jvm.stop()

Kills the JVM.

weka.core.jvm.with_package_support = None

whether JVM was started with package support

weka.core.packages module

class weka.core.packages.Dependency(jobject)

Bases: JavaObject

Wrapper for the weka.core.packageManagement.Dependency class.

property source

Returns the source package.

Returns:

the package

Return type:

Package

property target

Returns the target package constraint.

Returns:

the package constraint

Return type:

PackageConstraint

weka.core.packages.LATEST = 'Latest'

Constant for the latest version of a package

class weka.core.packages.Package(jobject)

Bases: JavaObject

Wrapper for the weka.core.packageManagement.Package class.

as_dict()

Turns the package information into a dictionary. Not to be confused with ‘to_dict’!

Returns:

the package information as dictionary

Return type:

dict

property dependencies

Returns the dependencies of the package.

Returns:

the list of Dependency objects

Return type:

list of Dependency

install()

Installs the package.

property is_installed

Returns whether the package is installed.

Returns:

whether installed

Return type:

bool

property metadata

Returns the meta-data.

Returns:

the meta-data dictionary

Return type:

dict

property name

Returns the name of the package.

Returns:

the name

Return type:

str

property url

Returns the URL of the package.

Returns:

the url

Return type:

str

property version

Returns the version of the package.

Returns:

the version

Return type:

str

class weka.core.packages.PackageConstraint(jobject)

Bases: JavaObject

Wrapper for the weka.core.packageManagement.PackageConstraint class.

check_constraint(pkge=None, constr=None)

Checks the constraints.

Parameters:
get_package()

Returns the package.

Returns:

the package

Return type:

Package

set_package(pkge)

Sets the package.

Parameters:

pkge (Package) – the package

weka.core.packages.all_package(name)

Returns Package object for the specified package (either installed or available). Returns None if not found.

Parameters:

name (str) – the name of the package to retrieve

Returns:

the package information, None if not available

Return type:

Package

weka.core.packages.all_packages()

Returns a list of all packages.

Returns:

the list of packages

Return type:

list

weka.core.packages.available_package(name)

Returns Package object for the specified, available package. Returns None if not installed.

Parameters:

name (str) – the name of the available package to retrieve

Returns:

the package information

Return type:

Package

weka.core.packages.available_packages()

Returns a list of all packages that aren’t installed yet.

Returns:

the list of packages

Return type:

list

weka.core.packages.establish_cache()

Establishes the package cache if necessary.

weka.core.packages.install_missing_package(pkge, version='Latest', quiet=False, stop_jvm_and_exit=False)

Installs the package if not yet installed.

Parameters:
  • pkge (str) – the name of the repository package, a URL (http/https) or a zip file

  • version (str) – in case of the repository packages, the version

  • quiet (bool) – whether to suppress console output and only print error messages

  • stop_jvm_and_exit (bool) – whether to stop the JVM and exit if anything was installed

Returns:

tuple of (success, exit_required); “success” being True if either nothing to install or all successfully installed, False otherwise; “exit_required” being True if at least one package was installed and the JVM needs restarting

Return type:

tuple

weka.core.packages.install_missing_packages(pkges, quiet=False, stop_jvm_and_exit=False)

Installs the missing packages.

Parameters:
  • pkges (the packages to install) – list of tuples (packagename, version) or strings (packagename, LATEST is assume for version), use “Latest” or LATEST constant to grab latest version

  • quiet (bool) – whether to suppress console output and only print error messages

  • stop_jvm_and_exit (bool) – whether to stop the JVM and exit if anything was installed

Returns:

tuple of (success, exit_required); “success” being True if either nothing to install or all successfully installed, False otherwise; “exit_required” being True if at least one package was installed and the JVM needs restarting

Return type:

tuple

weka.core.packages.install_package(pkge, version='Latest', details=False)

Installs the specified package.

Parameters:
  • pkge (str) – the name of the repository package, a URL (http/https) or a zip file

  • version (str) – in case of the repository packages, the version

  • details (bool) – whether to just return a success/failure flag (False) or a dict with detailed information (from_repo, version, error, install_message, success)

Returns:

whether successfully installed or dict with detailed information

Return type:

bool or dict

weka.core.packages.install_packages(pkges, fail_fast=True, details=False)

Installs the specified packages. When running in fail_fast mode, then the first package that fails to install will stop the installation process. Otherwise, all packages are attempted to get installed.

The details dictionary uses the package name, url or file path as the key and stores the following information in a dict as value: - from_repo (bool): whether installed from repo or “unofficial” package (ie URL or local file) - version (str): the version that was attempted to be installed (if applicable) - error (str): any error message that was encountered - install_message (str): any installation message that got returned when installing from URL or zip file - success (bool): whether successfully installed or not

Parameters:
  • pkges (list) – the list of packages to install (name of the repository package, a URL (http/https) or a zip file), if tuple must be name/version

  • fail_fast (bool) – whether to quit the installation of packages with the first package that fails (True) or whether to attempt to install all packages (False)

  • details (bool) – whether to just return a success/failure flag (False) or a dict with detailed information (per package: from_repo, version, error, install_message, success)

Returns:

whether successfully installed or detailed information

Return type:

bool or dict

weka.core.packages.installed_package(name)

Returns Package object for the specified, installed package. Returns None if not installed.

Parameters:

name (str) – the name of the installed package to retrieve

Returns:

the package information

Return type:

Package

weka.core.packages.installed_packages()

Returns a list of the installed packages.

Returns:

the list of packages

Return type:

list

weka.core.packages.is_installed(name, version=None)

Checks whether a package with the name is already installed.

Parameters:
  • name (str) – the name of the package

  • version (str) – the version to check as well, ignored if None

Returns:

whether the package is installed

Return type:

bool

weka.core.packages.is_official_package(name, version=None)

Checks whether the package is an official one.

Parameters:
  • name (str) – the name of the package to check

  • version (str) – the specific version to check

Returns:

whether an official package or not

Return type:

bool

weka.core.packages.main(args=None)

Performs the specified package operation from the command-line. Calls JVM start/stop automatically. Use -h to see all options.

Parameters:

args (list) – the command-line arguments to use, uses sys.argv if None

weka.core.packages.refresh_cache()

Refreshes the cache.

weka.core.packages.suggest_package(name, exact=False)

Suggests package(s) for the given name (classname, package name). Matching can be either exact or just a substring.

Parameters:
  • name (str) – the name to look for

  • exact (bool) – whether to perform exact matching or substring matching

Returns:

list of matching package names

Return type:

list

weka.core.packages.sys_main()

Runs the main function using the system cli arguments, and returns a system error code.

Returns:

0 for success, 1 for failure.

Return type:

int

weka.core.packages.uninstall_package(name)

Uninstalls a package.

Parameters:

name (str) – the name of the package

weka.core.packages.uninstall_packages(names)

Uninstalls a package.

Parameters:

names (list) – the names of the package

weka.core.serialization module

weka.core.stemmers module

class weka.core.stemmers.Stemmer(classname='weka.core.stemmers.NullStemmer', jobject=None, options=None)

Bases: OptionHandler

Wrapper class for stemmers.

stem(s)

Performs stemming on the string.

Parameters:

s (str) – the string to stem

Returns:

the stemmed string

Return type:

str

weka.core.stopwords module

class weka.core.stopwords.Stopwords(classname='weka.core.stopwords.Null', jobject=None, options=None)

Bases: OptionHandler

Wrapper class for stopwords handlers.

is_stopword(s)

Checks a string whether it is a stopword.

Parameters:

s (str) – the string to check

Returns:

True if a stopword

Return type:

bool

weka.core.tokenizers module

class weka.core.tokenizers.TokenIterator(tokenizer)

Bases: object

Iterator for string tokens.

class weka.core.tokenizers.Tokenizer(classname='weka.core.tokenizers.AlphabeticTokenizer', jobject=None, options=None)

Bases: OptionHandler

Wrapper class for tokenizers.

tokenize(s)

Tokenizes the string.

Parameters:

s (str) – the string to tokenize

Returns:

the iterator

Return type:

TokenIterator

weka.core.typeconv module

weka.core.typeconv.float_to_jfloat(d)

Turns the Python float into a Java java.lang.Float object.

Parameters:

d (float) – the Python float

Returns:

the Float object

Return type:

JPype object

weka.core.typeconv.from_jobject_array(a)

Converts the java object array into a list.

Parameters:

a – the java array to convert

Returns:

the generated list

weka.core.typeconv.jdouble_array_to_ndarray(a)

Turns the Java array of doubles into a numpy 2-dim array.

Parameters:

a – the double array

Type:

JPype object

Returns:

Numpy array

Return type:

numpy.darray

weka.core.typeconv.jdouble_matrix_to_ndarray(m)

Turns the Java matrix (2-dim array) of doubles into a numpy 2-dim array.

Parameters:

m – the double matrix

Type:

JPype object

Returns:

Numpy array

Return type:

numpy.darray

weka.core.typeconv.jdouble_to_float(d)

Turns the Java java.lang.Double object into Python float object.

Parameters:

d (JPype object) – the java.lang.Double

Returns:

the Float object

Return type:

float

weka.core.typeconv.jenumeration_to_list(enm)

Turns the java.util.Enumeration into a list.

Parameters:

enm (JPype object) – the enumeration to convert

Returns:

the list

Return type:

list

weka.core.typeconv.jint_array_to_ndarray(a)

Turns the Java array of ints into a numpy 2-dim array.

Parameters:

a – the double array

Type:

JPype object

Returns:

Numpy array

Return type:

numpy.darray

weka.core.typeconv.jstring_array_to_list(a)

Turns the Java string array into Python unicode string list.

Parameters:

a (JPype object) – the string array to convert

Returns:

the string list

Return type:

list

weka.core.typeconv.jstring_list_to_string_list(l, return_empty_if_none=True)

Converts a Java java.util.List containing strings into a Python list.

Parameters:
  • l (JPype object) – the list to convert

  • return_empty_if_none (bool) – whether to return an empty list or None when list object is None

Returns:

the list with UTF strings

Return type:

list

weka.core.typeconv.string_list_to_jarray(l)

Turns a Python unicode string list into a Java String array.

Parameters:

l – the string list

Type:

list

Return type:

java string array

Returns:

JPype object

weka.core.typeconv.string_list_to_jlist(l)

Turns a Python unicode string list into a Java List.

Parameters:

l – the string list

Type:

list

Return type:

java list

weka.core.typeconv.to_jdouble_array(values, none_as_nan: bool = False)

Converts the list of floats or the numpy array into a Java array.

Parameters:
  • values – the values to convert

  • none_as_nan (bool) – whether to convert None values to NaN

Returns:

the java array

weka.core.typeconv.to_jint_array(values)

Converts the list of ints into a Java array.

Parameters:

values – the values to convert

Returns:

the java array

weka.core.typeconv.to_jobject_array(values)

Converts the list of objects into a Java object array.

Parameters:

values – the list of objects to convert

Returns:

the java array

weka.core.typeconv.to_string(o)

Turns the object into a string.

Parameters:

o – the object to convert

Returns:

the generated string

Return type:

str

weka.core.utils module

weka.core.utils.correlation(values1, values2)

Computes the correlation between the two lists of floats.

Parameters:
  • values1 (list) – the first list of floats

  • values2 (list) – the second list of floats

Returns:

the correlation coefficient

Return type:

float

weka.core.utils.normalize(values, sum_=None)

Normalizes the doubles in the array using the given value.

Parameters:
  • values (list) – the list of floats to normalize

  • sum (float) – the value by which the floats are to be normalized

Returns:

the normalized float values

Return type:

list

weka.core.utils.variance(values)

Computes the variance for a list of floats.

Parameters:

values (list) – the list of floats to compute the variance for

Returns:

the variance

Return type:

float

weka.core.version module

weka.core.version.pww_version()

Returns the installed version of python-weka-wrapper3.

Returns:

the version, None if failed to obtain

Return type:

str

weka.core.version.weka_version()

Determines the version of Weka in use.

Returns:

the version

Return type:

str

weka.core.version.with_graph_support()

Checks whether pygraphviz is installed for graph support.

Returns:

True if with pygraphviz support

Return type:

bool

weka.core.version.with_plot_support()

Checks whether matplotlib is installed for plot support.

Returns:

True if with matplotlib support

Return type:

bool

Module contents