weka.core package

Submodules

weka.core.capabilities module

class weka.core.capabilities.Capabilities(jobject=None, owner=None)

Bases: weka.core.classes.JavaObject

Wrapper for Capabilities.

attribute_capabilities()

Returns all the attribute capabilities.

Returns:attribute capabilities
Return type:Capabilities
capabilities()

Returns all the capabilities.

Returns:all capabilities
Return type:list
class_capabilities()

Returns all the class capabilities.

Returns:class capabilities
Return type:Capabilities
dependencies()

Returns all the dependencies.

Returns:the dependency list
Return type:list
disable(capability)

Disables the specified capability.

Parameters:capability (Capability) – the capability to disable
disable_all()

Disables all capabilities.

disable_all_attribute_dependencies()

Disables all attribute dependencies.

disable_all_attributes()

Disables all attributes.

disable_all_class_dependencies()

Disables all class dependencies.

disable_all_classes()

Disables all classes.

disable_dependency(capability)

Disables the dependency of the given capability Disabling NOMINAL_ATTRIBUTES also disables BINARY_ATTRIBUTES, UNARY_ATTRIBUTES and EMPTY_NOMINAL_ATTRIBUTES.

Parameters:capability (Capability) – the dependency to disable
enable(capability)

enables the specified capability.

Parameters:capability (Capability) – the capability to enable
enable_all()

enables all capabilities.

enable_all_attribute_dependencies()

enables all attribute dependencies.

enable_all_attributes()

enables all attributes.

enable_all_class_dependencies()

enables all class dependencies.

enable_all_classes()

enables all classes.

enable_dependency(capability)

enables the dependency of the given capability enabling NOMINAL_ATTRIBUTES also enables BINARY_ATTRIBUTES, UNARY_ATTRIBUTES and EMPTY_NOMINAL_ATTRIBUTES.

Parameters:capability (Capability) – the dependency to enable
classmethod for_instances(data, multi=None)

returns a Capabilities object specific for this data. The minimum number of instances is not set, the check for multi-instance data is optional.

Parameters:
  • data (Instances) – the data to generate the capabilities for
  • multi (bool) – whether to check the structure, too
Returns:

the generated capabilities

Return type:

Capabilities

handles(capability)

Returns whether the specified capability is set.

Parameters:capability (Capability) – the capability to check
Returns:whether the capability is set
Return type:bool
has_dependencies()

Returns whether any dependencies are set.

Returns:whether any dependecies are set
Return type:bool
has_dependency(capability)

Returns whether the specified dependency is set.

Parameters:capability (Capability) – the capability to check
Returns:whether the dependency is set
Return type:bool
min_instances

Returns the minimum number of instances that must be supported.

Returns:the minimum number
Return type:int
other_capabilities()

Returns all other capabilities.

Returns:all other capabilities
Return type:Capabilities
owner

Returns the owner of these capabilities, if any.

Returns:the owner, can be None
Return type:JavaObject
supports(capabilities)

Returns true if the currently set capabilities support at least all of the capabiliites of the given Capabilities object (checks only the enum!)

Parameters:capabilities (Capabilities) – the capabilities to check
Returns:whether the current capabilities support at least the specified ones
Return type:bool
supports_maybe(capabilities)

Returns true if the currently set capabilities support (or have a dependency) at least all of the capabilities of the given Capabilities object (checks only the enum!)

Parameters:capabilities (Capabilities) – the capabilities to check
Returns:whether the current capabilities (potentially) support the specified ones
Return type:bool
test_attribute(att, is_class=None, fail=False)

Tests whether the attribute meets the conditions.

Parameters:
  • att (Attribute) – the Attribute to test
  • is_class (bool) – whether this attribute is the class attribute
  • fail (bool) – whether to fail with an exception in case the test fails
Returns:

whether the attribute meets the conditions

Return type:

bool

test_instances(data, from_index=None, to_index=None, fail=False)

Tests whether the dataset meets the conditions.

Parameters:
  • data (Instances) – the Instances to test
  • from_index (int) – the first attribute to include
  • to_index (int) – the last attribute to include
Returns:

wether the dataset meets the requirements

Return type:

bool

class weka.core.capabilities.Capability(jobject=None, member=None)

Bases: weka.core.classes.Enum

Wrapper for a Capability.

is_attribute

Returns whether this capability is an attribute.

Returns:whether it is an attribute
Return type:bool
is_attribute_capability

Returns whether this capability is an attribute capability.

Returns:whether it is an attribute capability
Return type:bool
is_class

Returns whether this capability is a class.

Returns:whether it is a class
Return type:bool
is_class_capability

Returns whether this capability is a class capability.

Returns:whether it is a class capability
Return type:bool
is_other_capability

Returns whether this capability is an other capability.

Returns:whether it is an other capability
Return type:bool

weka.core.classes module

class weka.core.classes.AbstractParameter(classname=None, jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Ancestor for all parameter classes used by SetupGenerator and MultiSearch.

prop

Returns the currently set property to apply the parameter to.

Returns:the property
Return type:str
class weka.core.classes.Configurable(config=None)

Bases: weka.core.classes.JSONObject

The ancestor for all actors.

config

Obtains the currently set options of the actor.

Returns:the options
Return type:dict
description()

Returns a description of the object.

Returns:the description
Return type:str
fix_config(options)

Fixes the options, if necessary. I.e., it adds all required elements to the dictionary.

Parameters:options (dict) – the options to fix
Returns:the (potentially) fixed options
Return type:dict
classmethod from_dict(d)

Restores its state from a dictionary, used in de-JSONification.

Parameters:d (dict) – the object dictionary
generate_help()

Generates a help string for this actor.

Returns:the help string
Return type:str
help

Obtains the help information per option for this actor.

Returns:the help
Return type:dict
logger

Returns the logger object.

Returns:the logger
Return type:logger
new_logger()

Returns a new logger instance.

Returns:the logger instance
Return type:logger
print_help()

Prints a help string for this actor to stdout.

to_dict()

Returns a dictionary that represents this object, to be used for JSONification.

Returns:the object dictionary
Return type:dict
class weka.core.classes.Enum(jobject=None, enum=None, member=None)

Bases: weka.core.classes.JavaObject

Wrapper for Java enums.

name

Returns the name of the enum member.

Returns:the name
Return type:str
ordinal

Returns the ordinal of the enum member.

Returns:the ordinal
Return type:int
values

Returns list of all enum members.

Returns:all enum members
Return type:list
class weka.core.classes.Environment(jobject=None)

Bases: weka.core.classes.JavaObject

Wraps around weka.core.Environment

add_variable(key, value, system_wide=False)

Adds the environment variable.

Parameters:
  • key (str) – the name of the variable
  • value (str) – the value
  • system_wide (bool) – whether to add the variable system wide
remove_variable(key)

Adds the environment variable.

Parameters:key (str) – the name of the variable
classmethod system_wide()

Returns the system-wide environment.

;return: the environment :rtype: Environment

variable_names()

Returns the names of all environment variables.

Returns:the names of the variables
Return type:list
variable_value(key)

Returns the value of the environment variable.

Parameters:key (str) – the name of the variable
Returns:the variable value
Return type:str
class weka.core.classes.JSONObject

Bases: object

Ancestor for classes that can be represented as JSON and restored from JSON.

classmethod from_dict(d)

Restores an object state from a dictionary, used in de-JSONification.

Parameters:d (dict) – the object dictionary
Returns:the object
Return type:object
classmethod from_json(s)

Restores the object from the given JSON.

Parameters:s (str) – the JSON string to parse
Returns:the
shallow_copy()

Returns a shallow copy of itself.

Returns:the copy
Return type:object
to_dict()

Returns a dictionary that represents this object, to be used for JSONification.

Returns:the object dictionary
Return type:dict
to_json()

Returns the options as JSON.

Returns:the object as string
Return type:str
class weka.core.classes.JavaArray(jobject)

Bases: weka.core.classes.JavaObject

Convenience wrapper around Java arrays.

component_type()

Returns the classname of the elements.

Returns:the class of the elements
Return type:str
classmethod new_instance(classname, length)

Creates a new array with the given classname and length; initial values are null.

Parameters:
  • classname (str) – the classname in Java notation (eg “weka.core.DenseInstance”)
  • length (int) – the length of the array
Returns:

the Java array

Return type:

JB_Object

class weka.core.classes.JavaArrayIterator(data)

Bases: object

Iterator for elements in a Java array.

next()

Returns the next element from the array.

Returns:the next array element object, wrapped as JavaObject if not null
Return type:JavaObject or None
class weka.core.classes.JavaObject(jobject)

Bases: weka.core.classes.JSONObject

Basic Java object.

classmethod check_type(jobject, intf_or_class)

Returns whether the object implements the specified interface or is a subclass.

Parameters:
  • jobject (JB_Object) – the Java object to check
  • intf_or_class (str) – the classname in Java notation (eg “weka.core.DenseInstance;”)
Returns:

whether object implements interface or is subclass

Return type:

bool

classname

Returns the Java classname in dot-notation.

Returns:the Java classname
Return type:str
classmethod enforce_type(jobject, intf_or_class)

Raises an exception if the object does not implement the specified interface or is not a subclass. E.g.: self._enforce_type(‘weka.core.OptionHandler’, ‘Lweka/core/OptionHandler;’) or self._enforce_type(‘weka.core.converters.AbstractFileLoader’)

Parameters:
  • jobject (JB_Object) – the Java object to check
  • intf_or_class (str) – the classname in Java notation (eg “weka.core.DenseInstance”)
classmethod from_dict(d)

Restores an object state from a dictionary, used in de-JSONification.

Parameters:d (dict) – the object dictionary
Returns:the object
Return type:object
get_property(path)

Attempts to get the value (jobject, a Java object) of the provided (bean) property path.

Parameters:path (str) – the property path, e.g., “filter” for a setFilter(…)/getFilter() method pair
Returns:the wrapped Java object
Return type:JavaObject
is_serializable

Returns true if the object is serialiable.

Returns:true if serializable
Return type:bool
jclass

Returns the Java class object of the underlying Java object.

Returns:the Java class
Return type:JB_Object
jclasswrapper

Returns a JClassWrapper instance of the class for the encapsulated Java object, giving access to the class methods using dot notation.

http://pythonhosted.org//javabridge/highlevel.html#wrapping-java-objects-using-reflection

Returns:the wrapper
Return type:JClassWrapper
jwrapper

Returns a JWrapper instance of the encapsulated Java object, giving access to methods using dot notation.

http://pythonhosted.org//javabridge/highlevel.html#wrapping-java-objects-using-reflection

Returns:the wrapper
Return type:JWrapper
classmethod new_instance(classname)

Creates a new object from the given classname using the default constructor, None in case of error.

Parameters:classname (str) – the classname in Java notation (eg “weka.core.DenseInstance”)
Returns:the Java object
Return type:JB_Object
set_property(path, jobject)

Attempts to set the value (jobject, a Java object) of the provided (bean) property path.

Parameters:
  • path (str) – the property path, e.g., “filter” for a setFilter(…)/getFilter() method pair
  • jobject (JB_Object) – the Java object to set; if instance of JavaObject class, the jobject member is automatically used
to_dict()

Returns a dictionary that represents this object, to be used for JSONification.

Returns:the object dictionary
Return type:dict
class weka.core.classes.ListParameter(jobject=None, options=None)

Bases: weka.core.classes.AbstractParameter

Parameter using a predefined list of values, used by SetupGenerator and MultiSearch.

values

Returns the currently set values.

Returns:the list of values (strings)
Return type:list
class weka.core.classes.MathParameter(jobject=None, options=None)

Bases: weka.core.classes.AbstractParameter

Parameter using a math expression for generating values, used by SetupGenerator and MultiSearch.

base

Returns the currently set base value.

Returns:the base
Return type:float
expression

Returns the currently set expression.

Returns:the expression
Return type:str
maximum

Returns the currently set maximum value.

Returns:the maximum
Return type:float
minimum

Returns the currently set minimum value.

Returns:the minimum
Return type:float
step

Returns the currently set step value.

Returns:the step
Return type:float
class weka.core.classes.Option(jobject)

Bases: weka.core.classes.JavaObject

Wrapper for the weka.core.Option class.

description

Returns the description of the option.

Returns:the description
Return type:str
name

Returns the name of the option.

Returns:the name
Return type:str
num_arguments

Returns the synopsis of the option.

Returns:the synopsis
Return type:str
synopsis

Returns the synopsis of the option.

Returns:the synopsis
Return type:str
class weka.core.classes.OptionHandler(jobject, options=None)

Bases: weka.core.classes.JavaObject, weka.core.classes.Configurable

Ancestor for option-handling classes. Classes should implement the weka.core.OptionHandler interface to have any effect.

description()

Returns a description of the object.

Returns:the description
Return type:str
classmethod from_dict(d)

Restores an object state from a dictionary, used in de-JSONification.

Parameters:d (dict) – the object dictionary
Returns:the object
Return type:object
global_info()

Returns the globalInfo() result, None if not available.

Rtypes:str
options

Obtains the currently set options as list.

Returns:the list of options
Return type:list
to_commandline()

Generates a commandline string from the JavaObject instance.

Returns:the commandline string
Return type:str
to_dict()

Returns a dictionary that represents this object, to be used for JSONification.

Returns:the object dictionary
Return type:dict
to_help()

Returns a string that contains the ‘global_info’ text and the options.

Returns:the generated help string
Return type:str
class weka.core.classes.Random(seed)

Bases: weka.core.classes.JavaObject

Wrapper for the java.util.Random class.

next_double()

Next random double.

Returns:the next random double
Return type:double
next_int(n=None)

Next random integer. if n is provided, then between 0 and n-1.

Parameters:n (int) – the upper limit (minus 1) for the random integer
Returns:the next random integer
Return type:int
class weka.core.classes.Range(jobject=None, ranges=None)

Bases: weka.core.classes.JavaObject

Wrapper for a Weka Range object.

invert

Returns whether the range is inverted.

Returns:true if inverted
Return type:bool
ranges

Returns the string range.

Returns:the string range of 1-based indices
Return type:str
selection()

Returns the selection list.

Returns:the list of 0-based integer indices
Return type:list
upper(upper)

Sets the upper limit.

Parameters:upper (int) – the upper limit
class weka.core.classes.SelectedTag(jobject=None, tag_id=None, tag_text=None, tags=None)

Bases: weka.core.classes.JavaObject

Wrapper for the weka.core.SelectedTag class.

selected

Returns the selected tag.

Returns:the tag
Return type:Tag
tags

Returns the associated tags.

Returns:the list of Tag objects
Return type:list
class weka.core.classes.SetupGenerator(jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Allows generation of large number of setups using parameter setups.

base_object

Returns the base object to apply the setups to.

Returns:the base object
Return type:JavaObject or OptionHandler
parameters

Returns the list of currently set search parameters.

Returns:the list of AbstractSearchParameter objects
Return type:list
setups()

Generates and returns all the setups according to the parameter search space.

Returns:the list of configured objects (of type JavaObject)
Return type:list
class weka.core.classes.SingleIndex(jobject=None, index=None)

Bases: weka.core.classes.JavaObject

Wrapper for a Weka SingleIndex object.

index()

Returns the integer index.

Returns:the 0-based integer index
Return type:int
single_index

Returns the string index.

Returns:the 1-based string index
Return type:str
upper(upper)

Sets the upper limit.

Parameters:upper (int) – the upper limit
class weka.core.classes.Stoppable

Bases: object

Classes that can be stopped.

is_stopped()

Returns whether the object has been stopped.

Returns:whether stopped
Return type:bool
stop_execution()

Triggers the stopping of the object.

class weka.core.classes.Tag(jobject=None, ident=None, ident_str='', readable='', uppercase=True)

Bases: weka.core.classes.JavaObject

Wrapper for the weka.core.Tag class.

ident

Returns the current integer ID of the tag.

Returns:the integer ID
Return type:int
identstr

Returns the current ID string.

Returns:the ID string
Return type:str
readable

Returns the ‘human readable’ string.

Returns:the readable string
Return type:str
class weka.core.classes.Tags(jobject=None, tags=None)

Bases: weka.core.classes.JavaObject

Wrapper for an array of weka.core.Tag objects.

find(name)

Returns the Tag that matches the name.

Parameters:name (str) – the string representation of the tag
Returns:the tag, None if not found
Return type:Tag
classmethod get_object_tags(javaobject, methodname)

Instantiates the Tag array obtained from the object using the specified method name.

Example: cls = Classifier(classname=”weka.classifiers.meta.MultiSearch”) tags = Tags.get_object_tags(cls, “getMetricsTags”)

Parameters:
  • javaobject (JavaObject) – the javaobject to obtain the tags from
  • methodname (str) – the method name returning the Tag array
Returns:

the Tags objects

Return type:

Tags

classmethod get_tags(classname, field)

Instantiates the Tag array located in the specified class with the given field name.

Example: tags = Tags.get_tags(“weka.classifiers.functions.SMO”, “TAGS_FILTER”)

Parameters:
  • classname (str) – the classname in which the tags reside
  • field (str) – the field name of the Tag array
Returns:

the Tags objects

Return type:

Tags

weka.core.classes.backquote(s)

Backquotes the string.

Parameters:s (str) – the string to process
Returns:the backquoted string
Return type:str
weka.core.classes.complete_classname(classname)

Attempts to complete a partial classname like ‘.J48’ and returns the full classname if a single match was found, otherwise an exception is raised.

Parameters:classname (str) – the partial classname to expand
Returns:the full classname
Return type:str
weka.core.classes.deregister_dict_handler(typestr)

Deregisters a handler for restoring an object from a JSON dictionary.

Parameters:typestr (str) – the type of the object
weka.core.classes.from_commandline(cmdline, classname=None)

Creates an OptionHandler based on the provided commandline string.

Parameters:
  • cmdline (str) – the commandline string to use
  • classname (str) – the classname of the wrapper to return other than OptionHandler (in dot-notation)
Returns:

the generated option handler instance

Return type:

object

weka.core.classes.from_dict_handlers = {}

The methods that handle the restoration from a JSON dictionary, stored under their ‘type’.

weka.core.classes.get_class(classname)

Returns the class object associated with the dot-notation classname.

Taken from here: http://stackoverflow.com/a/452981

Parameters:classname (str) – the classname
Returns:the class object
Return type:object
weka.core.classes.get_classname(obj)

Returns the classname of the JB_Object, Python class or object.

Parameters:obj (object) – the java object or Python class/object to get the classname for
Returns:the classname
Return type:str
weka.core.classes.get_dict_handler(typestr)

Returns the handler for restoring an object from a JSON dictionary.

Parameters:typestr (str) – the type of the object
Returns:the handler, None if not available
weka.core.classes.get_jclass(classname)

Returns the Java class object associated with the dot-notation classname.

Parameters:classname (str) – the classname
Returns:the class object
Return type:JB_Object
weka.core.classes.get_static_field(classname, fieldname, signature)

Returns the Java object associated with the static field of the specified class.

Parameters:
  • classname (str) – the classname of the class to get the field from
  • fieldname (str) – the name of the field to retriev
Returns:

the object

Return type:

JB_Object

weka.core.classes.has_dict_handler(typestr)

Returns the handler for restoring an object from a JSON dictionary.

Parameters:typestr (str) – the type of the object
Returns:the handler, None if not available
weka.core.classes.is_array(obj)

Checks whether the Java object is an array.

Parameters:obj (JB_Object) – the Java object to check
Returns:whether the object is an array
Return type:bool
weka.core.classes.is_instance_of(obj, class_or_intf_name)

Checks whether the Java object implements the specified interface or is a subclass of the superclass.

Parameters:
  • obj (JB_Object) – the Java object to check
  • class_or_intf_name (str) – the superclass or interface to check, dot notation or with forward slashes
Returns:

true if either implements interface or subclass of superclass

Return type:

bool

weka.core.classes.join_options(options)

Turns the list of options back into a single commandline string.

Parameters:options (list) – the list of options to process
Returns:the combined options
Return type:str
weka.core.classes.main()

Runs a classifier from the command-line. Calls JVM start/stop automatically. Use -h to see all options.

weka.core.classes.quote(s)

Quotes the string if necessary.

Parameters:s (str) – the string to process
Returns:the quoted string
Return type:str
weka.core.classes.register_dict_handler(typestr, handler)

Registers a handler for restoring an object from a JSON dictionary.

Parameters:
  • typestr (str) – the type of the object
  • handler – the method
weka.core.classes.split_options(cmdline)

Splits the commandline into a list of options.

Parameters:cmdline (str) – the commandline string to split into individual options
Returns:the split list of commandline options
Return type:list
weka.core.classes.to_commandline(optionhandler)

Generates a commandline string from the OptionHandler instance.

Parameters:optionhandler (OptionHandler) – the OptionHandler instance to turn into a commandline
Returns:the commandline string
Return type:str
weka.core.classes.unbackquote(s)

Un-backquotes the string.

Parameters:s (str) – the string to process
Returns:the un-backquoted string
Return type:str
weka.core.classes.unquote(s)

Un-quotes the string.

Parameters:s (str) – the string to process
Returns:the un-quoted string
Return type:str

weka.core.converters module

class weka.core.converters.IncrementalLoaderIterator(loader, structure)

Bases: object

Iterator for dataset rows when loarding incrementally.

next()

Reads the next dataset row.

Returns:the next row
Return type:Instance
class weka.core.converters.Loader(classname='weka.core.converters.ArffLoader', jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for Loaders.

load_file(dfile, incremental=False)

Loads the specified file and returns the Instances object. In case of incremental loading, only the structure.

Parameters:
  • dfile (str) – the file to load
  • incremental (bool) – whether to load the dataset incrementally
Returns:

the full dataset or the header (if incremental)

Return type:

Instances

Raises:

Exception – if the file does not exist

load_url(url, incremental=False)

Loads the specified URL and returns the Instances object. In case of incremental loading, only the structure.

Parameters:
  • url (str) – the URL to load the data from
  • incremental (bool) – whether to load the dataset incrementally
Returns:

the full dataset or the header (if incremental)

Return type:

Instances

class weka.core.converters.Saver(classname='weka.core.converters.ArffSaver', jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for Savers.

capabilities()

Returns the capabilities of the saver.

Returns:the capabilities
Return type:Capabilities
save_file(data, dfile)

Saves the Instances object in the specified file.

Parameters:
  • data (Instances) – the data to save
  • dfile (str) – the file to save the data to
class weka.core.converters.TextDirectoryLoader(jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for TextDirectoryLoader.

load()

Loads the text files from the specified directory and returns the Instances object. In case of incremental loading, only the structure.

Returns:the full dataset or the header (if incremental)
Return type:Instances
weka.core.converters.load_any_file(filename)

Determines a Loader based on the the file extension. If successful, loads the full dataset and returns it.

Parameters:filename (str) – the name of the file to load
Returns:the
Return type:Instances
weka.core.converters.loader_for_file(filename)

Returns a Loader that can load the specified file, based on the file extension. None if failed to determine.

Parameters:filename (str) – the filename to get the loader for
Returns:the assoicated loader instance or None if none found
Return type:Loader
weka.core.converters.ndarray_to_instances(array, relation, att_template='Att-#', att_list=None)

Converts the numpy matrix into an Instances object and returns it.

Parameters:
  • array (numpy.darray) – the numpy ndarray to convert
  • relation (str) – the name of the dataset
  • att_template (str) – the prefix to use for the attribute names, “#” is the 1-based index, “!” is the 0-based index, “@” the relation name
  • att_list (list) – the list of attribute names to use
Returns:

the generated instances object

Return type:

Instances

weka.core.converters.save_any_file(data, filename)

Determines a Saver based on the the file extension. Returns whether successfully saved.

Parameters:
  • filename (str) – the name of the file to save
  • data (Instances) – the data to save
Returns:

whether successfully saved

Return type:

bool

weka.core.converters.saver_for_file(filename)

Returns a Saver that can load the specified file, based on the file extension. None if failed to determine.

Parameters:filename (str) – the filename to get the saver for
Returns:the associated saver instance or None if none found
Return type:Saver

weka.core.database module

class weka.core.database.DatabaseUtils(jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for weka.experiment.DatabaseUtils.

db_url

Obtains the currently set database URL.

Returns:the database URL
Return type:str
password

Obtains the currently set database password.

Returns:the database password
Return type:str
user

Obtains the currently set database user.

Returns:the database user
Return type:str
class weka.core.database.InstanceQuery(jobject=None, options=None)

Bases: weka.core.database.DatabaseUtils

Wrapper class for weka.experiment.InstanceQuery.

custom_properties

Obtains the currently set custom properties file.

Returns:the custom properties file
Return type:str
query

Obtains the current SQL query to execute.

Returns:the SQL query
Return type:str
retrieve_instances(query=None)

Executes either the supplied query or the one set via options (or the ‘query’ property).

Parameters:query (str) – query to execute if not the currently set one
Returns:the generated dataq
Return type:Instances
sparse_data

Obtains the whether sparse data is returned or not.

Returns:whether sparse data is generated
Return type:bool

weka.core.dataset module

class weka.core.dataset.Attribute(jobject)

Bases: weka.core.classes.JavaObject

Wrapper class for weka.core.Attribute.

add_relation(instances)

Adds the relation value, returns the index.

Parameters:instances (Instances) – the Instances object to add
Returns:the index
Return type:int
add_string_value(s)

Adds the string value, returns the index.

Parameters:s (str) – the string to add
Returns:the index
Return type:int
copy(name=None)

Creates a copy of this attribute.

Parameters:name (str) – the new name, uses the old one if None
Returns:the copy of the attribute
Return type:Attribute
classmethod create_date(name, formt="yyyy-MM-dd'T'HH:mm:ss")

Creates a date attribute.

Parameters:
  • name (str) – the name of the attribute
  • formt (str) – the date format, see Javadoc for java.text.SimpleDateFormat
classmethod create_nominal(name, labels)

Creates a nominal attribute.

Parameters:
  • name (str) – the name of the attribute
  • labels (list) – the list of string labels to use
classmethod create_numeric(name)

Creates a numeric attribute.

Parameters:name (str) – the name of the attribute
classmethod create_relational(name, inst)

Creates a relational attribute.

Parameters:
  • name (str) – the name of the attribute
  • inst (Instances) – the structure of the relational attribute
classmethod create_string(name)

Creates a string attribute.

Parameters:name (str) – the name of the attribute
date_format

Returns the format of this data attribute. See java.text.SimpleDateFormat Javadoc.

Returns:the format string
Return type:str
equals(att)

Checks whether this attributes is the same as the provided one.

Parameters:att (Attribute) – the Attribute to check against
Returns:whether the same
Return type:bool
equals_msg(att)

Checks whether this attributes is the same as the provided one. Returns None if the same, otherwise error message.

Parameters:att (Attribute) – the Attribute to check against
Returns:None if the same, otherwise error message
Return type:str
index

Returns the index of this attribute.

Returns:the index
Return type:int
index_of(label)

Returns the index of the label in this attribute.

Parameters:label (str) – the string label to get the index for
Returns:the 0-based index
Return type:int
is_averagable

Returns whether the attribute is averagable.

Returns:whether averagable
Return type:bool
is_date

Returns whether the attribute is a date one.

Returns:whether date attribute
Return type:bool
is_in_range(value)

Checks whether the value is within the bounds of the numeric attribute.

Parameters:value (float) – the numeric value to check
Returns:whether between lower and upper bound
Return type:bool
is_nominal

Returns whether the attribute is a nominal one.

Returns:whether nominal attribute
Return type:bool
is_numeric

Returns whether the attribute is a numeric one (date or numeric).

Returns:whether numeric attribute
Return type:bool
is_relation_valued

Returns whether the attribute is a relation valued one.

Returns:whether relation valued attribute
Return type:bool
is_string

Returns whether the attribute is a string attribute.

Returns:whether string attribute
Return type:bool
lower_numeric_bound

Returns the lower numeric bound of the numeric attribute.

Returns:the lower bound
Return type:float
name

Returns the name of the attribute.

Returns:the name
Return type:str
num_values

Returns the number of labels.

Returns:the number of labels
Return type:int
ordering

Returns the ordering of the attribute.

Returns:the ordering (ORDERING_SYMBOLIC, ORDERING_ORDERED, ORDERING_MODULO)
Return type:int
parse_date(s)

Parses the date string and returns the internal format value.

Parameters:s (str) – the date string
Returns:the internal format
Return type:float
type

Returns the type of the attribute. See weka.core.Attribute Javadoc.

Returns:the type
Return type:int
type_str(short=False)

Returns the type of the attribute as string.

Returns:the type
Return type:str
upper_numeric_bound

Returns the upper numeric bound of the numeric attribute.

Returns:the upper bound
Return type:float
value(index)

Returns the label for the index.

Parameters:index (int) – the 0-based index of the label to return
Returns:the label
Return type:str
values

Returns the labels, strings or relation-values.

Returns:all the values, None if not NOMINAL, STRING, or RELATION
Return type:list
weight

Returns the weight of the attribute.

Returns:the weight
Return type:float
class weka.core.dataset.AttributeIterator(data)

Bases: object

Iterator for attributes in an Instances object.

next()

Returns the next attribute from the Instances object.

Returns:the next Attribute object
Return type:Attribute
class weka.core.dataset.AttributeStats(jobject)

Bases: weka.core.classes.JavaObject

Container for attribute statistics.

distinct_count

The number of distinct values.

Returns:The number of distinct values
Return type:int
int_count

The number of int-like values.

Returns:The number of int-like values
Return type:int
missing_count

The number of missing values.

Returns:The number of missing values
Return type:int
nominal_counts

Counts of each nominal value.

Returns:Counts of each nominal value
Return type:ndarray
nominal_weights

Weight mass for each nominal value.

Returns:Weight mass for each nominal value
Return type:ndarray
numeric_stats

Stats on numeric value distributions.

Returns:Stats on numeric value distributions
Return type:NumericStats
total_count

The total number of values.

Returns:The total number of values
Return type:int
unique_count

The number of values that only appear once.

Returns:The number of values that only appear once
Return type:int
class weka.core.dataset.Instance(jobject)

Bases: weka.core.classes.JavaObject

Wrapper class for weka.core.Instance.

class_attribute

Returns the currently set class attribute.

Returns:the class attribute
Return type:Attribute
class_index

Returns the currently set class index.

Returns:the class index, -1 if not set
Return type:int
classmethod create_instance(values, classname='weka.core.DenseInstance', weight=1.0)

Creates a new instance.

Parameters:
  • values (ndarray or list) – the float values (internal format) to use, numpy array or list.
  • classname (str) – the classname of the instance (eg weka.core.DenseInstance).
  • weight (float) – the weight of the instance
classmethod create_sparse_instance(values, max_values, classname='weka.core.SparseInstance', weight=1.0)

Creates a new sparse instance.

Parameters:
  • values (list) – the list of tuples (0-based index and internal format float). The indices of the tuples must be in ascending order and “max_values” must be set to the maximum number of attributes in the dataset.
  • max_values (int) – the maximum number of attributes
  • classname (str) – the classname of the instance (eg weka.core.SparseInstance).
  • weight (float) – the weight of the instance
dataset

Returns the dataset that this instance belongs to.

Returns:the dataset or None if no dataset set
Return type:Instances
get_relational_value(index)

Returns the relational value at the specified position (0-based).

Parameters:index (int) – the 0-based index of the inernal value
Returns:the relational value
Return type:Instances
get_string_value(index)

Returns the string value at the specified position (0-based).

Parameters:index (int) – the 0-based index of the inernal value
Returns:the string value
Return type:str
get_value(index)

Returns the internal value at the specified position (0-based).

Parameters:index (int) – the 0-based index of the inernal value
Returns:the internal value
Return type:float
has_class()

Returns whether a class attribute is set (convenience method).

Returns:whether a class attribute is currently set
Return type:bool
has_missing()

Returns whether at least one attribute has a missing value.

Returns:whether at least one value is missing
Return type:bool
is_missing(index)

Returns whether the attribute at the specified index is missing.

Parameters:index (int) – the 0-based index of the attribute
Returns:whether the value is missing
Return type:bool
classmethod missing_value()

Returns the numeric value that represents a missing value in Weka (NaN).

Returns:missing value
Return type:float
num_attributes

Returns the number of attributes.

Returns:the numer of attributes
Return type:int
num_classes

Returns the number of class labels.

Returns:the numer of class labels
Return type:int
set_missing(index)

Sets the attribute at the specified index to missing.

Parameters:index (int) – the 0-based index of the attribute
set_string_value(index, s)

Sets the string value at the specified position (0-based).

Parameters:
  • index (int) – the 0-based index of the inernal value
  • s (str) – the string value
set_value(index, value)

Sets the internal value at the specified position (0-based).

Parameters:
  • index (int) – the 0-based index of the attribute
  • value (float) – the internal float value to set
values

Returns the internal values of this instance.

Returns:the values as numpy array
Return type:ndarray
weight

Returns the currently set weight.

Returns:the weight
Return type:float
class weka.core.dataset.InstanceIterator(data)

Bases: object

Iterator for rows in an Instances object.

next()

Returns the next row from the Instances object.

Returns:the next Instance object
Return type:Instance
class weka.core.dataset.InstanceValueIterator(data)

Bases: object

Iterator for values in an Instance object.

next()

Returns the next value from the Instance object.

Returns:the next value, depending on the attribute that can be either a number of a string
Return type:str or float
class weka.core.dataset.Instances(jobject)

Bases: weka.core.classes.JavaObject

Wrapper class for weka.core.Instances.

add_instance(inst, index=None)

Adds the specified instance to the dataset.

Parameters:
  • inst (Instance) – the Instance to add
  • index (int) – the 0-based index where to add the Instance
classmethod append_instances(inst1, inst2)

Merges the two datasets (one-after-the-other). Throws an exception if the datasets aren’t compatible.

Parameters:
Returns:

the combined dataset

Return type:

Instances

attribute(index)

Returns the specified attribute.

Parameters:index (int) – the 0-based index of the attribute
Returns:the attribute
Return type:Attribute
attribute_by_name(name)

Returns the specified attribute, None if not found.

Parameters:name (str) – the name of the attribute
Returns:the attribute or None
Return type:Attribute
attribute_stats(index)

Returns the specified attribute statistics.

Parameters:index (int) – the 0-based index of the attribute
Returns:the attribute statistics
Return type:AttributeStats
attributes()

Returns an iterator over the attributes.

class_attribute

Returns the currently set class attribute.

Returns:the class attribute
Return type:Attribute
class_index

Returns the currently set class index (0-based).

Returns:the class index, -1 if not set
Return type:int
class_is_first()

Sets the first attribute as class attribute (convenience method).

class_is_last()

Sets the last attribute as class attribute (convenience method).

compactify()

Compactifies the set of instances.

classmethod copy_instances(dataset, from_row=None, num_rows=None)

Creates a copy of the Instances. If either from_row or num_rows are None, then all of the data is being copied.

Parameters:
  • dataset (Instances) – the original dataset
  • from_row (int) – the 0-based start index of the rows to copy
  • num_rows (int) – the number of rows to copy
Returns:

the copy of the data

Return type:

Instances

classmethod create_instances(name, atts, capacity)

Creates a new Instances.

Parameters:
  • name (str) – the relation name
  • atts (list of Attribute) – the list of attributes to use for the dataset
  • capacity (int) – how many data rows to reserve initially (see compactify)
Returns:

the dataset

Return type:

Instances

delete(index=None)

Removes either the specified Instance or all Instance objects.

Parameters:index (int) – the 0-based index of the instance to remove
delete_attribute(index)

Deletes an attribute at the given position.

Parameters:index (int) – the 0-based index of the attribute to remove
delete_attribute_type(typ)

Deletes all attributes of the given type in the dataset.

Parameters:typ (int) – the attribute type to remove, see weka.core.Attribute Javadoc
delete_first_attribute()

Deletes the first attribute.

delete_last_attribute()

Deletes the last attribute.

delete_with_missing(index)

Deletes all rows that have a missing value at the specified attribute index.

Parameters:index (int) – the attribute index to check for missing attributes
equal_headers(inst)

Compares this dataset against the given one in terms of attributes.

Parameters:inst (Instances) – the dataset to compare against
Returns:None if the same, otherwise an error message
Return type:str
get_instance(index)

Returns the Instance object at the specified location.

Parameters:index (int) – the 0-based index of the instance
Returns:the instance
Return type:Instance
has_class()

Returns whether a class attribute is set (convenience method).

Returns:whether a class attribute is currently set
Return type:bool
insert_attribute(att, index)

Inserts the attribute at the specified location.

Parameters:
  • att (Attribute) – the attribute to insert
  • index (int) – the index to insert the attribute at
classmethod merge_instances(inst1, inst2)

Merges the two datasets (side-by-side).

Parameters:
Returns:

the combined dataset

Return type:

Instances

no_class()

Unsets the class attribute (convenience method).

num_attributes

Returns the number of attributes.

Returns:the number of attributes
Return type:int
num_instances

Returns the number of instances.

Returns:the number of instances
Return type:int
randomize(random)

Randomizes the dataset using the random number generator.

Parameters:random (Random) – the random number generator to use
relationname

Returns the name of the dataset.

Returns:the name
Return type:str
set_instance(index, inst)

Sets the Instance at the specified location in the dataset.

Parameters:
  • index (int) – the 0-based index of the instance to replace
  • inst (Instance) – the Instance to set
Returns:

the instance

Return type:

Instance

sort(index)

Sorts the dataset using the specified attribute index.

Parameters:index (int) – the index of the attribute
stratify(folds)

Stratifies the data after randomization for nominal class attributes.

Parameters:folds (int) – the number of folds to perform the stratification for
classmethod summary(inst)

Generates a summary of the dataset.

Parameters:inst (Instances) – the dataset
Returns:the summary
Return type:str
classmethod template_instances(dataset, capacity=0)

Uses the Instances as template to create an empty dataset.

Parameters:
  • dataset (Instances) – the original dataset
  • capacity (int) – how many data rows to reserve initially (see compactify)
Returns:

the empty dataset

Return type:

Instances

test_cv(num_folds, fold)

Generates a test fold for cross-validation.

Parameters:
  • num_folds (int) – the number of folds of cross-validation, eg 10
  • fold (int) – the current fold (0-based)
Returns:

the training fold

Return type:

Instances

train_cv(num_folds, fold, random=None)

Generates a training fold for cross-validation.

Parameters:
  • num_folds (int) – the number of folds of cross-validation, eg 10
  • fold (int) – the current fold (0-based)
  • random (Random) – the random number generator
Returns:

the training fold

Return type:

Instances

train_test_split(percentage, rnd=None)

Generates a train/test split. Creates a copy of the dataset first before applying randomization.

Parameters:
  • percentage (double) – the percentage split to use (amount to use for training; 0-100)
  • rnd (Random) – the random number generator to use, if None the order gets preserved
Returns:

the train/test splits

Return type:

tuple

values(index)

Returns the internal values of this attribute from all the instance objects.

Returns:the values as numpy array
Return type:list
class weka.core.dataset.Stats(jobject)

Bases: weka.core.classes.JavaObject

Container for numeric attribute stats.

count

The number of values seen.

Returns:The number of values seen
Return type:float
max

The maximum value seen, or Double.NaN if no values seen.

Returns:The maximum value seen, or Double.NaN if no values seen
Return type:float
mean

The mean of values at the last calculateDerived() call.

Returns:The mean of values at the last calculateDerived() call
Return type:float
min

The minimum value seen, or Double.NaN if no values seen.

Returns:The minimum value seen, or Double.NaN if no values seen
Return type:float
stddev

The std deviation of values at the last calculateDerived() call.

Returns:The std deviation of values at the last calculateDerived() call
Return type:float
sum

The sum of values seen.

Returns:The sum of values seen
Return type:float
sumsq

The sum of values squared seen.

Returns:The sum of values squared seen
Return type:float
weka.core.dataset.create_instances_from_lists(x, y=None, name='data')

Allows the generation of an Instances object from a list of lists for X and a list for Y (optional). Data can be numeric, string or bytes. Attributes can be converted to nominal with the weka.filters.unsupervised.attribute.NumericToNominal filter.

Parameters:
  • x (list of list) – the input variables (row wise)
  • y (list) – the output variable (optional)
  • name (str) – the name of the dataset
Returns:

the generated dataset

Return type:

Instances

weka.core.dataset.create_instances_from_matrices(x, y=None, name='data')

Allows the generation of an Instances object from a 2-dimensional matrix for X and a 1-dimensional matrix for Y (optional). Data can be numeric, string or bytes. Attributes can be converted to nominal with the weka.filters.unsupervised.attribute.NumericToNominal filter.

Parameters:
  • x (ndarray) – the input variables
  • y (ndarray) – the output variable (optional)
  • name (str) – the name of the dataset
Returns:

the generated dataset

Return type:

Instances

weka.core.dataset.missing_value()

Returns the value that represents missing values in Weka (NaN).

Returns:missing value
Return type:float

weka.core.jvm module

weka.core.jvm.add_bundled_jars()

Adds the bundled jars to the JVM’s classpath.

weka.core.jvm.add_system_classpath()

Adds the system’s classpath to the JVM’s classpath.

weka.core.jvm.start(class_path=None, bundled=True, packages=False, system_cp=False, max_heap_size=None)

Initializes the javabridge connection (starts up the JVM).

Parameters:
  • class_path (list) – the additional classpath elements to add
  • bundled (bool) – whether to add jars from the “lib” directory
  • packages (bool or str) – whether to add jars from Weka packages as well (bool) or an alternative Weka home directory (str)
  • system_cp (bool) – whether to add the system classpath as well
  • max_heap_size (str) – the maximum heap size (-Xmx parameter, eg 512m or 4g)
weka.core.jvm.stop()

Kills the JVM.

weka.core.packages module

class weka.core.packages.Dependency(jobject)

Bases: weka.core.classes.JavaObject

Wrapper for the weka.core.packageManagement.Dependency class.

source

Returns the source package.

Returns:the package
Return type:Package
target

Returns the target package constraint.

Returns:the package constraint
Return type:PackageConstraint
class weka.core.packages.Package(jobject)

Bases: weka.core.classes.JavaObject

Wrapper for the weka.core.packageManagement.Package class.

dependencies

Returns the dependencies of the package.

Returns:the list of Dependency objects
Return type:list of Dependency
install()

Installs the package.

is_installed

Returns whether the package is installed.

Returns:whether installed
Return type:bool
metadata

Returns the meta-data.

Returns:the meta-data dictionary
Return type:dict
name

Returns the name of the package.

Returns:the name
Return type:str
url

Returns the URL of the package.

Returns:the url
Return type:str
class weka.core.packages.PackageConstraint(jobject)

Bases: weka.core.classes.JavaObject

Wrapper for the weka.core.packageManagement.PackageConstraint class.

check_constraint(pkge=None, constr=None)

Checks the constraints.

Parameters:
get_package()

Returns the package.

Returns:the package
Return type:Package
set_package(pkge)

Sets the package.

Parameters:pkge (Package) – the package
weka.core.packages.all_packages()

Returns a list of all packages.

Returns:the list of packages
Return type:list
weka.core.packages.available_packages()

Returns a list of all packages that aren’t installed yet.

Returns:the list of packages
Return type:list
weka.core.packages.establish_cache()

Establishes the package cache if necessary.

weka.core.packages.install_package(pkge, version='Latest')

The list of packages to install.

Parameters:
  • pkge (str) – the name of the repository package, a URL (http/https) or a zip file
  • version (str) – in case of the repository packages, the version
Returns:

whether successfully installed

Return type:

bool

weka.core.packages.installed_packages()

Returns a list of the installed packages.

Returns:the list of packages
Return type:list
weka.core.packages.is_installed(name)

Checks whether a package with the name is already installed.

Parameters:name (str) – the name of the package
Returns:whether the package is installed
Return type:bool
weka.core.packages.refresh_cache()

Refreshes the cache.

weka.core.packages.uninstall_package(name)

Uninstalls a package.

Parameters:name (str) – the name of the package
Returns:whether successfully uninstalled
Return type:bool

weka.core.serialization module

weka.core.serialization.deepcopy(obj)

Creates a deep copy of the JavaObject (or derived class) or JB_Object.

Parameters:obj (object) – the object to create a copy of
Returns:the copy, None if failed to copy
Return type:object
weka.core.serialization.read(filename)

Reads the serialized object from disk. Caller must wrap object in appropriate Python wrapper class.

Parameters:filename (str) – the file with the serialized object
Returns:the JB_Object
Return type:JB_Object
weka.core.serialization.read_all(filename)

Reads the serialized objects from disk. Caller must wrap objects in appropriate Python wrapper classes.

Parameters:filename (str) – the file with the serialized objects
Returns:the list of JB_OBjects
Return type:list
weka.core.serialization.write(filename, jobject)

Serializes the object to disk. JavaObject instances get automatically unwrapped.

Parameters:
  • filename (str) – the file to serialize the object to
  • jobject (JB_Object or JavaObject) – the object to serialize
weka.core.serialization.write_all(filename, jobjects)

Serializes the list of objects to disk. JavaObject instances get automatically unwrapped.

Parameters:
  • filename (str) – the file to serialize the object to
  • jobjects (list) – the list of objects to serialize

weka.core.stemmers module

class weka.core.stemmers.Stemmer(classname='weka.core.stemmers.NullStemmer', jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for stemmers.

stem(s)

Performs stemming on the string.

Parameters:s (str) – the string to stem
Returns:the stemmed string
Return type:str

weka.core.stopwords module

class weka.core.stopwords.Stopwords(classname='weka.core.stopwords.Null', jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for stopwords handlers.

is_stopword(s)

Checks a string whether it is a stopword.

Parameters:s (str) – the string to check
Returns:True if a stopword
Return type:bool

weka.core.tokenizers module

class weka.core.tokenizers.TokenIterator(tokenizer)

Bases: object

Iterator for string tokens.

next()

Reads the next dataset row.

Returns:the next row
Return type:Instance
class weka.core.tokenizers.Tokenizer(classname='weka.core.tokenizers.AlphabeticTokenizer', jobject=None, options=None)

Bases: weka.core.classes.OptionHandler

Wrapper class for tokenizers.

tokenize(s)

Tokenizes the string.

Parameters:s (str) – the string to tokenize
Returns:the iterator
Return type:TokenIterator

weka.core.types module

weka.core.types.double_matrix_to_ndarray(m)

Turns the Java matrix (2-dim array) of doubles into a numpy 2-dim array.

Parameters:m – the double matrix
Type:JB_Object
Returns:Numpy array
Return type:numpy.darray
weka.core.types.double_to_float(d)

Turns the Python float into a Java java.lang.Float object.

Parameters:d (float) – the Python float
Returns:the Float object
Return type:JB_Object
weka.core.types.enumeration_to_list(enm)

Turns the java.util.Enumeration into a list.

Parameters:enm (JB_Object) – the enumeration to convert
Returns:the list
Return type:list
weka.core.types.string_array_to_list(a)

Turns the Java string array into Python unicode string list.

Parameters:a (JB_Object) – the string array to convert
Returns:the string list
Return type:list
weka.core.types.string_list_to_array(l)

Turns a Python unicode string list into a Java String array.

Parameters:l – the string list
Type:list
Return type:java string array
Returns:JB_Object

weka.core.version module

weka.core.version.weka_version()

Determines the version of Weka in use.

Returns:the version
Return type:str

Module contents