weka.core package¶

weka.core.capabilities module¶

class weka.core.capabilities.Capabilities(jobject=None, owner=None)¶

Bases: JavaObject

Wrapper for Capabilities.

attribute_capabilities()¶

Returns all the attribute capabilities.

Returns:: attribute capabilities
Return type:: Capabilities

capabilities()¶

Returns all the capabilities.

Returns:: all capabilities
Return type:: list

class_capabilities()¶

Returns all the class capabilities.

Returns:: class capabilities
Return type:: Capabilities

dependencies()¶

Returns all the dependencies.

Returns:: the dependency list
Return type:: list

disable(capability)¶

Disables the specified capability.

Parameters:: capability (Capability) – the capability to disable

disable_all()¶: Disables all capabilities.

disable_all_attribute_dependencies()¶: Disables all attribute dependencies.

disable_all_attributes()¶: Disables all attributes.

disable_all_class_dependencies()¶: Disables all class dependencies.

disable_all_classes()¶: Disables all classes.

disable_dependency(capability)¶

Disables the dependency of the given capability Disabling NOMINAL_ATTRIBUTES also disables BINARY_ATTRIBUTES, UNARY_ATTRIBUTES and EMPTY_NOMINAL_ATTRIBUTES.

Parameters:: capability (Capability) – the dependency to disable

enable(capability)¶

enables the specified capability.

Parameters:: capability (Capability) – the capability to enable

enable_all()¶: enables all capabilities.

enable_all_attribute_dependencies()¶: enables all attribute dependencies.

enable_all_attributes()¶: enables all attributes.

enable_all_class_dependencies()¶: enables all class dependencies.

enable_all_classes()¶: enables all classes.

enable_dependency(capability)¶

enables the dependency of the given capability enabling NOMINAL_ATTRIBUTES also enables BINARY_ATTRIBUTES, UNARY_ATTRIBUTES and EMPTY_NOMINAL_ATTRIBUTES.

Parameters:: capability (Capability) – the dependency to enable

classmethod for_instances(data, multi=None)¶

returns a Capabilities object specific for this data. The minimum number of instances is not set, the check for multi-instance data is optional.

Parameters:

data (Instances) – the data to generate the capabilities for
multi (bool) – whether to check the structure, too

Returns:

the generated capabilities

Return type:

Capabilities

handles(capability)¶

Returns whether the specified capability is set.

Parameters:: capability (Capability) – the capability to check
Returns:: whether the capability is set
Return type:: bool

has_dependencies()¶

Returns whether any dependencies are set.

Returns:: whether any dependecies are set
Return type:: bool

has_dependency(capability)¶

Returns whether the specified dependency is set.

Parameters:: capability (Capability) – the capability to check
Returns:: whether the dependency is set
Return type:: bool

property min_instances¶

Returns the minimum number of instances that must be supported.

Returns:: the minimum number
Return type:: int

other_capabilities()¶

Returns all other capabilities.

Returns:: all other capabilities
Return type:: Capabilities

property owner¶

Returns the owner of these capabilities, if any.

Returns:: the owner, can be None
Return type:: JavaObject

supports(capabilities)¶

Returns true if the currently set capabilities support at least all of the capabiliites of the given Capabilities object (checks only the enum!)

Parameters:: capabilities (Capabilities) – the capabilities to check
Returns:: whether the current capabilities support at least the specified ones
Return type:: bool

supports_maybe(capabilities)¶

Returns true if the currently set capabilities support (or have a dependency) at least all of the capabilities of the given Capabilities object (checks only the enum!)

Parameters:: capabilities (Capabilities) – the capabilities to check
Returns:: whether the current capabilities (potentially) support the specified ones
Return type:: bool

test_attribute(att, is_class=None, fail=False)¶

Tests whether the attribute meets the conditions.

Parameters:

att (Attribute) – the Attribute to test
is_class (bool) – whether this attribute is the class attribute
fail (bool) – whether to fail with an exception in case the test fails

Returns:

whether the attribute meets the conditions

Return type:

bool

test_instances(data, from_index=None, to_index=None, fail=False)¶

Tests whether the dataset meets the conditions.

Parameters:

data (Instances) – the Instances to test
from_index (int) – the first attribute to include
to_index (int) – the last attribute to include

Returns:

wether the dataset meets the requirements

Return type:

bool

class weka.core.capabilities.Capability(jobject=None, member=None)¶

Bases: Enum

Wrapper for a Capability.

property is_attribute¶

Returns whether this capability is an attribute.

Returns:: whether it is an attribute
Return type:: bool

property is_attribute_capability¶

Returns whether this capability is an attribute capability.

Returns:: whether it is an attribute capability
Return type:: bool

property is_class¶

Returns whether this capability is a class.

Returns:: whether it is a class
Return type:: bool

property is_class_capability¶

Returns whether this capability is a class capability.

Returns:: whether it is a class capability
Return type:: bool

property is_other_capability¶

Returns whether this capability is an other capability.

Returns:: whether it is an other capability
Return type:: bool

weka.core.classes module¶

class weka.core.classes.AbstractParameter(classname=None, jobject=None, options=None)¶

Bases: OptionHandler

Ancestor for all parameter classes used by SetupGenerator and MultiSearch.

property prop¶

Returns the currently set property to apply the parameter to.

Returns:: the property
Return type:: str

class weka.core.classes.Date(jobject=None, msecs=None)¶

Bases: JavaObject

Wraps a java.util.Date object.

property time¶

Returns the stored milli-seconds.

Returns:: the milli-seconds
Return type:: long

class weka.core.classes.Enum(jobject=None, enum=None, member=None)¶

Bases: JavaObject

Wrapper for Java enums.

property name¶

Returns the name of the enum member.

Returns:: the name
Return type:: str

property ordinal¶

Returns the ordinal of the enum member.

Returns:: the ordinal
Return type:: int

property values¶

Returns list of all enum members.

Returns:: all enum members
Return type:: list

class weka.core.classes.Environment(jobject=None)¶

Bases: JavaObject

Wraps around weka.core.Environment

add_variable(key, value, system_wide=False)¶

Adds the environment variable.

Parameters:

key (str) – the name of the variable
value (str) – the value
system_wide (bool) – whether to add the variable system wide

remove_variable(key)¶

Adds the environment variable.

Parameters:: key (str) – the name of the variable

classmethod system_wide()¶

Returns the system-wide environment.

;return: the environment :rtype: Environment

variable_names()¶

Returns the names of all environment variables.

Returns:: the names of the variables
Return type:: list

variable_value(key)¶

Returns the value of the environment variable.

Parameters:: key (str) – the name of the variable
Returns:: the variable value
Return type:: str

class weka.core.classes.JavaArray(jobject)¶

Bases: JavaObject

Convenience wrapper around Java arrays.

component_type()¶

Returns the classname of the elements.

Returns:: the class of the elements
Return type:: str

classmethod new_array(classname, length)¶

Creates a new array with the given classname and length; initial values are null.

Parameters:

classname (str) – the classname in Java notation (eg “weka.core.DenseInstance”)
length (int) – the length of the array

Returns:

the Java array

Return type:

JPype object

class weka.core.classes.JavaArrayIterator(data)¶

Bases: object

Iterator for elements in a Java array.

class weka.core.classes.JavaObject(jobject)¶

Bases: JSONObject

Basic Java object.

classmethod check_type(jobject, intf_or_class)¶

Returns whether the object implements the specified interface or is a subclass.

Parameters:

jobject (JPype object) – the Java object to check
intf_or_class (str) – the classname in Java notation (eg “weka.core.DenseInstance;”)

Returns:

whether object implements interface or is subclass

Return type:

bool

property classname¶

Returns the Java classname in dot-notation.

Returns:: the Java classname
Return type:: str

classmethod enforce_type(jobject, intf_or_class)¶

Raises an exception if the object does not implement the specified interface or is not a subclass.

Parameters:

jobject (JPype object) – the Java object to check
intf_or_class (str) – the classname in Java notation (eg “weka.core.DenseInstance”)

classmethod from_dict(d)¶

Restores an object state from a dictionary, used in de-JSONification.

Parameters:: d (dict) – the object dictionary
Returns:: the object
Return type:: object

get_property(path)¶

Attempts to get the value (jobject, a Java object) of the provided (bean) property path.

Parameters:: path (str) – the property path, e.g., “filter” for a setFilter(…)/getFilter() method pair
Returns:: the wrapped Java object
Return type:: JavaObject

property is_serializable¶

Returns true if the object is serialiable.

Returns:: true if serializable
Return type:: bool

property jclass¶

Returns the Java class object of the underlying Java object.

Returns:: the Java class
Return type:: JClass

property jwrapper¶

DEPRECATED: use self.jobject directly, as it is already wrapped.

Returns the encapsulated Java object, giving access to methods using dot notation.

Returns:: the JPype object

classmethod new_instance(classname, options=None)¶

Creates a new object from the given classname using the default constructor, None in case of error.

Parameters:

classname (str) – the classname in Java notation (eg “weka.core.DenseInstance”)
options (list) – the list of options to use, ignored if None

Returns:

the Java object

Return type:

JPype object

set_property(path, jobject)¶

Attempts to set the value (jobject, a Java object) of the provided (bean) property path.

Parameters:

path (str) – the property path, e.g., “filter” for a setFilter(…)/getFilter() method pair
jobject (JPype object) – the Java object to set; if instance of JavaObject class, the jobject member is automatically used

to_dict()¶

Returns a dictionary that represents this object, to be used for JSONification.

Returns:: the object dictionary
Return type:: dict

class weka.core.classes.ListParameter(jobject=None, options=None)¶

Bases: AbstractParameter

Parameter using a predefined list of values, used by SetupGenerator and MultiSearch.

property values¶

Returns the currently set values.

Returns:: the list of values (strings)
Return type:: list

class weka.core.classes.MathParameter(jobject=None, options=None)¶

Bases: AbstractParameter

Parameter using a math expression for generating values, used by SetupGenerator and MultiSearch.

property base¶

Returns the currently set base value.

Returns:: the base
Return type:: float

property expression¶

Returns the currently set expression.

Returns:: the expression
Return type:: str

property maximum¶

Returns the currently set maximum value.

Returns:: the maximum
Return type:: float

property minimum¶

Returns the currently set minimum value.

Returns:: the minimum
Return type:: float

property step¶

Returns the currently set step value.

Returns:: the step
Return type:: float

class weka.core.classes.Option(jobject)¶

Bases: JavaObject

Wrapper for the weka.core.Option class.

property description¶

Returns the description of the option.

Returns:: the description
Return type:: str

property name¶

Returns the name of the option.

Returns:: the name
Return type:: str

property num_arguments¶

Returns the synopsis of the option.

Returns:: the synopsis
Return type:: str

property synopsis¶

Returns the synopsis of the option.

Returns:: the synopsis
Return type:: str

class weka.core.classes.OptionHandler(jobject, options=None)¶

Bases: JavaObject, Configurable

Ancestor for option-handling classes. Classes should implement the weka.core.OptionHandler interface to have any effect.

description()¶

Returns a description of the object.

Returns:: the description
Return type:: str

classmethod from_dict(d)¶

Restores an object state from a dictionary, used in de-JSONification.

Parameters:: d (dict) – the object dictionary
Returns:: the object
Return type:: object

global_info()¶

Returns the globalInfo() result, None if not available.

Rtypes:: str

property options¶

Obtains the currently set options as list.

Returns:: the list of options
Return type:: list

to_commandline()¶

Generates a commandline string from the JavaObject instance.

Returns:: the commandline string
Return type:: str

to_dict()¶

Returns a dictionary that represents this object, to be used for JSONification.

Returns:: the object dictionary
Return type:: dict

to_help(title=True, description=True, options=True, use_headers=True, separator='')¶

Returns a string that contains the ‘global_info’ text and the options.

Parameters:

title (bool) – whether to output a title
description (bool) – whether to output the description
options (bool) – whether to output the options
use_headers (bool) – whether to output headers, describing the sections
separator (str) – the separator line to use between sections

Returns:

the generated help string

Return type:

str

class weka.core.classes.Random(seed)¶

Bases: JavaObject

Wrapper for the java.util.Random class.

next_double()¶

Next random double.

Returns:: the next random double
Return type:: double

next_int(n=None)¶

Next random integer. if n is provided, then between 0 and n-1.

Parameters:: n (int) – the upper limit (minus 1) for the random integer
Returns:: the next random integer
Return type:: int

class weka.core.classes.Range(jobject=None, ranges=None)¶

Bases: JavaObject

Wrapper for a Weka Range object.

property invert¶

Returns whether the range is inverted.

Returns:: true if inverted
Return type:: bool

property ranges¶

Returns the string range.

Returns:: the string range of 1-based indices
Return type:: str

selection()¶

Returns the selection list.

Returns:: the list of 0-based integer indices
Return type:: list

upper(upper)¶

Sets the upper limit.

Parameters:: upper (int) – the upper limit

class weka.core.classes.SelectedTag(jobject=None, tag_id=None, tag_text=None, tags=None)¶

Bases: JavaObject

Wrapper for the weka.core.SelectedTag class.

property selected¶

Returns the selected tag.

Returns:: the tag
Return type:: Tag

property tags¶

Returns the associated tags.

Returns:: the list of Tag objects
Return type:: list

class weka.core.classes.SetupGenerator(jobject=None, options=None)¶

Bases: OptionHandler

Allows generation of large number of setups using parameter setups.

property base_object¶

Returns the base object to apply the setups to.

Returns:: the base object
Return type:: JavaObject or OptionHandler

property parameters¶

Returns the list of currently set search parameters.

Returns:: the list of AbstractSearchParameter objects
Return type:: list

setups()¶

Generates and returns all the setups according to the parameter search space.

Returns:: the list of configured objects (of type JavaObject)
Return type:: list

class weka.core.classes.SingleIndex(jobject=None, index=None)¶

Bases: JavaObject

Wrapper for a Weka SingleIndex object.

index()¶

Returns the integer index.

Returns:: the 0-based integer index
Return type:: int

property single_index¶

Returns the string index.

Returns:: the 1-based string index
Return type:: str

upper(upper)¶

Sets the upper limit.

Parameters:: upper (int) – the upper limit

class weka.core.classes.Tag(jobject=None, ident=None, ident_str='', readable='', uppercase=True)¶

Bases: JavaObject

Wrapper for the weka.core.Tag class.

property ident¶

Returns the current integer ID of the tag.

Returns:: the integer ID
Return type:: int

property identstr¶

Returns the current ID string.

Returns:: the ID string
Return type:: str

property readable¶

Returns the ‘human readable’ string.

Returns:: the readable string
Return type:: str

class weka.core.classes.Tags(jobject=None, tags=None)¶

Bases: JavaObject

Wrapper for an array of weka.core.Tag objects.

find(name)¶

Returns the Tag that matches the name.

Parameters:: name (str) – the string representation of the tag
Returns:: the tag, None if not found
Return type:: Tag

classmethod get_object_tags(javaobject, methodname)¶

Instantiates the Tag array obtained from the object using the specified method name.

Example: cls = Classifier(classname=”weka.classifiers.meta.MultiSearch”) tags = Tags.get_object_tags(cls, “getMetricsTags”)

Parameters:

javaobject (JavaObject) – the javaobject to obtain the tags from
methodname (str) – the method name returning the Tag array

Returns:

the Tags objects

Return type:

weka.core.converters module¶

class weka.core.converters.IncrementalLoaderIterator(loader, structure)¶

Bases: object

Iterator for dataset rows when loarding incrementally.

class weka.core.converters.Loader(classname='weka.core.converters.ArffLoader', jobject=None, options=None)¶

Bases: OptionHandler

Wrapper class for Loaders.

load_file(dfile, incremental=False, class_index=None)¶

Loads the specified file and returns the Instances object. In case of incremental loading, only the structure.

Parameters:

dfile (str) – the file to load
incremental (bool) – whether to load the dataset incrementally
class_index (str) – the class index string to use (‘first’, ‘second’, ‘third’, ‘last-2’, ‘last-1’, ‘last’ or 1-based index)

Returns:

the full dataset or the header (if incremental)

Return type:

Instances

Raises:

Exception – if the file does not exist

load_url(url, incremental=False)¶

Loads the specified URL and returns the Instances object. In case of incremental loading, only the structure.

Parameters:

url (str) – the URL to load the data from
incremental (bool) – whether to load the dataset incrementally

Returns:

the full dataset or the header (if incremental)

Return type:

Instances

class weka.core.converters.Saver(classname='weka.core.converters.ArffSaver', jobject=None, options=None)¶

Bases: OptionHandler

Wrapper class for Savers.

capabilities()¶

Returns the capabilities of the saver.

Returns:: the capabilities
Return type:: Capabilities

save_file(data, dfile)¶

Saves the Instances object in the specified file.

Parameters:

data (Instances) – the data to save
dfile (str) – the file to save the data to

class weka.core.converters.TextDirectoryLoader(jobject=None, options=None)¶

Bases: OptionHandler

Wrapper class for TextDirectoryLoader.

load()¶

Loads the text files from the specified directory and returns the Instances object. In case of incremental loading, only the structure.

Returns:: the full dataset or the header (if incremental)
Return type:: Instances

weka.core.converters.load_any_file(filename, class_index=None)¶

Determines a Loader based on the the file extension. If successful, loads the full dataset and returns it.

Parameters:

filename (str) – the name of the file to load
class_index (str) – the class index string to use (‘first’, ‘second’, ‘third’, ‘last-2’, ‘last-1’, ‘last’ or 1-based index)

Returns:

the

Return type:

Instances

weka.core.converters.load_csv_file(filename, dialect='excel', delimiter=',', quotechar='"', num_cols=None, nom_cols=None)¶

Loads a CSV file using the Python csv module and then converts it to an Instances object. Better at reading CSV files than Weka’s built-in CSVLoader. String attributes can be converted to nominal ones using the weka.filters.unsupervised.attribute.StringToNominal filter.

Parameters:

filename (str) – the name of the CSV file to load
dialect (str) – the type of CSV file to load
delimiter (str) – the field delimiter
quotechar (str) – the character used for quoting cells
quoting – how the quoting works
num_cols (list) – the list of 0-based column indices that are numeric, default for cols is str
nom_cols (list) – the list of 0-based column indices that are nominal, default for cols is str

weka.core.converters.loader_for_file(filename)¶

Returns a Loader that can load the specified file, based on the file extension. None if failed to determine.

Parameters:: filename (str) – the filename to get the loader for
Returns:: the assoicated loader instance or None if none found
Return type:: Loader

weka.core.converters.ndarray_to_instances(array, relation, att_template='Att-#', att_list=None)¶

Converts the numpy matrix into an Instances object and returns it.

Parameters:

array (numpy.darray) – the numpy ndarray to convert
relation (str) – the name of the dataset
att_template (str) – the prefix to use for the attribute names, “#” is the 1-based index, “!” is the 0-based index, “@” the relation name
att_list (list) – the list of attribute names to use

Returns:

the generated instances object

Return type:

Instances

weka.core.converters.save_any_file(data, filename)¶

Determines a Saver based on the the file extension. Returns whether successfully saved.

Parameters:

filename (str) – the name of the file to save
data (Instances) – the data to save

Returns:

whether successfully saved

Return type:

bool

weka.core.converters.saver_for_file(filename)¶

Returns a Saver that can load the specified file, based on the file extension. None if failed to determine.

Parameters:: filename (str) – the filename to get the saver for
Returns:: the associated saver instance or None if none found
Return type:: Saver

weka.core.database module¶

class weka.core.database.DatabaseUtils(jobject=None, options=None)¶

Bases: OptionHandler

Wrapper class for weka.experiment.DatabaseUtils.

property db_url¶

Obtains the currently set database URL.

Returns:: the database URL
Return type:: str

property password¶

Obtains the currently set database password.

Returns:: the database password
Return type:: str

property user¶

Obtains the currently set database user.

Returns:: the database user
Return type:: str

class weka.core.database.InstanceQuery(jobject=None, options=None)¶

Bases: DatabaseUtils

Wrapper class for weka.experiment.InstanceQuery.

property custom_properties¶

Obtains the currently set custom properties file.

Returns:: the custom properties file
Return type:: str

property query¶

Obtains the current SQL query to execute.

Returns:: the SQL query
Return type:: str

retrieve_instances(query=None)¶

Executes either the supplied query or the one set via options (or the ‘query’ property).

Parameters:: query (str) – query to execute if not the currently set one
Returns:: the generated data
Return type:: Instances

property sparse_data¶

Obtains the whether sparse data is returned or not.

Returns:: whether sparse data is generated
Return type:: bool

weka.core.dataset module¶

class weka.core.dataset.Attribute(jobject)¶

Bases: JavaObject

Wrapper class for weka.core.Attribute.

add_relation(instances)¶

Adds the relation value, returns the index.

Parameters:: instances (Instances) – the Instances object to add
Returns:: the index
Return type:: int

add_string_value(s)¶

Adds the string value, returns the index.

Parameters:: s (str) – the string to add
Returns:: the index
Return type:: int

copy(name=None)¶

Creates a copy of this attribute.

Parameters:: name (str) – the new name, uses the old one if None
Returns:: the copy of the attribute
Return type:: Attribute

classmethod create_date(name, formt="yyyy-MM-dd'T'HH:mm:ss")¶

Creates a date attribute.

Parameters:

name (str) – the name of the attribute
formt (str) – the date format, see Javadoc for java.text.SimpleDateFormat

classmethod create_nominal(name, labels)¶

Creates a nominal attribute.

Parameters:

name (str) – the name of the attribute
labels (list) – the list of string labels to use

classmethod create_numeric(name)¶

Creates a numeric attribute.

Parameters:: name (str) – the name of the attribute

classmethod create_relational(name, inst)¶

Creates a relational attribute.

Parameters:

name (str) – the name of the attribute
inst (Instances) – the structure of the relational attribute

classmethod create_string(name)¶

Creates a string attribute.

Parameters:: name (str) – the name of the attribute

property date_format¶

Returns the format of this data attribute. See java.text.SimpleDateFormat Javadoc.

Returns:: the format string
Return type:: str

equals(att)¶

Checks whether this attributes is the same as the provided one.

Parameters:: att (Attribute) – the Attribute to check against
Returns:: whether the same
Return type:: bool

equals_msg(att)¶

Checks whether this attributes is the same as the provided one. Returns None if the same, otherwise error message.

Parameters:: att (Attribute) – the Attribute to check against
Returns:: None if the same, otherwise error message
Return type:: str

property index¶

Returns the index of this attribute.

Returns:: the index
Return type:: int

index_of(label)¶

Returns the index of the label in this attribute.

Parameters:: label (str) – the string label to get the index for
Returns:: the 0-based index
Return type:: int

property is_averagable¶

Returns whether the attribute is averagable.

Returns:: whether averagable
Return type:: bool

property is_date¶

Returns whether the attribute is a date one.

Returns:: whether date attribute
Return type:: bool

is_in_range(value)¶

Checks whether the value is within the bounds of the numeric attribute.

Parameters:: value (float) – the numeric value to check
Returns:: whether between lower and upper bound
Return type:: bool

property is_nominal¶

Returns whether the attribute is a nominal one.

Returns:: whether nominal attribute
Return type:: bool

property is_numeric¶

Returns whether the attribute is a numeric one (date or numeric).

Returns:: whether numeric attribute
Return type:: bool

property is_relation_valued¶

Returns whether the attribute is a relation valued one.

Returns:: whether relation valued attribute
Return type:: bool

property is_string¶

Returns whether the attribute is a string attribute.

Returns:: whether string attribute
Return type:: bool

property lower_numeric_bound¶

Returns the lower numeric bound of the numeric attribute.

Returns:: the lower bound
Return type:: float

property name¶

Returns the name of the attribute.

Returns:: the name
Return type:: str

property num_values¶

Returns the number of labels.

Returns:: the number of labels
Return type:: int

property ordering¶

Returns the ordering of the attribute.

Returns:: the ordering (ORDERING_SYMBOLIC, ORDERING_ORDERED, ORDERING_MODULO)
Return type:: int

parse_date(s)¶

Parses the date string and returns the internal format value.

Parameters:: s (str) – the date string
Returns:: the internal format
Return type:: float

property type¶

Returns the type of the attribute. See weka.core.Attribute Javadoc.

Returns:: the type
Return type:: int

type_str(short=False)¶

Returns the type of the attribute as string.

Returns:: the type
Return type:: str

property upper_numeric_bound¶

Returns the upper numeric bound of the numeric attribute.

Returns:: the upper bound
Return type:: float

value(index)¶

Returns the label for the index.

Parameters:: index (int) – the 0-based index of the label to return
Returns:: the label
Return type:: str

property values¶

Returns the labels, strings or relation-values.

Returns:: all the values, None if not NOMINAL, STRING, or RELATION
Return type:: list

property weight¶

Returns the weight of the attribute.

Returns:: the weight
Return type:: float

class weka.core.dataset.AttributeIterator(data)¶

Bases: object

Iterator for attributes in an Instances object.

class weka.core.dataset.AttributeStats(jobject)¶

Bases: JavaObject

Container for attribute statistics.

property distinct_count¶

The number of distinct values.

Returns:: The number of distinct values
Return type:: int

property int_count¶

The number of int-like values.

Returns:: The number of int-like values
Return type:: int

property missing_count¶

The number of missing values.

Returns:: The number of missing values
Return type:: int

property nominal_counts¶

Counts of each nominal value.

Returns:: Counts of each nominal value
Return type:: ndarray

property nominal_weights¶

Weight mass for each nominal value.

Returns:: Weight mass for each nominal value
Return type:: ndarray

property numeric_stats¶

Stats on numeric value distributions.

Returns:: Stats on numeric value distributions
Return type:: NumericStats

property total_count¶

The total number of values.

Returns:: The total number of values
Return type:: int

property unique_count¶

The number of values that only appear once.

Returns:: The number of values that only appear once
Return type:: int

class weka.core.dataset.Instance(jobject)¶

Bases: JavaObject

Wrapper class for weka.core.Instance.

property class_attribute¶

Returns the currently set class attribute.

Returns:: the class attribute
Return type:: Attribute

property class_index¶

Returns the currently set class index.

Returns:: the class index, -1 if not set
Return type:: int

classmethod create_instance(values, classname='weka.core.DenseInstance', weight=1.0)¶

Creates a new instance.

Parameters:

values (ndarray or list) – the float values (internal format) to use, numpy array or list.
classname (str) – the classname of the instance (eg weka.core.DenseInstance).
weight (float) – the weight of the instance

classmethod create_sparse_instance(values, max_values, classname='weka.core.SparseInstance', weight=1.0)¶

Creates a new sparse instance.

Parameters:

values (list) – the list of tuples (0-based index and internal format float). The indices of the tuples must be in ascending order and “max_values” must be set to the maximum number of attributes in the dataset.
max_values (int) – the maximum number of attributes
classname (str) – the classname of the instance (eg weka.core.SparseInstance).
weight (float) – the weight of the instance

property dataset¶

Returns the dataset that this instance belongs to.

Returns:: the dataset or None if no dataset set
Return type:: Instances

get_relational_value(index)¶

Returns the relational value at the specified position (0-based).

Parameters:: index (int) – the 0-based index of the inernal value
Returns:: the relational value
Return type:: Instances

get_string_value(index)¶

Returns the string value at the specified position (0-based).

Parameters:: index (int) – the 0-based index of the inernal value
Returns:: the string value
Return type:: str

get_value(index)¶

Returns the internal value at the specified position (0-based).

Parameters:: index (int) – the 0-based index of the inernal value
Returns:: the internal value
Return type:: float

has_class()¶

Returns whether a class attribute is set (convenience method).

Returns:: whether a class attribute is currently set
Return type:: bool

has_missing()¶

Returns whether at least one attribute has a missing value.

Returns:: whether at least one value is missing
Return type:: bool

is_missing(index)¶

Returns whether the attribute at the specified index is missing.

Parameters:: index (int) – the 0-based index of the attribute
Returns:: whether the value is missing
Return type:: bool

classmethod missing_value()¶

Returns the numeric value that represents a missing value in Weka (NaN).

Returns:: missing value
Return type:: float

property num_attributes¶

Returns the number of attributes.

Returns:: the numer of attributes
Return type:: int

property num_classes¶

Returns the number of class labels.

Returns:: the numer of class labels
Return type:: int

set_missing(index)¶

Sets the attribute at the specified index to missing.

Parameters:: index (int) – the 0-based index of the attribute

set_string_value(index, s)¶

Sets the string value at the specified position (0-based).

Parameters:

index (int) – the 0-based index of the inernal value
s (str) – the string value

set_value(index, value)¶

Sets the internal value at the specified position (0-based).

Parameters:

index (int) – the 0-based index of the attribute
value (float) – the internal float value to set

to_numpy(internal=False)¶

Turns the instance into a numpy matrix.

Parameters:: internal (bool) – whether to return the internal format
Returns:: the dataset as matrix with single row
Return type:: np.ndarray

property values¶

Returns the internal values of this instance.

Returns:: the values as numpy array
Return type:: ndarray

property weight¶

Returns the currently set weight.

Returns:: the weight
Return type:: float

class weka.core.dataset.InstanceIterator(data)¶

Bases: object

Iterator for rows in an Instances object.

class weka.core.dataset.InstanceValueIterator(data)¶

Bases: object

Iterator for values in an Instance object.

class weka.core.dataset.Instances(jobject)¶

Bases: JavaObject

Wrapper class for weka.core.Instances.

add_instance(inst, index=None)¶

Adds the specified instance to the dataset.

Parameters:

inst (Instance) – the Instance to add
index (int) – the 0-based index where to add the Instance

classmethod append_instances(inst1, inst2)¶

Merges the two datasets (one-after-the-other). Throws an exception if the datasets aren’t compatible.

Parameters:

inst1 (Instances) – the first dataset
inst2 (Instances) – the first dataset

Returns:

the combined dataset

Return type:

Instances

attribute(index)¶

Returns the specified attribute.

Parameters:: index (int) – the 0-based index of the attribute
Returns:: the attribute
Return type:: Attribute

attribute_by_name(name)¶

Returns the specified attribute, None if not found.

Parameters:: name (str) – the name of the attribute
Returns:: the attribute or None
Return type:: Attribute

attribute_names()¶

Returns a list of all the attribute names.

Returns:: list of attribute names
Return type:: list

attribute_stats(index)¶

Returns the specified attribute statistics.

Parameters:: index (int) – the 0-based index of the attribute
Returns:: the attribute statistics
Return type:: AttributeStats

attributes()¶: Returns an iterator over the attributes.

property class_attribute¶

Returns the currently set class attribute.

Returns:: the class attribute
Return type:: Attribute

property class_index¶

Returns the currently set class index (0-based).

Returns:: the class index, -1 if not set
Return type:: int

class_is_first()¶: Sets the first attribute as class attribute (convenience method).

class_is_last()¶: Sets the last attribute as class attribute (convenience method).

compactify()¶: Compactifies the set of instances.

classmethod copy_instances(dataset, from_row=None, num_rows=None)¶

Creates a copy of the Instances. If either from_row or num_rows are None, then all of the data is being copied.

Parameters:

dataset (Instances) – the original dataset
from_row (int) – the 0-based start index of the rows to copy
num_rows (int) – the number of rows to copy

Returns:

the copy of the data

Return type:

Instances

copy_structure()¶

Returns a copy of the dataset structure.

Returns:: the structure of the dataset
Return type:: Instances

classmethod create_instances(name, atts, capacity)¶

Creates a new Instances.

Parameters:

name (str) – the relation name
atts (list of Attribute) – the list of attributes to use for the dataset
capacity (int) – how many data rows to reserve initially (see compactify)

Returns:

the dataset

Return type:

Instances

cv_splits(folds=10, rnd=None, stratify=True)¶

Generates a list of train/test pairs used in cross-validation. Creates a copy of the dataset beforehand when randomizing.

Parameters:

folds (int) – the number of folds to use, >= 2
rnd (Random) – the random number generator to use for randomization, skips randomization if None
stratify (bool) – whether to stratify the data after randomization

Returns:

the list of train/test split tuples

Return type:

list

delete(index=None)¶

Removes either the specified Instance or all Instance objects.

Parameters:: index (int) – the 0-based index of the instance to remove

delete_attribute(index)¶

Deletes an attribute at the given position.

Parameters:: index (int) – the 0-based index of the attribute to remove

delete_attribute_type(typ)¶

Deletes all attributes of the given type in the dataset.

Parameters:: typ (int) – the attribute type to remove, see weka.core.Attribute Javadoc

delete_first_attribute()¶: Deletes the first attribute.

delete_last_attribute()¶: Deletes the last attribute.

delete_with_missing(index)¶

Deletes all rows that have a missing value at the specified attribute index.

Parameters:: index (int) – the attribute index to check for missing attributes

equal_headers(inst)¶

Compares this dataset against the given one in terms of attributes.

Parameters:: inst (Instances) – the dataset to compare against
Returns:: None if the same, otherwise an error message
Return type:: str

get_instance(index)¶

Returns the Instance object at the specified location.

Parameters:: index (int) – the 0-based index of the instance
Returns:: the instance
Return type:: Instance

has_class()¶

Returns whether a class attribute is set (convenience method).

Returns:: whether a class attribute is currently set
Return type:: bool

insert_attribute(att, index)¶

Inserts the attribute at the specified location.

Parameters:

att (Attribute) – the attribute to insert
index (int) – the index to insert the attribute at

classmethod merge_instances(inst1, inst2)¶

Merges the two datasets (side-by-side).

Parameters:

inst1 (Instances or str) – the first dataset
inst2 (Instances) – the first dataset

Returns:

the combined dataset

Return type:

Instances

no_class()¶: Unsets the class attribute (convenience method).

property num_attributes¶

Returns the number of attributes.

Returns:: the number of attributes
Return type:: int

property num_instances¶

Returns the number of instances.

Returns:: the number of instances
Return type:: int

randomize(random)¶

Randomizes the dataset using the random number generator.

Parameters:: random (Random) – the random number generator to use

property relationname¶

Returns the name of the dataset.

Returns:: the name
Return type:: str

set_instance(index, inst)¶

Sets the Instance at the specified location in the dataset.

Parameters:

index (int) – the 0-based index of the instance to replace
inst (Instance) – the Instance to set

Returns:

the instance

Return type:

Instance

sort(index)¶

Sorts the dataset using the specified attribute index.

Parameters:: index (int) – the index of the attribute

stratify(folds)¶

Stratifies the data after randomization for nominal class attributes.

Parameters:: folds (int) – the number of folds to perform the stratification for

subset(col_range=None, col_names=None, invert_cols=False, row_range=None, invert_rows=False, keep_relationame=False)¶

Returns a subset of attributes/rows of the Instances object. If neither attributes nor rows have been specified a copy of the dataset gets returned. The invers of the specified cols/rows can be returned by setting invert_cols and/or invert_rows to True. The method uses the weka.filters.unsupervised.attribute.Remove and weka.filters.unsupervised.instance.RemoveRange filters under the hood.

Parameters:

col_range (str) – the subset of attributes to return (eg ‘1-3,7-12,67-last’), None for all
col_names (list) – the list of attributes to return (list of names; case-sensitive), takes precedence over col_range
invert_cols (bool) – whether to invert the returned attributes
row_range (str) – the subset of rows to return (eg ‘1-3,7-12,67-last’), None for all
invert_rows (bool) – whether to invert the returned rows
keep_relationame (bool) – whether to keep the original relation name

Returns:

the subset

Return type:

Instances

classmethod summary(inst)¶

Generates a summary of the dataset.

Parameters:: inst (Instances) – the dataset
Returns:: the summary
Return type:: str

classmethod template_instances(dataset, capacity=0)¶

Uses the Instances as template to create an empty dataset.

Parameters:

dataset (Instances) – the original dataset
capacity (int) – how many data rows to reserve initially (see compactify)

Returns:

the empty dataset

Return type:

Instances

test_cv(num_folds, fold)¶

Generates a test fold for cross-validation.

Parameters:

num_folds (int) – the number of folds of cross-validation, eg 10
fold (int) – the current fold (0-based)

Returns:

the training fold

Return type:

Instances

to_numpy(internal=False)¶

Turns the dataset into a numpy matrix.

Parameters:: internal (bool) – whether to return the internal format
Returns:: the dataset as matrix
Return type:: np.ndarray

train_cv(num_folds, fold, random=None)¶

Generates a training fold for cross-validation.

Parameters:

num_folds (int) – the number of folds of cross-validation, eg 10
fold (int) – the current fold (0-based)
random (Random) – the random number generator

Returns:

the training fold

Return type:

Instances

train_test_split(percentage, rnd=None)¶

Generates a train/test split. Creates a copy of the dataset first before applying randomization.

Parameters:

percentage (double) – the percentage split to use (amount to use for training; 0-100)
rnd (Random) – the random number generator to use, if None the order gets preserved

Returns:

the train/test splits

Return type:

tuple

values(index)¶

Returns the internal values of this attribute from all the instance objects.

Returns:: the values as numpy array
Return type:: np.ndarray

class weka.core.dataset.Stats(jobject)¶

Bases: JavaObject

Container for numeric attribute stats.

property count¶

The number of values seen.

Returns:: The number of values seen
Return type:: float

property max¶

The maximum value seen, or Double.NaN if no values seen.

Returns:: The maximum value seen, or Double.NaN if no values seen
Return type:: float

property mean¶

The mean of values at the last calculateDerived() call.

Returns:: The mean of values at the last calculateDerived() call
Return type:: float

property min¶

The minimum value seen, or Double.NaN if no values seen.

Returns:: The minimum value seen, or Double.NaN if no values seen
Return type:: float

property stddev¶

The std deviation of values at the last calculateDerived() call.

Returns:: The std deviation of values at the last calculateDerived() call
Return type:: float

property sum¶

The sum of values seen.

Returns:: The sum of values seen
Return type:: float

property sumsq¶

The sum of values squared seen.

Returns:: The sum of values squared seen
Return type:: float

weka.core.dataset.check_col_names_unique(cols_x, col_y=None)¶

Checks whether the column names are unique (a requirement for Instances objects).

Parameters:

cols_x (list) – the column names for the input variables
col_y (str) – the optional name for the output variable

Returns:

None if check passed, otherwise error message

Return type:

str

weka.core.dataset.create_instances_from_lists(x, y=None, name='data', cols_x=None, col_y=None, nominal_x=None, nominal_y=False)¶

Allows the generation of an Instances object from a list of lists for X and a list for Y (optional). Data can be numeric, string or bytes. Attributes can be converted to nominal with the weka.filters.unsupervised.attribute.NumericToNominal filter. None values are interpreted as missing values.

Parameters:

x (list of list) – the input variables (row wise)
y (list) – the output variable (optional)
name (str) – the name of the dataset
cols_x (list) – the column names to use
col_y (str) – the column name to use for the output variable (y)
nominal_x (list) – the list of 0-based column indices to treat as nominal ones, ignored if None
nominal_y (bool) – whether the y column is to be treated as nominal

Returns:

the generated dataset

Return type:

Instances

weka.core.dataset.create_instances_from_matrices(x, y=None, name='data', cols_x=None, col_y=None, nominal_x=None, nominal_y=False)¶

Allows the generation of an Instances object from a 2-dimensional matrix for X and a 1-dimensional matrix for Y (optional). Data can be numeric, string or bytes. Attributes can be converted to nominal with the weka.filters.unsupervised.attribute.NumericToNominal filter. nan values are interpreted as missing values.

Parameters:

x (np.ndarray) – the input variables
y (np.ndarray) – the output variable (optional)
name (str) – the name of the dataset
cols_x (list) – the column names to use
col_y (str) – the column name to use for the output variable (y)
nominal_x (list) – the list of 0-based column indices to treat as nominal ones, ignored if None
nominal_y (bool) – whether the y column is to be treated as nominal

Returns:

the generated dataset

Return type:

Instances

weka.core.dataset.missing_value()¶

Returns the value that represents missing values in Weka (NaN).

Returns:: missing value
Return type:: float

weka.core.distances module¶

class weka.core.distances.DistanceFunction(classname='weka.core.EuclideanDistance', jobject=None, options=None)¶

Bases: OptionHandler

Wrapper for Weka’s weka.core.DistanceFunction interface.

property attribute_indices¶

Returns the attribute indices in use.

Returns:: the attribute indices
Return type:: str

distance(first, second, cutoff=None)¶

Computes the distance between the two Instance objects.

Parameters:

first (Instance) – the first instance
second (Instance) – the second instance
cutoff (float) – optional cutoff value to speed up calculation

Returns:

the calculated distance

Return type:

float

property instances¶

Returns the dataset in use.

Returns:: the dataset
Return type:: Instances

weka.core.jvm module¶

weka.core.jvm.add_bundled_jars(cp)¶

Adds the bundled jars to the JVM’s classpath.

Parameters:: cp (list) – the list to append the classpath to

weka.core.jvm.add_system_classpath(cp)¶

Adds the system’s classpath to the JVM’s classpath.

Parameters:: cp (list) – the list to append the classpath to

weka.core.jvm.automatically_install_packages = None¶: whether to automatically install missing packages

weka.core.jvm.lib_dir()¶

Returns the “lib” directory path.

Returns:: the path to the “lib” directory
Return type:: str

weka.core.jvm.start(class_path=None, bundled=True, packages=False, system_cp=False, max_heap_size=None, system_info=False, auto_install=False, logging_level=10)¶

Initializes the jpype connection (starts up the JVM).

Parameters:

class_path (list) – the additional classpath elements to add
bundled (bool) – whether to add jars from the “lib” directory
packages (bool or str) – whether to add jars from Weka packages as well (bool) or an alternative Weka home directory (str)
system_cp (bool) – whether to add the system classpath as well
max_heap_size (str) – the maximum heap size (-Xmx parameter, eg 512m or 4g)
system_info (bool) – whether to print the system info (generated by weka.core.SystemInfo)
auto_install (bool) – whether to automatically install missing Weka packages (based on suggestions); in conjunction with package support
logging_level (int) – the logging level to use for this module, e.g., logging.DEBUG or logging.INFO

weka.core.jvm.started = None¶: whether the JVM has been started

weka.core.jvm.stop()¶: Kills the JVM.

weka.core.jvm.with_package_support = None¶: whether JVM was started with package support

weka.core.packages module¶

class weka.core.packages.Dependency(jobject)¶

Bases: JavaObject

Wrapper for the weka.core.packageManagement.Dependency class.

property source¶

Returns the source package.

Returns:: the package
Return type:: Package

property target¶

Returns the target package constraint.

Returns:: the package constraint
Return type:: PackageConstraint

weka.core.packages.LATEST = 'Latest'¶: Constant for the latest version of a package

class weka.core.packages.Package(jobject)¶

Bases: JavaObject

Wrapper for the weka.core.packageManagement.Package class.

as_dict()¶

Turns the package information into a dictionary. Not to be confused with ‘to_dict’!

Returns:: the package information as dictionary
Return type:: dict

property dependencies¶

Returns the dependencies of the package.

Returns:: the list of Dependency objects
Return type:: list of Dependency

install()¶: Installs the package.

property is_installed¶

Returns whether the package is installed.

Returns:: whether installed
Return type:: bool

property metadata¶

Returns the meta-data.

Returns:: the meta-data dictionary
Return type:: dict

property name¶

Returns the name of the package.

Returns:: the name
Return type:: str

property url¶

Returns the URL of the package.

Returns:: the url
Return type:: str

property version¶

Returns the version of the package.

Returns:: the version
Return type:: str

class weka.core.packages.PackageConstraint(jobject)¶

Bases: JavaObject

Wrapper for the weka.core.packageManagement.PackageConstraint class.

check_constraint(pkge=None, constr=None)¶

Checks the constraints.

Parameters:

pkge (Package) – the package to check
constr (PackageConstraint) – the package constraint to check

get_package()¶

Returns the package.

Returns:: the package
Return type:: Package

set_package(pkge)¶

Sets the package.

Parameters:: pkge (Package) – the package

weka.core.packages.all_package(name)¶

Returns Package object for the specified package (either installed or available). Returns None if not found.

Parameters:: name (str) – the name of the package to retrieve
Returns:: the package information, None if not available
Return type:: Package

weka.core.packages.all_packages()¶

Returns a list of all packages.

Returns:: the list of packages
Return type:: list

weka.core.packages.available_package(name)¶

Returns Package object for the specified, available package. Returns None if not installed.

Parameters:: name (str) – the name of the available package to retrieve
Returns:: the package information
Return type:: Package

weka.core.packages.available_packages()¶

Returns a list of all packages that aren’t installed yet.

Returns:: the list of packages
Return type:: list

weka.core.packages.establish_cache()¶: Establishes the package cache if necessary.

weka.core.packages.install_missing_package(pkge, version='Latest', quiet=False, stop_jvm_and_exit=False)¶

Installs the package if not yet installed.

Parameters:

pkge (str) – the name of the repository package, a URL (http/https) or a zip file
version (str) – in case of the repository packages, the version
quiet (bool) – whether to suppress console output and only print error messages
stop_jvm_and_exit (bool) – whether to stop the JVM and exit if anything was installed

Returns:

tuple of (success, exit_required); “success” being True if either nothing to install or all successfully installed, False otherwise; “exit_required” being True if at least one package was installed and the JVM needs restarting

Return type:

tuple

weka.core.packages.install_missing_packages(pkges, quiet=False, stop_jvm_and_exit=False)¶

Installs the missing packages.

Parameters:

pkges (the packages to install) – list of tuples (packagename, version) or strings (packagename, LATEST is assume for version), use “Latest” or LATEST constant to grab latest version
quiet (bool) – whether to suppress console output and only print error messages
stop_jvm_and_exit (bool) – whether to stop the JVM and exit if anything was installed

Returns:

tuple of (success, exit_required); “success” being True if either nothing to install or all successfully installed, False otherwise; “exit_required” being True if at least one package was installed and the JVM needs restarting

Return type:

tuple

weka.core.packages.install_package(pkge, version='Latest', details=False)¶

Installs the specified package.

Parameters:

pkge (str) – the name of the repository package, a URL (http/https) or a zip file
version (str) – in case of the repository packages, the version
details (bool) – whether to just return a success/failure flag (False) or a dict with detailed information (from_repo, version, error, install_message, success)

Returns:

whether successfully installed or dict with detailed information

Return type:

bool or dict

weka.core.packages.install_packages(pkges, fail_fast=True, details=False)¶

Installs the specified packages. When running in fail_fast mode, then the first package that fails to install will stop the installation process. Otherwise, all packages are attempted to get installed.

The details dictionary uses the package name, url or file path as the key and stores the following information in a dict as value: - from_repo (bool): whether installed from repo or “unofficial” package (ie URL or local file) - version (str): the version that was attempted to be installed (if applicable) - error (str): any error message that was encountered - install_message (str): any installation message that got returned when installing from URL or zip file - success (bool): whether successfully installed or not

Parameters:

pkges (list) – the list of packages to install (name of the repository package, a URL (http/https) or a zip file), if tuple must be name/version
fail_fast (bool) – whether to quit the installation of packages with the first package that fails (True) or whether to attempt to install all packages (False)
details (bool) – whether to just return a success/failure flag (False) or a dict with detailed information (per package: from_repo, version, error, install_message, success)

Returns:

whether successfully installed or detailed information

Return type:

bool or dict

weka.core.packages.installed_package(name)¶

Returns Package object for the specified, installed package. Returns None if not installed.

Parameters:: name (str) – the name of the installed package to retrieve
Returns:: the package information
Return type:: Package

weka.core.packages.installed_packages()¶

Returns a list of the installed packages.

Returns:: the list of packages
Return type:: list

weka.core.packages.is_installed(name, version=None)¶

Checks whether a package with the name is already installed.

Parameters:

name (str) – the name of the package
version (str) – the version to check as well, ignored if None

Returns:

whether the package is installed

Return type:

bool

weka.core.packages.is_official_package(name, version=None)¶

Checks whether the package is an official one.

Parameters:

name (str) – the name of the package to check
version (str) – the specific version to check

Returns:

whether an official package or not

Return type:

bool

weka.core.packages.main(args=None)¶

Performs the specified package operation from the command-line. Calls JVM start/stop automatically. Use -h to see all options.

Parameters:: args (list) – the command-line arguments to use, uses sys.argv if None

weka.core.packages.refresh_cache()¶: Refreshes the cache.

weka.core.packages.suggest_package(name, exact=False)¶

Suggests package(s) for the given name (classname, package name). Matching can be either exact or just a substring.

Parameters:

name (str) – the name to look for
exact (bool) – whether to perform exact matching or substring matching

Returns:

list of matching package names

Return type:

list

weka.core.packages.sys_main()¶

Runs the main function using the system cli arguments, and returns a system error code.

Returns:: 0 for success, 1 for failure.
Return type:: int

weka.core.packages.uninstall_package(name)¶

Uninstalls a package.

Parameters:: name (str) – the name of the package

weka.core.packages.uninstall_packages(names)¶

Uninstalls a package.

Parameters:: names (list) – the names of the package

weka.core.serialization module¶

weka.core.stemmers module¶

class weka.core.stemmers.Stemmer(classname='weka.core.stemmers.NullStemmer', jobject=None, options=None)¶

Bases: OptionHandler

Wrapper class for stemmers.

stem(s)¶

Performs stemming on the string.

Parameters:: s (str) – the string to stem
Returns:: the stemmed string
Return type:: str

weka.core.stopwords module¶

class weka.core.stopwords.Stopwords(classname='weka.core.stopwords.Null', jobject=None, options=None)¶

Bases: OptionHandler

Wrapper class for stopwords handlers.

is_stopword(s)¶

Checks a string whether it is a stopword.

Parameters:: s (str) – the string to check
Returns:: True if a stopword
Return type:: bool

weka.core.tokenizers module¶

class weka.core.tokenizers.TokenIterator(tokenizer)¶

Bases: object

Iterator for string tokens.

class weka.core.tokenizers.Tokenizer(classname='weka.core.tokenizers.AlphabeticTokenizer', jobject=None, options=None)¶

Bases: OptionHandler

Wrapper class for tokenizers.

tokenize(s)¶

Tokenizes the string.

Parameters:: s (str) – the string to tokenize
Returns:: the iterator
Return type:: TokenIterator

weka.core.typeconv module¶

weka.core.typeconv.float_to_jfloat(d)¶

Turns the Python float into a Java java.lang.Float object.

Parameters:: d (float) – the Python float
Returns:: the Float object
Return type:: JPype object

weka.core.typeconv.from_jobject_array(a)¶

Converts the java object array into a list.

Parameters:: a – the java array to convert
Returns:: the generated list

weka.core.typeconv.jdouble_array_to_ndarray(a)¶

Turns the Java array of doubles into a numpy 2-dim array.

Parameters:: a – the double array
Type:: JPype object
Returns:: Numpy array
Return type:: numpy.darray

weka.core.typeconv.jdouble_matrix_to_ndarray(m)¶

Turns the Java matrix (2-dim array) of doubles into a numpy 2-dim array.

Parameters:: m – the double matrix
Type:: JPype object
Returns:: Numpy array
Return type:: numpy.darray

weka.core.typeconv.jdouble_to_float(d)¶

Turns the Java java.lang.Double object into Python float object.

Parameters:: d (JPype object) – the java.lang.Double
Returns:: the Float object
Return type:: float

weka.core.typeconv.jenumeration_to_list(enm)¶

Turns the java.util.Enumeration into a list.

Parameters:: enm (JPype object) – the enumeration to convert
Returns:: the list
Return type:: list

weka.core.typeconv.jint_array_to_ndarray(a)¶

Turns the Java array of ints into a numpy 2-dim array.

Parameters:: a – the double array
Type:: JPype object
Returns:: Numpy array
Return type:: numpy.darray

weka.core.typeconv.jstring_array_to_list(a)¶

Turns the Java string array into Python unicode string list.

Parameters:: a (JPype object) – the string array to convert
Returns:: the string list
Return type:: list

weka.core.typeconv.jstring_list_to_string_list(l, return_empty_if_none=True)¶

Converts a Java java.util.List containing strings into a Python list.

Parameters:

l (JPype object) – the list to convert
return_empty_if_none (bool) – whether to return an empty list or None when list object is None

Returns:

the list with UTF strings

Return type:

list

weka.core.typeconv.string_list_to_jarray(l)¶

Turns a Python unicode string list into a Java String array.

Parameters:: l – the string list
Type:: list
Return type:: java string array
Returns:: JPype object

weka.core.typeconv.string_list_to_jlist(l)¶

Turns a Python unicode string list into a Java List.

Parameters:: l – the string list
Type:: list
Return type:: java list

weka.core.typeconv.to_jdouble_array(values, none_as_nan: bool = False)¶

Converts the list of floats or the numpy array into a Java array.

Parameters:

values – the values to convert
none_as_nan (bool) – whether to convert None values to NaN

Returns:

the java array

weka.core.typeconv.to_jint_array(values)¶

Converts the list of ints into a Java array.

Parameters:: values – the values to convert
Returns:: the java array

weka.core.typeconv.to_jobject_array(values)¶

Converts the list of objects into a Java object array.

Parameters:: values – the list of objects to convert
Returns:: the java array

weka.core.typeconv.to_string(o)¶

Turns the object into a string.

Parameters:: o – the object to convert
Returns:: the generated string
Return type:: str

weka.core.utils module¶

weka.core.utils.correlation(values1, values2)¶

Computes the correlation between the two lists of floats.

Parameters:

values1 (list) – the first list of floats
values2 (list) – the second list of floats

Returns:

the correlation coefficient

Return type:

float

weka.core.utils.normalize(values, sum_=None)¶

Normalizes the doubles in the array using the given value.

Parameters:

values (list) – the list of floats to normalize
sum (float) – the value by which the floats are to be normalized

Returns:

the normalized float values

Return type:

list

weka.core.utils.variance(values)¶

Computes the variance for a list of floats.

Parameters:: values (list) – the list of floats to compute the variance for
Returns:: the variance
Return type:: float

weka.core.version module¶

weka.core.version.pww_version()¶

Returns the installed version of python-weka-wrapper3.

Returns:: the version, None if failed to obtain
Return type:: str

weka.core.version.weka_version()¶

Determines the version of Weka in use.

Returns:: the version
Return type:: str

weka.core.version.with_graph_support()¶

Checks whether pygraphviz is installed for graph support.

Returns:: True if with pygraphviz support
Return type:: bool

weka.core.version.with_plot_support()¶

Checks whether matplotlib is installed for plot support.

Returns:: True if with matplotlib support
Return type:: bool

weka.core package¶

weka.core.capabilities module¶

weka.core.classes module¶

weka.core.converters module¶

weka.core.database module¶

weka.core.dataset module¶

weka.core.distances module¶

weka.core.jvm module¶

weka.core.packages module¶

weka.core.serialization module¶

weka.core.stemmers module¶

weka.core.stopwords module¶

weka.core.tokenizers module¶

weka.core.typeconv module¶

weka.core.utils module¶

weka.core.version module¶

Module contents¶

python-weka-wrapper3

Navigation

Related Topics