Command-line
============

From command-line, *python-weka-wrapper3* behaves similar to Weka itself, i.e., the command-line.
Most of the general options are available, as well as the following:

* `-j` for adding additional jars, in the same format as the classpath for the platform.
  E.g., for Linux, `-j /some/where/a.jar:/other/place/b.jar`
* `-X` for defining the maximum heap size.
  E.g., `-X 512m` for 512 MB of heap size.

The following examples are all for a Linux bash environment. Windows users have to replace
forwarding slashes `/` with backslashes `\\` and place the command on a single line with the
backslashes `\\` at the end of the lines removed.


Data generators
---------------

Artifical data can be generated using one of Weka's data generators, e.g., the
`Agrawal` classification generator:

.. code-block:: bash

   pww-datagenerator \
       -o /tmp/out.arff \
       weka.datagenerators.classifiers.classification.Agrawal


Command-line help screen:

::

    usage: pww-datagenerator [-h] [-j classpath] [-X heap] datagenerator ...

    Executes a data generator from the command-line. Calls JVM start/stop
    automatically.

    positional arguments:
      datagenerator  data generator classname, e.g.,
                     weka.datagenerators.classifiers.classification.LED24
      option         additional data generator options

    optional arguments:
      -h, --help     show this help message and exit
      -j classpath   additional classpath, jars/directories
      -X heap        max heap size for jvm, e.g., 512m


Filters
-------

Filtering a single ARFF dataset, removing the last attribute using the `Remove` filter:

.. code-block:: bash

   pww-filter \
       -i /my/datasets/iris.arff \
       -o /tmp/out.arff \
       -c last \
       weka.filters.unsupervised.attribute.Remove \
       -R last

For batch filtering, you can use the `-r` and `-s` options for the input and output
for the second file.


Command-line help screen:

::

    usage: pww-filter [-h] [-j classpath] [-X heap] -i input1 -o output1
                      [-r input2] [-s output2] [-c classindex]
                      filter ...

    Executes a filter from the command-line. Calls JVM start/stop automatically.

    positional arguments:
      filter         filter classname, e.g., weka.filters.AllFilter
      option         additional filter options

    optional arguments:
      -h, --help     show this help message and exit
      -j classpath   additional classpath, jars/directories
      -X heap        max heap size for jvm, e.g., 512m
      -i input1      input file 1
      -o output1     output file 1
      -r input2      input file 2
      -s output2     output file 2
      -c classindex  1-based class attribute index


Classifiers
-----------

Example on how to cross-validate a `J48` classifier (with confidence factor 0.3)
on the iris UCI dataset:

.. code-block:: bash

   pww-classifier \
        -t /my/datasets/iris.arff \
        -c last \
        weka.classifiers.trees.J48
        -C 0.3


Command-line help screen:

::

    usage: pww-classifier [-h] [-j classpath] [-X heap] -t train [-T test]
                          [-c class index] [-d outmodel] [-l inmodel]
                          [-x num folds] [-s seed] [-v] [-o] [-i] [-k]
                          [-m costmatrix] [-g graph]
                          classifier ...

    Performs classification/regression from the command-line. Calls JVM start/stop
    automatically.

    positional arguments:
      classifier      classifier classname, e.g., weka.classifiers.trees.J48
      option          additional classifier options

    optional arguments:
      -h, --help      show this help message and exit
      -j classpath    additional classpath, jars/directories
      -X heap         max heap size for jvm, e.g., 512m
      -t train        Training set file
      -T test         Test set file
      -c class index  1-based class attribute index
      -d outmodel     model output file name
      -l inmodel      model input file name
      -x num folds    number of folds for cross-validation
      -s seed         seed value for randomization
      -v              no statistics for training
      -o              only statistics, don't output model
      -i              output information retrieval statistics
      -k              output information theoretic statistics
      -m costmatrix   cost matrix file
      -g graph        output file for graph (if supported)


Clusterers
----------

Example on how to perform classes-to-clusters evaluation for `SimpleKMeans`
(with 3 clusters) using the iris UCI dataset:

.. code-block:: bash

   pww-clusterer \
       -t /my/datasets/iris.arff \
       -c last \
       weka.clusterers.SimpleKMeans
       -N 3


Command-line help screen:

::

    usage: pww-clusterer [-h] [-j classpath] [-X heap] -t train [-T test]
                         [-d outmodel] [-l inmodel] [-p attributes] [-x num folds]
                         [-s seed] [-c class index] [-g graph]
                         clusterer ...

    Performs clustering from the command-line. Calls JVM start/stop automatically.

    positional arguments:
      clusterer       clusterer classname, e.g., weka.clusterers.SimpleKMeans
      option          additional clusterer options

    optional arguments:
      -h, --help      show this help message and exit
      -j classpath    additional classpath, jars/directories
      -X heap         max heap size for jvm, e.g., 512m
      -t train        training set file
      -T test         test set file
      -d outmodel     model output file name
      -l inmodel      model input file name
      -p attributes   attribute range
      -x num folds    number of folds
      -s seed         seed value for randomization
      -c class index  1-based class attribute index
      -g graph        graph output file (if supported)


Attribute selection
-------------------

You can perform attribute selection using `BestFirst` as search algorithm and
`CfsSubsetEval` as evaluator as follows:

.. code-block:: bash

   pww-attsel \
       -i /my/datasets/iris.arff \
       -x 5 \
       -n 42 \
       -s "weka.attributeSelection.BestFirst -D 1 -N 5"
       weka.attributeSelection.CfsSubsetEval \
       -P 1 \
       -E 1


Command-line help screen:

::

    usage: pww-attsel [-h] [-j classpath] [-X heap] -i input [-c class index]
                      [-s search] [-x num folds] [-n seed]
                      evaluator ...

    Performs attribute selection from the command-line. Calls JVM start/stop
    automatically.

    positional arguments:
      evaluator       evaluator classname, e.g.,
                      weka.attributeSelection.CfsSubsetEval
      option          additional evaluator options

    optional arguments:
      -h, --help      show this help message and exit
      -j classpath    additional classpath, jars/directories
      -X heap         max heap size for jvm, e.g., 512m
      -i input        input file
      -c class index  1-based class attribute index
      -s search       search method, classname and options
      -x num folds    number of folds
      -n seed         the seed value for randomization


Associators
-----------

Associators, like `Apriori`, can be run like this:

.. code-block:: bash

   pww-associator \
       -t /my/datasets/iris.arff \
       weka.associations.Apriori \
       -N 9 -I


Command-line help screen:

::

    usage: pww-associator [-h] [-j classpath] [-X heap] -t train associator ...

    Executes an associator from the command-line. Calls JVM start/stop
    automatically.

    positional arguments:
      associator    associator classname, e.g., weka.associations.Apriori
      option        additional associator options

    optional arguments:
      -h, --help    show this help message and exit
      -j classpath  additional classpath, jars/directories
      -X heap       max heap size for jvm, e.g., 512m
      -t train      training set file


Package management
------------------

Versions newer than 0.2.9 also offer package management from the command-line via the `pww-packages`
command. There are several sub-commands available:

::

    usage: pww-packages [-h]
                        {list,info,install,uninstall,remove,freeze,suggest,is-installed,bootstrap}
                        ...

    Manages Weka packages.

    positional arguments:
      {list,info,install,uninstall,remove,freeze,suggest,is-installed,bootstrap}
        list                For listing all/installed/available packages
        info                Outputs information about packages
        install             For installing one or more packages
        uninstall (remove)  For uninstalling one or more packages
        freeze              For outputting list of installed packages
        suggest             For suggesting packages that contain the specified
                            class
        is-installed        Checks whether a package is installed, simply outputs
                            true/false
        bootstrap           Generates Python script for recreating current pww3
                            environment.

    optional arguments:
      -h, --help            show this help message and exit


Listing packages
++++++++++++++++

Listing all, available or installed packages can be done using the `list` sub-command:

::

    usage: pww-packages list [-h] [-f {text,json}] [-o FILE] [-r]
                             [{all,installed,available}]

    positional arguments:
      {all,installed,available}
                            defines what packages to list

    optional arguments:
      -h, --help            show this help message and exit
      -f {text,json}, --format {text,json}
                            the output format to use
      -o FILE, --output FILE
                            the file to store the output in, uses stdout if not
                            supplied
      -r, --refresh-cache   whether to refresh the package cache


Info on packages
++++++++++++++++

Outputting information on one or more packages is achieved with the `list` sub-command:

::

    usage: pww-packages info [-h] [-t {brief,full}] [-f {text,json}] [-o FILE]
                             [-r]
                             name [name ...]

    positional arguments:
      name                  the package(s) to output the information for

    optional arguments:
      -h, --help            show this help message and exit
      -t {brief,full}, --type {brief,full}
                            the type of information to output
      -f {text,json}, --format {text,json}
                            the output format to use
      -o FILE, --output FILE
                            the file to store the output in, uses stdout if not
                            supplied
      -r, --refresh-cache   whether to refresh the package cache


Installing/uninstalling/check installed status
++++++++++++++++++++++++++++++++++++++++++++++

The `install` sub-command installs one or more packages:

::

    usage: pww-packages install [-h] [-r FILE] [--refresh-cache]
                                [packages [packages ...]]

    positional arguments:
      packages              the name of the package(s) to install, append
                            '==VERSION' to pin to a specific version

    optional arguments:
      -h, --help            show this help message and exit
      -r FILE, --requirements FILE
                            the text file with packages to install (one per line,
                            format: PKGNAME[==VERSION[|URL]])
      --refresh-cache       whether to refresh the package cache


The `uninstall` (or `remove`) sub-command removes one or more packages:

::

    usage: pww-packages uninstall [-h] packages [packages ...]

    positional arguments:
      packages    the name of the package(s) to uninstall

    optional arguments:
      -h, --help  show this help message and exit


The `is-installed` sub-command outputs whether a package is installed or not:

::

    usage: pww-packages is-installed [-h] [-f {text,json}] [-o FILE]
                                     name [name ...]

    positional arguments:
      name                  the name of the package to check, append '==VERSION'
                            to pin to a specific version

    optional arguments:
      -h, --help            show this help message and exit
      -f {text,json}, --format {text,json}
                            the output format to use
      -o FILE, --output FILE
                            the file to store the output in, uses stdout if not
                            supplied


The `freeze` sub-command outputs all installed packages in the `requirements.txt` format
(PKGNAME==VERSION[|URL], one per line):

::

    usage: pww-packages freeze [-h] [-r FILE] [-u] [-f]

    optional arguments:
      -h, --help            show this help message and exit
      -r FILE, --requirements FILE
                            the text file to store the package/version pairs in
                            (one per line, format: PKGNAME[==VERSION])
      -u, --output_urls     whether to output the download URL for unofficial
                            packages (appends '|URL')
      -f, --force_urls      forces the output of the URLs for all packages, not
                            just unofficial ones


Suggest packages
++++++++++++++++

If you are not sure which package a certain class is part of, then use the `suggest`
sub-command to help with that (this works only for official packages):

::

    usage: pww-packages suggest [-h] [-e] [-f {text,json}] [-o FILE] classname

    positional arguments:
      classname             the classname to suggest packages for

    optional arguments:
      -h, --help            show this help message and exit
      -e, --exact           whether to match the name exactly or perform substring
                            matching
      -f {text,json}, --format {text,json}
                            the output format to use
      -o FILE, --output FILE
                            the file to store the output in, uses stdout if not
                            supplied


Bootstrapping
+++++++++++++

If you want to recreate your python-weka-wrapper3 installation in another virtual
environment or even on another machine, then you can use the `bootstrap` sub-command
to generate a Python script that performs all the necessary steps:

::

    usage: pww-packages bootstrap [-h] [-f] [-o FILE]

    optional arguments:
      -h, --help            show this help message and exit
      -f, --force_urls      forces the install from URLs, not just unofficial ones
      -o FILE, --output FILE
                            the file to store the Python script in, otherwise
                            outputs it on stdout