Table of Contents

Name

ssmake -- tool for building semantic spaces

Synopsis

ssmake [-w weighting_scheme] [-k maxdims] space train_data [test_data]

Description

ssmake trains a semantic space from training documents in train_data, optionally projects in documents from test_data and saves the resultant semantic space to space.

The options are as follows:

-w weighting_scheme
Sets the matrix weighting scheme to weighting_scheme. Currently, the only possible options are none or normalised_entropy
-k maxdims
Sets the maximum number of dimensions for the space. If there is insufficient data, then the actual number of dimensions may be less than this value.

Files

space
The file to save the resultant semantic space to.
train_data
A file containing the data required to train the space. The file should consist of one line for each document. Each line should start with the name of the document, followed by terms and their respective weights or occurrences. Each term and weight should be separated by a space. Each term-weight pair should be comma separated. Each line must end in a comma.

An example file might look like this:

1000.jpg sun 1.0, RGB[0] 100, RGB[20] 2000,
1001.jpg car 1.0, RGB[10] 56, RGB[40] 5, RGB[41] 8091,
1002.jpg plane 1.0, sky 1.0, RGB[25] 5000, RGB[27] 499, RGB[63] 200,
...

test_data
The test_data file contains the information required to build the document space. The format of the file is the same as for the train_data file, with each line representing a document. Depending on what the space is for, the test_data may be omitted completely, or it might only contain partial data (for example in an image retrieval scenario, it might only contain visual terms and not contain any annotation or keyword terms).

See Also

ssfind(1) ssutil(1) libSemanticSpace(3)

Copyright

School of Electronics and Computer Science, University of Southampton

Author

Jonathon Hare <jsh2@ecs.soton.ac.uk>


Table of Contents