Table of Contents
ssmake -- tool for building semantic spaces
ssmake [-w weighting_scheme] [-k maxdims] space train_data [test_data]
ssmake trains a semantic space from training documents in train_data,
optionally projects in documents from test_data and saves the resultant
semantic space to space.
The options are as follows:
- -w weighting_scheme
Sets the matrix weighting scheme to weighting_scheme. Currently,
the only possible options are none or normalised_entropy
- -k maxdims
- Sets the maximum number of dimensions for the space. If
there is insufficient data, then the actual number of
dimensions may be less than this value.
- The file to save the resultant semantic space to.
- A file containing the data required to train the space. The
file should consist of one line for each document. Each
line should start with the name of the document, followed
by terms and their respective weights or occurrences. Each
term and weight should be separated by a space. Each term-weight
pair should be comma separated. Each line must end
in a comma.
An example file might look like this:
1000.jpg sun 1.0, RGB 100, RGB 2000,
1001.jpg car 1.0, RGB 56, RGB 5, RGB 8091,
1002.jpg plane 1.0, sky 1.0, RGB 5000, RGB 499, RGB 200,
- The test_data file contains the information required to
build the document space. The format of the file is the
same as for the train_data file, with each line representing
a document. Depending on what the space is for, the
test_data may be omitted completely, or it might only contain
partial data (for example in an image retrieval scenario,
it might only contain visual terms and not contain
any annotation or keyword terms).
School of Electronics and Computer Science, University of Southampton
Jonathon Hare <firstname.lastname@example.org>
Table of Contents