SUMMA Centroid Computation
Functionality
Computes the centroid if a set of documents (e.g. corpus) from vectors in the individual documents. The centroid, called 'centroid' is stored as a feature of the corpus itself.
Parameters of the Resource
- annSet: the annotation set where the document vectors are to be found (only one vector per document)
- vecName: the annotation representing the document vector
- corpus: the corpus holding the set of documents
Restriction
This resource should be used in a GATE pipeline, it does not make sense to use it in a Corpus Pipeline!Each document must have a document vector computed. This can be produced using the vector computation component in SUMMA.