
SUMMA is a toolkit for the development of text summarization systems.




SUMMA Centroid Computation


Computes the centroid if a set of documents (e.g. corpus) from vectors in the individual documents. The centroid, called 'centroid' is stored as a feature of the corpus itself.

Parameters of the Resource

  • annSet: the annotation set where the document vectors are to be found (only one vector per document)
  • vecName: the annotation representing the document vector
  • corpus: the corpus holding the set of documents


This resource should be used in a GATE pipeline, it does not make sense to use it in a Corpus Pipeline!Each document must have a document vector computed. This can be produced using the vector computation component in SUMMA.






Copyright 2002-2014 Universitat Pompeu Fabra