November 17, 2004
In the lab directory /mailocal/lab/numt/ngssc/summarize/ there is a newspaper text in two version: the first is the original, the second has been preprocessed somewhat for the text parser (e.g. the sentences have been numbered). Extract the top ten keywords and top five sentences from the text using the saliency score method.
Use gtp from the text mining assignment (modify the runmedline
script), perform stemming and remove common words.
The report should include the code and the result.