Automatic Representative News Generation using On-Line Clustering

  • Marlisa Sigita Electronic Engineering Polytechnic Institute of Surabaya
  • Ali Ridho Barakbah Electronic Engineering Polytechnic Institute of Surabaya
  • Entin Martiana Kusumaningtyas Electronic Engineering Polytechnic Institute of Surabaya
  • Idris Winarno Electronic Engineering Polytechnic Institute of Surabaya

Abstract

The increasing number of online news provider has produced large volume of news every day. The large volume can bring drawback in consuming information efficiently because some news contain similar contents but they have different titles that may appear. This paper presents a new system for automatically generating representative news using on-line clustering. The system allows the clustering to be dynamic with the features of centroid update and new cluster creation. Text mining is implemented to extract the news contents. The representative news is obtained from the closest distance to each centroid that calculated using Euclidean distance. For experimental study, we implement our system to 460 news in Bahasa Indonesia. The experiment performed 70.9% of precision ratio. The error is mainly caused by imprecise results from keyword extraction that generates only one or two keywords for an article. The distribution of centroid’s keywords also affects the clustering results.

Keywords: News Representation, On-line Clustering, Keyword Aggregation, Text Mining.

Downloads

Download data is not yet available.

References

Kominfo Pekalongan, Pengguna Internet Indonesia BisaTembus 82 Juta, http://kominfo.pekalongankota.go.id, Retrieved June 19, 2013.

I. Moggi, Daftar Situs Berita Online yang ada di Indonesia, http://www.speechmagazine.blogspot.com, Retrieved May 13, 2011.

Diptia Zandra Eka Puspitasari, Ali Ridho Barakbah, Idris Winarno, Automatic Representative News Generation using Automatic Clustering, Industrial Electronics Seminar (IES) 2011, Surabaya, 2012.

Oren Zamir, Oren Etzioni, Grouper: A Dynamic Clustering Interface to Web Search Result, Department of Computer Science snd Engineering, Seattle, 2010.

A. C. George, Efficient Extraction of News Articles based on RSS. Computer and Informatics Engineering Department, University of Patras.

Ali Ridho Barakbah, Pursuit Reinforcement Competitive Learning: An approach for on-line clustering, The 2nd Information and Communication Technology Seminar (ICTS), Surabaya, 2006.

Published
2013-12-15
How to Cite
Sigita, M., Barakbah, A. R., Kusumaningtyas, E. M., & Winarno, I. (2013). Automatic Representative News Generation using On-Line Clustering. EMITTER International Journal of Engineering Technology, 1(1). https://doi.org/10.24003/emitter.v1i1.11
Section
Articles