Visual Computing

University of Konstanz
IEEE Transactions on Visualization and Computer Graphics

Document Cards: A Top Trumps Visualization for Documents

H. Strobelt, D. Oelke, C. Rohrdantz, A. Stoffel, D. Keim, O. Deussen
Teaser of Document Cards: A Top Trumps Visualization for Documents

Material

Paper (.pdf, 4.8MB)

Abstract

Finding suitable, less space consuming views for a document’s main content is crucial to provide convenient access to large document collections on display devices of different size. We present a novel compact visualization which represents the document’s key semantic as a mixture of images and important key terms, similar to cards in a top trumps game. The key terms are extracted using an advanced text mining approach based on a fully automatic document structure extraction. The images and their captions are extracted using a graphical heuristic and the captions are used for a semi-semantic image weighting. Furthermore, we use the image color histogram for classification and show at least one representative from each non-empty image class. The approach is demonstrated for the IEEE InfoVis publications of a complete year. The method can easily be applied to other publication collections and sets of documents which contain images.

BibTeX

@article{Strobelt2009DocumentCardsTop,
  author    = {H. Strobelt, D. Oelke, C. Rohrdantz, A. Stoffel, D. Keim, O. Deussen},
  doi       = {10.1109/TVCG.2009.139},
  issn      = {1077-2626},
  journal   = {IEEE Transactions on Visualization and Computer Graphics},
  keywords  = {data mining;data visualisation;document image processing;IEEE InfoVis publications;advanced text mining;compact visualization;display devices;document cards;document structure extraction;image color histogram;semi-semantic image weighting;top trumps document visualization;Displays;Feeds;Histograms;Image databases;Operating systems;Pipelines;Search engines;Text mining;Visualization;content extraction;document collection browsing;document visualization;visual summary},
  month     = {nov},
  number    = {6},
  pages     = {1145--1152},
  title     = {Document Cards: A Top Trumps Visualization for Documents},
  url       = {http://graphics.uni-konstanz.de/publikationen/2009/documentcards/website/},
  volume    = {15},
  year      = {2009}
}