Visual Computing

University of Konstanz
IEEE Transactions on Visualization and Computer Graphics

Document Cards: A Top Trumps Visualization for Documents

H. Strobelt, D. Oelke, C. Rohrdantz, A. Stoffel, D. Keim, O. Deussen
Teaser of Document Cards: A Top Trumps Visualization for Documents

Material

Paper (.pdf, 4.8 MB)

Abstract

Finding suitable, less space consuming views for a document’s main content is crucial to provide convenient access to large document collections on display devices of different size. We present a novel compact visualization which represents the document’s key semantic as a mixture of images and important key terms, similar to cards in a top trumps game. The key terms are extracted using an advanced text mining approach based on a fully automatic document structure extraction. The images and their captions are extracted using a graphical heuristic and the captions are used for a semi-semantic image weighting. Furthermore, we use the image color histogram for classification and show at least one representative from each non-empty image class. The approach is demonstrated for the IEEE InfoVis publications of a complete year. The method can easily be applied to other publication collections and sets of documents which contain images.

BibTeX

@article{Strobelt2009DocumentCardsTop,
  author     = {H. Strobelt and D. Oelke and C. Rohrdantz and A. Stoffel and D. Keim and O. Deussen},
  doi        = {10.1109/TVCG.2009.139},
  issn       = {1077-2626},
  journal    = {IEEE Transactions on Visualization and Computer Graphics},
  keywords   = {data mining;data visualisation;document image processing;IEEE InfoVis publications;advanced text mining;compact visualization;display devices;document cards;document structure extraction;image color histogram;semi-semantic image weighting;top trumps document visualization;Displays;Feeds;Histograms;Image databases;Operating systems;Pipelines;Search engines;Text mining;Visualization;content extraction;document collection browsing;document visualization;visual summary},
  month      = {nov},
  number     = {6},
  pages      = {1145-1152},
  title      = {Document Cards: A Top Trumps Visualization for Documents},
  url        = {http://graphics.uni-konstanz.de/publikationen/2009/documentcards/website/},
  volume     = {15},
  year       = {2009},
}