Commit 76a896d8 authored by Cory Quammen's avatar Cory Quammen Committed by Bill Lorensen

ENH: Add vtkWordCloud to Infovos/Core

vtkWordCloud is an Image Source that creates a word cloud.

Word Clouds, AKA Tag Clouds, are a text visualization technique that displays individual works with properties that depend on the frequency of a word in a document. Numerous options are available, including the color of the words, size of the words, the orientation of the words, font files, and mask files.
Word Clouds, AKA Tag Clouds, are a text visualization technique that displays individual works with properties that depend on the frequency of a word in a document.  vtkWordCloud varies the font size base on word frequency.

Word Clouds are useful for quickly perceiving the most prominent terms in a document. Also, Word Clouds can identify trends and patterns that would otherwise be unclear or difficult to see in a tabular format. Frequently used keywords stand out better in a Word Cloud. Common words that might be overlooked in tabular form are highlighted in the larger text, making them pop out when displayed in a word cloud.

There is some controversy about the usefulness of word clouds. Their best use may be for presentations. Word clouds can be used to "compare" texts from similar subjects, e.g., Presidential Inaugural Addresses, job candidate comparisons, etc.

Several methods are available to customize the resulting visualization. The class provides defaults that provide a reasonable result.

BackgroundColorName - The vtkNamedColors name for the backgound (MidNightBlue).

BWMask - Mask image has a single channel(false). Mask images typically have three channels (r,g,b).

ColorDistribution - Distribution of random colors(.6 1.0), if WordColorName is not empty.

ColorSchemeName - Name of a color scheme from vtkColorSeries to be used to select colors for the words (), if WordColorName is empty.

DPI -  Dots per inch(200) of the rendered text. DPI is used as a scaling mechanism for the words. As DPI increases, the word size increases. If there are too, few skipped words, increase this value, too many, decrease it.

FontFileName - If empty, the built-in Arial font is used(). The FontFileName is the name of a file that contains a TrueType font.

FontMultiplier - Font multiplier(6). The final font size is this value * the word frequency.

Gap - Space gap of words (2). The gap is the number of spaces added to the beginning and end of each word.

MaskColorName - Name of the color for the mask (black). This color is the name of the vtkNamedColors that defines the foreground of the mask. Usually black or white.

MaskFileName - Mask file name(). If a mask file is specified, it will be used as the mask. Otherwise, a black square is used as the mask. The mask file should contain three channels of unsigned char values. If the mask file is just a single unsigned char, specify turn the boolean BWMask on.  If BWmask is on, the class will create a three channel image using vtkImageAppendComponents.

MaxFontSize - Maximum font size(48).

MinFontSize - Minimum font size(8).

MinFrequency - Minimum word frequency accepted(2). Word with frequencies less than this will be ignored.

OffsetDistribution - Range of uniform random offsets(-size[0]/100.0 -size{1]/100.0)(-20 20). These offsets are offsets from the generated path for word layout.

OrientationDistribution - Ranges of random orientations(-20 20). If discrete orientations are not defined, these orientations will be generated.

Orientations - Discrete orientations for displayed words. If present, this overrides OrientationDistribution.

ReplacementPairs - Replace the first word with another second word (). The first word is also added to the StopList.

Sizes - Size of image(640 480).

StopWords - User provided stop words(). vtkWordCloud has built-in stop words. The user-provided stop words are added to the built-in list.

Title - Add this word to the document's words and set a high frequency, so that is will be rendered first.

WordColorName - Name of the color for the words(). The name is selected from vtkNamedColors. If the name is empty, the ColorDistribution will generate random colors.
parent a15e0073
......@@ -42,7 +42,8 @@ set(classes
vtkTreeDifferenceFilter
vtkTreeFieldAggregator
vtkTreeLevelsFilter
vtkVertexDegree)
vtkVertexDegree
vtkWordCloud)
vtk_module_add_module(VTK::InfovisCore
CLASSES ${classes})
......@@ -28,4 +28,27 @@ vtk_add_test_cxx(vtkInfovisCoreCxxTests tests
TestTreeDifferenceFilter.cxx,NO_VALID
)
vtk_test_cxx_executable(vtkInfovisCoreCxxTests tests)
# add to the list but don't define a test
list(APPEND tests UnitTestWordCloud.cxx)
list(APPEND tests TestWordCloud.cxx)
ExternalData_add_test(${_vtk_build_TEST_DATA_TARGET}
NAME VTK::InfovisCoreCxxTests-UnitTestWordCloud
COMMAND vtkInfovisCoreCxxTests UnitTestWordCloud
DATA{../../../../Testing/Data/Gettysburg.txt} DATA{../../../../Testing/Data/Canterbury.ttf} DATA{../../../../Testing/Data/hearts.png} DATA{../../../../Testing/Data/hearts8bit.png} DATA{../../../../Testing/Data/NLTKStopList.txt})
ExternalData_add_test(${_vtk_build_TEST_DATA_TARGET}
NAME VTK::InfovisCoreCxxTests-TestWordCloud
COMMAND vtkInfovisCoreCxxTests TestWordCloud
DATA{../../../../Testing/Data/Gettysburg.txt} DATA{../../../../Testing/Data/Canterbury.ttf}
-V DATA{../Data/Baseline/TestWordCloud.png,:}
-T "${VTK_TEST_OUTPUT_DIR}"
)
set(all_tests
${tests}
${data_tests}
${output_tests}
${custom_tests}
)
vtk_test_cxx_executable(vtkInfovisCoreCxxTests all_tests RENDERING_FACTORY)
/*=========================================================================
Program: Visualization Toolkit
Module: TestWordCloud.cxx
-------------------------------------------------------------------------
Copyright 2008 Sandia Corporation.
Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
the U.S. Government retains certain rights in this software.
-------------------------------------------------------------------------
Copyright (c) Ken Martin, Will Schroeder, Bill Lorensen
All rights reserved.
See Copyright.txt or http://www.kitware.com/Copyright.htm for details.
This software is distributed WITHOUT ANY WARRANTY; without even
the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE. See the above copyright notice for more information.
=========================================================================*/
#include "vtkSmartPointer.h"
#include "vtkWordCloud.h"
#include <vtkRenderWindow.h>
#include <vtkRenderWindowInteractor.h>
#include <vtkRenderer.h>
#include <vtkCamera.h>
#include <vtkImageViewer2.h>
#include <vtkNamedColors.h>
#include <iostream>
int TestWordCloud(int argc, char *argv[])
{
if (argc < 2)
{
std::cout << "Usage: " << argv[0] << "filename" << std::endl;
return EXIT_FAILURE;
}
vtkWordCloud::OffsetDistributionContainer offset;
offset[0] = 0;
offset[1] = 0;
auto wordCloud = vtkSmartPointer<vtkWordCloud>::New();
wordCloud->SetFileName(argv[1]);
wordCloud->SetOffsetDistribution(offset);
wordCloud->SetFontFileName(argv[2]);
wordCloud->AddOrientation(0.0);
wordCloud->AddOrientation(90.0);
wordCloud->Update();
std::cout << "File" << argv[1] << std::endl;
std::cout << "Font" << argv[2] << std::endl;
std::cout << "Kept Words: " << wordCloud->GetKeptWords().size() << std::endl;
std::cout << "Stopped Words: " << wordCloud->GetStoppedWords().size() << std::endl;
std::cout << "Skipped Words: " << wordCloud->GetSkippedWords().size() << std::endl;
// Display the final image
auto colors = vtkSmartPointer<vtkNamedColors>::New();
auto interactor = vtkSmartPointer<vtkRenderWindowInteractor>::New();
auto imageViewer = vtkSmartPointer<vtkImageViewer2>::New();
imageViewer->SetInputData(wordCloud->GetOutput());
imageViewer->SetupInteractor(interactor);
imageViewer->GetRenderer()->SetBackground(colors->GetColor3d("Wheat").GetData());
imageViewer->SetSize(wordCloud->GetSizes()[0], wordCloud->GetSizes()[1]);
imageViewer->GetRenderer()->ResetCamera();
// Zoom in a bit
vtkCamera* camera = imageViewer->GetRenderer()->GetActiveCamera();
camera->ParallelProjectionOn();
camera->SetParallelScale(wordCloud->GetAdjustedSizes()[0] * .4);
imageViewer->GetRenderWindow()->Render();
interactor->Start();
return EXIT_SUCCESS;
}
This diff is collapsed.
e74d9cb19ff60205f117da692cbeca99f79d389a54d00ba1378fcef18fcea5bf515035a670885865add6e4286336f938b3a5a626014b058f58c79a5098a595b4
cc2e1e5a1d74c59edc4efe2e4d9b3cc7d970337529dcf6d55a6bbfd995b28d19b29b6065156aba1815e28347839b40cb0df6f65a32669fa5d18293a386ce7e94
798126b4f5c0f8b6aa43089df279a42f09075cd3476e83b2663590eeb2ab6793674610e94ee130ece763174b3ba637cb24f7339530e0f696c601779b8f6998f6
a3cf52fff463054c16c282dd410dedec54ea017435ca70ac937a89e1c7786476769dac781f4cba61a144697acd8d00c0408280df119cbd61f57a1e00a2f6b92f
......@@ -8,6 +8,11 @@ DEPENDS
VTK::CommonCore
VTK::CommonDataModel
VTK::CommonExecutionModel
VTK::CommonColor
VTK::IOImage
VTK::ImagingCore
VTK::ImagingSources
VTK::RenderingFreeType
PRIVATE_DEPENDS
VTK::FiltersExtraction
VTK::FiltersGeneral
......@@ -16,6 +21,7 @@ TEST_DEPENDS
VTK::IOImage
VTK::IOInfovis
VTK::InfovisLayout
VTK::InteractionImage
VTK::InteractionStyle
VTK::RenderingOpenGL2
VTK::TestingRendering
This diff is collapsed.
This diff is collapsed.
ae780e61477b98aedf9cb47ea86637f45a7d9ecb2c6cf9ce4fb7360eabfcd9659b0fd79a5cf42e0ea44ae2140f83d3a305da1b088853fd8a20dfcf497c9b6dcf
a8edbf80252f7ec17d324dd766db68d4aa5afa12203b2a81d73fcaf59ac4c23dc9586a0d82370ab0ef92eafcf8453a67047d720828145526b1ef28c7358f8150
9552144f2af135d070315dedf5e530daf3ba93e480c2d64c77e3099395f491011d5fbd7679c4daaed2709a83832c09a31c308d346eb4dffbeda969fcbe99e9f4
0ce83e87c630a4eada6cdb116549353621baaaf35e386046b4992708ecae2ee0a9fa7f826dcc87dac617506befeaa588e012beecab0999ac5b7c9baa698d447e
aa229211252470922b25995bd73b6d7ce540eff76738cab92b727a6712ff0494b27b799eefad978406a200d6e7a7f1386143f17c6a4de8054e474a985206afa0
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment