Skip to content
  • Cory Quammen's avatar
    ENH: Add vtkWordCloud to Infovos/Core · 76a896d8
    Cory Quammen authored and Bill Lorensen's avatar Bill Lorensen committed
    vtkWordCloud is an Image Source that creates a word cloud.
    
    Word Clouds, AKA Tag Clouds, are a text visualization technique that displays individual works with properties that depend on the frequency of a word in a document. Numerous options are available, including the color of the words, size of the words, the orientation of the words, font files, and mask files.
    Word Clouds, AKA Tag Clouds, are a text visualization technique that displays individual works with properties that depend on the frequency of a word in a document.  vtkWordCloud varies the font size base on word frequency.
    
    Word Clouds are useful for quickly perceiving the most prominent terms in a document. Also, Word Clouds can identify trends and patterns that would otherwise be unclear or difficult to see in a tabular format. Frequently used keywords stand out better in a Word Cloud. Common words that might be overlooked in tabular form are highlighted in the larger text, making them pop out when displayed in a word cloud.
    
    There is some controversy about the usefulness of word clouds. Their best use may be for presentations. Word clouds can be used to "compare" texts from similar subjects, e.g., Presidential Inaugural Addresses, job candidate comparisons, etc.
    
    Several methods are available to customize the resulting visualization. The class provides defaults that provide a reasonable result.
    
    BackgroundColorName - The vtkNamedColors name for the backgound (MidNightBlue).
    
    BWMask - Mask image has a single channel(false). Mask images typically have three channels (r,g,b).
    
    ColorDistribution - Distribution of random colors(.6 1.0), if WordColorName is not empty.
    
    ColorSchemeName - Name of a color scheme from vtkColorSeries to be used to select colors for the words (), if WordColorName is empty.
    
    DPI -  Dots per inch(200) of the rendered text. DPI is used as a scaling mechanism for the words. As DPI increases, the word size increases. If there are too, few skipped words, increase this value, too many, decrease it.
    
    FontFileName - If empty, the built-in Arial font is used(). The FontFileName is the name of a file that contains a TrueType font.
    
    FontMultiplier - Font multiplier(6). The final font size is this value * the word frequency.
    
    Gap - Space gap of words (2). The gap is the number of spaces added to the beginning and end of each word.
    
    MaskColorName - Name of the color for the mask (black). This color is the name of the vtkNamedColors that defines the foreground of the mask. Usually black or white.
    
    MaskFileName - Mask file name(). If a mask file is specified, it will be used as the mask. Otherwise, a black square is used as the mask. The mask file should contain three channels of unsigned char values. If the mask file is just a single unsigned char, specify turn the boolean BWMask on.  If BWmask is on, the class will create a three channel image using vtkImageAppendComponents.
    
    MaxFontSize - Maximum font size(48).
    
    MinFontSize - Minimum font size(8).
    
    MinFrequency - Minimum word frequency accepted(2). Word with frequencies less than this will be ignored.
    
    OffsetDistribution - Range of uniform random offsets(-size[0]/100.0 -size{1]/100.0)(-20 20). These offsets are offsets from the generated path for word layout.
    
    OrientationDistribution - Ranges of random orientations(-20 20). If discrete orientations are not defined, these orientations will be generated.
    
    Orientations - Discrete orientations for displayed words. If present, this overrides OrientationDistribution.
    
    ReplacementPairs - Replace the first word with another second word (). The first word is also added to the StopList.
    
    Sizes - Size of image(640 480).
    
    StopWords - User provided stop words(). vtkWordCloud has built-in stop words. The user-provided stop words are added to the built-in list.
    
    Title - Add this word to the document's words and set a high frequency, so that is will be rendered first.
    
    WordColorName - Name of the color for the words(). The name is selected from vtkNamedColors. If the name is empty, the ColorDistribution will generate random colors.
    76a896d8