automap

AutoMap©

1. General Information
2. Product
3. Key Terms and Constraints
4. Coding Choices: Filtering the Text and Windowing
5. References

1. General Information

AutoMap extracts, analyzes and represents cognitive maps of texts as representations of individual's mental models. Mental models are conceptual networks of relations between concepts. Texts contain a portion of the author's mental model.

Differences in the distribution of concepts and relationships among the Concepts across texts provide insight into the similarities and differences in the content and structure of texts.

AutoMap contains content analytic techniques and map analytic techniques to code and analyze texts and a toolkit for data reduction. The content analytic techniques are focused on collecting and analyzing quantitative information on the concepts. The map analytic techniques extract conceptual networks of texts.
AutoMap is not restricted to any language.

AutoMap is not restricted to any language.

2. Product

AutoMap is a program written in Java, that runs under the Linux, UNIX and Windows operating systems. The release that has been made available as AutoMap.exe runs only under Windows.

As an input AutoMap takes one or several texts. The user can pre-process the text and determine the window size and further settings according to the research question. Every step of data reduction is visualized and can be stored for further analysis. As an output AutoMap generates the pre-processed text, a coded map, a visualization of the map and a statistical overview.
There are no scientific standards for defining information as irrelevant or how to create a delete list or a thesaurus. The user has to determine the most appropriate level of generalization considering his research question. AutoMap supports this decision making process.

3. Key Terms and Constraints

1.    A concepts is a single idea or ideational kernel represented by a single word or phrase.
2.    A relation is a connection between two concepts.
3.    A statement is a set of two concepts and the relation between them.
4.    A map is a network of concepts formed from statements. Two statements are linked if they share one concept.
5.    A Thesaurus is a set of key concepts and their synonyms.
6.    A key concept is a concept that other concepts will be translated into.

4. Coding Choices: Filtering the Text and Windowing

Coding a map is a two-step process that requires the user to make decisions about Filtering a text and Windowing (proximity analysis).
The coding choices may change the analysis results significantly.

Filtering a text means to reduce the data to a minimized set of the most relevant, content-bearing terms. Pre-processing is a semi-automated, iterative process that allows the user to stay close to the data and to beyond explicitly articulated ideas to implied ideas. The size of both text and map is decreased significantly and therefore meaningful comparisons across texts becomes possible.A simplified text is generated that can be visually inspected. There are no scientific standards for defining information as irrelevant. The user has to determine the most appropriate level of data reduction considering his research question.
AutoMap allows the researcher to use a three step process for data reduction: punctuation, deletion and generalization. Inter-coder reliability may be a second criteria.
AutoMap helps to make decisions and to realize the generalization.
By determining the punctuation the user decides whether statements within sentences, paragraphs or the entire text are considered in the analysis.
On a more detailed level deletion and generalization are applied to filter the text.

Windowing is a method that codes the (filtered) text as a map by putting relationships between pairs of Concepts that occur within a window. A window is a set of contiguous concepts. By determining the window size the user defines how proximally distant concepts can be from each other and still have a relationship.

Delete List:

Deletion removes words from the text which do not help answer the research question such as proper names, pronouns, conjunctions, articles, prepositions and notations.
AutoMap has two delete lists available – an extensive one and a limited one – and the researcher can modify these or design a unique one.
To view these Delete Lists, go to the Delete Concepts Index Card and select one of the Lists.

To construct an optimal Delete List the user should apply a predefined set of concepts and modify this List during the process of coding.
AutoMap supports this process.
The user can choose one of the predefined Delete Lists.
After running a first analysis, the Map-output is to be checked to decide which further concepts should be deleted.
Then the Delete List is to be extended by these concepts.
The intermediate versions of the filtered text are displayed on the Concepts deleted Index card and can be saved.
The process can be repeated as often as necessary.

Thesaurus:

A Thesaurus is built up by synonyms and key concepts. The synonyms will be translated into the key concepts. A Thesaurus is typically designed specifically for a dataset. AutoMap uses the entries in the thesaurus to search the text and “translate” specific words and phrases into more basic concepts specified by the researcher.

The user can decide if he wants to keep the rest of the text after replacing concpets or not.

To generate an optimal Thesaurus the user should apply a first draft of his Thesaurus and modify it during the process of coding.
AutoMap supports this process.
After running a first analysis, the Map-output can be checked to decide which further concepts should be replaced.
Then the Thesaurus can be modified.
The intermediate versions of the filtered text are displayed on the Concepts replaced Index card and can be saved.
The process can be repeated as often as necessary.

Once the user has defined a Thesaurus, AutoMap offers two ways to apply it: When the words and phrases included in the thesaurus are replaced by the corresponding key concepts, the rest of the text can be maintained or neglected. The difference between is the resulting data reduction, that is much higher if the rest of the text, which is not included in the thesaurus, will be neglected.

When the pre-processed text will be analyzed in order to code a map, statements will be placed between the concepts within every single window.
If the user did not apply a delete or a thesaurus, statements will be placed between all contiguous concepts.
If a delete list was applied and/ or a thesaurus in the way that concepts were replaced by corresponding key concepts and maintaining the rest of the texts, statements will be placed between the concepts of the pre-processed text.
If the user applied the thesaurus in the way that concepts were replaced by corresponding key concepts and the rest of the text was neglected, AutoMap offers two methods to place statements between concepts: direct adjacency and rhetorical adjacency.

Direct adjacency means that only the key concepts are maintained while all the rest of the text will not be considered. AutoMap displays the text that is to be analyzed as the resulting plain string of key concepts.
Rhetorical adjacency means that the original distance between the key concepts will be considered for the analysis. In this approach AutoMap displays the resulting text as a string of xxx symbols and key concepts. The xxx symbols can be considered space holder the by.
Examples for the different arroaches to apply the thesaurus are provided in the AutoMap Help under the root directory of AutoMap.

Windowing

Windowing is a method that codes the (filtered) text as a map by putting relationships between pairs of Concepts that occur within a window.
A window is a set of contiguous concepts.
By determining the window size the user defines how proximally distant concepts can be from each other and still have a relationship.
AutoMap offers Windowing as a completely automated process.
The user can select a window size between 2 and 100

Windowing may include two dangers:

Overlinking:      Put relationships between concepts that are not semantically related but simple are contiguous
Underlinking:    Not put relationships between concepts that are semantically related but are distant from each other
                         due to grammatical phrasing.

5. References

For further Information about Textual Analysis see:

http://www.hss.cmu.edu/departments/sds/faculty/carley/publications.htm

All listed abstracts are available under this direction.
For full versions of texts please contact Prof. Kathleen Carley: carley+@andrew.cmu.edu

Kathleen M. Carley, 1997, "Network Text Analysis: The Network Position of Concepts."
Chapter 4 in C. Roberts (Ed.), Text Analysis for the Social Sciences:
Methods for Drawing Statistical Inferences from Texts and Transcripts.
Hillsdale, NJ: Lawrence Erlbaum Associates. pp. 79-100.

Kathleen M. Carley, 1997, "Extracting Team Mental Models Through Textual Analysis."
Journal of Organizational Behavior, 18: 533-538.

Abstract

An approach, called map analysis, for extracting, analyzing and combining representations of individual's mental models as cognitive maps is presented. This textual analysis technique allows the researcher to extract cognitive maps, locate similarities across maps, and combine maps to generate a team map. Using map analysis the researcher can address questions about the nature of team mental models and the extent to which sharing is necessary for effective teamwork. This technique is illustrated using data drawn from a study of software engineering teams. The impact of critical coding choices on the resultant findings is examined. It is shown that various coding choices have systematic effects on the complexity of the coded maps and their similarity. consequently a thorough analysis requires analyzing the data several times under different coding choices. For example, re-analysis under different coding scenarios revealed that although members of successful teams tend to have more elaborate, more widely shared maps than members of non-successful teams, this difference is significant only when the data is unfiltered. Thus a better interpretation of this result is that all teams have comparable models, but successful teams are able to describe their models in more ways than are non-successful teams.

Kathleen Carley, 1994. "Extracting Culture through Textual Analysis." Poetics, 22:291-312.

Abstract

Language has been viewed as a window on the mind. Language is also a window on culture. Through analyzing texts the interplay between human cognition and culture can be examined. Through analyzing texts cognitive similarities and differences across individuals, which serve as a basis for culture can be described. Through analyzing texts the impact of culture on individual behavior can be examined. In addition, such analyses can locate similarities and differences across cultures and changes within cultures. This paper explores the relative benefits for using content analysis and map analysis for extracting and analyzing culture given a series of texts. content analysis has been the traditional textual analysis method used for examining culture. However, it is not theoretically grounded. In contrast, map analysis has received less use and is theoretically grounded in an understanding of human cognition. It is shown that under certain conditions map analysis subsumes content analysis. Researchers can thus use map analysis not only to extract and analyze culture but to examine the relationship between cognition and culture. Illustrative applications are drawn from four different studies.

Kathleen Carley, 1993, "Coding Choices for Textual Analysis: A Comparison of Content Analysis and Map Analysis."
In Marsden P. (Ed), Sociological Methodology , 23: 75-126. Oxford: Blackwell.

Abstract

Content and map analysis, procedures for coding and understanding texts, are described and contrasted. Where content analysis focuses on the extraction of concepts from texts, map analysis focuses on the extraction of both concepts and the relationships among them. Map analysis thus subsumes content analysis. coding choices that must be made prior to employing content-analytic procedures are enumerated, as are additional coding choices necessary for employing map-analytic procedures. The discussion focuses on general issues that transcend specific software procedures for coding texts from either a content-analytic or map-analytic perspective.

Kathleen Carley & Michael Palmquist, 1992, "Extracting, Representing and Analyzing Mental Models."
Social Forces , 70(3): 601-636

Abstract

When making decisions or talking to others, people use mental models of the world to evaluate choices and frame discussions. This article describes a methodology for representing mental models as maps, extracting these maps from texts, and analyzing and comparing these maps. The methodology employs a set of computer-based tools to analyze written and spoken texts. these tools support textual comparison both in terms of what concepts are present and in terms of what structures of information are present. The methodology supports both qualitative and quantitative comparisons of the resulting representations. This approach is illustrated using data drawn from a larger study of students learning to write where it is possible to compare mental models of the students with those of the instructor.

Eleanor T. Lewis & Jana Diesner & Kathleen Carley, 2001, "Using Automated Text Analysis to Study Self-Presentation Strategies"
Presented at the Computational Analysis of Social and Organizational Systems (CASOS) conference, Pittsburgh Pennsylvania, July 2001. Available through the CASOS working paper series.

Abstract

Extracting and representing the networks of ties between concepts in a set of texts creates a “map” of each text. Map analysis allows a researcher to compare the networks of ties between concepts in these texts by systematically reducing their content. The goals of this research paper are to answer both a methodological and a substantive question. First, how do the choices a researcher makes about how to generate maps using an automated text program alter the results, and how do these results compare to the results of hand-coding? Second, how can we interpret the results of map analysis to better understand the strategies authors use to manage their self-presentation, a central purpose of many texts. The texts we use are a subsample of a dataset of applications by entrepreneurs for an “Entrepreneur of the Year” award. Applicants value uniqueness in their application’s content because it sets them apart and demonstrates their worthiness for the award, but the value placed on uniqueness in the structure of their accounts is not as clear. Our analysis allows us to extract four general self-presentation strategies: the prepared entrepreneur, the driven entrepreneur, the creative niche entrepreneur, and the humble entrepreneur (a single entrepreneur may employ multiple strategies).

Contact:

Prof. Kathleen Carley
Carnegie Mellon University, Pittsburgh, PA
E-Mail: Carley+@andrew.cmu.edu

Jana Diesner
Carnegie Mellon University, Pittsburgh, PA
janadiesner@gmx.net