Text Analysis of Open-Ends
Ennis, D. M. and Russ, W. (2019). IFPress, 22(3) 3-4.
Open-ended questions in surveys have always posed a challenge for the interpreter. When we design closed-ended questions in a survey, we assume that the respondent understands the meaning of the questions. However, to interpret open-ended questions, we assume the opposite – that a survey analyst understands what a respondent means. In some areas where surveys are used there is heavy reliance placed on open-ended questions. This occurs frequently in “consumer perception” or “consumer takeaway” surveys of ads in false advertising cases. Open-ended responses are given considerable weight in these cases where the occurrence of certain words and their counts relative to a control ad are often at the heart of the key evidence on which the decision-maker depends.
The decision-maker could be a judge in a bench trial, a jury, or an arbitrator. The decision-maker could also be a member of the staff of the National Advertising Division who decides cases as part of the advertising industry’s self-regulatory body. It is clear from decisions made by the NAD that the advertiser, the challenger and the NAD itself may have different interpretations of the meaning(s) intended in responses from open-ended questions and hence lead to different counts that may contribute to deciding a case. For example, Schick Manufacturing Inc. challenged The Gillette Company when the latter made reference to “moisture strips”, “intensive moisture” and “moisturize” in advertising their Venus Divine Shaving System for Women. The challenger noted a much larger difference between control and test ads in messages related to the “moisturizing” benefit than the advertiser. The NAD’s count was intermediate. Since the meaning of words depends heavily on the context in which they occur and the possible bias of the interpreter, the task of obtaining reliable counts from open-ended questions in surveys is difficult even for a human, and even more problematic for a machine.
Despite the difficulty for machines, computational tools can provide information gleaned from open-ended questions to help an analyst decide the most likely meaning of a word or phrase and the frequency of occurrence of different possible meanings. Although the tools may not replace a human compiler, they offer useful descriptive statistics on the occurrence of individual words and combinations and possibly include a component of textual context as is attempted in natural language processing. In this technical report, we explore some of these computer-aided tools and apply them to an actual survey involving open-ended questions about the meaning of a product label to a consumer audience.
Figure 1a. Word cloud from the original label.