Fig 1.
A graphical model of narratives.
For a story with a set of actants A1, …, An, the narrative can be divided into a set of contexts. In each context, the story is summarized as a set of interactions (relationships) between actants as shown in the figure. Therefore, an edge between actants A1 and A2 carries a set of relationships that exist between the two actants, and the significance of each relationship in this context. It is important to note that relationships come from not only verbs, but also other syntactic structures in the text that imply relationships.
Fig 2.
Modeling the steps a user takes to generate a social media post or a story fragment for a given domain.
Fig 3.
Representation of the narrative framework discovery pipeline.
Most of these numbered blocks are described briefly in the main paper and in more detail in the S1 File.
Fig 4.
An example of syntax-based relationship extraction patterns.
The sentence, “The spark for the attack was the cache of e-mails stolen from John Podesta, chair of Clinton’s campaign” is analyzed to extract three relationship triples. These relationships are then aggregated across the entire corpus to create the final narrative network.
Table 1.
Summary statistics for the extracted graphs from the two corpora.
Fig 5.
Relationship extraction patterns.
Patterns by total number for A: Pizzagate (top) and for B: Bridgegate (bottom). For example, SVO is (nsubj, verb, obj), SRL is (A0, Verb, A1) and (A0, Verb, A2). A larger list can be found in S1 File.
Table 2.
A sample of the top 5 supernodes and subnodes for Pizzagate and Bridgegate.
Table 3.
Comparison of pipeline actant discovery with the gold standard evaluation data.
Fig 6.
Comparison of our results with the NY Times Pizzagate hand-drawn graph.
Edges and nodes that we do not discover in the top ranked actants through the pipeline are greyed out (cannibalism). Highly ranked edges and nodes that we discover not included in the NY Times illustration are in green (Bill Clinton and Clinton Foundation). We maintain the visual convention of dashed lines that the NY Times uses to identify relationships based on the interpretation by the conspiracy theorists of hidden knowledge. Immediately following the node label is the ranking of the actant as discovered by our pipeline.
Table 4.
Comparison of pipeline inter-actant relationship discovery with the NY Times and the gold standard corpora.
Fig 7.
A subnetwork of the Pizzagate narrative framework.
Some of the nodes are subnodes (e.g. “Clinton Foundation”), and others are supernodes (e.g. “Pizzagate”). Because we only pick the lead verbs for labeling edges, the contextual meaning of relationships becomes more clear when one considers the entire relationship phrase. For example, the relationship “began” connecting “Pizzagate” to “Hillary Clinton Campaign email….” derives from sentences such as, “What has come to be known as Pizzagate began with the public release of Hillary Clinton campaign manager John Podesta’s emails by WikiLeaks…”. Similarly the edge labeled “threaten” connecting “Alefantis” to the “Pizzagate” supernode is derived from sentences such as, “James Alefantis threatens Pizzagate researcher….”. Here the supernode, “Pizzagate” includes the entity “Pizzagate researcher,” which appears as a subnode.
Fig 8.
Identification of the Pizzagate narrative framework from the Pizzagate corpus.
Subnodes with a mention frequency count < 265 and their edges are removed from the community-partitioned network obtained from Algorithm 1 (See Fig 10 for the network before filtering). Solid nodes are core nodes, while nodes without color, such as “fbi”, are non-core nodes. Colors are based on the core nodes’ assigned community, while all relationships are collapsed to a single edge. These core nodes have an assignment based on the threshold, while open shared nodes have an assignment based on
threshold (see Algorithm 1). Pizzagate subnodes are concatenated into their supernodes, and are outlined in red, while the subnodes retain their community coloring. Contextual communities are shaded with yellow, metanarrative with blue, nucleations with green, and unrelated discussions with purple.
Fig 9.
A three dimensional visualization of the narrative framework for Pizzagate in terms of domains.
On the top, A: the graph with the inclusion of relationships generated by Wikileaks—the aggregate graph in blue shows a single large connected component. On the bottom, B: the graph with the Wikileaks relationships removed, shows on the aggregate level the remaining domains as disjoint components. In the Pizzagate conspiracy theory, the different domains have been causally linked via the single dubious source of the conspiracy theorists’ interpretations of the leaked emails dumped by Wikileaks. No such keystone exists in the Bridgegate narrative Network.
Fig 10.
Community detection on the overall Pizzagate corpus.
Subnodes are colored based on their assigned community, while all relationships between any two subnode actant nodes are collapsed to a single edge. Solid core nodes have an assignment based on the threshold, while open shared nodes have an assignment based on
threshold (see Algorithm 1). Main Pizzagate supernodes are outlined in red, and include their subnodes colored by community. Meta-narrative frameworks are shaded with blue. Context groupings are shaded with yellow, while narrative framework nucleations are shaded with green. Unrelated discussions are circled in purple. The entire Pizzagate narrative framework is highlighted with a red box (see Fig 8 for a frequency-filtered version of this figure).
Fig 11.
Time series of the first mention of Bridgegate entities.
Starting with the events of September 2013.
Fig 12.
Selection of nodes from the Podesta subnode egonet subgraph.
The self-loop edge for the node “Podesta” is labeled with an automatically derived description of John Podesta as the Clinton Campaign Chair.
Fig 13.
Subgraph of the Podesta supernode.
The supernode consists of several subnodes, including those automatically labeled as leaked emails, Tony Podesta, John Podesta, the Podesta brothers, and John Podesta as Hillary Clinton’s campaign manager. The most significant context dependent relationships for each of the subnodes are presented as labeled, directed edges. See Fig 14 for further examples where both ends of the relationships are shown.
Fig 14.
A subset of the ego network for Bridget Anne Kelly.
The specific relationships between Kelly and other important named individuals are revealed in this ego network subgraph as determined both by their frequency and centrality in the narrative network. For each named entity, we added a self-loop edge labeled with automatically derived descriptions of the entity. These relationships show the endemic nature of the Bridgegate conspiracy: all actants are local to New Jersey politics. Since the edges are labeled with only the lead verbs appearing in the relationship phrases, the edge labels can be difficult to understand but, because we retain the original phrases, the relationship can be recovered. For example, the relationship “pinned” from Christie to Kelly can be linked to sentences such as: Critchley said evidence to support that claim is contained in interview summaries that accompanied a report commissioned by Christie’s office that pinned the blame largely on Kelly and Wildstein.
Fig 15.
Two closeups of labeled edges related to Pizzagate.
Excerpts from our auto-generated NY Times matched Pizzagate graph reveal the relationships between a subset of nodes. Top A: the graph reveals that John Podesta follows Satanism, and bottom B: that Tony Podesta owns weird art that uses coded language to promote pedophilia.
Fig 16.
Comparison of relationship labels generated by our automated methodology with the the NY Times Bridgegate graph for Chris Christie.
Most significant relationship labels from the “Chris Christie” node to other nodes are displayed here. For each node, we also include one descriptive phrase that was found in an automated manner by our pipeline. These descriptive phrases match very closely the roles portrayed in the NY Times Bridgegate graph. As in other figures, the edge labels only pick the most important verbs for the associated relationship phrase. The rest of the words in the corresponding phrases provide the necessary context for meaningful interpretations of these verbs. For example, the verb“pinned” connecting Christie to Anne Bridgett Kelly, is part of the phrase, “pinned the blame on,” which we extracted from the text.