Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1.

A representation of the steps in iterative abductive analysis.

The process begins with conducting experiments and flows clockwise through reasoning about data and what experiments to perform next, whereupon the process repeats. Deductive analyses include these steps, but modeling occurs before experiments, so that the steps are rearranged. Parts of many of these steps (e.g., computing model properties) can be automated, and this automation is the focus of this paper. Other steps are not automated, such as the process of developing a model, because this requires a significant element of human reasoning. Thus, our software system requires human-in-the-loop execution. The process can be used in a purely experimental approach (i.e., no modeling). See the text for a description of this graphic.

More »

Fig 1 Expand

Fig 2.

Roadmap of, and relationships among, sections in this manuscript.

Arrows indicate dependencies among sections, and dashed arrows identify the theoretical models that impact the design and implement of the software pipeline system. The Introduction, Related Work, and Conclusions are not shown. See text for details.

More »

Fig 2 Expand

Fig 3.

Five software pipelines (in gray) for NESS experiments.

The five pipelines are itemized and described in Table 1. In this human-in-the-analysis loop, experiments (upper left in figure) are performed. Any experiment whose data can be cast in terms of the data model specification can be analyzed with this system. These pipelines are the focus of this work. The pipeline composition shown here, for abductive looping, is one of several possibilities. See Table 1 for descriptions of the pipelines in this figure. The first, second, and fifth pipelines can be used with a purely experimental approach (omitting modeling). An earlier version of the pipeline system is provided in [34], Fig 1.

More »

Fig 3 Expand

Table 1.

Description of the five pipelines for NESS experiments.

More »

Table 1 Expand

Fig 4.

The three types of models described in this work: (Abstract) data model, graph dynamical system model, and pipeline model.

The data model enables rigorous reasoning about both (i) experiments and experimental data specifications (requirements) and (ii) modeling and simulation (MAS) specifications. It, along with the graph dynamical system (GDS) model, help to ensure consistency and correspondence between experiments and MAS. We use GDS to model the dynamics of particular applications systems. Specific data sources and modeling approaches are shown. These are used within our pipeline model. Figure adapted from [34].

More »

Fig 4 Expand

Table 2.

This work involves three major topics (left column of table): Data representation, modeling representation, and software pipelines.

More »

Table 2 Expand

Fig 5.

An application-specific pipeline is composed of an invariant framework that performs general operations (see text) and application-specific h-functions.

More »

Fig 5 Expand

Table 3.

Definition of our abstract data model.

More »

Table 3 Expand

Fig 6.

Sequence of data models for reasoning about experiments and modeling and simulation.

We advocate for pre-pending the abstract data model to the front end of the model process, as shown here. Table 3 shows our abstract data model and Fig 7 shows this data model translated into a entity-relationship diagram in unified modeling language (UML) form. The table and figures in A (which support Section 7) show the Data Common Specification for our software design.

More »

Fig 6 Expand

Fig 7.

Data model of Table 3 translated into a entity-relationship diagram in unified modeling language (UML) form.

This illustrates that the abstract data model can be translated to customary forms of data models (e.g., UML) that are more amenable for software development.

More »

Fig 7 Expand

Table 4.

Symbols used to describe our computational model known as a discrete Graph Dynamical System (GDS).

More »

Table 4 Expand

Fig 8.

Network G(V, E) for a GDS example, with V = {v1, v2, v3, v4, v5, v6}.

Thresholds θi are provided for nodes vi, in blue, by the respective nodes. The local functions fi are threshold functions for viV, 1 ≤ i ≤ 6; see text for details. The discrete system dynamics are given by the configurations at successive times from 0 to 4, at the right in the figure. Each configuration is given by . The system reaches a fixed point at time t = 3, as evidenced by no change in the configuration in going from t = 3 to t = 4.

More »

Fig 8 Expand

Fig 9.

Conceptual view of a pipeline that is composed of the pipeline framework (represented by the bounding box) and the h-functions that provide the application-based functionality of a particular pipeline.

Functions, or h-function, hi, 1 ≤ i ≤ 3 are implemented as software within a pipeline. The pipeline framework (red box) controls the execution order of functions and the inputs and outputs for each function, through a pipeline job specification. Circles in the figure denote input and output digital objects, such as ASCII files or database tables. This figure is a more detailed representation of Fig 4. Adapted from [34] Fig 3.

More »

Fig 9 Expand

Fig 10.

One arbitrary software h-function within a pipeline.

Data instances , , and are transformed by transformation code τ1 to conform to required input for h. Similary, and are used by τ2 to produce input . Outputs from the h-function are , , and . Inputs and outputs are subjected to verification through comparisons with specified schema (not shown here). The pipeline framework is represented by the red box that controls execution of the h-functions and transformation codes. This is a more detailed representation of Figs 5 and 9.

More »

Fig 10 Expand

Fig 11.

Two pipelines are shown to illustrate similarities and differences between them.

To run a pipeline (called a job), a pipeline-specific configuration input file is verified and is read by the pipeline framework. The file specifies h-functions and their order of execution, as well as required input files to the pipeline. Here we show how function h1 is executed in a pipeline 1 and how h4 is executed in pipeline 2. The pipeline framework invokes the corresponding functions. If specified in the configuration file, the pipeline framework invokes a transformation function that transforms the contents of one or more files into an input file of correct format for the h-function. There may be one transformation function for each direct input to an h-function. At appropriate points in a pipeline, data files are verified against their corresponding JSON schema (input file verification). The h-function is executed and output files are generated (these digital object outputs may be, e.g., plot files, ASCII data files, and binary data files). There may be additional h-functions within pipeline 1, indicated by the ellipsis below pipeline 1 function h1 execution. In this example, outputs from the generic pipeline 1 are inputs for the generic pipeline 2. Function h4 in pipeline 2 is executed in a similar fashion to function h1 in pipeline 1. See the text for descriptions of these various components. Note: the pipeline framework (in brown) is the same code for all pipelines. See Table 5 for implementation details of the elements in this figure.

More »

Fig 11 Expand

Table 5.

Sections and files from the execution of a generic Pipeline.

More »

Table 5 Expand

Table 6.

Configuration input file description.

More »

Table 6 Expand

Table 7.

Summary table of h-functions.

More »

Table 7 Expand

Fig 12.

The anagram game screen, phase-2, for one player.

This player has own letters “R,” “O,” and “L” and has requested an “E” and “A” from neighbors. The “E” is green, so this player’s request has been fulfilled and so “E” can be used in forming words; but the request for “A” is still outstanding so cannot be used in words. Below these letters, it shows that Player 2 has requested “O” and “L” from this player. This player can reply to these requests, if she so chooses. Below that is a box where the player types and submits new words.

More »

Fig 12 Expand

Fig 13.

Case study 1.

Partial representation of the data model for the online experiment composed of 3 phases with a set of V players (n = |V|). The phase 1 DIFI measure, a proxy for CI, uses a null (i.e., empty) network on n players; i.e., there are no edges in the graph because players play individually. In phase 2, a team-based CI-priming game, edges E are communication channels. Initial conditions Bv include letter assignments to players. The individual DIFI measure is repeated in phase 3. The action set A and illustrative action tuples Ti are given for each phase.

More »

Fig 13 Expand

Fig 14.

The Data Analytics Pipeline (DAP) was executed to analyze phase 2 of three experiments with n = 6 and d = 5.

The time series of number of words formed by player for experiment #2 is generated by function h3.

More »

Fig 14 Expand

Fig 15.

The Data Analytics Pipeline (DAP) was executed to analyze phase 2 of three experiments with n = 6 and d = 5.

The histogram for the number of actions “letter request” for three experiments is generated by function h5. The x-axis is time in the group anagram game, binned in 30 seconds intervals.

More »

Fig 15 Expand

Fig 16.

The Data Analytics Pipeline (DAP) was executed to analyze phase 2 of three experiments with n = 6 and d = 5.

The discrete time actions for all three experiments is generates by function h7. This latter output will inform the Property Inference pipeline for computing parameters for simulation models. Time (in seconds) is shown in the first row as 1, 2, 3, …, and counts of the z vector components, per player and per experiment are given.

More »

Fig 16 Expand

Fig 17.

The Property Inference pipeline receives the input from h7 of the Data Analysis Pipeline (DAP).

The parameters in this figure were generated to inform an ABM model for the Modeling and Simulation Pipeline (MASP). The transitions in the figure are from from i to j, where aiA is the action at time t and ajA is the action at (t + 1). Rows not shown mean there are no such transitions in the data.

More »

Fig 17 Expand

Fig 18.

The Modeling And Simulation Pipeline (MASP) and Model Evaluation And Prediction Pipeline (MEAPP) were run to obtain simulation results and model predictions, and to compare experimental data to model predictions.

All three plots contain model predictions and use results from h1 of the MASP. Function h1 of MEAPP plots corresponding experimental and model output data (top plot) and compares experiment and model output using KL-divergence (center plot) for six parameters. Function h2 of MEAPP uses h3 of the Data Analysis pipeline (DAP) to plot model predictions from h1 of the MASP (bottom plot) where now n = 15 (in experiments, n = 6).

More »

Fig 18 Expand

Table 8.

Data common specification.

More »

Table 8 Expand

Fig 19.

JSON schema for the “Experiment” of the data common specification.

More »

Fig 19 Expand

Fig 20.

JSON schema for the “Phase” of the data common specification.

More »

Fig 20 Expand

Fig 21.

JSON schema for the “Phase Description” of the data common specification.

More »

Fig 21 Expand

Fig 22.

JSON schema for the “Player” of the data common specification.

More »

Fig 22 Expand

Fig 23.

JSON schema for the “Action” of the data common specification.

More »

Fig 23 Expand

Fig 24.

To run a pipeline (called a job), a configuration input file specifies functions and their order of execution.

This figure shows a portion of the schema for a configuration file that specifies the experiment JSON schema file location.

More »

Fig 24 Expand

Fig 25.

To run a pipeline (called a job), a configuration input file specifies functions and their order of execution.

This Figure shows a portion of the schema for a configuration file that specifies the phase description JSON schema file location.

More »

Fig 25 Expand

Fig 26.

To run a pipeline (called a job), a configuration input file specifies functions and their order of execution.

This Figure shows a portion of the schema for a configuration file that specifies the phase JSON schema file location.

More »

Fig 26 Expand

Fig 27.

To run a pipeline (called a job), a configuration input file specifies functions and their order of execution.

This Figure shows a portion of the schema for a configuration file that specifies the action description JSON schema file location.

More »

Fig 27 Expand

Fig 28.

To run a pipeline (called a job), a configuration input file specifies functions and their order of execution.

This Figure shows a portion of the schema for a configuration file that specifies the player description JSON schema file location.

More »

Fig 28 Expand

Fig 29.

To run a pipeline (called a job), a configuration input file specifies functions and their order of execution.

In this configuration file there are five possible functions that can be executed in any order. This Figure shows a portion of the schema for a configuration file that specifies how to compose and execute one or more functions of a simple pipeline. For example, here it defines that a parameter called “actionId” is only necessary for functions h2 through h5.

More »

Fig 29 Expand

Fig 30.

This is an example of the (1) experimental data transformation pipeline execution to transform raw experimental data into the data common specification.

Here we show how function h1 is executed. Here we show an input CSV file as an example for the “Completed Session Summary” input file. If necessary, file contents are transformed to obtain the direct input for a function in the correct format. Here we show how the “Completed Session Summary” CSV input file is transformed into a “Completed Session Summary” json file that becomes the input for the function. After verification of formats by the corresponding JSON schemas, the function is executed and output files are generated. Here we show the output json file for the “Experiment” data common specification.

More »

Fig 30 Expand

Fig 31.

This is an example of the (2) data analytics pipeline execution to analyze files of data in the common specification.

Here we show how function h7 is executed. Input files are validated against their corresponding JSON schema. Here we show an example of a json schema file for the “Experiment” description input file. Fig 19 contains the whole file. After verification of formats by the corresponding JSON schemas, if necessary, file contents are transformed to obtain the direct input for a function in the correct format. After verification of formats by the corresponding JSON schemas, function h7 is executed and output files are generated. In this example the output file is an input for the (3) Property Inference pipeline.

More »

Fig 31 Expand

Table 9.

Listing of types of functions as microservices for the (1) Experimental Data Transformation Pipeline (EDTP).

More »

Table 9 Expand

Table 10.

Listing of types of functions as microservices for the (2) Data Analytics Pipeline (DAP).

More »

Table 10 Expand

Table 11.

Listing of types of functions as microservices for the (3) Property Inference Pipeline (PIP).

More »

Table 11 Expand

Table 12.

Listing of types of functions as microservices for the (4) Modeling and Simulation Pipeline (MASP).

More »

Table 12 Expand

Table 13.

Listing of types of functions as microservices for the (5) Model Evaluation and Prediction pipeline (MEAPP).

More »

Table 13 Expand

Fig 32.

Elements of the data model (Table 3), for the online social network experiment in [3].

More »

Fig 32 Expand

Table 14.

Online social network experiment in [3], defined with our data model.

More »

Table 14 Expand

Fig 33.

Data model of Table 14 translated into a entity-relationship diagram in Unified Modeling Language (UML) form.

More »

Fig 33 Expand

Table 15.

How the structure of communication networks among actors can affect system-level performance is studied in [44].

More »

Table 15 Expand

Fig 34.

Data model of Table 15 translated into a entity-relationship diagram in Unified Modeling Language (UML) form.

More »

Fig 34 Expand