Blaunet: An R-based graphical user interface package to analyze Blau space

McPherson’s Blau space and affiliation ecology model is a powerful tool for analyzing the ecological competition among social entities, such as organizations, along a combination of sociodemographic characteristics of their members. In this paper we introduce the R-based Graphical User Interface (GUI) package Blaunet, an integrated set of tools to calculate, visualize, and analyze the statuses of individuals and social entities in Blau space, parameterized by multiple sociodemographic traits as dimensions. The package is able to calculate the Blau statuses at the nodal, dyadic, and meso levels based on three types of information: sociodemographic characteristics, group affiliations (e.g., membership in groups/organizations), and network ties. To facilitate this, Blaunet has the following five main capabilities, it can: 1) identify a list of possible salient dimensions; 2) calculate, plot, and analyze niches for social entities by measuring the social distance along the salient dimensions between individuals affiliated with them; 3) generate Blau bubbles for individuals, thereby allowing the study of interpersonal influence of similar others even with limited or no network information; 4) capture niche dynamics cross-sectionally by calculating the intensity of exploitation from the carrying capacity and the membership rate; and 5) analyze the niche movement longitudinally by estimating the predicted niche movement equations. We illustrate these capabilities of Blaunet with example datasets.


Installation
Users are recommended to have the most recent version of R installed. Thirteen dependent packages are installed along with the Blaunet graphic package, which may take a longer time during the first time of installation. These thirteen dependent packages include: (1-4) 'gWidgets', 'gWidgetsRGtk2', 'RGtk2', and 'cairoDevice', which function to build the graphic interface; (5-7) 'plot3D', 'plot3Drgl', and 'rgl', which function to make a 3D plot; (8-11) 'network', 'sna', 'ergm', and ' statnet.common', which function to do network operations and analysis; and (12-13) 'haven' and 'foreign', which function to import and export data file between R and other statistical software, such as SAS, SPSS, Stata.

Linux Installation
Blaunet installation under Linux needs building the X11 and GTK+ environment.

 Hit the keyboard shortcut Ctrl+Alt+T to open 'Terminal'
 If the dependent library 'x11' is not installed, install it by typing: sudo apt-get install libglpk-dev  If the dependent library 'gtk2.0' is not installed, install it by typing: sudo apt-get install libgtk2.0-dev  If the dependent library 'rgl' is not installed, install it by typing: sudo apt-get build-dep r-cran-rgl  Type 'R' in terminal and install Blaunet in R via R> install.packages('Blaunet', repos="http://cran.r-project.org", dependencies=TRUE)

OS X Installation
Blaunet installation under OSX requires the building of the X11 and GTK+ environment.  As shown in Figure 1, the main interface of Blaunet includes five elementsthe menu bar, the toolbar, the text frame showing current working directory, the 'Set Working Directory' bar, and the general information of Blaunet, from top to bottom.

Set Working Directory
 The current working directory is shown in the text frame right below the toolbar.
 The working directory may be changed by clicking the 'Set Working Directory' bar from the main window on the Windows platform; on OS X or Linux platform, this can be done by typing the absolute path in the text frame and then clicking the 'Set Working Directory' bar.
 When the user selects to save any of the generated results, they will be saved into the working directory.

4
Please be careful to NOT load the network data as attribute data or the attribute data as network data. It doesn't matter which one you load first.
 The first row of the attribute data should be variable names.
 The attribute data should be sorted by a unique identification indicator for each node (to link the attribute data with network data) as well as ecology if there are multiple subsamples.
 The attributed data can be opened by clicking 'Data' → 'Open Attribute File' from the menu bar, or simply clicking the first icon from the toolbar.
 The network data can be formatted as either an adjacency matrix or an edge list.
 The network data should NOT include any column names in the first row.

5
 The network data can be opened by clicking 'Data' → 'Open Network File' from the menu bar, or simply clicking the second icon from the toolbar.

Clear the Memory
 Before opening a new dataset the user should clear the memory, so that the data of the next project will not be corrupted up by the data of the previous project.
 This can be done by clicking 'Data' → 'Clear Memory' from the menu bar or the third icon from the toolbar.

Exit Package
 If the user wishes to exit the program, he or she may do so by clicking 'Data' → 'Quit' from the menu bar or the last icon from the toolbar.

Browse Menu -Browse Data
 Data can be browsed by clicking the 'Browse' button from the menu bar and selecting the correspondent data type one wishes to browse.

Network Menu -Generate Network Statistics
 Clicking 'Network' → 'Info' from the menu bar displays the summary information about the network data that has been loaded.
 Clicking 'Network' → 'Density' from the menu bar displays the network density.
 Clicking 'Network' → 'Centrality' from the menu bar displays the out-degree, in-degree, betweenness, closeness, and eigenvector centrality measures for each node. The results can also be saved into the working directory for later use.

Plot the Network Graph
 After clicking 'Graph' → 'Network Graph' from the menu bar, a dialog box (as shown in Figure 3) is displayed where the user can choose whether vertex names/labels will appear in the graph, how to define the vertex color, sides (e.g., starts from a minimum of 3-sides or a triangle), and size, as well as what type of layout will be applied.  Click 'Plot Graph' to display the network graph. The network graph can be saved in different height, weight, resolution (in DPI), and file type into the working directory.

Histogram of Out-degree and In-degree Distribution
 Clicking 'Graph' → 'Histogram Out-degree' from the menu bar plots the out-degree distribution histogram. The histogram can be saved in different heights, weights, resolutions (in DPI), and file types into the working directory.
 Clicking 'Graph' → 'Histogram In-degree' from the menu bar plots the in-degree distribution histogram. The histogram can be saved in different height, weight, resolution (in DPI), and file type into the working directory.

Salient Dimensions
 This feature provides a list of potential salient dimensions that may be used for 7 subsequent analysis. The potential salient dimensions are derived from network and membership variables. Therefore, the attribute data loaded should contain either membership and/or network variables.
 Clicking 'Analysis' → 'Salient Dimensions' from the menu bar opens a dialog box (as shown in Figure 4) and asks to specify the unique identification indicator for each node (as Node.ids), the ecology indicator if there are multiple sub-samples (as Ecology.ids), the set of dimensions, the group affiliation variables, and the alpha value. Please note that it may take a longer time when the numbers of cases, dimensions, and groups is large.    Clicking the 'Continue' button again, a rectangle (for 2 dimensions) or a cuboid (for 3 dimensions) is displayed for each group (and for each ecology if multiple sub-samples are specified) to indicate its niche along the selected dimensions. If the nodes or networks option is selected, all nodes or network ties will also appear overlaid on the niches. Selecting 3 dimensions allows the user to view two 3D plots side by side, one generated with 'plot3D' package which can be rotated by selecting values for horizontal or vertical angles and saved in different height, weight, resolution (in DPI), and file type into the working directory, and the other generated with 'plot3Drgl' package which can be freely rotated to any viewing points by dragging the mouse.   If the 'Network included' box is checked in the Niche Analysis dialog box, two extra columns appear in 'Nodal result': 'Spanner' indicates whether each node spans to other niche(s) through his or her network tie(s) and 'NumSpannedTo' indicates how many niche(s) he or she spans to. An additional 'Dyadic result' button shows up in the aforementioned small dialog box (as shown in Figure 9) which computes six dyadic measures for each present edge, including co-nicher, co-outsider, straddler, spanner, and Euclidean and Mahalanobis distance. Finally, out-degree, in-degree, betweenness, closeness, and eigenvector centrality measures are added to the 'Correlation Matrix' output.

Niche Dynamics
 Clicking 'Analysis' → ' Niche Dynamics' from the menu bar displays a dialog box (as shown in Figure 10), which requests the unique identification indicator for each node (as ecology.ids), the ecology indicator if there are multiple sub-samples (as ecology.ids), whether or not network is included, the dimensions, the groups, the sample weights if there are any, and whether or not only complete cases are included.  Clicking the 'Continue' button, a new dialog box (similar to the one shown in Figure 8) asks to specify the standard deviation around the mean (1.5 by default) for each dimension.
 If there are multiple sub-samples/ecologies being specified (as ecology.ids) and the user wishes to see the predicted niche movement equations, he or she can click the 'Continue' button, select 'all' in the 'Niche Dynamics Option' dialog box (as shown in Figure 11), recategorize the 2 selected dimensions in the 'Dimension Category Selection' dialog box (as shown in Figure 12), and find the results in the 'Predicted Niche Movements' dialog box (as shown in Figure 13). The results can be saved into the working directory for later use.
13 Figure 11: Interface for selection ecology.   Table' outputs all the information about carrying capacity, membership rate, and intensity of exploitation. All the plots and results can be saved into the working directory for later use.  If there are multiple sub-samples/ecologies being specified and the user wishes to make the plot for just one ecology, he or she can select the ecology in the afore-mentioned 'Niche Dynamics Option' dialog box (as shown in Figure 11), re-categorize the 2 selected dimensions in the afore-mentioned 'Dimension category selection' dialog box (as shown in Figure 12), and choose which plot he or she would like displayed or just save the results in the afore-mentioned 'Plot…' dialog box (as shown in Figure 14). All the plots and results can be saved into the working directory for later use.

Blau Bubbles (Blau Proximity Analysis)
 Clicking 'Analysis' → 'Blau Bubbles' from the menu bar displays a new dialog box (as shown in Figure 15) that requests the unique identification indicator for each node (as Node.ids), the ecology indicator if there are multiple sub-samples (as Ecology.ids), and the dimensions to generate Blau bubbles.  Clicking 'Continue' displays another dialog box (as shown in Figure 16) that asks the user to identify the categorical variables among selected dimensions and to define the radius    If network data are loaded in the program, 'Blau Bubble List' includes two additional measures indicating whether or not there is a tie between each pair of nodes and their geodesic distance in the network; and 'Nodal Bubble List' has four additional columns: the degree, alter list, number of coincidences that the alters are also in the Blau bubble as well as who they are.

About
 Clicking 'Help' → ' About' from the menu bar displays the Blaunet version information.

Graphic User Interface Package Manual
 Clicking 'Help' → 'Graphic Package Manual' from the menu bar displays the Manual for the Blaunet Graphic User Interface Package.

Command Line Manual
 Clicking 'Help' → 'Command Line Manual' from the menu bar displays the manual for the command line codes (mainly on niche analysis).

BSANet.rda
 BSANet.rda is a small dataset containing 10 individuals in two non-overlapping locations (New York and San Francisco), created solely to illustrate the functions of the Blaunet package. It contains demographic information such as the individuals' age, education, and income; group affiliation information of memberships in a liberal or conservative organization (or both); and network information among the 10 individuals available in both adjacency matrix and edge list formats.