Network Science with Python and NetworkX Quick Start Guide
上QQ阅读APP看书,第一时间看更新

Adding attributes to nodes and edges

In the last chapter, I said that networks were entirely defined by the number of nodes and which nodes were connected. I lied. Kind of. Now that we're all a little older and wiser than we were in Chapter 1What is a Network?, I can tell you the whole truth: sometimes, network nodes and edges are annotated with additional information. In the Graph class, each node and edge can have a set of attributes to store this additional information. Attributes can simply be a convenient place to store information related to the nodes and edges, or they can be used by visualizations and network algorithms.

The Graph class allows you to add any number of attributes to a node. For a G, network, each node's attributes are stored in the dict at G.nodes[v], where v is the node's ID. In the karate club example, the club members eventually split into two separate clubs. We can add an attribute to each node to describe which splinter club the corresponding member joined after the original club disbanded. The club joined by member i is given by the ith element of the following list:

member_club = [ 
0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
0, 0, 0, 0, 1, 1, 0, 0, 1, 0,
1, 0, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1]

This information can be added by iterating over all the node IDs and setting the node attribute based on the value in member_club, as follows:

for node_id in G.nodes: 
G.nodes[node_id]["club"] = member_club[node_id]

Attributes can also be added automatically when a new node is added by passing keyword arguments to add_node() after the node ID as follows:

G.add_node(11, club=0)

Now that the club attribute has been set for all the nodes, it's possible to check the value of that attribute for individual nodes, shown as follows:

G.nodes[mr_hi] 
{'club': 0}

G.nodes[john_a]
{'club': 1}

It looks like Mr. Hi and John A really don't get along very well and ended up joining different clubs. We can visualize these different clubs by using different colors. The list of node colors can be created by iterating through the nodes and assigning a color based on their club attribute. That list can then be passed to the draw_networkx() function as follows:

node_colors = [ 
'#1f78b4' if G.nodes[v]["club"] == 0
else '#33a02c' for v in G]
nx.draw_networkx(G, karate_pos, label=True, node_color=node_color)

In the preceding code, a color is stored in the node_colors list for each node, and passed to draw_networkx(), which will produce the following:

The Zachary network after splitting into two clubs

Adding attributes to edges works much like it does for nodes. In a G network, an edge's attributes are stored in the dict at G.edges[v, w], where v and w are the node IDs of the edge endpoints. Note that since the Graph class represents an undirected network, these attributes can also be accessed at G.edges[w, v]. You might think that you'd need to update both separately (if you're prone to anxiety), but NetworkX takes care of that for you.

Some of the edges in the karate club network connect members who joined the same splinter club, while other edges connect members from different splinter clubs. This information can be stored in the Graph class using edge attributes. To do so, iterate through all the edges, and check whether the edge endpoints have the same club attribute. In this example, I create an attribute called internal to represent whether an edge is internal to a single splinter club. This can be done using the following code:

# Iterate through all edges 
for v, w in G.edges:
# Compare `club` property of edge endpoints
# Set edge `internal` property to True if they match
if G.nodes[v]["club"] == G.nodes[w]["club"]:
G.edges[v, w]["internal"] = True
else:
G.edges[v, w]["internal"] = False

The two types of edges could also be visualized with color, but we'll need color in the next section, so let's use solid lines for internal edges and dashed lines for external ones instead. The internal and external edges can be found by iterating through the edges and checking the internal attribute. Note that rather than using the individual node IDs v and w, this example references edges using a single e variable, which contains a 2-tuple of node IDs, given as follows:

internal = [e for e in G.edges if G.edges[e]["internal"]] 
external = [e for e in G.edges if ~G.edges[e]["internal"]]

NetworkX can only draw one line style at a time, so multiple line styles requires nodes, edges, and labels to be drawn separately. While doing so takes more code, it gives more control over the final output. First, we draw the nodes and node labels, specifying node colors using the node_color parameter:

# Draw nodes and node labels 
nx.draw_networkx_nodes(G, karate_pos, node_color=node_color)
nx.draw_networkx_labels(G, karate_pos)

Next, we draw the internal and external edges separately, using the style parameter to draw the external edges as dashed lines:

# Draw internal edges as solid lines 
nx.draw_networkx_edges(G, karate_pos, edgelist=internal)
# Draw external edges as dashed lines
nx.draw_networkx_edges(G, karate_pos, edgelist=external, style="dashed")

The preceding code will produce the following:

Internal and external edges in the Zachary network