## SNAP/email-Eu-core

SNAP network: email-Eu-core network

Name | email-Eu-core |
---|---|

Group | SNAP |

Matrix ID | 2784 |

Num Rows | 1,005 |

Num Cols | 1,005 |

Nonzeros | 25,571 |

Pattern Entries | 25,571 |

Kind | Directed Graph With Communities |

Symmetric | No |

Date | 2007 |

Author | J. Leskovec, J. Kleinberg and C. Faloutsos |

Editor | J. Leskovec |

Download | MATLAB Rutherford Boeing Matrix Market |
---|---|

Notes |
SNAP (Stanford Network Analysis Platform) Large Network Dataset Collection, Jure Leskovec and Anrej Krevl, http://snap.stanford.edu/data, June 2014. email: jure at cs.stanford.edu email-Eu-core network https://snap.stanford.edu/data/email-Eu-core.html Dataset information The network was generated using email data from a large European research institution. We have anonymized information about all incoming and outgoing email between members of the research institution. There is an edge (u, v) in the network if person u sent person v at least one email. The e-mails only represent communication between institution members (the core), and the dataset does not contain incoming messages from or outgoing messages to the rest of the world. The dataset also contains "ground-truth" community memberships of the nodes. Each individual belongs to exactly one of 42 departments at the research institute. This network represents the "core" of the email-EuAll (https://snap.stanford.edu/data/email-EuAll.html) network, which also contains links between members of the institution and people outside of the institution (although the node IDs are not the same). Dataset statistics Nodes 1,005 Edges 25,571 Nodes in largest WCC 986 (0.981) Edges in largest WCC 25552 (0.999) Nodes in largest SCC 803 (0.799) Edges in largest SCC 24729 (0.967) Average clustering coefficient 0.3994 Number of triangles 105461 Fraction of closed triangles 0.1085 Diameter (longest shortest path) 7 90-percentile effective diameter 2.9 Source (citation) Hao Yin, Austin R. Benson, Jure Leskovec, and David F. Gleich. "Local Higher-order Graph Clustering." In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017. J. Leskovec, J. Kleinberg and C. Faloutsos. Graph Evolution: Densification and Shrinking Diameters. ACM Transactions on Knowledge Discovery from Data (ACM TKDD), 1(1), 2007. http://www.cs.cmu.edu/~jure/pubs/powergrowth-tkdd.pdf File Description email-Eu-core.txt.gz Email communication links between members of the institution email-Eu-core-department-labels.txt.gz Department membership labels Data format for community membership NODEID DEPARTMENT NODEID: id of the node (a member of the institute) DEPARTMENT: id of the member's department (number in 0, 1, ..., 41) --------------------------------------------------------------------------- Notes on inclusion into the SuiteSparse Matrix Collection, July 2018: --------------------------------------------------------------------------- The SNAP grqph is 0-based with nodes numbered 0 to 1004. In the SuiteSparse Matrix Collection, Problem.A is the directed graph, where A(i,j)=1 if person 1+i sent person 1+j at least one email. (1+, since the SNAP graph is 0-based). Each person is in exactly one community, so this could be represented as a vector of size n, as node meta data. However, to be consistent with the other SNAP/com-* problems in the SuiteSparse Matrix Collection, the community structure is represented as a sparse matrix, created from the file email-Eu-core-department-labels.txt. C = Problem.aux.Communities_all is a sparse matrix of size n by 42. C(i,k)=1 if person 1+i is in department 1+k (again, 1+ to convert the data to 1-based). Thus, column C(:,k) represents the (1+k)th community, where each community is a member's department. |