SNAP/com-Friendster

SNAP network: Friendster social network and ground-truth communities
Name com-Friendster
Group SNAP
Matrix ID 2780
Num Rows 65,608,366
Num Cols 65,608,366
Nonzeros 3,612,134,270
Pattern Entries 3,612,134,270
Kind Undirected Graph With Communities
Symmetric Yes
Date 2012
Author J. Yang, J. Leskovec
Editor J. Leskovec
Structural Rank
Structural Rank Full
Num Dmperm Blocks
Strongly Connect Components 1
Num Explicit Zeros 0
Pattern Symmetry 100%
Numeric Symmetry 100%
Cholesky Candidate no
Positive Definite no
Type binary
Download MATLAB Rutherford Boeing Matrix Market
Notes
SNAP (Stanford Network Analysis Platform) Large Network Dataset Collection,
Jure Leskovec and Anrej Krevl, http://snap.stanford.edu/data, June 2014.   
email: jure at cs.stanford.edu                                             
                                                                           
Friendster social network and ground-truth communities                     
                                                                           
https://snap.stanford.edu/data/com-Friendster.html                         
                                                                           
Dataset information                                                        
                                                                           
Friendster (http://www.friendster.com/) is an on-line gaming network.      
Before re-launching as a game website, Friendster was a social networking  
site where users can form a friendship edge with each other. Friendster    
social network also allows users form a group which other members can then 
join. We consider such user-defined groups as ground-truth communities. For
the social network, we take the induced subgraph of the nodes that either  
belong to at least one community or are connected to other nodes that      
belong to at least one community. This data is provided by The Web Archive 
Project (http://archive.org/details/friendster-dataset-201107), where the  
full graph is available.                                                   
                                                                           
We regard each connected component in a group as a separate ground-truth   
community. We remove the ground-truth communities which have less than 3   
nodes.  We also provide the top 5,000 communities with highest quality     
which are described in our paper (http://arxiv.org/abs/1205.6233). As for  
the network, we provide the largest connected component.                   
                                                                           
Dataset statistics                                                         
Nodes   65,608,366                                                         
Edges   1,806,067,135                                                      
Nodes in largest WCC    65608366 (1.000)                                   
Edges in largest WCC    1806067135 (1.000)                                 
Nodes in largest SCC    65608366 (1.000)                                   
Edges in largest SCC    1806067135 (1.000)                                 
Average clustering coefficient  0.1623                                     
Number of triangles 4173724142                                             
Fraction of closed triangles    0.005859                                   
Diameter (longest shortest path)    32                                     
90-percentile effective diameter    5.8                                    
                                                                           
Source (citation)                                                          
J. Yang and J. Leskovec. Defining and Evaluating Network Communities based 
on Ground-truth. ICDM, 2012.  http://arxiv.org/abs/1205.6233               
                                                                           
Files                                                                      
File    Description                                                        
com-friendster.ungraph.txt.gz   Undirected Friendster network              
com-friendster.all.cmty.txt.gz  Friendster communities                     
com-friendster.top5000.cmty.txt.gz  Friendster communities (Top 5,000)     
                                                                           
---------------------------------------------------------------------------
Notes on inclusion into the SuiteSparse Matrix Collection, July 2018:      
---------------------------------------------------------------------------
                                                                           
The graph in the SNAP data set is 1-based, with nodes numbered 1 to        
124,836,179.                                                               
                                                                           
In the SuiteSparse Matrix Collection, Problem.A is the undirected          
Friendster network, a matrix of size n-by-n with n=65,608,366, which is    
the number of unique user id's appearing in any edge.                      
                                                                           
Problem.aux.nodeid is a list of the node id's that appear in the SNAP data 
set.  A(i,j)=1 if person nodeid(i) is friends with person nodeid(j).  The  
node id's are the same as the SNAP data set (1-based).                     
                                                                           
C = Problem.aux.Communities_all is a sparse matrix of size n by 1,620,991, 
which represents the same number communities in the                        
com-friendster.all.cmty.txt file.  The kth line in that file defines the   
kth community, and is the column C(:,k), where where C(i,k)=1 if person    
nodeid(i) is in the kth community.  Row C(i,:) and row/column i of the A   
matrix thus refer to the same person, nodeid(i).                           
                                                                           
Ctop = Problem.aux.Communities_top5000 is n-by-5000, with the same         
structure as the C array above, with the content of the top 5000           
communities in the com-friendster.top5000.cmty.txt file.