SNAP/soc-Pokec

SNAP network: soc-Pokec Slovakian social network
Name soc-Pokec
Group SNAP
Matrix ID 2789
Num Rows 1,632,803
Num Cols 1,632,803
Nonzeros 30,622,564
Pattern Entries 30,622,564
Kind Directed Graph
Symmetric No
Date 2012
Author L. Takac, M. Zabovsky
Editor J. Leskovec
Structural Rank
Structural Rank Full
Num Dmperm Blocks
Strongly Connect Components 325,892
Num Explicit Zeros 0
Pattern Symmetry 54.3%
Numeric Symmetry 54.3%
Cholesky Candidate no
Positive Definite no
Type binary
Download MATLAB Rutherford Boeing Matrix Market
Notes
SNAP (Stanford Network Analysis Platform) Large Network Dataset Collection,
Jure Leskovec and Anrej Krevl, http://snap.stanford.edu/data, June 2014.   
email: jure at cs.stanford.edu                                             
                                                                           
soc-Pokec social network                                                   
                                                                           
https://snap.stanford.edu/data/soc-Pokec.html                              
                                                                           
Dataset information                                                        
                                                                           
Pokec (http://pokec.azet.sk/) is the most popular on-line social network in
Slovakia. The popularity of network has not changed even after the coming  
of Facebook. Pokec has been provided for more than 10 years and connects   
more than 1.6 million people.  Datasets contains anonymized data of the    
whole network.  Profile data contains gender, age, hobbies, interest,      
education etc. Profile data are in Slovak language. Friendships in Pokec   
are oriented.                                                              
                                                                           
Dataset statistics                                                         
Nodes    1,632,803                                                         
Edges   30,622,564                                                         
Nodes in largest WCC  1632803 (1.000)                                      
Edges in largest WCC  30622564 (1.000)                                     
Nodes in largest SCC  1304537 (0.799)                                      
Edges in largest SCC  29183655 (0.953)                                     
Average clustering coefficient  0.1094                                     
Number of triangles  32557458                                              
Fraction of closed triangles  0.01611                                      
Diameter (longest shortest path)  11                                       
90-percentile effective diameter  5.2                                      
                                                                           
Source (citation)                                                          
L. Takac, M. Zabovsky. Data Analysis in Public Social Networks,            
International Scientific Conference & International Workshop Present Day   
Trends of Innovations, May 2012 Lomza, Poland.                             
https://snap.stanford.edu/data/soc-pokec.pdf                               
                                                                           
Files                                                                      
File  Description                                                          
soc-pokec-relationships.txt.gz  User relationship data                     
soc-pokec-profiles.txt.gz  User profile data                               
soc-pokec-readme.txt  Description of files                                 
                                                                           
soc-pokec-readme.txt:                                                      
                                                                           
========================================================================   
Pokec social network                                                       
========================================================================   
                                                                           
Pokec is the most popular Slovak on-line social network. These datasets    
are anonymized and contains relationships and user profile data of the     
whole network. Profile data are in Slovak language. Friendships in the     
Pokec network are oriented. Datasets were crawled during MAY 25-27 2012.   
                                                                           
Author: Lubos Takac, lubos.takac@gmail.com                                 
                                                                           
DATASET STATISTICS:                                                        
                                                                           
Nodes ............................  1,632,803                              
Edges ............................ 30,622,564                              
Nodes in largest WCC .............  1,632,803 (1.000)                      
Edges in largest WCC ............. 30,622,564 (1.000)                      
Nodes in largest SCC .............  1,304,537 (0.799)                      
Edges in largest SCC ............. 29,183,655 (0.953)                      
Average clustering coefficient ...   0.1094                                
Number of triangles .............. 32557458                                
Fraction of closed triangles .....  0.01611                                
Diameter (longest shortest path) .       11                                
90-percentile effective diameter .      5.3                                
                                                                           
FILES:                                                                     
                                                                           
soc-pokec-relationships.txt                                                
  Contains friendship relations between users. There is one relation per   
  line. Values in the line are tab separated. For example if a row         
  contains 3 5, this means that user 3 has a friend: user 5.               
                                                                           
soc-pokec-profiles.txt                                                     
  Contains profile data in a tab separated form. Columns with no data      
  contain the string "null". Data are in Slovak Language.                  
                                                                           
  Columns (attributes):                                                    
    user_id                                                                
    public                                                                 
    completion_percentage                                                  
    gender                                                                 
    region                                                                 
    last_login                                                             
    registration                                                           
    AGE                                                                    
    body                                                                   
    I_am_working_in_field                                                  
    spoken_languages                                                       
    hobbies                                                                
    I_most_enjoy_good_food                                                 
    pets                                                                   
    body_type                                                              
    my_eyesight                                                            
    eye_color                                                              
    hair_color                                                             
    hair_type                                                              
    completed_level_of_education                                           
    favourite_color                                                        
    relation_to_smoking                                                    
    relation_to_alcohol                                                    
    sign_in_zodiac                                                         
    on_pokec_i_am_looking_for                                              
    love_is_for_me                                                         
    relation_to_casual_s_x                                                 
    my_partner_should_be                                                   
    marital_status                                                         
    children                                                               
    relation_to_children                                                   
    I_like_movies                                                          
    I_like_watching_movie                                                  
    I_like_music                                                           
    I_mostly_like_listening_to_music                                       
    the_idea_of_good_evening                                               
    I_like_specialties_from_kitchen                                        
    fun                                                                    
    I_am_going_to_concerts                                                 
    my_active_sports                                                       
    my_passive_sports                                                      
    profession                                                             
    I_like_books                                                           
    life_style                                                             
    music                                                                  
    cars                                                                   
    politics                                                               
    relationships                                                          
    art_culture                                                            
    hobbies_interests                                                      
    science_technologies                                                   
    computers_internet                                                     
    education                                                              
    sport                                                                  
    movies                                                                 
    travelling                                                             
    health                                                                 
    companies_brands                                                       
    more                                                                   
                                                                           
  Column descriptions:                                                     
    user_id:                                                               
      integer, users' nicknames were mapped to integers                    
    public:                                                                
      bool, 1 - all friendships are public                                 
    completion_percentage:                                                 
      integer, percentage proportion of filled values                      
    gender:                                                                
      bool, 1 - man                                                        
    region:                                                                
      string, mostly regions in Slovakia (example: "zilinsky kraj,         
      kysucke nove mesto" means county Zilina, town Kysucke Nove Mesto,    
      Slovakia), some foreign countries (example: "zahranicie,             
      zahranicie - nemecko" means foreign country Germany (nemecko)),      
      some Czech regions (example: "ceska republika, cz - ostravsky        
      kraj" means Czech Republic, county Ostrava (ostravsky kraj))         
    last_login:                                                            
      datetime, last time at which the user has logged in                  
    registration:                                                          
      datetime, time at which the user registered at the site              
    age:                                                                   
      integer, 0 - age attribute not set                                   
                                                                           
  Notes:                                                                   
    Other attributes are free fillable by user. They contain opinions      
    in text form or URL of club which the user prefer. There are 163       
    records without any data and completion_percentage=0. (I think that    
    these are the profiles of people who have canceled their accounts      
    during the data crawling.)                                             
                                                                           
---------------------------------------------------------------------------
Notes on inclusion into the SuiteSparse Matrix Collection, July 2018:      
---------------------------------------------------------------------------
                                                                           
The SNAP data set is 1-based with nodes numbered 1 to 1,632,803.  In the   
SuiteSparse Matrix Collection, Problem.A is the directed graph, where      
A(i,j)=1 means person i states that person j is their friend.  The value   
A(i,j) can differ from A(j,i).  That is, the friend list of person i is    
A(i,:), and it can include person j.  But the friend list of person j,     
which is A(j,:) need not include person i.                                 
                                                                           
The profiles are stored in Problem.aux, with one array of size n-by-* for  
each component of the profile.  Most fields are strings (even age, which   
can be 'null').  Only user_id, public, and completion_percentage are       
numeric.