Lijun Peng



PhD Candidate

Dept. of Mathematics & Statistics

 

 

View Lijun Peng's profile on LinkedIn 

Network Community Detection


Community detection in networks has drawn much attention in diverse fields, especially social sciences. In this project, we propose a novel stochastic blockmodel based on a logistic regression setup with node correction terms to better address this problem. We follow a Bayesian approach that explicitly captures the community behavior via prior specification. We then adopt a data augmentation strategy with latent Polya-Gamma variables to obtain posterior samples. We conduct inference based on a canonically mapped centroid estimator that formally addresses label non-identifiability. We demonstrate the novel proposed model and estimation on real-world as well as simulated benchmark networks and show that the proposed model and estimator outperform classical Karrer & Newman degree-corrected stochastic blockmodels.

 

 

Community Detection On Large-scale Social Networks


Community detection in networks is becoming increasingly important in many applications, especially in social sciences. Today’s era of big data proposes a new challenge in this field -- how to efficiently detect community structure on large-scale social networks. In this project, we modify the previous community detection approach to make it suitable for large-scale networks such as youtube, amazon, live journal and friendster data. We get and clean the data from Stanford Large Network Dataset Collection. We effectively learn significant portions of networks and shrink the number of communities by adopting maching learning techniques and conduct exact inference based on a maximum a posteriori (MAP) estimator. We demonstrate the novel proposed model and estimation on large real-world networks as well as simulated benchmark networks, and show that the proposed estimator is more computationally efficient and performs better when compared to the MAP estimator from classical degree corrected stochastic blockmodels interms of Normalized Mutual Information (NMI).

 

 

Ridge-Regularized Covariance Selection in Gaussian Graphical Models


Gaussian graphical models have been extensively used to model conditional independence via the concentration matrix of a random vector. They are particularly relevant to incorporate structure when the length of the vector is large and naive methods lead to unstable estimation of the concentration matrix. In covariance selection, we have a latent network among vector components such that two components are not connected if they are conditionally independent, that is, if their corresponding entry in the concentration matrix is zero. To identify the latent network and detect communities, we propose a Bayesian approach with a hierarchical prior in two levels: a spike-and-slab prior on each off-diagonal entry of the concentration matrix for variable selection; and a degree-corrected stochastic blockmodel to capture the community behavior. To conduct inference, we develop an efficient routine based on ridge regularization and MAP estimation.
 
Contact

Office:  Room 151
111 Cummington Mall
Boston MA 02215

Email:

Phone: +1(617)358-2378


Add me to your contact by scanning the QR code above