Map Home

Loading...

Loading...

Active

$499.6K Funding

2 People

External

Many of today's real-world systems form information networks, where nodes represent multi-type entities and links reflect relation between entities. Examples include social networks, user-product networks, and knowledge graphs, that exist in social networking sites, e-commerce systems, and digital encyclopedia systems, respectively. One major goal of data science is to obtain effective representations of information networks to support various analytical tasks such as similarity search, link prediction, and node ranking that enable real-world applications such as search and recommendation. Prior research has developed machine learning and optimization solutions for inferring network data representations. However, these approaches still suffer from computation and storage challenges, especially while responding to real-time requests and computing on small devices. This project will develop novel methods for learning network data representations that could save much computational cost and storage space. Ultimately, the project will make the applications driven by network data run more effectively and efficiently. This project will develop novel machine learning methods for hashing both homogeneous and heterogenous information networks. In addition to structure information, the developed approach will also consider attribute information that may be available for many information networks. As the formalized learning objectives are NP-hard problems with binary decision variables, this project will develop new optimization methods for solving the learning problems by exploring two different directions for solving the learning problems. The first one is to creatively transform the formalized learning problems to well-studied Max-Cut problems and then leverage existing approaches to solve the transformed problems. The other one is to study continuous optimization reformulations for the formalized learning problems and then develop convex approximations to solve the reformulated continuous optimization problems. The learned binary representations of information networks will be applied to several important applications tasks such as similar node search, node classification, link prediction, and recommendations. Multiple information network datasets will be used to evaluate the performance of developed methods based on different quantitative evaluation metrics. The developed information network hashing solutions in this project will significantly advance the research fields of representation learning of information networks, data mining and machine learning for network data analysis, and optimization for machine learning problems. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.