Integrated framework for criminal network extraction from Web

Afra S., Alhajj R.

Journal of Information Science, vol.47, no.2, pp.206-226, 2021 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 47 Issue: 2
  • Publication Date: 2021
  • Doi Number: 10.1177/0165551519888606
  • Journal Name: Journal of Information Science
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Scopus, Academic Search Premier, FRANCIS, IBZ Online, Periodicals Index Online, ABI/INFORM, Aerospace Database, Analytical Abstracts, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, EBSCO Education Source, Education Abstracts, Index Islamicus, Information Science and Technology Abstracts, INSPEC, Library and Information Science Abstracts, Library Literature and Information Science, Library, Information Science & Technology Abstracts (LISTA), Metadex, Civil Engineering Abstracts
  • Page Numbers: pp.206-226
  • Keywords: Crime analysis, criminal graph, information extraction, social network, Web data mining
  • Istanbul Medipol University Affiliated: Yes


Extracting criminals’ information and discovering their network are techniques that investigators often rely on to get extra information about criminal incidents and potential criminals. With the recent advances of the Web, a.k.a. Web 2.0, it has become a rich source of data which provides a variety of information sources. In this article, we propose an integrated framework that combines a variety of available components and makes use of different sources of information provided on the Web to get a better knowledge about criminals or terrorists (we will use criminals to cover all terrorists in the rest of this article). Our system extracts criminals’ information and their corresponding network using Web sources, such as online newspapers, official reports, and social media. It uses text analysis to identify key persons and topics from crawled Web documents. We build a criminal graph from the analysed text based on the co-occurrence of mentioning of criminals. Further analysis is applied on the constructed graph to get key people, hidden relationships and interactions between criminals, as well as hierarchical criminal groups within a network. For every process in the framework, we analysed various available works and implementations that could be used in the process. While analysing social media posts, we identified several challenges which show what solutions could be used for that purpose. Finally, we provide a Web application which implements the proposed framework. It also shows how helpful and efficient the system is in extracting and analysing criminal information.