Exploring and Evaluating a Malicious Site on the Dark Web Using Machine Learning
Abstract
Various web-based assaults, especially Drive-by-Download operations, were getting more significant in this era of globalisation. It's indeed crucial that gather data regarding harmful websites that might offer a blacklist-based detecting tool throughout order to safeguard genuine users. They suggest a technique through this research to collect the URLs of harmful websites mostly on dark web. This suggested method works by scanning dark web resources as well as gathers harmful URLs which are verified through using Gred algorithm or VirusTotal. To use a content embedded and gradient boosting decision tree model, they additionally forecast risky subcategories of gathered websites that seem to be possibly harmful. We show through trials that the suggested method has an F1-score accuracy of 0.82 for predicting harmful site types. Harmful websites, commonly referred to as malicious unified resource locators (URLs), serve as the foundation for a variety of online offences include phishing, spamming, identity theft, financial fraud, and malware. A regular and severe danger to cyber security has been found. Researchers found several artificial learning methods, such as delisting systems, to effectively identify and classify hazardous URLs online. Delisting is entirely useless for detecting malicious URL variants or freshly created URLs. In real-time settings, it also requires human input and takes a while to complete.