AI against cybercriminals: how DarkBERT solves the mysteries of the dark web
Now no one will escape from digital justice!
The dark web is a part of the Internet that is hidden from ordinary users and is only accessible through special anonymizing programs, such as Tor. Various illegal activities take place on the dark web, such as the sale of illegal substances, weapons, false documents, and hacking services.
Scientists from South Korea created artificial intelligence that can analyze and extract useful information from dark web texts. Their AI is called DarkBERT and is based on the RoBERTa architecture, one of the most powerful natural language processing approaches developed in 2019.
To train the model, the scientists collected a large database of dark web texts, scanned it through the Tor network, and then filtered out repetitive and off-topic information. Then they used this database to train RoBERTa LLM — a model that can process dark web texts and highlight key elements in them.
Scientists have shown that DarkBERT outperforms other large language models in the quality of dark web text analysis. This can help cybersecurity professionals and law enforcement go deeper into the corners of the internet where criminals lurk. However, DarkBERT has not yet been perfected and requires further training and tuning. How exactly it will be used and what knowledge it can provide is still unknown.