TUCSON, Ariz. – The quivering images and militant writings are frightening: an exploding Humvee blankets passing cars with dust; a lab technician makes explosives, step by step; hatred oozes from "A guide to kill Americans in Saudi Arabia."
Tens of thousands of Web pages are now devoted to terrorist propaganda designed to attract followers. On the surface, the messages and videos reveal little about their creators. But programmers and writers leave digital clues: the greetings and other words they choose, their punctuation and syntax, and the way they code multimedia attachments and Web links.
Researchers at the University of Arizona are developing a tool that uses these clues to automate the analysis of online jihadism. The Dark Web project aims to scour Web sites, forums and chat rooms to find the Internet's most prolific and influential jihadists and learn how they reel in adherents.
Lab director Hsinchun Chen hopes Dark Web will crimp what he calls "al-Qaida University on the Web," the mass of Web sites where potential terrorists learn their trade, from making explosives to planning attacks. Experts said they are not aware of any comparable effort, though some said the project may have only limited applications.
The project in the university's Artificial Intelligence Lab will not identify people outside cyberspace "because that involves civil liberties," Chen said, preferring to let law enforcement and intelligence analysts take over from there. Instead, it will help identify messages with the same author and reveal links that aren't obvious.
"Our tool will help them ID the high-risk, radical opinion leaders in cyberspace," Chen said.
Chen said a few agencies are on the verge of using some of his team's techniques but he wouldn't name the agencies.
Former FBI counterterror chief Dale Watson, who noted that terrorist Web sites and communications are now analyzed manually, said the ability to sort through so much data electronically "would be a great asset in the fight against terrorism."
"It would greatly enhance the speed and capability to sort through a large amount of data," Watson said. "That would be the key here. The issue will be where is the Web site originating and where are the tentacles going?"
The only other computer-generated research of terrorist Web sites that Chen said he knew of is at the Pacific Northwest National Laboratory in Richland, Wash. Spokesman Greg Koller said the lab's program is "developing some tools that a decision-maker could use, but nothing that is completely automated."
The bulk of a $1.3 million grant the National Science Foundation gave Chen's group will focus on who produces improvised explosives and what they talk about — such as American troop movements and terrorist tactics. Before getting the NSF funding, Chen started the project with about $3 million from other Artificial Intelligence Lab programs.
Dark Web's software, which Chen calls Writeprint, samples 480 different factors to identify whether the same people are posting to multiple radical forums. It can analyze everything from a fragment of an e-mail to videos depicting American soldiers blown up in Humvees and fuel tankers.
Writeprint is derived from a program originally used to determine the authenticity of William Shakespeare's works. It looks at writing style, word usage and frequency and greetings, and at technical elements ranging from Web addresses to the coding on multimedia attachments. It also looks at linguistic features such as special characters, punctuation, word roots, font size and color, he said.
Currently, intelligence analysts cannot effectively analyze writing style in cyberspace, particularly multilingual writings, he said.
"But using our tool ... we can get about 95 percent accuracy, because I'm utilizing a lot of things your naked eye cannot see," Chen said.
Chen and counterterror specialists said what he termed a tenfold increase in the last two years in jihadist content appearing online has outstripped intelligence analysts' abilities.
"Automating this is absolutely necessary," said Evan Kohlmann, a terrorism expert with the Washington-based Investigative Project on Terrorism. "We're reaching that finite limit" of what can be done manually by humans.
Dark Web compares writings it finds to others in its logs of about 500 million pages of jihadist-produced documents, videos, images, e-mails and other postings, Chen said.
Most of the material is in Arabic, but as terrorist sympathizers have spawned new sites worldwide since 2005, Dark Web has expanded to look at Chinese-, Spanish- and French-language postings, and others will be added.
Given that some forums include close to 70,000 members and a million postings, analyzing Web traffic by hand "is really like drinking water from a fire hydrant," Chen said.
Some counterterror specialists, including some Chen consulted when he started the project nearly four years ago, are unconvinced Dark Web will deliver first-class analysis or produce real-world results.
"To be anything more than a scientific exercise, the techniques and methods developed need to be applicable to real-world counterterrorism," said Ben Venzke, head of IntelCenter, a private company that studies terrorist groups for intelligence agencies.
Venzke, who did not have specific knowledge of Chen's project, cautioned that public discussion of attempts to identify jihadists can damage those efforts.
"If you develop a method to identify something that a group is doing and then publicly disclose the method or enough of what you're looking for, there's a very good chance that they're going to stop doing it," Venzke said.
Others Chen consulted at Dark Web's outset also aren't sold on how much real-world value it can deliver.
"He has to show that the guys who post it have anything to do with the bombings," said Dr. Marc Sageman, a forensic psychiatrist, a CIA case worker in the late 1980s in Afghanistan and a senior fellow with the Foreign Policy Research Institute.
Gabriel Weimann, an international terrorism expert at Israel's University of Haifa, also tempered his support.
"I am not very thrilled with `computerized scanning,'" Weimann said in an e-mail. "A human eye sees more, and deeper."