AI Bug Hunter Sets Milestone By Claiming Top Spot on HackerOne's Leaderboard

AI Bug Hunter Sets Milestone By Claiming Top Spot on HackerOne’s Leaderboard

AI Bug Hunter Sets Milestone By Claiming Top Spot on HackerOne’s Leaderboard

Image: mstandret/Envato

XBOW, an autonomous AI, has overtaken human hackers on HackerOne’s US leaderboard after submitting more than 1,000 vulnerability reports in a few months.

Verfasst von
Aminu Abdullahi
Aminu Abdullahi
Jun 26, 2025

An autonomous AI system called XBOW has outperformed human researchers to become the top-ranked security tester in the US on HackerOne, a bug bounty platform used by major organizations to strengthen their cybersecurity.

This is a milestone in the use of AI for ethical security research because it’s the first documented instance of an autonomous system outperforming human experts on a large scale in a real-world environment.

XBOW was developed to function as an independent penetration tester, capable of identifying, validating, and reporting vulnerabilities in real-world systems. In a span of a few months, XBOW submitted over 1,000 vulnerability reports, leapfrogging thousands of human ethical hackers to land at the top of the US leaderboard.

“All findings were fully automated,” wrote Nico Waisman, XBOW head of security, in a blog post about its top ranking. However, he noted that human staff conducted reviews prior to submission to comply with HackerOne’s current policies governing AI tool usage.

How accurate is the AI tool?

Despite common concerns that AI tools often produce false positives in security testing, XBOW’s accuracy has impressed security professionals. According to internal metrics:

  • 132 vulnerabilities were confirmed and resolved by program owners.
  • 303 vulnerabilities were “triaged,” which means acknowledged but not yet resolved.
  • 125 vulnerabilities remain under review.
  • 208 vulnerabilities reports were marked as duplicates.
  • 209 vulnerabilities were labeled as informative.
  • 36 vulnerabilities were considered applicable.

In terms of severity over the past three months, XBOW’s reports included:

  • 54 critical vulnerabilities
  • 242 high vulnerabilities
  • 524 medium vulnerabilities
  • 65 low vulnerabilities

These figures suggest the AI’s findings are not only rapid but also impactful.

Must-read security coverage

How does XBOW work?

XBOW’s training began with solving Capture The Flag (CTF) challenges, a common method in cybersecurity education, before moving on to testing environments that simulate real-world conditions.

To ensure quality, the system uses a “validator” layer. These are automated checkers — sometimes powered by language models, other times by custom scripts — that verify whether a vulnerability truly exists.

“We treated [XBOW] like any external researcher would: no shortcuts, no internal knowledge — just XBOW, running on its own,” said Waisman. The company plans to release a series of blog posts detailing some of the AI’s most creative discoveries, offering a transparent look into how it works and what it found.

XBOW has just raised $75 million in a new funding round led by Altimeter Capital, with participation from Sequoia Capital and NFDG, according to Bloomberg.

Aminu Abdullahi

Aminu Abdullahi is a B2C and B2B technology and finance writer with more than six years of experience covering enterprise IT, cybersecurity, cloud computing, artificial intelligence, fintech, business software, and emerging technologies. His work has appeared in publications including TechRepublic, eWEEK, Channel Insider, Geekflare, Enterprise Networking Planet, eSecurity Planet, CIO Insight, and Webopedia. With a technical background in computer science, he specializes in translating complex technology topics into clear, accessible content for business leaders and decision-makers.