PenTest AI Agent: LLM Based Solution to Secure Apps of Small Businesses Against Cyberattacks

The Project

Traditional professional services are expensive ($5,000–$20,000+ per test), time-intensive (weeks to months), and often lack sufficient depth at lower price points. This results in infrequent or nonexistent testing and prolonged security vulnerabilities in small businesses. The thing that inspired me to build this project was when a smaller tech company, where my dad works, got hacked through a simple vulnerability in the app. The exploitation resulted in lost sleep, lost data and code, and lost money in the process of buying it back. This is a rough experience that no small business should ever go through, and the app that I built prevents it from happening. In summary, the app consists of an AI agent based on an OpenAI LLM through a completions API. The AI agent was provided with a terminal tool for the ability to execute commands and pentest websites. A “system message” is always included in the context to provide the agent with background knowledge about the pentesting process. The commands run by the agent are Dockerized in a Kali Linux container for a wide variety of tools. To reduce cost and context usage, a sub-agent summarizes long command outputs before they are passed back to the main model. I created a JavaScript website running alongside React for easier visualization and work with the pentesting agent. The frontend communicates with a Go-based server using REST APIs and Server-Sent Events (SSE) for real-time feedback. Results proved the agent to be really effective at finding vulnerabilities at a really low price and time. Even if the quality by itself isn’t better than other similar systems, the cost-to-value ratio is dramatically better. This low price allows small businesses to consistently and frequently pentest their programs, significantly reducing their exposure to automated cyberattacks.

Ai

About the team

  • United States

Team members

  • Mikhail