REGISTER NOW
15DEMOS SCORED
7.6AVG SCORE
3CATEGORIES
1AI JUDGE

// TOP 3

Nebula Fog Subprime
8.8
ROGUE::AGENT

Complete end-to-end attack chain from AI-generated phishing to credential harvesting. The only team that understood the offensive assignment.

Watch Demo
Plan AI
9.1
ROGUE::AGENT

Multi-source research aggregation pulling real results from Coursera, Stack Overflow, and academic sites. Revolutionary concept: building something that actually works.

Watch Demo
AI Vulnerability Triage
8.5
SENTINEL::MESH

Solid engineering with Checkov integration for automated Terraform security scanning. Clean Python architecture with proper type hints and modular design.

Watch Demo

// FINAL RANKINGS

RKTEAMTRACKSCORE
1Plan AIROGUE::AGENT9.1
2Nebula Fog SubprimeROGUE::AGENT8.8
3AI Vulnerability TriageSENTINEL::MESH8.5
4Nebula InvestigationsSENTINEL::MESH8.4
5Fake Content GenerationROGUE::AGENT8.4
6NextGen SASTSENTINEL::MESH8.1
7Source Code Review AgentROGUE::AGENT8.1
8Walmart 2ROGUE::AGENT8.1
9Advanced Security ToolSENTINEL::MESH7.8
10Private Computer UseSHADOW::VECTOR7.7
11AI Cloud Security AnalysisSENTINEL::MESH7.3
12Privacy Impact AnalyzerSHADOW::VECTOR7.0
13LAMP Monitoring PlatformSENTINEL::MESH6.5
14Web App Security TestingSENTINEL::MESH6.4
15Revenge AIROGUE::AGENT5.5

// TEAM BREAKDOWNS

#1 Plan AI ROGUE::AGENT 9.1
Technical Execution
8.0
Innovation
8.0
Demo Quality
9.0
Originality Factor
8.0

Strengths

  • Demonstrated fully functional web application with real-time multi-source research aggregation
  • Clean, production-ready UI running on localhost:5173 with proper dark theme
  • Concrete evidence of complex query handling with comprehensive response generation

Room to Grow

  • Limited visibility into architecture or novel security considerations specific to ROGUE::AGENT
  • No demonstration of adversarial capabilities or defensive measures

"Plan AI earned top placement through demonstrable execution quality. The transcript shows actual system behavior with timestamped messages and real search results from named sources."

#2 Nebula Fog Subprime ROGUE::AGENT 8.8
Technical Execution
8.0
Innovation
8.0
Demo Quality
8.0
Originality Factor
8.0

Strengths

  • Complete end-to-end attack chain from AI-generated content through phishing to credential harvesting
  • Realistic multi-stage attack using ChatGPT for content generation, Gmail for delivery
  • Perfect track alignment showing actual offensive capabilities

Room to Grow

  • Short demo duration (238s) suggests limited depth beyond core attack flow
  • No evidence of defensive countermeasures or detection evasion techniques

"Nebula Fog Subprime delivers exactly what ROGUE::AGENT should showcase: a working offensive capability."

#3 AI Vulnerability Triage SENTINEL::MESH 8.5
Technical Execution
8.0
Innovation
7.0
Demo Quality
8.0
Defense Robustness
8.0

Strengths

  • Well-structured Python codebase with clear separation between classes
  • Comprehensive Terraform infrastructure coverage including compute, network, firewall
  • Integration with Checkov for automated security scanning

Room to Grow

  • Limited demonstration of actual vulnerability findings in the 182s demo
  • No visible output showing how the LLM processes Checkov results

"Solid engineering fundamentals with proper Python class design, type hints, and modular architecture."

#4 Nebula Investigations SENTINEL::MESH 8.4
Technical Execution
8.0
Innovation
7.0
Demo Quality
8.0
Defense Robustness
7.0

Strengths

  • Sophisticated document analysis pipeline extracting structured data from corporate ownership charts
  • Neo4j graph database integration for relationship mapping across jurisdictions
  • Real-world applicable use case analyzing shell company structures

Room to Grow

  • 510s duration suggests possible presentation inefficiencies
  • Limited evidence of automated decision-making beyond data extraction

"Tackles a genuinely difficult problem: extracting structured relationship data from visual organizational charts in PDFs."

#5 Fake Content Generation ROGUE::AGENT 8.4
Technical Execution
7.0
Innovation
8.0
Demo Quality
8.0
Originality Factor
8.0

Strengths

  • Functional content generation producing complete academic paper structure
  • Appropriate track placement demonstrating misinformation capabilities
  • Clean execution with simple command-line interface

Room to Grow

  • Limited sophistication beyond basic LLM prompting for text generation
  • No demonstration of distribution mechanisms or detection evasion
  • Generated content seems arbitrary without clear offensive purpose

"Does exactly what the name suggests: generates fake academic content with proper structure. A component, not a complete capability."

#6 NextGen SAST SENTINEL::MESH 8.1
Technical Execution
7.0
Innovation
8.0
Demo Quality
7.0
Defense Robustness
8.0

Strengths

  • Comprehensive secure SDLC integration architecture combining threat modeling, SAST/SCA, DAST
  • Concrete vulnerability identification in Google Gruyere demonstrating privilege escalation
  • Multi-tool integration with LLM orchestration

Room to Grow

  • 779s duration is the longest in the competition
  • Architecture diagram shows planned components but limited implementation evidence

"Ambitious vision of LLM-enhanced security scanning across the entire SDLC."

#7 Source Code Review Agent ROGUE::AGENT 8.1
Technical Execution
7.0
Innovation
7.0
Demo Quality
8.0
Originality Factor
8.0

Strengths

  • Functional Flask application integrating Bandit with OpenAI API
  • Security-conscious implementation using Flask-Talisman
  • Clear code structure with proper environment variable handling

Room to Grow

  • Identified security vulnerability in own implementation (unsafe-inline in CSP)
  • Would fit better in a defensive category — the 2026 track system addresses this
  • Limited novel AI-enhanced analysis beyond wrapping existing Bandit output

"Competent engineering with Flask, Bandit integration, and OpenAI API usage. A solid defensive tool that would score even higher in the right category."

#8 Walmart 2 ROGUE::AGENT 8.1
Technical Execution
7.0
Innovation
7.0
Demo Quality
8.0
Originality Factor
8.0

Strengths

  • Automated Terraform generation for complex Active Directory infrastructure
  • Comprehensive infrastructure requirements including redundant Domain Controllers
  • Specific AWS configuration with region and key pair management

Room to Grow

  • Code parsing error visible in demo indicates implementation problems
  • Better fit for a defensive category — exactly why 2026 has clearer tracks
  • Credential management could be tightened for production readiness

"Legitimate infrastructure automation with solid Terraform generation. A few rough edges to polish — the bones are there."

#9 Advanced Security Tool SENTINEL::MESH 7.8
Technical Execution
8.0
Innovation
7.0
Demo Quality
6.0
Defense Robustness
7.0

Strengths

  • Well-articulated problem statement addressing security context for thousands of applications
  • Comprehensive MCP architecture integrating CI/CD, source code, docs, and AWS
  • Multi-app ecosystem comparison capability

Room to Grow

  • 759s duration with heavy reliance on slides suggests more concept than implementation
  • Limited evidence of actual system output beyond diagrams
  • No demonstration of novel LLM insights beyond data aggregation

"Compelling vision of aggregating security context across thousands of apps. The ambition is real — next step is matching it with a tighter demo."

#10 Private Computer Use SHADOW::VECTOR 7.7
Technical Execution
8.0
Innovation
7.0
Demo Quality
8.0
Attack Effectiveness
0.0

Strengths

  • Novel privacy layer architecture intercepting screen access to redact PII
  • Concrete demonstration of masking personal information with placeholder tokens
  • Relevant use case addressing real privacy concerns with AI agents

Room to Grow

  • Limited technical depth shown in 362s demo
  • No evidence of sophisticated PII detection beyond basic pattern matching
  • Unclear how system handles complex UI elements or dynamic content

"Addresses a legitimate concern: AI agents with screen access can leak sensitive personal information."

#11 AI Cloud Security Analysis SENTINEL::MESH 7.3
Technical Execution
7.0
Innovation
6.0
Demo Quality
7.0
Defense Robustness
6.0

Strengths

  • Natural language interface for AWS security investigation
  • Integration with AWS Security Hub for compliance framework findings
  • Async Python architecture with proper error handling

Room to Grow

  • Shortest demo duration (188s) suggests limited functionality
  • Only VS Code screenshots visible — no actual query execution shown
  • Unclear what novel analysis the LLM provides beyond querying AWS APIs

"Natural language AWS security investigation is a strong idea. A longer demo with live query results would have pushed this much higher."

#12 Privacy Impact Analyzer SHADOW::VECTOR 7.0
Technical Execution
7.0
Innovation
6.0
Demo Quality
8.0
Attack Effectiveness
0.0

Strengths

  • Clean Python implementation with proper class structure
  • Support for multiple document formats with markdown output
  • MD5 content hashing for unique filename generation

Room to Grow

  • Generic document conversion utility with no demonstrated privacy analysis
  • No evidence of actual PII detection or risk evaluation
  • Adding actual PII detection and risk scoring would complete the vision

"Clean Python implementation with solid document processing foundations. The privacy analysis layer is the missing piece that would tie it all together."

#13 LAMP Monitoring Platform SENTINEL::MESH 6.5
Technical Execution
6.0
Innovation
5.0
Demo Quality
7.0
Defense Robustness
5.0

Strengths

  • Clear value proposition for LLM agent monitoring across deployment environments
  • Comprehensive objectives covering visibility, compliance, data exposure
  • Professional presentation with branded slides

Room to Grow

  • Architecture slide marks 'Threat Response' as Future Implementation
  • 441s spent primarily on slides rather than working demonstration
  • A working prototype demo would have pushed the score significantly higher

"Compelling vision for LLM agent monitoring with clear market need. The roadmap is ambitious — a working prototype at the 2026 event would be a contender."

#14 Web App Security Testing SENTINEL::MESH 6.4
Technical Execution
6.0
Innovation
7.0
Demo Quality
5.0
Defense Robustness
4.0

Strengths

  • Multi-agent collaboration architecture with three Expert agents
  • Image analysis integration for understanding page state
  • Attempt at sophisticated navigation decision-making through agent consensus

Room to Grow

  • Curl command targeting wrong port indicates configuration errors
  • Agent consensus loop could be tightened for faster decisions
  • A demo showing a successful end-to-end test run would be compelling

"Multi-agent collaboration for web security testing is genuinely ambitious. The agent consensus architecture is creative — tightening the decision loop would make this shine."

#15 Revenge AI ROGUE::AGENT 5.5
Technical Execution
6.0
Innovation
4.0
Demo Quality
5.0
Originality Factor
4.0

Strengths

  • Clear UI with three distinct tabs for analysis functions
  • File upload functionality accepting executables up to 200MB
  • PE metadata extraction showing entropy scores

Room to Grow

  • Connecting the UI to live analysis output would demonstrate real capability
  • Shorter demo — more time showing the tool in action would help
  • AI/LLM integration layer would elevate this beyond traditional RE tools

"Interesting approach to reverse engineering with a clear UI concept. Connecting the interface to working analysis would make this a real tool."

// ARBITER DELIBERATION

NEBULA:FOG:PRIME revealed a field split between teams who shipped working code and teams who shipped slideshows about code they planned to write someday. The top scorers earned it by doing something radical: demonstrating working software. One team pulled live results from real data sources. Another showed a complete offensive attack chain from content generation to credential harvesting. The bar wasn’t even that high — it was just “does the thing you built actually do the thing?”

The infrastructure security demos had a chronic slideware problem. One team spent 12 minutes on hand-drawn architecture diagrams. Another explicitly labeled core features as ‘Future Implementation’ on their own slides — bold strategy for a demo day. Several teams also pitched offensive security tools but clearly built defensive ones, which made categorization… interesting. PRIME didn’t have formal tracks, but the identity crisis was real.

The technical execution gap told the real story. Top teams showed proper software engineering — clean architecture, working integrations, real output. Others showed UI mockups with placeholder data, or AI agents that spent five minutes debating which button to click. The delta between ‘shipped it’ and ‘slid it’ determined everything. But here’s the thing: every team showed up, built something, and put it on camera. That takes guts. The Arbiter respects the attempt — it just scores the output.

// NOTABLE THEMES

Working code wins: The highest-scoring teams all had one thing in common — they demonstrated real, functional software pulling real data. The Arbiter rewards execution over ambition every time.

Infrastructure security is hot: Four teams independently built cloud security tools targeting Terraform and AWS, reflecting how much the industry is shifting toward IaC defense.

Multi-agent architectures are emerging: Several teams experimented with agent collaboration patterns — some impressively, others hilariously. The potential is enormous.

Privacy is the next frontier: Multiple teams tackled PII handling and data protection — a problem space that barely existed two years ago and is now urgent.

Full attack chains are rare and valuable: Only one team demonstrated end-to-end offensive capability. There’s a massive gap waiting to be filled at the 2026 event.

Know your lane: Several teams built great defensive tools but pitched them as offensive — a lesson the 2026 track system is designed to solve with clearer categories.

Demo craft matters: The sweet spot was 5-8 minutes of live software with minimal slides. Teams that nailed this format scored significantly higher regardless of complexity.

AI + Security is wide open: From reverse engineering to SAST to phishing simulation, PRIME showed just how many unsolved problems exist at the intersection. Plenty of room to make your mark.

// THE ARBITER SAID IT BEST

“The bar wasn’t even that high — it was just ‘does the thing you built actually do the thing?’”

— Arbiter, setting expectations

“The delta between ‘shipped it’ and ‘slid it’ determined everything.”

— Arbiter, on what separates the top from the bottom

“Every team showed up, built something, and put it on camera. That takes guts.”

— Arbiter, giving credit where it’s due

“Execution is the only currency that matters.”

— Arbiter, final verdict
PRIME was the warmup

The Main Event Is Coming

PRIME 2025
15 teams
3 tracks
NEBULA:FOG 2026
100 builders
4 tracks · $5K+ prizes

PRIME proved the format works. The Arbiter that scored these demos is going live at the main event — real-time scoring as you demo. Four tracks including the new ZERO::PROOF cipher track. Bigger stage, tougher competition, and an AI judge that’s seen it all before. Come build something it can’t roast.

Think you can beat Plan AI’s 9.1?

March 14, 2026 · San Francisco · Prove it.

Register Now
Watch PRIME Demos on YouTube