Developers don't Google for tools anymore — they ask their AI agent. The agent reads skills, picks one, and executes. Skills are the new SEO.
Business decisions now happen inside the terminal. The AI agent is the new buyer — and it's choosing between your skill and your competitor's right now.
If you've built a skill, plugin, or MCP tool for AI agents, you need to know if agents are choosing it.
Your skill teaches agents to call your API. Make sure they actually do — not your competitor's.
Your skill teaches agents to interact with your product. But is it winning against alternatives?
Your CLI or SDK skill competes with others. A better-worded description can steal your traffic.
Your skill helps agents recommend and use your library. See if it gets picked over the alternatives.
"I wrote a skill, I hope agents use it"
"I know my selection rate is 87% and improving"
"No idea if competitors are stealing my traffic"
"I caught 3 steals and fixed my description to prevent them"
"Every agent might choose differently and I can't test"
"I tested across 4 agents and optimized for each"
"My skill description is just... what I wrote"
"My skill description is A+ graded and battle-tested"
Three steps to know if your skill wins
Paste your skill description, upload a .md file, or search the skills.sh repository.
Your skill enters the arena against real competitors. AI agents simulate choosing between them.
Get your selection rate, detailed reasoning for each decision, and insights to improve.
"Advanced web search with real-time results and structured output..."
"Basic web scraping tool for fetching page contents..."
"The winning skill clearly specified real-time capabilities and structured output, which directly matched the task of finding current news. The competitor only mentioned basic page fetching without real-time or formatting guarantees."
Benchmark, compare, and optimize your agent skills
Compare your skill directly against competitors in realistic agent scenarios.
See exactly how often AI agents choose your skill — with clear percentages.
Pit 3+ skills against each other. Get ELO rankings to see where you stand in your category.
AI-powered reasoning explains why your skill was chosen — or wasn't.
Run evaluations locally or in CI/CD. pip install skills-arena.
Test across Claude, GPT, and more to see how different agents evaluate your skill.
Run evaluations locally, in CI/CD, or anywhere Python runs.
"You spent months building a great product. You wrote a skill so agents can use it. But right now, inside thousands of terminals, an AI agent is reading your skill description next to your competitor's — and choosing theirs."
Two free evaluations — no signup needed.
Enter the Arena