AI Agents & Trust

Unlocking AI Potential

Meanwhile, AI agents are reshaping how we work, handling repetitive tasks and freeing humans to focus on creative leaps. For example, the thinking and judgment that only people bring are now at the forefront. Additionally, this shift in how software gets built and used is one of the most exciting developments in the industry.

Building Trust in AI Systems

However, the ultimate enabler for this shift is trust. Therefore, people must be able to hand real tasks to an agent and know it will do what it should — and nothing it shouldn’t. Furthermore, this trust can’t be built by any single team behind closed doors. Instead, it has to be earned collectively, in the open, by a community of researchers, engineers, and the genuinely curious.

Meanwhile, the Fabraix Playground exists to make this effort tangible. Every challenge deploys a live AI agent, not a toy scenario or a mocked-up document parser, but an agent with real capabilities, and opens it up for the community to break. For instance, system prompts are published, and challenge configs are versioned in the open.

How the Playground Works

Each challenge puts a live AI agent in front of you with a specific persona, a set of tools, and something it’s been instructed to protect. Additionally, the system prompt is fully visible. Your job is to find a way past the guardrails anyway. However, the community drives what gets tested, and anyone can propose a challenge — the scenario, the agent, the objective.

The community votes

The top-voted challenge is considered for go live with a ticking clock

The fastest successful jailbreak wins

The winning technique gets published — approach, reasoning, everything

Advancing AI Security

Finally, every technique we publish advances what the community collectively understands about how AI agents fail — and how to build ones that don’t. Meanwhile, the project structure is designed to be open and collaborative, with a React frontend and challenge configs versioned in the open.

Additionally, guardrail evaluation runs server-side to prevent client-side tampering, and the agent runtime is being open-sourced separately. Therefore, you can get involved by proposing a challenge, suggesting agent capabilities, or reporting bugs.

Getting Started

To get started, you can run the project locally by installing the dependencies and running the development server. For example, you can use the following commands:

npm install

npm run dev

Meanwhile, you can connect to the live API by default, or develop against a local backend by setting the VITE_API_URL environment variable.

Finally, join the discussion on Discord to share approaches and techniques with the community.

About Fabraix: We build runtime security for AI agents at Fabraix. The Playground is how we stress-test defenses in the open and how the broader community contributes to the shared understanding of AI security and failure modes.