Project Roadmap(ui_ui)

From API Tool to Universal Agentic Layer - Strategic Roadmap

Current State: API Conversation Platform

What We Have Today

UIUI currently transforms API documentation into conversational experiences. Users upload OpenAPI specs, ask questions in natural language, and receive both textual insights and custom visualizations. The system consists of four microservices (Forge, Atlas, Prism, Orchestrator) that handle the complete pipeline from documentation normalization to visualization generation. We've proven that complex data interfaces can be made conversational, demonstrating this with platforms like DefiLlama where users find our interface easier than the native one.

The core innovation is already here: the ability to understand intent, plan multi-step workflows, execute them reliably, and generate meaningful visual outputs. This infrastructure - not the API focus - is the real foundation for the agentic web.


Target State: The Agentic Internet

What We're Building Toward

Imagine opening your browser to a simple text input - like Google's homepage, but for actions not searches. Type "Show me my AWS costs trending up" or "Find customers who haven't ordered in 30 days across all our tools" and get instant answers with visualizations. No need to know which app has the data, no need to navigate interfaces, no need to learn query languages.

The ultimate form might be uiui.wtf - a universal interface where every partner gets a subdomain (defillama.uiui.wtf, ahrefs.uiui.wtf) providing their users with a conversational layer over their existing product. Or it could be a browser extension that adds an agentic command bar to every website. Or both. The key is that UIUI becomes the translation layer between human intent and digital capability, regardless of the underlying application.


The Phased Evolution

Phase 1: API Excellence & Market Validation

  • Onboard 20-30 complex data platforms or other products (DefiLlama, Dune, analytics tools)

  • Perfect the conversational-to-visual pipeline a) fully live workflows b) ability to change the workflow or the visualization c) Partner automation pipeline (Forge integration into different apps + ability to view and manage UISpec, SDK, and tool catalog objects; support for partner-specific styles) d) memory inside the conversation/workflow

  • Establish the *.uiui.wtf subdomain model with early partners

  • Build reputation as "the agentic layer for complex tools that makes them simple"

  • Generate revenue and usage data to validate demand

  • Model agnosticity - ability for users to plug in their own models (consider for future phases)

  • Better understanding of what specific APIs can or cannot do, with clear communication to users about capabilities and limitations

  • Launch UIUI Script Store (Beta): Community marketplace for sharing API workflows and automations

    • Users can publish their successful workflows as templates

    • Others can clone, modify, and improve shared scripts

    • Rating and review system for quality control

    • Start building network effects early

Key Milestones:


Phase 2: Beyond APIs - Web Understanding

  • Develop Sphere: Agent-friendly knowledge repository where agents share learned patterns about websites and domains. Like Stack Overflow for agents - each agent contributes what it learns, all agents can query this collective knowledge. Potential separate product if demand emerges from other companies.

  • Develop browser extension that understands current page context

    • DOM parsing as primary method (more reliable, structured)

    • OCR as fallback for images, PDFs, canvas elements (charts/graphs)

    • Hybrid approach: DOM for structure, OCR for visual-only content

    • Cost optimization: DOM first (cheap), OCR only when needed

  • Combine API calls with DOM reading (not manipulation yet)

  • Enable queries like "Summarize this page's data in a chart"

  • Introduce user-related memory: "Compare this to what I saw yesterday" - reimagining browser history as a queryable story of user interactions

  • Build webpage understanding without requiring API documentation

Key Capabilities to Build:


Phase 3: Read-Write Web Agents with Autopilot Mode

  • Introduce Autopilot Mode: Browser automation with two modes:

    • Full Autopilot: User states goal, agent executes completely autonomously ("Book me the cheapest flight to NYC next week")

    • Supervised Mode: Agent pauses at critical decisions for user confirmation ("Found 3 flights, which one?" shows options)

  • Add DOM manipulation capabilities (form filling, clicking, navigation)

  • Enable complex workflows: "Fill out this form using data from that spreadsheet"

  • Automated Web Crawler: Agents explore entire websites autonomously, learning all capabilities and constraints, then uploading discoveries to Sphere for collective knowledge

  • Build library of common web patterns (login, search, filter, export)

  • Expand UIUI Script Store to include DOM-based automations (building on Phase 1 beta)

Key Capabilities to Build:


Phase 4: The Agentic Protocol & Standards

  • Establish UIUI Protocol (UIP): Open standard for how websites expose capabilities to agents

    • Similar to how robots.txt tells crawlers what to index

    • uiui.json tells agents what actions are possible

    • Partners implement UIP for native agent integration

  • Launch uiui.wtf as "Google for Agents": Central search engine where users type intent and discover which services can fulfill it

  • Sphere becomes public infrastructure: Other companies' agents can query and contribute

  • Build partnerships for native integration (no extension needed for UIP-compliant sites)

  • Create certification program: "UIUI Ready" badge for websites

Key Capabilities to Build:


Phase 5: The Agentic Browser

  • Launch UIUI Browser: Purpose-built for agent-first interaction

    • Text input as primary navigation (URL bar replaced by intent bar)

    • Native Autopilot Mode built-in

    • Sphere knowledge integrated

    • All web history becomes queryable agent memory

  • Every website automatically works through natural language

  • Traditional clicking/typing becomes secondary interaction method

  • Agent learns user preferences and automates routine tasks

End State Markers:


Key Strategic Decisions

1. Distribution Strategy

Decision: Both browser extension AND standalone site

  • Extension for seamless integration with any website

  • uiui.wtf as central hub and discovery engine

  • Partners choose their integration depth

2. Brand Strategy

Decision: Invisible infrastructure (like Google approach)

  • UIUI brand for marketing and awareness

  • But actual usage feels native to partner sites

  • Users might not even know they're using UIUI

3. Ecosystem Evolution

Decision: Curated → Open with Standards

  • Phase 1-2: Carefully selected partners for quality control

  • Phase 3: Open with approval process

  • Phase 4: Fully open with UIUI Protocol (UIP) standards

4. Pricing Philosophy

Decision: Agent-as-Commodity Model

  • Users pay for agent compute as base cost

  • UIUI adds premium for orchestration value

  • Vision: This becomes THE standard way to pay for AI agents

  • Phase 1: We absorb Claude costs to prove value

  • Phase 2+: Transparent pass-through + margin

5. Target Audience

Decision: Dual-Market from Day One

  • Consumers: Simple access to complex tools

  • Developers: Powerful automation without building infrastructure

  • Same product, different messaging

  • Let usage patterns guide future focus

6. Model Strategy

Decision: Model Agnostic Future

  • Phase 1-2: Claude/GPT for quality and simplicity

  • Phase 3: BYO model for enterprises

  • Phase 4-5: Personal models for privacy-conscious users

  • Long-term: Every user has their own fine-tuned agent


The North Star

Every web interaction should be as simple as having a conversation. Whether it's checking your AWS costs, analyzing customer data, or comparing prices across sites - one text box, natural language, instant results. UIUI becomes the universal translator between human intent and digital capability.

The path from API tool to agentic layer isn't just about adding features. It's about fundamentally changing how humans interact with software. We're not building a better interface. We're eliminating interfaces altogether.

Last updated