# Project Roadmap(ui\_ui)

## From API Tool to Universal Agentic Layer - Strategic Roadmap

### Current State: API Conversation Platform

#### What We Have Today

UIUI currently transforms API documentation into conversational experiences. Users upload OpenAPI specs, ask questions in natural language, and receive both textual insights and custom visualizations. The system consists of four microservices (Forge, Atlas, Prism, Orchestrator) that handle the complete pipeline from documentation normalization to visualization generation. We've proven that complex data interfaces can be made conversational, demonstrating this with platforms like DefiLlama where users find our interface easier than the native one.

The core innovation is already here: the ability to understand intent, plan multi-step workflows, execute them reliably, and generate meaningful visual outputs. This infrastructure - not the API focus - is the real foundation for the agentic web.

***

### Target State: The Agentic Internet

#### What We're Building Toward

Imagine opening your browser to a simple text input - like Google's homepage, but for actions not searches. Type "Show me my AWS costs trending up" or "Find customers who haven't ordered in 30 days across all our tools" and get instant answers with visualizations. No need to know which app has the data, no need to navigate interfaces, no need to learn query languages.

The ultimate form might be **uiui.wtf** - a universal interface where every partner gets a subdomain (defillama.uiui.wtf, ahrefs.uiui.wtf) providing their users with a conversational layer over their existing product. Or it could be a browser extension that adds an agentic command bar to every website. Or both. The key is that UIUI becomes the translation layer between human intent and digital capability, regardless of the underlying application.

***

### The Phased Evolution

#### Phase 1: API Excellence & Market Validation

* Onboard 20-30 complex data platforms or other products (DefiLlama, Dune, analytics tools)
* Perfect the conversational-to-visual pipeline a) fully live workflows b) ability to change the workflow or the visualization c) Partner automation pipeline (Forge integration into different apps + ability to view and manage UISpec, SDK, and tool catalog objects; support for partner-specific styles) d) memory inside the conversation/workflow
* Establish the \*.uiui.wtf subdomain model with early partners
* Build reputation as "the agentic layer for complex tools that makes them simple"
* Generate revenue and usage data to validate demand
* Model agnosticity - ability for users to plug in their own models (consider for future phases)
* Better understanding of what specific APIs can or cannot do, with clear communication to users about capabilities and limitations
* **Launch UIUI Script Store (Beta)**: Community marketplace for sharing API workflows and automations
  * Users can publish their successful workflows as templates
  * Others can clone, modify, and improve shared scripts
  * Rating and review system for quality control
  * Start building network effects early

**Key Milestones:**

* [ ] 10,000 queries/day across all platforms
* [ ] 3 flagship partners using subdomain model
* [ ] $50K MRR from API simplification alone

***

#### Phase 2: Beyond APIs - Web Understanding

* **Develop Sphere**: Agent-friendly knowledge repository where agents share learned patterns about websites and domains. Like Stack Overflow for agents - each agent contributes what it learns, all agents can query this collective knowledge. Potential separate product if demand emerges from other companies.
* Develop browser extension that understands current page context
  * DOM parsing as primary method (more reliable, structured)
  * OCR as fallback for images, PDFs, canvas elements (charts/graphs)
  * Hybrid approach: DOM for structure, OCR for visual-only content
  * Cost optimization: DOM first (cheap), OCR only when needed
* Combine API calls with DOM reading (not manipulation yet)
* Enable queries like "Summarize this page's data in a chart"
* Introduce user-related memory: "Compare this to what I saw yesterday" - reimagining browser history as a queryable story of user interactions
* Build webpage understanding without requiring API documentation

**Key Capabilities to Build:**

* [ ] Sphere v1: Shared agent knowledge base
* [ ] Visual recognition of data on any webpage (DOM + OCR hybrid)
* [ ] Automatic schema inference from HTML tables/lists
* [ ] Cross-tab context awareness
* [ ] Session memory and comparison
* [ ] User journey recording and replay

***

#### Phase 3: Read-Write Web Agents with Autopilot Mode

* **Introduce Autopilot Mode**: Browser automation with two modes:
  * Full Autopilot: User states goal, agent executes completely autonomously ("Book me the cheapest flight to NYC next week")
  * Supervised Mode: Agent pauses at critical decisions for user confirmation ("Found 3 flights, which one?" *shows options*)
* Add DOM manipulation capabilities (form filling, clicking, navigation)
* Enable complex workflows: "Fill out this form using data from that spreadsheet"
* **Automated Web Crawler**: Agents explore entire websites autonomously, learning all capabilities and constraints, then uploading discoveries to Sphere for collective knowledge
* Build library of common web patterns (login, search, filter, export)
* Expand UIUI Script Store to include DOM-based automations (building on Phase 1 beta)

**Key Capabilities to Build:**

* [ ] Autopilot Mode: Full autopilot and supervised options
* [ ] Secure DOM manipulation framework
* [ ] Action recording and replay
* [ ] Cross-site workflow orchestration
* [ ] Community script marketplace
* [ ] Web crawler for automatic site understanding

***

#### Phase 4: The Agentic Protocol & Standards

* **Establish UIUI Protocol (UIP)**: Open standard for how websites expose capabilities to agents
  * Similar to how robots.txt tells crawlers what to index
  * uiui.json tells agents what actions are possible
  * Partners implement UIP for native agent integration
* **Launch uiui.wtf as "Google for Agents"**: Central search engine where users type intent and discover which services can fulfill it
* **Sphere becomes public infrastructure**: Other companies' agents can query and contribute
* Build partnerships for native integration (no extension needed for UIP-compliant sites)
* Create certification program: "UIUI Ready" badge for websites

**Key Capabilities to Build:**

* [ ] UIUI Protocol specification v1.0
* [ ] uiui.wtf search engine for agent-compatible services
* [ ] Developer tools for UIP implementation
* [ ] Sphere API for third-party agents
* [ ] Certification and validation system

***

#### Phase 5: The Agentic Browser

* **Launch UIUI Browser**: Purpose-built for agent-first interaction
  * Text input as primary navigation (URL bar replaced by intent bar)
  * Native Autopilot Mode built-in
  * Sphere knowledge integrated
  * All web history becomes queryable agent memory
* Every website automatically works through natural language
* Traditional clicking/typing becomes secondary interaction method
* Agent learns user preferences and automates routine tasks

**End State Markers:**

* [ ] UIUI Browser reaches 1M+ daily active users
* [ ] 50%+ of user web interactions happen through natural language
* [ ] Major websites implement UIP natively
* [ ] "Agentic layer" becomes recognized W3C standard consideration

***

### Key Strategic Decisions

#### 1. **Distribution Strategy**

**Decision: Both browser extension AND standalone site**

* Extension for seamless integration with any website
* uiui.wtf as central hub and discovery engine
* Partners choose their integration depth

#### 2. **Brand Strategy**

**Decision: Invisible infrastructure (like Google approach)**

* UIUI brand for marketing and awareness
* But actual usage feels native to partner sites
* Users might not even know they're using UIUI

#### 3. **Ecosystem Evolution**

**Decision: Curated → Open with Standards**

* Phase 1-2: Carefully selected partners for quality control
* Phase 3: Open with approval process
* Phase 4: Fully open with UIUI Protocol (UIP) standards

#### 4. **Pricing Philosophy**

**Decision: Agent-as-Commodity Model**

* Users pay for agent compute as base cost
* UIUI adds premium for orchestration value
* Vision: This becomes THE standard way to pay for AI agents
* Phase 1: We absorb Claude costs to prove value
* Phase 2+: Transparent pass-through + margin

#### 5. **Target Audience**

**Decision: Dual-Market from Day One**

* Consumers: Simple access to complex tools
* Developers: Powerful automation without building infrastructure
* Same product, different messaging
* Let usage patterns guide future focus

#### 6. **Model Strategy**

**Decision: Model Agnostic Future**

* Phase 1-2: Claude/GPT for quality and simplicity
* Phase 3: BYO model for enterprises
* Phase 4-5: Personal models for privacy-conscious users
* Long-term: Every user has their own fine-tuned agent

***

### The North Star

Every web interaction should be as simple as having a conversation. Whether it's checking your AWS costs, analyzing customer data, or comparing prices across sites - one text box, natural language, instant results. UIUI becomes the universal translator between human intent and digital capability.

The path from API tool to agentic layer isn't just about adding features. It's about fundamentally changing how humans interact with software. We're not building a better interface. We're eliminating interfaces altogether.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://wtf.uiui.wtf/wtf-are-we-building/project-roadmap-ui_ui.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
