• DocsDocs
  • PricingPricing
Book a demo
Sign in
Sign in
Book a demo
    • Ready-made features
      • AI Copilots
        AI Copilots

        In-app AI agents that feel human

      • Comments
        Comments

        Contextual commenting

      • Multiplayer Editing
        Multiplayer Editing

        Realtime collaboration

      • Notifications
        Notifications

        Smart alerts for your app

      • Presence
        Presence

        Realtime presence indicators

    • Platform
      • Monitoring Dashboard
        Monitoring Dashboard

        Monitor your product

      • Realtime Infrastructure
        Realtime Infrastructure

        Hosted WebSocket infrastructure

    • Tools
      • Examples

        Gallery of open source examples

      • Showcase

        Gallery of collaborative experiences

      • Next.js Starter Kit

        Kickstart your Next.js collaborative app

      • DevTools

        Browser extension for debugging

      • Tutorial

        Step-by-step interactive tutorial

      • Guides

        How-to guides and tutorial

    • Company
      • Blog

        The latest from Liveblocks

      • Customers

        The teams Liveblocks empowers

      • Changelog

        Weekly product updates

      • Security

        Our approach to security

      • About

        The story and team behind Liveblocks

  • Docs
  • Pricing
  • Ready-made features
    • AI Copilots
    • Comments
    • Multiplayer Editing
    • Notifications
    • Presence
    Platform
    • Monitoring Dashboard
    • Realtime Infrastructure
    Solutions
    • People platforms
    • Sales tools
    • Startups
    Use cases
    • Multiplayer forms
    • Multiplayer text editor
    • Multiplayer creative tools
    • Multiplayer whiteboard
    • Comments
    • Sharing and permissions
    • Document browsing
  • Resources
    • Documentation
    • Examples
    • Showcase
    • React components
    • DevTools
    • Next.js Starter Kit
    • Tutorial
    • Guides
    • Release notes
    Technologies
    • Next.js
    • React
    • JavaScript
    • Redux
    • Zustand
    • Yjs
    • Tiptap
    • BlockNote
    • Slate
    • Lexical
    • Quill
    • Monaco
    • CodeMirror
  • Company
    • Pricing
    • Blog
    • Customers
    • Changelog
    • About
    • Contact us
    • Careers
    • Terms of service
    • Privacy policy
    • DPA
    • Security
    • Trust center
    • Subprocessors
  • HomepageSystem status
    • Github
    • Discord
    • X
    • LinkedIn
    • YouTube
    © 2025 Liveblocks Inc.
Blog/Engineering

Building an AI copilot inside your Tiptap text editor

How to add advanced AI editing to your Tiptap text editor with tool calls, realtime streaming, and diffs

on November 21st
Building an AI copilot inside your Tiptap text editor
November 21st·10 min read
Share article

When most SaaS teams talk about adding AI to their product, they think of a sidebar assistant that drafts text or a modal that pastes content into the editor. But a true AI copilot doesn’t live outside the editor—it lives within it. It understands the document, its structure, the user’s intent and can edit, reorganize, and reason over complex content.

That’s what we built when working with our client, Distribute, a go-to-market platform with a collaborative document editor. Here’s what we learned while turning a traditional editor into an AI-native workspace.

The editor

Distribute’s text editor uses a structured, schema-based document system that supports custom layouts, tabbed sections, and user mentions. To make AI truly useful in that context, we needed more than a content generator. The copilot had to:

  • Understand the entire document structure (not just current selection)
  • Modify any part of it, from rewriting a paragraph to merging two tabs
  • Preserve strict schema validity and contextual placeholders
  • Respect the collaboration model — users must see, review, and accept changes safely

The result: a copilot that feels native to the editor, not bolted on top.

Distribute editor with copilot

The editor with diffs, and AI chat with diffs, accept/reject dialog

Key technical challenges

Working with a strict document model

Distribute’s editor runs on Tiptap, built over ProseMirror, meaning every node and mark is validated against a schema. That’s great for consistency but hard for AI, since large language models are much more comfortable producing text than tree structures.

Our first idea was to ask the LLM to generate a stream of edit operations referencing node IDs. In practice, this proved unstable—the model could easily lose track of structure, break references, or output malformed operations.

So we inverted the problem. Instead of operations, the model outputs a complete edited document, but written in a restricted JSX-style markup that mirrors our schema. Alongside it, the model includes a short meta-comment explaining what it changed and why.

We then diff this version against the user’s original document to extract precise edits.

// System prompt for AI copilotconst SYSTEM_PROMPT = `# RESPONSE FORMATReturn JSON with two fields:{  "doc": "<complete edited document in allowed markup>",  "comment": "<brief explanation of changes>"}
# ALLOWED MARKUP ELEMENTS- Structure: <Doc>, <Tab id="x" name="Y">- Text: <Paragraph>, <Heading level="1-3">, <HardBreak>- Formatting: <Bold>, <Italic>, <Strike>, <Underline>- Lists: <BulletList>, <OrderedList>, <ListItem>- Special: <CustomLink href="...">, <CustomImage src="...">, <ContactCard>- NEVER modify: <ContactCard>, <CustomTaskList>, <Button>, <VideoRecord>
# RULES- Return COMPLETE document, preserving all unchanged content- Keep all node attributes (ids, alignment, etc.)- Never invent new tags or attributes- Preserve all special blocks exactly as provided`;
// JSX-style documentconst documentKnowledge = `<Doc> <Tab id="tab1" name="Overview"> <Paragraph>Current proposal draft...</Paragraph> </Tab></Doc>`;
// Document contextconst metadataKnowledge = `{ "title": "Enterprise Sales Proposal", "userName": "John", "userCompany": "Acme Corp"}`;
“We realized that LLMs are great at rewriting but terrible at patching. Once we gave it a strict language and compared outputs ourselves, everything clicked.”
Image of Myron Mavko
Myron MavkoFounder at Flexum

This design gave us reliability without sacrificing flexibility.

Extended-text diffing

Diffing two structured documents is hard. Naive character/word diffs ignore node boundaries and semantics; purely tree-based diffs often treat intra‑node text edits as whole‑node replacements, obscuring the minimal change set. We needed both views at once. We built a hybrid method we call extended-text diffing.

The idea is to flatten the document into a text-like sequence but preserve structure as metadata. We convert every visible and invisible element into a token—even visual blocks like images or dividers get symbolic representations. Each token carries metadata about its original formatting and position in the tree.

This lets us:

  • Run a classic text-diff algorithm (LCS-style) on the flattened form.
  • Reconstruct a structured document with semantic awareness.
  • Label every node as unchanged, added, updated, or removed.

The outcome is a perfect balance: simple diff logic, yet structure-aware precision.

Extended-text diffing

Extended-text diffing

// 1) Flatten to extended‑text tokensconst tokens = flattenToExtendedText(editorState); // emits chars + meta tokens: EOB, IMG, HR, MENTION(uid)
// 2) Diff with LCS-like algorithmconst diff = lcsDiff(tokens, tokensFrom(editedDoc)); // yields ops: keep/add/remove/update
// 3) Rebuild structured doc with suggestionsconst suggested = rebuildWithDiff(editorSchema, originalDoc, diff);renderDiffView(suggested); // highlights added/updated/removed

Streaming responses in realtime

Waiting for the full model response before showing anything would make the experience feel sluggish. Instead, we stream tokens as they arrive and immediately generate diffs on that partial content so the user can watch the copilot’s work unfold in realtime.

The challenge is that a streamed document is incomplete, so the temporary tail of the diff may falsely appear to be removed text simply because those tokens haven’t arrived yet. To keep the experience consistent, we hide that trailing removal fragment until streaming completes, revealing the full text once the model finishes.

This way, users see responsive, continuously updating feedback with accurate diffs from the first second to the last.

Your browser does not support the video tag.Streaming preview behavior

Our AI App Builder example demos this—the trailing code is kept visible until the stream completes.

function generateDiff(  originalDoc: JSONContent,  editedDoc: JSONContent,  options = { hideTrailingRemovals: false }) {  const originalTokens = flattenToExtendedText(originalDoc);  const editedTokens = flattenToExtendedText(editedDoc);
let diff = computeDiff(originalTokens, editedTokens);
// Hide trailing deletions for partial/streaming diffs if (options.hideTrailingRemovals) { diff = hideTrailingRemovals(diff); }
return rebuildWithDiff(diff);}

Collaboration and privacy

In a shared document, AI should never surprise teammates. Our rule: preview locally, publish deliberately.

Here’s the workflow:

  • When a user invokes the copilot, their editor switches to a read‑only diff view of the proposed changes.
  • Other collaborators continue editing as usual; their updates keep syncing and are immediately reflected in the user’s diff view.
  • The user can review, chat, and iterate as needed. Accept applies the change set and publishes it to everyone as that user’s edits; Reject discards it.

This keeps the session private without pausing collaboration or forking state. The implementation is a custom diff renderer inside the editor—no paid editor plugins, no special server mode. It works with any strict‑schema editor and any real‑time layer (e.g., Liveblocks) because it’s just view state plus deterministic apply/revert.

// Copilot UI Componentfunction AICopilotPreview({ diffDoc, onAccept, onReject }) {  return (    <div className="copilot-preview">      {/* Read-only editor showing diff */}      <TiptapEditor        content={diffDoc}        editable={false}        extensions={[          // Render insertions in green, deletions in red with strikethrough          DiffMark.configure({            HTMLAttributes: {              "data-insert": { class: "bg-green-100 text-green-900" },              "data-delete": { class: "bg-red-100 text-red-900 line-through" },              "data-update": { class: "bg-blue-100 text-blue-900" },            },          }),        ]}      />
{/* Controls */} <div className="controls"> <button onClick={() => { onAccept(); }} > ✓ Accept Changes </button>
<button onClick={() => { // Discard suggestions, return to original onReject(); }} > ✗ Reject </button> </div> </div> );}

This handles the diff preview outside of the AI chat, but we need to build the AI chat itself, with its own accept/reject dialog.

Building an advanced AI chat

Most AI chat libraries are built on HTTP streaming, which is prone to breaking due to network issues, timeouts, and other factors. Additionally, it’s difficult to handle front-end tool calls, confirmation flows, and realtime feedback with HTTP streaming, which is why we recommend using WebSockets instead of HTTP for AI chats.

The ready-made Liveblocks AiChat component handles all this, rendering a realtime chat interface, and you can add the current document state to the chat as knowledge, so it understands the starting point before it makes changes.

import { AiChat, AiTool } from "@liveblocks/react-ui";import { RegisterAiKnowledge } from "@liveblocks/react";import { useEditor } from "@tiptap/react";import { EditDocumentTool } from "./EditDocumentTool";
function Chat({ chatId, metadata }) { const editor = useEditor();
return ( <> {/* The Liveblocks AI chat component*/} <AiChat chatId={chatId} />
{/* Giving the AI chat the current document */} <RegisterAiKnowledge description="The current document" value={editor.getJSON()} />
{/* Giving the AI chat the document's metadata */} <RegisterAiKnowledge description="The current metadata" value={metadata} />
{/* See below */} <EditDocumentTool /> </> );}

Alongside this you need a human-in-the-loop tool that allows AI to edit and stream in the changes in realtime, with a confirmation flow, so that the user can accept or reject the changes.

import { AiTool } from "@liveblocks/react-ui";import { RegisterAiTool } from "@liveblocks/react";import { defineAiTool } from "@liveblocks/client";import { useEditor } from "@tiptap/react";import { applyDiff, generateDiff, DiffPreview } from "src/utils/diff";
function EditDocumentTool() { const editor = useEditor();
// A tool that allows AI to edit the document return ( <RegisterAiTool name="edit-document" tool={defineAiTool()({ description: "Edit a document", parameters: { type: "object", properties: { doc: { type: "string" }, comment: { type: "string" }, }, required: ["doc", "comment"], additionalProperties: false, }, execute: () => {}, render: ({ stage, args, partialArgs, result, types }) => { const diff = generateDiff(editor.getJSON(), partialArgs.doc, { hideTrailingRemovals: stage !== "executed", });
// Document is streaming in, update diff in the editor if (stage === "receiving") { __showDiffInEditor__(diff);
// Show the current diff in the chat return ( <AiTool title="Modifying document…"> <DiffPreview diff={diff} /> </AiTool> ); }
// Document has fully streamed in, show confirm/cancel dialog in chat if (stage === "executing") { __showDiffInEditor__(diff);
return ( <AiTool title="Modifying document…"> <AiTool.Confirmation types={types} confirm={async () => { // On confirm, apply changes to the document, remove in-editor diff preview const updatedDoc = applyDiff(editor.getJSON(), args.doc); editor.commands.setContent(updatedDoc); __showDiffInEditor__(null);
return { data: { success: true }, description: "The user accepted the changes", }; }} cancel={() => { // On cancel, remove in-editor diff preview __showDiffInEditor__(null);
return { data: { success: false }, description: "The user rejected the changes", }; }} > <DiffPreview diff={diff} /> </AiTool.Confirmation> </AiTool> ); }
// stage === "executed" // Dialog has been clicked, show collapsed final diff return ( <AiTool title={ result.data.success ? "Document updated" : "Update cancelled" } collapsed={true} > <DiffPreview diff={diff} /> </AiTool> ); }, })} /> );}

Results

The editor now supports full-document understanding and structural manipulation through AI, enabling copilot suggestions that can modify entire layouts while keeping schema validity intact. Users can review changes inline, accept them safely, and continue collaborating instantly as AI responses stream in real time, creating a seamless, engaging authoring experience.

Lessons learned

  1. LLMs prefer rewriting, not patching. Let them produce a clean new version and control structure externally.
  2. Schema safety comes first. A strict markup contract keeps your data valid and predictable.
  3. Diffing is where magic happens. Our extended-text diffing bridged language and structure elegantly.
  4. Streaming adds life without lag. Show diffs as tokens arrive; suppress only the trailing‑removal artifact until the stream completes.
  5. Collaboration needs isolation. Private layers preserve UX trust without breaking realtime sync.
  6. WebSockets instead of HTTP. WebSockets reliably enable more advanced features like tool calls, confirmation flows, and realtime feedback.

Takeaways for builders

Embedding AI in structured, collaborative editors isn’t about prompt crafting, it’s about architecture. Once you own your schema, diffing, and collaboration layer, AI can operate natively within your product.

Whether you stream your own model output or use Liveblocks’ ready-made AiChat component for collaborative streaming, the principle is the same—responsiveness, control, and trust.

CollaborationAIText editing

Ready to get started?

Join thousands of companies using Liveblocks ready‑made collaborative features to drive growth in their products.

Book a demo

Related blog posts

  • Why we built our AI agents on WebSockets instead of HTTP

    Why we built our AI agents on WebSockets instead of HTTP

    Picture of Jonathan Rowny
    Picture of Nimesh Nayaju
    September 29th
    Engineering
  • What's the best vector database for building AI products?

    What's the best vector database for building AI products?

    Picture of Jonathan Rowny
    September 15th
    Engineering
  • We’ve open-sourced our customizable React emoji picker

    We’ve open-sourced our customizable React emoji picker

    Picture of Chris Nicholas
    Picture of Marc Bouchenoire
    Picture of Pierre Le Vaillant
    March 20th
    Engineering