Confidence Indicators

Visual signals communicating how certain or uncertain the AI is about its output. The core tension: transparency demands users know when AI is uncertain, but showing it wrong creates anxiety, false precision, or decision paralysis. The right approach depends entirely on what's at stake and what the user can do about it.

Component

In Context

Source Code

'use client'

import { Button } from '@/components/ui/button'
import { Popover, PopoverContent, PopoverTrigger } from '@/components/ui/popover'
import { Separator } from '@/components/ui/separator'
import { cn } from '@/lib/utils'
import { Search } from 'lucide-react'
import { useState } from 'react'

type ConfidenceLevel = 'low' | 'medium' | 'high'

interface MissingContext {
  id: string
  text: string
}

interface ConfidenceIndicatorProps {
  level?: ConfidenceLevel
  missingContext?: MissingContext[]
  onFindContext?: () => void
}

const LEVEL_CONFIG = {
  low: {
    label: 'Low',
    colors: ['bg-[#7d0000]', 'bg-[#475467]', 'bg-[#475467]'],
  },
  medium: {
    label: 'Medium',
    colors: ['bg-[#844600]', 'bg-[#844600]', 'bg-[#475467]'],
  },
  high: {
    label: 'High',
    colors: ['bg-[#12651a]', 'bg-[#12651a]', 'bg-[#12651a]'],
  },
} as const

function ConfidenceLevelTrigger({ level, isSelected }: { level: ConfidenceLevel; isSelected: boolean }) {
  const { colors } = LEVEL_CONFIG[level]

  return (
    <div
      className={cn(
        'flex h-8 items-center justify-center rounded-xl px-2 transition-colors',
        isSelected ? 'bg-[#263035]' : 'hover:bg-[#263035]/50'
      )}
      aria-label={`Confidence: ${level}`}
    >
      <div className="flex h-1 w-14 gap-1">
        {colors.map((color, i) => (
          <div key={i} className={cn('h-full flex-1 rounded-sm', color)} />
        ))}
      </div>
    </div>
  )
}

function ConfidenceLevelCard({
  level,
  missingContext,
  onFindContext,
}: {
  level: ConfidenceLevel
  missingContext: MissingContext[]
  onFindContext?: () => void
}) {
  const { label } = LEVEL_CONFIG[level]

  return (
    <div className="flex w-[300px] flex-col overflow-hidden">
      <div className="px-4 pt-3">
        <span className="text-base leading-6 text-[#d0d5dd]">Confidence: {label}</span>
      </div>

      <Separator className="mt-3 bg-[#3d4a54]" />

      <div className="not-prose space-y-3 overflow-y-auto px-4 py-3">
        <p className="text-base text-[#f2f7fc]">Missing Context:</p>
        <ul className="ml-6 list-disc space-y-3">
          {missingContext.map((item) => (
            <li key={item.id} className="text-base text-[#f2f7fc]">
              {item.text}
            </li>
          ))}
        </ul>
      </div>

      <div className="rounded-b-xl bg-[#3d4a54] px-3 py-2">
        <Button
          variant="ghost"
          onClick={onFindContext}
          className="h-auto w-full justify-start gap-2 p-0 text-base leading-6 text-[#f2f7fc] hover:bg-transparent hover:opacity-80"
        >
          <Search size={16} strokeWidth={1.5} />
          Find missing context
        </Button>
      </div>
    </div>
  )
}

export function ConfidenceIndicator({
  level = 'low',
  missingContext = [
    { id: '1', text: 'Missing context 1' },
    { id: '2', text: 'Missing context 2' },
    { id: '3', text: 'Missing context 3' },
    { id: '4', text: 'Missing context 4' },
  ],
  onFindContext,
}: ConfidenceIndicatorProps) {
  const [isOpen, setIsOpen] = useState(false)

  return (
    <Popover open={isOpen} onOpenChange={setIsOpen}>
      <PopoverTrigger asChild>
        <Button
          variant="ghost"
          className="h-auto p-0 hover:bg-transparent"
          aria-expanded={isOpen}
        >
          <ConfidenceLevelTrigger level={level} isSelected={isOpen} />
        </Button>
      </PopoverTrigger>
      <PopoverContent
        align="end"
        className="w-auto border-none bg-[#263035] p-0 rounded-xl shadow-md"
      >
        <ConfidenceLevelCard
          level={level}
          missingContext={missingContext}
          onFindContext={() => {
            setIsOpen(false)
            onFindContext?.()
          }}
        />
      </PopoverContent>
    </Popover>
  )
}

export default ConfidenceIndicator

API Reference

ConfidenceIndicator

The main component for displaying confidence levels with expandable context.

Prop	Type	Default	Description
`level`	`'low' \| 'medium' \| 'high'`	`'low'`	The confidence level to display. Controls the visual indicator colors.
`missingContext`	`MissingContext[]`	`[]`	Array of missing context items to display in the popover.
`onFindContext`	`() => void`	`undefined`	Callback fired when the "Find missing context" button is clicked.

MissingContext

The shape of each missing context item.

Property	Type	Description
`id`	`string`	Unique identifier for the context item.
`text`	`string`	Description of the missing context.

Level Colors

Level	Description	Visual
`low`	High uncertainty, significant missing context	1 of 3 bars filled (red)
`medium`	Moderate confidence, some context missing	2 of 3 bars filled (orange)
`high`	High confidence, minimal missing context	3 of 3 bars filled (green)

Design Philosophy

Where I started

When I started designing the confidence indicator, my focus was on showing how sure is the AI. That's the obvious framing: give the user a confidence score so they can make an informed decision. A percentage next to each response. It's transparent, it's what most AI products ship, and it felt like the responsible thing to do.

What made me doubt it

Then I ran an action-mapping exercise. I listed 5 confidence ranges and wrote down what the user would actually do at each one, not what they'd feel, what they'd do.

Range	User Action
>90%	Trust it, move on
80–90%	Trust it, move on
65–80%	Verify before acting on it
50–65%	Don't trust it, check independently
<50%	Don't trust it, check independently

Five ranges, two behaviors. The user either trusts the answer or goes to check it. There's no third action. The score wasn't changing anyone's behavior.

What I killed

I killed the percentage for three reasons.

The precision is fake. When a model says 73%, users read it like a thermometer, an exact measurement. But most models are poorly calibrated. 73% might mean correct anywhere from 55–90% of the time. The number communicates exactness that doesn't exist in the system.

The number doesn't change the action. What does a user do differently at 73% vs 68%? Nothing. A precise input to a binary decision is noise.

It shifts attention to the score. The moment you show a number, users evaluate the number instead of the content. "Is 73% good enough?" replaces actually reading the response. The score becomes a worse proxy than the user's own eyes.

The reframe

Killing the percentage cleared the table but didn't solve the problem. I still needed to figure out what to show instead. That's when I realized the question I'd been designing around, "how sure is the AI?", was the wrong question entirely. It produces a number. The user's real question is "what would make this answer better?" and that produces a path forward.

Compare: "62% confident this is Urgent" gives the user one option, trust it or don't. But "this email is short, has no explicit deadline, and the sender writes both urgent and casual messages in the same tone" tells the user exactly where to look. They glance at the email, see there's no deadline, reclassify in three seconds. The AI showed what it was missing. The user closed the gap.

This works across response types. "I don't have access to your codebase, so I'm guessing at your project structure." "This answer is based on training data, the API may have changed." Each tells the user what would make the answer better without prescribing what to do about it. A senior engineer pastes in their file tree. A junior engineer double-checks the imports. Same information, different actions, because the user knows their context better than the component ever could.

The principle

Don't show the AI's confidence score. Show what information is missing, ambiguous, or uncertain and let the user decide what to do about it. The number asks the user to trust a black box. The explanation turns it into a collaboration.

Confidence Indicators

On this page