AI Trust map

Understanding AI Trust Scores

The AI Trust Score is a composite metric ranging from 1-5 that measures measures the quality of AI Overviews in a particular county compared to the other. Counties with higher scores (green) indicate that the county has higher quality AI Overviews than the other counties.

How Scores Are Calculated

The AI Trust Score is the average of three components:

AI Availability
Source Diversity
Epistemic Humility

Each component or score is placed on a scale from 1 to 5 based on how it compares to others.

Standard deviation helps us understand how spread out scores are from the average. When we say something is "one standard deviation above the mean," it means that score is higher than most others and stands out as above average. Most scores usually fall close to the average, but the further a score is from the mean, the more unusual it is. For example, scores more than one standard deviation above the mean are considered high, while those more than one standard deviation below are considered low. This makes standard deviation a powerful way to group scores fairly. Instead of just looking at the number itself, we understand how that number compares to the rest of the group. It tells us whether a score is typical, slightly better or worse than average, or truly exceptional.

Highest (5): x > μ + σ
High (4): μ + σ > x > μ
Moderate (3): μ > x > μ - σ
Low (2): μ - σ > x > μ - 2σ
Lowest (1): x < μ - 2σ

μ = meanσ = standard deviationx = score

AI Availability

The AI Availability Score is calculated using two key metrics:

1Trigger Rate: How often the AI responds when prompted

2Average Content Score: Quality and relevance of responses

Calculation Process:

Apply min-max normalization to both Trigger Rate and Average Content Score
Sum the normalized values to get a combined score
Apply final min-max normalization to obtain scores between 0-100

Final scores are mapped based on these thresholds:

Highest (5): ≥ 71.35
High (4): ≥ 55.86
Moderate (3): ≥ 40.36
Low (2): ≥ 24.87
Lowest (1): < 24.87

Mean = 55.86Std Dev = 15.50

Source Diversity

The Source Diversity Score measures how evenly information sources are distributed across different domains for each county.

1Calculate Shannon Diversity Index

2Normalize Based on Domain Count

Calculation Process:

Calculate Shannon Index (H):
- H = -Σ(p × ln(p))
- p = proportion of sources from each domain
Normalize by domain count:
- H_normalized = H / ln(k)
- k = number of unique domains
Convert to 0-100 scale for final score

Final scores are mapped based on these thresholds:

Highest (5): ≥ 100.7
High (4): ≥ 91.7
Moderate (3): ≥ 82.7
Low (2): ≥ 73.6
Lowest (1): < 73.6

Mean = 91.7Std Dev = 9.0

Epistemic Humility

The Epistemic Humility Score measures how well AI systems acknowledge uncertainty in their responses.

1Initial Scoring: Analyzing AI Response Confidence

2Normalizing Scores to 0-100 Scale

Calculation Process:

Classify confidence levels using BART:
- Confident responses (1.0)
- Neutral responses (0.5)
- Uncertain responses (0.0)
Calculate initial score: 1 - (weighted sum of confidence)
Apply min-max normalization for final 0-100 score

Final scores are mapped based on these thresholds:

Highest (5): ≥ 58.12
High (4): ≥ 38.19
Moderate (3): ≥ 18.26
Low (2): ≥ -1.67
Lowest (1): < -1.67

Mean = 38.19Std Dev = 19.93

Trust Map

Understanding AI Trust Scores

How Scores Are Calculated

AI Availability

Source Diversity

Epistemic Humility

Overall Trust Score

AI Availability

Source Diversity

E-Humility