Methodology

Survivor ELO Methodology: How SHALLOW Ratings Work

Last updated: 2026-05-22

Survivor Hierarchical Algorithm for Logistic Likelihood & Odds Weighting

SHALLOW is a quantitative rating system for Survivor contestants. It applies the Elo rating methodology—originally developed for chess—to measure strategic and social positioning across all US and Australian Survivor seasons.


What SHALLOW Measures Directly


What SHALLOW Measures Indirectly

What SHALLOW Does Not Measure


The Rating System

SHALLOW uses the Elo rating system, where rating changes depend on two factors:

  1. The Outcome Winners gain points; losers lose points.
  2. The Expection Upsets (lower-rated player beats higher-rated) cause larger rating swings than expected outcomes.

All players begin at 1500. Ratings are zero-sum: points gained by winners equal points lost by losers.

The K Factor

The K factor controls rating volatility—how much ratings change per event. After empirical optimization across all seasons, SHALLOW uses K = 8.

Why K = 8?

We tested K values from 4 to 128, measuring predictive accuracy using Brier score (lower is better). K = 8 minimizes prediction error, indicating it best balances responsiveness to new information against stability from accumulated history.

Lower K values mean established ratings are more stable and require more evidence to change significantly. This reflects Survivor's high-variance environment where even skilled players face elimination through circumstances beyond their control.

Returning Players

Players carry their ratings across seasons. A returning player enters their new season with whatever rating they accumulated previously, allowing the system to track career-long performance.


Provisional Ratings

Players with fewer than 5 tribal council attendances are marked "provisional." Their ratings affect other players normally, but they are excluded from official rankings due to insufficient sample size.

This prevents early boots with small samples from appearing alongside established players in the rankings.


Tiers

Players are assigned tiers based on statistical z-scores (standard deviations from mean rating):

Tier Z-Score Interpretation
Sole Survivor ≥ 2.0 Top ~2% of all players
Finalist 1.0 to 2.0 Top ~15%
Merge 0 to 1.0 Above average
Jury -1.0 to 0 Below average
Pre-Merge < -1.0 Bottom ~15%

Tier boundaries are automatically calibrated to the full player population.


Limitations

Survivor is high-variance. Tribe swaps, idol finds, challenge outcomes, and production twists introduce randomness that can eliminate strong players and advance weaker ones. SHALLOW captures outcomes, not intentions or abilities in isolation.

Vote data has edge cases. Split votes, revotes, unanimous decisions, and special tribal formats require interpretation. We process these consistently but acknowledge ambiguity in some situations.

US and Australian Survivor are pooled. We treat all seasons as one continuous rating pool. This maximizes signal on legendary players who have competed across both franchises—Sandra Diaz-Twine, Tony Vlachos, Russell Hantz, and others have ratings built from appearances in both US and AU seasons. The tradeoff is that the two franchises may have different average competition levels, but cross-franchise returnees anchor the populations together.


Case Studies

These examples illustrate what SHALLOW captures—and what it cannot.

The Namesake: Parvati Shallow

Parvati Shallow ranks #1 with an Elo of 1773—four standard deviations above the mean. Across five seasons and 47 tribal councils, she maintained 83.7% vote accuracy while reaching Final Tribal Council three times. Her rating reflects not a single dominant season, but sustained excellence across Micronesia, Heroes vs. Villains, and Winners at War.

Vote Accuracy vs. Perceived Control: Dee Valladares

Dee Valladares won Season 45 and is widely considered one of the strongest recent winners. She ranks #66 amongst US players. Why?

Player Rank Vote Accuracy Tribals Jury Votes
Kyle Fraser (S48) #19 100% 10 5/8
Rachel LaMont (S47) #20 77.8% 12 7/8
Kenzie Petty (S46) #36 90.9% 13 5/8
Dee Valladares (S45) #66 66.7% 10 5/8

Dee voted incorrectly at three tribal councils. Kyle voted correctly every time. SHALLOW cannot observe jury management, strategic misdirection, or edit portrayal. It sees only who voted for whom and who went home. A player can dominate strategically while appearing on the "wrong" side of votes that eliminated allies or pawns. This is a known limitation: the model rewards correct reads as recorded, not necessarily optimal strategy.

Reaching FTC Twice Without Winning: Amanda Kimmel

Amanda Kimmel reached Final Tribal Council twice—China and Micronesia—and lost both times. She still ranks #40th amongst US players with an A-tier rating. Across three seasons, she attended 30 tribal councils with 85.7% vote accuracy and earned 4 jury votes total.

Her rating reflects sustained survival and strategic positioning. FTC losses cost her rating points against the winners, but 30 tribals of strong performance outweighs two jury defeats. SHALLOW values longevity and vote accuracy; losing at FTC hurts, but does not erase an otherwise excellent record.

Fan Favorite, Mixed Bag: Spencer Bledsoe

Spencer Bledsoe is a fan favorite who won 6 individual immunities across two seasons and reached Final Tribal Council in Cambodia. He ranks #71 amongst US players with a A-tier rating.

His vote accuracy was 65.4%—he voted incorrectly 9 times across 27 tribals. At FTC, he received zero jury votes. Challenge dominance kept him alive, but the system sees frequent misreads and jury rejection.

Goat Dampening: Sue Smey and Ben Katzman

Sue Smey (Season 47) and Ben Katzman (Season 46) both reached Final Tribal Council. Both received zero jury votes. Despite surviving 39 days, their ratings sit near the population mean (#271 and #297 amongst US and AU, respectively).

Reaching the end without jury support generates matchup losses against every other finalist for every jury vote cast. The system does not reward survival alone—it rewards survival validated by peer respect.

Elite Without Winning: Cirie Fields

Cirie Fields ranks #7 among US players with an Elo of 1683 despite never winning Survivor and never reaching Final Tribal Council. Across five seasons and 44 tribal councils, she maintained 78.6% vote accuracy with zero individual immunity wins.

Her rating is built entirely on strategic positioning—being on the right side of votes and outlasting opponents at the tribals she attended. Cirie demonstrates that the system measures gameplay outcomes, not just results. You do not need to win challenges or reach FTC to rate among the all-time greats.

Rise and Fall: Russell Hantz

Russell Hantz peaked at 1733 Elo (S-tier) in the penultimate episode of Heroes vs. Villains. Then came Final Tribal Council: zero jury votes against Sandra and Parvati. His rating dropped immediately as he lost every jury matchup.

It continued falling. In Redemption Island (Season 22), Russell was eliminated second. In Australian Survivor: Champions vs. Contenders, he was voted out pre-merge. Each early elimination generated outlast losses against the players who survived. Russell's current rating reflects the full arc: dominant strategic play in Samoa and Heroes vs. Villains, followed by three consecutive poor performances.

Rise and Fall: Sandra Diaz-Twine

Sandra Diaz-Twine peaked at 1708 Elo (S-tier) after winning Heroes vs. Villains—her second consecutive victory. She was, at that moment, one of the highest-rated player in Survivor history.

Three subsequent seasons eroded her rating. In Game Changers (Season 34), she was voted out at the second swap. In Winners at War (Season 40), she was eliminated pre-merge. In Australian Survivor: Blood vs. Water, she left early again. Sandra currently ranks far from her peak. Like Russell, her trajectory illustrates that SHALLOW treats each tribal independently. Past wins provide no immunity from future losses.


Data Sources

Vote data sourced from the survivoR package.


Methodology Changes

This system is under active development. Rating calculations may be refined as we identify improvements. Historical ratings are recalculated when methodology changes to maintain consistency.

Current version: 1.0.0


See also: BEAST challenge rating methodology, the FAQ, and the full rankings.