Methodology

Survivor ELO Methodology: How SHALLOW Ratings Work

Last updated: 2026-05-22

Survivor Hierarchical Algorithm for Logistic Likelihood & Odds Weighting

SHALLOW is a quantitative rating system for Survivor contestants. It applies the Elo rating methodology—originally developed for chess—to measure strategic and social positioning across all US and Australian Survivor seasons.

What SHALLOW Measures Directly

Vote Accuracy (Outwit) At each tribal council, players who vote for the eliminated contestant are considered "correct." Players who vote elsewhere, have their votes nullified, or are themselves eliminated are considered "incorrect." Each correct voter is matched against each incorrect voter in a pairwise comparison. Correct voters gain rating points; incorrect voters lose them. This captures the essence of strategic positioning: being on the right side of the vote requires social awareness, alliance management, and information gathering. Being on the correct overall side of a split vote is considered "correct," even if you did not write down the name of the player who went home.
Survival (Outlast) Players who survive a tribal council gain rating points against the eliminated player. Only players present at the tribal compete—pre-merge, only your tribe receives credit for an elimination. This captures the fundamental objective of Survivor: lasting longer than other players.
Final Tribal Council Jury votes are processed as head-to-head matchups between finalists. Each jury vote for a player creates a "win" against each finalist who did not receive that vote. This structure naturally dampens "goat" ratings. A player who reaches Final Tribal Council but receives zero jury votes will lose matchups to every other finalist for every jury vote cast. Strong finalists who earn jury respect gain rating points; passengers who were carried to the end lose them.
Fire-Making Challenges Final four fire-making challenges are direct head-to-head Elo matchups between the two competitors.

What SHALLOW Measures Indirectly

Challenge Performance (Outplay) We do not award bonus points for immunity wins. Challenges are indirectly measured: winning immunity means attending tribal council without elimination risk, earning outlast credit against whoever goes home. Immunity also provides negotiating leverage—the necklace holder cannot be voted out and can influence the target. These strategic benefits flow through vote accuracy and survival, not a separate challenge score. We track immunity wins as a display statistic.
Idol Play Successfully playing an idol keeps you in the game. The value is captured through survival and vote accuracy, not the idol play itself. Like immunity, idols provide leverage: a known idol holder can negotiate from a position of power. This influence appears in the resulting votes, not as a direct bonus.

What SHALLOW Does Not Measure

Strategic Attribution The system cannot determine who drove a vote. If five players vote together, all five are credited equally. In large samples, players who consistently find themselves on the right side of votes will accumulate higher ratings. In small samples, a strategic mastermind and a follower may look identical. Every correct voter gains; every incorrect voter loses.
Intentional Wrong Votes A player who deliberately votes against the majority to protect a relationship is penalized the same as a genuine misread. The model cannot distinguish intent. In the short run this is a real limitation. Over a full season, if the relationship pays off through deeper survival or jury votes, the outlast and FTC components capture the downstream value.
Entertainment Value Confessional count, memorable moments, and fan popularity are not inputs to the rating.

The Rating System

SHALLOW uses the Elo rating system, where rating changes depend on two factors:

The Outcome Winners gain points; losers lose points.
The Expection Upsets (lower-rated player beats higher-rated) cause larger rating swings than expected outcomes.

All players begin at 1500. Ratings are zero-sum: points gained by winners equal points lost by losers.

The K Factor

The K factor controls rating volatility—how much ratings change per event. After empirical optimization across all seasons, SHALLOW uses K = 8.

Why K = 8?

We tested K values from 4 to 128, measuring predictive accuracy using Brier score (lower is better). K = 8 minimizes prediction error, indicating it best balances responsiveness to new information against stability from accumulated history.

Lower K values mean established ratings are more stable and require more evidence to change significantly. This reflects Survivor's high-variance environment where even skilled players face elimination through circumstances beyond their control.

Returning Players

Players carry their ratings across seasons. A returning player enters their new season with whatever rating they accumulated previously, allowing the system to track career-long performance.

Provisional Ratings

Players with fewer than 5 tribal council attendances are marked "provisional." Their ratings affect other players normally, but they are excluded from official rankings due to insufficient sample size.

This prevents early boots with small samples from appearing alongside established players in the rankings.

Tiers

Players are assigned tiers based on statistical z-scores (standard deviations from mean rating):

Tier	Z-Score	Interpretation
Sole Survivor	≥ 2.0	Top ~2% of all players
Finalist	1.0 to 2.0	Top ~15%
Merge	0 to 1.0	Above average
Jury	-1.0 to 0	Below average
Pre-Merge	< -1.0	Bottom ~15%

Tier boundaries are automatically calibrated to the full player population.

Limitations

Survivor is high-variance. Tribe swaps, idol finds, challenge outcomes, and production twists introduce randomness that can eliminate strong players and advance weaker ones. SHALLOW captures outcomes, not intentions or abilities in isolation.

Vote data has edge cases. Split votes, revotes, unanimous decisions, and special tribal formats require interpretation. We process these consistently but acknowledge ambiguity in some situations.

US and Australian Survivor are pooled. We treat all seasons as one continuous rating pool. This maximizes signal on legendary players who have competed across both franchises—Sandra Diaz-Twine, Tony Vlachos, Russell Hantz, and others have ratings built from appearances in both US and AU seasons. The tradeoff is that the two franchises may have different average competition levels, but cross-franchise returnees anchor the populations together.

Case Studies

These examples illustrate what SHALLOW captures—and what it cannot.

The Namesake: Parvati Shallow

Parvati Shallow ranks #1 with an Elo of 1773—four standard deviations above the mean. Across five seasons and 47 tribal councils, she maintained 83.7% vote accuracy while reaching Final Tribal Council three times. Her rating reflects not a single dominant season, but sustained excellence across Micronesia, Heroes vs. Villains, and Winners at War.

Vote Accuracy vs. Perceived Control: Dee Valladares

Dee Valladares won Season 45 and is widely considered one of the strongest recent winners. She ranks #66 amongst US players. Why?

Player	Rank	Vote Accuracy	Tribals	Jury Votes
Kyle Fraser (S48)	#19	100%	10	5/8
Rachel LaMont (S47)	#20	77.8%	12	7/8
Kenzie Petty (S46)	#36	90.9%	13	5/8
Dee Valladares (S45)	#66	66.7%	10	5/8

Dee voted incorrectly at three tribal councils. Kyle voted correctly every time. SHALLOW cannot observe jury management, strategic misdirection, or edit portrayal. It sees only who voted for whom and who went home. A player can dominate strategically while appearing on the "wrong" side of votes that eliminated allies or pawns. This is a known limitation: the model rewards correct reads as recorded, not necessarily optimal strategy.

Reaching FTC Twice Without Winning: Amanda Kimmel

Amanda Kimmel reached Final Tribal Council twice—China and Micronesia—and lost both times. She still ranks #40th amongst US players with an A-tier rating. Across three seasons, she attended 30 tribal councils with 85.7% vote accuracy and earned 4 jury votes total.

Her rating reflects sustained survival and strategic positioning. FTC losses cost her rating points against the winners, but 30 tribals of strong performance outweighs two jury defeats. SHALLOW values longevity and vote accuracy; losing at FTC hurts, but does not erase an otherwise excellent record.

Fan Favorite, Mixed Bag: Spencer Bledsoe

Spencer Bledsoe is a fan favorite who won 6 individual immunities across two seasons and reached Final Tribal Council in Cambodia. He ranks #71 amongst US players with a A-tier rating.

His vote accuracy was 65.4%—he voted incorrectly 9 times across 27 tribals. At FTC, he received zero jury votes. Challenge dominance kept him alive, but the system sees frequent misreads and jury rejection.

Goat Dampening: Sue Smey and Ben Katzman

Sue Smey (Season 47) and Ben Katzman (Season 46) both reached Final Tribal Council. Both received zero jury votes. Despite surviving 39 days, their ratings sit near the population mean (#271 and #297 amongst US and AU, respectively).

Reaching the end without jury support generates matchup losses against every other finalist for every jury vote cast. The system does not reward survival alone—it rewards survival validated by peer respect.

Elite Without Winning: Cirie Fields

Cirie Fields ranks #7 among US players with an Elo of 1683 despite never winning Survivor and never reaching Final Tribal Council. Across five seasons and 44 tribal councils, she maintained 78.6% vote accuracy with zero individual immunity wins.

Her rating is built entirely on strategic positioning—being on the right side of votes and outlasting opponents at the tribals she attended. Cirie demonstrates that the system measures gameplay outcomes, not just results. You do not need to win challenges or reach FTC to rate among the all-time greats.

Rise and Fall: Russell Hantz

Russell Hantz peaked at 1733 Elo (S-tier) in the penultimate episode of Heroes vs. Villains. Then came Final Tribal Council: zero jury votes against Sandra and Parvati. His rating dropped immediately as he lost every jury matchup.

It continued falling. In Redemption Island (Season 22), Russell was eliminated second. In Australian Survivor: Champions vs. Contenders, he was voted out pre-merge. Each early elimination generated outlast losses against the players who survived. Russell's current rating reflects the full arc: dominant strategic play in Samoa and Heroes vs. Villains, followed by three consecutive poor performances.

Rise and Fall: Sandra Diaz-Twine

Sandra Diaz-Twine peaked at 1708 Elo (S-tier) after winning Heroes vs. Villains—her second consecutive victory. She was, at that moment, one of the highest-rated player in Survivor history.

Three subsequent seasons eroded her rating. In Game Changers (Season 34), she was voted out at the second swap. In Winners at War (Season 40), she was eliminated pre-merge. In Australian Survivor: Blood vs. Water, she left early again. Sandra currently ranks far from her peak. Like Russell, her trajectory illustrates that SHALLOW treats each tribal independently. Past wins provide no immunity from future losses.

Data Sources

Vote data sourced from the survivoR package.

Methodology Changes

This system is under active development. Rating calculations may be refined as we identify improvements. Historical ratings are recalculated when methodology changes to maintain consistency.

Current version: 1.0.0

See also: BEAST challenge rating methodology, the FAQ, and the full rankings.