Before we begin

Most of the work was done a couple years back (during COVID), so significant parts may have changed since then

Disclaimer

This work isn’t endorsed by Riot Games and doesn’t reflect the views or opinions of Riot Games or anyone officially involved in producing or managing League of Legends. League of Legends and Riot Games are trademarks or registered trademarks of Riot Games, Inc.

Table Of Content

Overview
Current Datasets And Their Issues
At 10000 feet
Replays
Reverse Engineering
Performance
Other Tidbits
Thoughts

Overview

League of Legends is one of the world’s most popular competitive games, with millions of players generating vast amounts of gameplay data daily. Basic match statistics are available, but accessing moment-by-moment gameplay data is near impossible. This article demonstrates how to create a high-fidelity dataset by reverse engineering the game engine, capturing information such as precise player positions to ability usage timings and damage calculations.

Current League of Legends datasets and analytics tools face several limitations:

Granularity: Most available data comes from Riot’s public APIs, which only provide aggregated statistics (total kills, gold earned, etc.) rather than detailed gameplay events
Precision: Existing tools that sample game state often miss critical micro-interactions
Completeness: Important gameplay elements like exact ability used, hiding in fog, positions data are missing

Because of this third-party developers can only build basic tools and datasets. We haven’t seen announcements with regards to game-playing agents unlike other games where high fidelity data has enabled breakthoughs in reinforcement learning ~ OpenAI5, AlphaStar.

I built a tool that can create high fidelity datasets. It directly decrypts and processes the game’s internal replay files, but keeps the game’s native data format. It can capture precise details like ability usage, movement, and state changes at millisecond intervals and can scale efficiently.

I’ll walk through the technical challenges involved in reverse engineering League’s proprietary game engine and replay format. These methods can be applied to similar problems in data extraction.

Expand to show sample of data scraped from a game

{
  [
    {
      "CreateSummoner": {
        "time": 0.0,
        "id": 1073741859,
        "name": "SomeUserName",
        "champion": "akali"
      }
    },
    {
      "CastSpell": {
        "time": 10.23234,
        "champion_caster_id": 1073741859,
        "spell_name": "AkaliE",
        "level": 1,
        "source_position": {
          "x": 14045.15,
          "z": 13559.334
        },
        "target_position": {
          "x": 14400.824,
          "z": 14443.168
        },
        "target_end_position": {
          "x": 0.0,
          "z": 0.0
        },
        "target_ids": [],
        "windup_time": 0.25,
        "cooldown": 14.5,
        "mana_cost": 30.0,
        "slot": 2
      }
    },
    {
      "UpdateState": {
        "time": 721.11426,
        "1073741859": {
          [
            {
              "name": "health",
              "data": 1516.6107
            },
            {
              "name": "movement_speed",
              "data": 270.777
            },
          ]
        }
      }
    },
    {
      "BasicAttackAtTarget": {
        "time": 122.12,
        "source_id": 1073741859,
        "target_id": 1073741858,
        "source_position": {
          "x": 9222.389,
          "z": 2501.3594
        },
        "target_position": {
          "x": 9266.0,
          "z": 2522.0
        },
        "slot": 49,
        "spell_name": "AkaliP",
        "level": 0,
        "target_end_position": {
          "x": 0.0,
          "z": 0.0
        },
        "target_ids": [
          1073741858
        ],
        "windup_time": 1.3504388,
        "cooldown": 0.0,
        "mana_cost": 0.0
      }
    },
    {
      "CreateEntity": {
        "time": 722.9974,
        "id": 1073770044,
        "position": {
          "x": 9326.244,
          "z": 2522.7634
        },
        "name": "TwilightShroud",
        "level": 1,
      }
    },
    {
      "BecomeVisibleInFogOfWar": {
        "time": 862.4249,
        "id": 1073741859
      }
    },
    {
      "Death": {
        "time": 860.31085,
        "id": 1073741859,
      }
    }
  ]
}

Current Datasets And Their Issues

League of Legends is a popular MOBA game. With millions of games played daily, several companiesBackseat.ggblitz.ggMobalytics have invested significant effort in analyzing gameplay data to improve the player experience. However, detailed, fine-grained data remains relatively hard to obtain.

Typical data sources provide only limited, high-level information. Despite these limitations, people build basic models use this data to build basic win prediction models Kaggle or create user interfaces for gameplay analysisMobalytics.

The following section provides different ways to fetch data and what datasets exist out there. I’ll show and explain what these datasets contain.

Most existing datasets are created by scraping recent games through the official RIOT matches API. When provided with a match ID, this API returns a list of participating players and various game statistics and metrics.

Expand to show match descriptionhttps://developer.riotgames.com/apis#match-v5/GET_getMatch

Name	Data Type	Description
endOfGameResult	string	Refer to indicate if the game ended in termination.
gameCreation	long	Unix timestamp for when the game is created on the game server (i.e., the loading screen).
gameDuration	long	Prior to patch 11.20, this field returns the game length in milliseconds calculated from gameEndTimestamp - gameStartTimestamp. Post patch 11.20, this field returns the max timePlayed of any participant in the game in seconds.
gameEndTimestamp	long	Unix timestamp for when match ends on the game server. This timestamp can occasionally be significantly longer than when the match “ends”.
gameId	long
gameMode	string	Refer to the Game Constants documentation.
gameName	string
gameStartTimestamp	long	Unix timestamp for when match starts on the game server.
gameType	string
gameVersion	string	The first two parts can be used to determine the patch a game was played on.
mapId	int	Refer to the Game Constants documentation.
participants	List[ParticipantDto]
platformId	string	Platform where the match was played.
queueId	int	Refer to the Game Constants documentation.
teams	List[TeamDto]
tournamentCode	string	Tournament code used to generate the match. This field was added to match-v5 in patch 11.13 on June 23rd, 2021.
allInPings	int	Yellow crossed swords
assistMePings	int	Green flag
assists	int
baronKills	int
bountyLevel	int
champExperience	int
champLevel	int
championId	int	Prior to patch 11.4, on Feb 18th, 2021, this field returned invalid championIds.
championName	string
commandPings	int	Blue generic ping (ALT+click)
championTransform	int	This field is currently only utilized for Kayn’s transformations. (Legal values: 0 - None, 1 - Slayer, 2 - Assassin)
consumablesPurchased	int
challenges	ChallengesDto
damageDealtToBuildings	int
damageDealtToObjectives	int
damageDealtToTurrets	int
damageSelfMitigated	int
deaths	int
detectorWardsPlaced	int
doubleKills	int
dragonKills	int
eligibleForProgression	boolean
enemyMissingPings	int	Yellow questionmark
enemyVisionPings	int	Red eyeball
firstBloodAssist	boolean
firstBloodKill	boolean
firstTowerAssist	boolean
firstTowerKill	boolean
gameEndedInEarlySurrender	boolean
fistBumpParticipation	int
voidMonsterKill	int
abilityUses	int
acesBefore15Minutes	int
alliedJungleMonsterKills	float
baronTakedowns	int
blastConeOppositeOpponentCount	int
bountyGold	int
buffsStolen	int
completeSupportQuestInTime	int
controlWardsPlaced	int
damagePerMinute	float
damageTakenOnTeamPercentage	float
dancedWithRiftHerald	int
deathsByEnemyChamps	int
dodgeSkillShotsSmallWindow	int
doubleAces	int
dragonTakedowns	int
legendaryItemUsed	List[int]
effectiveHealAndShielding	float
elderDragonKillsWithOpposingSoul	int
elderDragonMultikills	int
enemyChampionImmobilizations	int
enemyJungleMonsterKills	float
epicMonsterKillsNearEnemyJungler	int
epicMonsterKillsWithin30SecondsOfSpawn	int
epicMonsterSteals	int
epicMonsterStolenWithoutSmite	int
firstTurretKilled	int
firstTurretKilledTime	float
flawlessAces	int
fullTeamTakedown	int
gameLength	float

These fields provide coarse, post-processed data from Riot, typically containing aggregate statistics like total kills, wards placed, and other cumulative events.

The Live Client API (LCU) offers a more detailed alternative, but requires running the game client. While it provides event timestamps (like deaths, destroyed turrets, and objective kills), it still lacks crucial data points such as player positions and ability usage timestamps.

Expand to show LCU API json examplehttps://static.developer.riotgames.com/docs/lol/liveclientdata_sample.json

{
  "activePlayer": {
    "abilities": {
      "E": {
        "abilityLevel": 0,
        "displayName": "Molten Shield",
        "id": "AnnieE",
        "rawDescription": "GeneratedTip_Spell_AnnieE_Description",
        "rawDisplayName": "GeneratedTip_Spell_AnnieE_DisplayName"
      },
      "Passive": {
        "displayName": "Pyromania",
        "id": "AnniePassive",
        "rawDescription": "GeneratedTip_Passive_AnniePassive_Description",
        "rawDisplayName": "GeneratedTip_Passive_AnniePassive_DisplayName"
      },
      "Q": {
        "abilityLevel": 0,
        "displayName": "Disintegrate",
        "id": "AnnieQ",
        "rawDescription": "GeneratedTip_Spell_AnnieQ_Description",
        "rawDisplayName": "GeneratedTip_Spell_AnnieQ_DisplayName"
      },
      "R": {
        "abilityLevel": 0,
        "displayName": "Summon: Tibbers",
        "id": "AnnieR",
        "rawDescription": "GeneratedTip_Spell_AnnieR_Description",
        "rawDisplayName": "GeneratedTip_Spell_AnnieR_DisplayName"
      },
      "W": {
        "abilityLevel": 0,
        "displayName": "Incinerate",
        "id": "AnnieW",
        "rawDescription": "GeneratedTip_Spell_AnnieW_Description",
        "rawDisplayName": "GeneratedTip_Spell_AnnieW_DisplayName"
      }
    },
    "championStats": {
      "abilityPower": 0.00000000000000,
      "armor": 0.00000000000000,
      "armorPenetrationFlat": 0.0,
      "armorPenetrationPercent": 0.0,
      "attackDamage": 0.00000000000000,
      "attackRange": 0.0,
      "attackSpeed": 0.00000000000000,
      "bonusArmorPenetrationPercent": 0.0,
      "bonusMagicPenetrationPercent": 0.0,
      "cooldownReduction": -0.00,
      "critChance": 0.0,
      "critDamage": 0.0,
      "currentHealth": 0.0,
      "healthRegenRate": 0.00000000000000,
      "lifeSteal": 0.0,
      "magicLethality": 0.0,
      "magicPenetrationFlat": 0.0,
      "magicPenetrationPercent": 0.0,
      "magicResist": 0.00000000000000,
      "maxHealth": 0.00000000000000,
      "moveSpeed": 0.00000000000000,
      "physicalLethality": 0.0,
      "resourceMax": 0.00000000000000,
      "resourceRegenRate": 0.00000000000000,
      "resourceType": "MANA",
      "resourceValue": 0.00000000000000,
      "spellVamp": 0.0,
      "tenacity": 0.0
    },
    "currentGold": 0.00000000000000,
    "fullRunes": {
      "generalRunes": [{
          "displayName": "Electrocute",
          "id": 8112,
          "rawDescription": "perk_tooltip_Electrocute",
          "rawDisplayName": "perk_displayname_Electrocute"
        },
        {
          "displayName": "Cheap Shot",
          "id": 8126,
          "rawDescription": "perk_tooltip_CheapShot",
          "rawDisplayName": "perk_displayname_CheapShot"
        },
        {
          "displayName": "Eyeball Collection",
          "id": 8138,
          "rawDescription": "perk_tooltip_EyeballCollection",
          "rawDisplayName": "perk_displayname_EyeballCollection"
        },
        {
          "displayName": "Relentless Hunter",
          "id": 8105,
          "rawDescription": "perk_tooltip_8105",
          "rawDisplayName": "perk_displayname_8105"
        },
        {
          "displayName": "Celerity",
          "id": 8234,
          "rawDescription": "perk_tooltip_Celerity",
          "rawDisplayName": "perk_displayname_Celerity"
        },
        {
          "displayName": "Gathering Storm",
          "id": 8236,
          "rawDescription": "perk_tooltip_GatheringStorm",
          "rawDisplayName": "perk_displayname_GatheringStorm"
        }
      ],
      "keystone": {
        "displayName": "Electrocute",
        "id": 8112,
        "rawDescription": "perk_tooltip_Electrocute",
        "rawDisplayName": "perk_displayname_Electrocute"
      },
      "primaryRuneTree": {
        "displayName": "Domination",
        "id": 8100,
        "rawDescription": "perkstyle_tooltip_7200",
        "rawDisplayName": "perkstyle_displayname_7200"
      },
      "secondaryRuneTree": {
        "displayName": "Sorcery",
        "id": 8200,
        "rawDescription": "perkstyle_tooltip_7202",
        "rawDisplayName": "perkstyle_displayname_7202"
      },
      "statRunes": [{
          "id": 5008,
          "rawDescription": "perk_tooltip_StatModAdaptive"
        },
        {
          "id": 5003,
          "rawDescription": "perk_tooltip_StatModMagicResist"
        },
        {
          "id": 5003,
          "rawDescription": "perk_tooltip_StatModMagicResist"
        }
      ]
    },
    "level": 1,
    "summonerName": "Riot Tuxedo"
  },
  "allPlayers": [{
    "championName": "Annie",
    "isBot": false,
    "isDead": false,
    "items": [],
    "level": 1,
    "position": "",
    "rawChampionName": "game_character_displayname_Annie",
    "respawnTimer": 0.0,
    "runes": {
      "keystone": {
        "displayName": "Electrocute",
        "id": 8112,
        "rawDescription": "perk_tooltip_Electrocute",
        "rawDisplayName": "perk_displayname_Electrocute"
      },
      "primaryRuneTree": {
        "displayName": "Domination",
        "id": 8100,
        "rawDescription": "perkstyle_tooltip_7200",
        "rawDisplayName": "perkstyle_displayname_7200"
      },
      "secondaryRuneTree": {
        "displayName": "Sorcery",
        "id": 8200,
        "rawDescription": "perkstyle_tooltip_7202",
        "rawDisplayName": "perkstyle_displayname_7202"
      }
    },
    "scores": {
      "assists": 0,
      "creepScore": 0,
      "deaths": 0,
      "kills": 0,
      "wardScore": 0.0
    },
    "skinID": 0,
    "summonerName": "Riot Tuxedo",
    "summonerSpells": {
      "summonerSpellOne": {
        "displayName": "Flash",
        "rawDescription": "GeneratedTip_SummonerSpell_SummonerFlash_Description",
        "rawDisplayName": "GeneratedTip_SummonerSpell_SummonerFlash_DisplayName"
      },
      "summonerSpellTwo": {
        "displayName": "Ignite",
        "rawDescription": "GeneratedTip_SummonerSpell_SummonerDot_Description",
        "rawDisplayName": "GeneratedTip_SummonerSpell_SummonerDot_DisplayName"
      }
    },
    "team": "ORDER"
  }],
  "events": {
    "Events": [{
      "EventID": 0,
      "EventName": "GameStart",
      "EventTime": 0.00000000000000000
    }]
  },
  "gameData": {
    "gameMode": "CLASSIC",
    "gameTime": 0.000000000,
    "mapName": "Map11",
    "mapNumber": 11,
    "mapTerrain": "Default"
  }
}

The Live Client API provides more granular data than the Match API by including timestamps for key events like deaths, turret destructions, and objective kills. However, it still lacks crucial gameplay details such as player positions, ability usage, and combat interactions.

Several public datasets are available, with Kaggle being a prominent source. A popular Kaggle example demonstrates typical data granularity ~ per-game totals (dragon/baron kills, champion kills, wards placed), and per-minute metrics (CS/gold rates). These datasets generally rely on the RIOT Match API, inheriting its limitations.

Expand to show detailed data

gameId	blueWins	blueWardsPlaced	blueWardsDestroyed	blueFirstBlood	blueKills	blueDeaths	blueAssists	blueEliteMonsters	blueDragons	blueHeralds	blueTotalGold	blueAvgLevel	blueTotalExperience	blueTotalMinionsKilled	blueTotalJungleMinionsKilled	blueGoldDiff	blueExperienceDiff	blueCSPerMin	blueGoldPerMin	redWardsPlaced	redWardsDestroyed	redFirstBlood	redKills	redDeaths	redAssists	redEliteMonsters	redDragons	redHeralds	redTowersDestroyed	redTotalGold	redAvgLevel	redTotalExperience	redTotalMinionsKilled	redTotalJungleMinionsKilled	redGoldDiff	redExperienceDiff	redCSPerMin	redGoldPerMin
4519157822	0	28	2	1	9	6	11	0	0	0	17210	6.6	17039	195	36	643	-8	19.5	1721.0	15	6	0	6	9	8	0	0	0	0	16567	6.8	17047	197	55	-643	8	19.7	1656.7
4523371949	0	12	1	0	5	5	5	0	0	0	14712	6.6	16265	174	43	-2908	-1173	17.4	1471.2	12	1	1	5	5	2	2	1	1	1	17620	6.8	17438	240	52	2908	1173	24.0	1762.0
4521474530	0	15	0	0	7	11	4	1	1	0	16113	6.4	16221	186	46	-1172	-1033	18.6	1611.3	15	3	1	11	7	14	0	0	0	0	17285	6.8	17254	203	28	1172	1033	20.3	1728.5
4524384067	0	43	1	0	4	5	5	1	0	1	15157	7.0	17954	201	55	-1321	-7	20.1	1515.7	15	2	1	5	4	10	0	0	0	0	16478	7.0	17961	235	47	1321	7	23.5	1647.8
4436033771	0	75	4	0	6	6	6	0	0	0	16400	7.0	18543	210	57	-1004	230	21.0	1640.0	17	2	1	6	6	7	1	1	0	0	17404	7.0	18313	225	67	1004	-230	22.5	1740.4
4475365709	1	18	0	0	5	3	6	1	1	0	15899	7.0	18161	225	42	698	101	22.5	1589.9	36	5	1	3	5	2	0	0	0	0	15201	7.0	18060	221	59	-698	-101	22.1	1520.1
4493010632	1	18	3	1	7	6	7	1	1	0	16874	6.8	16967	225	53	2411	1563	22.5	1687.4	57	1	0	6	7	9	0	0	0	0	14463	6.4	15404	164	35	-2411	-1563	16.4	1446.3
4496759358	0	16	2	0	5	13	3	0	0	0	15305	6.4	16138	209	48	-2615	-800	20.9	1530.5	15	0	1	13	5	11	1	1	0	0	17920	6.6	16938	157	54	2615	800	15.7	1792.0
4443048030	0	16	3	0	7	7	8	0	0	0	16401	7.2	18527	189	61	-1979	-771	18.9	1640.1	15	2	1	7	7	5	2	1	1	0	18380	7.2	19298	240	53	1979	771	24.0	1838.0
4509433346	1	13	1	1	4	5	5	1	1	0	15057	6.8	16805	220	39	-1548	-1574	22.0	1505.7	16	2	0	5	4	4	0	0	0	0	16605	6.8	18379	247	43	1548	1574	24.7	1660.5
4452162573	0	20	3	1	4	4	6	0	0	0	15474	6.6	16611	231	28	331	-1585	23.1	1547.4	15	2	0	4	4	5	1	1	0	0	15143	7.2	18196	216	51	-331	1585	21.6	1514.3
4453038156	0	33	2	1	11	11	7	1	0	1	16695	7.0	18507	157	40	-1505	-635	15.7	1669.5	17	1	0	11	11	9	0	0	0	0	18200	7.0	19142	188	52	1505	635	18.8	1820.0
4515594785	1	18	1	1	7	1	11	1	1	0	17865	7.4	19102	238	53	3274	1659	23.8	1786.5	12	1	0	1	7	1	0	0	0	0	14591	6.8	17443	240	50	-3274	-1659	24.0	1459.1
4524924257	0	14	3	0	4	9	1	1	0	1	14979	6.6	17213	210	52	-3414	-1141	21.0	1497.9	20	3	1	9	4	11	0	0	0	0	18393	7.2	18354	229	51	3414	1141	22.9	1839.3
4516505202	1	15	3	1	4	4	4	0	0	0	15722	6.8	17896	224	51	-470	-187	22.4	1572.2	102	1	0	4	4	3	0	0	0	0	16192	7.0	18083	242	48	470	187	24.2	1619.2
4482120064	0	17	1	0	3	7	3	0	0	0	15015	6.8	16974	209	53	-1996	-1804	20.9	1501.5	18	3	1	7	3	13	0	0	0	0	17011	7.2	18778	237	51	1996	1804	23.7	1701.1
4523758462	1	14	1	1	10	2	8	0	0	0	19733	7.6	20862	263	56	5228	3378	26.3	1973.3	13	2	0	2	10	2	1	1	0	0	14505	6.8	17484	210	64	-5228	-3378	21.0	1450.5
4503636905	0	43	3	0	3	7	3	1	0	1	14852	6.8	16888	203	54	-1975	-1345	20.3	1485.2	17	14	1	7	3	6	0	0	0	0	16827	6.8	18233	218	53	1975	1345	21.8	1682.7
4486384947	1	21	4	1	5	4	11	0	0	0	16282	6.8	17378	213	49	882	512	21.3	1628.2	19	3	0	4	5	3	2	1	1	0	15400	6.6	16866	228	52	-882	-512	22.8	1540.0
4457103291	0	11	3	0	5	9	5	0	0	0	14994	7.0	17924	188	48	-3155	-2773	18.8	1499.4	15	1	1	9	5	9

Some third-party vendors partnered with RIOT offer additional APIs. For instance, the Overwolf API combines features from both RIOT APIs but doesn’t provide significant additional granularity beyond the official sources.

Expand to show detailed data

{
  "info": {
    "live_client_data": {
      "active_player": {
        "abilities": {
          "E": {
            "abilityLevel": 0,
            "displayName": "Stacked Deck",
            "id": "CardmasterStack",
            "rawDescription": "GeneratedTip_Spell_CardmasterStack_Description",
            "rawDisplayName": "GeneratedTip_Spell_CardmasterStack_DisplayName"
          },
          "Passive": {
            "displayName": "Loaded Dice",
            "id": "SecondSight",
            "rawDescription": "GeneratedTip_Passive_SecondSight_Description",
            "rawDisplayName": "GeneratedTip_Passive_SecondSight_DisplayName"
          },
          "Q": {
            "abilityLevel": 0,
            "displayName": "Wild Cards",
            "id": "WildCards",
            "rawDescription": "GeneratedTip_Spell_WildCards_Description",
            "rawDisplayName": "GeneratedTip_Spell_WildCards_DisplayName"
          },
          "R": {
            "abilityLevel": 0,
            "displayName": "Destiny",
            "id": "Destiny",
            "rawDescription": "GeneratedTip_Spell_Destiny_Description",
            "rawDisplayName": "GeneratedTip_Spell_Destiny_DisplayName"
          },
          "W": {
            "abilityLevel": 0,
            "displayName": "Pick a Card",
            "id": "PickACard",
            "rawDescription": "GeneratedTip_Spell_PickACard_Description",
            "rawDisplayName": "GeneratedTip_Spell_PickACard_DisplayName"
          }
        },
        "championStats": {
          "abilityHaste": 0,
          "abilityPower": 0,
          "armor": 21,
          "armorPenetrationFlat": 0,
          "armorPenetrationPercent": 1,
          "attackDamage": 25,
          "attackRange": 0,
          "attackSpeed": 0.6510000228881836,
          "bonusArmorPenetrationPercent": 1,
          "bonusMagicPenetrationPercent": 1,
          "cooldownReduction": 0,
          "critChance": 0,
          "critDamage": 0,
          "currentHealth": 534,
          "healthRegenRate": 0,
          "lifeSteal": 0,
          "magicLethality": 0,
          "magicPenetrationFlat": 0,
          "magicPenetrationPercent": 1,
          "magicResist": 30,
          "maxHealth": 534,
          "moveSpeed": 330,
          "physicalLethality": 0,
          "resourceMax": 333,
          "resourceRegenRate": 0,
          "resourceType": "MANA",
          "resourceValue": 333,
          "spellVamp": 0,
          "tenacity": 0
        },
        "currentGold": 0,
        "fullRunes": {
          "generalRunes": [
            {
              "displayName": "Dark Harvest",
              "id": 8128,
              "rawDescription": "perk_tooltip_DarkHarvest",
              "rawDisplayName": "perk_displayname_DarkHarvest"
            },
            {
              "displayName": "Taste of Blood",
              "id": 8139,
              "rawDescription": "perk_tooltip_TasteOfBlood",
              "rawDisplayName": "perk_displayname_TasteOfBlood"
            },
            {
              "displayName": "Eyeball Collection",
              "id": 8138,
              "rawDescription": "perk_tooltip_EyeballCollection",
              "rawDisplayName": "perk_displayname_EyeballCollection"
            },
            {
              "displayName": "Ravenous Hunter",
              "id": 8135,
              "rawDescription": "perk_tooltip_RavenousHunter",
              "rawDisplayName": "perk_displayname_RavenousHunter"
            },
            {
              "displayName": "Presence of Mind",
              "id": 8009,
              "rawDescription": "perk_tooltip_PresenceOfMind",
              "rawDisplayName": "perk_displayname_PresenceOfMind"
            },
            {
              "displayName": "Coup de Grace",
              "id": 8014,
              "rawDescription": "perk_tooltip_CoupDeGrace",
              "rawDisplayName": "perk_displayname_CoupDeGrace"
            }
          ],
          "keystone": {
            "displayName": "Dark Harvest",
            "id": 8128,
            "rawDescription": "perk_tooltip_DarkHarvest",
            "rawDisplayName": "perk_displayname_DarkHarvest"
          },
          "primaryRuneTree": {
            "displayName": "Domination",
            "id": 8100,
            "rawDescription": "perkstyle_tooltip_7200",
            "rawDisplayName": "perkstyle_displayname_7200"
          },
          "secondaryRuneTree": {
            "displayName": "Precision",
            "id": 8000,
            "rawDescription": "perkstyle_tooltip_7201",
            "rawDisplayName": "perkstyle_displayname_7201"
          },
          "statRunes": [
            { "id": 5008, "rawDescription": "perk_tooltip_StatModAdaptive" },
            { "id": 5008, "rawDescription": "perk_tooltip_StatModAdaptive" },
            { "id": 5003, "rawDescription": "perk_tooltip_StatModMagicResist" }
          ]
        },
        "level": 1,
        "summonerName": "Sh4rgaas"
      }
    }
  },
  "feature": "live_client_data"
}

Using The Datasets

Startups currently use these datasets to provide player analysis and insights. Access to more granular data would enable significantly deeper and more sophisticated analysis tools.

Example of overlays provided in game (Image source: blitz.gg)

Blitz.gg that give timers and pathing for players which is pretty cool!

AI voice suggestions (Image source: backseat.gg)

Some startups give AI suggestions to help with gameplay.

U.GG post game analysis (Image source: u.gg)

Others give deep analysis to one’s gameplay.

At 10000 Feet

Overview

Here’s a high level overview of the system, which I’ll describe each part in the following sections

Overview of the system

Replays And Packets

The process begins by fetching encrypted replay files from a data source. These replays contain compressed gameplay state data that the League of Legends client uses to reconstruct game sessions for viewing. After decryption and decompression, the system processes individual encrypted packets through an emulator, which converts them into structured JSON format. The complete process and replay format are detailed in the rofl format section.

Emulator And Decryption Background

The emulatorEmulator is software that runs an architecture. It’s used here to run specific parts of league of legends binary without running the entire game (useful for debugging and seeing what variables are important). An example of an emulator is unicorn engine inspects and debugs the League of Legends binary to handle encrypted packets. Packet data is decrypted when accessed and immediately re-encrypted afterwardYou also might say: you still don’t need an emulator because you can figure out how decryption works and write your own code to decrypt the entire packet buffer without running an emulator.
League of legends is an actively updated game and thus the decryption changes often, so the emulator makes debugging much easier
.

Here’s a simple Python example demonstrating packet decryption using XOR:

TakeDamagePacket has two fields we're interested in extracting information out of

The TakeDamage packet contains two encrypted fields: the target object and damage amount. For this example, we’ll use a simple XOR operation with 0x69 to decrypt these values.

Expand to show code (in-case above python doesn't show)

# Champion has hp and other fields
class Champion:
  def __init__(self, name, id, hp):
    self.name = name
    self.id = id
    self.hp = hp
    
  def __str__(self):
    return f'{self.name}: id [{self.id}] has {self.hp} hp remaining\n'
    
# Encryption/Decryption using XOR -- not real thing
KEY=0x69
def decrypt(data: bytes) -> bytes:
    return bytes(data[i] ^ KEY for i in range(len(data)))

def encrypt(data: bytes) -> bytes:
    return bytes(data[i] ^ KEY for i in range(len(data)))

class TakeDamagePacket:
  def __init__(self, buffer):
    self.buffer = buffer
  
  def target_id_bytes(self) -> bytes:
    return self.buffer[12:16]
    
  def damage_bytes(self) -> bytes:
    return self.buffer[16:20]

# Helper functions
def convert_bytes_to_int(data: bytes) -> int:
  return int.from_bytes(data, byteorder='big')

name = 'Ezreal'
id = 10
hp = 1000
ezreal = Champion(name, id, hp)
print(ezreal)

# example data from rofl file...
buffer = b'\xeb\x9d\x65\x38\xd3\x0e\x47\xfa\xa1\x29\x13\x9b\x69\x69\x69\x63\x69\x69\x68\x45\x75\xe2\x3e\x9f\x50\xfb\xa8\x26\x1f\xc7\x47\x47\x47'

packet = TakeDamagePacket(buffer)

# First, check if the target is ezreal

# Fetch the offset in packet to target id
encrypted_target_id_bytes = packet.target_id_bytes()
print(f'Encrypted target id bytes[' + ''.join(f'\\x{b:02x}' for b in encrypted_target_id_bytes) + ']')

# Decrypt the bytes and convert to int
target_id_bytes = decrypt(encrypted_target_id_bytes)
print(f'Decrypted target id bytes[' + ''.join(f'\\x{b:02x}' for b in target_id_bytes) + ']')

target_id = convert_bytes_to_int(target_id_bytes)
print(f'\nTarget id [{target_id}] == Ezreal id [{ezreal.id}]?')

if target_id == ezreal.id:
  # After access, remove variable and encrypt bytes
  del target_id
  target_id_bytes = encrypt(target_id_bytes)
  print('Damage target is Ezreal\n')
  print(f'Reencrypt target id bytes[' + ''.join(f'\\x{b:02x}' for b in target_id_bytes) + ']')

  # Ezreal took some damage
  encrypted_damage_bytes = packet.damage_bytes()
  print(f'Encrypted damage bytes[' + ''.join(f'\\x{b:02x}' for b in encrypted_damage_bytes) + ']')
  damage_bytes = decrypt(encrypted_damage_bytes)
  print(f'Decrypted damage bytes[' + ''.join(f'\\x{b:02x}' for b in damage_bytes) + ']')
  damage = convert_bytes_to_int(damage_bytes)

  print(f'\nOh no, id [{id}] (Ezreal) took {damage} damage!')
  ezreal.hp -= damage
  print(ezreal)

  del damage
  damage_bytes = encrypt(damage_bytes)
  print(f'Reencrypt damage bytes[' + ''.join(f'\\x{b:02x}' for b in damage_bytes) + ']')

Each packet field undergoes a decrypt-access-release cycle. In the TakeDamagePacket example, when checking if Ezreal is the target, the id field is decrypted, compared, and immediately re-encrypted and deleted. Once the comparison is complete, the decrypted reference is deleted. The same process applies to the damage field. This pattern (unintentionally) is a good defense against memory scannersMemory scanners such as Cheat Engine allows reverse engineers to find determinstic values in memory and be able to trace them back to the code that modifies or reads the values. By deleting the value, it makes it harder for reverse engineers to find important values and thus, important sections of code that produces that value.

Replays

Data sources

Replays files are available through the official league client or third-party websites such as op.gg or Mobalytics. The replays end in the .rofl Interesting name to choose for the file extension.

How Replays Are Viewed

Players can watch replays through either the League client interface or command line.

Example of how to view a replay through the client

ROFL Replay

While the ROFL replay format is undocumented, community efforts have reverse-engineered its structure. Here’s a simplified version of the format:

The ROFL file format

The key components are the encrypted chunks. Each chunk contains packet metadata (time, type, size) and encrypted packet data. While the timestamps tell us when events occur, we still need to figure out what the packets actually do.

Visually Tracing How Packets Are Used

To understand packet structures, we can analyze what the client must process during replay playback. Let’s examine a specific game frame:

Example of viewing a replay (Image source: League Of Legends)

Frame Analysis (Timestamp 4:02):

There are four champions in the image.

Jhin, champion with blue hp on bottom left
Braum,
Thresh,
Ezreal,

Two key actions are occurring:

Jhin is taking damage (as indicated by a small red indicator of his health bar).
Ezreal and Thresh (red health bars) moving leftward

We should expect at least two packets:

Damage Packet
- Jhin’s id
- float value for damage taken
Movement packet
- Ezreal or Thresh’s id
- X, Y, Z coordinates
- Start and end positions

These packets should align with the timestamp and visible game state. The next section will explore their detailed implementation.

Reverse Engineering

Some Background

League of Legends runs on a custom game engine developed in 2009. At that time, Unreal Engine was the primary commercial option, as Unity hadn’t yet emerged. This means we’re working with a closed-source game engine.

While RIOT’s technical blogs offer insights into their enginePlanning a future game engine for league which discusses the issues with the current game engineGraphics pipeline, they don’t reveal the implementation details we need.

Our analysis focuses on the I/O and networking layers where packet processing occurs. We’re particularly interested in the pipelineIn this context, a pipeline as a process that takes an input, transforms the input and splits out an output that transforms raw packet data into game state:

A stream of packets flows through a game engine to update game state

Packet Processing

Packets are done in 3 steps:

Memory Allocation (Packet::Packet)
- Sets up packet metadata
- Allocates space for packet data
Deserialization (DeserializePacket)
- Converts chunk data to in-memory representation
- Extracts some information from the chunk data as packet fields
Game State Update (UsePacket)
- Applies packet data to game state

GameState game_state;

int ParseBufferAndUpdateGameState(char* buffer, uint64_t buffer_size)
{
  // Read the packet type from the buffer
  uint16_t packet_type = *(uint16_t*)buffer;

  // Allocate an empty packet data structure
  Packet packet(packet_type);

  // Parse buffer into packet data structure
  int status_ok = packet.deserialize(buffer, buffer_size);

  // Update game state with packet
  if (status_ok)
    game_state.use_packet(&packet);

  // At this point, the packet is deallocated,
  // invoking the destructor Packet::~Packet()
  return status_ok;
}

Here’s an example of TakeDamagePacket’s DeserializePacket function

enum PacketType : uint16_t
{
  TAKE_DAMAGE = 0x2385,
};

struct TakeDamagePacket : public Packet
{
  PacketType packet_type;
  char* data; // To be parsed later by game state (this contains the source and target id and how much damage was applied)
  int expected_size = 12; // Should expect this left in the buffer
  bool flag1 = false; // Other misc data parsed
}

virtual bool TakeDamagePacket::deserialize(char* buffer, uint64_t buffer_size)
{
  // Skip the packet type from buffer
  char* start_buffer = buffer + sizeof(PacketType);

  // Check if buffer has enough size to read into this packet
  if (buffer_size - sizeof(PacketType) < expected_size)
  {
    return false;
  }

  // Copy the data over
  memcpy(this->data, start_buffer, this->expected_size);

  // See if certain flag is set
  if (start_buffer[3] == 79)
    this->flag1 = true;

  return true;
}

Click to expand to see how we got to packet decompiled code

The first decompiled code snippet below shows the parent function that takes in the raw chunk data, instantiates a specific packet object using a factory class to make a packet object based on its type. At this point, the packet object doesn’t hold any information from the chunk data, so it deserializes the packet data called from its vtable ((*(_QWORD *)packet + 8LL)(packet, &v19, (char *)v6 + v4), which is DeserializePacket). Now, the packet object contains metadata and the chunk data. Then, it uses it by calling from its vtable again ((*(_QWORD *)v11 + 16LL))(v11, packet_, v20))), which calls UsePacket to update the game state.

__int64 __fastcall CallPacketsInOrder(__int64 a1, __int16 **raw_data, __int64 a3, double a4)
{
  __int64 packet; // [rsp+18h] [rbp-38h] BYREF
  __int16 packet_type; // [rsp+24h] [rbp-2Ch] BYREF

  packet_type = **raw_data;
  Packet::Packet(&packet, (unsigned __int16)packet_type);// Packet::Packet(&packet, packet_type);
  if ( !packet )
    return 0;
  v7 = (*(__int64 (__fastcall **)(__int64, __int16 **, char *))(*(_QWORD *)packet + 8LL))(packet, &v19, (char *)v6 + v4);// Calls Packet::Deserialize from packet's vtable
  packet_ = packet;
  *(double *)(a1 + 688) = v20;
  ...
          v11 = *(_QWORD *)(*(_QWORD *)(a1 + 696) + 8 * v10);
          v12 = (*(__int64 (__fastcall **)(__int64, __int64, double))(*(_QWORD *)v11 + 16LL))(v11, packet_, v20);// Calling UsePacket from Packet's VTable
  ...
  if ( packet )
    (*(void (__fastcall **)(__int64))(*(_QWORD *)packet + 32LL))(packet); // Call destructor to packet
  return v7;
}

Decompiled Packet Code

Expand to show decompiled code for allocating memory

int64 *fastcall Packet::Packet(PacketStruct *packet, int packet_type)
{
  switch ( a2 )
  {
    case 2:
      v4 = operator new(0xEuLL);
      *(_WORD *)(v4 + 8) = 2;
      *(_DWORD *)(v4 + 10) = 0;
      v5 = off_101BECE88;
      goto LABEL_639;
    case 3:
      v101 = operator new(0x28uLL);
      *(_WORD *)(v101 + 8) = 3;
      *(_DWORD *)(v101 + 10) = 0;
      *(_QWORD *)v101 = off_101BED530;
      *(_QWORD *)(v101 + 16) = 0LL;
      *(_QWORD *)(v101 + 24) = 0LL;
      *(_WORD *)(v101 + 32) = -22610;
      *packet = v101;
      return packet;
    case 5:
      v4 = operator new(0x20uLL);
      *(_WORD *)(v4 + 8) = 5;
      *(_DWORD *)(v4 + 10) = 0;
      v31 = off_101BEA880;
      goto LABEL_578;
    case 9:
      v4 = operator new(0x20uLL);
      *(_WORD *)(v4 + 8) = 9;
      *(_DWORD *)(v4 + 10) = 0;
      v31 = off_101BE9618;
      goto LABEL_578;
    case 10:
      v4 = operator new(0xEuLL);
      *(_WORD *)(v4 + 8) = 10;
      *(_DWORD *)(v4 + 10) = 0;
      v5 = off_101BECF80;
      goto LABEL_639;
    case 11:
      v381 = operator new(0x1CuLL);
      *(_WORD *)(v381 + 8) = 11;
      *(_DWORD *)(v381 + 10) = 0;
      *(_QWORD *)v381 = off_101BEDDC0;
      *(_QWORD *)(v381 + 16) = 0xE7E7E7E7E7E7E7E7LL;
      *(_DWORD *)(v381 + 24) = -404232217;
      *packet = v381;
      return packet;
    case 12:
      v6 = operator new(0x38uLL);
      *(_WORD *)(v6 + 8) = 12;
      *(_DWORD *)(v6 + 10) = 0;
      *(_QWORD *)v6 = off_101BEEA98;
      *(_QWORD *)(v6 + 16) = off_101BEEA60;
      *(_DWORD *)(v6 + 24) = 320017171;
      *(_QWORD *)(v6 + 32) = 0xBBBBBBBBBBBBBBBBLL;
      *(_QWORD *)(v6 + 40) = 0xCECECECE2E2E2E2ELL;
      *(_DWORD *)(v6 + 48) = -235802127;
      *packet = v6;
      return a1;

Expand to show decompiled code for decrypting packet

char __fastcall DeserializePacket(Packet* packet, unsigned __int64 *a2, unsigned __int64 a3)
{
  v3 = *a2;
  if ( a3 - *a2 < 6 )
    return 0;
  v5 = a2;
  v6 = packet;
  *(_WORD *)(packet + 8) = *(_WORD *)v3;
  *(_DWORD *)(packet + 10) = *(_DWORD *)(v3 + 2);
  v7 = (__int16 *)*a2;
  v8 = (_BYTE *)(*a2 + 6);
  *a2 = (unsigned __int64)v8;
  if ( a3 - (unsigned __int64)v8 < 3 )
    return 0;
  *a2 = (unsigned __int64)v7 + 9;
  v10 = 86;
  if ( (v7[3] & 8) == 0 )
    v10 = 114;
  *(_BYTE *)(packet + 14) = v10;
  switch ( (*((unsigned __int8 *)v7 + 7) >> 4) & 7 )
  {
    case 0:
      *(_DWORD *)(packet + 16) = -1027423577;
      goto LABEL_20;
    case 1:
      if ( (unsigned __int8)sub_100D8D2F0(packet + 16, a2, a3) )
        goto LABEL_20;
      return 0;
    case 2:
      if ( (unsigned __int8)sub_100D8CFB0(packet + 16, a2, a3) )
        goto LABEL_20;
      return 0;
    case 3:
      if ( (unsigned __int8)sub_100D8D490(packet + 16, a2, a3) )
        goto LABEL_20;
      return 0;
    case 4:
      if ( (unsigned __int8)sub_100D8D150(packet + 16, a2, a3) )
        goto LABEL_20;
      return 0;
    case 5:
      *(_DWORD *)(packet + 16) = -1027423550;
      goto LABEL_20;
    case 6:
      *(_DWORD *)(packet + 16) = -1027423610;
      goto LABEL_20;
    case 7:
      *(_DWORD *)(packet + 16) = 1852730990;

Expand to show decompiled code for updating game state

__int64 __fastcall UsePacket(
        __int64 a1,
        __int64 a2,
        __m128 a3,
        __m128 a4,
        double a5,
        double a6,
        double a7,
        double a8,
        double a9)
{
 v9 = *(double *)&a2;
  v10 = 1;
  switch ( *(_WORD *)(a2 + 8) )
  {
    case 0xA:
      sub_100384590(d);
      return v10;
    case 0xE:
      v564 = *(_DWORD *)(a2 + 16);
      v565 = ((187
             - ((65793
               * ((2050
                 * ((unsigned __int8)((65793
                                     * ((2050 * (unsigned __int8)(v564 - 86)) & 0x22110 | (32800
                                                                                         * (unsigned __int8)(v564 - 86)) & 0x88440u)) >> 16) ^ 0xE6)) & 0x22110 | (32800 * ((unsigned __int8)((65793 * ((2050 * (unsigned __int8)(v564 - 86)) & 0x22110 | (32800 * (unsigned __int8)(v564 - 86)) & 0x88440u)) >> 16) ^ 0xE6)) & 0x88440u)) >> 16)) ^ 0xED)
           + 244;
      v566 = (unsigned __int8)((65793
                              * ((2050 * (unsigned __int8)(HIBYTE(v564) - 86)) & 0x22110 | (32800
                                                                                          * (unsigned __int8)(HIBYTE(v564) - 86)) & 0x88440u)) >> 16) ^ 0xE6;
      v567 = c(
               d,
               ((unsigned __int16)(((-17664
                                   - (((65793
                                      * ((2050
                                        * ((unsigned __int8)((65793
                                                            * ((2050 * (unsigned __int8)(BYTE1(v564) - 86)) & 0x22110 | (32800 * (unsigned __int8)(BYTE1(v564) - 86)) & 0x88440u)) >> 16) ^ 0xE6)) & 0x22110 | (32800 * ((unsigned __int8)((65793 * ((2050 * (unsigned __int8)(BYTE1(v564) - 86)) & 0x22110 | (32800 * (unsigned __int8)(BYTE1(v564) - 86)) & 0x88440u)) >> 16) ^ 0xE6)) & 0x88440u)) >> 8) & 0xFF00)) ^ 0xED00)
                                 - 3072)
              + ((-1157627904 - ((16843008 * ((2050 * v566) & 0x22110 | (32800 * v566) & 0x88440)) & 0xFF000000)) ^ 0xED000000 | (((12255232 - ((65793 * ((2050 * ((unsigned __int8)((65793 * ((2050 * (unsigned __int8)(BYTE2(v564) - 86)) & 0x22110 | (32800 * (unsigned __int8)(BYTE2(v564) - 86)) & 0x88440u)) >> 16) ^ 0xE6)) & 0x22110 | (32800 * ((unsigned __int8)((65793 * ((2050 * (unsigned __int8)(BYTE2(v564) - 86)) & 0x22110 | (32800 * (unsigned __int8)(BYTE2(v564) - 86)) & 0x88440u)) >> 16) ^ 0xE6)) & 0x88440)) & 0xFF0000)) ^ 0xED0000) + 15990784) & 0xFF0000)
              - 201326592) | (unsigned __int8)v565);
      if ( v567 )
      {
        v568 = sub_1002F12F0(d, *(unsigned int *)(v567 + 16));
        if ( v568 )
        {
          v569 = v568;
          v570 = (*(__int64 (__fastcall **)(__int64))(*(_QWORD *)v568 + 16LL))(v568);
          v571 = sub_1000EE9E0();
          v = sub_123(v570, v569, v571);
          if ( v )
            (*(void (__fastcall **)(__int64, _QWORD))(*(_QWORD *)v + 2960LL))(
              v,
              (unsigned __int8)((((65793
                                 * ((2050
                                   * (unsigned __int8)((65793
                                                      * ((2050 * *(unsigned __int8 *)(a2 + 14)) & 0x22110 | (32800 * *(unsigned __int8 *)(a2 + 14)) & 0x88440u)) >> 16)) & 0x22110 | (32800 * (unsigned __int8)((65793 * ((2050 * *(unsigned __int8 *)(a2 + 14)) & 0x22110 | (32800 * *(unsigned __int8 *)(a2 + 14)) & 0x88440u)) >> 16)) & 0x88440u)) >> 15) & 0xAA)
                              + (((65793
                                 * ((2050
                                   * (unsigned __int8)((65793
                                                      * ((2050 * *(unsigned __int8 *)(a2 + 14)) & 0x22110 | (32800 * *(unsigned __int8 *)(a2 + 14)) & 0x88440u)) >> 16)) & 0x22110 | (32800 * (unsigned __int8)((65793 * ((2050 * *(unsigned __int8 *)(a2 + 14)) & 0x22110 | (32800 * *(unsigned __int8 *)(a2 + 14)) & 0x88440u)) >> 16)) & 0x88440u)) >> 17) & 0x55)
                              + 51));
        }
      }
      return v10;
    case 0x14:
      v27 = __ROL1__(*(_BYTE *)(a2 + 14), 5);
      v28 = __ROL1__(((v27 >> 1) & 0x55 | (2 * v27) & 0xAA) + 50, 3);
      sub_100384E20(
        d,
        (unsigned __int8)((65793 * ((2050 * v28) & 0x22110 | (32800 * v28) & 0x88440u)) >> 16) != 0);
      return v10;
    case 0x15:
      sub_101153C40();
      if ( *(_DWORD *)(a2 + 24) )
      {
        v930 = *(_QWORD *)(a2 + 16);
        v931 = 60LL * *(unsigned int *)(a2 + 24);
        do
        {
          sub_100170F20(v930);
          v930 += 60LL;
          v931 -= 60LL;
        }
        while ( v931 );
      }
      sub_1001BCB10(0LL);
      return v10;

As you may notice, all three functions is composed of a big switch statement: it’s because it runs each case statement based on the packet type.

We’re interested in how the packet is used to update game state. Let’s take a look into that.

Reading The Packet

After deserialization, we still can’t directly read the packet data. The packet object maintains encrypted chunk data, only decrypting fields when they’re accessed - as explained in the emulator and decryption section. Let’s examine how this works in practice by analyzing the TakeDamage function, focusing on two key pieces of information that we want to extract: who is taking damage and how much damage they’ve taken.

Click to expand to view the raw decompiled TakeDamage function

char __fastcall TakeDamage(__int64 idk, __int64 packet)
{
  v3 = *(_DWORD *)(packet + 16);
  v4 = HIWORD(v3);
  v5 = HIBYTE(v3);
  LOBYTE(v3) = __ROL1__(~byte_1019891E0[(unsigned __int8)~__ROL1__(v3, 4)], 4);
  v6 = byte_1019891E0[(unsigned __int8)(__ROL1__(
                                          ~byte_1019891E0[(unsigned __int8)~__ROL1__(BYTE1(*(_DWORD *)(packet + 16)), 4)],
                                          4)
                                      - 54)];
  v7 = 65793 * ((2050 * v6) & 0x22110 | (32800 * v6) & 0x88440);
  v8 = byte_1019891E0[(unsigned __int8)(__ROL1__(~byte_1019891E0[(unsigned __int8)~__ROL1__(v4, 4)], 4) - 54)];
  v9 = (2050 * v8) & 0x22110 | (32800 * v8) & 0x88440;
  v10 = byte_1019891E0[(unsigned __int8)(__ROL1__(~byte_1019891E0[(unsigned __int8)~__ROL1__(v5, 4)], 4) - 54)];
  target_object = GetInternalObjectFromID(
                    qword_101C60850,
                    (16843008 * ((2050 * v10) & 0x22110 | (32800 * v10) & 0x88440)) & 0xFF000000 | (65793 * v9) & 0xFF0000 | (v7 >> 8) & 0xFF00 | (unsigned __int8)((65793 * ((2050 * byte_1019891E0[(unsigned __int8)(v3 - 54)]) & 0x22110 | (32800 * byte_1019891E0[(unsigned __int8)(v3 - 54)]) & 0x88440u)) >> 16));
  ...
  source_object = GetInternalObjectFromID(
                    qword_101C60850,
                    ((unsigned __int8)(__ROR1__(
                                         byte_1019891E0[byte_1019891E0[(unsigned __int8)(-113
                                                                                       - HIBYTE(*(_DWORD *)(packet + 36)))]],
                                         1)
                                     + 89) << 24) | ((unsigned __int8)(__ROR1__(
                                                                         byte_1019891E0[byte_1019891E0[(unsigned __int8)(-113 - HIWORD(*(_DWORD *)(packet + 36)))]],
                                                                         1)
                                                                     + 89) << 16) | ((unsigned __int8)(__ROR1__(byte_1019891E0[byte_1019891E0[(unsigned __int8)(-113 - BYTE1(*(_DWORD *)(packet + 36)))]], 1) + 89) << 8) | (unsigned int)(unsigned __int8)(__ROR1__(byte_1019891E0[byte_1019891E0[(unsigned __int8)(-113 - *(_DWORD *)(packet + 36))]], 1) + 89));
  ...
  v46 = (HIWORD(*(_DWORD *)(packet + 24)) ^ 0xF2) + 9;
  (*(void (__fastcall **)(__int64, double))(*(_QWORD *)obj + 1920LL))(
    obj,
    *(double *)_mm_cvtsi32_si128(__ROL1__(      // float that represents damage taken
                                   ((unsigned __int8)((*(_DWORD *)(packet + 24) ^ 0xF2) + 9) >> 1) & 0x55 | (2 * ((*(_DWORD *)(packet + 24) ^ 0xF2) + 9)) & 0xAA,
                                   5) ^ 4 | ((__ROL1__(
                                                ((unsigned __int8)((BYTE1(*(_DWORD *)(packet + 24)) ^ 0xF2) + 9) >> 1) & 0x55 | (2 * ((BYTE1(*(_DWORD *)(packet + 24)) ^ 0xF2) + 9)) & 0xAA,
                                                5) ^ 4) << 8) | ((__ROL1__((v46 >> 1) & 0x55 | (2 * v46) & 0xAA, 5) ^ 4) << 16) | ((unsigned __int8)(__ROL1__(((unsigned __int8)((HIBYTE(*(_DWORD *)(packet + 24)) ^ 0xF2) + 9) >> 1) & 0x55 | (2 * ((HIBYTE(*(_DWORD *)(packet + 24)) ^ 0xF2) + 9)) & 0xAA, 5) ^ 4) << 24)).i64);
  ...
  return 1;
}

Tthe second parameter to GetInternalObjectFromID is the id that are the source and target network ids. The functions converts the network ids to the game engine objects by looking through a map of [id -> object pointer]. Then, to apply damage to the target object, there’s a call to obj, using its vtable at offset 0x1920 to apply damage. The conversion using _mm_cvtsi32_si128 just puts the bytes into the xmm0 register because the function takes in a float.

The code can be boiled down to this

void TakeDamage(__int64 idk, char* packet_buffer)
{
  target_object = GetInternalObjectFromID(DECRYPT_VALUE1((uint32_t)packet_buffer + 16));
  source_object = GetInternalObjectFromID(DECRYPT_VALUE2((uint32_t)packet_buffer + 36));

  target_object->take_damage(DECRYPT_FLOAT1((float)packet + 24));
}

GetInternalObjectFromID converts an id (uint32_t) to an internal object in the game. DECRYPT_VALUE1 and DECRYPT_VALUE2 are different inlined functions that decrypts a value that I’ll explain in the next section.

Below is the an emulator for GetInternalObjectFromID for the target object. Feel free to try to it out. In the end, you should expect the value 265086324 in RSI register, which matches the second argument to GetInternalObjectFromID. RDI is the first argument, which is a global map of id -> object and RSI the is the second argument containing the network id.

Address	Bytes	Assembly

Address	Bytes	ASCII

Reg	Value	Number

How Does The Decryption Work

Let’s break down this decompiled code: target_object = GetInternalObjectFromID(DECRYPT_VALUE1((uint32_t)packet_buffer + 16));? At first glance it’s hard to read, so let’s examine what’s happening step by step:

  v3 = *(_DWORD *)(packet + 16);
  v4 = HIWORD(v3);
  v5 = HIBYTE(v3);
  LOBYTE(v3) = __ROL1__(~byte_1019891E0[(unsigned __int8)~__ROL1__(v3, 4)], 4);
  v6 = byte_1019891E0[(unsigned __int8)(__ROL1__(
                                          ~byte_1019891E0[(unsigned __int8)~__ROL1__(BYTE1(*(_DWORD *)(packet + 16)), 4)],
                                          4)
                                      - 54)];
  v7 = 65793 * ((2050 * v6) & 0x22110 | (32800 * v6) & 0x88440);
  v8 = byte_1019891E0[(unsigned __int8)(__ROL1__(~byte_1019891E0[(unsigned __int8)~__ROL1__(v4, 4)], 4) - 54)];
  v9 = (2050 * v8) & 0x22110 | (32800 * v8) & 0x88440;
  v10 = byte_1019891E0[(unsigned __int8)(__ROL1__(~byte_1019891E0[(unsigned __int8)~__ROL1__(v5, 4)], 4) - 54)];
  target_object = GetInternalObjectFromID(
                    qword_101C60850,
                    (16843008 * ((2050 * v10) & 0x22110 | (32800 * v10) & 0x88440)) & 0xFF000000 | (65793 * v9) & 0xFF0000 | (v7 >> 8) & 0xFF00 | (unsigned __int8)((65793 * ((2050 * byte_1019891E0[(unsigned __int8)(v3 - 54)]) & 0x22110 | (32800 * byte_1019891E0[(unsigned __int8)(v3 - 54)]) & 0x88440u)) >> 16));

The decryption process uses a 255-byte lookup table combined with arithmetic operations. It’s an enhanced version of XOR obfuscation: instead of a simple XOR, each byte goes through multiple transformations:

Initial lookup in the 255-byte table
Series of arithmetic operations (multiply, add, subtract)
Additional table lookups
Final transformation to produce the decrypted value

Boiling this down, it looks like this:

  char lookup_table[255] = { 0x56, 0x82, 0xDC, 0x83, 0x02, 0x8F, 0x29, 0x35, 0x04, 0x21, 0x71, 0x79, 0x9E, 0x92, 0x7F, ... };
  uint32_t enc_data = *(uint32_t*)((char*)packet + 16);
  uint8_t lookup_byte1 = enc_data & 0xFF;
  uint8_t lookup_byte2 = (enc_data >> 8) & 0xFF;
  uint8_t lookup_byte3 = (enc_data >> 16) & 0xFF;

  // Here are the operations for 1 byte
  uint8_t lookup_byte3_intermediate = lookup_table[~lookup_byte3 >>= 4];
  // lookup again
  uint8_t value3 = lookup_table[lookup_byte3_intermediate];
  uint32_t transform_value3 = (16843008 * (2050 * value3) & 0x22110) | (value3 * 32800);
  // Extract the first byte out of it to place in our resulting value (network id)
  uint32_t high_byte = transform_value3 & 0xFF000000

  // Repeat for other bytes...

  uint32_t network_id = 0; // To fill in
  network_id |= high_byte;

Here’s the emulator for taking damage target_object->take_damage(DECRYPT_FLOAT1((float)packet + 24)); where performs the decryption and finally the damage value appears. At the end, you should get 000000003e874000 in RAX which is 0.26416015625

Address	Bytes	Assembly

Address	Bytes	ASCII

Reg	Value	Number

So what’s the catch?

While it might seem feasible to reimplement these functions in Python without running the client, several factors make this approach impracticalActually, a more seasoned reverse engineer may know better tools for doing such things.:

the encryption and decryption methods changes
- lookup table might be different
- the operations after the lookup might be different
- there’s not a single global lookup table
  - a different lookup table associated with every function
the chunk layout (fields) get shuffled per patch
- What may be in 0x10 offset of the packet may now be at offset 0x38
the packet types switch up each patch

“Beating” The Obfuscation

Instead of reverse-engineering each encryption function, I chose a pragmatic approach: reading values directly from registers. This method requires an emulator, which typically runs rather slow (unicorn emulator). Instead, I developed an “exception emulator” that runs natively (detailed in the performance section).

For example, in the TakeDamage function, I intercept two key values:

The network id from the RSI register before target_object creation
The damage value from RAX before it’s applied

This interception uses a trampoline hookIf you’re not familiar with trampoline hook, it’s essentially a callback that you add or similar to breakpointing in gdb and editing the program’s variables before continuing:

Visually demonstrating what a trampoline hook does (Image source: codereversing.com)

Click to expand to view how the trampoline hook works

The process has two parts:

Save original bytes and insert jump:

movd xmm0,eax
jmp FFF90000 # Insert my trampoline hook here

Create trampoline:

Save all register states
Call MY_FUNCTION to expose registers as a struct
Allow high-level language access to register values
Restore registers
Execute original instructions
Return to normal code flow

push rsp                               
pushfq                                 
...     
push r8                                
push r9                                
push r10                               
push r11                               
push r12                               
push r13                               
push r14                               
push r15                               
push rbx                               
push rcx                               
push rdx                               
push rsi                               
push rdi                               
push rbp                               
sub rsp,40                             
movaps xmmword ptr ss:[rsp],xmm0       
movaps xmmword ptr ss:[rsp+10],xmm1    
movaps xmmword ptr ss:[rsp+20],xmm2    
movaps xmmword ptr ss:[rsp+30],xmm3    
...
mov rax, MY_FUNCTION # This is what we redirect to, to read RAX
call rax                               
...
movaps xmm0,xmmword ptr ss:[rsp]       
movaps xmm1,xmmword ptr ss:[rsp+10]    
movaps xmm2,xmmword ptr ss:[rsp+20]    
movaps xmm3,xmmword ptr ss:[rsp+30]    
add rsp,40                             
pop rbp                                
pop rdi                                
pop rsi                                
pop rdx                                
pop rcx                                
pop rbx                                
pop r15                                
pop r14                                
pop r13                                
pop r12                                
pop r11                                
pop r10                                
pop r9                                 
pop r8                      
...            # Execute original function somewhere here        
jmp qword ptr ds:[FFF90105] 
xchg esp,eax                           
ret far        # Jump back

At the high level, Rust code interfaces with these hooks to read and modify program state:

emulator.hook_address(TakeDamageAddress, move |emulator, context| {
    let damage_float = convert_xmm_to_float(context.xmm0);
    obj.take_damage(damage_float);
});

By applying similar hooks to other game events (ability casts, movement, resource changes, etc.), we can reconstruct the complete gameplay state.

Why do it this way?

Works directly with compressed ROFL files
Uses fewer resources than running the game
- Ignores irrelevant game systems
Provides precise value extraction

Why not do it this way?

Requires hooking undocumented packet structures
- Have deep understanding of packet structures
System breaks with unexpected packet changes
Potential for value extraction errors
No game engine validation

Alternative Approach: Direct Game State Reading

MiscellaneousStuff’s approach avoids packet analysis entirely by reading game state directly:

Replay extraction by running the game and reading game state per interval (Image source: miscellaneousstuff.github.io)

This method works by running the full game client (sound, graphics, network, etc) in replay mode, sampling the game at fixed time intervals and dumping the game state snapshots to disk.

While simpler to maintain, this sampling approach trades precision for simplicity. Lower sampling intervals increase precision but increases data usage, similar to sample-based profilers.

Performance

Since we’re scraping, perfomance matters a little bit. We don’t want it to be running at a snail’s pace (hours+ per replay).

I’ve written the emulator originally with the unicorn emulatorIf you’re unfamiliar with unicorn emulator, it’s a CPU emulator. I like to think of it as GDB with much more powerful scripting capabilities in python. It was a good for bootstrapping, but it was terribly slow (by how much? somewhere like 5 minutes per replay).

I’ve since rewritten it in rust with an api similar to unicorn emulator’s api. I called it the exception emulator because it runs natively and any exceptions (ex, out of bounds memory accesses) are routed and handled by a hook similar to unicorn’s api. It works by inserting a software breakpoint (INT3 instruction on x86 CPUs) and creating an exception handler that lookups the corresponding the function to call that is mapped to the breakpoint. I use this to reverse functions where the game state is not properly initialized.

My very simple methodology for evaluating my program using a single replay with a machine setup:

Input:
- ~15 minute replay (11.5mb rofl file)
- “Gaming” machine
- NVME device
- Windows 11
- Other applications are running like chrome and discord
Output:
- prettified json ~(135MB)

PS> Measure-Command {start-process cargo "run --release -- -file test.rofl" -Wait}


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 3
Milliseconds      : 33
Ticks             : 30338426
TotalDays         : 3.51139189814815E-05
TotalHours        : 0.000842734055555556
TotalMinutes      : 0.0505640433333333
TotalSeconds      : 3.0338426
TotalMilliseconds : 3033.8426

Running with one replay of ~15 minutes finishes in 3 seconds. Eyeballing task manager, it takes around 400MB up when running. Output is 135MB of prettified json.

Comparison to MiscellaneousStuff’s approach

Author also does some napkin math ~ 26 / 16 + 0.25 = 1.6 minutes = 96 seconds for 26 minute replay. So accounting for the replay, I’m running at 5 seconds, which is a bit faster than the author’s approach. The memory might bit a bit higher (running the game), which I’ve eyeballed at 1GB. Space utilization is ~820MB unpacking one game from his repo compared to 300MB - 500MB.

While this does show that my approach has better performance, this is definitely eyeballing the results. I haven’t ran the author’s code.

In either case, both should scale decently. Both can run multiple instances of the game/emulator.

Other Tidbits

Packet Optimizations

If you are working at RIOT, I’ve listed some optimizations that could be applied to the packets. I don’t have a full view of how the game engine/server works, so some of these optimizations may not apply.

Repeat Packets

Packets of leaving the fog appear multiple times (20+ repeats sometimes!) at the same time for the same id. I don’t see why this should be done unless there’s an engine specific issue or an issue when generating a replay.

{
  "LeaveFromFog": {
    "time": 862.4249,
    "id": 1073741859
  }
}

A look into a single replay file shows the following:

Repeat packets: 21762
Total packets: 770650
Percentage of repeat packets: 2.82%
Repeat packet total bytes: 65370
Packet total bytes: 3399813
Percentage of bytes used up by repeat packets: 1.92%

A graph showing the breakdown of repeat packets per packet type

Removing these repeats in the replay could reduce the amount of data used per replay by around ~2%, which could save cost for hosting the space and bandwidth saved using the cloud (perhaps AWS).

Doing some quick estimates only for storage writes cost

~30 million active players per day and a player plays at least 2 games per day

average replay size = 10MB = 10 MB
optimization = 0.02
daily active users = 30 million / day = 30M/day
number of games = daily active users × 2 / 10 = 6M/day
data written in a day = number of games × average replay size to GB/day = 60,000 GB/day
data saved = optimization × data written in a day = 1,200 GB/day
aws S3 cost = $0.02 per GB  = $0.02/GB
money saved per day = data saved × aws S3 cost = $24.00/day
money saved per day per month = $730.49
money saved per day per year = $8,765.82

Integer Values

For integer values (ids, hashes, etc), consider using variable integer compression, which is commonly used in serialization frameworks such as protobuf and wasm to save more space.

The chunk format already does this for single byte versus 4 byte integer, the variable integer compression can be used to create 2 byte or 3 byte integers.

Same-ish Packets

Some packets have different ids, but mostly do the same thing For example, there could be a MovementPacket1 and MovementPacket2, and they both have the same fields, except for 1 or 2 extra fields in MovementPacket2. I believe that merging them could reduce complexity, but I do not know how the game server is structured or how the code base is structured

Repeat of Different Packets

Some packets encompass the same meaning. For example, typically, a damage packet is applied (by the same id) followed by a death packet at the same time. The damage packet is not necessary. However, due to the game engine and the way it is structured, it may be difficult to remove this.

Replay Time offset

An optimization I observed for each packet is using a byte (instead of a float) to represent the time offset for the next packet. This is quite neat.

Game Engine Optimizations

I’m going to suggest some game engine optimizations below. However, consider that the game engine itself is quite old, it may be difficult to near impossible to change how the engine is structured.

Single Threadedness

Packets are handled as follows:

Create/Allocate packet -> Deserialize buffer into packet -> Update game state with packet

Instead, multiple threads could be used to split the work

Thread 1: Create/Allocate packet -> Deserialize buffer into packet
Thread 2: Create/Allocate packet -> Deserialize buffer into packet
Thread 3: Create/Allocate packet -> Deserialize buffer into packet
Thread 4: Create/Allocate packet -> Deserialize buffer into packet

Thread 5: Update game state with packet

The packets may be out of order, so have another thread order the packets with queues:

Thread 1: Create/Allocate packet -> Deserialize buffer into packet -> Unordered packet queue
Thread 2: Create/Allocate packet -> Deserialize buffer into packet -> Unordered packet queue
Thread 3: Create/Allocate packet -> Deserialize buffer into packet -> Unordered packet queue
Thread 4: Create/Allocate packet -> Deserialize buffer into packet -> Unordered packet queue

Thread 6: Unordered packet queue -> Order packets -> Ordered packet queue

Thread 5: Ordered Packet queue -> Update game state with packet

Buffer Pools

Instead of allocating/freeing memory for each packet, use a buffer/memory pool for commonly used packets to reduce CPU overhead.

Thoughts

Improvements

One idea is to lift the assembly to LLVM, perform static analysis to generate or extract the decryption mechanism or the functions themselves into a different language such as C++, but that would take way too much time.

Lastly, the bottleneck in the system is most likely replay aggregation (which I haven’t measure yet, but I would assume so given that downloading the replay takes seconds).

Wrapping Up

Fun project. Captured hundred of thousands of replays and did some analysis on them, but got busy. Learned a bit more about reversing and rust.

Before we begin

Disclaimer

Table Of Content

Overview

Current Datasets And Their Issues

Using The Datasets

At 10000 Feet

Overview

Replays And Packets

Emulator And Decryption Background

Replays

Data sources

How Replays Are Viewed

ROFL Replay

Visually Tracing How Packets Are Used

Reverse Engineering

Some Background

Packet Processing

Decompiled Packet Code

Reading The Packet

How Does The Decryption Work

So what’s the catch?

“Beating” The Obfuscation

Why do it this way?

Why not do it this way?

Alternative Approach: Direct Game State Reading

Performance

Comparison to MiscellaneousStuff’s approach

Other Tidbits

Packet Optimizations

Repeat Packets

Integer Values

Same-ish Packets

Repeat of Different Packets

Replay Time offset

Game Engine Optimizations

Single Threadedness

Buffer Pools

Thoughts

Improvements

Wrapping Up

Enjoy Reading This Article?