Why Amazon Has Dropped its Internal AI Usage Leaderboard

Amazon has dropped an internal initiative that sought to encourage employees to embrace AI tools after it emerged staff were using AI to complete what the Financial Times describes as "pointless" tasks.
The initiative, known as Kirorank, ranked Amazon employees in a leaderboard based on how often they used AI in their day-to-day work. However, with staff consuming tokens to complete low-value tasks, the company's computing costs spiked.
The FT reports that Kirorank is now unavailable for use.
Dave Treadwell, Senior Vice President at Amazon, told staff the leaderboard was built with âgood intentionsâ but encourages âtokenmaxxingâ â the inflating of AI token consumption.
Tokens are the units of data processed by AI models. âPlease do not use AI just for the sake of using AI," Dave added.
Amazon says in a statement that âthe beta dashboard was not a formal or approved tool, and has since been deprecatedâ. It adds that the leaderboard âwas created by a group of employees who wanted to drive awareness for how AI can accelerate workâ.
The Financial Times also reports that Meta employees have attempted similar gaming of internal tables by driving up token consumption.
Why Amazon's AI incentives misfired
The rankings collapse when employees optimise for the metric rather than the mission. Some reportedly assign AI agents, autonomous bots that act on a userâs behalf, to needless tasks to climb the table.
Each pointless task consumes computing capacity that Amazon must pay for. That turns the scoreboard into work, rather than a signal of useful adoption.
Public leaderboards heighten the pressure to perform to the number. Without a clear link to business outcomes, the metric invites work that looks active but creates little value.
Attribution and transparency matter. Amazonâs distancing of the beta tool underlines how unofficial dashboards can drift from leadershipâs intent.
Adoption targets, costs and vendor pricing
Amazon introduced targets for more than 80% of its developers to use AI each week, applying adoption pressure on individual engineers, according to the Financial Times.
Employees reportedly use Kiro and MeshClaw, an inâhouse agent tool, to generate additional AI activity and demonstrate usage. That activity boosts tokens consumed, not necessarily output delivered.
Costs are rising as model providers move to consumptionâbased pricing. Anthropic, whose models Amazon uses extensively, has shifted from flat monthly fees to metered usage, increasing some customers’ bills, the Financial Times reports.
The stakes are significant as Amazon expects to spend about US$200bn in capital expenditure, the vast majority on AI and data centre infrastructure.
To help fund that investment, the company has undertaken sweeping layoffs, cutting at least 30,000 corporate roles since October 2025, CBS News reports.
From tokens burned to code shipped
Amazonâs replacement metric points to where AI adoption measurement is heading. It now tracks ânormalised deploymentsâ, evidence that engineers regularly use AI to create useful code, rather than raw token consumption, the Financial Times reports.
Treadwell tells staff he does not want workers to focus on tokens. He instructs them to concentrate on building better products and shipping improvements customers notice.
Other leaders echo this outcomesâfirst view. Ravi Kumar S, CEO at Cognizant, calls token consumption a âvanity metricâ, telling Fortune that the company measures results over usage. He is hiring more than 20,000 graduates this year as rivals cut.
Measuring deployments encourages thoughtful integration of gen AI into the software lifecycle. It rewards teams for merging AIâassisted code into production rather than spinning tokens on experiments that never ship.
The lesson for HR and engineering leaders
This episode is a textbook incentiveâdesign failure. Mandate a behaviour, attach a public scoreboard and people will deliver the number, whether or not it creates value.
The fix is straightforward. Measure the outcome the business actually wants, in this case working code rather than tokens burned, and the incentive to game the metric fades.
Where AI is concerned, adoption quality matters more than adoption quantity. Leaders should define highâvalue use cases, align targets with delivery milestones and validate impact postâdeployment.
Clear ownership helps. If dashboards drive behaviour, they must be approved, audited and linked to goals that matter for customers and the business.


