Why Amazon Has Dropped its Internal AI Usage Leaderboard

By Diya Joseph

June 07, 2026

undefined mins

Share this article

Prioritise Us on Google

Share this article

Prioritise Us on Google

Amazon ends AI usage leaderboard as staff game tokens, shifting focus to deployments amid rising compute costs. Credit: Amazon

Amazon has removed an internal AI leaderboard after employees inflated token consumption, instead opting to track deployments that show real code shipping

Amazon has dropped an internal initiative that sought to encourage employees to embrace AI tools after it emerged staff were using AI to complete what the Financial Times describes as "pointless" tasks.

The initiative, known as Kirorank, ranked Amazon employees in a leaderboard based on how often they used AI in their day-to-day work. However, with staff consuming tokens to complete low-value tasks, the company's computing costs spiked.

The FT reports that Kirorank is now unavailable for use.

Dave Treadwell, Senior Vice President at Amazon, told staff the leaderboard was built with “good intentions” but encourages “tokenmaxxing” – the inflating of AI token consumption.

Tokens are the units of data processed by AI models. “Please do not use AI just for the sake of using AI," Dave added.

Amazon says in a statement that “the beta dashboard was not a formal or approved tool, and has since been deprecated”. It adds that the leaderboard “was created by a group of employees who wanted to drive awareness for how AI can accelerate work”.

The Financial Times also reports that Meta employees have attempted similar gaming of internal tables by driving up token consumption.

Dave Treadwell, Senior Vice President at Amazon

Why Amazon's AI incentives misfired

The rankings collapse when employees optimise for the metric rather than the mission. Some reportedly assign AI agents, autonomous bots that act on a user’s behalf, to needless tasks to climb the table.

Each pointless task consumes computing capacity that Amazon must pay for. That turns the scoreboard into work, rather than a signal of useful adoption.

Public leaderboards heighten the pressure to perform to the number. Without a clear link to business outcomes, the metric invites work that looks active but creates little value.

Attribution and transparency matter. Amazon’s distancing of the beta tool underlines how unofficial dashboards can drift from leadership’s intent.

Adoption targets, costs and vendor pricing

Amazon introduced targets for more than 80% of its developers to use AI each week, applying adoption pressure on individual engineers, according to the Financial Times.

Employees reportedly use Kiro and MeshClaw, an in‑house agent tool, to generate additional AI activity and demonstrate usage. That activity boosts tokens consumed, not necessarily output delivered.

Costs are rising as model providers move to consumption‑based pricing. Anthropic, whose models Amazon uses extensively, has shifted from flat monthly fees to metered usage, increasing some customers’ bills, the Financial Times reports.

The stakes are significant as Amazon expects to spend about US$200bn in capital expenditure, the vast majority on AI and data centre infrastructure.

To help fund that investment, the company has undertaken sweeping layoffs, cutting at least 30,000 corporate roles since October 2025, CBS News reports.

Andy Jassy, President and CEO of Amazon

From tokens burned to code shipped

Amazon’s replacement metric points to where AI adoption measurement is heading. It now tracks “normalised deployments”, evidence that engineers regularly use AI to create useful code, rather than raw token consumption, the Financial Times reports.

Treadwell tells staff he does not want workers to focus on tokens. He instructs them to concentrate on building better products and shipping improvements customers notice.

Other leaders echo this outcomes‑first view. Ravi Kumar S, CEO at Cognizant, calls token consumption a “vanity metric”, telling Fortune that the company measures results over usage. He is hiring more than 20,000 graduates this year as rivals cut.

Measuring deployments encourages thoughtful integration of gen AI into the software lifecycle. It rewards teams for merging AI‑assisted code into production rather than spinning tokens on experiments that never ship.

Ravi Kumar S, CEO at Cognizant

The lesson for HR and engineering leaders

This episode is a textbook incentive‑design failure. Mandate a behaviour, attach a public scoreboard and people will deliver the number, whether or not it creates value.

The fix is straightforward. Measure the outcome the business actually wants, in this case working code rather than tokens burned, and the incentive to game the metric fades.

Where AI is concerned, adoption quality matters more than adoption quantity. Leaders should define high‑value use cases, align targets with delivery milestones and validate impact post‑deployment.

Clear ownership helps. If dashboards drive behaviour, they must be approved, audited and linked to goals that matter for customers and the business.