Developer ToolsWeb ScrapingData IntelligenceApifyPPE Pricing

GitHub Stars Are a Vanity Metric (And What to Read Instead)

GitHub stars are a lifetime popularity counter that never decays. Here are 5 signals that actually predict adoption, health, risk, quality, and trajectory.

Ryan Clinton

The problem: Stars are the only GitHub signal most teams ever look at, and they correlate with almost nothing that matters in 2026. A 60,000-star framework can have one active maintainer and a release schedule that quietly stopped 18 months ago. A 3,000-star security library can be wired into half the Fortune 500. Stars are a lifetime popularity counter — they tick up forever, they never decay when a project dies, and they don't distinguish "adopted in production" from "bookmarked during a Hacker News scroll." The signal stopped working sometime around the time GitHub crossed 100 million repos. Teams keep using it because it sits at the top of every search result, and because nobody's quite sure what to read instead.

What is a vanity metric on GitHub? A vanity metric is a number that looks meaningful, moves in a flattering direction, and has zero predictive power for the decision it's being used to make. GitHub stars are the canonical example. They measure cumulative bookmarking interest at some point in the past — not current health, current adoption, current risk, or current trajectory. The metric flatters the project; it doesn't inform the buyer.

Why it matters: Stars drive real money decisions. Engineering meetings re-litigate "but it has 30k stars" before adopting a dependency. VCs sourcing dev-tool deals filter their pipeline by star count. Security teams skip audits on "popular" projects. According to GitHub's 2025 Octoverse report, the platform now hosts 630 million repositories with 180 million developers — a noise floor where star count tells you almost nothing about whether the specific project in front of you is safe to depend on.

Use it when: You're evaluating whether to adopt, fund, integrate, or audit a GitHub project — and you want to read signals that survived past 2014.

Also known as: github vanity metrics, github star inflation, fake github stars, gameable github signals, undecaying popularity counter, lifetime stargazer count, github discovery vs evaluation signals

Quick answer:

  • What it is: Stars are a lifetime popularity bookmark counter — additive, never decaying, decoupled from current behaviour.
  • When to use them: Discovery only. Useful at the top of a funnel to find candidates worth evaluating.
  • When NOT to use them: Adoption, dependency audits, risk reviews, due diligence, trend detection, or anything where current state matters.
  • Replace with: Fork-to-star ratio, 90-day commit velocity, top-contributor share, trajectory enum, and decay score — read together.
  • Main tradeoff: The replacement signals need enrichment data (community profile, contributor stats, releases) — more API calls per repo, in exchange for a verdict instead of a vibe.

Decision shortcut — when stars actually mislead:

  1. Project hasn't seen a release in 18 months — stars say "popular," reality says "frozen."
  2. Top contributor did 90%+ of last-year commits — stars say "trusted," reality says "one person away from collapse."
  3. Recent commits are 100% bot-driven (Dependabot, Actions) — stars say "active," reality says "zombie."
  4. Repo went viral on launch day, traffic decayed since — stars say "trending," reality says "spike, not slope."
  5. Stars are climbing but forks aren't — stars say "adopting," reality says "bookmarking."

If any of these are true, the star count is misleading you. The signal you want lives elsewhere.

If you only remember 5 things:

  1. Stars don't decay. A 50,000-star repo abandoned in 2022 still has 50,000 stars in 2026. The number can't go down even when the project's already dead.
  2. Stars are gameable. Dagster documented a working fake-star economy where 1,000 stars can be purchased for as little as $64. Their detector found inflation patterns across multiple AI projects.
  3. Stars don't measure adoption. They measure bookmarking. Forks, downstream dependents, and production-grade language tags measure adoption. They're different numbers.
  4. Stars don't measure depth. A 3,000-star vector DB is huge for its category. A 3,000-star JS framework is a curiosity. Star counts mean nothing without category baselines.
  5. The replacement is a stack, not a single field. Adoption, health, risk, quality, and trajectory each need their own signal. Five fields, read together, replace one number that was lying anyway.

Stars suggest vs reality signal — the article in one table:

Stars suggestReality often is
PopularMaybe abandoned (lifetime counter, no decay)
TrustedMaybe single-maintainer (concentration hidden)
AdoptedMaybe just bookmarked (no fork/dependent signal)
GrowingMaybe decaying (no rolling-window velocity)
QualityMaybe a viral demo (no governance signal)

Five ways the same number lies in five different directions. The replacement isn't a better single signal — it's a stack of five.

Problems this solves:

  • How to evaluate a GitHub repo without trusting the star count
  • How to spot a popular-but-dead project before adopting it
  • How to detect bought or inflated GitHub stars
  • How to compare repos across different size categories fairly
  • How to filter a category for current health rather than lifetime popularity
  • How to build a dependency adoption checklist that survives an audit

In this article: What's a vanity metric · Five ways stars mislead · Stars-to-real-signals map · JSON output example · Adoption signal · Health signal · Risk signal · Quality signal · Trajectory signal · Naive vs replacement · Archetypes · Star buying · What teams do with this · Cost math · Limitations · Best practices · Common mistakes · FAQ

Examples table — concrete input/output mappings:

Repo profileStar verdictReal verdict
60k-star "flagship" framework, no formal governance, sole BDFL"Industry standard, safe"CAUTION, MEDIUM bus factor risk, COLLAPSING governance signal
3k-star security library, 8 active contributors, monthly releases, used in production by 4 of the top 10 cloud providers"Too small to matter"STRONGLY_RECOMMENDED, GROWING adoption percentile, top 3% of category
12k-star AI demo, peaked Day 1 on Hacker News, no commits since launch month"Still trending"ABANDONED, decay velocity FAST, no human activity 11 months
20k-star foundation-governed project, top contributor share 88%, paid lead"Single-maintainer risk"MODERATE — concentration high but governance distributed, MINIMAL_IMPACT if lead leaves
800-star Rust toolchain, REVIVING from 14-month dormancy, 4 new contributors in last 60 days"Dead repo"REVIVING trajectory, MEDIUM adoption readiness, breakout flag pending

The takeaway: in five out of five rows, the star count points the wrong way. The replacement stack flips the verdict in every case.

What is a vanity metric on GitHub?

Definition (short version): A vanity metric on GitHub is a number that looks important, only moves up, and has no predictive power for adoption, health, risk, or trajectory. Stars are the canonical example — a lifetime, undecaying, additive bookmark counter that doesn't reflect current project behaviour.

The longer version: a vanity metric is a flattering top-of-funnel number, decoupled from any decision the buyer is actually trying to make. It survives because it's easy to display, easy to compare, and easy to brag about. It fails because it doesn't tell you whether to depend on the project today.

There are roughly four ways stars get used in practice:

  1. Discovery filter — "Show me repos in this topic above 1,000 stars." Defensible. Stars are a popularity prior; they help find candidates worth evaluating.
  2. Comparison metric — "Which of these three frameworks is bigger?" Weak. Cross-category miscalibration breaks this immediately.
  3. Adoption proxy — "This has 50k stars, must be safe." Wrong. Adoption needs forks, dependents, and production-language signals — not stars.
  4. Health proxy — "This has 50k stars, must be maintained." Wrong, and the most expensive failure mode. Stars stay frozen at 50k whether the project ships weekly or hasn't shipped in two years.

Use 1 with caution. Use 2-4 only if you've never been burned by a dependency choice and want to find out what that feels like.

The cleanest mental model is a two-stage funnel — discovery, where stars work, then evaluation, where they fail:

    ┌─────────────────────────────────────────┐
    │              DISCOVERY                  │
    │     (stars are a useful prior)          │
    │                                         │
    │   "Show me the most-starred repos       │
    │    in topic:vector-database"            │
    │                                         │
    │   Output: candidate shortlist           │
    └────────────────────┬────────────────────┘
                         │
                         ▼
    ┌─────────────────────────────────────────┐
    │             EVALUATION                  │
    │   (stars actively mislead here)         │
    │                                         │
    │   "Of these candidates, which are       │
    │    safe to adopt / fund / integrate?"   │
    │                                         │
    │   Output: per-candidate verdict —       │
    │   adoption · health · risk · quality    │
    │   · trajectory                          │
    └─────────────────────────────────────────┘

Stars get you onto the list. They cannot get you off it. The stack of five replacement signals is what runs the second stage.

Quick extraction answers for common queries:

Are GitHub stars a good metric? GitHub stars are a useful discovery signal but a poor evaluation signal. They measure cumulative bookmarking, not current health, adoption, or risk. They never decay when a project dies, they don't reflect current usage, and they're inflatable through documented fake-star markets. Use them to find candidates, not to judge them.

What should I use instead of GitHub stars? Replace stars with a five-signal stack: fork-to-star ratio plus downstream dependents (adoption), 90-day commit velocity plus trajectory enum (health), top-contributor share plus signed-commit ratio (risk), community profile plus release cadence (quality), and rolling star velocity plus breakout detection (trajectory). Stars compress all five into one number badly. The stack reads them separately.

Do GitHub stars predict project quality? No. Quality requires recent maintenance activity, distributed contributors, release discipline, and community governance — none of which are correlated with star count. The Borges and Valente 2018 study of the top-5,000 starred GitHub repositories explicitly warned about the risks of selecting projects by star count, while finding that 75% of developers do exactly that.

Five ways GitHub stars actively mislead

There are five distinct failure modes. Stars don't fail one way — they fail five different ways, and the failure modes compound.

1. Lifetime accumulation — stars never decay

A star is permanent unless the user explicitly unstars. The lifetime counter has no decay, no half-life, and no mechanism for "this project went dormant." A repo that hit 40,000 stars in 2020 still has 40,000 stars in 2026 if nobody actively unstars it — even if the maintainer left, the releases stopped, the issues filled with > stale-bot closed for inactivity, and the dependency tree it sits on top of has moved on. This is the single largest reason stars stop working as the platform ages: the older GitHub gets, the more zombie popularity it accumulates.

2. Star velocity is gameable

Big star spikes correlate with social events, not adoption events. A Hacker News front page, a viral social-media thread, a Reddit r/programming feature, a "look what I built" Show HN — all produce sustained 3-7 day star inflows that look like adoption from outside. Six weeks later the curve flattens and the project is back to its baseline. The peak star count survives forever; the actual user base never materialised.

3. Cross-category miscalibration

3,000 stars in vector databases is a top-3 player. 3,000 stars in JavaScript UI frameworks is a curiosity nobody's heard of. 3,000 stars in CLI utilities is enormous. Stars without category baselines are unreadable — the same number means five different things across five categories. Teams that don't normalise against category percentiles draw the wrong conclusions reliably.

4. Star buying and fake-star markets exist

Dagster's investigation documented a working market for purchased GitHub stars, with services pricing stars at €0.85 each and one provider selling 1,000 fake stars for $64. Their detector identified clear fingerprint patterns: accounts created after a certain date, fewer than 1 follower, fewer than 1 person followed, no public Gist, four or fewer public repositories, empty profile fields, and grant/creation/update dates that all match. They found inflation patterns across multiple AI-tagged repositories. The point isn't that every popular repo is gaming stars; it's that you can no longer tell from the star count alone.

5. Stars measure discovery, not depth

Bookmarking is not adoption. A user who stars a repo while reading a thread on the train has the same effect on the counter as a team that integrates the project into production for two years. Real adoption shows up in forks (someone copied it to use), in downstream dependents (someone wired it into a package manifest), and in production-grade language tags (someone shipped it). Stars compress all of those into a single bookmarking signal that can't distinguish between them.

Read together, these five failure modes mean the star count answers the wrong question. The right question is "is this project healthy, adopted, low-risk, well-maintained, and trending in the right direction right now?" None of those have a one-field shortcut, but each has a reliable replacement signal.

Mapping stars to the signals they replace

The cleanest way to think about this is to map each thing teams think stars tell them to the signal that actually tells them.

                              ★ STARS ★
                         (one lifetime counter,
                          undecaying, gameable)
                                  │
           ┌──────────┬───────────┼───────────┬──────────┐
           │          │           │           │          │
           ▼          ▼           ▼           ▼          ▼
       ADOPTION    HEALTH       RISK       QUALITY   TRAJECTORY
           │          │           │           │          │
   ┌───────┴──┐  ┌────┴────┐ ┌────┴────┐ ┌────┴────┐ ┌───┴────┐
   │ fork-to- │  │ 90-day  │ │ top-    │ │ community│ │ rolling│
   │ star     │  │ commit  │ │ contrib │ │ profile  │ │ velocity│
   │ ratio    │  │ velocity│ │ share   │ │ %        │ │ +      │
   │ +        │  │ +       │ │ +       │ │ +        │ │ breakout│
   │ depend-  │  │ traject-│ │ signed- │ │ release  │ │ flag   │
   │ ents     │  │ ory     │ │ commit  │ │ cadence  │ │ +      │
   │ +        │  │ enum    │ │ ratio   │ │ +        │ │ accel- │
   │ language │  │ +       │ │ +       │ │ days     │ │ eration│
   │ tags     │  │ decay   │ │ ifMain  │ │ since    │ │ vs     │
   │          │  │ score   │ │ Leaves  │ │ release  │ │ baseline│
   └──────────┘  └─────────┘ └─────────┘ └──────────┘ └────────┘

Field-by-field:

What teams think stars meanWhat the real signal is
"It's adopted"Fork-to-star ratio + downstream dependents + production language tags
"It's healthy"90-day commit velocity + maintenance.trajectory + decayScore
"It's low-risk"topContributorShare + signedCommitRatio + ifMaintainerLeaves
"It's high quality"communityProfile.healthPercentage + release cadence + daysSinceRelease
"It's trending"Rolling star velocity + isBreakout flag + acceleration vs baseline

One number gets replaced by five. The replacement looks heavier — it isn't, in practice, because the GitHub Repo Intelligence actor returns all five for every repo in a single call. Conceptually, the substitution is one-to-five, not one-to-one.

JSON output — what the replacement signals look like

Concrete output from a single call. This is what reading "is this project actually any good?" looks like when stars are no longer the answer.

{
  "fullName": "scrapy/scrapy",
  "stars": 53200,
  "forks": 10580,
  "scores": {
    "projectHealthScore": 91,
    "adoptionReadinessScore": 94,
    "communityScore": 88,
    "supplyChainRiskScore": 8,
    "outreachScore": 62
  },
  "benchmarks": {
    "healthPercentile": 92,
    "adoptionPercentile": 88,
    "riskPercentile": 12,
    "categoryRank": 3,
    "totalInCategory": 50
  },
  "communityProfile": {
    "healthPercentage": 100,
    "hasReadme": true,
    "hasContributing": true,
    "hasCodeOfConduct": true,
    "hasLicense": true
  },
  "activityStats": {
    "commitActivity90d": 87,
    "commitActivity365d": 312,
    "weeklyCommitAvg90d": 6.7
  },
  "contributors": {
    "count": 547,
    "topContributorShare": 0.18,
    "signedCommitRatio": 0.73
  },
  "latestRelease": {
    "tag": "v2.12.0",
    "daysSinceRelease": 23
  },
  "maintenance": {
    "status": "ACTIVE",
    "trajectory": "STABLE",
    "decayScore": 3,
    "decayVelocity": "NONE",
    "isZombie": false,
    "busFactorRisk": "LOW",
    "ifMaintainerLeaves": "MINIMAL_IMPACT"
  },
  "forecast": {
    "growthProjection30d": "HIGH",
    "abandonmentRisk90d": "LOW"
  },
  "recommendations": {
    "adoptionVerdict": "STRONGLY_RECOMMENDED",
    "riskLevel": "LOW",
    "maintenanceStatus": "ACTIVE"
  }
}

The star count (53,200) is in there. It's just one field of forty. The fields that actually answer "should I adopt this?" are recommendations.adoptionVerdict, scores.adoptionReadinessScore, and maintenance.trajectory. None of them are stars.

Compare against a hypothetical 60,000-star "flagship" framework with one BDFL and a frozen release cadence: same star count range, but topContributorShare: 0.94, daysSinceRelease: 487, maintenance.status: AT_RISK, trajectory: COLLAPSING, adoptionVerdict: CAUTION. The star count would have told you to adopt it. The verdict tells you to wait.

What to read for real adoption

The signal: Fork-to-star ratio + downstream dependents + production-grade language tags.

Stars measure who bookmarked the project. Forks measure who copied it to actually use. The fork-to-star ratio is the cleanest single proxy for "did people adopt this, or did they just save it for later?" A healthy adopted project lives somewhere in the 0.10–0.25 range — one fork for every 4–10 stars. A heavily-bookmarked-but-not-used project sits below 0.05. A heavily-forked-relative-to-stars project is being used for derivative work or production deployment.

Downstream dependents is the next layer. GitHub surfaces "Used by N" in the sidebar of repos that produce installable packages. That number captures actual ecosystem integration — package.json, requirements.txt, go.mod, Cargo.toml entries that pull this project into other people's builds. Stars say "I noticed you." Dependents say "I shipped you."

Production-grade language tags add the third layer. A 30,000-star demo repo tagged jupyter notebook is fundamentally different from a 30,000-star library tagged go or rust. The language profile from a repo's enrichment data — the actual code-mix percentages — separates "tutorial / demo / showcase" from "production library other people depend on."

The actor exposes all three: forks and stars for the ratio, the languages enrichment for the production-grade tag, and adoptionReadinessScore as the composite read. Where the ratio plus dependents plus language profile diverge from the star count, the star count is the one to ignore.

What to read for real project health

The signal: 90-day commit velocity + maintenance.trajectory + maintenance.status enum + decayScore.

Health is a current-state signal. Stars measure popularity at some unspecified point in the past; health asks "what's happening this quarter?" The 90-day commit count is the headline number. A library shipping 40-100 commits in the last 90 days is alive. One shipping 5 in 90 days is decaying. One shipping 0 is abandoned, regardless of what the star counter says.

The maintenance.status enum maps that velocity to a class: ACTIVE, STABLE, SLOWING, AT_RISK, ABANDONED. STABLE and ACTIVE are both fine — STABLE is mature low-velocity activity, ACTIVE is high-velocity. SLOWING is the early warning. AT_RISK is the late warning. ABANDONED is the post-mortem.

The trajectory enum (GROWING, STABLE, DECLINING, COLLAPSING, REVIVING) gives the second derivative — direction matters more than current state. A 60,000-star repo with status: AT_RISK and trajectory: COLLAPSING is a different animal from a 3,000-star repo with status: ACTIVE and trajectory: GROWING. Stars rank them in the wrong order; the enums rank them correctly.

The decayScore (0-100) and decayVelocity (NONE, SLOW, FAST) sharpen the signal further. A score of 3 with decayVelocity: NONE is a healthy steady-state. A score of 67 with decayVelocity: FAST is a project losing 2-3% of its activity baseline per week. Either way, the star count can't tell you which is which.

For deeper coverage of the abandonment-at-scale problem, see the detect abandoned GitHub repositories at scale post — that's the field-by-field guide to reading the maintenance enum across hundreds of repos.

What to read for real supply-chain risk

The signal: topContributorShare + signedCommitRatio + bus factor + ifMaintainerLeaves impact.

This is where stars fail most expensively. A popular dependency with one maintainer is the structural precondition for the worst kinds of supply-chain incident. The xz-utils backdoor (CVE-2024-3094), analysed by Akamai and documented on Wikipedia, happened because the project was effectively maintained by one person — Lasse Collin — over years, and a long-running social-engineering campaign exploited that concentration to insert a working remote-code-execution backdoor into a library that ships in nearly every Linux distribution. The star count said "popular and trustworthy." The contributor concentration said "one person away from collapse." Only one of those would have warned a buyer.

The headline replacement signal is topContributorShare over a rolling window — the share of the last year's commits authored by the top single contributor. Above 0.80 on a critical dependency is a flag. Above 0.90 is a five-alarm fire. Below 0.40 with multiple contributors above 0.10 each is a healthily distributed project.

signedCommitRatio adds a verification layer. Signed commits cryptographically attest the author. A high signed-commit ratio doesn't make a project safe, but a low one removes one of the few defences against commit spoofing in compromised contributor accounts.

busFactorRisk (LOW / MEDIUM / HIGH) and ifMaintainerLeaves (MINIMAL_IMPACT / SLOWS_DEVELOPMENT / PROJECT_LIKELY_STALLS) close the loop with an impact prediction. Concentration alone overstates risk for foundation-governed projects with paid leads; concentration plus impact gives the real read. For the deeper field-by-field treatment, the bus factor explainer is the companion piece — that post does the rolling-window methodology and worked examples.

What to read for real quality

The signal: Community profile completeness + release cadence + latestRelease.daysSinceRelease.

Quality is the hardest of the five to compress into one number, because "quality" means different things to different buyers — code quality, governance quality, release-discipline quality, documentation quality. The replacement signal that survives across all those uses is the GitHub Community Profile completeness check.

A repo's community profile (hasReadme, hasContributing, hasCodeOfConduct, hasIssueTemplate, hasPullRequestTemplate, hasLicense, healthPercentage) is a structured measurement of governance maturity. A 100% community-profile score doesn't guarantee good code — but a sub-50% score on a 60k-star "flagship" project tells you nobody set up the basics, and that the project's apparent maturity is a marketing read, not a structural one.

Release cadence adds the discipline layer. latestRelease.daysSinceRelease under 90 for a production library is healthy. 180 is a yellow flag. 365 is a red flag. 540+ on a project that claims production-readiness is an open invitation to read the issues tab and find out why.

The composite communityScore (0-100) rolls these together with a weighted breakdown. A 60k-star repo with a 42 community score is a different proposition than a 3k-star repo with an 88 community score, even though the popularity metric ranks them the opposite way around.

What to read for real trajectory

The signal: Trajectory enum (GROWING / STABLE / DECLINING / COLLAPSING / REVIVING) + breakout detection + acceleration vs category baseline.

Stars are an integral. Trajectory is a derivative. The integral can be enormous while the derivative is negative — that's the "popular but dying" case stars can't see by definition. A repo can have 60,000 stars and a trajectory: COLLAPSING reading at the same time, because trajectory measures the slope of the activity curve, not the cumulative star total.

The five-state enum is the readable summary: GROWING (accelerating), STABLE (mature consistent), DECLINING (slowing), COLLAPSING (rapidly losing activity), REVIVING (coming back from dormancy). One field replaces a manual reading of seven different time-series charts.

Breakout detection (isBreakout: true) flags repos in the top 1% of a category by current velocity — useful for finding projects that haven't yet hit the popularity threshold the star sort optimises for. For the "find rising projects before they hit Trending" workflow, the find trending GitHub repositories before they blow up post does the deep treatment of acceleration-vs-baseline signal stacking.

Acceleration vs category baseline is the subtlety that matters most. A 30k-star repo gaining 5 stars/day is decelerating. A 1,200-star repo gaining 80 stars/day is accelerating. The star count says the first repo is bigger; the acceleration says the second one is the trade. Both can be true simultaneously, which is why the trajectory enum exists — it surfaces the second-order signal stars hide.

Naive vanity verdict vs replacement-signal verdict

A side-by-side at scale. These are realistic profiles I've seen come out of category scans, mapped to what the star count alone would tell you and what the replacement stack actually says.

Repo profileStar verdictReplacement-signal verdict
60k-star CSS framework, sole BDFL, last release 412 days agoIndustry standardCAUTION, AT_RISK, COLLAPSING governance
3k-star security library, 8 active contributors, monthly tagged releases, used by major cloud providersNiche, skipSTRONGLY_RECOMMENDED, top 3% of category
12k-star AI demo repo, peaked Day 1 on Hacker News, no commits since launch monthTrendingABANDONED, decayVelocity FAST, no human activity 11 months
20k-star foundation-governed compiler, top contributor share 0.88 (paid lead)Single-maintainer riskMODERATE — concentration high, governance distributed, MINIMAL_IMPACT
800-star Rust toolchain, REVIVING from 14-month dormancy, 4 new contributors in last 60 daysDeadREVIVING trajectory, breakout pending, MEDIUM adoption readiness
45k-star ORM, 100% of last 6 months' commits from dependabot[bot] and github-actions[bot]ActiveZOMBIE, AT_RISK, no human activity 180d
1.1k-star CLI tool, 15 active contributors, weekly releases, fork-to-star ratio 0.31Too small to botherSTRONGLY_RECOMMENDED, GROWING, high adoption percentile
28k-star LLM agent framework, +30 stars/day for last year, no accelerationHot projectSTABLE — not accelerating, ranked below smaller velocity leaders

Pricing and features based on publicly available information as of May 2026 and may change.

The pattern is consistent. The star count and the verdict point in opposite directions in 6 of 8 rows. In the 2 rows where they roughly agree, the verdict still adds context the star count couldn't carry — concentration vs governance, REVIVING vs dead, breakout vs steady-grower.

Concrete archetypes — where stars and reality disagree

Five archetypes worth knowing. Each is the kind of profile a star-only filter mishandles in a recognisably different way.

The flagship-framework-without-governance. A 60,000-star JS framework with a single benevolent dictator, no formal governance, no foundation backing, and a release cadence that quietly stopped 14 months ago. The star count says industry standard. The community profile says 38% complete. The contributor graph says one person doing 91% of the last year's commits. When the BDFL eventually steps back — they always eventually step back — the project has no continuity plan, and "we picked this because it had 60k stars" becomes a post-mortem line.

The quiet workhorse. A 3,000-star security library, 8 active contributors, monthly tagged releases, signed commits at 89%, used in production by 4 of the top 10 cloud providers. The star count says skip. The dependents count says half the internet runs on this. The fork-to-star ratio says active integration work. This is the profile most likely to be the correct dependency choice and the least likely to be picked by a star-sorted filter.

The Hacker News spike. A 12,000-star AI demo with a launch-day peak of 8,000 stars in 48 hours, then 4,000 more across the next month, then a flat curve. Last commit was the launch month. README still says "production-ready." The star counter never got the memo. Adoption never happened. This archetype is overrepresented in trending lists and underrepresented in things you should depend on.

The foundation-governed concentration paradox. A 20,000-star compiler under a recognised foundation. Top-contributor share is 0.88 — looks like single-maintainer risk by raw concentration. But the lead is a paid full-time employee with a documented succession plan, and the foundation funds two backup maintainers. busFactorRisk: HIGH by share alone, ifMaintainerLeaves: MINIMAL_IMPACT by impact analysis. The naive concentration read overstates risk; the impact-weighted read corrects it. Stars don't help either way.

The REVIVING from dormancy. An 800-star Rust toolchain that went dormant for 14 months, then 4 new contributors picked it up in the last 60 days, shipped a release, opened the issue tracker, and started accelerating. The star count says dead. The trajectory says REVIVING with revivalStrength: STRONG. This is the archetype VC sourcing teams pay for — the inflection point where a category is being re-attacked by a new team. Star-sorted filters bury it; trajectory-sorted filters surface it.

These five archetypes — flagship-without-governance, quiet workhorse, Hacker News spike, governance paradox, and revival — are 80% of the cases where stars produce the wrong verdict. None of them are obscure edge cases; they're recurring patterns. The replacement stack handles all five.

The star-buying economy

A short detour because it changes how to read any star-sorted result.

Star-buying is not a hypothetical. Dagster's investigation, which set up a controlled dummy repo and bought stars from multiple services, found a working market: services like Baddhi Shop selling 1,000 fake stars for $64, GitHub24 selling stars at €0.85 each. Their detector ran fingerprint analysis across recent star events and identified inflation patterns concentrated in AI-tagged repositories — accounts created clustered around specific dates, near-empty profiles, follower counts under 1, public repo counts under 5, and grant/creation/update timestamps that all line up.

The implication is not that every popular repo is gaming stars. It's that you can no longer use the star count as ground truth for "real human interest." A new AI project at 3,000 stars in two weeks could be organic acceleration, paid promotion, or actual purchased stars — the star counter can't distinguish. The replacement signals — commit activity, contributor onboarding, fork-to-star ratio, downstream dependents — are harder to fake at the same scale, because they require either maintained infrastructure or actual usage to produce.

The defence isn't to detect every fake star. It's to never make an adoption decision on a metric that's been monetised.

What teams actually do with the replacement stack

Five concrete workflows where the substitution pays for itself.

Adoption decisions on a single repo. "Should we depend on this?" is the canonical use case. Pull the full intelligence — mode: "repo-due-diligence" against a specific compareRepos list — and read the verdict, the maintenance status, the bus factor risk, and the community score. The star count is reference context, not input.

Dependency tree audits. A 200-package dependency tree at $0.15/repo is $30/run. Run it weekly. Read the diff. The team only needs to look at NEW, NEWLY_ABANDONED, and STATUS_CHANGE rows — typically 3-7 per week against a steady tree. Star counts are irrelevant; the audit is asking "what's degrading in our supply chain?" and stars don't measure degradation.

Open-source due diligence for VCs. Sourcing dev-tool deals on star count produces a pipeline full of frozen flagships. Sourcing on trajectory: GROWING plus adoptionPercentile > 70 plus decayScore < 20 plus rising star velocity produces a pipeline of actual current movers. The star sort buries inflection points; the trajectory sort surfaces them.

Content sourcing and ecosystem analysis. Newsletter writers, dev-tool category analysts, and competitive intelligence teams need ecosystem maps that are accurate as of this week, not five years ago. A market-map run with auto-partitioning past the 1,000-result GitHub Search cap produces a current ranked dataset of up to 10,000 repos with all five replacement signals. Stars are the discovery filter; the verdicts are the editorial.

Security team risk reviews. Security reviews on dependency trees should never use star count. They should use supplyChainRiskScore, topContributorShare, signedCommitRatio, and busFactorRisk — those are the fields that map to the failure modes the security team is paid to prevent. Stars are decorative on a security review.

For teams already running these workflows manually, the evaluate GitHub repositories framework post walks through the three-tier model — naive (stars), intermediate (activity metrics), and decision intelligence (composite scoring). This post is the meta-essay tying that framework's "naive" tier to the failure modes that make replacement worth the cost. ApifyForge runs this whole cluster of GitHub-evaluation content because the same five replacement signals show up in every workflow.

How much does it cost to replace star-based filtering?

The cost math is the part most teams underweight, in both directions.

The GitHub Repo Intelligence Apify actor — published by ApifyForge — charges $0.15 per repository scored. A typical use:

WorkflowReposCost
One-off due diligence on 5 candidate frameworks5$0.75
Weekly dependency audit on a 200-package tree200$30
Monthly market map of a 1,500-repo category1,500$225
Quarterly ecosystem scan, 5,000 repos with auto-partition5,000$750

Compare against the alternative. A 200-package weekly audit done manually is 4-8 engineering hours per week reading commit graphs and contributor lists — call it 6 hours at $150/hour loaded cost, or $900/week. The actor is $30. The savings funds the replacement stack twenty-nine times over and removes the recurring "but it has 30k stars" debate from adoption meetings.

The hidden cost is the one that doesn't show up in invoices: the cost of getting the adoption decision wrong on a popular-but-unhealthy project, finding out 18 months later when the maintainer leaves or the CVE drops, and absorbing the migration work to swap it out under pressure. That cost is bigger than the entire annual budget for the audit. The ApifyForge actor is the cheap way to never pay it.

How to call the actor

Concrete input for the three workflows that replace star-based filtering. These are inputs to the actor — what the actor returns is the JSON output shown above.

{
  "query": "topic:vector-database",
  "mode": "market-map",
  "maxResults": 5000,
  "autoPartitionResults": true,
  "excludeForks": true
}

That's the category-scan workflow — replaces "sort by stars" with a multi-signal ranked dataset across up to 10,000 repos.

{
  "compareRepos": ["facebook/react", "vuejs/vue", "sveltejs/svelte", "solidjs/solid"],
  "mode": "adoption-shortlist"
}

That's the side-by-side comparison workflow — replaces "the one with the most stars wins" with a verdict-per-repo and an explicit winner.

{
  "query": "topic:web-scraping",
  "mode": "dependency-audit",
  "maxResults": 200,
  "compareToPreviousRun": true
}

That's the scheduled-monitoring workflow — replaces "we'll check on it next time someone remembers" with a weekly diff flagging NEW, SCORE_CHANGE, and NEWLY_ABANDONED rows.

Best practices — when stars are still useful

Stars aren't useless. They're a discovery signal; they fail as an evaluation signal. Six rules for using them correctly.

  1. Use stars as a top-of-funnel filter, never as a verdict. minStars: 100 removes toy projects from a category scan. That's a defensible use. Sorting the final ranked list by stars is not.
  2. Never compare star counts across categories. 3,000 stars in vector DBs is huge; 3,000 stars in JS frameworks is a curiosity. Always normalise to category percentiles before comparing.
  3. Treat star count as a popularity prior, not a quality signal. Stars tell you a repo is more likely to have working basics — README, license, some structure. They don't tell you it's safe to adopt.
  4. Read star velocity, not star count, for trend signals. Cumulative stars are useless for trend detection. Rolling 30-day star velocity normalised by category baseline is the actual signal.
  5. Combine stars with fork-to-star ratio for adoption read. Stars alone over-index on bookmarking. Stars + forks + a ratio above 0.10 is a defensible adoption proxy.
  6. Decay your own internal star data. If you keep an internal "popular GitHub projects" list, decay it. A repo that hit 40k stars three years ago and froze should not still rank above a 5k-star project that's accelerating now.

Common mistakes

Five recurring patterns I see teams make with stars. Each has a one-line correction.

Mistake 1: "It has 30k stars, must be safe." Stars don't measure safety. Read supplyChainRiskScore, topContributorShare, and signedCommitRatio for safety. Star count is independent of all three.

Mistake 2: Sorting a category-scan dataset by stars and reading the top 10. The 1,000-result GitHub Search cap means the top 10 by stars are the 10 most-bookmarked-ever, not the 10 most-relevant-now. Sort by projectHealthScore or adoptionReadinessScore instead.

Mistake 3: Treating Hacker News spikes as adoption. A 7-day star inflow from a Show HN post is a discovery event, not an adoption event. Wait 60 days and check the slope before drawing conclusions.

Mistake 4: Ignoring star-buying as a possibility. Any new project hitting 5k stars in three weeks in an AI-adjacent category should be checked against a fake-star detector before being treated as organic. Dagster open-sourced a working one.

Mistake 5: Using lifetime star count to compare two repos at different ages. A 5-year-old project with 30k stars (~6k/year) is on a different trajectory from a 1-year-old project with 8k stars. The 8k repo is accelerating six times faster. Lifetime stars hide that.

Limitations

The replacement signals aren't free.

  • Enrichment costs API time. Community profile, activity stats, contributor data, language breakdowns, and release info each add API calls per repo. A 1,000-repo enriched run takes ~35 minutes with a token, vs ~22 seconds for raw search.
  • Some signals need a baseline. Trajectory and breakout detection need at least one prior run for cross-run comparison. The first run is just current state; the second run gives you the diff.
  • Bus factor needs context. Top-contributor share alone overstates risk for foundation-governed projects with paid leads. The ifMaintainerLeaves impact prediction adjusts for that, but no automated signal is perfect — final calls on critical dependencies still benefit from a human read.
  • Fake-star detection is not built in. The actor does not currently flag fake stars directly. The fake-star defence here is structural: don't depend on the star count, depend on signals that are harder to fake at scale.
  • Signed-commit ratio is a hygiene signal, not a safety guarantee. A high signed ratio is good. It doesn't prove the signers haven't been compromised — see the xz-utils incident for the worst-case version.

Glossary

Vanity metric — A flattering number with no predictive power for the decision being made. Stars are the canonical GitHub example.

Lifetime counter — A metric that only increases. Stars, total commits, total contributors. Lifetime counters can't reflect decay.

Bus factor — The minimum number of contributors who would need to leave for a project to stall. Concentration, not headcount.

Trajectory — The direction of a project's activity curve, summarised as GROWING / STABLE / DECLINING / COLLAPSING / REVIVING.

Decay score — A 0-100 measurement of how fast a project is declining, paired with a velocity flag (NONE / SLOW / FAST).

Zombie repo — A repository with recent commit timestamps but no real human activity — bot-only commits, dependency bumps, license edits.

Breakout — A repository in the top 1% of a category by current velocity, regardless of absolute star count.

Fake stars — Programmatically purchased GitHub stars from an external service, used to inflate apparent popularity. Documented economy with detectable fingerprints.

Key facts about GitHub stars

  • GitHub stars are a lifetime counter. They never decay when a project stops being maintained.
  • 75% of developers consider star count before using a project, per the Borges and Valente 2018 study of the top-5,000 starred repositories.
  • The same study explicitly recommended caution about selecting projects based on star count.
  • A working market for purchased GitHub stars exists, with documented pricing as low as $0.064 per star, per Dagster's fake-star investigation.
  • GitHub now hosts 630 million repositories with 180 million developers, per the 2025 Octoverse report — a noise floor where star count alone tells you almost nothing.
  • The xz-utils backdoor (CVE-2024-3094) targeted a project effectively maintained by one person, despite carrying significant apparent popularity. Stars said "trustworthy"; concentration said "fragile."
  • Replacement signals — fork-to-star ratio, 90-day commit velocity, top-contributor share, community profile, trajectory enum — are individually harder to inflate than star count.

Broader applicability

The argument generalises beyond GitHub.

These patterns apply to any platform where lifetime cumulative engagement metrics get used as quality signals. Five universal principles:

  1. Lifetime counters are vanity by default. Followers, total downloads, total signups, total page views — all lifetime counters with no decay. None are quality signals on their own.
  2. Direction beats magnitude. Whether the metric is going up or down right now matters more than the absolute level. A second-derivative read (trajectory) survives the failure modes a first-derivative read (count) doesn't.
  3. Concentration is a leading risk indicator. One person doing 90% of the work is the structural precondition for collapse, regardless of which platform's number says the project is popular.
  4. Discovery and evaluation are different funnels. The metric that's good at "find candidates" is rarely good at "decide between candidates." Don't reuse the same number for both.
  5. Any monetised metric should be assumed inflatable. Once a number drives money decisions, a market for inflating that number will exist. Detection or replacement, not trust, is the defence.

These apply to npm download counts, LinkedIn followers, Product Hunt upvotes, Stack Overflow rep, and most "trending" tabs across most platforms. GitHub stars happen to be the version where the gap is most documented — but the structural pattern is platform-agnostic.

When you need this

Replace star-based filtering with the signal stack when:

  • You're picking dependencies that will run in production and "the maintainer disappeared" isn't an acceptable failure mode
  • You're funding open-source companies and "the project froze 18 months ago" would change the cheque size
  • You're auditing a dependency tree larger than 50 packages
  • You're sourcing emerging technologies where breakout matters more than current popularity
  • You're comparing competing frameworks and "X has more stars" keeps coming up in adoption meetings

You probably don't need this if:

  • You're picking a one-off utility for a throwaway script and the cost of a bad pick is rerunning the script
  • You're choosing between first-party internal repos where you already have full context
  • You're doing a presentation slide that needs round numbers and "30k stars" is the right level of detail for the audience
  • The decision is fully reversible in under 30 minutes with no migration cost
  • You're doing pure discovery and absolute popularity is genuinely the only signal you need at that stage

Frequently asked questions

Are GitHub stars a good metric?

GitHub stars are a useful discovery signal but a poor evaluation signal. They measure cumulative bookmarking, not current health, adoption, or risk. Stars are a lifetime counter — they never decay when a project goes dormant — and they're inflatable through a documented fake-star market. Use them at the top of a funnel to find candidates worth evaluating, then switch to the replacement stack (fork-to-star ratio, commit velocity, contributor concentration, trajectory) for the actual decision.

What should I use instead of GitHub stars?

Replace stars with a five-signal stack, one per question: fork-to-star ratio plus downstream dependents for adoption; 90-day commit velocity plus the maintenance.trajectory enum for health; top-contributor share plus signed-commit ratio for risk; community profile plus release cadence for quality; and rolling star velocity plus breakout detection for trajectory. Each signal answers a question stars can't. Read together, the stack produces a verdict instead of a vibe.

Do GitHub stars predict project quality?

No. Quality requires recent maintenance activity, distributed contributors, release discipline, and community governance — none of which correlate strongly with star count. The Borges and Valente 2018 study of the top-5,000 starred GitHub repositories explicitly warned about the risks of selecting projects by star count, while finding that 75% of developers do exactly that. Stars and quality have similar distributions but different drivers.

Are GitHub stars bought?

Yes, in some cases. Dagster's fake-star investigation documented a working market with services pricing fake stars at €0.85 each and at least one provider selling 1,000 stars for $64. Their detector identified clear fingerprint patterns across multiple AI-tagged projects — recent account creation, near-empty profiles, sub-1 follower counts, matching grant/creation/update dates. The implication is structural: any popular new project's star count should be treated as suspect until corroborated by activity, fork, and contributor signals that are harder to fake at scale.

How do I evaluate a GitHub repo without stars?

Pull a multi-signal read: 90-day commit count, top-contributor share over the last year, days since last release, community profile health percentage, and the trajectory enum. The GitHub Repo Intelligence Apify actor on ApifyForge returns all of these per repo at $0.15 each, with composite verdicts (STRONGLY_RECOMMENDED, CAUTION, HIGH_RISK) that compress the stack into a decision. Stars become reference context — one field among forty, not the headline number.

Should I trust a 50k-star repo?

Not on the basis of the star count alone. A 50k-star repo can have one active maintainer doing 91% of last year's commits, a release cadence that froze 14 months ago, a 38% community profile score, and a trajectory: COLLAPSING reading at the same time — all while the star counter is still at 50k because nobody unstars old projects. Pull the full intelligence and read the verdict. If the verdict is STRONGLY_RECOMMENDED with low bus-factor risk and active maintenance, then yes. If not, the 50k is decorative.

What's a good GitHub star count?

There isn't one. Star count is a category-relative measurement at best and a meaningless absolute at worst. 3,000 stars in vector databases is a top-3 player; 3,000 stars in JS frameworks is invisible. The right question is whether the repo is in the top decile of its category by health, adoption, and trajectory — which depends on the category baseline, not on a fixed threshold. Always compare percentile within a category, never absolute count across categories.

How do GitHub stars correlate with adoption?

Weakly. Stars measure bookmarking; adoption shows up in forks, downstream package dependents, and production-grade language profiles. The fork-to-star ratio is the cleanest single proxy: above 0.20 suggests heavy adoption, under 0.05 suggests heavy bookmarking-without-use. Stars alone are biased toward projects that get social traction (Hacker News, conference talks, viral threads), which is a different filter from "things people actually wired into production."

Can a 1,000-star project be more important than a 50,000-star project?

Yes, regularly. The most consequential dependencies in many ecosystems are sub-3,000-star security and infrastructure libraries that quietly run inside half the Fortune 500. Star count rewards consumer-visible projects with social-graph reach. It under-rewards plumbing — the kind of project that gets imported, never starred, and is structurally critical. The replacement stack — particularly downstream-dependents and production-grade language tags — is what separates "popular" from "important."

Why do GitHub stars never decay?

Because the star action is permanent unless the user explicitly unstars. There's no half-life, no inactivity decay, no "this project went dormant" mechanism. Once a star is granted, it persists until the user revisits the repo and clicks unstar. Almost nobody does that — so lifetime accumulated stars survive long after the project that earned them stops being healthy. The metric is mechanically additive, which means it can't reflect any current reality.

Slightly, but not enough to depend on. Trending is a 24-hour star-velocity slice, dominated by AI-of-the-week noise, with no editorial filtering. It catches viral spikes after they've already happened — by the time a project is trending publicly, the early-mover advantage is gone. The find trending GitHub repositories before they blow up post covers the rolling-window acceleration approach that surfaces breakouts in the 100-3,000 star band, before they hit Trending.


Ryan Clinton publishes Apify actors and MCP servers as ryanclinton and builds developer intelligence tools at ApifyForge.


Last updated: May 2026

This guide focuses on GitHub, but the same patterns apply broadly to any platform where lifetime cumulative engagement metrics — followers, downloads, upvotes, signups — get repurposed as quality signals.