Data IntelligenceWeb ScrapingApifyPrivacyCompliance

Building a Resale-Safe Business Dataset

Scraped Google data isn't legally yours to resell. A resale-safe business dataset is built on licensed open ground truth: Overture under CDLA 2.0.

Ryan Clinton

The problem: You scraped a few thousand businesses, cleaned them up, and now someone wants to ship that data in a product, sell it to a customer, or hand it to a client as a deliverable. Then legal asks one question: "where did this come from, and are we allowed to resell it?" The honest answer is usually no. A scrape of Google content isn't a licence. It's a copy of someone else's data, taken under terms that explicitly restrict redistributing it. The whole project was sitting on a legal hole nobody checked for.

This is the part of place-data work that gets ignored right up until the moment it can't be. I publish actors on the Apify Store as ryanclinton, including ones in this space, so I'm not here to scare you off scraping. Scraping for a one-off lookup is fine. But "scrape it and sell it" is a different claim, and that claim has a structural problem in the middle of it. This post is about the positive version: what it actually takes to end up with a business dataset you own, can resell, ship, or hand off, and why the answer is licensed open ground truth, not a cleaner scrape.

This is the legal-foundation piece in a cluster around Google Maps Scraping Isn't a Data Strategy. The pillar flagged that scraped Maps data isn't legally yours to build on. This post goes deep on the fix.

If your problem is already "I have a business list and I need records I can legally resell," that's the job the Business Data Enricher Apify actor was built for. But read on for why the licence, not the scrape, is the thing that matters.

What is a resale-safe business dataset? A resale-safe business dataset is a set of business records built on data licensed for redistribution and product-building, most commonly Overture Maps places under CDLA Permissive 2.0, so you can legally resell, ship, or hand off the output, unlike scraped content bound by a platform's terms.

Why it matters: Scraped Google content is governed by the Google Maps Platform Terms, which restrict caching, redistribution, and building competing datasets. "Public" does not mean "yours to resell." The licence the data carries decides what you're allowed to do with it, and a scrape carries the wrong one.

Use it when: You're building a product on business data, selling a list or dataset to a customer, delivering data as a client deliverable, or storing place records you'll redistribute in any form.

Also known as: commercially-licensed business data · resale-permissive place data · CDLA-licensed places · open ground-truth business records · data you can legally redistribute · licence-safe local business data.

Quick answer

  • What it is: business records built on a redistribution-permissive licence (CDLA Permissive 2.0 on Overture Maps), so the output is legally yours to resell, ship, or hand off.
  • When you need it: any time the data leaves your own four walls, whether sold, productized, or delivered to a client.
  • When you don't: genuine internal one-off lookups you'll never redistribute.
  • The core distinction: a scrape copies data under restrictive terms; a resale-safe dataset is built on data explicitly licensed for redistribution.
  • Main tradeoff: open licensed data carries fewer live fields (no reviews or today's hours) but is the only foundation you can legally build a saleable product on.

In this article: What it is · Why scrape-and-sell breaks · What CDLA allows · Alternatives · Best practices · Mistakes · FAQ

Key takeaways

  • Scraped Google Maps content is bound by the Google Maps Platform Terms, which restrict caching, redistribution, and competing-dataset use, so it isn't a foundation you can resell.
  • Overture Maps places ship under CDLA Permissive 2.0, a Community Data License Agreement licence built specifically for redistribution and commercial product-building.
  • "Scraping public data isn't a crime" and "this data is legally mine to resell" are two completely different legal bars: courts ruling on the first say nothing about the second.
  • CDLA Permissive 2.0 has one practical obligation: preserve attribution and licence notices on redistribution. There is no share-alike, no copyleft, no requirement to open your own product.
  • Not all Overture themes are resale-permissive: places are CDLA, but buildings and transportation are ODbL (share-alike), so a resale-safe dataset has to keep them apart.

Licence posture, in concrete terms

What you want to doScraped Google dataCDLA-licensed Overture data
Use it internally for a lookupGenerally fineFine
Sell the dataset to a customerRestricted by Google TOSPermitted with attribution
Ship it inside a paid productRestrictedPermitted with attribution
Hand it to a client as a deliverableRestrictedPermitted with attribution
Redistribute or republish itRestrictedPermitted with attribution

Licensing posture based on publicly available terms as of June 2026 and may change. This is a plain-language summary, not legal advice.

What is a resale-safe business dataset?

Definition (short version): A resale-safe business dataset is a collection of business records built on data licensed for redistribution and commercial use, most commonly Overture Maps places under CDLA Permissive 2.0, so the records can legally be resold, shipped in a product, or delivered to a customer.

The word doing the work is licensed. A dataset isn't resale-safe because it's clean, or accurate, or yours-feeling. It's resale-safe because the data underneath it carries a licence that grants you the right to redistribute it.

That's a property of the source, not of how nicely you formatted the rows. You can take a scrape, dedupe it, validate it, and make it beautiful, and it's still not resale-safe, because the underlying content is still bound by the terms it was scraped under. Licence posture is decided upstream, at the source, and it travels with the data no matter what you do downstream.

There are broadly three places business data comes from, and they have opposite legal postures. Scraped platform content (Google Maps, Yelp, and similar) is copied under terms that restrict redistribution. Licensed open data (Overture under CDLA, Foursquare's open release under Apache 2.0) is published specifically to be reused commercially. Commercial licensed data (paid aggregators) grants resale rights per contract. A resale-safe dataset is built on the second or third, never the first.

Why isn't a scrape legally yours to resell?

A scrape isn't yours to resell because you didn't acquire the data under a licence that grants redistribution rights. You copied it under terms that explicitly restrict it. The Google Maps Platform Terms prohibit caching, redistributing, and using Google content to build a competing dataset, and those terms bind you contractually whether or not the data was publicly visible.

This is the distinction that trips up almost everyone: visibility is not a licence. "It was on a public web page, so I can do what I want with it" feels intuitive and is wrong. And while recent case law has clarified that scraping publicly accessible data isn't automatically a Computer Fraud and Abuse Act violation, that ruling is about the act of accessing, not your right to resell the result. Access rights and redistribution rights are different questions under different law, contract and the platform's terms, not the CFAA. The Google Maps/Google Earth Additional Terms reinforce the same restriction.

I'm not your lawyer and none of this is legal advice. But the posture is clear enough to plan around: if the data plan is "scrape a platform and sell the output," there's a contractual hole in the middle of it that doesn't get smaller as the project grows. It gets more expensive to discover.

What does CDLA Permissive 2.0 actually allow?

CDLA Permissive 2.0 allows you to use, modify, and redistribute the data, including in commercial products and for resale, with one practical obligation: preserve the attribution and licence notices when you pass the data along. There's no share-alike clause, no requirement to release your own work, and no royalty.

The Community Data License Agreement family was written by the Linux Foundation specifically for sharing data the way open-source licences share code. The "Permissive" variant is the data equivalent of a permissive software licence like MIT or Apache: take it, build on it, sell what you build, just keep the credit line intact.

Here's what that means in plain terms for a business dataset built on Overture places:

  1. You can resell it. Selling a dataset or a product built on CDLA-licensed data is explicitly permitted.
  2. You can ship it closed. Nothing forces you to open-source your product or share your derived data back. The permissive variant has no copyleft.
  3. You can modify it. Clean it, enrich it, reshape it, merge it with your own data. All fine.
  4. You must keep attribution. The one real obligation: the recipient of your data needs the attribution and licence notice. This is a credit line, not a constraint on how you use it.
  5. There's no warranty. Like every open licence, the data is provided as-is. You're responsible for fitness for your use.

The contrast with a scrape is total. One licence is written to be redistributed; the other set of terms is written to prevent it.

One important caveat, because it's the trap people fall into: Overture Maps does not ship every theme under the same licence. The places theme is CDLA Permissive 2.0. Other themes, namely buildings and transportation, are licensed under ODbL, which carries a share-alike obligation that can force your derived data open. A genuinely resale-safe business dataset keeps the CDLA places data clean and never silently mixes ODbL-licensed fields into the output. That separation is part of what "resale-safe" has to mean, and it's exactly the kind of thing that's easy to get wrong by hand.

What does a resale-safe record look like?

A resale-safe record is a business record that carries its own licence proof (the source, the licence identifier, and the attribution string) alongside the business data, so the resale posture travels with the row instead of living in a document nobody can find later.

Here's the shape. This is what the output looks like, not how it's produced:

{
  "name": "Domino's Pizza",
  "category": "pizza_restaurant",
  "address": "Belfast BT9 6AA",
  "gers_id": "08f1949...c3a",
  "license": "CDLA-Permissive-2.0",
  "source": "Overture Maps",
  "attribution": "© Overture Maps Foundation",
  "resale_safe": true
}

The point isn't the business fields; any tool gives you those. The point is the last four. license, source, attribution, and resale_safe are the row telling you, per record, that this data carries redistribution rights and exactly what credit line to ship with it. When legal asks "can we resell this," the answer is in the data, not in a Slack thread from four months ago. A scrape row can't carry that flag honestly, because the honest value would be false.

The procurement test

This is where that flag earns its place. A serious buyer's procurement or legal team will eventually send a line like:

"Describe the provenance and licensing of your business dataset."

On a scraped dataset, the honest answer is a paragraph of hedging: where it came from, which terms it was taken under, and why you think that might be fine. That paragraph stalls deals. On a resale-safe dataset, the answer is one line, "Overture Maps places under CDLA Permissive 2.0, attribution attached per record," backed by a sample row that proves it. Same question, same buyer, opposite outcomes, and the difference is entirely in the licence underneath.

What are the alternatives for getting resale-safe data?

There are four honest routes to a business dataset you can legally resell. Each has real tradeoffs across cost, effort, and how much legal work lands on you. I'm naming where each one breaks, not handing you a build guide. The "how to actually produce the records" is the part you don't want to own by hand.

1. Keep scraping and hope nobody asks. Best for: nothing you'll ever redistribute. It's the cheapest path and the one most teams are quietly on. It works right up until the data leaves the building, at which point the Google Maps Platform Terms make it a liability rather than an asset. Fine for internal one-offs, wrong for anything saleable.

2. Build resolution on open data in-house. Best for: organisations with a data team and a permanent need. You'd take a licensed open source like Overture, build the matching and cleaning yourself, and take on the licence-tracking discipline: keeping CDLA places separate from ODbL themes, attaching attribution correctly, and proving provenance per record. That last part is legal-engineering, not just data-engineering, and getting the licence separation wrong can quietly poison your resale posture. Real, but it's an ongoing maintained responsibility, not a one-time build.

3. License a commercial place-data provider. Best for: enterprises that want a contract and a vendor to point at. You buy cleaned data with resale rights spelled out in the contract. The data is genuinely resale-safe, but you're locked to the vendor's schema, refresh cadence, and pricing, often at enterprise scale, and the resale rights are exactly as broad as the contract you negotiated.

4. Resolve your list against licensed open ground truth. Best for: teams that have a list (a CRM, store locations, a scrape they already paid for) and want it returned as records they can legally resell, without building or maintaining any of the licence machinery. This is the category the Business Data Enricher Apify actor sits in. You bring a list or pull a territory, and get back records built on Overture Maps under CDLA Permissive 2.0, each flagged resale-safe with the attribution already attached. It's one of the few tools where the licence posture is part of the output rather than your problem. Pay-per-resolved-place, so you pay for records you can actually keep.

ApproachSource licenceLegal to resellAttribution handledWho owns licence tracking
Keep scrapingPlatform TOS (restrictive)NoN/AN/A
In-house on open dataCDLA / ApacheYes, if done rightYou build itYou
Commercial providerPer contractYesPer contractVendor
Resolve vs open ground truthCDLA Permissive 2.0YesAttached per recordThe tool

Licensing and features based on publicly available information as of June 2026 and may change. This is a plain-language summary, not legal advice.

Each approach has trade-offs in cost, control, and how much licence-compliance work you carry. The right choice depends on whether you have a data team, how much you'll redistribute, and whether you want to own provenance tracking or have it handled.

Best practices for data licence posture

Six things I'd tell anyone whose data is going to leave the building.

  1. Decide the resale question at ingestion, not at delivery. "Can we legally redistribute this" is a question to answer the day the data enters your system, not the day a customer asks where it came from. By delivery it's too late to change the source.
  2. Treat licence as a property of the source, not the format. Cleaning, validating, and reshaping a scrape doesn't change its licence. The posture is inherited from where the data came from and nothing downstream rewrites it.
  3. Keep licences from mixing. If you touch multiple open sources, never silently blend a share-alike licence (like ODbL) into a permissive-licensed output. One ODbL field can drag share-alike obligations onto the whole thing.
  4. Carry attribution with the data, not in a doc. The attribution string should travel on the records, so the credit line ships automatically when the data does. A licence note in a README nobody reads isn't compliance.
  5. Prefer per-record licence proof over a blanket claim. "Our whole dataset is CDLA" is weaker than every row carrying its own license and source. Provenance you can point at per record is what survives a legal review.
  6. Match the source to what you're allowed to sell. If the use case lives on live reviews and today's hours, no open licence carries those, and a scrape can't make them saleable. Be honest about which fields you can legally redistribute.

Common mistakes with data licensing

Five mistakes I see constantly, each with a real cost.

  • Assuming "public" means "resellable." It doesn't. Public visibility and redistribution rights are unrelated. A platform can show you data publicly and still contractually forbid you from reselling it, and Google's terms do exactly that.
  • Cleaning a scrape and calling it yours. Deduping and validating a scrape changes its quality, not its licence. The polished output inherits the same restrictive terms as the raw rows. Effort doesn't launder licence posture.
  • Mixing CDLA and ODbL data without noticing. Pulling places (CDLA) and buildings (ODbL) from the same open source and merging them into one output silently imports share-alike obligations. Now your "permissive" dataset has a copyleft string attached.
  • Skipping attribution. CDLA's one obligation is attribution, and it's easy to drop when you reshape the data. Stripping the credit line is the one way to actually breach an otherwise generous licence.
  • Confusing "not a crime" with "safe to sell." A court saying scraping isn't a CFAA violation gets read as "scraping is legal, so I can sell the data." The ruling is about access, not resale. Different bar entirely.

A concrete before/after

A small data vendor I talked through this was selling local-business lists built from a Google Maps scraper. The before state: a clean, well-formatted product, a handful of paying customers, and a licence posture that was, when you actually looked at it, "scraped Google content resold as our own." One enterprise prospect's procurement team asked for data provenance, and the deal stalled the moment the honest answer surfaced. The product was good. The foundation wasn't sellable.

The change was reframing the question from "how do we clean the scrape" to "where does the data legally come from." They stopped treating their scrape as the product and started treating their list as input to resolution against licensed open ground truth. After: the same lists came back as records built on Overture under CDLA Permissive 2.0, each carrying its source, licence, and attribution. The provenance question that had killed the enterprise deal now had a one-line answer the buyer's legal team accepted. The product looked nearly identical to the customer; the difference was entirely in the licence underneath. Their context, their numbers, and results vary with list quality and how much redistribution you actually do.

Implementation checklist

The sequence for moving from a scrape you can't sell to a dataset you can.

  1. Audit your current sources. For every dataset you redistribute or plan to, write down where the data actually came from and what licence (or terms) it carries. Most teams have never done this and are surprised.
  2. Flag anything platform-scraped. Google Maps, Yelp, and similar scraped content is the liability. Mark it as not-resale-safe until proven otherwise.
  3. Classify the use case. Internal-only and never redistributed? You may be fine. Sold, shipped, or delivered? You need a resale-permissive source.
  4. Inventory your inputs. A CRM table, store locations, a scrape you already paid for: anything with a name and a location is a valid input to resolution.
  5. Resolve against licensed open ground truth. Run the list through a resolution tool like the Business Data Enricher Apify actor to get records built on Overture under CDLA Permissive 2.0, flagged resale-safe with attribution.
  6. Carry the licence proof downstream. Keep the license, source, and attribution fields on every record so the resale posture travels with the data into your product or deliverable.
  7. Document provenance once, point at it forever. With per-record licence proof in the data, the "where did this come from" question has a permanent, checkable answer.

Limitations

Honest constraints, because licensed open data isn't a free lunch and a scrape isn't worthless.

  • Open data carries fewer live fields. Overture Maps refreshes monthly and carries no reviews, ratings, live hours, or popular times. If your saleable product depends on those, no open licence provides them, and a scrape can't legally sell them either. That's a hard constraint, not a tooling gap.
  • CDLA still has an obligation. "Permissive" isn't "no rules." You must preserve attribution and licence notices on redistribution. It's a light obligation, but dropping it is the one way to breach the licence.
  • Coverage varies by region. Open place data is strong in well-mapped areas and thinner in some regions. Dense urban markets carry richer records than sparse rural ones.
  • This isn't legal advice. I publish actors that produce resale-safe records; I'm not a lawyer. For a specific commercial product, get the licence posture reviewed by counsel. This post is the map, not the legal sign-off.
  • Licence separation has to be enforced. Resale-safe requires keeping CDLA places data away from share-alike themes. If a tool or a build silently mixes them, the resale posture is compromised regardless of intent.

Key facts about resale-safe business data

  • A resale-safe business dataset is built on data licensed for redistribution, most commonly Overture Maps places under CDLA Permissive 2.0.
  • Scraped Google Maps content is bound by the Google Maps Platform Terms, which restrict caching, redistribution, and competing-dataset use.
  • "Public" visibility does not grant resale rights; the licence the data carries decides what you can do with it.
  • CDLA Permissive 2.0 permits commercial use and resale with one obligation: preserving attribution and licence notices.
  • CDLA Permissive 2.0 has no share-alike clause, so you can ship a closed product built on the data without releasing your own work.
  • Overture's places theme is CDLA Permissive 2.0, but its buildings and transportation themes are ODbL (share-alike) and must not be mixed into resale-safe output.
  • "Scraping isn't a CFAA crime" rulings concern access, not the separate question of resale rights.
  • Per-record licence proof (source, licence ID, attribution) makes data provenance checkable instead of a guess.

Glossary

  • CDLA Permissive 2.0: a Community Data License Agreement licence that permits commercial use, modification, and redistribution of data with attribution and no share-alike.
  • ODbL: the Open Database License, a share-alike data licence that can require derived databases to be shared under the same terms.
  • Overture Maps: an open geospatial data project whose places theme provides business records under CDLA Permissive 2.0.
  • Attribution: the credit line a permissive licence requires you to preserve when redistributing the data.
  • Provenance: a record of where data came from and under what licence, ideally carried on each row.
  • Resale-safe: a property of data whose source licence permits redistribution and commercial resale.

Where these patterns apply beyond business data

The licence-posture distinction isn't really about places. It's a general truth about any external data you intend to redistribute, and it applies far beyond local business records.

  • Licence travels with the source. Whatever data you build on, its redistribution rights are inherited from where it came from, not from how you reshape it. True for text, images, code, and structured data alike.
  • "Public" is not "permissive." Across every domain, visible-on-the-web and licensed-for-reuse are unrelated properties that get conflated constantly.
  • Provenance belongs in the data. Carrying source and licence on the record, not in a separate document, is the pattern that survives a legal review in any field.
  • Don't mix incompatible licences. Blending share-alike and permissive sources silently imports the stricter obligation, whether you're combining datasets, code libraries, or media.
  • Decide resale posture upstream. "Can we legally redistribute this" is an ingestion-time question for any external data, not a delivery-time surprise.

When you need this

You probably need a resale-safe dataset (not just a clean scrape) if:

  • You're building a product on top of business data.
  • You sell lists, datasets, or reports to customers.
  • You deliver data as a client deliverable or consulting output.
  • You redistribute place data in any form, internal-to-external.
  • A procurement or legal team will eventually ask for data provenance.

You probably don't need this if:

  • The data is genuinely internal and never leaves your organisation.
  • You're doing a one-off lookup you'll discard.
  • Your use case lives entirely on live reviews and today's hours, which no open licence covers anyway.

Frequently asked questions

Generally no. Scraping publicly visible data isn't automatically a crime (courts have repeatedly declined to treat it as one), but reselling the result is a separate question. The Google Maps Platform Terms restrict caching, redistribution, and building competing datasets, and those terms bind you contractually. Internal use is one thing; selling or shipping scraped Google content as your own dataset runs into those restrictions.

What does CDLA Permissive 2.0 allow me to do?

CDLA Permissive 2.0 lets you use, modify, and redistribute the data, including commercially and for resale, with one obligation: preserve the attribution and licence notices when you pass the data on. There's no share-alike, no requirement to open-source your product, and no royalty. It's the data equivalent of a permissive software licence like MIT or Apache.

Does "public data" mean I can resell it?

No, and this is the most common misconception in the space. Public visibility and redistribution rights are unrelated. A platform can display data publicly while contractually forbidding you from reselling it, which is exactly what Google's terms do. What you can legally do with data is decided by its licence or the terms you accessed it under, not by whether it was visible on a web page.

Why is Overture Maps data resale-safe when scraped data isn't?

Because of the licence, not the data quality. Overture Maps publishes its places theme under CDLA Permissive 2.0, a licence written specifically to allow redistribution and commercial product-building. Scraped Google content is copied under terms written to prevent it. Same kind of business records, opposite legal postures, and the licence is what decides whether you can sell the output.

Do I have to open-source my product if I use CDLA data?

No. CDLA Permissive 2.0 is the permissive variant, which has no share-alike or copyleft clause. You can build a closed, commercial product on CDLA-licensed data and never release your own code or derived data. The one obligation is preserving attribution when you redistribute the underlying data. This is different from ODbL, a share-alike licence that can require derived databases to be shared.

How do I get business records I can actually resell?

You build them on a redistribution-permissive source. The most direct route, if you have a list or a territory, is to resolve it against licensed open ground truth. The Business Data Enricher Apify actor returns records built on Overture under CDLA Permissive 2.0, each flagged resale-safe with attribution attached, on a pay-per-resolved-place basis. For the full picture on why a scrape isn't the foundation, see the pillar on why Google Maps scraping isn't a data strategy.

Ryan Clinton publishes Apify actors and MCP servers as ryanclinton and builds developer tools at ApifyForge.


Last updated: June 2026

This guide focuses on local business data and Apify, but the same licence-posture patterns apply broadly to any external data you intend to resell, ship, or redistribute.

Related actors mentioned in this article

Business Data Enricher

Returns business records built on Overture Maps under CDLA Permissive 2.0, each flagged resale-safe with attribution, so the output is legally yours to reuse

View on ApifyForge →
Google Maps Lead Enricher

Companion enrichment when a resolved, resale-safe cohort needs contact-level data for outreach

View on ApifyForge →