- Practical Prospecting
- Posts
- 126: How to Scrape Hard-to-Find ICPs
126: How to Scrape Hard-to-Find ICPs
Our 3 stop process for finding companies and contacts that aren't on LinkedIn
Welcome back to The Practical Prospecting newsletter!
Today, I’m sharing how we find companies, contacts, emails, and phone numbers for those industries that aren’t on LinkedIn. Usually, blue-collar industries.
Many markets don’t have clean, ready-made lists of companies and decision-makers. If your ICP is hard to find, you need to be more creative than just exporting from LinkedIn or ZoomInfo.
Here’s the 3-step process we use to pull the most out of any hard-to-find ICP
Side note: These are my favorite industries to prospect because when the data is hard to find, it means fewer competitors are reaching out to them. If you sell to these industries, I’d love to work with you.
Agenda
Step 1: Find Every Company Website
Step 2: Confirm ICP Fit
Step 3: Find & Enrich Contacts

Step 1: Find Every Company Website
When we work with a new client, our goal is always to get a complete picture of their total addressable market (TAM).
We want to have one source of truth that contains every single company we could want to prospect.
To do this, we start by combining multiple data sources to cast the widest net:
First, we want to scrape everything we can find on LinkedIn (even if there isn’t much).
Here are the tools we use for that:
Sales Navigator → Exporting company lists using the Prospeo Chrome extension
Clay’s “Find Companies Search” feature → This data is pulled directly from LinkedIn (and it doesn’t cost any Clay credits)
Next, we use ZoomInfo. Since it’s one of the few databases that doesn’t source all of its data from LinkedIn. So usually we’ll find new companies.
Finally, we find the bulk of our list from these two data sources:
Google Maps searches with industry-specific keywords (via Apify)
Industry directories & associations (manufacturer/distributor lists, partner directories, certification databases, etc.)
The goal: build a single master list of websites in our TAM.
Step 2: Confirm ICP Fit
Now that we have a giant spreadsheet of companies, we need to confirm that they’re actually an ICP fit.
As you know, industry filters aren’t always accurate. For example, a company listed under the “Software Development” industry on LinkedIn isn’t necessarily a SaaS tool.
That’s why we validate every single company for ICP fit using Clay.
Here’s how we do it (watch the video above for a live example):
Upload all company websites to Clay.com
Using our OpenAI API key, we have ChatGPT check every website for ICP fit
Note: If you use your API key, it costs roughly $1 for every 5k companies you check. So the cost is negligible.
Before running the whole list, we human-verify a sample list to catch edge cases that the AI missed.
We tweak the ICP fit prompt until it’s at least 90% accurate
Then we run the prompt on the entire list, and remove the bad-fit companies.
The result: a refined list of true, qualified companies in your TAM.
Step 3: Find & Enrich Contacts
Now it’s time to find decision-makers at these companies. Along with their emails and phone numbers
First, we use Clay’s “Find People” search feature, which uses a company domain to find contacts that work at those companies on LinkedIn.
It’s way easier to find emails when you have a LinkedIn profile. But for these hard-to-find industries, we usually only find about 20% of the contacts with this method.
The next step is to scrape the site. Here’s the prompt we use.
Again, we use an OpenAI prompt in Clay to extract the full names of any decision-makers listed on the website (usually on the “About” or “Contact Us” pages), as well as any emails on the website (blue-collar industries tend to have emails on the site).
If we find full names, we run those through LeadMagic, which uses a Full Name + Company Website to find contact info.
If we can’t find contact info using these methods, we fall back on these two methods
Capture company phone numbers and hand them off to SDRs for cold calling.
Use Contact Us forms when no direct email can be found (can be automated now with Clay’s “Navigator” feature).
Final Note: Why This Works
Most teams stop at LinkedIn and miss a huge chunk of their TAM. By layering multiple company sources, validating against ICP criteria, and being creative with contact discovery, you can pull in a much deeper, more accurate list of accounts and contacts.
This framework works for any niche market where contacts are tough to find, whether you’re going after emerging industries, fragmented local markets, or specialized verticals.
Thanks for reading,
Jed