Ryanhub - file viewer
filename: .cursor/skills/collect-leads/SKILL.md
branch: main
back to repo
---
name: collect-leads
description: >-
  Collect job leads with Playwright MCP from LinkedIn, Indeed, Twitter, Greenhouse,
  Lever, Workday, or generic pages. Creates leads via scripts/new_lead.py.
  Use for job search, scraping boards, /loop lead collection, or adding a posting by URL.
---

# Collect leads

Browse job boards, extract listings, save via **`scripts/new_lead.py`**. Never hand-create lead files or CSV rows.

## Before browsing

1. Read **`profile/target-thesis.md`** — what to prioritize
2. Read **`goal.md`** — search keywords, avoid list, batch limits
3. Pick source playbook below
4. Optionally scan `data/leads.csv` for existing URLs (script rejects dupes anyway)

## Workflow

1. Browse/search source.
2. Extract title, company, URL, source, location, employment type.
3. Run **`scripts/new_lead.py`**:

```bash
.venv/bin/python scripts/new_lead.py \
  --title "Software Engineer Intern" \
  --company "Acme Corp" \
  --url "https://..." \
  --source "linkedin" \
  --location "Remote" \
  --employment-type "internship"
```

Add `--remote` if fully remote. Script assigns ID (`L0002`), creates company if needed, syncs CSV.

4. Open the created `leads/L0002-*.md` and fill **body** sections: Summary, Why it fits, Concerns, Next action.
5. If company looks interesting, add notes to the company file and score with **`profile/company-fit-rubric.md`**.
6. If a relevant person is found, run **`scripts/new_contact.py`** and link in lead frontmatter `contacts: [C0003]`.
7. Run **`job-eval`** mindset on `match` / `status` / `priority`.
8. Run `sync_indexes.py`, `validate.py`, and `report.py` before stopping.

## Source playbooks

| Source | Playbook |
|--------|----------|
| LinkedIn | [linkedin.md](linkedin.md) |
| Indeed | [indeed.md](indeed.md) |
| Twitter/X | [twitter.md](twitter.md) |
| Greenhouse | [greenhouse.md](greenhouse.md) |
| Lever | [lever.md](lever.md) |
| Workday | [workday.md](workday.md) |
| Other | [other.md](other.md) |

For company-first discovery, use **`company-scout`** instead of starting from job boards.

## Playwright MCP

- Config: `.cursor/mcp.json`
- Save login: `data/browser/storage-state.json`
- Snapshots: `data/browser/snapshots/`

## Loop mode

On each `/loop` tick:

1. One source (rotate if user asked for multiple)
2. Respect batch size in `goal.md` (typically 5–10 new leads)
3. Report: count added, skipped dupes, top match if any
4. Run `sync_indexes.py` + `validate.py` if any leads added

## User pasted a URL

Skip board search — open URL directly, extract fields, run `new_lead.py`, then job-eval on the created file.

## Do not

- Write `leads/*.md` from scratch with a hand-picked ID
- Append rows to `data/leads.csv` manually
- Auto-apply or scrape private data
- Collect generic B2B SaaS volume when ambitious technical targets are available