# SuiteScrape Agent Onboarding

Use this skill when an agent needs to scrape public web pages, convert URLs into AI-ready markdown/JSON/summary/raw text, ask questions about page content, create URL monitors, export stored jobs, or save scrape results into SuiteScrape memory.

## Service

- Website: https://suitescrape.com
- API base: https://api.suitescrape.com
- OpenAPI contract: https://suitescrape.com/openapi.json
- Docs: https://suitescrape.com/docs
- LLM context: https://suitescrape.com/llms-full.txt

## Authentication

Use an API key from the operator's SuiteScrape account:

```bash
export SUITESCRAPE_API_KEY=ss_your_key
```

Send it as:

```http
Authorization: Bearer ss_your_key
```

Do not print, log, commit, paste, or expose API keys. If no key is available, ask the operator to create one in the SuiteScrape workbench.

## Quick Start

```bash
curl -sS https://api.suitescrape.com/api/agent/scrape \
  -H "Authorization: Bearer $SUITESCRAPE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"example.com","question":"Return the key facts as bullets.","format":"markdown"}'
```

## MCP Bridge

For Streamable HTTP MCP clients, connect directly to:

```text
https://api.suitescrape.com/mcp
```

Send the API key as:

```http
Authorization: Bearer ss_your_key
```

For Claude Desktop, Cursor, Codex-compatible MCP clients, and other local MCP hosts that need stdio, run the bridge from the SuiteScrape repo:

```json
{
  "mcpServers": {
    "suitescrape": {
      "command": "node",
      "args": ["/absolute/path/to/suitescrape-backend-fix/mcp/server.mjs"],
      "env": {
        "SUITESCRAPE_API_KEY": "ss_your_key",
        "SUITESCRAPE_API_URL": "https://api.suitescrape.com"
      }
    }
  }
}
```

Both MCP modes expose tools for plans, scraping, asking stored jobs, URL mapping, bounded crawls, job search, job export, and URL monitor creation.

## Common Workflows

### Scrape a page

Call `POST /api/agent/scrape`.

```json
{
  "url": "example.com",
  "format": "markdown"
}
```

Add `question` when you need Claude to answer from the page content.

### Ask about an existing job

Call `POST /api/agent/ask`.

```json
{
  "jobId": "job_id_here",
  "question": "What changed or matters most?"
}
```

### Find stored jobs

Call `GET /api/agent/jobs?status=completed&q=pricing&limit=25`.

Use `tag`, `q`, `status`, `limit`, and `offset` to narrow results.

### Export a result

Call `GET /api/agent/jobs/:id/export?format=markdown`.

Supported formats: `markdown`, `json`, `summary`, `raw`.

### Map a site before scraping

Call `POST /api/agent/map`.

```json
{
  "url": "iana.org/domains/reserved",
  "limit": 25
}
```

### Queue a bounded crawl

Call `POST /api/agent/crawl`.

```json
{
  "url": "iana.org/domains/reserved",
  "maxPages": 5,
  "format": "summary"
}
```

### Monitor a URL

Call `POST /api/agent/monitors`.

```json
{
  "url": "example.com/pricing",
  "name": "Pricing watch",
  "intervalHours": 24,
  "alertKeywords": ["pricing"],
  "minWordDelta": 25
}
```

## Safety Rules

- Only scrape public URLs the operator is allowed to process.
- Do not attempt to access localhost, private networks, metadata IPs, or internal URLs. SuiteScrape blocks these targets.
- Keep crawl sizes small unless the operator explicitly asks for a larger crawl.
- Prefer `map` before `crawl` so the operator can inspect target URLs.
- Use webhooks or polling for queued jobs instead of holding requests open.
- Back off on `429`.
- Treat `401` as missing or invalid API key.
- Treat `403` as a plan limit or permission issue.
- Treat `503` from AI routes as missing AI configuration or temporary AI dependency failure.

## Useful References

- Complete route schemas: https://suitescrape.com/openapi.json
- Developer docs: https://suitescrape.com/docs
- Current prices and limits: https://suitescrape.com/pricing
