Circle of Wizards All APIs on RapidAPI →

JobSpec

Turn any messy job posting into clean, structured JSON — from a URL, raw HTML, or plain text.

JobSpec is a job-posting parser. Point it at a job ad and it returns a normalized record: title, company, location, salary range, employment type, experience level, responsibilities, requirements, skills, benefits, and more. It works whether you have a live job URL, a saved HTML page, or a block of pasted text, so you can stop writing brittle per-site scrapers and HTML-regex hacks and just get fields you can store in a database.

Endpoints

Endpoint What it returns
POST /extract A normalized JobRecord parsed from a single posting. Send a JSON body with one of url, html, or text. Returns: title, company (name, url, industry, size), location (city, state, country, remote), employment_type, experience_level, salary (min, max, currency, period), description, responsibilities[], requirements[], nice_to_have[], skills[], benefits[], posted_date, apply_url, source_url, job_id, and cached.
GET /health Simple liveness check. Returns {"status": "ok"}.

Why this API

  • Three input modes, one schema. Pass a live url (JobSpec fetches and cleans the page for you), raw html you already have, or plain text you pasted — the output shape is identical every time.
  • One consistent record across every job board. Greenhouse, Lever, company career pages, PDFs you've turned into text — they all collapse into the same flat, predictable JSON instead of a different DOM per site.
  • Salary parsing built in. Ranges like "$180k–$230k" come back as structured salary.min, salary.max, currency, and period numbers you can filter and sort on.
  • Normalized enums and skill tags. employment_type (full_time, part_time, contract, internship, temporary, volunteer) and experience_level (entry, mid, senior, lead, executive) are normalized, and skills[] are short tags like Python, AWS, React rather than full sentences.
  • Cached by default. Identical requests are served from a 24-hour cache, so repeat lookups are fast and don't re-run extraction.

Typical use cases

  • Build a job board or aggregator by ingesting postings from many sources into one uniform schema.
  • Power an ATS or recruiting tool that needs structured requirements, skills, and salary out of free-form ads.
  • Normalize salary data across thousands of postings for market/compensation analysis.
  • Feed a job-matching engine with clean skills[] and requirements[] arrays instead of raw HTML.
  • Clean up scraped or pasted listings without maintaining a parser for every site layout.

Good to know

  • You must send exactly one input. Provide url, html, or text in the JSON body. An empty request returns 422. If you send a url, JobSpec fetches it server-side, strips scripts/styles/images, and parses the visible text.
  • Extraction is LLM-based. Job records are produced by a language model from the posting content, not from an official structured feed. It is accurate on standard postings but, like any model, can occasionally miss or misread fields — treat output as high-quality structured guesses, not a system of record.
  • Most fields are nullable. Any scalar the posting doesn't mention comes back as null; list fields (responsibilities, requirements, nice_to_have, skills, benefits) come back as []. Always code defensively.
  • Results are cached for ~24 hours. A repeated identical request returns the same record with cached: true. Vary your input to force a fresh parse.
  • URL fetching has limits. JobSpec follows up to 5 redirects, uses a 15-second timeout, and reads up to ~500 KB of HTML. Pages that require login, heavy JavaScript rendering, or aggressive bot protection may not fetch — in those cases pass the html or text yourself.
  • One posting per call. /extract parses a single job at a time; there is no bulk/search endpoint. (This is a structured-extraction API — it does not search or list jobs for you.)