Greenhouse Open Jobs & Metadata Health

WezOps review of 87 open roles on the Dataiku Greenhouse board (board token dataiku, snapshot 2025-11-21).

Talent Acquisition Operations

Global Greenhouse Config

Metadata ≈ system reliability

What this page is

A live picture of your open demand – and where the data model quietly works against you.

Everything here is derived from your public Greenhouse job feed only. No stages, no candidates – just the job catalog and its metadata.

Open roles

Remote conflicts

Evergreen / pipeline

Goal: make recruiter performance, time-to-fill and funnel analytics trustworthy by treating job metadata like a product.

Snapshot of open demand Where the heat is, globally

We are only looking at open jobs from the public careers site – no closed reqs, no historical stages. The picture is GTM-heavy, global, and already shows where the metadata model will block clean TA reporting.

Open jobs

Avg age ≈ 122 days. 72% are < 90 days; a handful have been live > 1 year.

Region mix (by location)

EMEA · Americas · APAC

~42% EMEA, 33% Americas, 22% APAC, 2% other.

Evergreen / pipeline

Reqs with PIPE / EVG in the ID. Avg age ≈ 402 days.

Metadata health

Strong story, noisy schema

Demand is clear – GTM + AI – but remote flags conflict with location, and "N/A" placeholders exist.

Where you are hiring By department & region

GTM (Sales, Solution Engineering, Revenue Operations) drives about two-thirds of demand. R&D hiring is selective – primarily senior AI/ML roles.

Department breakdown (top 5)

Chart A

Two-thirds of headcount sits in Sales, Solution Engineering, and RevOps. R&D hiring is selective—focused on senior AI/ML roles.

Sales 30 · 34.5%

Solution Engineering 16 · 18.4%

Revenue Operations 9 · 10.3%

Marketing 7 · 8.0%

Customer Success Engineering 5 · 5.7%

14 distinct GH departments sit behind 87 open roles – just enough that "Other" bucketing matters.
8 departments have ≤2 roles; they're often grouped under Strategy & Ops or G&A for external facing.

Region breakdown (by location string)

Chart B

The majority of your open demand is EMEA-based, followed by the Americas and a moderate APAC footprint.

EMEA (EU, UK, MEA) 37 · 42.5%

Americas (US, CA, LATAM) 29 · 33.3%

APAC (SEA, ANZ, India) 19 · 21.8%

Global / unspecified 2 · 2.3%

40+ unique location strings in the feed – many multi-city blobs that make filtering tricky.
We'd normalise these into Country → City → Hub for cleaner dashboards and recruiter workloads.

Evergreen & pipeline reqs Always-open vs. active headcount

Evergreen and pipeline roles (identified by PIPE / EVG in the req ID) are great for talent pooling—but if dashboards mix them with real headcount, every metric drifts.

Evergreen share of the catalog

Chart C

About 18% of live roles are evergreen / pipeline. That's high enough to seriously skew time-to-fill and load-balancing reports.

Average age of an evergreen req is ~402 days—vs ~59 days for standard headcount.
Solution Engineering and Sales account for most of these; 2 have been open > 3.5 years.

Impact on funnel metrics

Chart D

If an unsegmented dashboard asks "what is our average time-to-fill?", the answer will be months longer than reality because it mixes in evergreen roles.

Avg age – standard headcount ~59 days

Avg age – evergreen / pipeline ~402 days

Avg age – blended (all reqs) ~122 days

Even a handful of multi-year reqs can double your reported average.
Solution: tag evergreen programmatically (GH custom field or a naming rule), exclude from standard dashboards, report separately.

Remote metadata layer Conflicting signals

Remote work is encoded simultaneously in location strings, offices and a Remote? custom field. They disagree more often than you'd like.

Remote vs on-site mix (Remote? field)

Chart E

The intent is clear: a third of the catalog is remote-friendly. The question is whether downstream reports can trust this flag.

On-site (Remote? = No) 56 · 64.4%

Remote (Remote? = Yes) 30 · 34.5%

Missing 1 · 1.1%

Roughly one-third of roles are remote-friendly at the metadata level.
This mix is strategically important (talent pools, salary bands, recruiter coverage).
We'd make Remote? the single source of truth and derive everything else from it.

Remote metadata conflicts

Chart F

When three fields all try to describe "remote vs on-site", they inevitably drift apart. That is exactly what we see in the snapshot.

Remote? vs location – consistent 66 jobs

Remote? vs location – inconsistent 21 jobs · 24.1%

Remote? vs office – consistent 73 jobs

Remote? vs office – inconsistent 14 jobs · 16.1%

~24% of roles have Remote? values that conflict with the location string.
~16% conflict with office settings.
We'd clean the current catalog, then enforce rules so one field drives all remote reporting.

Location string quality

Chart G

The candidate-facing locations are good; the analytics-facing structure behind them needs one more layer.

Single city/country 57 · 65.5%

Remote only 15 · 17.2%

Multi-location strings 12 · 13.8%

Region labels only 3 · 3.4%

40+ distinct location strings, including multi-country blobs.
We'd treat these as display labels, and derive reporting from structured Country / City / Region fields.

Taxonomy & next steps From "interesting" to "operational"

With only the public job feed, we can already see enough to propose a concrete, low-friction plan: clean the schema, then wire it into recruiter performance and data-quality monitoring.

Department alignment (GH vs External-Facing)

Taxonomy

Two parallel taxonomies describe the same work: Greenhouse Department and External-Facing Department. They agree most of the time – and disagree just enough to matter.

Examples: Sales Enablement / Enterprise Data & Analytics roles exposed as Strategy & Operations externally.
At least 1 in 7 roles changes bucket depending on which field a dashboard uses.
We'd define a single, governed mapping and enforce it via templates and validation.

What WezOps would do

Proposal

A practical, two-phase plan: fix the schema, then turn it into a monitoring and performance layer Phil's team can actually use.

Phase 1 – Job & metadata cleanup
- Redesign & lock the department / sub-department / location / remote model.
- Normalize current open roles (remote conflicts, multi-location strings, evergreen flags).
- Introduce a small, opinionated set of data-quality KPIs for the job catalog.
Phase 2 – Recruiter performance & data-quality dashboards
- Once connected to internal Greenhouse data (stages, offers, owners), layer on time-to-fill, funnel conversion, and workload metrics.
- All indexed by cleaned metadata – so "Sales, EMEA, remote" means the same thing everywhere.
Outcome: a TA system where dashboards reflect how the team actually works – and where data quality is monitored like a product.