"We improved job outcomes."
A statement. No unit, nothing to count, nothing a reviewer can act on.
SMART metrics test whether a number is defensible: the five-test framework, six design principles, SMART indicators, examples, and FAQ for program teams.
Sopact reads every survey, record, and document on arrival and traces each metric back to the row of data that defends it. SMART is five separate tests — Specific, Measurable, Achievable, Relevant, Time-bound — and a metric that passes four of them is the one that breaks in front of the board. This page is for the program teams, funders, and impact funds that have to defend a number, not only report one.
By Unmesh Sheth · Founder & CEO, Sopact · Updated May 25, 2026
SMART metrics are measurements that pass five tests at once: Specific, Measurable, Achievable, Relevant, and Time-bound. George Doran introduced the acronym in 1981 to write goals that survive the next review. The meaning that matters in practice: a SMART metric is a number a team can defend — traceable back to a source row of data, and forward to a decision.
A statement. No unit, nothing to count, nothing a reviewer can act on.
A number — but over what window? Out of how many? Compared to what?
Specific, measurable, achievable, relevant, time-bound. A reviewer can act on this.
SMART is taught as one acronym, but it is really five separate filters. Each letter catches a different kind of metric error. Drop one, and a different failure walks straight through.
Names what is counted, who it covers, and which condition must hold.
States the unit and the data source. The number has to come from somewhere.
Anchors the target to a baseline. Not a wish, not a stretch with no floor.
Ties the metric to the outcome the program is meant to produce.
Names the window. Without a window, the count never reports.
Pathway adapted from Doran, G. T. (1981), "There's a S.M.A.R.T. way to write management's goals and objectives," Management Review 70(11). The failure-mode layer is added by Sopact.
Vocabulary varies by community. KPI consultants say SMART performance indicators and SMART performance measures. Monitoring and evaluation teams say SMART measure, SMART measures, or SMART measurement. Data analysts use SMART as a definitional pre-flight check in data analysis. The five tests underneath do not change — a unit, a source, a baseline, a relevance, and a window.
Metrics may or may not pass the five SMART tests. Most do not — that is the gap this page closes.
A key performance indicator is a metric promoted to the small set a team reviews. SMART criteria for KPIs apply the same five tests at the indicator level.
SMART goals were the original 1981 application. The criteria moved to metrics and KPIs as the framework spread.
An indicator is chosen because it stands in for something harder to observe. SMART indicators apply the five tests with extra weight on the Relevant test.
The five-letter test is the filter. These are the design choices a team makes while drafting a metric so it passes the filter on the first try — not on the third rewrite during board-prep week.
Metrics fail the Specific test when they describe what the program does instead of who it changes. "Workshops delivered" is an activity; "graduates placed in matched roles" is a population outcome.
The population framing forces a denominator — what makes a metric comparable across cohorts.
Targets set before units are how vanity numbers happen. Decide what unit the metric counts in and which system produces the count, then set the target.
The unit constrains the target to something defensible, not aspirational.
A target without a baseline is a wish. A baseline without a target is a report. The pattern that works names the prior number, the new target, and the gain expected.
Baseline-anchored targets are the only kind a board can challenge constructively.
Relevance is the test most often skipped, because it requires knowing what the program is meant to produce. A metric off the theory of change survives reporting but informs no decision.
Relevance is what separates SMART metrics from busywork metrics.
"Improve placements" runs forever and reports never. "Placements within six months of completion, measured each quarter" is countable, finishable, and comparable.
Without a window, the metric is permanently in progress and lands no decision.
Programs collect dozens of data points. Only a handful become SMART metrics; the rest are diagnostic context. Making every point pass the five tests is how dashboards become unreadable.
A short list of defensible metrics outperforms a long list of suggestive ones every quarter.
The first decision in metric design controls all the others. Most teams spend their reporting effort fighting the downstream consequences of an upstream choice they did not realize they were making.
| The choice | The broken way | The working way | What it decides |
|---|---|---|---|
| Picking the metric subject | Program"Workshops delivered." Counts activity, not change. | Population"Graduates placed in matched roles." Counts change in the people the program serves. | Whether the metric can describe outcomes or only activity. |
| Defining the unit | Implicit"Improve engagement." No instrument can produce a number. | Explicit"Mid-program retention, percent of cohort at session 6 of 8, source: attendance roster." | Whether the metric can be computed at all when the data lands. |
| Setting the target | No baseline"Hit 80 percent." A round number written in a planning meeting. | Baselined"80 percent placement, against a prior-cohort baseline of 63. A 17-point gain." | Whether the target is defensible, or reopened at the next meeting. |
| Linking metric to theory | Countable"Email open rate." An engagement signal, not the outcome the program produces. | From theory"Wage gain at six months." Maps to the outcome step the program is built around. | Whether the metric can inform a decision or only fill a slide. |
| Setting the time window | No window"Increase placements." Runs forever, never reports a final number. | Windowed"Placements within six months of completion, measured each quarter." Start, end, comparison cycle. | Whether the metric ever finishes a reporting cycle. |
| Sizing the metric set | Forty KPIsEvery data point promoted to KPI. None defended deeply. | Five to sevenA short SMART set; the rest kept as diagnostic context, each defended one row deep. | Whether the team can defend any metric when a reviewer pushes. |
The first choice — subject — controls every choice after it. A program-subject metric forces the unit toward activity counts, which drives targets toward throughput, which breaks the Relevant test. Get the subject right and the other five tests are within reach. Get it wrong and no amount of rewriting fixes the metric.
An indicator is a metric chosen because it stands in for something harder to observe directly. You cannot measure "career readiness" with one number, so you pick an indicator that tracks it. SMART indicators apply the same five tests — but the Relevant test does most of the work, because a wrong indicator passes Specific, Measurable, Achievable, and Time-bound cleanly and still measures the wrong thing.
"Graduates who completed all eight training sessions, by cohort." A SMART output indicator: it names the population, the unit, and the source, and stands in for "the program was delivered as planned."
"Share of graduates in a training-matched role six months after completion." It stands in for "the training changed employability." The Relevant test is what confirms it is the right stand-in.
"Median wage gain against a prior-cohort baseline, twelve months out." It stands in for "the program changed economic standing." Slow to move — so the window is long and named.
For indicators, the test that breaks first is Relevant. An indicator that passes Specific, Measurable, Achievable, and Time-bound but stands in for the wrong outcome is the most expensive kind of well-formed metric — it reports cleanly every quarter and points to no real decision. Name what each indicator stands in for, in writing, before it goes on the dashboard.
A workforce training program runs an eight-week cohort of about sixty graduates. The board wants one number that answers whether the program is working. The team has a draft. It does not survive a single test.
"We started the year reporting 'improved job outcomes' because that is what the funder asked for. The board asked what we meant by improved. We said placements were up. They asked compared to what. They asked how we counted a placement, and whether self-reports and verified hires were the same number, and what window we were measuring. We did not have answers. So we rewrote the metric until every question had a one-line answer."
47 of 60 graduates placed in a training-matched role within six months of program completion, against a prior-cohort baseline of 38 of 60. Specific population, named unit, a source, a baseline, a window. Every board question now has a one-line answer.
The five tests pass because the metric definition lives next to the data that defends it — the named field, the tracked instrument, the linked baseline, the theory tag. Sopact reads each on arrival; rebuilding the linkage by hand every quarter is what a forms tool and a spreadsheet ask of the same person writing the report.
The five-letter test is the same everywhere. What changes by program shape is which letter fails first — and what a working metric set returns once it is fixed.
Cohort cycles with four collection touchpoints. The placement tracker lives in a spreadsheet, the surveys in a forms tool, and the IDs do not match — so quarterly reporting becomes a manual matching exercise before a single metric is computed.
Anchor every measurement to a Persistent Contact ID at intake. Each SMART metric traces to one source row.
Multi-year cohorts, long horizons. The program collects dozens of countable signals — attendance, GPA, recommendation counts — that pass Specific, Measurable, Achievable, and Time-bound but do not map to the long-term outcomes.
Grade every candidate metric against the program theory before promoting it. A short SMART set of five to seven.
20 to 50 portfolio companies, each reporting different operating metrics. A portfolio-level "jobs created" rollup mixes apples and oranges unless the fund standardizes the definition at the time of investment.
A definition contract signed at investment closing, referencing IRIS+ codes or a fund metric dictionary. The S test passes once.
The five SMART tests are only as strong as the data structure underneath them. Two of the five — Measurable and Achievable — depend on capabilities a forms tool does not have.
Forms tools render questions, capture responses, and export to CSV. None of them carry a stakeholder identity across instruments, link to program records, or surface a baseline at the moment a metric is designed. The Measurable test needs a unit and a source; the Achievable test needs a baseline and a target. A forms tool plus a spreadsheet supplies all four only by matching records by hand, every reporting cycle, by the same person writing the report.
Sopact reads each response on arrival and holds the parts that defend a metric in one place. A Persistent Contact ID carries across every instrument. Program records and responses live in one workspace. The metric definition is tagged to the program theory at design time. The five SMART tests pass because what defends them is held together structurally, not stitched together procedurally each quarter.
Collecting the answer was never the hard part. Defending the metric the answer feeds is — and that needs a unit, a source, a baseline, and a theory tag that stay attached to the number from the day the data lands to the day it reaches the board.
SMART metrics are program or business measurements that pass five tests at once: Specific, Measurable, Achievable, Relevant, and Time-bound. The acronym originated in a 1981 management paper by George Doran and now applies across goal setting, performance indicators, and impact measurement. A metric that passes all five can be defended back to a source row of data and forward to a decision the team needs to make.
In the context of setting metrics, SMART stands for Specific, Measurable, Achievable, Relevant, and Time-bound. Each letter is a separate test. Specific catches vague wording. Measurable catches numbers no instrument can produce. Achievable catches targets without baselines. Relevant catches metrics off the program theory. Time-bound catches measurements that never report.
SMART metrics are the metrics a team can defend — in a board meeting, a funder review, or a planning session three quarters from now. They name what is being counted, where the number comes from, what change is realistic against a baseline, why the metric matters to the program, and which window the count covers. Most metrics fail at least one of these tests.
The SMART framework is a five-test checklist for goals, objectives, and performance indicators — one of the most cited frameworks in management literature, used in KPI design, OKR coaching, and monitoring and evaluation guidance. The framework does not generate a metric on its own. It is a filter applied to a draft metric to find which letter the metric fails.
A measurable metric names a unit and a source. The unit is what gets counted: people, dollars, days, sessions, placements. The source is the system or instrument the count comes from: an enrollment record, an exit survey, a payroll report. If a metric cannot be traced to both a unit and a source, the M in SMART is not satisfied and the number cannot be defended.
A metric reports a number. A SMART metric defends that number against five questions: what exactly is being counted, where the count comes from, whether the target is realistic given a baseline, whether the metric matches the program theory, and over what window the count applies. Most reporting failures happen because a metric was published before the five tests were applied.
A non-SMART metric: "We improved job outcomes." A SMART version: "Eighty percent of cohort graduates report a job placement matched to their training within six months of completion, against a prior-cohort baseline of sixty-three percent." The second version names the unit, the source, the baseline, the relevance, and the window. A reviewer can act on it. The first one starts a meeting about what was meant.
A SMART indicator is a metric chosen to stand in for a harder-to-observe outcome, written to pass all five tests. Because an indicator is a proxy, the Relevant test does the most work — an indicator can pass Specific, Measurable, Achievable, and Time-bound and still stand in for the wrong outcome. A SMART indicator names, in writing, what it stands in for, so the proxy can be checked.
SMART criteria for performance indicators apply the same five tests to KPIs as to goals. A SMART performance indicator names the population it covers, the data system it pulls from, a baseline against a target, the link to a strategic outcome, and the reporting window. Indicators that name only a direction — increase, improve, grow — fail the Specific and Measurable tests and produce reports the team cannot act on.
An actionable metric is one that, once it lands in front of a decision-maker, points to a next step. SMART is the structural test; actionable is the consequence. A metric that is specific, measurable, baseline-anchored, relevant to the program theory, and tied to a window almost always produces an action when the number moves. Vanity metrics point to no action because they pass none of the five tests.
In monitoring and evaluation, SMART metrics are how output, outcome, and impact indicators get written so they can be reported quarterly without arguments. A logframe row that names a SMART indicator avoids the most common M&E failure: a quarterly review where the team disagrees about what the indicator was meant to measure in the first place.
SMART is used in data analysis as a pre-flight check on the metric definition before any computation runs. Analysts apply the five tests to confirm the metric maps to a column in a data system, has a defensible filter for the population, names a baseline window and a comparison window, and ties to a question the work actually needs answered. SMART does not replace statistical methods — it catches definitional errors that statistics cannot fix later.
Sopact reads every survey response and program record under one Persistent Contact ID, so any SMART metric can be defended back to its source row. The Specific test gets a named field. The Measurable test gets a tracked instrument and unit. The Achievable test gets a baseline pulled from prior-cohort data. The Relevant test gets a tie to the program theory captured at design time. The Time-bound test gets a collection window the system holds.
Forms tools collect responses well. They do not enforce a stakeholder identity across forms, link to program records, or surface a baseline at the moment a metric is designed. Teams using Google Forms or SurveyMonkey to build SMART metrics typically end up matching exports by hand in a spreadsheet — which is where the M and the A in SMART quietly fail. The collection part is fine; the defensibility part needs a system that holds the parts together.
Outputs vs outcomes and the six properties of a working metric — the layer below the SMART test.
The map that makes the Relevant test answerable — the outcome steps a SMART metric is meant to track.
The instrument that produces the baseline the Achievable test needs — the same people, measured over time.
The practice a SMART metric set feeds — collecting, comparing, and reporting change.
Turning the qualitative half of a metric into countable, defensible signal.
Where the short SMART metric set lives — the five-component loop that keeps it current.
Sixty minutes with someone who builds these for a living. Bring one metric your program reports on today — the vaguer it is, the better the example. We walk it through Specific, Measurable, Achievable, Relevant, and Time-bound, and name where each test would live in your data. No slideware, no demo accounts — your data, read live.
No slideware. No demo accounts. Your own records, read live.