Sampling Methods Used in Compliance Verification
Sampling methods are the structured techniques verifiers use to select a subset of records, transactions, physical items, or observations from a larger population when examining compliance with regulatory requirements, standards, or contractual obligations. Because exhaustive review of every data point is rarely practical or economically feasible, sampling frames the evidence-gathering phase of compliance verification and directly shapes whether findings are defensible. The choice of method determines statistical confidence, coverage of high-risk areas, and the extent to which conclusions can be extrapolated across the full population under review.
Definition and scope
In compliance verification, sampling is the process of selecting a defined subset from a population of interest — records, employees, products, transactions, or facilities — to draw conclusions about the whole population's conformance status. The population is bounded by the verification scope and boundary-setting established before fieldwork begins.
The American Institute of Certified Public Accountants (AICPA) and the International Auditing and Assurance Standards Board (IAASB) both distinguish between audit sampling, where every unit in the population has a calculable chance of selection, and non-statistical sampling, where professional judgment governs selection without probability mathematics. The ISO 19011:2018 standard on guidelines for auditing management systems further recognizes that sampling plans must be documented to allow reproducibility and peer review.
Regulatory framing for sampling appears explicitly in federal practice: the U.S. Food and Drug Administration's Code of Federal Regulations 21 CFR Part 820 (Quality System Regulation) mandates statistically valid sampling plans for device manufacturers, while the Environmental Protection Agency's Compliance Monitoring Strategy directs inspectors to use risk-based sampling prioritization. The evidence standards in compliance verification that govern these processes require verifiers to demonstrate that samples are sufficient in size and representativeness to support the assurance level claimed.
How it works
Sampling in compliance verification proceeds through five discrete phases:
- Population definition — The verifier identifies the complete set of items subject to the requirement (e.g., all purchase orders above a threshold dollar amount within the audit period, or all production batches shipped during a calendar quarter).
- Risk stratification — High-risk strata (large-value transactions, geographically remote sites, newly onboarded suppliers) are separated from low-risk strata to allow differential sampling intensity. This step connects directly to materiality in compliance verification, because immaterial low-risk items may receive lighter sampling.
- Method selection — The verifier selects a probabilistic or non-probabilistic technique based on available population data, timeline, and required assurance level (see classification below).
- Sample size determination — For statistical methods, size is calculated using confidence level (typically 90%, 95%, or 99%), expected error rate, and tolerable deviation rate. For non-statistical methods, professional standards bodies such as the AICPA publish guidance tables that link judgment-based sizes to engagement risk.
- Evaluation and projection — Errors found in the sample are evaluated and, under statistical sampling, projected to the full population to estimate total error or deviation rate.
The compliance verification process steps framework places sampling within the evidence-collection phase, after planning and before reporting.
Common scenarios
Financial compliance reviews — Internal Revenue Service examination procedures use statistical random sampling to project underreported income across large transaction populations. A sample of 60 to 100 invoices may be drawn from tens of thousands of transactions, with findings scaled proportionally.
Environmental monitoring — EPA inspectors conducting Clean Air Act stack-testing verification often apply systematic sampling (every nth reading from a continuous monitor log) rather than random sampling, because the population is sequential time-series data where spatial patterns matter.
Healthcare claims audits — The Centers for Medicare & Medicaid Services (CMS) Recovery Audit Contractors program uses stratified random sampling that separates complex claims from simple claims before drawing separate random samples from each stratum, reducing variance and improving error-rate precision. This work falls within the broader domain of healthcare compliance verification.
Supply chain and product verification — ISO 2859-1 (Sampling procedures for inspection by attributes) governs acceptance sampling for manufactured goods. A defined Acceptable Quality Level (AQL) — for example, AQL 1.0 for critical defects — dictates sample sizes and rejection thresholds for incoming inspection lots.
Workplace safety — OSHA's enforcement targeting programs apply probability-proportional-to-size (PPS) sampling when selecting establishments for programmed inspections, weighting heavier toward sites with higher injury and illness rates.
Decision boundaries
The primary distinction in sampling classification is statistical versus non-statistical:
| Dimension | Statistical Sampling | Non-Statistical (Judgmental) Sampling |
|---|---|---|
| Selection mechanism | Random (simple, systematic, or stratified) | Professional judgment or heuristic |
| Quantified confidence | Yes — expressed as a percentage | No — cannot be formally quantified |
| Error projection | Mathematically defensible | Qualitative only |
| Regulatory acceptance | Required by some frameworks (FDA 21 CFR Part 820) | Accepted where risk is low or population is small |
| Documentation burden | Higher — requires documented probability model | Lower — requires rationale narrative |
Within statistical sampling, stratified random sampling outperforms simple random sampling when the population contains identifiable high-risk segments, because it ensures proportional representation from each stratum and reduces overall variance. Simple random sampling is appropriate when the population is homogeneous and no prior risk intelligence exists.
Judgmental sampling — selecting items known to be complex, high-value, or previously flagged — is appropriate for limited-vs-reasonable-assurance-verification engagements where the objective is detecting likely errors rather than estimating population-wide error rates. It cannot, however, support extrapolated conclusions.
The threshold between statistical and non-statistical selection is also governed by documentation requirements for compliance verification: any sampling plan used to support a formal verification opinion must be retained in working papers with sufficient detail to allow an independent reviewer to reproduce the selection logic.