## Notes from 10 February 2026
[[2026-02-09|← Previous note]] ┃ [[2026-02-11|Next note →]]
### State Lessons and Federal Fears
_Why the At-Will debate needs better evidence_
While civil service reform can be dismissed as the ['plumbing' of government](https://geoffmulgan.substack.com/p/in-praise-of-plumbing) - essential but invisible - the push for [Schedule Policy/Career](https://www.opm.gov/policy-data-oversight/hiring-information/hiring-authorities/schedule-policycareer/opm-answers-to-frequently-asked-schedule-policycareer-questions.pdf) in the US has transformed it into a site of [ideological conflict](https://www.govexec.com/workforce/2026/02/trump-admin-moves-finalize-return-schedule-f/411239/). The move to reclassify tens of thousands of federal positions represents more than a policy shift; it is a fundamental re-examination of whether a permanent, protected bureaucracy is an asset to democracy or an obstacle to it.
With the Trump administration moving forward on implementation ([affecting an estimated 50,000 federal positions](https://www.linkedin.com/posts/ronaldpsanders_trump-admin-moves-to-finalize-return-of-schedule-activity-7425659221204955137-iyQw/)) good-government groups, unions, and [scholars](https://donmoynihan.substack.com/p/trumps-schedule-f-rule-finalized) are sounding alarms about politicization, retaliation, and institutional damage. Given this administration's track record on civil service issues, a mix of chaotic execution and what often looks like intentional [dismantling of functioning](https://theconversation.com/how-18f-transformed-government-technology-and-why-its-elimination-matters-251333) systems [for partisan gain](https://lucabellodi.com/material/DOGE_Bellodi_Lee.pdf), [the anxiety isn't exactly unfounded](https://www.civilservicestrong.org/update/doge-what-was-the-point).
Into this charged atmosphere, the Partnership for Public Service recently [released a report](https://ourpublicservice.org/publications/at-will-employment-what-the-federal-government-can-learn-from-states/) analyzing at-will employment practices in US state governments. I have enormous respect for PPS's work generally, and this report compiles genuinely useful empirical material. To be clear, I think the risk mechanisms they highlight - political pressure, chilling effects, and workforce instability - are plausible and important to take seriously. But I want to focus on something specific: the evidentiary standards we're applying to state-level evidence, and whether this report meets them.
Full disclosure: I find myself more persuaded by arguments that at-will arrangements, properly designed, can enhance managerial flexibility without catastrophic consequences. But my concern here is about how we handle mixed evidence in a domain where everyone admits the data are thin.
**The core problem: Evidentiary asymmetry**
Here's what troubles me most: the report appears to treat similar forms of evidence differently depending on whether claims point toward costs or benefits.
When state data fail to show clear improvements from at-will reforms, the absence of clear positive effects is presented as a central takeaway - and, at times, seems to carry more inferential weight than the limits of the evidence would justify. When the same types of data - administrator surveys, perception measures - suggest potential harms, these are treated as credible warnings worth acting on. But in domains with thin evidence, "no proof of benefit" and "proof of harm" are not equivalent statements. The literature PPS itself cites describes results as mixed and difficult to interpret, meaning we lack clear signals in either direction.
One possible approach to mixed evidence is to apply symmetrical skepticism to both upside and downside claims. Instead, PPS front-loads the analysis with Trump-era examples of arbitrary dismissals, then moves to state evidence already primed to read absence of benefit as confirmation of danger. Given the policy stakes, the report understandably emphasizes risks - though that emphasis can blur the line between synthesis and argument.
And look, advocacy has its place. But it weakens credibility when it presents itself as an evidence review. Especially when the report shares a similar title and analytical structure with a [Manhattan Institute report](https://manhattan.institute/article/radical-civil-service-reform-is-not-radical-lessons-for-the-federal-government-from-the-states) making the opposite argument, yet does not engage that work directly. Addressing the leading competing synthesis would, in my view, strengthen the report's credibility - especially for readers coming to the question undecided.
**What "At-Will" actually means**
One analytical distinction that could be made more explicit is [distinguishing merit at entry from protection at exit](https://substack.com/home/post/p-183981697).
Some readers may take the report's framing to imply a closer equivalence between at-will status and patronage risk than the state variation warrants. You can have competitive, merit-based hiring and simplified removal procedures. You can have political appointments at entry with robust due process at exit. Many state positions classified as "at-will" are still filled through competitive examinations. Schedule Policy/Career, as described, would change removal protections while maintaining competitive hiring - but the report's emphasis is primarily on removal protections and politicization risk, rather than on clarifying how entry rules and merit-based hiring would be treated under the federal proposal.
This matters. If your concern is politicization - [partisan loyalty tests](https://fedscoop.com/federal-agency-jobs-trump-loyalty-question-opm-lawsuit/), expertise drain, regulatory capture - then you need to specify which institutional safeguard does the most work: competitive entry systems? Anti-discrimination enforcement? External oversight of dismissal procedures? Internal appeals processes? The report documents tremendous heterogeneity in how states combine these elements but then collapses that complexity back into a generic "at-will = bad" frame.
**The measurement problem and survey dependence**
What, exactly, are we measuring as "performance" here? The report relies heavily on surveys of HR professionals asking what they think happened after reforms. These are perceptions of change, perceptions of morale, perceptions of politicization risk.
Perceptions matter - but they are not performance data. If the theory is that at-will status improves managerial effectiveness while critics worry it degrades institutional capacity, we should be looking at actual institutional capacity: regulatory compliance quality, [service delivery](https://www.axios.com/local/salt-lake-city/2025/05/06/utah-best-state-us-world-news-report-2025) [outcomes](https://www.commonwealthfund.org/publications/scorecard/2025/jun/2025-scorecard-state-health-system-performance), procurement efficiency, [innovation metrics](https://business.adobe.com/blog/insights-from-adobes-2025-digital-government-index). Instead we're measuring whether people feel good or bad about changes, then treating those feelings as evidence about whether the government actually got better or worse at its functions.
And here's where recent research complicates the picture considerably. [Stenberg and Trondal's 2026](https://www.sv.uio.no/arena/english/research/publications/arena-working-papers/2026/wp-01-26-final.pdf) working paper analyzed four decades of survey data on bureaucratic autonomy in US states and found that perceived administrative autonomy remains remarkably stable - even in states that underwent radical civil service reforms like Georgia's and Florida's transitions. So PPS uses administrator perceptions to forecast politicization and instability, while this new analysis uses similar perception data across forty years and finds resilience through major reforms. Either perceptions aren't predictive of actual politicization - in which case we should be more cautious about using perception measures alone to forecast large-scale politicization or instability - or there's something about this specific federal reform that would break patterns that have held across decades. The report never engages this tension.
**Confounded reforms and missing variables**
PPS explicitly acknowledges that at-will reforms rarely came alone - they arrived bundled with pay-for-performance systems, decentralization, reclassification, compensation changes. This makes isolating the causal effect of at-will status essentially impossible.
Having made this acknowledgment, the report then proceeds to draw confident conclusions about what at-will status specifically does. You cannot have this both ways. There's a real tension here. The report rightly notes bundled reforms complicate attribution, yet some conclusions are framed in ways that may be read as attributing effects specifically to at-will status. Either treat "at-will" as a marker for broader packages and downgrade your causal claims, or design comparisons that actually isolate components.
And here's what both PPS and competing analyses overlook: the formal civil service is only part of how governments get work done. State governments operate through contractors, grantees, temporary arrangements, outsourced services. When managers face rigid hiring or dismissal rules, outsourcing becomes an escape valve. Conversely, in at-will environments, some pressure to contract out for flexibility might be lower.
If this substitution effect exists (and there's good reason to think it does!) then comparing states purely on civil service rules is misleading. You might be comparing a state that solved its flexibility problem internally (through at-will) to a state that solved it externally (through contractors). Observed stability on the payroll could mask instability in contracted workforces, and vice versa. So we need to consider the full labor perimeter and account for how work shifts across that boundary if we want better comparisons.
**On tail risks and evidence standards**
PPS deserves some credit for the type of argument they're making: essentially a precautionary claim that even if average effects are neutral, tail risks ([retaliation against whistleblowers](https://www.reuters.com/legal/government/us-federal-employees-would-lose-whistleblower-safeguards-under-trump-rule-2025-11-18/), [expertise flight](https://www.msn.com/en-us/news/us/exodus-of-staff-adds-to-faa-s-challenges/ar-AA1FHjeE), [regulatory capture](https://www.promarket.org/2025/03/12/for-my-enemies-tariffs-for-my-friends-exemptions/)) justify caution in federal governance.
That's not unreasonable logic. Tail risks can warrant precaution even when mean effects are unclear. But if you're making a tail-risk argument, you need evidence of tails, not means. Show us the mechanisms, the distributions, that bad scenarios are likely and concentrate in critical positions. The report cites case evidence and survey-based indicators, but a tail-risk argument would ideally be paired with more direct evidence on rare-but-high-impact events (e.g., documented retaliation patterns, distributions of removals in critical units, or quasi-experimental designs where feasible).
**Where this leaves us**
Look, I'm not arguing Schedule Policy/Career is unambiguously good policy. The Trump administration's broader governance record [[2025-12-25|does not inspire confidence]], and alarm from good-government groups isn't paranoia but pattern recognition based on what this administration has actually done.
However, the PPS report isn't making a political argument about Trump's trustworthiness. It's making an empirical argument about what state evidence shows, and that empirical argument doesn't hold up. The evidence is mixed - at-will hasn't produced managerial revolution, but neither has it generated administrative apocalypse. Both benefits and harms remain suggestive, confounded, difficult to isolate.
A report taking mixed evidence seriously would apply consistent evidentiary standards to both sides, distinguish merit entry from exit protections, grapple honestly with bundled reforms and impossible causal identification, measure actual performance rather than feelings, account for outsourcing substitution effects, and engage the strongest opposing evidence directly.
The PPS report does valuable work compiling state data and documenting heterogeneity. The risk-focused perspective is legitimate. But it reads as a precautionary policy brief rather than a neutral evidence review, applying asymmetric standards that treat absence of proof as proof of absence on one side while accepting thin evidence as credible warning on the other.
That doesn't make their concerns about Schedule Policy/Career wrong… the Trump administration's track record gives ample reason for worry. But it does mean we can't rely on this report to settle the empirical questions it claims to address. The policy debate will continue, as it should. What we need - and what this report doesn't provide - is analysis that matches the seriousness of the stakes with evidentiary rigor, rather than filling uncertainty with confident prediction shaped more by priors than by what state experience actually demonstrates.