Chapter 8: The Organization Blind to Itself
Thesis: A large organization or a state cannot directly observe its own knowledge and activity, which is distributed, partly hidden, and at times strategically concealed; so it reaches out to grasp a proxy (the metric it can see), and this is where the proxy move's Goodhart failure is at its most glaring, before being shored up with auditing (the audit trail) and redundancy.
A Colossus That Cannot See Itself
The principal-agent problem of the previous chapter now has its scale enlarged. The principal is no longer a single person, but an entire organization, a whole state; the ones entrusted to act are thousands upon thousands of people scattered everywhere. A new, almost absurd situation arises: this colossus cannot see itself clearly.
It wants to know how many people it has, what is being planted, who is doing what, and how well they are doing it; yet none of this knowledge resides anywhere it can directly read. What this chapter examines is this: when the object to be verified is the organization's own knowledge, distributed, hidden, and apt to dodge being seen, what does the organization do? The unverifiability here lays several of the earlier faces one atop another: partial observability (knowledge dispersed at the edges), plus the adversarial (the people being watched will turn around and manipulate the thing being watched).
Distributed Knowledge
Hayek's 1945 essay "The Use of Knowledge in Society"1 laid the problem bare: the knowledge on which a society runs is never concentrated in any one place; it is dispersed among countless individuals, it is local knowledge concerning a particular time and a particular place, and often it cannot even be put into words. Which machine has a small fault today, which customer is in fact about to drift away, which side path will collapse after the rain, the holders of such knowledge often do not themselves realize it is "knowledge," let alone find a way to package it up and hand it to the center. Polanyi called this layer the tacit dimension2: we know far more than we can tell.
This means the organization faces more than a case of "the information has not yet been collected." Even if everyone were loyal and cooperative, that local, tacit knowledge would still evaporate in the course of being gathered. The "whole picture of the organization" the center wants cannot, in principle, be faithfully fitted into any container that could verify it. This is the social-scale version of partial observability, and it comes with a harder floor: such knowledge is by its very nature local, and cannot be gathered into any one place.
The Urge Toward Legibility
If it cannot be seen clearly, the impulse is to find a way to make it seeable. Scott's 1998 book Seeing Like a State3 gives this urge an exact name: legibility. For a state to act upon society, it must first remake society into a shape it can read. It measures the land and draws cadastral maps, imposes fixed surnames on people who once had only nicknames, bynames, or patronymics, standardizes weights and measures, and rolls out standardized scientific forestry. These are not neutral acts of recording; they are reshaping reality itself so that reality will fit into the table. Hacking's The Taming of Chance32, Desrosières's The Politics of Large Numbers31, and Bowker and Star's Sorting Things Out30, taken together, form a history of "making society countable."
The danger of legibility lies in the fact that the map must simplify, and once an organization acts only according to the map, the things the map has erased come back to bite. Scott's most forceful case is precisely scientific forestry: to render the forest "legible, countable, and taxable," the Prussians remade the tangled natural woodland into uniform, easily inventoried single-species plantations. The first generation or two grew splendidly; by the third generation the soil was exhausted, pests spread, and the forest died off in swaths, so much so that German even coined a word for it, Waldsterben, the death of the forest. The cleaner the map, the more lethal the local knowledge it erased, the knowledge that had kept the system running. This is the organization manufacturing for itself the observability it lacks, at the cost of leveling, with its own hands, the very complexity that let it run.
The Proxy Metric, and Its Goodhart Collapse
The most common landing place of the urge toward legibility is the metric. The things truly cared about, health, learning, productivity, public welfare, cannot be directly observed; so the organization grasps the proxy it can see, the KPI, GDP, exam scores, paper citation counts, emergency-room waiting times.
This is exactly the proxy substitution we met in Chapter 7. But here it fails in the opposite manner, and that contrast is one of this book's main threads. The mathematician's proxy is faithful but no easier: an equivalent rewriting really is equivalent, yet it has not become any simpler to solve. The organization's proxy is precisely the reverse, easier but unfaithful: the metric is of course easy to measure, but its correspondence to the true target snaps the moment the metric itself becomes the target.
This rupture goes by many names. Goodhart in 19754: once a metric is taken as a policy target, its reliability as a metric falls apart. Campbell in 19796 says the social version of the same thing. As early as 1956, Ridgway had catalogued the "dysfunctional consequences of performance measurements"7; Kerr's 1975 essay "On the Folly of Rewarding A, While Hoping for B"8 wrote it into the common sense of management. Strathern gave it its most distilled single sentence5: when a measure becomes a target, it ceases to be a good measure. A deeper layer is reflexivity: Espeland and Sauder in 200712 point out that a public ranking is not describing the world but remaking it; a ranked university will change itself to fit the ranking's algorithm, so that what the metric "measures" is precisely the behavior it has itself called into being. Bevan and Hood11 documented the gaming of metrics inside the English health system; Smith in 199510 analyzed how the public release of performance data invites a string of unforeseen consequences; and Merton's 1936 paper on "the unanticipated consequences of purposive social action"9 is the wellspring of all this. Such collapses are common in reality, and the cost is at times staggering. To hit its account metric for "cross-selling," Wells Fargo employees secretly opened some 3.5 million fake accounts without customers' knowledge; the affair came to light in 2016, the bank was fined 185 million dollars and more than five thousand employees were dismissed, and the number that had been enshrined had destroyed the very customer relationship it was meant to measure. An earlier, parable-like case took place in colonial-era Delhi: to wipe out snakes, the authorities offered a bounty for dead cobras, whereupon residents simply bred cobras to collect it; when the bounty stopped, the snakes were all set free, and the snake problem grew worse than before. The "cobra effect" takes its name from this.
Why must a proxy be distorted? Principal-agent theory gives the rigorous explanation. Holmström's 1979 informativeness principle14: reward should be hung on signals that carry information about "effort." But once effort is multidimensional and you can measure only a few of its dimensions, trouble begins. Holmström and Milgrom's 1991 multitask analysis15 (the multitask principal-agent model) puts it plainly: when a person must attend to measurable and unmeasurable tasks at once, the more heavily you reward the measurable part, the more they will shift effort away from the unmeasurable part toward the measurable. Let the true target be $G$ and the observable proxy be $P$; the two are correlated under the status quo. The problem is that this correlation is a product of behavior, not an objective law. Once $P$ is made the target of pressure,
$$\arg\max_{a} P(a)\ \quad\text{vs.}\quad\ \arg\max_{a} G(a),$$
the rational agent goes looking for actions that raise $P$ while doing nothing for, or even harming, $G$; the correlation is crushed by the very pressure to optimize. The teacher teaching to the test, the hospital scheduling patients so as to lower one particular waiting-time figure, the researcher slicing a single paper into the smallest countable units of publication, are all the same mechanism.
Shoring It Up With Auditing and Redundancy
The proxy on its own will collapse, so the organization adds two further moves, which are likewise moves that recur throughout this book.
The audit trail and auditing. Double-entry bookkeeping is one of humanity's oldest audit chains; Soll, in The Reckoning22, argues that the ability to keep accounts that can be checked bears directly on the rise and fall of one nation after another: those that can reckon themselves are the ones that endure. Modern financial auditing and independent inspection all amount to swapping "fraud cannot be prevented in advance" for "fraud can be detected after the fact." But this move has an ailment of its own. Power's 1997 The Audit Society20 spells it out: when verification itself becomes a ritual, what the organization produces is no more than the appearance that "everything is under control," rather than control itself. The "audit culture" of Shore and Wright17 and O'Neill's reflection on "trust" in the 2002 Reith Lectures21 speak of the same alienation: in order to be held accountable, institutions pour enormous effort into manufacturing traces that can be inspected, while the real work gets pushed to the side.
Redundancy and consensus. Landau's underrated 1969 article16 rehabilitated "duplication and overlap": in a system whose parts are none of them fully reliable, redundancy is not waste but a source of reliability; several mutually independent checks are harder to fool all at once than a single authority. This move holds only under one precondition, independence, which the next part will stress again and again: if several checks are in fact of the same origin, a correlated failure will destroy the entire value of the redundancy in one stroke.
Where This Chapter Leads, and the Close of Part II
With this, the four sites have all been surveyed. The person at the console, the agent set loose, the mathematician hitting the wall, and the organization blind to itself face sources of unverifiability that are wildly different: preferences hidden in the heart, future behavior in an open world, propositions undecidable in principle, knowledge that is distributed and apt to dodge. Yet what they reach out to grasp is the same small set of things.
What most deserves to be set side by side is the two opposite failures of proxy substitution. The mathematician comes to grief on faithful but no easier; the organization comes to grief on easier but unfaithful. The two ends of that 2×2 table in Chapter 7 now both have flesh on them. They are not two moves but two directions of failure of a single move, and a good proxy must dodge both ends at once, being faithful and easier alike, which is so rare that it is nearly the whole of the craft. Chapter 11 will formally join these two ends. The principal-agent skeleton, too, has grown from a snippet of code in Chapter 6 into a state here.
With this, Part II has demonstrated the moves "embedded in their sites and tangled together." They are scattered, go by changing names, and are mixed into their respective jargons. What Part III sets out to do is to pull each move out of the field in which it grew, wash it clean, name it on its own, and cover every site at one stroke. That comparative table is the true payload of this book.
References
Waypoints: 1. historical scientific judgment; 2. theoretically studied material; 3. how science progresses; 4. how to live in an unverifiable world. This section was checked source by source.
- F. A. Hayek (1945). "The Use of Knowledge in Society." American Economic Review, 35(4), 519-530. [2][4] Hayek argues that the knowledge on which a society runs is never concentrated in one place but dispersed among countless individuals, that it is local knowledge concerning a particular time and a particular place, and that it cannot be faithfully gathered by any center. This essay is the direct point of departure for this chapter's section "Distributed Knowledge," and it sets the epistemological coloring of the predicament that "the organization cannot see itself clearly."
- M. Polanyi (1966). The Tacit Dimension. Doubleday. [2] Polanyi proposes the tacit dimension of knowledge, his famous line being "we know more than we can tell." The book uses it to show that a considerable part of the local knowledge dispersed at the edges simply cannot be put into words and handed up, which is a harder floor beneath the organization's difficulty in verifying itself.
- J. C. Scott (1998). Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed. Yale University Press. [2][4] Scott proposes the concept of "legibility": in order to act upon society, a state uses such means as cadastral maps, fixed surnames, and standardized weights and measures to remake society into a shape it can read, and this simplification often erases the local knowledge that keeps the system running, leading to failures like scientific forestry. This chapter's section "The Urge Toward Legibility" is built precisely on this; it is the core reading for understanding why an organization sets about leveling complexity with its own hands.
- C. A. E. Goodhart (1975). "Problems of Monetary Management: The U.K. Experience." Papers in Monetary Economics, Vol. I. Reserve Bank of Australia. [2] Goodhart was originally speaking of monetary policy, yet he gave an insight later cited everywhere: once a statistical regularity is taken as the target of policy control, its original regularity falls apart. This is the source of the name of this chapter's section "The Proxy Metric, and Its Goodhart Collapse," and the starting point for understanding how a proxy is crushed by the pressure to optimize.
- M. Strathern (1997). "Improving Ratings: Audit in the British University System." European Review, 5(3), 305-321. [2][4] Strathern, drawing on the experience of auditing in British universities, left Goodhart's law its most distilled popular formulation: when a measure becomes a target, it ceases to be a good measure. This chapter quotes the sentence directly; it is also the single best line for explaining the abstract collapse of the proxy to a reader.
- D. T. Campbell (1979). "Assessing the Impact of Planned Social Change." Evaluation and Program Planning, 2(1), 67-90. [2][4] Campbell, from the standpoint of social-science evaluation, proposed "Campbell's law," isomorphic to Goodhart's: the more a quantitative social indicator is used for social decision-making, the more it is subject to corruption pressures, and the more it will distort the very social process it was meant to monitor. This chapter uses it to corroborate that the collapse of the proxy is not peculiar to economics but the same phenomenon discovered again and again across disciplines.
- V. F. Ridgway (1956). "Dysfunctional Consequences of Performance Measurements." Administrative Science Quarterly, 1(2), 240-247. [2][4] Ridgway, very early on, systematically catalogued the dysfunctional consequences of performance measurement, distinguishing the distortions brought by single, composite, and multiple measures. This chapter uses it to show that the gaming and distortion of metrics is a rather old problem, discovered quite early, and not some recent coinage of management studies.
- S. Kerr (1975). "On the Folly of Rewarding A, While Hoping for B." Academy of Management Journal, 18(4), 769-783. [2][4] Kerr lists a wealth of real-world examples to show that organizations often reward one kind of behavior while hoping for another that they have not rewarded, with results that naturally run counter to intent. This essay wrote the mismatch of proxy and incentive into the common sense of management, and is the classic source of this chapter's "reward A while hoping for B" mechanism.
- R. K. Merton (1936). "The Unanticipated Consequences of Purposive Social Action." American Sociological Review, 1(6), 894-904. [2][4] Merton systematically analyzed why purposive social action always brings unanticipated consequences, and sorted out their causes, such as ignorance, error, and the imperious immediacy of value. This chapter treats it as the wellspring of a whole series of "unforeseen" phenomena, such as the gaming of metrics and the backlash of legibility.
- P. Smith (1995). "On the Unintended Consequences of Publishing Performance Data in the Public Sector." International Journal of Public Administration, 18(2-3), 277-310. [2][4] Smith classified and sorted out the string of unintended consequences invited by the public release of performance data in the public sector, such as tunnel vision, myopia, misrepresentation, measure fixation, and gaming. This chapter uses it to break the vague "the metric gets distorted" into several recognizable, concrete modes of failure.
- G. Bevan & C. Hood (2006). "What's Measured Is What Matters: Targets and Gaming in the English Public Health Care System." Public Administration, 84(3), 517-538. [2][4] Bevan and Hood empirically documented the various ways of gaming metrics in the English National Health Service under "targets and terror" governance, such as scheduling patients so as to lower waiting times, a practice that appeases the metric while doing nothing for real health. This chapter takes it as field evidence of how the gaming of metrics actually happens in public services.
- W. N. Espeland & M. Sauder (2007). "Rankings and Reactivity: How Public Measures Recreate Social Worlds." American Journal of Sociology, 113(1), 1-40. [2][4] Espeland and Sauder, using law-school rankings as their example, propose "reflexivity": a public measure does not merely describe the world but turns around to reshape the behavior of those being measured, so that what the metric finally measures is the very reaction it has itself called into being. This chapter's paragraph "A deeper layer is reflexivity" comes from here; it pushes the failure of the proxy to the level where the metric manufactures reality.
- M. Sauder & W. N. Espeland (2009). "The Discipline of Rankings: Tight Coupling and Organizational Change." American Sociological Review, 74(1), 63-82. [2][4] This companion piece draws on Foucault's concept of discipline to analyze how rankings become embedded in organizations: institutions once loosely coupled are forced under the pressure of rankings into tight coupling, and the external measure is internalized as everyday self-surveillance and organizational change. It complements the previous entry, which treats the mechanism of reflexivity, while this one treats how rankings remake an organization's internal structure.
- B. Holmström (1979). "Moral Hazard and Observability." The Bell Journal of Economics, 10(1), 74-91. [2][4] Holmström proposes the informativeness principle: under moral hazard, the optimal reward contract should hang on all signals that carry information about the agent's effort. This chapter uses it to give a rigorous principal-agent explanation of "why the proxy is bound to be distorted," and to lead into the trouble that arises when effort is multidimensional and only a few dimensions can be measured.
- B. Holmström & P. Milgrom (1991). "Multitask Principal-Agent Analyses: Incentive Contracts, Asset Ownership, and Job Design." The Journal of Law, Economics, and Organization, 7(Special Issue), 24-52. [2] The multitask principal-agent model shows that when a person must attend to measurable and unmeasurable tasks at once, the more heavily the measurable part is rewarded, the more they will draw effort away from the unmeasurable part. This chapter argues from it the mechanism of the proxy's collapse: pressing on the observable metric rationally induces the agent to abandon work that is hard to measure yet truly important.
- M. Landau (1969). "Redundancy, Rationality, and the Problem of Duplication and Overlap." Public Administration Review, 29(4), 346-358. [2][4] Landau rehabilitated the "duplication and overlap" so often denounced as waste: in a system whose parts are none of them fully reliable, redundancy is precisely the source of reliability, and several mutually independent checks are harder to fool all at once than a single authority. This chapter's section "Shoring It Up With Auditing and Redundancy" adopts this argument directly, and stresses that its precondition is the mutual independence of the checks.
- C. Shore & S. Wright (1999). "Audit Culture and Anthropology: Neo-Liberalism in British Higher Education." The Journal of the Royal Anthropological Institute, 5(4), 557-575. [2][4] Shore and Wright, taking British higher education as their example, propose "audit culture": under neoliberal governance, the logic of accountability and audit seeps into academic institutions, turning peers into objects of surveillance and reshaping the way people govern themselves. This chapter uses it to show how auditing is alienated from a tool into a culture that makes people spend themselves on manufacturing inspectable traces.
- J. Z. Muller (2018). The Tyranny of Metrics. Princeton University Press. [4] Muller, writing for the general reader, surveys the distortions and costs brought by overreliance on quantitative metrics in fields such as medicine, education, policing, and business, and offers judgments about when measurement should and should not be used. The book is a popular synthesis that explains the collapse of the proxy to practitioners, suitable for the reader as an introduction and a point of comparison.
- T. M. Porter (1995). Trust in Numbers: The Pursuit of Objectivity in Science and Public Life. Princeton University Press. [2][4] Porter argues that reliance on quantification often springs from a kind of "mechanical objectivity": in situations lacking trust and demanding outward accountability, numbers are used as a tool to suppress personal judgment and ward off challenge. The book provides a deep sociological explanation of why organizations cling to legible numbers, serving as background to both this chapter's sections on legibility and on auditing.
- M. Power (1997). The Audit Society: Rituals of Verification. Oxford University Press. [2][4] Power points out that when verification itself becomes a set of rituals, what the organization produces is often the appearance that "everything is under control," rather than control itself, and society remakes itself in turn so as to be auditable. This chapter's section "Shoring It Up With Auditing and Redundancy" draws on it to spell out the ailment that the audit move carries within: the more traces, the more the real work gets pushed aside.
- O. O'Neill (2002). A Question of Trust: The BBC Reith Lectures 2002. Cambridge University Press. [4] O'Neill, in this set of Reith Lectures, reflects on the contemporary culture of accountability: the various measures of transparency and audit meant to rebuild trust often erode the very trust they were intended to foster, leaving people busy coping with inspection rather than doing the work well. This chapter cites it alongside "the audit society" to show how excessive accountability backfires.
- J. Soll (2014). The Reckoning: Financial Accountability and the Rise and Fall of Nations. Basic Books. [1][4] Soll, with double-entry bookkeeping as his thread, argues that the ability to keep accounts that can be checked bears directly on the rise and fall of one nation after another: those that can reckon themselves are the ones that endure. This chapter uses it to support the claim that "the audit trail and auditing" is one of humanity's oldest chains of verification.
- J. G. March & H. A. Simon (1958). Organizations. John Wiley & Sons. [2] March and Simon laid the foundations of modern organization theory: the rationality of an organization's members is bounded, and the organization copes with the limits of individual cognitive capacity precisely through division of labor, procedures, and information channels. The book provides a basic framework for "the organization cannot see itself," and is a classic source for understanding how information flows and decays through a hierarchy.
- H. A. Simon (1947). Administrative Behavior: A Study of Decision-Making Processes in Administrative Organization. Macmillan. [2] Simon proposes bounded rationality, understanding the organization as a set of structures that help members make decisions under limited cognitive capacity. The book is the source for understanding why an organization must rely on simplification, routine, and proxies to run, and it lays the theoretical bedrock for this chapter's account of the limits of organizational self-knowledge.
- R. M. Cyert & J. G. March (1963). A Behavioral Theory of the Firm. Prentice-Hall. [2] Cyert and March propose a behavioral theory of the firm, stressing that organizational decisions are governed by standard operating procedures, limited search, and the negotiation of goals among parties, rather than by pure optimization. The book helps in understanding the plurality and tension of goals within an organization, and is an important support for this chapter's treatment of the organization as a bounded-rationality actor.
- O. E. Williamson (1975). Markets and Hierarchies: Analysis and Antitrust Implications. Free Press. [2] Williamson, starting from transaction costs, explains why some activities are coordinated by the market and others are folded into a hierarchical organization: bounded rationality and opportunism make some transactions more efficient to complete within a hierarchy. The book provides an economic explanation of why an organization takes scattered activities under its own roof, and thereby shoulders the difficulty of verifying them.
- K. J. Arrow (1974). The Limits of Organization. W. W. Norton. [2][4] Arrow concisely explores the organization as a means of coping with the scarcity of information and with uncertainty, and the inherent limits it meets in authority, responsibility, and trust. The book points out that trust is an indispensable lubricant of social functioning that cannot be bought by contract, in distant resonance with the cost of verification examined in this chapter's sections on auditing and redundancy.
- M. Lipsky (1980). Street-Level Bureaucracy: Dilemmas of the Individual in Public Services. Russell Sage Foundation. [2][4] Lipsky points out that street-level bureaucrats such as teachers, police, and social workers exercise a great deal of discretion under conditions of scarce resources, and that their everyday coping in fact shapes how public policy actually lands. The book is an important reference for understanding why the local knowledge and discretion at the organization's edges is hard for the center to observe and verify.
- J. Q. Wilson (1989). Bureaucracy: What Government Agencies Do and Why They Do It. Basic Books. [2][4] Wilson examines in detail the actual workings of government agencies, distinguishing types of agency by whether outputs and outcomes are observable, and explains why the real effectiveness of many public agencies is hard to measure. The book provides rich real-world material for this chapter's "the organization blind to itself," and is especially helpful for understanding why proxy metrics are particularly apt to distort in the public sector.
- G. C. Bowker & S. L. Star (1999). Sorting Things Out: Classification and Its Consequences. MIT Press. [2][4] Bowker and Star examine how classification systems silently embed themselves into infrastructure, and how they shape the very reality they meant to record neutrally, with the differences flattened by classification often carrying real consequences. This chapter sets it alongside Hacking and Desrosières, gathering it into the history of "making society countable," to show that classification is an invisible link in the engineering of legibility.
- A. Desrosières (1998). The Politics of Large Numbers: A History of Statistical Reasoning (trans. C. Naish). Harvard University Press. [2] Desrosières traces the history of statistical reasoning, showing that statistical categories took shape in step with state administration, and that numbers are at once a tool for knowing society and a political act that constructs social reality. This chapter brings it into the genealogy of "making society countable," revealing the provenance of the statistical apparatus behind legibility.
- I. Hacking (1990). The Taming of Chance. Cambridge University Press. [2] Hacking examines the rise of statistical and probabilistic thought in the nineteenth century, arguing that the mass collection of population data "tamed chance" and gave birth to concepts such as "the normal" and "normalcy" that govern modern governance. This chapter cites it to show that making society countable is itself a stretch of history that remade cognition, and not a neutral act of recording.