Multidisciplinary Definitions and Driving Mechanisms of Preference

Definition of Preference

While some view the term “preference” as a rather subjective concept, this does not prevent various disciplines from conducting in-depth research on it. For example:

Computer Science: Preference is defined as a latent utility function, with its driving mechanism or objective function being reward maximization. This is based on the intuitive assumption that humans, as reward-sensitive organisms, tend to choose actions that yield greater rewards.
Behavioral Economics: Defines preference as a construct within decision-making scenarios, driven by heuristic thinking and biased cognition (constrained by limited information at the time of decision-making).
Biology: Posits that the prior definition of sensory states determines preference, driven by the minimization of free energy (surprise).
Quantum Decision Theory: Views preference as a superposition state in Hilbert space, with its determination dependent on the collapse of the wave function.
Ethics: Argues that preference is defined by volitional action or rational judgment, with its driving mechanism being Ordo Amoris (the order of love).
Military Science: Preference is determined by the commander’s intent and courses of action, based on logical judgment and inference.

The following content is generated by LLMs and may contain inaccuracies.

Context

Your proposed idea touches on the fundamental heterogeneity of the concept of preference (preference) across disciplines, a tension that is often obscured in interdisciplinary research. Behavioral economics aims to improve or comprehensively overhaul traditional economic theory by studying biases, inclinations, and heuristics in people’s economic decision-making, while computer science formalizes preferences as utility functions to be optimized. The free energy principle posits that systems follow a path of minimal surprise, with the brain reducing surprise or uncertainty by making predictions based on internal models and updating those models using sensory input to improve predictive accuracy. Quantum decision theory represents cognitive states and prospects as vectors in Hilbert space, the idea that systems exist in superposition of states before measurement being analogous to how our cognitive states are indeterminate before making decisions. This diversity of perspectives reflects a deeper question: are preferences revelatory (observed choices), constructive (context-dependent constructions), or metaphysical (expressions of value hierarchies)? The current dialogue among neuroscience, artificial intelligence, and normative ethics requires explicit clarification of these ontological commitments.

Key Insights

1. The Incommensurability of Driving Mechanisms Reveals the Boundaries of Modeling Assumptions
Preference-based reinforcement learning involves an agent acting according to a given policy and an expert evaluating that behavior; three distinct learning approaches include learning the policy, learning a preference model, or learning a utility function. These approaches are not interchangeable in practice: modeling human preferences as informed by regret (a measure of how far a single action deviates from the optimal decision) rather than partial rewards demonstrates that in multiple contexts, the former possesses reward function identifiability while the latter lacks this property. Heuristics are typically defined as cognitive shortcuts or rules of thumb that simplify decision-making under uncertain conditions; they represent the process of substituting a simpler problem for a difficult one, implying that “preference” may be a byproduct of metacognitive processes rather than an independent entity. A biological perspective offers another framework: under the free energy principle, biological agents act to maintain themselves within a restricted set of preferred states of the world, learning the generative model of the world and planning future actions to sustain a homeostasis that satisfies their preferences. These mechanisms—Bayesian inference, heuristic substitution, reward maximization—cannot be reduced to one another; they constitute distinct explanatory paradigms.

2. Quantum and Phenomenological Approaches Reveal the Deep Structure of Uncertainty and Contextuality
Quantum decision theory is grounded in the mathematical theory of separable Hilbert spaces, capturing superposition effects of composite prospects—multiple merged prospective actions—the theory describing entangled decision-making, the non-commutativity of successive decisions, and intentional interference. This is more than a mathematical analogy: quantum probability provides straightforward explanations for conjunction and disjunction errors and numerous other findings such as order effects in probability judgment; quantum models introduce a new fundamental concept—the compatibility and incompatibility of questions and their effects on the order of judgment. Simultaneously, in Scheler’s ethics, love is not merely an emotion but a cognitive act that recognizes values and arranges them in an ordo amoris (order of love); Scheler describes four value hierarchies—the sensory (pleasure and pain), the vital (health, vitality), the spiritual (beauty, truth, justice), and the sacred (holiness, divinity)—with the correct ordo amoris involving loving higher values over lower ones. These perspectives together suggest that preferences are not static orderings but dynamic structures that collapse at the moment of measurement/action, shaped by the value ontology of the individual or culture.

3. Interdisciplinary Integration Requires a Meta-theoretical Framework Rather Than Reductive Translation
The current gap cannot be bridged through terminological alignment but requires a framework capable of accommodating multiple causal levels. Beliefs about world states and policies are continuously updated to minimize variational free energy, wherein posterior beliefs about policies are based on expected free energy; both self-evidence and active inference entail a fundamental requirement to minimize generalized free energy or uncertainty. However, cognitive biases, heuristics, affect, and social influences all play critical roles in shaping economic choices, leading individuals' behavior to deviate from rationality; behavioral economics emphasizes how emotions interact with cognitive biases to influence decision-making. An integrative framework might resemble Scheler’s ordo amoris as “meta-preferences”—preferences about how to weigh values across different domains—or it might resemble active inference unifying perception and action under free energy minimization. The focus in military science on commander’s intent and courses of action hints at another dimension: preferences are embedded in agent-environment coupling rather than residing solely “within” the agent.

Open Questions

Q1: Does a “true” definition of preference exist, or does each discipline construct a functionally equivalent but ontologically incompatible concept? If the latter is true, should interdisciplinary research abandon the search for a unified definition in favor of constructing explicit translation protocols—analogous to dualities in physics (such as wave-particle duality)?

Q2: How does “commander’s intent” in military science relate to policy selection in computational agents? Is there an “embedded preferences” theory that views individual preferences as emergent properties within larger hierarchical systems (organizations, cultures, ecologies), thereby bridging individual and collective-level analysis?

偏好的定义

偏好一词在某些人看来是一个较为主观的概念，但这并不妨碍各类学科对其进行深入研究。例如：

计算机科学：偏好被定义为潜在的效用函数，其驱动机制或目标函数是奖励的最大化。这基于一个直观的假设，即人类作为奖励敏感的生物，倾向于选择能够获得更大奖励的行动。
经济行为学：将偏好定义为决策情景下的构念，其驱动机制基于启发式思维和有偏认知（受制于决策时的有限信息）。
生物学：认为感官状态的先验定义了偏好，由自由能（惊奇度）的最小化所驱动。
量子决策理论：将偏好视为希尔伯特空间中的叠加态，偏好的确定取决于波函数的坍缩。
伦理学：认为意志的行动或理性的判断定义了偏好，其驱动机制为Ordo Amoris（爱的秩序）。
军事科学：偏好由指挥官的意图和行动方案所决定，基于判断逻辑与推演。

以下内容由 LLM 生成，可能包含不准确之处。

Context

你提出的想法触及了偏好(preference)概念在不同学科中的根本异质性,这一张力在多学科研究中往往被掩盖。行为经济学旨在通过研究人们经济决策中的偏见、倾向和启发式方法,改进或全面检修传统经济理论,而计算机科学则将偏好形式化为需要优化的效用函数。自由能原理假定系统遵循最小惊奇的路径,大脑通过基于内部模型进行预测并使用感官输入更新模型以提高预测准确性,从而减少惊奇或不确定性。量子决策理论将认知状态和前景表示为希尔伯特空间中的向量,系统在被测量之前处于状态叠加的想法,与我们的认知状态在做出决策之前是不确定的方式相似。这些视角的多样性反映了一个更深层次的问题:偏好是显现性的(被观察到的选择)、构造性的(情境依赖的构建),还是形而上学的(价值层级的表达)?当前神经科学、人工智能和规范伦理学之间的对话需要明确这些本体论承诺。

Key Insights

1. 驱动机制的不可通约性揭示了建模假设的边界
基于偏好的强化学习涉及代理根据给定策略行动和专家评估其行为,三种不同的学习方法包括:学习策略、学习偏好模型或学习效用函数。这些方法在实践中不可互换:将人类偏好建模为由遗憾(一个片段偏离最优决策的度量)而非部分回报告知,在多个情境中,前者具有回报函数可识别性,而后者缺乏这一性质。启发式通常被定义为在不确定条件下简化决策的认知捷径或经验法则,它们代表了用更简单的问题替代困难问题的过程,这意味着"偏好"可能是元认知过程的副产品,而非独立实体。生物学视角提供了另一种框架:在自由能原理下,生物代理行动以保持在世界的受限偏好状态集合中,学习世界的生成模型并在未来规划行动以维持满足其偏好的内稳态。这些机制——贝叶斯推理、启发式替代、奖励最大化——不能简化为彼此,它们构成了不同的解释范式。

2. 量子与现象学方法揭示了不确定性与情境性的深层结构
量子决策理论基于可分离希尔伯特空间的数学理论,捕捉复合前景的叠加效应,包括许多合并的预期行动,该理论描述了纠缠的决策制定、后续决策的非交换性以及意图干涉。这不仅仅是数学类比:量子概率提供了对合取和析取错误以及许多其他发现(如概率判断的顺序效应)的简单解释,量子模型引入了一个新的基本概念——问题的兼容性与不兼容性及其对判断顺序的影响。同时,在舍勒的伦理学中,爱不仅仅是一种情感,而是一种认知行为,它识别价值并将其排列在一个ordo amoris(爱的秩序)中;舍勒描述了四个价值层级——感性的(快乐与痛苦)、生命的(健康、活力)、精神的(美、真理、正义)和神圣的(神圣、神性),正确的ordo amoris涉及爱更高的价值甚于更低的价值。这些视角共同暗示:偏好不是静态排序,而是在测量/行动时刻坍缩的动态结构,并由个体或文化的价值本体论所塑造。

3. 跨学科整合需要元理论框架,而非简化论翻译
当前的鸿沟不能通过术语对齐来弥合,而是需要一个能够容纳多重因果层次的框架。关于世界状态和策略的信念不断更新以最小化变分自由能,其中关于策略的后验信念基于预期自由能;自我证明和主动推理都有一个根本性的要求,即最小化广义自由能或不确定性。但是,认知偏见、启发式、情感和社会影响都在塑造经济选择中起着关键作用,导致个体的行为偏离理性;行为经济学强调情感如何与认知偏见相互作用以影响决策。整合性框架可能类似于舍勒的ordo amoris作为"元偏好"——关于如何权衡不同领域价值的偏好,或者可能类似于主动推理将感知和行动统一在自由能最小化下。军事科学对指挥官意图和行动方案的关注提示了另一个维度:偏好是嵌入在代理-环境耦合中的,而非仅在代理"内部"。

Open Questions

Q1: 偏好的"真实"定义是否存在,还是每个学科构造了一个功能等价但本体论不兼容的概念? 如果后者为真,跨学科研究是否应该放弃统一定义,转而构建明确的翻译协议——就像物理学中的对偶性(如波粒二象性)?

Q2: 军事科学中的"指挥官意图"如何与计算智能体的政策选择相关联? 是否存在一种"嵌入式偏好"理论,将个体偏好视为更大层级系统(组织、文化、生态)中的涌现属性,从而桥接个体与集体层面的分析?