Mathematicians' Brain Attention Mechanisms

In this current era, this article becomes even more interesting. Because it mentions “attention is the only thing we possess,” an observation that aligns perfectly with the author’s experience in the mathematics field.

He has also been continuously thinking about what actually happens in someone’s brain when they are extremely intelligent. Is it cognitive inequality, or something else?

I’ve skimmed through the book, and there are several hypotheses:

Theories about cognitive inequality Certain differences in structure, volume, speed and efficiency of neural tissue result in genetic variability, but they cannot account for magnitude of observed cognitive inequality. Although these differences lead to genetic variation, they cannot explain the degree of cognitive inequality that has been observed.
A 2016 MRI scanning study This mentions a study with actual evidence. The research found that professional mathematicians typically activate some non-linguistic regions when processing complex mathematical statements, particularly the Parietal Lobe. This occurs regardless of whether these statements belong to their field of research.

In contrast, non-mathematics professionals at comparable academic levels activate language regions. In other words, when encountering a mathematical expression they cannot understand, mathematicians unconsciously switch to a special mathematical mode. This mode is an attempt to perceive and feel their existence in a particular way.

My understanding is that people are actually trying to comprehend, but possibly because a certain region of the brain cannot concentrate its attention on that area, it calls upon other parts to speculate and sense.

https://substack.com/home/post/p-183239843

The following content is generated by LLMs and may contain inaccuracies.

Background

“Attention is the only thing we have”—this observation becomes particularly meaningful in the current AI era dominated by Transformer architecture. David Bessis proposed a conjecture theory about cognitive inequality in an article published in February 2026, creating an interesting contrast between how mathematicians' brains work and attention mechanisms in artificial neural networks.

The core questions in the article touch upon fundamentals of cognitive science: what exactly happens inside a brain when a person demonstrates extraordinary mathematical talent? Bessis, as a professional mathematician, observed that mathematical progress involves not only mathematics itself, but also metacognition and emotional control. This observation challenges popular genetic determinism assumptions, instead placing focus on trainable cognitive habits.

Key Insights

1. Questionable Structural Basis of Cognitive Inequality

While some people may genetically possess more efficient neural metabolism, allowing their mathematical abilities to be two to ten times greater than ordinary people, genes alone struggle to explain the observed extreme level of inequality. Unlike highly heritable polygenic traits (such as height) that typically follow Gaussian distribution, the distribution of mathematical talent more closely resembles Pareto distribution, which usually stems from sequential extraction processes—“rich get richer” mechanisms where each step builds upon previous results.

2. Special Activation Patterns in Mathematicians' Brains

A 2016 fMRI study by Marie Amalric and Stanislas Dehaene scanned professional mathematicians and non-mathematical specialists with comparable mathematical literacy, finding that professional mathematicians, when evaluating advanced mathematical statements—whether algebraic, analytical, topological, or geometric—activated a reproducible set of bilateral prefrontal, intraparietal, and ventrolateral temporal regions. Crucially, these activations avoided language-related areas; brain activity during mathematical reflection bypassed language-related regions around the central sulcus and temporal regions traditionally involved in general semantic knowledge. Amalric & Dehaene, PNAS 2016

When mathematicians think about mathematics—whether analysis, algebra, geometry, or topology—parietal and lower temporal regions of both hemispheres are activated. By contrast, non-mathematicians facing identical mathematical statements activate language processing regions. This suggests mathematicians unconsciously switch to a special “mathematical mode,” attempting to “see” and “feel” the existence of these abstract structures in a particular way.

3. Enormous Differences in Metacognitive Habits

Many distinguished mathematicians have attempted to clarify one point: their talent is primarily a cognitive attitude. Einstein claimed “I have no special talent, I am only passionately curious”; Descartes insisted at the opening of his Discourse on Method that his mind is no better than ordinary people’s; Grothendieck emphasized “this power is by no means some extraordinary gift.” David Bessis, Mathematica: A Secret World of Intuition and Curiosity

Research shows that metacognitive knowledge and metacognitive monitoring are directly positively correlated with high school students' mathematical modeling skills, and the critical thinking dimension of computational thinking mediates the relationship between metacognition and mathematical modeling skills; sufficient metacognition can improve students' critical thinking in computational thinking and enhance mathematical modeling abilities. Research Source

4. Secondary Stimuli and Synaptic Connectome

Bessis proposed a critical hypothesis: there must necessarily be physical differences between the brains of exceptionally intelligent people and ordinary people, otherwise where do cognitive differences come from? His conjecture theory posits that the cognitive differences measured at any given moment for an individual are primarily explained by differences in their synaptic connectome.

This framework views the brain as a learning device rather than a computational device. Our synaptic connectome responds to reconstruction not only from primary stimuli (raw sensory signals from the world) but also from secondary stimuli—the continuous stream of mental imagery we generate. When you read a book, the primary stimulus is the ink on the page, but if certain books make you smarter, it’s not only because of the ink itself but also because of the related secondary stimuli triggered by the book and sustained for minutes, hours, days, years—those fleeting thoughts and mental images.

5. Trainability of Attention Control

Both intelligence and metacognitive skills are considered important predictors of mathematical performance, but the role of metacognitive skills in mathematics appears to change early in secondary education, and according to monotonic development hypothesis, metacognitive skills improve with age independent of intelligence development. Veenman Research

Metacognitive instruction produced substantial positive effects on metacognitive skills (effect size ES = 1.18, p < 0.001), with students in the treatment group showing significantly greater improvements in metacognitive skills compared to the control group. This indicates that through deliberate practice, more effective attention allocation and cognitive monitoring strategies can be cultivated.

6. The Algebraic Nature of Raven’s Matrix Test

Bessis offers unique insights into IQ testing. Raven’s Progressive Matrices, as one of the most g-loaded IQ tests, actually exudes a strong undergraduate algebra flavor—all about 3-cycles and permutation matrices. He subjectively found that by projecting mathematical structure onto pictures, he could gain intuitive perception of three overlaid permutation matrices (one for background geometric shapes, one for foreground rectangles' color, one for foreground rectangles' angles), and this intuitive perception greatly reduces demands on “working memory.”

More importantly, Raven’s Progressive Matrices show an increase rate of 7 IQ points per decade, more than double the rate of the Flynn effect observed on multifactor intelligence tests like WAIS and SB. This rapid growth may be explained by the increasing permeation of tabular structures in the cognitive environment—our numerical sense has undergone substantial evolution over the past millennium.

7. The Role of Cognitive Inhibition and Confidence

Cognitive inhibition is adaptive protection against learning from unreliable mental imagery; unlocking creative thinking and mastery requires overcoming it, partly regulated by social feedback, resulting in cognitively self-reinforcing stratification that solidifies with age.

Renowned mathematician Bill Thurston observed: when someone in mid-career proves a theorem widely recognized as important, their status in the community—their ranking—immediately and significantly rises; at this point they typically become more productive, becoming centers of thought and sources of theorems. This illustrates that the elevation of confidence, becoming central in the thought network, and (most importantly) discovery of new ways of thinking, act together.

8. Training Mathematical Intuition

Bessis advocates consciously training one’s mathematical intuition to work more effectively, a process he calls “System 3,” as a continuation of psychologist Daniel Kahneman’s famous distinction between System 1 (automatic, unconscious ability) and System 2 (conscious methodological reasoning). SIAM Review

This training is not about learning information but expanding the range of structures one can conceptualize. Just as blind boy Ben Underwood learned to “see” through tongue clicks and echolocation, mathematicians through continuous metacognitive practice retrain their brains to intuitively perceive abstract structures.

Open Questions

Can the neural mechanism of secondary stimuli be directly measured? If cognitive development is primarily mediated by secondary stimuli, could one design longitudinal neuroimaging studies tracking the evolution of brain activation patterns in students during key stages of mathematical learning (such as the two-year intensive training of French prépa)? Bessis predicts that individual students' progress trajectories will be significantly correlated with strengthening and/or more frequent use of the Amalric-Dehaene “mathematical brain” activation patterns.
Can metacognitive training cross the “genius threshold”? Bessis acknowledges only a “20% full cup”—critical aspects of psychological habits and metacognitive methods have solidified before children acquire language ability. But if cognitive stratification is primarily driven by trainable attention habits rather than genetic ceilings, do there exist yet-undiscovered teaching interventions that can systematically push more people toward the extreme tail of the cognitive distribution? Or does the randomness and path dependency of early neural development set insurmountable limits on achievable cognitive restructuring?

在现在这个时代，这篇文章显得更加有意思。因为它提到“注意力是我们唯一拥有的东西”，这个观察与文中作者在数学界的经历非常吻合。

他也一直在思考，当一个人非常聪明的时候，他大脑里究竟发生了什么？是认知的不平等吗，还是别的什么？

我大致翻阅了一下，书里有几个猜测：

认知不平等的推测理论某些结构、体积、速度和效率上的差异（Some differences in structure, volume, speed and efficiency of neural tissue result in genetic variability, but they cannot account for magnitude of observed cognitive inequality.），这些虽然导致了遗传变异，但无法解释观察到的认知不平等的程度。
2016 年的一项 MRI 扫描研究这里面提到一个有实际证据的研究。研究发现，专业数学家在处理复杂的数学语句时，通常会激活一些非语言区域，特别是顶叶区（Parietal Lobe）。无论这些语句是否属于他们的研究领域，都会出现这种情况。

相比之下，学术水平相当的非数学专业人士，则会启用语言区。也就是说，当遇到一个无法理解的数学表达时，数学家会下意识地切换到一个特殊的数学模式。这个模式就是尝试以某种特定的方式来看待、感受它们的存在。

我的理解是，人其实是在尝试去理解的，但可能由于脑部某个区域无法集中对该处的注意力，他就会调用其他部分来推测和感受。

https://substack.com/home/post/p-183239843

以下内容由 LLM 生成，可能包含不准确之处。

背景

“注意力是我们唯一拥有的东西”——这一观察在当下这个被Transformer架构主导的AI时代显得格外有意义。David Bessis在其2026年2月发表的文章中提出了一个关于认知不平等的猜想理论,将数学家大脑的工作方式与人工神经网络中的注意力机制形成了有趣的对照。

文中涉及的核心问题触及认知科学的根本:当一个人在数学上表现出非凡才能时,其大脑内部究竟发生了什么?Bessis作为职业数学家观察到,数学进步不仅仅关乎数学本身,还涉及元认知和情绪控制。这一观察挑战了流行的遗传决定论假设,转而将焦点放在可训练的认知习惯上。

关键洞察

1. 认知不平等的结构基础存疑

虽然某些人在遗传上可能拥有更高效的神经代谢,使他们在数学上的能力是普通人的两倍甚至十倍,但仅凭基因难以解释观察到的极端不平等程度。与高度遗传的多基因特征(如身高)通常遵循高斯分布不同,数学才能的分布更接近帕累托分布,这种分布通常源于序列抽取过程——每一步都建立在先前结果之上的"富者愈富"机制。

2. 数学家大脑的特殊激活模式

2016年Marie Amalric和Stanislas Dehaene的fMRI研究扫描了职业数学家和数学素养相当的非数学专业人士,发现职业数学家在评估高等数学陈述时,无论代数、分析、拓扑还是几何,都激活了一组可重复的双侧额叶、顶内和腹外侧颞叶区域。关键的是,这些激活避开了与语言相关的区域,数学反思期间的大脑活动绕过了围脑裂语言相关脑区以及传统上涉及一般语义知识的颞叶区域。Amalric & Dehaene, PNAS 2016

当数学家思考数学时——无论是分析、代数、几何还是拓扑——双半球的顶叶和下颞叶区域会被激活。相比之下,非数学家面对相同的数学陈述时会激活语言处理区域。这表明数学家下意识地切换到一种特殊的"数学模式",尝试以某种特定方式去"看"、去"感受"这些抽象结构的存在。

3. 元认知习惯的巨大差异

许多杰出数学家曾试图阐明一个观点:他们的才能首先是一种认知态度。爱因斯坦声称"我没有特殊才能,我只是充满激情地好奇";笛卡尔在《方法论》开篇坚称自己的心智并不比普通人更出色;格罗腾迪克强调"这种力量绝非某种非凡的天赋"。David Bessis, Mathematica: A Secret World of Intuition and Curiosity

研究显示,元认知知识和元认知监控与高中生的数学建模技能存在直接正相关,且计算思维的批判性思维维度在元认知和数学建模技能之间起中介作用,充分的元认知可以改善学生计算思维的批判性思维并提升数学建模技能。研究来源

4. 次级刺激与突触连接组

Bessis提出了一个关键假设:超级聪明的人的大脑与普通人的大脑之间必然存在物理差异,否则认知差异从何而来?他的猜想理论认为,个体在任何时刻测量到的认知差异主要由其突触连接组的差异来解释。

这一框架将大脑视为一个学习设备而非计算设备。我们的突触连接组不仅对主要刺激(来自世界的原始感官信号)做出重构响应,还对次级刺激——我们持续产生的心理意象流——做出响应。当你读一本书时,主要刺激是页面上的墨迹,但如果某些书能让你变得更聪明,原因不仅在于墨迹本身,还在于由书触发并持续数分钟、数小时、数天、数年的相关次级刺激——那些飘忽的思绪和心理意象。

5. 注意力控制的可训练性

智力和元认知技能都被认为是数学表现的重要预测因素,但元认知技能在数学中的作用在中学教育早期似乎会发生变化,且根据单调发展假说,元认知技能随年龄增长而提高,独立于智力发展。Veenman研究

元认知教学对元认知技能产生了实质性的积极影响(效应量ES = 1.18, p < 0.001),治疗组学生的元认知技能提升显著高于对照组。这表明通过刻意练习,可以培养出更有效的注意力分配和认知监控策略。

6. Raven矩阵测试的代数本质

Bessis对IQ测试提出了独特见解。Raven渐进矩阵作为最具g负荷的IQ测试之一,其实散发着浓厚的本科代数气息——全是关于3-循环和置换矩阵。他主观感受到,通过将数学结构投射到图片上,能够获得三个叠加置换矩阵的直观感知(一个用于背景几何形状,一个用于前景矩形的颜色,一个用于前景矩形的角度),这种直观感知大大减少了对"工作记忆"的需求。

更重要的是,Raven渐进矩阵显示出每十年7个IQ点的增长率,是Flynn效应在WAIS、SB等多因素智力测试上表现率的两倍多。这种快速增长可能由表格结构在认知环境中的日益渗透所解释——我们的数感在过去千年中发生了实质性演变。

7. 认知抑制与自信的作用

认知抑制是对从不可靠心理意象中学习的适应性保护,要解锁创造性思维和精通需要克服它,部分由社会反馈调节,导致随年龄固化的自我强化认知分层。

著名数学家Bill Thurston观察到:当某人在中间位置证明了一个被广泛认可为重要的定理时,他们在社区中的地位——即排序——会立即且显著提升,此时他们通常会变得更有生产力,成为思想中心和定理来源。这说明自信心的提升、成为思想网络中心、以及(最重要的)新思维方式的发现,三者共同作用。

8. 数学直觉的训练

Bessis主张有意识地训练自己的数学直觉使其更有效地工作,他称这一过程为"系统3",作为心理学家Daniel Kahneman著名的系统1(自动、无意识能力)和系统2(有意识的方法论推理)划分的延续。SIAM评论

这种训练并非学习信息,而是扩展一个人能够构想的范围。就像盲人男孩Ben Underwood学会通过舌头发声和听回声来"看见"一样,数学家通过持续的元认知练习重新训练大脑,使其能够直观地感知抽象结构。

开放问题

次级刺激的神经机制能否被直接测量? 如果认知发展主要由次级刺激介导,那么能否设计纵向神经影像学研究,跟踪数学学习关键阶段(如法国预科班的两年密集训练)中学生大脑激活模式的演变?Bessis预测个体学生的进步轨迹将与Amalric-Dehaene"数学脑"激活模式的强化和/或更频繁使用显著相关。
元认知训练是否能跨越"天才门槛"? Bessis承认只有"20%满的杯子"——心理习惯和元认知方法的关键方面已在儿童获得语言能力之前固化。但如果认知分层主要由可训练的注意力习惯驱动,而非遗传天花板,那么是否存在尚未发现的教学干预措施,能够系统性地将更多人推向认知分布的极端尾部?或者说,早期神经发育的随机性和路径依赖性是否为可达到的认知重组设定了不可逾越的界限?