又到年末了,是时候整理一波今年的读书清单了。2020 年实在是太特殊了,这一年里读过的书其实也不多, 大多与专业相关,且好几本是本科时期读过再来重读的,这类书过于经典,每次读的收获都会有所不同。
LBRYLBRY
A decentralized video platform – could it be a competitor to YouTube?
一个去中心化的视频平台,似乎是 YouTube 的竞争对手?
The All in Go Stack
本科毕业以后就很少有时间来折腾自己的网站了,大部分时间里都是 ssh 到服务器上随便搭
好一个服务后就任其自由生长,即便是服务挂掉了之后也没有太在意——毕竟是自己用的东西,
对于网站的用户而言:ABSOLUTELY NO WARRANTY。不过随着网站近来的用户数量呈现难以置信
的增长势头(虽然什么也没干),自然也就难免希望让网站变得更加「可靠」。
作为 2020 年的最后几项 TODO,我终于在圣诞节休假的第一周的前三天完成了整个 changkun.de 的「架构升级」,从原来组织混乱、 依赖复杂的 Native Nginx + Docker + tmux served binary + Jupyter notebook + hypervisor + crontab + … 等等依赖 C/C++/Python/Node.js/Go/… 以及数不胜数的第三方依赖全面转向了以「尽可能自研、能不依赖就不依赖、 即使以来也要依赖使用 Go 开发的依赖」为指导思想的纯 Go 的后端结构。
这篇文章就介绍了 changkun.de 作为个人网站, 从最初的每年不到一百的活跃用户到现在的每月上万活跃用户这个过程中 究竟积累并承载了哪些(公开的、但不那么可见的)个人以及面向公共的服务,以及它背后的迁移故事。
"Worse is Better""越差越好"
I stumbled upon an excerpt from an article called “The Rise of Worse is Better.” The author, Richard, reflects on why C and Unix succeeded. The article discusses the four major goals of software design: simplicity, correctness, consistency, and completeness. Two highly representative schools of thought have developed around these four goals: the MIT school and the New Jersey school (where Bell Labs is located). The MIT school believes that software must be absolutely correct and consistent first, then complete, and finally simple. It also “satirizes” the New Jersey school for doing the opposite – they set simplicity as the highest priority, even willing to sacrifice correctness for the sake of simplicity. In other words, software quality (popularity) does not increase with more features; from the perspective of practicality and ease of use, software with fewer features is actually more favored by users and the market.
So now you can see why some people always complain that Go can’t do this and can’t do that, is missing this and missing that. It’s because Rob Pike from Bell Labs is a through-and-through New Jersey school person. So to sum up, Go’s characteristics are:
- Simple
- Very simple
- Nothing but simple
There are several follow-up articles on “Worse is Better”:
- Original: Richard P. Gabriel. The Rise of Worse is Better. 1989. https://www.dreamsongs.com/RiseOfWorseIsBetter.html
- Follow-up 1: Nickieben Bourbaki. Worse is Better is Worse. 1991. https://dreamsongs.com/Files/worse-is-worse.pdf
- Follow-up 2: Richard P. Gabriel. Is Worse Really Better? 1992. https://dreamsongs.com/Files/IsWorseReallyBetter.pdf
- Follow-up 3: Richard P. Gabriel. Worse is Better. 2000. https://www.dreamsongs.com/WorseIsBetter.html
- Follow-up 4: Richard P. Gabriel. Back to the Future: Worse (Still) is Better! Dec 04, 2000. https://www.dreamsongs.com/Files/ProWorseIsBetterPosition.pdf
- Follow-up 5: Richard P. Gabriel. Back to the Future: Is Worse (Still) Better? Aug 2, 2002. https://www.dreamsongs.com/Files/WorseIsBetterPositionPaper.pdf
So which school do you lean towards?
偶然间读到了一篇文章的节选片段《The Rise of Worse is Better》,这篇文章的作者 Richard 围绕为什么 C 和 Unix 能够成功展开了反思。这篇文章中聊到了几个软件设计的四大目标简单、正确、一致和完整。其中围绕四个目标发展出了两大很有代表性的流派: MIT 流派和 New Jersey 流派(贝尔实验室所在地)。MIT 流派认为软件要绝对的正确和一致,然后才是完整,最后才是简单;而一并"讽刺"了 New Jersey 流派反其道而行之的做法,他们将简单的优先级设为最高,为了简单甚至能够放弃正确。换句话说,软件的质量(受欢迎的程度)并不随着功能的增加而提高,从实用性以及易用性来考虑,功能较少的软件反而更受到使用者和市场青睐。
所以你看到为什么总是有些人总是抱怨 Go 这也不行那也不行,这也没有那也没有了。因为来自贝尔实验室的 Rob Pike 就是一个彻彻底底的 New Jersey 流派中人。所以总结起来 Go 的特点就是:
- 简单
- 非常简单
- 除了简单就是简单
然后围绕 Worse is Better 还有好几篇后续文章:
- 原始文章: Richard P. Gabriel. The Rise of Worse is Better. 1989. https://www.dreamsongs.com/RiseOfWorseIsBetter.html
- 后续 1: Nickieben Bourbaki. Worse is Better is Worse. 1991. https://dreamsongs.com/Files/worse-is-worse.pdf
- 后续 2: Richard P. Gabriel. Is Worse Really Better? 1992. https://dreamsongs.com/Files/IsWorseReallyBetter.pdf
- 后续 3: Richard P. Gabriel. Worse is Better. 2000. https://www.dreamsongs.com/WorseIsBetter.html
- 后续 4: Richard P. Gabriel. Back to the Future: Worse (Still) is Better! Dec 04, 2000. https://www.dreamsongs.com/Files/ProWorseIsBetterPosition.pdf
- 后续 5: Richard P. Gabriel. Back to the Future: Is Worse (Still) Better? Aug 2, 2002. https://www.dreamsongs.com/Files/WorseIsBetterPositionPaper.pdf
所以你更倾向于哪个学派?
Proebsting's LawProebsting 定律
Today I read an extra paper. Although it’s not directly related to Go, I think it offers some insightful perspective on the current state of the Go language, so I’d like to share it. The paper is called “On Proebsting’s Law.”
We all know Moore’s Law says the number of transistors on integrated circuits doubles every 18 months, but this paper studies and validates the so-called Proebsting’s Law: the performance improvement brought by compiler optimization techniques doubles every 18 years. Proebsting’s Law was proposed in 1998, and its author Todd Proebsting was probably half-joking, because he suggested that the compiler and programming language research community should reduce their focus on performance optimization and instead pay more attention to improving programmer productivity.
Now, looking back at this suggestion with hindsight, we can see it’s not without merit: although Go’s compiler has gone through several major optimization versions, the techniques it uses aren’t particularly fancy – rather, they are quite traditional and conventional optimization techniques. However, this hasn’t hindered Go’s success, because what it tries to address is exactly programmer productivity:
- By avoiding circular dependencies, it greatly reduces the time programmers spend waiting for compilation
- Its very concise language design and feature set greatly reduces the time programmers spend thinking about how to use the language
- Forward compatibility guarantees almost entirely eliminate the migration and maintenance time caused by version upgrades
今天额外读了一篇论文,虽然跟 Go 没有直接关系,但我觉得对理解目前 Go 语言的现状是有一定启发意义的,所以来分享一下。这篇论文叫做 “On Proebsting’s Law”。
我们都知道 Moore 定律说集成电路上晶体管数量每 18 个月番一番,但这篇论文则研究并验证了所谓的Proebsting 定律: 编译器优化技术带来的性能提升每 18 年番一番。Proebsting 定律是在 1998 年提出的,当时的提出者 Todd Proebsting 可能只是在开玩笑,因为他建议编译器和编程语言研究界应该减少对性能优化的关注,而应该更多的关注程序员工作效率的提升。
现在我们来事后诸葛亮评价这一建议就能发现其实这并不是无道理的: Go 语言的编译器虽然经历过几大版本的优化,但其使用的技术并不够 fancy,相反而是很传统且中规中矩的优化技术。然而这并不影响 Go 语言的成功,因为它尝试解决的正是程序员的工作效率:
- 通过避免循环以来而极大的减少了程序员等待编译的时间
- 非常简洁的语言设计与特性极大的减少了程序员思考如何使用语言的时间
- 向前的兼容性保障几乎彻底消除了因为版本升级给程序员带来的迁移和维护时间
- 论文地址/Paper: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.29.434&rep=rep1&type=pdf
- Proebsting’s Law: http://proebsting.cs.arizona.edu/law.html
Telegram BotTelegram 机器人
Because the COVID situation in Europe is still terrible, even shopping at an Apple Store requires an appointment in advance. Since I urgently needed to visit an Apple Store recently but couldn’t find any available appointment slots, I quickly hacked together a tool to check availability and send a reminder message via Telegram when an appointment becomes available. Tool link: https://changkun.de/s/apreserve
Interacting with Telegram using Go is straightforward:
- Create a bot from BotFather
- Obtain the bot’s token and the chat ID for your conversation with it
- Then you can handle messages
- BotFather: https://t.me/botfather
- Tg bot API Go bindings: https://github.com/go-telegram-bot-api/telegram-bot-api
|
|
因为欧洲疫情依然很糟糕,所以现在甚至于想去苹果店购物都要提前预约。因为最近急需要去苹果店一次,又苦于刷不到可用的预约位置,刚刚顺手就糊一个工具来检查,当预约可用时给telegram发送一条提醒消息。工具地址: https://changkun.de/s/apreserve
用 Go 和 telegram 进行交互没有任何难度:
- 从 botfather 创建一个 bot
- 获得这个 bot 的 token 以及跟它对话的 chatid
- 于是可以处理消息了
- BotFather: https://t.me/botfather
- Tg bot API Go bindings: https://github.com/go-telegram-bot-api/telegram-bot-api
|
|
Apple SiliconApple Silicon
How is Go’s compilation performance on darwin/arm64? I did a rough and non-rigorous comparison of Go compilation performance between an Intel Mac and an M1 Mac. This compilation report was generated with the following commands:
$ go build -gcflags='-bench=bench.out' -a $ cat bench.out
where -a disables the compilation cache.
MacBook Air (M1, 2020), Apple M1, 16 GB:
|
|
Mac mini (2018), 3 GHz 6-Core Intel Core i5, 8 GB 2667 MHz DDR4:
|
|
Go在darwin/arm64上的编译性能怎么样?我很不严谨的粗略比较了Intel Mac 和 M1 Mac 的 Go 编译性能。这个编译报告由如下指令生成:
$ go build -gcflags='-bench=bench.out' -a $ cat bench.out
其中-a用于禁用编译缓存。
MacBook Air (M1, 2020), Apple M1, 16 GB:
|
|
Mac mini (2018), 3 GHz 6-Core Intel Core i5, 8 GB 2667 MHz DDR4:
|
|
Deprecating ioutil弃用 ioutil
ioutil will be fully deprecated in Go 1.16. Although these APIs will continue to exist due to the compatibility guarantee, they are no longer recommended for use. So the question is: what should we use instead? Here are all the APIs in the ioutil package:
|
|
The corresponding replacement APIs in 1.16:
|
|
In summary, three key changes:
- Discard, NopCloser, ReadAll have been moved to the
iopackage - ReadDir, ReadFile, WriteFile have been moved to the
ospackage - TempDir, TempFile have been renamed to MkdirTemp, CreateTemp and moved to the
ospackage
ioutil 将在 Go 1.16 中被彻底弃用,虽然由于兼容性保障这些 API 还会继续存在,但不再被推荐使用了。那么问题来了,我们应该用什么?这是 ioutil 包所有的 API:
|
|
1.16 中取而代之的与之对应的 API:
|
|
总结起来就是三点:
- Discard, NopCloser, ReadAll 挪到了 io 包中
- ReadDir, ReadFile, WriteFile 挪到了 os 包中
- TempDir, TempFile 更名为了 MkdirTemp, CreateTemp 并挪到了 os 包中
Testing io/fs Implementations测试 io/fs 的实现
io/fs is getting closer and closer. The functionality is great, but how do we test it? There is a function in testing/fstest that can do just that.
|
|
io/fs 越来越近了 功能很好但我们怎么才能测试它呢?testing/fstest 中有一个函数可以做到这件事情。
|
|
Revisiting Asynchronous Preemption回顾异步抢占
Are you sure you understand asynchronous preemption? Today, while discussing with Cao Da (@Xargin) about how the interrupted G in the asynchronous preemption flow restores its previous execution context, I realized my understanding of asynchronous preemption was not comprehensive enough. In “Go Under The Hood,” asynchronous preemption is described as follows: let’s name the two running threads M1 and M2. The overall logic of a preemption call can be summarized as:
- M1 sends an interrupt signal (
signalM(mp, sigPreempt)) - M2 receives the signal; the OS interrupts its executing code and switches to the signal handler (
sighandler(signum, info, ctxt, gp)) - M2 modifies the execution context and resumes at the modified location (
asyncPreempt) - Re-enters the scheduling loop to schedule other Goroutines (
preemptParkandgopreempt_m)
This summary is not entirely correct, because it does not clearly explain the difference between preemptPark and gopreempt_m. This week, let’s briefly supplement the overall behavior of asynchronous preemption:
Assuming the system monitor acts as M1, after the system monitor sends the interrupt signal, execution arrives at asyncPreempt2:
|
|
But will it ultimately choose preemptPark or gopreempt_m? The asynchronous preemption issued by sysmon calling preemptone does not set the preemptStop flag on G, so it enters the gopreempt_m flow. gopreempt_m ultimately calls goschedImpl, which places the preempted G into the global queue to be scheduled later.
So what about the other half (preemptPark)? When we carefully examine the implementation of preemptPark, we find that the preempted G is not added to the scheduling queue at all — instead, it directly calls schedule:
|
|
So how does the preempted G get back into the scheduling loop? It turns out that the branch where gp.preemptStop is true occurs when the GC needs it (markroot): it uses suspendG to mark the running G (gp.preemptStop = true), sends the preemption signal (preemptM), and returns the state of the interrupted G. When the GC’s marking work is complete and preemption ends, it passes this state and calls resumeG, which ultimately calls ready to resume the interrupted G:
|
|
你确定你看懂异步抢占了吗?今天跟曹大@Xargin 交流起异步抢占的流程里被中断的 G 是如何恢复到之前的执行现场时才发现对异步抢占的理解还不够全面。在《Go 语言原本》中是这样描述异步抢占的: 不妨给正在运行的两个线程命名为 M1 和 M2,抢占调用的整体逻辑可以被总结为:
- M1 发送中断信号(
signalM(mp, sigPreempt)) - M2 收到信号,操作系统中断其执行代码,并切换到信号处理函数(
sighandler(signum, info, ctxt, gp)) - M2 修改执行的上下文,并恢复到修改后的位置(
asyncPreempt) - 重新进入调度循环进而调度其他 Goroutine(
preemptPark和gopreempt_m)
这个总结并不完全正确,因为它并没有总结清楚 preemptPark 和 gopreempt_m 这两者之间的区别。这周我们来简单补充一下异步抢占的整体行为:
假设系统监控充当 M1,当系统监控发送中断信号后,会来到 asyncPreempt2:
|
|
但最终会选择 preemptPark 还是 gopreempt_m 呢?sysmon 调用 preemptone 的代码中发出的异步抢占并不会为 G 设置 preemptStop 标记,从而会进入 gopreempt_m 的流程,而 gopreempt_m 最终会调用 goschedImpl 将被抢占的 G 放入全局队列,等待日后被调度。
那么另一半(preemptPark)呢?当我们仔细查看 preemptPark 的实现则会发现,被抢占的 G 其实并没有被加入到调度队列中,而是直接就调用了 schedule:
|
|
那这时被抢占的 G 怎样才会恢复到调度循环呢?原来 gp.preemptStop 为 true 的分支发生在 GC 需要时(markroot)通过 suspendG 来标记正在运行的 G(gp.preemptStop = true),再发送抢占信号(preemptM),返回被中断 G 的状态。当 GC 的标记工作完成,抢占结束后,在将这个状态传递并调用 resumeG,最终 ready 并恢复这个被中断的 G:
|
|