retpolanne blog

Your friendly programmer catgirl 🏳️‍⚧️😺

5 March 2024

GSoC 2024 preparation - perf

by Anne Macedo

This year’s Google Summer Of Code (GSoC) will start soon and I’ve learned that people who are just starting on open source can also join. This made me pretty excited :)

Looking at the projects [1] that the Linux Foundation is planning to submit, I found an old friend of mine: BPF (under perf). [2]

I joined the #linuxfoundation-gsoc channel on IRC and pinged Arnaldo Carvalho de Melo (Acme) to talk about my interest in joining GSoC’s perf projects.

acme copied me through some threads and this blogpost will be my notepad for what’s being discussed.

BTF

One thing that was mentioned to me was investigating the limitations of representing Rust contructs in BTF. Rust is actively ignored by pahole. [3]

Perf lock contention

Namhyung Kim, who’s also copied on the emails exchange, mentioned some patches they are working too [4].

[4] shows us a way to use bpf for collecting kernel lock contention stats without the need to do a perf lock record first.

I tested on an idle postgres db running on my Debian:

sudo ./perf lock con -a -b `pgrep postgres`
contended   total wait     max wait     avg wait         type   caller

         2     23.38 us     11.92 us     11.69 us        mutex   do_epoll_wait+0x1dc

Namhyung also sent a patch for contending locks

Postgres and perf lock contention - ideas for studying kernel behaviour

An idea for doing perf lock contention analysis is trying to run real workloads that are CPU intensive on top of postgres. That’s and idea that I had with my friend, Akemi, and my girlfriend, Madu. We’re still working on this idea.

I don’t really have CPU or kernel-intensive workloads running on my computer, so the results won’t be as interesting. But once we test some interesting stuff we should see a lot of locks.

[6] shows us some example of Postgres performance gains with new kernel versions. So that means that a good and well tuned kernel could make the performance better for the database.

[7] shows us how running pgbench could expose some lock contention on the kernel caused by glibc.

Hacking everything together:

# Create a database for stress testing
sudo -u postgres psql postgres
psql (15.6 (Debian 15.6-0+deb12u1))
Type "help" for help.

postgres=# create database stress_test;
CREATE DATABASE
postgres=# 
\q

# Init pgbench
sudo -u postgres pgbench -s 10000 -i stress_test

sudo ./perf lock con -b --pid `pgrep pgbench`

Errors and troubleshooting

I got this error:

libbpf: prog 'collect_lock_syms': BPF program load failed: No such file or directory
libbpf: prog 'collect_lock_syms': -- BEGIN PROG LOAD LOG --
ldimm64 failed to find the address for kernel symbol 'runqueues'.
processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
-- END PROG LOAD LOG --
libbpf: prog 'collect_lock_syms': failed to load: -2
libbpf: failed to load object 'lock_contention_bpf'
libbpf: failed to load BPF skeleton 'lock_contention_bpf': -2
Failed to load lock-contention BPF skeleton
lock contention BPF setup failed

I then compiled and installed libbpf from tools/bpf - but I was getting this error:

libbpf: failed to find '.BTF' ELF section in vmlinux

Tried adding CONFIG_DEBUG_INFO_BTF=y to no avail. [8] should probably fix it :) So I’ll switch to the linux-next branch.

Nope… [9] has more info on this error.

Turns out [8] is not applied to linux, but to dwarves.

I deleted bpftool and pahole and installed those from source, but still vmlinux compiles without BTF…

I had to delete CONFIG_DEBUG_INFO_REDUCED and I think I enabled BTF :)_

Now I’m having another error:

/usr/local/bin/ld: jit_disasm.o: in function `init_context':
jit_disasm.c:(.text+0x2f4): undefined reference to `bfd_openr'

I installed the latest binutils from the source to no avail.

Then, I uninstalled binutils-dev and libbfd-dev and it worked :)

But I’m still having the previous error :c

So I installed the kernel with the configs above and was able to fix the errors above. New error showed up:

sudo perf lock con -a -b -- sleep 10
libbpf: prog 'contention_begin': failed to find kernel BTF type ID of 'contention_begin': -3
libbpf: prog 'contention_begin': failed to prepare load attributes: -3
libbpf: prog 'contention_begin': failed to load: -3
libbpf: failed to load object 'lock_contention_bpf'
libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
Failed to load lock-contention BPF skeleton
lock contention BPF setup failed

Resources

[x] https://www.cs.rice.edu/~la5/doc/perf-doc/d6/d3f/target_8c.html

[y] https://facebookmicrosites.github.io/bpf/blog/2018/11/14/btf-enhancement.html

References

[1] https://wiki.linuxfoundation.org/gsoc/google-summer-code-2024

[2] https://wiki.linuxfoundation.org/gsoc/2024-gsoc-perf

[3] https://lore.kernel.org/bpf/20240117133520.733288-2-jolsa@kernel.org/

[4] https://lore.kernel.org/all/20220729200756.666106-1-namhyung@kernel.org/

[5] https://lore.kernel.org/linux-perf-users/Zd-UmcqV0mbrKnd0@x1/

[6] https://news.ycombinator.com/item?id=3793973

[7] http://rhaas.blogspot.com/2011/08/linux-and-glibc-scalability.html

[8] https://lore.kernel.org/bpf/ZeJMnNuauQgor67O@x1/T/

[9] https://lore.kernel.org/bpf/f248cf92-038c-480f-b077-f7d56ebc55bc@nvidia.com/T/

tags: gsoc-2024 - gsoc - kernel-dev