Sunday, August 23, 2009

Single-Process Google Chrome

I was running out of RAM at work one day and realized that Google Chrome actually takes up a considerable amount of memory. Most of this is due to creating several renderers, sometimes resulting in one process per tab. The Chromium core supports running all renderers in one thread (simply pass --single-process at the command line) but unfortunately this is an unsupported and disabled mode in Google Chrome (see this message on the chromium-dev mailing list). But this wasn't going to stop me...

The first thing to do was to figure out how --single-process is implemented and how it is disabled in Google Chrome official builds. The first link in the mailing list posting points us in the right direction:

From src/chrome/app/chrome_dll_main.c:
  bool single_process =
// This is an unsupported and not fully tested mode, so don't enable it for
// official Chrome builds.
if (single_process)

What we want to do is change the value of single_process after it is initialized to false but before the if. Unfortunately, here this is impossible because the compiler has optimized away that variable! This snippet of code actually does not even appear in chrome.dll since the compiler has also optimized away the call to RenderProcessHost::set_run_renderer_in_process.

We are lucky though, because single_process is never used again so if we can emulate a call to RenderProcessHost::set_run_renderer_in_process(true), we're golden. Digging deeper, that method does the following:

Google Code Search finds src/chrome/browser/renderer_host/render_process_host.h:
  static void set_run_renderer_in_process(bool value) {
run_renderer_in_process_ = value;

..with run_renderer_in_process_ a static bool in RenderProcessHost. This means we'll find it like a global variable with storage in .data.

Now, of course, we don't have symbols for chrome.dll so we have to figure out a way of finding the variable's location in memory. This can be done by looking for other references to this variable, fingerprinting those and using them to find run_renderer_in_process_. In this particular case, we need to look for the accessors of this field, which the compiler has also inlined.

There are a few references scattered across the codebase, but remember that we are looking for ones that are easy to fingerprint and unlikely to change across versions. This means uncommon constants, code sequences, etc. I chose two places: BrowserRenderProcessHost::BrowserRenderProcessHost and RenderProcessHost::ShouldTryToUseExistingProcessHost.

First, let's look at BrowserRenderProcessHost::BrowserRenderProcessHost:
BrowserRenderProcessHost::BrowserRenderProcessHost(Profile* profile)
: RenderProcessHost(profile),
if (run_renderer_in_process()) {
// We need a "renderer pid", but we don't have one when there's no renderer
// process. So pick a value that won't clash with other child process pids.
// Linux has PID_MAX_LIMIT which is 2^22. Windows always uses pids that are
// divisible by 4. So...
static int next_pid = 4 * 1024 * 1024;
next_pid += 3;

Now this is a very nice snippet because we will have the constant (4 * 1024 * 1024) somewhere in .data and a reference to it from an ADD instruction with "3" as the second operand. This is fairly uncommon so we'll start looking for this one. (Open up IDA.) We come up with this (in chrome.dll 11,844,080 bytes):

.text:021749A8                 cmp     byte_2659CEC, 0
.text:021749AF jz short loc_21749C5
.text:021749B1 add dword_2639A58, 3

Here run_renderer_in_process_ is at byte_2659CEC.

Now I wanted to be on the safe side and confirm this with another reference. One of the references to run_renderer_in_process_ is in RenderProcessHost itself:

From src/chrome/browser/renderer_host/
bool RenderProcessHost::ShouldTryToUseExistingProcessHost() {
size_t renderer_process_count = all_hosts.size();
return run_renderer_in_process() ||
(renderer_process_count >= GetMaxRendererProcessCount());

Remember that that getter is inlined. In fact, GetMaxRendererProcessCount is inlined too. Let's look at the source for that as well to get an idea of what to expect.

From chrome/browser/renderer_host/
size_t GetMaxRendererProcessCount() {
static const size_t kMaxRenderersByRamTier[] = {
3, // less than 256MB
6, // 256MB
9, // 512MB
12, // 768MB
14, // 1024MB

static size_t max_count = 0;
if (!max_count) {
size_t memory_tier = base::SysInfo::AmountOfPhysicalMemoryMB() / 256;
if (memory_tier >= arraysize(kMaxRenderersByRamTier))
max_count = chrome::kMaxRendererProcessCount;
max_count = kMaxRenderersByRamTier[memory_tier];
return max_count;

Going back to the original boolean expression in ShouldTryToUseExistingProcessHost, we expect to see a CMP on run_renderer_in_process_, followed at some point by an indexed access into kMaxRenderersByRamTier. The reason I chose this is kMaxRenderersByRamTier is very easy to find and unlikely to change across versions. Anyway, the assembly ends up looking like this:

.text:02038775                 cmp     byte_2659CEC, 0
.text:0203877C push esi
.text:0203877D mov esi, dword_2636D08
.text:020387A9 loc_20387A9: ; CODE XREF: sub_2038775+2Dj
.text:020387A9 mov eax, ds:dword_25CF518[eax*4]

So if we find kMaxRenderersByRamTier (dword_2636D08 here), we can find the reference to it, then look in its neighborhood for a "CMP m32,0".

And of course it would be nice if a computer could do this automatically for us :) Luckily there's an awesome Python module called pydbg that lets you write debugging scripts in Python.

Check out the finished product:

So.. what have we learned?
  • Beware of compiler optimizations
  • If you cannot cause your target to execute a particular piece of code, try to emulate all the side-effects of that code
  • Don't give up! Anything and everything is possible in assembly :)
Until next time..


1 comment:

Anonymous said...

This is why I use elinks