Sunday, August 23, 2009

Single-Process Google Chrome

I was running out of RAM at work one day and realized that Google Chrome actually takes up a considerable amount of memory. Most of this is due to creating several renderers, sometimes resulting in one process per tab. The Chromium core supports running all renderers in one thread (simply pass --single-process at the command line) but unfortunately this is an unsupported and disabled mode in Google Chrome (see this message on the chromium-dev mailing list). But this wasn't going to stop me...

The first thing to do was to figure out how --single-process is implemented and how it is disabled in Google Chrome official builds. The first link in the mailing list posting points us in the right direction:

From src/chrome/app/chrome_dll_main.c:
  bool single_process =
#if defined (GOOGLE_CHROME_BUILD)
// This is an unsupported and not fully tested mode, so don't enable it for
// official Chrome builds.
false;
#else
parsed_command_line.HasSwitch(switches::kSingleProcess);
#endif
if (single_process)
RenderProcessHost::set_run_renderer_in_process(true);

What we want to do is change the value of single_process after it is initialized to false but before the if. Unfortunately, here this is impossible because the compiler has optimized away that variable! This snippet of code actually does not even appear in chrome.dll since the compiler has also optimized away the call to RenderProcessHost::set_run_renderer_in_process.

We are lucky though, because single_process is never used again so if we can emulate a call to RenderProcessHost::set_run_renderer_in_process(true), we're golden. Digging deeper, that method does the following:

Google Code Search finds src/chrome/browser/renderer_host/render_process_host.h:
  static void set_run_renderer_in_process(bool value) {
run_renderer_in_process_ = value;
}

..with run_renderer_in_process_ a static bool in RenderProcessHost. This means we'll find it like a global variable with storage in .data.

Now, of course, we don't have symbols for chrome.dll so we have to figure out a way of finding the variable's location in memory. This can be done by looking for other references to this variable, fingerprinting those and using them to find run_renderer_in_process_. In this particular case, we need to look for the accessors of this field, which the compiler has also inlined.

There are a few references scattered across the codebase, but remember that we are looking for ones that are easy to fingerprint and unlikely to change across versions. This means uncommon constants, code sequences, etc. I chose two places: BrowserRenderProcessHost::BrowserRenderProcessHost and RenderProcessHost::ShouldTryToUseExistingProcessHost.

First, let's look at BrowserRenderProcessHost::BrowserRenderProcessHost:
BrowserRenderProcessHost::BrowserRenderProcessHost(Profile* profile)
: RenderProcessHost(profile),
[...]
if (run_renderer_in_process()) {
// We need a "renderer pid", but we don't have one when there's no renderer
// process. So pick a value that won't clash with other child process pids.
// Linux has PID_MAX_LIMIT which is 2^22. Windows always uses pids that are
// divisible by 4. So...
static int next_pid = 4 * 1024 * 1024;
next_pid += 3;
SetProcessID(next_pid);
}

Now this is a very nice snippet because we will have the constant (4 * 1024 * 1024) somewhere in .data and a reference to it from an ADD instruction with "3" as the second operand. This is fairly uncommon so we'll start looking for this one. (Open up IDA.) We come up with this (in chrome.dll 3.0.195.6 11,844,080 bytes):

.text:021749A8                 cmp     byte_2659CEC, 0
.text:021749AF jz short loc_21749C5
.text:021749B1 add dword_2639A58, 3

Here run_renderer_in_process_ is at byte_2659CEC.

Now I wanted to be on the safe side and confirm this with another reference. One of the references to run_renderer_in_process_ is in RenderProcessHost itself:

From src/chrome/browser/renderer_host/render_process_host.cc:
bool RenderProcessHost::ShouldTryToUseExistingProcessHost() {
size_t renderer_process_count = all_hosts.size();
[...]
return run_renderer_in_process() ||
(renderer_process_count >= GetMaxRendererProcessCount());
}

Remember that that getter is inlined. In fact, GetMaxRendererProcessCount is inlined too. Let's look at the source for that as well to get an idea of what to expect.

From chrome/browser/renderer_host/renderer_process_host.cc:
size_t GetMaxRendererProcessCount() {
[...]
static const size_t kMaxRenderersByRamTier[] = {
3, // less than 256MB
6, // 256MB
9, // 512MB
12, // 768MB
14, // 1024MB
[...]
};

static size_t max_count = 0;
if (!max_count) {
size_t memory_tier = base::SysInfo::AmountOfPhysicalMemoryMB() / 256;
if (memory_tier >= arraysize(kMaxRenderersByRamTier))
max_count = chrome::kMaxRendererProcessCount;
else
max_count = kMaxRenderersByRamTier[memory_tier];
}
return max_count;
}

Going back to the original boolean expression in ShouldTryToUseExistingProcessHost, we expect to see a CMP on run_renderer_in_process_, followed at some point by an indexed access into kMaxRenderersByRamTier. The reason I chose this is kMaxRenderersByRamTier is very easy to find and unlikely to change across versions. Anyway, the assembly ends up looking like this:

.text:02038775                 cmp     byte_2659CEC, 0
.text:0203877C push esi
.text:0203877D mov esi, dword_2636D08
[...]
.text:020387A9 loc_20387A9: ; CODE XREF: sub_2038775+2Dj
.text:020387A9 mov eax, ds:dword_25CF518[eax*4]

So if we find kMaxRenderersByRamTier (dword_2636D08 here), we can find the reference to it, then look in its neighborhood for a "CMP m32,0".

And of course it would be nice if a computer could do this automatically for us :) Luckily there's an awesome Python module called pydbg that lets you write debugging scripts in Python.

Check out the finished product: chromehack.py.

So.. what have we learned?
  • Beware of compiler optimizations
  • If you cannot cause your target to execute a particular piece of code, try to emulate all the side-effects of that code
  • Don't give up! Anything and everything is possible in assembly :)
Until next time..

-Cat

Wednesday, June 10, 2009

Google Chrome Rocks! (or: Facebook Message Forensics)

There's a little trick I sometimes pull when my mouse accidentally slips on the back button while I was writing something on a web page (don't ask). Before I show you a demo, I want to explain what's going on behind the scenes so you can understand what I'm doing.

When you navigate away from a page, behind the scenes one of two things happen: either the browser frees the memory associated with the page, or it keeps the rendered page in a cache in case you decide to revisit it. Even when the memory is freed, the data usually stays around for a little while longer, at least until there have been enough free()'s to call the OS allocator. (Remember that the CRT is an allocation cache on top of the OS allocator.) Either way, if you've got a long enough pattern, you can just search for it in the process' memory space. It's crude, but it sometimes works. And here's how it's done:

First Step: Get the page's PID from Chrome's about:memory page
This is why this post is called "Google Chrome Rocks!" Normally this step is painful with Internet Explorer and Firefox because all pages are in the same (main) process, so they all become unresponsive when I debug the process. Chrome allows me to debug a single page, while still browsing in the other tabs.

(Yes, I have a lot of tabs open.)

Second Step: Fire up WinDbg and attach to that process
If you don't have WinDbg installed, use Process Explorer to suspend the process (right-click, Suspend). You could even get Visual Studio attached in the interim, but you will have to also suspend with Process Explorer during the hand-off to WinDbg.

Go to File | Attach to a Process... (F6) and choose the process:
Once WinDbg has attached, you're safe -- the process is suspended so its memory won't be reclaimed by the OS.

Third Step: Look for a pattern in what you were writing
It can be as simple as a word like "gliding". Of course, this will entirely depend on what you were writing.

For example, use the WinDbg command "s -a".

In my case, I can see that the last match is the right one. Use "db" to work your backwards to the beginning of the string.

Remember that your string might be stored in Unicode and try different patterns.

Fourth Step: Extract the recovered string
The text is too long for "da", but WinDbg has a ".printf" command. Don't be afraid to use the manual to come up with crazy techniques like this one. Sadly, the comma is not documented.

Fifth Step: Paste into notepad

Once this is done, you can release the process from the debugger. Make sure to use Debug | Detach Debuggee so you don't kill the process.

Then simply hit the Forward button and paste your text back into the editor :)

What would be really fun is writing a driver that can freeze the entire system and search physical memory for a particular pattern. This would catch even memory freed by the OS. Unfortunately, under Windows, VirtualFree'd memory is zero'ed by a background kernel thread, so the window of opportunity could be quite small.

Until next time,

Good night.

Saturday, April 11, 2009

Missing Ingress Qdisc

If you're trying to set up shaping on your Linux box (either manually through "tc qdisc add dev ppp0 handle ffff: ingress", or with The Wonder Shaper), you may run into this error:
RTNETLINK answers: No such file or directory

It means that you haven't compiled the "ingress" queueing discipline in your kernel (either built-in or module). Here's where to find it:

  • Option name: CONFIG_NET_SCH_INGRESS
  • menuconfig: Networking support ---> Networking options ---> QoS and/or fair queueing ---> Ingress Qdisc
  • Module location: net/sched/sch_ingress.ko

That is all.

Wednesday, March 04, 2009

Teach Google a Lesson!

Google is a fantastic search engine. The first thing I do in the morning is Gmail. Without Google Calendar, my life would be a mess. But sometimes, one human can do a better job than CONFIDENTIAL computers.

The PIC16F688 has been the focus of my attention for the last few months. In particular, I needed to read about its EUSART (aka UART) peripheral for a school project. Who do I ask? Google of course.

Sure enough, the second search result is the datasheet. But here's what Google didn't tell you: it's not the latest revision of the datasheet. This sucked particularly hard in my case because I had already printed a hardcopy of the relevant part of the documentation.

Luckily, Google has recently introduced a feature that lets me amend that. It's called SearchWiki and it rocks! I was tempted to also click on the "X" button to tell Google that that's the wrong answer to my query. I tried to tell Google what the right answer was by searching with my original keywords ("pic16f688"), plus some keywords that bring up the correct answer ("41203D"):

I clicked the green "Promote" icon, but alas, that result doesn't show up in the result page for just the original keywords ("pic16f688").

Nevertheless, SearchWiki lets me make sure that I never make that same mistake again. At least until Microchip releases a new revision of the datasheet... ;-)

-Cat

Sunday, February 15, 2009

Glassfish and Intermediate SSL Certificates

About a week ago, EngSoc bought a wildcard SSL certificate for our online properties. We've been running with self-signed certificates for a few years now and our users have had to put up with those security warnings.

One of the properties we wanted to secure with this certificate was a Glassfish installation that the engineering society uses for elections. I installed the certificate and confirmed it worked in Internet Explorer and Chrome -- at the time I didn't have Firefox installed on my computer so I didn't give it much thought.. but when I started getting reports that the warning was still popping up in Firefox, I had to investigate.

There are a couple of aggravating circumstances that caused this to happen -- Murphy must have been having a field day.
  1. Internet Explorer and Chrome access the same set of root CA certificates, as managed by the Windows Crypto API. Firefox uses its own certificate store.
  2. The server cert we bought is issued by an intermediate CA. This means that the server cert is not directly certified by the most commonly distributed root CA certs. Any server using this cert must send the intermediate cert along with the server cert, or else the browser cannot build a trust path from the server CA to one of the root CAs.
  3. For some reason, the Windows CA store happens to have the intermediate CA cert, while the Firefox store does not. For example, Windows could simply have a newer set of certs, but the reason is immaterial for this discussion. The consequence is that IE/Chrome do not show a warning, while FF does:


  4. Apache's mod_ssl can be configured to send this intermediate CA using the SSLCertificateChainFile directive. Glassfish does not have documentation for this, though it is possible. Keep reading.
First, we start off with a keystore.jks that contains your private key and CA-issued cert. Presumably you got to this point by generating a key pair, sending a certificate request and importing the certificate into your store.

# keytool -keystore keystore.jks -storepass changeit -list -v

Keystore type: JKS
Keystore provider: SUN

Your keystore contains 1 entry

Alias name: https
Creation date: Jan 30, 2009
Entry type: PrivateKeyEntry
Certificate chain length: 1
Certificate[1]:
Owner: CN=*.engsoc.org, OU=EngSoc Project, O=Carleton University, L=Ottawa, ST=Ontario, C=CA
Issuer: CN=DigiCert Global CA, OU=www.digicert.com, O=DigiCert Inc, C=US
Serial number: 780a3e6ee2948ad9a36c504fcf2c4c1
Valid from: Wed Jan 28 19:00:00 EST 2009 until: Tue Feb 02 18:59:59 EST 2010
Certificate fingerprints:
         MD5:  23:35:C5:7F:E8:77:55:4E:CE:47:FA:8E:18:E8:F0:9C
         SHA1: 9E:8E:E1:44:C8:02:4A:07:2B:6E:E9:59:34:B9:46:7B:56:AC:CE:E3
         Signature algorithm name: SHA1withRSA
         Version: 3

[...]

Note "Certificate chain length: 1". This is the server cert only and is not sufficient for Firefox to generate a trust path to one of its root CA certs. We can see that Glassfish indeed sends only this cert during the SSL handshake:

Our goal is to add the intermediate CA to this key entry and make the chain length be 2: the server cert and the intermediate CA cert.

We first need to create an X.509 file that contains both certs. Luckily, this is as simple as concatenating the two certs:
# cat certs/DigiCertCA.crt certs/star_engsoc_org.crt > certs/DigiCertCA+star_engsoc_org.crt

And now we import them both, overwriting the cert associated with the key entry:
# keytool -keystore keystore.jks -storepass changeit -importcert -alias https -file certs/DigiCertCA+star_engsoc_org.crt -noprompt
Certificate reply was installed in keystore

Now we look at the key entry again:
# keytool -keystore keystore.jks -storepass changeit -list -v

Keystore type: JKS
Keystore provider: SUN

Your keystore contains 1 entry

Alias name: https
Creation date: Feb 15, 2009
Entry type: PrivateKeyEntry
Certificate chain length: 2
Certificate[1]:
Owner: CN=*.engsoc.org, OU=EngSoc Project, O=Carleton University, L=Ottawa, ST=Ontario, C=CA
Issuer: CN=DigiCert Global CA, OU=www.digicert.com, O=DigiCert Inc, C=US
Serial number: 780a3e6ee2948ad9a36c504fcf2c4c1
Valid from: Wed Jan 28 19:00:00 EST 2009 until: Tue Feb 02 18:59:59 EST 2010
Certificate fingerprints:
         MD5:  23:35:C5:7F:E8:77:55:4E:CE:47:FA:8E:18:E8:F0:9C
         SHA1: 9E:8E:E1:44:C8:02:4A:07:2B:6E:E9:59:34:B9:46:7B:56:AC:CE:E3
         Signature algorithm name: SHA1withRSA
         Version: 3

[...]

Certificate[2]:
Owner: CN=DigiCert Global CA, OU=www.digicert.com, O=DigiCert Inc, C=US
Issuer: CN=Entrust.net Secure Server Certification Authority, OU=(c) 1999 Entrust.net Limited, OU=www.entrust.net/CPS incorp. by ref. (limits liab.), O=Entrust.net, C=US
Serial number: 4286aba0
Valid from: Fri Jul 14 13:10:28 EDT 2006 until: Mon Jul 14 13:40:28 EDT 2014
Certificate fingerprints:
         MD5:  FB:14:1E:91:00:CA:CB:77:D8:01:62:D8:8C:B8:84:48
         SHA1: 25:B7:8E:B9:36:A4:00:CE:34:13:1D:9A:6D:E8:BE:A0:4B:34:76:07
         Signature algorithm name: SHA1withRSA
         Version: 3

[...]

Indeed, there are now two certs for this key and Glassfish will now send them both during the SSL handshake:


Note also that, after connecting to the server once, Firefox added the intermediate CA cert (DigiCert Global CA) to its cert store:


Things get even more interesting when you have multiple servers using this cert. Our Apache server was properly configured to send both certs, so when a Firefox user went to an Apache site, Firefox would add the intermediate CA cert to its store. Subsequent visits to the Glassfish server would show no warnings. However, if someone visited the Glassfish server first, they would get the warning. This behaviour is reproducible by simply deleting the "DigiCert Global CA" cert from Firefox's store (it's under "Entrust.net").

Well, I hope this helped. It definitely solved our problem.

-Cat

Saturday, January 24, 2009

To Lock or Not to Lock?

Over the weekend, I was at CUSEC 2009 and an interesting computer science problem occured to me. Let me know what you think.

Consider a multi-threaded application with (at least) two threads: the main thread and a message-based worker thread.

Consider also an indexed ADT and consider a synchronized iteration through the elements of this enumeration (shown for clarity in Java without loss of generalization), running in the main thread:

List list = getList();
synchronized (list) {
    for (int i = 0; i < list.size(); i++) {
        T element = list.get(i);
        process(element);
    }
}

And consider a process(T element) method that sends a message to the worker thread to process this element. This worker thread, in turn, processes the element by accessing the original enumeration in a synchronized fashion. For example:

private void processInWorkerThread(T element) {
    List list = getList();
    sychronize (list) {
        // For example, get the index
        // within the list:
        // int index = list.indexOf(element);
        // setResult(index);
        respondToMainThread();
    }
}

When the enumeration is run, this algorithm will cause a deadlock. The main thread is waiting for the background thread to unblock, which will only happen when the main thread releases its lock on the list, which will never happen. This is deadlock.

Spotting a deadlock is one thing. Fixing a deadlock is another. And fixing a deadlock in a clean, future-proof, backwards-compatible way is practically an extinct species.

Clearly there is something wrong with the above design. Point out the flaws, what the options are, how you would fix them, and why you chose the solution that you chose. I welcome clarifying questions regarding the context or scope of the application.

Let the commenting begin!

Sunday, November 16, 2008

Teaching the Snake a New Trick

I learned Python about three years ago. At that time, I didn't know Python at all and took a job with a company that uses Python extensively. My job didn't actually consist of any Python work, but since I liked the company, I knew learning Python would be important for getting rehired. Python has since grown on me quite a bit, and whenever I am given a programming problem, I often find I come up with a Python solution the fastest. There are a few things that I love about it, but I'm not here to start a language war. I just taught the snake a new trick and I want to share it with you.

First, some background. I am a huge fan of the functional programming syntax that Python provides. Its for loops and list manipulation helpers are particularly natural and it's never been easier to mix and match imperative and functional programming.

One thing that I have been trying to do for a while but never actually found a Good Solution (TM) to is iterating over each consecutive pair of elements in a list. For example, if the list contains [1, 2, 3, 4], I want to execute some block of code with the pairs (1, 2), (2, 3), (3, 4).

The naive solution looks something like this:
L = [1, 2, 3, 4]
for i in range(len(L) - 1):
  a, b = L[i], L[i + 1]
  # work with a, b here

But come on, how ugly is that?

A much nicer solution uses the zip function as follows:
for a, b in zip(L[:-1], L[1:]):
  # work with a, b here

This forms two lists, one with the last element chopped off and one which is missing its first element. The pairs of corresponding elements of these two lists are the pairs of consecutive elements in the original list (zip does this part). Neat, huh? Can anyone come up with an even cleaner solution?

I leave you with a quote about programmers from Tidbits from the Dungeon:
Programmers are in a race with the Universe to create bigger and better idiot-proof programs, while the Universe is trying to create bigger and better idiots. So far the Universe is winning. -- Rich Cook
Well, it's back to homework for me.

-Cat

Wednesday, November 12, 2008

Lenovo Battery Sudden Death Syndrome

I'm a proud owner of a Lenovo ThinkPad X61s notebook. I bought it just over a year ago and I've been thoroughly happy with it. Until a month ago.

On October 16, two days after the end of my battery's one-year warranty, I came home after leaving the laptop plugged in during the entire day and, without thinking twice, unplugged the AC adapter as I usually do to take the laptop to the couch. The laptop turned off instantly and would not power back up with the battery. The ThinkPad Power Manager had this to say:


With the battery connected, the battery light on the front of the laptop flashed rapidly orange (it's supposed to be flashing green when charging and solid green when finished charging). Note that this happened suddenly -- the day before this happened, I had about 4 to 5 hours of battery life as usual.

I immediately called customer support and was greeted by a polite yet very unhelpful man. He was unable to help me in any way because my warranty had expired 2 days prior. It's a little silly, but I don't blame him; he's just following company policy. It looked like I would have to figure this out on my own.

From what I had heard, one of the most common causes of failure of Lithium-based batteries is failure of one of the cells (this battery pack, for example, is made up of 8 cells). The charging circuitry then avoids charging or discharing the cell, since damaged Lithium cells can be quite dangerous (explosions, etc.). Sometimes, it's as easy as replacing that cell; failed cells typically have a very low voltage, usually lower than 1.5V.

So, what does a good reverse engineer do with a potentially repairable battery pack? He cracks it open. WARNING: DON'T DO THIS AT HOME. It could literally blow up in your face.
Here's what a ThinkPad X61s 92P1172 battery looks like inside:


At this point, it's worth noting how the cells are connected in such a battery. The 8 cylindrical cells (part number "LH7M2D8") are grouped into 4 pairs. The two cells in each pair are in parallel, while the pairs are in series. If the cell voltage is Vcell, then the total voltage Vtotal = 4 * Vcell.

I measured each of the cell (pair) voltages, and they were all approximately 3.97V. According to TI's Using NiMH and Li-Ion in Portable Applications (Figure 1), this is a normal voltage for a mostly-charged cell. In fact, the Power Manager applet showed a total voltage of 15.86V, which adds up. However, it's debatable whether the PM reading should be trusted, since, not knowing exactly how the PM works, there is a possibility that the battery is in fact damaged and the reading is stale data from when it was last healthy. Either way, my multimeter confirmed the cells have a healthy voltage (I don't know enough about Li cells to say for sure that this means the cells are completely healthy). I also haven't tried measuring their voltage under load (a small resistor), since that's fairly dangerous if not done properly and I would prefer to avoid any fires until it becomes absolutely necessary.

And this is about where I got stuck. All signs so far point to a defective charge controller or a corrupted controller nonvolatile memory. I have tried doing the "reset battery gauge" procedure in Power Manager, but that results in my laptop hard-crashing (powering off) and nothing happens to the battery. I'm quite open to suggestions.

Thanks for reading, see you next time! :-)

Monday, November 10, 2008

Better iTap Learning

iTap (or T9 for you Nokia fans out there) is a widespread predictive text technology for cell phones, typically used when composing text messages (SMS). Each key on a cell phone's keypad is labelled with 3 (or 4) letters and, as you're pressing one number at a time, the phone tries to figure out which permutation of the sets of letters is most likely to be the word you are writing.

Of course, it always helps to integrate your own personal touch: new predictive text technologies actually learn from the words that you type. The newest versions even go so far as to record combinations of words and propose them to you when you're typing in case you want to repeat the same phrase. Very helpful stuff. But all this is done only based on what you write.

But what if the cell phone were to learn from your received text messages as well? Chances are you and your friends' vocabularies are pretty close. This potentially doubles the amount of learning material and therefore doubles the learning speed. Also, by associating the learned information with a particular correspondent, the device can make intelligent choices about the words it proposes to you. (You probably didn't mean to text "whats up dawg" to your mom.)

Motorola, hire me so I can implement this for you.

Friday, October 17, 2008

What do IPsec and Larvae Have in Common?

This is a bit of a digression from the usual topic of this blog, but I found the problem interesting enough to mention.

I have a Linux server at home, on which I tend to keep the command-line rtorrent client running in a detached screen session. I recently noticed that my rtorrent started hanging randomly -- the process would be in the sleeping state with no noticeable CPU usage, but it was entirely unresponsive to key presses. All other processes on the system were unaffected.

The first thing I did was Google search for "rtorrent hang". Sure enough, someone had had this problem before and reported that rtorrent was hanging on the "madvise" system call. The comment had claimed that it was a kernel bug, which wasn't entirely unrealistic since madvise seems to be a fairly esoteric, rarely-used, and therefore rarely-tested system call. The post pointed to a newer version of rtorrent, but, to my disappointment, even the latest version from SVN didn't solve the problem. My rtorrent was still hanging. And on top of that, HTTP requests seemed to be broken in the newer version because of what I can only assume is some incopmatibility with the relatively old version of libcurl on my machine.

I recompiled the original stable version of rtorrent, and thankfully HTTP torrent fetches and tracker requests worked again, but the hang was still there. I decided to investigate based on a clue left by the madvise post: use strace to see what syscall rtorrent was blocked in. This is what I found:
socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 579
fcntl64(579, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
setsockopt(579, SOL_IP, IP_TOS, [8], 4) = 0
connect(579, {sa_family=AF_INET, sin_port=htons(37320), sin_addr=inet_addr("189.51.247.163")}, 16) = -1 EINPROGRESS (Operation now in progress)
epoll_ctl(3, EPOLL_CTL_ADD, 579, {EPOLLOUT, {u32=139804600, u64=139804600}}) = 0
epoll_ctl(3, EPOLL_CTL_MOD, 579, {EPOLLOUT|EPOLLERR, {u32=139804600, u64=139804600}}) = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 580
fcntl64(580, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
setsockopt(580, SOL_IP, IP_TOS, [8], 4) = 0
connect(580, {sa_family=AF_INET, sin_port=htons(57524), sin_addr=inet_addr("192.168.0.10")}, 16

"How was this possible??" I asked myself. You can clearly see that socket 579 was put in nonblocking more and connect immediately returned "EINPROGRESS", as it should. But socket 580, which was doing virtually the same thing, was blocking in "connect." How was that possible? What was the difference?

It turns out the difference was in the destination IP address. I had almost forgotten that, prior to this hanging problem, I had set up an IPsec tunnel to my friends' network. His network is on 192.168.0.0/24, so when rtorrent was trying to connect to 192.168.0.10, my kernel was actually trying to establish an SA (IPsec tunnel) to my friend's network. A quick search for "ipsec blocking socket" quickly revealed that this behaviour is documented and configurable (http://lkml.org/lkml/2007/12/4/260). In fact, by simply "echo 0 > /proc/sys/net/core/xfrm_larval_drop", I solved the mystery of the hanging rtorrent. In this case, the kernel simply drops any packets associated with an IPsec tunnel that is in the process of being established -- instead of blocking the calling process until that tunnel is fully established.

This solution, of course, does have its downsides: when setting up or debugging an IPsec tunnel, you end up seeing packet loss without really knowing the reason. Temporarily turning xfrm_larval_drop is probably a good idea while tweaking your IPsec configuration!

See you next time.

-Cat

Tuesday, September 23, 2008

Laundry Part 3

Let's recall from last time: we're studying an Atmel AT88SC0404C ("CryptoMemory") in smart card form factor. We can communicate with it via a serial bus compatible with a regular PC serial port. We identified it using the Answer-to-Reset string, which ISO 7816 specifies all such cards must send when brought out of reset. Now we want to poke around the card a bit, hopefully doing something like reading the card's balance.
From this point on, the AT88SC0404C datasheet is an essential part of our work. At the very least, it's important to quickly scan it to get an idea of how the device works. There's a couple of tables you'll find yourself referring to fairly often:
  • Figure 4-10 (AT88SC0104C, 0204C, 0404C Configuration Memory) on page 14;
  • Table 8-2 (CryptoMemory Asynchronous Command Set) on page 41;
  • Table 8-3. Asychronous Mode Return Status Definitions.

The chip has some fuses that provide chip-level protection. Let's try to read the fuse status byte:
ATR 3b b2 11 00 10 80 00 04
--> 00 b6 01 00 01
<-- (b6) 20 [90 00]

That (b6) is the INS byte (command) sent back to us, the 20 is the fuse byte, and the trailing [90 00] is the status code. According to the datasheet (page 24), this means that all fuses have been blown.

The AT88SC0404C has a great deal of configuration memory (256 B); let's try to dump it using the "Read Config Zone" command. Keep in mind that bytes that we don't have rights to read will appear as 20 (the fuse byte):
ATR 3b b2 11 00 10 80 00 04
--> 00 b6 00 00 f0
<-- (b6) 3b b2 ... 20 20 [69 00]


Here's the data, formatted for comparing with the configuration zone map on page 14 (and with potentially identifying data removed):
000000 3b b2 11 00 10 80 00 04
000008 40 40 ff ff ff ff ff ff
000010 69 x2 08 x2 28 x0 40 x0
000018 bf 00 00 00 x0 x5 xd xb
000020 df 08 df 08 df 58 df 58
000028 ff ff ff ff ff ff ff ff
000030 ff ff ff ff ff ff ff ff
000038 ff ff ff ff ff ff ff ff
000040 ff ff ff ff ff ff ff ff
000048 ff ff ff ff ff ff ff ff
000050 ff 5d ae 47 5a 06 51 db
000058 20 20 20 20 20 20 20 20
000060 ff ff ff ff ff ff ff ff
000068 20 20 20 20 20 20 20 20
000070 ff ff ff ff ff ff ff ff
000078 20 20 20 20 20 20 20 20
000080 ff ff ff ff ff ff ff ff
000088 20 20 20 20 20 20 20 20
000090 20 20 20 20 20 20 20 20
000098 20 20 20 20 20 20 20 20
0000a0 20 20 20 20 20 20 20 20
0000a8 20 20 20 20 20 20 20 20
0000b0 ff 20 20 20 ff 20 20 20
0000b8 ff 20 20 20 ff 20 20 20
0000c0 ff 20 20 20 ff 20 20 20
0000c8 ff 20 20 20 ff 20 20 20
0000d0 ff 20 20 20 ff 20 20 20
0000d8 ff 20 20 20 ff 20 20 20
0000e0 ff 20 20 20 ff 20 20 20
0000e8 88 20 20 20 ff 20 20 20
0000f0 end


Let's take a look at some of the interesting fields and their values:
  • Offset 18h - DCR = BFh
  • Offset 19h - Identification Number Nc = "00 00 00 x0 x5 xd xb"
  • Offset 20h - AR0 = DFh
  • Offset 21h - PR0 = 08h
  • Offset 22h - AR1 = DFh
  • Offset 23h - PR2 = 08h
  • Offset 24h - AR2 = DFh
  • Offset 25h - PR2 = 58h
  • Offset 26h - AR3 = DFh
  • Offset 27h - PR3 = 58h
  • Offset 50h - Reserved for Authentication and Encryption = "ff 5d ae 47 5a 06 51 db"
  • Offset E8h - PAC = 88h
The DCR value is the Device Configuration Register and the meaning of its bits are explained in detail in section 5.3.8. The conclusion is that the Write 7 password (master password) has been disabled, "Unlimited Checksum Reads" is asserted, "Unlimited Authentication Trials" are not allowed and we are only allowed 4 incorrect password attempts before the device locks itself out in hardware, permanently. (Be careful not to "brick" your card!)

Next, we have an Identification Number that varies according to the number on the back of the card. It's not quite the same number, but the value of Nc does increase by 1 per every increment of the number of the back (i.e., it's just an offset).

The Access Registers are a bit more interesting. The AR values have PM(1:0)="11", meaning "no password", but since AM(1:0)="01", this means the cryptographic authentication protocol is in effect. "Encryption Required" is set to 1, which appears to mean deasserted. The rest of the bits aren't too interesting.

The Password Registers tell us that the device's user memory is split into two areas, accessed by different passwords/keys. The first half uses AK(1:0)="00" and POK(1:0)="00" and the second half uses AK(1:0)="01" and POK(1:0)="01".

The next part appears to be "Cryptograms" (section 5.3.12). This is most likely part of the cryptographic authentication protocol. The first page of the datasheet states that the chips use a "64-bit Mutual Authentication Protocol" under license from ELVA. If you look them up, they appear to be a French company that has filed patents on the topic to the US Patent Office: Method of enabling a server to authorize access to a service from portable... The patent makes reference to various cryptograms that get shuffled around to validate the identity of the device and/or host. Of course, the patent omits the details truly valuable to an implementation or for cryptoanalysis. (Okay, fine, I didn't read it all. If anyone finds and juicy details, please post!)

The very last field we can look at is the Password Attempts Counter for the Write 7 password. Because I attempted to validate the Write 7 password a couple times, this counter has decreased to 88h. This means I only have one attempt before the card locks itself out completely. Ouch! Don't do this at home!

Well, that's about it for now. I ordered a kit from Atmel that includes all the libraries, in binary format, needed to use the cryptographic authentication protocol. When I have something new to report, rest assured you'll be the first to know. Until then, 73.

-Cat


Saturday, July 12, 2008

Laundry, Semi-part 2b

To all my faithful readers:

It has been far too long since I've posted here. Unfortunately, my blog tends to take a rather low priority for me, and Real Life has been knocking on my door pretty insistently the past few months. Nonetheless, I will make an effort to post a bit more often and keep you posted on my techy adventures.

See you in a bit and don't give up on me!

-Cat

Thursday, January 24, 2008

Laundry Part 2

Here's the deal. We've got what looks like an ISO 7861 smartcard. It's used for "laundry," something most of us geeks reject as part of the alternate universe we like to call the "Real World." I digress. We're trying to communicate with this smartcard, hoping to unlock its secrets...

Let's start with a review of the electrical signals defined by the ISO standard.


VCC and Ground are pretty straightforward. Clock must be provided to the smartcard, TTL-level and on the order of 3 MHz. Input/Output is a bidirectional data pin; the protocol determines whether the host or the smartcard is driving this line at a given instant. The Reset signal is active-low.

We'll be connecting the I/O pin to a MAX-232 level shifter to convert the TTL level to the PC's RS-232 levels (and vice-versa). From the PC side, you have separate signals for transmit and receive; I connected the I/O pin directly to the PC's receive; the PC's transmit is spliced in with a 1 kOhm resistor. The resistor should limit the current in case something goes wrong, and in normal operation, the current is low enough that the voltage drop across the resistor isn't high enough to disturb data sent from the PC.

Now let's talk a bit about the protocol on this I/O pin. Right after reset, the card sends a block of data called the Answer to Reset (ATR); this is documented in section 2.3.4 of the standard. The baud is specified as being the input clock frequency divided by exactly 372. The other parameters are: 8 bits, even parity and one stop bit.

Assuming we want to work at a baud of 9600 (fairly typical baud for PC serial ports), this means we need an input frequency of 3.5712 MHz. While there's crystals out there that provide this frequency, I don't have one and I don't have a signal generator either. So I had to improvise.

I had laying around an Altera UP2 development board with a 25.175 MHz crystal. This is an educational board with a CPLD and an FPGA (programmable logic chips); I also happened to have a working copy of Quartus usable to create designs for the chips. I basically used a binary counter as a frequency divider from the main 25.175 MHz clock. In the end, it looked something like this:

Don't be deceived by the large board; it's just an oversized clock generator.

At first, I tried using a divisor of 7, which adds up to a baud of (25.175 MHz / 7 / 372 = ) 9668. This is really close to 9600, and most serial port receivers tolerate a certain margin of error, but 9668 turned out to be too far off; data become garbled after the first few bytes. If I settled for an I/O baud of 1200 bps, the required clock for the smartcard would be only 0.4464 MHz. With a divisor of 56, I would get a baud of 1208 bps, which was close enough for the serial port. I was able to get an ATR:

atr: read 8 bytes: atr: read 8 bytes: 3b b2 11 00 10 80 00 04

A quick Google search for this hex string quickly uncovered the identity of this smart card:

3B B2 11 00 10 80 00 04
Atmel memory card AT88SC0404C
http://www.atmel.com/dyn/resources/prod_documents/doc5210.pdf

Aha! Luckily, the datasheet is fairly explicit regarding the command set of the chip. But you'll have to wait until next time to see what happened when I started poking commands at it.

Tuesday, January 22, 2008

Laundry

Nowadays, it seems like real money (in the sense of cash) barely even exists anymore. We have debit cards, credit cards, paypal accounts, wire transfers... And in a sense, this is much more convenient than hiding 35 grams of gold under your pillow. On the other hand, as good as one may be at staring contests, it's a lot easier to convince a smart card that you put money on it than to convince 35 grams of gold that it's really 45 grams.

The laundry machines at my apartment building use SmartCity smart cards. There is a refill machine that takes debit (aka Interac) as well as credit cards. Here's what the cards look like (and the transcribed text for the benefit of search engines and visually impaired readers):


"SmartCity

Smart cards by Coinamatic

Canada's Most Trusted Name in Apartment Services™"


"Please treat this card like cash. The value on this card will not be replaced if the card is lost, stolen, destroyed, or altered. Use of this card constitutes acceptance of the terms and condition stated in the SmartCity® Resident Card Information section on http://www.coinamatic.com/
Questions? 1-800-561-1972 ou customerservices@coinamatic.com"

On the back, in the bottom-left corner, is what looks like a 7-digit numeric serial number.

The electrical contacts you can see on the front side (first photo) are the typical ISO 7861 physical interface. Most (if not all) of these cards also obey the electrical interface and protocol defined by the same standard. Luckily for us, this means all we need to communicate with them is a clock generator, an RS-232 level shifter (MAX232) and a regular PC serial port.

Stay tuned for more details on what happened when I hooked the card up to my PC! For now, I've got some homework to do.

-Cat

Sunday, January 20, 2008

Back in Black

Yesterday, I came back from Montreal after having attended my 3rd CUSEC. I have attended the conference every year since my first year in Software Engineering and I have to say that I was very happy this year to see a continued commitment to high quality talks and a friendly and fun atmosphere all around.

I thoroughly enjoyed most of the keynotes, but Jeff Atwood's talk was particularly motivating to me. I remembered I had started this blog a long time ago, and abandoned it (alas, for this is the destiny of so many of my projects); Jeff reminded me that I did indeed have something to say to the world.

I'm a Software Engineer by University program; a versatile programmer by experience and a hacker at heart. It's hard to keep me from reverse engineering just about any piece of technology that happens to drop on my lap. I happily drop from the virtual world of ones and zeroes and get my hands dirty with my soldering iron.

As for most geeks, my home page speaks of me better than I can: http://vv.carleton.ca/~cat/

See you around the blogosphere!

Wednesday, December 29, 2004

Multimedia PHP Album Tutorial

...by me!

Take a look, feedback is highly appreciated:
PHP Album Tutorial

Sunday, November 14, 2004

Castles Made of Sand

Holy #$@* Jimi Hendrix was a genius!

Friday, November 12, 2004

Know Your Enemy

Come on
Yes I know my enemies
They're the teachers who taught me to fight me
Compromise
conformity
assimilation
submission
ignorance
hypocrisy
brutality
the elite
All of which are American dreams(8x)

- Rage Against The Machine - Know Your Enemy

Tuesday, October 26, 2004

Hookers and services

<Technobabble>
No, this post is not related to prostitution. It is, in fact, related to Windows hooks - just as bad. Well, no, not nearly.. Anyway!

First, I'd like to mention that rattle's article on Systemwide Windows Hooks without external DLL is what first sparked my interest in this. Thanks!

I've mainly been experimenting with WH_KEYBOARD and WH_KEYBOARD_LL. As mentionned by MSDN, the NT-specific WH_KEYBOARD_LL delivers notifications in the process that set the hook. This is especially interesing because it means no external DLL need be created. On the the other hand, it makes you wonder about the numerous context switches that will happen upon keyboard input.

The second point I want to explore is exactly that - the delivery of notifications. As far as I can see, hooks use some kind of Windows messages. If, after hooking, the thread is Sleeping or SleepExing (to put the thread in an alertable wait state), hooks are not called. If, however, it is in a GetMessage or PeekMessage-based message pump, it processes messages successfully. What's more, MsgWaitForMultipleObjects returns WAIT_OBJECT_0+nCount (indicative of a message queued in the thread's message queue) when a hook is about to be called.

The strange behavior in this entire system is that, given an application which does not have any windows or other message sources than the hooks, GetMessage never returns! In fact, PeekMessage returns FALSE and never returns a valid message in *lpMsg. The hook procedure gets called, it appears, from inside them (in GetMessage and MsgWaitForMultipleObjects). I always suspected there was much more to GetMessage than meets the eye!

So, take all this, and throw most of it away. This only happens for WH_KEYBOARD_LL and WH_MOUSE_LL (and possibly WH_JOURNALRECORD and WH_JOURNALPLAYBACK). All other hooks seem to still require an external DLL. But wait... My original goal was to hook when the process was registered as a service. Things start to get even weirder in this case.

The service (it may be worth mentionning that it is set as an "interactive service" - interaction with the desktop is permitted) seems to be able to use WH_KEYBOARD, but still to a very restricted extent. (Normal processes simply didn't have their hook procedure called at all.) I observed that the hook procedure is called when the events are being delivered to a console window. As soon as events are received by other windows, the hook stops functionning, and switching back to a console window does not "re-enable" it. (Note that the message processing is still necessary for services.)

Soon to come... Hooking in other desktops and window stations. Probably yet another ugly beast, but I love it. I love it all.
</Technobabble>

Sunday, October 24, 2004

*chemistry nerd here*

Outrageously cool:
http://antoine.frostburg.edu/chem/senese/101/electrons/faq/orange-streetlights.shtml
.. especially "a nightmarish, monochromatic black-and-yellow effect". I'd like to see this.