This post is from waaaaay back in the first COVID-19 lockdown, and I never got round to converting it from .org format and actually posting it but I’m starting to take a look at doing something with what I discovered (possibly making a Flipper Zero zigbee sniffing module) so I’m going to get on and publish this now.
The biggest problem you might have when pursuing your own unique brand of technological freedom is that you might end up going deeper on it than you thought you would, and you’ll learn a whole shitload but not actually produce anything.
So obviously in pursuit of having my own brand of smart-home sans the completely unnecessary yet somehow synonymous data-mining, I’ve finally managed to install a fork of Micropython on a smart switch and its only taken me FUCKING HOURS.
The impotent rage of a man who choses his own problems (like wanting to be able to control the lights from magic rituals).
In all seriousness though, in this one I got to do the following for the first time:
- Debug with gdb
over the J-Link
- Built and flashed Micropython
- Hacked code on to something not explicitly designed for it
- Use vendor SVD
files to get easy access to peripherals
- Write a gdb
python plugin
- Used the hardfault registers to diagnose a problem
- Patched memset
Ikea have their own range of smart-home kit, called Tradfri. It’s super cheap, it doesn’t appear to phone home unnecessarily according to more dilligent people, and most importantly, others have already hacked on it and shown that things like UART and SWD are easily accessed.
One of the projects that I quickly found and dove into was Trammell Hudson’s [[https://trmm.net/Ikea][Ikea hacking lightening talk]] along with his fork of Micropython for the SiliconLabs EFR32 and partially implemented Zigbee stack.
My plan here was to buy one of the super cheap smart switches along with a couple of the peripherals (a plug and a bulb) and to try to hack the switch so that it could be wired up over serial to a raspberry pi to send instructions over Zigbee to the peripherals, thus enabling me to write my own HTTP endpoint to have complete control over the devices without them having to go near the internet (or without me having to install and app).
I’m increasingly under the impression that it’s a pretty specious assumption to think it could all Just Work like this within any reasonable timescale, but as you’ll see I have at least learned a lot.
Getting set up was pretty easy. Using Trammell’s photos and instructions, I quickly:
- got everything set up and the firmware built
- managed to get a pretty neat setup whereby I was using an FTDI232 to not only
interact with the thing over serial, but also to sub out the coin cell and
supply 3.3V to the MCU.
- Hooked up my J-Link EDU Mini running a gdb
server
- Loaded up the file, set up a breakpoint for the HardFault_Handler
function
- Flashed the code and reset the MCU…
Aaaaand much to my surprise we pretty much instantly hit the
HardFault_Handler
breakpoint.
This lead to me googling and almost instantly finding [[https://interrupt.memfault.com/blog/cortex-m-fault-debug][this fantastic article on debugging hardfaults on Cortex-M devices]]. It teaches you all about what peripherals are available by default on a Cortex-M which you can use to find out why your code failed.
I spent quite some time trying to figure out why the svd
files that give the
hex addresses for the peripherals some slightly more friendly names and
convenience functions didn’t seem to have the CFSR register that I needed to
debug - turns out that most vendors don’t include the core ARM peripherals in
their machine-readable definitions, and more infuriatingly ARM won’t provide
these either.
To save myself some mental cycles I quickly threw together the following based on some simple examples online:
BFSR_ADDR = "0xE000ed29"
BFSR_LEN = "b"
BFSR_BITS = [
"IBUSERR",
"PRECISERR",
"IMPRECISERR",
"UNSTKERR",
"STKERR",
"LSPERR",
"Reserved",
"BFARVALID"
]
class CFSR(gdb.Command):
def __init__(self):
gdb.Command.__init__(self, "cfsr", gdb.COMMAND_DATA)
def _get_bits(self, address, length):
bits = [bit for bit
in gdb.execute(
f"x/{length}t {address}", True, True
).split("\t")[1].strip()]
bits.reverse()
return bits
def invoke(self, args, from_tty):
cmd = str(args).split(" ")
if cmd[0] == "bfsr":
bits = self._get_bits(BFSR_ADDR, BFSR_LEN)
for i in range(len(bits)):
gdb.write(f"{BFSR_BITS[i]}: {bits[i]}\n")
gdb.flush()
return
if __name__ == "__main__":
CFSR()
I’d already determined by figuring this stuff out manually that the BFSR register contained the information I needed, and I just needed a more intelligible way to be able to view the output without getting into the weeds of a full implementation:
(gdb) c
Continuing.
Breakpoint 3, HardFault_Handler () at main.c:236
236 uart_str("!!!!!\\r\\n");
(gdb) cfsr bfsr
IBUSERR: 0
PRECISERR: 1
IMPRECISERR: 0
UNSTKERR: 0
STKERR: 1
LSPERR: 0
Reserved: 0
BFARVALID: 1
This roughly translates to: - PRECISERR :: We know exactly which instruction caused the hardfault - STKERR :: It was probably access to invalid memory - BFARVALID :: The BFAR peripheral contains the memory address that caused the problem
This was pretty helpful because checking an actual backtrace appeared to be of little use - it didn’t seem to make sense.
The address in BFAR was just below the stack at the top of ROM, so this was a pretty good smoking gun, and matched almost exactly the example given in the article.
By setting a watchpoint on memory just above the bottom of the stack, I was able to see the following:
...
#18 0x00015804 in memset (s=0x21000f7c, c=c@entry=0,
n=n@entry=132) at ../../lib/libc/string0.c:86
#19 0x00015804 in memset (s=0x21000f7c, c=c@entry=0,
n=n@entry=132) at ../../lib/libc/string0.c:86
#20 0x00015804 in memset (s=0x21000f7c, c=<optimized out>, n=132)
at ../../lib/libc/string0.c:86
#21 0x0001e742 in RADIO_SeqInit ()
#22 0x0001f8bc in GENERIC_PHY_RACConfig ()
#23 0x0001f900 in GENERIC_PHY_Init ()
#24 0x0001d24c in RFHAL_Init ()
#25 0x0001bc8c in RAILCore_Init ()
#26 0x00014efe in radio_init () at radio.c:347
#27 radio_init () at radio.c:317
#28 0x000145c4 in main (argc=<optimized out>,
argv=<optimized out>) at main.c:92
(gdb)
This is pretty frustrating because the problem appears to be right in the
middle of a binary blob, and my reverse-engineering skills aren’t up to much,
but notably in stepping through from that point, the memset
frames on the
stack just keep on iterating upward.
I emailed Trammell about this and he said that he’d occasionally seen similar but not gone to deep on the debugging (which, quite frankly, made me feel like I was doing a good job), but helpfully asked if I’d tried just commenting the radio code out.
Trying this out it transpires that even on hitting the gc_init
code (setting
up the heap for use as garbage collected memory for Micropython, or so I
assume) it shits the bed at memset
again.
Reading the implementation and not being particularly au fait with scarce-resource optimizations I didn’t really know what to make of this:
void *memset(void *s, int c, size_t n) {
if (c == 0 && ((uintptr_t)s & 3) == 0) {
// aligned store of 0
uint32_t *s32 = s;
for (size_t i = n >> 2; i > 0; i--) {
*s32++ = 0;
}
if (n & 2) {
*((uint16_t*)s32) = 0;
s32 = (uint32_t*)((uint16_t*)s32 + 1);
}
if (n & 1) {
*((uint8_t*)s32) = 0;
}
} else {
uint8_t *s2 = s;
for (; n > 0; n--) {
*s2++ = c;
}
}
return s;
}
Honestly? My lockdown brain can’t comprehend that, and certainly not how it
seemed to result in recursion behaviour. This was crossing a couple of areas of
expertise and none of them dovetail with mine - however git log -p --
lib/libc/string0.c
was very kind to me and let me know the current
implementation was a performance optimization.
That meant there was an earlier implementation I could check out!
I copied the old code out from before that commit:
void *memset(void *s, int c, size_t n) {
uint8_t *s2 = s;
for (; n > 0; n--) {
*s2++ = c;
}
return s;
}
YEAH BOIIIII I’m smart enough to understand that!
I uncommented the radio code in order to see if it all still caught fire and hey presto, not only did it not hardfault but I got a REPL over the serial console!
Honestly fucked if I know. the way that it would jump back in the memset it feels like it was perhaps accidentally rewriting the stack pointer over and over again…
I should probably debug this and see if it’s worth trying to fix properly in the original implementation, but I also should actually try to do the thing I originally set out to do.
I’ll update here when I decide which to do - I’m already taking time out from that to write it all down!