I learned about the Amateur Packet Radio Service (APRS) in college, and thought it was the coolest thing since sliced bread. You mean I could type a short text message out and send it to someone in another state with a hand held radio? And even send TCP packets? Holy cow, how cool would it be for someone to run a BBS over a 1200 baud modem again?
…of course I found about all this about a decade late. In 2009, cell phones had all but rendered this technology redundant. It didn’t help that the only way to get it up and running on Linux involved archaic config files, and a mix of several different kernel and user space applications. Ugh.
Even still, I wanted to write my own implementation. I didn’t have confidence in the existing implementations, as I’d hear my radio squawk up, but no packet. It seemed like there was some really low hanging fruit to be had as far as optimizing the receiver!
…of course, I was beaten to the punch, again by about a decade. WB2OSZ wrote a piece of software called Direwolf that solved the problem SO WELL, it almost made it pointless to even try. His extensive documentation compared every popular APRS implementation to his, and showed how Direwolf ran circles around them.
As a common denominator, WA8LMF recorded about an hour of APRS traffic and put it online. Some implementations only pulled in 600 packets, whereas Direwolf pulls in well over 1000. (This would serve as the “gold standard” for testing my implementation as well.)
On top of that, his clear documentation made it stupid simple to buy a $50 Raspberry Pi and a $20 push to talk radio, and setup a node in a few hours. Legitimately, Direwolf is the king of APRS, and his labor of love was a great service to the ham community. I’m not going to be able to compete with that.
And yet, some 700 lines of C later, liquidwolf is now a thing. (It probably shouldn’t be.) I thought you’d enjoy reading a few things about how I went about implementing this. I’m hoping to split this up into a few articles highlighting different aspects of the code, but right now it’s fitting to just talk about how it started.
Sadly, engineering is a fractal. I want to go into every little detail, but if I did, I wouldn’t get important things like laundry done, or a proper dinner. (Currently, I’ve eaten four raw hot dogs today.) I’m going to assume you know a little bit about signal processing, and APRS. Like the code I’ve written, this post is going to be pretty rough.
Eating an Elephant Sandwich: Where to start?
When I found Liquid DSP a few years back, I knew that would be the heart of my program. While GNU Radio made it dead simple to prototype and experiment, I wanted to write a C implementation myself. It may be calling the kettle black, but Direwolf’s code base is a little messy. I wanted something simple and straight forward, so offloading all the signal processing to another library would save me the headaches and cut down on code.
The hardest problem was knowing where to start. I have a bad habit of sharpening the axe for far too long, and never getting to the point of cutting down trees. One thing I’ve learned from my coworkers the past few years is to just sit down, and start writing. The code won’t be perfect, just get something to run. Work on tidying it up later.
With that in mind, I opened up VIM and started to write main.c. My primary goal was to get the infrastructure up to read and process a wave file. Initially, I started to write a simple module to parse the header, but that proved to be too much overhead. After twenty minutes of writing code, I scrapped the idea for something stupid simple. I’d read in packed floats from a binary file.
The first revision of code, which wasn’t even dignified with a commit, didn’t check for NULL after malloc(). It didn’t free memory. It didn’t have command line arguments. It didn’t have any clever code. It just read samples, shoved them through a few crudely constructed filters, and wrote them out to another file.
Signal processing code is messy. You’ve got a metric ton of variables scattered all over the place with varying scopes. Some are global. Some are local to the function, some only to the loop. Some are re-used for little fiddly bits where it’s painful to make another “scratch” variable. You have four indexes to different arrays, all named “i, j, jj, j2”, and this weird variable called “temp_tmp” that you’re not sure what it does, or if it’s safe to change.
You want there to be some sort of structure and sanity to your code, but there’s no way to cleanly pack 47 variables in a monolithic function. Here’s a quick run down of a few options I had.
Hopefully you can see how messy this is going to get. It seems there are two strategies for variable scope:
- Put all your variables at the top of the function, so as to be able to judge stack size for function calls.
- Put all your variables closest to where they’re used to enhance readability.
I normally prefer the first approach, but for DSP that seems to be a deal breaker — especially with loop variables. The code is easiest to read as a large monolithic function (as I’ll explain below), but putting all the variables at the top makes analyzing scope tricky, and encourages dangerous re-use. These were the rules I came up:
- Scope all “naming” variables with anonymous braces to prevent re-use later. (Not yet implemented!)
- All loop variables MUST be created in the for statement, C99 style.
- All other variables should be listed as created to assist with legibility, and provide hints at scope.
Staying “clean” while writing dirty code
DSP code roughly functions like a pipeline. Samples from your signal flow in at one point, and trickle down to the end. It’s helpful to be able to view the signal at any point in the pipeline, but this is harder to accomplish than it seems. (At least, with five minutes of code. I really didn’t want to have to learn how to use gnuplot!) Fortunately, I discovered that Audacity can import a file of packed floats. You just have to tell it how many “channels” you have!
However, some blocks don’t maintain a 1:1 ratio of samples-consumed to samples-output. What makes things worse, this relationship isn’t always integer numbers, or even constant! (My clock recovery block consumes approximately 4 samples, give or take, before outputting one sample!) To keep the samples lined up, I simply wrote dummy “0” values to the file if the code wouldn’t produce samples. This is what the code looked like.
Of course, I had to recompile the program a few times because I didn’t understand how sizeof() worked. (I thought I did! Turns out, nope!) You can see some of the “scratch” variables I have in the code, including ones I care about, and ones I don’t. The tmp variable is the most ugly, but I don’t have a good idea on how to do it more cleanly, aside from giving it a better name.
Opening the file in Audacity showed that things were working.
Crossing the Analog to Digital Divide
The next step was to pull packets from the waveform. Sadly, clock recovery is the most frustrating and difficult part of signal processing. While FIR filters aren’t too difficult to understand in an afternoon, clock recovery is a black box of magic and PhD level maths, that crosses the realm between analog and digital. The worst part is that when you misconfigure the block, it doesn’t break in a black and white manner. It just preforms poorly.
As if deframing a packet isn’t hard enough, APRS has to be special about it, and uses an old technology called HDLC. The thought is simple enough: packets start and end with a flag: 0x7E (~). The catch is when the user wants to transmit a tilde character. HDLC stuffs a “0” in the bitstream to interrupt anything that looks like “0111 1110” — which is to say, whenever there are six ones in a row.
There’s some seriously gross stuff going on with HDLC. If you transmit two frame delimiters back-to-back (ie: 0111 1110 | 0111 1110), they can share a zero in the middle. And for some odd reason, APRS runs NRZI encoding on the bitstream. Because they’re worried about sending too many consecutive 1’s back to back, because it makes clock recovery hard.
Given that HDLC deframing is really its own bit of code, I made it into a module to keep it loosely coupled to the DSP code. It seems trivial to implement, but let me tell you, I really banged my head into a wall implementing this. See if you can spot the bug in my code below.
When running a test of my code on known input, I found that I was receiving one less sample than I should be. I wrote a little program to test this, by printing out the state of the HDLC struct after calling the function.
Bitstream: abcd | 0111 1110 |
Data | Flag |input: a, in_packet: true, buff_len: 1
input: b, in_packet: true, buff_len: 2
input: c, in_packet: true, buff_len: 3
input: d, in_packet: true, buff_len: 4input: 0, in_packet: true, buff_len: 5 // Don't know if this is
input: 1, in_packet: true, buff_len: 6 // data, or a frame delim
input: 1, in_packet: true, buff_len: 7 // so record it anyways
input: 1, in_packet: true, buff_len: 8input: 1, in_packet: true, buff_len: 9
input: 1, in_packet: true, buff_len: 10
input: 1, in_packet: false, buff_len: 11 // Ah! Its a frame delim.
input: 0, in_packet: false, buff_len: 0Your packet length is 3. Hooray =3
// WHAT!?! That's wrong, it should be 4!!
Turns out I hadn’t incremented buff_len in my code yet, that happens on line 74, and we’re currently on line 36. We’ll never get there. I don’t need to subtract 7 from the length, I need to subtract 6. Oh that was sad.
Part of the reason it took so long for me to discover this was I didn’t put my “print debug code” inside the hdlc_execute function, because I thought it’d be cleaner to write a “dump state” function. While it was cleaner, I had to resort to GDB and step through a loooong loop until I found out what was going wrong. You can find the commit that fixes this here.
The last part to get working was the frame checksum. Checksum code is pretty scary, because it’s not obvious how it works. When the checksum fails, it’s hard to tell if it’s because your clock recovery isn’t preforming well, your hacked together HDLC implementation is broken, or you’re calculating the wrong checksum. (Turns out for me, all three were going wrong at the same time!) Fortunately, GNU Radio had a functioning HDLC deframer, so I stole the checksum algorithm from that. Thanks guys!
Tweaking the “Analog” Side for Receive Performance
Once I had the digital side working 100%, I could focus on the finicky “analog” side of the program, tweaking parameters and filters until things began to operate more smoothly. (I could have also worked on documentation and code hygene, but that’s boring! Fun things first!) It was fun seeing how various tweaks crept up the received packet count from 670, to 750, to almost 900 packets.
I wanted to put the code online and just be done with it, but it was in an embarrassing state. Unreadable, variables all over the place, bad hygiene. To make it a bit more presentable, I moved the code that turned a byte buffer into a packet struct into its own file, and ran it through valgrind to make sure I wasn’t leaking memory. (I was. Everywhere.)
I also found libsndfile online, and linked that in so the project could finally read .wav files, so the user doesn’t have to mangle their data before feeding it into liquidwolf. Just a smidge lower barrier to entry for people who want to play with it. One of the unexpected benefits — I can now read .ogg files!
EDIT: An unexpected tweak came in last night from @sharebrained, the inventor of the PortaPack for HackRF! He’s a seasoned developer, and recognized a mistake I made on the analog side that picked up another 5% of the packets! Most of my tweaks right now only gather another 2 or 3 packets. 40 packets is a sizable chunk! I hope to explain this in my next post.
At the end of the day, I suppose there’s a few fun points to take away.
- DSP code is ugly, and making it pretty is hard.
- To eat an elephant sandwich, do it one bite at a time. (Byte at a time? d’ohohohoho)
- Off by one errors are still a thing. I don’t know how to count anymore.
- Perfect is the enemy of good enough.
- It takes me 7 evenings to write 100 lines of code (that still probably has shocking bugs in it). I don’t even have transmit working.
- Getting a DSP project up off the ground is really hard, but once you’re in the air it’s a lot of fun.
I’d like to thank WB2OSZ for writing Direwolf, and W6KWF for his amazing thesis on APRS. Wading through old documents from the late 90’s to gather all the pieces of the puzzle is some serious effort. They truly blazed a path through the jungle of documentation, and left enough infrastructure behind to make this project much more feasible.
Thanks for reading, I hope you enjoyed it. Liquidwolf has a LONG way to go before it’s really practical, so I haven’t bothered to implement some useful features, like reading samples from a sound card, or a KISS PTY to allow it to integrate with other programs. I’d like to say “Part 2" will soon follow, but no promises. My muse is like Mary Poppins. It comes in, plays some fun games, and leaves when the wind changes.