Review: Haikubox Bird Identification System
A few months ago I heard about a new bird call identification system called Haikubox from Loggerhead Instruments.
It is a box that you plug in to an electrical socked, hook up to your WIFI, hang on the wall outside and sign into their app, and it will monitor the ambient sounds around its location on a 24x7 basis. When it hears what might be a bird call, it uploads it to their server where it’s evaluated and if possible, identified. The back end processing is very similar to Cornell’s Merlin app. The 24x7 aspect of this tool intrigued me, so I put myself on the wait list (thank you, supply chain).
About two weeks ago I got the email telling me it was in stock, so I made an order, and a few days later, it arrived.
It’s a small box, well weather sealed and seems to be well engineered. Setup was easy: there are two apps, one for configuring it, one to use it, downloadable from your phone or tablet’s app store. It took maybe ten minutes to get it unboxed, set up and on the air waiting for data.
I will note this is very much an enthusiast’s device. It’s not cheap: $399, which includes 5 years of server processing access, after which the server access is $59/year. For the kind of back end server processing and storage this thing uses, I don’t think that’s unreasonable. It does require electricity and an active WIFI connection, so it’s not portable like the Merlin app. You will be monitoring a specific location (most likely your house).
To cut to the chase: I love about 90% of the product. There are things it does amazingly well, and it has a couple of features I feel are missing from Merlin that I hope Cornell adopts. It’s accuracy was quite good, in some ways better than Merlin. Ultimately, though, there were some challenges I couldn’t get past in that last 10%, and I ultimately reached out to them and requested a refund.
But…..
As I got familiar with the tool I started having some questions and running into some challenges. One issue, very fixable: it was really unstable on my WIFI, which is a rock solid EERO 6 mesh. Nothing else on my network had any issues or outages during the time, and I could find no issues monitoring the network, but the device would go from a strong signal to a weak one and then drop off the network completely, requiring a power cycle (or three). I’ve seen this behavior before, and it seems some devices will have a tendency to shift off the stronger, nearby router to a weaker one for some reason. Very fixable, but frustrating when it’d run for hours and then go offline. I ended up putting it on a smart switch so I could remotely power cycle the thing.
The bigger issue, the one I never could get past, was privacy. If you stop and think about it, this is a 24x7 live microphone attached to the wall of your house and sharing what it hears with a server up in the cloud somewhere. The product’s web site (at the time I write this) doesn’t really into how they encrypt and protect this data, but the more I thought about it, the more I was uncomfortable with this. I do work for a company that does computer security work, and so I think I’m more sensitive to these kind of issues than a typical person, but… there’s a reason when Siri and Alexa are disabled everywhere in my house except the homepod that got exiled to the garage workshop.
I did talk to the support team, which was quite responsive and answered all of my crazy questions, and they are doing what they can to minimize the potential privacy implications of the device; the on-board system, if it recognizes what it thinks is human speech, stops recording, but as they noted, it’s not 100% perfect. I do think they’re taking reasonable steps to protect users, but… this is kind of a hot button for me, and it’s not well documented what and how they do the protection (to their credit, they agreed and I expect this will be improved in their online docs sometime soon). To be honest, if I were to grab my laptop, go out on the patio and sit in on a company meeting, I’d be in violation of company policy, which doesn’t want company meetings to be held within hearing of the “ladies in the tubes”.
The other big issue for me: this data is invalid for use in eBird. This surprised me, because it’s not covered at all in the Haikubox online documentation. I reached out to the regional eBird reviewer to let them know what I was doing with the box, and he shared with me eBird’s Best Practices on remote sensors, which is.. well, “nope”.
This remote sensing standard is pretty well hidden by Cornell, when I’d gone researching for guidance I never found it, but the standard is clear, you really shouldn’t put IDs found via Haiku into eBird. That — wasn’t the answer I wanted to hear, and it took me a few days to really get this straightened out in my head, but ultimately, I want to be a good citizen on eBird (more on that shortly), and I removed all of the Haiku data from my eBird listings.
We need to sort out remote sensing and eBird
It’s my belief that this remote sensing stuff is just starting to become useful, and yes, I do love living on the bleeding edge of new tech at times, and part of that is learning how to help others adopt it later. I see what Cornell’s Merlin is doing, and what Haiku is capable of and what these systems will likely be able to do in a year or three, and I think it’s going to be as big a change to birding as digital photography, or perhaps the invention of the binoculars (seriously). These tools are binoculars for the ears, and as someone who has lost significant capability, it’s opened up a world of birding that I have lost as my ears aged and stopped working so well.
And why I understand why eBird’s position is “nope” on remote sensing, this stuff is coming, it’s just starting to shift from researcher tools into the enthusiast market, and it’s only going to get better. Now is the time for them to get ahead of this emerging tech and set rational standards for their use, and evangelize those standards out into the eBird user base: because I believe, from my discussions with birders over the last number of years, that most of them who have adopted Merlin aren’t following Cornell’s Merlin Best Practices, because when i’ve asked, they rarely even considered that best practices existed. Merlin spits out an ID, and they believe it. The same is going to be true for Haiku users, and for what I’m sure will be other, similar products that we’ll see in the next few years.
So if I were in charge here (and thankfully, I’m not), here’s what I’d ask to see happen:
First, there are two huge features the Haikubox has that Merlin doesn’t: it rates each ID with a reliability ranking: high, medium, low. I found in my testing high rankings were 100% accurate, and medium rankings were pretty good but not perfect. Low rankings were sometimes accurate (and usually, I also had higher ranked sound clips for the same bird), but occasionally, just plain wrong: I got a few hits for Western Bluebird, which is not expected in this location, and one of those hits came at the same time as a male Western Tanager was visiting my feeder; while I know I have a pair of Western Tanagers nesting here again this spring, Haiku hasn’t been particularly good at recognizing them, and I believe the Bluebird was a mis-ID. I also had a mid-day overflight catch of a calling Common Loon, which isn’t completely impossible, but honestly, I doubt it.
Merlin needs to adopt these two features: we really need that reliability ranking for each individual ID clip.
To quote the Merlin best practices (which,. to be honest, when compared with the eBird best practices, these two parts of Cornell don’t seem to be quite on the same page in how the two tools should work together appropriately):
Like all birders, Merlin can make mistakes. If you're not confident that Merlin's suggestion is correct, or if you have not considered it independently, don't report it to eBird. (Do not report whatever Merlin says without considering it first!)
But… how does one do that? Merlin doesn’t give any hints on how to build that confidence, short of getting eyeballs on the bird — always a good practice, but not always a practical one. (I’ve been trying to get eyeballs on a Pacific-Slope Flycatcher that’s taken up residence here for weeks, to no avail. I finally did get really nice looks — twice — at the Black-Throated Gray Warbler that seems to be summering here with me). Adding reliability ratings to their clips, and allowing clips to be downloaded for analysis and sharing.
In my discussions with the Haikubox team, I expressed some strong disappointment that this remote sensor restriction in eBird wasn’t covered anywhere on their site, and that they hadn’t sat down with Cornell and hashed out how to make this data usable in eBird. They took this feedback with grace and indicated an intention to reach out and talk to Cornell. I am hoping they do and it helps us boostrap these kind of devices into general use by the enthusiast birding community.
In practice, I think the Merlin team, the Haiku team, and the eBird team need to sit down and figure this out, and then all three groups need to update their docs to explain the best practices clearly to users who in general aren’t going to go looking for best practices without some outreach to educate them.
I think there’s some good common ground all three can agree on:
ID clips ranked of high reliability seem to be quite accurate, and I think they could be a solid addition to eBird data. I’d go so far as to say these clips could be auto-submitted to eBird (but personally, I wouldn’t do that)
ID clips of medium reliability need some further cross-checking. But if I’ve seen a bird at a location and a week later I’m getting ID hits of medium quality, I consider that “good enough”
ID clips of low quality should not be submitted to eBird without strong corroborating evidence, such as a visual sighting.
I also think it would be a good idea for eBird to allow users to document how an ID was made; currently I disclose merlin data in the comment field, which isn’t sufficient. Thinking about this, a new field similar to the Breeding Code data could be used to help define the data in a way future researchers could leverage or filter it. Something like:
Visual ID
Identified by Sound
Identified via automated sound tool (ranked: high/medium/low)
Allowing people to optionally tag an ID this way would allow others, later, to have some insight on whether to use or exclude those IDs in their research, at minimal hassle to the submitter.
My Bottom Line
I… really wanted to love the Haikubox, but between the eBird disconnects and the privacy issues, I ended up sending it back — with regrets. One very intriguing set of IDs it found was that every morning right as the dawn chorus was kicking in (4:30-5:30AM) I was hearing an overflight of multiple Marbled Murrelets, evidently heading from wherever their nesting/roosting place was deep in the woods out to either Hood Canal or the other local waters. Which fostered a whole bunch of questions, like whether I could figure out directionality of the flight, where their nesting might be and should we find it so we could protect that area from possible logging? A whole lot of stuff way above my pay grade, but I think it gives an idea of the kind of possibilities this kind of remote monitoring might open up.
But products like Haikubox need to be able to submit data to eBird in an open and constructive way, and as it stands today, any user of a Haikubox, if identified by their eBird reviewer, might get an email that they might not be totally thrilled with, in large part because Haikubox hasn’t set appropriate expectations here.
If/when things get worked out and eBird is ready to accept remote sensing data and the best practices are out there and understood, I might well buy another Haikubox — I really want to love the product and I think it could add some interesting data to my birding life. That said, the privacy issues of a live microphone still bother me, so here’s an idea for a product I’d really rather buy:
Haikubox as a trail camera: no WIFI, but battery operated and storing data on an SD card, which I could harvest, import and evaluate, and then upload to the Haiku servers for the back-end ID and storage. That would allow me to put the box at places away from human voices and other noises (like the air conditioner), and explore other parts of this property away from my house and away from the birds that seem okay with being relatively nearby humans (I have 4 acres of woods here; how is the birdscape in the undeveloped areas different than near the feeders? I’d love to know)
I am really convinced we’ll be seeing this remote sensing gear moving into the birding ecosystem over the next couple of years. I’m looking forward to dipping my toe back into it when I can. For now, I think the technology pieces are just at the start of the “good enough for early adopters” group, but the social and process issues tied to the data are a black hole. Once those start getting sorted out, then I’ll be thrilled to bring this back into my birding life.
And Finally, a Mea Culpa
Of course, it wouldn’t be me opening up the hood of some neat new tech without bruising a few fingers and slicing open a fingertip. In this case, in one of my discussions about this, I made the comment “I know the data I’m submitting to eBird is of high quality” — and almost immediately, the voice in the back of my head muttered “yeah, but the data in eBird doesn’t show that, does it?” My reaction, of course, was to tell the voice to shut up, but..
But.. it’s right. I’ve gotten sloppy about my use of Merlin data and eBird. I’m not following best practices in some cases. In most cases I am, but I don’t believe I’m doing a good job of documenting what I know in many cases. I have to fix that. I’m not entirely sure how, yet — but this realization is where that idea above of the new field eBirds should support came from. But I’ve been mixing visual sightings and ear ID with the Merlin data in a very sloppy way, and I have to fix that.
I may be the only person that cares if I fix that data, but now that I’ve recognized I’m doing it wrong, I have to figure out a fix and get it implemented.
Which… well, life is always complicated, but I do want to live up to high standards in what I do, and it’s sometimes humbling to have to admit I haven’t done so here.