The Backyard Brains 2018 Summer Research Fellowship is coming to a close, but not before we get some real-world scientific experience in! Our research fellows are nearing the end of their residency at the Backyard Brains lab, and they are about to begin their tenure as neuroscience advocates and Backyard Brains ambassadors. The fellows dropped in on University of Michigan’s Undergraduate Research Opportunity Program (UROP) Symposium during their final week of the fellowship, and each scientist gave a quick poster presentation about the work they’d been doing this summer! The fellows synthesized their data into the time-honored poster format and gave lightning-round pitches of their work to attendees. BYB is in the business of creating citizen scientists, and this real-world application is always a highlight of our fellowship. Check out their posters below!
Hello friends, this is Yifan again. As the end of the summer draws near, my summer research is also coming to a conclusion. The work I did over the summer was very different from what I expected. Since this is a wrap up post for an ongoing project, let us first go through what exactly I did this summer.
The above is the product flow graph for our MDP project. All the blue blocks and paths are what I worked on this summer. In previous posts I wrote about progress and accomplishments on everything except the bird detection algorithm.
In my second blog post, I talked about using a single HMM (hidden Markov model) to differentiate between a bird and a non-bird. One problem was that HMM classification takes a long time. Running HMM classification on a 30-minute long recording takes about 2 minute. Considering the fact that we need to analyze data much longer than that, we need to pre-process the recording, and ditch the less interesting parts. This way, we are only putting the interesting parts of the recording into the HMM classifier.
This figure is the runtime profile of running HMM on a full 30-minute long recording. The classification took about 2 minutes. After splitting out the interesting parts of the recording, we are only running classification on these short clips, hence reduces the runtime by a very large factor (see figure below).
One thing you might have noticed in these two graphs is that the runtime for wav_parse is also extremely long. Since there is almost no way to get around parsing the wav file itself, the time consumed here will always be a bottleneck for our algorithm. Instead of a better parsing function, I did the mature thing by blaming it all on python’s inherent performance issues. Jokes aside, I think eventually someone will need to deal with this problem, but I think optimization can wait for now.
This figure is the raw classification output using a model trained by 5 samples of a matching bird call. If the model thinks a window in the recording matches the model, it marks that window as 0, otherwise 1. Basically this mess tells us that in these 23 clips, only clip 9 and 10 does not contain the bird used to train the model.
One might ask, why don’t you have a plot or graph for this result? Don’t yell at me yet, I have my reasons… I literally have more than a hundred clips from one 30-minute recording. It’s easier for me to quickly go through the result if they are clustered together in a text file.
Although me and my mentor Stanislav had decided on trying out HMM to do the bird detection. The results aren’t very optimistic. There is the possibility that HMM is not a very good choice for this purpose after all, which means I might need to do more research to find a better solution for bird detection. Luckily, since songbird is an ongoing project, I will get my full team back again in September. Over this summer, I believed I have made some valuable contributions to this project, and hopefully that can help us achieve our initial goals and plans for this product.
This summer has been a wonderful time to me. I would like to thank all my mentors and fellows for their help along the way, it really meant a lot to me. Looking into the future, I definitely believe this project has more potential than just classifying birds, but for now I am ready to enjoy the rest of the summer in order to work harder when I come back to Ann Arbor in fall.
Hello again everyone! It’s Yifan here with the songbird project. Like my other colleagues I also attended the 4th of July parade in Ann Arbor, which was very fun. I made a very rugged cardinal helmet which looks like a rooster hat, but I guess rooster also counts as a kind of bird, so that turned out just fine.
Anyways, since the last blog post, I have shifted my work emphasis to user interface. After some discussions with my supervisors, we’ve made the decision to change the scheme a little. Instead of using machine learning to detect onsets in a recording, we are going to make an interface that allows the users to select an appropriate volume threshold to do the pre-processing. Then, we will use our machine learning classifier to further classify these interesting clips in details.
Why thresholding based on volume, one might ask? Well, volume is the most straightforward property of sound for us. During tech trek, a kid asked me a very interesting question: when you are detecting birds in a long recording, how do you know the train sound you ruled out as noise isn’t a bird that just sounds like train? Although this one should be quite obvious, we should still give the users the freedom to keep what they want in the raw data. Hence, I’ve developed a simple mechanism that allows every user to decide what they want and what they don’t want before classifying.
This figure is a quick visual representation of a 15 minute field recording after being processed by the mechanism I was talking about. As you can see, in the first plot there is a red line. That is the threshold for user to define. Anything louder than this line would be marked as “activity”; anything quieter than it would be marked as “inactivity.” The second plot shows the activity by time. However, an activity, like a bird call, might have long silence period in between each call. In order not to count those as multiple activities, we have a parameter called “inactivity window,” which is basically the silent time you need in between two activities to be counted as separate activities.
In the above figure, the inactivity window is set to 0.5 second, which is very small. That is why you can see so many separate spikes in the activity plot. Below is the the plot of the same data, but with a inactivity window of 5 seconds.
Because the inactivity window is larger now, smaller activities are now merged into longer continuous activities. This can also be customized by users. After this preprocessing procedure, we will chop up the long recording based on activities, and run smaller clips through the pre-trained classifier.
Unfortunately my laptop completely gave up on me a couple days ago, and I had to send it to repair. I would love to show more data and graphs in this blog post, but I’m afraid I have to postpone that to my last post. Anyways, I wish the best for my laptop (as well as the data in it), and see you next time!