Hello again everyone! It’s Yifan here with the songbird project. Like my other colleagues I also attended the 4th of July parade in Ann Arbor, which was very fun. I made a very rugged cardinal helmet which looks like a rooster hat, but I guess rooster also counts as a kind of bird, so that turned out just fine.
Anyways, since the last blog post, I have shifted my work emphasis to user interface. After some discussions with my supervisors, we’ve made the decision to change the scheme a little. Instead of using machine learning to detect onsets in a recording, we are going to make an interface that allows the users to select an appropriate volume threshold to do the pre-processing. Then, we will use our machine learning classifier to further classify these interesting clips in details.
Why thresholding based on volume, one might ask? Well, volume is the most straightforward property of sound for us. During tech trek, a kid asked me a very interesting question: when you are detecting birds in a long recording, how do you know the train sound you ruled out as noise isn’t a bird that just sounds like train? Although this one should be quite obvious, we should still give the users the freedom to keep what they want in the raw data. Hence, I’ve developed a simple mechanism that allows every user to decide what they want and what they don’t want before classifying.
This figure is a quick visual representation of a 15 minute field recording after being processed by the mechanism I was talking about. As you can see, in the first plot there is a red line. That is the threshold for user to define. Anything louder than this line would be marked as “activity”; anything quieter than it would be marked as “inactivity.” The second plot shows the activity by time. However, an activity, like a bird call, might have long silence period in between each call. In order not to count those as multiple activities, we have a parameter called “inactivity window,” which is basically the silent time you need in between two activities to be counted as separate activities.
In the above figure, the inactivity window is set to 0.5 second, which is very small. That is why you can see so many separate spikes in the activity plot. Below is the the plot of the same data, but with a inactivity window of 5 seconds.
Because the inactivity window is larger now, smaller activities are now merged into longer continuous activities. This can also be customized by users. After this preprocessing procedure, we will chop up the long recording based on activities, and run smaller clips through the pre-trained classifier.
Unfortunately my laptop completely gave up on me a couple days ago, and I had to send it to repair. I would love to show more data and graphs in this blog post, but I’m afraid I have to postpone that to my last post. Anyways, I wish the best for my laptop (as well as the data in it), and see you next time!
Can robots think and feel? Can they have minds? Can they learn to be more like us? To do any of this, robots need brains. Scientists use “neurorobots” – robots with computer models of biological brains – to understand everything from motor control, and navigation to learning and problem solving. At Backyard Brains, we are working hard to take neurorobots out of the research labs and into the hands of anyone who wants one. How would you like a robot companion with life-like habits and goals? Even better, how would you like to visualize and rebuild its brain in real-time? Now that’s neuroscience made real!
I’m Christopher Harris, a neuroscientist from Sweden who for the past few years have had a bunch of neurorobots exploring my living room floor. Last year I joined Backyard Brains to turn my brain-based rugrats into a new education technology that makes it possible for high-school students to learn neuroscience by designing neurorobot brains. Our robots have cameras, wheels, microphones and speakers, and students use a drag-and-drop interface to hook them all up with neurons and neural networks into an artificial brain. Needless to say, the range of brains and behaviors you can create is limitless! Twice already we’ve had the opportunity to pilot our neurorobots with some awesome high-school students, and we’re learning a ton about how to make brain design a great learning experience.
But hang on, is this just machine learning (ML) dressed up to look like neuroscience? Not at all. Although ML algorithms and biological brains both get their power from connecting lots of neurons into networks that learn and improve over time, there are also crucial differences. Biological neurons are complex and generate spontaneous activity, while ML neurons are silent in the absence of input. Unlike ML networks, biological brain models are ideally suited for “neuromorphic” hardware, which has extraordinary properties, including (some say) the ability to support consciousness. Finally, while ML networks are organized into neat symmetrical layers with only the occasional feedback-loop, biological brains contain a huge diversity of network structures connected by tangles of criss-crossing nerve fibres. Personally I’m a big fan of the brain’s reward system – the sprawling, dopamine-driven network that generates our attention, motivation, decision-making and learning. So rest assured, fellow reward-enthusiasts, our neurorobots have a big bright “reward button” to release dopamine into the artificial brain, reinforce its synapses and shape its personality.
Interested? If you’d like to take part in a workshop to learn brain design for neurorobots, or if you’re a teacher and would like Backyard Brains to come and give your students a hands-on learning experience they’ll never forget; please email me at email@example.com, and check back here for updates.
G’day again! I’ve got data… and it is beautiful!
More on this below… I am pleased to update my progress on my BYB project, Human EEG visual decoding!
If you missed it, here’s the post where I introduced my project!
Since my first blog post, I have collected the data from 6 subjects with the stimulus presentation program I developed. The program presents 5 sets of 30 images from 4 categories (Face, House, Natural Scene, Weird pictures). Since the images are randomized, I have small, color-coded blocks in the corner of each image which I use to record which stimulus is presented when.
I needed to build the light sensor to read the signals from these colored block. I used a photoresistor at first, however, there was some delay on the signal, so I decided to use photodiodes which had a faster response. Since I do not have an engineering background, I had to learn how to read circuits and to solder to build the light sensor. This was new territory for me, but it was very interesting and motivating. After building up my device, I collected data from 6 subjects from 5 brain areas (CPz, C5, C6, P7, P8) that are thought to be important in measuring brain signals related to visual stimulus interpretation.
Figure1. Data recorded from DIY EEG gear. 5 channels from 5 brains areas (orange, red, yellow, light green, green) and 1 channel from photoresistor (aqua) that was replaced by photodiode
Figure2. A circuit for photodiode(top) and the photodiodes I built (bottom)
Figure3. Checking each channel from the Arduino. One channel (Yellow) on the back of the brain is detecting alpha waves – 10 Hz waves
Figure4. Spencer (top/mid) and Christy(bottom), our coolest interns, participating in the experiment
With the raw EEG data collected from each subject, I averaged them to get the ERP (Event Related Potential) to observe what the device detected from the data. ERPs provide a continuous measure of processing between a stimulus and a response, making it possible to determine which stages are being affected by a specific experimental manipulation, and also provide excellent temporal resolution—as the speed of ERP recording is only constrained by the sampling rate that the recording equipment can feasibly support, Thus, ERPs are well suited to research questions about the speed of neural activity.
Then I performed Monte Carlo simulations to verify the statistical significance of the spikes in ERP data. Monte Carlo simulation is a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. With 100 random samples for each category, the analysis indicated that we had statistically significant spikes across the graph, especially in N170 in face images, which was very meaningful for my research. N170 is a component of the event-related potential (ERP) that reflects the neural processing of faces, which supported that we have good detection on faces across subjects compared to other categories.
Figure5. ERP data from 6 subjects for each category of images. Significant response in N170 (negative peak after 170 ms after the stimuli presentation) is detected in the face
After verifying the statistical significance of the data, I used k-means clustering, a method of vector quantization that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. K-means clustering indicated that difference between subjects was more significant than the difference between trials and that the difference between trials was more significant than the difference between categories. And, much to my excitement, it was obvious that the response to faces was distinguished from other categories across the number of averaging data sets.
With the insights from k-means clustering, I finally performed the machine learning techniques I’d been studying to analyze my accuracy at classifying what category of images people were looking at during the experiment by looking at the raw data. I performed the most popular pattern classifiers such as “linear support vector machine,” “quadratic support vector machine,” “cubic support vector machine,” “complex trees,” “gaussian,” “knn,” and so on… I used these methods on a single subject and a set of 6 subjects with and without averaging every 5, 10, 15, 20, 25, 30, 50, 75, 150 vectors of EEG data. Support Vector Machine showed the best performance among other classifiers with more than 50% of accuracy for each class with averaging data showing the better performance as expected.
One Subject Raw
One Subject Averaged
Six Subjects Raw
Six Subjects Averaged
Figure6. K-means clustering results with averaging every 5, 10, 20, 50 75 vectors of the EEG data for a single subject(first 2 graphs) and 6 subjects(last 2 graphs). Y axis indicates 4 categories of the images (1: Face, 2: House, 3: Natural Scene, 4: Weird pictures), further illustrated by the red lines. The graphs from 6 subjects indicate that combining multiple subjects introduces too much variation to identify faces within the group. However, the graphs from a single subject indicate that face can be distinguished from other three categories.
Again, with the data from k-means clustering, and the Machine Learning classifiers I mentioned before, I then applied a 5-fold cross validation with and without averaging every 5 EEG data. In 5-fold cross validation, each data set is divided into five disjoint subsets, where four subsets are used as training sets and the rest are used for a test set. SVM showed the best performance among other classifiers with more than 50% of accuracy for each class with averaging data showing the better performance as expected.
One subject, SVM, no averaging
One subject, SVM, averaging 5
Six subjects, SVM, no averaging
Six subjects, SVM, averaging 5
Figure7. The results from pattern classification with SVM . Both one subject and 6 subjects achieved good results with averaging every 5 vectors of the EEG data, producing a better result than without averaging, and data from single subject producing a better result than 6 subjects. (The darker the green down the diagonal the better, that’s the accuracy of predicting specific classes)
So now I am working on real time pattern classification so that I can detect what people are looking at without averaging multiple sets of data. I will perform spectral decomposition to compute and downsample the spectral power of the re-referenced EEG around each trial. The spectral features from all of the electrodes will be concatenated and used as inputs to pattern classifiers. The classifiers were trained using various pattern classifiers to recognize when each stimulus category is processed as the target image in real time; a separate classifier will be trained for each combination of stimulus category and time bin. Next, the trained classifiers will be used to measure how strongly the prime distractor image is processed on each trial. Finally, subjects’ RTs (to the probe image) on individual trials will be aligned to the classifier output from the respective trials.
The successful result of this research will make this kind of neural decoding accessible for any neuroscience researcher with an affordable EEG rig and provide us an opportunity to bring state-of-art neurotechnology techniques, such as brain authentication, to life. Please keep an eye on my project and feel free to ask any question. Toodle-oo!