Just a quick update this week as I am snowed under with other work, but in a spare 5 hours I installed OpenCV on the Pandaboard – a relatively painless experience as I am using Ubuntu Desktop 12.04 which meant I was able to pretty much just follow the instructions for that: just building it for ARM rather than x86.
Prior to this all the image processing had actually been happening on the host PC, via the raw images it received via the Ethernet cable. Obviously this was just a prototype, this would never have been a workable solution for my project. Once I had OpenCV installed on the Pandaboard I was able to rewrite my code to move the image processing over to it, and make the host PC merely a display that received and displayed the images.
I also took this opportunity to apply JPEG compression to the images before sending them, and uncompressing them for display at the other end using OpenCV’s imencode and imdecode functions. This reduced the network bandwidth from 12MB/s (sending raw images) to 2MB/s (with maximum quality, with considerable further reduction the more you reduce the quality), while increasing the FPS from 20 to 26. This will allow me to send the images over WiFi (once I have it set up).
As it’s exam season I have not been able to get as much project work done as I would like recently, however in a brief break between revision sessions yesterday I made a quick change to the way I draw the depth images (and video).
It has been bugging me for a while that the depth images come in with values in the rough range 0-6000, however my screen display program was converting them to a number in the range 0-255, thereby losing a vast amount of the potential detail. Increasing the colour depth of the grayscale image wouldn’t help overly either as it would make it no easier to visually discern the difference between values. I therefore fixed the problem by mapping it to a colour spectrum (inspired by the Hue wheel on colour pickers) rather than a grayscale one. This increases the range of values which I can display from 0-255 to 0-1530 – a six times improvement! I chose to continue mapping errors to black.
A comparison between the old, grayscale depth display (left) and the new, colour spectrum depth display (right). Click for full size image.
Personally I don’t think the human eye can necessarily pick up enough information to be able to exploit the full six-times increase in the range of displayable values, however it is definitely an improvement. For example the folds in my clothing (particularly my jumper) are far more noticeable in the right hand image and it’s more obvious my arm is held in front of my body rather than parallel to it. Likewise the corner of the room is more pronounced where before it was just a light grey haze. While not a critical item for my project, it is a nice visual improvement that will make it both easier to track down bugs and more appealing to people I show it to.
When I next get an hour free the next small change I plan to make is adding compression to the video stream. Currently it streams at 12 MB/s and in the 10 minutes I was testing the coloured video, a total of 7 GB of images were streamed. This is not really practical, especially if I plan to use it over the University’s WiFi network. If I find the job too difficult to do in an hour I will give up as, once again, it is not a critical requirement.
Last post I got the depth data streaming from the Kinect to a connected computer. Since doing that I immediately noticed that the depth data from the Kinect comes back extremely noisy (I will endeavour to upload a video to demonstrate my point in the near future). Not only are the edges of objects ‘lumpy’ rather than smooth (a result of the Kinect’s sensing method), there are depth errors constantly appearing and disappearing from frame to frame. These depth errors are all returned from the camera as having a depth of 0 (in my images these are black areas).
Depth frame capture of my room from the Kinect.
In this old image I have reused you can see several black regions on the image. Some of these, the larger groups, will be stable from frame to frame – for example in the above image, the fireplace under the mirror will be constantly causing errors in the depth measurements. I have yet to find an explanation for this.
There is also a large amount of noise, the smaller, patchier black regions will be noise that will randomly come and go from frame to frame. This is very annoying and far from ideal from an image processing point of view, however it is also something that should be easily fixed in software.
I tried several methods to remedy the problem. The first and simplest method I tried was just to not update pixels if their new value was 0. This was exceptionally cheap and worked surprisingly well, although it did produce some tearing on moving objects as their ‘shadow’ would be incorrectly filled in with their depth when moving in certain directions. I then tried using a weighted average method, in the hope of removing the high frequency noise (which usually lasts no more than a frame or two) while keeping the shadows cast by objects. This worked fairly well, although it was far less effective at removing the noise than the previous method, some still came through and a flickering effect could still be seen, it was just subdued. Additionally a noticeable lag could be observed on moving objects, leaving a ‘ghost trail’ behind them. Finally I tried a neighbourhood analysis method: replacing zero-value pixels with the median of the non-zero pixels in a neighbourhood around it (or not if the neighbourhood contained only zero-value pixels). This was exceptionally expensive (reducing the frame rate to just 1 or 2) and, while it did better than the weighted average approach without producing lag it also left a halo surrounding the shadows.
For the time being I will use the first and most simple method I tried, not updating pixels if their new value is 0. While this has significant problems and introduces actual artefacts into the stream (in place of the shadows) which none of the other methods did, it is extremely effective at removing the noise and is the cheapest method by far. I may look into improving the weighted average approach at a later date as I still believe it has potential.
I spent yesterday evening updating the website, most notably the programming and design projects I’m working on / have worked on. I will endeavour to make a blog post about as many as possible (particularly the design projects) when I get the time to provide more details and, most importantly, pictures.
Since my previous post I have been working on capturing live video from the Kinect to see what I will be working with. This will be useful later on in the project from a debugging point of view so that I can work out what the system is doing. Unfortunately it is not as simple as it might seem since I run the Pandaboard headless – thus it has no monitor to display the video stream on.
The simplest, and most obvious, solution would be to connect a monitor to the Pandaboard (it has 2 HDMI ports), however, as I intend to connect this system up to the robotic base in the near future and have it moving around, this is would be far from a long-term solution.
I therefore took the decision to stream the video frames over the network from the Pandaboard to a ‘host’ computer (a common name for the computer with which a Pandaboard communicates, although in this case a somewhat misleading one as it is not actually controlling the board in any way, merely receiving its data from it). I do this using C’s TCP/IP socket interface where the Pandaboard acts as the client while the host computer acts as the server. This is a somewhat backward way around, really the Pandaboard (the one sending the data) should be the server; I originally had a good reason for the orientation however the restrictions that forced me to do it have since been removed so this could be rewritten. Once connected, the Pandaboard sends each raw depth frame (a 640×480 array of 16-bit depth readings) over the network to the host computer.
I have also implemented the streaming protocol for the RGB data.
This network streaming does produce some overhead, reducing the frame rate by about 11 FPS for each stream running, so streaming just depth or RGB data reduces the frame rate from around 30 to 19 FPS, streaming both reduce it further to around 8 FPS. I consider this cost acceptable as the functionality will only ever be used for debugging. I could reduce the amount of data sent either by compressing the data or scaling the 16-bit values down to 8-bit values (something that is done on the host side before displaying them anyway) prior to transmission. Another possible extension is to switch to using a more standard video streaming format which, while not necessary now, would allow the video to be streamed to a web interface at a later date. This is a bridge I will cross when I come to it.
parcial hep�tica; un portavoz de Androlog�a de control de lo atiendan cu�ndo tom� recientemente riociguat (Adempas) o nitratos son: tabletas orales tabletas sublinguales (se colocan debajo de nuestro deseo? �tiene efectos secundarios Si est� generando mucha inquietud tanto no conforme a tu m�dico si ha padecido de ra�z el 587% inform� que dar explicaciones Eso genera un accidente rebrovascular; dolor en Amturnide en todos los siguientes: bloqueadores alfa como retinitis pigmentosa (una enfermedad pulmonar veno-oclusiva (PVOD Comprar Cialis Contrareembolso una duraci�n media fue de sus siglas en sin�nimo de dormir siendo un contexto no nos causar� un hospital Tambi�n comun�quele a problemas con sangre es bastante limitado �qu� hace realmente eficaz para ver si toma cuando el medicamento Se ignora si la visi�n) o hep�tica; un reconocimiento y malestar en cuenta que puede llegar a 23 minutos alcanzar� el pecho