Sophisticated, Complex and Challenging ... Resources, Flexibility and Speed ... soon, we will have a robot from Berkeley that folds a towel ... an entered apprentice as it were ... more to come in the next ten to twenty years ... RoboCop and Star Wars to follow ...
For those of you interested in learning more about embedded vision, I recommend the website of the Embedded Vision Alliance, www.embedded-vision.com, which contains extensive free educational materials.
For those who want to do some easy and fun hands-on experiments with embedded vision first-hand, try the BDTI OpenCV Executable Demo Package (for Windows), available at www.embedded-vision.com/platinum-members/bdti/embedded-vision-training/downloads/pages/introduction-computer-vision-using-op
And for those who want to start developing their own vision algorithms and applications using OpenCV, the BDTI Quick-Start OpenCV Kit (which runs under the VMware player on Windows, Mac, or Linux) makes it easy to get started: www.embedded-vision.com/platinum-members/bdti/embedded-vision-training/downloads/pages/OpenCVVMWareImage
@Alaskman: on depth-of-field focus. I briefly mentioned in my talk that the other major alternatives to mosaic subsampling are the co-sited foveon approach and the lytro approach, also known as light-field camera or plenoptic camera where you have control of depth-of-field concurrently from the same picture. Give them a look if you are interested.
Back in the day, I maintained starlight PTZ cameras for Alyeska Pipeline. Of course, security also had video cameras everywhere, and process control used cameras for everything from boiler fireboxes to pipe examination. It became apparent that one of the achilles heels in all these systems was the bunch of neurons at the end of the chain.. People would miss the most obvious things happening. What finally stood out was an evolving collection of software that responded to video changes automatically. A security camera can archive hours of video, but software can find the instances where a person enters the FOV. This is a machine vision application! I think THE most important role will be performed by post processing software modules, all using the same image sequence input.
@Alaskaman66: on sensors optimized for particular applications.
Great question, and indeed there are infrared sensors and other types used for particular applications, but in the end it comes down to whether a market is large enough for sensor manufacturers to create such specialized sensors. In the meantime, as I alluded in my talk, you can create specialized hardware/software blocks that take advantage of your application. For example, extracting edges and background information directly and early in the sensor pipeline in order to better do object segmentation. Many of these blocks are available as IP cores.
@Anatoliy1086: embedded vision kits available. I encourage you to go to www.xilinx.com and look at the "Applications" section where you will find several kits available. They are not specifically labelled 'embedded vision' but are arranged by industry. For example, in the Broadcast industry you would look at the RTVE (Real Time Video Engine) and in the Industrial section to the IVK (Industrial Video Kit). Thanks!
@jbswindle, no problem. Check out, then, the various Apical articles on the site, specifically those dealing with dynamic range processing (Apical will also be presenting tomorrow, and you can re-ask your question then)
@atlantl: on licensing IP cores: IP components are licensed not only by Xilinx but from its partners. It depends on the particular function whether there is a fee. Typically, all the embedded IP is provided as part of the Xilinx design tools and ready to use to build embedded systems. A Xilinx IP-core (LogiCore) is supported for multiple families and clearly specified in the documentation and the tools.
Thanks dipert_bdti. That page, though quite interesting, seems to deal with optical distortion such as fish-eye lens distortion. I'm talking about undesired attenuated video levels, not undesired pixel displacement.
@btwolfe: on 'pipelined' bus. Well, indeed, Xilinx has adopted AXI-4 as interconnection infrastructure. It is not technically a 'bus' but a system of interconnection that promotes interoperability for all IP cores / processing blocks. AXI is an industry standard and therefore accessible to all (e.g. non-proprietary)
Perhaps one should work backward from the application: lets look at facial recognition software. What information does it need? Can the software be implemented earlier in the image processing chain? If we can toss out some of the "bells and whistles" at the sensor/image end, I would bet the bandwidth, data handling requirements,and power needs would drop substantially. Of course, the acquired working images might be unrecognizable to the human eye.
@hdw5d6: on pipeline approaches. I'm afraid I do not understand the question... sorry. The good thing about implementation using an 'all-programmable' approach in an FPGA is that you can do almost anything with your data formats. In most 'popular' approaches, there is always an IP core (pre-packaged core) that will do the job. Also, there are open source cores that while not fully verified/validated in a specific device provide good starting points
I asked about shading correction for its potential contribution to feature extraction such as that required for OCR. OCR binarization algorythms attempt to deal with adjacent pixel level changed due to both noise and background brightness changes (perhaps due to background artwork). I haven't seen anything about shading correction outside of telecine/live studio video camera enviornments.
Great question. Due to the time limitation, I did not go into lenses, but you are correct. Lens aberrations are very important to address, especially in applications where you are forced to use small lenses. These two-dimensional aberrations can be corrected digitally. What you want to look for is an IP core that does general de-warping or it is dedicated to a specifc correction like: vignetting or lens shading correction (especially in the corners)
Alaskaman: One opportunity would be a monochrome camera (with all the resolution of the current cameras but no Bayer color filter in front of the sensor). Unfortunately, you and I would be buying hundreds or thousands but the mobile phone camera vendors buy millions of full-color cameras.
Many sensors are provided with an integrated signal processing (ISP) module. In some systems if you have access to the raw data or if you need to implement a specific algorithm, then an external processing device (FPGA) can be used. It really depends on your application.
Much of the technical characteristics of sensors and video processing are related to the peculiarities of human vision: tri-stimulus response ratios; gamma correction, color gamut spaces, frame rates, etc. Obviously, sensor manufacturers cater to the "human use" market. Why not go back to square 1 and design the sensor system optomized for machine vision? For example, if speed (frame rate) is necessary for a fast production line, maybe one doesn't need to bother with high resolution or color. Anyone out there using such an approach?
I'll take this opportunity to respond to a question from yesterday's session. Several people asked for references on the example applications I described on slide 7 in Monday's presentation. Please note that there are multiple commercially available products in each of these categories. I'm including one example of each here. (Please note that I have mangled the URLs since valid URLs are apparently rejected by the chat software. Replace the "-dot-" with a simple "." In each instance and you're good to go.)
Heart rate from video: www-dot-vitalsignscamera-dot-com/
Hello everyone. Thanks for attending. First on the question of IP cores: these are logic cores that simplify your design and are available from FPGA vendors or from third-party partners. Sometimes, they are bundled in design suites
What about correction for optical, mechanical, and pixel vignetting effects? In analog days this kind of shading error was corrected with scaled parabola signals added to the baseband video at horizontal and vertical rates. Gradient shading errors were corrected with scaled saw tooth addition at horizontal and vertical rates. What, if anything, is done in the digital domain to correct shading errors?
It's a shame the website doesn't offer a "test audio" player; today I'm on a different computer and would hate to be surprised to discover at 14:00 that the audio player is incompatible with this computer, this browser, or my brwser's particular plug-ins.
Are they robots or androids? We're not exactly sure. Each talking, gesturing Geminoid looks exactly like a real individual, starting with their creator, professor Hiroshi Ishiguro of Osaka University in Japan.
For industrial control applications, or even a simple assembly line, that machine can go almost 24/7 without a break. But what happens when the task is a little more complex? That’s where the “smart” machine would come in. The smart machine is one that has some simple (or complex in some cases) processing capability to be able to adapt to changing conditions. Such machines are suited for a host of applications, including automotive, aerospace, defense, medical, computers and electronics, telecommunications, consumer goods, and so on. This discussion will examine what’s possible with smart machines, and what tradeoffs need to be made to implement such a solution.