Deep learning and cloud computing help perform security related tasks

Umbo Computer Vision leverages deep learning and cloud computing to provide autonomous video security systems.

1805vsdtt P02

Most video security systems rely on error-prone people to monitor all the hours of video captured each day by millions of cameras installed around the world. But thanks to rapid advances in artificial intelligence (AI), deep learning and cloud computing, which provide a more natural understanding of scenes, objects and people within video, autonomous surveillance technology stands poised to help humans better perform security-related tasks.

1805vsdtt P02

If the algorithm determines that something noteworthy has occurred, an alert is sent through the web-based interface and the Umbo mobile apps.

“Human operators have to maintain a high level of concentration and divide that attention to monitor multiple occurrences in a single location,” explains Shawn Guan, CEO, Umbo Computer Vision (San Francisco, CA, USA;, a company that leverages deep learning and cloud computing to provide autonomous video security systems. “In most states within USA and in the UK it is a licensed profession, requiring an approved training course over 3 days.”


Despite this training, notes Guan, research has found that the prevalence of human error is much higher than popularly perceived. Humans are simply not that great at monitoring for rare events across multiple video streams, with an error rate that fluctuates depending on many different, unpredictable circumstances. But with millions of hours of video being recorded each day, and limited resources for monitoring, the number of monitors that a security operator must observe is on the rise.


Umbo Learning Cameras are installed like any other security camera and Umbo Light leverages deep learning and cloud computing to help people perform security related tasks.

That’s one reason why Umbo Computer Vision’s security camera systems are currently in operation at California’s Summerhill Residential Community, Taiwan’s National Chung-Hsing University, and other locations around the world. While the Umbo Light system, as it’s called, is capable of taking in an RTSP stream from third-party cameras such as those from Axis Communications (Lund, Sweden,, the computer vision works the best with Umbo cameras such as the SmartDome or the SmartBullet, explains Guan.

Unlike many previous generation cameras, these cameras are cloud-first devices. This means that they do not use NVRs or servers. Instead, they are connected and powered by a Power Over Ethernet cable (PoE), streaming video data directly to media servers and storage buckets hosted on the cloud. These servers and storage buckets are hosted by a trio of cloud providers - Amazon Web Services (Seattle, WA, USA;, Microsoft Azure (Redmond, WA, USA;, and Google Cloud (Mountain View, CA, USA;

“The complexities of calibration and detection are hidden from the user with a simple web-based dashboard,” says Guan. “Within this dashboard, users can easily live stream their current cameras, watch historical video footage, and review a stream of events that have recently occurred within their zones of interest. These zones of interest are drawn directly on the scene. Multiple zones can be drawn and they can be configured so that users can receive alerts from them during specific times or if the person within them lingers there for an extended period of time.”

Video scenes are processed by Umbo’s cloud providers using custom computer vision algorithms. Once Light determines that something noteworthy has occurred within the region of interest, then it sends an alert through both the web-based interface and the Umbo mobile apps - available on both iOS and Android. The computer vision algorithm is trained to function across many environments and conditions with as little setup and configuration as possible, according to Guan.

“The future of any technology is based on its ability to solve problems that humans deem important. Law enforcement and public safety agencies globally experience a broad range of potential security threats like violence, shootings, theft, rape, and riots to name just a few. If computer vision technologies can help security managers identify these events as they happen then it would revolutionize the way we police and protect our societies today.”


More in Non-Factory