Academic researchers developing new activity recognition algorithm

Hamed Pirsiavash, a postdoc at MIT, and his former thesis advisor, Deva Ramanan of the University of California at Irvine have developed a new activity recognition algorithm that uses techniques from natural language processing to enable computers to more efficiently search video for actions.

James Carroll

May 14, 2014

2 min read

Content Dam Vsd En Articles 2014 05 Academic Researchers Developing New Activity Recognition Algorithm Leftcolumn Article Thumbnailimage File

While previous algorithms that perform similar tasks have been developed, the new algorithm reportedly has a number of advantages over its predecessors. According to the MIT news release, these include:

Execution time. The new algorithm’s execution time scales linearly with the size of the video file it’s searching, meaning that if one file is 10 times larger than another, the algorithm will take 10 times as long to search it, not 1,000 times longer, as with earlier algorithms.
Predicting actions. The algorithm is able to see a partially completed action and issue a probability that the action is of the type that it is looking for. It may revise this probability as the video continues, but does not have to wait until the action is complete to assess it.
Fixed memory. Regardless of how many frames of video the algorithm has reviewed, the amount of memory it requires is fixed, meaning that, unlike many of its predecessors, it can handle video streams of any length or size.

Pirsiavash and Ramanan’s algorithm utilizes aspects of a type of algorithm used in natural language processing, which is a field of computer science concerned with the interactions between computers and human (natural) languages. In the MIT new release, Pirsiavash explains how the natural language processing algorithm applies to activity prediction.

"One of the challenging problems they try to solve is, if you have a sentence, you want to basically parse the sentence, saying what is the subject, what is the verb, what is the adverb," Pirsiavash said. "We see an analogy here, which is, if you have a complex action — like making tea or making coffee — that has some subactions, we can basically stitch together these subactions and look at each one as something like verb, adjective, and adverb."

Page 1 | Page 2

About the Author

James Carroll

Former VSD Editor James Carroll joined the team 2013. Carroll covered machine vision and imaging from numerous angles, including application stories, industry news, market updates, and new products. In addition to writing and editing articles, Carroll managed the Innovators Awards program and webcasts.

Academic researchers developing new activity recognition algorithm

About the Author

James Carroll

Related

Enhancing Image Quality with Melt-Fit Technology: Customizing Optics to Material Variations

How Many Photons Are There in a Lumen?

Voice Your Opinion!

To join the conversation, and become an exclusive member of Vision Systems Design, create an account today!

Trending

Key Economic Insights from the 2026 A3 Business Forum

Udder Automation: Using Vision-Guided Robotics to See and Scrub Away a Messy Farm Job

Focus on Vision: Bits of Vision Tech from CES | January 14, 2026