Camera captures voices without a microphone

Yasuhiro Oikawa of Waseda University in Tokyo pointed a high-speed camera at the throat of a volunteer with one task in mind: To capture his/her voice without the use of a microphone. Yes, you read that correctly. Oikawa and his team announced at the International Congress on Acoustics on June 3 that they used cameras to take thousands of images per second and record the motions of a person’s neck and voice box as they spoke. A computer program then turned the recorded vibrations into sound waves. Why did they do this, you ask? Some lip-reading software programs are sophisticated enough to recognize different languages, but the end result doesn’t usually involve much more than a transcript, according to a ScienceNews article . In addition, microphones often record too much background noise, so Oikawa and his colleagues, looking for a new method of capturing vocal tones, came up with this idea. The article explains that the researchers pointed the camera at the throats of two volunteers and had them say the Japanese word tawara, which means straw bale or bag. The team recorded them at 10,000 fps, and at the same time, recorded the volunteers’ words with a standard microphone and a vibrometer for comparison. The vibrations recorded by the camera vibrations can’t be recorded by a camera – I think you mean “interpreted by the camera data) were similar to the ones from the microphone and vibrometer, Oikawa said in the article. After running the images though a computer program, the team reconstructed the volunteers’ voices well enough to hear and understand them saying tawara. Mechanical engineer Weikang Jiang of Shanghai Jiao Tong University in China noted Oikawa did not play audio of the reconstructed voices, but instead showed the comparison photos of the sound waves and vibrations. Like Weikang, I am interested to hear what the audio sounds like.

Andy Wilson

June 11, 2013

2 min read

Content Dam Vsd En Articles 2013 06 Camera Captures Voices Without A Microphone Leftcolumn Article Thumbnailimage File

Yasuhiro Oikawa of Waseda University in Tokyo pointed a high-speed camera at the throat of a volunteer with one task in mind: To capture his/her voice without the use of a microphone.

Yes, you read that correctly. Oikawa and his team announced at the International Congress on Acoustics on June 3 that they used cameras to take thousands of images per second and record the motions of a person’s neck and voice box as they spoke. A computer program then turned the recorded vibrations into sound waves.

Why did they do this, you ask? Some lip-reading software programs are sophisticated enough to recognize different languages, but the end result doesn’t usually involve much more than a transcript, according to a ScienceNews article . In addition, microphones often record too much background noise, so Oikawa and his colleagues, looking for a new method of capturing vocal tones, came up with this idea.

The article explains that the researchers pointed the camera at the throats of two volunteers and had them say the Japanese word tawara, which means straw bale or bag. The team recorded them at 10,000 fps, and at the same time, recorded the volunteers’ words with a standard microphone and a vibrometer for comparison. The vibrations recorded by the camera vibrations can’t be recorded by a camera – I think you mean “interpreted by the camera data) were similar to the ones from the microphone and vibrometer, Oikawa said in the article.

After running the images though a computer program, the team reconstructed the volunteers’ voices well enough to hear and understand them saying tawara. Mechanical engineer Weikang Jiang of Shanghai Jiao Tong University in China noted Oikawa did not play audio of the reconstructed voices, but instead showed the comparison photos of the sound waves and vibrations.

Like Weikang, I am interested to hear what the audio sounds like.

About the Author

Andy Wilson

Founding Editor

Founding editor of Vision Systems Design. Industry authority and author of thousands of technical articles on image processing, machine vision, and computer science.

B.Sc., Warwick University

Tel: 603-891-9115
Fax: 603-891-9297

AI-Powered Industrial Camera Platform Made for High-Speed Diagnostics

How Image Quality Tuning Improves Machine Vision Results

Sponsored

AI Powered Machine Vision Applications Guide

Sponsored

Camera captures voices without a microphone

About the Author

Andy Wilson

Founding Editor

Related

AI-Powered Industrial Camera Platform Made for High-Speed Diagnostics

How Image Quality Tuning Improves Machine Vision Results

AI Powered Machine Vision Applications Guide

Edge Learning, Your Next Competitive Edge

Voice Your Opinion!

To join the conversation, and become an exclusive member of Vision Systems Design, create an account today!

Trending

AI-Powered Industrial Camera Platform Made for High-Speed Diagnostics

How Image Quality Tuning Improves Machine Vision Results

Play OptiLex, a Weekly Word Game from VSD | March 4, 2026

Sponsored Picks

AI Powered Machine Vision Applications Guide

Edge Learning, Your Next Competitive Edge

Advanced Machine Vision Made Easy