Advertisement

SignAll is slowly but surely building a sign language translation platform

SignAll is slowly but surely building a sign language translation platform
From TechCrunch - February 14, 2018

Translating is difficult work, the more so the further two languages are from one another. French to Spanish? Not a problem. Ancient Greek to Esperanto? Considerably harder. But sign language is a unique case, and translating it uniquely difficult, because it is fundamentally different from spoken and written languages. All the same, SignAll has been working hard for years to make accurate, real-time machine translation of ASL a reality.

One would think that with all the advances in AI and computer vision happening right now, a problem as interesting and beneficial to solve as this would be under siege by the best of the best. Even thinking about it from a cynical market-expansion point of view, an Echo or TV that understands sign language could attract millions of new (and very thankful) customers.

Unfortunately, that doesnt seem to be the casewhich leaves it to small companies like Budapest-based SignAll to do the hard work that benefits this underserved group. And it turns out that translating sign language in real time is even more complicated than it sounds.

CEO Zsolt Robotka and chief R&D officer Mrton Kajtr were exhibiting this year at CES, where I talked with them about the company, the challenges they were taking on and how they expect the field to evolve. (Im glad to see the company was also at Disrupt SF in 2016, though I missed them then.)

Perhaps the most interesting thing to me about the whole business is how interesting and complex the problem is that they are attempting to solve.

Its multi-channel communication; its really not just about shapes or hand movements, explained Robotka. If you really want to translate sign language, you need to track the entire upper body and facial expressionsthat makes the computer vision part very challenging.

Right off the bat thats a difficult ask, since thats a huge volume in which to track subtle movement. The setup right now uses a Kinect 2 more or less at center and three RGB cameras positioned a foot or two out. The system must reconfigure itself for each new user, since just as everyone speaks a bit differently, all ASL users sign differently.

We need this complex configuration because then we can work around the lack of resolution, both time and spatial (i.e. refresh rate and number of pixels), by having different points of view, said Kajtr. You can have quite complex finger configurations, and the traditional methods of skeletonizing the hand dont work because they occlude each other. So were using the side cameras to resolve occlusion.

As if that wasnt enough, facial expressions and slight variations in gestures also inform what is being said, for example adding emotion or indicating a direction. And then theres the fact that sign language is fundamentally different from English or any other common spoken language. This isnt transcriptionits full-on translation.

Advertisement

Continue reading at TechCrunch »