Microsoft wants AI to be more helpful for people who are blind or use wheelchairs

Researchers are building diverse training data sets that include information from people with low vision and individuals living with conditions like ALS.

istock-873015278.jpg

vadimguzhva, Getty Images/iStockphoto

People who are blind or who use a wheelchair or who have autism often are early adopters of technology to complete everyday tasks like communicating, reading, and traveling. Artificial intelligence powers many of these services such as voice and object recognition. In many cases, these products are trained on data from able-bodied or neurotypical people. This means that the algorithms may have a limited understanding of body types, communication styles, and facial expressions.

Microsoft is working with researchers and advocacy groups to solve this data problem and build data sets that better reflect all types of users and real-world scenarios. Microsoft put the challenges in context in a post published on Oct. 12 on the company's AI Blog:

"If a self-driving car's pedestrian detection algorithms haven't been shown examples of people who use wheelchairs or whose posture or gait is different due to advanced age, for example, they may not correctly identify those people as objects to avoid or estimate how much longer they need to safely cross a street, researchers noted.

"AI models used in hiring processes that try to read personalities or interpret sentiment from potential job candidates can misread cues and screen out qualified candidates with autism or who emote differently. Algorithms that read handwriting may not be able to cope with examples from people who have Parkinson's disease or tremors. Gesture recognition systems may be confused by people with amputated limbs or different body shapes."

"This really points to the question of how 'normal' is defined by AI systems and who gets to decide that," Kate Crawford, senior principal researcher at Microsoft Research New York and co-founder of the company's Fairness, Accountability, Transparency and Ethics (FATE) in AI group, said in the blog post.

SEE: Natural language processing: A cheat sheet (TechRepublic)

Topic areas range from personalized image recognition for blind or low-vision people to improved facial recognition for people with amyotrophic lateral sclerosis(ALS). Microsoft researchers also are studying how often public datasets used to train AI systems include data from people older than 80. Age correlates strongly with disability so having data from older adults could make algorithms smarter when it comes to aging. Here are some of the projects that Microsoft is supporting with funding or technical expertise.

Object Recognition for Blind Image Training (ORBIT): This project is building a public data set from images taken by people who are blind or have low vision. The goal is to personalize image recognition so that an algorithm could identify a particular cane or set of keys. Generic object recognition can't do that.

VizWiz data set: University of Texas at Austin researchers are building on a data set that was started at Carnegie Mellon University. The goal is to work with people who are blind or with low vision to better understand their expectations of AI captioning tools and to improve how computer vision algorithms interpret photos taken by people who are blind. Danna Gurari, assistant professor at the University of Texas at Austin, is building a new public dataset to train, validate, and test image captioning algorithms. It includes more than 39,000 images taken by blind and low-vision participants 

Project Insight: This project in collaboration with Team Gleason will create an open dataset of facial imagery of people living with ALS to improve computer vision and train related AI models on a broader dataset. Team Gleason is a nonprofit that helps people living with ALS by providing them with innovative technology and equipment and other support.  

Researchers and advocates can apply for grants from Microsoft's AI for Accessibility fund to support their work.

Also see