Researchers’ Artificial Intelligence-Based Speech Sound Therapy Software Wins $2.5M NIH Grant

Three Syracuse University researchers, supported by a recent $2.5 million grant from the National Institutes of Health, are working to refine a clinically intuitive automated system that may improve treatment for speech sound disorders while alleviating the impact of a worldwide shortage of speech-language clinicians.

The project, “Intensive Speech Motor Chaining Treatment and Artificial Intelligence Integration for Residual Speech Sound Disorders,” is funded for five years. Jonathan Preston, associate professor of communication sciences and disorders, is principal investigator. Preston is the inventor of Speech Motor Chaining, a treatment approach for individuals with speech sound disorders. Co-principal investigators are Asif Salekin, assistant professor of electrical engineering and computer science, whose expertise is creating interpretable and fair human-centric artificial intelligence-based systems, and Nina Benway, a recent graduate of the communication sciences and disorders/speech-language pathology doctoral program.

Their system uses the evidence-based Speech Motor Chaining software, an extensive library of speech sounds and artificial intelligence to “think” and “hear” the way a speech-language clinician does.

The project focuses on the most effective scheduling of Speech Motor Chaining sessions for children with speech sound disorders and also examines whether artificial intelligence can enhance Speech Motor Chaining—a topic Benway explored in her dissertation. The work is a collaboration between Salekin’s Laboratory for Ubiquitous and Intelligent Sensing in the College of Engineering and Computer Science and Preston’s Speech Production Lab in the College of Arts and Sciences.

Clinical Need

In speech therapy, learners usually meet with a clinician one-on-one to practice speech sounds and receive feedback. If the artificial intelligence version of Speech Motor Chaining (“ChainingAI”) accurately replicates a clinician’s judgment, it could help learners get high-quality practice on their own between clinician sessions. That could help them achieve the intensity of practice that best helps overcome a speech disorder.

The software is meant to supplement, not replace, the work of speech clinicians. “We know that speech therapy works, but there’s a larger issue about whether learners are getting the intensity of services that best supports speech learning,” Benway says. “This project looks at whether AI-assisted speech therapy can increase the intensity of services through at-home practice between sessions with a human clinician. The speech clinician is still in charge, providing oversight, critical assessment and training the software on which sounds to say are correct or not; the software is simply a tool in the overall arc of clinician-led treatment.”

170,000 Sounds

A library of 170,000 correctly and incorrectly pronounced “r” sounds was used to train the system. The recorded sounds were made by 400-plus children over 10 years, collected by researchers at Syracuse, Montclair and New York Universities, and filed at the Speech Production Lab.

Benway wrote ChainingAI’s patent-pending speech analysis and machine learning operating code, which converts audio from speech sounds into recognizable numeric patterns. The system was taught to predict which patterns represent “correct” or “incorrect” speech. Predictions can be customized to individuals’ speech patterns.

During speech practice, the code works in real time with Preston’s Speech Motor Chaining website to sense, sort and interpret patterns in speech audio to “hear” whether a sound is made correctly. The software provides audio feedback (announcing “correct” or “not quite”), offers tongue-position reminders and tongue-shape animations to reinforce proper pronunciation, then selects the next practice word based on whether or not the child is ready to increase word difficulty.

Early Promise

The system shows greater potential than prior systems that have been developed to detect speech sound errors, according to the researchers.

Until now, Preston says, automated systems have not been accurate enough to provide much clinical value. This study overcomes issues that hindered previous efforts: Its example residual speech sound disorder audio dataset is larger; it more accurately recognizes incorrect sounds; and clinical trials are assessing therapeutic benefit.

“There has not been a clinical therapy system that has explicitly used AI machine learning to recognize correct and distorted “r” sounds for learners with residual speech sound disorders,” Preston says. “The data collected so far shows this system is performing well in relation to what a human clinician would say in the same circumstances and that learners are improving speech sounds after using ChainingAI.”

So Far, Just ‘R’

The experiment is currently focused on the “r” sound, the most common speech error persisting into adolescence and adulthood, and only on American English. Eventually, the researchers hope to expand software functionality to “s” and “z” sounds, different English dialects and other languages.

Ethical AI

The researchers have considered ethical aspects of AI throughout the initiative. “We’ve made sure that ethical oversight was built into this system to assure fairness in the assessments the software makes,” Salekin says. “In its learning process, the model has been taught to adjust for age and sex of clients to make sure it performs fairly regardless of those factors.” Future refinements will adjust for race and ethnicity.

The team is also assessing appropriate candidates for the therapy and whether different scheduling of therapy visits (such as a boot camp experience) might help learners progress more quickly than longer-term intermittent sessions.

Ultimately, the researchers hope the software provides sound-practice sessions that are effective, accessible and of sufficient intensity to allow ChainingAI to routinely supplement in-person clinician practice time. Once expanded to include “s” and “z” sounds, the system would address 90% of residual speech sound disorders and could benefit many thousands of the estimated six million Americans who are impacted by these disorders.

Written by Diane Stirling

Sucheta Soundarajan

Degree:

  • PhD, Computer Science (2013, Cornell University)

Areas of Expertise:

  • Social network analysis
  • Complex systems
  • Algorithmic fairness
  • Algorithms

Current Research:

Dr. Soundarajan’s research focuses on designing algorithms for analyzing social and other complex networks, including algorithms for characterizing the hierarchical structure of networks and the evolution of social networks.  She is particularly interested in designing fair network analysis algorithms for tasks such as link prediction and community/cluster detection.  Her work also explores the structure of real-world complex systems, including the behavior of individual animals in herds of dairy cows, language evolution in social media ecosystems, and stratification in scientific co-authorship networks. 

Selected Publications:

Sucheta Soundarajan and John Hopcroft. Use of Local Group Information to Identify Communities in Networks. ACM Transactions on Knowledge Discovery from Data (TKDD). 2015.

Sucheta Soundarajan, Tina Eliassi-Rad, and Brian Gallagher. A Guide to Selecting a Network Similarity Method. SIAM Conference on Data Mining (SDM). 2014.

Bruno Abrahao, Sucheta Soundarajan, John Hopcroft, and Robert Kleinberg. A Separability Framework for Analyzing Community Structure. ACM Transactions on Knowledge Discovery from Data (TKDD-CASIN). 2014.

Bruno Abrahao, Sucheta Soundarajan, John Hopcroft, and Robert Kleinberg. On the Separability of Structural Classes of Communities. 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). 2012.

Sucheta Soundarajan and John Hopcroft. Using Community Information to Improve the Precision of Link Prediction Methods. World Wide Web (WWW) 2012.

Farzana Rahman

Degrees:

  • Ph.D., Computer Science, Marquette University, Wisconsin, USA (2013)
  • M.S., Computer Science, Marquette University, Wisconsin, USA (2010)
  • B.S., Computer Science and Engineering, Bangladesh University of Engineering and Technology (BUET), Bangladesh (2008)

Research interests:

  • Mobile and pervasive health technologies
  • Internet-of-Things
  • Computer science education
  • Impact of active learning pedagogy in CS courses
  • Broadening participation of women and underrepresented students in CS

Current research:

Her research spans the domains of mobile healthcare, healthcare data analytics, and pervasive health technologies. Broadly, her research focuses on integrating mobile and pervasive technologies in health and wellness environments to improve users’ quality of life, mental and physical wellbeing. Her research also expands in the direction of mobile security, information and communication technology for development (ICT4D), Computer Science education, broadening participation in computing, best practices in undergraduate research, and how different pedagogical practices can increase diversity in CS. She is also interested in finding why and how people from diverse backgrounds are learning programming in 21stcentury and how the development of new kind of scalable programming environments or platform can support all kind of learners.

Teaching Interests:

  • Introduction to Programming
  • Object-Oriented Programming
  • Data Structure
  • Mobile Application Programming
  • Mobile and Pervasive Computing
  • Computer Architecture

Honors:

  • Provost LA Initiative Award, Florida International University, Spring 2018-2019
  • Best paper award, IEEE Conference on Networking Systems and Security (NSysS’ 16), 2016
  • Systers Pass-It-On (PIO) Award, Anita Borg Institute, 2014
  • Best paper award, IEEE International Conference on e-Health Networking, Applications and Services (Healthcom’ 12), 2012

Recent Publications:

  1. Claire Fulk, Grant Hobar, Kevin Olsen, Samy El-Tawab, Puya Ghazizadeh, and Farzana Rahman. Cloud-based Low-cost Energy Monitoring System through the Internet of Things. In Proceedings of the IEEE International Workshop of Mobile and Pervasive Internet of Things (PerIoT 2019), in Conjunction with IEEE Percom ’19. Japan, March 2019.
  2. Farzana Rahman and Samy El-Tawab. App Development for the Social Good: Teaching Socially Conscious Mobile App Development in an Upper-Level Computer Science Course. In Proceedings of the 2019 ASEE Annual Conference and Exposition (ASEE ’19), Orlando, FL, July 2019.
  3. Farzana Rahman. Leveraging Visual Programming Language and Collaborative Learning to Broaden Participation in Computer Science. In Proceedings of the 19th Annual Conference on Information Technology Education (SIGITE ’18), Ft Lauderdale, FL, Oct 2018.
  4. Saiyma Sarmin, Nafisa Anzum, Kazi Hasan Zubaer, Farzana Rahman, A. B. M. Alim Al Islam. Securing Highly-Sensitive Information in Smart Mobile Devices through Difficult-to-Mimic and Single-Time Usage Analytics. In Proceedings of the 15th EAI International Conference on Mobile and Ubiquitous Systems, Computing, Networking and Services (MobiQuitous ’18), Nov 2018.
  5. Farzana Rahman. From App Inventor to Java: Introducing Object-oriented Programming to Middle School Students Through Experiential Learning. In Proceedings of the 2018 ASEE Annual Conference and Exposition (ASEE ’18), Salt Lake City, UT, July 2018.
  6. Farzana Rahman, Healthy Hankerings: Motivating Adolescents to Combat Obesity with a Mobile Application. In Proceedings of the 20th International Conference on Human-Computer Interaction (HCI International ’18), NV, July, 2018.
  7. Farzana Rahman, Perry Fizzano, Evan M. Peck, Shameem Ahmed, and Stu Thompson. How to Build a Student-Centered Research Culture for the Benefit of Undergraduate Students. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (SIGCSE ’18), Maryland, Feb 2017.

Vir V. Phoha

Degree:

  • Ph.D. Texas Tech University

Research Interests:

  • Cyber Security – Cyber offense and defense
  • Machine Learning
  • Smart phones and tablets security
  • Biometrics — network based and standalone

Current Research:

My focus is to do original research that cuts across conventional rigorously defined disciplines and unifies basic and common concepts across disciplines. In particular, my research centers around security (malignant systems, active authentication, for example touch based authentication on mobile devices) and machine learning (decision trees, statistical, and evolutionary methods) with a focus on large time series data streams and static data sets, and computer networks (anomalies, optimization). I am also using these methods to build field realizable defensive and offensive Cyber-based systems. 

Courses Taught:

  • Security and Machine learning; Biometrics
  • Applied Cryptography

Honors and Awards:

  • Fellow of: AAAS; AAIA; IEEE; NAI; SDPS 
  • ACM Distinguished Scientist 
  • IEEE Computer Society Distinguished Visitor (2024-2026) 
  • ACM Distinguished Speaker (2012-2015) 
  • IEEE Region 1 Technological Innovation  Award, 2017 

Selected Publications:

  • F. Chen, J. Xin and V. V. Phoha, “SSPRA: A Robust Approach to Continuous Authentication Amidst Real-World Adversarial Challenges,” in IEEE Transactions on Biometrics, Behavior, and Identity Science, vol. 6, no. 2, pp. 245-260, April 2024, doi: 10.1109/TBIOM.2024.3369590 
  • Jingyu Xin, Vir V. Phoha, and Asif Salekin. 2022. Combating False Data Injection Attacks on Human-Centric Sensing Applications. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 6, 2, Article 83 (July 2022), 22 pages. https://doi.org/10.1145/3534577 
  • Xinyi Zhou, Kai Shu, Vir V. Phoha, Huan Liu, and Reza Zafarani. 2022. “This is Fake! Shared it by Mistake”:Assessing the Intent of Fake News Spreaders. In Proceedings of the ACM Web Conference 2022 (WWW ’22). Association for Computing Machinery, New York, NY, USA, 3685–3694. https://doi.org/10.1145/3485447.3512264 
  • Fallahi, A., Phoha, V.V. (2021). Adversarial Activity Detection Using Keystroke Acoustics. In: Bertino, E., Shulman, H., Waidner, M. (eds) Computer Security – ESORICS 2021. ESORICS 2021. Lecture Notes in Computer Science(), vol 12972. Springer, Cham. https://doi.org/10.1007/978-3-030-88418-5_30 
  • Xinyi Zhou, Atishay Jain, Vir V. Phoha, and Reza Zafarani. 2020. Fake News Early Detection: A Theory-driven Model. ACM Digital Threats 1, 2, Article 12 (June 2020), 25 pages. https://doi.org/10.1145/3377478 
  • B. Li, W. Wang, Y. Gao, V. V. Phoha and Z. Jin, “Wrist in Motion: A Seamless Context-Aware Continuous Authentication Framework Using Your Clickings and Typings,” in IEEE Transactions on Biometrics, Behavior, and Identity Science, vol. 2, no. 3, pp. 294-307, July 2020, doi: 10.1109/TBIOM.2020.2997004. 

Chilukuri K. Mohan

Degree(s):

  • Ph. D., State Univ. of New York at Stony Brook
  • B.Tech., Indian Institute of Technology, Kanpur

Lab/Center Affiliation(s) :

  • Syracuse Evolutionary and Neural Systems Exploration (SENSE) Lab
  • Smart Grid Lab

Research Interests:

  • Artificial intelligence
  • Evolutionary algorithms
  • Data mining
  • Social networks
  • Bioinformatics

Current Research:

Recent work has involved the development of algorithms for:

the recognition of patterns in promoter regions of genome sequences,
unsupervised detection of anomalies in data (including time series), and
optimization problems in the design of multiband cognitive radio networks.
Other current work includes the investigation of robustness properties of social networks, as well as the use of network models in understanding the dynamics of evolutionary algorithms.

Teaching Interests:

  • Smart grid
  • Social networks
  • Evolutionary algorithms
  • Neural networks

Honors:

  • Distinguished Scholar Award, International Society of Applied Intelligence, July 2011.

Selected Publications:

Linkage Sensitive Particle Swarm Optimization (D. Devicharan and C.K. Mohan), in Handbook of Swarm Intelligence – Concepts, Principles, and Applications (eds. B.K. Panigrahi, Y. Shi, and B. Lim), pp. 119-132, 2011.

Rank-Based Outlier Detection with (HuaMing Huang, Kishan Mehrotra Chilukuri K. Mohan), in Journal of Statistical Computation and Simulation, Oct. 2011.

Distributed In-Network Path Planning for Sensor Network Navigation in Dynamic Hazardous Environments (D. Chen, C.K. Mohan, K.G. Mehrotra, and P.K. Varshney), in Wireless Communications and Mobile Computing 12(8): 739-754, 2012.

SMAlign: Alignment of DNA Sequences with Gap Constraints (F. Alobaid, K. Mehrotra, C.K. Mohan, and R. Raina), in Proc. BICOB, Las Vegas, March 2012.

Reference Set Metrics for Multi-Objective Algorithms (C.K. Mohan and K. Mehrotra), in Proc. SEMCCO, pp.723-730, Dec. 2011.

Venkata S.S. Gandikota

Degrees:

  • Ph.D. Computer Science – Purdue University
  • MS Computer Science – Purdue University
  • MSc Mathematics – Birla Institute of Technology and Science, Goa, India
  • B.E. Computer Science – Birla Institute of Technology and Science, Goa, India

Lab/ Center/ Institute affiliation:

Areas of Expertise:

  • Foundations of Data Science
  • Coding & Information Theory
  • Lattice Algorithms

Dr. Gandikota’s research delves into the algorithmic principles of data recovery from noise, with an emphasis on its applications in fundamental machine learning problems. His primary objective is to delineate the conditions that enable successful data recovery while also devising efficient algorithms to achieve it.

Honors and Awards:

  • IEEE Senior Member
  • SOURCE RA Grant.
  • CUSE Seed Grant.

Selected Publications:

J. Cole Smith

Degrees:

  • PhD, Industrial and Systems Engineering, Virginia Tech, 2000
  • BS, Mathematical Sciences, Clemson University, 1996

Areas of Expertise:

  • Integer programming and combinatorial optimization
  • Network flows and facility location
  • Computational optimization methods
  • Large-scale optimization due to uncertainty or robustness considerations

My research interests lie in the field of mathematical optimization, especially in mixed-integer programming and combinatorial optimization. Much of my research has recently focused on network interdiction and fortification, along with bilevel mixed-integer optimization problems. I am particularly interested in interdiction problems that involve uncertain data, and/or in which there is an asymmetry of information among the players. My research has applications in areas including logistics, national security, healthcare, production, ecology, and sports. This research has recently appeared in journals such as Operations Research, Mathematical Programming, IISE Transactions, Networks, and INFORMS Journal on Computing, and has been supported by agencies including the National Science Foundation, the Office of Naval Research, the Air Force Office of Scientific Research, the Defense Threat Reduction Agency, and the Defense Advanced Research Projects Agency.

Honors:

  • 2023 Fellow, Institute for Operations Research and the Management Sciences (INFORMS)
  • 2019 Member, Academy of Distinguished Alumni for the Grado Department of Industrial and Systems Engineering at Virginia Tech
  • 2018 Fellow, Institute of Industrial and Systems Engineers
  • 2014 Glover-Klingman Prize for Best Paper in Networks (Sullivan and Smith, 2014)
  • 2010 Hamed K. Eldin Outstanding Young Industrial Engineer in Education Award

Selected Publications:

* Nguyen, D., Song, Y., and Smith, J.C., “A Two-Stage Interdiction-Monitoring Game,” 81(3), 334-358, Networks, 2023.

*Bochkarev, A.A., and Smith, J.C., “On Aligning Non-order-associated Binary Decision Diagrams,” 35(5), 910-928, INFORMS Journal on Computing, 2023.

*Curry, R.M. and Smith, J.C., “Minimum-cost Flow Problems Having Arc-activation Costs,” Naval Research Logistics, 69(2), 320-335, 2022.

* Lozano, L. and Smith, J.C., “A Binary Decision Diagram Based Algorithm for Solving a Class of Integer Two-Stage Stochastic Programs,” Mathematical Programming, 191(1), 381-404, 2022.

* Nguyen, D. and Smith, J.C., “Network Interdiction with Asymmetric Cost Uncertainty,” European Journal of Operational Research, 297(1), 239-251, 2022.

* Holzmann, T. and Smith, J.C., “The Shortest Path Interdiction Problem with Randomized Interdiction Strategies: Complexity and Algorithms,” Operations Research, 69(1), 82-99, 2021.