It’s not uncommon now for young elementary school students to do a substantial chunk of their reading practice in class with virtual tutors.
Artificial intelligence powers these digital programs, which listen to 5- and 6-year-olds read and offer feedback.
Amira Learning, one of the most widely adopted offerings on the market, . Several big providers of early reading curricula and intervention materials—including and Curriculum Associates—have integrated similar AI tools into their literacy programs, or created their own.
AI tutors promise a solution to one of the thorniest challenges in early elementary reading instruction: providing students lots of opportunities for practice, with immediate feedback.
As students are learning how to link spoken sounds to written letters, they need regular practice with the specific phonics patterns they’re learning. Teachers need to guide this process. In a class of two dozen students, where some may need extra help with previously learned skills and others are ready to move on, finding time for everyone to have one-on-one or small group instruction can be nearly impossible.
But even as schools are adopting the new AI reading tools, research on their effect on reading outcomes among young students isn’t conclusive. Even asking how well AI reading tools work is a multifaceted question: Can they accurately parse children’s speech? Do they actually save teachers time? And do they lead to better student outcomes?
Another question is comparative. Are the results with these tools better than what would have happened in their absence? Than other digital programs? Or than teacher-led independent practice?
“There are a few studies, but none of the rigorous studies that we would need to be confident in the use of AI,” said Matt Burns, a professor of special education at the University of Florida. Burns is involved in a grant to the University of Florida Literacy Institute to study AI integration into its early reading curriculum.
In part, that’s because the AI capabilities reading programs have now haven’t existed for long, said Henry May, a professor of education at the University of Delaware, and the lead author on a recent evaluation study of . “It’s only been in the last couple of years that you’re beginning to see AI-based interventions where the AI behind it is as sophisticated as they are now,” he said.
And while some schools have embraced virtual reading tutors, new research is also examining the potential for AI to help early reading teachers make data-driven instructional decisions, and generate customized reading materials, like decodable texts.
“I think that’s where AI can really be powerful,” said Burns. “To make decisionmaking more efficient, and more effective, more accurate.”
How widely used AI tutors stack up
Kids practicing reading on screens is hardly a new phenomenon. Most large curriculum companies have some digital component for skill practice.
But AI-enabled platforms work differently. Instead of listening to prerecorded questions and selecting an answer from a multiple choice list, students engage in a verbal back-and-forth with the program, similar to how they would with a live tutor.
In theory, an AI chatbot might be more effective in improving reading outcomes than answering multiple choice questions on a screen. Choosing the right answer from a list of options requires different knowledge than producing that answer on your own. With a multiple-choice-heavy program, it’s possible that students might be able to advance with lucky guesses.
For a real-time, interactive program, companies have to train their speech recognition engines on children’s voices and speech patterns, which differ from adults’. They also have to teach the tools to filter out background noise, like the voices of other students using the program.
“We’ve got to make sure it works at scale, in real classrooms, in the messiness of the ecological world it lives in,” said Ran Liu, a vice president and chief AI scientist at Amira.
Most of the research on Amira examines differences in student outcomes by dosage, comparing students who spent more time using the program to those who spent less. One recent study, though, used a quasi-experimental design to compare elementary schoolers in Louisiana who had to those who didn’t. The study showed small but statistically significant positive effects for Amira use on K-3 students’ scores on DIBELS, a common measure of early literacy skills.
Research on Amira also suggests students shouldn’t use the program for long periods of time, Liu said. “We’ve learned that around 25-30-minute sessions is where we see a ceiling effect,” she said.
Besides Amira, teachers may also have encountered SoapBox, an AI “engine” that previously worked with about 40 ed-tech companies, but was .
Curriculum Associates is integrating SoapBox and its speech-recognition capabilities into its instruction and assessment product iReady. It will use the engine to power student practice in letter naming, reading nonsense words, and reading passages for fluency practice, said Amelia Kelly, the chief technology officer of SoapBox Labs and vice president of data science at Curriculum Associates.
SoapBox Labs has that the tool , but there are no published studies on its effect on students’ reading achievement.
Still, it’s hard to know how AI tutors stack up against flesh and blood reading partners. Two studies of Chinese kindergarteners, for example, showed mixed results. Students who read books with AI chatbots saw gains in , vocabulary, and syntax. But when the researchers compared reading with a parent to reading with the AI, students who read with parents had .
One randomized controlled study this year, though, found from the Dysolve program, which targets language processing for students with reading disabilities. The program tailors phonemic awareness activities to students.
Other uses for AI in reading: lesson planning, text generation
Other research examines how AI can be used in the back end of reading instruction, for planning lessons or generating text that students read.
When kindergartners and 1st graders are learning new letter-sound patterns, they need to practice reading them in stories. But teachers often struggle to find enough of these aligned books, called decodable texts. And some say that many of the decodables on the market are poorly written or boring. Could generative AI offer new options?
One study suggests off-the-shelf tools may not be cut out for the job. In a paper published this year, researchers at the University of Texas at Austin found that in important ways when asked to generate texts within specific instructional parameters.
The six tools researchers used often created texts with higher levels of complexity than requested from prompts. When the AI produced decodable text designed to be used by beginning readers, it was often too difficult or sounded unnatural.
Some companies have tried to address this problem with tailor-made tools. One, Project Read AI, currently offers a decodable-reader generator that can create personalized stories aligned to the phonics scope and sequences of several curricula.
The company has worked closely with the University of Florida Literacy Institute to integrate its phonics program, UFLI Foundations, into some of Project Read AI’s tools. Its UFLI Portal creates small group lesson plans from scanned student reading and spelling data.
Supported by a federal grant, UFLI and Project Read AI are now working to develop and evaluate an AI-based instructional planning model, which analyzes data from student sessions with the company’s AI tutor to create individualized reading plans.
Another federal grant, to researchers at the State University of New York at Buffalo, supports the creation and evaluation of a different AI tool to for students in grades K-2.
The tool, called the AI Reading Enhancer, is currently in development. Ultimately, the goal is for it to be able to create decodable passages that align to students’ specific instructional needs, as well as their interests and cultural backgrounds, said X. Christine Wang, a professor of learning and instruction at SUNY Buffalo, and the grant’s principal investigator.
Automatic speech recognition technology would listen to students read the generated text, and then the tool would analyze student performance and provide corrective feedback.
This on-the-spot text generation would set the tool apart from other offerings currently on the market, which assign students readings from a bank of text selections. That method prioritizes quality control, Wang said, “but the compromise is less customized text.”
Wang’s approach also requires strong privacy protections, she said, as students would be sharing information with the tool. Her grant also covers work to create ethical guidelines for AI development and deployment.
“For this age group,” she said, “the guard rails have to be very rigorous.”