MIT Latest News

Subscribe to MIT Latest News feed
MIT News is dedicated to communicating to the media and the public the news and achievements of the students, faculty, staff and the greater MIT community.
Updated: 5 hours 47 min ago

A better method for planning complex visual tasks

Wed, 03/11/2026 - 12:00am

MIT researchers have developed a generative artificial intelligence-driven approach for planning long-term visual tasks, like robot navigation, that is about twice as effective as some existing techniques.

Their method uses a specialized vision-language model to perceive the scenario in an image and simulate actions needed to reach a goal. Then a second model translates those simulations into a standard programming language for planning problems, and refines the solution.

In the end, the system automatically generates a set of files that can be fed into classical planning software, which computes a plan to achieve the goal. This two-step system generated plans with an average success rate of about 70 percent, outperforming the best baseline methods that could only reach about 30 percent.

Importantly, the system can solve new problems it hasn’t encountered before, making it well-suited for real environments where conditions can change at a moment’s notice.

“Our framework combines the advantages of vision-language models, like their ability to understand images, with the strong planning capabilities of a formal solver,” says Yilun Hao, an aeronautics and astronautics (AeroAstro) graduate student at MIT and lead author of an open-access paper on this technique. “It can take a single image and move it through simulation and then to a reliable, long-horizon plan that could be useful in many real-life applications.”

She is joined on the paper by Yongchao Chen, a graduate student in the MIT Laboratory for Information and Decision Systems (LIDS); Chuchu Fan, an associate professor in AeroAstro and a principal investigator in LIDS; and Yang Zhang, a research scientist at the MIT-IBM Watson AI Lab. The paper will be presented at the International Conference on Learning Representations.

Tackling visual tasks

For the past few years, Fan and her colleagues have studied the use of generative AI models to perform complex reasoning and planning, often employing large language models (LLMs) to process text inputs.

Many real-world planning problems, like robotic assembly and autonomous driving, have visual inputs that an LLM can’t handle well on its own. The researchers sought to expand into the visual domain by utilizing vision-language models (VLMs), powerful AI systems that can process images and text.

But VLMs struggle to understand spatial relationships between objects in a scene and often fail to reason correctly over many steps. This makes it difficult to use VLMs for long-range planning.

On the other hand, scientists have developed robust, formal planners that can generate effective long-horizon plans for complex situations. However, these software systems can’t process visual inputs and require expert knowledge to encode a problem into language the solver can understand.

Fan and her team built an automatic planning system that takes the best of both methods. The system, called VLM-guided formal planning (VLMFP), utilizes two specialized VLMs that work together to turn visual planning problems into ready-to-use files for formal planning software.

The researchers first carefully trained a small model they call SimVLM to specialize in describing the scenario in an image using natural language and simulating a sequence of actions in that scenario. Then a much larger model, which they call GenVLM, uses the description from SimVLM to generate a set of initial files in a formal planning language known as the Planning Domain Definition Language (PDDL).

The files are ready to be fed into a classical PDDL solver, which computes a step-by-step plan to solve the task. GenVLM compares the results of the solver with those of the simulator and iteratively refines the PDDL files.

“The generator and simulator work together to be able to reach the exact same result, which is an action simulation that achieves the goal,” Hao says.

Because GenVLM is a large generative AI model, it has seen many examples of PDDL during training and learned how this formal language can solve a wide range of problems. This existing knowledge enables the model to generate accurate PDDL files.

A flexible approach

VLMFP generates two separate PDDL files. The first is a domain file that defines the environment, valid actions, and domain rules. It also produces a problem file that defines the initial states and the goal of a particular problem at hand.

“One advantage of PDDL is the domain file is the same for all instances in that environment. This makes our framework good at generalizing to unseen instances under the same domain,” Hao explains.

To enable the system to generalize effectively, the researchers needed to carefully design just enough training data for SimVLM so the model learned to understand the problem and goal without memorizing patterns in the scenario. When tested, SimVLM successfully described the scenario, simulated actions, and detected if the goal was reached in about 85 percent of experiments.

Overall, the VLMFP framework achieved a success rate of about 60 percent on six 2D planning tasks and greater than 80 percent on two 3D tasks, including multirobot collaboration and robotic assembly. It also generated valid plans for more than 50 percent of scenarios it hadn’t seen before, far outpacing the baseline methods.

“Our framework can generalize when the rules change in different situations. This gives our system the flexibility to solve many types of visual-based planning problems,” Fan adds.

In the future, the researchers want to enable VLMFP to handle more complex scenarios and explore methods to identify and mitigate hallucinations by the VLMs.

“In the long term, generative AI models could act as agents and make use of the right tools to solve much more complicated problems. But what does it mean to have the right tools, and how do we incorporate those tools? There is still a long way to go, but by bringing visual-based planning into the picture, this work is an important piece of the puzzle,” Fan says.

This work was funded, in part, by the MIT-IBM Watson AI Lab.

2026 MIT Sloan Sports Analytics Conference shows why data make a difference

Tue, 03/10/2026 - 5:30pm

With time dwindling in the Olympic women’s ice hockey gold medal game on Feb. 19, players for Team USA and Team Canada lined up for a key faceoff in Canada’s end. Canada had a 1-0 lead. USA had 2:23 left, and an ace up their sleeve: analytics.

USA Coach John Wroblewski pulled the goalkeeper, to get a player advantage, and had forward Alex Carpenter take the faceoff. Statistics show that Carpenter is not only very good at winning faceoffs; she also wins a lot of them cleanly. That allows her team to quickly regain possession, without too many teammates nearby. Knowing that, Wroblewski directed the USA players to spread out, largely away from the faceoff circle, in position to circulate the puck as soon as they got it back.

Carpenter won the faceoff, and Team USA quickly started a passing move. Laila Edwards soon launched a shot that longtime star Hilary Knight deflected in for the crucial, game-tying goal with 2:04 left. Team USA then won in overtime. And data-driven decision-making had also won big; indeed, it helped change the Olympics.

“What it does for a coach, the other thing these analytics do, is … it allows you to move forward with this confidence level,” Wroblewski said on Saturday at the 20th annual MIT Sloan Sports Analytics Conference (SSAC), during a hockey analytics panel where he detailed his decision-making for that faceoff, and in the gold medal game generally.

Using the data, he added, lets coaches “limit the emotion” that might cloud their in-game decisions.

“By the time you get to that decision, you’re then allowed the freedom to step away from the decision, to allow the players to go earn their medal,” Wroblewski added.

You don’t usually find coaches divulging their tactical secrets just three weeks after a big game has been played. But then, this is the MIT Sloan conference, a trailblazing forum that has helped analytics ideas spread throughout sports. Coaches, players, and analysts know any data-driven discussion will find an interested audience.

“Analytics was massive for us going into the gold medal game,” Wroblewski said.

20 years on: From classrooms to convention halls

The 20th edition of SSAC was a strong one, with many substantive panel discussions and interviews; the annual research paper, hackathon, and case study contests; mentorship events and informal networking opportunities; and more. Over 2,500 people attended the two-day event, held at Boston’s Menino Conference and Exhibition Center (MCEC). The conference was founded in 2007 by Daryl Morey, now president of basketball operations for the NBA Philadelphia 76ers, and Jessica Gelman, now CEO of the Kraft Analytics Group.

The first three editions of the conference were held on the MIT campus. In 2010, it first moved to the MCEC (one of two regular convention-center sites it uses), and starting in 2011, the conference became a two-day event.

Today people attend for the panels, the career opportunities, and, in some cases, to make news. NBA Commissioner Adam Silver was on hand this year, engaging in an on-stage conversation with former WNBA great Sue Bird, publicly addressing some of the key issues facing his league, and drawing wide media coverage.

First, though, Silver reflected about attending the second edition of the conference on the MIT campus in 2008, when he was deputy commissioner.

“It was literally a classroom of 20 people we were talking to,” Silver recalled. “I think it was the beginning of the moment when people were taking sports as a discipline more seriously. … I give Jessica and Daryl a lot of credit [for that].”

Addressing tanking and gambling

A core part of Silver’s comments focused on two big issues in pro basketball: tanking and gambling. About eight NBA teams appear to be tanking this season, that is, losing games in order to increase their chances of getting a high draft pick.

“We are going to make substantial changes for next year,” Silver said, although he also added: “I am an incrementalist. I think we’ve got to be a little bit careful about how huge a change we make at once. I’m not ruling anything out. But I am paying attention to that.”

To be sure, tanking has long been a part of professional basketball, as Bird noted during the conversation.

“We did it in Seattle, to be honest,” Bird said. “Breanna Stewart was coming out of college. We were in a ‘rebuild.’”

Still, in this NBA season, tanking has become an epidemic, in “a little bit of a perfect storm,” as Silver put it on Friday. And almost every proposed solution seems to have drawbacks. Perhaps the simplest cure for tanking, actually, would be robust analytical studies showing that it is not a very effective team-building strategy. If that is what the numbers reveal, of course.

Meanwhile, multiple arrests of NBA players and coaches at the beginning of the season show further that sports gambling continues to present challenges to professional sports leagues.

“I personally think there should be more regulation now, not less,” Silver said on Friday, suggesting that federal rules would simplify things in the U.S., where 39 states allow sports gambling to some extent. He also said the NBA can continue to work on monitoring data to protect against gambling scandals.

“I think there are some large-platform companies are that are looking at a business opportunity to come in and in a much more sophisticated way work as a detection service with the league,” Silver said.

Through it all, Silver said, the NBA will continue to be a data-driven operation. Have you watched a game with a long instant-replay review, and gotten a little impatient? Still, have you kept watching that game? So does almost everyone.

“For years people would tell us, ‘Don’t use instant replay, because you’ll turn fans off,’” Silver said. However, he added, “The data suggests, in terms of ratings and what servers tell us, you almost never lose a fan when you’re going to replay. Because they want to see the replay and they want to see what happened.”

The minnows got big

Sports analytics took root in baseball, with its discrete pitcher-hitter actions. Legendary MLB general manager Branch Rickey employed a statistician for the great Brooklyn Dodgers of the 1950s; the famous manager Earl Weaver thought analytically with the Baltimore Orioles in the 1970s. Baseball analyst Bill James made sports analytics a viable pursuit with his annual “Baseball Abstract” bestsellers in the 1980s, and Michael Lewis’ “Moneyball” popularized it.

But data can be applied to all sports — and sometimes is most valuable when only some teams are interested in it. Take soccer. In the English Premier League, about three clubs have been heavily oriented around analytics over the last decade: Liverpool FC, Brighton FC, and Brentford FC. That has helped Liverpool win multiple titles, while Brighton and Brentford, smaller clubs, have startled many with their success.

Saturday at SSAC, Brentford’s majority owner Matthew Benham made one of his most visible public appearances, in an onstage interview with podcaster Roger Bennett. Benham first made money wagering on soccer, then invested in Brentford, his childhood club.

“The information we used in the early days was really, really rudimentary,” Benham said. In his account, his success building an analytics-based club has only partly been about the numbers.

“A lot of the success has just been in running things efficiently.” Benham said. He prefers to have management discussions that are an “exchange of views, rather than debate,” since the latter implies an interaction with a clear winner and loser. Instead, compiling independent-minded views from his executives is more important.

Brentford also uses “a combination of old-style scouting and data” for its player acquisition decisions, Benham said. Not every decision works. Brentford could have signed current Arsenal FC star Eberechi Eze for a mere $4 million pounds in 2019, and passed; Crystal Palace FC acquired Eze, then realized a windfall when Arsenal purchased his services.

Still, pressed by Bennett to specify a little more about his analytical thinking, Benham implied that strikers are valuable not only for their finishing skills, but for consistently getting open for shots on goal. Fans tend to focus too much on a player’s misses, rather than how many chances are created by their off-ball work.

“Getting in position is way, way more informative than finishing,” Benham said.

A similar insight seems to have guided Liverpool’s thinking. As it happens, a Friday panel at SSAC featured Ian Graham, who ran Liverpool’s analytics operations from 2012 to 2023, and weighed in on a number of subjects. Among other things, Graham noted, teams are too cautious when tied late in a match; soccer grants three points for a win, one for a draw, and zero for a loss, so from a tied position, the reward for winning is twice as great as the penalty for losing.

“Teams don’t go for it enough,” Graham said. “Teams think a draw is an okay result.”

The limits of knowledge

Sports, of course, are ultimately played by imperfect, injury-prone, and sometimes exhausted athletes. One consistent lesson from the MIT Sloan conference involves the limits of data and plans.

“We think the data is giving us an answer, when actually it’s giving us some information, and we still have to make a choice,” said Ariana Andonian, vice president of player personnel for the Philadelphia 76ers, during a basketball panel on Saturday.

Asked about the promise of artificial intelligence for sports analytics, Sonia Raman, head coach of the WNBA’s Seattle Storm, noted that its insights might always be limited by circumstances.

“It’s not like you can just get an AI report in the middle of the game that says, ‘Get some shooting in,’” said Raman, who, prior to coaching in the WNBA and NBA served for 12 years as head coach of the MIT women’s basketball team.

“You can have a great plan, but if it’s poorly executed, it’s way worse than a poor plan that’s well executed,” added Steven Adams, a center for the NBA’s Houston Rockets (who is currently not playing due to injury), during the same panel.

And yet, in some games and matches, the analytics do work, the plans do come to fruition, and the numbers do make a difference. When that happens, as John Wroblewski can now attest, the results are golden. 

3 Questions: Building predictive models to characterize tumor progression

Tue, 03/10/2026 - 4:50pm

Just as Darwin’s finches evolved in response to natural selection in order to endure, the cells that make up a cancerous tumor similarly counter selective pressures in order to survive, evolve, and spread. Tumors are, in fact, complex sets of cells with their own unique structure and ability to change. 

Today, artificial Intelligence and machine learning tools offer an unparalleled opportunity to illuminate the generalizable rules governing tumor progression on the genetic, epigenetic, metabolic, and microenvironmental levels. 

Matthew G. Jones, an assistant professor in the MIT Department of Biology, the Koch Institute for Integrative Cancer Research, and the Institute for Medical Engineering and Science, hopes to use computational approaches to build predictive models — to play a game of chess with cancer, making sense of a tumor’s ability to evolve and resist treatment with the ultimate goal of improving patient outcomes. In this interview, he describes his current work.

Q: What aspect of tumor progression are you working to explore and characterize? 

A: A very common story with cancer is that patients will respond to a therapy at first, and then eventually that treatment will stop working. The reason this largely happens is that tumors have an incredible, and very challenging, ability to evolve: the ability to change their genetic makeup, protein signaling composition, and cellular dynamics. The tumor as a system also evolves at a structural level. Oftentimes, the reason why a patient succumbs to a tumor is because either the tumor has evolved to a state we can no longer control, or it evolves in an unpredictable manner. 

In many ways, cancers can be thought of as, on the one hand, incredibly dysregulated and disorganized, and on the other hand, as having their own internal logic, which is constantly changing. The central thesis of my lab is that tumors follow stereotypical patterns in space and time, and we’re hoping to use computation and experimental technology to decode the molecular processes underlying these transformations.  

We’re focused on one specific way tumors are evolving through a form of DNA amplification called extrachromosomal DNA. Excised from the chromosome, these ecDNAs are circularized and exist as their own separate pool of DNA particles in the nucleus. 

Initially discovered in the 1960s, ecDNA were thought to be a rare event in cancer. However, as researchers began applying next-generation sequencing to large patient cohorts in the 2010s, it seemed like not only were these ecDNA amplifications conferring the ability of tumors to adapt to stresses, and therapies, faster, but that they were far more prevalent than initially thought.

We now know these ecDNA amplifications are apparent in about 25 percent of cancers, in the most aggressive cancers: brain, lung, and ovarian cancers. We have found that, for a variety of reasons, ecDNA amplifications are able to change the rule book by which tumors evolve in ways that allow them to accelerate to a more aggressive disease in very surprising ways. 

Q: How are you using machine learning and artificial intelligence to study ecDNA amplifications and tumor evolution? 

A: There’s a mandate to translate what I’m doing in the lab to improve patients’ lives. I want to start with patient data to discover how various evolutionary pressures are driving disease and the mutations we observe. 

One of the tools we use to study tumor evolution is single-cell lineage tracing technologies. Broadly, they allow us to study the lineages of individual cells. When we sample a particular cell, not only do we know what that cell looks like, but we can (ideally) pinpoint exactly when aggressive mutations appeared in the tumor’s history. That evolutionary history gives us a way of studying these dynamic processes that we otherwise wouldn’t be able to observe in real time, and helps us make sense of how we might be able to intercept that evolution. 

I hope we’re going to get better at stratifying patients who will respond to certain drugs, to anticipate and overcome drug resistance, and to identify new therapeutic targets.

Q: What excited you about joining the MIT community?

A: One of the things that I was really attracted to was the integration of excellence in both engineering and biological sciences. At the Koch Institute, every floor is structured to promote this interface between engineers and basic scientists, and beyond campus, we can connect with all the biomedical research enterprises in the greater Boston area. 

Another thing that drew me to MIT was the fact that it places such a strong emphasis on education, training, and investing in student success. I’m a personal believer that what distinguishes academic research from industry research is that academic research is fundamentally a service job, in that we are training the next generation of scientists. 

It was always a mission of mine to bring excellence to both computational and experimental technology disciplines. The types of trainees I’m hoping to recruit are those who are eager to collaborate and solve big problems that require both disciplines. The KI [Koch Institute] is uniquely set up for this type of hybrid lab: my dry lab is right next to my wet lab, and it’s a source of collaboration and connection, and that reflects the KI’s general vision. 

How Joseph Paradiso’s sensing innovations bridge the arts, medicine, and ecology

Tue, 03/10/2026 - 4:25pm

Joseph Paradiso thinks that the most engaging research questions usually span disciplines. 

Paradiso was trained as a physicist and completed his PhD in experimental high-energy physics at MIT in 1981. His father was a photographer and filmmaker working at MIT, MIT Lincoln Laboratory, and the MITRE Corporation, so he grew up in a house where artists, scientists, and engineers regularly gathered and interesting music was always playing. 

That mix of influences led him to the MIT Media Lab, where he is the Alexander W. Dreyfoos Professor, academic head of the Program in Media Arts and Sciences, and director of the Responsive Environments research group.

At the Media Lab, Paradiso conducts research that engages sensing of different kinds and applies it across diverse and often extreme applications. He works on developing technologies that can efficiently capture and process multiple sensing modalities, and leverages this capability in application domains like the internet of things, medicine, environmental sensing, space exploration, and artistic expression. These efforts use that information to help people better understand the world, express themselves, and connect with one another.

Early in his career, Paradiso helped pioneer the field of wireless wearable sensing. He built many systems with multiple embedded sensors that could send information from the human body in real-time. One of his early flagship projects in this area was a pair of shoes fielded in 1997 for real-time augmented dance performance that embedded 16 sensors in each shoe, allowing wearers’ movements to directly generate music through algorithmic mapping. And Paradiso’s research at the Media Lab has consistently focused on sensing and using that information in new ways. 

“When I would list all the sensors … people would laugh. But now, my watch is measuring most of these things,” Paradiso notes. “The world has moved.” 

That progression from early prototypes to everyday technology helped lay the groundwork for devices people now use regularly to track activity, health, and performance.

As sensing systems improved, Paradiso expanded his work from individuals to groups. He developed platforms that allowed dance ensembles to create music together through their collective motion. Achieving this required Paradiso and his team to develop new ways for compact wearable devices to communicate wirelessly at high speed, as well as new approaches to real-time data processing and extending the range of available microelectromechanical systems (MEMS) sensors.

Those same sensing platforms were later adapted for sports medicine in 2006. Working with doctors who support elite athletes, his array of compact, wearable sensors captured large amounts of high-speed motion data from multiple points on the body, aimed at helping clinicians assess injury risk, performance, and recovery on the go, without the complex equipment typically associated with biomechanical monitoring and clinical settings.

More recently, Paradiso’s research has extended beyond humans. Through collaborations with National Geographic Explorers, his team has deployed sensors in remote environments to study animal behavior, including low-power compact wearable devices to detect the environmental conditions around the animal as well as track them (currently on lions and hyenas in Botswana and goats in Chile), and acoustic sensors with onboard AI to detect and monitor populations of endangered honeybees in Patagonia. This work provides new ways to understand how ecosystems function and how the planet is changing.

Paradiso was named an IEEE Fellow in January, recognizing his achievement in wireless wearable sensing and mobile energy harvesting. This is the highest grade of membership in IEEE, the world’s leading professional association dedicated to advancing technology for the benefit of humanity.

Across art, health, and the natural world, Paradiso’s work reflects how foundational research at MIT can seed technologies that ripple outward over time, shaping new applications and opening new fields. As advances in wearable technologies drive the rush toward the ever-more-connected human, a persistent existential question lurks. 

“Where do I stop, versus others begin?” Paradiso asks. 

For him, the aim is not novelty for its own sake, but amplification: using technology to help people become more perceptive, better connected, and more aware of their place in a larger system.

MIT School of Engineering faculty receive awards in fall 2025

Tue, 03/10/2026 - 4:00pm

Each year, faculty and researchers across the MIT School of Engineering are recognized with prestigious awards for their contributions to research, technology, society, and education. To celebrate these achievements, the school periodically highlights select honors received by members of its departments, institutes, labs, and centers. The following individuals were recognized in fall 2025:

Hal Abelson, the Class of 1922 Professor in the Department of Electrical Engineering and Computer Science, received the 2025 Lifetime Achievement Award for Excellence from Open Education Global. The award honors his foundational impact on open education, Creative Commons, and open knowledge movements.

Faez Ahmed, the Henry L. Doherty Career Development Professor in Ocean Utilization in the Department of Mechanical Engineering, received an Amazon Research Award for his project “AutoDA‑Sim: A Multi‑Agent Framework for Safe, Aesthetic, and Aerodynamic Vehicle Design.” Amazon Research Awards provide unrestricted funds and AWS Promotional Credits to academic researchers investigating various research topics in multiple disciplines.

Pulkit Agrawal, an associate professor in the Department of Electrical Engineering and Computer Science, received the 2025 IROS Toshio Fukuda Young Professional Award for contributions to robot learning, policy learning, agile locomotion, and dexterous manipulation. The award recognizes outstanding contributions of an individual of the IROS community who has pioneered activities in robotics and intelligent systems.

Ahmad Bahai, a professor of the practice in the Department of Electrical Engineering and Computer Science, was elected to the 2025 class of Fellows of the National Academy of Inventors for contribution to innovation in new semiconductor devices with extensive applications in clinical grade personal sensors for a variety of biomarkers. The honor recognizes inventors whose patented work has made a meaningful global impact.

Yufeng (Kevin) Chen, an associate professor in the Department of Electrical Engineering and Computer Science, received the 2025 IROS Toshio Fukuda Young Professional Award for contributions to insect‑scale multimodal robots and soft‑actuated aerial systems. The award recognizes outstanding contributions of an individual of the IROS community who has pioneered activities in robotics and intelligent systems.

Angela Koehler, the Charles W. and Jennifer C. Johnson Professor in the Department of Biological Engineering, received the 2025 Sato Memorial International Award from the Pharmaceutical Society of Japan, recognizing advancements in pharmaceutical sciences and U.S.–Japan scientific collaboration.

Dina Katabi, the Thuan (1990) and Nicole Pham Professor in the Department of Electrical Engineering and Computer Science, was elected to the National Academy of Medicine for pioneering digital health technology that enables noninvasive, off-body remote health monitoring via AI and wireless signals, and for developing digital biomarkers for Parkinson’s progression and detection. Election to the academy is considered one of the highest honors in the fields of health and medicine, and recognizes individuals who have demonstrated outstanding professional achievement and commitment to service.

Darcy McRose, the Thomas D. and Virginia W. Cabot Career Development Professor in the Department of Civil and Environmental Engineering, was selected as a 2025 Packard Fellow for Science and Engineering. The Packard Foundation established the Packard Fellowships for Science and Engineering to allow the nation’s most promising early-career scientists and engineers flexible funding to take risks and explore new frontiers in their fields of study.

Muriel Médard, the NEC Professor of Software Science and Engineering in the Department of Electrical Engineering and Computer Science, received the 2026 IEEE Richard W. Hamming Medal for contributions to coding for reliable communications and networking. Recognized for breakthroughs in network coding and information theory, Médard’s innovations improve the reliability of data transmission in applications such as streaming video, wireless networks, and satellite communications. The award is given for exceptional contributions to information sciences, systems and technology.

Tess Smidt, an associate professor in the Department of Electrical Engineering and Computer Science, was selected as a 2025 AI2050 Fellow by Schmidt Sciences for her project, “Hierarchical Representations of Complex Physical Systems with Euclidean Neural Networks.” The program supports research that aims to help AI benefit humanity by mid‑century.

Pages