Feed aggregator
AI Watermarking Won't Curb Disinformation
Generative AI allows people to produce piles upon piles of images and words very quickly. It would be nice if there were some way to reliably distinguish AI-generated content from human-generated content. It would help people avoid endlessly arguing with bots online, or believing what a fake image purports to show. One common proposal is that big companies should incorporate watermarks into the outputs of their AIs. For instance, this could involve taking an image and subtly changing many pixels in a way that’s undetectable to the eye but detectable to a computer program. Or it could involve swapping words for synonyms in a predictable way so that the meaning is unchanged, but a program could readily determine the text was generated by an AI.
Unfortunately, watermarking schemes are unlikely to work. So far most have proven easy to remove, and it’s likely that future schemes will have similar problems.
One kind of watermark is already common for digital images. Stock image sites often overlay text on an image that renders it mostly useless for publication. This kind of watermark is visible and is slightly challenging to remove since it requires some photo editing skills.
anemone-occidentalis-watermarked.jpg
Images can also have metadata attached by a camera or image processing program, including information like the date, time, and location a photograph was taken, the camera settings, or the creator of an image. This metadata is unobtrusive but can be readily viewed with common programs. It’s also easily removed from a file. For instance, social media sites often automatically remove metadata when people upload images, both to prevent people from accidentally revealing their location and simply to save storage space.
A useful watermark for AI images would need two properties:
- It would need to continue to be detectable after an image is cropped, rotated, or edited in various ways (robustness).
- It couldn’t be conspicuous like the watermark on stock image samples, because the resulting images wouldn’t be of much use to anybody.
One simple technique is to manipulate the least perceptible bits of an image. For instance, to a human viewer these two squares are the same shade:
But to a computer it’s obvious that they are different by a single bit: #93c47d vs 93c57d. Each pixel of an image is represented by a certain number of bits, and some of them make more of a perceptual difference than others. By manipulating those least-important bits, a watermarking program can create a pattern that viewers won’t see, but a watermarking-detecting program will. If that pattern repeats across the whole image, the watermark is even robust to cropping. However, this method has one clear flaw: rotating or resizing the image is likely to accidentally destroy the watermark.
There are more sophisticated watermarking proposals that are robust to a wider variety of common edits. However, proposals for AI watermarking must pass a tougher challenge. They must be robust against someone who knows about the watermark and wants to eliminate it. The person who wants to remove a watermark isn’t limited to common edits, but can directly manipulate the image file. For instance, if a watermark is encoded in the least important bits of an image, someone could remove it by simply setting all the least important bits to 0, or to a random value (1 or 0), or to a value automatically predicted based on neighboring pixels. Just like adding a watermark, removing a watermark this way gives an image that looks basically identical to the original, at least to a human eye.
Coming at the problem from the opposite direction, some companies are working on ways to prove that an image came from a camera (“content authenticity”). Rather than marking AI generated images, they add metadata to camera-generated images, and use cryptographic signatures to prove the metadata is genuine. This approach is more workable than watermarking AI generated images, since there’s no incentive to remove the mark. In fact, there’s the opposite incentive: publishers would want to keep this metadata around because it helps establish that their images are “real.” But it’s still a fiendishly complicated scheme, since the chain of verifiability has to be preserved through all software used to edit photos. And most cameras will never produce this metadata, meaning that its absence can’t be used to prove a photograph is fake.
Comparing watermarking vs content authenticity, watermarking aims to identify or mark (some) fake images; content authenticity aims to identify or mark (some) real images. Neither approach is comprehensive, since most of the images on the Internet will have neither a watermark nor content authenticity metadata.
Watermarking Content authenticity AI images Marked Unmarked (Some) camera images Unmarked Marked Everything else Unmarked UnmarkedText-based Watermarks
The watermarking problem is even harder for text-based generative AI. Similar techniques can be devised. For instance, an AI could boost the probability of certain words, giving itself a subtle textual style that would go unnoticed most of the time, but could be recognized by a program with access to the list of words. This would effectively be a computer version of determining the authorship of the twelve disputed essays in The Federalist Papers by analyzing Madison’s and Hamilton’s habitual word choices.
But creating an indelible textual watermark is a much harder task than telling Hamilton from Madison, since the watermark must be robust to someone modifying the text trying to remove it. Any watermark based on word choice is likely to be defeated by some amount of rewording. That rewording could even be performed by an alternate AI, perhaps one that is less sophisticated than the one that generated the original text, but not subject to a watermarking requirement.
There’s also a problem of whether the tools to detect watermarked text are publicly available or are secret. Making detection tools publicly available gives an advantage to those who want to remove watermarking, because they can repeatedly edit their text or image until the detection tool gives an all clear. But keeping them a secret makes them dramatically less useful, because every detection request must be sent to whatever company produced the watermarking. That would potentially require people to share private communication if they wanted to check for a watermark. And it would hinder attempts by social media companies to automatically label AI-generated content at scale, since they’d have to run every post past the big AI companies.
Since text output from current AIs isn’t watermarked, services like GPTZero and TurnItIn have popped up, claiming to be able to detect AI-generated content anyhow. These detection tools are so inaccurate as to be dangerous, and have already led to false charges of plagiarism.
Lastly, if AI watermarking is to prevent disinformation campaigns sponsored by states, it’s important to keep in mind that those states can readily develop modern generative AI, and probably will in the near future. A state-sponsored disinformation campaign is unlikely to be so polite as to watermark its output.
Watermarking of AI generated content is an easy-sounding fix for the thorny problem of disinformation. And watermarks may be useful in understanding reshared content where there is no deceptive intent. But research into adversarial watermarking for AI is just beginning, and while there’s no strong reason to believe it will succeed, there are some good reasons to believe it will ultimately fail.
Improving Shor’s Algorithm
We don’t have a useful quantum computer yet, but we do have quantum algorithms. Shor’s algorithm has the potential to factor large numbers faster than otherwise possible, which—if the run times are actually feasible—could break both the RSA and Diffie-Hellman public-key algorithms.
Now, computer scientist Oded Regev has a significant speed-up to Shor’s algorithm, at the cost of more storage.
Details are in this article. Here’s the result:
The improvement was profound. The number of elementary logical steps in the quantum part of Regev’s algorithm is proportional to ...
Germany's CO2 emissions are at their lowest in 7 decades — study
Extreme cold leaves thousands without power in Nordic countries
World’s biggest banks made $3B on green debt in 2023
Environmentalists challenge California rooftop solar decision
Federal study aims to improve offshore wind forecasting
Home buyouts raised N.Y. property values after historic flooding
Budget crunch threatens Maryland climate plans
Environmental lawyers are worried about the new youth climate case
EFF Asks Court to Uphold Federal Law That Protects Online Video Viewers’ Privacy and Free Expression
As millions of internet users watch videos online for news and entertainment, it is essential to uphold a federal privacy law that protects against the disclosure of everyone’s viewing history, EFF argued in court last month.
For decades, the Video Privacy Protection Act (VPPA) has safeguarded people’s viewing habits by generally requiring services that offer videos to the public to get their customers’ written consent before disclosing that information to the government or a private party. Although Congress enacted the law in an era of physical media, the VPPA applies to internet users’ viewing habits, too.
The VPPA, however, is under attack by Patreon. That service for content creators and viewers is facing a lawsuit in a federal court in Northern California, brought by users who allege that the company improperly shared information about the videos they watched on Patreon with Facebook.
Patreon argues that even if it did violate the VPPA, federal courts cannot enforce it because the privacy law violates the First Amendment on its face under a legal doctrine known as overbreadth. This doctrine asks whether a substantial number of the challenged law’s applications violate the First Amendment, judged in relation to the law’s plainly legitimate sweep. Courts have rightly struck down overbroad laws because they prohibit vast amounts of lawful speech. For example, the Supreme Court in Reno v. ACLU invalidated much of the Communications Decency Act’s (CDA) online speech restrictions because it placed an “unacceptably heavy burden on protected speech.”
EFF is second to none in fighting for everyone’s First Amendment rights in court, including internet users (in Reno mentioned above) and the companies that host our speech online. But Patreon’s First Amendment argument is wrong and misguided. The company seeks to elevate its speech interests over those of internet users who benefit from the VPPA’s protections.
As EFF, the Center for Democracy & Technology, the ACLU, and the ACLU of Northern California argued in their friend-of-the-court brief, Patreon’s argument is wrong because the VPPA directly advances the First Amendment and privacy interests of internet users by ensuring they can watch videos without being chilled by government or private surveillance.
“The VPPA provides Americans with critical, private space to view expressive material, develop their own views, and to do so free from unwarranted corporate and government intrusion,” we wrote. “That breathing room is often a catalyst for people’s free expression.”
As the brief recounts, courts have protected against government efforts to learn people’s book buying and library history, and to punish people for viewing controversial material within the privacy of their home. These cases recognize that protecting people’s ability to privately consume media advances the First Amendment’s purpose by ensuring exposure to a variety of ideas, a prerequisite for robust debate. Moreover, people’s video viewing habits are intensely private, because the data can reveal intimate details about our personalities, politics, religious beliefs, and values.
Patreon’s First Amendment challenge is also wrong because the VPPA is not an overbroad law. As our brief explains, “[t]he VPPA’s purpose, application, and enforcement is overwhelmingly focused on regulating the disclosure of a person’s video viewing history in the course of a commercial transaction between the provider and user.” In other words, the legitimate sweep of the VPPA does not violate the First Amendment because generally there is no public interest in disclosing any one person’s video viewing habits that a company learns purely because it is in the business of selling video access to the public.
There is a better path to addressing any potential unconstitutional applications of the video privacy law short of invalidating the statute in its entirety. As EFF’s brief explains, should a video provider face liability under the VPPA for disclosing a customer’s video viewing history, they can always mount a First Amendment defense based on a claim that the disclosure was on a matter of public concern.
Indeed, courts have recognized that certain applications of privacy laws, such as the Wiretap Act and civil claims prohibiting the disclosure of private facts, can violate the First Amendment. But generally courts address the First Amendment by invalidating the case-specific application of those laws, rather than invalidating them entirely.
“In those cases, courts seek to protect the First Amendment interests at stake while continuing to allow application of those privacy laws in the ordinary course,” EFF wrote. “This approach accommodates the broad and legitimate sweep of those privacy protections while vindicating speakers’ First Amendment rights.”
Patreon's argument would see the VPPA gutted—an enormous loss for privacy and free expression for the public. The court should protect against the disclosure of everyone’s viewing history and protect the VPPA.
You can read our brief here.
Improving patient safety using principles of aerospace engineering
Approximately 13 billion laboratory tests are administered every year in the United States, but not every result is timely or accurate. Laboratory missteps prevent patients from receiving appropriate, necessary, and sometimes lifesaving care. These medical errors are the third-leading cause of death in the nation.
To help reverse this trend, a research team from the MIT Department of Aeronautics and Astronautics (AeroAstro) Engineering Systems Lab and Synensys, a safety management contractor, examined the ecosystem of diagnostic laboratory data. Their findings, including six systemic factors contributing to patient hazards in laboratory diagnostics tests, offer a rare holistic view of this complex network — not just doctors and lab technicians, but also device manufacturers, health information technology (HIT) providers, and even government entities such as the White House. By viewing the diagnostic laboratory data ecosystem as an integrated system, an approach based on systems theory, the MIT researchers have identified specific changes that can lead to safer behaviors for health care workers and healthier outcomes for patients.
A report of the study, which was conducted by AeroAstro Professor Nancy Leveson, who serves as head of the System Safety and Cybersecurity group, along with Research Engineer John Thomas and graduate students Polly Harrington and Rodrigo Rose, was submitted to the U.S. Food and Drug Administration this past fall. Improving the infrastructure of laboratory data has been a priority for the FDA, who contracted the study through Synensis.
Hundreds of hazards, six causes
In a yearlong study that included more than 50 interviews, the Leveson team found the diagnostic laboratory data ecosystem to be vast yet fractured. No one understood how the whole system functioned or the totality of substandard treatment patients received. Well-intentioned workers were being influenced by the system to carry out unsafe actions, MIT engineers wrote.
Test results sent to the wrong patients, incompatible technologies that strain information sharing between the doctor and lab technician, and specimens transported to the lab without guarantees of temperature control were just some of the hundreds of hazards the MIT engineers identified. The sheer volume of potential risks, known as unsafe control actions (UCAs), should not dissuade health care stakeholders from seeking change, Harrington says.
“While there are hundreds of UCAs, there are only six systemic factors that are causing these hazards,” she adds. “Using a system-based methodology, the medical community can address many of these issues with one swoop.”
Four of the systemic factors — decentralization, flawed communication and coordination, insufficient focus on safety-related regulations, and ambiguous or outdated standards — reflect the need for greater oversight and accountability. The two remaining systemic factors — misperceived notions of risk and lack of systems theory integration — call for a fundamental shift in perspective and operations. For instance, the medical community, including doctors themselves, tends to blame physicians when errors occur. Understanding the real risk levels associated with laboratory data and HIT might prompt more action for change, the report’s authors wrote.
“There’s this expectation that doctors will catch every error,” Harrington says. “It’s unreasonable and unfair to expect that, especially when they have no reason to assume the data they're getting is flawed.”
Think like an engineer
Systems theory may be a new concept to the medical community, but the aviation industry has used it for decades.
“After World War II, there were so many commercial aviation crashes that the public was scared to fly,” says Leveson, a leading expert in system and software safety. In the early 2000s, she developed the System-Theoretic Process Analysis (STPA), a technique based on systems theory that offers insights into how complex systems can become safer. Researchers used STPA in its report to the FDA. “Industry and government worked together to put controls and error reporting in place. Today, there are nearly zero crashes in the U.S. What’s happening in health care right now is like having a Boeing 787 crash every day.”
Other engineering principles that work well in aviation, such as control systems, could be applied to health care as well, Thomas says. For instance, closed-loop controls solicit feedback so a system can change and adapt. Having laboratories confirm that physicians received their patients’ test results or investigating all reports of diagnostic errors are examples of closed-loop controls that are not mandated in the current ecosystem, Thomas says.
“Operating without controls is like asking a robot to navigate a city street blindfolded,” Thomas says. “There’s no opportunity for course correction. Closed-loop controls help inform future decision-making, and, at this point in time, it’s missing in the U.S. health-care system.”
The Leveson team will continue working with Synensys on behalf of the FDA. Their next study will investigate diagnostic screenings outside the laboratory, such as at a physician’s office (point of care) or at home (over the counter). Since the start of the Covid-19 pandemic, nonclinical lab testing has surged in the country. About 600 million Covid-19 tests were sent to U.S. households between January and September 2022, according to Synensys. Yet, few systems are in place to aggregate these data or report findings to public health agencies.
“There’s a lot of well-meaning people trying to solve this and other lab data challenges,” Rose says. “If we can convince people to think of health care as an engineered system, we can go a long way in solving some of these entrenched problems.”
The Synensys research contract is art of the Systemic Harmonization and Interoperability Enhancement for Laboratory Data (SHIELD) campaign, an agency initiative that seeks assistance and input in using systems theory to address this challenge.
Inclusive research for social change
Pair a decades-old program dedicated to creating research opportunities for underrepresented minorities and populations with a growing initiative committed to tackling the very issues at the heart of such disparities, and you’ll get a transformative partnership that only MIT can deliver.
Since 1986, the MIT Summer Research Program (MSRP) has led an institutional effort to prepare underrepresented students (minorities, women in STEM, or students with low socioeconomic status) for doctoral education by pairing them with MIT labs and research groups. For the past three years, the Initiative on Combatting Systemic Racism (ICSR), a cross-disciplinary research collaboration led by MIT’s Institute for Data, Systems, and Society (IDSS), has joined them in their mission, helping bring the issue full circle by providing MSRP students with the opportunity to use big data and computational tools to create impactful changes toward racial equity.
“ICSR has further enabled our direct engagement with undergrads, both within and outside of MIT,” says Fotini Christia, the Ford International Professor of the Social Sciences, associate director of IDSS, and co-organizer for the initiative. “We've found that this line of research has attracted students interested in examining these topics with the most rigorous methods.”
The initiative fits well under the IDSS banner, as IDSS research seeks solutions to complex societal issues through a multidisciplinary approach that includes statistics, computation, modeling, social science methodologies, human behavior, and an understanding of complex systems. With the support of faculty and researchers from all five schools and the MIT Schwarzman College of Computing, the objective of ICSR is to work on an array of different societal aspects of systemic racism through a set of verticals including policing, housing, health care, and social media.
Where passion meets impact
Grinnell senior Mia Hines has always dreamed of using her love for computer science to support social justice. She has experience working with unhoused people and labor unions, and advocating for Indigenous peoples’ rights. When applying to college, she focused her essay on using technology to help Syrian refugees.
“As a Black woman, it's very important to me that we focus on these areas, especially on how we can use technology to help marginalized communities,” Hines says. “And also, how do we stop technology or improve technology that is already hurting marginalized communities?”
Through MSRP, Hines was paired with research advisor Ufuoma Ovienmhada, a fourth-year doctoral student in the Department of Aeronautics and Astronautics at MIT. A member of Professor Danielle Wood’s Space Enabled research group at MIT’s Media Lab, Ovienmhada received funding from an ICSR Seed Grant and NASA's Applied Sciences Program to support her ongoing research measuring environmental injustice and socioeconomic disparities in prison landscapes.
“I had been doing satellite remote sensing for environmental challenges and sustainability, starting out looking at coastal ecosystems, when I learned about an issue called ‘prison ecology,’” Ovienmhada explains. “This refers to the intersection of mass incarceration and environmental justice.”
Ovienmhada’s research uses satellite remote sensing and environmental data to characterize exposures to different environmental hazards such as air pollution, extreme heat, and flooding. “This allows others to use these datasets for real-time advocacy, in addition to creating public awareness,” she says.
Focused especially on extreme heat, Hines used satellite remote sensing to monitor the fluctuation of temperature to assess the risk being imposed on prisoners, including death, especially in states like Texas, where 75 percent of prisons either don't have full air conditioning or have none at all.
“Before this project I had done little to no work with geospatial data, and as a budding data scientist, getting to work with and understanding different types of data and resources is really helpful,” Hines says. “I was also funded and afforded the flexibility to take advantage of IDSS’s Data Science and Machine Learning online course. It was really great to be able to do that and learn even more.”
Filling the gap
Much like Hines, Harvey Mudd senior Megan Li was specifically interested in the IDSS-supported MSRP projects. She was drawn to the interdisciplinary approach, and she seeks in her own work to apply computational methods to societal issues and to make computer science more inclusive, considerate, and ethical.
Working with Aurora Zhang, a grad student in IDSS’s Social and Engineering Systems PhD program, Li used county-level data on income and housing prices to quantify and visualize how affordability based on income alone varies across the United States. She then expanded the analysis to include assets and debt to determine the most common barriers to home ownership.
“I spent my day-to-day looking at census data and writing Python scripts that could work with it,” reports Li. “I also reached out to the Census Bureau directly to learn a little bit more about how they did their data collection, and discussed questions related to some of their previous studies and working papers that I had reviewed.”
Outside of actual day-to-day research, Li says she learned a lot in conversations with fellow researchers, particularly changing her “skeptical view” of whether or not mortgage lending algorithms would help or hurt home buyers in the approval process. “I think I have a little bit more faith now, which is a good thing.”
“Harvey Mudd is undergraduate-only, and while professors do run labs here, my specific research areas are not well represented,” Li says. “This opportunity was enormous in that I got the experience I need to see if this research area is actually something that I want to do long term, and I got more mirrors into what I would be doing in grad school from talking to students and getting to know faculty.”
Closing the loop
While participating in MSRP offered crucial research experience to Hines, the ICSR projects enabled her to engage in topics she's passionate about and work that could drive tangible societal change.
“The experience felt much more concrete because we were working on these very sophisticated projects, in a supportive environment where people were very excited to work with us,” she says.
A significant benefit for Li was the chance to steer her research in alignment with her own interests. “I was actually given the opportunity to propose my own research idea, versus supporting a graduate student's work in progress,” she explains.
For Ovienmhada, the pairing of the two initiatives solidifies the efforts of MSRP and closes a crucial loop in diversity, equity, and inclusion advocacy.
“I've participated in a lot of different DEI-related efforts and advocacy and one thing that always comes up is the fact that it’s not just about bringing people in, it's also about creating an environment and opportunities that align with people’s values,” Ovienmhada says. “Programs like MSRP and ICSR create opportunities for people who want to do work that’s aligned with certain values by providing the needed mentoring and financial support.”
New iPhone Exploit Uses Four Zero-Days
Kaspersky researchers are detailing “an attack that over four years backdoored dozens if not thousands of iPhones, many of which belonged to employees of Moscow-based security firm Kaspersky.” It’s a zero-click exploit that makes use of four iPhone zero-days.
The most intriguing new detail is the targeting of the heretofore-unknown hardware feature, which proved to be pivotal to the Operation Triangulation campaign. A zero-day in the feature allowed the attackers to bypass advanced hardware-based memory protections designed to safeguard device system integrity even after an attacker gained the ability to tamper with memory of the underlying kernel. On most other platforms, once attackers successfully exploit a kernel vulnerability they have full control of the compromised system...