The “DeepCom” AI model developed by the Microsoft and Beihang University team showed that it could effectively mimic human behavior by reading and commenting on news articles written in English and Chinese. But the original paper uploaded to the arXiv preprint server on 26 September made no mention of ethical issues regarding possible misuse of the technology. The omission sparked a backlash that eventually prompted the research team to upload an updated paper addressing those concerns.
“A paper by Beijing researchers presents a new machine learning technique whose main uses seem to be trolling and disinformation,” wrote Arvind Narayanan, a computer scientist at the Center for Information Technology Policy at Princeton University, in a Twitter post. “It’s been accepted for publication at EMLNP [sic], one of the top 3 venues for Natural Language Processing research. Cool Cool Cool [sic].”
The Microsoft and Beihang University paper has spurred discussion within the broader research community about whether machine learning researchers should follow stricter guidelines and more openly acknowledge the possible negative implications of certain AI applications. The paper is currently scheduled for presentation at the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP) in Hong Kong on 7 November.
Both Narayanan and David Ha, a scientist at Google Brain Research, voiced their skepticism of the original paper’s suggestion that “automatic news comment generation is beneficial for real applications but has not attracted enough attention from the research community.” Ha sarcastically asked if there would be a follow-up paper about an AI model called “DeepTroll” or “DeepWumao” (“Wumao” is the name for Chinese Internet commentators paid by the Chinese Communist Party to help manipulate public opinion online by making online comments.)
Jack Clark, a former journalist turned policy director for the OpenAI research organization, gave a more
blunt rebuttal to the paper’s suggestion: “As a former journalist, I can tell you that this is a lie.”
Researchers such as Alvin Grissom II, a computer scientist at Ursinus College in Collegeville, Penn., raised questions about what types of AI research deserve to be publicized by prominent research conferences such as EMNLP. “I think there’s qualitative difference between research on fundamental problems that have the potential for misuse and applications which are specifically suited to, if not designed for, misuse,” said Grissom wrote in a Twitter post.
The Microsoft and Beihang University researchers’ updated paper, which acknowledges some of the ethical concerns, was uploaded after Katyanna Quach reported on the controversy for The Register. The updated version also removed the original paper’s statement about how “automatic news generation is beneficial for real applications.”
“We are aware of potential ethical issues with application of these methods to generate news commentary that is taken as human,” the researchers wrote in the updated paper’s conclusion. “We hope to stimulate discussion about best practices and controls on these methods around responsible uses of the technology.”
The updated paper’s conclusion about possible applications also specifically mentions that the team was “motivated to extend the capabilities of a popular chatbot.” That almost certainly refers to Microsoft’s China-based chatbot named Xiaoice. It has more than 660 million users worldwide and has become a virtual celebrity in China. Wei Wu, one of the coauthors on the DeepCom paper, holds the position of principal applied scientist for the Microsoft Xiaoice team at Microsoft Research Asia in Beijing.
The Microsoft and Beihang University researchers did not provide much additional input when reached for comment. Instead, both Wu and a Microsoft representative referred to the updated version of the paper that acknowledges the ethical issues. But the Microsoft representative was unable to refer IEEE Spectrum to a single source who could speak about the company’s research review process.
“I’d like to hear from Microsoft if they had any ethical review process in place, and whether they plan to make any changes to their processes in the future in response to the concerns about this paper,” Narayanan wrote in an email to IEEE Spectrum. His prior work includes research on how AI can learn gender and racial biases from language.
Microsoft has previously staked out a position for itself as a leader in AI ethics with initiatives such as the company’s AI and Ethics in Engineering and Research (AETHER) Committee. That committee’s advice has supposedly led Microsoft to reject certain sales of its commercialized technology in the past. It’s less clear how much AETHER is involved in screening AI research collaborations prior to the AI application and commercialization stage.
Meanwhile, Nayaranan and other researchers have also asked questions about the review process for accepting papers at the EMNLP conference being held in Hong Kong. Narayanan urged conference attendees to direct questions at both the paper’s authors and the program chairs for the conference. (The EMNLP organizing committee had not responded to a request for comment as of publication time.)
“Security conferences these days require submissions to describe ethical considerations and how the authors followed ethical principles,” Narayanan wrote in a Twitter post. “Machine learning conferences should consider doing this.”