Tag

Anthropic

Browsing

A whole industry of data brokers buys up vast quantities of electronic information from cell phone apps and web browsers and sells it to advertisers who use that data to target ads. The same industry also sells that data, including bulk cell phone location data, to police departments and federal government agencies in ways that can reveal intimate details about Americans without a warrant.

Now, privacy advocates say that the best chance for Congress to close the well-known loophole around the Fourth Amendment that allows for that sort of governmental snooping is coming up in just a few weeks.

That’s when Congress is expected to take up reauthorization of what is known as Section 702 of the Foreign Intelligence Surveillance Act, which is set to expire on April 20.

After a 2015 change to the law, federal agencies are not supposed to collect data on U.S. citizens in bulk. But some found a workaround to requesting warrants by simply buying the data instead.

Last week, some 130 civil society organizations signed on to a letter urging members of Congress to include closing the data broker loophole in FISA 702 reauthorization, citing the “unprecedented expansion of warrantless mass surveillance that is sweeping up the private information of communities across America” and the potential for the loophole to be used “to supercharge AI-powered surveillance.”

At a Senate hearing last week, Sen. Ron Wyden (D-Ore.) asked Federal Bureau of Investigations director Kash Patel if he would commit to not buying Americans’ location data, which is usually obtained from cell phones. Patel declined to do so, instead saying the FBI “uses all tools” and “we do purchase commercially available information that’s consistent with the Constitution and the laws under the Electronic Communications Privacy Act, and it has led to some valuable intelligence for us.”

A spokesperson for the FBI declined to comment on which commercial data the FBI purchases. In 2023, then-FBI director Christopher Wray had indicated that the agency had backed away from using “commercial database information that includes location data derived from internet advertising.”

Location records from brokers are typically unlinked to a device owner’s name. But tools exist that help law enforcement track where a device has gone, where it spends every night and where it goes during working hours, said Bill Budington, a senior staff technologist at the Electronic Frontier Foundation, a privacy advocacy organization.

AI tools present new challenges for privacy

Artificial intelligence can be leveraged to make such data even more powerful. The CEO of the AI company Anthropic, Dario Amodei, warned in a statement last month that records the government can purchase can be used by AI to assemble “a comprehensive picture of any person’s life—automatically and at massive scale.”

Amodei’s unwillingness to allow Anthropic’s technology to be used for domestic mass surveillance or autonomous weapons has led to a major fight with the Pentagon, which says a private company cannot dictate how the government lawfully uses its technology.

In addition to the FBI and the Department of Defense, Immigration and Customs Enforcement is also among the federal agencies that have had known contracts for tools that rely on cell phone location information sourced from data brokers. These developments come as ICE is ramping up its efforts to surveil not only immigrants who are targeted for deportation, but also people who record federal agents and protesters, using tools such as facial recognition, license plate data and administrative subpoenas to tech companies for user information.

Earlier this year, ICE requested information on a federal procurement site for industry feedback about “commercial Big Data and Ad Tech” that could be used in its investigations, as was first reported by WIRED.

Last year, ICE signed a contract with the company Penlink for its program Webloc, which can be used to track the movements of mobile phones or find phones that have visited specific places, according to reporting by the tech news outlet, 404 Media.

A Penlink spokesperson told NPR in a statement that the company “understands the sensitivity and complexity of data privacy” and “the vendors we use to make location data available to our customers filter out sensitive locations, such as hospitals, schools, and religious institutions.”

The statement continued, “We are committed to complying with applicable laws and regulations, as our customers are required to do, and we update our practices as those laws change.”

ICE did not respond to NPR’s request for comment about the phone tracking technology and how it’s used.

Government data purchases without a warrant are “contributing to an ever-expanding infrastructure of private sector surveillance that is hurtling us into a dystopian surveillance society,” Jeramie D. Scott, senior counsel and director of the Surveillance and Oversight Program at the Electronic Privacy Information Center, told NPR.

FISA bill is “only chance” this year to end bulk data collection

Privacy and civil liberties advocates say the upcoming FISA reauthorization debate is the best chance to close the so-called “data broker loophole” that federal agencies are using to purchase the kind of bulk data that Congress has already banned them from collecting themselves.

“This is very likely the only chance that Congress has this year to vote for meaningful privacy protections,” said Sean Vitka, executive director of Demand Progress, an advocacy group that has helped bring together an unusual coalition supporting federal surveillance reforms with backers from opposite sides of the political spectrum.

He added, without reform, “The Trump administration is walking around with the most dangerous surveillance powers in recent history,” given recent advances in AI, expanding use of data from brokers and changes to FISA that Congress passed in 2024.

Rep. Warren Davidson (R-Ohio) along with conservative Sen. Mike Lee (R-Utah), teamed up with Democrats Rep. Zoe Lofgren and Wyden on bicameral bipartisan FISA reform legislation that would end the data broker loophole among several other reforms.

“This is one of those issues that really doesn't break on party lines,” Rep. Warren Davidson (R-Ohio) told NPR. “You're collecting data that really you would never get a warrant for, that kind of a broad dragnet sweep under normal warrant requirements.” he said.

“This is one of those issues that really doesn’t break on party lines,” Rep. Warren Davidson (R-Ohio) told NPR. “You’re collecting data that really you would never get a warrant for, that kind of a broad dragnet sweep under normal warrant requirements.”

Kevin Dietsch/Getty Images

“This is one of those issues that really doesn’t break on party lines,” Davidson told NPR.

Davidson said when the federal government purchases data broker data, “You’re collecting data that really you would never get a warrant for, that kind of a broad dragnet sweep under normal warrant requirements,” he said.

He also hopes to shut another loophole, known as the “backdoor search” loophole, by ending the practice of federal agencies searching Americans’ communications without a warrant that were swept up with the collection of bulk communications of foreigners outside the country.

But tying reforms to FISA’s reauthorization faces opposition from members of both parties. The White House and House Speaker Mike Johnson are both pushing for a clean reauthorization of FISA that would include no changes, and there are some Democrats who have indicated they support that plan to ensure the law does not lapse.

Still, amid opposition from members of his own party over a clean reauthorization, Johnson delayed a House vote on the issue until mid-April.

Courts have not weighed in on the practice of the federal government buying up bulk data from data brokers, making it an untested legal grey area. Privacy advocates argue the practice circumvents the Fourth Amendment and is contrary to a 2015 law that bars federal agencies from collecting bulk data on Americans. That law, the USA Freedom Act, came after former National Security Agency contractor Edward Snowden leaked classified information on how the agency was collecting Americans’ phone records.

Purchasing bulk data from data brokers is “very much not what Congress intended when it said we are banning bulk collection,” said Jake Laperruque, deputy director of the Security and Surveillance Project at the Center for Democracy and Technology. “It wasn’t, you know, ‘do bulk collection, but also pay taxpayer money for it.’ It was ‘don’t do bulk collection.'”

Privacy advocates like Laperruque also believe they have Supreme Court precedent on their side. In a 2018 case known as Carpenter v. United States, the court ruled that law enforcement needs a warrant to obtain a person’s historic cell phone location data from cell phone towers.

Laperruque said the idea that law enforcement can purchase information from data brokers they would normally need a warrant for doesn’t make sense, particularly since he said it is often possible to identify individuals from supposedly anonymized data from brokers.

“We certainly wouldn’t imagine a scenario where the police said, ‘We’re going to search your house. We don’t have a warrant, but we paid your landlord $100 to give us a spare key. So now we’re searching your house without a warrant,'” Laperruque said.

Davidson said the fact that data brokers can sell identifiable information highlights that Congress needs to deal with a broader privacy law to protect Americans’ data. “But in the meantime, you know, governments are buying their way around the Fourth Amendment and we need to close that off.”

He added that this issue is exacerbated further by artificial intelligence, which “can harvest and collect the data in a way that humans never could and do it amazingly fast.”

The recent falling out between Anthropic and the Department of Defense has further highlighted the potency of combining AI with powerful records purchased from data brokers, said Laperruque.

“What kind of new Pandora’s box do we open when we not only have these huge quantities of data, but we have tools that can start to scan and analyze patterns in unprecedented ways and at an unprecedented scale that you can never do from human analysts,” he said.

Feature image credit: Mandel Ngan/AFP via Getty Images

By

Sourced from npr

By James Vincent

Anthropic has expanded the context window of its chatbot Claude to 75,000 words — a big improvement on current models. Anthropic says it can process a whole novel in less than a minute.

An often overlooked limitation for chatbots is memory. While it’s true that the AI language models that power these systems are trained on terabytes of text, the amount these systems can process when in use — that is, the combination of input text and output, also known as their “context window” — is limited. For ChatGPT it’s around 3,000 words. There are ways to work around this, but it’s still not a huge amount of information to play with.

Now, AI startup Anthropic (founded by former OpenAI engineers) has hugely expanded the context window of its own chatbot Claude, pushing it to around 75,000 words. As the company points out in a blog post, that’s enough to process the entirety of The Great Gatsby in one go. In fact, the company tested the system by doing just this — editing a single sentence in the novel and asking Claude to spot the change. It did so in 22 seconds.

You may have noticed my imprecision in describing the length of these context windows. That’s because AI language models measure information not by number of characters or words, but in tokens; a semantic unit that doesn’t map precisely onto these familiar quantities. It makes sense when you think about it. After all, words can be long or short, and their length does not necessarily correspond to their complexity of meaning. (The longest definitions in the dictionary are often for the shortest words.) The use of “tokens” reflects this truth, and so, to be more precise: Claude’s context window can now process 100,000 tokens, up from 9,000 before. By comparison, OpenAI’s GPT-4 processes around 8,000 tokens (that’s not the standard model available in ChatGPT — you have to pay for access) while a limited-release full-fat model of GPT-4 can handle up to 32,000 tokens.

Right now, Claude’s new capacity is only available to Anthropic’s business partners, who are tapping into the chatbot via the company’s API. The pricing is also unknown, but is certain to be a significant bump. Processing more text means spending more on compute.

But the news shows AI language models’ capacity to process information is increasing, and this will certainly make these systems more useful. As Anthropic notes, it takes a human around five hours to read 75,000 words of text, but with Claude’s expanded context window, it can potentially take on the task of reading, summarizing and analyzing a long documents in a matter of minutes. (Though it doesn’t do anything about chatbots’ persistent tendency to make information up.) A bigger context window also means the system is able to hold longer conversations. One factor in chatbots going off the rails is that when their context window fills up they forget what’s been said and it’s why Bing’s chatbot is limited to 20 turns of conversation. More context equals more conversation.

Feature Image Credit: Anthropic

By James Vincent

A senior reporter who has covered AI, robotics, and more for eight years at The Verge.

Sourced from The Verge