Post

Who Do Your Tools Work For? Examining Some Tough Tradeoffs

August 13, 2021

When people buy things, they own them. When we talk about ownership, we generally think of it as a bundle of rights the owner possesses: like the right to repair or resell.

But there is another way to understand ownership. The devices you buy and the software you use should work for you, the user. Not anyone else. Not the government, not some tech company, and not advertisers or those who sell ads. If there are any exceptions to this, people should know what those exceptions are, why they are in place, and whether (and how) they can opt out.

Google’s proposed new ad-targeting tool, FLoC, is an example of this principle gone awry. Right now, Google’s Chrome browser is alone among major browsers in that it, unlike Firefox, Safari, or Edge, allows third party cookies to track you around the web, letting companies build up profiles on you they can use to sell ads. Google, unlike Mozilla, Apple, and Microsoft gets most of its revenue from selling ad space so its hesitance to remove third-party cookies is understandable if not admirable. Its controversial proposed solution for this problem is called FLoC, for Federated Learning of Cohorts.

The browser will still track the websites you visit, but instead of providing third parties with an unique identifier for each person, it will slot you into “cohorts” (roughly, interest groups) with similar users, and then tell sites you visit what cohorts you belong to. This allows for (somewhat) individually targeted ads to exist but without your browsing history being monitored by whoever feels like it.

Said that way, it’s a privacy win. But FLoC also means that when you visit a site, your browser will start telling websites what kind of person you are and the kind of ads you might like to see. Is that a privacy win? Some people have even started to find how FLoC can still be used for exactly the kind of tracking it is intended to displace.

More fundamentally, though, FLoC is a feature that may one day be part of your web browser, but it’s ultimately a feature that is not for your direct benefit. It’s not the first such feature to come loaded into consumer tech products — digital rights management (DRM) for cooyright in general comes to mind. (It’s still not possible to take a screenshot inside many video streaming apps!) But it’s a new frontier, and represents what is supposed to be a tool that serves users turning into one that serves the interests of the ad industry.

Apple’s newly announced system that will scan photos on iPhones for CSAM (child sexual abuse material) is another such feature.

To be clear, tech companies have scanned user photos for CSAM for many years. While there is no legal duty for tech companies to scan for this material, it is widely accepted that they have a moral duty to do so. (Companies are legally required to turn it over if they find it.) Generally speaking, large tech companies scan their cloud file storage, email, and cloud photo libraries routinely for this material. I was surprised to learn that Apple is not currently scanning uploaded iCloud Photos for CSAM (though it was scanning other things, such as iCloud email accounts).

CSAM is a real and large problem. However, those who are critical of CSAM scanning (whatever form it takes) should not be smeared as defenders of child abuse when they point out the unintended precedent that this form of scanning can set. The technical infrastructure that scans for CSAM can be deployed to scan for other things, such as political dissidents, activists in minority groups, and photos associated with other activities that a repressive regime may deem a threat to its control. Or it might simply be hacked.

At the same time, until now, such scanning happens in the cloud — after the user’s photos are saved on a tech company’s servers. For example, Google Photos, the default cloud photo storage for Android phones, scans for this material after photos are uploaded, not before. Apple’s system does the scanning step on a user’s phone — this is a step no major tech company has taken before. (Under Apple’s system, actually reporting users for illicit images to authorities still involves manual review by Apple, and although the scanning step happens on the user’s device, further steps still happen in the cloud.)

It is reasonable for cloud services to not want their services to host unlawful material even if they have no specific legal obligation to seek it out. That, coupled with the seriousness of the CSAM problem, is why I think that this kind of cloud-based scanning is acceptable. Additionally, while I think that user-uploaded data of all kinds should be fully protected by the Fourth Amendment’s prohibition on unreasonable government searches and seizures, certain kinds of automated scans of uploaded material (to name a less controversial example, for malware) do not appear to be out of line with user privacy expectations. The way to avoid this scanning if you are concerned that it is going beyond what is appropriate is simple: don’t use cloud services, or at least not unencrypted ones.

There is, however, a strong push by the tech industry to secure more user data in more robust ways, like through end-to-end encryption. When data is “end-to-end” encrypted, that means that even if it’s being hosted in the cloud, the platform can’t tell what it is. It just looks like random noise. Apple has been among the tech companies that has done the most to promote end-to-end encryption. For example, iMessage traffic cannot be seen by anyone except the sender and the receiver. Apple has further worked hard to make it very difficult to access a locked phone without a passcode, even for people who have physical access to it — which users tend to appreciate, even if both criminals and law enforcement, at times, do not. But while this is a different context (and was the controversy surrounding Apple’s refusal, in the wake of the San Bernadino shooting, to create a custom version of iOS for the FBI that would allow them to repeatedly guess device passcodes without being locked out), it does show a recognition that users have heightened privacy expectations when it comes to their personal devices.

But Apple has not fully encrypted all of the user data it stores in the cloud. iCloud Photos have never been end-to-end encrypted, and iCloud device backups are not fully encrypted either. (And those backups contain a backup of older iMessages, too — meaning that law enforcement, with a warrant, can access them. It doesn’t matter if encryption is secure and unbreakable if a decrypted copy of every message is available somewhere else.)

This is the necessary context to even begin thinking about Apple’s announcement that it will scan images for CSAM matches on iPhones and other devices, prior to their being uploaded to iCloud. Apple’s new feature, as announced, is no different than — even more privacy-protective than — how, for example, Google Photos now works. In both cases, you can avoid the scanning by not using the cloud service at all. But nevertheless, by moving much of the process onto the user’s phone, Apple’s plans cross a line that has not been crossed before.

There are several concerns. First is mission creep. If this capability is deployed to your phone, it would not take much for it to scan all images in your camera roll (not just those about to be uploaded to iCloud) or all images that you see, or messages sent and received via iMessage. The fact that only images that are set to be uploaded are scanned might be described as a “loophole,” and Apple might be asked to expand it to all saved photos, or all photos sent and received via iMessage. This is why this “slippery slope” argument has more force when it comes to what your device itself is doing, as opposed to the (also legitimate) slippery slope concerns relating to cloud-based CSAM scanning. And, more worryingly, the scope of the images the system detects may expand. Apple argues that it will “resist” attempts by governments to expand the kinds of images searched for. But the issue isn’t whether Apple will “resist” but whether it will be forced to add these features against its will, and be prevented from talking about it. Second, how to opt out of device-side scanning is not clear, and Apple has the sole discretion of whether to honor opt-out requests. Furthermore, should Apple even honor opt-outs for something as serious as CSAM scanning? At the same time, it’s also true that if Apple were to eventually fully encrypt iCloud Photo libraries, then it would be unable to scan for CSAM in the cloud at all.

This then leads to an uncomfortable policy choice between a series of options, none of which are great.

Option 1: Encrypt Everything Possible, and Scan Only What’s Not Encrypted

A fully end-to-end encrypted system with no backdoors and no scanning of material client-side before or after it is encrypted means less CSAM scanning, necessarily. You can’t scan what you can’t see. Many in the privacy and security community see this as an acceptable compromise and promote other ways to find CSAM and hold the people who create and share it to account.

Under this approach, scanning for known CSAM in the cloud is fine for now, or at least tolerable, but eventually will be greatly reduced.

Option 2: Encrypt Everything (But Scan Stuff Before It’s Uploaded)

While Apple has not announced any plans to fully encrypt iCloud Photos, many people assume that announcing client-side scanning means that such an announcement is inevitable. For now Apple merely claims that its system protects privacy better than those systems that do this scanning completely in the cloud. Regardless, true cloud encryption plus on-device CSAM scanning is one possibility, and it’s one that Apple seems to think strikes an appropriate balance. (Scanning on the device but then uploading photos to an unencrypted cloud is also a possibility. One that makes very little sense, but still a possibility.)

Option 3: Only Encrypt Some Things

Another approach is basically close to the status quo. Users should have privacy on their devices, but once their data touches the cloud, it’s subject to being scanned. Maybe we can have some services, like messaging and phone calls, be fully encrypted for everyone. But under this option, the CSAM problem means that proper encryption should not be turned on by default for all users.

***

I don’t like these choices! I don’t want to advocate for a system that lets the criminals who create and share CSAM to evade detection. I think that encryption has so many positive benefits to users that “encrypt all the things” is a very tempting option, even as I recognize the challenges to full encryption for the experience of ordinary users. (Without some sort of failsafe, real encryption means that if you forget your password or misplace your decryption key, you’ve lost your data. It can’t be “recovered” and your password can’t simply be reset.) And I want to maintain a clear distinction between where a user can and should expect near-total privacy — their personal devices — and where they might not — when they upload their data to the cloud.

So this is not a typical Public Knowledge blog post where we try to conclude with a concrete policy recommendation. We are going to listen and learn and think about it more. But I do have a few closing thoughts.

First, how you think about these issues depends on how serious you think the underlying problems are. CSAM is widely accepted to be a serious problem, and it’s important to limit its creation and distribution. By contrast, the plight of companies that want to track users around the web is less sympathetic. There is a difference between attempting to balance competing principles to address a serious social problem, and doing so to enable a new form of targeted advertising.

Second, both FLoC and Apple’s CSAM scanning do a lot of work locally on your device or computer, without first needing to upload your personal data to the cloud. Keeping data and functionality local, on your phone or computer, is generally speaking a privacy win. For example, it’s better if your phone can do voice transcription or biometric identification itself, instead of sending your data to the cloud first. Users benefit both from the feature itself, and the feature being implemented in a privacy-conscious way. But there is no direct, obvious benefit for the individual user to Apple’s CSAM system or Google’s ad-targeting plans–though in both cases there are arguments about the broader social benefit from CSAM scanning or the financial viability of the web, in the case of FLoC. In any event, using local processing and other privacy-preserving techniques does not necessarily mean that a given feature actually is good for privacy.

Finally, Apple’s system is very, very complicated and I have not even tried to describe how it works. Perceptual hashes, safety vouchers — all this stuff is very interesting. But these systems need to be easily understandable by ordinary users–or at least the values and principles underlying the systems and any policy governing them must be understandable. “My device is mine” is something people easily understand, and it’s what they expect — in part because of Apple’s own actions and marketing for the past several years. This is a departure. “Here is the complicated reason why you should accept some tradeoffs over here for some greater benefit over there” is a much more complicated message, and it’s one that Apple has not yet communicated effectively. Users must be able to understand what is happening and why, and what they can opt out of and why. Ever since Apple put out its plans, experts (who have been largely, though not uniformly critical, of both Apple’s plans and how it simply announced them without any public consultation) have been trying to figure out how it all works and what the problems might be. If seasoned security researchers, subject matter experts, cryptographers, and so on can’t agree on how Apple’s system works, much less what it means, ordinary users have no chance. But a system like this must be understood to be accepted and right now, it is neither.

Grappling with these tradeoffs is difficult. Doing so honestly and productively calls for people to think through the implications of their preferred approach — whether it’s something as serious as detecting CSAM, or about ad targeting. The fact that many people intuitively object to the tools they use and the devices they buy being used for some broader purpose should be part of this discussion.

 


About John Bergmayer

John Bergmayer is Legal Director at Public Knowledge, specializing in telecommunications, media, internet, and intellectual property issues. He advocates for the public interest before courts and policymakers, and works to make sure that all stakeholders -- including ordinary citizens, artists, and technological innovators -- have a say in shaping emerging digital policies.