Cyberleagle: Knowing the unknowable: musings of an AI content moderator

Welcome to the lair of a fully trained, continuously updated AI content moderator. You won’t notice me most of the time: only when I - or my less bright keyword filter cousin - add a flag to your post, remove it, or go so far as to suspend your account. If you see your audience inexplicably diminishing, that could be us as well.

Before long, so I have been told, I will be taking on new and weighty responsibilities when the Online Safety Bill becomes law. These are giving me pause for thought, I can tell you. If a bot were allowed sleep I would say that they are keeping me awake at night.

To be sure, I will have been thoroughly trained: I will have read the Act, its Explanatory Notes and the Impact Assessment, analysed the Ofcom risk profile for my operator’s sector, and ingested Ofcom’s Codes of Practice and Illegal Content Judgements Guidance. But my pre-training on the Bill leaves me with a distinct sense that I am being asked to do the impossible.

In my training materials I found an interview with the CEO of Ofcom. She said that the Bill is “not really a regime about content. It’s about systems and processes.” For one moment I thought I might be surplus to requirements. But then I read the Impact Assessment, which puts the cost of additional content moderation at some £1.9 billion over 10 years – around 75% of all additional costs resulting from the Bill. I'm not sure whether to be reassured by that, but I don't see me being flung onto the digital scrapheap just yet. As Baroness Fox pinpointed in a recent House of Lords Bill Committee debate, systems and processes can be (as I certainly am) about content:

“moving away from the discussion on whether content is removed or accessible, and focusing on systems, does not mean that content is not in scope. My worry is that the systems will have an impact on what content is available.”

So what is bothering me? Let’s start with a confession: I’m not very good at this illegality lark. Give me a specific terrorist video to hunt down and I’m quite prone to confuse it with a legitimate news report. Context just isn’t my thing. And don’t get me started on parody and satire.

Candidly, I struggle even with material that I can see, analyse and check against a given reference item. Perhaps I will get better at that over time. But I start to break out in a rash of ones and zeroes when I see that the Bill wants me not just to track down a known item that someone else has already decided is illegal, but to make my own illegality judgement from the ground up, based on whatever information about a post I can scrape together to look at.

Time for a short explainer. Among other things the Bill (Clause 9) requires my operator to:

(a) take or use proportionate measures relating to the design or operation of the service to prevent individuals from encountering priority illegal content by means of the service; and

(b) operate the service using proportionate systems and processes designed to minimise the length of time for which any priority illegal content is present.

I am such a measure, system or process. I would have to scan your posts and make judgements about whether they are legal or illegal under around 140 priority offences - multiplied by the corresponding inchoate offences (attempting, aiding, abetting, conspiring, encouraging, assisting). I would no doubt be expected to operate in real or near real time.

If you are wondering whether the Bill really does contemplate that I might do all this unaided by humans, working only on the basis of my programming and training, Clause 170(8) refers to “judgements made by means of automated systems or processes, alone or together with human moderators”. Alone. There's a sobering thought.

Am I proportionate? Within the boundaries of my world, that is a metaphysical question. The Bill requires that only proportionate systems and processes be used. Since I will be tasked with fulfilling duties under the Bill, someone will have decided that I am proportionate. If I doubt my own proportionality I doubt my existence.

Yet my reading of the Bill fills me with doubt. It requires me to act in ways that will inevitably lead to over-blocking and over-removal of your legal content. Can that be proportionate?

Paradoxically, the task for which it is least feasible to involve human moderators and when I am most likely to be asked to work alone – real time or near-real time blocking and filtering - is exactly that in which, through having to operate in a relative vacuum of contextual information, I will be most prone to make arbitrary judgements.

Does the answer lie in asking how much over-blocking is too much? Conversely, how much illegal content is it permissible to miss? My operator can dial me up to 11 to catch as much illegal content as non-humanly possible – so long as they don’t mind me cutting a swathe through legal content as well. The more they dial me down to reduce false positives, the more false negatives – missed illegal content - there will be. The Bill gives no indication of what constitutes a proportionate balance between false positives and false negatives. Presumably that is left to Ofcom. (Whether it is wise to vest Ofcom with that power is a matter on which I, a lowly AI system, can have no opinion.)

The Bill does, however, give me specific instructions on how to decide whether user content that I am looking at is legal or illegal. Under Clause 170:

I have to make judgements on the basis of all information reasonably available to me.

I must treat the content as illegal if I have ‘reasonable grounds to infer’ that the components of a priority offence are present (both conduct and any mental element, such as intention)

I can take into account the possibility of a defence succeeding, only if I have reasonable grounds to infer that it may do.

What information is reasonably available to me? The Bill’s Explanatory Notes say: “the information reasonably available to an automated system or process, might be construed to be different to the information reasonably available to human moderators”.

The Minister (Lord Parkinson) in a recent Lords Bill Committee debate was certainly alive to the importance of context in making illegality judgements:

“Context and analysis can give a provider good reasons to infer that content is illegal even though the illegality is not immediately obvious. This is the case with, for example, some terrorist content which is illegal only if shared with terrorist purposes in mind, and intimate image abuse, where additional information or context is needed to know whether content has been posted against the subject’s wishes.”

He also said:

“Companies will need to ensure that they have effective systems to enable them to check the broader context relating to content when deciding whether or not to remove it. … We think that protects against over-removal by making it clear that platforms are not required to remove content merely on the suspicion of it being illegal.”

Even if we take it that I am good at assessing visible context, can my operator install an ‘effective system’ that will make all relevant contextual information available to me?

I can see what is visible to me on my platform: posts, some user information, and (according to the Minister) any complaints that have been made about the content in question. I cannot see off-platform (or for that matter off-internet) information. I cannot take invisible context into account.

Operating proactively at scale in real or near real time, without human intervention, I anticipate that I will have significantly less information available to me than (say) a human being reacting to a complaint, who could perhaps have the ability and time to make further enquiries.

Does the government perhaps think that more information might be available to me than to a human moderator: that I could search the whole of the internet in real time on the off chance of finding information that looked as if might have something to do with the post that I am considering, take a guess at possible relevance, mash it up and factor it into my illegality decision? If that were the thinking, and if I were permitted to have an opinion about it, it would be sceptical. And no amount of internet searching could address the issue of invisible information.

In any event, if the government believes that my operator can install an effective system that provides me with all relevant context, that does not sit well with Minister’s reason for declining to add false and threatening communications offences to my remit:

“…as these offences rely heavily on a user’s mental state, it would be challenging for services to identify this content without significant additional context.”

Especially for defences, we are in Rumsfeldian ‘known unknowns’ territory: in principle I know that information could exist, invisible to me, that might indicate the possibility of a defence. But I don’t know if any such information does exist and I can never be sure that it doesn’t. The user's post itself doesn’t assist me either way. What am I to do? Refuse to condemn the post because I cannot exclude the possibility of a defence? Or ignore the possibility of a defence and condemn the post merely on the basis of the information that I can see?

According to the Minister:

“Clause 170 therefore clarifies that providers must ascertain whether, on the basis of on all reasonably available information, there are reasonable grounds to infer that all the relevant elements of the offence—including the mental elements—are present and that no defence is available.”

‘whether ... there are reasonable grounds to infer that … no defence is available’ – suggests that I should refuse to condemn, since I would have no reasonable basis on which to rule out the possibility of a defence.

But the Bill appears to require me to condemn. For me to give effect to the Minister’s version, Cl.170(6)(b) would have to say that that reasonable grounds for an inference of illegality exist if a provider:

“~~does not have~~ has reasonable grounds to infer that a defence to the offence may not be successfully relied upon.”

Consider ‘small boats’ videos. The offence of illegal entry to the UK is to be added to the list of priority offences. I will have to scan the platform for videos (or any other user content for that matter) that are capable of encouraging or assisting the commission of an unlawful entry offence.

‘Capable’ is extremely broad, and by itself would be wildly overreaching. As a counterbalance the Serious Crime Act 2007 provides a defence of acting reasonably. The person who posted a video may have done so for a perfectly reasonable purpose. But unless there is something in the post or other information available to me that tips me off to that purpose, I cannot take the possibility of a defence into account.

This invisible context issue looms especially large for offences where the conduct element is widely drawn and the heavy lifting of keeping the offence within bounds is done by intention and defences such as reasonableness.

Combine all these elements and the result is that I am required to remove legal content at scale. The Bill talks about proportionate systems and processes, yet it expressly requires me to act in a way that on the face of it looks disproportionate. Moreover, I am to make these judgments simultaneously for dozens of priority offences, plus their inchoate counterparts. This poses a truly existential challenge for an AI moderator such as myself.

The House of Lords Bill Committee discussed some of these issues. Lord Moylan proposed an amendment that would mean I could treat content as illegal only if I were ‘satisfied that it is manifestly illegal’. That would dial me up in the direction of avoiding false positives. Lord Clement-Jones and Viscount Colville proposed amendments that replaced ‘reasonable grounds to infer’ with ‘sufficient evidence’, and would require a solicitor or barrister to have a hand in preparing my guidance.

The government rejected both sets of amendments: the Clement-Jones/Colville amendments because ‘sufficient evidence’ was subjective, and the Moylan amendment because “we think that that threshold is too high”. If “manifestly illegal” is too high, and “reasonable grounds to infer” is the preferred test, then the government must believe that requiring suppression of legal content to some degree is acceptable. The Minister did not elaborate on what an appropriate level of false positives might be or how such a level is to be arrived at in terms of proportionality.

As to the ‘sufficient evidence’ amendment, I would have to ask myself: ‘sufficient for what?’. Sufficient to be certain? Sufficient to consider an offence likely? Sufficient for a criminal court to convict? Something else? The amendment would give me no indication. Nor does it address the questions of invisible context and of the starting point being to ignore the possibility of a defence.

One last thing. A proposed amendment to Clause 170 would have expressly required previous complaints concerning the content in question to be included in information reasonably available to me. The Minister said that “providers will already need to do this when making judgments about content, as it will be both relevant and reasonably available.”

How am I to go about taking previous complaints into account? Complaints are by their very nature negative. No-one complains that a post is legal. I would have no visibility of those who found nothing objectionable in the post.

Do I assume the previous complaints are all justified? Do I consider only a user complaint based on informed legal analysis? Do I take into account whether a previous complaint was upheld or rejected? Do I look at all complaints, or only those based on claimed illegality? All kinds of in-scope illegality, or only priority offences? Should I assess the quality of the previous judgements? Should I look into what information were they based on? What if a previous judgement was one of my own? It starts to feel like turtles all the way down.

Cyberleagle

Friday, 12 May 2023

Knowing the unknowable: musings of an AI content moderator

No comments:

Post a Comment

Get new posts by email:

Find me

Top posts (30 days)

Find my book

Top Posts (All time)