The Online Safety Act Network (OSAN) recently published a 10-point plan to amend the Online Safety Act. The plan includes:
“Insert a definition of safety by
design into the Act to make clear to Ofcom and services what Parliament
intended”.
From a technical drafting perspective clarification might be
welcome. The Act says that it “seeks to secure that regulated services are safe
by design”. That was added at the last minute in a clause describing the
overall purpose of the legislation, but which lacked any definition of ‘safe’
or ‘safe by design’. I discussed here the undoubted difficulties in interpreting what is
now Section 1 of the Act.
Of course, before we can craft a definition of safety by
design we have to know what it is intended to mean. For myself, I have always regarded
much of the theory underlying safety by design as fragile, at least within the
context of the Online Safety Act. But putting those doubts on one side, I did
think that I had a reasonable idea of what safety by design was intended to be
about.
Now I am not so sure.
To recap, this is how I thought safety by design was meant
to apply to regulation of online platforms:
- Safety by design requires safety to be considered at the design stage, not as an afterthought.
- Safety by design should then be applied iteratively via periodic risk assessments, incorporating feedback learned during operation of the service.
- Safety by design focuses on platform systems and processes, identifying and addressing those that create or exacerbate a risk of harm (however defined).
- Safety by design is not, or at least not primarily, about systems for content moderation.
- Safety by design favours non-content-specific, systems-focused measures.
- Safety by design is not about automated content detection and filtering.
Safety by design proponents have long criticised the Online
Safety Act for being too content-focused. Rather than more and better content
moderation, platforms should have to design safety into their systems and
processes from the outset. This, so the theory goes, would result in less harm
(however that might be conceptualised) occurring on platforms and less need for
ex-post content moderation.
There have been variations on these themes: for instance,
that a systems focus can include friction measures targeted at specific kinds
of content, but which stop short of requiring removal; or that measures can
focus on harm arising from certain kinds of content without focusing specifically
on the content itself.
Nevertheless, as a general proposition I understood safety by design (a.k.a. ‘systems and processes’) to be about addressing from the outset the design of risk-creating system features, combined with a preference for non-content-related or content-agnostic measures over content-specific measures.
If that is right, safety by design has two elements. It articulates
a general approach to safety but is also exclusionary: systems for content
moderation (or at least automated filtering systems) are not a safety by design
measure. The UK government, it should be said, has taken the opposite view. It regards
automated content filtering as a safety by design measure. That is the most
obvious difference of view that, if I am right in my understanding of safety by design, would have to be resolved in crafting a statutory definition.
To the extent that safety by design proponents embrace systems
for content moderation, that has tended to be as a fall-back for where safety
by design measures have not squeezed harm out of the system.
Thus Professor Lorna Woods’ October 2024 paper for OSAN, Safety
by Design, although allowing for the possibility of ex post measures
as a residual measure, differentiated that from a primary focus on design
choices:
“At the moment, content
moderation seems to be in tension with the design features that are influencing
the creation of the content in the first place, making moderation a harder job.
So a ‘by design’ approach is a necessary precondition for ensuring that other ex
post responses have a chance of success.
While a “by design” approach is
important, it is not sufficient on its own; there will be a need to keep
reviewing design choices and updating them, as well as perhaps considering ex
post measures to deal with residual issues that cannot be designed out,
even if the incidence of such issues has been reduced.”
She distinguished safety by design from techno-solutionism:
“Designing for safety (or some
other societal value) does not equate to techno-solutionism (or
techno-optimism); the reliance on a “magic box” to solve society’s woes or
provide a quick fix. Rather, what it acknowledges is that each technology may
have weaknesses and disadvantages, as well as benefits. Further, the design may
embody the social values and interests of its creators. A product (or some
of its features) may be part of the problem. The objective of “safety by
design” is – like product safety – to reduce the tendency of a given feature or
service to create or exacerbate such issues.” (emphasis added)
One might think that automated content filtering is the
paradigm example of regulatory techno-solutionism. Indeed, as to the Online
Safety Act itself, Professor Woods noted its emphasis on systems for content
moderation:
“What is rather more explicit in
the [Online Safety Act] safety duties is the focus on filtering and moderation,
which may have a design element (i.e. the tools are made available within the
system and designed to work with the system) but seem more ex post in the way
they work.”
Elsewhere Professor Woods has included reactive content
take-down systems within safety by design, but as the “last port of call”. (Introducing
the Systems Approach and the Statutory Duty of Care (chapter in Perspectives
on Platform Regulation, Nomos, 2021).)
We can find other examples of safety by design proponents
expressing concern about the Online Safety Act’s focus on systems for content
moderation.
Carnegie UK was OSAN’s online safety policy predecessor and,
through the work of Professor Woods and William Perrin, was the originator of
the proposal for a statutory duty of care. Carnegie UK said in its June 2019 submission to the Online Harms White Paper consultation:
“Worryingly, there are references
to proactive action in relation to a number of forms of content (and not just
the very severe child sexual abuse and exploitation and terrorist content)
which in the light of the emphasis in the codes could be taken to mean a
requirement for upload filtering and general monitoring to support that.”
Demos’ submission to the draft Online Safety Bill Committee
in September 2021 identified as a primary risk:
“A focus on regulation and
moderation of content rather than platform systems which affect the risk
of harm arising from that content” (emphasis in original)
and said:
“Although the Bill sets out a systems-based approach, there is a focus on reducing harm through content takedown measures, measuring the incidence of harms online and a focus on enforcing terms and conditions. ... we are concerned that in implementation this will turn into a ‘content-based approach’ by proxy, by prioritising the regulation of content moderation systems above other systems and design changes.”
Demos' April 2022 position paper on the Online Safety Bill argued that:
“The Bill treats a ‘systems’
approach as meaning a ‘systems for dealing with content’ approach…”
The Demos position paper also expressed particular concern about the “strong risk
of infringing on either privacy or freedom of expression” in Ofcom’s ability to
require use of proactive content moderation technology.
The 5Rights Foundation’s response to Ofcom’s final Illegal Harms
Code of Practice in December 2024 said:
“The legislation has a clear
objective that services are made “safe by design” but the majority of Ofcom’s
proposed measures are not designed to prevent harm occurring in the first place
– instead focusing on content moderation and reporting tools. While greater
requirements on governance and accountability are welcome, this in itself will
not ensure safety by design.”
If content-focused measures, or at least automated filtering, are not a variety of safety by design then a definition of safety by design for insertion in the Act could be expected to exclude measures of that kind; albeit how it could do so when the Act specifically contemplates the imposition of automated content detection and filtering is a conundrum.
But since the
government officially regards automated content filtering as a safety by design
measure, it would seem highly unlikely that a definition contradicting that could
find its way into the Act.
Ofcom’s Online Safety Act implementation
With the closing of Ofcom’s Summer 2025 consultation on additional safety measures, we can assess how far the Code of Practice measures
recommended or proposed by Ofcom to date are – and are not – focused on systems
for content moderation.
The consultation in fact provides a dual opportunity: to
analyse Ofcom’s existing and proposed measures from a safety by design
perspective, and to look at how safety by design proponents have reacted to Ofcom’s
newest proposals for automated content filtering (in Ofcom terminology, ‘proactive
technologies’).
Non-content, reactive content-related and proactive
content-related How far have non-content safety by design principles found
expression in Ofcom’s implementation of the Online Safety Act?
Regardless of whether there is overlap between systems measures
and content moderation measures, we can still conceive of functionality-oriented measures that do not require the platform to make judgements about content, nor involve directly limiting dissemination of content at all. A friction measure
such as a warning ‘Did you mean to post without reading the linked article?’
would be an example of such a non-content measure.
Thus we can break down the measures so far recommended or
proposed by Ofcom into non-content and content-related. The latter can be
further divided into reactive and proactive.
In total, across the Illegal Content, Protection of Children
and draft Additional Measures Codes for U2U services, there are (on my
reckoning) 73 non-content measures, 27 reactive content-related measures and 12
proactive content-related measures. For illegal content, most of the proactive
measures are contained in the Additional Measures consultation and are based on
content detection and filtering technology of various kinds.
However, a closer look at the 73 non-content measures
reveals that 50 of them are administrative, procedural or information
provision: appointing an accountable individual, preparing various written
documents, training, complaints and appeals procedures, publishing user support
materials and so on. Whilst those are aspects of wider systems design,
non-content measures addressed to features and functionality are of more
immediate interest.
That leaves 23 non-content measures: 11 in the Illegal
Content codes, all of which relate to children (in two cases only partially),
and 12 in the Protection of Children codes.
Most of the 23 non-content measures concern technical
functionality of the platform. The measures are limited (as required by the
Act) to UK users and relate to:
- Implementing an age-assurance process (ICU B1, PCU B1)
- Use of highly effective age assurance (HEAA) (PCU B2 to B7) (Age assurance does of course indirectly affect the content available to users who are not verified as over-18, as the result of content-related measures predicated on age assurance.)
- Safety defaults for child users concerning connection lists, account recommendations and direct messaging (ICU F1)
- Removal of five kinds of functionality from child-user livestreams (ICU F3)
- Options for user account blocking, disabling comments (for child users, or in some circumstances all registered users) (ICU J1, ICU J2)
- Enabling children to give negative feedback on content recommender systems (PCU E3)
- Providing information to children, when they restrict content or interactions with other accounts, as to the effect of doing so and further options available (PCU F2)
- Options for user blocking and muting, disabling comments (users not determined to be adults by use of HEAA) (PCU J1, PCU J2)
- Positive consent to group chat invitations (users not determined to be adults by use of HEAA) (PCU J3)
These examples illustrate that non-content measures are
feasible, albeit some of those measures are, at least in part, precursors to
content-related measures. Most obviously, age assurance underpins not only some
of the non-content measures listed above, but also measures about content that
should be hidden from under-18s.
Generally, it is striking how many of Ofcom’s non-content
functionality measures are concerned with denying functionality to, or to
interactions with, under-18s.
As to content-based measures, the Additional Measures
consultation marks a decided shift towards automated content detection. Should
these be welcomed as a version of safety by design, deprecated as systems for
content moderation, or regarded as a means of addressing residual issues that
cannot be designed out?
Safety by design or ex-post? OSAN’s cross-cutting response to Ofcom’s Additional Measures consultation takes issue with Ofcom’s
description of some content-related measures, including proactive technology,
as being ‘safety by design’:
“While some of the proposed
measures - including automated content moderation (para 1.51) and
livestreaming (p27) - are framed by Ofcom as being “safer by design”, these are
primarily about ex-post mitigations for harmful content (reporting content, or
relying on user action after harm has occurred) or introducing a form of
safety tech (proactive tech measures) rather than embedding safe design at the
level of systems and processes. There is still no understanding of what
good service redesign should look like to ensure a more holistic orientation
towards safety.” (emphasis added)
However, OSAN’s companion detailed response to
the Additional Measures Consultation characterises Ofcom’s proactive technology
proposals as safety by design:
“We broadly support the move
towards requiring proactive technology as a safety-by-design approach to user
safety”.
The detailed response (but not the cross-cutting response) would
therefore seem to endorse the government’s view of safety by design.
OSAN also suggested that Ofcom’s principles-based proactive
technology proposals could be extended to include intimate image abuse.
Recommender systems The Demos Digital submission
endorsed Ofcom’s proposed content-specific approach to recommender systems:
“The Demos Digital team agrees
with Ofcom’s proposal to exclude illegal content from recommender systems until
the content has been reviewed by content moderation teams.”
After pointing out that “Automated content identification
tools are known to struggle with reliability and bias”, Demos Digital then
suggested improvements including:
“Because of these risks of
inconsistency, Ofcom should provide specific guidance for platforms’
responsible use of automated content identification tools, including:
transparency reporting; quality control standards for automated identification
systems, including bias, reliability and accuracy; impact assessments for
evaluating the automated systems; and model parameters for identifying illegal
content. We believe this would alleviate some of the risks of automated content
identification systems – such as inconsistencies, inaccuracies, and bias –
which could result in the over-exclusion of legal content, or under-exclusion
of illegal content.”
At the level of principle it is difficult to see how this reflects a systems-based approach, other than in the sense of systems for
moderating content.
Parenthetically, even if a tendency to bias could be
alleviated, there is still the insoluble problem that automated content
identification tools do not have access to off-platform contextual information
that can affect legality of the user content in question.
In its comments on recommender systems OSAN supports
limitations on the reach of “content that is harmful in nature”, if accompanied
by freedom of expression safeguards such as explanations of how the systems
work in practice, and notification of creators when their content is affected
so as to allow them to use complaints and appeals processes.
Live-streaming For live-streaming, OSAN has suggested
some concrete ways in which Ofcom’s proposed Additional Measures could go
further: building in a delay to livestreaming and turning off livestreaming by
default for under-18s or under-16s. It describes these as safety by design
measures:
“15. Ofcom’s proposals focus on responding
to harm after it occurs and content moderation rather than preventing it in the
first place. There is no requirement for live-feed delays, which are standard
practice in traditional broadcasting, to prevent harmful or illegal content
from being aired in real time. Safety-by-design means including proactive
measures such as time-delay buffers and real-time risk assessment. There is
plenty of guidance available to broadcasters on this topic.” (emphasis added)
However, it then describes them as ex-post measures:
“17. More broadly, we would
recommend that Ofcom consider a greater array of ex-post features
- e.g. borrowing from broadcasting good practice and building more delay
into a live stream as a feature.” (emphasis added)
Is time delay an example of safety by design or an ex-post feature? The distinction would not necessarily matter much, were it not for the fact that a statutory definition of safety by design is proposed. But either way, although a time delay is of itself a non-content measure, its purpose is to enable the platform to make judgements about the content being live-streamed and (if thought necessary) to shut down the stream. OSA describes that as real-time risk assessment. In the context of the Act, those would have to be judgements about illegality or (for child-accessible streams) content harmful to children.
For children, OSAN contemplates a non-content-related
measure: turning live-streaming off by default for children, whether under-16
or under-18. It also observes that “A strong understanding of safety-by-design
would mean that where livestreaming cannot be delivered safely it shouldn’t be
in place.”
Finally, OSAN cites Ofcom’s proposed limitation on
livestream screen capture and recording for under-18s (part of ICU F3) as an
example of friction.
Safety by design in context
As implementation of the Online Safety Act has progressed, it is perhaps not surprising if it has become less clear how safety by design should translate into concrete measures. The theory of online safety by design, founded on the notion of risk-creating features, was formulated in the context of a range of services and harms that differed greatly from those in scope of the Online Safety Act. The range of services within the Act is far broader and the kinds of harm are much more specific.
In July 2018 Woods and Perrin, working with Carnegie UK,
proposed a:
“Virtuous circle of harm reduction on social media. Repeat this cycle in perpetuity or until behaviours have fundamentally changed and harm is designed out.” (Harm Reduction in Social Media, 17 July 2018 )
As to kinds of services, the proposal was aimed at around 10 social media companies each with over 1 million users. By January 2019, after discussion with various stakeholders, the authors had decided to extend the proposal to cover ‘social media and other internet platforms’ regardless of size. Now the Act covers an estimated 25,000 UK services (100,000 or more worldwide), 80% of which are micro-businesses (less than 10 employees).
On the face of it the underlying premise of the harm reduction cycle seems to be that what a user does on a platform is primarily the result of its design. However, the authors of the proposal say that their argument is not that we are 'pathetic dots' in the face of engineered determinism, but that the architecture of the platform nudges us towards certain behaviour (Woods and Perrin, Online harm reduction - a statutory duty of care and a regulator, April 2019.)
Even if it can be said that algorithmically driven social media platforms nudge us towards certain behaviour, how would that apply outside that specific milieu, for instance to plain vanilla discussion forums? And if, even on those large social media platforms, design only nudges rather than determines user behaviour, how far can harm really be designed out of the system?
As to kinds of harms, the safety by design theory is
premised on platforms being risk creators. We always then have to ask, risk of
what? In the context of the Online Safety Act that means connecting a given
feature to a created or exacerbated risk of one of the specific kinds of
criminality in scope of the Act, or of specific kinds of content harmful to
children.
Within the context of the Act, the theory has never been
easy to render into concrete expression:
- If the idea is that a user’s decision to post, say, an illegal offer to ferry illegal immigrants across the Channel is down to the design of the platform, that seems implausible.
- If the idea is that platform design can prevent such content being encountered, but without descending into content moderation and filtering, how is that to be done? Similarly if the concern is to prevent specific kinds of content being repeated or stimulated.
- If it means that recommender algorithms could be designed in ways that lessen the likelihood of their disseminating illegal content, it would have to be explained how that can be achieved without trespassing into content filtering.
- If the idea is that platform functionality can be designed to make it harder or slower to post, share or comment on user content generally, or to impose volume limits (a ‘circuit-breaker’), that would fit the theory. However, that kind of friction measure would necessarily strike against desirable and undesirable content alike, raising human rights proportionality issues.
- If the idea is that some functionalities should be banned, that would fit a version of the theory that holds that some functionalities cannot be designed safely. But the more general purpose the functionality in question, the greater the impact on legitimate content and the greater the human rights challenge.
- If the idea is that harm to children can be prevented by platform design which, for instance, reduces opportunities for adults to contact children, that would fit the theory.
If no connection can be found between a given technical or
business model feature of a platform and a risk of a user deciding to behave
illegally in a particular way, then the regulator will look somewhere other
than those design features to counter illegality: to other design features or,
failing that, to systems for moderation.
Professor Woods has suggested that designers should ask
themselves: ‘What happens when the bad people get hold of this feature?’ (Introducing
the Systems Approach and the Statutory Duty of Care, ibid.) However, that question
could be asked of any general purpose functionality, risk-creating or not. On
the face of it the question is about possible uses, not whether the feature in
question creates or exacerbates a risk of a particular illegal or harmful use. It
could be asked of the very act of providing a forum to which users can post. If
we are not careful, we rapidly fall into the trap of characterising speech as a
risk, not a fundamental right.
It is telling that Ofcom adopted that same approach in its statutory
Risk Register: rather than attempt to identify functionalities that inherently
create or exacerbate risk of illegality or content harmful to children, it sought
to identify features that are used by malefactors as well as by law-abiding
users: correlation rather than causation. That led it to list as risk factors general
purpose functionality such as the ability to create hyperlinks.
If safety by design turns out to be a poor fit with much of the Online Safety Act, it should be acknowledged that the originators of the safety
by design theory never wanted illegality to be the touchstone in the first
place. Professor Woods said:
“These categories of harm should
be identified by reference to their impact on the victim, not by reference to
whether the speech might be considered illegal or not.” (Introducing the
Systems Approach and the Statutory Duty of Care, ibid.)
That risks a leap from the frying pan (attributing risk of
illegal behaviour to a platform feature) into the fire (pursuing nebulous and
subjective kinds of harm). That aside, it would be no surprise if the theory turns
out not to map easily on to the Act. It is one thing to say that, for instance, chasing ‘Likes’ trains users to produce ‘response-creating content’ (Introducing the
Systems Approach and the Statutory Duty of Care, ibid). It is something else to show that a feature creates a risk of a user committing a specific criminal
offence.
It may not be fanciful to think that something has got lost along the way from the 10 or so large social media platforms that the Carnegie UK
authors had in mind for their original 2018 proposals, to the broad variety of
100,000 UK and overseas services in scope of the Online Safety Act. If, in essence, the theory was always really about large social media companies, their curation
and engagement algorithms and their data-driven business models, it would not
be a shock to find that it turns out to have little or no application
beyond that.
For platforms where user agency is the predominant factor,
and design decisions cannot realistically be regarded as likely to increase or
decrease the likelihood of illegality or relevant content harm, logic would suggest that issues that
cannot be designed out would most likely be at the forefront, not residual. A
fruitless quest for specific illegality- or harm-inducing features could then
easily result in a theoretical focus on systems and processes lapsing into
systems for content moderation, thence to proactive content filtering
technologies.
As to a statutory definition of safety by design, if systems for content moderation, including automated content filtering, are now to some extent embraced as an aspect of safety by design, it is difficult to see how a corresponding statutory definition could place meaningful limits on the kinds of concrete measures contemplated. It would also seem to have moved a very long way from the original conception of safety by design.
If the reality is that we do not have a clear idea of how safety by design is meant to
translate into concrete regulatory measures within the context of the Act, that would not be a good starting point for crafting a statutory definition.
The alternative, of course, is that I have always had safety by design wrong and that Parliament knew exactly what it intended in Section 1. If so, mea culpa.


