Cambridge Analytica

Is profiling acceptable?

Some level of profiling based on user activities, inference and prediction may be acceptable in commercial contexts. Marketing needs to be able to make some predictions to function at all, as long as it happens with consent and transparency. E.g. We may infer that “people who buy pushchairs will probably be susceptible to adverts about toys for young children”. Here we should distinguish between demonstrated statistical associations and simple guesses. For example, the data may show that buyers of pushchairs actually do not buy toys immediately afterwards, maybe because they’ve already made a significant purchase. This kind of insight based on profiling is more accurate but it brings more risks of intrusion.

There is a very thin line for the acceptability of such profiling. Facebook allows targeting around life events such as birthdays or marriage anniversaries. This can be annoying or and it is easy to get some inferences wrong.

Things get more complicated when sensitive data such as health, politics, sexuality or religion are involved. For example, there are predictions that may sound fairly common sense but are actually problematic, such as: “people who like East London Mosque probably won’t buy bacon”. Using any such sensitive data for profiling is regulated more strictly than other data in Europe for good reason. Some EU countries, such as Spain, explicitly ban the creation of databases with the political views of individuals.

Is psychological profiling acceptable?

Profiling based on psychological traits is a complex area, even if it is less regulated in principle than health or biometrics. This involves making psychological assessments about individuals you have never met. From an ethical point of view this profiling and psychometrics deployed by internet researchers seems questionable. There is a fundamental difference between establishing a statistical relation between behaviours and attempting to profile the essential qualities of an individual with the intention to use those attributes to then make completely new connections. It is unclear what, if any, benefits such profiling brings to the subjects involved and in what circumstances it would be acceptable in other contexts, let alone without consent.

Mixing sensitive data, such as political views with psychological profiling is even more dangerous and not something we can see a justification for. When we add the objective of using such psychological and political profiling for manipulation and targeting of sophisticated propaganda operating at the unconscious level in the aggressive manner displayed by CA the whole enterprise becomes completely unacceptable.

The fact that such political profiling is not just targeting individuals as isolated units, but a population, with the objective to fundamentally alter the material, social and political conditions of a whole country - including the removal of fundamental rights in the case of Brexit - should be an emergency wake up call for citizens and policymakers alike.

Are the claims about micro targeting credible?

There are two issues here: the ability to target individuals and the accuracy of the psychological predictions. The targeting of individuals certainly happens, but it is less clear how much effect this has and whether the behavioural proxies used for segmentation reflect the actual mental state of the subjects.

Chris Wylie has claimed that some of the adverts using SCL profiling had a full conversion rate up to 7%. This involves actually taking action - signing a petition, joining event, or making a donation. If true is extremely high, as average rates in Facebook ads for simple click-through are under 1%^[1].

There is some academic evidence that personality can relate to politics, with researchers finding that high openness and low conscientiousness are robust indicators of being socially liberal rather than conservative^[2]. This is the kind of insight CA will hope to exploit. Other researchers, including those whose names have appeared in the scandal albeit not directly implicated, have also found they can predict personality from digital records^[3]. This extends to predicting other non-psychological attributes, such as gender or race.

There is no full consensus on the impact of such approach though. Other academics, such as Jessica Baldwin Philippi, pour scorn on the idea that psychometrics are a revolutionary tool in political campaigning. Analysing the evolution of digital political campaigns she finds that Facebook has indeed allowed for more precise micro-targeting, but that this may not be necessarily helped by psychometrics. She argues instead that the impact of Twitter in broadening the communications space of campaigns would have had a bigger impact, for example, on Trump’s victory^[4].

We believe that a precautionary principle should be applied here, with extra monitoring and controls over online political advertising based on profiling. There is a need for more research and absolute transparency to override commercial secrecy. These precautionary measures, however, should not lead to rushed permanent legislation based on panic rather than evidence, which may stifle freedom of expression and restrict minority political views.

Is it OK to process the data of USA citizens in the UK and would Data Protection rules apply?

Cambridge Analytica is a company incorporated in the USA and the main accusations relate to the US presidential campaign and US citizens’ data^[5]. Generally, EU data protection would not apply to the processing of personal data of US citizens for a service offered in the USA by a US company, but we believe that in this case there are grounds for UK data protection to apply.

The UK Data Protection Act applies when an organisation is “established” in the United Kingdom and the data are processed in the context of that establishment, or if a non-EU company uses equipment in the UK^[6]. Cambridge Analytica has its main office in London, where we understand that the bulk of the processing takes place, and from where it offers commercial services. This was the criteria used to determine that Google was “established” in Spain - and subjected to Spanish data protection law - in the famous Costeja case on the right to be forgotten. It seems likely that Cambridge Analytica can be found to be “established” in the UK for data protection purposes. In addition, the original collection of the data was carried out by a Cambridge University academic.

European data protection does not discriminate against non-EU nationals. Recital 2 of both the current EU Data Protection Directive and GDPR use almost the exact terms: the fundamental rights and freedoms of natural persons are protected ‘whatever their nationality or residence’.

Did the Cambridge Analytica modelling results get used elsewhere, eg. for UK targeting?

Chris Wylie and the Guardian newspaper claim that the Canadian company AggregateIQ was a front for SCL activities and have shown documents demonstrating they share underlying intellectual property. AggregateIQ was paid almost £4 million by Vote Leave and connected groups ^[7]. If the claims are true, it is likely that Cambridge Analytica’s models were used for Brexit. Chris Wylie has stated categorically that these interventions were critical to the Brexit victory.

Even if the exact models were not used, we know that online psychometrics are becoming more widespread, and that Cambridge University researchers from the department that provides the intellectual basis for the whole enterprise^[8] have been offering their knowledge to other firms. Cambridge University should come clean on who their staff have worked for.

It seems very possible that UK citizens have been targeted similarly to those in the US, which is very concerning. We also understand that the parent company SCL, a British firm that routinely works for the UK government, may have undermined the democratic process in some 200 elections around the world.

Are FB's rules about sharing friends data acceptable, did these apply in the EU?

Facebook offered third party app developers an unacceptable level of access to their users’ data without appropriate consent or even basic transparency. A 2010 survey of 1800 Facebook apps found that some 11% enabled access to friends’ data in the same way as in this case^[9]. This option to harvest third party data was available in the EU.

Facebook stopped this kind of access in 2014-15, with friends’ data only being accessible to apps if these friends had also signed into the app. Mark Zuckerberg has claimed that this was due to privacy concerns^[10]. However, according to an ex-employee this was likely due to worries about feeding potential competitors that could reverse engineer their databases and social graph^[11].

Will CA or Facebook be open to data protection complaints?

The ICO is carrying out an investigation and we hope some preliminary results are made public soon. Based on the publicly available evidence the legality of data practices in this situation is not straightforward. There are three main parties involved: Facebook, the Cambridge University researcher Aleksandr Kogan (operating under a company called GSR) and Cambridge Analytica.

Kogan set up a company called Global Science Research (GSR) - allegedly under Cambridge Analytica ’s direct instructions - and got 270,000 people to volunteer their Facebook data in exchange for a small fee. The app was ostensibly for research and not commercial use, although this is under dispute by Kogan. The app then used Facebook’s API to further collect data for the “friends” of these people, covering at least 50 million users.

Cambridge Analytica gave Kogan around $1 million to build this dataset, which was then combined with other data to model political preferences. Wylie claims this dataset formed the basis of the whole of Cambridge Analytica’s online operations.

There is a dispute over who is responsible here. Facebook claims that Kogan abused their terms and conditions by selling data to Cambridge Analytica, while Kogan argues that he just played by the book, that everything was transparent to Facebook, and that many other apps were engaged in similar collection.

Kogan seems to be right on this last point but may have misled Facebook in other areas. If it is true as some argue that the data was collected for “research purposes” but then used for marketing, political or otherwise, Kogan may have legal questions to answer, in addition to other ethical issues or breaches of Facebook’s rules.

Facebook may have to answer over their oversharing of data, which enabled default access by apps to data from the “friends” of their users without consent. In 2009 US privacy organisations complained about Facebook’s data practices, including sharing with app developers. The company avoided fines by signing a “consent decree” with the US Federal Trade Commission in 2011, agreeing to not share its users’ data without their consent^[12]. Facebook claim that every user involved gave their consent, but this is simply not true. Most of the 89 million Facebook users affected did not even know this was happening, even if somewhere in the terms of service it said that it was possible. The company now stands accused in the US of violating this agreement, being liable to extensive fines^[13]. These former practices could also be a problem in the UK if the ICO decides to follow up.

Cambridge Analytica could also have a problem, as they are processing ordinary personal data to deduce the data subject's political views or likely political support. This, under the UK Data Protection Act 1998, is likely to involve processing of personal data (social media data) to reveal sensitive personal data attributes (political views), which requires an extra ground for the processing^[14]. This will likely have to be “explicit consent”, which is absent, therefore the processing should not have occurred.

Facebook is in a similar position: if they facilitate access to data of their users that is going to be processed to generate sensitive data profiles, they would need to obtain explicit consent. Facebook may plead ignorance of such processing, but still stands to be accused of failing to take appropriate care of the data. A major point of contention is whether or not this episode involved a data breach - which Facebook strenuously denies worried about their reputation. In fairness, there was no hacking, nobody lost or stole the data as such; Kogan simply used standard procedures. However, the Principle of Information Security in the UK Data Protection Act is much broader: “appropriate technical and organisational measures shall be taken against unauthorised or unlawful processing of personal data”.

Consent is also problematic for Facebook more widely in relation to their own internal processing of sensitive data such as political views, race or sexual orientation. It would be a stretch to argue that Facebook users have given the required explicit consent for sensitive data simply by signing up with its click-through end user agreement.

Facebook has been selling itself as a political campaigning platform. The platform has been profiling its users in the US and the EU using sensitive data, but seems to make a distinction for its users in the EU - who are covered by stricter privacy rules - in the use that they allow of such data. This needs to be clarified as the company’s policies on this area are very confusing.

References

↑ https://www.wordstream.com/blog/ws/2017/02/28/facebook-advertising-benchmarks
↑ https://onlinelibrary.wiley.com/doi/full/10.1111/j.1467-9221.2008.00668.x
↑ http://www.pnas.org/content/early/2013/03/06/1218772110
↑ https://jessebp.files.wordpress.com/2011/04/jbp_myths_datadriven_forum_final.pdf
↑ For now we will put aside issues with AggregateIQ and claims that effectively all these companies form a single entity around SCL.
↑ https://www.legislation.gov.uk/ukpga/1998/29/section/5
↑ See all the invoces here https://bit.ly/2E9qEyn
↑ https://www.psychometrics.cam.ac.uk/
↑ http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.591.8519&rep=rep1&type=pdf
↑ https://www.facebook.com/zuck/posts/10104712037900071
↑ https://www.theguardian.com/news/2018/mar/20/facebook-data-cambridge-analytica-sandy-parakilas
↑ https://newsroom.fb.com/news/2018/03/suspending-cambridge-analytica/
↑ https://epic.org/privacy/facebook/EPIC-et-al-ltr-FTC-Cambridge-FB-03-20-18.pdf
↑ Schedule 3 Data Protection Act 1998

[1] ttps://www.wordstream.com/blog/ws/2017/02/28/facebook-advertising-benchmarks

[2] ttps://onlinelibrary.wiley.com/doi/full/10.1111/j.1467-9221.2008.00668.x

[3] ttp://www.pnas.org/content/early/2013/03/06/1218772110

[4] ttps://jessebp.files.wordpress.com/2011/04/jbp_myths_datadriven_forum_final.pdf

[5] For now we will put aside issues with AggregateIQ and claims that effectively all these companies form a single entity around SCL.

[6] ttps://www.legislation.gov.uk/ukpga/1998/29/section/5

[7] See all the invoces here https://bit.ly/2E9qEyn

[8] ttps://www.psychometrics.cam.ac.uk/

[9] ttp://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.591.8519&rep=rep1&type=pdf

[10] ttps://www.facebook.com/zuck/posts/10104712037900071

[11] ttps://www.theguardian.com/news/2018/mar/20/facebook-data-cambridge-analytica-sandy-parakilas

[12] ttps://newsroom.fb.com/news/2018/03/suspending-cambridge-analytica/

[13] ttps://epic.org/privacy/facebook/EPIC-et-al-ltr-FTC-Cambridge-FB-03-20-18.pdf

[14] Schedule 3 Data Protection Act 1998

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]