Digital Methods and technicity-of-the-mediums.
From regimes of functioning to digital research.
Janna Joceli Omena
Doctoral thesis in Digital Media.
2021
Digital methods
and technicity-of-the-mediums
From regimes of functioning
to digital research
Janna Joceli Omena
2021
Digital Methods and technicity-of-the-mediums.
From regimes of functioning to digital research.
Janna Joceli Omena
Thesis submitted to meet the requirements for obtaining the degree of Doctor in Philosophy
in Digital Media by the UT Austin-Portugal Digital Media programme, carried out under the
scientific guidance of Researcher Professor Dr. Jorge Martins Rosa and Dr. Tommaso
Venturini. This work was funded by the Foundation for Science and Technology, grant
number PD/BD/128252/2016.
Thesis submission: 28 December 2020.
Revised submission: 11 June 2021.
Doctoral exams: 21 September 2021.
Jury composition:
Cristina Ponte (president of the jury)
Full professor, NOVA FCSH, Universidade
Nova de Lisboa, Portugal
Tommaso Venturini (supervisor)
Chargé de recherche, CNRS Center for Internet
and Society, France
Bernhard Rieder
Associate Professor, University of Amsterdam,
The Netherlands
Ariadna Matamoros-Fernández
Lecturer in Digital Media, Queensland
University of Technology, Australia
Liliana Bounegru
Lecturer in Digital Methods, Department of
Digital Humanities, King’s College London,
United Kingdom
Paulo Nuno Vicente
Assistant Professor, NOVA FCSH, Universidade
Nova de Lisboa, Portugal
Carla Morais
Assistant Professor, Faculdade de Engenharia,
Universidade do Porto, Portugal
To Colin Mcmillan,
a beloved friend and master.
(in memoriam)
2
ACKNOWLEDGEMENTS
Gratidão é a memória do coração
Dom Paulo Garcia
I want to thank God for bringing me here (what a journey!), for always being there as
my strength and hope in difficult times. You are my every good thing.
Finishing this dissertation in time was a burning and fascinating process that could
never have been completed without the support of a list of people who have in
different ways played crucial roles in my PhD journey.
Starting from iNOVA Media Lab, the environment that has enabled me to
create/coordinate the SMART Data Sprint and a research group named Social Media
Research Techniques (SMART), which was a golden opportunity to develop research
in an area not yet existing at NOVA FCSH. I would like to thank, particularly, Paulo
Nuno Vicente and António Granado. Paulo invited me to integrate the iNOVA Media
Lab from the start, believed in my work and opened several opportunities for me. I
will always be grateful for that, thank you meu coordenador! In my first year as a PhD
student, Antonio’s lectures served as an inspiration to this thesis in so many ways
(from a research project that turned into the SMART Data Sprint, to a good
conversation after class about politics and Instagram). António’s support and
invitation to collaborate with him were priceless. És o maior! Many thanks also to all
my colleagues at iNOVA Media Lab for inspiring conversations throughout the past
years, particularly those who are part of SMART research group: Ana Marta Flores,
Elena Pilipets, Jason Chao and Rita Sepúlveda.
Universidade Nova de Lisboa has been my home since 2013, when I began a master's
degree in NOVA FCSH. That same year I met Jorge Martins Rosa who has become
my master’s and doctorate supervisor. I am thankful to him for all these years, for
being always available either to listen or to develop collaborative projects. As a
supervisor, he has always considered my focused interest in digital methods, giving
me the freedom to seek my own paths. This has certainly made a difference in my
journey. A big thanks for all the support given by NOVA Desporto, to the athletestudents, and especially to all my basketball teammates and coaches who were
definitely part of this work, although not exactly academically. A big thanks to
Roberto Henriques for inviting me to teach social media analytics at NOVA
Information Management School, four groups in total, which coincided with the
beginning of the pandemic. What a challenge! This opportunity had a direct impact
on the reflections of this dissertation, besides being valuable to developing my
teaching skills. Before I began my PhD studies, two incredible minds unintentionally
introduced to me the concept of technicity, conversations that aroused my curiosity
and later turned my attention and interest to it. A big thanks to Manuel Bogalheiro for
the several inspirational talks we had, and to Bernhard Rieder who has inspired a great
deal of this dissertation. Two conversations I had with Bernhard were definitely game
changers on my PhD journey, the first was in July 2015 (at that time I hadn't even
3
begun my PhD course); the second was when I was already at the most advanced
stage of the thesis. Bernhard has always cared and been available either by responding
to my email requests or accepting the invitations to collaborate with SMART Data
Sprint. He also has shared his work, a few times asking for my general impressions
(when I certainly had more to receive than to give). I cannot thank him enough for
that☺"
I am indebted to Tommaso Venturini for his valuable suggestions and criticism in his
capacity of thesis co-supervisor. Being supervised by him was certainly one of the
best things that happened to me in the PhD journey. Tommaso has been an inspiring
advisor, always asking crucial questions to refine and complicate my thinking and
helping me to see the value of my work. This thesis would not be what it is without
Tommaso’s help and intervention. I could not be more grateful to have him as my
supervisor (I have learnt so much!); he has been the best supervisor ever "#$
This thesis has greatly benefited from the Digital Methods Initiative, from where I
learned to do digital methods and also to share knowledge. My first collaborative
experience with DMI took place during the 2014 Winter School in Amsterdam. This
was undoubtedly a differential factor in my journey to the development of this thesis.
Over the years, at DMI Summer Schools, I have met amazing researchers and started
to work in collaboration with some of them. For this, I want to thank everyone who
makes DMI a reality and encourage them all to keep going. You make the difference!
Many thanks to the super talented designer Beatrice Gobbo (who I first met at DMI
Summer School in 2017) for collaborating with the key visualisations part of this
thesis.
Last but not least, I want to say how grateful I am to my family and dearest friends,
for their love and support. My great thanks to Isis Cavalcanti%, Maria do Carmo$,
Carlisi Omena&, Jocelina Omena', Carlos and Rosa Omena(. One million thanks
to João Fonseca)*+ for his love and relentless support, for understanding my bad
moods, for celebrating with me all achievements and failures ,. Thanks to Cristina,
Guida, Orlando and Luís Fonseca for your warm welcome and care. A heartfelt thanks
to Natália de Santana- and Andrea Veruska., even physically separated by the
ocean, they both have been always present, always cheering me up. My special
gratitude to Margarida Sousa)/ and Katielle Silva)0, not only for their
friendship but for their kindness of heart, offering me support in difficult times. I want
to express my gratitude to Carla Nave, Ana Marta Flores, Camila Wohlmuth and
Ariana Mencaroni for their friendship, great company and PhD-related conversations
watered with a rosé or red wine. You are part of this! Many thanks to my dear friend
Inês Amaral for her always kind and supportive words and also to my dears Liliana
Rosa, Ricardo Miguel, Frédéric Gaspar (Doudou), Carlos Amaral, Ana Miguel, Luís
Silva, Elsa Caetano and Luis Oliveira Martins for all their support and care, for all
the great moments together. They all have inspired me to keep going and I am simply
so grateful to them $ 1 $
4
ACKNOWLEDGEMENTS CO-AUTHORED ARTICLES
This doctoral thesis has benefited so much from the collaboration of amazing
researchers, resulting in several articles and data sprint reports, a book chapter, two
grant awards and a book edition.
Omena, J. J., Rabello, E. T., & Mintz, A. G. (2020). Digital Methods for Hashtag
Engagement Research. Social Media and Society.
https://doi.org/10.1177/2056305120940697
Chapter 4 corresponds to a two-year collaboration with Elaine Teixeira Rabello and André
Mintz. This article originated from an unexpected encounter with two incredible people and
researchers (today my friends) at the DMI Summer School 2017. It was my third experience
in this kind of event, and I was almost ready to pitch a project about hashtags and political
polarisation in Brazil. I say almost because the project only became viable after meeting
André (who on the first day of the sprint sat on my right side) and Elaine (who on the same
day sat on my left side). I am so grateful to have met them both, fruitful relationships of work
and friendship (#instagood forever!). The article thus systematised approaches explored in
two data sprints (DMI Summer School 2017 and SMART Data Sprint 2018), with early
results presented at the ECREA Digital Culture and Communication Section Conference
(November 2017, Brighton, UK). I also would like to thank the researchers and designers
who participated in the data sprint projects.
Omena, J. J. & Granado, A. (2020). Call into the platform! Merging platform
grammatisation and practical knowledge to study digital networks, Icono 14, 18 (1), 89122. doi: 10.7195/ri14.v18i1.1436
Chapter 5 corresponds to a three-year collaboration with António Granado that started in early
2017, when António invited and challenged me to use digital methods to study how
Portuguese Universities were using Facebook to communicate. We presented some
preliminary findings at the Science Communication Conference - SciCom PT 2017. In 2018,
we focused on Google Vision API and the visual exploration of different networks. The results
of this work culminated in an article published in Icono and that also was approved at 8º
ECREA Conference 2020 by the Visual Cultures committee (the conference was postponed
to 2021 due to the pandemic). I also would like to thank Fábio Gouveia for helping us to
select a specific group of images in our folder with over 22,000 images which has facilitated
the analysis.
Silva, T., Mintz, A., Omena, J. J., Gobbo, B., Oliveira, T., Takamitsu, H., Pilipets, E.,
& Azhar, H. (2020). APIs de Visão Computacional: investigando mediações
algorítmicas a partir de estudo de bancos de imagens. Logos, 27(1).
doi:https://doi.org/10.12957/logos.2020.51523
Chapter 6 corresponds to an article written in collaboration with Tarcízio Silva, André Mintz,
Beatrice Gobbo, Taís Oliveira, Helen Takamitsu, Elena Pilipets and Hamdan Azhar. This
article was born from an invitation I made to Tarcízio Silva, in mid 2018, to join the SMART
Data Sprint 2019 at Universidade Nova de Lisboa. I helped Tarcízio to develop the project
before it was presented at the data sprint. During the event, and together with André Mintz
and Beatrice Gobbo, I helped in developing and operationalising the research design and
methods. A version of this study was also presented at the National Symposium on Science,
Technology and Society in 2019 in the city of Belo Horizonte, Brazil.
5
Janna Joceli Omena & Inês Amaral (2019). Sistema de leitura de redes digitais
multiplataforma. In: Métodos Digitais: Teoria-Prática-Crítica, edited by Janna Joceli
Omena. Lisboa: ICNOVA. ISBN: 978‐972‐9347‐34‐4
This book chapter was written in collaboration with Inês Amaral who is a fellow worker and
also a dear friend. I first met Inês Amaral in 2014, when she was tutoring a workshop on
network analysis with Gephi, which I attended while trying to make sense of how to read
networks afforded by Facebook Graph API through Gephi. I am so grateful for this meeting,
which later saw us join forces to develop research projects. Two reasons justify the
importance of this chapter in this thesis: i) the triangular understanding about platform
grammatisation, cultures of use and software affordances as pillars to be considered when
doing digital methods were first introduced in this book chapter; ii) as well as the narrative
affordances of ForceAtlas2 to read networks through fixed layers of interpretation. These
approaches have been tried-and-tested within and outside data sprints environments (since
2018 and continuous to be developed).
Warren Pearce, Suay M. Özkula, Amanda K. Greene, Lauren Teeling, Jennifer S.
Bansard, Janna Joceli Omena & Elaine Teixeira Rabello (2018): Visual crossplatform analysis: digital methods to research social media images, Information,
Communication & Society, DOI: 10.1080/1369118X.2018.1486871
This paper led by Warren Pearce and Suay Özkula is already considered an agenda-setting
digital methods article. I conceptualised and developed methods for the Instagram analysis
and also contributed to the topic of cross-platform analysis with digital methods. Through
this collaborative research experience, I could see in practice how crucial the roles of
technical knowledge and technical practices are in the production of knowledge.
Rosa, J. M., Omena, J. J., e Cardoso, D. (2018). Watchdogs in the Social Network:
A Polarized Perception? Observatório (OBS*) 12 (5), Special issue “As Formas
Contemporâneas
dos
Conflitos
e
das
Apostas
Digitais,
DOI: 10.15847/obsOBS12520181367
In this article, co-authored with Jorge Martins Rosa and Daniel Cardoso, I had the opportunity
to contribute to the research design and implementation with digital methods. Here we have
explored a navigational research practice for interpreting digital networks according to their
visual affordances.
Awarded grants & Book edition
In 2018, I received a research grant from UT Austin I Portugal Digital Media Program. The
project was related to the technicity of social media platforms and the life of Instagram bots.
I want to thank Jason Chao for his collaboration on matters related to machine learning.
Between 2018 and 2019 I edited the book Métodos Digitais: teoria-prática-crítica,
composed of reference articles and original texts, also designed for the teaching of digital
methods in Portuguese.
In 2020, I received another research grant from the Center for Advanced Internet Studies
(CAIS) but this time to promote working group meetings. I am coordinating this project,
Stick & Flow: A critical framework for investigating bot engagement on social media, with
Elena Pilipets.
This thesis has benefited from both research projects, book edition and collaborations.
6
Digital Methods and technicity-of-the-mediums.
From regimes of functioning to digital research.
Janna Joceli Omena
Abstract
Digital methods are taken here as a research practice crucially situated in the technological
environment that it explores and exploits. Through software-oriented analysis, this
research practice proposes to re-purpose online methods and data for social-medium
research but not considered as a proper type of fieldwork because these methods are new
and still in their process of description. These methods impose proximity with software
and reflect an environment inhabited by technicity. Thus, this dissertation is concerned
with a key element of the digital methods research approach: the computational (or
technical) mediums as carriers of meaning (see Berry, 2011; Rieder, 2020). The central
idea of this dissertation is to address the role of technical knowledge, practise and expertise
(as problems and solutions) in the full range of digital methods, taking the technicity of
the computational mediums and digital records as objects of study. By focusing on how
the concept of technicity matters in digital research, I argue that not only do digital
methods open an opportunity for further enquiry into this concept, but they also benefit
from such enquiry, since the working material of this research practice are the media, its
methods, mechanisms and data. In this way, the notion of technicity-of-the-mediums is
used in two senses pointing on the one hand to the effort to become acquainted with the
mediums (from a conceptual, technical and empirical perspective), on the other hand, to
the object of technical imagination (the capacity of considering the features and practical
qualities of technical mediums as ensemble and as a solution to methodological problems).
From the standpoint of non-developer researchers and the perspective of software practice,
the understanding of digital technologies starts from direct contact, comprehension and
different uses of (research) software and the web environment. The journey of digital
methods is only fulfilled by technical practice, experimentation and exploration. Two main
arguments are put forward in this dissertation. The first states that we can only repurpose
what we know well, which means that we need to become acquainted with the mediums
from a conceptual-technical-practical perspective; whereas, the second argument states
that the practice of digital methods is enhanced when researchers make room for, grow
and establish a sensitivity to the technicity-of-the-mediums. The main contribution of this
dissertation is to develop a series of conceptual and practical principles for digital research.
Theoretically, this dissertation suggests a broader definition of medium in digital methods
and introduces the notion of the technicity-of-the-mediums and three distinct but related
aspects to consider – namely platform grammatisation, cultures of use and software
affordances, as an attempt to defuse some of the difficulties related to the use of digital
methods. Practically, it presents concrete methodological approaches providing new
analytical perspectives for social media research and digital network studies, while
suggesting a way of carrying out digital fieldwork which is substantiated by technical
practices and imagination.
Keywords: digital methods, technicity, technical practices, networks, vision APIs, social
media.
7
Métodos Digitais e tecnicidade-dos-mediums.
De regimes de funcionamento à pesquisa digital.
Janna Joceli Omena
Resumo
Os métodos digitais são aqui tomados como uma prática de investigação crucialmente
situada no ambiente tecnológico que explora e do qual tira benefício. Esta prática de
pesquisa propõe a reorientação dos métodos online e dos dados para a pesquisa social e do
meio através da análise orientada por software, prática ainda não considerada como um
tipo adequado de trabalho de campo porque estes métodos são novos e a sua descrição está
ainda numa fase incipiente. Estes métodos obrigam a adquirir familiaridade com o
software e refletem um ambiente habitado pela tecnicidade. Esta dissertação diz assim
respeito a um elemento-chave da abordagem de investigação dos métodos digitais: os
meios computacionais (ou técnicos) enquanto portadores de significado (ver Berry, 2011;
Rieder, 2020). A ideia central desta dissertação é a de refletir sobre o papel do
conhecimento técnico, da prática técnica e da aquisição de competências (como problemas
e como soluções) em todo o âmbito dos métodos digitais, assumindo a tecnicidade dos
meios computacionais e dos registos digitais como objetos de estudo. Ao centrar-me na
forma como o conceito de tecnicidade é fundamental na investigação digital, argumento
que não só os métodos digitais abrem uma oportunidade para uma investigação mais
aprofundada deste conceito, mas também que beneficiam deste tipo de investigação, uma
vez que a matéria-prima desta prática de pesquisa são os meios, os seus métodos,
mecanismos e dados. Deste modo, a noção de tecnicidade-dos-meios é utilizada em dois
sentidos: apontando, por um lado, para a necessidade de conhecimento dos meios (duma
perspetiva conceptual, técnica e empírica) e, por outro, para o objeto da imaginação
técnica (a capacidade de tomar as características e as qualidades práticas dos meios
computacionais como um conjunto [ensemble] e como uma solução para problemas
metodológicos). Segundo o ponto de vista dos pesquisadores que não estão familiarizados
com o desenvolvimento de software (ou de ferramentas digitais) bem como da perspectiva
da prática do software, a compreensão das tecnologias digitais deve partir de um contato
direto, da compreensão e dos diferentes usos do software e do ambiente da web. O percurso
dos métodos digitais só pode ser concretizado pela prática técnica, pela experimentação e
pela exploração. Dois argumentos principais são apresentados nesta dissertação. O
primeiro afirma que só podemos tirar proveito daquilo que conhecemos de forma
aprofundada, o que significa que é necessário que nos familiarizemos com os meios numa
perspetiva conceptual-técnica-prática, enquanto o segundo argumento afirma que a prática
dos métodos digitais é aperfeiçoada quando os investigadores estão recetivos a,
amadurecem e adquirem uma sensibilidade para a tecnicidade-dos-meios. A principal
contribuição desta dissertação é o desenvolvimento de um conjunto de princípios
conceptuais e práticos para a pesquisa digital. Teoricamente, esta dissertação propõe uma
definição mais ampla de meio nos métodos digitais, introduz o conceito de tecnicidadedos-meios e aponta para três facetas distintas mas relacionadas – referimo-nos à
gramatização das plataformas, às culturas de utilização e às affordances do software –,
como uma solução para minorar algumas das dificuldades relacionadas com a utilização
dos métodos digitais. Na prática, apresenta abordagens metodológicas concretas que
fornecem novas perspetivas analíticas para a investigação dos media sociais e para os
estudos de redes digitais, ao mesmo tempo que sugere uma forma de levar a cabo trabalho
de campo digital que é substanciada por práticas técnicas e pela imaginação técnica.
Palavras-chave: métodos digitais, tecnicidade, práticas técnicas, redes, APIs, media
sociais
8
TABLE OF CONTENTS
DIGITAL METHODS AS AN ENVIRONMENT INHABITED BY TECHNICITY....................1
SITUATING THE PROBLEMS OF METHODS IN DIGITAL RESEARCH ......................................................................... 2
New media, new methods, new issues ......................................................................................... 2
The practice of digital methods demands a proximity with computational mediums .................. 6
ACCOUNTING FOR THE TECHNICITY-OF-THE-MEDIUMS IN DIGITAL METHODS ....................................................... 7
Part one: reflecting the intersection of methods, technicity and digital fieldwork ..................... 11
Part two: mobilising technicity-of-the-mediums and three pillars of the digital methods
approach to design research and present concrete methodological approaches ...................... 13
LIMITATIONS AND FURTHER WORK ........................................................................................................... 17
1 UNPACKING DIGITAL METHODS ...................................................................................... 20
UNDERSTANDING DIGITAL METHODS ....................................................................................................... 21
An historical perspective: the reasoning behind the methods .................................................... 22
Terminology and definitions ....................................................................................................... 24
Taking technological grammar into account .............................................................................. 27
DOING DIGITAL METHODS ..................................................................................................................... 29
The art of querying...................................................................................................................... 29
Technical practices, the makers and users of software .............................................................. 32
A checklist of questions related to the practice of digital methods ............................................ 39
Data sprints as a form of learning .............................................................................................. 41
THE MANY CHALLENGES OF DIGITAL METHODS ........................................................................................... 43
Four underlying principles ........................................................................................................... 43
Characterising technical knowledge and practice in digital methods......................................... 45
A call for a broader definition of “medium” in digital methods .................................................. 47
2 THREE ATTEMPTS TO UNDERSTAND TECHNICITY ..................................................... 52
FIRST ATTEMPT TO UNDERSTAND TECHNICITY (MEDIA THEORY) ..................................................................... 53
Ways of thinking technicity in Digital Media field studies .......................................................... 54
Technicity as a domain in reiterative and transformative practices ........................................... 57
Common perceptions and appropriations .................................................................................. 59
SECOND ATTEMPT TO UNDERSTAND TECHNICITY (A PHILOSOPHICAL PERSPECTIVE).............................................. 61
The awareness component: machine as elements, individuals and ensembles .......................... 63
The human function: to be acquainted with machines ............................................................... 68
From orders of thought to activity and back............................................................................... 74
The value of technical elements and technical imagination ....................................................... 77
THIRD ATTEMPT TO UNDERSTAND TECHNICITY (WITH DIGITAL METHODS) ......................................................... 80
The context and content of the network ..................................................................................... 80
Building a computer vision-based network (being acquainted with computational mediums) .. 84
Reading digital networks (orders of technical-practical thoughts) ............................................. 87
3 DIGITAL FIELDWORK.......................................................................................................... 95
GETTING ACQUAINTED WITH THE WEB ENVIRONMENT .................................................................................. 96
A technical comprehension of the web: an overview .................................................................. 97
An architecture of participation: the role of web applications and APIs................................... 103
From the web as a platform concept to platformisation .......................................................... 108
DIGITAL TECHNOLOGIES AND THE WEB AS THE LAST STAGE OF GRAMMATISATION ............................................ 110
Making visible and tangible different types of memory, behavior, knowledge ........................ 111
From the metaphor of capture to an in-depth look over technological grammar .................... 113
THREE PILLARS OF THE DIGITAL METHODS APPROACH ................................................................................. 117
Platform Grammatisation ......................................................................................................... 119
Cultures of use .......................................................................................................................... 123
The affordances of software ..................................................................................................... 131
INTRODUCTION TO CHAPTER FOUR ................................................................................ 138
9
4 HASHTAG ENGAGEMENT RESEARCH............................................................................ 142
INTRODUCTION .................................................................................................................................. 143
REVISITING THE ROLE OF HASHTAGS ........................................................................................................ 144
Situating Hashtag Engagement ................................................................................................ 147
REASONING WITH AND THROUGH THE MEDIUM ........................................................................................ 148
Technicity .................................................................................................................................. 149
Platform Grammatisation ......................................................................................................... 150
THE 3L PERSPECTIVE FOR STUDYING HASHTAG ENGAGEMENT ...................................................................... 151
Layer 1: High-Visibility Versus Ordinary .................................................................................... 152
Layer 2: Hashtagging Activity ................................................................................................... 153
Layer 3: Visual and Textual Content ......................................................................................... 153
THE PRAXIS OF HASHTAG ENGAGEMENT RESEARCH ................................................................................... 154
Political Context, Scholarly Approaches, and Framing of the Brazilian Case ............................ 154
Operationalising the 3L Perspective.......................................................................................... 156
FINDINGS .......................................................................................................................................... 158
High-Visibility Versus Ordinary ................................................................................................. 158
Hashtagging Activity ................................................................................................................. 160
Visual and Textual ..................................................................................................................... 163
CONCLUSION ..................................................................................................................................... 169
INTRODUCTION TO CHAPTER FIVE .................................................................................. 172
5 DIGITAL NETWORKS ......................................................................................................... 177
THE CASE OF PORTUGUESE UNIVERSITIES ON FACEBOOK ............................................................................ 178
MATERIAL AND METHODS .................................................................................................................... 180
RESULTS ........................................................................................................................................... 183
Seeing beyond like connections ................................................................................................ 183
The imagery of Portuguese Universities ................................................................................... 190
DISCUSSION....................................................................................................................................... 200
INTRODUCTION TO CHAPTER SIX..................................................................................... 202
6 INTERROGATING COMPUTER VISION APIS ................................................................. 206
INTRODUCTION .................................................................................................................................. 207
COMPUTER VISION AND THE STUDY OF IMAGES ........................................................................................ 209
Computer Vision APIs: What are they? ..................................................................................... 211
Interrogating APIs and stock images websites: absences and hyper-visibilities ....................... 214
Granularity and standardisation in the semantic spaces of APIs .............................................. 219
NETWORKS OF SEMANTIC SPACES AND TYPICALITY ..................................................................................... 224
CONCLUSION ..................................................................................................................................... 229
TECHNICITY OF THE MEDIUMS IN DIGITAL METHODS .................................................................. 232
Part 1: For a technical culture of knowledge in digital research ............................................ 235
Part 2: From technical knowledge and technical practices to new forms of enquiring ......... 239
DEVELOPING A SENSITIVITY TO THE TECHNICITY-OF-THE-MEDIUMS IN DIGITAL METHODS ................................... 244
Making room for the computational mediums as carries of meaning...................................... 247
Grow a sensitivity to technical elements while practising digital methods .............................. 247
Establishing a sensitivity to the technicity-of-the-mediums ..................................................... 248
REFERENCES ...................................................................................................................................... 251
APPENDICES ...................................................................................................................................... 274
10
DIGITAL METHODS AS AN ENVIRONMENT INHABITED BY TECHNICITY
I NTRODUCTION
1
“[…] but the ways of thought are more important
than the subject matter” Marvin Minsky, 1988, p.323
“My thesis today is that you cannot distinguish between the
technical question and the sociological question”
Tim Berners-Lee, 12 October 1995
Situating the problems of methods in digital research
New media, new methods, new issues
The need to develop new methods adapted to study new media is more present than
ever in media studies and has sparked a vibrant discussion on technical expertise and
knowledge, but also on practical and infrastructural issues, such as the need for
research
ethics,
digital
literacy,
new
data
infrastructures
and
new
graduate/undergraduate curricula in universities (see Lazer, Pentland, et al., 2009;
Marres, 2017; Rieder & Rohle, 2019). Over the past decade, different approaches have
emerged aiming to develop new theoretical frameworks and empirical fieldwork to
comprehend our digital life and culture. The quest for more stable methods and
epistemological regimes is reflected in innovative solutions and techniques to work
with digital and digitalised sources and methods in digital humanities (see Berry, 2011;
Warwick, Terras and Nyhan, 2012)1; cultural analytics (see Manovich, 2009, 2020)
and computational social sciences2 (see Lazer, Brewer, et al., 2009; Lazer, Pentland,
et al., 2009).
On one hand, the new methods are constantly being developed to use digital
technologies to respond to research tasks, and to anticipate the need of the expanding
1
For instance, quantitative studies that map and measure in detail cultural patterns by adopting
different visualisation techniques to make sense of big data samples, known as cultural analytics (see
Manovich, 2009, 2020). See a list of digital humanities journals and some initiatives in the following
links: https://zenodo.org/record/3406564,
https://guides.library.harvard.edu/c.php?g=310256&p=2071428
Here you find the official links for Manovich and Rogers respective initiatives:
http://lab.culturalanalytics.info/ and https://wiki.digitalmethods.net/Dmi/DmiAbout
2
Computational Social Sciences encompasses “language, location and movement, networks, images,
and video, with the application of statistical models that capture multifarious dependencies within
data” (D. M. J. Lazer et al., 2020, p. 1060)
2
fields cited above. On the other hand, as criticised by Alexander Galloway, the new
methods also reflect scholars’ individual appropriateness; “what results is a field of
infinite customization, where each thinker has a method tailored to his or her
preferences” (Galloway, 2014, p. 109). The ease with which new methods and be
created or customised can be problematic if it is not accompanied by an effort to think
with or reflect on the software that supports these methods and to acknowledge that,
we are facing a complex methodological moment in which concepts and knowledge
are mobilized through computational mediums (see Rieder & Rohle, 2019).
In this context, media scholars have been questioning the influence, bias and
consequences of digital technologies “for social scientific ways of knowing” (Ruppert,
Law, & Savage, 2013, p. 24), more precisely how digital technologies and data are
“reconfiguring social science methods and the very assumptions about what we know
about social and other relations” (Ruppert et al., 2013, p. 30). Scholars have, for
example, reflected on how one can understand platform infrastructures as
performative (van Dijck, Poell, & de Waal, 2018); how software and its methods
“affect the way we generate, present and legitimise knowledge in the humanities
and social sciences” (Rieder & Rohle, 2019); and, how digital databases (and their
navigational practice) could modify social theory (Latour, Jensen, Venturini,
Grauwin, & Boullier, 2012). A common factor brings these efforts together is the
need for a heterogeneous understanding of the digital technologies (Rupper, Law &
Savage 2013), which are required to be considered as an object and resource of
social life but also as means of sociological enquiry (see Latour et al., 2012; Marres,
2017; Rieder, 2020; Rogers, 2013). The discussion on the consequences of digital
methods, however, has to a large extent remained conceptual and its empirical
investigation is still underdeveloped.
Another characteristic of the new methods is that, in a way or another, they all
embody some form of computational turn in which the forms and features of the
computational medium supporting the enquiry become “an intrinsic part of the
research” (Berry, 2011, p. 3), a transformation that, according to David M. Berry
(2011), runs deeper than the rise of quantitative approaches of analysis. These
approaches, in fact, do not have as an objective nor are able to single out what is
inherent to a computational medium or how this affect the research that is carried out
through it, but to measure data by using statistical or mathematical analysis. The
3
transformations Berry and others refers to relate to a technical understanding3 of the
computational medium which leads to new research insights (Bogost & Montfort,
2009) but also to new ways of designing and responding to research questions in digital
media studies (see Bounegru, Gray, Venturini, & Mauri, 2017, Gerlitz & Rieder, 2018;
Rieder, Matamoros-Fernández, & Coromina, 2018; Rogers, 2019). As a response to
these transformations, the digital methods approach proposes to exploit the
mechanisms of digital platforms (e.g. the ranking systems of search engines results)
and their outputs (e.g. available as retrieved or scraped data). It encourages scholars to
pay greater attention to how platforms work (to their forms, functions and
mechanisms) and how they handle digitally native data.
Digital methods (Rogers, 2013; 2019) make use of what is created by and for digital
media and focus on data and methods that are “digitally native” rather than digitised4,
e.g. uniform resource locators (URLs), hashtags, hyperlinks and ranking. By following
this approach, researchers should consider the methods of digital media and their
effects but also the concepts, practices and substance of the computational mediums
they exploit. Accordingly, researchers should follow, repurpose and think along with
the medium, while it evolves and changes (Rogers, 2013). By carrying out research
with and about computational mediums, the web environment and data, digital
methods also end up repurposing social research (Venturini, et al., 2017). But how can
researchers develop a research mindset that helps in thinking along with the medium?
How can they mobilise empirical evidence in the digital methods approach? In what
sense digital methods can be considered as a type of fieldwork?
In this context, this dissertation strives to understand, explore and critique the digital
methods approach and its proposal to reconceive some preconceived ideas of digital
research – e.g. the longing for huge amounts of data (realizing that more is not always
better); the need to separate qualitative and quantitative (searching instead for qualiquantitative methods); and, the tendency to consider research tasks like data collection
as mechanical and automated activities (extras efforts are required to engage with the
3
Technical understanding here is both understood as related to the knowledge and methods of a
particular medium as well as the practical skills and methods required to re-purpose media for
research.
4
That is a difference of meaning between “born” and “native” digital material. David Berry (2011), in
digital humanities, refers to electronic literature, interactive fiction and web-based artefacts as “borndigital materials”. These, in many cases, are digitalised objects.
4
fieldwork, researchers need to be acquainted with the medium). The problem of
methods here is addressed from the standpoint of non-developer researchers who
“must pay close attention to more practical aspects of digital research” (Marres, 2017,
p.93) and take technical knowledge and technological environment into account. Even
when they don’t have advanced technical skills or experience in coding or software
making, scholars need to be utterly attentive to the features of the computational
medium in itself and in relation to other mediums and to the ways in which its elements
influence the design and implementation of research methods.
This dissertation seeks to support the argument about the dissolution of the quali/quanti
divide in digital research in practice, and acknowledge that “digital traceability and
datascape navigation makes data and methods more continuous”, and thus, “the
micro/macro distinction appears less significant” (Venturini et al., 2017, p. 7). By
doing this, I am entering into the new media, new methods, new issues debate on the
theory and practice of digital methods, while aiming to interrogate the role of
computational mediums and technological environment in digital research. Making
reference to technological environment, I wish to promote a technical comprehension
of digital methods and research software. This methodological viewpoint is
particularly salient in chapters 2 and 3. Discussing two case studies in my own
research, I try to demonstrate how a proximity with computational mediums has
changed the way knowledge was generated and presented.
I use the expression of “computational (or technical) mediums” in a sense that
encompasses but also exceeds the notion of communication media, inviting researchers
to consider media not only as communication platform, but also as living substances
and mediators devices. This means that both platform(s), where the data is derived
from (e.g. social media) or processed by (e.g. algorithmic techniques, computer vision
APIs), and research software are taken as mediums of “expressing a will and a means
to know” because they have their own concepts, language and practices (see Rieder &
Röhle 2018, p.123). Computational mediums here stand for research software, digital
platforms and associated algorithmic techniques, which can be captured by APIs’
results or scraping and crawling methods. When opting for the plural form mediums, I
want to avoid the association with media in the sense of communication media, while
stressing their potentials to change the ways in which we ask/respond research
questions and produce (scientific) knowledge.
5
In support of my use of medium, Berry argues that researchers should be “concentrated
around the underlying computationality of the forms held within a computational
medium”, thus, proposing scholars “to look at the digital component of the digital
humanities in the light of its medium specificity, as a way of thinking about how
medial changes produce epistemic changes5” (Berry, 2011, p.4). Although not
developing the notion itself, Berry is certainly referring to the role of software, tools
and digital devices in digital research. Throughout this thesis, I provide concrete
examples of the active role of computational mediums in different digital methods and
their relationship with researchers’ technical knowledge. Being sensitive to the
medium, I argue, we become capable of choosing and organising an ensemble of
computational tools for research purposes, while acknowledging their influence and
bias. We should, therefore, care and make room for computational mediums in the
design and implementation of digital methods and consider them as important as the
contents or the objects of our research.
The practice of digital methods demands a proximity with computational
mediums
Digital methods require some technical knowledge of digital platforms and
computational tools, demanding researchers to care about the role of these mediums
in research. As argued in the previous section, on one hand, there is a need to recognise
the participation of technology in the doing of social research but, on the other,
researchers also need to understand computational mediums and their methods more
practically (see Marres, 2017; Vis, 2013) paying attention to how they “affect the way
we generate, present and legitimize knowledge in the humanities and social sciences”
(Rieder and Röhle, 2017). Some attention to technicity is always required to use such
methods in line with the concerns of both the contemporary philosophers of technology
and the media scholars who have argued that technology, tools and software call for
5
Berry (2011, p.4) explains that this approach draws from recent work in software studies and critical
code studies, “but it also thinks about the questions raised by platform studies, namely the specifics of
general computability made available by specific platforms (Fuller, 2008; Manovich, 2008; Montfort
& Bogost, 2009)”.
6
new modes of reflexivity (see Marres, 2011; Hoel, 2012; Labour et. al 2012; Rupper,
Law & Savage 2013; Stiegler 2018).
As a way to tackle the problems of methods in digital media studies, particularly
through the lens of a digital methods approach, I am suggesting that researchers should
take into account the role of technicity and the layers of technical mediation mobilized
by each method. Two main reasons justify the choice of thinking digital methods
through the technicity embedded in their practices:
§
Digital methods take as their field the web environment, what is natively digital
and the computational medium, its methods and intellectual substance.
§
As a research approach, digital methods are still on a path of stabilisation and
standardisation (like many other approaches in digital research) which
facilitates the study of their inner workings6.
This section served as way to situate the problems of methods and to raise specific
questions pointing to the purpose and value of this dissertation. The next section
presents the objectives, research questions and main arguments of this dissertation,
summarising the purpose of each chapter.
Accounting for the technicity-of-the-mediums in digital methods
This dissertation discusses computational mediums as carriers of meaning (see Berry,
2011; Rieder, 2020) in the practice of digital methods. The key objective of this thesis
is to address the role of technical knowledge, practice and expertise (as problems and
solutions) in digital methods. It uses the concept of “technicity-of-the-mediums” in
6
Digital methods have been part of my research interests since the very beginning of the master’s
studies in 2013. My first collaborative experience with these methods, in a data sprint environment,
took place during the 2014 DMI Winter School in Amsterdam. Since then, I have never stopped
challenging myself, or been bored by the practice of digital methods. Self-learning, participation in and
organisation of multiple data sprints gave me the experience of using and adapting tools for the
collection, curation, analysis and visualisation of digital data, particularly from social media. The
challenge of knowing how to solve problems through creative methodological approaches is the reason
why I am fascinated with digital methods. Over the past seven years, I have been developing studies on
hashtag engagement, visual network analysis, computer vision API-based networks and bot agency,
resulting in several peer-review journal papers and awarded grants.
7
two ways: on the one hand, to refer to the effort to become acquainted with the
mediums (from conceptual, technical and empirical perspectives) and, on the other
hand, to emphasise the object of technical imagination (which reflects the capacity of
predicting and combining the practical qualities of computational mediums as an
ensemble and as a solution to methodological problems). Here, the understanding of
digital media and methods start from direct contact with the fieldwork and is nourished
by technical knowledge and imagination, a journey only fulfilled by technical practice,
experimentation and exploration. I understand technical practices as all processes
required to implement digital methods approach, which would also critically account
for the technicity-of-the-mediums. Whereas, technical imagination is taken as a result
of the researcher engagement on using and knowing about computational mediums in
different and applied research contexts.
I raise questions on the intersection between methods, technicity and digital fieldwork,
while suggesting that researchers should be acquainted with the computational
mediums and take seriously their regimes of functioning. I call this attitude of
establishing a sensitivity to the technicity-of-the-mediums and I argue that it constitutes
a crucial building block in the practice of digital methods. In this context, I attempt to
provides some answers to the following questions:
§
What is it like to design and implement research with a digital methods approach? To
what extent can these methods be considered a type of fieldwork?
§
Why the notion of technicity matters for and contribute to digital research?
To answer these questions, this dissertation is divided into two parts. The first draws
on a review of the literature on digital methods, philosophy of technology and media
theory. It discusses the crucial role of technicity in digital research, relying on the work
of Gilbert Simondon and Bernhard Rieder. I suggest that researchers have to do
fieldwork in order to reason with and about the computational mediums, and to become
familiar with the technical environment, while understanding the relationship between
software affordances, platforms’ culture of use and their technical grammatisation. The
second part mobilises the technicity-of-the-mediums in digital methods in a series of
case studies concerning hashtag political engagement (Instagram), networks of likes
and timeline images (Facebook) and web vision APIs (Microsoft, IBM, Google). This
part is substantiated by the conceptual discussion presenting a methodological
framework that can be replicated to different case studies.
8
This dissertation investigates how assuming a “technicity” perspective can enhance
the practice of social research, serving as an invitation to pay greater attention to the
computational mediums, and by doing this and iteratively enrich the research process,
the elaboration of research questions and the creation of theoretical concepts. A
conceptual, technical and empirical understanding of computational mediums can help
scholars to avoid fatal misalignments between their research objective and the
constraints and potentials of the mediums they work on and with. Acknowledging the
technicity of the mediums makes it impossible to resort to flawed reasoning such as:
“Because one thing is policy making for AI, and another is the call for us to know what AI
technically are or can do. This has nothing to do with policy making”
“You know, the theoretical part of the project is a thing, and another is the technical aspects
of data collection and analysis, I don´t care about those, the results are what matter and the
quantity of data”
“Well, why teenagers should understand what recommending systems are? This is just too
much for them. I just want to raise awareness about the power of YouTube algorithms”
“I want to develop this cool new theoretical concept about algorithmic culture but now I
need some data and some nice visualizations to support my idea”
I’ve encountered sentences like these in the most varied contexts of academic life,
colloquiums, project meetings, also informal conversations. Statements such as these
prove Simondon’s theory of technology is still relevant and applicable to the practice
of digital methods, as is Rieder's warning about a pressing need to a technical culture
of knowledge. By joining this warning, this dissertation alerts social researchers about
the fundamental role of the technicity perspective for knowledge production and
methodological innovation. Caring for the technicity of the medium can encourage
scholars to consider conceptualisation, technique and methods as a unity, rather than
separating concepts and theories from technique and methods. Thus, the perspective
“technicity” would offer scholars doing social research, a new viewpoint on their study
objects, allowing new interpretations of a phenomenon or issue that are empiricallybased and substantiated by the practice of digital methods.
By changing the way in which we do research, a “technicity” perspective would also
have an impact on the way we deal with ethical problems revealing how the standard
procedures of research ethics are sometimes inadequate for methods grounded on web
9
environments and digital platforms (see Tiidenberg, 2020). Research with digital
methods cannot rely on typical solutions (informed consent, for instance unfeasible in
most projects of this type) and have to face different ethical dilemmas that “arise and
need to be addressed during all steps of the research process, from planning, research
conduct, publication, and dissemination” (Markham & Buchanan, 2012, p.5).
Digital methods researchers take advantage of the data policies of digital platforms,
which often imply that by creating an account on social media, users give their consent
to share some of their personal data and some of the records of their online behaviours
with the platform and with third parties. Asking permission or giving detailed
information about the case study to each and every participant would be an impossible
task precisely because of the number of users as well as the difficulty to reach them.
A possible solution is to avoid studying individual people and instead investigates
larger issue at the level of public debate (see Markham, 2017). Information on the
social media profile description of politicians or botted accounts, for instance, can be
used to make sense of social-technical and political issues. Here, the patterns and
characteristics of the data are seen collectively and not focused on a single individual.
Methods addressing who is vulnerable (research individuals and populations) and what
is sensitive (studied data or behaviour) (see Tiidenberg, 2020) take a different shape
from the technicity perspective. For instance, researching sensitive subjects as how
pornographic content is spread by botted accounts, can produce results that
deliberately expose sexy and porn images of teenage girls being used in the web porn
market (as I demonstrate in chapter 2). Ethical questions may then be raised about the
image analysis, but it is also possible to argue that new ways of detecting the existence
of teenager pornographic sites can help researchers in reporting such activities to the
authorities.
In data treatment, there is sometimes the option to anonymize the results in order to
ensure the anonymity of the users. In data analysis and subsequent dissemination of
results, there is a continuing concern about finding ways to avoid disclosing personal
information and harming users. However, anonymization is not always a benefit for
research (e.g. to study political polarization or social movements), in this case, some
anonymization strategies are taken during the analysis process to ensure the anonymity
10
of ordinary users while finding ways to avoid improper exposure of the results or cause
harm to public figures involved in the study.
The main contribution of this dissertation is to develop a series of conceptual and
practical principles for digital research and digital methods. Theoretically, this
dissertation suggests a broader definition of medium in digital methods, while
introducing the concept of the technicity-of-the-mediums and three aspects to consider
when doing digital methods (platform grammatisation, cultures of use and software
affordances) as an attempt to defuse some of the difficulties related to the use of
methods. Practically, it presents concrete methodological approaches providing new
analytical perspectives for social media research, while suggesting a way of carrying
out digital fieldwork. By doing so, I hope to conceptually and empirically contribute
to the making of digital methods as a (proper) fieldwork.
Part one: reflecting the intersection of methods, technicity and digital fieldwork
This part contains three chapters that focus on the intersection between methods,
technicity and digital fieldwork, and in which I will present the need for developing a
sensitivity to the technicity-of-the-mediums in contemporary digital research as a
solution for some of the challenges faced by digital methods. A literature review on
digital methods is also provided, together with some attempts to understand the
concepts of technicity and grammatisation, while connecting Gilbert Simondon and
Bernard Stiegler’s philosophical reflections to the practice of these methods.
1 Unpacking digital methods
With the purpose of introducing and criticising the foundations of digital methods, this
chapter aims at understanding what it means to design and implement research with
these methods, while addressing the need of taking technological grammar into
account. Here I will make clear that, when using digital methods, knowledge and
findings are constantly mediated but also informed by software. Every stage of the
process impacts the following and the tools are always entangled with researcher’s
analytical decisions (showing that these methods exist in an environment inhabited by
technicity).
11
This chapter recognises different forms of technical practices that can be based on the
making or the use of software. Here, I will argue in line with Rieder (2020)7, and from
the standpoint of software-using rather than making, that non-developer researchers
“inhabit a knowledge domain that revolves around technicity, but also includes
practical and normative notions concerning application, process, evaluation, and so
forth. This domain cannot be subsumed into others” (p.76). The chapter concludes by
addressing the many challenges of digital methods, including a call for a broader
definition of “medium” in these methods.
2 Three attempts to understand technicity
This chapter explores how the concept of technicity matters in the practice of Internet
research. The main objective of this chapter is firstly to understand technicity in the
context of media studies and second to address the question of how technicity
concretely contributes to digital methods. The last attempt to understand technicity
provides a description of the process of building/interpreting computer vision-based
networks, which illustrates what I call the technicity-of-the-mediums in digital
methods, the development of a “digital intellect”8 (see Berry, 2011) and an
understanding of the computational medium in its entelechy – in an active rather than
static mode. Technicity-of-the-mediums in digital methods requires a certain
proximity with software but also a practical awareness of software forms, functions,
operations, and intellectual substance.
I will discuss several aspects of Simondon’s philosophy of technology to situate the
specific angle from which I consider the notion of technicity, in the sense becoming
familiar to the technical medium from a conceptual-technical-practical perspective.
Digital methods can benefit from this vision of technicity, although these methods deal
7
From the standpoint of software-using rather than software-making.
8
Berry (2011) grounds digital Bildung as the development of a digital intellect (or mental mode)
which is something opposed to a digital intelligence. Berry’s comparison is based on the work of
Richard Hofstadter (1963), who explains that “intellect… is the critical, creative, and contemplative
side of mind. Whereas intelligence seeks to grasp, manipulate, re-order, adjust, intellect examines,
ponders, wonders, theorizes, criticizes, imagines. Intelligence will seize the immediate meaning in a
situation and evaluate it. Intellect evaluates evaluations and looks for the meanings of situations as a
whole… Intellect [is] a unique manifestation of human dignity. (Hofstadter, 1963: 25 in Berry, 2011
pp.7-8). Regardless of terminology (Berry’s digital intellect or Bildung, Rogers’ digital methods
mind-frame, Marres’ device-aware sociology, Rieder’s technical culture), there is a requirement on
building not only practical but mental modes when doing digital methods.
12
with objects that differ from the ones originally considered by Simondon in his
reflection on industrial objects.
3 Digital fieldwork
This chapter aims to present the technological environment that the digital methods
approach takes as a point of departure to study social phenomena. It introduces
fundamental aspects of the web environment from a methodological viewpoint, while
reflecting on the notion of digital grammatisation in the practice of digital methods,
based on the work of Bernard Stiegler and Philip E. Agre. The chapter demonstrates
how technical expertise can contribute to new forms of enquiring and describe the
triangular relationship existing between software affordances and platforms’ cultures
of use and technical grammatisation. In doing this, I attempt to defuse some of the
difficulties related to the use of digital methods, while suggesting a way of carrying
out digital fieldwork. This background knowledge constitutes the first level of the
technicity-of-the-mediums and can be defined as the practical awareness that allows
researchers to understand not only in theory but also in practice what it means to study
collective phenomena “through interfaces and data structures” (Bernhard Rieder,
Abdulla, Poell, Woltering, & Zack, 2015, p. 4).
Part two: mobilising technicity-of-the-mediums and three pillars of the digital
methods approach to design research and present concrete methodological
approaches
This part contains three chapters that mobilise a sensitivity to the technicity-of-themediums and the three key aspects to consider when doing digital methods (software
affordances, platform’s grammatisation and cultures of use) to design research and
present concrete methodological approaches to social and medium research. The
chapters complement and support the arguments developed in Part 1.
This part brings together three peer-reviewed publications, written in collaboration
with various colleagues, that have attempted to use a digital methods approach to
investigate questions related to political polarisation through Instagram (the case of
13
the “impeachment-cum-coup” of Brazilian president Dilma Rousseff); institutional
communication of Portuguese Universities on Facebook; and, how computer vision
APIs interpret stock images related to different nationalities. The case studies follow
the key guiding principles to an ethical approach to internet research by balancing the
rights of subjects with the social benefits of research (Markham & Buchanan, 2012).
In these case studies, we considered the role of application programming interfaces
(the APIs of Instagram, Facebook and Google Vision), the graph layout algorithm
ForceAtlas2, extraction software (Netvizz, Visual Tagnet Explorer, YTDT,
TumblrTool) and analysis software (Gephi, RawGraphs, Google Vision API). In
parallel to this, we consider how technological grammars and digital records (hashtags,
engagement metrics, likes, image URLs) were carried out, rearranged and modified by
the computational mediums and the researcher intervention. The chapters are thus
substantiated by methodological innovation and new analytical perspectives for social
media research and digital network studies, as I summarise in the next subsections.
Each chapter describe a different case study where I tried to make room for, grow and
establish a sensitivity to the technicity-of-the-mediums, and to understand what it
means to design and implement research with digital methods. These three verbs refer
to three different ways to deal with the technicity of the mediums, whose difference
will be addressed in this thesis and which I will discuss in detail in the conclusion.
§
To make room for technicity relates to the efforts of becoming acquainted with the
fieldwork and being of aware of technical mediations. Researchers are invited to
become familiar with computational mediums (their functioning, potentials and
limitations) and train their mind to see the web, digital records, media and
computational mediums as a means of enquiry, as source and methods of
investigation.
§
To grow a sensitivity to the technicity-of-the-mediums stands for developing a
technical mindset by engaging with technical practices. Researchers understand how
the meaning carried by computational mediums can be as important as the object of
study, thereby rethinking their research questions and the conditions of proof in digital
research.
§
To establish to the technicity-of-the-mediums illustrates a conceptual, technical and
empirical understanding of computational mediums as crucial meaningful objects of
attention. Researchers are able to use digital methods, knowing what computational
mediums affect them, when and how.
14
The chapters reflect my quest to comprehend the modes of living (or nature) of
computational mediums, while seeking to grasp their epistemological role in digital
research. To do so and to embrace the unstable nature of the web platforms, I cared
about the forms of use, the concepts and the materialisation of computational mediums
throughout the design and implementation of different methods and research design,
related to hashtags, digital networks or computer vision. In the introduction to each
chapter, I will try to explain how the technicity approach has been used in the case
studies, highlighting in particular the aspects that are not included in the text of the
published articles. In addition, I provide a more specific methodological guidance and
reflections, justifying the choice of the articles.
4 Hashtag Engagement Research
This chapter seeks to contribute to the field of digital research by critically accounting
for the relationship between hashtags and their forms of grammatisation. The chapter
approaches hashtags as sociotechnical formations that serve social media research not
only as criteria for corpus selection but also as occasions to display the complexity of
online engagement and its entanglement with the technicity of web platforms.
Therefore, the study of hashtag engagement requires an understanding of the technicity
of online platform. In this respect, we propose the three-layered (3L) perspective for
addressing hashtag engagement. The first layer contemplates potential differences in
the use of hashtags by high-visible users and ordinary users. The second focuses on
hashtagging activity and the repurposing of how hashtags can be differently embedded
into social media databases. The last layer looks at the images and texts to which
hashtags are related. To operationalise this framework, we draw on the case of the
“impeachment-cum-coup” of Brazilian president Dilma Rousseff. When used together,
the three layers add value to one another, allowing to investigate both high-visibility
and ordinary groups.
Drawing on the three pillars of digital methods approach and the technicity-of-themediums, the three-layer perspective can be applied to different platforms.
15
5 Digital Networks
This chapter discusses the infrastructural aspects of Facebook and asks what one can
learn from the connections between Facebook pages (through likes) and from a list of
(timeline) image URLs. It uses network visualization as a means to reimagine
Facebook grammatisation for studying how Portuguese Universities use the platform
to communicate, and it interrogate how digital networks contribute to communication
studies.
Following the digital methods approach and through the notion of calling into the
platform (Omena & Granado, 2020), we operationalise digital research about
Facebook to map and analyse like connections as institutional interests and timeline
images as institutional visual culture. To this end, we build and analyse two distinctive
networks. The first comprises all connections made by 15 Portuguese Universities’
Facebook Pages (the acts of liking other pages or being liked in return). This network
captures the connections made by each page since its created time to March 2019. The
second network is built upon the affordances of Google Vision API, displaying the
connections between timeline images and their labels (a description of the content of
the image itself). To analyse these digital networks, we relied on visual network
analysis (Venturini, Jacomy, & Pereira, 2015; Venturini, Jacomy, & Jensen, 2019).
While exploring and analysing the networks, we considered Facebook itself and how
activity and connections are (re) arranged and made available through its Graph API
and output files, Gephi’s data laboratory, Portuguese Universities’ Facebook Pages
and respective websites. We also give attention to how Google Vision API labels
images and its output files (including the folder with the downloaded images).
Besides providing new ways to design and implement research that can be repurposed
for different studies, the main contribution of this chapter lies in embracing the
methods of the medium, a navigational research practice and the technicity-of-themediums as key components for digital social sciences.
6 Interrogating Vision APIs
This chapter presents the results of a study of computer vision Application
Programming Interfaces (APIs) and their interpretation of representations in stock
images. Computer vision is a field of computer sciences dedicated to the development
16
of algorithms for visual data interpretation, but the methods for its critical application
are still under construction. The study thus draws upon three computer vision APIs
(Google, IBM and Microsoft) for the analysis of 16.000 stock images related to the
following keywords: Brazilian, Nigerian, Austrian and Portuguese (their demonyms
were searched in two of the main Western stock image sites: Shutterstock and Adobe
Stock).
The main contribution of this chapter concerns the special attention given to the
potentials of vision APIs and respective machine learning models to interpret a
collection of images. In other words, the attempt to understand and interrogate how
image automated classification systems may facilitate or compromise the study of
natively digital images.
Limitations and further work
Part one provides a review of ideas derived the work of the philosophers of technology
(chapter 2) and try to connect them to some practical problems in digital methods. Its
purpose therefore is not to exhaustively discuss one or the other, but to establish a
meaningful conversation between them. In chapter 2, for example, my use of technicity
does not address all the elements of Simondon’s broader metaphysics nor it attempts
to develop a philosophical approach to science and technology (see Bogalheiro, 2017).
What Simondon presents as concretisation9, for instance, will not be addressed here,
since I am not investigating the long historical trajectories of technical objects and how
they evolve overtime10 (Simondon, 2017). Rather than as a complex process of
technical evolution (see Rieder, 2020), my use of technicity refers to the awareness
component from conceptual, technical and empirical perspectives. In addition, I will
neither address a theoretical discussion on the concept of technique, nor provide a
9
Concretisation is “a useful concept for thinking about technological artefacts and their evolution over
time” (Iliadis, 2015, p. 86). To Bernhard Rieder (2020), the notion of concretisation is essential for
understanding contemporary computing, he translates the term according to Simondon’s
perspective as “the march from a more abstract or modular state, where elements are arranged so
that every part only fulfils a single function without synergy with the others, to a state of
integration where mutual adjustments yield optimal functioning” (p.69). That would be the
movement from a modular technical object to an integrated technical object.
10
Although, for the past 5 years, I have been investigating and following social media APIs changes
overtime in terms of forms, functions and data access regimes, and the growth of web APIs categories
overtime. However, this work cannot be placed as a long trajectory and it is also not contemplated
here.
17
literature review on the concepts of technical knowledge or technical practices. These
latter will be approached from my experience in practicing digital methods.
In chapter 3, when introducing a methodological vision of the web, digital platforms
and software affordances, this dissertation does not go into depth in the history of web,
or in the theoretical debate on software studies, nor provides a detailed description of
web technologies. This chapter, moreover, does not provide ethnographic and
anthropological reflections on fieldwork. It also does not cover what does it means to
know or do fieldwork from the lens of digital ethnography or anthropology.
Part two is composed by texts that have been published as peer-reviewed articles. Each
of them is introduced by a separate section which will explain how a sensitivity to the
technicity-of-the-mediums has changed the way in which the cases were carried out. I
will make clear that these studies served also as a means for me to understand the
notion of technicity in practice, while seeking to unpack what it means to design and
implement research with digital methods. In the chapters, however, the technicity
approach can perhaps be seen in between the lines rather than as the central theme.
The main reason for this is the time and moment when these publications were written
and submitted, also assuming the technicity of the medium as an object of little or
secondary interest in the call for papers. The section introducing each chapter will
hopefully provide more details on the connection between the paper and the idea of
the technicity of the mediums.
Despite the case studies flag out the different ways in which ethical concerns are seen
from the perspective of technicity, this dissertation does not develop a discussion about
the ethical problems in digital research.
Part 1 and part 2 are interdependent because the conceptual and theoretical efforts of
this dissertation (in part 1) cannot be fully understood without a direct exposure to the
actual practices of digital methods for social or medium research (in part 2). The
criticism addressed to digital methods and the suggestions on how to carry out digital
fieldwork in part 1 could not have been proposed without the case studies in part 2
(and much extra practical efforts outside the context of this dissertation). Through the
connection between its two parts, this dissertation will hopefully help to improve our
18
understanding of the digital methods approach, because to learn how solve
methodological problems using technical knowledge and imagination, as I suggest,
requires practising the method itself under a special care about the technicity-of-themediums.
19
1 UNPACKING DIGITAL METHODS
C HAPTER 1
20
Understanding Digital Methods
This section defines digital methods from an historical perspective and in specific
terminology, while addressing the importance of taking technological grammar into
account. It questions what it means to follow the medium. What are the methods of the
medium? What does medium specificity refer to? How exactly can one repurpose
dominant devices and for what? What makes a difference in digital methods and why?
By shedding some light on how these methods are different from traditional knowledge
structures and other digital research approaches, this enquiry aims to provide a clear
understanding of digital methods. Table 1.1 characterises digital methods, but it also
summarises what will be addressed and explained throughout this chapter.
Digital Methods
learn from
the medium dynamic nature
the technicity-of-the-mediums
The data sprint practice
follow & repurpose
medium dynamic nature
medium specificity
the content of software
are always
situational & imaginative
experimental & collaborative
challenging & demanding
require
natively digital mindset
software-makers & software-users
technical practices
work with
web data & unstable media and methods
extraction, analysis and visualisation software
situated software & query design led research
offer
post-demographic studies
chain methodology
methodological innovation
Table 1.1. Understanding digital methods.
21
An historical perspective: the reasoning behind the methods
The short text entitled “The future of STS on the Web, or: what I learned (naively)
making the EASST website” became a digital-methods seminal article about building
tools to study web data. This mindset and proposal were introduced by Richard Rogers
in 1996, more than a decade before the release of his Digital Methods book in 2013.
In the text Rogers (1996) reflected upon preliminary ideas11 about websites that, later,
would change the way many scholars think and apply methods about and with the web.
By considering the web as a show and tell section, Rogers argues that websites can
represent an issue spectrum with the potential to be more than what is being presented.
His first idea looks at organisational websites as spaces for positioning statements
(such as politics) in which URLs, and not only content, would point to specific issues
or interests of a given organisation, providing insightful perspectives through the
analysis of hyperlink connections. The second idea, with web activism as an example,
suggests a technique to create “a healthy database” that starts with small gestures such
as “sending ‘subscribe’ messages to the leading web activist lists” by email (p. 26).
After that:
the webmaster then has to filter the incoming messages (maybe once
a day), and upload the calls to action on the site, arranging them by
date and by topic, perhaps with an overlay on geographical map
indicating physical origins and destinations of the activities. Email
links could be set up to the originator and the intended recipient,
allowing for two-way protest and/or information exchange. (Rogers,
1996, p. 26)
In this way, Rogers argued one could map, follow and display web activism without
having to be physically present at the demonstrations. It is interesting to note that this
idea involves a good understanding of how activist movements were using the web to
communicate or protest. It furthermore requires monitoring (not always as an
automatic process) combined with some technical skills (the webmaster tasks) and
imaginative thinking (what one can do with a list of emails).
11
When making the European Association of the Study of Science and Technology
(EASST) website, four ideas were presented in this text: 1. Evolving discourses sites; 2.
Activists loop sites; 3. Reflexive webmetrics sites; 4. Virtual presence only sites. Here I
am giving emphases to the first three.
22
The third idea concerns what can be quantified and in which terms content is measured.
The example used in this idea is followed by an argument in which Rogers justifies
why all academic journals should go online. He argued that, when measuring how
many hits each article receives on a website, one can obtain a metric of “awareness”.
In this sense, different measurements would point to alternative interpretation.
In Rogers’ 1996 text, we find the necessity of getting into a field in which the technical
functionality/potentiality of websites cannot be dissociated from their uses and
appropriations. A few years after “The future of STS on the Web”, and following the
specificity of the medium and the natively digital, Rogers (2009a, 2010) starts
questioning whether methods are to be changed in the context of Internet research12:
how to capture and analyse the natively digital objects? What do they offer? What kind
of new approaches are worthwhile? How can online medium methods be reimagined
for social and cultural research?
In 2010, Rogers advocated a bold proposal for Internet-related research “where we no
longer need to go off-line, or to digitize method, in order to study the online” (Rogers,
2010, p. 243). In other words, the suggestion is to take the Internet as a research site,
a place to ground findings. Back then, Rogers’ ideas (1996, 2009a, 2010) reflected
fundamental aspects of what today we understand as thinking along with natively
digital objects and methods, as well as online grounding research. That means, on one
side, learning and following how content is embedded into website design and, on the
other side, knowing how actors make use of the forms and functions provided by the
web environment. Consequently and implicitly in this process, the creation and use of
software for research purposes come along with the invitation to change the way we
see the medium (Internet platforms) but also the way we design research questions.
The basis of Rogers’ Digital Methods reflects both a particular way of thinking along
with the medium and with what it has to offer, just as the development and use of
research software.
12
When interviewed by Michele Mauri (link available at http://densitydesign.org/2014/05/aninterview-with-richard-rogers-repurposing-the-web-for-social-and-cultural-research/), Rogers
“situates digital methods as the study of digital that does not lean on the notion of remediation, or
merely redoing online what already exists in other media” (see also
https://wiki.digitalmethods.net/Dmi/MoreIntro).
23
This methodological proposal was originally developed as a counterpoint to the simple
application of existing methods applied to online environment (Rogers, 2009, 2010;
Rogers and Lewthwaite, 2019). With time and considering that when studying society
through Internet platforms one cannot dismiss the study of the platforms themselves
(Venturini and Rogers, 2019), digital methods have also transformed into ways of
studying the medium culture. Knowledge of the medium thus became a key matter of
concern when using digital methods.
Terminology and definitions
Digital methods are a particular form of research practice that is crucially situated in
the technological environment that it explores and exploits. In the philosophy
underlying these methods, the Internet is not taken as a parallel dimension of our social
life, but as a research site (source of data, method and technique) expected to be a
sphere where one can make and ground findings about society (Rogers, 2013, 2019).
That is:
Broadly speaking, digital methods may be considered the
deployment of online tools and data for the purposes of social and
medium research. More specifically, they derive from online
methods, or methods of the medium, which are reimagined and
repurposed for research. The methods to be repurposed are often
built into dominant devices for recommending sources or drawing
attention to oneself or one´s posts. (Rogers, 2017, p. 75; Rogers,
2019, p. 21)
Thus, research through digital methods should “follow the methods of the medium as
they evolve, learn from how the dominant devices treat natively digital objects, and
think along with those object treatment and devices so as to recombine or build on top
of them” (Rogers, 2013, p. 5). Through this lens, natively digital objects13 (e.g. URLs,
hashtags, tweets) and dominant devices (e.g. Google, Instagram, App stores) would
offer a method of research that learns about society through studying the web in its
13
That is a difference of meaning between “born” and “native” digital material. While David Berry
(2011), in digital humanities, refers to electronic literature, interactive fiction and web-based artefacts
as “born-digital materials” (in many cases digitalised), Richard Rogers (2013, 2015a) makes clear that
“digitally native” refers only to what is created for digital media (in the computational sense), rather
than digitised. For example, uniform resource locators (URLs), hashtags, hyperlinks and ranking,
instead of Internet archives or digitalised material.
24
own language (Rogers, 2010). In this spirit, Rogers has once suggested that social
scientists would no longer need to go offline in order to study societal changes, because
they could ground and conceptualise the “research that follows the medium, captures
its dynamics, and makes grounded claims” (Rogers, 2013, p. 13) online.
The term online groundedness (Rogers, 2013) refers to a type of research that must
consider “when and under what conditions may findings be grounded with web data”
and methods (Rogers, 2019, p. 5). Thus, to ground online findings is a matter of asking
whether or not digital methods are suitable for a given research scenario, because when
dealing with Internet platforms as means of research, “the investigated phenomenon
must be to some extent performed or, at least, reflected in such platforms” (Venturini
et al., 2018, p. 4). Otherwise, the methods would not be advisable or helpful.
It is evident that the proposal of digital methods does not fit either a “methods as usual”
approach (see Marres, 2011) or the transposition of existing methods to online
environment, because digital methods always “investigate how digital infrastructures
can be re-purposed for social enquiry” (Marres, 2017, p. 43). Web data collection,
analysis or visualisation are not digital methods simply because they imply digital tools
and data. Defining the perimeter of digital methods requires asking questions such as:
what does it mean to follow the medium? What are the methods of the medium? What
does medium-specificity refer to? How exactly can one repurpose dominant devices
and for what? How can one make meaningful research questions using these methods?
Some clarifications are needed, starting with the meaning of medium in digital
methods, which stands for dominant digital platforms and search engines along with
their inputs (web content and data) and outputs (what is available for data collection).
Likewise, the methods-of-the-medium are methods “that are in some sense built into
the web” (Rogers & Lewthwaite, 2019, p. 14) and piggyback on technologies such as
Google’s PageRank algorithm or social media platforms’ recommendation systems. In
this sense, a research practice that follows the methods-of-the-medium would “take
advantage of distinctive features of digital infrastructures, devices and practices”
(Marres, 2017, p.82) for social enquiry. The effort to follow the medium thus describes
“a particular form of medium-specific research” (Rogers, 2013, p.25).
In methodological terms, as explained by the digital sociologist Noortje Marres (2017),
to follow the medium means a “continuity in methodology development across
25
different media and technological settings” (p.82). It is here that the researcher must
see and pay attention to medium specificity as “the material and socio-technical
qualities of the media technologies used to implement method” (Marres, 2017, p. 83),
which means being always exposed to the unstable conditions of digital platforms and
search engines. To follow the medium, in other words, means be attentive to the
medium specificity. From a theoretical standpoint, Rogers defines the specificity of
the medium differently from most of the earlier literature. To him, this concept is not
related to media’s ontological distinctiveness as in “Mcluhan’s sense engagement,
William’s socially shaped forms, Hayles’s materiality or other theorists’ properties and
features, whether they are outputs (cultural forms) or inputs (forms of production)”
(2013, p.26). Instead, in digital methods, he argues that medium specificity can either
refer to the sense of preferred means of studying digital platforms (e.g. studying the
dominant forms of platform content through most engaged-with content overtime) or
the sense of looking at the methods of the medium (Rogers, 2013; 2019). His
comprehension of medium specificity is one from an epistemological standpoint – or
one of method, rather than from an ontological perspective (properties and features).
For this reason, he claims, Internet research should be reoriented and thought of as “a
source of data, method, and technique” (p.27).
While I will try to introduce conceptually the basis of medium-specific research in
digital methods, this approach can only be fully comprehended through practices. The
first task of this approach is to follow and learn from the medium and its methods.
Thus, addressing questions such as how does the platform work? How do people
engage with it? Which digital records are available and how are they handled? The
second task is to take the answers to these questions as new ways in which data,
methods and computing techniques implemented online can be used, “then think
through what kinds of other sorts of research can be done with them, how these sorts
of techniques can be repurposed” (Rogers, 2010, p. 259). These two tasks provide a
summary of what it means to follow, learn and re-purpose the medium when using
digital methods. A practical example of that is provided in the last section of chapter
2, and the case studies in chapters 4, 5 and 6 also follow the basis of medium-specific
research.
26
Taking technological grammar into account
Some existing digital methods quali-quanti approaches show how to repurpose
dominant media and their corresponding methods, as well as data. Rather than going
through detailed techniques, project recipes or successful research design protocols
(see
Bounegru
et
al.,
2017;
Rogers,
2019;
wiki.digitalmethods.net;
smart.inovamedialab.org), but without ignoring the importance of these latter, I wish
to emphasise the chain of knowledge and actions prior to any data collection or the
making of research questions. This means comprehending what is being re-purposed
and for what; at the same time, we thinking about what can serve as a point of departure
to query platforms (such as building lists of keywords of organisations, hashtags,
images URLs, etc.) and the use of a selection of software/tools as work material. Figure
1.1 helps us to better understand digital methods’ qualitative fronts, exposing what we
need to know before we start formulating the research questions.
For instance, and as Rogers (2019) suggests, when knowing in advance that Google
directs, by default, all search entries to the local domain (e.g. google.uk), thus returning
the results in the local language (including advertisements), one may raise a few
questions about what types of sources are returned. In this line of thinking, he argues
that one can study societal concerns through looking at Google search results as well
as inquiring its ranking system for a medium-led research. For that matter, Google’s
sense of the ‘local’ enables researchers to conduct cross-country analysis “by showing
the extent to which Google returns transnational, regional or some (other) combination
of results in its local domain engines” (p.115). One could look at the top 100 sites per
country as a way of profiling different countries, detecting which countries would rely
on the mega-upload sites or local-based domains, as Rogers explains:
Have you looked at the top 100 sites per country? It’s interesting in
the sense that you can profile a country according to what kind of
sites are in that top 100. Which kinds of countries are relying on the
mega-upload sites? Just to give you one short example. So one can
think about different sorts of Web indicators for ideas about the
societal condition. (Rogers, 2010, pp. 259–260)
27
Figure 1.1. Seeing the qualitative fronts of digital methods: the re-purposing of dominant
devices and available digital grammars: what for, points of departure and work material.
(Based on Omena et al., 2020; Rogers, 2019)
Using Google Vision APIs is another example of repurposing the medium and its
methods. Machine vision web services were not originally made for research purposes,
but they may be of great assistance in analysing large image datasets or studying image
circulation (see D’Andréa & Mintz, 2019; Omena et al., 2019; Ricci, Colombo,
Meunier, & Brilli, 2017; see also chapters 4 and 5). One can also interrogate pretrained machine learning models through closely looking at their outputs and, for
instance, comparing different services and their capacities and limitations for labelling
a collection of digital images (see Silva et al., 2020; see chapter 6). When re-imagining
technological grammars, for instance, one should consider how social media capture
and reorganise hashtags and their different modes of action to study collectively
formed actions (see chapter 4). In the same spirit, one can use YouTube video/channel
content, creators and related metadata to map issue networks (see Rieder, Coromina,
& Matamoros-Fernández, 2020) or to study the recommendation system of the
platform through the visual exploration of networks (see Omena et al., 2020).
Through these examples, though not exhaustive, we primarily learn that the baseline
scenario of digital methods refers to technical knowledge about medium forms and
functions, the cultures of use inherent to Internet platforms, as well as the software
outputs and content. Consequently, when re-purposing the medium, digital methods’
28
approaches (and respective techniques) also repurpose social research in return
(Venturini, Jacomy, Meunier, & Latour, 2017) but they require, in addition, a vision of
technicity. In all cases illustrated, the key points of departure for the implementation
of the methods relate to the researcher ability of searching as a form of research and
making lists of available digital records such as hyperlinks, hashtags and URLs. Not
separated from this is the technical and practical knowledge about the work material
(computational mediums) as both instruments (to perform a particular piece of work
e.g. data collection) and beings that function (beyond participating in each stage of the
research, they add concepts and technical content to the object of study, re-adjusting
and re-shaping it). I will return to this matter in chapters 2 and 3.
To take technological grammar into account means that we should take the methods of
the medium and available digital records as part of the research design, not only as
instruments but as active and influential participants.
Doing Digital Methods
The art of querying
The act of querying platforms and query design is at the core of digital methods. The
technique of building lists of words to be used as keywords or issue language informs
the foundations of digital methods (see Rogers, 2017, 2019). Keywords are here
understood as “the connections people are currently making of a word or phrase,
whether established or neologistic” (Rogers, 2019, p. 37). In this way, whether and
how people/organisations/tech-companies are using/engaging with keywords matters.
In Internet platforms, keywords can become search queries formed by hashtags, video
or channel ids, username accounts, image URLs, among others. These queries can be
used to collect and see things and their respective relations, connections,
representations or controversies, which explains why the technique of building a list
of keywords is the very starting point of digital methods.
Query design, therefore, is neither trivial practice or a purely technical question,
precisely because keywords in digital methods denote positioning efforts – program
and anti-programs (see Akrich and Latour, 1992) as well as neutrality efforts (Rogers,
2017; Rogers, 2019). Therefore, query design should be synonymous of spending time
29
navigating the platform to explore one’s subject of study: to monitor, to collect data
and to conduct some (visual) exploratory analysis. After all, “the ways in which actors
label the phenomena in which they are engaged can be subtle and complicated”
(Venturini el al., 2018, p.18). For instance, one may think of #coxinha and #mortadela
as random food-related hashtags, but when situating these tags in the context of
Brazilian impeachment-cum-coup protests in 2016, the hashtags meaning shifts to
pejorative nicknames for antagonistic protesters (see chapter 4). This only reinforces
the need of dedicating some time to defining our query design, instead of, as is most
common, go for what is logical in our mind or popular and trendy words that may not
reflect the actual language in use.
On one hand, query design exposes a close relation with data collection methods (e.g.
manual, API calling, crawling, scraping) and exploratory data analysis, and on the
other, it justifies that the choice of words not only matters but also requires scholars to
consider search as research (see Rogers, 2015). That is the formulation of specified
and underspecified queries, to make research findings with engine outputs (Rogers,
2013; Taibi et al., 2016) and the researcher capacity to make good queries. In other
words, this is a technique of building search queries as research questions (Rogers,
2019) which is not an easy task, as the forms and cultures of use of platforms are
constantly changing and so are the ways in which platforms impose, capture and
reorganise digital records.
The two types of queries serve different research purposes, when specified (e.g. “white
lives matter”, “#foraBolsonaro”) the query is used for studying dominant voice,
commitment and alignment. Whereas, when not being sufficiently detailed (e.g.
“abortion”, “quarantine”), the underspecified (or ambiguous) queries serve to uncover
differences and distinct hierarchies of societal concerns (see Rogers, 2019). In practice,
for instance, when defining a list of hashtags to study political polarised debates, this
process should respond to immersive observation of the context and through previous
exploratory analysis such as using co-hashtag networks, Excel’s pivot table and basic
formulas (e.g. VLOOKUP). This practice helps the researcher to verify whether the
chosen hashtags have clear connection with the topic, helping also to detect when
hashtags may be indicators of new connections on the topic or counter-reactions (see
chapter 4).
30
To facilitate the process of making research questions with lists of words to be used as
keywords or issue language, Rogers (2019) suggests that, when formulating queries,
“it is pertinent to consider keywords as being part of programmes, anti-programmes or
efforts at neutrality” (2019, p. 28). Chapter 4 brings the case of political polarisation
in Brazil, exemplifying how a good list of program and anti-program hashtags was
built; while chapter 6 relies on underspecified queries to perceive how different
nationalities (such as Brazilians, Nigerians, Austrians and Portuguese) are depicted by
stock images (Shutterstock and Adobe Stock).
In this line of thought, querying platforms and searching as research serve as a form
of mapping or locating issue networks (Rogers, 2018; see also chapters 4 and 6), and
of studying trends, dominant voice, commitment and alignment (see Rogers, 2018,
2019, also chapter 5)14. Once again, we are faced with what is expected to build queries
as research questions; an understanding of the ways and words adopted by different
groups part of the same phenomena. On the one hand, the search queries we use
influences what we can obtain from digital platforms and social media and,
consequently, the records available for the analysis. On the other, the search queries
we use have a direct impact on the way we set research questions as well as on the
ways we choose to respond to them.
So far, we have learnt that thinking along with the medium reflects and requires having
a good knowledge of the medium and mastering the art of querying. This also means
that, when using digital methods, to be in direct contact with the subject of study stands
for surfing the studied environment (e.g. Instagram) but also interacting with the
content of study within this environment (both via navigating the end-user interface as
well as using research tools for the purpose of data collection and exploratory
analysis). Such methodological proposal demands innovative ways of applying
methods because it works in an environment that is not made for academic research,
thus, posing significant challenges to digital research (see Ruppert, Law and Savage,
2013; Marres, 2017; Venturini et al., 2018). In the face of this, I will call attention to
the problem of digital methods illiteracy in the next sections, by exposing the working
material and technical practices related to these methods. In addition, I advocate data
14
When querying platforms, Rogers (2019) calls our attention to the use of quotation marks to avoid
“equivalent keywords”.
31
sprints as a form of learning the practice of digital methods. In chapter 3, I will return
to the problem of digital methods illiteracy, but this time describing what does it means
to be in direct contact with the fieldwork, suggesting a way of carrying out digital
fieldwork.
Technical practices, the makers and users of software
Digital methods reunite available online data (hyperlink, retweet, timestamp, like, etc.)
and metadata, considering how platforms or search engines handle them. Hashtags,
emojis, Facebook reactions, lists of link domains, image URLs or retweets (among
others) are taken as collections of situated representations, through which we can look
at things (either part of dominant debates or more issue-specific subjects). They have
distinct orders of worth to different stakeholders interests and different forms of
appropriation (Gerlitz, 2016; Gillespie, 2010; Highfield, 2018; Marres, 2017), serving
as a basis for exploratory and experimental research with digital methods. When taking
available online data as work material, researchers are invited to recognise and deal
with what is often considered as a problem or bias of digital research - the media
instability and ephemerality as well as the incompleteness of online data. This also
indicates that researchers need to exert some vigilance (see Venturini et al., 2018, p.4)
but also engage with technical practices to repurpose the online data.
To stock, manipulate and analyse online data, researchers must rely on another type of
working material: situated software15, that is, the development of techniques and
software applications for specific research questions16, and then the generalisation of
this application and their re-use in different research contexts (Rogers, 2010, p. 259;
Rogers & Lewthwaite, 2019). The use or re-use of existing tools is very common in
the everyday routine of many researchers and digital media-oriented laboratories,
15
In order to situate how software is used in the practice of digital methods, Rogers refers to situated
software, a term coined by Clay Shirky (2004), who is an American writer and thinker about the social
and economic effects of Internet technologies. Shirky developed this idea out of his teaching
experience at NYU's Interactive Telecommunications Program (ITP); he defines situated software as
“software designed in and for a particular social situation or context”. The full document was shared
in the "Networks, Economics, and Culture" mailing list, available at:
https://www.gwern.net/docs/technology/2004-03-30-shirky-situatedsoftware.html
16
If the operationalisation of digital methods is informed by “research problem-oriented software”
(see Rogers and Lewthwaite, 2019), we can also say that simply engaging with open source tools (for
extraction, analysis and visualisation) or mastering code skills are not necessarily related to repurposing the methods of the platform.
32
particularly because of the costs and challenges connected to making and maintaining
software/tools. The making of software in digital methods also serves the purpose of
interrogating the medium itself (see Rieder, 2020).
The Lippmannian Device17 (Fig. 1.2) is an example of situated software because it was
made for gaining “a rough sense of source’s partisanship and distribution of concerns”
(Rogers, 2019, p. 130). This web scraper queries different search engines (one at a
time), asking whether one or more keywords occur in different URLs. Before data
scraping, some choices have to be made such as opting for the total results per query
(maximum of 1000) and the search engine – e.g. Baidu, Bing, DuckDuckGo, Google,
Naver, Parsijoo, Seznam.cz, Sogou, Yahoo (Japan), Yandex. By selecting Google,
researchers can specify a number of parameters in the advanced options, e.g. the
domain, region, language, and also the period of time (e.g. past 24 hours, week or
month) and where the term(s) (e.g. “coronavirus”) appear in the article (e.g. anywhere,
headline, body of the text, URL). As a result of these decisions, the limit of requests to
Google can be reached when running the scraper. In these cases, data collection must
be monitored because to keep the scraper running, researchers are constantly
demanded to answer CAPTCHA requests.
To illustrate the use of the Lippmannian Device for social research, for example, we
can ask how Portuguese Journalism has managed the initial period of the pandemic.
This case may start with a list of the main newspapers in Portugal18 (URLs) and
underspecified keywords related to COVID-19 (keywords, e.g. coronavírus,
quarentena, estado de emergência and pandemia). Then, after running the scraper over
different periods of time, it is possible to monitor the mentions of the keywords over
time according to Portuguese newspapers, and analyse related textual content by using
the word tree visualisation technique (see Wattenberg & Viégas, 2008) (see Fig. 1.3).
In this way, and as proposes the Lippmannian Device, one may gain a sense of how
the main newspapers in Portugal distribute attention to a set of issues concerning
COVID-19. Figure 1.3 illustrates that, showing on the left the number of times
(maximum of 350 per word) that the Portuguese newspapers (nodes) have mentioned
17
The scraper is named after the American Journalist Walter Lippman, author of Public Opinion
which is a referential book in media studies. Rogers (2019) explains that Lippmann has called for a
coarse means of showing partisanship in his book The Phantom Public (1927).
https://wiki.digitalmethods.net/Dmi/ToolLippmannianDevice
18
Based on https://en.wikipedia.org/wiki/List_of_newspapers_in_Portugal.
33
coronavírus, DGS, estado de emergência, quarentena and pandemia. On the right,
when looking at the sentences related to the word register, we see that the number of
Figure 1.2. On the left a screenshot of the Lippmannian Device by Digital Methods Initiative
(DMI) being used and, on the right, one of its output files (the csv. file) open on a
spreadsheet.
Figure 1.3. Exploring and visualising the csv. file provided by the Lippmannian Device. On
the left, the occurrence of words related to COVID-19 in major Portuguese newspapers
between March 26 and April 2, 2020. Beeswarm plot by RawGraphs. On the right a word
tree containing the initial part of the article descriptions identified by the scraper when
searching for the keywords. Word tree by Jason Davies19.
19
https://www.jasondavies.com/wordtree/
34
deaths and the new cases of COVID-19 were a widely reported in the news between
March 26 and April 2, 2020. Besides some insights on the object of study, which
however is not what I want to highlight here, and the illustration of situated software,
this example also tells us that other types of software are required in the practice of
digital methods. That is, researchers need to deal with practices other than data
extraction.
In regard to software, it is then justifiable and even imperative the development of
situated research tools20, which likewise demand to be tested and used in a situation
and context. In this sense, the methods require technical creators (programmers,
methodologists, analysts, designers) and technical practices (either through softwaremaking or software-using) – both driven by how to questions as well as sociological
imagination21. Table 1.2 provides a comparative perspective on software makers and
users who, in the practices of digital methods, pay particular attention to the technical
content of software through technical practices, as Rieder (2020) suggests from the
perspective of a developer. Here, I am suggesting a demystification of technical
practices in the context of digital methods, often regarded as a speciality of developers
or tool makers who are users of programming language. Non-developer researchers
such as the methodologists, analysts and designers also take part in technical practices,
particularly they create methods of research or provide methodological innovation
through software-using. We are thus recognising different forms of technical practices
that can be based on the making or the usage of software.
In alignment with Rieder (2020), a good understanding of software would not be
provided by the contextual comprehension of how the social, political or economic
issues have become entangled with algorithmic information ordering, for instance. But
instead, through the “basic materialities and conditions of production” of software
(p.33). Here, however, I would like to devote attention to software potentialities in
solving specific problems or research questions when using digital methods. That is
20
Although building tools in digital methods is synonymous of a specific and situated research
purpose, the international academic community has certainly benefited from its open source tools.
21
“Digital methods, however seek to introduce a sociological imagination or a social research outlook
to the study of online devices” (Rogers, 2015, p. 2).
35
particularly related to software conditions of use, rather than their conditions of
production.
Technical creators and practices in digital methods
Software-makers (developer researchers)
Software-users (non-developer researchers)
Technical
practices
The practice of software-making:
building technical objects and taking into
account “computing’s highly layered and
modular character” a as technical creation
The practice of software-using:
building methods of research with, through
and about medium technical specificity as
technical creation (a methodological creative
use of software)
Researcher as
Users of programming language
Users of software
Makers and coordinators of
software particular forms of practices
Coordinators of software particular forms
of practices in methodological processes
Builders of tools/software
Builders of methodological innovation
Attention to
Technical
Culture
The technical content of software - materialities, (particular) potentialities, functioning, outputs
Conditions of production and implementation
Conditions of use and implementation
“The methods and mechanisms that constitute
and inform operation” (p.53)
Software operation and what does it mean?
“A deeper understanding of software and
software-making” as a response to the
challenge posed by the emergence of a
technical culture
A practical understanding of the active role of
software and software-using in the full-range
of digital methods as a response to the call for
a technical culture in digital research
Table 1.2. Technical creators and practices in digital methods (adapted from Rieder, 2020).
When working with software, researchers ought to pay serious attention to the content
of software, understanding its materialities, (particular) potentialities, functioning and
outputs, while aware that part of software technical substance “sits at the centre of
technical practice” (Rieder, 2020, p, 54). For instance, and as suggested in table 1.2,
the use of programming language and digital infrastructures to create tools or, by using
software, to create new methodologies. Although sharing concerns with software
technical substance, the makers (developer researchers) may care about “the methods
and mechanisms that constitute and inform operation” (p. 53), whereas the users (nondeveloper researchers) seek to understand software operation.
In the practices of digital methods some questions are raised, yet they are answered in
different ways. For instance, what does the software do? For what purpose? How does
it work (affordances, limitations and potentialities)? What are software conditions of
production and use? What is required for its implementation? That is to say that the
range of digital tools necessary to the realisation of digital methods call for a certain
36
technical expertise when making or using software, confirming that “software
demands an engagement with its technicity and the tools of realist description” (Fuller,
2008, p. 9).
While the matter of technicity will be discussed in chapter 2, I want to turn attention
to the peculiar vision of digital methods as a technical ensemble (suggested in figure
1.4 and more specifically exemplified in the chapter 3, figure 3.13). That is, when
applying the full range of digital methods, the research is not dealing with “purposeful
assemblages” or cascades of devices and inscriptions (data), as suggested by Ruppert,
Law and Savage (2013), neither are they working with “an assemblage of research
methods” as defined by Helmond (2015b, p. 20). But, by placing the possibilities and
potentialities of a range of software into action and with one final purpose, the
researcher becomes a builder and coordinator of a technical ensemble. To start
understanding this, we may look at the workflow of the methods in figure 1.4. I will
return to this discussion at the end of this chapter and in chapters 2 and 3.
The process starts with the existence of grammatised actions22 (e.g. tagged content,
shared links, liked publications, followed accounts) yield in social media APIs (Fig.
1.4). These grammars are not only aligned with current forms of appropriation and
usage of a given platform culture, but also entangled with its mechanisms (e.g. most
recent tagged-posts or stories made available by Instagram, related videos by YouTube
or top 100 ranked URLs by Google, recommended apps by Apple’s App Store). An
extraction software is, thus, required to access contents, for instance when using
YouTube Data Tools (Rieder, 2015) to communicate with YouTube API. This is a task
that occurs between software (YTDT and YT Data API) but which responds to the
researcher intervention who is aware of its possibilities and restrictions. Such form of
communication is, perhaps, an unnoticed detail in face of the ease within which one
can pull data from digital platforms. The restrictions can refer to data access regimes
and, consequently, data issues such as completeness, consistency and architectural
complexity that must be taken into consideration (see Bucher, 2013; Ho, 2020; Rieder,
Abdulla, Poell, Woltering, & Zack, 2015). But that is something one can verify in API
documentation or in the extraction software outputs.
22
When online activity is rendered, organised and re-arranged by digital platforms. In chapter 3, I will
address the notions of grammatisation and grammatised actions.
37
Different data format files are often made available such as comma-separated value
(CSV), JavaScript object notation (JSON), graph dataset format (GDF) and tabseparated value (TAB). This work material is, then, transposed to mining, analysis and
visualisation software such as Excel, OpenRefine, Gephi, Tableau and ImagePlot. At
this point, we can clearly see layers of technical mediation that take place in the
practical work using digital methods which should be considered as relevant in the
research process. In addition, by offering “a general research strategy, or set of moves,
that have certain affinities with an online project, mash-up, or chain methodology"
(Rogers, 2019, p.10), the methods, consequently, impose a certain proximity with the
software.
Figure 1.4. The workflow of social media analysis with digital methods (Twitter screenshot).
This visual workflow was also presented in a lecture given by Bernhard Rieder
at the Universidade Nova de Lisboa in 201523.
When looking at figure 1.4, it is possible to understand not only what digital methods
mean in practical terms, but also what would come along with them, and, to
comprehend that the methods prescribe a particular attention to technological grammar
while accounting for the content and particular forms of software practice (this
discussion will be elaborated in the following chapters).
23
Slides are available at https://www.slideshare.net/bernhardrieder/analyzing-social-media-withdigital-methods-possibilities-requirements-and-limitations.
38
On this basis, and according to such workflow, when questioning whether we were
trained to repurpose digital media and data for research in the same way we were taught
to apply online surveys or questionnaires, the answer would be no, mostly not, as I
have learned by both participating and organising numerous data sprints. The
techniques used to build questionnaires, likewise the traditional sociological approach
(e.g. demographic features), do not fit the reality or the practices of digital platforms
(Venturini et al., 2018), since this work field uses different materials and methods.
There is a difference between making a list of questions for the purpose of gathering
information directly from people (questionnaire) and making research questions with
a list of keywords (natively digital objects) for the purpose of gathering information
within the web environment (query design).
The practice of digital methods requires researcher skills and knowledge different from
those demanded in the use of questionnaires, such as: the art of querying, the attention
given to both software technical substance and the layers of technical mediation in the
full implementation of the methods. The repurpose of digital media and data does not
end in the spreadsheet, or in following the advantages of statistical packages as we see
in the analysis of online questionnaires. Both analytical proposals require some
training but not in the same manner.
A checklist of questions related to the practice of digital methods
The best way to start understanding the impact of digital methods in social research is
reflected in the forms research questions can be asked and answered. That is as
fundamental as the practical awareness of the medium effects and substance. The
following bullet points gather a checklist of questions, serving as guidance to think
more thoroughly on the practice of digital methods (based on Omena, 2019; Rogers,
2019; Venturini et al. 2018)24. This mode of inquiry goes in parallel with the technical
practices of digital methods, taking a more active role when the query design is
defined, while it also exposes the relevance of software in their operation.
24
See also the following interviews with Richard Rogers: http://densitydesign.org/2014/05/aninterview-with-richard-rogers-repurposing-the-web-for-social-and-cultural-research/ (conducted by
Michele Mauri from Density Design Lab) and
http://revistadisena.uc.cl/index.php/Disena/article/view/841 (conducted by Sarah Lewthwaite).
39
§ Basic questions with natively digital object and methods:
Is the Internet a sphere to ground findings?
Are digital methods suitable for my research?
Do I have keywords (tags, URLs, link domains, retweets) as starting points? Crawling,
Scraping or API calling? Why?
Social media, search engines, web archives or other platforms? Why?
Are these “good” hashtags/hyperlinks/URLs/expert-list? Why?
What is the logic of the recommender algorithm?
What is captured by platforms API and how are records connected by/through them?
§ Questions related to building-lists:
What type of query? (e.g. specified, underspecified) Are there expert lists?
For dominant voice and concern: Who are the specific actors that give voice to a problem and
to its specific areas? What is considered and ignored? For how long?
For commitment: How about the longevity or durability of this concern? Are those concerned
committed? Which issues were fleeting?
For positioning and alignment: Who is using the same language? What particular keywords?
How is this problem specifically articulated and counter articulated?
§ Questions related to data collection and analysis:
For technological grammar: What platform (s)? Why? How does it work? At which specific
mechanism should I look at? (e.g. ranking system, recommendation algorithms) How are
digital records made available?
For software: Which software? Why? What are the entry points to collect data? What are the
technical specificities? What are the limitations? What type of files are required or generated?
For digital records: What records are available? How are they captured, re-organised and
made available? How can they be studied? Why? What media items or metadata can I get?
For the analytical decision: Who to look at? Why? How? (e.g. high-visible and/or ordinary
content, actors and practices) What to look at? Why? How? (link domains, engagement
metrics, comments, timeframe) Single or cross-platform analysis? Why? How? What for?
Networks, rank flows, grids, scatter plots, bee swarms or another type of visualisation to
explore my dataset?
40
For the-repurpose-of: What questions can be asked? How to get advantage of platform data
and mechanisms? What technical specificities count? Why? For what purpose?
Data sprints as a form of learning
Based on my own experience, I want to advocate that, to fully understand the
methodological proposal and philosophy of digital methods requires a process of
learning by doing in a collaborative and interdisciplinary environment facilitated by
data sprints. But first, a brief explanation of data sprints, which are defined as
“intensive research and coding workshops where participants coming from different
academic and non-academic background convene physically to work together on a set
of data and research questions” (Venturini et al., 2018, p. 1)25. In other words, data
sprints promote forms of implementing exploratory and inventive ways of reading,
seeing and analysing platform data, bringing “social scientists, developers and data
designers together with relevant domain experts to explore research questions and
create prototype digital methods projects” (Munk, Madsen, & Jacomy, 2019, p. 110).
Within a limited timeframe, usually around five intensive days, a data sprint offers
multiple forms of conducting research.
Due to the enormous time pressure, analytical decisions are to be made fast and wisely,
to avoid the risk of getting “wrong” results or “not well executed” projects (Rogers &
Lewthwaite, 2019). At the end of the week, projects’ results and preliminary findings
are presented and after the sprint, project reports are made available online, providing
time to develop and improve, as well as to finalise the incomplete talks not carried out
along the week due to time constraints. For these reasons, data sprints tend to serve
well-researched projects at different stages of development, providing rich and
substantial insights either for experimental and exploratory studies or for confirmatory
ones (Omena, 2019).
A pioneer in the data sprint approach is the well-known Digital Methods Initiative
(DMI) from the University of Amsterdam, which has two directors, Richard Rogers
and Erik Borra (technical director), and more than 13 editions of its summer schools26.
This initiative has inspired the creation of SMART Data Sprint at Universidade Nova
25
In the context of SMART Data Sprint, the following video was created with the purpose to explain
the data sprint approach: https://www.youtube.com/watch?v=bveMpEtAvug
26
See https://wiki.digitalmethods.net/Dmi/DmiAbout.
41
de Lisboa, founded and coordinated by myself and also considered a referential sprint
which is now heading to its sixth edition27. This is the justification for using my own
experiences28 to validate my argument on data sprints as a form of learn digital
methods by doing. Over the years, I have seen how the data sprint approach can be a
valuable source for practically introducing the lines of research inquiry, techniques and
potentialities attuned to digital methods proposal for (non) academic studies.
Additionally, for those who decide to make room for the computational medium and
its methods in the research process, data sprints have proven to be a change of course
in the way scholars think of digital research and make use of available digital records
and methods.
From this background, and despite normally being considered as a technique for
organising collaboration, data sprints are also a means of learning digital methods.
Participants learn by listening (keynotes) and most importantly by doing. For instance,
they learn how to do specific tasks and implement digital methods’ techniques, such
as how to: plot a collection of images, build image-hashtag networks, explore the
visual affordances of networks, query vision APIs or use software as a tool to solve
research problems. It is thus a golden opportunity for those who seek to learn from the
digital methods approach, and best used when participants prepare in advance. As a
form of preparation, before the sprint, participants are asked to follow tutorials, install
and explore the utilities of web-based plug-ins or software. By doing so, while there
and not starting from scratch, they can take greater advantage of the workshops.
Another way of learning is when engaging with a working group, where participants
tend to choose a project by affinity with the object of study or by the proposed methods
and techniques (or both). Projects can offer a good environment to understand not only
how to make research questions but also learning in practical terms how to respond to
these questions. During this process, the intervention and the role of information
designers are important in both teaching innovative visual methodologies and
generating adequate visualisations that will assist data visual exploratory analysis and
27
See https://smart.inovamedialab.org/.
28
My first contact with digital methods, in a data sprint environment, took place during the 2014 DMI
Winter School in Amsterdam. Since then, self-learning, participation in and organisation of multiple
data sprints gave me the experience in using digital methods for interdisciplinary research projects and
in helping to conceptualise and develop methodological steps that these and other projects are using in
different research areas.
42
the results of the projects (see Ciuccarelli and Elli, 2019; Mauri, Gobbo and Colombo,
2019). However, such a contribution is also expected from experienced researchers
who are familiar with the use of digital methods; thus, not restricted to the intervention
of the designers.
The hands-on methods and very specific techniques taught in the workshops combined
with the conditions in which one can experience the full range of digital methods in a
collaborative environment, makes it possible to researchers to implement the methods,
while they learn how to repurpose digital media and data through technical practices.
Data sprints are thus an invitation to face and solve analytical and technical challenges
in practical terms.
The many challenges of Digital Methods
In the previous sections we learnt that what makes the difference in digital methods is
an invitation to first learn from medium specificity (following its logics, forms and
dynamics) and, consequently, to repurpose what is given by the methods of Internet
platforms for social, cultural or medium research. When scrutinising online dominant
devices and their methods, particular techniques to formulate queries are required. Key
to this process is the researcher’s ability in defining a list of words (e.g. URLs,
hashtags, videos or images ids, social media accounts) as issue language. Such ability
underpins search as research which is followed by a proper understanding and use of
the work material (digital records and software) and technical practices for these
methods. Under the premise of a medium research perspective, the functional logic of
work in digital methods thus invites researchers to think the subject of study in, with
and through a practical-technical research process.
Four underlying principles
Figure 1.5 illustrates what was discussed in the previous section but also describes
what is at stake in the implementation of digital methods. That is the ability of thinking
the subject of study through different but interconnected operations such as dominating
the art of querying platforms while understanding platform grammatisation (see
chapter 3). In the same way, knowing that the methods deal with thick layers of
technical mediation as they impose a certain proximity with the software. All
43
operations are interconnected. That means, for example, that grammatised actions
inform the making of queries as research questions (query design) and, at the same
time, are reflected in research software and visualisation models. These four operations
correspond to the implementation of four key principles that underpin the practice of
digital methods.
Figure 1.5. Four key principles that underpin the practice of digital methods.
(adapted from Omena, 2019).
The first principle assumes that methods take an interdependent position in the
research process, from its conception to the decisions made during the analytical
procedure; while in the second, we learn that the platform infrastructures play active
roles in research decision making processes and should be accounted for. After all,
platforms are not intermediaries, their mechanisms intervene, shape and organise what
we see (Gillespie, 2015, 2018b) and, consequently, how we read the subject of study.
Through these lenses, once again, we understand that one cannot study society through
digital platforms without studying the platform itself. The third principle is the
requirement of (a minimum) practical expertise in applied research with digital
methods, the ability of data extraction-mining-visualisation, for example. The fourth
principle concerns understanding digital methods as both interpretative and
quantitative. This will be demonstrated through different methodological approaches
44
and case studies I am proposing in this dissertation, in particular what concerns hashtag
engagement research, digital networks studies and medium research.
These principles reflect what drives and what is always present in the implementation
of digital methods: an imaginative, collaborative, and experimental endeavour. That
was something implicit in the previous section. However, this reality also exposes
some practical problems such as how can one specifically learn from online objects
and methods? How can we recognise and apprehend “the mode switch” (Rogers, 2019)
required by digital methods practices? To learn from the medium and recombine, reuse or re-purporse its methods are not simple tasks. In this respect, I will suggest an
unsual way of unpacking digital methods, using problems as the point of departure and
adjectives for a more accurate description.
Characterising technical knowledge and practice in digital methods
In the light of my experience, I will now broadly address the problems of technical
knowledge and practices in digital methods. To help in this effort, a list of adjectives
will be used to introduce a reality rarely emphasised in digital methods’ literature.
Overwhelming.
Uncomfortable.
Challenging.
Unpredictable.
Demanding.
Complex.
Extremely challenging.
Fascinating.
Digital methods are overwhelming. To carry out research based on this method, one is
supposed to have basic knowledge about platform mechanisms: to know how online
devices treat web data; to formulate queries as research questions; to ground findings
based on “the uncontrolled methods-of-the-medium” (Rogers, 2019). There is another
implicit requirement, that of using research software while getting familiar with the
web environment. Consequently, when designing research, many practical questions
arise: where to start? Which platform and why? Which tools? How to make research
45
questions as queries? How to collect data? What is next? How to map issue networks
using these methods?
These concerns lead us to another issue: how uncomfortable and challenging digital
methods can be. The presence of how to questions requires constant learning about
both thinking along with the medium and knowing how practically handle it. This, as
a reminder, informs us that research methods evolve with the technical mediums
context. Therefore, and by acting in conjunction with as well as in response to existing
medium, digital methods take researchers out of their comfort zone. The methods are,
thus, driven by cascading and cyclical processes. Adding to that, and since researchers
depend on “the availability and exploitability of digital objects” (Rogers, 2013, p.1),
we perceive digital methods as an unpredictable approach. As we know, the web
environment is live and not static, which obviously means the methods we use to
capture such landscape may be short-lived. So, again, the methods come with another
implicit requirement: monitoring. There is no other way around, as one cannot criticise
the continuously changing platform mechanisms or study social phenomena embedded
in such infrastructures without watching a situation carefully for a period of time (in
the web environment), while being aware of the existence of suitable tools capable of
collecting and working with online data and methods.
Therefore, the methods work with instability by default, which is, rather than
something negotiable, a starting point. However, by default, who wants to deal with
instability or cope with unpredictability? By default, who wants to feel obliged to
monitor the subject of study on the web or learn technical stuff? Who wants to
understand software? When using digital methods, there is no option but to face several
degrees of uncertainty, and this reality is certainly not a welcoming research scenario
for scholars who are used to working with web data as if were survey data. In these
cases, the major problem is to ignore the entanglements of data with medium
specificity and software functioning, which is why digital methods, being so
demanding, require researchers to have a minimum technical knowledge.
To further complicate this issue, these methods are not exclusively about data
practices. The difficulties encountered in the use of digital methods are not only
technical, but also conceptual. At this stage, knowledge and findings are constantly
mediated but also informed by tools inserted in a network of methods. Every stage of
the process impacts another and, as it goes, the tools carry their modes of being
46
entangled with the researcher’s analytical decisions (successes and errors) in the
subject of study-related content. This also means digital methods deal with an
environment inhabited by technicity. That is a complex task and the first contact with
these methods can be extremely challenging (although always fascinating), because
one cannot tell from where to begin, what to look at and how to cope with the full
spectrum offered by the methods.
The list of adjectives can help us to comprehend that research results and findings
using digital methods also correspond to knowledge about the conditions, methods and
specificities of the medium. On top of that, there is the mandatory call for working
with a set of tools/software. In accordance, and in the spirit of Hoel (2012), when using
digital methods, we should consider that knowledge is not only acquired through
symbolic forms but primarily through material instruments (e.g. from the methods of
platforms to research tools) combined with practice (e.g. knowing how to). This is a
process in which “theoretical-practical-technical modes of reasoning interpenetrate
each other not only sometimes but in principle” (Hoel, 2012, p.75).
Next, I will discuss the problem of knowledge in digital methods (both technical and
practical) by questioning how we can develop a research mindset that might help us to
think along with the medium. Moreover, I want to argue in favour of re-thinking the
definition of medium in digital methods. A discussion about the missing pieces of
digital methods will be held, with the purpose of supporting my argument,
demonstrating, also, how digital methods deal with an environment inhabited by
technicity.
A call for a broader definition of “medium” in digital methods
In a recent interview, Richard Rogers emphasises the importance of staying in digital
methods’ mind-frame and then keep working in a piece of research, within this mindframe. He is certainly referring to the repurpose of digital media and data. In fact,
finding a way to such a mindset may be the greatest challenge of digital methods
approach. It is something less related to knowing how to code or do things with
software and more related to a certain proximity with the software in its own ways.
How can one keep working in a piece of research within digital methods’ mind-frame?
How can one concretely learn from the media and, then, repurpose their methods and
data for social and medium research? In answering these questions, we may fully
47
benefit from the mindset underlying the philosophy of digital methods. When doing
this, we also fill a missing part in the methods and in digital social research as well,
which is something that forces us to develop “a digital Bildung” (Berry, 2011;
Bernhard Rieder & Röhle, 2018). In other others, as Rieder and Röhle (2018, p.123)
argue, “we have to be able to think with and in technology as a medium of expressing
a will and a means to know”.
To further complicate matters, and under the requested mind-frame, the methods invite
us to rethink the conditions of proof in digital research. Rogers (2019) presents two
main ways in which researchers could do that. First, by assuming online as a site of
grounding and, secondly, by questioning “whether the medium, or media dynamics is
overdetermining the outcomes” (p.21). By taking up this proposal, and considering the
complete assembly of the methods, how is one supposed to repurpose dominant
devices and data without considering other mediums which are part of the methods?
Here we come across with another missing piece, the call for a broader definition of
the medium in digital methods. As we learnt, thick layers of technical mediation are
inherent to the methods, as illustrated in figure 1.4 (see also Rieder and Röhle, 2018).
That is to say that, when using digital methods, beyond recognising platform
mechanisms (ranking, crawling, scraping), we should also consider research software
(e.g. Gephi) and web-based applications (e.g. YTDT, vision apis) as mediums to be
learnt from and repurposed, since these mediums have particular forms of practices
and modes of operation that take an active role in digital research.
Researchers are thus required to pay attention to the content of software because it
points not only “to concepts – but also to objects, practices and skill sets” that Rieder
and Röhle consider to have considerable internal heterogeneity and variation (2018, p.
10). By enhancing the notion of medium in digital methods, as well as its effects, we
may read Rogers’ statement from a different perspective:
With Digital Methods one of the key points is that you cannot take
the content out of the medium, and merely analyse the content. You
have to analyse the medium together with the content. It’s crucial to
realize that there are medium effects, when striving to do any kind
of social and cultural research project with web data. (Rogers, 2014)
48
Medium in digital methods are being perceived in a broader sense here, not only “for
clues and guidance” (Rogers, 2010, p.249), but also as real substance in research. We
thus may want to consider platforms mechanisms but also research software and webbased applications as “a medium of expressing a will and a means to know” (Rieder
and Röhle, 2018, p. 123, see also Rieder, 2020). In other words, when doing digital
research, we should also include the forms held within a computational medium, as
David Berry (2011) suggested to us almost 10 years ago. Generally, what is hereby
proposed may further complicate the problems of knowing how and knowing why when
dealing with/using digital methods. In 2013, Evenly Ruppert, Jonh Low and Mike
Savage was already warning us that digital devices demand “a better analytical grasp”,
one not offered in social theory, or in technical accounts of method (2013, p. 30). The
authors were referring to the capacity of exploring “fields of devices as relational
spaces”, “the chains of relations and practices enrolled in the social science apparatus”
(Ruppert, Law & Savage, 2013, pp.40-41). After almost a decade, the call for such an
analytical grasp became fundamental in digital research and even more specialised,
although still poorly documented in technical and practical accounts of method.
This helps us to identify another missing piece of digital methods approach that I want
to emphasise here, something related to the role played by technical and practical
knowledge of the mediums. In other words, the urge for a practical awareness of the
technical reality of the (computational) mediums involved in the methodological
process. Figure 1.6 illustrates that in a very old-fashioned way by describing a stepby-step protocol required to build a network of the followers-of-the-followers of an
Instagram bot profile (Mary__loo025), which speaks of bot detection techniques and
studies. The most interesting finding in this example was the detection of private bot
accounts as central and bridging nodes within the network. Figure 1.6, purposefully,
shares what precedes the network visualisation as its substance, rather than showing a
beautiful network29.
This result was made possible by a combination of factors, such as: previous
exploratory work on Instagram bots using digital methods (see Omena, 2017; Omena
et al., 2019), techniques of interpreting digital networks (Venturini et al., 2015;
Venturini, Jacomy, & Jensen, 2019) and a good understanding of Instagram specificity
29
The network visualisation of this example is available at
https://wiki.digitalmethods.net/Dmi/SummerSchool2020GoodEnoughPublics
49
(e.g. [bot] accounts can be private or public) combined with Gephi affordances. This
context helped in deciding what should be highlighted within the network: the node
colour of private accounts. By doing so, when looking at the position of nodes, it was
possible to see and confirm the role of bots’ private accounts in the market of fake
followers. Here, a fundamental requirement was the researcher’s ability to think about
the subject of study (botted accounts and their agency) in the context of Instagram
cultures of use and grammatisation, beyond mastering the use of research software.
This triangular relation will be further discussed in chapter 3.
Through this short example, we may recognise a technical mental endeavour in
conducting research which goes side-by-side with a practical awareness of the
technical reality of the computational mediums involved in the methodological
process. Here is where the notion of technicity, entangled with the mindset of digital
methods (in thoughts and actions) arises and helps.
Figure 1.6. A description of technical knowledge and technical practices in digital methods30.
30
There are some steps that precede the use of PhantomBuster, for instance to create a research
account, to buy followers and trace them in order to make a list of botted accounts (profile URLs) to
be used as entry point for data collection. In this case, to create a network of Instagram bot followersof-the-followers.
50
In response to this, as I argue, the extra effort needed in digital methods would point
towards the development of a sensitivity to the technicity-of-the-mediums (see chapter
2), facilitating the understanding of digital methods as a technical ensemble. A
situation in which researchers put together and coordinate or create the
(computational) mediums which are part of the complete process of digital methods,
while they connect technicities for solving research problems. That is a situated and
practical perspective that this dissertation proposes to digital methods approach. A
proposal that aims at methodological innovation to digital research, calling for
inventive/imaginative ways of research enquiries and practices.
In such a reality, the methods reinforce that researchers need to be aware of the
potentialities and limitations afforded by the mediums, while putting into action their
conditions of possibilities. In this spirit, when implementing the methods, we need “to
test different ways and to experiment as many times as necessary”, learning by trials
and errors in practical terms (Omena, 2019, p. 9). A process that is refined and
transformed by repetition (practising) but mainly when we make room for medium
substance and ways of being. These are the requirements to fully benefit from the
advantages of digital methods approach. These issues will be addressed in the
following chapter which also question how the concept of technicity matters in the
practice of digital/Internet research.
51
2 THREE ATTEMPTS TO UNDERSTAND TECHNICITY
C HAPTER 2
52
The most powerful cause of alienation in the world of today
is based on misunderstanding of the machine.
Gilbert Simondon, 1980, p.1.
If one wants to understand a being completely,
one must study it by considering it in its entelechy,
and not it its inactivity or its static states.
Gilbert Simondon, 2009, p. 19.
First attempt to understand technicity (Media Theory)
Technicity, a term originally borrowed from philosophy, refers to the relationship between
technology and humanity (or humans). When speaking of the concept of technicity, the works
of Martin Heidegger, Gilbert Simondon, and Bernard Stiegler31 are the most widely
acknowledged; these authors refuse to think of technology or technical objects as mere tools
because they play a pivotal role in society, and they are a constitutive part of ourselves and
our culture32. As reference for a mode of reasoning the relations between technology and
society, technicity alludes to the operative functioning of technology, that is its constitutive
mode of ordering or governing which simultaneously transforms society and what we are
(Bucher, 2012; Hoel & Van Der Tuin, 2013; Rieder, 2020; Rieder, Abdulla, Poell, Woltering,
& Zack, 2015; Simondon, 2017). Technicity thus refers to a field of knowledge which works
in theoretical, technical, and practical frameworks, but it is first and foremost technical in
essence (see Rieder, 2020).
The role of technicity for grasping technical mediations and established relations between
machines and humans has been a matter of concern in digital media field studies and alike.
This section proposes to review the appropriations, purposes, framings, and conceptual basis
of technicity in different fields of study. I want to address questions, such as: Why does
technicity matter and in which ways? How are scholars defining and using this concept? To
31
Following these relevant philosophers of technology, James Ash (2012) summarises three ways of
understanding technicity: i) as a persuasive logic for thinking about the world - representing Martin
Heidegger’s thought; ii) as a mode of existence of technical objects - relating to Gilbert Simondon’s
philosophy; and, iii) as an originary condition for human life itself - the proposition of Bernard
Stiegler). Considering that mapping all these philosophical thoughts on technicity is not an easy task
or the goal of this dissertation, and I will draw particular attention to what can we learn about
technicity from a very specific angle on the work of Gilbert Simondon combined with what we can
learn from the appropriations of this concept in digital media studies.
32
A non-Aristotelian tradition of thought on technology or one that is not based on the concept of
instrumentality (see Frabetti, 2011).
53
understand technicity, what must be looked at? How? What can we learn from that? By so
doing, and in a first attempt, I will introduce what technicity is.
Ways of thinking technicity in Digital Media field studies
In Games Studies, the media theorists Jon Dovey and Helen Kennedy (2006) introduced
technicity as a concept “to account for particular formations of identity and power which lie
at the heart of computer game cultures” (p.16). In their book Game Cultures, the definition of
technicity is taken from the cyberculture studies but is also enhanced through the work of
Pierre Bourdieu. From the former, the attentions would go for the attitudes towards
technology, its adoption and the correspondent practices, and from the latter, the inclusion of
issues like taste and cultural capital. In this sense, the deployment of technicity would look at
a network of relations with software; from looking at the ways specific skills in specific
technologies are privileged in a given context to look at the ways “in which specific
technologies bring us into new relationships with machines” (Dovey & Kennedy, 2006, p. 16).
The technicity of video games would then be unveiled by appreciating “the technologies and
techniques of production, design, implementation, and appropriation of videogame systems”
(Crogan & Kennedy, 2009, p. 113). However, when looking at techniques, the player, the
game and the environment are not isolated elements but relational. In practical terms, if one
wants to understand (or study) the technicity of video games, gameplay or game cultures, one
may look at different techniques such as those related to game design and play, to rules,
exceptions and practices, to reading and criticising games, or to the capability of understanding
cheating in digital games or gamers expertise (Crogan & Kennedy, 2009; Kuecklich, 2009;
Toft-Nielsen & Nørgård, 2015). The first insight we can take from Game Studies is this strong
bidirectional relation between the player, with particular skills in particular game practices,
and the game in itself and its forms of appropriations within a situation.
In a context in which the use of technology (the game) combined with a degree of expertise
matters, the player performance reflects such close relations between players and gaming. In
this respect, Toft-Nielsen and Nørgård (2015) say that the notion of technicity in gaming ties
into the very idea of the entanglements of the players’ kinaesthetic performance, competence
and expertise. To the authors, the player expertise cannot be limited to technological
competence, but “recast the purpose of corporeality and gender in relation to expertise”33. That
means, through the lens of technicity, the authors thus understand “performing gaming
33
Here expertise “manifests itself through the kinaesthetic performance of moving hands and bodies,
expressing intimate and skilful rhythmic timing and patterns as gaming expertise” (Toft-Nielsen &
Nørgård, 2015, p. 356).
54
expertise as a compound concept in which identity, technology, gender and corporeality”
(p.355) take place.
Technicity is central not only in Game Studies but, as argue Patrick Crogan and Helen
Kennedy34, this notion is pivotal for the elaboration of “a more rigorous and focused
perspective on the theorization of technology” (p.107). When applying gaming
research, they argue, the players and their cultural and collective involvements should
be taken “as processes of becoming intertwined with lineages of technological
development and disjunction which are the condition of these processes” (p.107). To
grasp these processes, explain the authors, a focus on ludic technicity is required.
Dwelling on and in technicity seeks to sustain critical attention on the
processes through which the human, as always social, connected
individual—connected through techniques, technologies, and dynamic
traditions of practice—lives a particular existence. The dynamic of
technical development is the medium or environment of this becoming, and
individual and collective identity is at best a metastable state that
accommodates or regulates provisionally the flow of transformations in
human–technical relations. (Crogan & Kennedy, 2009, p. 109)
From this argument, we derive the idea that technicity does not exist in a particular
technical object/machine but is rather accommodated in a process that assembles
socio-technical relations and individuals. This conceptual perception becomes even
more tangible when appreciating technicity under the lens of Internet research.
In context of platforms-software studies, Sabine Niederer and José van Dijck (2010)
uncover the role of technicity as a knowledge instrument by inquiring into Wikipedia’s
“dynamic nature” and accounting for its system of partially automated content
management. The understanding and analysis of the Wikipedia as a social technical
system, as the authors justify, are important steps towards a better comprehension of
“the powerful information technologies that shape our everyday life and the coded
mechanisms behind our informational practices and cultural experiences”(p.1384).
Thus, one of the key strengths of this way of studying Wikipedia is the consideration
of its technical specificities on how humans and bots contribute to the management of
content. More recently, in Networked Content Analysis, Niederer (2019) presents the
34
In 2009, the authors coordinated the special issue on Technologies between Games and Culture in
Sage’s new journal “Games and Culture”. This journal was established in early 2006.
55
concept of technicity as a synonym for the notion of medium-specificity often used in
digital methods (see Rogers, 2013, pp.25-26). The starting point is web content and
how it is embedded into digital platforms. In this sense technicity echoes how the forms
and functions of platforms shape web content. In other words, technicity refers to what
Niederer (2019) has described as “platform-specific aspects of content” or networked
content (p.35). In this sense, following her proposal, there is technicity in web content.
Tania Bucher (2012) turns to Michel Foucault’s concept of ‘governmentality’35 to
propose the term “technicity of attention”, which explains that digital platforms
operate “as an implementation of an attention economy36 directed at governing modes
of participation within the system” (Bucher 2012, 1). Bucher looks at the details of
Facebook Graph API and describes some of these with the intention to understand how
this platform generates and manages attention. As a digital infrastructure, the problem
would be the vast collection of practices established and hosted by Facebook; “a
controlled environment that users act in, but have little power to change” (Rieder et
al., 2015, p. 3). In this context, Bucher prescribes technicity as a way to condition
participation on social media. In other words, she understands technicity as “a mode
of governmentality that pertains to technologies” (op. cit, 3)37; thus, working as a form
to simultaneously understand the modes of software governance and how it “propagate
[s] a certain social order of continued (user) participation” (Bucher 2012, 17).
The technicity of Facebook is addressed differently by a group of New Media,
Journalism, Language and Communication scholars, through a proposal to study the
role “We are all Khaled Said” Facebook page played in the Egyptian revolution of
2011 intertwined with a particular interest in exploring analytical opportunities for
data-led studies. They bring to the debate the critical role of technicity in digital
research. That is related to, first and foremost, an introduction of either a technical
language or knowledge about social media platforms; what the authors called technical
fieldwork. In a sense, a first level of perceiving technicity would raise the awareness
35
“Refers to the rationalities that underlie the ‘techniques and procedures for directing human
behaviour’ (Foucault, 1997, p.81)” or better saying “mentalities or modes of thoughts that are
immanent to ‘government’ or the ‘conduct of conduct’” (Rose et al., 2006).
36
See Goldhaber (1997): The Attention Economy and the Net.
37
The mode of governance of Facebook, she explains, can take place in three different forms: i) “an
automated, ii) anticipatory and iii) a personalised way of operating the implementation of attention
economy” (Bucher 2012, 2).
56
of how “human practice is channelled through interfaces and data structures” (Rieder
et al. 2015, p.4).
In this article led by Bernhard Rieder, the authors concretely exemplify how essential
the technical fieldwork to platforms-software studies is. They provide a thick
description of the platform’s application programming interface (API); from an overall
history of its main changes, additions and limitations to an accurate view of
Facebook’s data structures - calling special attention to data issues such as
completeness, consistency, and architectural complexity. The technical knowledge of
the APIs is, therefore, critical for the practice of data or platform driven research
(Bucher, 2013; Rieder et al., 2015, see also Omena, Rabello & Mintz 2020).
Technicity as a domain in reiterative and transformative practices
From Games studies to Internet research, we see that technicity constitutes a complex
reality that integrates human-technical relations but, at the same time, requires more
concrete means to be comprehended, technical in nature. To complement our
investigation into the different uses of the notion of technicity, I want to expose the
work of Martin Dodge and Rob Kitchin who help us to understand the concept through
both theoretical and practical perspectives. The authors thus introduce technicity as the
power of technologies to either make things happen or to solve ongoing relational
problems; this is “the constant making anew of a domain in reiterative and
transformative practices” (Dodge & Kitchin, 2005, p. 162; Kitchin & Dodge, 2007).
Their definition is based on the work of Adrien Mackenzie who developed the concept
of technicity on the basis of Gilbert Simondon’s philosophy of technology38.
Technicity refers to the extent to which technologies mediate,
supplement, and augment collective life; the extent to which
technologies are fundamental to the constitution and grounding of
human endeavour; and the unfolding or evolutive power of
technologies to make things happen in conjunction with people
(Mackenzie 2002). For an individual technical element such as a
saw, its technicity might be its hardness and flexibility (a product of
human knowledge and production skills) that enables it, in
conjunction with human mediation, to cut well (note that the
38
Here, the understanding of technicity is not to be confused with technical mediation or mediumspecificity, although it has a direct correlation with these latter.
57
constitution and use of the saw is dependent on both human and
technology; they are inseparable). (Dodge and Kitchin, 2005, p. 169)
In the first place, technicity serves as a theoretical framework that helps to understand
the effect of software (code) and how it modulates sociospatial relations by making a
difference “to the form, function and meaning of space39” (Dodge and Kitchin, 2005,
p.171). They reflect on the technicity of the code which is taken as something
“contingent, negotiated, and nuanced; [that is] realized through its practice by people
in relation to historical and geographical context” (op. cit., p.170). In this realisation,
there is a high level of mutual dependency though, on one side, “if the code fails, then
the object fails to operate” (op. cit., p.178); by object they mean what is part of our
daily routine - from washing machines to transport and logistics networks.
Consequently, when an object fails, it compromises our living – namely domestic
living, traveling, working, communicating and consuming. Vice versa, if one
technology user does not perform her role along with software, the action occurs
differently or does not result in something meaningful. That is to say, and as detonated
by the definition of technicity, “code (software) and its effects are peopled” (op. cit.,
p.170). In the Dodge and Kitchin (2005) essay we clearly see that there is no technicity
of code without human mediation.
In a second moment, and as a complement to the theoretical vision, Dodge and Kitchin
(2005) invite us to consider technicity in its practical terms. The work of code could
only be unfolded in practice and in conjunction with people. To do so, and instead of
focusing on code, they examined the nature of maps or more precisely mapping
[technical] practices: “how maps are (re)made in diverse ways (technically, socially,
politically) by people within particular contexts and cultures as solutions to relational
problems” (Kitchin & Dodge, 2007, p. 243). Following the work of Bruno Latour,
Adrien Mackenzie and Gilbert Simondon, the authors draw our attention to something
that must go in parallel with the awareness of society as “hybrid assemblages of human
and non-humans” (Latour 1993): that is, while looking at what constitutes maps, we
should also pay attention to their process of becoming (see Kitchin & Dodge, 2007, p.
335), i.e. the interpretation of maps should not be separated from all related practices
39
Dodge and Kitchen (2005) interpret space as something produced by social relations and material
social practices. Rather than static, the functions of space alter with time; it is something that “gains
its form, function and meaning in practice” (p.172).
58
that constitute the process of mapmaking. In this sense, we may want to read technical
objects like Kitchin and Dodge read maps; as something “of-the-moment, brought into
being through practices” (p.331).
After reviewing why and in which ways technicity matters for different field studies,
as well as noticing how this notion has been used, conceptualised and applied in digital
media research, one important observation should be noted. The technicity of (games,
gameplayer, code, software, social media) is a complex concept that requires a close
relationship with software through technical knowledge and practices. This is central
to my thesis which grants “technology a new role in knowledge and existence by
posting to its involvement in processes of becoming” (Hoel, 2018, p. 420); in doing
so, I act in opposition to the traditional and classic views on technology “conceived as
external to being” (Hoel, 2018, p. 420, see also Marres 2017). While this argument is
not new, it is fundamental to my use of the notion of technicity to understand digital
methods.
Common perceptions and appropriations
To close this section, I want to emphasise what all these studies, perceptions and
appropriations of technicity have in common and what we can learn from them. By
doing this, we begin to understand why technicity matters and what we should look at
to gain a firm grasp of it, also situating the relationship of technicity in digital methods.
First and foremost, they all focus on the mode of existence of the technical objects,
embracing it as part of a relational process and proposing ways of reading the machine
and its relation between us and other tools. This is a reflexive exercise on i) paying
attention to the techniques of appropriation of videogames and their entanglements
with the performance and expertise of the gameplayer in order to account for identity
formation; or ii) comprehending the automated editing systems of Wikipedia to
evaluate how the encyclopaedia manages content; or iii) knowing how Facebook Open
Graph works in order to study participation. This exercise is not only about the effect
of software or the ways in which we make use of it, but also about processes of
becoming.
Through the lens of technicity, the comprehension of technologies is not limited to
technical forms and functions (what they do, how they work) or to technical expertise
(uses and practices) but it makes room for how/what technologies become what they
59
are in conjunction with people. That is something only unfolded in technical practice
and through its relational aspects, which are always situated in a given time and
context.
A second aspect refers to the comprehension of technicity as the understanding of a
network of relations within software. That involves knowledge about the nature of
machines that one can grasp through direct interaction with the software. In terms of
content, technicity refers to software functioning and potentialities but also practices.
Descriptions of forms, functions, limitations and hidden schemas take a prominent role
here. I thus recall Fuller (2008, 9) in asserting “software demands an engagement with
its technicity and the tools of realist description”. However, technicity cannot be
defined solely as a technical description or medium specificity because it only exists
through human mediation and performance. In this sense and following Rieder (2020),
we need to pay serious attention to the content of software, while considering that part
of its technical substance “sits at the centre of technical practice” (op.cit p, 54).
Technical practices sit at the centre of digital methods, so it is the content of technicity
which resides simultaneously in machines (software if you wish) and in us. These
practices become meaningful and crucial to be accounted for while using digital
methods.
One last element that we learn from the different ways of thinking technicity is that
this concept refers to a field of knowledge that combines theoretical, technical and
practical frameworks. In this spirit, the attempt of presenting an actual application to
technicity is what we can understand as a common concern among all cases. From
these studies, either based on philosophy of technology or media studies, we can derive
fundamental advice on how to grasp technicity and why it is important, as well as an
awareness of technical infrastructures and their modes of functioning intertwined with
the co-constitutive relations with us (as researchers) in processes of becoming with
technology. In this sense, technicity directly refers to a domain of technical expertise
and iterative technical practices in both theory and practice (see Rieder, 2020; Dodge
& Kitchin, 2005; Kitchin & Dodge, 2007) that requires action/practice to exist.
However, it was noted that an in-depth development of the term ‘technicity’ seems to
be disregarded, although these studies clearly indicate how we can read technicity. For
instance: through a “description of the Facebook platform and how it works” (Bucher,
2012); through understanding the “dynamic nature” of Wikipedia regarding human
60
and bots contributions to the management of content (Niederer & van Dijck, 2010); or
through listing the limitations and analytical possibilities afforded by social media
APIs (Rieder et al., 2015). In the context of digital methods, I want to argue that the
concept of technicity relates not only with the description of the forms and operations
of the medium, but it acknowledges a practical awareness about and with medium
functioning. The next section will help in this task by comprehending technicity
through the lens of Gilbert Simondon’s philosophical perspective.
Second attempt to understand technicity (a philosophical perspective)
In the previous section, the concept of technicity was introduced through the
perspective of Digital Media Studies. Here, the intention is to provide further reflection
following the perspective of the French philosopher Gilbert Simondon with the
purpose to strengthen my argument on the role of technicity in digital methods. In this
second attempt at understanding technicity, the book Engines of Order - A
Mechanology of Algorithmic Techniques by Bernhard Rieder (2020) serves as a
substantial support to my interpretation and efforts to comprehend what technicity is.
In a document that served as a complement to his PhD thesis in 1958, Simondon (1980,
2017) develops a particular vision of technical objects, more precisely their mode of
existence which must be taken as assessment of their values in our lives. According to
Simondon, this is the lynchpin of a new model of culture that knows and recognises
the essence of technical objects. This should replace the current model of culture,
which produces alienation due to a misunderstanding of the role of machines in our
lives. That is the reason, he argues, our culture is unbalanced, and to avoid devaluations
or confusion towards machine, we thus need to become aware of the mode of existence
of technical objects and understand that these objects (or technology itself) are “indeed
human” (Simondon, 1980, p.1)40. Only in this way, we would grasp the technicity of
“beings that function” (Simondon, 2017; see also Rieder, 2020).
40
“Using the vocabulary of Actor-Network Theory, we could say that an object’s technicity realizes
‘its script, its “affordance”, its potential to take hold of passersby and force them to play roles in its
story’ (Latour, 1999, p.177). Simondon’s philosophy, however, cautions us to not move too quickly to
the heterogeneous but flat assemblages Actor-Network Theory conceives. In fact, Latour’s more
recent An Inquiry into Modes of Existence (2013) follows Simondon in arguing that such modes
61
These principles align with the studies discussed in the previous section. However, the
application of technicity has a narrower focus in media theory, while in Simondon, we
see a broader spectrum (human-world-machine). Here, I am proposing to frame the
perspective of technicity, precisely into the reality of digital methods and following
Simondon’s very peculiar appreciation of technical objects that involve aspects not
considered by the previous studies (e.g. the distinction between individual, elements
and ensembles). A second aspect concerns his arguments about the role of man before
technical objects: not only are we the builders and coordinators of machines but we
also live among them. Therefore, and according to Simondon (1980), the following
sections attempt to expose a threefold dimension for understanding technicity which
consists of i) the nature of machine, ii) the values involved in their mutual relationship
(machine-machine) and iii) their relationship with man (machine-man).
I will be using technicity in two senses pointing on the one hand to the effort to become
acquainted with the mediums and, on the other hand, to the object of technical
imagination, a perspective that can be useful and relevant to the practice of digital
methods. To be acquainted with the medium reflects the researchers’ attitude to
understand the medium in its own right and in relation with others; then, and by
knowing it well, we start reasoning with and about the technical medium, using this
knowledge as a means for enquiry and for answering research questions. I advocate
that this attitude allied to the practice of digital methods results in the development of
a technical imagination which is defined by Simondon as “a particular sensitivity to
the technicity of [technical] elements; it is this sensitivity to technicity that allows for
the discovery of possible assemblages; the inventor does not proceed ex nihilo, from
matter that [s]he gives form to, but from elements that are already technical [...] (see
Simondon, 2017, p. 74). A technical imagination enables researchers to connect
technicities and thinking along with their “stable behaviours, expressing the
characteristics of (technical) elements, rather than simple qualities”; characteristics
that “are powers, in the fullest sense of the term, which is to say capacities for
producing or undergoing an effect in a determinate manner” (Simondon, 2017, p.75).
delineate their own substances in ways that are more profound than a mere incommensurability
between language games, because they admit other beings than words into the fold of what makes a
mode specific (Latour, 2013, p.20). Being itself is marked by difference and, as Peters claims,
‘[o]ntology is not flat; it is wrinkly, cloudy, and bunched’ (2015, p.30)” (Rieder, 2020, p. 56).
62
In other words, when doing digital methods, technical imagination reflects the
researcher’s capacity of predicting and combining the practical qualities of software
to solve methodological problems and to respond to research questions.
The awareness component: machine as elements, individuals and ensembles
The first step to understand the technicity of beings that function encloses the
awareness component (of structures, functions and operationalisations) which allows
us to see machines in their own right (Simondon, 1980; 2017). This stage starts with
the comprehension of the machine as a (technical) element, a (technical) individual
and a (technical) ensemble. On the one hand, explains Rieder (2020), this proposal can
be interpreted as part of a metaphysics of technology but, on the other, it “can also be
used as a conceptual device orienting the analysis of specific technical domains”
(p.65), yet acknowledging that a precise distinction between element, individual and
ensemble can be difficult to tell when working with digital technologies.
In the context of digital methods, the advantage of separating between element,
individual and ensemble is a first step to recognise the technical medium itself and in
relation to others, serving also as departure point to think along with what is already
technical, while paying attention to medium’s capacities for producing or undergoing
an effect in a determinate manner in the methodological process41. This attitude has
the potential to affect the course of action in methodology, either intervening directly
or indirectly in the ways we design research and think or study digital records. Before
exposing how this tripartite distinction can help the practice of digital methods, we
will start with a general understanding of elements, individuals and ensembles,
followed by some examples viewed from the perspective of digital methods.
Simondon compares technical elements to organs, functional units
that could not subsist on their own, and individuals to living bodies.
(Simondon, 2017, p.62) In this analogy, ensembles appear as
societies, characterised by more precarious or tumultuous dynamics
of relation, stabilisation, and perturbation. Computer networks
connecting functionally separate machines are obvious candidates
for technical ensembles, but the utility of the concept is broader.
Simondon argues that technical and economic value are almost
completely separate on the level of the element, while individuals
41
In italics I´m borrowing and adapting Simondon words to the context of digital methods.
63
and ensembles connect to broader realities such as social
arrangements and practices, the latter pointing to the fact that
contemporary technology functions as a series of industries. (Rieder,
2020, p. 69)
Technical individuals, such as software, relate to the integration of ‘pure’ technical
forces and “how an associated milieu is formed and stabilized, how continuous
operation is realized” (Rieder, 2020, p. 68). These are taken as a stable and integrated
system made of elements. For instance, “the application software ‘fits’ the processor
speed, memory capacities, screen dimensions, sensor arrays, and so forth” (op. cit.).
Individuals exists in a state of combination, so it is technicity within these technical
beings (Simondon, 2017). In digital research, and despite the need of a computer to
operate on, a good example of a technical individual is Gephi (Bastian, Heymann, &
Jacomy, 2009). Built with Java SE 6 on top of the NetBeans platform, the first version
of Gephi was released in July 2008 (0.6.0) and it is now operating under the stable
release 0.9.2 since September 2017 (Heymann, 2014; Wikipedia, 2015).
This eleven-year-old network analysis and visualisation software may represent what
Simondon describes as an operational unit closest to the human scale (2017, p.77) that
“combine and arrange elements into functioning and integrated wholes, realizing
technical schemata that evolve over time” (Rieder, 2020, p.61)42. Accordingly,
software like RawGraphs (Mauri, Elli, Caviglia, Uboldi, & Azzi, 2017) and Facepager
(Jünger & Keyling, 2019) or even social media can be seen as technical individuals.
In these cases, self-regulation is required, a “structural coherence capable of assuring
their function and stability over time” (op. cit. p.65).
Technical elements are functional units that cannot operate on their own but are
carriers of meaning with the potential to be repurposed. The valve spring and
modulation transformer are examples of elements in Simondon, who explains that
42
As mentioned before, Simondons’ reflections on technical objects were not related to software,
neither web-based APIs nor digital tools for research. Therefore, the distinctions between individual,
elements and ensemble demand to be adapted to the context of digital research. In fact, one could say
that Gephi is an ensemble, rather than an individual, precisely because it is written in Java and stands
on Netbeans. However, as we will see, both technical ensembles and technical individuals exist in a
state of combination (made of elements, e.g. Java). Gephi is illustrated as an individual here because it
is a stable software(Heymann, 2014). When Simondon refers to a technical ensemble, he explains
there is functional separation in it (meaning the gathering of technical objects not necessarily having
to operate within one another). An ensemble can also be temporary. A technical individual is thus,
“first and foremost, characterized by strong and stable technical or causal relations between the
constitutive elements that establish and maintain its functioning”, as Rieder (2020, p.58) explains.
64
these units enhance the quality of individuals such as valve springs in automobile
engines. The springs serve “to move an intake, or an exhaust, valve according to the
head-discharge curve of a cam so that the valve is in contact with its seat to prevent
compression leakage. In the meantime, a valve spring is required to impose appropriate
tension on the valve so as not to increase the friction loss of the valve operating
system” (Yoshihara, 2011). Over the years, valve springs became lighter and smaller
(e.g. by considering non-metallic inclusions in steel); an attribute that responded to the
environmental regulations for automobiles, consequently, the technical evolution of
these elements have reduced fuel consumption and carbon dioxide emissions (op. cit.).
This example helps us to comprehend Simondon (2017) when he says technical
elements are the true engines of progress, while the ensembles or individuals are more
connected to transformations and changes, precisely because the element “expresses
and preserves what has been acquired via a technical ensemble so as to be transported
into a new period” (p.73); only they have the capability “to transmit technicity from
one age to another” (p.76). They “do not dissolve when forming individuals”, but they
are capable of constituting their own trajectories (see Rieder, 2020, p.68).
In other words, elements transport a concretised technical reality, while the individual
and the ensemble contain “this technical reality without being able to transport and
transmit it; elements have a transductive property that makes them the true bearers of
technicity, just as seeds transport the properties of a species and go on to make new
individuals” (Simondon, 2017, pp.73-74). That is a form of explaining that technicity
exists in the purest way within technical elements (Simondon, 2017, p.74), “because
they are not yet combined into systems that, so to speak, put certain demands on them”
(Rieder, 2020, p.65). In the context of digital methods, some candidates for technical
elements can be algorithmic techniques (e.g. information ordering), machine learning
models for studying natively digital images (e.g. Google Vision API’s web detection
module) or even graph layout algorithms. These elements, in research design and
analyses, should become units of knowledge when apprehended by technical
imagination (see Rieder, 2020, p. 103). This will be discussed in the next section.
A second illustration, attempting to transpose the notion of elements to the practice of
digital methods, is the case of graph layout algorithms; indispensable/crucial for digital
network analysis. Taking as an example is one of the constitutive elements in Gephi:
ForceAtlas2 (Jacomy, Venturini, Heymann, & Bastian, 2014). The first version of this
65
layout algorithm, simply known as Force Atlas, arrived together with Gephi in 2008,
but improvements were made, and in June 2011 the stable version was released
(Jacomy, 2011; Jacomy et al., 2014). This force-directed layout serves small to
medium-size graphs and it responds to attraction force vs. repulsion by degree and
these forces create a movement that converges to a balanced state (in positioning the
nodes within the network). The final configuration is expected to help the
interpretation of the data. Although being a default layout in Gephi, ForceAtlas243 can
be installed in R (Analyx, 2015) and implemented for Python 2 and 3 (Chippada,
2017).
Digging deeper into ForceAtlas2 and based on how it works, considering also a series
of experimental studies using social media and search engines records, this forcedirected layout may provide a narrative thread that has fixed layers of interpretation44
such as centre, mid-zone, periphery, and isolated elements and multiple forms of
reading (see Omena & Amaral, 2019; Omena et al., 2020). The operational qualities
of this graph layout are in between something to be created (findings, reflections,
revelations, stories, narratives, controversies) and something that already exists (data
entered in the software, e.g. Gephi), grammatised actions available in digital platform.
By giving a shape for situated connections and arranging those in different zones
within the network, ForceAtlas2 has the capacity to produce a very particular effect in
the knowledge to be acquired and shared about a given context (see a practical example
of this in the next section).
43
Recently, and unsatisfied with the results afforded by this force-directed algorithm, Mathieu Jacomy
(co-founder of Gephi) has developed a quantitative metric named connected-closeness to read digital
networks: “a valid metric for interpreting distances in a network map” (Mathieu Jacomy, 2020b,
2020a).
44
Since 2018 this proposal has been tried-and-tested in and out the context of data sprints; for
instance, it was used to read networks of recommendation, paying attention to the recommendation of
similar apps in Google Play Store (e.g.
https://wiki.digitalmethods.net/Dmi/SummerSchool2018AppStoresBiasObjectionableQueries,
https://smart.inovamedialab.org/past-editions/smart-2019/project-reports/journalism-apps/) or to the
related videos suggested by YouTube algorithms (e.g.
https://wiki.digitalmethods.net/Dmi/SummerSchool2018MappingWarAtrocities). The fixed layers of
interpretation was also tested in networks of hashtags, networks of following accounts and networks
built on top of computer vision APIs and concerning image circulation (e.g.
https://thesocialplatforms.wordpress.com/2019/12/07/reading-digital-networks/,
https://smart.inovamedialab.org/past-editions/smart-2019/project-reports/interrogating-vision-apis/,
https://wiki.digitalmethods.net/Dmi/SummerSchool2020GoodEnoughPublics,
https://smart.inovamedialab.org/2021-platformisation/project-reports/investigating-cross-platform/).
66
Technical ensembles, less specific than elements but existing in a state of combination
like individuals, refer to broader realities; they “constitute systems that are not
characterized by the creation of an associated milieu, but rather by a form of coupling
that merely connects outputs to inputs, while remaining separate in terms of actual
functioning” (Rieder, 2020, p.68). In this way, an ensemble goes beyond itself by
enabling “the construction of new technical beings in the form of individuals”
(Simondon, 2017, p.72). However, these newly created beings can be temporary or
even occasional (Simondon, 2017). In addition, there is functional separation in
technical ensembles meaning the gathering of technical objects do not necessarily have
to operate within one another but in different environments (space) or systems.
Rieder (2020) points out that Simondon sees a systems perspective based in a theory
of information as key to understand what technical ensembles are, while setting
another example in which “computer networks connecting functionally separate
machines are obvious candidates for technical ensembles, but the utility of the concept
is broader” (op.cit. p.69). Here, the concept of technical ensemble is used as a form of
representation of the full range of digital methods, in which the distinction between
individuals, elements and ensembles needs to be adapted to digital research.
To conclude this topic, and following Simondon, I want to emphasise a shared
characteristic between elements and ensembles: they are characterised by a degree of
technical perfection. This refers to a practical quality “or at the very least material and
structural basis of certain practical qualities; in this way a good tool is not simply one
that is well put together and well crafted” (Simondon, 2017, p.72). In order words, no
matter whether the object is free of or full of flaws and limitations, what counts is its
capacity to accomplish a purpose. In the context of digital methods practice, the media
involved, its methods, forms and structures are never optimal, but they certainly
contain practical qualities that are capable of fulfilling a research purpose.
The distinction between individual, elements and ensemble can help the practice of
digital methods in different ways. First, and when comprehending technical mediums
at different levels (also knowing what they offer and how they operate), researchers
become capable of knowing what type of technical objects are required, when to use
these and why. This perspective is relevant in the research design with digital methods
because it compels researchers to understand the potentials and practical qualities of
67
digital technologies, in the same way that they take seriously the context and content
of the object of study to accomplish research purposes.
Second, and throughout the practice of digital methods, we come to understand the
impressions, effects and re-adjustments left by a range of software on digital records
(see chapter 3 and figure 3.13); in particular, how crucial the role of technical elements
is in the analysis. Here, the content of grammatised actions (e.g. hashtag-based data
set) is paired up with the content of technical individuals (e.g. extraction and
visualisation software) and elements (e.g. Google Vision automated image annotation
in a collection of images associated to a particular list of hashtags). This understanding
helps researchers not only to think along with and repurpose (if necessary)
computational mediums, but it also enables them to discover new arrangements, using
a technical imagination to create methodological solutions.
Third, the concept of a technical ensemble helps researchers to envision and design the
full implementation of digital methods. A situation where the practical qualities and
potentials of software, web-based APIs and digital tools (technical individuals and
elements) are combined and coordinated to work together, united to achieve a research
purpose. Here, researchers must define what range of software (technical individuals)
should form a chain methodology to solve research problems, valuing particular
elements as carriers of meaning.
When adapted to digital research, the tripartite distinction can make a difference in the
practice of digital methods, as this thesis expects to show (though not always using the
terms individuals, elements and ensembles).
The human function: to be acquainted with machines
[…] yet, in order for the human function make sense, it is necessary
for every man employed with a technical task to surround the
machine both from above and from below, to have an understand of
it in some way, and to look after its elements as well as its integration
into functional ensemble. For it is a mistake to establish a
hierarchical distinction between the care given to elements and the
care given to ensembles. Technicity is not a reality that can be
hierarchized; it exists as a whole inside its elements and propagates
transductively throughout the technical individual and ensembles:
through the individual, ensembles are made of elements, and from
the elements issue forth (Simondon, 2017, p.80).
68
The second step to understand the technicity of beings that function entangles the
relations between man and machines, particularly at the levels of technical individuals
and ensembles. Simondon proposes two situations to illustrate this relationship: in the
first, machines are the helpers while we are the bearers of technicity (artisanal
coupling); the second concerns industrial coupling45 in which machines are no longer
simple servants, but gain a status that requires regulation, supervision and organisation.
In the artisanal work, for instance bow drilling or the skill of shoeing a horse, man (as
technical individual) “ensures an internal distribution and self-regulation of the task
through his body” by mastering the tools (Simondon, 2017, p. 77); thus, assuming the
function of technical individuation. In this case, affirms Simondon (2017), man is the
bearer of technicity, while the tools are the helpers.
Conversely, as we know, this situation changes when looking at technical ensembles
of industrial and digital societies. To Simondon (2017), this is the moment when the
essential foundation of technical individualisation shifts: it is, from then on, within the
machines, rather than in humans. “Man directs and adjusts or regulates the machine,
the tool bearer; he realises grouping of machines, but does not himself bear tools”
(Simondon, 2017, p. 78). By disengaging from a more artisanal function, man can now
become either “organiser of the ensemble of technical individuals, or helper of
technical individuals”, playing a sort of auxiliary role by providing the machine with
elements. That is what Simondon places as two types of technical individuality: in the
first one, the role of man is above the machine (regulator) and below it (organiser) in
the other. However, it “does not mean the man cannot be a technical individual in any
shape or form and work in conjunction with the machine” (p.78), as explains
Simondon:
This man-machine relation is realised when man applies his action
to the natural world through the machine; the machine is then a
vehicle for action and information, in relation with three terms: man,
machine, and world, the machine being which is between man and
world. In this case, man preserves some traits of technicity defined
in particular by the necessity of apprenticeship. The machine thus
essentially serves the purpose of a relay, an amplifier of movements,
but it is still man who preserves within himself the centre of this
complex technical individual that is the reality constituted by man
and machine. (Simondon, 2017, pp.78-79)
45
In software engineering coupling “is the degree of interdependence between software modules; a
measure of how closely connected two routines or modules are” (Wikipedia, 2020).
69
The reflections of these sorts of relationships (man-machine) can fruitfully be adapted
to digital methods, here taken as research practices made of technical creators, both
software-makers and software-users (see chapter 1). Transporting Simondon’s
perspective (2017, pp.78-79) to the context of digital research, we could say that the
researcher becomes the bearer when they decide to work in conjunction with the
medium and its methods (software, web-based application, etc.). Technical tasks and
practices are synonymous with this activity. When it comes to machine-machine
relations, man can only intervene as a living being; always as a servant or organiser of
machines. Think of core activities in digital methods such as monitoring software
functioning. However, even claiming that “the real regulating power of culture”
pertains to machines, Simondon also makes evident that machines are only perfect in
the presence of man who is the “permanent organizer and a living interpreter of the
interrelationships of machines” (Simondon, 1980, p. 4). Once again, the role of man
towards technical objects is indispensable.
In the environment of digital research, the analogy of the conductor can shed light on
the role of the researcher as the one who directs technical ensembles, as well as
delineates close relations with the elements and their integration into the ensemble.
The conductor can direct his musicians only because, like them, and
with a similar intensity, he can interpret the piece of music
performed; he determines the tempo of their performance, but as he
does so his interpretative decisions are affected by the actual
performance of the musicians; in fact, it is through him that the
members of the orchestra affect each other’s interpretation; for each
of them he is the real, inspiring form of the group’s existence as
group; he is the central focus of interpretation of all of them in
relation to each other (Simondon, 1980, p.4).
In the full range of digital methods, the researcher should take the same position of the
conductor; who is familiar with the whole and the smaller parts of what makes an
orchestra, the general and the tiny details produced by each musical instrument; only
in this way is she able to conduct her musicians. The conductor, although in the
position of the one who directs the musicians, is nevertheless at the same level of these
latter; because her decisions cannot be taken individually or isolated but depend on
70
and are affected by the performance of the musicians while they play. On one hand, if
she weren´t there, there would be no “group’s existence as a group”, but on the other
hand, she could never interpret any piece of music without her instruments. Together
they bring musical notes to life; both delivering the performance and providing a
unique experience. Likewise, in digital methods, the researcher fulfils her function
before and through a technical ensemble composed by the medium that she
investigates and the tools that she uses to investigate it.
How then to become acquainted with the medium and its functional meanings within
an ensemble? It is necessary, according to Simondon, “to every man employed with a
technical task to surround the machine both from above and from below, to have an
understanding of it in some way, and to look after its elements as well as its integration
into functional ensemble” (Simondon, 2017, p.78). This also means that beings that
function signify not only through what they do or through how they operate (Rieder,
2020), but through the comprehension of their relational mode of functioning.
For instance, drawing on Simondon’s “mechanological perspective” on software,
Rieder (2020) introduces a mode of thinking and capturing the ‘interior life’ and
‘sociability’ of information retrieval “in terms that are not bound to an exterior finality
or productivity” (p.16). Consequently, and from a series of practices in softwaremaking, his book contributes to a way of studying how these engines of order
somehow “adjudicate digital life”. Yet, how to become acquainted with the medium
and its functional meanings when one is neither a developer nor has coding skills?
Such technical task, I contend, is not exclusive of developers, coders, those who make
software or hack systems. Non-developer researchers can also interrogate the
fundamental nature of the medium from a technical perspective.
Building machines is not the only ways of being around them. Using them is another.
Although barely talking about the use of technical objects, Simondon states that we
may treat these objects “as meaningful only in relation to a use and utility” (Simondon,
2017, p.16 in Rieder, 2020, p. 57). He furthermore ensures that we are engineers of
transformation, as both inventors and users of technical objects; so, the desire for
change resides in us, rather than in the object itself (Simondon, 2017, p.71).
In the context of digital methods, as a user, to be acquainted to the computational
medium would mean speaking its language conceptually and technically while using
71
the medium empirically without losing sight of its role in parts and the whole of digital
methods (see table 2.1). This also requires questioning the medium itself and its
relation to other mediums. By doing this and counting on the researcher’s empirical
experience and technical imagination, the computational mediums become active
actors in the research process with substantial content for enquiry and for answering
research questions – not only tools.
In order to become familiar with the technical medium, new media scholars should
have neither the same knowledge of a computer scientist, nor an ability to invent
algorithms; but they must “understand technologies well enough to connect them to
culture” and have the “willingness to use new and challenging methods of thinking
and investigation” (Bogost & Montfort, 2009, p. 5). More important than creating
computer systems or becoming a coder expert is “knowing the best questions to ask
about existing ones and how to go about answering them” (idem).
This illustrates not only the spirit of platform studies, proposing to think about the
software environment as a platform, in the belief that “technical understanding can
lead to new sorts of insights” (Bogost & Montfort, 2009, p. 1), but also concerns an
understanding about the technicity-of-the-mediums in the context of digital methods.
That is, an understanding of the medium in its own right and in relation with others
(see table 2.1). Then, and by knowing it well, we start reasoning with and about the
mediums, using this knowledge as a means for enquiry and for answering research
questions. In this sense, the knowledge gained from practical experience is used as a
point of departure for designing research questions, a value needed in digital research
and methods.
72
[to understand the]
Medium
Conceptually
Technically
Empirically
What is this medium?
What does it bear? How
does it work and for what
purpose? What is
permanent or subject of
updates, replacement?
Does it connect to other
fields of study? Which
and how?
To understand the
medium purpose,
capabilities, potentialities
and limitations in
technical terms and what
does it mean for research,
considering its technical
infrastructure.
What is required in order
to use it? When using,
what problems and
solutions are implicit to
the medium? What
technical skills may be
required in the context of
collaborative research
(data sprints)?
What are other mediums
that this medium related
to or communicate with?
What the medium carries
is modified? How? For
what purpose and on
which effect? In the
research context, what is
its technical element that
matters the most? Why?
To understand the
medium relational
aspects (with other
mediums) and respective
purpose and effects in
technical terms. To
understand the technical
element that matters.
To understand, see,
explore and experiment
its relational aspects and
respective purpose and
effects. To identify,
explore and experiment
the particular element(s)
that matter for research.
What questions can(not)
be asked? Which
medium potentialities or
methods can be used?
Why? What digital
records can(not) serve
this purpose?
To know about the
mediums methods and
elements that are
appropriated or have the
potential to respond to
research questions or to
be repurposed.
How questions can(not)
be answered? Which
ensemble of mediums
can(not) serve this
purpose? What medium
element can(not) make
the difference?
To know how to
orchestrate different
mediums in order to
answer a particular
research question.
in itself
Being acquainted with
in relation to
other mediums
Using technical imagination
as a means for enquiry
a means for answering
research questions
To dedicate some time to
go through each step of
digital methods approach
- testing, exploring and
experimenting its
possibilities, analytical
affordances and
limitations. To perform
the full range of digital
methods either following
projects’ recipes or to
experimenting and
testing new
arrangements.
Table 2.1. How to be acquainted with the medium in the context of digital methods?
73
From orders of thought to activity and back
The last step to understand technicity addresses its material reality which is “thus
mirrored by a ‘mental and practical universe’ that humans come to know and draw on
to create technical objects” (Simondon, 2017, p.252 in Rieder, 2020, p.76). The
challenge here is to advance the notion of technical mentality46 as a fundamental
principle with a crucial practical importance. Digital methods cut across technical
skills and practices unfolding new ways of thinking along with the medium just as
creative styles of enquiry and interpretative strategies; which are founding principles
in these methods (Marres, 2017; Omena, 2019; Rogers, 2019).
In this context, I want to argue that the development of a sensitivity to the technicity
of the mediums requires not only practical or technical efforts but also mental schemas,
the capacity to develop a particular mode of reasoning that echoes the technical
potentialities in the research process allowing for new arrangements. In line with
Simondon’s work (2009, 2017), what follows is an attempt to describe technicity as a
phase providing also brief reflection on technical thought.
When Simondon refers to technicity as a phase he is making a strong statement by
affirming that technicity is one of the fundamental elements that make us what we are
(the other being religion). He sees technicity as a phase fundamental to "the mode of
existence of the whole constituted by man and the world”(2017, p. 73); here the whole
(l’ensemble) is taken as a system which is mediated by technical objects and formed
by the man and the world. While philosophy has historically considered technology
either as autonomous or defined by its use, Simondon sees technology as an expression
of life (p.65; see also Rieder, 2020), accounting the reality of technical object itself. In
this sense, the philosopher of technology guides the reader into a chapter that aims at
grasping technicity as a mediation between the theory of knowledge and the theory of
action. While I will not delve into Simondon’s metaphysics, it is important to describe
this perspective that served as inspiration for this dissertation.
46
“Technical mentality offers a unique mode of knowledge that essentially uses the analogical transfer
and the paradigm, and grounds itself on the discovery of common modes of functioning - or of
regimes of operation - in otherwise different orders” (Simondon, 2009, p. 17). Despite still evolving,
and thereby incomplete, a technical mentality proposes schemas of intelligibility that would be
particularly adequate to grasp regimes of technical operations implying also their functional modes.
74
By defining technicity as a phase, Simondon understand it relationally:
one cannot conceive of a phase except in relation to another or to
several other phases; in a system of phases there is a relation of
equilibrium and of reciprocal tensions; it is the actual system of all
phases taken together that is the complete reality, not each phase in
itself; a phase is only a phase in relation to others. (Simondon, 2017,
p. 174)
As phase, technicity simultaneously precedes and takes place with and in the technical
objects: it precedes by being related with figural structures, positioning itself as
something prior to any split of subjectivity and objectivity; by also relating to a
trajectory of awareness (Simondon, 2017). For instance, the beginning of an action,
what Simondon exemplifies as "the desire for conquest" or "a sense of competition";
things, places and moments (key points) that go from the beginning of an action to its
own realisation; or as he explains as “the birth of a network of privileged points of
exchange between the being and the milieu” (p.182). What precedes the technical
objects, thus, consists in this capacity of connecting key-points, perceiving also that
these points “objectivise themselves in the form of concretized tools and instruments”
(p.181). In this mental universe, for instance, I would suggest thinking of key-points
as technical elements. Positioned itself as mediator between man and the world, these
elements are taken as a place of exchange or to be navigated and explored, rather than
dominated or possessed. By passing from figural structure(s) to technics, and in line
with Simondon, technicity takes place with and in technical objects, when technical
objects can be recognised as a reality in themselves.
The suggestion of taking elements as key-points has a purpose, that is my attempt to
redirect philosophical ideas to what I will further discuss and illustrate in this chapter
– namely the role of technical thought and mentality in digital methods. Simondon,
however, takes the act of climbing a mountain as an example of this mental state of
thought that precedes an action.
To climb a slope in order to go toward the summit, is to make one´s
way toward the privileged place that commands the entire mountain
chain, not in order to dominate or possess it, but in order to exchange
a relationship of friendship with it. Man and nature are not strictly
75
speaking enemies before this connection at this key-point, but are
simply strangers to each other. (Simondon, 2017, p.179)
Only after has it been climbed, the summit becomes a place of exchange; in this
process, man can simultaneously be influenced by or act upon it. Here the shift from
the idea of climbing a mountain (thought mediation) to the realisation of this activity
(technical mediation and human function) is objectivised in technics. On that basis,
Simondon reminds us that technics not only have the power to ‘modify a privileged
place’ (key-point), but “can also completely create the functionality of privileged
points” (p.182).
The notion of phase thus implies seeing technicity as something coming-into-being –
e.g. from looking at “the operation of a system with potentials in its reality” (p.169) to
the actual activity and attitude towards turning potentiality into actuality; which makes
us realise that technicity can be anything but motionless or static. Furthermore, it
cannot be entirely contained within technical objects, or exhaust itself within them. On
the contrary, “technicity precedes and goes beyond objects” (op. cit. p.179). In the
habitat of digital research, and as permanent organiser/coordinator of technical
ensembles, the researcher not only regards highly what precedes and takes places with
and in technical objects but, through technics, has the power to modify or create new
methodological arrangements. It is here that we see the need for developing a technical
mentality along with the process of digital methods.
As a phase, technicity becomes at the same time a problem and a solution47. In
Simondon’s words, this would be something that surrounds “the deepest reality of
technics” which, although constituted by theoretical knowledge, is realised in praxis
(p.171). As a result, more than a phase, technicity devolves into a phase-shift; turning
itself into something transitory as well as something definitive, explains Simondon.
Technicity is then transitory because of its capacity to split itself into theory and praxis,
dividing itself in two orders of thought: theoretical and practical. On one hand,
representative aspects of (scientific) knowledge that would be what Simondon refers
to as grounding realities that bring forth from representative orders of thought. On the
other hand, the active aspects of the praxis, what he refers to as figural realities that
47
Turning itself into "a permanent reminder of rupture" of the capacity of resolving problems
(solution) and of the position of becoming a problem (Simondon, 2017, p.174).
76
spring forth from active orders of thought48. Technicity is definitive because it refers
to a particular domain of knowledge49 that concerns the functioning of the technical
object; here is a particular consideration for the elements. That is where we see the key
for (also beauty of) developing a sensitivity to the technicity of the mediums which, as
I argue, can be constituent of digital methods practice; something that also pushes us
towards orders of technical-practical thoughts, something that “cannot be completely
systematised” because these thoughts “lead to a plurality of different values”
(Simondon, 2017, pp.216-217).
The value of technical elements and technical imagination
To exemplify “the concern for the element”, which is something enabled by technics,
Simondon (2017) invokes how Descartes explains the functioning of the heart:
[…] decomposing a complete cycle into simple successive
operations and showing that the functioning of the whole is the result
of the play of elements necessitated by their particular disposition
(for example that of each valve). Descartes doesn´t ask himself why
the heart is made in this way, with the valves and cavities, but how
it functions given that this is how it is made. The application of
schemas drawn from technics does not account for the existence of
the totality, taken in its unity, but only for the point by point and
instant by instant functioning of this totality. (Simondon, 2017,
pp.187-188)
The importance of the elements is not a matter of knowing what the heart is made of,
but how it functions, having the ability to understand its overall functioning
(fonctionnement d’ensemble) as a series of elementary processes and mediations
(Simondon, 2017). In this logic, “technicity introduces the search for a how through
the decomposition of an overall phenomenon into elementary operations” (p.188). This
also points to the possession of a content (at the level of element) which is primarily
technical. The particular concern for the elements mirrors a certain schematism of
48
In this sense, it is important to understand better the ground (fond) and the figure. The former
corresponds to “the functions of totality that are independent of each application of technical
gestures”, while the latter “the figure, made of definitive and particular schemas, specifies each
technique as a manner of acting”.
49
For instance, Rieder (2020) presents technicity as a notion related to the domain of making software
“where what programs do and how they do it is specified or, better, designed”. Here, technicity is a
notion related to the domain of using software in the practices of digital methods.
77
mental structures50 that resides in technical thought, which is taken as “the paradigm
of all inductive thinking, whether in the theoretical order, or in the practical order”
(Simondon, 2017, p. 188). According to Simondon (2017, p.214), inductive thinking
can be defined by its content, “the form of theoretical thought that arises from out of
the fragmentation of technics”, and for method “the thought that goes from particular
elements and experiences to the whole of the collection and to a general affirmation,
seizing the validity of the general enunciation by way of the accumulation of the
validity of particular experiences”. Coming from technics, this way of thinking
remains relational and pluralistic, because it is empirical in its origin (see Simondon,
2017, p.217).
The example of the functioning of the heart helps us to make sense of technicity as
something transitory (orders of thought) and definitive (domain of knowledge), but
also as schemas of technical thought. These, however, cannot be easily transmitted or
explained, as we have learnt from Simondon (2009; 2017), schemas of thought are
poorly understood from the order of expression, but they further presuppose a node of
expressive communication, modalities of attitude towards what is either theoretical
and practical; in and about technical objects. That is the reason, on one hand, why
Simondon (2009; 2017) overthinks technical activity51 to explain practical and
technical thoughts; and, on the other hand, he affirms that the application of such
modes of intelligibility requires the development of a technical mentality which “can
be developed into schemas of action and into values” turning itself into a thoughtnetwork.
As mentioned before, the reality of technical objects in Simondon echoes industrial
objects, but despite that his reflections have a powerful potential for digital research,
not excluding the possibility of a practical application. On the contrary, the work
process introduced by digital methods can be precisely the way for grounding (and
50
For the apprehension of technical individuals and ensembles.
51
The technical activity always faces two opposite but complementary realities that speak of technical
thought (Simondon, 2017). For Simondon (2017), the technical activity and its limits are exposed as
such when it fails: “Through its failure, technical thought discovers that the world cannot be entirely
incorporated into technics” (p.215). But he also presents a complementary perspective, as it leads to
the discovery of new possibilities: actions that fail expose also counter-structures attached to technical
operations demanding human intervention through technical gestures.
78
taking advantage of) such mental modes into action. Beyond getting acquainted with
the medium, cognitive schemas of thought (technical and practical) are precisely
important because they reflect upon technical invention and the creation of technical
objects but also methodological innovation.
Grounded by and in technical thought, Simondon explains that invention is “the taking
charge of the system of actuality through the system of virtualities, the creation of a
unique system in the basis of these two systems” (2017, p.61). That is to say, to
paraphrase him, to be in an intermediary position dominating what is between the
abstract and the concrete, taking charge of medium actual activity through its technical
potentials for the creation of one thing. In Simondon, invention is thus the creation of
a technical individual that requires from the inventor an intuitive knowledge of
technicity, particularly of the element. That is what Simondon (2017) describes as “the
level of schemas” which “presupposes the pre-existence and coherence of
representations that covers the object’s technicity with symbols belonging to an
imaginative systematic and an imaginative dynamic” (p.74). Imagination here is the
capacity of prediction of the practical qualities of technical objects. In this sense,
technical imagination thus can be defined as
“a particular sensitivity to the technicity of elements; it is this
sensitivity to technicity that enables the discovery of possible
assemblages; the inventor does not proceed ex nihilo (from scratch),
starting from matter that he gives to, but from elements that are
already technical” (Simondon, 2017, p.74)
I argue that such state of mind and sensitivity combined with a technical practice,
certainly inherent to the full range of digital methods, have the potential to fill a gap
in digital research: not only conceptually but in and through methods (practice). A
thought-network that echoes in modes of enquiring (research questions), gathering
different computational mediums and their respective methods (methodology) for a
final purpose (research goals). In this case, technicity mirrors a practical reality driven
by a technical imagination.
The points discussed above serve to provide a conceptual basis for my main argument
that the practice of digital methods is enhanced when researchers make room for, grow
and establish a sensitivity to the technicity-of-the-mediums.
79
Third attempt to understand technicity (with digital methods)
In attempting to illustrate more clearly the mental and practical modes of what I am
calling the technicity-of-the-mediums in digital methods, this section provides a
description of the process of building/interpreting computer vision-based networks. It
is an exercise in repurposing pre-trained machine learning models to study bot agency
in social media through networks. The research question here concerns how the visual
content shared by botted accounts travels across domains52. In order to answer to this
question, I take the example of a digital methods protocol to expose what it takes to
interpret computer vision networks - taken as an ensemble of machines, data, methods
and research practices. This register concerns networks created through Instagram and
Tumblr images53 and reconstructed through Google Vision API - a model that searches
and detects the web pages in which full or similar matching images have been shared
across the web.
The context and content of the network
The context of this network is an ongoing research on bot activity in social media
platforms which started in late 2017 through close observation and exploratory data
analysis of botted accounts on Instagram, or instabots. The first step was to identify
botted accounts. To do so, I carried out three investigative activities: i) mapping,
describing and using applications that boost engagement on Instagram; ii)
understanding the black market54 for automated engagement through botted accounts,
52
My “bot” journey started with the study of hashtag engagement on Instagram. I was exploring the
workers and the conservative protests in Brazil, March 2017. Through visual exploration of the
network cross-analysing with a list of the most active unique users of hashtags, it was detected not
only actors taking a position before the polarised protests, but political far-right botted accounts also
popped up (see blog post here: https://thesocialplatforms.wordpress.com/2017/12/21/insta-bots-andthe-black-market-of-social-media-engagement/). In March 2017, pro-Bolsonaro bot accounts were
using the main hashtags of the protests to get visibility in the debate, but also using a list of specific
hashtags that would point to Jair Messias Bolsonaro as the future president of Brazil (see the co-tag
network here: https://thesocialplatforms.files.wordpress.com/2017/04/insta-tarde_15m-2017foratemer-grevegeral-diretasjacc81-modularity-zoom-ok-final-version1.jpg?w=1024). As a result, I
started questioning the role these automated beings on Instagram engagement and the possible impact
on the Brazilian General Elections in 2018. Later, mainstream media and academic work proved how
the strategic political campaign of the presidential candidate had highly invested in automation.
53
In these types of networks, images can be captured by different data collection techniques and entry
points, such as keywords, hashtags, websites, or a list of social media username accounts.
54
Black market here is understood in the sense of subverting the official terms of use and data policies
of social media platforms; hidden automated practices that drive fake followers and engagement (likes
80
thus cross-analysing the functioning of applications with the modes of bot engagement
through these apps and on the platform; and finally, iii) the exercise of tracing botted
accounts by purchasing engagement (likes and comments) and followers (see Omena,
2017). After this and other exploratory studies, username-pattern was proven to be
valuable as a starting point for identifying social bots on Instagram. This confirmation
came from analysis that looked closely at username-patterns and their relationship with
accounts’ visual content (whether contains posts), profile info (whether contains image
profile, if it is private or public, if there is discrepancy between the number of followers
and following) and the pace of commenting.
The username characteristics were thus identified through bot detection techniques:
forcing bots into action and data exploratory analysis (see Figure 2.1). In the former,
the visible bots - those immediately detected after using hashtag lists or after the
purchase of engagement metrics and followers. These bots’ usernames would either
be weird (e.g. swph965, __beta__1, awesome.vs.amazing) or try to mimic real
accounts (e.g. sabrinaejr, williamc_clarke, b.ianca._), thus, hard to differentiate from
accounts of human people in a dataset55. In the latter technique, usernames were often
constituted by a sequence of numbers accompanied or by letters or underscore (e.g.
6151, 98.00715, ______160.0cm, 6.13kkk, _0318_m) among other atypical
combinations. Clusters of these botted profiles were only able to be identified when
working with spreadsheets (by filtering usernames from A to Z or Z to A) or in the
processes of data exploratory analyses/visualisations and visual network analysis56. To
detect this type of botted accounts required some experienced in the practices of digital
methods as well as previous knowledge on bots’ culture of use on Instagram.
Considering they are not easily detected as when purchasing, I am referring to these
accounts as hidden bots.
and comments) to accomplish a particular purpose in a non-authentic manner. For instance, the spread
of political controversial ideas, misinformation or to make and maintain popular profiles.
55
Unless when opting for filtering the most active unique users in the spreadsheet, this technique
works well for political-related content, at least in the Brazilian context.
56
Watch other bot detection techniques here:
https://www.youtube.com/playlist?list=PLuAgGxzD7fdxKJVTbYM5PtzMmnT94_1ZM
81
Figure 2.1. Two ways of detecting social bots.
The agency of instabots not only influences and shapes public and political debates
(Murthy et al., 2016; Woolley, 2016; Woolley & Howard, 2016) but plays a role in
the process of interpreting data. For instance, when working with networks of
hashtagged-based content networks, instabots probably influences in the shape of the
network, indirectly affecting its interpretation. Although bots on Instagram are usually
associated with celebrities, photographers, marketing professionals and influencers,
through username-pattern (combined with profile image) and data exploratory
analysis57, botting activity was efficiently detected in the context of health studies,
politics and demonstrations. For instance, cross-platform studies concerning Zika
Virus and Dengue hashtag-related content (Rabello et al., 2018); polarised political
protests in Brazil; for example, in 2016 the demonstrations of pro/anti impeachment
of Dilma Rousseff and in 201758 the stay/get out Michel Temer protests (see Omena
et al. 2017; Omena, Rabello & Mintz, 2020); and, when exploring hashtag networks
in the context of Brazilian presidential elections in 2018 or investigating the following
57
See some examples of using username and image profile as indicators to detect instabots through
Google Spreadsheets:
https://www.youtube.com/playlist?list=PLuAgGxzD7fdxKJVTbYM5PtzMmnT94_1ZM
58
See slides presentation available at: https://www.slideshare.net/jannajoceli/why-look-at-socialmedia-apis-81702316
82
network of Bolsonaro non-official campaign accounts and their image profile in
201959.
This research agenda was later expanded to Tumblr automated agency, particularly,
porn bots. In 2019, together with two research collaborators, namely Jason Chao and
Elena Pilipets, we combined our different backgrounds and expertise (coding, digital
methods and media theory) to enhance the previous work proposing an innovative and
replicable conceptual-theoretical-practical framework to address bot engagement on
Instagram and Tumblr (see Omena, Chao, Pilipets et. al., 2019)60.
Regarding the content of the network, there are three key constituent elements: the
visual content posted by what is identified here as hidden bots, the logic of
spatialisation of the graph layout algorithm (ForceAtlas2) and computer vision61
potentialities of Google Cloud’s services – namely searching on the web for full
matching images. This feature thus provides a different type of network from those
based on the image classification capacity of vision APIs; the so-called computer
vision image-label networks, which have been gaining space in digital research over
the years by allowing the interpretation of a network of images and their descriptive
labels (Colombo, 2018; Geboers & Van De Wiele, 2020; Mintz et al., 2019; Omena,
Rabello & Mintz, 2017; Omena & Granado, 2020; Ricci, Colombo, Meunier, & Brilli,
2017; Silva et al., 2020).
I am, however, addressing the unknown and unexplored computer vision imagedomain networks that allow the interpretation of a network of images and their
respective sites of circulation (through the detection of web pages in which fully
matched images appears). This type of network offers particular visual interpretations
of the images that stick within or flow out of digital platforms. When using Google
Vision to search for full matching Instagram and Tumblr bot images across the web,
59
See blog posts available at https://thesocialplatforms.wordpress.com/2018/10/22/elenao-vs-elesim/
and https://thesocialplatforms.wordpress.com/2019/12/07/reading-digital-networks/
60
This collaboration has grown resulting in a working group funding by the Center for Advanced
Internet Studies (CAIS), Bochum, Germany (https://www.cais.nrw/arbeitsgemeinschaften/criticalframework-for-investigating-bot-engagement/).
61
Object recognition, identification and detection reflects one of the most thriving fields of computing
and artificial intelligence (Porikli, Shan, Snoek, Sukthankar, & Wang, 2018), for example serving
national governments with face recognition as a form to identify terrorists or big tech companies with
automated content moderation (e.g. adult, violence, offensive or unwanted content). The affordances
of computer vision have also been repurposed for social and medium research.
83
one can identify whether the images (extracted from these platforms) were found only
on the Instagram (or Tumblr) environment (sticking within the platform) or have flown
out of it, reaching other web environments such as blogs, news media or social media
platforms. In addition, researchers can identify clusters of images showing up at
specific domains, also clusters of different domains sharing the same image. That is a
multi-perspective study, which by default offers general and detailed perspectives on
the visual content shared by both dominant link domains (e.g. social media) and
clusters of link domains (e.g. mainstream and local media, non-profit organisations);
the modes of circulation of images across platforms; and, moreover it clearly exposes
platform-specificities and cultures of use (see Omena et al. 2019, Omena et al. 2020).
It is crucial, however, to know in advance from where the entry list of images come
from (e.g. hashtagged content, newsfeed of specific social media accounts, search
engines results) and how Google’s vision suggests a list of pages or URLs in which
images are found (e.g. based or Google Image Search ranking systems and it
knowledge Graph).
Building a computer vision-based network (being acquainted with computational
mediums)
The making of this network (in collaboration with Jason Chao and Elena Pilipets)
required some time in order to obtain the final sample (see Omena, Chao, Pilipets et.
al., 2019) (see Figure 2.2). First, we had to the define our entry points to scrape data.
Since bot detection on Instagram was already established, it was then necessary to
discover, test and adapt the forms of detecting hidden bots on Tumblr. After all these
platforms have different cultures of use. So, the observation and monitoring of note
section was defined as an exclusive strategy for bot detection on Tumblr. While the
exploration of co-tag networks served well for both platforms, there were some data
analyses exploratory techniques that served only Instagram, for instance: using Excel
filters to verify account names; visualising the most active unique users (in mentioning
hashtags62); or searching for profile image repetition or profiles without images.
62
The former Instagram API Platform allowed third parties to have this type of information, when
using the no longer available DMI tools, namely Instagram Hashtag Explorer later renamed to Visual
Tagnet Explorer (both developed by Bernhard Rieder), which was a tabular file with information on
the users related to the use of a particular hashtag.
84
Scraping data was the second step, with results close to five hundred thousand media
items; the storage and initial analysis of the data were grounded by distribute
computing and the use of a virtual machine in the cloud – Google Cloud Storage. To
make the project practically viable, we worked with the most recent 30 posts published
by each botted account. Eventually, we found non-longer existing accounts while
others had changed their usernames; thus, and as expected, limiting the data collection
(see the final sample details in the visual protocol below)63.
Figure 2.2. Research diagram protocol for building a vision API-based network based on
Google’s Web Page Detection (full matching images) module.
During the next step (figure 2.2), relying on the scraper results of the query for hidden
bots, we selected the available list of post-image URLs, and then required the module
Web Entities and Pages from Google vision API, asking precisely for the feature full
matching images. This has informed us of where each image has appeared on the Web,
63
One may argue that following this very same research procedure, out of a collaborative research
environment, becomes an unfeasible practice. However, there are alternative scripts available and
many of them may be found on GitHub, basic skills in running scripts (e.g. in Mac’s Terminal or
Anaconda) and a bit of curiosity could be the solution. In that case, for instance, one may combine
Memespector (Rieder, 2017) for using Google’s Vision API and Image-network plotter (Mintz, 2018)
for plotting the images of the network into SVG files in order to build her own computer vision
network.
85
searching for a maximum of 1000 different URLs per image. Before converting our
comma-separated value file (CSV) into a graph file format, such as graphic data files
(GDF) or graph exchange XML format (GEXF), some data exploration on dataset had
to be conducted64.
Since we had tracked some porn bots, one of the objectives was to verify the existence
of porn websites. To do so, only the images that hit 10 different URLs were selected.
From this list, we identified the unique link domains and their frequency of occurrence.
DMI Harvest Tool65 was used to identify the unique link domains66; a total of 4.249
unique hosts. After that, we followed a protocol of “search as research” for detecting
porn websites: by querying the spreadsheet (image URL column) with the keywords
“porn” and “sex”, 125 porn unique hosts were identified there. When verifying the
frequency of occurrence of unique link domains, beyond social media and image
repositories, the porn domains popped up with a discrete frequency of occurrences (a
total of 809) if compared to the total occurrences of the dataset (25.862). This
exploratory exercise assisted us with extra features to be included as node attributes
into the bipartite network.
•
Node type 1: image. Attributes: full_matching_image_count to size the images
within the network and username to identify the botted account responsible for
uploading one or more images.
•
Note type 2: link domain. Created attribute: isPorn in order to use a different colour
for seeing porn websites.
64
CSV is a multiplatform format often present in digital methods as well as GDF and GEXF files used
for visual network analysis. Geographic data files (GDF) are commonly used for the creation and
structing of road network data (Open Street Map Wiki, 2017) but also a file format that stands for
‘graph dataset format’ used by GUESS (Adar & Kim, 2007), an exploratory data analysis software for
networks, and Gephi. GEXF was created by Gephi community project as a mature language for
“describing complex network structures”; by using extensible markup language (XML), GEXF is both
extensible and suitable for real specific applications (Gephi Community Project, 2009). Visual
network analysis’ projects such as GraphRecipes and MiniVan work with GEXF file format.
65
https://wiki.digitalmethods.net/Dmi/ToolHarvester
66
See this list here:
https://docs.google.com/spreadsheets/d/1DDmiDVYEy3kb1CHIB8KHB9Fi0WQEMsjDjftmP0lIoko/
edit?usp=sharing
86
The final step was to download the GEXF created with Table2Net67, then uploading
it on Gephi and spatialising the network using ForceAtlas2. The next challenge
would be to address the question of reading the network.
Reading digital networks (orders of technical-practical thoughts)
After passing through some layers of technical mediation, what we see when looking
at the network in Gephi is a second order of grammatisation. The final visual metaphor
neither entirely represents Instagram and Tumblr’s grammatisation nor quite exposes
Google Vision API in its full potentials. The scraping of grammatised actions (image
URLs pertained to public social media botted accounts) linked to one technical element
of the vision API (web detection) was transformed into a new corpus, which passed
through another modification, due to software affordances and exploratory data
analysis, turning itself into another corpus, a graph file format which gains a shape
through the work of a force-directed algorithm (ForceAtlas2) and life through its
interpretation, results and impact about to come.
Only then, is this computer vision image-domain network ready to be interpreted, to
respond to research questions – either those previously asked or the new ones to come,
and to provide findings. It is indeed an oligoptic vision of bot activity and respective
image circulation, as a resulting from the digital methods approach.
Within this framework, the most difficult challenge before starting the process of
reading the network was precisely the accurate comprehension of the potential
narrative thread afforded by the work of ForceAtlas2 (Jacomy et al., 2014). The main
argument is that ForceAtlas2 offers fixed layers of interpretation (centre, mid-zone,
periphery, and isolated elements) and multiple forms of reading (see Omena & Amaral,
2019; Omena et al., 2019). The basic principles of this technique account for the work
of ForceAtlas2 in action combined to empirical evidence based on several analysis I
have made over the past years68. ForceAtlas2 responds to attraction force vs. repulsion
by degree (the total of connections a node has received or made). “The force-directed
drawing has the specificity of placing each node depending on the other nodes. This
67
https://medialab.github.io/table2net/
68
Networks (either monopartite or bipartite) constituted by different digital records (hashtags, images,
links, posts, comments, users, etc).
87
process depends only on the connections between nodes. Eventual attributes of nodes
are never taken into account” (Jacomy et al., 2014, p. 2).
The detailed description that follows reflects not only a conceptual understanding of
ForceAtlas2 but some technical practices and even more some technical imagination
(see figure 2.3). Thus, assuming that in the centre of the network we can see the nodes
that gather more diversity and variety in their connections, being there, in some cases,
the most connected nodes (e.g. in the networks of co-occurrences of hashtags) or the
most popular nodes (e.g. networks of recommendation, the case of similar apps or
related videos on YouTube). In the mid-zone, we find influential or bridging actors as
well as empty zones (lack of connections); while the periphery is a space that reveals
different perspectives and particularities either in terms of content or platform
specificity. This area of the networks is usually very rich and interesting for analysis.
The isolated elements, once they exist, also deserve our attention.
Figure 2.3. The network of Instagram and Tumblr bots’ image circulation according to a
visual technique for interpreting the narrative affordances of ForceAtlas2. Network built
upon the full matching images feature of Google vision APIs’ web page detection.
88
On the certainty of the fixed zones of node positing within the network, we tried to
make sense of the spatialisation of the network looking at the relational aspect of the
dataset and how the network was built. However, considering the innovative proposal
of this network, we questioned how we should interpret the different zones of the
networks and what we expected to find there. What should we look at? How to exactly
read the connections between images and links? After all, the link domains within this
network are sites where full matching images have appeared.
Some technical questions were addressed to my colleague Jason Chao, concerning
particularities of the vision API. Then, there was also an attempt to compare this
network spatialisation with the computer vision image-label network and others.
However, such comparison was problematic because image-label networks concern
visual semantic spaces, as in the peripheral zone we normally see clusters that point to
particular visualities with a more detailed classification (labels) to describe the images;
whereas a more generic labelling takes place in the centre of the network (see Omena
& Granado, 2020). These networks, moreover, present a more static representation of
the content, contrary to the depiction of those built upon the full matching images
feature of Google vision APIs. The comparative exercise was then extended to
networks of recommendation, such as related videos on YouTube and similar Apps on
Google App Store. But, in these latter networks the centre zone is usually occupied by
most recommended videos or apps within the whole network. Adding to that, and
although they represent a flow of recommendation, both networks are monopartite (we
were dealing with a bipartite network).
In theory, we were aware that the network of Instagram and Tumblr bots’ image
circulation involves movement and dynamic perspective due to its two types of nodes
(images and link domains) and the way connections were made between nodes (the
appearance of an image in different URLs). Whereas in practice, we were unable to
move forward to interpret and analyse the network from such a dynamic perspective even after looking at the network in Gephi, talking with my colleagues and comparing
this with other networks. The solution to the interpretative problem was to both
mentally revisit and visually (using Gephi) go over what constitutes that network. A
step-by-step process, during which we took into consideration: i) the way the network
was made and how it was built (what precedes the network visualisation), and ii) the
89
network visualisation in itself – what nodes are, how connections are made, and the
role played by ForceAtlas2 (what takes place with and in the network).
At the limit of our thinking capacity, we finally had a clear understanding of what was
known in theory but not yet in practical terms. There was no point in comparing this
network to others, we were getting distracted by the content of the network (different
digital records) and the ways of reading what informs the connections between nodes
(e.g. the existence of co-related hashtags). The “aha moment” had finally come out,
advanced by what I could see; the positioning of the nodes (the anatomy and
functioning of ForceAtlas2); and, the shape of the clusters in the periphery zone
(reflecting the list of Instagram and Tumblr image URLs used to run Google Vision
API) (figure 2.4) and also using technical imagination (reflected in the capacity of
combining the practical qualities of ForceAtlas2 and Google Vision web detection
with the choices made in the query design).
Once again and understanding what precedes the network visualisation, the act of
mentally revisiting each step of the methods, paying close attention to technical
elements (how Google Vision API’s full matching image detection module functions?
how has it added a new layer of meaning to our list of image URLs?), allowed us to
read the network spatialisation. If all images within the network were originated by
two platforms, so it was reasonable to see two big clusters in the periphery zone
(gathering platform specific visual content). In other words, if we used a collection of
images from Instagram and Tumblr as entry point to run Google Vision API and
considering the number of images, we should expect the vision API to detect the place
of origin of those images (Instagram and Tumblr), particularly and as popular
references, social media appear at the top layer of Google results, resulting in the two
big clusters we see in figure 2.4. Within these clusters, the nodes pointing outward
(and not positioned close to the centre of the network) show the images that had
appeared only within each platform (Tumblr or Instagram). Whereas the nodes
positioning more in the central part would point to the imagery that flows out of these
platforms; thus, reaching different websites, such as porn hubs (in red). Following this
mindset, the small clusters we see connected only with Tumblr, point to images that
also flow out of the platform, but they hit link domains other than those located at the
centre of the network. Some of these are specific porn websites, such as those devoted
to Asian porn or teenager pornography, see the figure 2.5.
90
Figure 2.4. Computer vision image-domain network: reading how the imagery of Instagram
and Tumblr botted accounts travel across domains. Nodes are images and link domains (a
total of 14788), edges (a total of 33.503) indicates whether a given image happened to appear
in different domains. The visual content was published by what we called hidden bots.
Analysis by Janna Joceli Omena, Jason Chao & Elena Pilipets (see Omena et al., 2019).
Figure 2.5. Images posted by Tumblr hidden bots. Links of a recurring “innocent” imagery
redirect Tumblr users to Asian porn websites (above). Teenager sensual image (below) that
has appeared both within Tumblr and in teen porn websites (below).
91
Figure 2.6. Understanding the narrative affordance of ForceAtlas2 in practice. The network
of Instagram and Tumblr bots’ image circulation. On the right, screenshots highlight link
domains (green nodes), on the left images (pink nodes). (image source:
https://thesocialplatforms.wordpress.com/2019/12/07/reading-digital-networks/)
Beyond the stick and flow of image circulation, we further understood that node
positioning can either indicate (dominant) domains capable of gathering a more
diverse range of images - in the centre (e.g. Twitter, Pinterest), or clusters of images
92
connecting different link domain from both in central and peripheral zones (see figure
2.6). In this sense, we were able to raise questions: how images travel across platforms
(what sticks within the platform and what flows out), where do they appear (links
domains), and what types of visual content are attached to a given link domain or
clusters of link domains (and vice and versa).
After we finally understood the network spatialisation, the analytical process took
place (see figure 2.7) following a navigational research practice and the technique of
visual network analysis (Venturini et al., 2019, 2015), which draws our attention to the
position, size and colour of nodes as key interpretative aspects. Beyond what we could
see on the network, we considered a combination of factors, such as the relational
nature of digital records - images and their origins (platform), authors (botted
accounts) and context (e.g. how bots were detected, time period), the web environment
and the narrative affordance of ForceAtlas2. At this stage, our intervention was crucial
because interpretation means to navigate between Gephi overview/data laboratory to
the web environment, and back to Gephi, but now moving to the spreadsheet and
looking at a big screen, while making annotations in printed versions of the network69.
A navigational procedure was mandatory as was our technical awareness.
Figure 2.7. The analytical process and the researcher intervention
69
Here the interpretation of the network was led by a trajectory of technical awareness (see Figure 3.13
in chapter 3, which illustrates the research protocol diagram of this network).
93
Through the analysis of hidden Instagram and Tumblr bots, considering and comparing
it with the analysis of the purchased bots (see Omena et. al. 2019), two modes of bot
agency were detected: the discrete bots, mainly existing through the act of giving likes
(sometimes following others), which serve the purpose of boosting engagement
without attracting attention but never creating content. On the other hand, the imitative
bots mimic real people by distributing mainstream content, serving as aggregators of
followers/following and active agents of giving or receiving likes. Apart from that,
another significant finding reveals that Instagram and Tumblr botted accounts are not
only programmed to upload content or to follow trending hashtags in order to reach
visibility, but they follow and respect platforms’ culture of use and specificities
(Omena et al., 2019).
The description of Instagram and Tumblr bot image network aimed to illustrate the
role of the technicity before the full range of digital methods, also attempting to
describe the use of a technical imagination. It was furthermore an attempt to describe
processes that are hardly ever documented/registered in digital methods literature –
and I am not considering data sprints’ reports here. Although this section may sound
more descriptive and less conceptual in relation to the previous sections, the case
exposed here reflects precisely an example of what it is to think along with a network
of methods, connecting technicities (in mental and practical forms). This background
knowledge constitutes the awareness of the technicity-of-the-mediums where we
assume that researchers are familiar with the digital fieldwork and the technical
practices of digital methods. When in this position, researchers are able to use the
technical mediation and substance inherent to digital methods as a constitutive part of
research, also considering a triangular relationship between software affordances and
platforms’ cultures of use and grammatisation. The next chapter will demonstrate how
technical expertise can specifically contribute to new forms of enquiring by describing
that triangular relationship and suggesting a way of carrying out digital fieldwork.
94
3 DIGITAL FIELDWORK
C HAPTER 3
95
The digital is not a land of abundance. It is not a place where
information pours in freely or easily; not a place where
computational tricks, powerful as they may be, can replace the hard
work necessary to mine, nurture and refine inscriptions. Digital
methods do not spare us from walking the walk, but they give us the
chance to experiment new pathways. Tommaso Venturini, Mathieu
Jacomy, Axel Meunier and Bruno Latour, 2017.
Getting acquainted with the web environment
This section presents the technological environment that digital methods approach
takes as a point of departure to ground claims about social phenomena, promoting a
technical comprehension of the web environment according to Venturini and Rogers,
the argument that “research through media platforms should always be also research
about media platforms” (Venturini & Rogers, 2019, p. 6). Here we look at the web as
a network of connected (HTML) pages in which a particular knowledge on its
infrastructure is required to study social phenomena through the Web. Some
fundamental aspects about the web environment will be discussed from a
methodological viewpoint, paying particular attention to the role of web applications
and application programming interfaces.
The web environment sets the scene for different ways of researching in which scholars
need to adapt their research design in line with what is available in the blogosphere,
Internet archives, social media, search engines, etc. Moreover, they should know how
digital records are made available (methods like hyperlink networks70, crawling and
ranking, recommendation systems, Graph API infrastructure), how it is used - what is
grammatised (social buttons, hashtags, locale), through what types of content (textual,
video, image, gifs, memes), and for what purposes (for political debate, elections,
spread of disinformation, social causes, etc.). It is therefore imperative to understand
the web environment, from its methods and functioning to its forms of use (see Rogers,
2018).
70
See the work of Han Woo Park available at:
https://www.researchgate.net/publication/200772676_Hyperlink_network_analysis_A_new_method_f
or_the_study_of_social_structure_on_the_web
96
Such sensitivity to the technical architecture of the web is now, more than ever, crucial
in the content, methods, sources and techniques of medium research (D’Andréa, 2020;
Marres, 2017; Rieder & Röhle, 2018; Rogers, 2019; Watts, 2007) and justifies my
attempt to introduce the web environment through its functional modes and particular
relations with digital research and methods.
A technical comprehension of the web: an overview
The World Wide Web (WWW), or simply the Web, was created to be interactive, safe
and synonymous of a decentralised space in which we all would have the right to
express our opinion (Berners-Lee, 1995). While the Web is, still in many ways a space
for free speech71, the increasing power of private companies such as the AlphabetGoogle, Apple, Amazon, Facebook and Microsoft (a.k.a GAFAM) is breaking up this
space in a series platform walled gardens (Poell, Nieborg, & van Dijck, 2019; Rogers,
2018). The role of web platforms as custodians of the world’s data (Morozov, 2018)
is problematic and complex, implicating many social and economic issues. I want,
however, to give emphasis to a technical comprehension of the Web as crucial to carry
out research based on the digital methods approach.
In this spirit, we may start defining the web as is an information system where we can
access research material through Uniform Resource Locators (URLs) or Uniform
Resource
Name
(URN).
URLs
identifies
a
resource
by
location
(e.g.
https://smart.inovamedialab.org/2021-platformisation/) and URN by name, for
instance when using the International Stander Book Number (ISBN) to localise a book
(978‐972‐9347‐34‐4)72. The Uniform Resource Identifiers (URI) – either URLs or
URNs – are transferred via an application layer protocol for data communication
namely via Hypertext Transfer Protocol (HTTP), which is only accessible over the
Internet73 (see Berners-Lee, Fielding and Masinter, 2005). That is to explain the
universal identifier of every piece of information stored in the web needs to be
standardised by an URI. In Figure 3.1 we see the generic URI syntax “which consists
71
Considering that anyone, with Internet access, can create channels or find means to communicate,
but this comes with some costs such as those related to constant surveillance, privacy issues and the
digital labour.
72
World Wide Web (2020). Retrieved September 20, 2020, from
https://en.wikipedia.org/wiki/World_Wide_Web
73
See also: Hypertext Transfer Protocol (2020). Retrieved September 20, 2020, from
https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol
97
of a hierarchical sequence of components referred to as the scheme, authority, path,
query, and fragment” (Berners-Lee et al., 2005) and a few examples of web content
originated from YouTube, Tumblr, Facebook and Twitter.
Figure 3.1. Understanding the generic syntax of URIs.
According to Berners-Lee et al. (2005), schemes consists of a sequence of characters
beginning with a letter and followed by any combination of letters, digits, plus ("+"),
period ("."), or hyphen ("-"), e.g. https which stands for Hypertext Transfer Protocol
Secure, an extension of HTTP used for secure communication74. The authority
component, always preceded by two slashes (//), comprises a host such as
youtube.com, but it can also contain an optional user info. The Domain Name System
(DNS) also has its own hierarchy, being composed by hostname (en), second-level
domain (Wikipedia) and top-level domain (org): en.wikipedia.org (the domain name
is Wikipedia.org). The last two are useful analytical tools when using digital methods
and also country-code top-level domains (ccTLD) which indicates domain extensions
for regions or countries, e.g. www.aruki.pt.
The path component, often separated by a slash (/), specifies the unique location of a
file system. Using social media as an example, the path may show which environment
within YouTube (e.g. channels, video) or point to the exact location of an image or
Facebook post (see Figure 3.1). After the path, preceded by a question mark (?)
74
All URI schemes are required to be registered with the Internet Assigned Numbers Authority
(IANA).
98
highlighted in yellow (Figure 3.1), there can be an optional query component revealing
that within YouTube environment, we are looking at a specific video playlist among
other existing ones. Finally, the optional fragment component indicates what is often
taken as an id attribute of a specific element. The fragment component is another
valuable source for digital methods that helps the research in identifying unique
actors/content in a dataset or exploring visual content by using the IMAGE formula in
a Google Spreadsheet75.
Media, textual and visual content available on the Web are composed through the
Hypertext Markup Language (HTML) and Extensible Markup Language (XML).
HTML describes the content and presentation of a web document, while its description
and presentation become a text-based format with XML allowing machine
consumption (Helmond, 2015a). Along with HTML, Cascade Style Sheets (CSS) and
JavaScript represent the cornerstone technologies of the web; while CSS serve to
specify the presentation of web pages, JavaScript is used to specify their behaviour
(Flanagan, 2011). CSS is “a stylesheet language used to describe the presentation of a
document written in HTML or XML (Mozilla Developer Networks, 2020), describing
how elements should be rendered on media including colours, layout and fonts.
According to the World Wide Web Consortium (W3C), CSS is a standardised language
across browsers which also “allows one to adapt the presentation to different types of
devices, such as large screens, small screens, or printers” (World Wide Web
Consortium, n.d.).
JavaScript is a text-based programming language that played a significant role in the
functional diversification of the web. Created by Netscape in the early days of the web,
JavaScript is the technology that allows adding a dynamic behaviour to the web pages
(Flanagan, 2011; Mindfire Solutions, 2017; Mozilla Developers Network, 2020),
being thus instrumental in turning the web from a static and informational environment
to a dynamic and interactive one. Accordingly, it supports object-oriented and
functional programming styles to modify HTML content in response to events/actions,
being used on both front-end (or client side) and back-end (or server-side) (Coding
Arena, 2018; Mindfire Solutions, 2017). This brief definition explains why JavaScript
75
As a practical example, watch this playlist on detecting instabots:
https://www.youtube.com/watch?v=D1Jo84tfbY&list=PLuAgGxzD7fdxKJVTbYM5PtzMmnT94_1ZM
99
is known as the programming language of the web and a crucial brick for the
development of web application.
The web dynamic architecture makes web content meaningful not only for its users
(navigating, participating, creating) but also to computers (getting, learning, evolving),
proving to be a revolution on knowledge representation (Anderson & Wolff, 2010) but
also carrying endless challenges for research. In the practices of digital methods, the
value of machine-readable web content (with what is inherent to it, how it works and
on what effects) become a crucial technological grammar for driving research, such as
the use and knowledge of web crawlers, scrapers and APIs.
From the viewpoint of methodology, a technical understanding on the web
environment comprises one final element: the fractal design and layered structure
(Figure 3.2) of the web. In 1995, Tim Berners-Lee explains the fractal design of the
web at The MIT/Brown Vannevar Bush Symposium, comparing the web pages as they
were “representative of people, organizations, or concepts” and affirming that we all
need to be part of the fractal pattern (Berners-Lee, 1995). This is a certain structure
containing connections of widely varying “length” but with involvement at all scales;
to exemplify this idea he refers to the MIT building as a system in which “the whole
operates because the parts interoperate, and the way the whole works and whether the
whole works is defined by how the parts interoperated” (Berners-Lee, 1995) (Figure
3.2). This representation of the web becomes clearer through Franck Ghitalla and
Mathieu Jacomy’s76 work, which is inspired by network science and graph theory
(Figure 3.2).
76
In February 2019, Mathieu Jacomy presented the argument of the web as layers in the context of the
Digital Media Winter Institute at Universidade Nova de Lisboa, he was tutoring a workshop about
Networks of (dis)information (see https://smart.inovamedialab.org/workshops/2020_networks-ofdisinformation/). The author has shared a list of videos with all workshop participants in which one
can watch his lecture on the web as layers, available at
https://panopto.aau.dk/Panopto/Pages/Viewer.aspx?id=48cfe5ff-5503-431b-887f-ab53007ef5c4.
100
Figure 3.2. At the top, a screenshot of the Tim Berners-Lee’s lecture about Hypertext and
Our Collective Destiny at The MIT/Brown Vannevar Bush Symposium 1995 (30m)77. At the
bottom, a screenshot of Mathieu Jacomy’s lecture at Aalborg University in December 2019,
where he presents the notion of the web as layers78.
Like Berners-Lee’s illustration, Ghitalla and Jacomy (2019) also explain the map of
the web from the inside, rather than from above. To the authors, the web is made by
four layers. When web browsing or crawling for research purposes, we will first
77
Source: https://www.dougengelbart.org/content/view/258/000/
78
Source: https://panopto.aau.dk/Panopto/Pages/Viewer.aspx?id=48cfe5ff-5503-431b-887fab53007ef5c4
101
encounter the most well-known websites or platforms constituting the surface of the
web (first layer), such as Google, Wikipedia, social media or national institutions, etc.
The aggregates come right after the most cited websites or platforms, where we see
homophily happening or the bonding with similar others. Here we may see more
specialised content/actors, which is divided into a high layer of aggregates (the core,
second layer) and a lower one (the periphery, third layer), followed by an extremely
specialised zone, also known as the deep web (fourth layer). That adds to our
understanding that “on the web, anyone can have a voice, but it does not mean that
everyone will have the same impact” (Ooghe-Tabanou et al., 2018, p. 16). Key and
structural element to such web hierarchy is the hyperlink through which web
information has to be connected with in order to become known and to gain audience,
as explains Benjamin Ooghe-Tabanou and colleagues (2018).
But why does the notion of the web as layers matter? From the standpoint of digital
research, if one wants to study what happens within the web environment to better
understand society, one needs to understand how information/content is structured
within this environment. As explained by Jacomy (2019), we can understand the
different layers of the web and their properties through search engine results (first
results, specific queries results, unindexed content), visibility of the links (known of
all, known of amateurs and experts, forgotten) and the content (generic, specialised).
Paying attention to the notion of the web as layers also helps us to understand the role
of hyperlinks as “data-rich analytical device” (Helmond, 2013; Ooghe-Tabanou et al.,
2018). The case presented in chapter 2 to study social bots serves also to respond to
that question. Through the analyses of a computer vision API-based network, we have
identified social media platforms – in particular Twitter and Pinterest, and blog sites
in the first layer of the network. That means full matching images posted by Instagram
and Tumblr bots were detected by Google Vision API at the surface of the web. The
second layer showed us two main poles constituted by the studied platforms (Instagram
and Tumblr), exposing the imagery that sticks within or what flows out of these
platforms. This zone was followed by a more isolated zone within the network where
we saw specific porn websites (third layer), such as Asian girls and teen porn websites.
Although passing through some layers of technical mediation, the network reflects the
layered web.
102
An architecture of participation: the role of web applications and APIs
Tim O´Reilly’s definition of the Web 2.0 serves as another, and important, technical
comprehension of the web. O´Reilly describes it as a network of connected devices
empowered by human and non-human forms of use. Anticipating upcoming research
lines in Media Studies, such as Tarleton Gillespie’s The Politics of Platforms published
in 2010, O´Reilly calls our attention to the multi-sided publics of the web (including
users) and to the data/services centralisation-descentralisation modes belonging to web
applications.
“I said I´m not fond of definitions but I woke up this morning
with the start of one in my head: Web 2.0 is the network as
platform, spanning all connected devices; Web 2.0
applications are those that make the most of the intrinsic
advantages of that platform: delivering software as a
continually updated that gets better the more people use it,
consuming and remixing data from multiple sources,
including individual users, while providing their own data and
services in a form that allows remixing by others, creating
networks effects through an “architecture of participation”,
and going beyond the page metaphor of Web 1.0 to deliver
rich user experiences” (O’Reilly 2005)
In 2005, his compact definition of the web has perhaps not served as a reference for
research methods, at a time when research efforts were invested in other technological
aspects of the web. However, over the years and after the emergence of social media
APIs with loose restrictions to data access, digital media scholars have started to value
both the creation/use of web research app and the role of APIs as an object to be
considered, showing great interest in comprehending the agency and effects of how
social media platforms deliver software in a dynamic and continually updated format,
while allowing others (third parties applications) to access data and add new
functionalities to itself (see Lomborg & Bechmann, 2014).
It is, therefore, important to define web applications and APIs, in the context of digital
research. A web application is a software program that completes specific tasks and
achieves specific outcomes by using web technologies (e.g. JavaScript and HTML),
such as create, read, update and delete information (see Gibb, 2016; Microsoft Virtual
103
Academy, 2018)79. Web apps are accessed through web browsers and run on web
server, and are thus expected to be “be quicker to build, simpler to distribute, and easier
to iterate” (Miller, 2018). That also means that web apps provide interaction and
functionality within the web environment, using a combination of components for their
front and back end interfaces, organised as follows:
§
Components of a web app in the front-end interface:
Hypertext Markup Language (HTML): for defining the content and structure of a
webpage
Cascade Style Sheets (CSS): for defining the style and layout of a web page, the
presentation of web content
JavaScript (a text-based programming language): for creating scripts that controls the
behaviour of a webpage, controlling how an app will dynamically respond to action
§
Components of a web app in the back-end interface:
Database: for storing and organizing an app’s data
The back-end code of the app: for controlling data access and how it is used
Web server: for hosting the app and allowing others to access
Adapted from Microsoft Virtual Academy (see Coding Arena, 2018)
Google apps (e.g. Google Docs, Sheets or Slides), Slack, Trello, Pinterest and even
Twitter are some examples of web apps (see Coding Arena, 2018; Miller, 2018). In
the context of digital methods, the creation/use of research apps allows researchers to
carry out web-based studies in the most diverse research areas, mediating the execution
of specific tasks. For example: the retrieval of data from social media APIs with
YouTube Data Tools (Rieder, 2015) but also data exploratory analysis with DMITCAT (Borra & Rieder, 2014) and 4CAT (Peeters & Hagen, 2018); the extraction of
URLs from text, source code or search engine results, producing a clean list of URLs
with the Harvester from DMI Tools; and, the making network of Wikipedia related-
79
While mobile apps are built for a specific platform (e.g. iOS for the Apple iPhone or Android for a
Samsung device), running directly on a mobile device.
104
pages based on their “See also” section with Seealsology from DensityDesign &
médialab Sciences Po80.
Web research apps demand our attention not only for what they can offer or do, but
because they have “epistemic orientations that have repercussions for the production
of academic knowledge” (Borra & Rieder, 2014, p. 2). They open new ways of making
research and reflecting about media and society.
As data retrieval from social media is a primary task in the practices of digital methods,
we need to define and technically understand web application programming interfaces
(APIs). APIs constrain the way in which developers can create research apps (Sturm,
Pollard, & Graig, 2017), they also gather technologies for capturing and (re)organising
acts and actions within the web environment - from which researchers can gain access.
Even though we recognise that the golden age of social media API-driven research has
passed (see Freelon, 2018; Perriam, Birkbak, & Freeman, 2020)81, attention should be
given to what APIs are, how they work and what we should look at when using digital
methods.
Rick Sturm and colleagues define APIs as “critical software elements supporting data
and/or functionality interchange between diverse entities. They are also utilised for
standardising data access within a software engineering organization to facilitate faster
software development” (Sturm, Pollard & Graig, 2017). In other words, they are an
interface programming language and infrastructure which defines interactions between
multiple software, while allowing the use, access and exchange of data.
APIs respond to the principle of information hiding or the criteria applied to divide the
system into modules, proposed by David Lorge Parnas in 1971 (Parnas, 1971). This
principle “prescribes that software modules hide implementation details from other
modules in order to decrease the dependency between them” (de Souza, Redmiles,
80
See https://wiki.digitalmethods.net/Dmi/ToolHarvester,
https://densitydesign.github.io/strumentaliaseealsology/?fbclid=IwAR1zOtvsfpTU9emI4OtrdbTYq7ka4ROzruk03H3Zjl-HOaOlzcq5cmCSZnA,
https://medialab.github.io/graph-recipes/#!/upload
81
Social media APIs have high restrictions and very limited access to public data. As a response to
Facebook’s API restrictive measures, Anja Bechmann (the director of DATALAB – Center for Digital
Social Research) created an Open Doc to list publications that could not have existed without access
to API data. The list is available here:
https://docs.google.com/document/d/15YKeZFSUc1j03b4lW9YXxGmhYEnFx3TSy68qCrX9BEI/edi
t?usp=sharing
105
Cheng, Millen, & Patterson, 2004, p. 1). By being constituted by public and non-public
or private properties, APIs infrastructure can separate function from implementation.
The public properties of APIs are visible to the client and should include the
specifications of functionality, meanwhile the non-public properties must be secret,
following the criteria of decomposability and composability (Meyer, 1998). For
instance, let us “assume a module changes, but the changes apply only to its secret
elements, leaving the public ones untouched; then other modules who use it, called its
clients, will not be affected” (p.51).
Such an infrastructure is also recognised as the open-close principle of APIs because
software modules should be simultaneously open (for extension and adaptation) and
closed (to avoid modifications that affect clients) (Meyer, 1998), in order to facilitate
specific operations and characteristics such as interoperability (when two or more
applications cooperate, exchanging and making use of information) and modularity
(the capacity of breaking up applications into modules in a way that they can be
recombined) (see Bucher, 2013).
We can look at Facebook Graph API and the Netvizz application (Bernhard Rieder,
2013) (Figure 3.3) as practical example of web APIs and apps in the context of digital
methods. Netvizz was created in 2009 by Bernhard Rieder and it has worked as a
research tool for almost ten years. When embedded into Facebook, the application
offered a service to the platform users, adding a new functionality, such as the retrieval
of link stats and data from Pages, Groups and users’ like or friendship connections (see
Figure 3.3). Over the years, more specifically after the Cambridge Analytica scandal
in 2018, the restrictions have increased. For example, the access to user reactions to
Facebook Public Pages’ posts (user-post bipartite graph) or to identify Page fans per
country were suspended, closing a rich scenario for the study of social phenomena82.
82
In 2018, Jorge Martins Rosa, Daniel Cardoso and I used a user-post bipartite graph to guide content
analysis, we analysed comments - those made by commenters from the periphery (a total of 242
(34.42%) and those made by commenters from the center of the network (a total of461, 65.58%). The
study was focused on the reactions to a Facebook post, on July 9, 2017, in which the admins of the
Page revealed their names. The page (Os Truques da Imprensa) used to publish critical remarks on the
news of Portuguese national media, often generating heated debates on the platform, either being
praised for its role as a watchdog, or discredited as allegedly serving as the spokesperson for a hidden
political agenda. Our goal was to evaluate the debate generated by this post, particularly concerning
the polarisation of positions and arguments among those that engaged with the post. The article is
available at http://obs.obercom.pt/index.php/obs/article/view/1367/pdf.
106
Figure 3.3. Screenshot of the Netvizz research app (above) and, by using this app, the
changes of Facebook Graph API data access regime along the years (bellow).
The open-closed structure of Facebook Graph API can be also illustrated through the
requirement of more detailed information such as whether a list of publications was
sponsored or not. In 2015, this information was available for social media marketing
companies or clients of Social Bakers (https://www.socialbakers.com/), but not for
Netvizz users84 or Facebook users. A last characteristic of social media APIs that I
84
In 2017, and in collaboration with António Granado, we have analysed substantial Facebook data
generated by 15 Portuguese Universities’ official Facebook Pages. We suspect that a number of posts
were sponsored, due to a discrepancy observed in the number of engagement metrics and the content
107
want to emphasise is their capacity to facilitate the integration of contents through
platform-specific hyperlinks such as like or share buttons, called by Gerlitz and
Helmond (2013) as like economy. That is the case when pulling out social media
functionalities/buttons to put them into a website. As Anne Helmond (2015a; 2015b)
explains, such a like economy (afforded by APIs) enables dynamics of the
decentralisation (when pulling out social media functionalities/buttons to put them into
other websites) and the recentralisation (data collection through social buttons and
associated plugins) of data flows.
This dynamic “creates new forms of connectivity between websites beyond
hyperlinks, introducing an alternative fabric of the web” (Helmond, 2015b, p. 159),
particularly when turning the presence of social buttons and associated plugins into
valuable data. In this way, social media APIs play a crucial role in shifting the currency
of the web from web-native (hits and links) to platform-native (likes, shares, retweets)
(Gerlitz & Helmond, 2013; Helmond, 2015b). In other words, “social media platforms
are creating currencies that are tied to the mechanics and logics of their own platform
infrastructure” (Helmond, 2015b, p. 150).
From the web as a platform concept to platformisation
The enactments of platform programmability (through web-based APIs), namely the
platforms capacity of responding to data, service or functionality requests, also
constitute “the foundation for the Web as a platform concept” (Murugesan, 2007, p.
36; O´Reilly, 2005; see also Helmond, 2015a). The notion of platforms leads us to
consider the regimes of functioning of application programming interfaces (APIs) and
how they contribute to the decentralisation and recentralisation of platform
functionality and data. Mark Andreessen (2007b, September 16), for instance, explains
how web platforms are defined through the programmable potentials afforded by their
APIs. Access and functionalities can be broken down into three levels: API access
(level 1), Plug-in API (level 2), and Runtime Environment API (level 3). In Level 1,
applications would call into the platform via web services to access data and services,
e.g. pulling out YouTube content and Facebook like buttons to put them into a blog or
another page. In level 2, as illustrated before with Netvizz, there is not only data and
of the publications. After consulting the communication department of a few universities, our
suspicion has been confirmed, most of the posts with high engagement metrics were sponsored.
108
services access, but it is possible to inject functionality into a given platform. That is
also the case of the apps embedded in the Facebook environment, from gaming and
dating categories to applications such as This is your digital life, which violated
Facebook terms of service collecting and misusing its users’ data in favour of the
Donald Trump presidential election in 2016 in the so-called Cambridge Analytica data
scandal85. At the Level 3, Internet platforms allow developers to run their apps inside
the platform itself, for instance the virtual experience provided by Second Life86.
The characteristics of web APIs, in particular social media APIs, are central to the
platformisation of the web. This notion, coined by Anne Helmond, refers to “the rise
of the platform as the dominant infrastructural and economic model of the web and the
consequences of the expansion of social media platforms into other spaces online”
(2015a, p. 1). According to Helmond, three factors explain this phenomenon, namely:
the separation of content and presentation, the modularisation of content and features
and, finally, the interfacing with databases. These factors reflect not only how
platforms enable their programmability87 “through the exchange of data, content, and
functionality with third parties” (p.5) but also the infrastructure of the web
environment and relationship with web-based research apps/software, as discussed in
the previous sections.
While Tim O´Reilly’s prediction for the web has been confirmed (the web has indeed
become a provider of services, rather than just a place to find information), other
dynamics also shaped the evolution of the web. That is, the majority of all web services
ended up being concentrated in the hands of a few private companies (e.g. GAFAM,
Baidu, Alibaba and Tencent) which now govern public spaces, penetrating the core of
our economic and civic life (Dijck, 2020). This evolving dynamic process called
platformisation thus reflects “the inter-penetration of the digital infrastructures,
economic processes, and governmental frameworks of platforms in different economic
sectors and spheres of life” (Poell et al., 2019: 6 in Dijck, 2020, p. 4).
85
https://en.wikipedia.org/wiki/Facebook%E2%80%93Cambridge_Analytica_data_scandal
86
https://secondlife.com/
87
Helmond provides technical descriptions on APIs and explains how Extensible Markup Languange
(XML), by structuring content in a text-based format, facilitates machine consumption, enabling “the
circulation of content through modular elements/components” (p.6)”.
109
The web is less and less seen as an open platform for everyone to use, and more and
more as a closed environment ruled by specific platforms’ walled gardens.
Digital technologies and the web as the last stage of grammatisation
Grammatisation was forged by the linguist Sylvian Auroux (1992) while delineating
the technical process of description, formalisation and discretisation of human
behaviours into representations, so they could be reproduced (Crogan & Kinsley,
2012; Petit, 2012; Stiegler, 2011). The author uses the alphabet (alphabetisation) as
the first example of what constitutes a process of grammatisation, which starts with
the exteriorisation of human gestures, acts or knowledge. Central to this are, of course,
the grammar and the dictionary, particularly referred by Auroux (1992) as the pillars
of our metalinguistic knowledge, being simultaneously representations of languages
and techniques (or external tools) that alter communication spaces.
A grammar provides general procedures for creating/decomposing statements; while
the dictionary provides the items to be arranged/interpreted according to these
procedures. Here, the grammar must contain, at least, a categorisation of unities,
examples and rules, and its content should be relatively stable, e.g. spelling, syntax,
and morphology (Auroux, 1992). The author, furthermore, argues that spaces of
communication are simultaneously constituted and modified (Auroux, 1992; Petit,
2012)88 through grammatisation, which he considers an industrial revolution in itself
(after the first being the invention of writing and the second the print revolution)89 (see
Stiegler, 2011, p. 172).
88
The definition and contextualisation of grammatisation according to Victor Petit can be found in
“Vocabularie d’Ars Industrialis” (http://arsindustrialis.org/grammatisation), an international
association created in 2005 on the initiative of the philosopher Bernard Stiegler. “Ars Industrialis set
itself the goal of imagining a new type of arrangement between culture, technology, industry and
politics around a renewal of the life of the spirit” (see http://arsindustrialis.org/lassociation).
89
Auroux uses, for instance, the example of how landscapes’ views and modes of transportation have
radically changed along with the roads, the canals, the railways, and the landing fields, in order to
explain how “grammatisation has profoundly changed the ecology of communication and the state of
the linguistic patrimony of humanity" (Auroux, 1992, p. 70).
110
Here, the notion of grammatisation is transferred to the digital environment, framed in
the context of digital technologies and web platforms and thought through its
relationship with social media and research software. In this sense, researchers may
consider the formalisation of web content (online objects and activities such as links,
URLs, like buttons, comments) as technological grammar, while looking at the
entanglements of web infrastructure and applications’ informative sources that
explains and articulate “discourse-made-machinery” (Agre, 1994). To help us reflect
on this reality, this section provides an overview of digital grammatisation through the
philosophical work of Bernard Stiegler, followed by Philip E. Agre’s software specific
frame.
Making visible and tangible different types of memory, behavior, knowledge
Bernhard Stiegler extends and diverts Sylvian Auroux’s conceptualisation90 by
framing the digital technologies and the Internet as “the last stage of grammatisation
and a new kind of writing”. He argues that, because of the new condition of
memorisation91 imposed by technological evolution, we (human beings) can only be
understood through the point of view of technological complexity or the “realisation
of memory92” (Stiegler, 2018), which stands for processes of exteriorisation,
production and discretisation of intellectual structures. Consequently, he explains that
such impression of memory can be observed, apprehended or studied through digital
90
Thinking through the work of Auroux, and based on the alphabet, Stiegler sees digital
grammatisation as similar to the process of writing a letter, in which the impression of memory is
simultaneously a technical and a logical condition. “A becoming-letter of the sound of speech [la
parole] which precedes all logic and all grammar, all science of language and all science in general,
which is the techno-logical condition (in the sense that it is always already technical and logical) of all
knowledge, and which begins with its exteriorization” (Stiegler, De la misère sumbolique 1, pp. 111114 apud Stiegler, 2011, p.172).
91
This idea has particularly emerged with Leroi-Gourhan’s analysis of a protohuman fossil
(Zinjanthropus boisei), then raising the thesis that techniques are a vector of memory: “[H]e showed
that a crucial biological differentiation of the cerebral cortex took place in the passage from what he
called the Australanthropian to the Neanderthal. He showed that, from the Neanderthal onward, the
cortical system was practically at the end of its evolution: the neural equipment of the Neanderthal is
remarkably similar to ours. Nevertheless, from the Neanderthal to us, technics evolves to an
extraordinary extent” (Stiegler, 2010, p.73).
92
According to Stiegler, this principle can be explained because the process of becoming human
reflects both the recollection of the past or ideas or soul (anamnesis) and its exterior traces or the
technical supplement to memory (hypomnesis). In this sense, and quoting Leroi-Gourhan, he states
that technicity or techno-logic (in Homo sapiens) “is no longer geared to cell development but seems
to exteriorise itself completely—to lead, as it were, a life of its own” (Leroi-Gourhan, 1993, pp.137139 apud Stiegler, 2018, p.30).
111
grammatisation. This notion thus reflects on how technologies affect the development
of human nature, while playing a role in the process of the impression of memory.
To Stiegler, grammatisation is a form of a spatialisation of time93 (Stiegler 2012), as it
makes visible different types of memory, behaviour or knowledge, e.g. when temporal
flows of a speech are transformed into web textual content which is stored in the backend but also available in the end-user interfaces of social media. These behavioural
fluxes or detemporalised forms of speech make what Stiegler (2012) calls a spatial
object, which is an object synoptically visible and tangible, thus making “possible an
understanding that is both analytic (discretised94) and synthetic (unified)” (Stiegler,
2012, p. 2). After typing the Internet sucks on my Tumblr dashboard and clicking post,
my opinion about the Internet is no longer a temporalised form of speech but a spatial
object. This evolves into a permanent link95 composed by text, image and a timestamp:
a visible and tangible object that can be traced, saved, stored or shared by others. Here,
the retention96 and materialisation of my opinion was only possible through a technical
materialisation process afforded by Tumblr, serving also as a good perception about
digital grammatisation, described by Stiegler as:
all technical processes that enable behavioural fluxes or flows to be
made discrete (in the mathematical sense) and to be reproduced;
behavioural flows through which the experiences of human beings
(speaking, working, perceiving, interacting and so on) are expressed
or imprinted. If grammatisation is understood in this way, then the
digital is the most recent stage of grammatisation, a stage in which
all behavioural models can now be grammatised and integrated
through a planetary-wide industry of the production, collection,
exploitation, and distribution of digital traces. (Stiegler, 2012, p.2)
93
Stiegler sees behaviour as a form of time.
94
In computing, discrete means individually separate and distinct for the purposes of easier
calculation, whilst unified means to be part of a whole, but operate as a single entity.
95
https://joceliii.tumblr.com/post/173795180479/the-internet-sucks
96
Retention “refers to what is retained, through a mnesic function itself constitute of a consciousness,
that is, of a psychical apparatus” (Stiegler, 2012, p.2) (mnesic refers to what pertains to memory).
Stiegler (2012) borrows the term retention from Husserl, who distinguishes two types of retention:
i) primary retention (perception) that “means retained in the course of a perception, and through the
process of this perception, but in the present” (primary retention is not yet a memory but “the course
of a present experience”); ii) secondary retention (imagination) “is the constitutive element of a
mental state which is always based on memory” (idem, pp. 2-3). Stiegler introduces a third type of
retention, hypomnesic (an artificial or technical supplement to memory): the tertiary retention
“in which consists the grammatisation of the flow of retentions” - more generally, any technical
materialisation process” (p.3).
112
Stiegler (2018) recognises the techno-logic of grammatisation as constituent building
blocks of both technology and culture, and along with the current status of the Internet
for sociability97, he asserts that we live in a revolutionary moment (similar to the
invention of writing) and should change our modes of conceptualisation and conditions
of interpretation accordingly.
Moreover, Stiegler (2010) asserts that the grammatisation follows the logic of
processes and dynamic compositions98, and not of hierarchies or totalising systems.
Thus, another way to understand grammatisation rests on approaching a complex
process divided into four steps: i) the memory´s inscription carried out by the
equipment for the input of data; ii) its preservation operated by a technical supplement
to memory, which constitutes databases; iii) its processing by software, which may
take different forms; and finally; iv) its transmission or publication – “the data thus
processed are transmitted on networks. One accesses these networks through
interfaces” (see Stiegler, 2018, p. 27), such as web-based APIs. This process has its
own language, its own memory, and its own knowledge. Consequently, the technics
and methods applied to grasp the process of digital grammatisation “cannot therefore
be considered as merely ‘means’ serving ‘ends’ that would not themselves be technological” (Stiegler, 2018, p. 32). Rather, we should consider the conditions imposed by
digital technologies and how they inscribe, preserve, process and transmit behavioural
fluxes or flows. This will be discussed in the next section.
From the metaphor of capture to an in-depth look over technological grammar
In 1994, Phillip E. Agre proposed a reflection on two (not mutually exclusive) models
of privacy as cultural phenomena: the surveillance model has its origins in the realm
of public debates; while the capture model is rooted in the practical application of
computing (see Agre, 1994, p. 107). He explains that such perceptions employ distinct
metaphors. The surveillance model is informed by modes of observation built on visual
metaphors such as Orwell’s “Big Brother is watching” and the capture model employs
linguistic metaphors in which “human activities are systematically reorganised to
allow computers to track them in real time” (Agre, 1994, p. 101).
97
That is what, in 1986, Stiegler would refer to as the emergence of new technologies.
98
Here the author follows the work of Jacques Derrida.
113
From the standpoint of the philosophy and the practices of digital methods, Agre’s
essay remains relevant these days for reasons other than technology and privacy
matters. Agre’s capture model will help us to look at the technical context and
environment of social media grammars as methodological language. By providing a
comprehensive understanding of digital grammatisation, it highlights how
technological grammar can no longer be seen outside the technicity of the software.
Three key aspects of his work are particularly relevant.
The first relates to common aspects found in all tracking systems (or schemas): the
entity, the computer and the agent. The entity refers to what is about to be tracked and
captured; each entity may have “a definitive identity that remains stable over time”
(p.104). In the social media context, clicking on links, choosing a reaction other than
“like” on Facebook, making a comment on, uploading an image. Entities have
changeable states, but “a definitive identity”, such as when one uploads a video on
YouTube, a unique identifier (id attribute) is immediately created, serving as a
definitive identity to localise the video but also to collect different types of information
about it99. These activities can be facilitated by back-end communication between an
extraction software and YouTube Data API, but also through the front-end interface
of YouTube. Quantifiable parameters are attached to a video, such as number of views,
likes/dislikes/comments, video recommendation, being also captured over time.
These, in Agre’s lexicon, contain “some mathematically definable representation
schema, which is capable of expressing a certain formal space of state of affairs”
(Agre, 1994, p.105).
The computer is responsible for representing the changing states of an entity;
consequently, it also provides social and technical means that keep “the
correspondence between the representation and the reality” (op. cit, p.104). By
computer, the author refers to databases or distributed systems which can only offer
entities’ representation schemas of what it can capture and store over time, “and this
trajectory, can be either literal or metaphorical, or both, depending on what aspects of
99
By using YouTube Data Tools (Rieder, 2015), one can retrieve video info and statistics to generate
a network of relations between videos, based on the reference “related to video id” of YouTube Data
API.
114
the entity are represented” (p.105). In the context of social media, we can, for example,
think of APIs as a computer in Agre’s terms.
The last aspect present in all tracking systems is a human or automated agent that can
request/retrieve information from a database. In the practices of digital methods,
agents are research software (e.g. FacePager or YTDT), scrapers, crawlers,
HTML/Python scripts, which are used or created by multiple stakeholders (e.g.
developers, scholars, users, companies, government) to request/create/retrieve from
the web or social media environment (this activity can also be tracked).
The second important aspect in Agre’s essay relates to a good perception of the capture
system itself. This model implies the match (or mismatch) of epistemological versus
ontological principals (Agre, 1994). On one side, the notion of capture refers to the
acquisition of information by computers; on the other, it expresses a semantic
distinction, i.e. “acquiring the data” versus “modelling the reality it reflects” (p.106).
Agre refuses the idea of a literal descriptive system and, instead, he introduces the
capture model as a metaphor system, in which the ways for understanding activity have
basis in the practical application of computing.
This is a lesson we may want to learn from Agre, transposing it to the ways in which
we study online activity while using and designing research with digital methods. The
way in which Agre unpacks capture systems100 illustrates more substantial perspectives
on what technological grammar is. Capture here means a complex phenomenon in
which “every domain of activity has its own historical logic, its own vocabulary of
actions, and its own distinctive social relations of representation” (p.116).
The third central proposal in Agre’s essay is what he calls grammars of action or the
formalisation of languages for representing human activity. These grammars specify a
set of unitary actions that have many and varied manifestations. In social media, a
simple example is the act of adding the # symbol before words, numbers or emojis
turning them into hashtags that can stand for ideas, opinions or positioning efforts.
Hashtags indicate searchable topics recognised by users, recommendation algorithms
and automated beings (bots). These actions are embedded in the platform databases
under predefined and specific properties, e.g. what comes after the # symbol? Where
100
Agre approaches the following interconnected elements: grammars of action, capture and
functionality, capture in society, and the political economy of capture.
115
can hashtags be used? (e.g. in captions, overlapping images or videos) How many
times? What is associated with the use of hashtags (e.g. the content of tagged posts may
contain the username, captions, co-related tags, image URLs, publication date,
location)? Moreover, social media databases can capture the relationship between
hashtags and their predefined and specific properties as well as a set of actions related
to hashtagging (e.g. liking or commenting on posts, using filters). This helps us to
understand what Agre refers to as the capture model, providing also an accurate
definition to what we are referring to as platform grammatisation (see Gerlitz & Rieder,
2018): “the situation that results when grammars of action are imposed upon human
activities, and when the newly reorganised activities are represented by computers in
real time” (p.109).
Accordingly, we need a better understanding of the five-phased cycle that encompasses
this process or how grammars of action are embedded into computers. These phases
are not independent of each other, but rather concur or arise simultaneously. They
remind researchers of how attentive they should be when working with technological
grammar, taking into consideration every rule, information, forms of use and
articulation attached to them.
1. analysis (the basic ontologies - objects, variable, relations, that will inform
different forms of activity)
2.
articulation (how grammars are “strung together to form sensible stretches of
activity” p.110)
3.
imposition (grammars of action are often a result of a normative force,
“participants in the (articulated) activity may or may not participate in the
process and may or may not resist to it” p.110)
4.
instrumentation (sociotechnical means are provided either by the capture
system itself, by those orienting their activities through the capture machinery,
and the institutional consequences of these relations)
5. elaboration (after being recorded and stored, activity becomes available to be
accessed and merged with other records).
When transposing Agre’s five-phased cycle to social media research, to understand
how grammars of action are articulated, we should be aware of the basic ontologies of
116
APIs, bearing in mind that the imposition of grammars may not signify a full
appropriation or use. For instance, one can simply not use hashtags nor provide
Instagram with a list of close friends when publishing a Story. Moreover, we should
not forget that technological grammar can oversimplify the acts it intends to represent
but also provides means for either accurate or inaccurate descriptions of actions.
In relation to available grammars of action, as new media studies have shown, they
serve research as powerful means to monitor and better comprehend social issues. In
this regard, the next section will introduce three aspects required to the simplest digital
methods project.
Three pillars of the digital methods approach
Through the years, and in the many data sprints that I have attended/organised, I have
often noticed how the crucial effort to understand with and about the computational
mediums can be easily brushed aside by more pressing practical issues. Projects tend
to disregard the specificity of the medium and the platform’s cultures and digital
methods become just a way to do stuff with web data, rather than an invitation to think
along with the media.
As previously argued, even the simplest digital methods project requires some
technical knowledge, practice and expertise. This background knowledge constitutes
the first level of the technicity-of-the-mediums and can be defined as the practical
awareness that allows researchers to understand not only in theory but also in practice
what it means to study collective phenomena “through interfaces and data structures”
(Rieder et al., 2015, p. 4).
This section demonstrates how technical expertise can specifically contribute to new
forms of enquiring. It proposes a schematic way to raise awareness of the technical
mediation inherent to digital methods101, describing a triangular relationship between
software affordances and platforms’ cultures of use and grammatisation (see Fig.3.4).
This proposal attempts to defuse some of the difficulties related to the use of digital
101
Instead of discussing what is evident and present in applied research, for instance how the Internet
and new technologies may lead to a revolution in the making of social science (Watts, 2007) or the
importance of learning particular technical skills.
117
methods, while it suggests a way of carrying out digital fieldwork. The schema
represents a combination of awareness (in theory, imagination and practice) and
software practice and draws attention to three distinct but related aspects, while
engaging with the specific modes of the technical mediums. In other words, intellectual
and practical operations are not separated but paired up. The visual schema in figure
3.4 corresponds to a need of digital research, i. e. the fact that scholars must care about
the specificities of the medium and data, particularly “where and how they happen,
who and what they are attached to and the relations they forge, how they get assembled,
where they travel, their multiple arrangements and mobilizations, and, of course, their
instabilities, durabilities and how they sometimes get disaggregated too” (Rupper,
Law, Savage, 2013, pp.31-32). In other words, the understanding of the domain of its
particular potentialities (practical awareness) is required in order to achieve a
(research) purpose.
Figure 3.4. Getting to know the fieldwork through understanding the triangular relationship
between platform’s cultures of use, grammatisation and software.
In this scenario, it is crucial to understand the role of the medium’s elements, features
and qualities and how they relate to the subject of study in the interpretative process.
Figure 3.4 is also meant to serve as a set of guidelines that assist the researcher to go
through the mental-practical schemas demanded by digital methods. In the following
pages, I will introduce the three pillars of the knowledge triangle separately.
118
Platform Grammatisation
Platforms here, in general terms, are taken as “socio-technical assemblages and
complex institutions” (Gillespie, 2018a, p. 255), and in a detailed assessment they are
“(re-)programmable digital infrastructures that facilitate and shape personalised
interactions among end-users and complementors, organised through the systematic
collection, algorithmic processing, monetisation, and circulation of data” (Poell et
al., 2019, p. 3) but also apprehended as arenas for processes of capitalisation,
monetisation and proprietary enclosure (see Mackenzie, 2019; Rieder, Coromina, &
Matamoros-Fernández, 2020). Mackenzie (2019), however, argues that platforms
should “no longer [be] rendered as a social network of users (individuals and
organisations), or even more starkly as an advertising medium (Skeggs & Yuill,
2016)”, because “the platform itself becomes an experimental system for observing
the world and testing how the world responds to changes in the platform on many
different scales” (Mackenzie, 2019, p. 2003). In his perspective, there is a shift from
platforms as a place of connectivity to being predictive operations in which platforms
“modulate connectivity and begin to infrastructuralise it” (op. cit.).
In the context of digital methods, platform grammatisation refers to the technological
processes inherent to the web environment and APIs in which and through which
online communication, acts and actions are structured, captured and merged with other
records, yet made available limitedly through data retrieval methods such as crawling,
scraping or API calling. In other words, the situations where users deal with predefined
technological grammar, produced and delineated by software, to structure their activity
(Gerlitz & Rieder, 2018). That alludes to the operationalisation of platforms and the
particular and pervasive agency of its technical functioning (see Rieder, Abdulla,
Poell, Woltering, & Zack, 2015) intertwined with and in online data.
As an example, let us see how TikTok102 structures video content and metadata. To do
this, we should consult the mobile app front-end and back-end interfaces (see Fig.3.5)
as referential starting points to understand which are the acts and actions that one can
do to a TikTok video (by looking at the end-user interface), and how these acts and
102
TikTok is a Chinese mobile video-sharing application which gained worldwide recognition in
2020, in particular during the quarantine restrictions and lockdowns provoked by the impact of
COVID-19. Let’s also pretend we are not familiar with TikTok but know the basics in advance; that
the app follows social media logic of communication and interaction (e.g. one needs to have an
account in order to produce content, to be able to follow others or to be followed).
119
actions are re-arranged and made available through back-end infrastructures. The latter
can be verified by reading the official or unofficial API documentation, and by
exploring the output files (e.g. TAB., CSV. GEXF.) provided by data extraction tools.
In this process, a series of questions should be addressed such as: what is it possible to
do with technological grammar? What are the standard grammatised actions and what
are those which indicate platform-specific cultures? In what forms are they organised
and articulated? What can be accessed and subsequently repurposed? How? Are there
new or no longer used grammatised actions103? Figures 3.5 and 3.6 help us to respond
to these questions, showing the possible actions and reactions to a TikTok video. Each
piece of action is recorded. For instance, headers; text; create time; author id and name;
music id, name and author; image and video URLs; like, share, comment counts;
whether a hashtag was used. When implementing digital methods, one can be tempted
to use and explore all the available information on TikTok. Contrary to that, we should
first apply this technical knowledge to ask how technological grammar can serve
research purposes and answer research, also considering the culture of use and
appropriation of the grammars of action.
When looking at figures 3.5 and 3.6, we notice that TikTok grammatised actions have
more levels of specificity to be exploited than the record of engagement metrics, which
drives us to rethink the use of quantification (or what can be calculated) 104. For
instance, and based on figure 3.6, a list of image URLs originated from the TikTok
environment may serve as a point of departure to visual content analysis or to
investigate the sites of image circulation within and beyond TikTok. Whereas, when
comparing what music names and authors are associated with a hashtag, one can have
a sense of the dominant and ordinary forms of signification emerging on TikTok.
Through looking at the app technical interfaces, and with some background on its
usage culture, we may come across more elaborated perceptions about the potentials
of TikTok’s grammatised actions for digital research.
103
For example, the development of emoji hashtags on Instagram in 2015, Facebook Reactions in
2016, LinkedIn Reactions in 2019 or when Twitter changed the favourite start button to the heart
shaped button in 2015 (See https://blog.twitter.com/official/en_us/a/2015/hearts-on-twitter.html)
104
For instance, the overall engagement in a publication or the total number of followers or mentions
reflect a common path for social sciences’ researchers to think political and social issues, giving
substance to the idea of reality calculability.
120
Figure 3.5 TikTok grammatisation: looking at the end-user interface and reading unofficial
API documentation. Source: https://github.com/drawrowfly/tiktok-scraper#getVideoMeta
Figure 3.6 TikTok grammatisation: identifying grammatised actions through the exploration
of the scraper output file. Source: https://github.com/drawrowfly/tiktokscraper#getVideoMeta
Figure 3.7 shows TikTok end-user interface for both navigation and usage
perspectives. When watching videos, TikTok offers the classic social media buttons
(like, comment, share, mention, use emojis), but also two others that indicate particular
forms of use: the audio and Whatsapp buttons. The former informs the central role of
sounds to app creative environment, whereas the latter only shows up after watching
the same video for the third time in a row. This directly suggests that the user should
share the video on Whatsapp. In discover, an information board drives the user to
topics concerning the pandemic, e.g. official information about the cases in Portugal,
related hashtags such as #safehands and #felizemcasa (happy at home), and TikTok
new effects which is followed by trending hashtags. There users can search for top
121
video content, users, videos, sounds and hashtags. Here we may learn that hashtags are
as important as sounds on TikTok.
Figure 3.7. Being acquainted with TikTok grammatisation.
Other grammars of action emerge when using and experimenting with the app (Figure
3.7), the different types of facial effects are diverse and very interactive (e.g. type here
to write, open your mouth, touch the screen, drag your finger, blink your eyes). Sounds
can also be searched for when creating a video, as we see in the image on the left
(Figure 3.7), where a user can search for sounds based on her favourite ones and those
belonging to different categories imposed by TikTok, such as those we see below in
Figure 3.8. What is interesting is that, soon after publishing a video and once again,
the app suggests the user share the content on Whatsapp. All these possibilities can
point to what users can do (not be confused with what users do).
In short, I am suggesting that to get a good picture of grammatisation it is necessary to
spend time exploring the platform environment (front/back-end) both as an observer
and as user, but also to implement data exploratory analyses and visualisations to gain
a better understanding of how actions are articulated and how these articulations may
serve research purposes.
122
Figure 3.8. TikTok’s list of audio categories.
The notion of platform grammatisation shall guide researchers to use the knowledge
about the ways in which grammatised actions are altered and rearranged by computing
as methodological language. It helps researchers to make sense of data
retrieved/scraped from digital platforms. In this sense, and as previously discussed,
taking grammatisation into account demands new ways of conceptualising the subject
of study. Here, social media content cannot be separated from its carrier (see Niederer,
2019); platform interfaces and infrastructures. The following chapters adopt this
perspective in research oriented by Instagram hashtag engagement and Facebook
natively digital images.
Moreover, the study and repurpose of grammatised actions requires new skills rooted
in digital methods technical practices. For this reason, scholars such as Noortje Marres
and Richard Rogers argue that we should never forget to take into account
technological grammar in their context and environment. As a result, it is time to
consider software as message, including its technical schema and functional makeup
(see Manovich, 2014; Rieder, 2020) in the making of digital methods.
Cultures of use
Cultures of use here refer to the modes of life, the common meanings and the forms of
signification that emerge and circulate within a given platform. This perspective,
adapted from Raymond William’s conception of the word culture, entails that cultures
123
of use are expressed by technological grammar shaped by platform’s infrastructure and
technical mechanisms. To account for platform cultures of use, we should ask about
the common practices of the platform, its native objects and how they are used, the
role of recommendation and ranking systems in everyday usage. These questions are
helpful to the turning of search queries into research questions but may not be
sufficient to grasp platform cultures of use. Additionally, we may want to question
how differently publics use social media (or other digital platforms) and engage with
digital grammars. For what purpose? Who are the influential or dominant actors? Are
the users resistant to what is imposed and why? How (or what) makes cultures of use
change overtime?
Cultures of use here is in the plural because social media platforms have multiple
publics and forms of engagement which change from one platform to another but also
inside the platform itself. Within the Instagram environment one can study politics,
far-right movements, influencers, social bots, health, or porn-related issues, for
example, by looking at hashtags, stories, following network or visual content. The
research approach adopted should be responsive to cultures of use. When looking at
specific social issues across platforms, we need to recognise that different platforms
breed different cultures of use. For instance, even when the subject of study is the same
(e.g. pregnancy, emoji hashtags, social bots, climate change, Zika virus), the
appropriation of digital grammars and the production of (visual and textual) content
related to it might differ across platforms, while also diverging internally (see Bogers,
Niederer, Bardelli, & De Gaetano, 2020; Highfield, 2018; Omena et al., 2019; Pearce
et al., 2018; Rabello et al., 2018; Rogers, 2018). These differences are also noticed
when looking at high-visibility actors, content and uses versus ordinary actors, content
and uses (see chapter 4).
A more technical perspective on platforms’ usage cultures is proposed by Weltevrede
and Borra (2016), following their concept of device cultures:
Device cultures can then be defined as the interaction between users
and platform; how activity is imagined, curated, and prescribed into
the platform architecture; how affordances are activated by the
(un)intended uses and practices that take place on and within
platforms; and how the data is collected and processed by the
platform. In other words, the platform architecture suggests certain
124
practices and uses, and contains the traces of platform activity in the
form of data, whether unprocessed or in aggregate—or any
algorithmically processed form, such as the variants on popular,
trending, or relevant content. (Weltevrede & Borra, 2016, p. 2)
While this definition invites us to think about and interpret the reasoning of connective
action105 on their own terms, as suggested by Bennett and Segerberg (2012), it
simultaneously draws our attention to the modes of programmability associated to
social media and other web platforms (Helmond, 2015a; Mackenzie, 2019;
Murugesan, 2007; Poell et al., 2019). We learnt that platform cultures of use require a
comprehension from its natively digital environment; that is to apprehending platform
cultures of use through the lens of social media programmability (Helmond, 2015a).
Once again, we return to the language and reality of the web environment, particularly
what its infrastructure enables and by what means. Moreover, as discussed in the
previous section, we need to take technological grammars seriously respecting the
environment they come from. After all, they are considered as a “language of sociality”
(Marres, 2017) and “connectivity” (Van Dijck, 2013) serving, thereby, as both entry
points and corpus for research. The forms of articulation and re-arrangements of
technological grammars matters. So, while we pay attention to what is going on in
platforms, we should look at how these forms of signification are articulated by
platform grammatisation.
Let us use TikTok as an example. To get a sense about cultures of use in TikTok, we
need to use the app, while navigating through all its possibilities. This will give the
researcher a practical sense of the technological environment and grammar offered in
and through TikTok’s rules, default features and dynamics to the creation of short
videos using lip-sync, dance, and face effects. I refer especially to an understanding of
how the grammars of TikTok are imposed and articulated, rather than hastening the
105
Bennet and Segerberg introduce connective action as an expression of personalised action
formations, which, however, have the same issues or claims (e.g. environment, rights, women's
equality) found in older movements or party concerns (collective action). The authors explain that
“people may still join actions in large numbers, but the identity reference is more derived through
inclusive and diverse large-scale personal expression rather than through common group or
ideological identification” (p.744). The authors highlight two important elements in large-scale
connective action formations: i) political content in the form of easily personalised idea and ii) various
personal communication technologies that enable sharing these themes.
125
observation (or first impression) to look at the content that these grammars carried out
or at what the most popular users often do with them.
I have navigated the application with a certain constancy during the lockdown in
Portugal and, as a lurker, my first impressions were positive due to the funny and very
creative memetic video content suggested in my feed. To bring happiness to everyone
is the aim of TikTok, according to Zhang Nan (the CEO of Douyin TikTok’s Chinese
version) (Zhang, 2020). In a recent article on the infrastructuralisation of TikTok,
Zongyi Zhang (2020) says the platform distinguishes from the logic of traditional
creative video industry, explaining also how hashtags, hashtag challenges and music
structure and organise actions. Moreover, how TikTok’s new algorithm mechanism
reacts to a popular video; “if the creator produces an extremely popular video, past
content of him will also be reposted and recapture the attention of the algorithm. This
compensation mechanism means not only an opportunity but also a compulsion for
creators” (Zhang, 2020, p. 11).
In figure 3.9, on the right, the dancing challenge based on Cardi B’s song WAP which
was ranked in my feed on November 4, 2020; and, on the left, the Netflix
#tumdumchallenge suggested by the discover information board on the same day.
After clicking on #wapchallenge and scrolling down, one listens to the same part of
the song106, while watching different videos, for instance, users imitating or adapting
the singer’s dancing moves to other performance dance like Ballet, tutorials teaching
how to dance WAP, users singing the song in 1940’s or Celine Dion style, among
many other creative possibilities. In the #tumdumchallenge, we listen to the sound we
hear when entering Netflix followed by a remix version of it. One cannot tell what to
expect from video content here except user’s imaginative creativity (always consistent
with the platform cultures of use)107. In TikTok, one must be performative enough to
communicate and create videos using lip-sync, different filters and Augmented Reality
effects or dancing (non) professionally in either indoor or outdoor environments.
106
In particular the part of the song that says: “Now from the top, make it drop, that's some wet ass
pussy. Now get a bucket and a mop, that's some wet ass pussy. I'm talkin' WAP, WAP, WAP, that's
some wet ass pussy. Macaroni in a pot, that's some wet ass pussy, huh”.
107
https://www.tiktok.com/amp/tag/tudumchallenge?lang=en
126
Figure 3.9. TikTok’s cultures of use: hashtag challenges.
127
The #tumdumchallenge and #wapchallenge are good examples of the common and
expected video making practices on TikTok, reflecting what the platform proposes to
its users. However, usage culture does not have to respond to what is imposed or
expected by TikTok, because users may propose other forms of engagement or they
can resist to what is imposed. For instance, when users appropriate TikTok
technological grammar to support a cause not at all related to the making of
entertaining videos, such as the Brazilian Mariana Ferrer’s case of rape, in which the
accused was found not guilty108.
This decision, communicated in September 2020, is an unprecedent judgment in the
country (see Alves, 2020). According to the Brazilian Public Ministry, André de
Camargo Aranha had a conduct where there was a will but not full consciousness109
for rape. Up until November 4, 2020, almost 40 million views were counted in videos
hashtagged with #justiçapormariferrer (justice for Mari Ferrer), #justicapormariferrer
or #justicapormaribferrer (see Figure 3.10). In addition to disagreeing with the judge's
decision, an excerpt from the hearing provoked great indignation and criticism was
provoked by an excerpt from the heraring, in which the defence lawyer humiliates
Mariana Ferrer by showing sensual photos, implying she has a false victim's posture
and saying she has only fake crying on her Instagram accounts110.
108
In December 2018, the 23-years-old Mariana Ferrer reported to the Brazilian police that she was
raped in a nightclub in Santa Catarina, but she was unable to remember who had raped her; thus, she
believed she was drugged. At the time, the event promoter was a virgin. In May 2019, Mariana shares
her story on Instagram as an attempt to arouse public interest and see her process move faster in the
Public Ministry. After investigation and based on genetic material (saliva in the glass) and an internal
security video record, police have identified the rapist: the businessman André de Camargo Aranha.
According to the article of Schirlei Alves from The Intercept Brasil (Alves, 2020) and The Intercept
YouTube live (see https://www.youtube.com/watch?v=hsm4poTWjMs), in the first statement, the
accused says that he did not touch Mariana and, in the second statement he claims that he had oral sex
with her. However, and beyond other evidence, the corpus delict exam has proven that Mariana was
indeed raped, proving also the rupture of her hymen. The defence of Mariana has already appealed
against the judge's decision. Estadão made the full hearing available here:
https://www.youtube.com/watch?v=P0s9cEAPysY
Brazilians response to that case not only on TikTok but across social media platforms, engaging with
the hashtag #justiçapormariferrer (justice for Mari Ferrer). See:
https://www.tumblr.com/search/justi%C3%A7apormariferrer,
https://www.instagram.com/explore/tags/justi%C3%A7apormariferrer/,
https://twitter.com/search?q=%23justicapormariferrer&src=typeahead_click,
https://www.facebook.com/hashtag/justi%C3%A7apormariferrer
109
The Intercept Brasil used the expression he expression 'rape by mistake' (estupro culposo) to
summarise the case and explain it to the lay audience.
110
See https://www.youtube.com/watch?v=X--JAQShBBw.
128
The expression rape by mistake (estupro culposo), used by the Intercept Brasil to
summarise the case and explain it to the lay audience (see Alves, 2020), was also
reapropriated by TikTok users, showing their disagreement and revolt about the case
through the hashtags #estuproculposonaoexiste (rape by mistake does not exist) and
#estuproculposonao (rape by mistake no).
Figure 3.10. Otherwise engaging with TikTok: #justiçapormariferrer (justice for Mari
Ferrer). Screenshot taken November 4, 2020.
What also informs cultures of use is when users do not accept how the technical design
of platforms grammatises acts. They thus resist to what is imposed, finding a way out
of the rules. A good example is the role of 4chan’s anonymised users (anons), known
as the bakers, in resisting the platform’s limitations. The digital methods-based study
129
of 4chan’s /pol/, ‘Politically Incorrect’ board111, has shown how the bakers coordinate
the creation and maintaining of threads as a means to continue conversation on this
board (Bach, Tsapatsaris, Szpirt, & Custodis, 2018). In this study, Daniel Bach and
colleagues detected a coordinated rhythm in 4chan/pol comments containing
“President Trump General” and “Trump General”; a steady rate of around 1000 threads
from January 2016 to February 2018. This finding reveals that political conversations
in 4chan can be “centrally orchestrated by an elite who seem to dictate how the
conversation is framed”, rather than being anarchic, chaotic, or random as is expected
in this environment (see Bach et al., 2018; Knuttile, 2011; Tuters, Jokubauskaitė, &
Bach, 2018).
To better understand this specific culture of use, allow me to briefly introduce 4chan
(https://www.4chan.org). This imageboard platform maintains a culture of anonymity
(Knuttile, 2011) in which anyone may post anything anonymously in its various
themed boards; however, “boards only allow a finite number of comments before
threads must be purged” (Tuters, Jokubauskaitė, & Bach, 2018). The grammatisation
of 4chan is quite simple:
Most 4chan boards consist of 10 pages, each containing 20 threads
for a total of 200 active threads at all times. When someone replies
to a thread, it is ‘bumped’, meaning it becomes the top post on the
board — but only until another post is bumped afterwards. If it
reaches 300 comments, a thread can no longer be bumped. If this
happens, or when a thread stops garnering any reactions, it starts to
slowly descend towards the bottom of the board. If a thread falls
outside the 200 active threads, it is deleted or locked so no one can
comment anymore. (Bach et al., 2018)
In response to such a transitory and fleeting mode of being, the bakers resist 4chan’s
grammatisation and fight its ephemerality, by adhering to specific practices in which
posting times and the use of standard templates when commenting.
On one side, we have come to the point to reflect on cultures of use as shaped by how
platforms develop their infrastructure “not just in anticipation of inappropriate content
activity, but in response to it” (Gillespie, 2018, p. 264; see also van Dijck, 2013).
Scholars have been more attentive to the particularities and stylistic conventions that
emerge from social media by meeting forms of recognising “how registers of meaning
111
https://boards.4chan.org/pol/
130
and affect are produced” by, in and through these infrastructures - what Martin Gibbs
and colleagues called “platform vernaculars” (Gibbs, Meese, Arnold, Nansen, &
Carter, 2015, p. 258; see also Flores, 2019; Geboers & Van De Wiele, 2020; Pearce et
al., 2018; Pilipets, 2019). This approach allows researchers to “examine the
specificities of social media platforms”, while paying special attention to the particular
forms of participation that occur on them (Gibbs et al, 2015, p.258).
On the other side, we learn that to understand cultures of use is also a matter of asking
how it is to experience or to ‘be’ on a platform (see Knuttile, 2011) combined with the
critical analytics data approach (see Bach, Tsapatsaris, Szpirt, & Custodis, 2018;
Pilipets et al., 2019; Tuters et al., 2018) suggested by the practices of digital methods.
Here social media are sights for causes and windows on an issue (Niederer, 2019),
rather than only a place to praise the quality of being well-known, thus avoiding giving
attention to vanity metrics (see Rogers, 2018).
The affordances of software
Software here stands for all the computational mediums that take part in the full range
of digital methods because they “re-adjusts and re-shapes everything [they are] applied
to – or at least, [they have] a potential to do this” (Manovich, 2014, p. 80). That means
that we assume social media, search engines and web platforms as software along with
other mediums such as Gephi and vision APIs. Here, the concept of software has less
to do with “a foundational understanding of computing” (Rieder, 2020)112 and more to
do with a consideration of what software has to say through its materialities,
potentialities, functioning, outputs and relational aspects. That is an invitation to
become familiar with “a properly technical substance that sits at the centre of technical
practice” (Rieder, 2020a, p. 54), yet an active position before the use of software,
which is required in the implementation of digital methods.
Therefore, a conceptual or a technical understanding of software is not enough for
these methods, but empirical knowledge is essential (as I argued in chapter 2). This
section thus addresses software as the last pillar of knowledge illustrated in figure 3.4,
paying attention to the content of software which concerns an awareness of software
operation from the standpoint of non-developer researchers. To contextualise the
112
“Which seeks to settle its ontological status in order to develop a clear, axiomatic basis that
supports the deductive style of reasoning analytical philosophy favors” (Rieder, 2020, p.52).
131
relational aspects of software with the other aspects to be considered when practising
digital methods (cultures of use and platform grammatisation), we will pay attention
to the notion of software affordances.
In the context of human-machine interaction, affordances mirror the perceived and
hidden properties of software (see Gaver, 1991; Norman, 1988).
[...] the term affordance refers to the perceived and actual properties
of the thing, primarily those fundamental properties that determine
just how the thing could possibly be used. [...]. Affordances provide
strong clues to the operations of things. Plates are for pushing.
Knobs are for turning. Slots are for inserting things into. Balls are
for throwing or bouncing. When affordances are taken advantage of,
the user knows what to do just by looking: no picture, label, or
instruction needed. (Norman 1988, p.9)
In the context of software and platforms studies, Tania Bucher and Anne Helmond
(2017) introduce a more useful and appropriate notion of software affordances to the
context of digital methods. In approaching the questions of affordances in social
media, the authors propose we should not only pay attention to what technology does
to users but also what software can do for users and “what platforms afford to other
kinds of users beside end-user” (Bucher & Helmond, 2017 p.16). They are affirming
that
[…] by approaching the question of affordance from a relational and
multi-layered perspective, the question is not just whose action
possibilities we are talking about, but also how those action
possibilities come into existence by drawing together (sometimes
incompatible) entities into new forms of meaningfulness. (Bucher &
Helmond, 2017 p.18)
A similar vision of software affordances is presented by Ruppert et al. (2013) who
warned researchers to be attentive to the specificities of the materiality of digital
devices and to explore “the chains of relations and practices enrolled in the social
science apparatus” (methods) (p.41). Although, the authors do not use the term
“affordance”, they discussed what digital devices (software) can do for researchers or
afford them to do or might do to these devices. In this sense, software affordances can
132
also be taken as the materiality, productive and mediating capacities of software which,
according to Ruppert et al. (2013) are not explored in social theory.
In practical matters, and considering software affordances from a relational perspective
with platform grammatisation and cultures of use, we should be addressing questions
such as: how can we study platforms with and through software? What are the
elements/particularities of software that we should be aware of or master? What are
the grammatised actions required to query platforms using extraction software or web
research apps? In addition, there are basic requirements but also guidelines for a good
understanding of the triad grammatisation-cultures of use-software:
§
The need to be aware of (extraction, mining, analysis and visualisation) software,
while becoming familiar with its technical substance.
§
The need to understand how software takes part in the research design, analytical tasks
and presentation of findings.
§
The need to recognise that software cannot act or perform alone but is, instead,
conditioned by our choices, decisions and knowledge.
To illustrate the issues raised above, two examples will be developed. The first relates
to data collection and visualisation, whereas the second illustrates the implementation
of digital methods for studying online images through vision APIs-based networks.
Research is always influenced by the technological grammars or the technological
environment under investigation. We can only capture what is stored within platforms
databases and web environment. After choosing the extraction tool (e.g. YouTube Data
Tools) and the platform (e.g. YouTube), we need to understand how software responds
to its grammatisation and its regime of data access. For instance, asking what entry
points are available to query platforms (e.g. key terms, location, hashtags, object id);
how far back in time can data can be retrieved; what are the standard output files (e.g.
TAB, CSV, HTML, GDF, JSON etc.); and what sort of information comes with these
files. Consequently, what questions can be posed or answered.
Another aspect requiring our attention is to understand that every decision matters
when using data extraction software; the choice of words combined with the
parameters chosen to collect data give us information about what we can get and how
133
we can see. In Figure 1 we see what happens with network visualisation when simply
switching one parameter of YouTube Data Tools (YTDT) (Rieder, 2015); at the top,
videos ranked by relevance and by date at the bottom. Technicity was the keyword
used in the video network module that provided two GDF files; in one the network is
organised by what YouTube considers more relevant to the search query (technicity)
while, in the other, YTDT delivers a reverse chronological order based on the videos
creation date.
As illustrated in Figure 1, after retrieving data from social media, the output files must
go elsewhere and be submitted to many other technical mediations. These are
situations in which both software and the researcher’s decisions intervene, re-adjusting
and re-shaping representations of online activity. For instance, when opening a file
with Gephi, we must add other layers of technical meaning provided by force-directed
algorithms (e.g. ForceAtlas2) and metrics like degree (from graph theory and network
analysis). Here, we not only meet other fields of study (e.g. network studies) but also
other technical mediums (besides YouTube and YTDT) with its own methods, rules
and language for making networks readable and interpretable. Accordingly, there is a
call for understanding the basic principle of network analysis, not just Gephi itself.
Figure 1. Effects of the choices made before extracting data.
134
Figure 2 helps us in understanding such entanglements, highlighting what we see after
feeding Gephi with the output provided by YTDT (e.g. through nodes position within
the network, we may have a sense of how YouTube suggests relevant videos).
Furthermore, we see what YouTube users have created (e.g. title, published date,
category) and how other users have reacted to video content containing the word
technicity (e.g. number of views, comments, likes, dislike). The position of nodes
(videos) within the network makes visible YouTube grammatisation (e.g. video
recommendation system) and cultures of use and is influenced by the affordances of
the algorithm ForceAtlas2. Node size and colour can either display YouTube
grammatised actions (e.g. commentcount, dislike, video category) or Gephi statistics
(e.g. modularity, average degree).
Figure 3.12. The entanglements of data with grammatisation, software functioning
and mediums-specificity.
The process of mapping a network of related videos with and about YouTube is a
process of accumulation and transformation in which my dataset/corpus remains
situated and contextualised, but it also gains new arrangements and technical substance
to be considered when interpreting the network. Crucial in this process are the three
pillars of the digital methods approach presented above.
The second example concerns the implementation of digital methods for studying
online images through vision APIs-based networks. Figure 3.13 illustrates the role of
135
software affordances in this process, using a research diagram protocol to expose how
the researcher intervention, software specific potentialities and technical practices add
new form and meaning to a collection of natively digital images, while transforming
these throughout the operationalisation of the methods. This protocol is directly related
to the description of the process of building/interpreting computer vision-based
networks in chapter 2 (or my attempt to illustrate more clearly the mental and practical
modes of what I am calling the technicity-of-the-mediums in digital methods).
In short, the process starts with a list of social media image URLs which are interpreted
by web vision APIs, re-arranged as a network by the researcher and other software,
and finally, analysed according to its visual affordances. The result neither corresponds
to the grammatisation of the platform in which the images were extracted, nor the
grammatisation of the vision API in itself. Online images are merged, re-adjusted and
re-shaped by other computational mediums inherent to the practices of digital methods.
The final computer vision-based network thus carries another order of grammatisation,
one co-created with technical mediums.
The chain of methods results, built on the top of existing but different technological
grammars, going beyond the limitations of each of those individual grammars and
methods. It is a methodological process that demands a good understanding of the
technicity-of-the-mediums, but which also leads the researcher to work with a second
order of grammatisation (see also the last section of chapter 2). The result is the
creation of new methodological grammars based on the basis of existing technological
grammars and software affordances.
136
Figure 3.13. The methodological process for creating computer vision-based networks to
study online images. Concept by Janna Joceli Omena and design by Beatrice Gobbo.
137
Introduction to chapter four
This chapter is originally published as: Omena, J. J., Rabello, E. T., & Mintz, A. G.
(2020). Digital Methods for Hashtag Engagement Research. Social Media +
Society. https://doi.org/10.1177/2056305120940697
The chapter exemplifies the first steps in digital methods research and introduces the
attitude of making room for the technicity of the mediums. It also marks a change in
thinking about my research object, from political polarization to the role of
platform/software and of methods. The chapter began to take shape in 2016 following
a growing political polarization in Brazil, taking advantage of previous knowledge
about Instagram grammatisation and the possible analytical approaches afforded by
hashtag-based data. For example, networks of hashtag co-occurrence, account-based
analysis such as verifying who were the most active users using specific hashtags. At
that time, Instagram’s API Platform was relatively generous in offering access to posts
and metadata associated to a list of hashtags, not only in quantity but in a temporal
perspective. It allowed researchers to go back days, months, and even years in time.
Data collection occurred in several iterations from March 13 to March 31, 2016 and
was supported by Visual Tagnet Explorer (Rieder, 2015). The datasets were organised
in a datasheet113. Data collection was carried out by closely following the political
polarisation movement through the lens of Instagram. Data analysis was initiated in
the context of a data sprint in July 2017, leading to the publication of a paper in coauthorship with Elaine Teixeira Rabello and André Mintz. My main contribution in
this article refers to the methodological design and conceptualization, and analytical
proposal to consider dominant and ordinary actors, uses and contents associated with
hashtag engagement.
The technicity approach here means paying close attention to Instagram’s API to make
sense of how researchers can access, treat and repurpose the platform grammatised
actions (technically and practically). Second, the chapter poses research questions
aligned with the technicity of the mediums and the object of study (Brazilian protests
mobilised towards the “impeachment-cum-coup” of Brazilian president Dilma
Rousseff and framed by Instagram hashtag engagement). Third, and by seeking ways
113
https://drive.google.com/file/d/0B7j2W-Xfs9qBTVRhNzYydHdQd3c/view?usp=sharing
138
to apply “critical analytics” for social media research (Rogers, 2018), the chapter
introduces the three-layered (3L) perspective as a way of making sense of hashtag
engagement. The 3L perspective assembles hashtag engagement, their related content,
and the actors involved by distinguishing dominant and ordinary groups embedded in
social media practices and mechanisms. As an alternative to the customary choice of
focussing on most popular content, I suggested that we consider not only what is high
visible but also what is kept out of the spotlight. Finally, this chapter reflects my efforts
in mobilising the three key aspects of the practice of digital methods (software
affordances, platform’s grammatisation and cultures of use) using a methods’qualiquanti approach. Starting with the meticulous choice of a list of hashtags, followed by
the visualization of co-occurrence networks to identify other potential hashtags. After
collecting data and considering our analytical proposal (hashtags in favour and against
the impeachment of Dilma), we excluded offensive hashtags (coxinha march vs.
mortadela day) due to the low number of posts and one ambiguous hashtag (coup)
which could be both for and against the impeachment. Data was merged into two
databases and we used basic Excel formulas (e.g. VLOOKUP, SUMIF, percentile,
remove duplicates) to distinguish high-visible from ordinary actors according to the
total of likes and comments their posts received in the days of the protests. We then
used spreadsheets114, basic exploratory visualisations115, the web (verifying the post
URLs) and ImageSorter116 (to make sense of all visual content) to qualitatively analyse
the top 40 high-visible actors of both the pro-impeachment and anti-coup groups. The
image-label networks and the co-terms networks serve as other examples that provided
a macro perspective of the case study (highlighting visual content associated to each
protest), while the analysis of #nãovaitergolpe co-occurring network pointed to a very
specific and particular situation regarding the appropriation of a hashtag, from which
we were able to detect a shift of #nãovaitergolpe original meaning.
114
High-visible actors: https://drive.google.com/file/d/0B7j2WXfs9qBNnF3WDN2MmdsekE/view?usp=sharing&resourcekey=0-36LeRkc7nnlS8FYKDa2HWQ
High-visible actors (anti-coup protests): https://drive.google.com/file/d/0B7j2WXfs9qBbVYwZXg3YkNqVHc/view?usp=sharing&resourcekey=0-c3ulgDn7ktI556ijkqWHng
High-visible actors (pro-impeachment protests): https://drive.google.com/file/d/0B7j2WXfs9qBNS1TXzQxUHpFWlE/view?usp=sharing&resourcekey=0-6nDhzC56SIYkv53UTiwfsA
115
For instance: https://drive.google.com/file/d/0B7j2WXfs9qBMGdmblN6ZklMYXM/view?usp=sharing&resourcekey=0-EgJYDvReaxNXEk9w_OQu8g
116
For instance: https://drive.google.com/file/d/0B7j2WXfs9qBNUxHNjhEVnRURWc/view?usp=sharing&resourcekey=0-382uZW7S0WjTxqXNl2-2fw
139
As regard methods, on the one hand, this chapter can be characterised as a naive look
at the potentialities of computer vision, particularly since the analysis took place just
a few months after the launch of Google Vision API in May 2017. At the time, using
computer vision networks to research purposes was a methodological novelty with
room to be developed and criticised. This naive appropriation of digital technologies
and methods allows me to reflect on the situation that researchers (or students) who
are required to handle a series of other analytical decisions without mastering all the
technical details of the methods. The use of computer vision was here purely
exploratory, and the operations required to build the networks of the pro and antiimpeachment protests in Brazil has perhaps stolen some time from the critical
questioning of the technical constraints of Google Vision API. The outputs were not
contested but interpreted. This is another challenge we need to recognise when using
digital methods. Nevertheless, the results showed that approach can be promising
when dealing with large collections of images. On the other hand, this chapter
introduces an experimental and situational approach (3L perspective to hashtag
engagement) that has been developed and replicated in other research contexts117,
confirming not only to its capacity to provide rich insight and in-depth vision of the
case study but also methodologically contributing to digital research.
There are some practical aspects that can be learnt from this case study, as highlighted
below:
§
The requirement of a careful data curation process considering what hashtags are
being used, the role of the extraction software and of the exploratory visualisations in
the making of a good data sample.
§
When considering the technicity of the mediums, the formulation of research
questions should go hand in hand with the subject of study itself and in relation to the
ensemble of computational mediums put together by the researcher, as well as the
technical practices required by the methods.
117
For instance, and in the context of the 2021 SMART Data Sprint, see the following projects:
https://smart.inovamedialab.org/2021-platformisation/project-reports/gramming-covid19-reframingthe-pandemic/, https://smart.inovamedialab.org/2021-platformisation/project-reports/homesofinstafor-a-lockdownlife/, https://smart.inovamedialab.org/2021-platformisation/projectreports/vivasylibresnosqueremos/,
140
§
There are no ready-made data for social research. Check, clean, re-work, include or
delete data to the analysis are essential tasks; from the use of basic excel formulas and
data cleaning techniques to mastering the use of research software. These practices
are not explicitly available in digital method’s literature, nor have yet been
standardised.
§
In the analysis of data and especially with a more qualitative approach, the web should
always be a source of consultation and analysis.
§
In research oriented by hashtag engagement, researchers are invited to consider both
high visibility and ordinary user. At all level of analyses, unique actors must be
identified (e.g. users, link domains, image URLs and video ids) and subsequently
distinguished in highly and less visible. These analytical decisions (or strategies) help
researchers to better situate and contextualise the data sample, while also exposing the
hierarchical structure of the Web combined with how Instagram make data available.
§
When analysing digital networks, great descriptive efforts are required before
reaching possible findings or insights.
§
The use of computer vision facilitates the interpretation of large image datasets which
are labelled with confidence scores and ranked by topicality. For instance, the labels
assigned to pastéis de nata would be ranked by topicality (dish, food, cuisine) and
respective scores, as demonstrated below:
o Dish(0.9934035),Food(0.9903261),
Cuisine(0.9864208),
Egg
tart(0.9717019),
Baked
goods(0.9347744),
Ingredient(0.9207317),
Dessert(0.88936204),
Custard
tart(0.8410596),
Pastel(0.8360902),
Pastry(0.8034706)
§
When building a network of images and labels provided by computer vision, topics
categorization creates image clusters facilitating the interpretation of the visual
content.
This chapter feeds into the main arguments of this dissertation, by presenting a case
study that took into account the need to become acquainted with computational
mediums from a conceptual-technical-practical perspective, while introducing a
methodological framework.
141
4 HASHTAG ENGAGEMENT RESEARCH118
C HAPTER 4
118
This chapter was originally published as: Omena, J. J., Rabello, E. T., & Mintz, A. G. (2020).
Digital Methods for Hashtag Engagement Research. Social Media +
Society. https://doi.org/10.1177/2056305120940697
142
Introduction
In 2007, when Chris Messina made a tweet suggesting the use of # to organize content,
he could not have predicted how the movement of adding the hash symbol before a
word, a sequence of characters, or an emoji would become an everyday social practice
inside and outside of web platforms. The adoption of the # symbol goes beyond the
labelling of trackable content or elements; instead, it is now undertaken as “multiple,
open-ended, and contingent phenomen[on]” in society (Rambukkana, 2015, p. 5) that
serves digital research as a storytelling device.
At the same time, the use of hashtags points to controversial and tricky activities
(projected to create, induce, or keep alive a given debate/conversation). Either way,
these activities have demanded medium-specific methods and research (Gerlitz &
Rieder, 2018; Rogers, 2013). In alignment with new media scholars (Highfield &
Leaver, 2016; Langlois & Elmer, 2013; Rieder & Röhle, 2017; van Dijck, 2013), we
argue that social media research faces multiple challenges related to its complexity,
both in terms of the amount of information that circulates online and, especially, of
the need to investigate how to carry out research with the indispensable technical
knowledge. This involves raising questions, for instance, regarding how to approach
hashtags through platform mechanisms and how to handle the affordances and
limitations imposed by their infrastructure (see Marres, 2017; Rieder et al., 2015).
Against this background, this article proposes a framework to tackle the problem of
the methods applied to understanding collectively formed actions mediated by social
media platforms, that is, what we refer to as “hashtag engagement.” To that end, we
acknowledge “methods” as not only complementary to digital research but in an
interdependent position (Latour, 2010; Rogers, 2013) and, consequently, the study of
“hashtag engagement” as something that requires technical knowledge and (a
minimum) practical expertise on applied research with digital methods. In this
regard, we incorporate the notions of technicity (Simondon, 2009, 2017) and platform
grammatisation (Agre, 1994; Gerlitz & Rieder, 2018; Stiegler, 2006, 2012) to better
understand the complexity and challenges of hashtagging for digital research.
Furthermore, we present the three-layered (3L) perspective which aims to
“repurpose” the way we reason about hashtag engagement, moving from folksonomy
aspects to their multiple and complex role in and through social media. Under the
143
lens of digital methods (Rogers, 2013, 2019) and distinguishing high-visibility versus
ordinary actors and related content, the 3L approach aims toward providing a novel
way for reasoning and doing research about hashtag engagement. To conceptually
and practically introduce our proposal, we draw on the case of the “impeachmentcum-coup” of Brazilian president Dilma Rousseff. The demonstrations of March
2016 are particularly meaningful as they marked a heightened peak of political
polarisation in Brazil. We then took advantage of Instagram both as a source of
historical data generated by millions of citizens and as a site of research. We first
revisit the role of hashtags and situate “hashtag engagement” to underpin the 3L
perspective.
Revisiting the role of hashtags
The use of hashtags is undoubtedly a part of our digital life. There is a hashtag for
almost every social interest, for example, political causes or protests (#elenão vs.
#elesim),
branding
or
advertising
campaigns
(#PepsiGenerations),
genre
representation (#femboy), the awareness of illness (#microcefalia), erotic content
(#22), tourism (#RiodeJaneiro), gastronomy (#foodporn), memories (#tbt), and so
on. As natively digital objects (Liu, 2009; Rogers, 2013), hashtags may serve as
indexes for their functions, meanings, and practices. That is to say, one can search for,
navigate, or engage with hashtags, while others can monitor, trace, and retrieve small
or large datasets linked to them. Engaging with hashtags may express local or global
conversations, compact or large events, and controversial or non-controversial issues
(Bruns & Burgess, 2011; Burgess et al., 2015; Highfield, 2018; Pearce et al., 2020;
Tiindenberg & Baym, 2017). It is essential also to recall that hashtagging is not
exclusively human activity, but often the fuel behind effective bot activity (Bessi &
Ferrara, 2016; Omena et al., 2019; Wilson, 2017) also used on social media for
political and marketing purposes. And that means, beyond the capacity to represent
communities, publics, discourses, or sociopolitical formations, hashtags can be
perceived as sociotechnical networks, both as “the medium and the message”
(Rambukkana, 2015). The act of engaging with hashtags is not a new theme within
Social Media Studies, particularly for Twitter. This platform is the most common focus
144
of hashtag-led studies, with a vast theoretical and empirical literature that addresses
the relationship between hashtags and social formations (see Bode et al., 2014; Bruns
& Burgess, 2011; Burgess
et al., 2015; Small, 2011). Moreover, the use of political
hashtags is a prevailing criterion in corpus selection (Jungherr, 2014, 2015). On
Instagram, however, scholars have approached hashtags in selfie studies (Tifentale,
2015), commemoration and celebration (Gibbs et al., 2015), geolocalisation and sociospatial divisions (Boy & Uitermark, 2016), and as innovative visual methods to
research emoji hashtags (Highfield, 2018) or climate change images (Pearce et al.,
2020). Also, hashtags serve as a path to either training data for the development of
automatic image annotation (Giannoulakis & Tsapatsoulis, 2016) or for addressing
human behavior (see Cortese et al., 2018; Tiidenberg & Baym, 2017).
On Instagram, the use of hashtags began in 2011,119 promoted by the platform
community team through an initiative named “Weekend Hashtag Project”: a weekly
campaign that stimulates a culture of hashtag use in association with artistic and
creative photographic styles, giving users a chance to have their publications featured
by Instagram. Beginning at the end of 2011, weekly suggestions were prompted every
Friday, such as #throughthefence and #middleoftheroad in November, and
#vanishingpoint in December120. Over time, the prefix “WHP”121 became compulsory
for those who wanted to join the project and the weekly announcements moved beyond
the Instagram Blog on Tumblr to other platforms such as Twitter and Facebook. After
Instagram, a new tagging practice has also emerged throughout the #insta tags
family—for example, #instagood, #instamood, #instadaily, #instalike, #instalove.
These tags, moving across platforms, not only gave rise to ready-made hashtag
119
Although it is said that hashtag use began in the same year that Instagram was launched, in fact, we
were able to detect two accounts that used hashtags in 2010—namely, cindy44
(see https://www.instagram.com/p/B7Ho/, https://www.instagram.com/p/CNLr/, https://www.instagra
m.com/p/DPv4/) and natsuke (see https://www.instagram.com/p/vlyc/). The first profile, which
belongs to a female creative director, adopted the tags #cindy44, #donkey, #throughthefence, #jj and
#birds in the month that Instagram was created, October 2010, and the second profile also used the tag
#throughthefence, but in December.
120
See http://blog.instagram.com/post/13,120,184,445/throughthefence; https://twitter.com/instagram/s
tatus/141,220,040,329,531,392; https://twitter.com/instagram/status/148,826,765,953,990,656.
121
See a few examples: #streetartistry in 2012, #whpemptyspaces in 2013, #whpmirrormirror in 2014,
#whpboomerang in 2015, #whpidentity in 2016, #whpinthekitchen in 2017 and #whp in 2018.
145
thematic lists to boost (automated) engagement122, but have also pushed the boundaries
of hashtagging, and challenged hashtag based studies.
Beyond serving as a description of visual content (Giannoulakis & Tsapatsoulis, 2016)
or as an index for a topic, a hashtag is also a register for the realm of feelings, ideas,
and beliefs (Paparachissi, 2015). To demonstrate this, #BrasilContraOGolpe [Brazil
against the coup] may serve as a good example. In late March 2016, this tag emerged
from Dilma Rousseff’s supporters and “democracy advocates.” Activists, intellectuals,
journalists, politicians, and ordinary users started using #BrasilContraOGolpe as a
reference to the impeachment process against the president - considered by many as a
“modern coup” (Jinkings et al., 2016). Pro-impeachment supporters, however, have
also adopted the usage of the tag, but shifted its original meaning to support their
arguments: claiming that the real coup would be that of keeping Dilma Rousseff and
her Labour Party (PT) in charge of the government. This meaning shift, especially
concerning polarised debates in programs and anti-programs (see Akrich & Latour,
1992; Rogers, 2018), is an example of double-sense hashtags.
To locate these modes of appropriation, a technical understanding of the platforms’
functional forms of living (technicity) must be entangled with the process of doing
digital methods (Rogers, 2019). Studies based on hashtags, however, should not
conflate different platforms but, rather, apply different analytical procedures to each
one (see Highfield, 2018; Highfield & Leaver, 2015; Rogers, 2017). Conversely,
hashtags can be viewed as “problematic” content for digital research due to their failure
to cover certain sensitive issues that tend to be disguised, such as pro-eating disorder
content (see Gerrard, 2018). The collective adoption of tags can also be employed as
a comparative source to grasp hashtagging activity in different platforms, which can
be used to adapt methodological approaches (Highfield & Leaver, 2016). Despite
unveiling different layers of reasoning, the logics of the hashtag adoption, and its
consequences in a given context, these studies do not necessarily address hashtagging
as a collective action movement. Alternatively, we further introduce the idea of
discussing hashtag engagement rather than the hashtag adoption, conflating with the
technicity of Instagram and its grammatisation process.
122
These lists of hashtags were originally adopted to increase views on publications and,
consequently, to boost likes, followers, and comments with the help of applications and their
automated mechanisms.
146
Situating Hashtag Engagement
What, then, does the word engagement in hashtag engagement refer to? Engagement
is taken as actions, metrics, and research indicators. For instance, one can argue that
hashtag engagement is commonly associated with the act of using tags to engage with
news, activism, brand strategies, event-based information, politics, demonstrations,
automation practices, or specific debates. However, the term engagement has been
either used to name platform-afforded metrics (or the totality of commensurable
activities in a media item) or taken as an indicator for research design. Engagement
metrics have thus become part of general digital media literacy as well as parameters
for selecting data samples to be further analysed. Partly encouraged by terminology
adopted by platforms themselves123, these metrics have even merged with the very
notion of engagement in common parlance124.
On this topic, Marres (2017) refers to the analytic figure of power-law as a critical
issue in “the re-validation of hierarchical forms of social and public life” (p. 71).
According to Marres, by feeding power laws back to users in the form of trending lists,
digital platforms not only inform what goes on in digital settings but also serve “as an
instrument that influences collective action”. And, while these can be understood as
actual and faithful results of how users generally relate to the media, Gillespie (2017)
draws attention to how the platform metaphor may hide inherent biases and active
intervention of the internet high-tech companies, while suggesting a smooth standing
point from which users can participate equally and fairly. These two remarks remind
us that hashtag engagement also responds to platform infrastructures and mechanisms.
In this scenario, we understand that social media engagement can be approached under
a dual logic. In one way, it prioritises the sum of actions media items receive from
many actors. Alternatively, engagement with a topic can be perceived by the recurring
use of natively digital objects (Rogers, 2013) or grammars of action (Agre, 1994) from
many actors about a topic—that is, many people using particular terms, hashtags, or
123
Platforms’ documentation commonly refers to these metrics as “post engagement” and offer their
analytic products as a way to “measure engagement.”
See https://analytics.twitter.com or https://www.facebook.com/business/help/735,720,159,834,389
124
The term and the problem to which it refers has a history that long precedes social media platforms,
and has been related to different meanings besides the ones it has come to convey nowadays. For these
reasons, as discussed by Rafael Grohmann (2018), it is important to critically and carefully consider
the term’s polysemy when it is used conceptually.
147
images. Following platform mechanisms, the first logic is reflected on the most
engaged list or what is dominant in terms of popularity and influence— parameters
commonly taken for sampling purposes in social media research. The second logic
refers to the diffuse posting of content related to particular issues that do not
necessarily reach large numbers of likes, shares, or similar actions. That is where we
would also find ordinary posts kept out of the spotlight—in a distribution that is similar
to C. Anderson’s (2008) notion of the long tail.
The dual logic of social media engagement thus raises concerns in research methods,
particularly the understanding of the high-visibility and ordinary lists: what different
stories can they tell? How may these lists complement or contradict one another? Some
researchers have addressed specific concerns regarding how the practice of
emphasising high-visibility content or the logic of popularity may lead to social media
studies driven by engagement parameters (Marres & Weltevrede, 2012; Rosa et al.,
2018). On the contrary, there is a long-standing debate around what ordinary means
and why it matters for Cultural, Communication and Media Studies. For instance, in
attempting to describe the ordinariness of culture, Williams (1989) explained how
difficult it is to interpret the ordinary or unknown audience. In his view, ordinary
people do not belong to “the normal description of the masses”; they belong to the
unknown or unseen structures (Williams, 1989, p. 98).
This article thus proposes, from a standpoint of quali-quantitative methods (Latour et
al., 2012; Moats & Borra, 2018; Venturini, 2010), an alternative perspective to
addressing engagement in social media research; a call to embrace not only highly
visible content, but also ordinary, less-visible content for the interpretation of hashtagmediated actions.
Reasoning with and through the medium
The study of hashtag engagement also requires a grasping of the functioning of the
platform itself (technicity) along with the platform techno-materialisation process—
which “enable (s) behavioural fluxes or flows to be made discrete (in the mathematical
sense) and to be reproduced” (Stiegler, 2012, p. 2) (grammatisation). In this regard,
we incorporate the notions of technicity and grammatisation, which not only
148
complement one another but are crucial for social media research and, accordingly, for
the concretisation of the 3L approach.
Technicity
The philosophy of Gilbert Simondon (2009, 2017) reminds us of the crucial role of
technicity for an understanding of “the mode of existence of the whole constituted by
man and the world” (2017, p. 173)—a reality mediated by technical objects. The
reasoning proposed in this article derives from Simondon’s ideas on the essence of
technicity (2017) and the technical mentality (2009). Technicity, in a specific manner,
refers to the notion of “function” as being associated with the technical and practical
forms of knowledge of technical objects and how they relate to us. On this basis,
technicity would simultaneously precede and take place with and in technical objects:
first by being related to figural structures or the realm of ideas and, second, by the
recognition of technical objects as a practical reality. This movement (from
representative aspects to the praxis of techniques), consequently, divides technicity
into two orders of thought: theory and praxis. In this way, technicity concomitantly
triggers not only theoretical but also technical and practical knowledge on the
functioning of technical objects and their relationship with human beings.
A technical mentality thus implies thinking hashtag engagement with, in, and through
social media platforms. Rather than only looking at the content, a study based on the
technicity of Instagram should also consider the functioning of its technical interfaces
and algorithmic techniques. One example of this would be to take advantage of
application program interface (API) documentation using the knowledge about
platform data access regimes, end-points, and their respective limitations and rate
limits to repurpose social media research125. This article aligns with concerns raised
by scholars such as Rieder et al. (2015) and Langlois and Elmer (2013), by looking at
what is in social media technical interfaces as a way to perceive how social media
grammars (hashtags) have been rendered and made available. In doing so, we propose,
in practical ways, a more techno-aware understanding of social life (Marres, 2017) in
pursuit of studying “hashtag engagement” on social media.
125
Some examples of the limitations of Instagram Graph API for getting hashtagged media include the
fact that one cannot request username field or query more than 30 unique hashtags within a 7-day
period.
149
Platform Grammatisation
When referring to grammatisation, we are addressing an extension of the concept
forged by Auroux (1994)—a process of description, formalisation, and discretization
of human behaviors into representations, so that they can be reproduced (Crogan &
Kinsley, 2012). This is what the French philosopher of technology, Bernard Stiegler
(2006, 2012), called the process of digital grammatisation in which “all behavioural
models can now be grammatised and integrated through a planetary-wide industry of
the production, collection, exploitation, and distribution of digital traces” (Stiegler,
2012, p. 2). More recently, Gerlitz and Rieder (2018), envisioning the infrastructural
aspects of Twitter, presented an updated definition of grammatisation: when users
inscribe themselves into predefined forms and options produced and delineated by
technical interfaces (software) to structure their activity. Beyond providing a way of
looking at things, platform grammatisation simultaneously produces standardisation
of actions (e.g., likes) and formalises these activities to calculability. This is a relevant
concept for digital methods-based research, due to its strong focus on mediaspecificity, which, in the case of social media, is very much defined by their
grammatisation of social activity.
Next, we borrow Agre’s (1994) technical understanding of “grammars of action” or
the representative forms of “discourse-made-machinery,” such as hashtagging,
commenting, posting, replying, and so on. In this sense, hashtags are no longer text,
but, by being clicked, they enact a navigational function. Thus, hashtag engagement is
embedded into the platform databases that predefine specific properties (e.g., a tagged
post has a caption, an image, or video and date of publication), the relationship between
them (e.g., hashtags appear in Instagram posts), and a set of actions (e.g., liking or
commenting on posts, using filters; see Gerlitz & Rieder, 2018). When considering
how social media databases store and organise actions attached to the # symbol, we
verify multiple forms of storing and further accessing hashtag data. As an illustration,
through the former Instagram Platform API, it was possible to recall the number of
times a profile mentioned a given tag (suggesting a form of appropriation) or the
provision of ways of seeing correlations among tags (through a co-tag network).
Meanwhile, the current Instagram Graph API only allows the search for the most
popular or recently published tagged content.
150
In other words, and despite its prestructured form (#), hashtags can be differently
embedded into social media databases permitting, then, different ways of reading
hashtag engagement. Along with this grammatisation process, hashtags can also
acquire different meanings and purposes in the modes they are used and, therefore,
researched. That is what we refer here as “the grammars of hashtags,” how social
media capture and reorganise the different modes of actions attached to hashtagging.
The 3L perspective for studying hashtag engagement
The 3L perspective assembles hashtag engagement, their related content, and the
actors involved by distinguishing dominant and ordinary groups embedded in social
media practices and mechanisms. The practical awareness of the platform
grammatisation and technicity is the basis that concretely informs the 3L approach.
This kind of knowledge, we argue, provides practical ways of reasoning with and
through the functioning of the platform itself and its conjunction with hashtag
engagement. Just as digital methods (Rogers, 2013, 2019) the 3L perspective must
follow and evolve with the medium, its methods, and the affordances of digital data.
Following the lexicon and proposal of Rogers (2018), this perspective also serves as a
form of “critical analytics” or “alt metrics” for social media research by locating issue
networks and creating indicators that are alternatives to marketing-like measures.
We understand hashtag engagement as collectively formed actions mediated by
technical interfaces. In other words, grammatised actions that move toward
descriptions of images and feelings or toward particular topics of discussion (or
issues), which require a (minimum) collective level of commitment. These
sociotechnical formations, differently inscribed within web platforms, offer a framed
(but sturdy) perception of society while providing social media research with different
levels of analysis. Through the lens of the 3L perspective and along with the proposal
of sociologist Bruno Latour (2010; Latour et al., 2012), the study of hashtag
engagement allows analysis to move between the levels of the element (micro) and of
151
the aggregates (macro)126. With Latour and others (Omena, et al., 2019; Venturini et
al., 2015, 2018), we embrace a “navigational practice” not restricted to either of those
levels but a research practice that goes from micro to macro and back, taking any of
them as a starting point for the inquiry. Few studies, however, have been developed on
methods for researching hashtag engagement on Instagram on such bases. This is a
contribution we expect to make with our 3L perspective for hashtag engagement
studies on (but not restricted to) Instagram.
In what follows, we explain each layer comprising the integrated 3L approach.
Although presented in a linear sequence, they must be taken together, as layers of the
same object.
Layer 1: High-Visibility Versus Ordinary
On this analytical level, unique actors are identified and subsequently distinguished
according to the modes of activity and engagement metrics received by their posts over
time (the acts of hashtagging or interacting with tagged content). In so doing, we
attempt to cover both high-visibility and ordinary actors and related content, as well
as answer the following questions: who are the high-visibility and the ordinary actors?
Who dominates the debate? What is the visual and textual content related to them?
What are the sites of image circulation? How about the distribution of users, posts, and
engagement?
The main challenge is in proposing a threshold for delimiting high-visibility from
ordinary hashtag usage, its related actors, and content127. Driven by Rogers’s (2018)
alternative metrics to study issue networks in social media research, we considered the
persistence of user activity over time as they are inscribed in platform engagement
metrics. Thereby, it is an attempt to address what the social media digital attention
economy either emphasises or not. In this logic, high-visibility actors and content are
understood as the minority, which exhibit comparatively high and consistent
engagement metrics (e.g. likes and comments counts) across the observed time span.
126
Latour’s proposal is based on Gabriel Tarde’s social theory, particularly his idea of quantification.
The importance and influence of Gabriel Tarde’s work is recognized by Bruno Latour when he places
Tarde as the main precursor of Actor-Network Theory.
127
Actor activity is understood in their tagging or uploading tagged content overtime, whereas metrics
of engagement means the total of likes and comments in a publication. In other platforms, engagement
metrics could also include reposting (share, retweet, reblog, etc.), among other actions.
152
This would indicate not only the scale of their audience but also their ability to receive
responses to their publications. Conversely, ordinary actors and content would be the
majority, exhibiting comparatively lower engagement metrics, reaching a smaller
audience. Of course, these categories are not empirically self-evident. Rather, the
threshold needs to be arbitrarily defined by grounded criteria.
Layer 2: Hashtagging Activity
The second layer relates to the repurposing of hashtagging activity for grasping the
grammars of hashtags. By this, we mean the ways in which social media platforms
capture and reorganise the different modes attached to hashtagging. Far from being
neutral intermediaries (Latour, 2005), hashtags are taken as entities to which the
activities of users, bots, and platform algorithms converge and through which they
mutually transform one another. Although such entanglement can be very complex, it
is possible, in line with digital methods’ perspective (Rogers, 2013), to repurpose
hashtags as traces from which one may infer those activities.
Besides framing the most active actors or serving as qualitative parameters to inquire
into high-visibility and ordinary groups, the intensity and rhythm of hashtag mentions
may indicate actors very committed to specific issue spaces, as well as potential botted
accounts (see Omena et al., 2019). Patterns of concomitant hashtag use can indicate
different hashtagging practices, including shifts of meaning, purposeful deviations, as
well as hashtag ambiguity and ironic usage. We argue that different approaches should
be embraced to read the forms of appropriation and frequency of use regarding one or
more hashtags.
Looking at the affordances of Instagram to hashtagging activity, this layer seeks to
answer questions such as the following: What can frequency of hashtag use reveal
about high-visibility and ordinary groups? What can the number of times hashtags are
mentioned by a given account tell us about particular actors or automated agency?
How can the co-occurrences of hashtags indicate different hashtagging practices? How
do hashtags mediate actors’ engagement with a cause?
Layer 3: Visual and Textual Content
Finally, hashtag engagement should also be related to the content of the posts within
which they are mentioned. The third layer focuses on visual and textual content,
providing an overview of the diversity and richness of narratives attributed to
153
particular hashtags. Here, the focus is on understanding the images and texts to which
hashtags are brought to relation, taken as constituent parts of their meanings and
related practices. In that regard, and accounting for high-visibility and ordinary groups,
this layer asks: what stories can the visual and textual tell? What are the visual and
textual compositions or meanings related to certain hashtags? How about the sites of
image production and circulation?
The quali-quantitative approach is particularly relevant at this analytical level.
Considering our interest in massive ordinary posts, this approach would be laborious—
not to say unfeasible. However, distant reading methods for both texts and visual
content can be mobilised for identifying recurring patterns (Dixon, 2012) among the
dataset, without losing sight of their manifestations. This is the main challenge of this
layer, whose operationalisation will be detailed further.
The praxis of Hashtag Engagement Research
Political Context, Scholarly Approaches, and Framing of the Brazilian Case
The case study approaches two antagonistic protests staged in Brazil in March 2016,
during a rise in political animosity in the country. On the 13th of that month, protesters
went to the streets in many cities in support of an ongoing parliamentary process to
remove President Dilma Rousseff from office. Five days later, on the 18th, protesters
contrary to the removal took their turn, expressing concern that the proposed
impeachment lacked legal cause and would thus be qualified as a “parliamentary coup”
(Jinkings et al., 2016). In respect to the terminology used by each of the groups in
defining themselves—and wary of not prematurely resolving the implied controversy
(Latour, 2005; Venturini, 2010)—we chose to refer to the protests, respectively, as
“pro-impeachment” and “anti-coup.”
It is essential to understand this case within a broader political context. Addressing
Brazilian demonstrations staged between 2013 and 2016, Alonso (2017) discusses
elements that could have facilitated their emergence with an interest in the styles of
mobilisation of each cycle of demonstrations. These include the wave of global
autonomist protests starting in 2010 (from Tunisia to Wall Street), Brazil’s
international visibility due to the sports events it would host in following years;
154
corruption scandals and their spectacularisation; and the rapid reconfiguration of
Brazilian social strata (see P. Anderson, 2011; Lima, 2010), which destabilised
symbols of social hierarchy (race, income, and education, among others).
This four-year period, culminating in 2016, is commonly divided into three protest
waves. First is that of the so-called “June Journeys”: mass demonstrations which, at
their peak in June 2013, brought an estimated one million people to the streets. They
marked the emergence of an autonomist and leaderless style of demonstration, which
took governments and traditional movements by surprise, but which also culminated
in ideologically ambiguous protests coalescing agendas across the political
spectrum—from anarchist to pro-dictatorship demands. Next was what Alonso (2017)
refers to as the 2015 “Patriot cycle,”128 following the 2014 presidential elections,
which Rousseff won by a very narrow margin. To the right of the political spectrum,
allegedly nonpartisan groups achieved prominence, especially on social media (Omena
& Rosa, 2017). They were able to mobilise a wide range of conservative political
strands, from major players in the financial and industrial sectors to religious
fundamentalists and conservative citizens from higher economic strata.
The case studied in this article is part of the third wave, more directly tied to Rousseff’s
impeachment process, which, officially, pursued accusations of administrative
misconduct (which came to be known as “fiscal pedaling”) in December 2015. Most
protests took place in 2016, when the aforementioned conservative groups were
prominent established players in Brazilian protests. The polarisation already
experienced in the second wave was magnified by the reconfiguration of the public
agenda, with antagonistic groups of supporters and detractors of Rousseff’s deposition
becoming delineated.
Despite the actual judicial arguments of the process, public debate inherited much of
the agenda of the previous wave, with pro-impeachment demonstrators focusing on
corruption scandals, targeting the Workers’ Party, and mobilising mostly citizens from
higher economic strata. Calls for Rousseff’s ousting were accompanied by several
misogynistic depictions of Rousseff—the first-ever female president of Brazil— as
discussed by scholarly inquiries of the case (see Corrêa, 2017). Hatred against left-
128
Alonso indicates March and April of that year, but we would extend the cycle’s scope to protests
staged later in 2015 as well.
155
leaning activists and marginalised segments of the population, commonly associated
with a progressive agenda, was also increasingly manifest in that context. Anti-coup
demonstrators’ discourses focused on the defense of democracy and often exhibited
explicit partisan stances.
Although this event has prompted scholarly inquiries on several aspects of the process,
there are surprisingly few works that investigate how protesters represented
themselves in that context. The impeachment process has been more often studied in
regard to how it was reported by the press or by groups leading the protests (see Fausto
Neto, 2016; Tavares et al., 2016), with little attention paid to ordinary protesters’ visual
depiction of the event or to Instagram as a site of observations129. In what follows, we
will present a study of this case based on our 3L perspective, building upon Instagram’s
culture of use and affordances.
Operationalising the 3L Perspective
Taking advantage of Instagram’s API Platform, which at the time allowed researchers
to go back days, months, and even years in time, data collection occurred in several
iterations from March 13 to March 31, 2016. Our study relied on Visual Tagnet
Explorer (Rieder, 2015) to collect publicly available posts according to queries based
on hashtags. Chosen upon immersive observation of the context and through previous
exploratory data collection and analysis (co-hashtag networks and Excel’s pivot table),
the selected hashtags (Table 4.1) corresponded to the following criteria: having a
significant number of mentions, bearing clear connection with the topic, being an
indicator of counter-reactions, or being an indicator of new connections on the topic.
The datasets were later filtered by matching the dates of the posts and the protests,
limiting the scope to the two dates - March 13 for pro-impeachment and March 18 for
anti-coup. The final combined dataset included 19,231 unique Instagram accounts with
a total of 22,423 posts.
129
Regarding self-representation, an exception is a work by França and Bernardes (2016), which
approaches visual depictions of the 2015 demonstrations, albeit from a different theoretical and
methodological standpoint. Regarding digital platforms, Twitter and Facebook were most commonly
studied with regard to impeachment-related demonstrations (see Alzamora & Bicalho, 2016; Moraes
& Quadros, 2016; Ribeiro et al., 2016).
156
Table 4.1. List of hashtags selected for the case study
Following the 3L perspective, the distinction of high-visibility from ordinary was
based on the combination of two factors: first, detecting unique actors (Instagram
accounts) and then the testing of different thresholds for the average platform
engagement metrics (sum of like and comment counts) of the users’ posts over time.
In so doing, we expected to find a viable threshold that could distinguish between a
minority group of users which received a large portion of the total sum of engagement
metrics of all posts in the dataset. Through this process, we came to define the
threshold at the 98th percentile of average engagement per post, per user. Using this
boundary, we found similar distributions for both pro-impeachment and anti-coup
datasets. In both cases, high-visibility actors were a minority responsible for roughly
4% of all the posts in each dataset; yet, they received around 50% of all engagementrelated activity. Through this procedure, we sought to distinguish the most visible (and,
therefore, most popular and influential) actors and their related content from the rest.
Next, for the analysis of hashtagging activity, we focused on hashtags’ frequency of
use and their concomitant mentioning. The former was taken as an indicator of popular
tags, which we compared between high-visibility and ordinary users in each protest.
The concomitant mentioning of hashtags was observed through co-occurrence network
157
built on Gephi Version 0.9.2 (Gephi Consortium, 2017), taken as analytical devices to
observe patterns of hashtagging practices130.
For the visual dimension, we relied on an experimental on an experimental approach
based on that proposed by Ricci et al. (2017). Post images were automatically labeled
based on their content using a computer vision API—Google Cloud Vision API
Version 1.0 (Google, 2017)131. The automated image classification was later
combined with Gephi and a custom Python script (Mintz, 2018) for building a
computer vision-based network. The so-called image-label networks in which we can
see clusters of images connected by their descriptive labels. For the textual content,
we resorted to two analytical tools: CorTexT Manager (Lisis Laboratory, 2017) and
Textanalysis (Rieder, n.d.). The former, advanced by topic modeling algorithms,
allowed us to visualise co-term networks of Instagram captions and their related
hashtags (clustered by political positioning). Textanalysis served our case study to
compare the use of emojis in the captions of posts by high-visibility and ordinary
users.
Findings
In this section, we present the findings of the case study of the “impeachment-cumcoup” of Brazilian president Dilma Rousseff. We applied the 3L perspective to study
political polarisation in Brazil through the lens of hashtag engagement and considering
two national demonstrations: the pro-impeachment (March 13) and anti-coup (March
18) protests.
High-Visibility Versus Ordinary
Through the distinction made at this stage, we were able to inquire of high-visibility
actors and their related content. Who are they? What can activity over time tell us
about high-visibility actors? To what visual elements are they attached? We identified
130
All network visualisations used in this study were based on the visual network analysis technique
(Venturini et al., 2015; Venturini et al., 2018).
131
Bernhard Rieder’s (2017) Memespector script was used for interfacing with Google’s
API. https://github.com/bernorieder/memespector
158
a very particular structure in both pro-impeachment and anti-coup groups (Table 4.2):
on one side, a group of actors who obtain high levels of engagement metrics with very
few publications, while on the other, a group of actors with a large number of
publications over the day of protests also getting high levels of engagement metrics
(see Omena et al., 2017).
In a more specific example, Figure 1 shows the configuration of high-visibility actors
(dots) positioned according to received engagement metrics (vertical axis) along the
day of the protests (horizontal axis). At the top, the actress Viviane Araújo points to a
trending characteristic in the dominant visuality among public figures: selfies, whereas
the classic imagery of the crowds is mainly promoted by non-official campaign
accounts and the organiser of the protests— namely, chegadecorruptos,
foracorruptos_rn and vemprarua. Other visual elements addressed by the high-visible
actors in pro-impeachment protests expose the support of the then Federal Judge
Sérgio Moro and the Operation Car Wash or the appearance of humorous images (e.g.,
Dilma in the shape of Zika mosquito) and aggressive messages addressed to Dilma
Rousseff and Lula.
Table 4.2. The high-visibility actors in Brazilian protests. Instagram, March 2016.
159
Figure 4.1. High-visibility actors of the pro-impeachment protests in Brazil, March 13, 2016.
Composition, engagement flow over time, and visual elements (scatter plot design by
Beatrice Gobbo).
There were also some unexpected findings: first, an account dedicated to pets
(petscharm) among high-visibility actors. This Instagram account published a series of
images of dogs wearing Brazil’s football garment or the Brazilian flag, elements also
worn or carried by protesters. In regard to actors’ activity and their associated
engagement metrics, we saw an ongoing posting activity over March 13, 2016 and,
between 3 p.m. and 9 p.m., high peaks of engagement that may correspond to the
simultaneous protest acts across different cities in Brazil (Figure 4.1). It is also
important to point out deleted non-official campaign accounts on Instagram, such as
opereacaolavajatooficial (official operation car wash), which lead us to question their
authenticity and role.
Hashtagging Activity
As a next step in the analysis of hashtag engagement, we considered the grammars of
hashtags by reading Instagram’s different forms of capturing hashtagging. Looking at
referential tags and their use frequency, we noticed different preferences among highvisibility and ordinary actors (Figure 4.2).
For instance, in pro-impeachment protests, #foradilma (get out Dilma) and #forapt (get
out PT) were more frequent among ordinary users, while #vemprarua (come to the
street) was slightly more frequent among high-visibility ones. In anti-coup protests,
ordinary actors gave preference to #naovaitergolpe (there won’t be a coup), while
high-visibility actors opted for #vemprademocracia (come to democracy). The
160
different cultures of appropriation among high-visibility and ordinary actors provide a
more accurate description of hashtag engagement practices.
Now, we turn our attention to hashtag mentions and related actors, more precisely,
who are the high-visibility actors and how many times they mention specific tags.
Beyond seeing tag preferences among high-visibility and ordinary actors, the
contribution of this analysis is in the detection of very committed Instagram accounts
with given hashtags. So far, and unlike occasional mentions, we have seen that the
persistence of hashtag mentions over time may refer to those actors responsible for
keeping the debate regarding protesters’ grievances alive. Conversely, accounts with
few mentions can equally reach high engagement metrics by being related to public
figures, humorous or artistic visual content (e.g., tiacrey, lalanoleto, artedadepressao),
or politicians and activists (e.g., humbertocostapt, fernando.domingos.sim).
To take a concrete example, in the pro-impeachment protests, the most committed
actors by hashtag mention were mainly non-official campaign accounts—namely,
chegadecorruptos, foracorruptos_rn, operaçãolavajatooficial, petscharm, and the
organisers of the protests (vemprarua). The behaviour of these Instagram accounts
points to an automated agency (see Omena et al., 2019). Regarding the anti-coup
protests, non-official campaign accounts (e.g., rosangelacct, transitivaedireta,
liliferrer14) also took part in the “most active list” by hashtag mentions, but so did
alternative media (e.g., medianinja) and one of the organisers of the protest (cutbrasil).
Regarding non-official campaign accounts, we found strong suggestions that thirdparty applications were being used to boost engagement metrics.
161
Figure 4.2. Proportional frequency of hashtag mentions (number of mentions over the
number of posts) for high-visibility and ordinary groups. Filtered to the 10 most mentioned
hashtags of each dataset. Visualisation created with Tableau Desktop (Version 10.4.6; 2018).
The visual exploration of co-occurring hashtag network added value to the hashtagging
activity perspective. Rather than following the typical cluster analysis to study the
partisan use of hashtags and related topics, we approached emblematic hashtags
adopted by pro- and anti-programs as a form of seeing a shift in meaning. That is what
we call double-sense hashtags. After scrutinising #nãovaitergolpe (there won’t be
coup) (Figure 4.3) co-occurrence network, we were able to detect purposeful shifts of
the hashtag’s meaning—for instance, hashtags supporting the impeachment process
and connected to the main slogan of the pro- impeachment protests “come to the
street”. In addition, tags addressing messages directly related to the now-former
presidents of Brazil—“get out Dilma,” “get out Lula,” and the association of an
inflatable puppet wearing prison uniform, named Pixuleco, with Lula.
162
Figure 4.3. #nãovaitergolpe co-occurring network related to anti-coup protests in
Brazil, 18 March 2016. Instagram Platform. Network attributes: 1,250 nodes
(hashtags) and 11,487 edges (co-occurrences). Visualisation created with Gephi,
layout: Force Atlas 2 (Jacomy et al., 2014), “LinLog mode” option enabled.
Visual and Textual
Visual content was analysed through an image-label network built upon pre-trained
machine learning models of Google Cloud Vision API. We interpreted this network
by describing clusters of images brought together by formal similarity; an exercise of
relabelling the image classification provided by the vision API (Figures 4.4 and 4.5).
Through this approach, we found that both pro-impeachment and anti-coup visualities
exhibited a similar overall pattern, annotated by three major clusters: selfies and
portraits, crowds, and graphic pictures (banners, image macros, text, etc.). However
minor, both networks had food and beverage clusters, which we have also found to be
related to the protests themselves. Each of the groups had pejorative nicknames for
antagonist protesters which were based on food: “coxinhas” (a popular Brazilian treat
163
made with chicken) and “mortadela” (a popular type of sausage), respectively, used
by anti-coup and pro-impeachment protesters.
Several unique clusters were detected in each network, pointing to a particular visual
culture. The pro-impeachment (see Figure 4.4) had a large cluster of variations of the
Brazilian flag, which shows its strong connection with patriotic iconography. A
prominent cluster of dog pictures was also found, which indicates the trivialisation of
political engagement, while also possibly relating to how pets are commonly treated
and represented by middle-class Brazilians. Lying between individual and group
portraits were a significant amount of people wearing sunglasses, which seems to
relate to how these accessories are status symbols within Brazil. Contrary to this, the
anti-coup image-label network (see Figure 4.5) had a comparatively smaller cluster of
individual or small group portraits, with crowd photos being more prominent. The
Brazilian flag was much less featured, while other symbols, such as red protest t-shirts
and newspaper clippings, stood out. Within the individual portrait cluster, bearded
faces composed a small but meaningful cluster which relates to a typical expression of
political identity in the left.
To compare visual content between high-visibility and ordinary groups of each protest,
we resorted to a quantitative approach of label attribution frequency (Figure 4.6).
Regarding the image-label networks, the pro-impeachment dataset had a higher
occurrence of labels which relate to close-up portraits (e.g., “sunglasses,” “facial
expression,” “face”). These labels were slightly more common in the ordinary group
than in the high-visibility one. In the anti-coup dataset, labels related to collective
imagery were more common (e.g., “festival,” “demonstration,” “event”), indicating a
different representational tendency for this protest. These labels were also more
common among the high-visibility group than the ordinary group.
164
Figure 4.4. Image-label network of the pro-impeachment protests, March 13, 2016, Brazil.
Original Instagram images plotted according to relative node positions of a bipartite network
built with Google Cloud Vision API’s Version 1.0 (Google, 2017) “Label Detection” data.
Network attributes: 18,986 nodes (1,358 labels and 17,628 images) and 80,479 edges.
Layout: Force Atlas 2 (Jacomy et al., 2014), “Prevent overlap” option enabled.
Figure 4.5. Image-label network of the anti-coup protests, March 18, 2016, Brazil. Original
Instagram images plotted according to relative node positions of the bipartite network built
with Google Cloud Vision API’s Version 1.0 (Google, 2017) “Label Detection” data.
165
Network attributes: 2,872 nodes (587 labels and 2,285 images) and 10,508 edges. Layout:
Force Atlas 2 (Jacomy et al., 2014), “Prevent overlap” option enabled.
Figure 4.6. Proportional frequency of Google Cloud Vision API Version 1.0 (Google, 2017)
label attributions (number of attributions over a number of posts) for high-visibility and
ordinary groups. Filtered to the 15 most used attributed labels of each dataset. Visualisation
created with Tableau Desktop (Version 10.4.6; 2018).
Moreover, labels indicating colours were among the top occurring in both datasets:
yellow and green for the pro-impeachment protests; red for the anti-coup protests,
beyond being, respectively, associated with the Brazilian flag or the national football
uniform (pro-impeachment) and to the Workers’ Party or other left-wing movements
(anti-coup). Colours, here, indicate a statement of Brazilians’ position.
Seeking to identify the specificities of the discourse adopted in each of the political
perspectives (anti-coup and pro-impeachment) and groups (high-visibility and
166
ordinary), we visualised textual content (Instagram captions) in different levels of
analysis (Figure 4.7) through co-term networks.
We first visualised the textual content of both protests gathered in four main clusters
(Figure 4.7, left): two related to anti-coup positioning, and the other two connected to
the pro-impeachment group. In the latter, we see expected slogans against Dilma and
surprising national anthem terms; while in the anti-coup clusters there are appeals for
the impeachment process to end and for respecting the results of the 2014 democratic
elections in Brazil. In opposition to this broad perspective, we separated the co-term
networks by closely looking at the high-visibility and ordinary groups. The highvisibility network (Figure 4.7, center) shows more isolated clusters, scarcely
interconnected. The places where the protests occurred are what connect the polarised
debate. In the ordinary textual network (Figure 4.7, right), the main component shows
more dense connections, thus reproducing concerns similar to those we have already
mentioned.
Figure 4.7. Textual analysis of Brazilian protests in March 2016 via co-term networks.
Instagram captions and related hashtags were clustered according to political positioning (the
pro-impeachment and anti-coup selected hashtags), and according to co-occurrences of the
50 top terms in Instagram captions. Nodes are terms and edges co-mentioning relationships.
Software analysis: CorTexT Manager (Lisis Laboratory, 2017).
The richness of these different narratives is found in isolated clusters that reveal very
particular concerns, belonging solely to one group. It was the case of the appearance
of terms suggesting Brazilians not be moved by hatred but to “protest peacefully” as a
part of high-visibility textual content and the specific terms associated with an
alternative media account—namely, Mídia Ninja (Figure 4.7, center). Another
167
example, now in the ordinary network (right side), entails nationalistic rhetoric
referring to the Brazilian national anthem. Finally, but no less important, while highvisibility actors acknowledged Brazilians for their participation in the proimpeachment demonstrations, the ordinary actors expressed how proud they were of
being present at the protest.
Figure 4.8. The appropriation of emojis according to high- visibility
and ordinary groups; emojis organised according to frequency of use.
Ultimately, mixing the visual and textual content, we observed the use of emojis in
Instagram captions. Emojis (formerly called “emoticons”) have had a significant role
in computer-mediated communication, serving as a path to sharpen emotional
expressiveness on text-based interactions. In our perspective, these objects are
interesting because they can be apprehended in terms of representativeness (high-vis
and ordinary) and positioning (pro-impeachment vs. anti- coup), and not only as an act
of tagging per se.
168
Figure 4.8 depicts the appropriation of emojis in high-visibility and ordinary groups,
ranked according to frequency of use. At a glance, representative colours may be seen
in pro-impeachment icons (yellow and green) as well as in symbolic icons for the anticoup group (tulip and raised fist). This points to different use preferences, also serving
as a reinforcement of the visuality (Instagram images) attached to the polarised groups.
However, when comparing the appropriation of emojis by different groups, while the
ordinary group has a heart among the most used emojis, high-visibility accounts opted
for the globe showing the Americas, smiling face with sunglasses, and a party popper.
In addition, the skin tone of emojis reveals an interesting perspective about race
(represented by squares in Figure 4.8), with a predominance of light skin and medium
skin tones among protesters, except for the high-visibility accounts of the anti-coup
demonstrations, which had medium-dark and dark skin tones.
Conclusion
This chapter sought to critically and methodologically contribute to digital research by
looking at the specific case of hashtag engagement. Through digital methods, we
introduced the 3L perspective: a hands-on approach that operationalises new forms of
digital social enquiry. It has, in its core, the entanglement of the technicity of Instagram
and its grammatisation process as a lens for hashtag engagement analysis, as in the
appraisal of what is trendy in Hashtag Studies or Social Media Research and what is
often kept out of research concerns; that is, both high-visibility and ordinary actors,
actions, and related hashtagged content. The core outcome of this kind of research is
the assumption/perception of that high-visibility as a mirror of the social media digital
attention economy. However, in being re-signified through the detection of unique
actors combined with platform metrics over time, it serves as an alternative approach
to social media vanity metrics. By enquiring of hashtag political engagement on
Instagram, we confirmed the importance of including high-visibility versus ordinary
groups (Layer 1), hashtagging activity (Layer 2), and its related visuality and textuality
(Layer 3) as layers of the same object of study.
Through the case of the impeachment-cum-coup of Brazilian president Dilma Rousseff
in 2016, substantial differences between the high-visibility and ordinary groups were
169
uncovered—both in terms of hashtag usage culture and related content. By looking at
the structural shape of high- visibility groups in Layer 1, we found that impactful visual
content requires little effort from public figures, politicians, and artists (often with one
post), while continuous activity over time is a mandatory task for non-official
campaign accounts and independent media (often with a high number of posts). In
Layer 2, the different ways in which hashtags are captured by social media databases
expose different cultures of appropriation. The choice of tags and their intensities of
use changes between high-visibility and ordinary actors. These grammatised actions
also point to very particular behaviors—from the double-sense hashtags to an
automated agency. With the third layer, we navigate through the whole (all images and
textual content) to its parts (what pertains to high-visibility and ordinary) and back and
forth. When cross- read, the three layers add value to one another, providing a rich and
in-depth vision of the case study. This could not be understood without uncollapsing
hashtags, often treated as monolithic indices, without internal differences.
In this scope, the 3L approach adds value to social media research by accounting for
how the functional/practical relationship between technicity and platform
grammatisation concretely informs the process of reasoning with and through the
medium. However, it is essential to observe the significant changes in social media
APIs and their impact on research, as argued by Venturini and Rogers (2019): a call
for researchers to gain independence from standardised pathways. For instance, and
after the implementation of Instagram Graph API, the tool used in this study is now
obsolete (see Rieder, 2016), leading us back to scraping-based tools as an alternative
to pursuing the 3L perspective, e.g., Instaloader (Version 4.2.6, 2019). Another point
concerns the inherent limitations of our proposal, which are certainly not exhaustive
of possibilities to explore the modes of engagement beyond unique actors and their
respective metrics and activities. For instance, to follow hashtags and account for their
algorithmically driven placement in users’ feeds or to account for the biases and
limitations of computer vision and machine learning as analytical instruments of
analysis (see Mintz et al., 2019).
Furthermore, the challenges of applying digital methods for hashtag engagement
research concerns how to deal with the ephemeral ways of being of social media and
their changeable ways of grammatising actions. Regardless of the possible changes in
platforms and research tools or protocols, the conjunction of the 3L pertains to key
170
points often addressed in social media research. With this knowledge and positioning
the notions of technicity and grammatisation as a practical matter, this chapter may
contribute to what Rogers calls a medium-specific theory. Therefore, and as it follows
the ways in which platforms operate, the techniques and enquiry proposed by 3L shall
evolve through time. We also hope that this framework can enhance the understanding
of hashtag engagement and, regardless of the platform-specific derivations, being
further applicable on different platforms.
171
Introduction to chapter five
This chapter is originally published as: Omena, J. J. & Granado, A. (2020). Call into
the platform! Merging platform grammatisation and practical knowledge to study
digital networks, Icono 14, 18 (1), 89-122. doi: 10.7195/ri14.v18i1.1436
The chapter follows the invitation of professor António Granado, at the beginning of
2016, to start a project about the Portuguese Universities on Facebook, raising
questions such as: how do Portuguese universities make use of Facebook? What and
how do they communicate? How do digital platforms serve as a tool or bridge for
science communication? How can visual network analysis help to respond to these
questions? My contribution in this chapter was written at times between 2016 and
2019, starting with the exploration of networks of Facebook page likes connections
and page data to the making, analysis and visualisation of Google vision-based
network. As other projects that seek to develop research with digital methods, our
project started with a sort of mimic research, that is by imitating research protocols
that had been previously successful, but it ended up taking its own form.
The reason why I chose to include this chapter is that it illustrates a purposeful use of
digital networks as research device and the possibility of experimenting different
visual techniques from a media research perspective. This is, to give space and time to
the practice of digital methods, learning how to think along and repurpose the medium
and digital records, and, more practically, how to differentiate data descriptions from
findings (or trying to do that!). A good opportunity to take advantage of the openended research procedure underpinning the methods.
The chapter thus describes a situation where the researchers have already developed a
certain proximity with computational mediums and developed a sensitivity to their
technicity. The chapter takes seriously the acquired knowledge about the relationship
between software affordances, platforms’ culture of use and their technical
grammatisation and reflects on the iterative and navigational practices demanded by
the method. Here time is crucial; time to explore and describe what the networks had
to offer, time to question new analytical possibilities, time to present practical solutions
through the methods, and to start the analysis all over again.
172
The technicity approach can be noticed in the way the research questions were asked
and, in the way, technical imagination was used to solve these questions. Technical
knowledge (about Facebook grammatisation, Google Vision’ labels detection and
ForceAtlas2) and practices (using research software, reading networks, creating data
visualisations) were crucial. Finally, such knowledge become a fundamental factor in
the iterative and navigational technical practices, which use the Web not only as a
source of data but also as place to find less structured information, supporting
qualitative analysis, e.g. use pages and posts IDs or URLs (available on spreadsheets
or Gephi) to locate publications on Facebook, situating, for instance, contexts related
to the page like network.
For instance, we took Facebook grammatisation into account, using Page like
connections and data as a means to map and analyse the institutional interests of
Portuguese higher education. This showed us relevant information in mapping the
profile of Portuguese universities. We welcomed new research questions throughout
the analysis, e.g. by asking about the dominant and ordinary associations of Portuguese
Universities and using Facebook Page category to respond to that. By using Google
Vision API outputs (labels based on confidence score and topicality rating to classify
images), we were able to make sense of a historical image dataset rearranging all
timeline images and labels as networks.
To answer the main and emerging research questions, basic and advanced
visualizations were produced, e.g. a bee swarm chart to examine the mood of
Portuguese Universities over the years according to face detection module of Google
Vision API, and a network grid detecting the visual patterns and particularities of each
University observing the presence or absence of colours represented by image clusters
(posters, animals, people, etc.). Through the bee swarm chart, we were able to identify
what brings joy to four universities by closely analysing images tagged as very likely
or likely to contain joy in their content (information provided by computer vision
outputs). Through the network grid, we visually compare the images clusters across
the different university pages. Here, the mastery of Gephi use and the ability to create
basic visualisations with RawGraphs proven crucial for producing knowledge and
posing new research questions.
In regard to a discussion of the methods, this chapter continues and expands the use of
computer vision networks as a research device, also making use of basic visualizations
to answer meaningful questions. Here Google vision outputs were interpreted and
173
further explored (though not necessarily questioned), thus letting the methods,
computational mediums and researchers’ intervention tell the story. For instance, after
noticing that the dominant visual content contained students, professors, academic
staff, board members in the most varied types of events, we asked about the mood of
the Portuguese Universities and about the accuracy of Google’s face detection to detect
different facial expressions. The accuracy for expressions such as "joy" has shown to
be a reliable label, contrary to a lack of precision in the classification of images
detected with surprise, sorrow or anger faces. The latter required a navigational mode
of analysis; from the bee swarm chart to the search for an image id in the spreadsheet
(filtering by face expressions and published results) and, afterwards, in the folder of
images.
The analysis of images with "joy" faces, pertaining to four specific universities and
identified through the bee swarm chart, was a more complex methodological process.
We came across a recurrent situation when using digital methods: knowing what to do
and why, but not how to do it. In situations like this, we see the advantage of using or
activating a technical imagination to solve emerging methodological problems. An
imagination that results from some experience in the practice of digital methods, e.g.
downloading and visualising a collection of images. In this case study, for instance,
we considered analysing specific groups of images (faces with “joy”) because we were
aware that all images (downloaded from the web) were stored in a folder and named
with the format *name*.*ext*, the image filename follows the image id available on
the image URL.
Image URL
https://instagram.fhio3-1.fna.fbcdn.net/v/t51.288515/e35/64816216_895583534126353_811098829469214962_n.jpg?_nc_ht=instagram.fhio31.fna.fbcdn.net&_nc_cat=101&_nc_ohc=EGRkWUNDcXMAX9wcHSI&tp=1&oh=fe4ddb96d47d4f
15367eb4b49f7080a4&oe=6059A868
Image filename
64816216_895583534126353_811098829469214962_n.jpg
We knew beforehand that the images’ filenames could be filtered on a spreadsheet
serving as query to locate the images inside the image folder, but we didn´t know how
174
to get a list of images out of it. In other words, we knew what to do (get the images
labelled by joy by four different universities, relocating these in four separated folder)
and why to do it (to verify the motive of joy by analysing the visual content with
ImageSorter). The difficulty here was to operationalise the task of locating a list of
images in folder using filename, and then, copy the selected images into another folder
to be analysed qualitatively with the help of ImageSorter. A simple command line
solved this challenge, thanks to a colleague’s help (Fábio Gouveia)132. What I want to
emphasize here is the use of technical knowledge to facilitate the analysis. We learned
that technical imagination helps to resolve methodological issues and supports the
creation of research tools.
In this chapter, most of the results, analyses and challenges could not have been
foreseen or included in the initial plans, demonstrating the open-endedness of digital
methods. There are some practical aspects that can be learnt from this case study, as
highlighted below:
§
It is necessary to avoid giving too much importance to quantifiable metrics (such as
number of followers, likes or post). We should take advantage of what technological
grammar can offer as units of knowledge for a quali-quanti appreciation of the dataset,
also counting on a technical imagination for this task.
§
To take advantage of technological grammar as methodological language, practical
skills are required such as mastering basic excel formulas, knowing how to handle
data and using research software for data visualization and analysis. These practices
must go hand-in-hand with knowing-how to think along with the relationship between
software affordances, platforms’ culture of use and their platform grammatisation.
§
The analysis of a digital network is never restricted to its visualization, the way nodes
are connected and positioned can point to other levels of analysis contemplating
content other than connections (e.g. moving from the network description and findings
to the scrutiny of facial expression to identify what brings join to the Universities’
community).
§
Moreover, when reading digital networks through ForceAtlas2, the spatialization
provided by this force-directed layout may provide a narrative thread that has fixed
layers of interpretation (centre, mid-zone, periphery, isolated elements) that guide and
facilitate reading.
132
Inspired by this and similar needs to search for images in a folder, recently, and together with Jason
Chao, we created a simple command-line based-tool to help with that task, the Offline Image Query and
Extraction Tool, https://github.com/jason-chao/offline-image-query.
175
§
When building a temporal image dataset, it is necessary to be attentive to possible
changes of unique identifiers (a Facebook Page id can change over time, e.g. pages
can be deactivated) just as the short life span of social media image URLs (images
should be downloaded as soon as data collection is complete).
§
Literacy in using (research) software should go hand-in-hand with the acknowledge
about the relationship between software affordances, platforms’ culture of use and
their technical grammatisation.
§
In the analysis of data and especially through a more qualitative look, the web should
always be a source of consultation and analysis.
§
Digital methods require room for experimentation and time to deliver proper results.
When using these methods, researcher should be open to new research questions that
emerge in the exploration and visual analysis of the data.
The analysis proposed in this chapter offers macro and micro perspective of the
Portuguese Universities on Facebook, moving from general to specific visions with
both quantitative (a general overview of the networks) and qualitative (looking at
specific content and actors in the network). Therefore, the chapter supports the
argument about the dissolution of the quali/quanti divide by taking technological
grammar into account (rethinking conditions and means to knowledge production),
mastering software practices (know-how) and being aware about potentials of
computations mediums (in technical and practical terms).
176
5 DIGITAL NETWORKS133
C HAPTER 5
133
This chapter was originally published as: Omena, J. J. & Granado, A. (2020). Call into the platform!
Merging platform grammatisation and practical knowledge to study digital networks, Icono 14, 18 (1), 89122. doi: 10.7195/ri14.v18i1.1436
177
The case of Portuguese Universities on Facebook
Three visions of how to approach the digital have been modelling research in the field
of Social Sciences and Communication. The mastering of online questionnaires,
surveys and interviews to enquiry our digital life comprise the first. Although taken as
key research methods, the proposal of migrating the social sciences instrumentarium
to online does not properly respond to the affordances of digital platforms and data
(Rogers, 2015; Marres, 2017). A second vision conforms mixed methods or what
Marres (2017) refers as an affirmative approach to grasp the digital, that is, to treat
“digital devices as an empirical resource for enquiry” (p.125) and also to affirm the
role of bias in processes like issue formation. Despite being well-suited for online
environment, this vision remains focused on the instrumental capacities of the digital,
just as the first one. For both cases, the appreciation of technology is somehow broken
and thus not seriously taken as “hybrid assemblages” (Latour, Jensen, Venturini,
Grauwin, & Boullier, 2012). Conversely, this chapter foregrounds “the mediumspecificity of social phenomena” (Marres, 2017, p. 117), when digital platforms are
both object and method of study (Latour et al., 2012).
We bypass the so-called digitisation of methods and opt for the deployment of online
mechanisms, tools and data for conducting social or medium research (Rogers, 2013;
2015; Marres, 2017). This introduces the third vision for approaching the digital - from
the inside out or the incorporation of the methods of the medium to reimagine social
and medium research (Rogers, 2019). In this line of thought, we account web platforms
as sociological machines (Marres, 2017) that are qualified by digital instruments for
data capture, analysis and feedback. Consequently, we consider digital infrastructures,
with Marres (2017) and Rogers (2019) as promoters of methodological innovation.
Drawing on the case of Portuguese Universities on Facebook, this chapter emphasises
the importance of combining knowledge on platform grammatisation with data
research practices (capture, mining, analysis and visualisation). It furthermore
considers the notion of grammatisation (Gerlitz & Rieder, 2018; Omena, Rabello &
Mintz, 2020) as a path to understand how social media delineate, (re) organise and
structure online activity through software, for example, social media application
programming interfaces (APIs). That is what we refer to as call into the platform,
which points to the functional understanding of web platforms as a fundamental basis
178
for Digital Social Sciences and Communication. Thus, there is a requirement of
technical layers of knowledge about the platform itself entangled with digital methods
perspective (Rieder, 2015; Rogers, 2019).
The proposal of calling into the platform to study digital networks, therefore, envisions
the infrastructural aspects of Facebook and its forms of grammatising online activity.
One way of understanding Facebook grammatisation is to grasp how its Graph API
delineates predefined forms of activities and their specific properties. From the
existence of a Facebook Page and related metadata134 to very peculiar characteristics,
for instance, what a given page “is” by informing the “page category” (e.g. College &
University). In platform functionality and comprehension, we ask what one can learn
from the connections between Facebook Pages (through likes) and from a list of
(timeline) image URLs.
We further raise questions on the affordances and limitations of Facebook as a source
for digital Social Sciences research through the exercise of reading digital networks.
What can be studied from single page like connections? What can we foresee from a
historical dataset of images featured in Facebook Pages timeline? How to reimagine
platform grammatisation to study the institutional communication of Portuguese
Universities? To address these challenges, two distinctive networks135 will be
explored, shedding light on the institutional connections and the visual culture of
higher education in Portugal. These networks emerge from different situations: one
afforded by Facebook Graph API (the like network of Facebook Pages) and another
afforded by digital data (the timeline image-label network136). The first (see Fig 5.1)
comprises all connections made by a given page; the act of liking other pages or being
liked in return (a monopartite network). The second is built upon the affordances of
computer vision APIs in describing large images datasets, and the advantages of
134
E.g. Page description, post date of publication, reactions, shares, comments, posts per hour
(post_activity), how users interact with or talk about a page (talking_about_account), the total number
of likes a page has received (fan_count) and whether users can or cannot post in a Page (users_can_post)
https://developers.facebook.com/docs/graph-api/reference/page/
135
As socio-technical formations, digital networks offer ways of understanding social and cultural
phenomena (including institutional communication).
136
For further details on image-label networks see the work of Ricci, Colombo, Meunier, & Brilli,
2017; Omena et al. 2017; and Mintz et al., 2019.
179
software and data for building and plotting a network of images and their descriptive
labels (a bipartite network).
Figure 5.1. The act of liking other pages in Facebook end-user interface.
Given this scenario, we take the visual affordances of networks and data relational
nature as an analytical framework (Venturini, Jacomy, Bounegru, & Gray, 2018;
Venturini, Jacomy, & Pereira, 2015; Venturini, Jacomy, & Jensen, 2019; and Omena,
Chao, Pilipets et al. 2019). Thus, overlooking statistical metrics, we argue with
Venturini and Rogers (2019) that digital social sciences research through web
platforms should always be research about these platforms. That is to say: one cannot
study society through a web platform without studying the platform itself. In what
follows, we operationalise research about Facebook as a form of grasping the
institutional connections and visual culture of Portuguese Universities through this
platform.
Material and Methods
The material presented in this chapter is part of a larger research project about the
Portuguese Universities on Facebook, which started in March 2017. Since then and
following the advantages of API research (Venturini & Rogers, 2019), the public page
metadata of the 14 Portuguese Public Universities, as well as that of one private (see
Table
5.1),
have
been
collected
and
archived
through
the
application
Netvizz137 (Rieder, 2013). The list of 15 universities complies with the Council of
137
Netvizz is no longer available for research purposes, it stopped working on September 4, 2019.
180
Deans of Portuguese Universities (CRUP) that lists all Portuguese Public Universities
and the Portuguese Catholic University (the oldest private higher education institution
in Portugal). CRUP represents more than 80 percent of all students enrolled at
Portuguese Universities.
Facebook Page ID was the entry point for data collection, advanced by the Netvizz
modules Page Like Network and Page Timeline Images. Table 1 shows the number of
extracted pages for each Like Network (crawl depth 1), with data extraction in March
2019, and the number of all timeline images collected. This latter dataset is a
substantial and representative sample because it brings together all images uploaded
by the 15 Portuguese Universities: from the data of creation of each page to March
2018.
Page
Created
Time
University
Page ID
Extracted
pages from
Page Like
Network
Module
Number of
extracted images
from Timeline
Im- ages Module
Feb 2009
Universidade do Porto
Universidade da Beira Interior
51541308379
143419211198
69 pages
42 pages
1272 images
1,719 images
342519554742
16 pages
528 images
114882798568553
159654804074269
23 pages
1055 images
116726201675273
4 pages
152 pages
66 images
616 images
110296129010034
391 pages
7,054 images
190354481008389
111501795592752
157257697673304
528 pages
279 images
2,731 images
331782250179584
263694373699158
33 pages
53 pages
149 pages
1 pages
48 pages
1,758 pages
Sep 2009
Mar 2010 Universidade Católica
Portuguesa
Sep 2010 Universidade de Aveiro
Oct 2010 Universidade de Coimbra
Apr 2010 ISCTE – Instituto Universitário de
Lisboa
Jan 2011 Universidade de Trás-osMontes e Alto Douro (UTAD)
Mar 2011 Universidade do Algarve
Feb 2011 Universidade do Minho
May 2011 Universidade Aberta de
Portugal
Jan 2012 Universidade Nova de Lisboa
Feb 2012 Universidade de Lisboa
Nov 2012 Universidade de Évora
398588126903994
Feb 2014 Universidade dos Açores
1696811123898577
590410784413143
Jul 2014 Universidade da Madeira
Total of pages and images:
121 pages
128 pages
982 images
365 images
562 images
2,772 images
1,813 images
784 images
22,598 images
Table 5.1. List of Portuguese Universities according to Facebook Page IDs, created time by
Facebook Page Transparency, and the outputs of data extraction through Netvizz138.
138
In March 2017, the University of Coimbra had 363 timeline images registered and in March 2018, we
detected the page had changed its Facebook id from 170094986358297 to 159654804074269. The
181
The research protocol diagram139 is explained as follows (see Figure 5.2). The entry
points for data collection were page ids. Through calling the Facebook Graph API,
Netvizz returns both .tab and .gdf files that contain metrics and data afforded by
Facebook Platform (e.g. Reactions button, post published date, comments). Some of
these metrics were attached to node attributes (e.g. page category, post-activity, fan
count, talking about count) or edge property (the act of liking connects pages) within
the like network, while others facilitated the process of building up an image network
(e.g. a list of image URLs). For data analysis and visualisation, we used Gephi
(Bastian, Heymann, & Jacomy, 2009) and Graph Recipes140, and for the image-label
network we relied also on Python scripts: one for interfacing with Google’s Vision
API141 and the other for plotting images into .svg files, the Image Network Plotter142.
For the automated image content analysis, the option was Google´s Vision API, due
to its descriptive capacities, which tend to higher levels of specificity when labelling
large image datasets in comparison to other vision APIs, such as IBM or Microsoft
(Mintz et al., 2019). To analyse the 22,594 images generated by Portuguese
Universities, two specific properties of Google’s computer vision API were chosen:
the description of images and the detection of face expressions - namely label and face
detection. For the latter, RawGraphs (Mauri, Elli, Caviglia, Uboldi, & Azzi, 2017) and
ImageSorter143 served as important tools to analyse, navigate and visualise the results
related to face detection.
For the scrutiny of the two distinctive digital networks (page like and image-label
networks), we relied mostly on visual network analysis (Venturini et al., 2015; 2019;
Omena, Chao, Pilipets et al. 2019). This approach draws our attention to the visual
affordances of the networks, rather than focusing only on statistical metrics. The
total of 66 images collected though correspond to the latter page id and all images were uploaded
between 21 June 2013 and 13 March 2017.
139
Research protocol diagrams presents the entire research process “in a compact visual form” (see the
work of Niederer and Colombo 2019); it visually informs the research steps advanced by digital
methods approach.
140
https://medialab.github.io/graph-recipes/#!/upload
141
That corresponds to an expanded version of Memespector, originally developed in PHP by Ber- nhard
Rieder, and later ported to Python and expanded by André Mintz. Available in: https://
github.com/amintz/memespector-python. See Google Vision API here https://cloud.google.
com/vision/.
142
https://github.com/amintz/image-network-plotter
143
https://visual-computing.com/project/imagesorter/
182
position, size and colour of the nodes are fundamental aspects in this analytical
process, as well as the spatialisation of the network, here provided by ForceAtlas2.
This force-directed algorithm, commonly used for studying networks emerged from
social media, supports the interpretation of data by creating a balance state in the
spatialisation of the network (Jacomy, Heymann, Venturini, & Bastian, 2014).
Modularity calculation (Blondel, Guillaume, Lambiotte, & Lefebvre, 2008) was also
used to identify clusters: the detection of institutional interests within the like network
and the different modes of visual representations within image-label networks. Adding
to that, we took into account a critical framework for reading digital networks (Omena
& Amaral, 2019) which simultaneously reflects technical-practical knowledge on
platform grammatisation, the narrative affordances of ForceAtlas2 and Gephi
software.
Figure 5.2. Research protocol diagram: combining the knowledge of platform
grammatisation with the praxis of data capture and data analysis for studying Portuguese
Universities Digital Networks.
Results
Seeing beyond like connections
The analysis of Portuguese Universities Page like network is organised according to
an overview of the page’s profile, institutional connections and the narrative thread
afforded by the spatialisation of the network, by looking at its whole and parts, as well
as the central and bridging nodes (Latour et al., 2012; Venturini et al., 2019; Omena
& Amaral, 2019). The question of clusters’ formations is also addressed, moving to an
183
in-depth analysis of page categories as a path to unveil the universities specific
(institutional) interests. Starting with a general overview, the scatterplot below (see
Figure 5.3) displays different Facebook metrics (variables) attached to a given
university (nodes). Through the size of the nodes, we can see that Facebook typical
forms of measurements can be very contradictory if taken as analytical parameters.
For instance, a high degree of post activity may not relate with the users’ engagement
- talking about count (see the universities of UTAD, Coimbra and Algarve). By the
same token, the number of fan count (see Aberta), a high degree of activity or number
of fans, does not necessarily indicates high levels of engagement. Additionally, we
notice that more than a half of the universities allow users to post (green nodes), but
these publications are kept hidden from their Page timeline.
Figure 5.3. An overview of Portuguese Universities according to the following Facebook
metrics: post activity (posts per hour and based on the last 50 posts); talking about
count (attention metric); users can post (whether a page allows users to publish posts on the
page); fan count (number of likes a page has received). Scatterplot made in RawGraphs and
edited in Inkscape.
184
When moving towards the heat map of the Portuguese Universities page like network
(see Figure 5.4), we first noticed that Media and News Companies (see the nodes
among Público and Expresso), followed by Calouste Gulbenkian Foundation, is at the
heart of the interests of this network. A second aspect is the high density of connections
that surrounds UTAD and Algarve universities and, with a lower density, ISCTE,
Porto, NOVA e Lisboa. A third aspect concerns Madeira University, which is detached
from the main component within the network and, ironically, geographically isolated.
Madeira is an island located in Funchal, part of the Maderia Archipelago. Following
the positioning of the nodes, one final observation relates to the central role of
Universia Portugal (an academic Portal), European Commission, Forum Estudante (an
academic and professional-oriented magazine) and the Association of Portuguese
Speaking Universities (AULP).
The exercise of replacing edges with a density heat map provides a vision of the whole
network and signalises the matters of common concern among Portuguese Universities
(Media and News Journal), showing agglomerations (clusters) and central actors.
Despite recognising the value of the nodes positioned in the centre of the network
(under the categories of Media/News Company and Newspapers), those were removed
with the intention of improving clusters’ visualisation. This is illustrated by the
emergence of three small clusters attached to the universities of Aveiro, Beira Interior
and Minho (see Fig. 5.5). Furthermore, new central actors for Portuguese Universities
were detected: the General Direction of Portuguese Higher Education, MIT Portugal
Program (an international collaboration centre), FNAC Portugal and Futurália (the
biggest Education Fair in Portugal). Subsequently, we read the spatialisation of the
network through the visual affordances of ForceAtlas2144. In the centre of the network,
we can see the very connected pages that play a key role in the whole network, while
influential actors and bridging nodes145 take part in the mid-zone, for instance, the
Portuguese television and radio channels (e.g. RTP2, SIC TV, Rádio Comercial),
AULP and the job opportunities office as bridging nodes. In the mid-zone, small
144
In practical terms, this force-directed layout provides a narrative thread that has fixed layers of
interpretation but multiple forms of reading (see Omena & Amaral, 2019).
145
When a node connects nodes of different clusters.
185
clusters can be seen with connections mainly related to Porto, Nova, Lisboa and Évora
universities.
Figure 5.4. The heatmap of Portuguese Universities page like network in March 2019. The
map shows 44 visible nodes (out of 1.522) and contains 18.988 edges. Node size by
indegree. Visualisation by Graph Recipes (Heatmap dessert) and Gephi.
In the periphery (see Fig. 5.5), while the large holes denote fewer connections with the
centre or mid-term zones, the existence of clusters correspond to the particular interests
of Algarve, UTAD, Minho, Beira Interior and ISCTE, for instance, faculties,
laboratories, student associations, schools. As an isolated element, Madeira only takes
part of the network due to the likes the page has received from a medical service’ and
a high school’s Facebook Pages. Madeira university itself has only connected to pages
featured by “Portuguese in...” or “Portuguese immigrants in...”, evoking the presence
186
of Portuguese in European, South America and African cities or countries. Those are,
somehow, unexpected connections for an official higher education Institutional
Facebook Page.
Figure 5.5. Reading the page like network of Portuguese Universities according to the
narrative thread afforded by the force directed layout ForceAtlas2. The nodes categorised by
Media/News Company and Newspapers were removed to highlight cluster formation and to
perceive the universities’ particular interests. 1.426 nodes,16.125 edges. Node sized by
degree.
Throughout the observation of the shape of the edges in large and small clusters, a
visual pattern is identified suggesting how clusters are formed. An outward movement
from the central node to its neighbours alludes to the fact that cluster formation is
substantially based on the act of liking (see Algarve, UTAD and ISCTE in Fig. 5.5).
187
To confirm the visual hypothesis, we used degree centrality to analyse cluster
formation within the page like network: by sizing nodes according to the number of
connections made by a page (degree), and the total of likes a page received (indegree)
or made (outdegree).
Thus, we reimaged page likes (see table 5.2) to define whether cluster formation is
based on page activity (by means of liking pages), reciprocity (by a balanced number
of likes given vs. received), popularity (by means of receiving likes), or little
reciprocity. In so doing, we avoid misinterpretation in the process of interpreting
digital networks, such as taking larger clusters as more important or overseeing the
role of hidden elements.
Algarve has the largest cluster mainly because it has liked more than 500 pages. In
contrast, the small cluster of Minho is based on reciprocal connections, while Porto
seems to be popular among other institutional Facebook Pages. When accounting for
how connections are made within a network (data relational nature), node size or
visibility should only inform about different characteristics that serve as basis for
understanding the cluster formation and its narrative thread.
Clusters/Page
Universidade do Algarve
Universidade de Trás-os-Montes e Alto Douro
Universidade de Évora
Universidade Aberta de Portugal
ISCTE
Universidade do Minho
Universidade de Lisboa
Universidade Católica Portuguesa
Universidade do Porto
Universidade de Aveiro
Universidade da Beira Interior
Universidade NOVA de Lisboa
Universidade da Madeira
Indegree
[receiving
likes]
Outdegree
[liking other
pages]
Formation
based on
178
117
54
19
75
131
66
37
527
390
148
127
151
120
52
15
Page activity
[by means of
liking pages]
140
81
75
66
2
68
22
41
33
47
Popularity
[by means of
receiving likes]
Reciprocity
[by balanced
number of
likes received
and made]
Little reciprocity
Table 5.2: Analysing cluster formation and nodes’ size on the basis of in-degree and outdegree.
188
After
understanding
cluster
formation,
the
Facebook
parameter
Page
Category146 served as a form of identifying the interests of Portuguese Universities
(see Fig. 5.6) within the network. Not surprisingly, College & University, School,
Community, Non-profit Organization, Community College, High School, Government
Organisation, and Media/News Company are the dominant categories within the
network. Page Category also reveals multiple dimensions of sociality: media and
communication (TV, radio, newspaper, website); culture (museums, library, art,
musician/band); business (company, consulting or advertising or marketing agency,
agriculture service); public services and health concerns (medical and health, hospital);
entertainment (golf course and country, cultural centre, movie theatre) and gastronomy
(Thai restaurant, dessert shop); public figures and news personality; and very specific
interests like Barbershop, Car Dealership, Shopping Mall, Sports League and Events.
However, when searching for politically related categories, civic engagement, social
movements or causes, nothing was found.
Figure 5.6. What are the dominant (highlighted in white) and ordinary (in colour)
associations linked to Portuguese Universities? Treemap of Page Categories: size refers to
the frequency of appearance.
146
Facebook used to have six broad categories of pages: Local business or place; Company, Organisation
or Institution; Brand or product; Artist, Band or Public Figure, Entertainment; and, Cause or Community.
Each category had a long sub-category list. In July 2019, we verified that the platform now only offers two
broad categories of pages: i) Business or Brand and ii) Community or Public Figure. These latter are also
constitute by sub-categories but presently, they are only visible when searched, see
https://www.facebook.com/pages/creation/.
189
Back to the mainstream categories, and driven by College & University and School,
three groups that stand for the institutional profile and specific interests of each
university were detected:
1. Schools from varied districts/regions of Portugal, with especial attention to
school groups (Algarve and UTAD), polytechnic institutes (Évora), university
departments or services (Minho).
2. Internal stakeholders: faculties, institutes, departments, research centres, and
courses from each university. For instance, Coimbra, Lisboa and NOVA focus
on their faculties or institutes, while Aveiro pays more attention to
departments, Beira Interior to students’ nuclei from different courses, and
Católica, including its branches in Porto and Braga, to courses.
3. International universities and Portuguese higher education147 exclusively
represented by ISCTE – IUL and local learning centres.
Contrasting with this description, Aberta University is uniquely positioned, showing a
combination of the three groups, due to its balanced range of connectivity. Another
affordance of looking at page categories is unfolding official pages that are set aside
by the university or the other way around. A good example is the lack of connection
between Coimbra University and its Department of Physics, The geophysical and
astronomic laboratory, the geophysical and astronomic laboratory, Rádio
Universidade de Coimbra, and others148. Additionally, the discovery of pages oriented
towards specific audience, for example, Brazilians students149.
The imagery of Portuguese Universities
What can we foresee from a historical dataset of Facebook images timelines? How to
repurpose the methods of the medium to study the visuality of higher education in
Portugal? As natively digital objects, online images have uniform resource identifiers
147
For instance, Münster University, University of Groningen, Universidad Carlos III de Madrid, UNED Universidad Nacional de Educación a Distancia, Universidade da Cidade de Macau, and Portuguese
Higher Education: Universidade do Porto, Instituto Superior Técnico, Universidade da Beira Interior,
Universidade Lusíada de Lisboa, Escola Superior de Comunicação Social, Uni- versidade de Lisboa,
Laboratório de Ciências da Comunicação, ISCTE Business School.
148
Turismos Universidade de Coimbra, Relações Internacionais Universidade de Coimbra - International Office, Sociologia, Universidade de Coimbra, Clube de Robótica da Universidade de Coimbra.
149
https://www.facebook.com/UniversidadeCoimbraBrasil/
190
(URLs) that often provide a fragment component preceded by a = (equal sign) which
points to a unique identifier (id) of a given image, such as https://scontent.flis91.fna.fbcdn.net/v/t1.09/1937274_144956263379_7335679_n.jpg?_nc_cat=108&_nc_ht=scontent.flis91.fna&oh=2a719e0099e71e-4049305a22ea628887&oe=5D536D7D.
By
taking
advantage of the images’ URLs afforded by Facebook Graph API and merging these
into the computer vision services of Google Cloud API, we were able to describe and
interpret the imagery of Portuguese Universities.
A total of 22,594 images were plotted in a bipartite network, in which nodes are images
(90.62%) and labels (9.38%), while edges (161,474 in total) show the connections
made by a number of labels that describe one or more images. The co-occurrences of
similar descriptive visual content (labels) inform about the position of the images
within the network. In alignment with Rose’s (2016) proposal for interpreting visual
material, the first level of analysis enquires about the site of the image itself. The multisited composition and meanings indicate different formations of institutional interests
that are invested in the visuality of the page timelines. What visualities describe this
culture? What can the dominant visualities and their inherent meanings tell? What is
not there (and why)? Through the computer vision API-based network, in Figure 5.7,
we may see the plot of all images featured in the timeline of the 15 Portuguese
Universities’ Facebook Pages, spatialised according to correlated labels and, at the
bottom, the analytical exercise of relabelling the machine vision by categorising
clusters. Next, we will show how the scrutiny of timeline images can provided general
and specific insights on the visual culture of Portuguese Universities over the years,
more specifically from each university page creation date to March 2018.
In Figure 5.7, at the top, we see quite a homogeneous network, except for the strong
concentration of images in dark hues on the right side and the colourful agglomeration
at the top. These clusters represent the most common visual representation associated
with Portuguese Universities official Facebook Pages: Portugal sentado (Portugal
while seated) and posters. The former is named after a Portuguese journalistic
expression that relates to the (bad) habit of publishing photos of people seated in press
conferences, auditoriums, all kinds of meetings, parliament, scientific conferences, or
even sports events, in order words, something to be avoided in the
newspapers. Posters depict all sorts of written content: screenshots of news,
191
institutional newsletter, banners to promote academic events or to celebrate
commemorative dates.
In the peripheral zone of the image-label network, we find the formation of discrete
clusters that point to different visualities with a more detailed classification (labels) to
describe the images, whereas a more generic labelling takes place in the centre of the
network. After analysing the different regions of the network, ten clusters were
detected besides those two already mentioned (see Fig. 5.7): people in academic
events, people in outdoor/indoor events, school buildings, head shot pictures, sports,
musical performance, sky and grass, animals, labs, and history in black and white.
In a general view (see Fig. 5.8), the visual depiction of Portuguese Universities mainly
outlines the tedious pictures of audiences (sitting in an auditorium) or keynote speakers
in academic conferences. Adding to that, pre and post conference conversations,
organisers or participants posing for pictures, posters, presentations, institutional
partnership (e.g. shaking hands or signing contracts) and prize winners.
Beside portugal sentado and nearly in the centre of the network (see Fig. 5.7), there
are two other clusters of which the main focus is people in academic events (outdoor
and indoor activities), small and big groups chatting or posing for a photo, professors
or students being interviewed.
192
Figure 5.7. The imagery of Portuguese Universities on Facebook from 2009 to 2018.
193
Another strong visual identity is the graphical depiction of posters and banners with
the most diverse type of announcements, beyond conference, symposiums, workshops,
sports or new students related banners, the number of likes achieved by a page (fan
count), and the celebration around the ‘international day of’ (e.g. water, statistics). A
very particular visuality is also uncovered, specifically news clippings which indicates
how Portuguese Universities value when mainstream or traditional Portuguese
newspapers headline or mentions academic research, professors or students. In this
scenario, and between 2009 and 2018, we may infer that the dominant visualities of
Portuguese Universities conform to people attending events and institutional posters.
The architecture of school buildings earned a place of honour, in particular the
overwhelming presence of the Institute of Social Sciences (ICS- Minho University (see
Fig. 5.8). We also visualise head shot pictures of local, national and international
researchers (perhaps a few number of students), and a full cluster dedicated to
university sports (see Fig.8). From collective (including wheelchair categories) to
individuals’ modalities, the rule seems to be the depiction of victorious teams: images
of teams celebrating the victory, athletes on the podium, holding medals or a trophy.
At the bottom of the network, the musical performance cluster (see Fig. 5.7),
composed of cultural events in the shape of concerts, choirs, orchestra and musicians
with their instruments. There is also a group of images putting together because sky
and grass are their main visual composition.
The ordinary visual content (less substantial in numbers) brings wild and
domestic animals such as birds, caws, owls, tigers, monkey, dogs and cats; the
stereotype images of labs - namely researchers working with a telescope; and people
who made history in black and white pictures. This latter basically represents historical
photos published by Porto. The visual description and historical perspective of
Portuguese Universities seem to lack, however, the everyday life of students and nonstereotypical imagery of scientific research, while it overvalued the academic events
and institutional communication through banners and news clipping.
194
Figure 5.8. The visual history of Portuguese Universities from 2009 to 2018: image tree map
based on clusters detected in the image-label network (portugal sentado, posters, people in
academic events, people in outdoor/indoor events, school buildings, head shot pictures,
sports, musical performance, sky and grass, animals, labs, and history in black and white).
After having a general (but also detailed) perspective on the imagery of Portuguese
Universities on Facebook, in the second level of analysis, we questioned the visual
choices and patterns attached to each university. Fifteen bipartite image-label networks
were arranged in a grid to respond to this question (Fig. 5.9). The focus of our
analytical observations here is the presence or absence of colours, which indicates
different visual patterns and particularities. In the grid, for each university image-label
network, there is the number of images in accordance with the page created time. These
individual characteristics help to situate different networks. The vision API-based
network grid provides not only an innovative technique to approach Portuguese
Universities’ timeline images but also their visual history on Facebook.
195
Figure 5.9. Vision API-based network grid. Seeing the visual patterns of a given Portuguese
University from its page creation date to March 2018.
As previously described, pictures of sitting audiences, posters and people in academic
conference correspond to the dominant imagery of the majority of Portuguese
Universities on Facebook – with exception of Coimbra University. In contrast to this
portrayal, the practice of sports and musical performance appear to have little visual
196
space among universities, at least if compared with the main images’ categories. For
instance, there is a minor representation in Açores, Lisboa and Católica and no visual
mention in Madeira. Aberta University does not contemplate sport practices. Particular
characteristics can be seen through the animals cluster, which is almost exclusively to
UTAD but also represented in Porto and, to a lesser extent in Aveiro. The head shot
pictures seem to please all universities, except in the cases of Açores, Madeira and
Coimbra which do not invest in such style. The vision API-based network grid (see
Fig. 5.9) offers an effective and direct way of reading the choices and patterns that
constitute the imagery of Portuguese University on Facebook. Such technique can be
replicated in further similar studies.
Since people is at the core of visual communication, the last level of analysis took
advantage of the face detection module made available by Google Vision API. The
main objective was to repurpose machine vision to assess the mood of Portuguese
higher education. Results demonstrate an overwhelming depiction of happiness along
the
years,
but
also
detects
other
face
expressions
such
as surprise, sorrow and anger (see Fig. 5.10). In terms of consistency, and considering
the page creation date, Porto, Minho, Aveiro, ISCTE and Madeira have recursively
been publishing images that contain joy in their respective Facebook Pages. Whereas,
in terms of volume and duration, Trás-os-Montes e Alto Douro (775 images), Açores
(603 images), Évora (447 images) and Beira Interior (285 images) stand out as the
Facebook Pages with more images that are very likely or likely to express joy. But
what are those images related to? What would be the motive for great happiness or
pleasure? After plotting images separately (see appendices), it was possible to uncover
the different motives that drive the visual imagery of UTAD, Açores, Évora and Beira
Interior.
The happiness in UTAD is mainly featured by students, staff or professors posing for
a picture in various types of academic events, and also the depiction of smiling
speakers and audience. There is also joy in head shot pictures, news clipping, people
drinking wine, in a few selfies, sports, tech-related pics, and the inauguration of new
facilities. In a similar spirit, the visuality of Beira Interior has a great focus on students
participating in outdoor and indoor events, although it also brings university staff and
board members to these events, some selfies, head shot pictures and the register of
awards ceremonies. The smiling audience, the act of posing for pictures in academic
197
events or official ceremonies, are also key visual characteristics in Évora, in addition
to the overwhelming presence of students. Other types of events prompt the happy
visuality of Açores: official ceremonies to award the best graduated students (both in
an auditorium and in the rectory building); events where the faculty staff and board
members get together (e.g. in the rectory building, outdoor picnic). Photos of students
in classroom, in the rectory building or in outdoor events are also common.
Figure 5.10. Computer vision to examine the mood of Portuguese Universities over the years
according to face detection module of Google Vision API. Four main expressions were
detected as very likely or likely to appear in the universities’ Facebook images timeline: joy,
surprise, sorrow and anger.
In addition to the academic events that put together students, professors, staff and
board members, there is a common reason that brings joy to the higher-educational
environment: the visit of the president of Portugal, Marcelo Rebelo de Sousa. His
198
presence is synonymous of a series of pictures in the Facebook timeline images of
UTAD, Açores, Évora and Beira Interior universities. The International Exchange
Erasmus Students Network connects also to the joy of Évora and Beira Interior, and
although sporting achievement is too little, these images often connect to happiness.
When further analysing other face expressions, the results supplied by computer vision
can be deceptive. The Facebook timeline images that contain surprise (64 images) do
not exactly depict faces showing that something unexpected may have happened.
Rather, we see face expressions that may be tricky for machine learning algorithms,
such as raised eyebrows or opening and closing the mouth when talking or giving a
presentation. Although useless for detecting real surprised expressions, the results
provided by machine vision demonstrate a type of visuality that should perhaps be
avoided in institutional communication, since it does not provide a pleasant visual
identity (e.g. a, b, c, d, e).
The sorrow related images (a total of 14) are mainly groups of students in different
situations such as: in learning environments, appearing to be concentrated and sitting
in a small auditorium or standing in a laboratory paying attention to the professor (e.g.
in Açores and in UTAD); in academic events (e.g. 1 and 2 from UTAD, and
e.g. 3 from Aveiro); or students being interviewed by a television reporter (Algarve).
Institutional and academic events also categorise a few sorrow images, for
instance university staff involved in organising or planning activities in the course of
an educational event (Beira Interior), a picture zooming in to focus on the audience
sitting in an auditorium (Açores), and during a presidential visit by the president of
Portugal Marcelo Rebelo de Sousa who is entering the University facilities
accompanied by the dean of Évora University (Ana Costa Freitas) and a few board
members. There is also the case of detecting sorrow faces in banners; one image
promoting an exhibition about the miracles of Nossa Senhora de Fátima constituted by
an old black and white photo with three children staring at a camera looking
sad (UTAD),
and
another
with Alfred
Hitchcock
looking
down
another
filmmaker (Évora).
The five images with anger faces derive from the practice of sports, except for the one
promoting a Masters in Theatre (Évora University in 2017). The other cases bring
winning athletes such as João Paulo Fernandes getting second place in the London
Paralympic Games 2012 (Aveiro); the celebration of Porto’s female Rugby team for
199
achieving third place in the European University Games 2014 (Porto); Minho´s female
five-a-side football athletes in the middle of the court celebrating (probably the
victory) in 2017 (Minho); or the applications for EA Campus 2014 (ISCTE).
The thick description about face detection, in particular surprise, sorrow and anger,
intentionally calls our attention to the lack of precision and limitations of computer
vision APIs.
Discussion
The study of Portuguese Universities’ Facebook Pages discloses some practical and
institutional implications for both research and communication practices. For the latter,
we should consider that the use of social media for institutional communication is
indeed a very recent activity: Facebook is a teenager (15 years old) and the oldest
Portuguese Universities’ pages on this platform were created in 2009 (Porto and Beira
Interior). For the last ten years, Portuguese Universities have been learning how to
communicate using Facebook, while they use it and rely on its affordances.
However, it also becomes clear that Portuguese Universities are using Facebook as just
another platform to “shout out” their activities. At the same time, most universities do
not seem to be spending enough time building their social media networks or giving
attention to some aspects of their missions. Research activities or accomplishments,
for instance, are very seldom referred to, which means that science communication is
practically nonexistent. Connection with the outside world is also neglected.
When seeing the dominant images shared by Portuguese Universities, we find mostly
“Portugal Sentado” (people seated while listening to conferences) and conference
posters, showing very little imagination and a somewhat careless attitude towards
social media best practices and potentialities for communication. By using the same
type of photos time and time again, universities are perpetuating the stigma of a boring
academic environment. The “institutional” visuality of academia reproduced by
Portuguese Universities fails in taking advantage of the attention economy of
Facebook. In other words: no “clickworthy” and “shareable” content.
200
From a research perspective, vision API-based networks’ innovative approach sheds
light on how the universities in Portugal make use of, manage and give priority to their
visual content over the years. The digital visual methods adopted here come along with
thick descriptions, technical knowledge and practical expertise, highlighting the need
of questioning the methods of the medium critically. That is the case of recognising
the lack of precision and limitations of computer vision in the analytical process or
being aware of the problems with web data (knowing how to deal with it!).
Our proposal of calling into the platform is an invitation to conduct communication
and social research through the lens of medium-specificity. Following Latour’s
oligoptic vision of society, it derives from the explorations in the context of the deviceaware sociology (Marres 2017) and the technical-practical fieldwork. This reality
challenges new media researchers to take advantage of the intrinsic properties and
dynamic nature of digital platforms, and here lies the main contribution of this chapter;
a practical walk through the possibilities of repurposing the technicity of networks for
studying societal (institutional) phenomenon on Facebook; a call for another culture
of making research questions and designing research; a call for embracing an openended process in which new questions are always welcome.
201
Introduction to chapter six
This chapter was originally published as: Silva, T., Mintz, A., Omena, J. J., Gobbo, B.,
Oliveira, T., Takamitsu, H., Pilipets, E., & Azhar, H. (2020). APIs de Visão
Computacional: investigando mediações algorítmicas a partir de estudo de bancos de
imagens. Logos, 27(1). doi:https://doi.org/10.12957/logos.2020.51523 The data
sprint report of the article is available from https://smart.inovamedialab.org/pasteditions/smart-2019/project-reports/interrogating-vision-apis/
This chapter investigate the image classification capacities of different computer
vision APIs (Google, Microsoft, IBM), while asking how different nationalities
(Australian, Brazilian, Nigerian, Portuguese) are represented by stock images websites
(Shutterstock and Adobe Stock). I have contributed to this chapter by suggesting the
angle of research and proposing ways to interpret the image-label networks. Together
with André Mintz and Beatrice Gobbo, I helped in developing and operationalising the
research design and methods.
The chapter illustrates a situation in which researchers are familiar with and care for
the technicity of computational mediums (platforms and also research software) and
presents a creative visual method (e.g. comparative matrixes of vision APIs outputs
through image-label networks). It also serves as an example of when and how the
technicity of the mediums is misaligned with the research objective which,
particularly, highlights our lack of knowledge about how machine learning models
deliver labels to classify objects and scene in an image. That said, I want to first
describe the impact of such lack of knowledge in how we think and develop ideas
about a specific issue. After that, I discuss potential visual methodologies to interrogate
and compare the labelling capacities of three computer vision APIs.
What were the research questions and why they should not have been asked?
This case study is also interesting because we started off on the wrong foot, especially
considering two of our initial research questions:
§
Can we investigate national representations through computer vision tools?
§
How are cultural specificities made visible by computer vision APIs?
202
These questions led us to seek for national representations and cultural specificities
through the labels provided by vision APIs to classify the material content within an
image, which brought us to wrong conclusions. For example, when assuming that
computer vision has a limited scope to image classification because it lacks the
capacity to detect cultural, racial and gender specificities (e.g. food, music, dance,
customs related to a country). The article thus affirms the bias reproduced by
proprietary vision APIs, stating that this is linked to the geopolitical position of these
companies. And, even more, it insinuates that the outputs have racial bias. Although
we have indeed managed to get a sense of how different nationalities are represented
in stock image websites by describing common and unique image clusters to each
nationality studied (Australian, Brazilian, Nigerian, Portuguese), we were unable to
see over cultural specificities.
My point here is, computer vision image classification provides general and very
specific labels afforded by machine learning models (see example below). The textual
descriptions are assigned to an image (e.g. labels or tags such as food) and these are
always accompanied by high or low confidence scores (from 0 to 1) and ranked by
topicality rating. This informs us about both the probability of the textual descriptions
assigned to an image which follows a hierarchical way of classifying what is in an
image. The labelling characteristics and potentialities of computer vision would allow
different modes of understanding visual content, but not exactly identifying cultural
specificities (not by labels but through our interpretation of image clusters). This is to
say that the two research questions emphasised here should be dropped or at least
rethought. If we had to ask about cultural specificities, we could have used another
way such as opting for the detection of web entities (see below).
Google Vision Labels
Food (0.9772421); Ingredient (0.8929239); Fruit
(0.8815268); Staple food (0.86995584); Recipe
(0.8641755); Dish (0.85230696); Cuisine (0.8481042);
Yellow (0.8465541); Produce (0.7728272); Baked
goods (0.7722937)
Google Vision Web Entities
Lisbon (1.0512); Egg tart (1.02405); Puff pastry
(0.92190003); Pastel (0.91665006); Tart (0.8598);
Custard (0.80205); Pastel de nata (0.7354); Dessert
(0.7092); Custard tart (0.7072); Cream (0.6104)
But what can we learn from this? Well, I would say that in the same way that the
practice of collecting and visualizing data can be overly addictive in the first contact
with digital methods, researchers are susceptible to technological innovation, falling
203
into the temptation to make use of what is new (vision APIs) and apparently easy (e.g.
YTDT combined with Gephi) and trying things out without quite knowing where they
are going or what they are asking. One’s ability to use software (or being an expert in
it) does not necessarily mean to be aware of the technicity of the medium, the same
can be said for one’s skill to generate data visualizations or code. Our knowledge of
image network was solid, but we had a poor understanding of the functioning of image
classification by computer vision and because of that our results were flawed. As
argued in chapter 2, we should distinguish computational mediums at different levels
(e.g. individual and elements), taking special care to the “elements” (e.g. algorithmic
techniques) as they are carriers of meaning. Here we lacked the ability to understand
the practical qualities of image classification. Digital methods protocols are only as
robust as the weakest link of their methodological chain. Our tendency to frame (or
blame) algorithmic techniques as either racist or culturally ignorant agents portrays a
common (and perhaps not always conscious) attitude towards technical objects, the
tendency to judge them before getting to know them. We learnt that the use of
computer vision APIs must be accompanied with minimum knowledge about its
features, the technical element in use such as image classification. Otherwise, the
research questions and design can be misguided, leading to erroneous assumptions.
How the practice of methods guided us to new ways of asking questions and solving
problems?
From a medium perspective research, the chapter asks whether there are differences
between computer vision APIs providers, particularly on their ability to classify
images. For this purpose, the article proposes promising visual methodologies to both
interrogate the computer vision APIs outputs and compare their descriptive capacities
by comparing different image-label networks in matrixes. First, and by take advantage
of the spatialization logic of ForceAtlas2, we grasp the mode of spatial distribution of
images and labels through a visualization displaying networks in shades of gray.
Second, we categorize image clusters according to their dominant visual typology and
use colors to emphasize the common and unique clusters between the three APIs. In
this way, from the center to the periphery of the networks, we were able to understand
the generality and specificity of the labels applied by each vision API and the variety
of objects defined by the scope of the dataset and their topical specificity.
204
In this chapter, digital networks are used as research devices for reading technical
content (the range and specificity of labelling across APIs) and the content of the
images in a comparative way (by attributing colours to common and unique image
clusters).
This chapter thus feeds into the main arguments of this dissertation, by presenting a
case study that consider computational mediums from a conceptual-technical-practical
perspective, by introducing a medium research perspective to get to know the
potentials of computer vision (through the comparative analysis of image classification
of different vision APIs) and innovative visual methodologies using comparative
matrixes of image-label networks. The chapter is also valuable as a description of a
failure: showing how seriously the lack of basic knowledge about a computer vision
features can harm research results.
205
6 INTERROGATING COMPUTER VISION APIS150
C HAPTER 6
150
This chapter was originally published as: Silva, T., Mintz, A., Omena, J. J., Gobbo, B., Oliveira, T.,
Takamitsu, H., Pilipets, E., & Azhar, H. (2020). APIs de Visão Computacional: investigando
mediações algorítmicas a partir de estudo de bancos de imagens. Logos, 27(1).
doi:https://doi.org/10.12957/logos.2020.51523
206
Introduction
The current internet and communication scenery are characterised by a duality
between the multiplication of media technologies, which is present in the field of
audience and also production (Napoli, 2008), and the growing and controversial web
platforming and, consequently, the economy of digital attention. If the idea of
liberating the transmitting pole or the relative decentralization of content production
predominated during the dissemination of information and communication
technologies (Castells, 2002; Lemos, 2002), today, the commercial panorama
demonstrates considerable concentration of digital in a few companies. The acronym
GAFAM cites the corporations around Google, Amazon, Facebook, Apple and
Microsoft and the oligopoly power which is not only business, but also about the
economy of attention and even the interpretation of social reality, since they
concentrate a good part of the data daily generated by people and companies, as well
as the technical resources to interpret them.
Helmond (2015) observed the platformisation of the web by social media organisations
as an effective effort to manage data flows and web economics in creating and
reconfiguring values from digital tracks and data. This trend promoted the integration
and inter-programmability between environments in a hierarchical manner among the
web environments that cannot be seen in isolation, deepening the inadequacy of
traditional research methods for their understanding (Pearce et al, 2018). Platforms
such as Facebook, for example, evolved to other digital environments by providing
easy authentication features or distributed comments. Amazon, in turn, was one of the
first e-commerce cases to apply the marketplace idea: the site becomes an intermediary
between consumers and retailers, of various scales, who need the website to access
their customers. But Facebook is much more than a social networking site and Amazon
is much more than an e-commerce site. These organisations have become
conglomerates in competition for cutting-edge areas of technology, such as robotics
and artificial intelligence (AI), driven by the data that passes through their servers in
the transactions and communications they host. Srnicek (2017) explains the advantage
of these corporations over traditional business models, since platforms are positioned
between users and suppliers and, at the same time, dominate the space where
transactions take place, having privileged access not only to data from the companies
207
that use them, but also to exclusive data that arises from the privileged position of
having information from several competitors.
In this scenario, AI platforms emerge both from these large new corporations in the
digital market and from companies that were born from the traditional computing
market (such as Microsoft and IBM). Among the different services offered, such as
natural language processing and recommendations based on consumption patterns, is
the computational interpretation of images, one of the AI frontiers and a fundamental
demand of the contemporary media setting. In social media, for example, images have
become increasingly central to communication. In this respect, the literature in the field
refers to a "visual turnaround" of digital platforms (Faulkner, Vis & D'orazio, 2018),
whose publications are more and more visual, including with social media specifically
focused on this type of content (such as Instagram, Snapchat and Tiktok).
This development, however, has not been accompanied in a proportional way by
academic research, which still encounters difficulties in investigating this reality.
Research focused on images has typically been focused on small qualitative studies
(Laestadius, 2017), but the study of visual data in social media is moving considerably
towards multidimensional approaches that see "qualitative data on a quantitative scale"
(D'Orazio, 2014), due to the volume, the characteristics of circulation and its
complexities. Studying images in social media is especially challenging due to the
complexity of computational processing of visual data - to which the sub-discipline of
Computer Science called Computer Vision is dedicated. However, this is a growing
demand. Some social media platforms, for example, process all images published on
their site through computer vision, for purposes such as moderation, ad segmentation
and accessibility for people with visual impairments.
In a sense, much of the work of the academic researcher interested in understanding
the circulation of images may be accomplished and expanded through the use of
automatic tagging of the semantic content in the images, available in a "packaged"
form in programming libraries or computer vision providers. Most large digital media
and technology companies have launched their own AI platforms serving these
purposes, such as IBM Watson, Amazon Web Services, Microsoft Azure and Google
Cloud Platform. However, the knowledge about these platforms and their mode of
operation is largely unexplored, which demands reflection not only on their potential
208
for the applications in research, but also to investigate how they interpret the processed
visual data.
Therefore, the purpose of this chapter is to pursue this dual task of exploring the
analytical possibilities of computer vision APIs - such as those mentioned above while at the same time questioning the very constitution of these platforms151. In this
approach, the study is aligned with the "digital methods", according to the proposal of
Richard Rogers (2013, 2015), which is defined, among other aspects, by the critical
and reflective repurposing of operative instances of the web (native objects of the
digital) and the data generated by the platforms in a social research sensitive to the
media (cf. Omena, 2019). In particular, we examine the biases inherent in computer
vision APIs, which will be evaluated in analytical and investigative practice. For such
a framework, the study also mobilises elements of decolonial-based technological
critique that emphasises that technologies are not just neutral things but incorporated
cultural artefacts of power and representation relations, as mediating artefacts (Haas,
2003).
Computer Vision and the study of images
In the field of computer science, the sub-discipline specifically focused on the
computational interpretation of visual data is called Computer Vision. Historically, it
is one of the first problems proposed, in the 1950s, for the development of AI, then, as
a derivation of cybernetics (Cardon, Cointet & Mazières, 2018). Among its first
developments are programs for the computational reconstitution of three-dimensional
spaces and objects from photographic images (Manovich, 1993; Roberts, 1963). With
the elaboration of algorithmic models for image interpretation, computer vision allows
the incorporation of photographs and videos - among other types of recording - as data
input for robot navigation, forensic science and information management systems
beyond, of course, war and surveillance applications.
151
This article derives from a collective research within the SMART Datasprint 2019, an event
organised by the iNOVA Media Lab, da Universidade Nova de Lisboa. The research report is
available at: https://smart.inovamedialab.org/smart-2019/project-reports/interrogating-vision-apis/. A
version of this study was also presented at the National Symposium on Science, Technology and
Society, organised by ESOCITE (Brazilian Association of Social Studies of Science and Technology)
in Belo Horizonte, Brazil.
209
Importantly, however, the fundamental problem of computer vision corresponds to
what, in the jargon of the area, is referred to as a "misplaced problem", that is: a
problem for which it is impossible to achieve a single or optimal solution, but only
probabilistic approaches (Smeulders et al., 2000). Far from being definitive, any
computational interpretation of an image will always be an interpretation that will
inform both about the analysed image and - fundamentally - about the program that
produced it.
In the context of platformisation, especially with the increasing centrality of visual
content in the use practices of the platforms, image recognition programs from
computer vision play a fundamental role. Given the constituent character of
datafication and algorithmic mediations in the very definition of social media
platforms (D'Andréa, 2018; van Dijck, 2013), image recognition programs allow the
integration of visual content with the operation of platforms. Also, it is precisely the
massive availability of these contents that enables the contemporary development of
computer vision under the paradigm of machine learning (Alpaydin, 2016). This
paradigm is characterised by the inductive nature of its operation, in which the
algorithm is not explicitly elaborated in the program, but rather inferred by the program
itself from a large volume of training data (Mackenzie, 2017).
However, several problems emerge from this relatively recent configuration. A typical
counter argument addressed to the advocates of the machine learning paradigm
concerns the unintelligibility of the decision system produced (Cardon et al., 2018).
After the training process, the resulting models offer probabilistic predictions based
on the training data, but it becomes difficult to measure and (even more) intervene in
the system architecture to correct eventual impertinences and biases. According to
Cardon et al., this problem of unintelligibility was counterbalanced in the scientific
controversy surrounding machine learning by the relative effectiveness then verified
in system applications. This was due to the volume of data used in training which, in
theory, would lead to a reduction in errors and biases. However, as much contemporary
criticism has pointed out, the quantitative volume of data does not necessarily reflect
its quality to infer general models. Data collected from the web - "in the wild" - tend
to reproduce invisibility and oppression schemes that, due to their systemic character,
are not compensated by the large volume of data used in training. On the contrary, it
210
can be seen that the models inferred from this data reify and, in doing so, reinforce the
dynamics of invisibilization and marginalization.
These programs have also become research tools and objects in humanities and social
sciences, in applications with variable critical degree. Recent works have mobilised
them as tools to understand cultural and behavioral trends in large databases, and also
to map differences between image recognition systems, such as biases and
preconceptions crystallised in databases, models and algorithms. Considering the first
case, examples include semantic mapping of cities (Ricci et al., 2017; Rykov et al.,
2016); comparative studies of selfie cultures (Tifentale & Manovich, 2015); visual
persuasion analysis (Hussain, 2017; Joo et al., 2014); description of home styles in
political communication (Anastasopoulous et al, 2016); classification of political
propaganda (Qi et. al, 2016); the study of online political engagement modes (Omena,
Rabeloo & Mintz, 2017); as well as image circulation dynamics (Omena et al, 2019;
D'Andrea & Mintz, 2019; Silva, Barciela & Meirelles, 2018). For the study of the
biases and differences between machine learning systems, approaches focus on their
gender (Hendricks et al., 2018) and race (Buolamwini, 2017) implications. Some of
these studies have resulted in impact information such as good practices guides (Osoba
& Welser IV, 2017) or a proposal for carefully designed machine learning databases
(Buolamwini & Gebru, 2018).
Computer Vision APIs: What are they?
Application Programming Interface (API) describes a way of structuring computer
programs that allows their interoperability with other systems. By means of an API, a
computer program can be designed to package certain functions and data resources so
that they can be accessed by an external program. In the context of distributed
computing and web platforming, APIs enable public or commercial availability of
computational services and data. In practical terms, it means that thousands of
individuals or companies can automatically and in a standardized manner, usually on
demand based payment, use the digital services and data of a supplier company.
In the area of providing Computer Vision resources, companies such as IBM,
Microsoft, Google, Amazon and Clarifai stand out, as well as TinEye and Kairos niche
resources. Among the functions performed are, for example, image classification,
facial recognition and optical character recognition. Available in recent years by large
211
companies in Silicon Valley, these APIs are now a platform model of computer vision
that has spread with applications in various areas - including research of social media
platforms152.
To illustrate a typical feature of these tools, we can see in Figure 6.1 a demonstration
of the IBM Watson Visual Recognition feature. In the pre-trained vendor model, the
system can identify objects and categories such as fabric (fabric), gray (gray colour),
fabric types, and other entities with diverse degrees of confidence. With tens of
thousands of pre-trained labels and classes, the feature can be applied immediately by
developers for different purposes. Among the case studies provided by IBM is the use
of the feature to quickly identify types of vehicle damage, for example.
Figure 6.1. Screenshot of the Watson Visual Recognition demo on the company's official
website.
The computer vision APIs call the identification of these entities’ labels, tags or
classes, depending on the organisation of each one. But this is only one of the groups
of resources offered. The options available are growing; nevertheless, we can highlight
five other groups: a) automated recognition of text in images, turning the visual
resource into textual; b) resources linked to the identification of people and their
characteristics, such as age, gender and facial expressions; c) discovery of equivalent
or similar images on the web, as well as extraction of related information from the
152
Alongside these commercial APIs, it is worth highlighting open code and public models of image
recognition, such as those made available by the project Keras (Chollet et al., 2015). This one in
particular facilitates the development of applications with restrict image recognition models based on
the typology of Imagenet database (Deng et al., 2009), with 1000 pre-trained image categories.
212
semantic web; d) vertical models, such as recognition of celebrities, tourist spots or
types of food; and, e) detection of explicit content, such as violence and pornography,
applied for moderation and content filtering.
In Table 6.1 we summarise the main resources explicitly available by Google, IBM
and Microsoft APIs providers with which we engage more directly in this study. As
may be seen below, some features are unique - Google, for example, integrates its
resources with search engines, allowing both reverse search of the image on the web
and the extraction of information linked to these images by scanning the websites
where they are used.
Google
IBM
Microsoft
Labels / Tags / Classes
Yes
Yes
Yes
Semantic Web entities
Yes
No
No
Food classes
No
Yes
No
Automatic subtitles
No
No
Yes
Explicit content detection
Yes
Yes
Yes
Face detection
Yes
No
Yes
Facial expressions
Yes
No
No
Celebrities
No
No
Yes
Touristic spots / Locations
Yes
No
Yes
Gender
No
Yes
Yes
Age
No
Yes
Yes
Text recognition
Yes
No
Yes
Text language
Yes
No
No
Reverse search on the web
Yes
No
No
Table 6.1. Comparison of some of the Computer Vision API features
213
The resources cited are widely available and employed by organisations and
commercial or governmental sectors. Among these applications, collaborations of
large technology companies with USA war and repressive projects have resulted in
protests by their employees. In 2018, Google employees enforced shutdowns against
the company's use of AI technologies to optimise drone attacks153. In an event
indicative of the fight against the discussion on artificial intelligence ethics, one of the
employees engaged in the protest, Meredith Whittaker, left the company after being
pressured to distance herself from research in ethics and technology at New York
University154. In 2019, Amazon employees were seeking to prevent the company from
supporting the immigration sector responsible for tracking undocumented immigrants
in the USA and their imprisonment in concentration camp analogue institutions in the
border region155.
At the interface with digital research methods, computer vision APIs have been used
for creative research designs as well as for their own reflective critique of these
resources. We can understand these initiatives in relation to what Farida Vis (2013)
proposed to question other possible ways of thinking about the data collected or
processed through computational resources such as AI and Big Data, to imagine other
possible applications.
Interrogating APIs and stock images websites: absences and hyper-visibilities
Resuming the complex nature of the media ecosystems referred to in the introduction,
the role of commercial images websites must be highlighted. Stock images companies
have existed since the 1920s (Bruhn, 2003), providing photographs to media
companies, publishers, advertising agencies, being used by producers in different
formats and media, such as magazines, newspapers, out-of-home media and
packaging. In the 21st century, with the internet and the cheapening of photographic
production, the microstock format became popular, expanding the size of consumer
and producer networks. The so-called microstock is a model through which sock image
companies become an interface between photographers and professional or amateur
153
https://www.nytimes.com/2018/06/01/technology/google-pentagon-project-maven.html
154
https://www.theguardian.com/us-news/2019/apr/22/google-mass-protests-employee-retaliation
155
https://www.theguardian.com/us-news/2019/jul/11/amazon-ice-protest-immigrant-tech
214
studios that can sell their photographs individually and at low cost to all kinds of
clients.
The cultural analysis of the representative role played by stock images websites is
relatively recent. Frosh (2001) discusses the different meanings and layers of
references for media producers, buyers and end consumers. A crucial concept evoked
by the author is invisibility: most media consumers are not familiar with the concept
of stock images; they only consume and interpret the visuality in its aesthetic and
representative contents in the final contexts.
This invisibility is also partially presented in the academic field. Scientific papers on
the observation and critical analysis of image repositories are scarce, especially when
we talk about digital methods and social computing. Among the most recent
investigations on the production and circulation of image repositories, one can cite the
work of Pritchard and Whiting (2015) on gender and aging; an article by Giorgia Aiello
and Anna Woodhouse (2016) on gender stereotyping; sprints reports coordinated by
Aiello (2016, 2017) on stock images websites in relation to photojournalism and
gender representations; West's study on gender stereotyping and class invisibility in
images representative of work environments (West, 2018); and, a method for
increasing equity in image recommendation systems (Karako & Manggala, 2018). Our
proposal is to contribute to this set of experiments on stock images websites, through
the lens of digital methods and computer vision APIs - while critically investigating
these tools.
The project sought to investigate Computer Vision APIs as search engines while
experimentally applying them to the study of visual representation of different
countries
through
the
images
resulting
from
the
search
for
their
gentílicos/gentilic/population (specifically: Brazilians, Portuguese, Nigerians and
Austrians). This dual approach is based on the inflection proposed by Noortje Marres
and David Moats (2015) to the founding principle of symmetry in the field of Social
Studies of Science and Technology (STS). In considerations focused on the study of
online controversies, the authors advocate the need for a symmetrical approach to the
'content' of controversies and the means by which they manifest themselves. In the
case of this study, symmetry occurs between the representations of stock images and
the algorithmic mediations of images in digital networks, also mobilised as a research
methodological tool. To this end, the study is based on the approach of large visual
215
datasets, collected from stock image websites (Shutterstock and Adobe Stock), with
the aim of deriving descriptions and identifying patterns of comparison and, at the
same time, developing considerations on the potentials and limits of this
methodological application.
Thus, the delimitation of research questions pertinent to computer vision technology
and, at the same time, to stock images, converged on the following questions:
§
What are the differences between computer vision APIs providers?
§
How do computer vision APIs "understand" the same photos?
§
How are the ontologies of each API's labels distinguished?
§
Can we investigate national representations with computer vision tools?
§
How do stock images represent the visuality of the analysed countries?
§
How are cultural specificities made visible through the use of computer vision
APIs?
To answer these questions, we developed a protocol and analysis based on mixed
methods. A first step involved data collection from image repositories, going through
the processing of these images by different computer vision APIs and, finally, ending
in the analysis of this data through a variety of approaches and techniques aimed at
different aspects of the questions posed. These specificities will be made explicit
throughout the analysis.
The research protocol yields a diversity of viewpoints that allows the isolation of
variables for, at least, three axes of comparative research: between computer vision
APIs, stock image websites and nationalities (see Figure 6.2). In this study we have
chosen to focus on computer vision APIs, as already highlighted, and the different
nationalities, given our interest in the cultural specificities of APIs. Different image
banks are interesting to diversify the biases of each nationality through different
sources; however, this aspect is not so central to the scope of the study. Therefore,
image banks are mobilised in an undifferentiated way in the analysis.
216
Figure 6.2. Comparative axes of analysis
The data collection was performed through a scratching technique, where the source
code of the image banks' web pages was processed in order to find the images’ URLs.
The images in their preview versions (thumbnail) were then loaded locally for
processing. We used a Python script developed specifically for this purpose (Mintz,
2019). For every nationality we collected 2000 images for each stock image website
(see Figure 6.3).
As mentioned in the previous section, three providers of computer vision were selected
for the study: Google Cloud Vision API, Microsoft Azure Computer Vision API and
IBM Watson Visual Recognition API (see Table 6.1). In the scope of the study, images
were processed through their respective modules that provide descriptive tags or labels
- Label Detection, for Google, Image Classification for IBM, and Microsoft Tags.
The images were submitted in batch to the APIs using a Python script called Data
Inspector (Guerin Jr. & Silva, 2018), in turn based on the Memespector script. The
Memespector script was originally developed in PHP by Bernhard Rieder (Rieder, Den
Tex & Mintz, 2018) and then adapted to Python and expanded by André Mintz
(2018b). The operating logic is common to all these programs: a list of images
217
(composed of URLs and other tabular data in CSV format) is taken as input, and each
image is then loaded and processed through the APIs, returning annotated tabular data
(in CSV format) and relational data, or graphs (in GEXF format) as a result. In this
way, the resulting processed datasets could be interrogated through statistical, corpus
linguistics, social network analysis and visual critical-exploratory analysis techniques.
Among the results, we want to group observations into two points, respectively linked
to (hyper-)visibility and absence practices.
Figure 6.3. Data collection protocol
218
Granularity and standardisation in the semantic spaces of APIs
As seen in the example illustrated in Figure 6.1, each image processed in the computer
vision resources yields through the system the marking of terms that refer to objects,
states, qualities or characteristics. A first challenge, which speaks significantly to our
research questions, is the fact that the providers of these resources do not share the
complete list of possible tags. Thus, there is no possibility to analyse the ability of the
resource to understand a certain semantic group before performing empirical tests.
Microsoft Azure Computer Vision
Austrian
Brazilian
Nigerian Portuguese
317
561
485
501
1.632
2.044
1.846
1.991
2.037
2.170
1.145
1.992
API
IBM Watson Visual Recognition
API
Google Cloud Vision API
Table 6.2. Number of tags assigned by each computer vision API
to each of the studied datasets (n = 4.000).
Considering that an essentialist understanding is, by definition, standardising and
simplifying, a first question is: to what extent do computer vision providers offer an
effectively relevant number of tags that approximate to a certain degree the complexity
of the scenes photographed? In Table 2 we can see the number of unique tags assigned
to each dataset, comparing providers and datasets from each country. The most
significant difference is in the number of Microsoft tags, and far behind we find IBM
and Google. In addition to this quantitative aim, it also seems relevant to consider the
relationships established between the tags and the types of clusters formed by their cooccurrence.
219
Figure 6.4. Comparative matrix of image-label bimodal networks
for each country and each API.
Figure 6.4 shows a set of views that helps to understand the topology of the
relationships between labels assigned by APIs for each analysed dataset. We refer to
this topology as semantic spaces. In the comparative matrix, the nets for each dataset
(columns) were initially composed by merging the labels of the three APIs into the
same net, then, submitted to the ForceAtlas2 spatialisation algorithm. Afterwards, to
compose the matrix, this combined net was filtered for each API, so that it was possible
to compare the labels distribution and relationships in this joint spatialisation. Hence,
it is evident how the low number of tags of Microsoft's API is reflected in the greatly
connected networks and, therefore, with little topological differentiation between
clusters. It is also noticeable how, although quantitatively IBM and Google return
comparable numbers of tags, the IBM API traces a semantic space more
undifferentiated than Google's. In the case of the latter, the density of connections is
less diffusely distributed across the network and tends to condense into specific
220
clusters, suggesting a greater degree of specialisation in labelling. We will see in
sequence how this general feature is performed in specific cases.
Seeking to qualitatively understand the APIs sensitivity to culturally specific images,
the analysis included the examination of small groups or individual images and how
APIs label them. While the analysis ultimately unfolds into the evaluation of the labels
applied to the image, in the light of the contextual knowledge of the countries and
cultures being represented, this approach involves the challenge of delimiting, from
the large datasets, these specific and quantitatively smaller cases. Filtering specific
label occurrences allows the analysis of small groups of images linked by related
labels, through mixed exploratory methods.
The bi-modal networks of images and labels (which we will see in detail in the
following section) have composed exploratory devices that allow the examination of
emerging visual patterns between the datasets, as well as possible misclassifications
or exceptions. The Google Sheet tabular data software was also used due to its image
preview functionality with its 'IMAGE' formula. Consequently, it has become possible
to filter the datasets according to particular tags, in order to scrutinise the images in
which they were applied by the API. This redirection of the analysis may be
approximated to the API’s reverse engineering, since by taking the tags as a starting
point, it is possible to infer aspects of the training data that fed the algorithms and their
possible biases.
Identifying food, dishes and cooking is something that computer vision providers
include and, as we will see, are able to accomplish to a large extent. But is the
recognition of regional specificities enough? As one example, we can speak of Pastel
de Nata (see Figure 6.5), a famous Portuguese pastry that appears abundantly in the
country related dataset. Considering the labels attributed to a group of three images
that we find with Pastel de Nata, we found a case that demonstrates the low degree of
APIs cultural specificity. In view of the results, clearly that Google presented the most
accurate results, even assigning the label "pastel".
Moreover, it also identified "custard tart", a generic name for the recipe, that also
relates to similar pastry from England and France. IBM Watson assigned labels like
"brioche" and "Yorkshire pudding", which are not the same patty but have a similar
format to Pastel de Nata. Microsoft API used the descriptions "doughnut" and "donut",
221
which are very different compared to Portuguese food and typically related to
American culture.
Figure 6.5. Labels assigned to three images of pastéis de nata by APIs.
Non-food labels were filtered for easier reading.
Another topic addressed in the study concerns how computer vision APIs label nonwhite phenotypic characteristics and non-western accessories. The cases in question
emerge from an exploratory assessment of the representation of indigenous and black
population in the datasets of photos of Brazilians and Nigerians and their labelling by
APIs. During navigation through photos of black people, a remark emerged about how
hair and accessories were tagged, particularly by the Google Cloud Vision API.
We found that the "wig" label was consistently assigned to black women with frizzy
hair or wearing turbans, in addition to the "Dastar" label on turbans of Brazilian women
from the Bahia region, when, in fact, Dastar is a specifically Indian turban used by
Sikhism followers. Additionally, in regard to turbans, the label "fashion accessory" has
often been attributed to clothing in the Nigerian dataset. This, in a way, simplifies the
religious and traditional symbolic charge contained in the items. In the case of
Brazilian Bahian turbans, the "Tradition" and "Ceremony" labels appeared more
frequently, bringing them closer to religious characteristics.
222
Figure 6.6. Labels attributed to images of black women with frizzy hair and wearing turbans.
This analysis gives us some clues about the limitations of the APIs in the cultural sense
of their labels and leads us to questions, such as which of the variations of hair types
are included in the API and why is there no "frizzy hair" label in pictures of black
women. This duality between visibility and hyper-visibility has been explored by
researchers of the implications of AI technologies and computer vision, in particular
for racial and gender issues (Buolamwini, 2017; Buolamwini & Gebru, 2018; Noble,
2018). One aspect highlighted by the work of these researchers is how these
technologies should be understood in view of the context in which they operate,
reflecting not only biases contained in the datasets, but also forms of exclusion that
have conditioned the constitution of both datasets and software, including the little
diversity of the development teams.
The idea of post-colonial computing was reviewed by Ali (2016) as he discusses how
the examination of cultural relations and power in computing, human-computer
interaction and information and communication technologies has been undertaken.
The decolonial turnaround (Quijano, 2010), however, assumes that although the
traditional designs of colonialism have been superseded, "an ongoing legacy of
colonialism in contemporary societies in the form of social discrimination” and
223
“practices and legacies of European colonialism in social orders and forms of
knowledge" persists (ALI, 2016, p. 4). Considering the abyssal gap between
corporations like Google, IBM and Microsoft on one hand, and developers and
communicators on the peripheries on the other, in terms of the potential for creating
databases and computer vision ontologies, the pressures of cost-benefit, effectiveness
and network effect tend to concentrate projects around those providers. The practices
of standardisation of content and tools would be at the core of computer science itself,
which, in its nature, "forces to assume a perspective of the positivist paradigm, in
addition to its own empirical analytical approach to experimental sciences"156 (Portilla,
2013, p. 98).
Networks of semantic spaces and typicality
In order to analyse both the labelling performed by each API and the visual
characteristics of the data addressed, the study used visualisation of bimodal networks
when processing the assignment of labels to images as relational data. This is a
common approach in similar research and is programmed in the Python version of the
Memespector script (Mintz, 2018b). Image-label networks are generated by
considering images and descriptive tags as two types of nodes. Nodes represent the
assignment of a label to an image.
This data representation allows its processing and visualisation through software such
as Gephi (Gephi, 2017), which provides several analysis tools, from layout algorithms
to statistical modules based on the Graph Theory methodological framework. For this
study, the analysis was mainly established on visual network analysis (Grandjean &
Jacomy, 2019; Venturini, Jacomy & Jensen, 2019), which focuses on describing
properties of datasets according to the networks’ topological traits based on the nodes’
position and size. ForceAtlas2 (Jacomy et al., 2014) was the main layout algorithm
used in network spatialisation and modularity calculation (Blondel et al., 2008) was
applied to identify the main clusters.
156
Original: “obliga a asumir una mirada desde el paradigma positivista, dado además su enfoque
empírico analítico propio de las ciencias experimentales”. Author’s translation.
224
Modularity divides the network into sections according to their possible community
structure, assigning codes to each partition. Depending on the data represented, these
partitions may lead the researcher to discover significant clusters, such as thematic or
geographic groups or general semantic concepts. The exploration of network
visualisations was carried out in printed formats combined with their visualisation on
the screens for the identification of clusters. In addition to the graphical visualisation
of networks, we used versions in which the images themselves were plotted at the
position of their corresponding node, using the Image Network Plotter script (Mintz,
2018a). This representation facilitates the transit between the visual aspect of the
images and the relative position they occupy in the network, according to the
computational reading performed by the APIs (see Figure 6.7).
Figure 6.7. Representations of the image-label network generated from the same dataset: on
the left, the relationship between the labels/tags and on the right the images spatially
arranged from the bi-modal relationship between the communality of the labels.
The analysis needs to consider articulated aspects of the analysed datasets and the API
used for the analysis. Layout algorithms based on force systems like ForceAtlas2 work
under the power law and the preferred connection process (Jacomy et al., 2014),
implying a very particular logic of reading the semantic space of the computer vision
APIs. Importantly, the analysed bimodal networks are formed by the confluence
between the tagging logic and the singular configuration of the visual datasets
approached. Therefore, the mode of spatial distribution of images and labels, from the
center to the periphery of the networks, seems to be the result of at least three factors:
a) the generality or specificity of the labels; b) the variety of objects defined by the
scope of the dataset and their topical specificity; and c) the topological characteristics
225
of the semantic space resulting from the associations between labels in each service.
For example, in Figure 6.7 we see quite particular clusters positioned on the periphery
of the networks as a result of the Google API tagging: musical instruments, dog types,
and food. The formation of these clusters is a result of both the prioritisation of these
topics in the data and the high degree of granularity of the Google API when describing
them. We found quite specific tags for puppies, such as 'spanish water dog', 'lagotto
romagnolo' and ‘serra de aires dog', for example.
Considering these analysis parameters, the networks between images and tags allowed
us to identify topical categories within each set of visual data and the relative
predominance of each was taken as an indicator of typicality of each national
representation (see Figure 6.8). By analysing these networks side by side, it was
possible to define common topical categories that can be observed in the various
datasets, given their presence and relative prominence in all cases. These were:
"Nature", "Food" and "People". Additionally, through this analysis, a unique category
was also identified as emerging in each set: "carnival" for Brazilian; "azulejos" for
Portuguese; "cidade" for Austrian; and "dinheiro" for Nigerian.
The network views presented in Figure 6.8 allow us to understand the different degrees
of specialisation of labeling by each API. In this figure, in particular, one can also
observe how the clusters formed by the labelling serve as indicators of the visualities
constituted in the image banks for each nationality.
The categories shared among the countries form communities of variable dimensions
in each case, but in a relatively balanced way in most of them. The Austrian case stands
out, however, by reason of its cluster concerning natureza (nature) being significantly
larger than in the others. In the Portuguese case, the pessoas (people) cluster is also
comparatively smaller than the other cases addressed, while the specific grouping
cluster, related to azulejos (tiles), is quite pronounced. In the Brazilian and Nigerian
cases, the food and people clusters were the most pronounced.
226
Figure 6.8. Comparative matrix of image- label networks according to the gentilic and
computer vision API.
Going to the level of the labels themselves, some of these general perceptions are
specified around some of the more frequent terms in each case. Figure 6.9 shows the
10 most frequent terms for each nationality datasets, according to the Google API. The
frequency in each case is compared with the frequency of the same term in other
nationalities. It is noted the already mentioned exceptionality of the figurations in the
Austrian dataset, the only case where the most frequent tags/labels are not related to
food but to landscape categories - "mountain", "natural land", "sky" etc. The
Portuguese dataset brings terms related to tile images after those related to food "design" and "textile". Both Brazilian and Nigerian datasets bring tags related to
people after those related to food - "fun" and "smile".
In marketing and consumer studies, the topic of typicality is approached from
cognitive psychology studies as a parameter of a product or brand's strength (Loken &
Ward, 1990). The notion is understood as the strength of attachment of an individual
instance to a general category. Or, in another way, as the measure of how exemplary
227
an instance is in understanding a category. In the case of this study, we could say the
typicality of a figuration is to what extent it would be representative of a nationality in
the context of image banks and the media products that use them. The typicality, in
this context, would be related to the frequency with which certain figurations appear
in the search for the gentilic of each country. Bruhn (2003) highlights the allegorical
character of stock images, which seems to aim at the representation of categories.
However, this factor needs to be combined with the algorithmic mediations mobilised
in the analysis, allowing the grouping of these figurations. Considering the operation
of machine learning, there would be a particular dynamic of typicity conformation of
a category in the training process by examples. Factors that contribute to the
constitution of the category involve the frequency of linking certain figurations and
the visual attributes shared between these figurations; aspects that are ultimately
unfeasible by the closure of technologies in APIs and that relate to the constitution of
their training bases and neural network architecture.
Figure 6.9. Comparison of the 10 most frequent labels per gentilic.
228
Among the indications of Ali (2016) for the adoption of a decolonial perspective in
computing is, as a minimum, geopolitics and policies consideration of the bodies in
the different engagements of production and thinking about computing. Drawing,
building, researching or theorising on phenomena cannot be done without looking at
the power relations inherent in the concentration of ownership, development and sale
of technologies in the USA and Silicon Valley, for example. Aspects of the typicality
observed in the study seem to relate to this aspect. The bias reproduced by the
technologies in question is linked to the geopolitical position of these companies and
largely defines the technological mediations of contemporary digital communication.
The production of codes, datasets and interfaces follows the logic of "mirroring"
through which producers think of users similar to themselves (Haas, 2012).
In line with the perspective of digital methods, as mentioned above, the mobilisation
of these tools as analysis devices must be combined with their continuous critical
consideration as part of the study. Critical and public scrutiny with AI systems,
automation, indexing and categorisation of content may generate remedies that
minimise the problems presented (Raji & Buolamwini, 2019). The centrality assumed
by computer vision APIs in the interpretation of visual data should, therefore, be
considered critically.
Conclusion
Studies of large corpora on digital platforms have focused on quantifiable aspects of
the environment or on the interfaces themselves (Laestadius, 2017). Mixed methods
such as our visual network analysis approach enable researchers to constantly change
between levels of visualisation and data exploration. Network visualisations can
generate unique insights through the spatialisation and modularity of tens of thousands
of images. At the same time, these networks allow filtering specific instances related
to a theme or context.
We discovered patterns about what is commonly related to different countries through
their gentilic. Food, Nature and People emerge as salient categories related to each of
the four countries. Since all of these categories are related to concepts usually
perceived as linked to places and cultures, as well as to the visuality of image banks,
229
we consider that this indicates the suitability of the methods used for the comparison
between countries.
At the same time, the unique characteristics pointed to particular discoveries. In each
country a specific theme has emerged, linked to its culture and tourist strategies which,
therefore, creates demand for image banks. While positive and negative stereotypes
played a relevant role in these categories (such as "Carnival" for Brazil), the offer of
albums by photographers and studios may have biased some results. In the case of
Nigeria and Austria, for example, prolific albums directed the dataset (and its images
and labels/tags) to heighten their themes. This highlights an important variable not
included in the study: the ratio between the number of platform images under study
and the number of producers. That is, in our project, if fewer content producers use a
certain tag, each producer can influence the analysis more relatively. To move forward
on this issue, future studies may compare peripheral countries with hegemonic
countries in the global media industries, such as the USA, the UK and Japan.
About the computer vision providers, the study sheds light on their differences,
limitations, and ways of reappropriating them to explore culture and representations
embedded in digital platforms with digital methods. Commercial services such as
Google Cloud Vision API, Microsoft Azure Computer Vision API and IBM Watson
Visual Recognition are considered "black boxes" (Buolamwini & Gebru, 2018; Latour,
2001; Pasquale, 2016), demanding auditing methods for assessing their mode of
operation, accuracy or coverage. Neither the list of labels nor their total number are
disclosed by the providers. We hope that studies such as ours add to the field, in order
to help other researchers interested in using computer vision APIs for social research.
In addition, the different providers do not overlap or equate in understanding complex
cultural data, such as images, so, using them uncritically can be a methodological
problem. Some of the procedures performed in this study (such as the clustering
dictionary) show the feasibility of combining two or more providers by merging
annotated datasets and networks. Future experiments with other providers are
recommended to expand the scope of the findings.
Experts on the researched topics can direct multifaceted investigations on the datasets.
International comparisons with international teams can advance understanding of both
method (computer vision) and object (image banks).
230
Finally, cultural stereotypes and "typicities" about a country and its populations are
reproduced in content providers; therefore, understanding the ways in which this
happens is important for fairer media ecosystems. Image bank providers are important
to advertising agencies, public relations and publishing companies around the world
and the microstock business model extends its impacts also to small and medium
enterprises and public institutions, making understanding their visual cultures and
productive routines especially important due to their pervasiveness.
231
TECHNICITY OF THE MEDIUMS IN DIGITAL METHODS
C ONCLUSION
232
Summary
This dissertation has addressed the role of technical knowledge, practice and expertise
both as a problem and as a solution, in a variety of digital methods. The computational
mediums (and respective regimes of functioning), the web environments and technical
procedures were taken as key elements in the practice of digital methods. By
demonstrating how technicity influences the ways we generate, present and legitimise
knowledge in digital research, I have argued that the practice of digital methods is
enhanced when researchers make room for, grow and establish a sensitivity to the
technicity of the computational mediums. To substantiate my argument, I proposed a
reflection on the intersections between digital methods, technicity and digital
fieldwork, while mobilising the concept of technicity-of-the-mediums and discussing
three crucial aspects of the digital methods approach.
This thesis investigates what it is like to design and implement research with a digital
methods approach by making clear that this approach requires researchers to develop
a mind-frame that accounts for, investigates and re-purposes technological grammar
for social enquiry. This dissertation shows that repurposing digital media and data for
social and medium research is a challenging activity that demands extra effort. Using
digital methods is not about enquiring technologies from the outside in (see Marres,
2017), but about understanding how to work with socio-technical assemblages and
how to think along with a network of methods. This is what it takes to put research
strategies and methods into action. It requires taking extra care when analysing
grammatised actions, that “have not been created by or for the social sciences”
(Venturini et al., 2018, p. 4). The question of what do extra efforts mean in practical
terms has been answered, first, by exposing the many challenges in digital methods
(chapter 1) and, then, proposing ways to defuse some of the difficulties related to the
use of these methods (chapters 2 and 3).
To describe how the notion of technicity matters for and contribute to digital research,
this dissertation introduces the concept of technicity-of-the-mediums which is
mobilised in a series of case studies (chapters 4, 5 and 6), resulting in specific
methodological approaches. These provide new analytical perspectives for social
media research and digital network studies, as I will argue in the following sections.
With regard to what extent digital methods can be considered a type of fieldwork, this
dissertation has presented a technical understanding of the web environments (from a
233
methodological standpoint), while proposing and discussing the role of three distinct
but related aspects to be considered when doing digital methods: software affordances
and platforms’ cultures of use and grammatisation (chapter 3).
Below a summary of conceptual and practical contributions:
§
The expression of computational (technical) mediums stands for research software,
digital platforms and associated algorithmic techniques, referring to media not only as
communication platform, but also as mediators’ devices, which demand, as a
consequence, as much attention as the contents or the objects of our research.
§
The concept of technicity-of-the-mediums serves as an invitation to become
acquainted with the computational mediums in the practice of digital methods. It is
related to the relationship among the computational mediums, the fieldwork and the
researcher(s) and her/his object of study, thus demanding iterative and navigational
technical practices.
§
Three pillars of the digital methods approach refer to the awareness of a triangular
relationship between software affordances, platforms’ cultures of use and their
grammatisation. While suggesting a way of carrying out digital fieldwork, this
proposal highlight some of the difficulties related to the use of digital methods, i. e.
the need to care about the specificities of the medium and data.
§
The notion of second order of grammatisation follows the creation of new
methodological grammars by researchers when using digital methods. It is a second
order because it is based on existing technological grammars such as what is captured
by platforms (units of expression and communication), what is afforded by software
(e.g. force-directed algorithms, metrics like network diameter and modularity) or by
the outputs of other computational mediums such as computer vision APIs. The final
results (about the research object and content) neither entirely represents platform
grammars nor the affordances of (research) software, but a mix of both and of the
researcher agency.
Overall, it may be concluded that technical knowledge, technical practices and
technical imagination can concretely enhance the design and implementation of
research with digital methods.
234
Part 1: For a technical culture of knowledge in digital research
Chapter 1 explains in different ways that, in the practice of digital methods, researchers
should develop a digital methods’ mind-frame, being able to stay in it and then keep
working in a piece of research that accounts for, investigates and re-purposes
technological grammar and digital records for social enquiry. To that end, it is
important to consider the computational mediums (e.g. APIs, extraction and analysis
software) not only as a way to perform a particular piece of work (as instruments) but
also as active participants in each stage of the research, because they add technical
substance to the object of study while re-adjusting and re-shaping it. In chapter 1, I am
therefore suggesting a broader definition of medium that encompasses but also exceeds
that of medium of communication. Computational (or technical) mediums (e.g.
software or digital tools) have a proper domain of being and meaning (see Rieder,
2020). Consequently, researchers should not only look carefully at the thick layers of
technical mediation (inherent to the methods) but recognise the forms of practices and
modes of operation held within software, web apps or APIs as “a medium of expressing
a will and a means to know” (Rieder and Röhle, 2018, p. 123, see also Berry, 2011;
Rieder 2020).
To unfold this argument, I used a network of the followers-of-the-followers of an
Instagram bot profile (Mary__loo025) to call attention to the role of technical
knowledge in the practice of digital methods. Through this example, I have also
demonstrated the importance of devoting some time to the art of querying as well as
the craft of becoming close to computational mediums. Here is where the concept of
technicity-of-the-mediums can help. Chapter 2 thus presents three attempts to
understand technicity from the lens of media studies, of a philosophical perspective
and of digital methods.
After reviewing why and in which ways technicity matters in media studies, I
identified different efforts to understand and use this concept: from giving attention to
the processes through which people connect “through techniques, technologies, and
dynamic traditions of practice” (Crogan & Kennedy, 2009, p. 109) to the focus on how
the domains of knowledge and transformative practices relate to technicity (Dodge &
Kitchin, 2005; Kitchin & Dodge, 2007; Niederer & Dijck, 2010). Although, clearly,
235
technicity is a complex and compound concept, which demands a close relationship
with software through technical knowledge and practices, technicity in media studies
is not generally referenced as a means to (re)think the way we design and implement
digital methods. The literature review indicated how the notion of technicity, has been
used for purposes such as taking into account identity formation, understanding how
content is managed by digital platforms or studying participation in social media. In
this dissertation, I consider technicity can be used to (re)think the design and
implementation of methods. In chapter 2, another attempt to understand technicity
follows the Gilbert Simondon’s philosophical reflections and try to connect it to the
practice of digital methods – which has been one of the greatest challenges of this
dissertation. While recognising that such a proposal deserves further development, I
have tried my best to import the concept of technicity in the context of digital methods.
This dissertation concludes that an understanding of the technicity alludes to the
awareness component of technical mediums from different levels, in particular when
these are, in Simondon’s terms, individuals (software) and elements (e.g. machine
learning models of vision APIs, algorithmic techniques). Researchers are thus required
to be acquainted with the former, while knowing when and why to value the latter in
the research design. This means they also should know how to use the potentials,
practical qualities and meanings of technical mediums for the research purposes. This
distinction is crucial to help researchers:
§
To think with and repurpose (if necessary) technical mediums; learning to value
elements in the full range of digital methods while keeping in mind the way in which
they add new meanings to the subject of analysis.
§
To devise research design with digital methods and decide what pieces of software
should be staked in a methodological chain to solve research problems, pushing
researchers towards orders of technical-practical thoughts.
§
To implement digital methods through a technical ensemble composed by the medium
under investigation and the tools used to investigate it.
§
To enable researchers to discover new arrangements when doing digital methods,
using a technical imagination to create methodological solutions or raise new
questions.
In this thesis, I have also suggested that researchers can think of digital methods as
technical ensembles, connecting the practical qualities and potentials of computational
236
mediums to the research purposes. Moreover, they should invoke their technical
imagination to better combine software inputs and outputs and to accomplish the
activities that compose a digital methods protocol.
Chapter 2 also provides a description of the process of building/interpreting a computer
vision-based network to illustrate more clearly the mental and practical modes of what
I am calling the technicity-of-the-mediums in digital methods. A first lesson drawn
from this is that the content of such a network is never the subject of study but also the
technicity of the tool used to produce it. We need to move beyond discussions about
how to get data or how to make beautiful network visualisation and start making room
for what precedes the network visualisation. Central elements for this purpose are
technical knowledge but also a good understanding of the object of study within the
web environment.
A second lesson refers to how a computer vision-based network can offer insights
impossible to achieve through traditional research practices, but only through technical
practices and imagination. For instance, in the making of the network, we created
specific nodes attributes (with a spreadsheet and Table2Net) - namely the authors of
the visual content, the year of its creation and the classification of link domains
(whether a porn website or not), using colour (or label) to identify them in the network
diagram. Those interventions have helped us to respond to a particular research
question concerning porn bots, enabling us to analyse their agency but, at the same
time, to perceive them in the general context of the network.
In chapter 3 we continued and expanded the journey of understanding how technical
knowledge contributes to new forms of enquiring in digital research, and here we learnt
a way of carrying out digital fieldwork. The chapter contends that new media
researchers should have a technical understanding of the web environments and
develop a practical awareness of a triangular relationship between software
affordances and platforms’ cultures of use and grammatisation. This is what it means
to get acquainted with the technological environment in which digital methods ground
their claims about social phenomena. For instance, within the web environments, we
thus should take Uniform Resource Identifiers (URI) as research material because
every piece of information stored in the web is standardised by an URI. Here, the
syntax of URLs is a means for exploring the data set, for responding to research
questions or identifying unique actors or records. An example of this is given in chapter
237
2: after following the protocol of search as research for detecting pornography
websites through images URLs, it was possible to identify and visualise 125
pornography unique hosts within the network. As mentioned above, this strategy gave
us a new direction for visual network analysis and generated new insightful findings.
Such an attitude requires a capacity of considering the features of technical mediums
as an ensemble and as a solution to methodological problems.
To understand the layered structure of online connectivity (Ghitalla & Jacomy, 2019)
is another aspect that researchers should keep in mind when using search engines
results or crawlers’ outputs, because the hierarchical structure of the web delivers
content arranged in a particular order (which is reflected in the data obtained): content
which is known by everyone (first layer), by amateurs and experts (second layer), and
simply ignored (third) (Jacomy, 2019).
Not less important, is knowing how web apps and APIs may serve research. These
mediate the execution of specific research tasks and demand our attention because they
have “epistemic orientations that have repercussions for the production of academic
knowledge” (Borra & Rieder, 2014, p. 2). Learning how to read social media APIs’
documentation (not always transparent or user friendly) is helpful to better understand
the situations when users deal with predefined technological grammar, produced and
delineated by software, to structure their activity (Gerlitz & Rieder, 2018). All these
possibilities, however, exist in a web that is less and less seen as an open platform for
everyone to use, and more and more as a closed environment ruled by specific
platforms’ walled gardens.
Finally, the concluding remarks of chapter 3 introduce the three pillars of the digital
methods approach which relates to technical knowledge and practical awareness of:
§
Platform grammatisation: the technological processes inherent to the web
environment and APIs in which and through which online actions are structured,
recorded and collected through crawling, scraping or API calling.
§
Cultures of use: the modes of life, the common meanings and the forms of
signification that emerge and circulate within a given platform, encompassing what is
expressed by technological grammar and shaped by platform’s infrastructure and
technical mechanisms.
238
§
Software affordances: the materiality, productive and mediating capacities of
software to be considered from a relational perspective with platform grammatisation
and cultures of use.
Part 2: From technical knowledge and technical practices to new forms of
enquiring
Chapter 4 presents a hands-on approach to study hashtag engagement on social media
through the three pillars of the digital methods approach. The “impeachment-cumcoup” of Brazilian president Dilma Rousseff was studied under different but related
perspectives. The first contemplates the differences between high-visibility and
ordinary hashtag usage culture, the related actors, and content. The second focuses on
hashtagging activity and the last layer looks into the images and texts to which
hashtags are related. When read together, the three levels of analysis add value to one
another, providing a rich and in-depth vision of the case study. This approach thus
promises an enhanced understanding of hashtag engagement and can be applied to
different platforms.
Chapter 4 opens a window of opportunity for digital research, showing Instagram as
an environment for the study of political debates through hashtags (in 2016, the
platform was not considered a means to this end). In compliance with some extra effort
required by the digital methods approach, the choice of hashtags in this chapter takes
seriously the art of querying. It considered program and anti-program (situating
hashtags as positioning efforts) and resulted from immersive observation and
monitoring as well as exploratory data collection and analysis (co-hashtag networks
and Excel’s pivot table) throughout the month of the protests (March 2016).
The extra care needed for the analysis of grammatised actions is reflected in the
criticism of some common sampling practices in digital research – e.g. focusing only
on the most engaged items or what is dominant in terms of popularity and influence.
It thus proposes that researchers consider also what is “ordinary”, that is users, posts
and practices kept out of the spotlight. To do that, on the first level of analysis (highvisibility versus ordinary), unique actors must be identified and distinguished using
engagement metrics. Two analytical strategies are proposed: the identification of
239
unique actors (e.g. users, link domains, image URLs and video ids) and the analyses
of high-visible and ordinary elements (actors, content and or cultures of use). The care
about unique actors gives researchers a sense of what the data set can represent, it helps
to situate and contextualise it and provide a partial but robust perception of the digital
records, framed by Instagram grammatisation, software affordances and our vision on
the issue but solidly built with a good query design.
Enquiring of hashtag political engagement on Instagram, this methodological
framework confirms the importance of including high visibility but also ordinary
groups. It also revealed a particular structure concerning high-visible actors. In regard
to the latter, the evidence here is seen in two types of high-visible actors with different
Instagram usage practices throughout the protest day; little and high engagement in
post publication. We found that producing an impact requires little effort from public
figures, politicians, and artists (often with one post), as expected, while continuous
activity over time is necessary for non-official campaign accounts and independent
media (often with a high number of posts). Both cases are part of the high-visibility
group of protesters. When comparing high-visibility with ordinary, we detected
different patterns between these groups related to textual and visual narratives (through
analysing captions and images), forms of expressing feelings (through looking at
emojis) and positioning efforts (through the choice of hashtags).
Another criticism addressed in this chapter relates to superficial approaches to hashtag
activity– e.g. when the analysis ignores the forms in which hashtags are captured and
re-arranged by platforms (grammatisation) as well as the forms of appropriation and
frequency of hashtags (cultures of use). In this regard, the second level of analysis
(hashtagging activity) looks at referential tags and their use frequency. Again, we
noticed different preferences between high-visibility and ordinary actors. Through a
visual network exploration, chapter 4 also suggests that we approach emblematic
hashtags as a form of seeing a shift in meaning (what we called double-sense
hashtags), rather than following the typical cluster analysis to study the partisan use
of hashtags and related topics.
On the third level of analysis (visual and textual), different digital networks were part
of the study including a network of images and labels afforded by Google Vision API.
This type of network, still under-exploited, has shown great analytical value to study
online images. Through the network we produced, we were able to see the stereotypes
240
that characterise different Brazilian’s political positions (e.g. colours: yellow and green
for the pro-impeachment protests; red for the anti-coup protests) and political identity
(e.g. bearded faces in the left, sunglasses in the right), for example. In addition, through
the pro-impeachment related images, we see a higher occurrence of labels which relate
to close-up portraits (e.g., “sunglasses,” “facial expression,” “face”), while labels
related to collective imagery were more common in the anti-coup data set. (e.g.,
“festival,” “demonstration,” “event”).
The richness of different narratives was found by analysing the networks of
cooccurrence of terms and emojis, extracted from Instagram captions. The networks
revealed particular concerns raised only by one group. One example of this is the
argument made by anti-coup high-visible actors that Brazilians were not moved by
hatred but wanted to “protest peacefully”. Another example is the nationalistic
rhetoric, which appeared exclusively in pro-impeachment ordinary actors. In addition,
we saw ordinary actors showing that they were proud to participate in the protest, while
high-visibility actors acknowledged Brazilians for their participation. The emoji
analysis revealed an interesting perspective about race, with a predominance of light
skin and medium skin tones among protesters, except for the high-visibility accounts
of the anti-coup demonstrations, which had medium-dark and dark skin tones.
Chapter 5 expands the use of computer vision-based networks for social research, also
providing new ways to design and implement research through repurposing Facebook
likes and visual content. We studied how Portuguese Universities use the platform to
communicate (using two types of networks); we map and analyse like connections as
proxies of institutional interests and timeline images as institutional visual culture. By
doing so, chapter 5 exposes what researchers can learn from the connections between
Facebook Pages (through likes) and from a list of (timeline) image URLs. This study,
like the hashtag engagement approach in chapter 4, can be repurposed for different
studies. The main contribution of this chapter lies in embracing the methods of the
medium, a navigational research practice and the technicity-of-the-mediums as key
components for digital social sciences.
The network of likes between pages revealed not only common interests among all
universities (e.g. Portuguese and global Media/News Company, Newspapers and
Education-related Pages), but also some peculiar behaviours (more or less active in
terms of making institutional connections), dimensions of sociality (e.g. bonding with
241
pages categorised as business, public figures, restaurants, entertainment, barbershop,
shopping mall), interests (e.g. from investing in connections with internal stakeholders
to establishing a bond with international universities) and lack of interest (when
searching for political related categories, civic engagement, social movements or
political parties, nothing was found). This is a network that helps researchers to map
the institutional profiles and that may serve well other examples (e.g. social or
environmental causes and political oriented pages). On other platforms, we can
consider following networks (Instagram or Twitter) and channel network (YouTube).
Using the image-label network, researchers can carry out two analyses that
complement one another. The first is the analysis of images clusters, proving a global
vision of the site of the images in themselves. We found that Portuguese Universities
are perpetuating the idea of a boring academic environment (e.g by using institutional
posters or photos of people seated in an auditorium and listening to a conference). The
second level of the analysis provides very specific insights concerning each university
image sharing culture (e.g images containing animals are almost exclusive to UTAD).
This analysis benefits from some knowledge on Facebook Grammatisation, the
affordances of Gephi and Google Vision API combined with some technical expertise
and practical awareness. Here colours were associated to the image clusters, while
making the labels (second node type) disappear by colouring them with the same
colour of the background. The final visualisation provides a network grid that gathers
15 networks (one for each university), which allowed us to take advantage of the visual
affordances of the networks, analysing each university individually.
Chapter 6 interrogates three computer vision APIs (Google, IBM and Microsoft)
analysing 16,000 images related to Brazilians, Nigerians, Austrians and Portuguese
through the search for their demonyms in two of the main Western stock image sites
(Shutterstock and Adobe Stock). This served as an example of when and how the
technicity of the mediums is misaligned with the research objective, teaching us that
one’s ability to use software (or being an expert in it) does not necessarily mean to be
aware of the technicity of the medium, the same can be said for one’s skill to generate
data visualizations or code. Our knowledge of image network was solid, but we had a
poor understanding of the functioning of image classification by computer vision.
Consequently, we addressed research questions not consistent with the technicity of
the medium (by framing algorithmic techniques as either racist or culturally ignorant
242
agents) which in turn led us to flawed reasoning and results. Too much confidence in
knowing how to do or how to proceed at the level of methods or the eagerness for
methodological experimentation can take away the attitude of identifying and
interrogating the technical elements that are carriers of meaning. Thus, researcher's
attention ends up focusing only on methods.
We however had insightful results when asking how different computer vision APIs
classify the same collections of images. This study presents important findings that
researchers may want to consider when comparing or using computer vision APIs to
study online images, summarised in four elements (or the need to be aware of):
§
The different range of image labelling (capacity in term of numbers of labels)
§
The modes of image labelling (use of words, e.g. in IBM Watson API colours are part
of the generic labels used to classify visual content)
§
The granularity of image labelling (how specific the label can be, e.g. serra de aires
dog)
§
The lack of precision (e.g. inaccurate detection of facial expression, e.g. for surprise,
sorrow or anger)
At the same time that these elements are reflected in the spatialisation of the network
(forming clusters that reveal not only the image content but also the range and
granularity of image labelling), they also shed light on the differences and limitations
among the three computer vision providers. Chapters 6 and 5 show us that image
clusters within the image-label network follows the logics of image classification by
computer vision, that is the provision of a topicality rating and confidence scores for
the textual descriptions, e.g. Food (0.9772421); Ingredient (0.8929239); Fruit
(0.8815268); Staple food (0.86995584); Recipe (0.8641755); Dish (0.85230696);
Cuisine (0.8481042). This means, in the image-label network, we would see clusters
constituted by generic labels, such as food, text, buildings, musical instruments, that
are followed by more specific or descriptive labels (e.g. when looking at the food
cluster, we can make sense of the variety of food types pertaining to a group of images
such apple pie, rice, beans). This kind of knowledge is noteworthy for interpreting the
network because, when we look at image-label networks, we understand how images
can be positioned in different areas of the network (how connections are made) and
what is behind the formation of clusters (general labels followed by more specific
ones).
243
Part 1 and part 2 illustrates in different ways what the technicity perspective has to
offer to digital research. Understanding that computational mediums influence the
interpretative process and a voice on their own, help re-think the way research
questions are asked by taking into account technological grammar as factor that
contribute to research efforts (not just an issue or bias). Accounting for technicity
suggests that conceptualization is dependent on what is obtained through practical
operations (method), that are not separated from close observation of one’s object of
study.
Developing a sensitivity to the technicity-of-the-mediums in digital methods
In this dissertation I have argued and illustrated how the practice of digital methods is
enhanced when researchers make room for, grow and establish a sensitivity to the
technicity-of-the-mediums. Researchers benefit from that, developing a capacity to
research with and about technological grammar and computational mediums.
To show the connection between part 1 and part 2 of this research, I introduce below
a figure that depicts three distinct but connected ways in which researchers can develop
a sensitivity to the technicity-of-the-mediums in digital methods. When doing this, I
argue, researchers may develop a capacity to research with and about technological
grammar and computational mediums. This is the main contribution of this
dissertation, which may serve as a building block in the practice of digital methods.
The concept of technicity-of-the-mediums is related to the relationship among the
technical mediums, the fieldwork and the researcher(s) and her object of study, thus
referring to technical knowledge but also demanding iterative and navigational
technical practices. In this way, the concept points to processes of getting acquainted
with the computational mediums from conceptual, technical and empirical
perspectives (making room for the sensitivity to technicity), and in the practice of
digital methods (developing such sensitivity). This involves an engagement with the
digital fieldwork as well as technical practices, which takes some time and requires
extra efforts from the researcher (establish the importance of this approach).
244
I want to present practically the situations which relate to the attitude of make room
for (chapter 4), grow (chapter 5) and establish (chapter 6) a sensitivity to the technicityof-the-mediums by using the case studies. Below, I summarise how the technicity
approach has been into use in the case studies, while the figure helps to explain the
technicity approach in a more metaphorical way. In this, the pixels in the background
of the image represent the knowledge of the researcher which changes and evolves
over time and through technical practices (the other shapes and the connecting line are
explained later).
§
Chapter 4 exemplifies the attitude of making room for the technicity of the mediums
and the philosophy underlying digital methods. By closely paying attention to
Instagram’s application programme interface as a way of make sense of how
researchers can access, treat and repurpose the platform technological grammar and by
knowing how to make use of extraction and analysis software, as well as being aware
of the potentialities of computer vision, we attempted to pose research questions
aligned both with to the object of study and with the technicity of the mediums.
§
Chapter 5 results illustrate a situation in which researchers have already developed a
certain proximity with computational mediums and the practice of digital methods.
This chapter exemplifies the attitude of growing a sensitivity to the technicity of the
mediums by taking seriously the relationship between software affordances,
platforms’ culture of use and their technical grammatisation. As in chapter 4, the
analysis here proposed offers macro and micro perspective of the research object,
moving from general to specific visions with both quantitative (general overview of
the networks) and qualitative (looking at specific content and actors in the network).
§
Finally, chapter 6 presents an example of the use of digital methods as a tool to
create innovative visual methodologies using comparative matrixes of image-label
networks, in order to compare three different vision APIs outputs. The chapter also
serves a counter-example, describing the misalignment between the technicity of the
mediums and the research objective. It demonstrates absence of basic knowledge
about a computer vision feature can harm research results.
245
Developing a sensitivity to the technicity-of-the-mediums in digital methods.
The pixels in the background of the image represents knowledge which begins with
knowledge of the researcher's area of expertise that changes and evolves over time and
through technical practices. Design by Beatrice Gobbo and concept by Janna Joceli Omena.
246
Making room for the computational mediums as carries of meaning
The attitude of making room for the technicity-of-the-mediums refers to the efforts to
become acquainted with the fieldwork and being of aware of technical mediation.
When researchers get in contact for the first time with the digital methods approach,
they are invited to train their mind to see the web, digital records, media and
computational mediums as a means of enquiry, as source and methods of investigation.
In the figure, the cubic forms (different from the rest) tell us about this challenging
encounter, a change of mind set in which the meaning carried by computational
mediums are as important as the object of study.
It is from this direct contact with web environments that researchers start to understand
what they should look at and what for. They should look at the web from a
methodological and technical standpoint, understanding how online devices treat web
data. To learn, researcher should understand a triangular relationship between
platforms’ cultures of use and grammatisation, while exploring software affordances
and not losing sight of the object of study. In doing so, they should consider the
purpose, potentialities and limitations of the computational mediums, while practising
the full range of digital methods. It is through the practice of the software (to extract,
analyse and visualise digital records) that researchers start to understand what they
should know. They should know how to navigate digital platforms and API
documentations and identify what is seen there in the data obtained. They should know
how to use software instructions and how to work with different format files, not losing
sight of the data relational aspects.
All these efforts point to an understanding that is conceptual more than practical. Here
the use of software serves as an awareness tool, so that researchers can start seeing
things differently. What is technical is already there.
Grow a sensitivity to technical elements while practising digital methods
To grow a sensitivity to the technicity-of-the-mediums, researchers need to engage
with technical practices from the standpoint of software-using. This is a situation in
which researchers, already familiar with the fieldwork, start developing projects with
digital methods. They are invited to think along with a network of methods, while they
learn how to implement the methods. In the figure, we see that the cubic forms (the
247
web, digital records, media and computational mediums) are no longer stranger but
something familiar; they have come to make sense. Through technical practices,
researchers understand how the meaning carried by computational mediums can be
as important as the object of study, thereby rethinking conditions of proof in digital
research.
It is by practicing digital methods that researchers start to understand what they should
know and what for. They should know that different fields of studies (e.g. graph
theory or network studies, information visualisation, web technologies and statistics)
are attached to computational mediums. It is from the continuous use of digital
methods (individually and through data sprints) that the researchers can learn how to
think along and repurpose the medium and digital records. To this end, and never
losing sight of the object of study, they start with a sort of mimic research, imitating
successful research protocols or following step-by-step tried-and-tested research
recipes. Through iterative and navigational technical practices, researchers develop
a sensitivity to technical elements (graph layout algorithms, web vision APIs
modules, algorithmic techniques) as meaningful objects of attention, while they begin
to value and then appropriate the navigational practice in analysis. With time,
researchers develop the capacity of considering the features and practical qualities of
technical mediums as ensemble.
All these efforts accommodate more practical than conceptual understanding; here
the use of software serves as a knowledge tool, so that researchers can start doing
things differently. What is technical is already there.
Establishing a sensitivity to the technicity-of-the-mediums
All the effort made so far culminate in a balanced state. Here, what is conceptual,
technical, and empirical is combined as if they were one thing (always connected with
the object of study). It is from this perspective, that researchers become creators and
interpreters of a second order of grammatisation. The research protocol diagram below
illustrates that (but in a different way compared to the one discussed in chapter 3). On
the one hand, it portrays a concrete methodological process for creating computer
vision-based networks to study online images. On the other hand, it uses an
overlapping layer to underline the crucial role of technical knowledge, technical
248
practices and the researcher’s act of connecting techniques for research purposes.
Researchers orchestrate a technical ensemble (the full range of digital methods),
knowing what computational mediums should be part of it, at what time and in what
way they perform. They also know which of these are meaningful objects of attention.
That is, for instance, the case of the API that captured and made available the images,
the vision API that added new technological grammar to the images, the software
(Gephi) used to analyse the images and force-directed algorithm used to spatialise the
network (ForceAtlas2). Researchers organise a technical ensemble and follow the
protocol of a navigational practice in analysis, understanding that their decisions are
affected by the performance of the technical mediums.
With time, researchers develop the capacity of using the features of technical mediums
as an ensemble and as a solution to methodological problems. As we see in the figure,
the colours of the pixels in the background of the image have changed, representing
what has been gained through processes of getting acquainted with the computational
mediums and the web environment. This situation, however, is not definitive because,
whenever necessary, the researcher should go back to previous steps (as represented
by the lines in the figure).
Although the methods, media and technological grammar are unstable and change over
the years, there is something permanent in the logic of thinking digital research, which
may come from the awareness of the technicity-of-the-mediums combined with the
practice of digital methods.
249
The research protocol diagram for building/interpreting computer vision networks as an
example of researchers as creators and interpreters of a second order of grammatisation.
Design by Beatrice Gobbo and concept by Janna Joceli Omena.
250
References
Adar, E., & Kim, M. (2007). SoftGUESS: Visualization and Exploration of Code
Clones in Context. Retrieved from http://depfind.sourceforge.net/
Agre, P. E. (1994). Surveillance and capture: Two models of privacy. The
Information Society, 10, 110–127.
https://doi.org/10.1080/01972243.1994.9960162
Aiello, G. & Woodhouse, A. (2016). When corporations come to define the visual
politics of gender: The case of Getty Images. Journal of Language and Politics,
v. 15, n. 3, p. 351-366.
Aiello, G., et al. (2016). A critical genealogy of the Getty Images Lean In Collection.
Retrieved August, 19, 2019 from
https://wiki.digitalmethods.net/Dmi/WinterSchool2016CriticalGenealogyGettyI
magesLeanIn
Aiello, G., et al. (2017). Taking stock: Can news images be generic? Retrieved on the
August, 19, 2019 from https://wiki.digitalmethods.net/Dmi/TakingStock
Akrich, M., & Latour, B. (1992). A summary of a convenient vocabulary for the
semiotics of human and nonhuman assemblies. In W. Bijker & J. Law (Eds.),
Shaping technology/building society: Studies in sociotechnical change. (pp.
259–264). Cambridge: MIT Press.
Alonso, A. (2017, June). The politics of the streets: protests in São Paulo from Dilma
to Temer. Novos Estudos CEBRAP. http:// bdpi.usp.br/item/002837619
Alpaydin, E. (2016). Machine learning: the new AI. MIT Press.
Alves, S. (2020). Julgamento de Influencer Mariana Ferrer Termina com sentença
inédita de ‘Estupro Culposo’ e Advogado humilhand jovem. The Intercept
Brasil. Retrieved from https://theintercept.com/2020/11/03/influencer-marianaferrer-estupro-culposo/
Alzamora, G. C., & Bicalho, L. A. G. (2016). The representation of the impeachment
day mediated by hashtags on Twitter and Facebook: semiosis in hybrid
networks. Interin, 21(2), 100–121.
Analyx. (2015). GitHub - analyxcompany/ForceAtlas2: This is the R implementation
of the Force Atlas 2 graph layout designed for Gephi. Retrieved June 4, 2020,
from https://github.com/analyxcompany/ForceAtlas2
251
Anderson, C. (2008). The long tail: Why the future is selling less of more. Hachette
Books.
Anderson, C., & Wolff, M. (2010). The Web is Dead - Long Live the Web.
https://doi.org/10.1177/146045820000600401
Anderson, P. (2011). Lula’s Brazil. London Review of Books, 7, 3–12.
https://www.lrb.co.uk/v33/n07/perry-anderson/lulas-brazil Auroux, S. (1992).
The technological revolution of grammatization.University of Campinas.
Andreessen, M. (2007). The three kinds of platforms you meet on the Internet.
Retrieved January 13, 2016, from
https://web.archive.org/web/20071002031605/http://blog.pmarca.com/2007/09/t
he-threekinds.%0Dhtml
Ash, J. (2012). Technology, technicity, and emerging practices of temporal
sensitivity in videogames. Environment and Planning A, 44(1), 187–203.
https://doi.org/10.1068/a44171
Auroux, S. (Translated by E. P. O. (1992). A Revolução Tecnológica da
Gramatização. Campinas, SP: Editora da Unicamp.
Bach, D., Tsapatsaris, M. R., Szpirt, M., & Custodis, L. (2018). The Baker’s Guild:
The Secret Order Countering 4chan’s Affordances. Retrieved from
https://oilab.eu/the-bakers-guild-the-secret-order-countering-4chans-affordances/
Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An Open Source Software
for Exploring and Manipulating Networks. Third International AAAI
Conference on Weblogs and Social Media, 361–362.
https://doi.org/10.1136/qshc.2004.010033
Bennett, Lance and Segerberg, A. (2012). The logic of connective action.
Information, Communication & Society, 15(5), 739–768.
https://doi.org/10.1080/1369118X.2012.670661
Berners-Lee, T, Fielding, R. T., & Masinter, L. M. (2005). Uniform Resource
Identifiers (URI): Generic Syntax. Retrieved from
https://doi.org/10.17487/RFC3986
Berners-Lee, Tim. (1995). Hypertext and Our Collective Destiny. Retrieved
September 20, 2020, from
https://www.dougengelbart.org/content/view/258/000/
Berry, D. M. (2011). The computational turn: thinking about the digital humanities.
Culture Machine, 12.
252
Bessi, A., & Ferrara, E. (2016, November). Social bots distort the 2016 U.S.
Presidential election online discussion. First Monday.
https://firstmonday.org/ojs/index.php/fm/article/ view/7090/5653
Blondel, V. D. et al. (2008). Fast unfolding of communities in large networks.
Journal of statistical mechanics: theory and experiment, v. 2008, n. 10, p.
P10008.
Blondel, V. D., Guillaume, J.L., Lambiotte, R., & Lefebvre, E. (2008). Fast
unfolding of communities in large networks. Journal of Statistical Mechanics:
Theory and Experiment, 10008(10), 6.
Bode, L., Vraga, E. K., Borah, P., & Shah, D. V. (2014). A new space for political
behavior: Political social networking and its democratic consequences. Journal
of Computer-Mediated Communication, 19, 414–429.
http://https://doi.org/10.1111/ jcc4.12048
Bogers, L., Niederer, S., Bardelli, F., & De Gaetano, C. (2020). Confronting bias in
the online representation of pregnancy. Convergence, 1–23.
https://doi.org/10.1177/1354856520938606
Bogost, I., & Montfort, N. (2009). Platform Studies: Frequently Questioned
Answers. Digital Arts and Culture, 12–15(December), 1–6.
Borra, E., & Rieder, B. (2014). Programmed Method: Developing a Toolset for
Capturing and Analysing Tweets. Aslib Journal of Information Management,
Vol. 66(No. 3), 262–278.
Bounegru, L., Gray, J., Venturini, T., & Mauri, M. (2017). A field guide to Fake
News and other information disorders: a collection of recipes for those who love
to cook with digital methods. Retrieved from https://fakenews.publicdatalab.org/
Boy, J. D., & Uitermark, J. (2016). How to study the city on Instagram. PLOS ONE,
11(6), Article e0158161. https://doi. org/10.1371/journal.pone.0158161
Bruhn, M.(2003). Visualization services: Stock photography and the picture industry.
Genre: Forms of Discourse and Culture, v. 36, n. 3-4, p. 365-381, 2003.
Bruns, A., & Burgess, J. E. (2011, October 17). The use of Twitter hashtags in the
formation of ad hoc publics [Conference ses- sion]. Proceedings of the 6th
European Consortium for Political Research (ECPR) General Conference,
Reykjavik.
Bucher, T. (2012). A technicity of attention: How software “makes sense.” Culture
Machine, 13, 1–23.
253
Bucher, T. (2013). Objects of Intense Feeling: The Case of the Twitter.
Computational Culture. Retrieved from
http://computationalculture.net/article/objects-of-intense-feeling-the-case-of-thetwitter-api
Buolamwini, J. & Gebru, T. (2018). Gender shades: Intersectional accuracy
disparities in commercial gender classification. In: Conference on fairness,
accountability and transparency. p. 77-91.
Buolamwini, J. (2017). Gender shades: Intersectional phenotypic and demographic
evaluation of face datasets and gender classifiers (Master Thesis, Massachusetts
Institute of Technology). Retrieved from
https://dspace.mit.edu/bitstream/handle/1721.1/114068/1026503582-MIT.pdf
Burgess, Robin, Remi Jedwab, Edward Miguel, Ameet Morjaria, and Gerard Padró i
Miquel. 2015. "The Value of Democracy: Evidence from Road Building in
Kenya." American Economic Review, 105 (6): 1817-51. DOI:
10.1257/aer.20131031
Cardon, D., Cointet, J. & Mazieres, A.(2018). Neurons spike back: The Invention of
Inductive Machines and the Artificial Intelligence Controversy.
Castells, M. (2002). A sociedade em rede. São Paulo: Paz e Terra.
Chippada, B. (2017). GitHub - bhargavchippada/forceatlas2: Fastest Gephi’s
ForceAtlas2 graph layout algorithm implemented for Python and NetworkX.
Retrieved June 4, 2020, from https://github.com/bhargavchippada/forceatlas2
Chollet, F. et al. (2015). Keras. Recovered at https://keras.io
Ciuccarelli, P., & Elli, T. (2019). Beyond visualisation. In Reassembling the
Republic of Letters in the Digital Age (pp. 299–314). Göttingen University
Press. https://doi.org/10.17875/gup2019-1146
Coding Arena. (2018). Build Your First Web App In Visual Studio - Microsoft
Virtual Academy - Coding Arena. Retrieved from
https://www.youtube.com/watch?v=mgAtiR-1is4
Colombo, G. (2018). The design of composite images: Displaying digital visual
content for social research.
Corrêa, L. G. (2017). Does impeachment have gender? Circulation of images and
texts about Dilma Rousseff in Brazilian and British press. In P. C. Castro (Ed.),
A circulação discursiva entre produção e reconhecimento (pp. 279–292). Edufal.
254
Cortese, D. K., Szczypka, G., Emery, S., Wang, S., Hair, E., & Vallone, D. (2018).
Smoking selfies: Using Instagram to explore young women’s smoking
behaviors. Social Media + Society, 4(3).
https://doi.org/10.1177/2056305118790762
Crogan, P., & Kennedy, H. (2009). Technologies Between Games and Culture, 4(2),
107–114. https://doi.org/10.1177/0022022103251753
Crogan, P., & Kinsley, S. (2012). Paying Attention: Towards a Critique of the
Attention Economy, 13(1997), 1–29. Retrieved from
http://eprints.uwe.ac.uk/17039/1/463-965-1-PB.pdf
D'Orazio, F. (2014). The Future of Social Media Research. In Woodfield, K. (org.),
Social media in social research: blogs on blurring the boundaries.
D’Andréa, C. (2018). Cartografando controvérsias com as plataformas digitais:
apontamentos teórico-metodológicos. Galáxia (São Paulo), n. 38, p. 28-39.
D’Andréa, C. (2020). Pesquisando plataformas online: conceitos e métodos.
Salvador: EDUFBA. Retrieved from
https://repositorio.ufba.br/ri/handle/ri/32043
D’Andréa, C., & Mintz, A. (2019). Studying the live cross-platform circulation of
images with computer vision API: An experiment based on a sports media event.
International Journal of Communication, 13, 1825–1845.
de Souza, C. R. B., Redmiles, D., Cheng, L.-T., Millen, D., & Patterson, J. (2004).
Sometimes you need to see through walls, (September 2015), 63.
https://doi.org/10.1145/1031607.1031620
Deng, J. et al.(2009). Imagenet: A large-scale hierarchical image database. In: IEEE
conference on computer vision and pattern recognition, 248–255
Dixon, D. (2012). Analysis tool or research methodology: Is there an epistemology
for patterns? In D. M. Berry (Ed.), Understanding digital humanities (pp. 191–
209). Palgrave Macmillan. https://doi.org/10.1057/9780230371934
Dodge, M., & Kitchin, R. (2005). Code and the Transduction of Space. Annals of the
Association of American Geographers, 95(1), 162–180.
https://doi.org/10.1111/j.1467-8306.2005.00454.x
Dovey, J., & Kennedy, H. W. (2006). Game cultures : computer games as new
media. Open University Press.
Fausto Neto, A. (2016). Impeachment according to the logics of the “fabrication” of
the event. Rizoma, 4(2), 8–36. https://doi. org/10.17058/rzm.v4i2.8602
255
Flanagan, D. (2011). JavaScript: The Definitive Guide. (M. Loukides, Ed.) (6th ed.).
USA: O’Reilly Media. Retrieved from
https://books.google.pt/books?id=4RChxt67lvwC&printsec=frontcover#v=onep
age&q&f=false
Flores, A. M. M. (2019). Produção e Consumo de Vídos em 360° - Tendências para
o Jornalismo Brasileiro no YouTube. In J.J. Omena (Ed.), Métodos Digitais:
Teoria-Prática-Crítica (pp. 183–201). Lisbon: ICNOVA. Retrieved from
https://www.researchgate.net/publication/340814578_Producao_e_consumo_de
_videos_em_360_-_tendencias_para_o_jornalismo_brasileiro_no_YouTube
França, V. V., & Bernardes, M. (2016). Images, beliefs and truth in the protests of
2013 and 2015. Rumores, 10(19), 8–24. https:// doi.org/10.11606/issn.1982677X.rum.2016.112718
Freelon, D. (2018). Computational research in the post-API age Deen. Political
Communication, 35, 665–668. https://doi.org/10.1080/10584609.2018.1477506
Frosh, Paul. (2001). Inside the image factory: stock photography and cultural
production. Media, Culture & Society, v. 23, n. 5, p. 625-646.
Fuller, M. (2008). Software Studies. A Lexicon. (Matthew Fuller, Ed.). Cambridge,
MA; London, England: The MIT Press.
Galloway, A. R. (2014). The Cybernetic Hypothesis. Differences, 25(1), 107–131.
https://doi.org/10.1215/10407391-2420021
Gaver, W. W. (1991). Technology affordances. In Conference on Human Factors in
Computing Systems - Proceedings (pp. 79–84). New York, New York, USA:
Association for Computing Machinery. https://doi.org/10.1145/108844.108856
Geboers, M. A., & Van De Wiele, C. T. (2020). Machine Vision and Social Media
Images: Why Hashtags Matter. Social Media + Society, 6(2).
https://doi.org/10.1177/2056305120928485
Geboers, M., et al. (2019). Tracing relational affect on social platforms through
image recognition. Retrieved from the Universiteit van Amsterdam’s website:
https://wiki.digitalmethods.net/Dmi/SummerSchool2019TracingAffect
Gephi Community Project. (2009). GEXF File Format. Retrieved June 14, 2020,
from https://gephi.org/gexf/format/
Gerlitz, C., & Rieder, B. (2018). Tweets Are Not Created Equal: investigating
Twitter’s client ecosystem. International Journal of Communication : IJoC, 12.
256
Retrieved from https://dare.uva.nl/search?identifier=4da1d406-1213-4103-8237eef5ae786948
Gerlitz, Carolin, & Helmond, A. (2013). The like economy: Social buttons and the
data-intensive web. New Media & Society, 15(8), 1348–1365.
https://doi.org/10.1177/1461444812472322
Gerlitz, Carolin. (2016). What Counts? Reflections on the Multivalence of Social
Media Data. Digital Culture & Society, 2(2). https://doi.org/10.14361/dcs-20160203
Gerrard, Y. (2018). Beyond the hashtag: Circumventing content moderation on social
media. New Media and Society, 20(12), 4492–4511.
https://doi.org/10.1177/1461444818776611
Giannoulakis, S., & Tsapatsoulis, N. (2016). Evaluating the descrip- tive power of
Instagram hashtags. Journal of Innovation in Digital Ecosystems, 3(2), 114–129.
https://doi.org/10.1016/j. jides.2016.10.001
Gibb, R. (2016). What is a web application? Retrieved from
https://blog.stackpath.com/web-application/
Gibbs, M., Meese, J., Arnold, M., Nansen, B., & Carter, M. (2015). #Funeral and
Instagram: death, social media, and platform vernacular. Information
Communication and Society, 18(3), 255–268.
https://doi.org/10.1080/1369118X.2014.987152
Gillespie, T. (2010). The politics of “platforms.” New Media & Society, 12(3), 347–
364. https://doi.org/10.1177/1461444809342738
Gillespie, T. (2017). The platform metaphor, revisited. The Alexander Von
Humboldt Institute for Internet and Society. https://www.hiig.de/en/theplatform-metaphor-revisited
Gillespie, Tarleton. (2015). Platforms Intervene. Social Media + Society, 1(1), 1–2.
https://doi.org/10.1177/2056305115580479
Gillespie, Tarleton. (2018a). Governance of and by platforms. In A. Burgess, Jean;
Poell, Thomas & Marwick (Ed.), The Sage handbook of social media (SAGE
Publi, pp. 254–278). Sage. Retrieved from https://www.microsoft.com/enus/research/publication/governance-of-and-by-platforms/
Gillespie, Tarleton. (2018b). Platforms Are Not Intermediaries. 2 GEO. L. TECH.
REV. 198, (2), 198–216. https://doi.org/10.1177/1527476411433519
257
Google. (2017). Google Cloud Vision API (Version 1.0) [Computer software].
https://cloud.google.com/vision
Grandjean, M. & Jacomy, M.(2019). Translating Networks: Assessing
Correspondence Between Network Visualisation and Analytics. Digital
Humanities, 10, 2019.
Grohmann, R. (2018). The notion of engagement: meanings and traps for
communication research. Revista FAMECOS, 25(3), 29387.
https://doi.org/10.15448/1980-3729.2018.3.29387
Haas, A. M. (2012). Race, rhetoric, and technology: A case study of decolonial
technical communication theory, methodology, and pedagogy. Journal of
Business and Technical Communication, v. 26, n. 3, p. 277-310.
Helmond, A. (2013). The Algorithmization of the Hyperlink | Computational
Culture. Computational Culture, 3. Retrieved from
http://computationalculture.net/the-algorithmization-of-the-hyperlink/
Helmond, A. (2015a). The Platformization of the Web: Making Web Data Platform
Ready. Social Media + Society, 1(2), 2056305115603080.
https://doi.org/10.1177/2056305115603080
Helmond, A. (2015b). The web as platform: Data flows in social media. PhD
dissertation. University of Amsterdam. Retrieved from
http://www.annehelmond.nl/wordpress/wpcontent/uploads//2015/08/Helmond_WebAsPlatform.pdf
Heymann, S. (2014). Gephi. Encyclopedia of Social Networks and Mining
(ESNAM).
Hendricks, Lisa Anne et al.(2018). Women also snowboard: Overcoming bias in
captioning models. European Conference on Computer Vision. Springer, Cham,
2018. p. 793-811.
Highfield, T. (2018). Emoji hashtags // hashtag emoji: Of platforms, visual affect,
and discursive flexibility. First Monday, 23(9), 1–16.
https://doi.org/10.5210/fm.v23i9.9398
Highfield, T., & Leaver, T. (2015, January). A methodology for mapping Instagram
hashtags. First Monday, 20(1). https://doi. org/10.5210/fm.v20i1.5563
Highfield, T., & Leaver, T. (2016). Instagrammatics and digital methods: Studying
visual social media, from selfies and GIFs to memes and emoji. Communication
Research and Practice, 2(1), 47–62.
258
Ho, J. C. T. (2020). How biased is the sample? Reverse engineering the ranking
algorithm of Facebook’s Graph application programming interface. Big Data
and Society, 7(1). https://doi.org/10.1177/2053951720905874
Hoel, A. S. (2018). Technicity. In Posthuman Glossary (Rosi Braid, pp. 420–423).
Bloomsbury.
Hoel, A. S., & Van Der Tuin, I. (2013). The ontological force of technicity: Reading
Cassirer and Simondon diffractively. Philosophy and Technology, 26(2), 187–
202. https://doi.org/10.1007/s13347-012-0092-5
Hussain, Zaeem et al.(2017). Automatic understanding of image and video
advertisements. In: Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition. p. 1705-1715.
Iliadis, A. (2015). Two examples of concretization. Platform, 6(1), 86–95.
Instaloader. (2019). (Version 4.2.6) [Computer software]. https://
github.com/instaloader/instaloader
Jacomy, M. (2019). The Web as Layers. Retrieved from
https://panopto.aau.dk/Panopto/Pages/Viewer.aspx?id=48cfe5ff-5503-431b887f-ab53007ef5c4
Jacomy, Mathieu, Venturini, T., Heymann, S., & Bastian, M. (2014). ForceAtlas2, a
continuous graph layout algorithm for handy network visualization designed for
the Gephi software. PLoS ONE, 9(6).
https://doi.org/10.1371/journal.pone.0098679
Jacomy, Mathieu. (2011). ForceAtlas2, the new version of our home-brew Layout.
Retrieved June 4, 2020, from
https://gephi.wordpress.com/2011/06/06/forceatlas2-the-new-version-of-ourhome-brew-layout/
Jacomy, Mathieu. (2020a). A validity metric for interpreting distances in a network
map. Retrieved June 4, 2020, from https://observablehq.com/@jacomyma/avalidity-metric-for-interpreting-distances-in-a-networkm?collection=@jacomyma/quest-for-connected-closeness
Jacomy, Mathieu. (2020b). Making complex networks interpretable with a metric –
Reticular. Retrieved June 4, 2020, from https://reticular.hypotheses.org/1603
Jinkings, I., Doria, K., & Cleto, M. (Eds.). (2016). Why do we shout coup? To
understand the impeachment and political crisis in Brazil. Boitempo Editorial.
259
Joo, Jungseock et al. (2014). Visual persuasion: Inferring communicative intents of
images. In: Proceedings of the IEEE conference on computer vision and pattern
recognition. p. 216-223.
Jünger, J., & Keyling, T. (2019). Facepager. An application for automated data
retrieval on the web.
Jungherr, A. (2014, February 27). Twitter in politics: A compre- hensive literature
review. SSRN Electronic Journal. https://doi. org/10.2139/ssrn.2402443
Jungherr, A. (2015). Twitter use in election campaigns: A system- atic literature
review. Journal of Information Technology & Politics, 13(1), 72–91.
Karako, C. & Manggala, P. (2018). Using image fairness representations in diversitybased re-ranking for recommendations. In: Adjunct Publication of the 26th
Conference on User Modeling, Adaptation and Personalization.
Kitchin, R., & Dodge, M. (2007). Rethinking maps. Progress in Human Geography,
31(3), 331–344. https://doi.org/10.1177/0309132507077082
Knuttile, L. (2011). User unknown: 4chan, anonymity and contingency. First
Monday, 16(10). Retrieved from https://doi.org/10.5210/fm.v16i10.3665
Kuecklich, J. (2009). A Techno-Semiotic Approach to Cheating in Computer Games
Or How I Learned to Stop Worrying and Love the Machine. Games and Culture,
4(2), 158–169. https://doi.org/10.1177/1555412008325486
Laestadius, L. (2017). Instagram. In Sloan, L.; Quan-Haase, A. (Orgs.), The SAGE
handbook of social media research methods (p.573–592). Los Angeles ; London:
SAGE Publications Ltd.
Langlois, G., & Elmer, G. (2013). The research politics of social media platforms.
Culture Machine, 14, 1–17.
Latour, B. (2001). Um coletivo de humanos e não-humanos: no labirinto de Dédalo.
A esperança de Pandora: ensaios sobre a realidade dos estudos científicos. São
Paulo: EDUSC.
Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory.
Oxford University Press.
Latour, B. (2010). Tarde’s idea of quantification. In M. Candea (Ed.), The social
after Gabriel Tarde: Debates and assess- ments (pp. 145–162). Routledge.
Latour, B., Jensen, P., Venturini, T., Grauwin, S., & Boullier, D. (2012). “The whole
is always smaller than its parts” - a digital test of Gabriel Tardes’ monads.
260
British Journal of Sociology, 63(4), 590–615. https://doi.org/10.1111/j.14684446.2012.01428.x
Lazer, D. M. J., Pentland, A., Watts, D. J., Aral, S., Athey, S., Contractor, N., …
Wagner, C. (2020). Computational social science: Obstacles and opportunities.
Science, 369(6507), 1060–1062. https://doi.org/10.1126/science.aaz8170
Lazer, D., Brewer, D., Christakis, N., Fowler, J., & King, G. (2009). Life in the
network: the coming age of computational social science. Science, 323(5915),
721–723. https://doi.org/10.1126/science.1167742.Life
Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabási, A. L., Brewer, D., … Van
Alstyne, M. (2009, February 6). Social science: Computational social science.
Science. American Association for the Advancement of Science.
https://doi.org/10.1126/science.1167742
Lemos, A. (2002). Cibercultura, tecnologia e vida social na cultura contemporânea.
Porto Alegre: Sulina.
Lima, M. (2010). Racial inequalities and public policy: affirmative action in the Lula
government. Novos Estudos CEBRAP, 87, 77–95.
Lisis Laboratory. (2017). CorTexT Manager [Computer software].
https://managerv2.cortext.net
Liu, A. (2009). Digital humanities and academic change. English Language Notes,
47, 17–35.
Loken, B. & Ward, J. (1990). Alternative approaches to understanding the
determinants of typicality. Journal of Consumer Research, v. 17, n. 2, p. 111126.
Lomborg, S., & Bechmann, A. (2014). Using APIs for Data Collection on Social
Media. The Information Society, 30(4), 256–265.
https://doi.org/10.1080/01972243.2014.915276
Mackenzie, A. (2019, November 10). From API to AI: platforms and their opacities.
Information Communication and Society, pp. 1989–2006.
https://doi.org/10.1080/1369118X.2018.1476569
Manovich, L. (1993). The engineering of vision from constructivism to computers
(doctoral thesis), University of Rochester. Retrieved from
http://manovich.net/EV/EV.PDF
Manovich, L. (2009). Cultural Analytics: Visualizing Cultural Patterns in the Era of
“More Media.” New York, 1–4.
261
Manovich, L. (2014). Software is the message. Journal of Visual Culture, 13(1), 79–
81. https://doi.org/10.1177/1470412913509459
Manovich, L. (2020). Cultural Analytics. The MIT Press. Retrieved from
https://mitpress.mit.edu/books/cultural-analytics
Markham AN, Buchanan E (2012) Ethical decision-making and internet research,
recommendations from the AoIR ethics working committee (version 2.0).
Retreived from: http://aoir.org/ reports/ethics2.pdf
Markham A (2017) Impact model for ethics: notes from a talk. Retrieved from:
https:// annettemarkham.com/2017/07/impact-model-ethics/
Marres, N. (2011). Re-distributing methods: digital social research as participatory
research. Sociological Review. Retrieved from http://eprints.gold.ac.uk/6846
Marres, N. (2017). Digital sociology: the reinvention of social research. Bristol:
Polity Press.
Marres, N. & Moats, D. (2015). Mapping controversies with social media: The case
for symmetry. Social Media+ Society.
Marres, N., & Weltevrede, E. (2012). Scraping the social? Issues in live social
research. Journal of Cultural Economy, 6(3), 313–335.
Mauri, M., Elli, T., Caviglia, G., Uboldi, G., & Azzi, M. (2017). RAWGraphs. In
Proceedings of the 12th Biannual Conference on Italian SIGCHI Chapter CHItaly ’17 (pp. 1–5). New York, New York, USA: ACM Press.
https://doi.org/10.1145/3125571.3125585
Mauri, M., Gobbo, B., & Colombo, G. (2019). O papel do designer no contexto do
data sprint. In J.J. Omena (Ed.), Métodos digitais: Teoria-Prática-Crítica (pp.
161–180). Lisbon: ICNOVA.
Meyer, B. (1998). Object-oriented software construction (Second Edi). Santa Barbara
(California), USA: Prentice Hall.
Miller, P. (2018). Web apps are only getting better. Retrieved from
https://www.theverge.com/circuitbreaker/2018/4/11/17207964/web-appsquality-pwa-webassembly-houdini
Mindfire Solutions. (2017). How important is JavaScript for Modern Web
Developers? Retrieved from https://medium.com/@mindfiresolutions.usa/howimportant-is-javascript-for-modern-web-developers-2854309b9f52
Mintz, A. (2019) Stock scraper (software). Retrieved from
https://github.com/amintz/stock-scraper
262
Mintz, A., Silva, T., Gobbo, B., Pilipets, E., Azhar, H., Takamistu, H., Omena, J. J.,
& Oliveira, T. (2019). Interrogating vision APIs [Smart Data Sprint 2019].
Universidade Nova de Lisboa. https://smart.inovamedialab.org/smart2019/project-reports/ interrogating-vision-apis/
Mintz, A.(2018a) Image Network Plotter (software). Retrieved from
https://github.com/amintz/image-network-plotter
Mintz, A.(2018b). Memespector Python (software). Retrieved from
https://github.com/amintz/memespector-python
Moats, D., & Borra, E. (2018). Quali-quantitative methods beyond networks:
Studying information diffusion on Twitter with the Modulation Sequencer. Big
Data & Society, 5(1). https://doi.org/10.1177/2053951718772137
Moraes, T. P. B., & Quadros, D. G. (2016). The crisis of Dilma Rousseff government
in 140 characters on Twitter: from #impeachment to #foradilma. Em debate:
Periódico de Opinião Pública e Conjuntura Política, 8(1), 14–21. http://bibliotecadigital.tse.jus.br/xmlui/handle/bdtse/3290
Morozov, & Evgeny. (2018). There is a leftwing way to challenge big tech for our
data. Here it is. Retrieved September 29, 2020, from
https://www.theguardian.com/commentisfree/2018/aug/19/there-is-a-leftwingway-to-challenge-big-data-here-it-is?CMP=twt_gu
Mozilla Developer Networks. (2020). CSS: Cascading Style Sheets. Retrieved from
https://developer.mozilla.org/en-US/docs/Web/CSS
Mozilla Developers Network. (2020). What is JavaScript? Retrieved November 9,
2020, from https://developer.mozilla.org/enUS/docs/Web/JavaScript/About_JavaScript
Munk, A. K., Madsen, A. K., & Jacomy, M. (2019). Thinking through the databody:
Sprints as experimental situations. In Å. Mäkitalo, T. Nicewonger, & M. Elam
(Eds.), Designs for Experimentation and Inquiry: Approaching Learning and
Knowing in Digital Transformation (1st ed., pp. 110–128). London: Routledge.
Murthy, D., Powell, A. B., Tinati, R., Anstead, N., Carr, L., Halford, S. J., & Weal,
M. (2016). Bots and political influence: A sociotechnical investigation of social
network capital. International Journal of Communication, 10(June), 4952–4971.
Murugesan, S. (2007). Understanding Web 2.0. IEEE Computer Society, (August),
1–10. https://doi.org/10.1109/MITP.2007.78
263
Napoli, P. M.(2008). Toward a model of audience evolution: New technologies and
the transformation of media audiences. McGannon Center Working Paper
Series.
Niederer, S. & Colombo, G. (2019). Visual Methodologies for Networked Images:
Designing Visualizations for Collaborative Research, Cross-platform Analysis,
and Public Participation. Diseña, (14), 40-67.
Niederer, S., & van Dijck, J. (2010). Wisdom of the crowd or technicity of content?
Wikipedia as a sociotechnical system. New Media and Society, 12(8), 1368–
1387. https://doi.org/10.1177/1461444810365297
Noble, Safiya Umoja. (2018). Algorithms of oppression: How search engines
reinforce racism. New York: NYU Press, 2018.
Norman, D. (1988). The Psychology of Everyday Things. New York: Basic Books.
O´Reilly, T. (2005). What is Web 2.0: Design patterns and business models for the
next generation of software. Retrieved from
https://www.oreilly.com/pub/a/web2/archive/what-is-web-20.html
Omena, J. J. & Amaral, I. (2019). Sistema de leitura de redes digitais
multiplataforma. In Janna Joceli Omena (Ed.), Métodos Digitais: Teoria-PráticaCrítica (pp. 121–140). Lisbon: ICNOVA. Retrieved from
https://www.researchgate.net/publication/339434985_Sistema_de_leitura_de_re
des_digitais_multiplataforma
Omena, J. J., & Rosa, J. M. (2017). “Brazil went to the streets”- Again! Studies of
protests on social networks. In C. Camponez, F. Pinheiro, J. Fernandes, M.
Gomes, & R. Sobreira (Eds.), Comunicação e Transformações Sociais, Vol II:
Comunicação Política, Comunicação Organizacional e Institucional e Cultura
Visual (Atas do IX Congresso da SopCom) (pp. 51– 74). Associação Portuguesa
de Ciências da Comunicação.
Omena, J.J. (2019). O que são métodos digitais? In J.J. Omena (Ed.), Métodos
Digitais: teoria -prática-crítica (pp. 1–15). Lisbon: ICNOVA.
Omena, J.J., & Granado, A. (2020). Call into the platform! Revista ICONO14
Revista Científica de Comunicación y Tecnologías Emergentes, 18(1), 89–122.
https://doi.org/10.7195/ri14.v18i1.1436
Omena, J.J., Chao, J., Pilipets, E., Kollanyi, B., Zilli, B., Flaim, G., … Nero, S.
(2019). Bots and the black market of social media engagement.
https://doi.org/10.13140/RG.2.2.30518.52804
264
Omena, Janna Joceli, Rabello, E. T., & Mintz, A. G. (2020). Digital Methods for
Hashtag Engagement Research. Social Media + Society, (July-September), 1–
18. https://doi.org/10.1177/2056305120940697
Omena, Janna Joceli, Rabello, Elaine Teireixa, Mintz, A. (2017). Visualising hashtag
engagement: imagery of political polarisation on Instagram. Retrieved June 12,
2020, from
https://wiki.digitalmethods.net/Dmi/InstagramLivenessVisualisingengagement
Omena, Janna Joceli, Deem, A., Gobbo, B., Van Geenen, D., Alves, D., Kannasto,
E., … Israel Turin, V. (2020). Cross-platform digital networks: Exploring the
narrative affordances of force-directed layouts and data relations nature.
Retrieved from https://smart.inovamedialab.org/2020-digital-methods/projectreports/cross-platform-digital-networks/
Omena, Janna Joceli. (2017). Insta Bots and the black market of social media
engagement – The Social Platforms. Retrieved June 12, 2020, from
https://thesocialplatforms.wordpress.com/2017/12/21/insta-bots-and-the-blackmarket-of-social-media-engagement/
Omena, Janna Joceli. (2019). Métodos Digitais: teoria‐prática‐crítica. (Janna Joceli
Omena, Ed.). Lisbon: ICNOVA. Retrieved from
https://www.icnova.fcsh.unl.pt/metodos-digitais-teoria‐pratica‐critica/
Ooghe-Tabanou, B., Jacomy, M., Girard, P., & Plique, G. (2018). Hyperlink is not
dead! In ACM International Conference Proceeding Series (Vol. 2, pp. 12–18).
New York, New York, USA: Association for Computing Machinery.
https://doi.org/10.1145/3240431.3240434
Osoba, O. A. & Welser IV, W. (2017). An Intelligence in Our Image: The Risks of
Bias and Errors in Artificial Intelligence. Rand Corporation.
Paparachissi, Z. (2015). Affective publics: Sentiment, technology, and politics.
Oxford University Press.
Parnas, D. L. (1971). Information Distribution aspects of design methodology.
Retrieved from http://nova.campusguides.com/hpdilll
Pasquale, F. (2016). The Black Box Society – The Secret Algorithms That Control
Money and Information. Cambridge, Massachusetts London, England: Harvard
University Press.
Pearce, W., Özkula, S. M., Greene, A. K., Teeling, L., Bansard, J. S., Omena, J. J., &
Rabello, E. T. (2018). Visual cross-platform analysis: digital methods to
265
research social media images. Information Communication and Society, 23(2),
161–180. https://doi.org/10.1080/1369118X.2018.1486871
Peeters, S., & Hagen, S. (2018). 4CAT: Capture and Analysis Toolkit.
Perriam, J., Birkbak, A., & Freeman, A. (2020). Digital methods in a post-API
environment. International Journal of Social Research Methodology, 23(3), 277–
290. https://doi.org/10.1080/13645579.2019.1682840
Petit, V. (2012). Ars Industrialis. Retrieved February 3, 2018, from
http://arsindustrialis.org/grammatisation
Pilipets, E. (2019). From Netflix Streaming to Netflix and Chill: The (Dis)Connected
Body of Serial Binge-Viewer. Social Media + Society, 5(4), 205630511988342.
https://doi.org/10.1177/2056305119883426
Pilipets, E., Flores, A. M. M., Flaim, G., Skazedonig, M., Sepúlveda, R., & Del
Nero, S. (2019). From “tumblr purge” to “female nipples.” Retrieved from
https://smart.inovamedialab.org/2020-digital-methods/project-reports/tumblrpurge-female-nipples/
Poell, T., Nieborg, D., & van Dijck, J. (2019). Platformisation. Internet Policy
Review, 8(4). https://doi.org/10.14763/2019.4.1425
Porikli, F., Shan, S., Snoek, C., Sukthankar, R., & Wang, X. (2018). Deep Learning
for Visual Understanding: Part 2 [From the Guest Editors]. IEEE Signal
Processing Magazine. https://doi.org/10.1109/MSP.2017.2766286
Portilla, J. H. I. (2013). Ciencias de la computación:¿ un reto para el pensamiento
decolonial? Revista Criterios, v. 20, n. 1, p. 91-99.
Pritchard, K. & Whiting, R.(2015). Taking stock: A visual analysis of gendered
ageing. Gender, Work & Organization, v. 22, n. 5, p. 510-528.
Quijano, A.(2010). Colonialidade do poder e classificação social. In: SANTOS, B.
(org). Epistemologias do Sul. São Paulo: Cortez, p. 84-130.
Rabello, E., Matta, G., Omena, J. J., Silva, T., Teixeira, A., Cano-Orón, L., … Costa,
A. R. (2018). Visualising engagement on Zika epidemic: public health and social
insights from platform data analysis.
https://doi.org/10.13140/RG.2.2.26627.32800/1
Raji, I. D. & Buolamwini, J. (2019). Actionable auditing: Investigating the impact of
publicly naming biased performance results of commercial ai products. In:
Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society.
2019. p. 429-435.
266
Rambukkana, N. (Ed.). (2015). Hashtag publics: The power and politics of discursive
networks. Peter Lang.
Ribeiro, M. M., & Ortellado, M. (2016). Digital profile of protesters of 13th and 18th
of March| Opinião | EL PAÍS Brasil. El País.
https://brasil.elpais.com/brasil/2016/03/28/ opinion/1459128271_535467.html
Ricci, D., Colombo, G., Meunier, A., & Brilli, A. (2017, June 28–30). Designing
digital methods to monitor and inform Urban Policy. The case of Paris and its
urban nature initiative [Conference session]. Proceedings of the 3rd International
Conference on Public Policy (ICPP3), Singapore.
Rieder, B. (2015). Analyzing Social Media with Digital Methods. Possibilities,
Requirements, and Limitations. Retrieved October 26, 2015, from https://www.
slideshare.net/bernhardrieder/analyzing-social-media-with-digital-methodspossibilities-requirements-and-limitations
Rieder, B. (2015). Visual tagnet explorer [Computer software].
https://tools.digitalmethods.net/netvizz/instagram/
Rieder, B. (2016). Closing APIs and the public scrutiny of very large online
platforms. http://thepoliticsofsystems.net/2016/05/clos- ing-apis-and-the-publicscrutiny-of-very-large-online-plat- forms/
Rieder, B. (2017). Memespector. Retrieved from
https://github.com/bernorieder/memespector
Rieder, B. (n.d.). Textanalysis [Computer software]. http://labs.polsys.net/tools/textanalysis/
Rieder, Bernhard, & Röhle, T. (2018). Digital Methods From Challenges to Bildung.
In M. T. Schaefer & K. van Es (Eds.), The Datafied Society (pp. 109–124).
Amsterdam University Press. https://doi.org/10.1515/9789048531011-010
Rieder, Bernhard, & Rohle, T. (2019). Métodos digitais: dos desafios à Bildung. In
Janna Joceli Omena (Ed.), Métodos Digitais: teoria -prática-crítica (pp. 19–36).
Lisbon: ICNOVA. Retrieved from https://www.icnova.fcsh.unl.pt/metodosdigitais-teoria‐pratica‐critica/
Rieder, Bernhard, Abdulla, R., Poell, T., Woltering, R., & Zack, L. (2015). Data
critique and analytical opportunities for very large Facebook Pages: Lessons
learned from exploring “We are all Khaled Said.” Big Data & Society, 2(2),
2053951715614980. https://doi.org/10.1177/2053951715614980
267
Rieder, Bernhard, Coromina, Ò., & Matamoros-Fernández, A. (2020). Mapping
YouTube. First Monday, 25(8). https://doi.org/10.5210/fm.v25i8.10667
Rieder, Bernhard, Matamoros-Fernández, A., & Coromina, Ò. (2018). From ranking
algorithms to ‘ranking cultures’: Investigating the modulation of visibility in
YouTube search results. Convergence, 24(1), 50–68.
https://doi.org/10.1177/1354856517736982
Rieder, Bernhard. (2013). Studying Facebook via Data Extraction: The Netvizz
Application. Proceedings of WebSci ’13, the 5th Annual ACM Web Science
Conference, 346–355. https://doi.org/10.1145/2464464.2464475
Rieder, Bernhard. (2015). YouTube Data Tools (Version 1.11).
Rieder, Bernhard. (2020). Engines of Order: a mechanology of algorithmic
techniques. Amsterdam University Press.
Roberts, L. G. (1963). Machine perception of three-dimensional solids. Retrieved
from
https://www.researchgate.net/publication/220695992_Machine_Perception_of_T
hree-Dimensional_Solids
Rogers, R. (2017). Foundations of Digital Methods: Query Design. In M. T. Schäfer
& V. van Es (Eds.), The Datafied Society. Studying Culture through Data (pp.
75–94). Amsterdam: Amsterdam University Press.
https://doi.org/10.5117/9789462981362
Rogers, R. (2018). Otherwise Engaged: Social Media from Vanity Metrics to Critical
Analytics. International Journal of Communication, 12, 450–472.
Rogers, Richard, & Lewthwaite, S. (2019). Teaching Digital Methods: Interview
with Richard Rogers. Interviewer: S. Lewthwaite. Revista Diseña, (14), 12–37.
https://doi.org/10.7764/disena.14.12-37
Rogers, Richard. (1996). The future of STS on the web, or: What I learned (naively)
making the EASST website EASST Review, 15(2), 25–27.
https://www.researchgate.net/publication/239841669_The_Future_of_Science_a
nd_Technology_Studies_on_the_Web
Rogers, Richard. (2009). The End of the Virtual: Digital Methods. The End of the
Virtual: Digital Methods. Amsterdam: Amsterdam University Press.
https://doi.org/10.5117/9789056295936
Rogers, Richard. (2010). Internet Research: The Question of Method—A Keynote
Address from the YouTube and the 2008 Election Cycle in the United States
268
Conference. Journal of Information Technology & Politics, 7(2–3), 241–260.
https://doi.org/10.1080/19331681003753438
Rogers, Richard. (2013). Digital Methods. Cambridge, MA: MIT Press.
Rogers, Richard. (2015). Digital Methods for Web Research. In R. Scott & S.
Kosslyn (Eds.), Emerging Trends in the Behavioral and Social Sciences (pp. 1–
22). Hoboken, NJ: Wiley. https://doi.org/10.1002/9781118900772
Rogers, Richard. (2018). Digital Methods for Cross-platform Analysis. In J. et. al.
Burgess (Ed.), The SAGE Handbook of Social Media (pp. 91–108). 55 City
Road, London: SAGE Publications Ltd.
https://doi.org/10.4135/9781473984066.n6
Rogers, Richard. (2019). Doing digital methods. Lodon: Sage.
Rosa, J. M., Omena, J. J., & Cardoso, D. (2018). Watchdogs in the social network: A
polarized perception? Observatório, 12(5), 98–117.
Rose, G. (2016). Visual Methodologies (4th Ed.). UK: Open University.
Ruppert, E., Law, J., & Savage, M. (2013). Reassembling Social Science Methods:
The Challenge of Digital Devices. Theory, Culture & Society, 30(4), 22–46.
https://doi.org/10.1177/0263276413484941
Rykov et al. (2016). Semantic and geospatial ,mapping of Instagram Images in SaintPetersburg. Proceedings of the AINL FRUCT 2016 Conference SaintPetersburg, Russia, 10-12 November 2016. Retrieved from
http://ieeexplore.ieee.org/servlet/opac?punumber=7889413
Shirky, C. (2004). Situated Software. Retrieved July 23, 2020, from
https://www.gwern.net/docs/technology/2004-03-30-shirkysituatedsoftware.html
Silva, T., Barciela, P. & Meirelles, P. (2019). Mapeando Imagens de Desinformação
e Fake News Político-Eleitorais com Inteligência Artificial. 3o CONEC:
Congresso Nacional de Estudos Comunicacionais Da PUC Minas Poços de
Caldas - Convergência e Monitoramento, 413–427, 2018. Retrieved from
https://conec.pucpcaldas.br/wp-content/uploads/2019/06/anais2018.pdf
Silva, T., Mintz, A., Omena, J. J., Gobbo, B., Oliveira, T., Takamitsu, H. T., …
Azhar, H. (2020). APIs de Visão Computacional: investigando mediações
algorítmicas a partir de estudo de bancos de imagens. Logos, 27(1), 25.54.
https://doi.org/doi:https://doi.org/10.12957/logos.2020.51523
269
Silva, T.; Meirelles, P.; Apolonio, B. (2018). Visão Computacional nas Mídias
Sociais: Estudando imagens de #Férias no Instagram. Presented at I Encontro
Norte e Nordeste da ABCiber, São Luís.
Simondon, G. (1980). On the Mode of Existence of Technical Objects. University of
Western Ontario (Vol. 1). https://doi.org/10.1017/CBO9781107415324.004
Simondon, G. (2009). Technical Mentality. Parrhesia, 17–27. Retrieved from
http://www.mediafire.com/?emywtgzmmmn%5Cnpapers2://publication/uuid/C
AC01CB8-9CBB-4B93-91F2-1B46AB2534E8
Simondon, G. (2017). On the mode of existence of technical objects. Minnesota,
USA.: University of Minnesota Press.
Small, T. A. (2011). What the hashtag? A content analy- sis of Canadian politics on
Twitter. Journal Information, Communication & Society, 14(6), 872–895.
Smeuders et al. (2000). Content-based image retrieval at the end of the early years.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12), 1349–
1380.
Srnicek, N. (2017). Platform Capitalism. Cambridge, UK; Malden, MA:Polity Press.
Stiegler, B. (2006). Anamnesis and hypomnesis: The memories of desire. In L.
Armand & A. Bradley (Eds.), Technicity (pp. 15–41). Litteraria Pragensia.
Stiegler, B. (2011). The Decadence of Industrial Democracies. Cambridge, UK:
Polity Press.
Stiegler, B. (2018). Technologies of memory and imagination.
Stiegler, Bernard. (2011). The Decadence of Industrial Democracies. Cambridge,
UK: Polity Press.
Stiegler, Bernard. (2012). Die Aufklärung in the Age of Philosophical Engineering.
Computational Culture. https://doi.org/10.1017/CBO9781107415324.004
Sturm, R., Pollard, C., & Graig, J. (2017). Application Performance Management
(APM) in the digital Enterprise: Management of Traditional Applications.
Retrieved from
https://www.sciencedirect.com/book/9780128040188/application-performancemanagement-apm-in-the-digital-enterprise
Tableau Desktop. (2018). Tableau (Version 10.4.6) [Computer software].
https://www.tableau.com/products/desktop
Taibi, D., Rogers, R., Marenzi, I., Nejdl, W., Ahmad, Q. A. I., & Fulantelli, G.
(2016). Search as research practices on the web: The SaR-Web platform for
270
cross-language engine results analysis. WebSci 2016 - Proceedings of the 2016
ACM Web Science Conference, 367–369.
https://doi.org/10.1145/2908131.2908201
Tavares, F. D. M. B., Berger, C., & Vaz, P. B. (2016). A fore- seen coup: Lula,
Dilma and the pro-impeachment discourse on Veja magazine. Pauta Geral:
Estudos em Jornalismo, 3(2), 20–44.
http://www.revistas2.uepg.br/index.php/pauta/article/ view/9174
Tiidenberg K. (2020) Research Ethics, Vulnerability, and Trust on the Internet. In:
Hunsinger J., Allen M., Klastrup L. (eds) Second International Handbook of
Internet Research. Springer, Dordrecht. https://doi.org/10.1007/978-94-0241555-1_55
Tifentale, A. (2015). Making sense of the selfie: Digital image- making and imagesharing in social media. Scriptus Manet, 1, 47–59.
Tifentale, A. & Manovich, L. (2015). Selfiecity: Exploring photography and selffashioning in social media. In Postdigital Aesthetics (p. 109–122). Springer,.
Tiidenberg, K., & Baym, N. K. (2017). Learn it, buy it, work it: Intensive pregnancy
on Instagram. Social Media + Society, 3(1).
https://doi.org/10.1177/2056305116685108
Toft-Nielsen, C., & Nørgård, R. T. (2015). Expertise as gender performativity and
corporeal craftsmanship. Convergence, 21(3), 343–359.
https://doi.org/10.1177/1354856515579843
Tuters, M., Jokubauskaitė, E., & Bach, D. (2018). Post-Truth Protest: How 4chan
Cooked Up the Pizzagate Bullshit. M/C Journal, 21(3).
https://doi.org/https://doi.org/10.5204/mcj.1422
van Dijck, José, Poell, T., & de Waal, M. (2018). The Platform Society (Vol. 1).
Oxford University Press. https://doi.org/10.1093/oso/9780190889760.001.0001
Van Dijck, José. (2013). The Culture of Connectivity: A Critical History of Social
Media. New York: Oxford University Press.
Van Dijck, José. (2020). Seeing the forest for the trees: Visualizing platformization
and its governance. New Media and Society.
https://doi.org/10.1177/1461444820940293
Venturini, T. (2010). Diving in magma: How to explore contro- versies with actornetwork theory. Public Understanding of Science, 19(3), 258–273.
271
Venturini, T., Jacomy, M., & Pereira, D. (2015). Visual Network Analysis.
SciencesPo Media Lab working paper.
Venturini, T., Jacomy, M., Bounegru, L., & Gray, J. (2018). Visual Network
Exploration for Data Journalists. In B. Eldridge II, S. & Franklin (Ed.),
Handbook to Developments in Digital Journalism Studies. Abingdon:
Routledge.
Venturini, T., Jacomy, M., Bounegru, L., & Gray, J. (2018). Visual Network
Exploration for Data Journalists. In B. Eldridge II, S. & Franklin (Ed.),
Handbook to Developments in Digital Journalism Studies. Abingdon:
Routledge.
Venturini, Tommaso, & Rogers, R. (2019). “API-based research” or how can digital
sociology and digital journalism studies learn from the Cambridge Analytica
affair. Digital Journalism, (Forthcoming).
https://doi.org/10.1080/21670811.2019.1591927
Venturini, Tommaso, Bounegru, L., Gray, J., & Rogers, R. (2018). A reality
check(list) for digital methods. New Media & Society, 20(11), 4195–4217.
https://doi.org/10.1177/1461444818769236
Venturini, Tommaso, Jacomy, M., & Jensen, P. (2019). What do we see when we
look at networks an introduction to visual network analysis and force-directed
layouts. SSRN, (1). Retrieved from
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3378438
Venturini, Tommaso, Jacomy, M., Meunier, A., & Latour, B. (2017). An unexpected
journey: A few lessons from sciences Po médialab’s experience. Big Data &
Society, 4(2), 205395171772094. https://doi.org/10.1177/2053951717720949
Vis, F. (2013). A critical reflection on Big Data: Considering APIs, researchers and
tools as data makers. First Monday, 18(10).
https://doi.org/10.5210/fm.v18i10.4878
Warwick, C., Terras, M., & Nyhan, J. (2012). Digital Humanities in Practice. (C.
Warwick, M. Terras, & J. Nyhan, Eds.). Cambridge University Press.
https://doi.org/https://doi.org/10.29085/9781856049054
Wattenberg, M., & Viégas, F. B. (2008). The word tree, an interactive visual
concordance. IEEE Transactions on Visualization and Computer Graphics,
14(6), 1221–1228. https://doi.org/10.1109/TVCG.2008.172
272
Watts, D. J. (2007). A twenty-first century science. Nature, 445(7127), 489.
https://doi.org/10.1038/445489a
Weltevrede, E., & Borra, E. (2016). Platform affordances and data practices: The
value of dispute on Wikipedia. Big Data and Society, 3(1), 1–16.
https://doi.org/10.1177/2053951716653418
West, C. (2018). The Lean In Collection: Women, Work, and the Will to Represent.
Open Cultural Studies, 2(1), 430-439, 2018.
Wiki, O. S. M. (2017). GDF. Retrieved June 14, 2020, from
https://wiki.openstreetmap.org/wiki/GDF
Wikipedia. (2015). Gephi. Retrieved June 4, 2020, from
https://en.wikipedia.org/wiki/Gephi
Wikipedia. (2020). Coupling (computer programming). Retrieved June 5, 2020, from
https://en.wikipedia.org/wiki/Coupling_(computer_programming)#cite_noteISOIECTR19759_2005-2
Williams, R. (1989). Culture is ordinary. In R. Williams (Ed.), Resources of hope,
culture, democracy, socialism (pp. 3–14). Verso.
Wilson, C. (2017, April 6). I spent two years botting on Instagram— Here’s what I
learned [Blog post]. PetaPixel. https://petapixel. com/2017/04/06/spent-twoyears-botting-instagram-heres- learned/
Woolley, S. C. (2016). Automating Power: Social bot interference in global politics.
First Monday, 21(4).
Woolley, S. C., & Howard, P. N. (2016). Political Communication, Computational
Propaganda, and Autonomous Agents. International Journal of Communication,
10, 4882–4890.
World Wide Web Consortium. (n.d.). HTML & CSS. Retrieved November 9, 2020,
from https://www.w3.org/standards/webdesign/htmlcss#whatcss
Yoshihara, N. (2011). Development History of Wire Rods for Valve Springs.
KOBELCO TECHNOLOGY REVIEW. Retrieved from
https://www.semanticscholar.org/paper/Development-History-of-Wire-Rods-forValve-Springs-Yoshihara/3ca2f39b7e0c18167e116331a9383f4488acc83b
Zhang, Z. (2020). Infrastructuralization of Tik Tok: transformation, power
relationships, and platformization of video entertainment in China. Media,
Culture and Society. https://doi.org/10.1177/0163443720939452
273
Appendices
274
275
276
277
278
279