Academia.eduAcademia.edu
Digital Methods and technicity-of-the-mediums. From regimes of functioning to digital research. Janna Joceli Omena Doctoral thesis in Digital Media. 2021 Digital methods and technicity-of-the-mediums From regimes of functioning to digital research Janna Joceli Omena 2021 Digital Methods and technicity-of-the-mediums. From regimes of functioning to digital research. Janna Joceli Omena Thesis submitted to meet the requirements for obtaining the degree of Doctor in Philosophy in Digital Media by the UT Austin-Portugal Digital Media programme, carried out under the scientific guidance of Researcher Professor Dr. Jorge Martins Rosa and Dr. Tommaso Venturini. This work was funded by the Foundation for Science and Technology, grant number PD/BD/128252/2016. Thesis submission: 28 December 2020. Revised submission: 11 June 2021. Doctoral exams: 21 September 2021. Jury composition: Cristina Ponte (president of the jury) Full professor, NOVA FCSH, Universidade Nova de Lisboa, Portugal Tommaso Venturini (supervisor) Chargé de recherche, CNRS Center for Internet and Society, France Bernhard Rieder Associate Professor, University of Amsterdam, The Netherlands Ariadna Matamoros-Fernández Lecturer in Digital Media, Queensland University of Technology, Australia Liliana Bounegru Lecturer in Digital Methods, Department of Digital Humanities, King’s College London, United Kingdom Paulo Nuno Vicente Assistant Professor, NOVA FCSH, Universidade Nova de Lisboa, Portugal Carla Morais Assistant Professor, Faculdade de Engenharia, Universidade do Porto, Portugal To Colin Mcmillan, a beloved friend and master. (in memoriam) 2 ACKNOWLEDGEMENTS Gratidão é a memória do coração Dom Paulo Garcia I want to thank God for bringing me here (what a journey!), for always being there as my strength and hope in difficult times. You are my every good thing. Finishing this dissertation in time was a burning and fascinating process that could never have been completed without the support of a list of people who have in different ways played crucial roles in my PhD journey. Starting from iNOVA Media Lab, the environment that has enabled me to create/coordinate the SMART Data Sprint and a research group named Social Media Research Techniques (SMART), which was a golden opportunity to develop research in an area not yet existing at NOVA FCSH. I would like to thank, particularly, Paulo Nuno Vicente and António Granado. Paulo invited me to integrate the iNOVA Media Lab from the start, believed in my work and opened several opportunities for me. I will always be grateful for that, thank you meu coordenador! In my first year as a PhD student, Antonio’s lectures served as an inspiration to this thesis in so many ways (from a research project that turned into the SMART Data Sprint, to a good conversation after class about politics and Instagram). António’s support and invitation to collaborate with him were priceless. És o maior! Many thanks also to all my colleagues at iNOVA Media Lab for inspiring conversations throughout the past years, particularly those who are part of SMART research group: Ana Marta Flores, Elena Pilipets, Jason Chao and Rita Sepúlveda. Universidade Nova de Lisboa has been my home since 2013, when I began a master's degree in NOVA FCSH. That same year I met Jorge Martins Rosa who has become my master’s and doctorate supervisor. I am thankful to him for all these years, for being always available either to listen or to develop collaborative projects. As a supervisor, he has always considered my focused interest in digital methods, giving me the freedom to seek my own paths. This has certainly made a difference in my journey. A big thanks for all the support given by NOVA Desporto, to the athletestudents, and especially to all my basketball teammates and coaches who were definitely part of this work, although not exactly academically. A big thanks to Roberto Henriques for inviting me to teach social media analytics at NOVA Information Management School, four groups in total, which coincided with the beginning of the pandemic. What a challenge! This opportunity had a direct impact on the reflections of this dissertation, besides being valuable to developing my teaching skills. Before I began my PhD studies, two incredible minds unintentionally introduced to me the concept of technicity, conversations that aroused my curiosity and later turned my attention and interest to it. A big thanks to Manuel Bogalheiro for the several inspirational talks we had, and to Bernhard Rieder who has inspired a great deal of this dissertation. Two conversations I had with Bernhard were definitely game changers on my PhD journey, the first was in July 2015 (at that time I hadn't even 3 begun my PhD course); the second was when I was already at the most advanced stage of the thesis. Bernhard has always cared and been available either by responding to my email requests or accepting the invitations to collaborate with SMART Data Sprint. He also has shared his work, a few times asking for my general impressions (when I certainly had more to receive than to give). I cannot thank him enough for that☺" I am indebted to Tommaso Venturini for his valuable suggestions and criticism in his capacity of thesis co-supervisor. Being supervised by him was certainly one of the best things that happened to me in the PhD journey. Tommaso has been an inspiring advisor, always asking crucial questions to refine and complicate my thinking and helping me to see the value of my work. This thesis would not be what it is without Tommaso’s help and intervention. I could not be more grateful to have him as my supervisor (I have learnt so much!); he has been the best supervisor ever "#$ This thesis has greatly benefited from the Digital Methods Initiative, from where I learned to do digital methods and also to share knowledge. My first collaborative experience with DMI took place during the 2014 Winter School in Amsterdam. This was undoubtedly a differential factor in my journey to the development of this thesis. Over the years, at DMI Summer Schools, I have met amazing researchers and started to work in collaboration with some of them. For this, I want to thank everyone who makes DMI a reality and encourage them all to keep going. You make the difference! Many thanks to the super talented designer Beatrice Gobbo (who I first met at DMI Summer School in 2017) for collaborating with the key visualisations part of this thesis. Last but not least, I want to say how grateful I am to my family and dearest friends, for their love and support. My great thanks to Isis Cavalcanti%, Maria do Carmo$, Carlisi Omena&, Jocelina Omena', Carlos and Rosa Omena(. One million thanks to João Fonseca)*+ for his love and relentless support, for understanding my bad moods, for celebrating with me all achievements and failures ,. Thanks to Cristina, Guida, Orlando and Luís Fonseca for your warm welcome and care. A heartfelt thanks to Natália de Santana- and Andrea Veruska., even physically separated by the ocean, they both have been always present, always cheering me up. My special gratitude to Margarida Sousa)/ and Katielle Silva)0, not only for their friendship but for their kindness of heart, offering me support in difficult times. I want to express my gratitude to Carla Nave, Ana Marta Flores, Camila Wohlmuth and Ariana Mencaroni for their friendship, great company and PhD-related conversations watered with a rosé or red wine. You are part of this! Many thanks to my dear friend Inês Amaral for her always kind and supportive words and also to my dears Liliana Rosa, Ricardo Miguel, Frédéric Gaspar (Doudou), Carlos Amaral, Ana Miguel, Luís Silva, Elsa Caetano and Luis Oliveira Martins for all their support and care, for all the great moments together. They all have inspired me to keep going and I am simply so grateful to them $ 1 $ 4 ACKNOWLEDGEMENTS CO-AUTHORED ARTICLES This doctoral thesis has benefited so much from the collaboration of amazing researchers, resulting in several articles and data sprint reports, a book chapter, two grant awards and a book edition. Omena, J. J., Rabello, E. T., & Mintz, A. G. (2020). Digital Methods for Hashtag Engagement Research. Social Media and Society. https://doi.org/10.1177/2056305120940697 Chapter 4 corresponds to a two-year collaboration with Elaine Teixeira Rabello and André Mintz. This article originated from an unexpected encounter with two incredible people and researchers (today my friends) at the DMI Summer School 2017. It was my third experience in this kind of event, and I was almost ready to pitch a project about hashtags and political polarisation in Brazil. I say almost because the project only became viable after meeting André (who on the first day of the sprint sat on my right side) and Elaine (who on the same day sat on my left side). I am so grateful to have met them both, fruitful relationships of work and friendship (#instagood forever!). The article thus systematised approaches explored in two data sprints (DMI Summer School 2017 and SMART Data Sprint 2018), with early results presented at the ECREA Digital Culture and Communication Section Conference (November 2017, Brighton, UK). I also would like to thank the researchers and designers who participated in the data sprint projects. Omena, J. J. & Granado, A. (2020). Call into the platform! Merging platform grammatisation and practical knowledge to study digital networks, Icono 14, 18 (1), 89122. doi: 10.7195/ri14.v18i1.1436 Chapter 5 corresponds to a three-year collaboration with António Granado that started in early 2017, when António invited and challenged me to use digital methods to study how Portuguese Universities were using Facebook to communicate. We presented some preliminary findings at the Science Communication Conference - SciCom PT 2017. In 2018, we focused on Google Vision API and the visual exploration of different networks. The results of this work culminated in an article published in Icono and that also was approved at 8º ECREA Conference 2020 by the Visual Cultures committee (the conference was postponed to 2021 due to the pandemic). I also would like to thank Fábio Gouveia for helping us to select a specific group of images in our folder with over 22,000 images which has facilitated the analysis. Silva, T., Mintz, A., Omena, J. J., Gobbo, B., Oliveira, T., Takamitsu, H., Pilipets, E., & Azhar, H. (2020). APIs de Visão Computacional: investigando mediações algorítmicas a partir de estudo de bancos de imagens. Logos, 27(1). doi:https://doi.org/10.12957/logos.2020.51523 Chapter 6 corresponds to an article written in collaboration with Tarcízio Silva, André Mintz, Beatrice Gobbo, Taís Oliveira, Helen Takamitsu, Elena Pilipets and Hamdan Azhar. This article was born from an invitation I made to Tarcízio Silva, in mid 2018, to join the SMART Data Sprint 2019 at Universidade Nova de Lisboa. I helped Tarcízio to develop the project before it was presented at the data sprint. During the event, and together with André Mintz and Beatrice Gobbo, I helped in developing and operationalising the research design and methods. A version of this study was also presented at the National Symposium on Science, Technology and Society in 2019 in the city of Belo Horizonte, Brazil. 5 Janna Joceli Omena & Inês Amaral (2019). Sistema de leitura de redes digitais multiplataforma. In: Métodos Digitais: Teoria-Prática-Crítica, edited by Janna Joceli Omena. Lisboa: ICNOVA. ISBN: 978‐972‐9347‐34‐4 This book chapter was written in collaboration with Inês Amaral who is a fellow worker and also a dear friend. I first met Inês Amaral in 2014, when she was tutoring a workshop on network analysis with Gephi, which I attended while trying to make sense of how to read networks afforded by Facebook Graph API through Gephi. I am so grateful for this meeting, which later saw us join forces to develop research projects. Two reasons justify the importance of this chapter in this thesis: i) the triangular understanding about platform grammatisation, cultures of use and software affordances as pillars to be considered when doing digital methods were first introduced in this book chapter; ii) as well as the narrative affordances of ForceAtlas2 to read networks through fixed layers of interpretation. These approaches have been tried-and-tested within and outside data sprints environments (since 2018 and continuous to be developed). Warren Pearce, Suay M. Özkula, Amanda K. Greene, Lauren Teeling, Jennifer S. Bansard, Janna Joceli Omena & Elaine Teixeira Rabello (2018): Visual crossplatform analysis: digital methods to research social media images, Information, Communication & Society, DOI: 10.1080/1369118X.2018.1486871 This paper led by Warren Pearce and Suay Özkula is already considered an agenda-setting digital methods article. I conceptualised and developed methods for the Instagram analysis and also contributed to the topic of cross-platform analysis with digital methods. Through this collaborative research experience, I could see in practice how crucial the roles of technical knowledge and technical practices are in the production of knowledge. Rosa, J. M., Omena, J. J., e Cardoso, D. (2018). Watchdogs in the Social Network: A Polarized Perception? Observatório (OBS*) 12 (5), Special issue “As Formas Contemporâneas dos Conflitos e das Apostas Digitais, DOI: 10.15847/obsOBS12520181367 In this article, co-authored with Jorge Martins Rosa and Daniel Cardoso, I had the opportunity to contribute to the research design and implementation with digital methods. Here we have explored a navigational research practice for interpreting digital networks according to their visual affordances. Awarded grants & Book edition In 2018, I received a research grant from UT Austin I Portugal Digital Media Program. The project was related to the technicity of social media platforms and the life of Instagram bots. I want to thank Jason Chao for his collaboration on matters related to machine learning. Between 2018 and 2019 I edited the book Métodos Digitais: teoria-prática-crítica, composed of reference articles and original texts, also designed for the teaching of digital methods in Portuguese. In 2020, I received another research grant from the Center for Advanced Internet Studies (CAIS) but this time to promote working group meetings. I am coordinating this project, Stick & Flow: A critical framework for investigating bot engagement on social media, with Elena Pilipets. This thesis has benefited from both research projects, book edition and collaborations. 6 Digital Methods and technicity-of-the-mediums. From regimes of functioning to digital research. Janna Joceli Omena Abstract Digital methods are taken here as a research practice crucially situated in the technological environment that it explores and exploits. Through software-oriented analysis, this research practice proposes to re-purpose online methods and data for social-medium research but not considered as a proper type of fieldwork because these methods are new and still in their process of description. These methods impose proximity with software and reflect an environment inhabited by technicity. Thus, this dissertation is concerned with a key element of the digital methods research approach: the computational (or technical) mediums as carriers of meaning (see Berry, 2011; Rieder, 2020). The central idea of this dissertation is to address the role of technical knowledge, practise and expertise (as problems and solutions) in the full range of digital methods, taking the technicity of the computational mediums and digital records as objects of study. By focusing on how the concept of technicity matters in digital research, I argue that not only do digital methods open an opportunity for further enquiry into this concept, but they also benefit from such enquiry, since the working material of this research practice are the media, its methods, mechanisms and data. In this way, the notion of technicity-of-the-mediums is used in two senses pointing on the one hand to the effort to become acquainted with the mediums (from a conceptual, technical and empirical perspective), on the other hand, to the object of technical imagination (the capacity of considering the features and practical qualities of technical mediums as ensemble and as a solution to methodological problems). From the standpoint of non-developer researchers and the perspective of software practice, the understanding of digital technologies starts from direct contact, comprehension and different uses of (research) software and the web environment. The journey of digital methods is only fulfilled by technical practice, experimentation and exploration. Two main arguments are put forward in this dissertation. The first states that we can only repurpose what we know well, which means that we need to become acquainted with the mediums from a conceptual-technical-practical perspective; whereas, the second argument states that the practice of digital methods is enhanced when researchers make room for, grow and establish a sensitivity to the technicity-of-the-mediums. The main contribution of this dissertation is to develop a series of conceptual and practical principles for digital research. Theoretically, this dissertation suggests a broader definition of medium in digital methods and introduces the notion of the technicity-of-the-mediums and three distinct but related aspects to consider – namely platform grammatisation, cultures of use and software affordances, as an attempt to defuse some of the difficulties related to the use of digital methods. Practically, it presents concrete methodological approaches providing new analytical perspectives for social media research and digital network studies, while suggesting a way of carrying out digital fieldwork which is substantiated by technical practices and imagination. Keywords: digital methods, technicity, technical practices, networks, vision APIs, social media. 7 Métodos Digitais e tecnicidade-dos-mediums. De regimes de funcionamento à pesquisa digital. Janna Joceli Omena Resumo Os métodos digitais são aqui tomados como uma prática de investigação crucialmente situada no ambiente tecnológico que explora e do qual tira benefício. Esta prática de pesquisa propõe a reorientação dos métodos online e dos dados para a pesquisa social e do meio através da análise orientada por software, prática ainda não considerada como um tipo adequado de trabalho de campo porque estes métodos são novos e a sua descrição está ainda numa fase incipiente. Estes métodos obrigam a adquirir familiaridade com o software e refletem um ambiente habitado pela tecnicidade. Esta dissertação diz assim respeito a um elemento-chave da abordagem de investigação dos métodos digitais: os meios computacionais (ou técnicos) enquanto portadores de significado (ver Berry, 2011; Rieder, 2020). A ideia central desta dissertação é a de refletir sobre o papel do conhecimento técnico, da prática técnica e da aquisição de competências (como problemas e como soluções) em todo o âmbito dos métodos digitais, assumindo a tecnicidade dos meios computacionais e dos registos digitais como objetos de estudo. Ao centrar-me na forma como o conceito de tecnicidade é fundamental na investigação digital, argumento que não só os métodos digitais abrem uma oportunidade para uma investigação mais aprofundada deste conceito, mas também que beneficiam deste tipo de investigação, uma vez que a matéria-prima desta prática de pesquisa são os meios, os seus métodos, mecanismos e dados. Deste modo, a noção de tecnicidade-dos-meios é utilizada em dois sentidos: apontando, por um lado, para a necessidade de conhecimento dos meios (duma perspetiva conceptual, técnica e empírica) e, por outro, para o objeto da imaginação técnica (a capacidade de tomar as características e as qualidades práticas dos meios computacionais como um conjunto [ensemble] e como uma solução para problemas metodológicos). Segundo o ponto de vista dos pesquisadores que não estão familiarizados com o desenvolvimento de software (ou de ferramentas digitais) bem como da perspectiva da prática do software, a compreensão das tecnologias digitais deve partir de um contato direto, da compreensão e dos diferentes usos do software e do ambiente da web. O percurso dos métodos digitais só pode ser concretizado pela prática técnica, pela experimentação e pela exploração. Dois argumentos principais são apresentados nesta dissertação. O primeiro afirma que só podemos tirar proveito daquilo que conhecemos de forma aprofundada, o que significa que é necessário que nos familiarizemos com os meios numa perspetiva conceptual-técnica-prática, enquanto o segundo argumento afirma que a prática dos métodos digitais é aperfeiçoada quando os investigadores estão recetivos a, amadurecem e adquirem uma sensibilidade para a tecnicidade-dos-meios. A principal contribuição desta dissertação é o desenvolvimento de um conjunto de princípios conceptuais e práticos para a pesquisa digital. Teoricamente, esta dissertação propõe uma definição mais ampla de meio nos métodos digitais, introduz o conceito de tecnicidadedos-meios e aponta para três facetas distintas mas relacionadas – referimo-nos à gramatização das plataformas, às culturas de utilização e às affordances do software –, como uma solução para minorar algumas das dificuldades relacionadas com a utilização dos métodos digitais. Na prática, apresenta abordagens metodológicas concretas que fornecem novas perspetivas analíticas para a investigação dos media sociais e para os estudos de redes digitais, ao mesmo tempo que sugere uma forma de levar a cabo trabalho de campo digital que é substanciada por práticas técnicas e pela imaginação técnica. Palavras-chave: métodos digitais, tecnicidade, práticas técnicas, redes, APIs, media sociais 8 TABLE OF CONTENTS DIGITAL METHODS AS AN ENVIRONMENT INHABITED BY TECHNICITY....................1 SITUATING THE PROBLEMS OF METHODS IN DIGITAL RESEARCH ......................................................................... 2 New media, new methods, new issues ......................................................................................... 2 The practice of digital methods demands a proximity with computational mediums .................. 6 ACCOUNTING FOR THE TECHNICITY-OF-THE-MEDIUMS IN DIGITAL METHODS ....................................................... 7 Part one: reflecting the intersection of methods, technicity and digital fieldwork ..................... 11 Part two: mobilising technicity-of-the-mediums and three pillars of the digital methods approach to design research and present concrete methodological approaches ...................... 13 LIMITATIONS AND FURTHER WORK ........................................................................................................... 17 1 UNPACKING DIGITAL METHODS ...................................................................................... 20 UNDERSTANDING DIGITAL METHODS ....................................................................................................... 21 An historical perspective: the reasoning behind the methods .................................................... 22 Terminology and definitions ....................................................................................................... 24 Taking technological grammar into account .............................................................................. 27 DOING DIGITAL METHODS ..................................................................................................................... 29 The art of querying...................................................................................................................... 29 Technical practices, the makers and users of software .............................................................. 32 A checklist of questions related to the practice of digital methods ............................................ 39 Data sprints as a form of learning .............................................................................................. 41 THE MANY CHALLENGES OF DIGITAL METHODS ........................................................................................... 43 Four underlying principles ........................................................................................................... 43 Characterising technical knowledge and practice in digital methods......................................... 45 A call for a broader definition of “medium” in digital methods .................................................. 47 2 THREE ATTEMPTS TO UNDERSTAND TECHNICITY ..................................................... 52 FIRST ATTEMPT TO UNDERSTAND TECHNICITY (MEDIA THEORY) ..................................................................... 53 Ways of thinking technicity in Digital Media field studies .......................................................... 54 Technicity as a domain in reiterative and transformative practices ........................................... 57 Common perceptions and appropriations .................................................................................. 59 SECOND ATTEMPT TO UNDERSTAND TECHNICITY (A PHILOSOPHICAL PERSPECTIVE).............................................. 61 The awareness component: machine as elements, individuals and ensembles .......................... 63 The human function: to be acquainted with machines ............................................................... 68 From orders of thought to activity and back............................................................................... 74 The value of technical elements and technical imagination ....................................................... 77 THIRD ATTEMPT TO UNDERSTAND TECHNICITY (WITH DIGITAL METHODS) ......................................................... 80 The context and content of the network ..................................................................................... 80 Building a computer vision-based network (being acquainted with computational mediums) .. 84 Reading digital networks (orders of technical-practical thoughts) ............................................. 87 3 DIGITAL FIELDWORK.......................................................................................................... 95 GETTING ACQUAINTED WITH THE WEB ENVIRONMENT .................................................................................. 96 A technical comprehension of the web: an overview .................................................................. 97 An architecture of participation: the role of web applications and APIs................................... 103 From the web as a platform concept to platformisation .......................................................... 108 DIGITAL TECHNOLOGIES AND THE WEB AS THE LAST STAGE OF GRAMMATISATION ............................................ 110 Making visible and tangible different types of memory, behavior, knowledge ........................ 111 From the metaphor of capture to an in-depth look over technological grammar .................... 113 THREE PILLARS OF THE DIGITAL METHODS APPROACH ................................................................................. 117 Platform Grammatisation ......................................................................................................... 119 Cultures of use .......................................................................................................................... 123 The affordances of software ..................................................................................................... 131 INTRODUCTION TO CHAPTER FOUR ................................................................................ 138 9 4 HASHTAG ENGAGEMENT RESEARCH............................................................................ 142 INTRODUCTION .................................................................................................................................. 143 REVISITING THE ROLE OF HASHTAGS ........................................................................................................ 144 Situating Hashtag Engagement ................................................................................................ 147 REASONING WITH AND THROUGH THE MEDIUM ........................................................................................ 148 Technicity .................................................................................................................................. 149 Platform Grammatisation ......................................................................................................... 150 THE 3L PERSPECTIVE FOR STUDYING HASHTAG ENGAGEMENT ...................................................................... 151 Layer 1: High-Visibility Versus Ordinary .................................................................................... 152 Layer 2: Hashtagging Activity ................................................................................................... 153 Layer 3: Visual and Textual Content ......................................................................................... 153 THE PRAXIS OF HASHTAG ENGAGEMENT RESEARCH ................................................................................... 154 Political Context, Scholarly Approaches, and Framing of the Brazilian Case ............................ 154 Operationalising the 3L Perspective.......................................................................................... 156 FINDINGS .......................................................................................................................................... 158 High-Visibility Versus Ordinary ................................................................................................. 158 Hashtagging Activity ................................................................................................................. 160 Visual and Textual ..................................................................................................................... 163 CONCLUSION ..................................................................................................................................... 169 INTRODUCTION TO CHAPTER FIVE .................................................................................. 172 5 DIGITAL NETWORKS ......................................................................................................... 177 THE CASE OF PORTUGUESE UNIVERSITIES ON FACEBOOK ............................................................................ 178 MATERIAL AND METHODS .................................................................................................................... 180 RESULTS ........................................................................................................................................... 183 Seeing beyond like connections ................................................................................................ 183 The imagery of Portuguese Universities ................................................................................... 190 DISCUSSION....................................................................................................................................... 200 INTRODUCTION TO CHAPTER SIX..................................................................................... 202 6 INTERROGATING COMPUTER VISION APIS ................................................................. 206 INTRODUCTION .................................................................................................................................. 207 COMPUTER VISION AND THE STUDY OF IMAGES ........................................................................................ 209 Computer Vision APIs: What are they? ..................................................................................... 211 Interrogating APIs and stock images websites: absences and hyper-visibilities ....................... 214 Granularity and standardisation in the semantic spaces of APIs .............................................. 219 NETWORKS OF SEMANTIC SPACES AND TYPICALITY ..................................................................................... 224 CONCLUSION ..................................................................................................................................... 229 TECHNICITY OF THE MEDIUMS IN DIGITAL METHODS .................................................................. 232 Part 1: For a technical culture of knowledge in digital research ............................................ 235 Part 2: From technical knowledge and technical practices to new forms of enquiring ......... 239 DEVELOPING A SENSITIVITY TO THE TECHNICITY-OF-THE-MEDIUMS IN DIGITAL METHODS ................................... 244 Making room for the computational mediums as carries of meaning...................................... 247 Grow a sensitivity to technical elements while practising digital methods .............................. 247 Establishing a sensitivity to the technicity-of-the-mediums ..................................................... 248 REFERENCES ...................................................................................................................................... 251 APPENDICES ...................................................................................................................................... 274 10 DIGITAL METHODS AS AN ENVIRONMENT INHABITED BY TECHNICITY I NTRODUCTION 1 “[…] but the ways of thought are more important than the subject matter” Marvin Minsky, 1988, p.323 “My thesis today is that you cannot distinguish between the technical question and the sociological question” Tim Berners-Lee, 12 October 1995 Situating the problems of methods in digital research New media, new methods, new issues The need to develop new methods adapted to study new media is more present than ever in media studies and has sparked a vibrant discussion on technical expertise and knowledge, but also on practical and infrastructural issues, such as the need for research ethics, digital literacy, new data infrastructures and new graduate/undergraduate curricula in universities (see Lazer, Pentland, et al., 2009; Marres, 2017; Rieder & Rohle, 2019). Over the past decade, different approaches have emerged aiming to develop new theoretical frameworks and empirical fieldwork to comprehend our digital life and culture. The quest for more stable methods and epistemological regimes is reflected in innovative solutions and techniques to work with digital and digitalised sources and methods in digital humanities (see Berry, 2011; Warwick, Terras and Nyhan, 2012)1; cultural analytics (see Manovich, 2009, 2020) and computational social sciences2 (see Lazer, Brewer, et al., 2009; Lazer, Pentland, et al., 2009). On one hand, the new methods are constantly being developed to use digital technologies to respond to research tasks, and to anticipate the need of the expanding 1 For instance, quantitative studies that map and measure in detail cultural patterns by adopting different visualisation techniques to make sense of big data samples, known as cultural analytics (see Manovich, 2009, 2020). See a list of digital humanities journals and some initiatives in the following links: https://zenodo.org/record/3406564, https://guides.library.harvard.edu/c.php?g=310256&p=2071428 Here you find the official links for Manovich and Rogers respective initiatives: http://lab.culturalanalytics.info/ and https://wiki.digitalmethods.net/Dmi/DmiAbout 2 Computational Social Sciences encompasses “language, location and movement, networks, images, and video, with the application of statistical models that capture multifarious dependencies within data” (D. M. J. Lazer et al., 2020, p. 1060) 2 fields cited above. On the other hand, as criticised by Alexander Galloway, the new methods also reflect scholars’ individual appropriateness; “what results is a field of infinite customization, where each thinker has a method tailored to his or her preferences” (Galloway, 2014, p. 109). The ease with which new methods and be created or customised can be problematic if it is not accompanied by an effort to think with or reflect on the software that supports these methods and to acknowledge that, we are facing a complex methodological moment in which concepts and knowledge are mobilized through computational mediums (see Rieder & Rohle, 2019). In this context, media scholars have been questioning the influence, bias and consequences of digital technologies “for social scientific ways of knowing” (Ruppert, Law, & Savage, 2013, p. 24), more precisely how digital technologies and data are “reconfiguring social science methods and the very assumptions about what we know about social and other relations” (Ruppert et al., 2013, p. 30). Scholars have, for example, reflected on how one can understand platform infrastructures as performative (van Dijck, Poell, & de Waal, 2018); how software and its methods “affect the way we generate, present and legitimise knowledge in the humanities and social sciences” (Rieder & Rohle, 2019); and, how digital databases (and their navigational practice) could modify social theory (Latour, Jensen, Venturini, Grauwin, & Boullier, 2012). A common factor brings these efforts together is the need for a heterogeneous understanding of the digital technologies (Rupper, Law & Savage 2013), which are required to be considered as an object and resource of social life but also as means of sociological enquiry (see Latour et al., 2012; Marres, 2017; Rieder, 2020; Rogers, 2013). The discussion on the consequences of digital methods, however, has to a large extent remained conceptual and its empirical investigation is still underdeveloped. Another characteristic of the new methods is that, in a way or another, they all embody some form of computational turn in which the forms and features of the computational medium supporting the enquiry become “an intrinsic part of the research” (Berry, 2011, p. 3), a transformation that, according to David M. Berry (2011), runs deeper than the rise of quantitative approaches of analysis. These approaches, in fact, do not have as an objective nor are able to single out what is inherent to a computational medium or how this affect the research that is carried out through it, but to measure data by using statistical or mathematical analysis. The 3 transformations Berry and others refers to relate to a technical understanding3 of the computational medium which leads to new research insights (Bogost & Montfort, 2009) but also to new ways of designing and responding to research questions in digital media studies (see Bounegru, Gray, Venturini, & Mauri, 2017, Gerlitz & Rieder, 2018; Rieder, Matamoros-Fernández, & Coromina, 2018; Rogers, 2019). As a response to these transformations, the digital methods approach proposes to exploit the mechanisms of digital platforms (e.g. the ranking systems of search engines results) and their outputs (e.g. available as retrieved or scraped data). It encourages scholars to pay greater attention to how platforms work (to their forms, functions and mechanisms) and how they handle digitally native data. Digital methods (Rogers, 2013; 2019) make use of what is created by and for digital media and focus on data and methods that are “digitally native” rather than digitised4, e.g. uniform resource locators (URLs), hashtags, hyperlinks and ranking. By following this approach, researchers should consider the methods of digital media and their effects but also the concepts, practices and substance of the computational mediums they exploit. Accordingly, researchers should follow, repurpose and think along with the medium, while it evolves and changes (Rogers, 2013). By carrying out research with and about computational mediums, the web environment and data, digital methods also end up repurposing social research (Venturini, et al., 2017). But how can researchers develop a research mindset that helps in thinking along with the medium? How can they mobilise empirical evidence in the digital methods approach? In what sense digital methods can be considered as a type of fieldwork? In this context, this dissertation strives to understand, explore and critique the digital methods approach and its proposal to reconceive some preconceived ideas of digital research – e.g. the longing for huge amounts of data (realizing that more is not always better); the need to separate qualitative and quantitative (searching instead for qualiquantitative methods); and, the tendency to consider research tasks like data collection as mechanical and automated activities (extras efforts are required to engage with the 3 Technical understanding here is both understood as related to the knowledge and methods of a particular medium as well as the practical skills and methods required to re-purpose media for research. 4 That is a difference of meaning between “born” and “native” digital material. David Berry (2011), in digital humanities, refers to electronic literature, interactive fiction and web-based artefacts as “borndigital materials”. These, in many cases, are digitalised objects. 4 fieldwork, researchers need to be acquainted with the medium). The problem of methods here is addressed from the standpoint of non-developer researchers who “must pay close attention to more practical aspects of digital research” (Marres, 2017, p.93) and take technical knowledge and technological environment into account. Even when they don’t have advanced technical skills or experience in coding or software making, scholars need to be utterly attentive to the features of the computational medium in itself and in relation to other mediums and to the ways in which its elements influence the design and implementation of research methods. This dissertation seeks to support the argument about the dissolution of the quali/quanti divide in digital research in practice, and acknowledge that “digital traceability and datascape navigation makes data and methods more continuous”, and thus, “the micro/macro distinction appears less significant” (Venturini et al., 2017, p. 7). By doing this, I am entering into the new media, new methods, new issues debate on the theory and practice of digital methods, while aiming to interrogate the role of computational mediums and technological environment in digital research. Making reference to technological environment, I wish to promote a technical comprehension of digital methods and research software. This methodological viewpoint is particularly salient in chapters 2 and 3. Discussing two case studies in my own research, I try to demonstrate how a proximity with computational mediums has changed the way knowledge was generated and presented. I use the expression of “computational (or technical) mediums” in a sense that encompasses but also exceeds the notion of communication media, inviting researchers to consider media not only as communication platform, but also as living substances and mediators devices. This means that both platform(s), where the data is derived from (e.g. social media) or processed by (e.g. algorithmic techniques, computer vision APIs), and research software are taken as mediums of “expressing a will and a means to know” because they have their own concepts, language and practices (see Rieder & Röhle 2018, p.123). Computational mediums here stand for research software, digital platforms and associated algorithmic techniques, which can be captured by APIs’ results or scraping and crawling methods. When opting for the plural form mediums, I want to avoid the association with media in the sense of communication media, while stressing their potentials to change the ways in which we ask/respond research questions and produce (scientific) knowledge. 5 In support of my use of medium, Berry argues that researchers should be “concentrated around the underlying computationality of the forms held within a computational medium”, thus, proposing scholars “to look at the digital component of the digital humanities in the light of its medium specificity, as a way of thinking about how medial changes produce epistemic changes5” (Berry, 2011, p.4). Although not developing the notion itself, Berry is certainly referring to the role of software, tools and digital devices in digital research. Throughout this thesis, I provide concrete examples of the active role of computational mediums in different digital methods and their relationship with researchers’ technical knowledge. Being sensitive to the medium, I argue, we become capable of choosing and organising an ensemble of computational tools for research purposes, while acknowledging their influence and bias. We should, therefore, care and make room for computational mediums in the design and implementation of digital methods and consider them as important as the contents or the objects of our research. The practice of digital methods demands a proximity with computational mediums Digital methods require some technical knowledge of digital platforms and computational tools, demanding researchers to care about the role of these mediums in research. As argued in the previous section, on one hand, there is a need to recognise the participation of technology in the doing of social research but, on the other, researchers also need to understand computational mediums and their methods more practically (see Marres, 2017; Vis, 2013) paying attention to how they “affect the way we generate, present and legitimize knowledge in the humanities and social sciences” (Rieder and Röhle, 2017). Some attention to technicity is always required to use such methods in line with the concerns of both the contemporary philosophers of technology and the media scholars who have argued that technology, tools and software call for 5 Berry (2011, p.4) explains that this approach draws from recent work in software studies and critical code studies, “but it also thinks about the questions raised by platform studies, namely the specifics of general computability made available by specific platforms (Fuller, 2008; Manovich, 2008; Montfort & Bogost, 2009)”. 6 new modes of reflexivity (see Marres, 2011; Hoel, 2012; Labour et. al 2012; Rupper, Law & Savage 2013; Stiegler 2018). As a way to tackle the problems of methods in digital media studies, particularly through the lens of a digital methods approach, I am suggesting that researchers should take into account the role of technicity and the layers of technical mediation mobilized by each method. Two main reasons justify the choice of thinking digital methods through the technicity embedded in their practices: § Digital methods take as their field the web environment, what is natively digital and the computational medium, its methods and intellectual substance. § As a research approach, digital methods are still on a path of stabilisation and standardisation (like many other approaches in digital research) which facilitates the study of their inner workings6. This section served as way to situate the problems of methods and to raise specific questions pointing to the purpose and value of this dissertation. The next section presents the objectives, research questions and main arguments of this dissertation, summarising the purpose of each chapter. Accounting for the technicity-of-the-mediums in digital methods This dissertation discusses computational mediums as carriers of meaning (see Berry, 2011; Rieder, 2020) in the practice of digital methods. The key objective of this thesis is to address the role of technical knowledge, practice and expertise (as problems and solutions) in digital methods. It uses the concept of “technicity-of-the-mediums” in 6 Digital methods have been part of my research interests since the very beginning of the master’s studies in 2013. My first collaborative experience with these methods, in a data sprint environment, took place during the 2014 DMI Winter School in Amsterdam. Since then, I have never stopped challenging myself, or been bored by the practice of digital methods. Self-learning, participation in and organisation of multiple data sprints gave me the experience of using and adapting tools for the collection, curation, analysis and visualisation of digital data, particularly from social media. The challenge of knowing how to solve problems through creative methodological approaches is the reason why I am fascinated with digital methods. Over the past seven years, I have been developing studies on hashtag engagement, visual network analysis, computer vision API-based networks and bot agency, resulting in several peer-review journal papers and awarded grants. 7 two ways: on the one hand, to refer to the effort to become acquainted with the mediums (from conceptual, technical and empirical perspectives) and, on the other hand, to emphasise the object of technical imagination (which reflects the capacity of predicting and combining the practical qualities of computational mediums as an ensemble and as a solution to methodological problems). Here, the understanding of digital media and methods start from direct contact with the fieldwork and is nourished by technical knowledge and imagination, a journey only fulfilled by technical practice, experimentation and exploration. I understand technical practices as all processes required to implement digital methods approach, which would also critically account for the technicity-of-the-mediums. Whereas, technical imagination is taken as a result of the researcher engagement on using and knowing about computational mediums in different and applied research contexts. I raise questions on the intersection between methods, technicity and digital fieldwork, while suggesting that researchers should be acquainted with the computational mediums and take seriously their regimes of functioning. I call this attitude of establishing a sensitivity to the technicity-of-the-mediums and I argue that it constitutes a crucial building block in the practice of digital methods. In this context, I attempt to provides some answers to the following questions: § What is it like to design and implement research with a digital methods approach? To what extent can these methods be considered a type of fieldwork? § Why the notion of technicity matters for and contribute to digital research? To answer these questions, this dissertation is divided into two parts. The first draws on a review of the literature on digital methods, philosophy of technology and media theory. It discusses the crucial role of technicity in digital research, relying on the work of Gilbert Simondon and Bernhard Rieder. I suggest that researchers have to do fieldwork in order to reason with and about the computational mediums, and to become familiar with the technical environment, while understanding the relationship between software affordances, platforms’ culture of use and their technical grammatisation. The second part mobilises the technicity-of-the-mediums in digital methods in a series of case studies concerning hashtag political engagement (Instagram), networks of likes and timeline images (Facebook) and web vision APIs (Microsoft, IBM, Google). This part is substantiated by the conceptual discussion presenting a methodological framework that can be replicated to different case studies. 8 This dissertation investigates how assuming a “technicity” perspective can enhance the practice of social research, serving as an invitation to pay greater attention to the computational mediums, and by doing this and iteratively enrich the research process, the elaboration of research questions and the creation of theoretical concepts. A conceptual, technical and empirical understanding of computational mediums can help scholars to avoid fatal misalignments between their research objective and the constraints and potentials of the mediums they work on and with. Acknowledging the technicity of the mediums makes it impossible to resort to flawed reasoning such as: “Because one thing is policy making for AI, and another is the call for us to know what AI technically are or can do. This has nothing to do with policy making” “You know, the theoretical part of the project is a thing, and another is the technical aspects of data collection and analysis, I don´t care about those, the results are what matter and the quantity of data” “Well, why teenagers should understand what recommending systems are? This is just too much for them. I just want to raise awareness about the power of YouTube algorithms” “I want to develop this cool new theoretical concept about algorithmic culture but now I need some data and some nice visualizations to support my idea” I’ve encountered sentences like these in the most varied contexts of academic life, colloquiums, project meetings, also informal conversations. Statements such as these prove Simondon’s theory of technology is still relevant and applicable to the practice of digital methods, as is Rieder's warning about a pressing need to a technical culture of knowledge. By joining this warning, this dissertation alerts social researchers about the fundamental role of the technicity perspective for knowledge production and methodological innovation. Caring for the technicity of the medium can encourage scholars to consider conceptualisation, technique and methods as a unity, rather than separating concepts and theories from technique and methods. Thus, the perspective “technicity” would offer scholars doing social research, a new viewpoint on their study objects, allowing new interpretations of a phenomenon or issue that are empiricallybased and substantiated by the practice of digital methods. By changing the way in which we do research, a “technicity” perspective would also have an impact on the way we deal with ethical problems revealing how the standard procedures of research ethics are sometimes inadequate for methods grounded on web 9 environments and digital platforms (see Tiidenberg, 2020). Research with digital methods cannot rely on typical solutions (informed consent, for instance unfeasible in most projects of this type) and have to face different ethical dilemmas that “arise and need to be addressed during all steps of the research process, from planning, research conduct, publication, and dissemination” (Markham & Buchanan, 2012, p.5). Digital methods researchers take advantage of the data policies of digital platforms, which often imply that by creating an account on social media, users give their consent to share some of their personal data and some of the records of their online behaviours with the platform and with third parties. Asking permission or giving detailed information about the case study to each and every participant would be an impossible task precisely because of the number of users as well as the difficulty to reach them. A possible solution is to avoid studying individual people and instead investigates larger issue at the level of public debate (see Markham, 2017). Information on the social media profile description of politicians or botted accounts, for instance, can be used to make sense of social-technical and political issues. Here, the patterns and characteristics of the data are seen collectively and not focused on a single individual. Methods addressing who is vulnerable (research individuals and populations) and what is sensitive (studied data or behaviour) (see Tiidenberg, 2020) take a different shape from the technicity perspective. For instance, researching sensitive subjects as how pornographic content is spread by botted accounts, can produce results that deliberately expose sexy and porn images of teenage girls being used in the web porn market (as I demonstrate in chapter 2). Ethical questions may then be raised about the image analysis, but it is also possible to argue that new ways of detecting the existence of teenager pornographic sites can help researchers in reporting such activities to the authorities. In data treatment, there is sometimes the option to anonymize the results in order to ensure the anonymity of the users. In data analysis and subsequent dissemination of results, there is a continuing concern about finding ways to avoid disclosing personal information and harming users. However, anonymization is not always a benefit for research (e.g. to study political polarization or social movements), in this case, some anonymization strategies are taken during the analysis process to ensure the anonymity 10 of ordinary users while finding ways to avoid improper exposure of the results or cause harm to public figures involved in the study. The main contribution of this dissertation is to develop a series of conceptual and practical principles for digital research and digital methods. Theoretically, this dissertation suggests a broader definition of medium in digital methods, while introducing the concept of the technicity-of-the-mediums and three aspects to consider when doing digital methods (platform grammatisation, cultures of use and software affordances) as an attempt to defuse some of the difficulties related to the use of methods. Practically, it presents concrete methodological approaches providing new analytical perspectives for social media research, while suggesting a way of carrying out digital fieldwork. By doing so, I hope to conceptually and empirically contribute to the making of digital methods as a (proper) fieldwork. Part one: reflecting the intersection of methods, technicity and digital fieldwork This part contains three chapters that focus on the intersection between methods, technicity and digital fieldwork, and in which I will present the need for developing a sensitivity to the technicity-of-the-mediums in contemporary digital research as a solution for some of the challenges faced by digital methods. A literature review on digital methods is also provided, together with some attempts to understand the concepts of technicity and grammatisation, while connecting Gilbert Simondon and Bernard Stiegler’s philosophical reflections to the practice of these methods. 1 Unpacking digital methods With the purpose of introducing and criticising the foundations of digital methods, this chapter aims at understanding what it means to design and implement research with these methods, while addressing the need of taking technological grammar into account. Here I will make clear that, when using digital methods, knowledge and findings are constantly mediated but also informed by software. Every stage of the process impacts the following and the tools are always entangled with researcher’s analytical decisions (showing that these methods exist in an environment inhabited by technicity). 11 This chapter recognises different forms of technical practices that can be based on the making or the use of software. Here, I will argue in line with Rieder (2020)7, and from the standpoint of software-using rather than making, that non-developer researchers “inhabit a knowledge domain that revolves around technicity, but also includes practical and normative notions concerning application, process, evaluation, and so forth. This domain cannot be subsumed into others” (p.76). The chapter concludes by addressing the many challenges of digital methods, including a call for a broader definition of “medium” in these methods. 2 Three attempts to understand technicity This chapter explores how the concept of technicity matters in the practice of Internet research. The main objective of this chapter is firstly to understand technicity in the context of media studies and second to address the question of how technicity concretely contributes to digital methods. The last attempt to understand technicity provides a description of the process of building/interpreting computer vision-based networks, which illustrates what I call the technicity-of-the-mediums in digital methods, the development of a “digital intellect”8 (see Berry, 2011) and an understanding of the computational medium in its entelechy – in an active rather than static mode. Technicity-of-the-mediums in digital methods requires a certain proximity with software but also a practical awareness of software forms, functions, operations, and intellectual substance. I will discuss several aspects of Simondon’s philosophy of technology to situate the specific angle from which I consider the notion of technicity, in the sense becoming familiar to the technical medium from a conceptual-technical-practical perspective. Digital methods can benefit from this vision of technicity, although these methods deal 7 From the standpoint of software-using rather than software-making. 8 Berry (2011) grounds digital Bildung as the development of a digital intellect (or mental mode) which is something opposed to a digital intelligence. Berry’s comparison is based on the work of Richard Hofstadter (1963), who explains that “intellect… is the critical, creative, and contemplative side of mind. Whereas intelligence seeks to grasp, manipulate, re-order, adjust, intellect examines, ponders, wonders, theorizes, criticizes, imagines. Intelligence will seize the immediate meaning in a situation and evaluate it. Intellect evaluates evaluations and looks for the meanings of situations as a whole… Intellect [is] a unique manifestation of human dignity. (Hofstadter, 1963: 25 in Berry, 2011 pp.7-8). Regardless of terminology (Berry’s digital intellect or Bildung, Rogers’ digital methods mind-frame, Marres’ device-aware sociology, Rieder’s technical culture), there is a requirement on building not only practical but mental modes when doing digital methods. 12 with objects that differ from the ones originally considered by Simondon in his reflection on industrial objects. 3 Digital fieldwork This chapter aims to present the technological environment that the digital methods approach takes as a point of departure to study social phenomena. It introduces fundamental aspects of the web environment from a methodological viewpoint, while reflecting on the notion of digital grammatisation in the practice of digital methods, based on the work of Bernard Stiegler and Philip E. Agre. The chapter demonstrates how technical expertise can contribute to new forms of enquiring and describe the triangular relationship existing between software affordances and platforms’ cultures of use and technical grammatisation. In doing this, I attempt to defuse some of the difficulties related to the use of digital methods, while suggesting a way of carrying out digital fieldwork. This background knowledge constitutes the first level of the technicity-of-the-mediums and can be defined as the practical awareness that allows researchers to understand not only in theory but also in practice what it means to study collective phenomena “through interfaces and data structures” (Bernhard Rieder, Abdulla, Poell, Woltering, & Zack, 2015, p. 4). Part two: mobilising technicity-of-the-mediums and three pillars of the digital methods approach to design research and present concrete methodological approaches This part contains three chapters that mobilise a sensitivity to the technicity-of-themediums and the three key aspects to consider when doing digital methods (software affordances, platform’s grammatisation and cultures of use) to design research and present concrete methodological approaches to social and medium research. The chapters complement and support the arguments developed in Part 1. This part brings together three peer-reviewed publications, written in collaboration with various colleagues, that have attempted to use a digital methods approach to investigate questions related to political polarisation through Instagram (the case of 13 the “impeachment-cum-coup” of Brazilian president Dilma Rousseff); institutional communication of Portuguese Universities on Facebook; and, how computer vision APIs interpret stock images related to different nationalities. The case studies follow the key guiding principles to an ethical approach to internet research by balancing the rights of subjects with the social benefits of research (Markham & Buchanan, 2012). In these case studies, we considered the role of application programming interfaces (the APIs of Instagram, Facebook and Google Vision), the graph layout algorithm ForceAtlas2, extraction software (Netvizz, Visual Tagnet Explorer, YTDT, TumblrTool) and analysis software (Gephi, RawGraphs, Google Vision API). In parallel to this, we consider how technological grammars and digital records (hashtags, engagement metrics, likes, image URLs) were carried out, rearranged and modified by the computational mediums and the researcher intervention. The chapters are thus substantiated by methodological innovation and new analytical perspectives for social media research and digital network studies, as I summarise in the next subsections. Each chapter describe a different case study where I tried to make room for, grow and establish a sensitivity to the technicity-of-the-mediums, and to understand what it means to design and implement research with digital methods. These three verbs refer to three different ways to deal with the technicity of the mediums, whose difference will be addressed in this thesis and which I will discuss in detail in the conclusion. § To make room for technicity relates to the efforts of becoming acquainted with the fieldwork and being of aware of technical mediations. Researchers are invited to become familiar with computational mediums (their functioning, potentials and limitations) and train their mind to see the web, digital records, media and computational mediums as a means of enquiry, as source and methods of investigation. § To grow a sensitivity to the technicity-of-the-mediums stands for developing a technical mindset by engaging with technical practices. Researchers understand how the meaning carried by computational mediums can be as important as the object of study, thereby rethinking their research questions and the conditions of proof in digital research. § To establish to the technicity-of-the-mediums illustrates a conceptual, technical and empirical understanding of computational mediums as crucial meaningful objects of attention. Researchers are able to use digital methods, knowing what computational mediums affect them, when and how. 14 The chapters reflect my quest to comprehend the modes of living (or nature) of computational mediums, while seeking to grasp their epistemological role in digital research. To do so and to embrace the unstable nature of the web platforms, I cared about the forms of use, the concepts and the materialisation of computational mediums throughout the design and implementation of different methods and research design, related to hashtags, digital networks or computer vision. In the introduction to each chapter, I will try to explain how the technicity approach has been used in the case studies, highlighting in particular the aspects that are not included in the text of the published articles. In addition, I provide a more specific methodological guidance and reflections, justifying the choice of the articles. 4 Hashtag Engagement Research This chapter seeks to contribute to the field of digital research by critically accounting for the relationship between hashtags and their forms of grammatisation. The chapter approaches hashtags as sociotechnical formations that serve social media research not only as criteria for corpus selection but also as occasions to display the complexity of online engagement and its entanglement with the technicity of web platforms. Therefore, the study of hashtag engagement requires an understanding of the technicity of online platform. In this respect, we propose the three-layered (3L) perspective for addressing hashtag engagement. The first layer contemplates potential differences in the use of hashtags by high-visible users and ordinary users. The second focuses on hashtagging activity and the repurposing of how hashtags can be differently embedded into social media databases. The last layer looks at the images and texts to which hashtags are related. To operationalise this framework, we draw on the case of the “impeachment-cum-coup” of Brazilian president Dilma Rousseff. When used together, the three layers add value to one another, allowing to investigate both high-visibility and ordinary groups. Drawing on the three pillars of digital methods approach and the technicity-of-themediums, the three-layer perspective can be applied to different platforms. 15 5 Digital Networks This chapter discusses the infrastructural aspects of Facebook and asks what one can learn from the connections between Facebook pages (through likes) and from a list of (timeline) image URLs. It uses network visualization as a means to reimagine Facebook grammatisation for studying how Portuguese Universities use the platform to communicate, and it interrogate how digital networks contribute to communication studies. Following the digital methods approach and through the notion of calling into the platform (Omena & Granado, 2020), we operationalise digital research about Facebook to map and analyse like connections as institutional interests and timeline images as institutional visual culture. To this end, we build and analyse two distinctive networks. The first comprises all connections made by 15 Portuguese Universities’ Facebook Pages (the acts of liking other pages or being liked in return). This network captures the connections made by each page since its created time to March 2019. The second network is built upon the affordances of Google Vision API, displaying the connections between timeline images and their labels (a description of the content of the image itself). To analyse these digital networks, we relied on visual network analysis (Venturini, Jacomy, & Pereira, 2015; Venturini, Jacomy, & Jensen, 2019). While exploring and analysing the networks, we considered Facebook itself and how activity and connections are (re) arranged and made available through its Graph API and output files, Gephi’s data laboratory, Portuguese Universities’ Facebook Pages and respective websites. We also give attention to how Google Vision API labels images and its output files (including the folder with the downloaded images). Besides providing new ways to design and implement research that can be repurposed for different studies, the main contribution of this chapter lies in embracing the methods of the medium, a navigational research practice and the technicity-of-themediums as key components for digital social sciences. 6 Interrogating Vision APIs This chapter presents the results of a study of computer vision Application Programming Interfaces (APIs) and their interpretation of representations in stock images. Computer vision is a field of computer sciences dedicated to the development 16 of algorithms for visual data interpretation, but the methods for its critical application are still under construction. The study thus draws upon three computer vision APIs (Google, IBM and Microsoft) for the analysis of 16.000 stock images related to the following keywords: Brazilian, Nigerian, Austrian and Portuguese (their demonyms were searched in two of the main Western stock image sites: Shutterstock and Adobe Stock). The main contribution of this chapter concerns the special attention given to the potentials of vision APIs and respective machine learning models to interpret a collection of images. In other words, the attempt to understand and interrogate how image automated classification systems may facilitate or compromise the study of natively digital images. Limitations and further work Part one provides a review of ideas derived the work of the philosophers of technology (chapter 2) and try to connect them to some practical problems in digital methods. Its purpose therefore is not to exhaustively discuss one or the other, but to establish a meaningful conversation between them. In chapter 2, for example, my use of technicity does not address all the elements of Simondon’s broader metaphysics nor it attempts to develop a philosophical approach to science and technology (see Bogalheiro, 2017). What Simondon presents as concretisation9, for instance, will not be addressed here, since I am not investigating the long historical trajectories of technical objects and how they evolve overtime10 (Simondon, 2017). Rather than as a complex process of technical evolution (see Rieder, 2020), my use of technicity refers to the awareness component from conceptual, technical and empirical perspectives. In addition, I will neither address a theoretical discussion on the concept of technique, nor provide a 9 Concretisation is “a useful concept for thinking about technological artefacts and their evolution over time” (Iliadis, 2015, p. 86). To Bernhard Rieder (2020), the notion of concretisation is essential for understanding contemporary computing, he translates the term according to Simondon’s perspective as “the march from a more abstract or modular state, where elements are arranged so that every part only fulfils a single function without synergy with the others, to a state of integration where mutual adjustments yield optimal functioning” (p.69). That would be the movement from a modular technical object to an integrated technical object. 10 Although, for the past 5 years, I have been investigating and following social media APIs changes overtime in terms of forms, functions and data access regimes, and the growth of web APIs categories overtime. However, this work cannot be placed as a long trajectory and it is also not contemplated here. 17 literature review on the concepts of technical knowledge or technical practices. These latter will be approached from my experience in practicing digital methods. In chapter 3, when introducing a methodological vision of the web, digital platforms and software affordances, this dissertation does not go into depth in the history of web, or in the theoretical debate on software studies, nor provides a detailed description of web technologies. This chapter, moreover, does not provide ethnographic and anthropological reflections on fieldwork. It also does not cover what does it means to know or do fieldwork from the lens of digital ethnography or anthropology. Part two is composed by texts that have been published as peer-reviewed articles. Each of them is introduced by a separate section which will explain how a sensitivity to the technicity-of-the-mediums has changed the way in which the cases were carried out. I will make clear that these studies served also as a means for me to understand the notion of technicity in practice, while seeking to unpack what it means to design and implement research with digital methods. In the chapters, however, the technicity approach can perhaps be seen in between the lines rather than as the central theme. The main reason for this is the time and moment when these publications were written and submitted, also assuming the technicity of the medium as an object of little or secondary interest in the call for papers. The section introducing each chapter will hopefully provide more details on the connection between the paper and the idea of the technicity of the mediums. Despite the case studies flag out the different ways in which ethical concerns are seen from the perspective of technicity, this dissertation does not develop a discussion about the ethical problems in digital research. Part 1 and part 2 are interdependent because the conceptual and theoretical efforts of this dissertation (in part 1) cannot be fully understood without a direct exposure to the actual practices of digital methods for social or medium research (in part 2). The criticism addressed to digital methods and the suggestions on how to carry out digital fieldwork in part 1 could not have been proposed without the case studies in part 2 (and much extra practical efforts outside the context of this dissertation). Through the connection between its two parts, this dissertation will hopefully help to improve our 18 understanding of the digital methods approach, because to learn how solve methodological problems using technical knowledge and imagination, as I suggest, requires practising the method itself under a special care about the technicity-of-themediums. 19 1 UNPACKING DIGITAL METHODS C HAPTER 1 20 Understanding Digital Methods This section defines digital methods from an historical perspective and in specific terminology, while addressing the importance of taking technological grammar into account. It questions what it means to follow the medium. What are the methods of the medium? What does medium specificity refer to? How exactly can one repurpose dominant devices and for what? What makes a difference in digital methods and why? By shedding some light on how these methods are different from traditional knowledge structures and other digital research approaches, this enquiry aims to provide a clear understanding of digital methods. Table 1.1 characterises digital methods, but it also summarises what will be addressed and explained throughout this chapter. Digital Methods learn from the medium dynamic nature the technicity-of-the-mediums The data sprint practice follow & repurpose medium dynamic nature medium specificity the content of software are always situational & imaginative experimental & collaborative challenging & demanding require natively digital mindset software-makers & software-users technical practices work with web data & unstable media and methods extraction, analysis and visualisation software situated software & query design led research offer post-demographic studies chain methodology methodological innovation Table 1.1. Understanding digital methods. 21 An historical perspective: the reasoning behind the methods The short text entitled “The future of STS on the Web, or: what I learned (naively) making the EASST website” became a digital-methods seminal article about building tools to study web data. This mindset and proposal were introduced by Richard Rogers in 1996, more than a decade before the release of his Digital Methods book in 2013. In the text Rogers (1996) reflected upon preliminary ideas11 about websites that, later, would change the way many scholars think and apply methods about and with the web. By considering the web as a show and tell section, Rogers argues that websites can represent an issue spectrum with the potential to be more than what is being presented. His first idea looks at organisational websites as spaces for positioning statements (such as politics) in which URLs, and not only content, would point to specific issues or interests of a given organisation, providing insightful perspectives through the analysis of hyperlink connections. The second idea, with web activism as an example, suggests a technique to create “a healthy database” that starts with small gestures such as “sending ‘subscribe’ messages to the leading web activist lists” by email (p. 26). After that: the webmaster then has to filter the incoming messages (maybe once a day), and upload the calls to action on the site, arranging them by date and by topic, perhaps with an overlay on geographical map indicating physical origins and destinations of the activities. Email links could be set up to the originator and the intended recipient, allowing for two-way protest and/or information exchange. (Rogers, 1996, p. 26) In this way, Rogers argued one could map, follow and display web activism without having to be physically present at the demonstrations. It is interesting to note that this idea involves a good understanding of how activist movements were using the web to communicate or protest. It furthermore requires monitoring (not always as an automatic process) combined with some technical skills (the webmaster tasks) and imaginative thinking (what one can do with a list of emails). 11 When making the European Association of the Study of Science and Technology (EASST) website, four ideas were presented in this text: 1. Evolving discourses sites; 2. Activists loop sites; 3. Reflexive webmetrics sites; 4. Virtual presence only sites. Here I am giving emphases to the first three. 22 The third idea concerns what can be quantified and in which terms content is measured. The example used in this idea is followed by an argument in which Rogers justifies why all academic journals should go online. He argued that, when measuring how many hits each article receives on a website, one can obtain a metric of “awareness”. In this sense, different measurements would point to alternative interpretation. In Rogers’ 1996 text, we find the necessity of getting into a field in which the technical functionality/potentiality of websites cannot be dissociated from their uses and appropriations. A few years after “The future of STS on the Web”, and following the specificity of the medium and the natively digital, Rogers (2009a, 2010) starts questioning whether methods are to be changed in the context of Internet research12: how to capture and analyse the natively digital objects? What do they offer? What kind of new approaches are worthwhile? How can online medium methods be reimagined for social and cultural research? In 2010, Rogers advocated a bold proposal for Internet-related research “where we no longer need to go off-line, or to digitize method, in order to study the online” (Rogers, 2010, p. 243). In other words, the suggestion is to take the Internet as a research site, a place to ground findings. Back then, Rogers’ ideas (1996, 2009a, 2010) reflected fundamental aspects of what today we understand as thinking along with natively digital objects and methods, as well as online grounding research. That means, on one side, learning and following how content is embedded into website design and, on the other side, knowing how actors make use of the forms and functions provided by the web environment. Consequently and implicitly in this process, the creation and use of software for research purposes come along with the invitation to change the way we see the medium (Internet platforms) but also the way we design research questions. The basis of Rogers’ Digital Methods reflects both a particular way of thinking along with the medium and with what it has to offer, just as the development and use of research software. 12 When interviewed by Michele Mauri (link available at http://densitydesign.org/2014/05/aninterview-with-richard-rogers-repurposing-the-web-for-social-and-cultural-research/), Rogers “situates digital methods as the study of digital that does not lean on the notion of remediation, or merely redoing online what already exists in other media” (see also https://wiki.digitalmethods.net/Dmi/MoreIntro). 23 This methodological proposal was originally developed as a counterpoint to the simple application of existing methods applied to online environment (Rogers, 2009, 2010; Rogers and Lewthwaite, 2019). With time and considering that when studying society through Internet platforms one cannot dismiss the study of the platforms themselves (Venturini and Rogers, 2019), digital methods have also transformed into ways of studying the medium culture. Knowledge of the medium thus became a key matter of concern when using digital methods. Terminology and definitions Digital methods are a particular form of research practice that is crucially situated in the technological environment that it explores and exploits. In the philosophy underlying these methods, the Internet is not taken as a parallel dimension of our social life, but as a research site (source of data, method and technique) expected to be a sphere where one can make and ground findings about society (Rogers, 2013, 2019). That is: Broadly speaking, digital methods may be considered the deployment of online tools and data for the purposes of social and medium research. More specifically, they derive from online methods, or methods of the medium, which are reimagined and repurposed for research. The methods to be repurposed are often built into dominant devices for recommending sources or drawing attention to oneself or one´s posts. (Rogers, 2017, p. 75; Rogers, 2019, p. 21) Thus, research through digital methods should “follow the methods of the medium as they evolve, learn from how the dominant devices treat natively digital objects, and think along with those object treatment and devices so as to recombine or build on top of them” (Rogers, 2013, p. 5). Through this lens, natively digital objects13 (e.g. URLs, hashtags, tweets) and dominant devices (e.g. Google, Instagram, App stores) would offer a method of research that learns about society through studying the web in its 13 That is a difference of meaning between “born” and “native” digital material. While David Berry (2011), in digital humanities, refers to electronic literature, interactive fiction and web-based artefacts as “born-digital materials” (in many cases digitalised), Richard Rogers (2013, 2015a) makes clear that “digitally native” refers only to what is created for digital media (in the computational sense), rather than digitised. For example, uniform resource locators (URLs), hashtags, hyperlinks and ranking, instead of Internet archives or digitalised material. 24 own language (Rogers, 2010). In this spirit, Rogers has once suggested that social scientists would no longer need to go offline in order to study societal changes, because they could ground and conceptualise the “research that follows the medium, captures its dynamics, and makes grounded claims” (Rogers, 2013, p. 13) online. The term online groundedness (Rogers, 2013) refers to a type of research that must consider “when and under what conditions may findings be grounded with web data” and methods (Rogers, 2019, p. 5). Thus, to ground online findings is a matter of asking whether or not digital methods are suitable for a given research scenario, because when dealing with Internet platforms as means of research, “the investigated phenomenon must be to some extent performed or, at least, reflected in such platforms” (Venturini et al., 2018, p. 4). Otherwise, the methods would not be advisable or helpful. It is evident that the proposal of digital methods does not fit either a “methods as usual” approach (see Marres, 2011) or the transposition of existing methods to online environment, because digital methods always “investigate how digital infrastructures can be re-purposed for social enquiry” (Marres, 2017, p. 43). Web data collection, analysis or visualisation are not digital methods simply because they imply digital tools and data. Defining the perimeter of digital methods requires asking questions such as: what does it mean to follow the medium? What are the methods of the medium? What does medium-specificity refer to? How exactly can one repurpose dominant devices and for what? How can one make meaningful research questions using these methods? Some clarifications are needed, starting with the meaning of medium in digital methods, which stands for dominant digital platforms and search engines along with their inputs (web content and data) and outputs (what is available for data collection). Likewise, the methods-of-the-medium are methods “that are in some sense built into the web” (Rogers & Lewthwaite, 2019, p. 14) and piggyback on technologies such as Google’s PageRank algorithm or social media platforms’ recommendation systems. In this sense, a research practice that follows the methods-of-the-medium would “take advantage of distinctive features of digital infrastructures, devices and practices” (Marres, 2017, p.82) for social enquiry. The effort to follow the medium thus describes “a particular form of medium-specific research” (Rogers, 2013, p.25). In methodological terms, as explained by the digital sociologist Noortje Marres (2017), to follow the medium means a “continuity in methodology development across 25 different media and technological settings” (p.82). It is here that the researcher must see and pay attention to medium specificity as “the material and socio-technical qualities of the media technologies used to implement method” (Marres, 2017, p. 83), which means being always exposed to the unstable conditions of digital platforms and search engines. To follow the medium, in other words, means be attentive to the medium specificity. From a theoretical standpoint, Rogers defines the specificity of the medium differently from most of the earlier literature. To him, this concept is not related to media’s ontological distinctiveness as in “Mcluhan’s sense engagement, William’s socially shaped forms, Hayles’s materiality or other theorists’ properties and features, whether they are outputs (cultural forms) or inputs (forms of production)” (2013, p.26). Instead, in digital methods, he argues that medium specificity can either refer to the sense of preferred means of studying digital platforms (e.g. studying the dominant forms of platform content through most engaged-with content overtime) or the sense of looking at the methods of the medium (Rogers, 2013; 2019). His comprehension of medium specificity is one from an epistemological standpoint – or one of method, rather than from an ontological perspective (properties and features). For this reason, he claims, Internet research should be reoriented and thought of as “a source of data, method, and technique” (p.27). While I will try to introduce conceptually the basis of medium-specific research in digital methods, this approach can only be fully comprehended through practices. The first task of this approach is to follow and learn from the medium and its methods. Thus, addressing questions such as how does the platform work? How do people engage with it? Which digital records are available and how are they handled? The second task is to take the answers to these questions as new ways in which data, methods and computing techniques implemented online can be used, “then think through what kinds of other sorts of research can be done with them, how these sorts of techniques can be repurposed” (Rogers, 2010, p. 259). These two tasks provide a summary of what it means to follow, learn and re-purpose the medium when using digital methods. A practical example of that is provided in the last section of chapter 2, and the case studies in chapters 4, 5 and 6 also follow the basis of medium-specific research. 26 Taking technological grammar into account Some existing digital methods quali-quanti approaches show how to repurpose dominant media and their corresponding methods, as well as data. Rather than going through detailed techniques, project recipes or successful research design protocols (see Bounegru et al., 2017; Rogers, 2019; wiki.digitalmethods.net; smart.inovamedialab.org), but without ignoring the importance of these latter, I wish to emphasise the chain of knowledge and actions prior to any data collection or the making of research questions. This means comprehending what is being re-purposed and for what; at the same time, we thinking about what can serve as a point of departure to query platforms (such as building lists of keywords of organisations, hashtags, images URLs, etc.) and the use of a selection of software/tools as work material. Figure 1.1 helps us to better understand digital methods’ qualitative fronts, exposing what we need to know before we start formulating the research questions. For instance, and as Rogers (2019) suggests, when knowing in advance that Google directs, by default, all search entries to the local domain (e.g. google.uk), thus returning the results in the local language (including advertisements), one may raise a few questions about what types of sources are returned. In this line of thinking, he argues that one can study societal concerns through looking at Google search results as well as inquiring its ranking system for a medium-led research. For that matter, Google’s sense of the ‘local’ enables researchers to conduct cross-country analysis “by showing the extent to which Google returns transnational, regional or some (other) combination of results in its local domain engines” (p.115). One could look at the top 100 sites per country as a way of profiling different countries, detecting which countries would rely on the mega-upload sites or local-based domains, as Rogers explains: Have you looked at the top 100 sites per country? It’s interesting in the sense that you can profile a country according to what kind of sites are in that top 100. Which kinds of countries are relying on the mega-upload sites? Just to give you one short example. So one can think about different sorts of Web indicators for ideas about the societal condition. (Rogers, 2010, pp. 259–260) 27 Figure 1.1. Seeing the qualitative fronts of digital methods: the re-purposing of dominant devices and available digital grammars: what for, points of departure and work material. (Based on Omena et al., 2020; Rogers, 2019) Using Google Vision APIs is another example of repurposing the medium and its methods. Machine vision web services were not originally made for research purposes, but they may be of great assistance in analysing large image datasets or studying image circulation (see D’Andréa & Mintz, 2019; Omena et al., 2019; Ricci, Colombo, Meunier, & Brilli, 2017; see also chapters 4 and 5). One can also interrogate pretrained machine learning models through closely looking at their outputs and, for instance, comparing different services and their capacities and limitations for labelling a collection of digital images (see Silva et al., 2020; see chapter 6). When re-imagining technological grammars, for instance, one should consider how social media capture and reorganise hashtags and their different modes of action to study collectively formed actions (see chapter 4). In the same spirit, one can use YouTube video/channel content, creators and related metadata to map issue networks (see Rieder, Coromina, & Matamoros-Fernández, 2020) or to study the recommendation system of the platform through the visual exploration of networks (see Omena et al., 2020). Through these examples, though not exhaustive, we primarily learn that the baseline scenario of digital methods refers to technical knowledge about medium forms and functions, the cultures of use inherent to Internet platforms, as well as the software outputs and content. Consequently, when re-purposing the medium, digital methods’ 28 approaches (and respective techniques) also repurpose social research in return (Venturini, Jacomy, Meunier, & Latour, 2017) but they require, in addition, a vision of technicity. In all cases illustrated, the key points of departure for the implementation of the methods relate to the researcher ability of searching as a form of research and making lists of available digital records such as hyperlinks, hashtags and URLs. Not separated from this is the technical and practical knowledge about the work material (computational mediums) as both instruments (to perform a particular piece of work e.g. data collection) and beings that function (beyond participating in each stage of the research, they add concepts and technical content to the object of study, re-adjusting and re-shaping it). I will return to this matter in chapters 2 and 3. To take technological grammar into account means that we should take the methods of the medium and available digital records as part of the research design, not only as instruments but as active and influential participants. Doing Digital Methods The art of querying The act of querying platforms and query design is at the core of digital methods. The technique of building lists of words to be used as keywords or issue language informs the foundations of digital methods (see Rogers, 2017, 2019). Keywords are here understood as “the connections people are currently making of a word or phrase, whether established or neologistic” (Rogers, 2019, p. 37). In this way, whether and how people/organisations/tech-companies are using/engaging with keywords matters. In Internet platforms, keywords can become search queries formed by hashtags, video or channel ids, username accounts, image URLs, among others. These queries can be used to collect and see things and their respective relations, connections, representations or controversies, which explains why the technique of building a list of keywords is the very starting point of digital methods. Query design, therefore, is neither trivial practice or a purely technical question, precisely because keywords in digital methods denote positioning efforts – program and anti-programs (see Akrich and Latour, 1992) as well as neutrality efforts (Rogers, 2017; Rogers, 2019). Therefore, query design should be synonymous of spending time 29 navigating the platform to explore one’s subject of study: to monitor, to collect data and to conduct some (visual) exploratory analysis. After all, “the ways in which actors label the phenomena in which they are engaged can be subtle and complicated” (Venturini el al., 2018, p.18). For instance, one may think of #coxinha and #mortadela as random food-related hashtags, but when situating these tags in the context of Brazilian impeachment-cum-coup protests in 2016, the hashtags meaning shifts to pejorative nicknames for antagonistic protesters (see chapter 4). This only reinforces the need of dedicating some time to defining our query design, instead of, as is most common, go for what is logical in our mind or popular and trendy words that may not reflect the actual language in use. On one hand, query design exposes a close relation with data collection methods (e.g. manual, API calling, crawling, scraping) and exploratory data analysis, and on the other, it justifies that the choice of words not only matters but also requires scholars to consider search as research (see Rogers, 2015). That is the formulation of specified and underspecified queries, to make research findings with engine outputs (Rogers, 2013; Taibi et al., 2016) and the researcher capacity to make good queries. In other words, this is a technique of building search queries as research questions (Rogers, 2019) which is not an easy task, as the forms and cultures of use of platforms are constantly changing and so are the ways in which platforms impose, capture and reorganise digital records. The two types of queries serve different research purposes, when specified (e.g. “white lives matter”, “#foraBolsonaro”) the query is used for studying dominant voice, commitment and alignment. Whereas, when not being sufficiently detailed (e.g. “abortion”, “quarantine”), the underspecified (or ambiguous) queries serve to uncover differences and distinct hierarchies of societal concerns (see Rogers, 2019). In practice, for instance, when defining a list of hashtags to study political polarised debates, this process should respond to immersive observation of the context and through previous exploratory analysis such as using co-hashtag networks, Excel’s pivot table and basic formulas (e.g. VLOOKUP). This practice helps the researcher to verify whether the chosen hashtags have clear connection with the topic, helping also to detect when hashtags may be indicators of new connections on the topic or counter-reactions (see chapter 4). 30 To facilitate the process of making research questions with lists of words to be used as keywords or issue language, Rogers (2019) suggests that, when formulating queries, “it is pertinent to consider keywords as being part of programmes, anti-programmes or efforts at neutrality” (2019, p. 28). Chapter 4 brings the case of political polarisation in Brazil, exemplifying how a good list of program and anti-program hashtags was built; while chapter 6 relies on underspecified queries to perceive how different nationalities (such as Brazilians, Nigerians, Austrians and Portuguese) are depicted by stock images (Shutterstock and Adobe Stock). In this line of thought, querying platforms and searching as research serve as a form of mapping or locating issue networks (Rogers, 2018; see also chapters 4 and 6), and of studying trends, dominant voice, commitment and alignment (see Rogers, 2018, 2019, also chapter 5)14. Once again, we are faced with what is expected to build queries as research questions; an understanding of the ways and words adopted by different groups part of the same phenomena. On the one hand, the search queries we use influences what we can obtain from digital platforms and social media and, consequently, the records available for the analysis. On the other, the search queries we use have a direct impact on the way we set research questions as well as on the ways we choose to respond to them. So far, we have learnt that thinking along with the medium reflects and requires having a good knowledge of the medium and mastering the art of querying. This also means that, when using digital methods, to be in direct contact with the subject of study stands for surfing the studied environment (e.g. Instagram) but also interacting with the content of study within this environment (both via navigating the end-user interface as well as using research tools for the purpose of data collection and exploratory analysis). Such methodological proposal demands innovative ways of applying methods because it works in an environment that is not made for academic research, thus, posing significant challenges to digital research (see Ruppert, Law and Savage, 2013; Marres, 2017; Venturini et al., 2018). In the face of this, I will call attention to the problem of digital methods illiteracy in the next sections, by exposing the working material and technical practices related to these methods. In addition, I advocate data 14 When querying platforms, Rogers (2019) calls our attention to the use of quotation marks to avoid “equivalent keywords”. 31 sprints as a form of learning the practice of digital methods. In chapter 3, I will return to the problem of digital methods illiteracy, but this time describing what does it means to be in direct contact with the fieldwork, suggesting a way of carrying out digital fieldwork. Technical practices, the makers and users of software Digital methods reunite available online data (hyperlink, retweet, timestamp, like, etc.) and metadata, considering how platforms or search engines handle them. Hashtags, emojis, Facebook reactions, lists of link domains, image URLs or retweets (among others) are taken as collections of situated representations, through which we can look at things (either part of dominant debates or more issue-specific subjects). They have distinct orders of worth to different stakeholders interests and different forms of appropriation (Gerlitz, 2016; Gillespie, 2010; Highfield, 2018; Marres, 2017), serving as a basis for exploratory and experimental research with digital methods. When taking available online data as work material, researchers are invited to recognise and deal with what is often considered as a problem or bias of digital research - the media instability and ephemerality as well as the incompleteness of online data. This also indicates that researchers need to exert some vigilance (see Venturini et al., 2018, p.4) but also engage with technical practices to repurpose the online data. To stock, manipulate and analyse online data, researchers must rely on another type of working material: situated software15, that is, the development of techniques and software applications for specific research questions16, and then the generalisation of this application and their re-use in different research contexts (Rogers, 2010, p. 259; Rogers & Lewthwaite, 2019). The use or re-use of existing tools is very common in the everyday routine of many researchers and digital media-oriented laboratories, 15 In order to situate how software is used in the practice of digital methods, Rogers refers to situated software, a term coined by Clay Shirky (2004), who is an American writer and thinker about the social and economic effects of Internet technologies. Shirky developed this idea out of his teaching experience at NYU's Interactive Telecommunications Program (ITP); he defines situated software as “software designed in and for a particular social situation or context”. The full document was shared in the "Networks, Economics, and Culture" mailing list, available at: https://www.gwern.net/docs/technology/2004-03-30-shirky-situatedsoftware.html 16 If the operationalisation of digital methods is informed by “research problem-oriented software” (see Rogers and Lewthwaite, 2019), we can also say that simply engaging with open source tools (for extraction, analysis and visualisation) or mastering code skills are not necessarily related to repurposing the methods of the platform. 32 particularly because of the costs and challenges connected to making and maintaining software/tools. The making of software in digital methods also serves the purpose of interrogating the medium itself (see Rieder, 2020). The Lippmannian Device17 (Fig. 1.2) is an example of situated software because it was made for gaining “a rough sense of source’s partisanship and distribution of concerns” (Rogers, 2019, p. 130). This web scraper queries different search engines (one at a time), asking whether one or more keywords occur in different URLs. Before data scraping, some choices have to be made such as opting for the total results per query (maximum of 1000) and the search engine – e.g. Baidu, Bing, DuckDuckGo, Google, Naver, Parsijoo, Seznam.cz, Sogou, Yahoo (Japan), Yandex. By selecting Google, researchers can specify a number of parameters in the advanced options, e.g. the domain, region, language, and also the period of time (e.g. past 24 hours, week or month) and where the term(s) (e.g. “coronavirus”) appear in the article (e.g. anywhere, headline, body of the text, URL). As a result of these decisions, the limit of requests to Google can be reached when running the scraper. In these cases, data collection must be monitored because to keep the scraper running, researchers are constantly demanded to answer CAPTCHA requests. To illustrate the use of the Lippmannian Device for social research, for example, we can ask how Portuguese Journalism has managed the initial period of the pandemic. This case may start with a list of the main newspapers in Portugal18 (URLs) and underspecified keywords related to COVID-19 (keywords, e.g. coronavírus, quarentena, estado de emergência and pandemia). Then, after running the scraper over different periods of time, it is possible to monitor the mentions of the keywords over time according to Portuguese newspapers, and analyse related textual content by using the word tree visualisation technique (see Wattenberg & Viégas, 2008) (see Fig. 1.3). In this way, and as proposes the Lippmannian Device, one may gain a sense of how the main newspapers in Portugal distribute attention to a set of issues concerning COVID-19. Figure 1.3 illustrates that, showing on the left the number of times (maximum of 350 per word) that the Portuguese newspapers (nodes) have mentioned 17 The scraper is named after the American Journalist Walter Lippman, author of Public Opinion which is a referential book in media studies. Rogers (2019) explains that Lippmann has called for a coarse means of showing partisanship in his book The Phantom Public (1927). https://wiki.digitalmethods.net/Dmi/ToolLippmannianDevice 18 Based on https://en.wikipedia.org/wiki/List_of_newspapers_in_Portugal. 33 coronavírus, DGS, estado de emergência, quarentena and pandemia. On the right, when looking at the sentences related to the word register, we see that the number of Figure 1.2. On the left a screenshot of the Lippmannian Device by Digital Methods Initiative (DMI) being used and, on the right, one of its output files (the csv. file) open on a spreadsheet. Figure 1.3. Exploring and visualising the csv. file provided by the Lippmannian Device. On the left, the occurrence of words related to COVID-19 in major Portuguese newspapers between March 26 and April 2, 2020. Beeswarm plot by RawGraphs. On the right a word tree containing the initial part of the article descriptions identified by the scraper when searching for the keywords. Word tree by Jason Davies19. 19 https://www.jasondavies.com/wordtree/ 34 deaths and the new cases of COVID-19 were a widely reported in the news between March 26 and April 2, 2020. Besides some insights on the object of study, which however is not what I want to highlight here, and the illustration of situated software, this example also tells us that other types of software are required in the practice of digital methods. That is, researchers need to deal with practices other than data extraction. In regard to software, it is then justifiable and even imperative the development of situated research tools20, which likewise demand to be tested and used in a situation and context. In this sense, the methods require technical creators (programmers, methodologists, analysts, designers) and technical practices (either through softwaremaking or software-using) – both driven by how to questions as well as sociological imagination21. Table 1.2 provides a comparative perspective on software makers and users who, in the practices of digital methods, pay particular attention to the technical content of software through technical practices, as Rieder (2020) suggests from the perspective of a developer. Here, I am suggesting a demystification of technical practices in the context of digital methods, often regarded as a speciality of developers or tool makers who are users of programming language. Non-developer researchers such as the methodologists, analysts and designers also take part in technical practices, particularly they create methods of research or provide methodological innovation through software-using. We are thus recognising different forms of technical practices that can be based on the making or the usage of software. In alignment with Rieder (2020), a good understanding of software would not be provided by the contextual comprehension of how the social, political or economic issues have become entangled with algorithmic information ordering, for instance. But instead, through the “basic materialities and conditions of production” of software (p.33). Here, however, I would like to devote attention to software potentialities in solving specific problems or research questions when using digital methods. That is 20 Although building tools in digital methods is synonymous of a specific and situated research purpose, the international academic community has certainly benefited from its open source tools. 21 “Digital methods, however seek to introduce a sociological imagination or a social research outlook to the study of online devices” (Rogers, 2015, p. 2). 35 particularly related to software conditions of use, rather than their conditions of production. Technical creators and practices in digital methods Software-makers (developer researchers) Software-users (non-developer researchers) Technical practices The practice of software-making: building technical objects and taking into account “computing’s highly layered and modular character” a as technical creation The practice of software-using: building methods of research with, through and about medium technical specificity as technical creation (a methodological creative use of software) Researcher as Users of programming language Users of software Makers and coordinators of software particular forms of practices Coordinators of software particular forms of practices in methodological processes Builders of tools/software Builders of methodological innovation Attention to Technical Culture The technical content of software - materialities, (particular) potentialities, functioning, outputs Conditions of production and implementation Conditions of use and implementation “The methods and mechanisms that constitute and inform operation” (p.53) Software operation and what does it mean? “A deeper understanding of software and software-making” as a response to the challenge posed by the emergence of a technical culture A practical understanding of the active role of software and software-using in the full-range of digital methods as a response to the call for a technical culture in digital research Table 1.2. Technical creators and practices in digital methods (adapted from Rieder, 2020). When working with software, researchers ought to pay serious attention to the content of software, understanding its materialities, (particular) potentialities, functioning and outputs, while aware that part of software technical substance “sits at the centre of technical practice” (Rieder, 2020, p, 54). For instance, and as suggested in table 1.2, the use of programming language and digital infrastructures to create tools or, by using software, to create new methodologies. Although sharing concerns with software technical substance, the makers (developer researchers) may care about “the methods and mechanisms that constitute and inform operation” (p. 53), whereas the users (nondeveloper researchers) seek to understand software operation. In the practices of digital methods some questions are raised, yet they are answered in different ways. For instance, what does the software do? For what purpose? How does it work (affordances, limitations and potentialities)? What are software conditions of production and use? What is required for its implementation? That is to say that the range of digital tools necessary to the realisation of digital methods call for a certain 36 technical expertise when making or using software, confirming that “software demands an engagement with its technicity and the tools of realist description” (Fuller, 2008, p. 9). While the matter of technicity will be discussed in chapter 2, I want to turn attention to the peculiar vision of digital methods as a technical ensemble (suggested in figure 1.4 and more specifically exemplified in the chapter 3, figure 3.13). That is, when applying the full range of digital methods, the research is not dealing with “purposeful assemblages” or cascades of devices and inscriptions (data), as suggested by Ruppert, Law and Savage (2013), neither are they working with “an assemblage of research methods” as defined by Helmond (2015b, p. 20). But, by placing the possibilities and potentialities of a range of software into action and with one final purpose, the researcher becomes a builder and coordinator of a technical ensemble. To start understanding this, we may look at the workflow of the methods in figure 1.4. I will return to this discussion at the end of this chapter and in chapters 2 and 3. The process starts with the existence of grammatised actions22 (e.g. tagged content, shared links, liked publications, followed accounts) yield in social media APIs (Fig. 1.4). These grammars are not only aligned with current forms of appropriation and usage of a given platform culture, but also entangled with its mechanisms (e.g. most recent tagged-posts or stories made available by Instagram, related videos by YouTube or top 100 ranked URLs by Google, recommended apps by Apple’s App Store). An extraction software is, thus, required to access contents, for instance when using YouTube Data Tools (Rieder, 2015) to communicate with YouTube API. This is a task that occurs between software (YTDT and YT Data API) but which responds to the researcher intervention who is aware of its possibilities and restrictions. Such form of communication is, perhaps, an unnoticed detail in face of the ease within which one can pull data from digital platforms. The restrictions can refer to data access regimes and, consequently, data issues such as completeness, consistency and architectural complexity that must be taken into consideration (see Bucher, 2013; Ho, 2020; Rieder, Abdulla, Poell, Woltering, & Zack, 2015). But that is something one can verify in API documentation or in the extraction software outputs. 22 When online activity is rendered, organised and re-arranged by digital platforms. In chapter 3, I will address the notions of grammatisation and grammatised actions. 37 Different data format files are often made available such as comma-separated value (CSV), JavaScript object notation (JSON), graph dataset format (GDF) and tabseparated value (TAB). This work material is, then, transposed to mining, analysis and visualisation software such as Excel, OpenRefine, Gephi, Tableau and ImagePlot. At this point, we can clearly see layers of technical mediation that take place in the practical work using digital methods which should be considered as relevant in the research process. In addition, by offering “a general research strategy, or set of moves, that have certain affinities with an online project, mash-up, or chain methodology" (Rogers, 2019, p.10), the methods, consequently, impose a certain proximity with the software. Figure 1.4. The workflow of social media analysis with digital methods (Twitter screenshot). This visual workflow was also presented in a lecture given by Bernhard Rieder at the Universidade Nova de Lisboa in 201523. When looking at figure 1.4, it is possible to understand not only what digital methods mean in practical terms, but also what would come along with them, and, to comprehend that the methods prescribe a particular attention to technological grammar while accounting for the content and particular forms of software practice (this discussion will be elaborated in the following chapters). 23 Slides are available at https://www.slideshare.net/bernhardrieder/analyzing-social-media-withdigital-methods-possibilities-requirements-and-limitations. 38 On this basis, and according to such workflow, when questioning whether we were trained to repurpose digital media and data for research in the same way we were taught to apply online surveys or questionnaires, the answer would be no, mostly not, as I have learned by both participating and organising numerous data sprints. The techniques used to build questionnaires, likewise the traditional sociological approach (e.g. demographic features), do not fit the reality or the practices of digital platforms (Venturini et al., 2018), since this work field uses different materials and methods. There is a difference between making a list of questions for the purpose of gathering information directly from people (questionnaire) and making research questions with a list of keywords (natively digital objects) for the purpose of gathering information within the web environment (query design). The practice of digital methods requires researcher skills and knowledge different from those demanded in the use of questionnaires, such as: the art of querying, the attention given to both software technical substance and the layers of technical mediation in the full implementation of the methods. The repurpose of digital media and data does not end in the spreadsheet, or in following the advantages of statistical packages as we see in the analysis of online questionnaires. Both analytical proposals require some training but not in the same manner. A checklist of questions related to the practice of digital methods The best way to start understanding the impact of digital methods in social research is reflected in the forms research questions can be asked and answered. That is as fundamental as the practical awareness of the medium effects and substance. The following bullet points gather a checklist of questions, serving as guidance to think more thoroughly on the practice of digital methods (based on Omena, 2019; Rogers, 2019; Venturini et al. 2018)24. This mode of inquiry goes in parallel with the technical practices of digital methods, taking a more active role when the query design is defined, while it also exposes the relevance of software in their operation. 24 See also the following interviews with Richard Rogers: http://densitydesign.org/2014/05/aninterview-with-richard-rogers-repurposing-the-web-for-social-and-cultural-research/ (conducted by Michele Mauri from Density Design Lab) and http://revistadisena.uc.cl/index.php/Disena/article/view/841 (conducted by Sarah Lewthwaite). 39 § Basic questions with natively digital object and methods: Is the Internet a sphere to ground findings? Are digital methods suitable for my research? Do I have keywords (tags, URLs, link domains, retweets) as starting points? Crawling, Scraping or API calling? Why? Social media, search engines, web archives or other platforms? Why? Are these “good” hashtags/hyperlinks/URLs/expert-list? Why? What is the logic of the recommender algorithm? What is captured by platforms API and how are records connected by/through them? § Questions related to building-lists: What type of query? (e.g. specified, underspecified) Are there expert lists? For dominant voice and concern: Who are the specific actors that give voice to a problem and to its specific areas? What is considered and ignored? For how long? For commitment: How about the longevity or durability of this concern? Are those concerned committed? Which issues were fleeting? For positioning and alignment: Who is using the same language? What particular keywords? How is this problem specifically articulated and counter articulated? § Questions related to data collection and analysis: For technological grammar: What platform (s)? Why? How does it work? At which specific mechanism should I look at? (e.g. ranking system, recommendation algorithms) How are digital records made available? For software: Which software? Why? What are the entry points to collect data? What are the technical specificities? What are the limitations? What type of files are required or generated? For digital records: What records are available? How are they captured, re-organised and made available? How can they be studied? Why? What media items or metadata can I get? For the analytical decision: Who to look at? Why? How? (e.g. high-visible and/or ordinary content, actors and practices) What to look at? Why? How? (link domains, engagement metrics, comments, timeframe) Single or cross-platform analysis? Why? How? What for? Networks, rank flows, grids, scatter plots, bee swarms or another type of visualisation to explore my dataset? 40 For the-repurpose-of: What questions can be asked? How to get advantage of platform data and mechanisms? What technical specificities count? Why? For what purpose? Data sprints as a form of learning Based on my own experience, I want to advocate that, to fully understand the methodological proposal and philosophy of digital methods requires a process of learning by doing in a collaborative and interdisciplinary environment facilitated by data sprints. But first, a brief explanation of data sprints, which are defined as “intensive research and coding workshops where participants coming from different academic and non-academic background convene physically to work together on a set of data and research questions” (Venturini et al., 2018, p. 1)25. In other words, data sprints promote forms of implementing exploratory and inventive ways of reading, seeing and analysing platform data, bringing “social scientists, developers and data designers together with relevant domain experts to explore research questions and create prototype digital methods projects” (Munk, Madsen, & Jacomy, 2019, p. 110). Within a limited timeframe, usually around five intensive days, a data sprint offers multiple forms of conducting research. Due to the enormous time pressure, analytical decisions are to be made fast and wisely, to avoid the risk of getting “wrong” results or “not well executed” projects (Rogers & Lewthwaite, 2019). At the end of the week, projects’ results and preliminary findings are presented and after the sprint, project reports are made available online, providing time to develop and improve, as well as to finalise the incomplete talks not carried out along the week due to time constraints. For these reasons, data sprints tend to serve well-researched projects at different stages of development, providing rich and substantial insights either for experimental and exploratory studies or for confirmatory ones (Omena, 2019). A pioneer in the data sprint approach is the well-known Digital Methods Initiative (DMI) from the University of Amsterdam, which has two directors, Richard Rogers and Erik Borra (technical director), and more than 13 editions of its summer schools26. This initiative has inspired the creation of SMART Data Sprint at Universidade Nova 25 In the context of SMART Data Sprint, the following video was created with the purpose to explain the data sprint approach: https://www.youtube.com/watch?v=bveMpEtAvug 26 See https://wiki.digitalmethods.net/Dmi/DmiAbout. 41 de Lisboa, founded and coordinated by myself and also considered a referential sprint which is now heading to its sixth edition27. This is the justification for using my own experiences28 to validate my argument on data sprints as a form of learn digital methods by doing. Over the years, I have seen how the data sprint approach can be a valuable source for practically introducing the lines of research inquiry, techniques and potentialities attuned to digital methods proposal for (non) academic studies. Additionally, for those who decide to make room for the computational medium and its methods in the research process, data sprints have proven to be a change of course in the way scholars think of digital research and make use of available digital records and methods. From this background, and despite normally being considered as a technique for organising collaboration, data sprints are also a means of learning digital methods. Participants learn by listening (keynotes) and most importantly by doing. For instance, they learn how to do specific tasks and implement digital methods’ techniques, such as how to: plot a collection of images, build image-hashtag networks, explore the visual affordances of networks, query vision APIs or use software as a tool to solve research problems. It is thus a golden opportunity for those who seek to learn from the digital methods approach, and best used when participants prepare in advance. As a form of preparation, before the sprint, participants are asked to follow tutorials, install and explore the utilities of web-based plug-ins or software. By doing so, while there and not starting from scratch, they can take greater advantage of the workshops. Another way of learning is when engaging with a working group, where participants tend to choose a project by affinity with the object of study or by the proposed methods and techniques (or both). Projects can offer a good environment to understand not only how to make research questions but also learning in practical terms how to respond to these questions. During this process, the intervention and the role of information designers are important in both teaching innovative visual methodologies and generating adequate visualisations that will assist data visual exploratory analysis and 27 See https://smart.inovamedialab.org/. 28 My first contact with digital methods, in a data sprint environment, took place during the 2014 DMI Winter School in Amsterdam. Since then, self-learning, participation in and organisation of multiple data sprints gave me the experience in using digital methods for interdisciplinary research projects and in helping to conceptualise and develop methodological steps that these and other projects are using in different research areas. 42 the results of the projects (see Ciuccarelli and Elli, 2019; Mauri, Gobbo and Colombo, 2019). However, such a contribution is also expected from experienced researchers who are familiar with the use of digital methods; thus, not restricted to the intervention of the designers. The hands-on methods and very specific techniques taught in the workshops combined with the conditions in which one can experience the full range of digital methods in a collaborative environment, makes it possible to researchers to implement the methods, while they learn how to repurpose digital media and data through technical practices. Data sprints are thus an invitation to face and solve analytical and technical challenges in practical terms. The many challenges of Digital Methods In the previous sections we learnt that what makes the difference in digital methods is an invitation to first learn from medium specificity (following its logics, forms and dynamics) and, consequently, to repurpose what is given by the methods of Internet platforms for social, cultural or medium research. When scrutinising online dominant devices and their methods, particular techniques to formulate queries are required. Key to this process is the researcher’s ability in defining a list of words (e.g. URLs, hashtags, videos or images ids, social media accounts) as issue language. Such ability underpins search as research which is followed by a proper understanding and use of the work material (digital records and software) and technical practices for these methods. Under the premise of a medium research perspective, the functional logic of work in digital methods thus invites researchers to think the subject of study in, with and through a practical-technical research process. Four underlying principles Figure 1.5 illustrates what was discussed in the previous section but also describes what is at stake in the implementation of digital methods. That is the ability of thinking the subject of study through different but interconnected operations such as dominating the art of querying platforms while understanding platform grammatisation (see chapter 3). In the same way, knowing that the methods deal with thick layers of technical mediation as they impose a certain proximity with the software. All 43 operations are interconnected. That means, for example, that grammatised actions inform the making of queries as research questions (query design) and, at the same time, are reflected in research software and visualisation models. These four operations correspond to the implementation of four key principles that underpin the practice of digital methods. Figure 1.5. Four key principles that underpin the practice of digital methods. (adapted from Omena, 2019). The first principle assumes that methods take an interdependent position in the research process, from its conception to the decisions made during the analytical procedure; while in the second, we learn that the platform infrastructures play active roles in research decision making processes and should be accounted for. After all, platforms are not intermediaries, their mechanisms intervene, shape and organise what we see (Gillespie, 2015, 2018b) and, consequently, how we read the subject of study. Through these lenses, once again, we understand that one cannot study society through digital platforms without studying the platform itself. The third principle is the requirement of (a minimum) practical expertise in applied research with digital methods, the ability of data extraction-mining-visualisation, for example. The fourth principle concerns understanding digital methods as both interpretative and quantitative. This will be demonstrated through different methodological approaches 44 and case studies I am proposing in this dissertation, in particular what concerns hashtag engagement research, digital networks studies and medium research. These principles reflect what drives and what is always present in the implementation of digital methods: an imaginative, collaborative, and experimental endeavour. That was something implicit in the previous section. However, this reality also exposes some practical problems such as how can one specifically learn from online objects and methods? How can we recognise and apprehend “the mode switch” (Rogers, 2019) required by digital methods practices? To learn from the medium and recombine, reuse or re-purporse its methods are not simple tasks. In this respect, I will suggest an unsual way of unpacking digital methods, using problems as the point of departure and adjectives for a more accurate description. Characterising technical knowledge and practice in digital methods In the light of my experience, I will now broadly address the problems of technical knowledge and practices in digital methods. To help in this effort, a list of adjectives will be used to introduce a reality rarely emphasised in digital methods’ literature. Overwhelming. Uncomfortable. Challenging. Unpredictable. Demanding. Complex. Extremely challenging. Fascinating. Digital methods are overwhelming. To carry out research based on this method, one is supposed to have basic knowledge about platform mechanisms: to know how online devices treat web data; to formulate queries as research questions; to ground findings based on “the uncontrolled methods-of-the-medium” (Rogers, 2019). There is another implicit requirement, that of using research software while getting familiar with the web environment. Consequently, when designing research, many practical questions arise: where to start? Which platform and why? Which tools? How to make research 45 questions as queries? How to collect data? What is next? How to map issue networks using these methods? These concerns lead us to another issue: how uncomfortable and challenging digital methods can be. The presence of how to questions requires constant learning about both thinking along with the medium and knowing how practically handle it. This, as a reminder, informs us that research methods evolve with the technical mediums context. Therefore, and by acting in conjunction with as well as in response to existing medium, digital methods take researchers out of their comfort zone. The methods are, thus, driven by cascading and cyclical processes. Adding to that, and since researchers depend on “the availability and exploitability of digital objects” (Rogers, 2013, p.1), we perceive digital methods as an unpredictable approach. As we know, the web environment is live and not static, which obviously means the methods we use to capture such landscape may be short-lived. So, again, the methods come with another implicit requirement: monitoring. There is no other way around, as one cannot criticise the continuously changing platform mechanisms or study social phenomena embedded in such infrastructures without watching a situation carefully for a period of time (in the web environment), while being aware of the existence of suitable tools capable of collecting and working with online data and methods. Therefore, the methods work with instability by default, which is, rather than something negotiable, a starting point. However, by default, who wants to deal with instability or cope with unpredictability? By default, who wants to feel obliged to monitor the subject of study on the web or learn technical stuff? Who wants to understand software? When using digital methods, there is no option but to face several degrees of uncertainty, and this reality is certainly not a welcoming research scenario for scholars who are used to working with web data as if were survey data. In these cases, the major problem is to ignore the entanglements of data with medium specificity and software functioning, which is why digital methods, being so demanding, require researchers to have a minimum technical knowledge. To further complicate this issue, these methods are not exclusively about data practices. The difficulties encountered in the use of digital methods are not only technical, but also conceptual. At this stage, knowledge and findings are constantly mediated but also informed by tools inserted in a network of methods. Every stage of the process impacts another and, as it goes, the tools carry their modes of being 46 entangled with the researcher’s analytical decisions (successes and errors) in the subject of study-related content. This also means digital methods deal with an environment inhabited by technicity. That is a complex task and the first contact with these methods can be extremely challenging (although always fascinating), because one cannot tell from where to begin, what to look at and how to cope with the full spectrum offered by the methods. The list of adjectives can help us to comprehend that research results and findings using digital methods also correspond to knowledge about the conditions, methods and specificities of the medium. On top of that, there is the mandatory call for working with a set of tools/software. In accordance, and in the spirit of Hoel (2012), when using digital methods, we should consider that knowledge is not only acquired through symbolic forms but primarily through material instruments (e.g. from the methods of platforms to research tools) combined with practice (e.g. knowing how to). This is a process in which “theoretical-practical-technical modes of reasoning interpenetrate each other not only sometimes but in principle” (Hoel, 2012, p.75). Next, I will discuss the problem of knowledge in digital methods (both technical and practical) by questioning how we can develop a research mindset that might help us to think along with the medium. Moreover, I want to argue in favour of re-thinking the definition of medium in digital methods. A discussion about the missing pieces of digital methods will be held, with the purpose of supporting my argument, demonstrating, also, how digital methods deal with an environment inhabited by technicity. A call for a broader definition of “medium” in digital methods In a recent interview, Richard Rogers emphasises the importance of staying in digital methods’ mind-frame and then keep working in a piece of research, within this mindframe. He is certainly referring to the repurpose of digital media and data. In fact, finding a way to such a mindset may be the greatest challenge of digital methods approach. It is something less related to knowing how to code or do things with software and more related to a certain proximity with the software in its own ways. How can one keep working in a piece of research within digital methods’ mind-frame? How can one concretely learn from the media and, then, repurpose their methods and data for social and medium research? In answering these questions, we may fully 47 benefit from the mindset underlying the philosophy of digital methods. When doing this, we also fill a missing part in the methods and in digital social research as well, which is something that forces us to develop “a digital Bildung” (Berry, 2011; Bernhard Rieder & Röhle, 2018). In other others, as Rieder and Röhle (2018, p.123) argue, “we have to be able to think with and in technology as a medium of expressing a will and a means to know”. To further complicate matters, and under the requested mind-frame, the methods invite us to rethink the conditions of proof in digital research. Rogers (2019) presents two main ways in which researchers could do that. First, by assuming online as a site of grounding and, secondly, by questioning “whether the medium, or media dynamics is overdetermining the outcomes” (p.21). By taking up this proposal, and considering the complete assembly of the methods, how is one supposed to repurpose dominant devices and data without considering other mediums which are part of the methods? Here we come across with another missing piece, the call for a broader definition of the medium in digital methods. As we learnt, thick layers of technical mediation are inherent to the methods, as illustrated in figure 1.4 (see also Rieder and Röhle, 2018). That is to say that, when using digital methods, beyond recognising platform mechanisms (ranking, crawling, scraping), we should also consider research software (e.g. Gephi) and web-based applications (e.g. YTDT, vision apis) as mediums to be learnt from and repurposed, since these mediums have particular forms of practices and modes of operation that take an active role in digital research. Researchers are thus required to pay attention to the content of software because it points not only “to concepts – but also to objects, practices and skill sets” that Rieder and Röhle consider to have considerable internal heterogeneity and variation (2018, p. 10). By enhancing the notion of medium in digital methods, as well as its effects, we may read Rogers’ statement from a different perspective: With Digital Methods one of the key points is that you cannot take the content out of the medium, and merely analyse the content. You have to analyse the medium together with the content. It’s crucial to realize that there are medium effects, when striving to do any kind of social and cultural research project with web data. (Rogers, 2014) 48 Medium in digital methods are being perceived in a broader sense here, not only “for clues and guidance” (Rogers, 2010, p.249), but also as real substance in research. We thus may want to consider platforms mechanisms but also research software and webbased applications as “a medium of expressing a will and a means to know” (Rieder and Röhle, 2018, p. 123, see also Rieder, 2020). In other words, when doing digital research, we should also include the forms held within a computational medium, as David Berry (2011) suggested to us almost 10 years ago. Generally, what is hereby proposed may further complicate the problems of knowing how and knowing why when dealing with/using digital methods. In 2013, Evenly Ruppert, Jonh Low and Mike Savage was already warning us that digital devices demand “a better analytical grasp”, one not offered in social theory, or in technical accounts of method (2013, p. 30). The authors were referring to the capacity of exploring “fields of devices as relational spaces”, “the chains of relations and practices enrolled in the social science apparatus” (Ruppert, Law & Savage, 2013, pp.40-41). After almost a decade, the call for such an analytical grasp became fundamental in digital research and even more specialised, although still poorly documented in technical and practical accounts of method. This helps us to identify another missing piece of digital methods approach that I want to emphasise here, something related to the role played by technical and practical knowledge of the mediums. In other words, the urge for a practical awareness of the technical reality of the (computational) mediums involved in the methodological process. Figure 1.6 illustrates that in a very old-fashioned way by describing a stepby-step protocol required to build a network of the followers-of-the-followers of an Instagram bot profile (Mary__loo025), which speaks of bot detection techniques and studies. The most interesting finding in this example was the detection of private bot accounts as central and bridging nodes within the network. Figure 1.6, purposefully, shares what precedes the network visualisation as its substance, rather than showing a beautiful network29. This result was made possible by a combination of factors, such as: previous exploratory work on Instagram bots using digital methods (see Omena, 2017; Omena et al., 2019), techniques of interpreting digital networks (Venturini et al., 2015; Venturini, Jacomy, & Jensen, 2019) and a good understanding of Instagram specificity 29 The network visualisation of this example is available at https://wiki.digitalmethods.net/Dmi/SummerSchool2020GoodEnoughPublics 49 (e.g. [bot] accounts can be private or public) combined with Gephi affordances. This context helped in deciding what should be highlighted within the network: the node colour of private accounts. By doing so, when looking at the position of nodes, it was possible to see and confirm the role of bots’ private accounts in the market of fake followers. Here, a fundamental requirement was the researcher’s ability to think about the subject of study (botted accounts and their agency) in the context of Instagram cultures of use and grammatisation, beyond mastering the use of research software. This triangular relation will be further discussed in chapter 3. Through this short example, we may recognise a technical mental endeavour in conducting research which goes side-by-side with a practical awareness of the technical reality of the computational mediums involved in the methodological process. Here is where the notion of technicity, entangled with the mindset of digital methods (in thoughts and actions) arises and helps. Figure 1.6. A description of technical knowledge and technical practices in digital methods30. 30 There are some steps that precede the use of PhantomBuster, for instance to create a research account, to buy followers and trace them in order to make a list of botted accounts (profile URLs) to be used as entry point for data collection. In this case, to create a network of Instagram bot followersof-the-followers. 50 In response to this, as I argue, the extra effort needed in digital methods would point towards the development of a sensitivity to the technicity-of-the-mediums (see chapter 2), facilitating the understanding of digital methods as a technical ensemble. A situation in which researchers put together and coordinate or create the (computational) mediums which are part of the complete process of digital methods, while they connect technicities for solving research problems. That is a situated and practical perspective that this dissertation proposes to digital methods approach. A proposal that aims at methodological innovation to digital research, calling for inventive/imaginative ways of research enquiries and practices. In such a reality, the methods reinforce that researchers need to be aware of the potentialities and limitations afforded by the mediums, while putting into action their conditions of possibilities. In this spirit, when implementing the methods, we need “to test different ways and to experiment as many times as necessary”, learning by trials and errors in practical terms (Omena, 2019, p. 9). A process that is refined and transformed by repetition (practising) but mainly when we make room for medium substance and ways of being. These are the requirements to fully benefit from the advantages of digital methods approach. These issues will be addressed in the following chapter which also question how the concept of technicity matters in the practice of digital/Internet research. 51 2 THREE ATTEMPTS TO UNDERSTAND TECHNICITY C HAPTER 2 52 The most powerful cause of alienation in the world of today is based on misunderstanding of the machine. Gilbert Simondon, 1980, p.1. If one wants to understand a being completely, one must study it by considering it in its entelechy, and not it its inactivity or its static states. Gilbert Simondon, 2009, p. 19. First attempt to understand technicity (Media Theory) Technicity, a term originally borrowed from philosophy, refers to the relationship between technology and humanity (or humans). When speaking of the concept of technicity, the works of Martin Heidegger, Gilbert Simondon, and Bernard Stiegler31 are the most widely acknowledged; these authors refuse to think of technology or technical objects as mere tools because they play a pivotal role in society, and they are a constitutive part of ourselves and our culture32. As reference for a mode of reasoning the relations between technology and society, technicity alludes to the operative functioning of technology, that is its constitutive mode of ordering or governing which simultaneously transforms society and what we are (Bucher, 2012; Hoel & Van Der Tuin, 2013; Rieder, 2020; Rieder, Abdulla, Poell, Woltering, & Zack, 2015; Simondon, 2017). Technicity thus refers to a field of knowledge which works in theoretical, technical, and practical frameworks, but it is first and foremost technical in essence (see Rieder, 2020). The role of technicity for grasping technical mediations and established relations between machines and humans has been a matter of concern in digital media field studies and alike. This section proposes to review the appropriations, purposes, framings, and conceptual basis of technicity in different fields of study. I want to address questions, such as: Why does technicity matter and in which ways? How are scholars defining and using this concept? To 31 Following these relevant philosophers of technology, James Ash (2012) summarises three ways of understanding technicity: i) as a persuasive logic for thinking about the world - representing Martin Heidegger’s thought; ii) as a mode of existence of technical objects - relating to Gilbert Simondon’s philosophy; and, iii) as an originary condition for human life itself - the proposition of Bernard Stiegler). Considering that mapping all these philosophical thoughts on technicity is not an easy task or the goal of this dissertation, and I will draw particular attention to what can we learn about technicity from a very specific angle on the work of Gilbert Simondon combined with what we can learn from the appropriations of this concept in digital media studies. 32 A non-Aristotelian tradition of thought on technology or one that is not based on the concept of instrumentality (see Frabetti, 2011). 53 understand technicity, what must be looked at? How? What can we learn from that? By so doing, and in a first attempt, I will introduce what technicity is. Ways of thinking technicity in Digital Media field studies In Games Studies, the media theorists Jon Dovey and Helen Kennedy (2006) introduced technicity as a concept “to account for particular formations of identity and power which lie at the heart of computer game cultures” (p.16). In their book Game Cultures, the definition of technicity is taken from the cyberculture studies but is also enhanced through the work of Pierre Bourdieu. From the former, the attentions would go for the attitudes towards technology, its adoption and the correspondent practices, and from the latter, the inclusion of issues like taste and cultural capital. In this sense, the deployment of technicity would look at a network of relations with software; from looking at the ways specific skills in specific technologies are privileged in a given context to look at the ways “in which specific technologies bring us into new relationships with machines” (Dovey & Kennedy, 2006, p. 16). The technicity of video games would then be unveiled by appreciating “the technologies and techniques of production, design, implementation, and appropriation of videogame systems” (Crogan & Kennedy, 2009, p. 113). However, when looking at techniques, the player, the game and the environment are not isolated elements but relational. In practical terms, if one wants to understand (or study) the technicity of video games, gameplay or game cultures, one may look at different techniques such as those related to game design and play, to rules, exceptions and practices, to reading and criticising games, or to the capability of understanding cheating in digital games or gamers expertise (Crogan & Kennedy, 2009; Kuecklich, 2009; Toft-Nielsen & Nørgård, 2015). The first insight we can take from Game Studies is this strong bidirectional relation between the player, with particular skills in particular game practices, and the game in itself and its forms of appropriations within a situation. In a context in which the use of technology (the game) combined with a degree of expertise matters, the player performance reflects such close relations between players and gaming. In this respect, Toft-Nielsen and Nørgård (2015) say that the notion of technicity in gaming ties into the very idea of the entanglements of the players’ kinaesthetic performance, competence and expertise. To the authors, the player expertise cannot be limited to technological competence, but “recast the purpose of corporeality and gender in relation to expertise”33. That means, through the lens of technicity, the authors thus understand “performing gaming 33 Here expertise “manifests itself through the kinaesthetic performance of moving hands and bodies, expressing intimate and skilful rhythmic timing and patterns as gaming expertise” (Toft-Nielsen & Nørgård, 2015, p. 356). 54 expertise as a compound concept in which identity, technology, gender and corporeality” (p.355) take place. Technicity is central not only in Game Studies but, as argue Patrick Crogan and Helen Kennedy34, this notion is pivotal for the elaboration of “a more rigorous and focused perspective on the theorization of technology” (p.107). When applying gaming research, they argue, the players and their cultural and collective involvements should be taken “as processes of becoming intertwined with lineages of technological development and disjunction which are the condition of these processes” (p.107). To grasp these processes, explain the authors, a focus on ludic technicity is required. Dwelling on and in technicity seeks to sustain critical attention on the processes through which the human, as always social, connected individual—connected through techniques, technologies, and dynamic traditions of practice—lives a particular existence. The dynamic of technical development is the medium or environment of this becoming, and individual and collective identity is at best a metastable state that accommodates or regulates provisionally the flow of transformations in human–technical relations. (Crogan & Kennedy, 2009, p. 109) From this argument, we derive the idea that technicity does not exist in a particular technical object/machine but is rather accommodated in a process that assembles socio-technical relations and individuals. This conceptual perception becomes even more tangible when appreciating technicity under the lens of Internet research. In context of platforms-software studies, Sabine Niederer and José van Dijck (2010) uncover the role of technicity as a knowledge instrument by inquiring into Wikipedia’s “dynamic nature” and accounting for its system of partially automated content management. The understanding and analysis of the Wikipedia as a social technical system, as the authors justify, are important steps towards a better comprehension of “the powerful information technologies that shape our everyday life and the coded mechanisms behind our informational practices and cultural experiences”(p.1384). Thus, one of the key strengths of this way of studying Wikipedia is the consideration of its technical specificities on how humans and bots contribute to the management of content. More recently, in Networked Content Analysis, Niederer (2019) presents the 34 In 2009, the authors coordinated the special issue on Technologies between Games and Culture in Sage’s new journal “Games and Culture”. This journal was established in early 2006. 55 concept of technicity as a synonym for the notion of medium-specificity often used in digital methods (see Rogers, 2013, pp.25-26). The starting point is web content and how it is embedded into digital platforms. In this sense technicity echoes how the forms and functions of platforms shape web content. In other words, technicity refers to what Niederer (2019) has described as “platform-specific aspects of content” or networked content (p.35). In this sense, following her proposal, there is technicity in web content. Tania Bucher (2012) turns to Michel Foucault’s concept of ‘governmentality’35 to propose the term “technicity of attention”, which explains that digital platforms operate “as an implementation of an attention economy36 directed at governing modes of participation within the system” (Bucher 2012, 1). Bucher looks at the details of Facebook Graph API and describes some of these with the intention to understand how this platform generates and manages attention. As a digital infrastructure, the problem would be the vast collection of practices established and hosted by Facebook; “a controlled environment that users act in, but have little power to change” (Rieder et al., 2015, p. 3). In this context, Bucher prescribes technicity as a way to condition participation on social media. In other words, she understands technicity as “a mode of governmentality that pertains to technologies” (op. cit, 3)37; thus, working as a form to simultaneously understand the modes of software governance and how it “propagate [s] a certain social order of continued (user) participation” (Bucher 2012, 17). The technicity of Facebook is addressed differently by a group of New Media, Journalism, Language and Communication scholars, through a proposal to study the role “We are all Khaled Said” Facebook page played in the Egyptian revolution of 2011 intertwined with a particular interest in exploring analytical opportunities for data-led studies. They bring to the debate the critical role of technicity in digital research. That is related to, first and foremost, an introduction of either a technical language or knowledge about social media platforms; what the authors called technical fieldwork. In a sense, a first level of perceiving technicity would raise the awareness 35 “Refers to the rationalities that underlie the ‘techniques and procedures for directing human behaviour’ (Foucault, 1997, p.81)” or better saying “mentalities or modes of thoughts that are immanent to ‘government’ or the ‘conduct of conduct’” (Rose et al., 2006). 36 See Goldhaber (1997): The Attention Economy and the Net. 37 The mode of governance of Facebook, she explains, can take place in three different forms: i) “an automated, ii) anticipatory and iii) a personalised way of operating the implementation of attention economy” (Bucher 2012, 2). 56 of how “human practice is channelled through interfaces and data structures” (Rieder et al. 2015, p.4). In this article led by Bernhard Rieder, the authors concretely exemplify how essential the technical fieldwork to platforms-software studies is. They provide a thick description of the platform’s application programming interface (API); from an overall history of its main changes, additions and limitations to an accurate view of Facebook’s data structures - calling special attention to data issues such as completeness, consistency, and architectural complexity. The technical knowledge of the APIs is, therefore, critical for the practice of data or platform driven research (Bucher, 2013; Rieder et al., 2015, see also Omena, Rabello & Mintz 2020). Technicity as a domain in reiterative and transformative practices From Games studies to Internet research, we see that technicity constitutes a complex reality that integrates human-technical relations but, at the same time, requires more concrete means to be comprehended, technical in nature. To complement our investigation into the different uses of the notion of technicity, I want to expose the work of Martin Dodge and Rob Kitchin who help us to understand the concept through both theoretical and practical perspectives. The authors thus introduce technicity as the power of technologies to either make things happen or to solve ongoing relational problems; this is “the constant making anew of a domain in reiterative and transformative practices” (Dodge & Kitchin, 2005, p. 162; Kitchin & Dodge, 2007). Their definition is based on the work of Adrien Mackenzie who developed the concept of technicity on the basis of Gilbert Simondon’s philosophy of technology38. Technicity refers to the extent to which technologies mediate, supplement, and augment collective life; the extent to which technologies are fundamental to the constitution and grounding of human endeavour; and the unfolding or evolutive power of technologies to make things happen in conjunction with people (Mackenzie 2002). For an individual technical element such as a saw, its technicity might be its hardness and flexibility (a product of human knowledge and production skills) that enables it, in conjunction with human mediation, to cut well (note that the 38 Here, the understanding of technicity is not to be confused with technical mediation or mediumspecificity, although it has a direct correlation with these latter. 57 constitution and use of the saw is dependent on both human and technology; they are inseparable). (Dodge and Kitchin, 2005, p. 169) In the first place, technicity serves as a theoretical framework that helps to understand the effect of software (code) and how it modulates sociospatial relations by making a difference “to the form, function and meaning of space39” (Dodge and Kitchin, 2005, p.171). They reflect on the technicity of the code which is taken as something “contingent, negotiated, and nuanced; [that is] realized through its practice by people in relation to historical and geographical context” (op. cit., p.170). In this realisation, there is a high level of mutual dependency though, on one side, “if the code fails, then the object fails to operate” (op. cit., p.178); by object they mean what is part of our daily routine - from washing machines to transport and logistics networks. Consequently, when an object fails, it compromises our living – namely domestic living, traveling, working, communicating and consuming. Vice versa, if one technology user does not perform her role along with software, the action occurs differently or does not result in something meaningful. That is to say, and as detonated by the definition of technicity, “code (software) and its effects are peopled” (op. cit., p.170). In the Dodge and Kitchin (2005) essay we clearly see that there is no technicity of code without human mediation. In a second moment, and as a complement to the theoretical vision, Dodge and Kitchin (2005) invite us to consider technicity in its practical terms. The work of code could only be unfolded in practice and in conjunction with people. To do so, and instead of focusing on code, they examined the nature of maps or more precisely mapping [technical] practices: “how maps are (re)made in diverse ways (technically, socially, politically) by people within particular contexts and cultures as solutions to relational problems” (Kitchin & Dodge, 2007, p. 243). Following the work of Bruno Latour, Adrien Mackenzie and Gilbert Simondon, the authors draw our attention to something that must go in parallel with the awareness of society as “hybrid assemblages of human and non-humans” (Latour 1993): that is, while looking at what constitutes maps, we should also pay attention to their process of becoming (see Kitchin & Dodge, 2007, p. 335), i.e. the interpretation of maps should not be separated from all related practices 39 Dodge and Kitchen (2005) interpret space as something produced by social relations and material social practices. Rather than static, the functions of space alter with time; it is something that “gains its form, function and meaning in practice” (p.172). 58 that constitute the process of mapmaking. In this sense, we may want to read technical objects like Kitchin and Dodge read maps; as something “of-the-moment, brought into being through practices” (p.331). After reviewing why and in which ways technicity matters for different field studies, as well as noticing how this notion has been used, conceptualised and applied in digital media research, one important observation should be noted. The technicity of (games, gameplayer, code, software, social media) is a complex concept that requires a close relationship with software through technical knowledge and practices. This is central to my thesis which grants “technology a new role in knowledge and existence by posting to its involvement in processes of becoming” (Hoel, 2018, p. 420); in doing so, I act in opposition to the traditional and classic views on technology “conceived as external to being” (Hoel, 2018, p. 420, see also Marres 2017). While this argument is not new, it is fundamental to my use of the notion of technicity to understand digital methods. Common perceptions and appropriations To close this section, I want to emphasise what all these studies, perceptions and appropriations of technicity have in common and what we can learn from them. By doing this, we begin to understand why technicity matters and what we should look at to gain a firm grasp of it, also situating the relationship of technicity in digital methods. First and foremost, they all focus on the mode of existence of the technical objects, embracing it as part of a relational process and proposing ways of reading the machine and its relation between us and other tools. This is a reflexive exercise on i) paying attention to the techniques of appropriation of videogames and their entanglements with the performance and expertise of the gameplayer in order to account for identity formation; or ii) comprehending the automated editing systems of Wikipedia to evaluate how the encyclopaedia manages content; or iii) knowing how Facebook Open Graph works in order to study participation. This exercise is not only about the effect of software or the ways in which we make use of it, but also about processes of becoming. Through the lens of technicity, the comprehension of technologies is not limited to technical forms and functions (what they do, how they work) or to technical expertise (uses and practices) but it makes room for how/what technologies become what they 59 are in conjunction with people. That is something only unfolded in technical practice and through its relational aspects, which are always situated in a given time and context. A second aspect refers to the comprehension of technicity as the understanding of a network of relations within software. That involves knowledge about the nature of machines that one can grasp through direct interaction with the software. In terms of content, technicity refers to software functioning and potentialities but also practices. Descriptions of forms, functions, limitations and hidden schemas take a prominent role here. I thus recall Fuller (2008, 9) in asserting “software demands an engagement with its technicity and the tools of realist description”. However, technicity cannot be defined solely as a technical description or medium specificity because it only exists through human mediation and performance. In this sense and following Rieder (2020), we need to pay serious attention to the content of software, while considering that part of its technical substance “sits at the centre of technical practice” (op.cit p, 54). Technical practices sit at the centre of digital methods, so it is the content of technicity which resides simultaneously in machines (software if you wish) and in us. These practices become meaningful and crucial to be accounted for while using digital methods. One last element that we learn from the different ways of thinking technicity is that this concept refers to a field of knowledge that combines theoretical, technical and practical frameworks. In this spirit, the attempt of presenting an actual application to technicity is what we can understand as a common concern among all cases. From these studies, either based on philosophy of technology or media studies, we can derive fundamental advice on how to grasp technicity and why it is important, as well as an awareness of technical infrastructures and their modes of functioning intertwined with the co-constitutive relations with us (as researchers) in processes of becoming with technology. In this sense, technicity directly refers to a domain of technical expertise and iterative technical practices in both theory and practice (see Rieder, 2020; Dodge & Kitchin, 2005; Kitchin & Dodge, 2007) that requires action/practice to exist. However, it was noted that an in-depth development of the term ‘technicity’ seems to be disregarded, although these studies clearly indicate how we can read technicity. For instance: through a “description of the Facebook platform and how it works” (Bucher, 2012); through understanding the “dynamic nature” of Wikipedia regarding human 60 and bots contributions to the management of content (Niederer & van Dijck, 2010); or through listing the limitations and analytical possibilities afforded by social media APIs (Rieder et al., 2015). In the context of digital methods, I want to argue that the concept of technicity relates not only with the description of the forms and operations of the medium, but it acknowledges a practical awareness about and with medium functioning. The next section will help in this task by comprehending technicity through the lens of Gilbert Simondon’s philosophical perspective. Second attempt to understand technicity (a philosophical perspective) In the previous section, the concept of technicity was introduced through the perspective of Digital Media Studies. Here, the intention is to provide further reflection following the perspective of the French philosopher Gilbert Simondon with the purpose to strengthen my argument on the role of technicity in digital methods. In this second attempt at understanding technicity, the book Engines of Order - A Mechanology of Algorithmic Techniques by Bernhard Rieder (2020) serves as a substantial support to my interpretation and efforts to comprehend what technicity is. In a document that served as a complement to his PhD thesis in 1958, Simondon (1980, 2017) develops a particular vision of technical objects, more precisely their mode of existence which must be taken as assessment of their values in our lives. According to Simondon, this is the lynchpin of a new model of culture that knows and recognises the essence of technical objects. This should replace the current model of culture, which produces alienation due to a misunderstanding of the role of machines in our lives. That is the reason, he argues, our culture is unbalanced, and to avoid devaluations or confusion towards machine, we thus need to become aware of the mode of existence of technical objects and understand that these objects (or technology itself) are “indeed human” (Simondon, 1980, p.1)40. Only in this way, we would grasp the technicity of “beings that function” (Simondon, 2017; see also Rieder, 2020). 40 “Using the vocabulary of Actor-Network Theory, we could say that an object’s technicity realizes ‘its script, its “affordance”, its potential to take hold of passersby and force them to play roles in its story’ (Latour, 1999, p.177). Simondon’s philosophy, however, cautions us to not move too quickly to the heterogeneous but flat assemblages Actor-Network Theory conceives. In fact, Latour’s more recent An Inquiry into Modes of Existence (2013) follows Simondon in arguing that such modes 61 These principles align with the studies discussed in the previous section. However, the application of technicity has a narrower focus in media theory, while in Simondon, we see a broader spectrum (human-world-machine). Here, I am proposing to frame the perspective of technicity, precisely into the reality of digital methods and following Simondon’s very peculiar appreciation of technical objects that involve aspects not considered by the previous studies (e.g. the distinction between individual, elements and ensembles). A second aspect concerns his arguments about the role of man before technical objects: not only are we the builders and coordinators of machines but we also live among them. Therefore, and according to Simondon (1980), the following sections attempt to expose a threefold dimension for understanding technicity which consists of i) the nature of machine, ii) the values involved in their mutual relationship (machine-machine) and iii) their relationship with man (machine-man). I will be using technicity in two senses pointing on the one hand to the effort to become acquainted with the mediums and, on the other hand, to the object of technical imagination, a perspective that can be useful and relevant to the practice of digital methods. To be acquainted with the medium reflects the researchers’ attitude to understand the medium in its own right and in relation with others; then, and by knowing it well, we start reasoning with and about the technical medium, using this knowledge as a means for enquiry and for answering research questions. I advocate that this attitude allied to the practice of digital methods results in the development of a technical imagination which is defined by Simondon as “a particular sensitivity to the technicity of [technical] elements; it is this sensitivity to technicity that allows for the discovery of possible assemblages; the inventor does not proceed ex nihilo, from matter that [s]he gives form to, but from elements that are already technical [...] (see Simondon, 2017, p. 74). A technical imagination enables researchers to connect technicities and thinking along with their “stable behaviours, expressing the characteristics of (technical) elements, rather than simple qualities”; characteristics that “are powers, in the fullest sense of the term, which is to say capacities for producing or undergoing an effect in a determinate manner” (Simondon, 2017, p.75). delineate their own substances in ways that are more profound than a mere incommensurability between language games, because they admit other beings than words into the fold of what makes a mode specific (Latour, 2013, p.20). Being itself is marked by difference and, as Peters claims, ‘[o]ntology is not flat; it is wrinkly, cloudy, and bunched’ (2015, p.30)” (Rieder, 2020, p. 56). 62 In other words, when doing digital methods, technical imagination reflects the researcher’s capacity of predicting and combining the practical qualities of software to solve methodological problems and to respond to research questions. The awareness component: machine as elements, individuals and ensembles The first step to understand the technicity of beings that function encloses the awareness component (of structures, functions and operationalisations) which allows us to see machines in their own right (Simondon, 1980; 2017). This stage starts with the comprehension of the machine as a (technical) element, a (technical) individual and a (technical) ensemble. On the one hand, explains Rieder (2020), this proposal can be interpreted as part of a metaphysics of technology but, on the other, it “can also be used as a conceptual device orienting the analysis of specific technical domains” (p.65), yet acknowledging that a precise distinction between element, individual and ensemble can be difficult to tell when working with digital technologies. In the context of digital methods, the advantage of separating between element, individual and ensemble is a first step to recognise the technical medium itself and in relation to others, serving also as departure point to think along with what is already technical, while paying attention to medium’s capacities for producing or undergoing an effect in a determinate manner in the methodological process41. This attitude has the potential to affect the course of action in methodology, either intervening directly or indirectly in the ways we design research and think or study digital records. Before exposing how this tripartite distinction can help the practice of digital methods, we will start with a general understanding of elements, individuals and ensembles, followed by some examples viewed from the perspective of digital methods. Simondon compares technical elements to organs, functional units that could not subsist on their own, and individuals to living bodies. (Simondon, 2017, p.62) In this analogy, ensembles appear as societies, characterised by more precarious or tumultuous dynamics of relation, stabilisation, and perturbation. Computer networks connecting functionally separate machines are obvious candidates for technical ensembles, but the utility of the concept is broader. Simondon argues that technical and economic value are almost completely separate on the level of the element, while individuals 41 In italics I´m borrowing and adapting Simondon words to the context of digital methods. 63 and ensembles connect to broader realities such as social arrangements and practices, the latter pointing to the fact that contemporary technology functions as a series of industries. (Rieder, 2020, p. 69) Technical individuals, such as software, relate to the integration of ‘pure’ technical forces and “how an associated milieu is formed and stabilized, how continuous operation is realized” (Rieder, 2020, p. 68). These are taken as a stable and integrated system made of elements. For instance, “the application software ‘fits’ the processor speed, memory capacities, screen dimensions, sensor arrays, and so forth” (op. cit.). Individuals exists in a state of combination, so it is technicity within these technical beings (Simondon, 2017). In digital research, and despite the need of a computer to operate on, a good example of a technical individual is Gephi (Bastian, Heymann, & Jacomy, 2009). Built with Java SE 6 on top of the NetBeans platform, the first version of Gephi was released in July 2008 (0.6.0) and it is now operating under the stable release 0.9.2 since September 2017 (Heymann, 2014; Wikipedia, 2015). This eleven-year-old network analysis and visualisation software may represent what Simondon describes as an operational unit closest to the human scale (2017, p.77) that “combine and arrange elements into functioning and integrated wholes, realizing technical schemata that evolve over time” (Rieder, 2020, p.61)42. Accordingly, software like RawGraphs (Mauri, Elli, Caviglia, Uboldi, & Azzi, 2017) and Facepager (Jünger & Keyling, 2019) or even social media can be seen as technical individuals. In these cases, self-regulation is required, a “structural coherence capable of assuring their function and stability over time” (op. cit. p.65). Technical elements are functional units that cannot operate on their own but are carriers of meaning with the potential to be repurposed. The valve spring and modulation transformer are examples of elements in Simondon, who explains that 42 As mentioned before, Simondons’ reflections on technical objects were not related to software, neither web-based APIs nor digital tools for research. Therefore, the distinctions between individual, elements and ensemble demand to be adapted to the context of digital research. In fact, one could say that Gephi is an ensemble, rather than an individual, precisely because it is written in Java and stands on Netbeans. However, as we will see, both technical ensembles and technical individuals exist in a state of combination (made of elements, e.g. Java). Gephi is illustrated as an individual here because it is a stable software(Heymann, 2014). When Simondon refers to a technical ensemble, he explains there is functional separation in it (meaning the gathering of technical objects not necessarily having to operate within one another). An ensemble can also be temporary. A technical individual is thus, “first and foremost, characterized by strong and stable technical or causal relations between the constitutive elements that establish and maintain its functioning”, as Rieder (2020, p.58) explains. 64 these units enhance the quality of individuals such as valve springs in automobile engines. The springs serve “to move an intake, or an exhaust, valve according to the head-discharge curve of a cam so that the valve is in contact with its seat to prevent compression leakage. In the meantime, a valve spring is required to impose appropriate tension on the valve so as not to increase the friction loss of the valve operating system” (Yoshihara, 2011). Over the years, valve springs became lighter and smaller (e.g. by considering non-metallic inclusions in steel); an attribute that responded to the environmental regulations for automobiles, consequently, the technical evolution of these elements have reduced fuel consumption and carbon dioxide emissions (op. cit.). This example helps us to comprehend Simondon (2017) when he says technical elements are the true engines of progress, while the ensembles or individuals are more connected to transformations and changes, precisely because the element “expresses and preserves what has been acquired via a technical ensemble so as to be transported into a new period” (p.73); only they have the capability “to transmit technicity from one age to another” (p.76). They “do not dissolve when forming individuals”, but they are capable of constituting their own trajectories (see Rieder, 2020, p.68). In other words, elements transport a concretised technical reality, while the individual and the ensemble contain “this technical reality without being able to transport and transmit it; elements have a transductive property that makes them the true bearers of technicity, just as seeds transport the properties of a species and go on to make new individuals” (Simondon, 2017, pp.73-74). That is a form of explaining that technicity exists in the purest way within technical elements (Simondon, 2017, p.74), “because they are not yet combined into systems that, so to speak, put certain demands on them” (Rieder, 2020, p.65). In the context of digital methods, some candidates for technical elements can be algorithmic techniques (e.g. information ordering), machine learning models for studying natively digital images (e.g. Google Vision API’s web detection module) or even graph layout algorithms. These elements, in research design and analyses, should become units of knowledge when apprehended by technical imagination (see Rieder, 2020, p. 103). This will be discussed in the next section. A second illustration, attempting to transpose the notion of elements to the practice of digital methods, is the case of graph layout algorithms; indispensable/crucial for digital network analysis. Taking as an example is one of the constitutive elements in Gephi: ForceAtlas2 (Jacomy, Venturini, Heymann, & Bastian, 2014). The first version of this 65 layout algorithm, simply known as Force Atlas, arrived together with Gephi in 2008, but improvements were made, and in June 2011 the stable version was released (Jacomy, 2011; Jacomy et al., 2014). This force-directed layout serves small to medium-size graphs and it responds to attraction force vs. repulsion by degree and these forces create a movement that converges to a balanced state (in positioning the nodes within the network). The final configuration is expected to help the interpretation of the data. Although being a default layout in Gephi, ForceAtlas243 can be installed in R (Analyx, 2015) and implemented for Python 2 and 3 (Chippada, 2017). Digging deeper into ForceAtlas2 and based on how it works, considering also a series of experimental studies using social media and search engines records, this forcedirected layout may provide a narrative thread that has fixed layers of interpretation44 such as centre, mid-zone, periphery, and isolated elements and multiple forms of reading (see Omena & Amaral, 2019; Omena et al., 2020). The operational qualities of this graph layout are in between something to be created (findings, reflections, revelations, stories, narratives, controversies) and something that already exists (data entered in the software, e.g. Gephi), grammatised actions available in digital platform. By giving a shape for situated connections and arranging those in different zones within the network, ForceAtlas2 has the capacity to produce a very particular effect in the knowledge to be acquired and shared about a given context (see a practical example of this in the next section). 43 Recently, and unsatisfied with the results afforded by this force-directed algorithm, Mathieu Jacomy (co-founder of Gephi) has developed a quantitative metric named connected-closeness to read digital networks: “a valid metric for interpreting distances in a network map” (Mathieu Jacomy, 2020b, 2020a). 44 Since 2018 this proposal has been tried-and-tested in and out the context of data sprints; for instance, it was used to read networks of recommendation, paying attention to the recommendation of similar apps in Google Play Store (e.g. https://wiki.digitalmethods.net/Dmi/SummerSchool2018AppStoresBiasObjectionableQueries, https://smart.inovamedialab.org/past-editions/smart-2019/project-reports/journalism-apps/) or to the related videos suggested by YouTube algorithms (e.g. https://wiki.digitalmethods.net/Dmi/SummerSchool2018MappingWarAtrocities). The fixed layers of interpretation was also tested in networks of hashtags, networks of following accounts and networks built on top of computer vision APIs and concerning image circulation (e.g. https://thesocialplatforms.wordpress.com/2019/12/07/reading-digital-networks/, https://smart.inovamedialab.org/past-editions/smart-2019/project-reports/interrogating-vision-apis/, https://wiki.digitalmethods.net/Dmi/SummerSchool2020GoodEnoughPublics, https://smart.inovamedialab.org/2021-platformisation/project-reports/investigating-cross-platform/). 66 Technical ensembles, less specific than elements but existing in a state of combination like individuals, refer to broader realities; they “constitute systems that are not characterized by the creation of an associated milieu, but rather by a form of coupling that merely connects outputs to inputs, while remaining separate in terms of actual functioning” (Rieder, 2020, p.68). In this way, an ensemble goes beyond itself by enabling “the construction of new technical beings in the form of individuals” (Simondon, 2017, p.72). However, these newly created beings can be temporary or even occasional (Simondon, 2017). In addition, there is functional separation in technical ensembles meaning the gathering of technical objects do not necessarily have to operate within one another but in different environments (space) or systems. Rieder (2020) points out that Simondon sees a systems perspective based in a theory of information as key to understand what technical ensembles are, while setting another example in which “computer networks connecting functionally separate machines are obvious candidates for technical ensembles, but the utility of the concept is broader” (op.cit. p.69). Here, the concept of technical ensemble is used as a form of representation of the full range of digital methods, in which the distinction between individuals, elements and ensembles needs to be adapted to digital research. To conclude this topic, and following Simondon, I want to emphasise a shared characteristic between elements and ensembles: they are characterised by a degree of technical perfection. This refers to a practical quality “or at the very least material and structural basis of certain practical qualities; in this way a good tool is not simply one that is well put together and well crafted” (Simondon, 2017, p.72). In order words, no matter whether the object is free of or full of flaws and limitations, what counts is its capacity to accomplish a purpose. In the context of digital methods practice, the media involved, its methods, forms and structures are never optimal, but they certainly contain practical qualities that are capable of fulfilling a research purpose. The distinction between individual, elements and ensemble can help the practice of digital methods in different ways. First, and when comprehending technical mediums at different levels (also knowing what they offer and how they operate), researchers become capable of knowing what type of technical objects are required, when to use these and why. This perspective is relevant in the research design with digital methods because it compels researchers to understand the potentials and practical qualities of 67 digital technologies, in the same way that they take seriously the context and content of the object of study to accomplish research purposes. Second, and throughout the practice of digital methods, we come to understand the impressions, effects and re-adjustments left by a range of software on digital records (see chapter 3 and figure 3.13); in particular, how crucial the role of technical elements is in the analysis. Here, the content of grammatised actions (e.g. hashtag-based data set) is paired up with the content of technical individuals (e.g. extraction and visualisation software) and elements (e.g. Google Vision automated image annotation in a collection of images associated to a particular list of hashtags). This understanding helps researchers not only to think along with and repurpose (if necessary) computational mediums, but it also enables them to discover new arrangements, using a technical imagination to create methodological solutions. Third, the concept of a technical ensemble helps researchers to envision and design the full implementation of digital methods. A situation where the practical qualities and potentials of software, web-based APIs and digital tools (technical individuals and elements) are combined and coordinated to work together, united to achieve a research purpose. Here, researchers must define what range of software (technical individuals) should form a chain methodology to solve research problems, valuing particular elements as carriers of meaning. When adapted to digital research, the tripartite distinction can make a difference in the practice of digital methods, as this thesis expects to show (though not always using the terms individuals, elements and ensembles). The human function: to be acquainted with machines […] yet, in order for the human function make sense, it is necessary for every man employed with a technical task to surround the machine both from above and from below, to have an understand of it in some way, and to look after its elements as well as its integration into functional ensemble. For it is a mistake to establish a hierarchical distinction between the care given to elements and the care given to ensembles. Technicity is not a reality that can be hierarchized; it exists as a whole inside its elements and propagates transductively throughout the technical individual and ensembles: through the individual, ensembles are made of elements, and from the elements issue forth (Simondon, 2017, p.80). 68 The second step to understand the technicity of beings that function entangles the relations between man and machines, particularly at the levels of technical individuals and ensembles. Simondon proposes two situations to illustrate this relationship: in the first, machines are the helpers while we are the bearers of technicity (artisanal coupling); the second concerns industrial coupling45 in which machines are no longer simple servants, but gain a status that requires regulation, supervision and organisation. In the artisanal work, for instance bow drilling or the skill of shoeing a horse, man (as technical individual) “ensures an internal distribution and self-regulation of the task through his body” by mastering the tools (Simondon, 2017, p. 77); thus, assuming the function of technical individuation. In this case, affirms Simondon (2017), man is the bearer of technicity, while the tools are the helpers. Conversely, as we know, this situation changes when looking at technical ensembles of industrial and digital societies. To Simondon (2017), this is the moment when the essential foundation of technical individualisation shifts: it is, from then on, within the machines, rather than in humans. “Man directs and adjusts or regulates the machine, the tool bearer; he realises grouping of machines, but does not himself bear tools” (Simondon, 2017, p. 78). By disengaging from a more artisanal function, man can now become either “organiser of the ensemble of technical individuals, or helper of technical individuals”, playing a sort of auxiliary role by providing the machine with elements. That is what Simondon places as two types of technical individuality: in the first one, the role of man is above the machine (regulator) and below it (organiser) in the other. However, it “does not mean the man cannot be a technical individual in any shape or form and work in conjunction with the machine” (p.78), as explains Simondon: This man-machine relation is realised when man applies his action to the natural world through the machine; the machine is then a vehicle for action and information, in relation with three terms: man, machine, and world, the machine being which is between man and world. In this case, man preserves some traits of technicity defined in particular by the necessity of apprenticeship. The machine thus essentially serves the purpose of a relay, an amplifier of movements, but it is still man who preserves within himself the centre of this complex technical individual that is the reality constituted by man and machine. (Simondon, 2017, pp.78-79) 45 In software engineering coupling “is the degree of interdependence between software modules; a measure of how closely connected two routines or modules are” (Wikipedia, 2020). 69 The reflections of these sorts of relationships (man-machine) can fruitfully be adapted to digital methods, here taken as research practices made of technical creators, both software-makers and software-users (see chapter 1). Transporting Simondon’s perspective (2017, pp.78-79) to the context of digital research, we could say that the researcher becomes the bearer when they decide to work in conjunction with the medium and its methods (software, web-based application, etc.). Technical tasks and practices are synonymous with this activity. When it comes to machine-machine relations, man can only intervene as a living being; always as a servant or organiser of machines. Think of core activities in digital methods such as monitoring software functioning. However, even claiming that “the real regulating power of culture” pertains to machines, Simondon also makes evident that machines are only perfect in the presence of man who is the “permanent organizer and a living interpreter of the interrelationships of machines” (Simondon, 1980, p. 4). Once again, the role of man towards technical objects is indispensable. In the environment of digital research, the analogy of the conductor can shed light on the role of the researcher as the one who directs technical ensembles, as well as delineates close relations with the elements and their integration into the ensemble. The conductor can direct his musicians only because, like them, and with a similar intensity, he can interpret the piece of music performed; he determines the tempo of their performance, but as he does so his interpretative decisions are affected by the actual performance of the musicians; in fact, it is through him that the members of the orchestra affect each other’s interpretation; for each of them he is the real, inspiring form of the group’s existence as group; he is the central focus of interpretation of all of them in relation to each other (Simondon, 1980, p.4). In the full range of digital methods, the researcher should take the same position of the conductor; who is familiar with the whole and the smaller parts of what makes an orchestra, the general and the tiny details produced by each musical instrument; only in this way is she able to conduct her musicians. The conductor, although in the position of the one who directs the musicians, is nevertheless at the same level of these latter; because her decisions cannot be taken individually or isolated but depend on 70 and are affected by the performance of the musicians while they play. On one hand, if she weren´t there, there would be no “group’s existence as a group”, but on the other hand, she could never interpret any piece of music without her instruments. Together they bring musical notes to life; both delivering the performance and providing a unique experience. Likewise, in digital methods, the researcher fulfils her function before and through a technical ensemble composed by the medium that she investigates and the tools that she uses to investigate it. How then to become acquainted with the medium and its functional meanings within an ensemble? It is necessary, according to Simondon, “to every man employed with a technical task to surround the machine both from above and from below, to have an understanding of it in some way, and to look after its elements as well as its integration into functional ensemble” (Simondon, 2017, p.78). This also means that beings that function signify not only through what they do or through how they operate (Rieder, 2020), but through the comprehension of their relational mode of functioning. For instance, drawing on Simondon’s “mechanological perspective” on software, Rieder (2020) introduces a mode of thinking and capturing the ‘interior life’ and ‘sociability’ of information retrieval “in terms that are not bound to an exterior finality or productivity” (p.16). Consequently, and from a series of practices in softwaremaking, his book contributes to a way of studying how these engines of order somehow “adjudicate digital life”. Yet, how to become acquainted with the medium and its functional meanings when one is neither a developer nor has coding skills? Such technical task, I contend, is not exclusive of developers, coders, those who make software or hack systems. Non-developer researchers can also interrogate the fundamental nature of the medium from a technical perspective. Building machines is not the only ways of being around them. Using them is another. Although barely talking about the use of technical objects, Simondon states that we may treat these objects “as meaningful only in relation to a use and utility” (Simondon, 2017, p.16 in Rieder, 2020, p. 57). He furthermore ensures that we are engineers of transformation, as both inventors and users of technical objects; so, the desire for change resides in us, rather than in the object itself (Simondon, 2017, p.71). In the context of digital methods, as a user, to be acquainted to the computational medium would mean speaking its language conceptually and technically while using 71 the medium empirically without losing sight of its role in parts and the whole of digital methods (see table 2.1). This also requires questioning the medium itself and its relation to other mediums. By doing this and counting on the researcher’s empirical experience and technical imagination, the computational mediums become active actors in the research process with substantial content for enquiry and for answering research questions – not only tools. In order to become familiar with the technical medium, new media scholars should have neither the same knowledge of a computer scientist, nor an ability to invent algorithms; but they must “understand technologies well enough to connect them to culture” and have the “willingness to use new and challenging methods of thinking and investigation” (Bogost & Montfort, 2009, p. 5). More important than creating computer systems or becoming a coder expert is “knowing the best questions to ask about existing ones and how to go about answering them” (idem). This illustrates not only the spirit of platform studies, proposing to think about the software environment as a platform, in the belief that “technical understanding can lead to new sorts of insights” (Bogost & Montfort, 2009, p. 1), but also concerns an understanding about the technicity-of-the-mediums in the context of digital methods. That is, an understanding of the medium in its own right and in relation with others (see table 2.1). Then, and by knowing it well, we start reasoning with and about the mediums, using this knowledge as a means for enquiry and for answering research questions. In this sense, the knowledge gained from practical experience is used as a point of departure for designing research questions, a value needed in digital research and methods. 72 [to understand the] Medium Conceptually Technically Empirically What is this medium? What does it bear? How does it work and for what purpose? What is permanent or subject of updates, replacement? Does it connect to other fields of study? Which and how? To understand the medium purpose, capabilities, potentialities and limitations in technical terms and what does it mean for research, considering its technical infrastructure. What is required in order to use it? When using, what problems and solutions are implicit to the medium? What technical skills may be required in the context of collaborative research (data sprints)? What are other mediums that this medium related to or communicate with? What the medium carries is modified? How? For what purpose and on which effect? In the research context, what is its technical element that matters the most? Why? To understand the medium relational aspects (with other mediums) and respective purpose and effects in technical terms. To understand the technical element that matters. To understand, see, explore and experiment its relational aspects and respective purpose and effects. To identify, explore and experiment the particular element(s) that matter for research. What questions can(not) be asked? Which medium potentialities or methods can be used? Why? What digital records can(not) serve this purpose? To know about the mediums methods and elements that are appropriated or have the potential to respond to research questions or to be repurposed. How questions can(not) be answered? Which ensemble of mediums can(not) serve this purpose? What medium element can(not) make the difference? To know how to orchestrate different mediums in order to answer a particular research question. in itself Being acquainted with in relation to other mediums Using technical imagination as a means for enquiry a means for answering research questions To dedicate some time to go through each step of digital methods approach - testing, exploring and experimenting its possibilities, analytical affordances and limitations. To perform the full range of digital methods either following projects’ recipes or to experimenting and testing new arrangements. Table 2.1. How to be acquainted with the medium in the context of digital methods? 73 From orders of thought to activity and back The last step to understand technicity addresses its material reality which is “thus mirrored by a ‘mental and practical universe’ that humans come to know and draw on to create technical objects” (Simondon, 2017, p.252 in Rieder, 2020, p.76). The challenge here is to advance the notion of technical mentality46 as a fundamental principle with a crucial practical importance. Digital methods cut across technical skills and practices unfolding new ways of thinking along with the medium just as creative styles of enquiry and interpretative strategies; which are founding principles in these methods (Marres, 2017; Omena, 2019; Rogers, 2019). In this context, I want to argue that the development of a sensitivity to the technicity of the mediums requires not only practical or technical efforts but also mental schemas, the capacity to develop a particular mode of reasoning that echoes the technical potentialities in the research process allowing for new arrangements. In line with Simondon’s work (2009, 2017), what follows is an attempt to describe technicity as a phase providing also brief reflection on technical thought. When Simondon refers to technicity as a phase he is making a strong statement by affirming that technicity is one of the fundamental elements that make us what we are (the other being religion). He sees technicity as a phase fundamental to "the mode of existence of the whole constituted by man and the world”(2017, p. 73); here the whole (l’ensemble) is taken as a system which is mediated by technical objects and formed by the man and the world. While philosophy has historically considered technology either as autonomous or defined by its use, Simondon sees technology as an expression of life (p.65; see also Rieder, 2020), accounting the reality of technical object itself. In this sense, the philosopher of technology guides the reader into a chapter that aims at grasping technicity as a mediation between the theory of knowledge and the theory of action. While I will not delve into Simondon’s metaphysics, it is important to describe this perspective that served as inspiration for this dissertation. 46 “Technical mentality offers a unique mode of knowledge that essentially uses the analogical transfer and the paradigm, and grounds itself on the discovery of common modes of functioning - or of regimes of operation - in otherwise different orders” (Simondon, 2009, p. 17). Despite still evolving, and thereby incomplete, a technical mentality proposes schemas of intelligibility that would be particularly adequate to grasp regimes of technical operations implying also their functional modes. 74 By defining technicity as a phase, Simondon understand it relationally: one cannot conceive of a phase except in relation to another or to several other phases; in a system of phases there is a relation of equilibrium and of reciprocal tensions; it is the actual system of all phases taken together that is the complete reality, not each phase in itself; a phase is only a phase in relation to others. (Simondon, 2017, p. 174) As phase, technicity simultaneously precedes and takes place with and in the technical objects: it precedes by being related with figural structures, positioning itself as something prior to any split of subjectivity and objectivity; by also relating to a trajectory of awareness (Simondon, 2017). For instance, the beginning of an action, what Simondon exemplifies as "the desire for conquest" or "a sense of competition"; things, places and moments (key points) that go from the beginning of an action to its own realisation; or as he explains as “the birth of a network of privileged points of exchange between the being and the milieu” (p.182). What precedes the technical objects, thus, consists in this capacity of connecting key-points, perceiving also that these points “objectivise themselves in the form of concretized tools and instruments” (p.181). In this mental universe, for instance, I would suggest thinking of key-points as technical elements. Positioned itself as mediator between man and the world, these elements are taken as a place of exchange or to be navigated and explored, rather than dominated or possessed. By passing from figural structure(s) to technics, and in line with Simondon, technicity takes place with and in technical objects, when technical objects can be recognised as a reality in themselves. The suggestion of taking elements as key-points has a purpose, that is my attempt to redirect philosophical ideas to what I will further discuss and illustrate in this chapter – namely the role of technical thought and mentality in digital methods. Simondon, however, takes the act of climbing a mountain as an example of this mental state of thought that precedes an action. To climb a slope in order to go toward the summit, is to make one´s way toward the privileged place that commands the entire mountain chain, not in order to dominate or possess it, but in order to exchange a relationship of friendship with it. Man and nature are not strictly 75 speaking enemies before this connection at this key-point, but are simply strangers to each other. (Simondon, 2017, p.179) Only after has it been climbed, the summit becomes a place of exchange; in this process, man can simultaneously be influenced by or act upon it. Here the shift from the idea of climbing a mountain (thought mediation) to the realisation of this activity (technical mediation and human function) is objectivised in technics. On that basis, Simondon reminds us that technics not only have the power to ‘modify a privileged place’ (key-point), but “can also completely create the functionality of privileged points” (p.182). The notion of phase thus implies seeing technicity as something coming-into-being – e.g. from looking at “the operation of a system with potentials in its reality” (p.169) to the actual activity and attitude towards turning potentiality into actuality; which makes us realise that technicity can be anything but motionless or static. Furthermore, it cannot be entirely contained within technical objects, or exhaust itself within them. On the contrary, “technicity precedes and goes beyond objects” (op. cit. p.179). In the habitat of digital research, and as permanent organiser/coordinator of technical ensembles, the researcher not only regards highly what precedes and takes places with and in technical objects but, through technics, has the power to modify or create new methodological arrangements. It is here that we see the need for developing a technical mentality along with the process of digital methods. As a phase, technicity becomes at the same time a problem and a solution47. In Simondon’s words, this would be something that surrounds “the deepest reality of technics” which, although constituted by theoretical knowledge, is realised in praxis (p.171). As a result, more than a phase, technicity devolves into a phase-shift; turning itself into something transitory as well as something definitive, explains Simondon. Technicity is then transitory because of its capacity to split itself into theory and praxis, dividing itself in two orders of thought: theoretical and practical. On one hand, representative aspects of (scientific) knowledge that would be what Simondon refers to as grounding realities that bring forth from representative orders of thought. On the other hand, the active aspects of the praxis, what he refers to as figural realities that 47 Turning itself into "a permanent reminder of rupture" of the capacity of resolving problems (solution) and of the position of becoming a problem (Simondon, 2017, p.174). 76 spring forth from active orders of thought48. Technicity is definitive because it refers to a particular domain of knowledge49 that concerns the functioning of the technical object; here is a particular consideration for the elements. That is where we see the key for (also beauty of) developing a sensitivity to the technicity of the mediums which, as I argue, can be constituent of digital methods practice; something that also pushes us towards orders of technical-practical thoughts, something that “cannot be completely systematised” because these thoughts “lead to a plurality of different values” (Simondon, 2017, pp.216-217). The value of technical elements and technical imagination To exemplify “the concern for the element”, which is something enabled by technics, Simondon (2017) invokes how Descartes explains the functioning of the heart: […] decomposing a complete cycle into simple successive operations and showing that the functioning of the whole is the result of the play of elements necessitated by their particular disposition (for example that of each valve). Descartes doesn´t ask himself why the heart is made in this way, with the valves and cavities, but how it functions given that this is how it is made. The application of schemas drawn from technics does not account for the existence of the totality, taken in its unity, but only for the point by point and instant by instant functioning of this totality. (Simondon, 2017, pp.187-188) The importance of the elements is not a matter of knowing what the heart is made of, but how it functions, having the ability to understand its overall functioning (fonctionnement d’ensemble) as a series of elementary processes and mediations (Simondon, 2017). In this logic, “technicity introduces the search for a how through the decomposition of an overall phenomenon into elementary operations” (p.188). This also points to the possession of a content (at the level of element) which is primarily technical. The particular concern for the elements mirrors a certain schematism of 48 In this sense, it is important to understand better the ground (fond) and the figure. The former corresponds to “the functions of totality that are independent of each application of technical gestures”, while the latter “the figure, made of definitive and particular schemas, specifies each technique as a manner of acting”. 49 For instance, Rieder (2020) presents technicity as a notion related to the domain of making software “where what programs do and how they do it is specified or, better, designed”. Here, technicity is a notion related to the domain of using software in the practices of digital methods. 77 mental structures50 that resides in technical thought, which is taken as “the paradigm of all inductive thinking, whether in the theoretical order, or in the practical order” (Simondon, 2017, p. 188). According to Simondon (2017, p.214), inductive thinking can be defined by its content, “the form of theoretical thought that arises from out of the fragmentation of technics”, and for method “the thought that goes from particular elements and experiences to the whole of the collection and to a general affirmation, seizing the validity of the general enunciation by way of the accumulation of the validity of particular experiences”. Coming from technics, this way of thinking remains relational and pluralistic, because it is empirical in its origin (see Simondon, 2017, p.217). The example of the functioning of the heart helps us to make sense of technicity as something transitory (orders of thought) and definitive (domain of knowledge), but also as schemas of technical thought. These, however, cannot be easily transmitted or explained, as we have learnt from Simondon (2009; 2017), schemas of thought are poorly understood from the order of expression, but they further presuppose a node of expressive communication, modalities of attitude towards what is either theoretical and practical; in and about technical objects. That is the reason, on one hand, why Simondon (2009; 2017) overthinks technical activity51 to explain practical and technical thoughts; and, on the other hand, he affirms that the application of such modes of intelligibility requires the development of a technical mentality which “can be developed into schemas of action and into values” turning itself into a thoughtnetwork. As mentioned before, the reality of technical objects in Simondon echoes industrial objects, but despite that his reflections have a powerful potential for digital research, not excluding the possibility of a practical application. On the contrary, the work process introduced by digital methods can be precisely the way for grounding (and 50 For the apprehension of technical individuals and ensembles. 51 The technical activity always faces two opposite but complementary realities that speak of technical thought (Simondon, 2017). For Simondon (2017), the technical activity and its limits are exposed as such when it fails: “Through its failure, technical thought discovers that the world cannot be entirely incorporated into technics” (p.215). But he also presents a complementary perspective, as it leads to the discovery of new possibilities: actions that fail expose also counter-structures attached to technical operations demanding human intervention through technical gestures. 78 taking advantage of) such mental modes into action. Beyond getting acquainted with the medium, cognitive schemas of thought (technical and practical) are precisely important because they reflect upon technical invention and the creation of technical objects but also methodological innovation. Grounded by and in technical thought, Simondon explains that invention is “the taking charge of the system of actuality through the system of virtualities, the creation of a unique system in the basis of these two systems” (2017, p.61). That is to say, to paraphrase him, to be in an intermediary position dominating what is between the abstract and the concrete, taking charge of medium actual activity through its technical potentials for the creation of one thing. In Simondon, invention is thus the creation of a technical individual that requires from the inventor an intuitive knowledge of technicity, particularly of the element. That is what Simondon (2017) describes as “the level of schemas” which “presupposes the pre-existence and coherence of representations that covers the object’s technicity with symbols belonging to an imaginative systematic and an imaginative dynamic” (p.74). Imagination here is the capacity of prediction of the practical qualities of technical objects. In this sense, technical imagination thus can be defined as “a particular sensitivity to the technicity of elements; it is this sensitivity to technicity that enables the discovery of possible assemblages; the inventor does not proceed ex nihilo (from scratch), starting from matter that he gives to, but from elements that are already technical” (Simondon, 2017, p.74) I argue that such state of mind and sensitivity combined with a technical practice, certainly inherent to the full range of digital methods, have the potential to fill a gap in digital research: not only conceptually but in and through methods (practice). A thought-network that echoes in modes of enquiring (research questions), gathering different computational mediums and their respective methods (methodology) for a final purpose (research goals). In this case, technicity mirrors a practical reality driven by a technical imagination. The points discussed above serve to provide a conceptual basis for my main argument that the practice of digital methods is enhanced when researchers make room for, grow and establish a sensitivity to the technicity-of-the-mediums. 79 Third attempt to understand technicity (with digital methods) In attempting to illustrate more clearly the mental and practical modes of what I am calling the technicity-of-the-mediums in digital methods, this section provides a description of the process of building/interpreting computer vision-based networks. It is an exercise in repurposing pre-trained machine learning models to study bot agency in social media through networks. The research question here concerns how the visual content shared by botted accounts travels across domains52. In order to answer to this question, I take the example of a digital methods protocol to expose what it takes to interpret computer vision networks - taken as an ensemble of machines, data, methods and research practices. This register concerns networks created through Instagram and Tumblr images53 and reconstructed through Google Vision API - a model that searches and detects the web pages in which full or similar matching images have been shared across the web. The context and content of the network The context of this network is an ongoing research on bot activity in social media platforms which started in late 2017 through close observation and exploratory data analysis of botted accounts on Instagram, or instabots. The first step was to identify botted accounts. To do so, I carried out three investigative activities: i) mapping, describing and using applications that boost engagement on Instagram; ii) understanding the black market54 for automated engagement through botted accounts, 52 My “bot” journey started with the study of hashtag engagement on Instagram. I was exploring the workers and the conservative protests in Brazil, March 2017. Through visual exploration of the network cross-analysing with a list of the most active unique users of hashtags, it was detected not only actors taking a position before the polarised protests, but political far-right botted accounts also popped up (see blog post here: https://thesocialplatforms.wordpress.com/2017/12/21/insta-bots-andthe-black-market-of-social-media-engagement/). In March 2017, pro-Bolsonaro bot accounts were using the main hashtags of the protests to get visibility in the debate, but also using a list of specific hashtags that would point to Jair Messias Bolsonaro as the future president of Brazil (see the co-tag network here: https://thesocialplatforms.files.wordpress.com/2017/04/insta-tarde_15m-2017foratemer-grevegeral-diretasjacc81-modularity-zoom-ok-final-version1.jpg?w=1024). As a result, I started questioning the role these automated beings on Instagram engagement and the possible impact on the Brazilian General Elections in 2018. Later, mainstream media and academic work proved how the strategic political campaign of the presidential candidate had highly invested in automation. 53 In these types of networks, images can be captured by different data collection techniques and entry points, such as keywords, hashtags, websites, or a list of social media username accounts. 54 Black market here is understood in the sense of subverting the official terms of use and data policies of social media platforms; hidden automated practices that drive fake followers and engagement (likes 80 thus cross-analysing the functioning of applications with the modes of bot engagement through these apps and on the platform; and finally, iii) the exercise of tracing botted accounts by purchasing engagement (likes and comments) and followers (see Omena, 2017). After this and other exploratory studies, username-pattern was proven to be valuable as a starting point for identifying social bots on Instagram. This confirmation came from analysis that looked closely at username-patterns and their relationship with accounts’ visual content (whether contains posts), profile info (whether contains image profile, if it is private or public, if there is discrepancy between the number of followers and following) and the pace of commenting. The username characteristics were thus identified through bot detection techniques: forcing bots into action and data exploratory analysis (see Figure 2.1). In the former, the visible bots - those immediately detected after using hashtag lists or after the purchase of engagement metrics and followers. These bots’ usernames would either be weird (e.g. swph965, __beta__1, awesome.vs.amazing) or try to mimic real accounts (e.g. sabrinaejr, williamc_clarke, b.ianca._), thus, hard to differentiate from accounts of human people in a dataset55. In the latter technique, usernames were often constituted by a sequence of numbers accompanied or by letters or underscore (e.g. 6151, 98.00715, ______160.0cm, 6.13kkk, _0318_m) among other atypical combinations. Clusters of these botted profiles were only able to be identified when working with spreadsheets (by filtering usernames from A to Z or Z to A) or in the processes of data exploratory analyses/visualisations and visual network analysis56. To detect this type of botted accounts required some experienced in the practices of digital methods as well as previous knowledge on bots’ culture of use on Instagram. Considering they are not easily detected as when purchasing, I am referring to these accounts as hidden bots. and comments) to accomplish a particular purpose in a non-authentic manner. For instance, the spread of political controversial ideas, misinformation or to make and maintain popular profiles. 55 Unless when opting for filtering the most active unique users in the spreadsheet, this technique works well for political-related content, at least in the Brazilian context. 56 Watch other bot detection techniques here: https://www.youtube.com/playlist?list=PLuAgGxzD7fdxKJVTbYM5PtzMmnT94_1ZM 81 Figure 2.1. Two ways of detecting social bots. The agency of instabots not only influences and shapes public and political debates (Murthy et al., 2016; Woolley, 2016; Woolley & Howard, 2016) but plays a role in the process of interpreting data. For instance, when working with networks of hashtagged-based content networks, instabots probably influences in the shape of the network, indirectly affecting its interpretation. Although bots on Instagram are usually associated with celebrities, photographers, marketing professionals and influencers, through username-pattern (combined with profile image) and data exploratory analysis57, botting activity was efficiently detected in the context of health studies, politics and demonstrations. For instance, cross-platform studies concerning Zika Virus and Dengue hashtag-related content (Rabello et al., 2018); polarised political protests in Brazil; for example, in 2016 the demonstrations of pro/anti impeachment of Dilma Rousseff and in 201758 the stay/get out Michel Temer protests (see Omena et al. 2017; Omena, Rabello & Mintz, 2020); and, when exploring hashtag networks in the context of Brazilian presidential elections in 2018 or investigating the following 57 See some examples of using username and image profile as indicators to detect instabots through Google Spreadsheets: https://www.youtube.com/playlist?list=PLuAgGxzD7fdxKJVTbYM5PtzMmnT94_1ZM 58 See slides presentation available at: https://www.slideshare.net/jannajoceli/why-look-at-socialmedia-apis-81702316 82 network of Bolsonaro non-official campaign accounts and their image profile in 201959. This research agenda was later expanded to Tumblr automated agency, particularly, porn bots. In 2019, together with two research collaborators, namely Jason Chao and Elena Pilipets, we combined our different backgrounds and expertise (coding, digital methods and media theory) to enhance the previous work proposing an innovative and replicable conceptual-theoretical-practical framework to address bot engagement on Instagram and Tumblr (see Omena, Chao, Pilipets et. al., 2019)60. Regarding the content of the network, there are three key constituent elements: the visual content posted by what is identified here as hidden bots, the logic of spatialisation of the graph layout algorithm (ForceAtlas2) and computer vision61 potentialities of Google Cloud’s services – namely searching on the web for full matching images. This feature thus provides a different type of network from those based on the image classification capacity of vision APIs; the so-called computer vision image-label networks, which have been gaining space in digital research over the years by allowing the interpretation of a network of images and their descriptive labels (Colombo, 2018; Geboers & Van De Wiele, 2020; Mintz et al., 2019; Omena, Rabello & Mintz, 2017; Omena & Granado, 2020; Ricci, Colombo, Meunier, & Brilli, 2017; Silva et al., 2020). I am, however, addressing the unknown and unexplored computer vision imagedomain networks that allow the interpretation of a network of images and their respective sites of circulation (through the detection of web pages in which fully matched images appears). This type of network offers particular visual interpretations of the images that stick within or flow out of digital platforms. When using Google Vision to search for full matching Instagram and Tumblr bot images across the web, 59 See blog posts available at https://thesocialplatforms.wordpress.com/2018/10/22/elenao-vs-elesim/ and https://thesocialplatforms.wordpress.com/2019/12/07/reading-digital-networks/ 60 This collaboration has grown resulting in a working group funding by the Center for Advanced Internet Studies (CAIS), Bochum, Germany (https://www.cais.nrw/arbeitsgemeinschaften/criticalframework-for-investigating-bot-engagement/). 61 Object recognition, identification and detection reflects one of the most thriving fields of computing and artificial intelligence (Porikli, Shan, Snoek, Sukthankar, & Wang, 2018), for example serving national governments with face recognition as a form to identify terrorists or big tech companies with automated content moderation (e.g. adult, violence, offensive or unwanted content). The affordances of computer vision have also been repurposed for social and medium research. 83 one can identify whether the images (extracted from these platforms) were found only on the Instagram (or Tumblr) environment (sticking within the platform) or have flown out of it, reaching other web environments such as blogs, news media or social media platforms. In addition, researchers can identify clusters of images showing up at specific domains, also clusters of different domains sharing the same image. That is a multi-perspective study, which by default offers general and detailed perspectives on the visual content shared by both dominant link domains (e.g. social media) and clusters of link domains (e.g. mainstream and local media, non-profit organisations); the modes of circulation of images across platforms; and, moreover it clearly exposes platform-specificities and cultures of use (see Omena et al. 2019, Omena et al. 2020). It is crucial, however, to know in advance from where the entry list of images come from (e.g. hashtagged content, newsfeed of specific social media accounts, search engines results) and how Google’s vision suggests a list of pages or URLs in which images are found (e.g. based or Google Image Search ranking systems and it knowledge Graph). Building a computer vision-based network (being acquainted with computational mediums) The making of this network (in collaboration with Jason Chao and Elena Pilipets) required some time in order to obtain the final sample (see Omena, Chao, Pilipets et. al., 2019) (see Figure 2.2). First, we had to the define our entry points to scrape data. Since bot detection on Instagram was already established, it was then necessary to discover, test and adapt the forms of detecting hidden bots on Tumblr. After all these platforms have different cultures of use. So, the observation and monitoring of note section was defined as an exclusive strategy for bot detection on Tumblr. While the exploration of co-tag networks served well for both platforms, there were some data analyses exploratory techniques that served only Instagram, for instance: using Excel filters to verify account names; visualising the most active unique users (in mentioning hashtags62); or searching for profile image repetition or profiles without images. 62 The former Instagram API Platform allowed third parties to have this type of information, when using the no longer available DMI tools, namely Instagram Hashtag Explorer later renamed to Visual Tagnet Explorer (both developed by Bernhard Rieder), which was a tabular file with information on the users related to the use of a particular hashtag. 84 Scraping data was the second step, with results close to five hundred thousand media items; the storage and initial analysis of the data were grounded by distribute computing and the use of a virtual machine in the cloud – Google Cloud Storage. To make the project practically viable, we worked with the most recent 30 posts published by each botted account. Eventually, we found non-longer existing accounts while others had changed their usernames; thus, and as expected, limiting the data collection (see the final sample details in the visual protocol below)63. Figure 2.2. Research diagram protocol for building a vision API-based network based on Google’s Web Page Detection (full matching images) module. During the next step (figure 2.2), relying on the scraper results of the query for hidden bots, we selected the available list of post-image URLs, and then required the module Web Entities and Pages from Google vision API, asking precisely for the feature full matching images. This has informed us of where each image has appeared on the Web, 63 One may argue that following this very same research procedure, out of a collaborative research environment, becomes an unfeasible practice. However, there are alternative scripts available and many of them may be found on GitHub, basic skills in running scripts (e.g. in Mac’s Terminal or Anaconda) and a bit of curiosity could be the solution. In that case, for instance, one may combine Memespector (Rieder, 2017) for using Google’s Vision API and Image-network plotter (Mintz, 2018) for plotting the images of the network into SVG files in order to build her own computer vision network. 85 searching for a maximum of 1000 different URLs per image. Before converting our comma-separated value file (CSV) into a graph file format, such as graphic data files (GDF) or graph exchange XML format (GEXF), some data exploration on dataset had to be conducted64. Since we had tracked some porn bots, one of the objectives was to verify the existence of porn websites. To do so, only the images that hit 10 different URLs were selected. From this list, we identified the unique link domains and their frequency of occurrence. DMI Harvest Tool65 was used to identify the unique link domains66; a total of 4.249 unique hosts. After that, we followed a protocol of “search as research” for detecting porn websites: by querying the spreadsheet (image URL column) with the keywords “porn” and “sex”, 125 porn unique hosts were identified there. When verifying the frequency of occurrence of unique link domains, beyond social media and image repositories, the porn domains popped up with a discrete frequency of occurrences (a total of 809) if compared to the total occurrences of the dataset (25.862). This exploratory exercise assisted us with extra features to be included as node attributes into the bipartite network. • Node type 1: image. Attributes: full_matching_image_count to size the images within the network and username to identify the botted account responsible for uploading one or more images. • Note type 2: link domain. Created attribute: isPorn in order to use a different colour for seeing porn websites. 64 CSV is a multiplatform format often present in digital methods as well as GDF and GEXF files used for visual network analysis. Geographic data files (GDF) are commonly used for the creation and structing of road network data (Open Street Map Wiki, 2017) but also a file format that stands for ‘graph dataset format’ used by GUESS (Adar & Kim, 2007), an exploratory data analysis software for networks, and Gephi. GEXF was created by Gephi community project as a mature language for “describing complex network structures”; by using extensible markup language (XML), GEXF is both extensible and suitable for real specific applications (Gephi Community Project, 2009). Visual network analysis’ projects such as GraphRecipes and MiniVan work with GEXF file format. 65 https://wiki.digitalmethods.net/Dmi/ToolHarvester 66 See this list here: https://docs.google.com/spreadsheets/d/1DDmiDVYEy3kb1CHIB8KHB9Fi0WQEMsjDjftmP0lIoko/ edit?usp=sharing 86 The final step was to download the GEXF created with Table2Net67, then uploading it on Gephi and spatialising the network using ForceAtlas2. The next challenge would be to address the question of reading the network. Reading digital networks (orders of technical-practical thoughts) After passing through some layers of technical mediation, what we see when looking at the network in Gephi is a second order of grammatisation. The final visual metaphor neither entirely represents Instagram and Tumblr’s grammatisation nor quite exposes Google Vision API in its full potentials. The scraping of grammatised actions (image URLs pertained to public social media botted accounts) linked to one technical element of the vision API (web detection) was transformed into a new corpus, which passed through another modification, due to software affordances and exploratory data analysis, turning itself into another corpus, a graph file format which gains a shape through the work of a force-directed algorithm (ForceAtlas2) and life through its interpretation, results and impact about to come. Only then, is this computer vision image-domain network ready to be interpreted, to respond to research questions – either those previously asked or the new ones to come, and to provide findings. It is indeed an oligoptic vision of bot activity and respective image circulation, as a resulting from the digital methods approach. Within this framework, the most difficult challenge before starting the process of reading the network was precisely the accurate comprehension of the potential narrative thread afforded by the work of ForceAtlas2 (Jacomy et al., 2014). The main argument is that ForceAtlas2 offers fixed layers of interpretation (centre, mid-zone, periphery, and isolated elements) and multiple forms of reading (see Omena & Amaral, 2019; Omena et al., 2019). The basic principles of this technique account for the work of ForceAtlas2 in action combined to empirical evidence based on several analysis I have made over the past years68. ForceAtlas2 responds to attraction force vs. repulsion by degree (the total of connections a node has received or made). “The force-directed drawing has the specificity of placing each node depending on the other nodes. This 67 https://medialab.github.io/table2net/ 68 Networks (either monopartite or bipartite) constituted by different digital records (hashtags, images, links, posts, comments, users, etc). 87 process depends only on the connections between nodes. Eventual attributes of nodes are never taken into account” (Jacomy et al., 2014, p. 2). The detailed description that follows reflects not only a conceptual understanding of ForceAtlas2 but some technical practices and even more some technical imagination (see figure 2.3). Thus, assuming that in the centre of the network we can see the nodes that gather more diversity and variety in their connections, being there, in some cases, the most connected nodes (e.g. in the networks of co-occurrences of hashtags) or the most popular nodes (e.g. networks of recommendation, the case of similar apps or related videos on YouTube). In the mid-zone, we find influential or bridging actors as well as empty zones (lack of connections); while the periphery is a space that reveals different perspectives and particularities either in terms of content or platform specificity. This area of the networks is usually very rich and interesting for analysis. The isolated elements, once they exist, also deserve our attention. Figure 2.3. The network of Instagram and Tumblr bots’ image circulation according to a visual technique for interpreting the narrative affordances of ForceAtlas2. Network built upon the full matching images feature of Google vision APIs’ web page detection. 88 On the certainty of the fixed zones of node positing within the network, we tried to make sense of the spatialisation of the network looking at the relational aspect of the dataset and how the network was built. However, considering the innovative proposal of this network, we questioned how we should interpret the different zones of the networks and what we expected to find there. What should we look at? How to exactly read the connections between images and links? After all, the link domains within this network are sites where full matching images have appeared. Some technical questions were addressed to my colleague Jason Chao, concerning particularities of the vision API. Then, there was also an attempt to compare this network spatialisation with the computer vision image-label network and others. However, such comparison was problematic because image-label networks concern visual semantic spaces, as in the peripheral zone we normally see clusters that point to particular visualities with a more detailed classification (labels) to describe the images; whereas a more generic labelling takes place in the centre of the network (see Omena & Granado, 2020). These networks, moreover, present a more static representation of the content, contrary to the depiction of those built upon the full matching images feature of Google vision APIs. The comparative exercise was then extended to networks of recommendation, such as related videos on YouTube and similar Apps on Google App Store. But, in these latter networks the centre zone is usually occupied by most recommended videos or apps within the whole network. Adding to that, and although they represent a flow of recommendation, both networks are monopartite (we were dealing with a bipartite network). In theory, we were aware that the network of Instagram and Tumblr bots’ image circulation involves movement and dynamic perspective due to its two types of nodes (images and link domains) and the way connections were made between nodes (the appearance of an image in different URLs). Whereas in practice, we were unable to move forward to interpret and analyse the network from such a dynamic perspective even after looking at the network in Gephi, talking with my colleagues and comparing this with other networks. The solution to the interpretative problem was to both mentally revisit and visually (using Gephi) go over what constitutes that network. A step-by-step process, during which we took into consideration: i) the way the network was made and how it was built (what precedes the network visualisation), and ii) the 89 network visualisation in itself – what nodes are, how connections are made, and the role played by ForceAtlas2 (what takes place with and in the network). At the limit of our thinking capacity, we finally had a clear understanding of what was known in theory but not yet in practical terms. There was no point in comparing this network to others, we were getting distracted by the content of the network (different digital records) and the ways of reading what informs the connections between nodes (e.g. the existence of co-related hashtags). The “aha moment” had finally come out, advanced by what I could see; the positioning of the nodes (the anatomy and functioning of ForceAtlas2); and, the shape of the clusters in the periphery zone (reflecting the list of Instagram and Tumblr image URLs used to run Google Vision API) (figure 2.4) and also using technical imagination (reflected in the capacity of combining the practical qualities of ForceAtlas2 and Google Vision web detection with the choices made in the query design). Once again and understanding what precedes the network visualisation, the act of mentally revisiting each step of the methods, paying close attention to technical elements (how Google Vision API’s full matching image detection module functions? how has it added a new layer of meaning to our list of image URLs?), allowed us to read the network spatialisation. If all images within the network were originated by two platforms, so it was reasonable to see two big clusters in the periphery zone (gathering platform specific visual content). In other words, if we used a collection of images from Instagram and Tumblr as entry point to run Google Vision API and considering the number of images, we should expect the vision API to detect the place of origin of those images (Instagram and Tumblr), particularly and as popular references, social media appear at the top layer of Google results, resulting in the two big clusters we see in figure 2.4. Within these clusters, the nodes pointing outward (and not positioned close to the centre of the network) show the images that had appeared only within each platform (Tumblr or Instagram). Whereas the nodes positioning more in the central part would point to the imagery that flows out of these platforms; thus, reaching different websites, such as porn hubs (in red). Following this mindset, the small clusters we see connected only with Tumblr, point to images that also flow out of the platform, but they hit link domains other than those located at the centre of the network. Some of these are specific porn websites, such as those devoted to Asian porn or teenager pornography, see the figure 2.5. 90 Figure 2.4. Computer vision image-domain network: reading how the imagery of Instagram and Tumblr botted accounts travel across domains. Nodes are images and link domains (a total of 14788), edges (a total of 33.503) indicates whether a given image happened to appear in different domains. The visual content was published by what we called hidden bots. Analysis by Janna Joceli Omena, Jason Chao & Elena Pilipets (see Omena et al., 2019). Figure 2.5. Images posted by Tumblr hidden bots. Links of a recurring “innocent” imagery redirect Tumblr users to Asian porn websites (above). Teenager sensual image (below) that has appeared both within Tumblr and in teen porn websites (below). 91 Figure 2.6. Understanding the narrative affordance of ForceAtlas2 in practice. The network of Instagram and Tumblr bots’ image circulation. On the right, screenshots highlight link domains (green nodes), on the left images (pink nodes). (image source: https://thesocialplatforms.wordpress.com/2019/12/07/reading-digital-networks/) Beyond the stick and flow of image circulation, we further understood that node positioning can either indicate (dominant) domains capable of gathering a more diverse range of images - in the centre (e.g. Twitter, Pinterest), or clusters of images 92 connecting different link domain from both in central and peripheral zones (see figure 2.6). In this sense, we were able to raise questions: how images travel across platforms (what sticks within the platform and what flows out), where do they appear (links domains), and what types of visual content are attached to a given link domain or clusters of link domains (and vice and versa). After we finally understood the network spatialisation, the analytical process took place (see figure 2.7) following a navigational research practice and the technique of visual network analysis (Venturini et al., 2019, 2015), which draws our attention to the position, size and colour of nodes as key interpretative aspects. Beyond what we could see on the network, we considered a combination of factors, such as the relational nature of digital records - images and their origins (platform), authors (botted accounts) and context (e.g. how bots were detected, time period), the web environment and the narrative affordance of ForceAtlas2. At this stage, our intervention was crucial because interpretation means to navigate between Gephi overview/data laboratory to the web environment, and back to Gephi, but now moving to the spreadsheet and looking at a big screen, while making annotations in printed versions of the network69. A navigational procedure was mandatory as was our technical awareness. Figure 2.7. The analytical process and the researcher intervention 69 Here the interpretation of the network was led by a trajectory of technical awareness (see Figure 3.13 in chapter 3, which illustrates the research protocol diagram of this network). 93 Through the analysis of hidden Instagram and Tumblr bots, considering and comparing it with the analysis of the purchased bots (see Omena et. al. 2019), two modes of bot agency were detected: the discrete bots, mainly existing through the act of giving likes (sometimes following others), which serve the purpose of boosting engagement without attracting attention but never creating content. On the other hand, the imitative bots mimic real people by distributing mainstream content, serving as aggregators of followers/following and active agents of giving or receiving likes. Apart from that, another significant finding reveals that Instagram and Tumblr botted accounts are not only programmed to upload content or to follow trending hashtags in order to reach visibility, but they follow and respect platforms’ culture of use and specificities (Omena et al., 2019). The description of Instagram and Tumblr bot image network aimed to illustrate the role of the technicity before the full range of digital methods, also attempting to describe the use of a technical imagination. It was furthermore an attempt to describe processes that are hardly ever documented/registered in digital methods literature – and I am not considering data sprints’ reports here. Although this section may sound more descriptive and less conceptual in relation to the previous sections, the case exposed here reflects precisely an example of what it is to think along with a network of methods, connecting technicities (in mental and practical forms). This background knowledge constitutes the awareness of the technicity-of-the-mediums where we assume that researchers are familiar with the digital fieldwork and the technical practices of digital methods. When in this position, researchers are able to use the technical mediation and substance inherent to digital methods as a constitutive part of research, also considering a triangular relationship between software affordances and platforms’ cultures of use and grammatisation. The next chapter will demonstrate how technical expertise can specifically contribute to new forms of enquiring by describing that triangular relationship and suggesting a way of carrying out digital fieldwork. 94 3 DIGITAL FIELDWORK C HAPTER 3 95 The digital is not a land of abundance. It is not a place where information pours in freely or easily; not a place where computational tricks, powerful as they may be, can replace the hard work necessary to mine, nurture and refine inscriptions. Digital methods do not spare us from walking the walk, but they give us the chance to experiment new pathways. Tommaso Venturini, Mathieu Jacomy, Axel Meunier and Bruno Latour, 2017. Getting acquainted with the web environment This section presents the technological environment that digital methods approach takes as a point of departure to ground claims about social phenomena, promoting a technical comprehension of the web environment according to Venturini and Rogers, the argument that “research through media platforms should always be also research about media platforms” (Venturini & Rogers, 2019, p. 6). Here we look at the web as a network of connected (HTML) pages in which a particular knowledge on its infrastructure is required to study social phenomena through the Web. Some fundamental aspects about the web environment will be discussed from a methodological viewpoint, paying particular attention to the role of web applications and application programming interfaces. The web environment sets the scene for different ways of researching in which scholars need to adapt their research design in line with what is available in the blogosphere, Internet archives, social media, search engines, etc. Moreover, they should know how digital records are made available (methods like hyperlink networks70, crawling and ranking, recommendation systems, Graph API infrastructure), how it is used - what is grammatised (social buttons, hashtags, locale), through what types of content (textual, video, image, gifs, memes), and for what purposes (for political debate, elections, spread of disinformation, social causes, etc.). It is therefore imperative to understand the web environment, from its methods and functioning to its forms of use (see Rogers, 2018). 70 See the work of Han Woo Park available at: https://www.researchgate.net/publication/200772676_Hyperlink_network_analysis_A_new_method_f or_the_study_of_social_structure_on_the_web 96 Such sensitivity to the technical architecture of the web is now, more than ever, crucial in the content, methods, sources and techniques of medium research (D’Andréa, 2020; Marres, 2017; Rieder & Röhle, 2018; Rogers, 2019; Watts, 2007) and justifies my attempt to introduce the web environment through its functional modes and particular relations with digital research and methods. A technical comprehension of the web: an overview The World Wide Web (WWW), or simply the Web, was created to be interactive, safe and synonymous of a decentralised space in which we all would have the right to express our opinion (Berners-Lee, 1995). While the Web is, still in many ways a space for free speech71, the increasing power of private companies such as the AlphabetGoogle, Apple, Amazon, Facebook and Microsoft (a.k.a GAFAM) is breaking up this space in a series platform walled gardens (Poell, Nieborg, & van Dijck, 2019; Rogers, 2018). The role of web platforms as custodians of the world’s data (Morozov, 2018) is problematic and complex, implicating many social and economic issues. I want, however, to give emphasis to a technical comprehension of the Web as crucial to carry out research based on the digital methods approach. In this spirit, we may start defining the web as is an information system where we can access research material through Uniform Resource Locators (URLs) or Uniform Resource Name (URN). URLs identifies a resource by location (e.g. https://smart.inovamedialab.org/2021-platformisation/) and URN by name, for instance when using the International Stander Book Number (ISBN) to localise a book (978‐972‐9347‐34‐4)72. The Uniform Resource Identifiers (URI) – either URLs or URNs – are transferred via an application layer protocol for data communication namely via Hypertext Transfer Protocol (HTTP), which is only accessible over the Internet73 (see Berners-Lee, Fielding and Masinter, 2005). That is to explain the universal identifier of every piece of information stored in the web needs to be standardised by an URI. In Figure 3.1 we see the generic URI syntax “which consists 71 Considering that anyone, with Internet access, can create channels or find means to communicate, but this comes with some costs such as those related to constant surveillance, privacy issues and the digital labour. 72 World Wide Web (2020). Retrieved September 20, 2020, from https://en.wikipedia.org/wiki/World_Wide_Web 73 See also: Hypertext Transfer Protocol (2020). Retrieved September 20, 2020, from https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol 97 of a hierarchical sequence of components referred to as the scheme, authority, path, query, and fragment” (Berners-Lee et al., 2005) and a few examples of web content originated from YouTube, Tumblr, Facebook and Twitter. Figure 3.1. Understanding the generic syntax of URIs. According to Berners-Lee et al. (2005), schemes consists of a sequence of characters beginning with a letter and followed by any combination of letters, digits, plus ("+"), period ("."), or hyphen ("-"), e.g. https which stands for Hypertext Transfer Protocol Secure, an extension of HTTP used for secure communication74. The authority component, always preceded by two slashes (//), comprises a host such as youtube.com, but it can also contain an optional user info. The Domain Name System (DNS) also has its own hierarchy, being composed by hostname (en), second-level domain (Wikipedia) and top-level domain (org): en.wikipedia.org (the domain name is Wikipedia.org). The last two are useful analytical tools when using digital methods and also country-code top-level domains (ccTLD) which indicates domain extensions for regions or countries, e.g. www.aruki.pt. The path component, often separated by a slash (/), specifies the unique location of a file system. Using social media as an example, the path may show which environment within YouTube (e.g. channels, video) or point to the exact location of an image or Facebook post (see Figure 3.1). After the path, preceded by a question mark (?) 74 All URI schemes are required to be registered with the Internet Assigned Numbers Authority (IANA). 98 highlighted in yellow (Figure 3.1), there can be an optional query component revealing that within YouTube environment, we are looking at a specific video playlist among other existing ones. Finally, the optional fragment component indicates what is often taken as an id attribute of a specific element. The fragment component is another valuable source for digital methods that helps the research in identifying unique actors/content in a dataset or exploring visual content by using the IMAGE formula in a Google Spreadsheet75. Media, textual and visual content available on the Web are composed through the Hypertext Markup Language (HTML) and Extensible Markup Language (XML). HTML describes the content and presentation of a web document, while its description and presentation become a text-based format with XML allowing machine consumption (Helmond, 2015a). Along with HTML, Cascade Style Sheets (CSS) and JavaScript represent the cornerstone technologies of the web; while CSS serve to specify the presentation of web pages, JavaScript is used to specify their behaviour (Flanagan, 2011). CSS is “a stylesheet language used to describe the presentation of a document written in HTML or XML (Mozilla Developer Networks, 2020), describing how elements should be rendered on media including colours, layout and fonts. According to the World Wide Web Consortium (W3C), CSS is a standardised language across browsers which also “allows one to adapt the presentation to different types of devices, such as large screens, small screens, or printers” (World Wide Web Consortium, n.d.). JavaScript is a text-based programming language that played a significant role in the functional diversification of the web. Created by Netscape in the early days of the web, JavaScript is the technology that allows adding a dynamic behaviour to the web pages (Flanagan, 2011; Mindfire Solutions, 2017; Mozilla Developers Network, 2020), being thus instrumental in turning the web from a static and informational environment to a dynamic and interactive one. Accordingly, it supports object-oriented and functional programming styles to modify HTML content in response to events/actions, being used on both front-end (or client side) and back-end (or server-side) (Coding Arena, 2018; Mindfire Solutions, 2017). This brief definition explains why JavaScript 75 As a practical example, watch this playlist on detecting instabots: https://www.youtube.com/watch?v=D1Jo84tfbY&list=PLuAgGxzD7fdxKJVTbYM5PtzMmnT94_1ZM 99 is known as the programming language of the web and a crucial brick for the development of web application. The web dynamic architecture makes web content meaningful not only for its users (navigating, participating, creating) but also to computers (getting, learning, evolving), proving to be a revolution on knowledge representation (Anderson & Wolff, 2010) but also carrying endless challenges for research. In the practices of digital methods, the value of machine-readable web content (with what is inherent to it, how it works and on what effects) become a crucial technological grammar for driving research, such as the use and knowledge of web crawlers, scrapers and APIs. From the viewpoint of methodology, a technical understanding on the web environment comprises one final element: the fractal design and layered structure (Figure 3.2) of the web. In 1995, Tim Berners-Lee explains the fractal design of the web at The MIT/Brown Vannevar Bush Symposium, comparing the web pages as they were “representative of people, organizations, or concepts” and affirming that we all need to be part of the fractal pattern (Berners-Lee, 1995). This is a certain structure containing connections of widely varying “length” but with involvement at all scales; to exemplify this idea he refers to the MIT building as a system in which “the whole operates because the parts interoperate, and the way the whole works and whether the whole works is defined by how the parts interoperated” (Berners-Lee, 1995) (Figure 3.2). This representation of the web becomes clearer through Franck Ghitalla and Mathieu Jacomy’s76 work, which is inspired by network science and graph theory (Figure 3.2). 76 In February 2019, Mathieu Jacomy presented the argument of the web as layers in the context of the Digital Media Winter Institute at Universidade Nova de Lisboa, he was tutoring a workshop about Networks of (dis)information (see https://smart.inovamedialab.org/workshops/2020_networks-ofdisinformation/). The author has shared a list of videos with all workshop participants in which one can watch his lecture on the web as layers, available at https://panopto.aau.dk/Panopto/Pages/Viewer.aspx?id=48cfe5ff-5503-431b-887f-ab53007ef5c4. 100 Figure 3.2. At the top, a screenshot of the Tim Berners-Lee’s lecture about Hypertext and Our Collective Destiny at The MIT/Brown Vannevar Bush Symposium 1995 (30m)77. At the bottom, a screenshot of Mathieu Jacomy’s lecture at Aalborg University in December 2019, where he presents the notion of the web as layers78. Like Berners-Lee’s illustration, Ghitalla and Jacomy (2019) also explain the map of the web from the inside, rather than from above. To the authors, the web is made by four layers. When web browsing or crawling for research purposes, we will first 77 Source: https://www.dougengelbart.org/content/view/258/000/ 78 Source: https://panopto.aau.dk/Panopto/Pages/Viewer.aspx?id=48cfe5ff-5503-431b-887fab53007ef5c4 101 encounter the most well-known websites or platforms constituting the surface of the web (first layer), such as Google, Wikipedia, social media or national institutions, etc. The aggregates come right after the most cited websites or platforms, where we see homophily happening or the bonding with similar others. Here we may see more specialised content/actors, which is divided into a high layer of aggregates (the core, second layer) and a lower one (the periphery, third layer), followed by an extremely specialised zone, also known as the deep web (fourth layer). That adds to our understanding that “on the web, anyone can have a voice, but it does not mean that everyone will have the same impact” (Ooghe-Tabanou et al., 2018, p. 16). Key and structural element to such web hierarchy is the hyperlink through which web information has to be connected with in order to become known and to gain audience, as explains Benjamin Ooghe-Tabanou and colleagues (2018). But why does the notion of the web as layers matter? From the standpoint of digital research, if one wants to study what happens within the web environment to better understand society, one needs to understand how information/content is structured within this environment. As explained by Jacomy (2019), we can understand the different layers of the web and their properties through search engine results (first results, specific queries results, unindexed content), visibility of the links (known of all, known of amateurs and experts, forgotten) and the content (generic, specialised). Paying attention to the notion of the web as layers also helps us to understand the role of hyperlinks as “data-rich analytical device” (Helmond, 2013; Ooghe-Tabanou et al., 2018). The case presented in chapter 2 to study social bots serves also to respond to that question. Through the analyses of a computer vision API-based network, we have identified social media platforms – in particular Twitter and Pinterest, and blog sites in the first layer of the network. That means full matching images posted by Instagram and Tumblr bots were detected by Google Vision API at the surface of the web. The second layer showed us two main poles constituted by the studied platforms (Instagram and Tumblr), exposing the imagery that sticks within or what flows out of these platforms. This zone was followed by a more isolated zone within the network where we saw specific porn websites (third layer), such as Asian girls and teen porn websites. Although passing through some layers of technical mediation, the network reflects the layered web. 102 An architecture of participation: the role of web applications and APIs Tim O´Reilly’s definition of the Web 2.0 serves as another, and important, technical comprehension of the web. O´Reilly describes it as a network of connected devices empowered by human and non-human forms of use. Anticipating upcoming research lines in Media Studies, such as Tarleton Gillespie’s The Politics of Platforms published in 2010, O´Reilly calls our attention to the multi-sided publics of the web (including users) and to the data/services centralisation-descentralisation modes belonging to web applications. “I said I´m not fond of definitions but I woke up this morning with the start of one in my head: Web 2.0 is the network as platform, spanning all connected devices; Web 2.0 applications are those that make the most of the intrinsic advantages of that platform: delivering software as a continually updated that gets better the more people use it, consuming and remixing data from multiple sources, including individual users, while providing their own data and services in a form that allows remixing by others, creating networks effects through an “architecture of participation”, and going beyond the page metaphor of Web 1.0 to deliver rich user experiences” (O’Reilly 2005) In 2005, his compact definition of the web has perhaps not served as a reference for research methods, at a time when research efforts were invested in other technological aspects of the web. However, over the years and after the emergence of social media APIs with loose restrictions to data access, digital media scholars have started to value both the creation/use of web research app and the role of APIs as an object to be considered, showing great interest in comprehending the agency and effects of how social media platforms deliver software in a dynamic and continually updated format, while allowing others (third parties applications) to access data and add new functionalities to itself (see Lomborg & Bechmann, 2014). It is, therefore, important to define web applications and APIs, in the context of digital research. A web application is a software program that completes specific tasks and achieves specific outcomes by using web technologies (e.g. JavaScript and HTML), such as create, read, update and delete information (see Gibb, 2016; Microsoft Virtual 103 Academy, 2018)79. Web apps are accessed through web browsers and run on web server, and are thus expected to be “be quicker to build, simpler to distribute, and easier to iterate” (Miller, 2018). That also means that web apps provide interaction and functionality within the web environment, using a combination of components for their front and back end interfaces, organised as follows: § Components of a web app in the front-end interface: Hypertext Markup Language (HTML): for defining the content and structure of a webpage Cascade Style Sheets (CSS): for defining the style and layout of a web page, the presentation of web content JavaScript (a text-based programming language): for creating scripts that controls the behaviour of a webpage, controlling how an app will dynamically respond to action § Components of a web app in the back-end interface: Database: for storing and organizing an app’s data The back-end code of the app: for controlling data access and how it is used Web server: for hosting the app and allowing others to access Adapted from Microsoft Virtual Academy (see Coding Arena, 2018) Google apps (e.g. Google Docs, Sheets or Slides), Slack, Trello, Pinterest and even Twitter are some examples of web apps (see Coding Arena, 2018; Miller, 2018). In the context of digital methods, the creation/use of research apps allows researchers to carry out web-based studies in the most diverse research areas, mediating the execution of specific tasks. For example: the retrieval of data from social media APIs with YouTube Data Tools (Rieder, 2015) but also data exploratory analysis with DMITCAT (Borra & Rieder, 2014) and 4CAT (Peeters & Hagen, 2018); the extraction of URLs from text, source code or search engine results, producing a clean list of URLs with the Harvester from DMI Tools; and, the making network of Wikipedia related- 79 While mobile apps are built for a specific platform (e.g. iOS for the Apple iPhone or Android for a Samsung device), running directly on a mobile device. 104 pages based on their “See also” section with Seealsology from DensityDesign & médialab Sciences Po80. Web research apps demand our attention not only for what they can offer or do, but because they have “epistemic orientations that have repercussions for the production of academic knowledge” (Borra & Rieder, 2014, p. 2). They open new ways of making research and reflecting about media and society. As data retrieval from social media is a primary task in the practices of digital methods, we need to define and technically understand web application programming interfaces (APIs). APIs constrain the way in which developers can create research apps (Sturm, Pollard, & Graig, 2017), they also gather technologies for capturing and (re)organising acts and actions within the web environment - from which researchers can gain access. Even though we recognise that the golden age of social media API-driven research has passed (see Freelon, 2018; Perriam, Birkbak, & Freeman, 2020)81, attention should be given to what APIs are, how they work and what we should look at when using digital methods. Rick Sturm and colleagues define APIs as “critical software elements supporting data and/or functionality interchange between diverse entities. They are also utilised for standardising data access within a software engineering organization to facilitate faster software development” (Sturm, Pollard & Graig, 2017). In other words, they are an interface programming language and infrastructure which defines interactions between multiple software, while allowing the use, access and exchange of data. APIs respond to the principle of information hiding or the criteria applied to divide the system into modules, proposed by David Lorge Parnas in 1971 (Parnas, 1971). This principle “prescribes that software modules hide implementation details from other modules in order to decrease the dependency between them” (de Souza, Redmiles, 80 See https://wiki.digitalmethods.net/Dmi/ToolHarvester, https://densitydesign.github.io/strumentaliaseealsology/?fbclid=IwAR1zOtvsfpTU9emI4OtrdbTYq7ka4ROzruk03H3Zjl-HOaOlzcq5cmCSZnA, https://medialab.github.io/graph-recipes/#!/upload 81 Social media APIs have high restrictions and very limited access to public data. As a response to Facebook’s API restrictive measures, Anja Bechmann (the director of DATALAB – Center for Digital Social Research) created an Open Doc to list publications that could not have existed without access to API data. The list is available here: https://docs.google.com/document/d/15YKeZFSUc1j03b4lW9YXxGmhYEnFx3TSy68qCrX9BEI/edi t?usp=sharing 105 Cheng, Millen, & Patterson, 2004, p. 1). By being constituted by public and non-public or private properties, APIs infrastructure can separate function from implementation. The public properties of APIs are visible to the client and should include the specifications of functionality, meanwhile the non-public properties must be secret, following the criteria of decomposability and composability (Meyer, 1998). For instance, let us “assume a module changes, but the changes apply only to its secret elements, leaving the public ones untouched; then other modules who use it, called its clients, will not be affected” (p.51). Such an infrastructure is also recognised as the open-close principle of APIs because software modules should be simultaneously open (for extension and adaptation) and closed (to avoid modifications that affect clients) (Meyer, 1998), in order to facilitate specific operations and characteristics such as interoperability (when two or more applications cooperate, exchanging and making use of information) and modularity (the capacity of breaking up applications into modules in a way that they can be recombined) (see Bucher, 2013). We can look at Facebook Graph API and the Netvizz application (Bernhard Rieder, 2013) (Figure 3.3) as practical example of web APIs and apps in the context of digital methods. Netvizz was created in 2009 by Bernhard Rieder and it has worked as a research tool for almost ten years. When embedded into Facebook, the application offered a service to the platform users, adding a new functionality, such as the retrieval of link stats and data from Pages, Groups and users’ like or friendship connections (see Figure 3.3). Over the years, more specifically after the Cambridge Analytica scandal in 2018, the restrictions have increased. For example, the access to user reactions to Facebook Public Pages’ posts (user-post bipartite graph) or to identify Page fans per country were suspended, closing a rich scenario for the study of social phenomena82. 82 In 2018, Jorge Martins Rosa, Daniel Cardoso and I used a user-post bipartite graph to guide content analysis, we analysed comments - those made by commenters from the periphery (a total of 242 (34.42%) and those made by commenters from the center of the network (a total of461, 65.58%). The study was focused on the reactions to a Facebook post, on July 9, 2017, in which the admins of the Page revealed their names. The page (Os Truques da Imprensa) used to publish critical remarks on the news of Portuguese national media, often generating heated debates on the platform, either being praised for its role as a watchdog, or discredited as allegedly serving as the spokesperson for a hidden political agenda. Our goal was to evaluate the debate generated by this post, particularly concerning the polarisation of positions and arguments among those that engaged with the post. The article is available at http://obs.obercom.pt/index.php/obs/article/view/1367/pdf. 106 Figure 3.3. Screenshot of the Netvizz research app (above) and, by using this app, the changes of Facebook Graph API data access regime along the years (bellow). The open-closed structure of Facebook Graph API can be also illustrated through the requirement of more detailed information such as whether a list of publications was sponsored or not. In 2015, this information was available for social media marketing companies or clients of Social Bakers (https://www.socialbakers.com/), but not for Netvizz users84 or Facebook users. A last characteristic of social media APIs that I 84 In 2017, and in collaboration with António Granado, we have analysed substantial Facebook data generated by 15 Portuguese Universities’ official Facebook Pages. We suspect that a number of posts were sponsored, due to a discrepancy observed in the number of engagement metrics and the content 107 want to emphasise is their capacity to facilitate the integration of contents through platform-specific hyperlinks such as like or share buttons, called by Gerlitz and Helmond (2013) as like economy. That is the case when pulling out social media functionalities/buttons to put them into a website. As Anne Helmond (2015a; 2015b) explains, such a like economy (afforded by APIs) enables dynamics of the decentralisation (when pulling out social media functionalities/buttons to put them into other websites) and the recentralisation (data collection through social buttons and associated plugins) of data flows. This dynamic “creates new forms of connectivity between websites beyond hyperlinks, introducing an alternative fabric of the web” (Helmond, 2015b, p. 159), particularly when turning the presence of social buttons and associated plugins into valuable data. In this way, social media APIs play a crucial role in shifting the currency of the web from web-native (hits and links) to platform-native (likes, shares, retweets) (Gerlitz & Helmond, 2013; Helmond, 2015b). In other words, “social media platforms are creating currencies that are tied to the mechanics and logics of their own platform infrastructure” (Helmond, 2015b, p. 150). From the web as a platform concept to platformisation The enactments of platform programmability (through web-based APIs), namely the platforms capacity of responding to data, service or functionality requests, also constitute “the foundation for the Web as a platform concept” (Murugesan, 2007, p. 36; O´Reilly, 2005; see also Helmond, 2015a). The notion of platforms leads us to consider the regimes of functioning of application programming interfaces (APIs) and how they contribute to the decentralisation and recentralisation of platform functionality and data. Mark Andreessen (2007b, September 16), for instance, explains how web platforms are defined through the programmable potentials afforded by their APIs. Access and functionalities can be broken down into three levels: API access (level 1), Plug-in API (level 2), and Runtime Environment API (level 3). In Level 1, applications would call into the platform via web services to access data and services, e.g. pulling out YouTube content and Facebook like buttons to put them into a blog or another page. In level 2, as illustrated before with Netvizz, there is not only data and of the publications. After consulting the communication department of a few universities, our suspicion has been confirmed, most of the posts with high engagement metrics were sponsored. 108 services access, but it is possible to inject functionality into a given platform. That is also the case of the apps embedded in the Facebook environment, from gaming and dating categories to applications such as This is your digital life, which violated Facebook terms of service collecting and misusing its users’ data in favour of the Donald Trump presidential election in 2016 in the so-called Cambridge Analytica data scandal85. At the Level 3, Internet platforms allow developers to run their apps inside the platform itself, for instance the virtual experience provided by Second Life86. The characteristics of web APIs, in particular social media APIs, are central to the platformisation of the web. This notion, coined by Anne Helmond, refers to “the rise of the platform as the dominant infrastructural and economic model of the web and the consequences of the expansion of social media platforms into other spaces online” (2015a, p. 1). According to Helmond, three factors explain this phenomenon, namely: the separation of content and presentation, the modularisation of content and features and, finally, the interfacing with databases. These factors reflect not only how platforms enable their programmability87 “through the exchange of data, content, and functionality with third parties” (p.5) but also the infrastructure of the web environment and relationship with web-based research apps/software, as discussed in the previous sections. While Tim O´Reilly’s prediction for the web has been confirmed (the web has indeed become a provider of services, rather than just a place to find information), other dynamics also shaped the evolution of the web. That is, the majority of all web services ended up being concentrated in the hands of a few private companies (e.g. GAFAM, Baidu, Alibaba and Tencent) which now govern public spaces, penetrating the core of our economic and civic life (Dijck, 2020). This evolving dynamic process called platformisation thus reflects “the inter-penetration of the digital infrastructures, economic processes, and governmental frameworks of platforms in different economic sectors and spheres of life” (Poell et al., 2019: 6 in Dijck, 2020, p. 4). 85 https://en.wikipedia.org/wiki/Facebook%E2%80%93Cambridge_Analytica_data_scandal 86 https://secondlife.com/ 87 Helmond provides technical descriptions on APIs and explains how Extensible Markup Languange (XML), by structuring content in a text-based format, facilitates machine consumption, enabling “the circulation of content through modular elements/components” (p.6)”. 109 The web is less and less seen as an open platform for everyone to use, and more and more as a closed environment ruled by specific platforms’ walled gardens. Digital technologies and the web as the last stage of grammatisation Grammatisation was forged by the linguist Sylvian Auroux (1992) while delineating the technical process of description, formalisation and discretisation of human behaviours into representations, so they could be reproduced (Crogan & Kinsley, 2012; Petit, 2012; Stiegler, 2011). The author uses the alphabet (alphabetisation) as the first example of what constitutes a process of grammatisation, which starts with the exteriorisation of human gestures, acts or knowledge. Central to this are, of course, the grammar and the dictionary, particularly referred by Auroux (1992) as the pillars of our metalinguistic knowledge, being simultaneously representations of languages and techniques (or external tools) that alter communication spaces. A grammar provides general procedures for creating/decomposing statements; while the dictionary provides the items to be arranged/interpreted according to these procedures. Here, the grammar must contain, at least, a categorisation of unities, examples and rules, and its content should be relatively stable, e.g. spelling, syntax, and morphology (Auroux, 1992). The author, furthermore, argues that spaces of communication are simultaneously constituted and modified (Auroux, 1992; Petit, 2012)88 through grammatisation, which he considers an industrial revolution in itself (after the first being the invention of writing and the second the print revolution)89 (see Stiegler, 2011, p. 172). 88 The definition and contextualisation of grammatisation according to Victor Petit can be found in “Vocabularie d’Ars Industrialis” (http://arsindustrialis.org/grammatisation), an international association created in 2005 on the initiative of the philosopher Bernard Stiegler. “Ars Industrialis set itself the goal of imagining a new type of arrangement between culture, technology, industry and politics around a renewal of the life of the spirit” (see http://arsindustrialis.org/lassociation). 89 Auroux uses, for instance, the example of how landscapes’ views and modes of transportation have radically changed along with the roads, the canals, the railways, and the landing fields, in order to explain how “grammatisation has profoundly changed the ecology of communication and the state of the linguistic patrimony of humanity" (Auroux, 1992, p. 70). 110 Here, the notion of grammatisation is transferred to the digital environment, framed in the context of digital technologies and web platforms and thought through its relationship with social media and research software. In this sense, researchers may consider the formalisation of web content (online objects and activities such as links, URLs, like buttons, comments) as technological grammar, while looking at the entanglements of web infrastructure and applications’ informative sources that explains and articulate “discourse-made-machinery” (Agre, 1994). To help us reflect on this reality, this section provides an overview of digital grammatisation through the philosophical work of Bernard Stiegler, followed by Philip E. Agre’s software specific frame. Making visible and tangible different types of memory, behavior, knowledge Bernhard Stiegler extends and diverts Sylvian Auroux’s conceptualisation90 by framing the digital technologies and the Internet as “the last stage of grammatisation and a new kind of writing”. He argues that, because of the new condition of memorisation91 imposed by technological evolution, we (human beings) can only be understood through the point of view of technological complexity or the “realisation of memory92” (Stiegler, 2018), which stands for processes of exteriorisation, production and discretisation of intellectual structures. Consequently, he explains that such impression of memory can be observed, apprehended or studied through digital 90 Thinking through the work of Auroux, and based on the alphabet, Stiegler sees digital grammatisation as similar to the process of writing a letter, in which the impression of memory is simultaneously a technical and a logical condition. “A becoming-letter of the sound of speech [la parole] which precedes all logic and all grammar, all science of language and all science in general, which is the techno-logical condition (in the sense that it is always already technical and logical) of all knowledge, and which begins with its exteriorization” (Stiegler, De la misère sumbolique 1, pp. 111114 apud Stiegler, 2011, p.172). 91 This idea has particularly emerged with Leroi-Gourhan’s analysis of a protohuman fossil (Zinjanthropus boisei), then raising the thesis that techniques are a vector of memory: “[H]e showed that a crucial biological differentiation of the cerebral cortex took place in the passage from what he called the Australanthropian to the Neanderthal. He showed that, from the Neanderthal onward, the cortical system was practically at the end of its evolution: the neural equipment of the Neanderthal is remarkably similar to ours. Nevertheless, from the Neanderthal to us, technics evolves to an extraordinary extent” (Stiegler, 2010, p.73). 92 According to Stiegler, this principle can be explained because the process of becoming human reflects both the recollection of the past or ideas or soul (anamnesis) and its exterior traces or the technical supplement to memory (hypomnesis). In this sense, and quoting Leroi-Gourhan, he states that technicity or techno-logic (in Homo sapiens) “is no longer geared to cell development but seems to exteriorise itself completely—to lead, as it were, a life of its own” (Leroi-Gourhan, 1993, pp.137139 apud Stiegler, 2018, p.30). 111 grammatisation. This notion thus reflects on how technologies affect the development of human nature, while playing a role in the process of the impression of memory. To Stiegler, grammatisation is a form of a spatialisation of time93 (Stiegler 2012), as it makes visible different types of memory, behaviour or knowledge, e.g. when temporal flows of a speech are transformed into web textual content which is stored in the backend but also available in the end-user interfaces of social media. These behavioural fluxes or detemporalised forms of speech make what Stiegler (2012) calls a spatial object, which is an object synoptically visible and tangible, thus making “possible an understanding that is both analytic (discretised94) and synthetic (unified)” (Stiegler, 2012, p. 2). After typing the Internet sucks on my Tumblr dashboard and clicking post, my opinion about the Internet is no longer a temporalised form of speech but a spatial object. This evolves into a permanent link95 composed by text, image and a timestamp: a visible and tangible object that can be traced, saved, stored or shared by others. Here, the retention96 and materialisation of my opinion was only possible through a technical materialisation process afforded by Tumblr, serving also as a good perception about digital grammatisation, described by Stiegler as: all technical processes that enable behavioural fluxes or flows to be made discrete (in the mathematical sense) and to be reproduced; behavioural flows through which the experiences of human beings (speaking, working, perceiving, interacting and so on) are expressed or imprinted. If grammatisation is understood in this way, then the digital is the most recent stage of grammatisation, a stage in which all behavioural models can now be grammatised and integrated through a planetary-wide industry of the production, collection, exploitation, and distribution of digital traces. (Stiegler, 2012, p.2) 93 Stiegler sees behaviour as a form of time. 94 In computing, discrete means individually separate and distinct for the purposes of easier calculation, whilst unified means to be part of a whole, but operate as a single entity. 95 https://joceliii.tumblr.com/post/173795180479/the-internet-sucks 96 Retention “refers to what is retained, through a mnesic function itself constitute of a consciousness, that is, of a psychical apparatus” (Stiegler, 2012, p.2) (mnesic refers to what pertains to memory). Stiegler (2012) borrows the term retention from Husserl, who distinguishes two types of retention: i) primary retention (perception) that “means retained in the course of a perception, and through the process of this perception, but in the present” (primary retention is not yet a memory but “the course of a present experience”); ii) secondary retention (imagination) “is the constitutive element of a mental state which is always based on memory” (idem, pp. 2-3). Stiegler introduces a third type of retention, hypomnesic (an artificial or technical supplement to memory): the tertiary retention “in which consists the grammatisation of the flow of retentions” - more generally, any technical materialisation process” (p.3). 112 Stiegler (2018) recognises the techno-logic of grammatisation as constituent building blocks of both technology and culture, and along with the current status of the Internet for sociability97, he asserts that we live in a revolutionary moment (similar to the invention of writing) and should change our modes of conceptualisation and conditions of interpretation accordingly. Moreover, Stiegler (2010) asserts that the grammatisation follows the logic of processes and dynamic compositions98, and not of hierarchies or totalising systems. Thus, another way to understand grammatisation rests on approaching a complex process divided into four steps: i) the memory´s inscription carried out by the equipment for the input of data; ii) its preservation operated by a technical supplement to memory, which constitutes databases; iii) its processing by software, which may take different forms; and finally; iv) its transmission or publication – “the data thus processed are transmitted on networks. One accesses these networks through interfaces” (see Stiegler, 2018, p. 27), such as web-based APIs. This process has its own language, its own memory, and its own knowledge. Consequently, the technics and methods applied to grasp the process of digital grammatisation “cannot therefore be considered as merely ‘means’ serving ‘ends’ that would not themselves be technological” (Stiegler, 2018, p. 32). Rather, we should consider the conditions imposed by digital technologies and how they inscribe, preserve, process and transmit behavioural fluxes or flows. This will be discussed in the next section. From the metaphor of capture to an in-depth look over technological grammar In 1994, Phillip E. Agre proposed a reflection on two (not mutually exclusive) models of privacy as cultural phenomena: the surveillance model has its origins in the realm of public debates; while the capture model is rooted in the practical application of computing (see Agre, 1994, p. 107). He explains that such perceptions employ distinct metaphors. The surveillance model is informed by modes of observation built on visual metaphors such as Orwell’s “Big Brother is watching” and the capture model employs linguistic metaphors in which “human activities are systematically reorganised to allow computers to track them in real time” (Agre, 1994, p. 101). 97 That is what, in 1986, Stiegler would refer to as the emergence of new technologies. 98 Here the author follows the work of Jacques Derrida. 113 From the standpoint of the philosophy and the practices of digital methods, Agre’s essay remains relevant these days for reasons other than technology and privacy matters. Agre’s capture model will help us to look at the technical context and environment of social media grammars as methodological language. By providing a comprehensive understanding of digital grammatisation, it highlights how technological grammar can no longer be seen outside the technicity of the software. Three key aspects of his work are particularly relevant. The first relates to common aspects found in all tracking systems (or schemas): the entity, the computer and the agent. The entity refers to what is about to be tracked and captured; each entity may have “a definitive identity that remains stable over time” (p.104). In the social media context, clicking on links, choosing a reaction other than “like” on Facebook, making a comment on, uploading an image. Entities have changeable states, but “a definitive identity”, such as when one uploads a video on YouTube, a unique identifier (id attribute) is immediately created, serving as a definitive identity to localise the video but also to collect different types of information about it99. These activities can be facilitated by back-end communication between an extraction software and YouTube Data API, but also through the front-end interface of YouTube. Quantifiable parameters are attached to a video, such as number of views, likes/dislikes/comments, video recommendation, being also captured over time. These, in Agre’s lexicon, contain “some mathematically definable representation schema, which is capable of expressing a certain formal space of state of affairs” (Agre, 1994, p.105). The computer is responsible for representing the changing states of an entity; consequently, it also provides social and technical means that keep “the correspondence between the representation and the reality” (op. cit, p.104). By computer, the author refers to databases or distributed systems which can only offer entities’ representation schemas of what it can capture and store over time, “and this trajectory, can be either literal or metaphorical, or both, depending on what aspects of 99 By using YouTube Data Tools (Rieder, 2015), one can retrieve video info and statistics to generate a network of relations between videos, based on the reference “related to video id” of YouTube Data API. 114 the entity are represented” (p.105). In the context of social media, we can, for example, think of APIs as a computer in Agre’s terms. The last aspect present in all tracking systems is a human or automated agent that can request/retrieve information from a database. In the practices of digital methods, agents are research software (e.g. FacePager or YTDT), scrapers, crawlers, HTML/Python scripts, which are used or created by multiple stakeholders (e.g. developers, scholars, users, companies, government) to request/create/retrieve from the web or social media environment (this activity can also be tracked). The second important aspect in Agre’s essay relates to a good perception of the capture system itself. This model implies the match (or mismatch) of epistemological versus ontological principals (Agre, 1994). On one side, the notion of capture refers to the acquisition of information by computers; on the other, it expresses a semantic distinction, i.e. “acquiring the data” versus “modelling the reality it reflects” (p.106). Agre refuses the idea of a literal descriptive system and, instead, he introduces the capture model as a metaphor system, in which the ways for understanding activity have basis in the practical application of computing. This is a lesson we may want to learn from Agre, transposing it to the ways in which we study online activity while using and designing research with digital methods. The way in which Agre unpacks capture systems100 illustrates more substantial perspectives on what technological grammar is. Capture here means a complex phenomenon in which “every domain of activity has its own historical logic, its own vocabulary of actions, and its own distinctive social relations of representation” (p.116). The third central proposal in Agre’s essay is what he calls grammars of action or the formalisation of languages for representing human activity. These grammars specify a set of unitary actions that have many and varied manifestations. In social media, a simple example is the act of adding the # symbol before words, numbers or emojis turning them into hashtags that can stand for ideas, opinions or positioning efforts. Hashtags indicate searchable topics recognised by users, recommendation algorithms and automated beings (bots). These actions are embedded in the platform databases under predefined and specific properties, e.g. what comes after the # symbol? Where 100 Agre approaches the following interconnected elements: grammars of action, capture and functionality, capture in society, and the political economy of capture. 115 can hashtags be used? (e.g. in captions, overlapping images or videos) How many times? What is associated with the use of hashtags (e.g. the content of tagged posts may contain the username, captions, co-related tags, image URLs, publication date, location)? Moreover, social media databases can capture the relationship between hashtags and their predefined and specific properties as well as a set of actions related to hashtagging (e.g. liking or commenting on posts, using filters). This helps us to understand what Agre refers to as the capture model, providing also an accurate definition to what we are referring to as platform grammatisation (see Gerlitz & Rieder, 2018): “the situation that results when grammars of action are imposed upon human activities, and when the newly reorganised activities are represented by computers in real time” (p.109). Accordingly, we need a better understanding of the five-phased cycle that encompasses this process or how grammars of action are embedded into computers. These phases are not independent of each other, but rather concur or arise simultaneously. They remind researchers of how attentive they should be when working with technological grammar, taking into consideration every rule, information, forms of use and articulation attached to them. 1. analysis (the basic ontologies - objects, variable, relations, that will inform different forms of activity) 2. articulation (how grammars are “strung together to form sensible stretches of activity” p.110) 3. imposition (grammars of action are often a result of a normative force, “participants in the (articulated) activity may or may not participate in the process and may or may not resist to it” p.110) 4. instrumentation (sociotechnical means are provided either by the capture system itself, by those orienting their activities through the capture machinery, and the institutional consequences of these relations) 5. elaboration (after being recorded and stored, activity becomes available to be accessed and merged with other records). When transposing Agre’s five-phased cycle to social media research, to understand how grammars of action are articulated, we should be aware of the basic ontologies of 116 APIs, bearing in mind that the imposition of grammars may not signify a full appropriation or use. For instance, one can simply not use hashtags nor provide Instagram with a list of close friends when publishing a Story. Moreover, we should not forget that technological grammar can oversimplify the acts it intends to represent but also provides means for either accurate or inaccurate descriptions of actions. In relation to available grammars of action, as new media studies have shown, they serve research as powerful means to monitor and better comprehend social issues. In this regard, the next section will introduce three aspects required to the simplest digital methods project. Three pillars of the digital methods approach Through the years, and in the many data sprints that I have attended/organised, I have often noticed how the crucial effort to understand with and about the computational mediums can be easily brushed aside by more pressing practical issues. Projects tend to disregard the specificity of the medium and the platform’s cultures and digital methods become just a way to do stuff with web data, rather than an invitation to think along with the media. As previously argued, even the simplest digital methods project requires some technical knowledge, practice and expertise. This background knowledge constitutes the first level of the technicity-of-the-mediums and can be defined as the practical awareness that allows researchers to understand not only in theory but also in practice what it means to study collective phenomena “through interfaces and data structures” (Rieder et al., 2015, p. 4). This section demonstrates how technical expertise can specifically contribute to new forms of enquiring. It proposes a schematic way to raise awareness of the technical mediation inherent to digital methods101, describing a triangular relationship between software affordances and platforms’ cultures of use and grammatisation (see Fig.3.4). This proposal attempts to defuse some of the difficulties related to the use of digital 101 Instead of discussing what is evident and present in applied research, for instance how the Internet and new technologies may lead to a revolution in the making of social science (Watts, 2007) or the importance of learning particular technical skills. 117 methods, while it suggests a way of carrying out digital fieldwork. The schema represents a combination of awareness (in theory, imagination and practice) and software practice and draws attention to three distinct but related aspects, while engaging with the specific modes of the technical mediums. In other words, intellectual and practical operations are not separated but paired up. The visual schema in figure 3.4 corresponds to a need of digital research, i. e. the fact that scholars must care about the specificities of the medium and data, particularly “where and how they happen, who and what they are attached to and the relations they forge, how they get assembled, where they travel, their multiple arrangements and mobilizations, and, of course, their instabilities, durabilities and how they sometimes get disaggregated too” (Rupper, Law, Savage, 2013, pp.31-32). In other words, the understanding of the domain of its particular potentialities (practical awareness) is required in order to achieve a (research) purpose. Figure 3.4. Getting to know the fieldwork through understanding the triangular relationship between platform’s cultures of use, grammatisation and software. In this scenario, it is crucial to understand the role of the medium’s elements, features and qualities and how they relate to the subject of study in the interpretative process. Figure 3.4 is also meant to serve as a set of guidelines that assist the researcher to go through the mental-practical schemas demanded by digital methods. In the following pages, I will introduce the three pillars of the knowledge triangle separately. 118 Platform Grammatisation Platforms here, in general terms, are taken as “socio-technical assemblages and complex institutions” (Gillespie, 2018a, p. 255), and in a detailed assessment they are “(re-)programmable digital infrastructures that facilitate and shape personalised interactions among end-users and complementors, organised through the systematic collection, algorithmic processing, monetisation, and circulation of data” (Poell et al., 2019, p. 3) but also apprehended as arenas for processes of capitalisation, monetisation and proprietary enclosure (see Mackenzie, 2019; Rieder, Coromina, & Matamoros-Fernández, 2020). Mackenzie (2019), however, argues that platforms should “no longer [be] rendered as a social network of users (individuals and organisations), or even more starkly as an advertising medium (Skeggs & Yuill, 2016)”, because “the platform itself becomes an experimental system for observing the world and testing how the world responds to changes in the platform on many different scales” (Mackenzie, 2019, p. 2003). In his perspective, there is a shift from platforms as a place of connectivity to being predictive operations in which platforms “modulate connectivity and begin to infrastructuralise it” (op. cit.). In the context of digital methods, platform grammatisation refers to the technological processes inherent to the web environment and APIs in which and through which online communication, acts and actions are structured, captured and merged with other records, yet made available limitedly through data retrieval methods such as crawling, scraping or API calling. In other words, the situations where users deal with predefined technological grammar, produced and delineated by software, to structure their activity (Gerlitz & Rieder, 2018). That alludes to the operationalisation of platforms and the particular and pervasive agency of its technical functioning (see Rieder, Abdulla, Poell, Woltering, & Zack, 2015) intertwined with and in online data. As an example, let us see how TikTok102 structures video content and metadata. To do this, we should consult the mobile app front-end and back-end interfaces (see Fig.3.5) as referential starting points to understand which are the acts and actions that one can do to a TikTok video (by looking at the end-user interface), and how these acts and 102 TikTok is a Chinese mobile video-sharing application which gained worldwide recognition in 2020, in particular during the quarantine restrictions and lockdowns provoked by the impact of COVID-19. Let’s also pretend we are not familiar with TikTok but know the basics in advance; that the app follows social media logic of communication and interaction (e.g. one needs to have an account in order to produce content, to be able to follow others or to be followed). 119 actions are re-arranged and made available through back-end infrastructures. The latter can be verified by reading the official or unofficial API documentation, and by exploring the output files (e.g. TAB., CSV. GEXF.) provided by data extraction tools. In this process, a series of questions should be addressed such as: what is it possible to do with technological grammar? What are the standard grammatised actions and what are those which indicate platform-specific cultures? In what forms are they organised and articulated? What can be accessed and subsequently repurposed? How? Are there new or no longer used grammatised actions103? Figures 3.5 and 3.6 help us to respond to these questions, showing the possible actions and reactions to a TikTok video. Each piece of action is recorded. For instance, headers; text; create time; author id and name; music id, name and author; image and video URLs; like, share, comment counts; whether a hashtag was used. When implementing digital methods, one can be tempted to use and explore all the available information on TikTok. Contrary to that, we should first apply this technical knowledge to ask how technological grammar can serve research purposes and answer research, also considering the culture of use and appropriation of the grammars of action. When looking at figures 3.5 and 3.6, we notice that TikTok grammatised actions have more levels of specificity to be exploited than the record of engagement metrics, which drives us to rethink the use of quantification (or what can be calculated) 104. For instance, and based on figure 3.6, a list of image URLs originated from the TikTok environment may serve as a point of departure to visual content analysis or to investigate the sites of image circulation within and beyond TikTok. Whereas, when comparing what music names and authors are associated with a hashtag, one can have a sense of the dominant and ordinary forms of signification emerging on TikTok. Through looking at the app technical interfaces, and with some background on its usage culture, we may come across more elaborated perceptions about the potentials of TikTok’s grammatised actions for digital research. 103 For example, the development of emoji hashtags on Instagram in 2015, Facebook Reactions in 2016, LinkedIn Reactions in 2019 or when Twitter changed the favourite start button to the heart shaped button in 2015 (See https://blog.twitter.com/official/en_us/a/2015/hearts-on-twitter.html) 104 For instance, the overall engagement in a publication or the total number of followers or mentions reflect a common path for social sciences’ researchers to think political and social issues, giving substance to the idea of reality calculability. 120 Figure 3.5 TikTok grammatisation: looking at the end-user interface and reading unofficial API documentation. Source: https://github.com/drawrowfly/tiktok-scraper#getVideoMeta Figure 3.6 TikTok grammatisation: identifying grammatised actions through the exploration of the scraper output file. Source: https://github.com/drawrowfly/tiktokscraper#getVideoMeta Figure 3.7 shows TikTok end-user interface for both navigation and usage perspectives. When watching videos, TikTok offers the classic social media buttons (like, comment, share, mention, use emojis), but also two others that indicate particular forms of use: the audio and Whatsapp buttons. The former informs the central role of sounds to app creative environment, whereas the latter only shows up after watching the same video for the third time in a row. This directly suggests that the user should share the video on Whatsapp. In discover, an information board drives the user to topics concerning the pandemic, e.g. official information about the cases in Portugal, related hashtags such as #safehands and #felizemcasa (happy at home), and TikTok new effects which is followed by trending hashtags. There users can search for top 121 video content, users, videos, sounds and hashtags. Here we may learn that hashtags are as important as sounds on TikTok. Figure 3.7. Being acquainted with TikTok grammatisation. Other grammars of action emerge when using and experimenting with the app (Figure 3.7), the different types of facial effects are diverse and very interactive (e.g. type here to write, open your mouth, touch the screen, drag your finger, blink your eyes). Sounds can also be searched for when creating a video, as we see in the image on the left (Figure 3.7), where a user can search for sounds based on her favourite ones and those belonging to different categories imposed by TikTok, such as those we see below in Figure 3.8. What is interesting is that, soon after publishing a video and once again, the app suggests the user share the content on Whatsapp. All these possibilities can point to what users can do (not be confused with what users do). In short, I am suggesting that to get a good picture of grammatisation it is necessary to spend time exploring the platform environment (front/back-end) both as an observer and as user, but also to implement data exploratory analyses and visualisations to gain a better understanding of how actions are articulated and how these articulations may serve research purposes. 122 Figure 3.8. TikTok’s list of audio categories. The notion of platform grammatisation shall guide researchers to use the knowledge about the ways in which grammatised actions are altered and rearranged by computing as methodological language. It helps researchers to make sense of data retrieved/scraped from digital platforms. In this sense, and as previously discussed, taking grammatisation into account demands new ways of conceptualising the subject of study. Here, social media content cannot be separated from its carrier (see Niederer, 2019); platform interfaces and infrastructures. The following chapters adopt this perspective in research oriented by Instagram hashtag engagement and Facebook natively digital images. Moreover, the study and repurpose of grammatised actions requires new skills rooted in digital methods technical practices. For this reason, scholars such as Noortje Marres and Richard Rogers argue that we should never forget to take into account technological grammar in their context and environment. As a result, it is time to consider software as message, including its technical schema and functional makeup (see Manovich, 2014; Rieder, 2020) in the making of digital methods. Cultures of use Cultures of use here refer to the modes of life, the common meanings and the forms of signification that emerge and circulate within a given platform. This perspective, adapted from Raymond William’s conception of the word culture, entails that cultures 123 of use are expressed by technological grammar shaped by platform’s infrastructure and technical mechanisms. To account for platform cultures of use, we should ask about the common practices of the platform, its native objects and how they are used, the role of recommendation and ranking systems in everyday usage. These questions are helpful to the turning of search queries into research questions but may not be sufficient to grasp platform cultures of use. Additionally, we may want to question how differently publics use social media (or other digital platforms) and engage with digital grammars. For what purpose? Who are the influential or dominant actors? Are the users resistant to what is imposed and why? How (or what) makes cultures of use change overtime? Cultures of use here is in the plural because social media platforms have multiple publics and forms of engagement which change from one platform to another but also inside the platform itself. Within the Instagram environment one can study politics, far-right movements, influencers, social bots, health, or porn-related issues, for example, by looking at hashtags, stories, following network or visual content. The research approach adopted should be responsive to cultures of use. When looking at specific social issues across platforms, we need to recognise that different platforms breed different cultures of use. For instance, even when the subject of study is the same (e.g. pregnancy, emoji hashtags, social bots, climate change, Zika virus), the appropriation of digital grammars and the production of (visual and textual) content related to it might differ across platforms, while also diverging internally (see Bogers, Niederer, Bardelli, & De Gaetano, 2020; Highfield, 2018; Omena et al., 2019; Pearce et al., 2018; Rabello et al., 2018; Rogers, 2018). These differences are also noticed when looking at high-visibility actors, content and uses versus ordinary actors, content and uses (see chapter 4). A more technical perspective on platforms’ usage cultures is proposed by Weltevrede and Borra (2016), following their concept of device cultures: Device cultures can then be defined as the interaction between users and platform; how activity is imagined, curated, and prescribed into the platform architecture; how affordances are activated by the (un)intended uses and practices that take place on and within platforms; and how the data is collected and processed by the platform. In other words, the platform architecture suggests certain 124 practices and uses, and contains the traces of platform activity in the form of data, whether unprocessed or in aggregate—or any algorithmically processed form, such as the variants on popular, trending, or relevant content. (Weltevrede & Borra, 2016, p. 2) While this definition invites us to think about and interpret the reasoning of connective action105 on their own terms, as suggested by Bennett and Segerberg (2012), it simultaneously draws our attention to the modes of programmability associated to social media and other web platforms (Helmond, 2015a; Mackenzie, 2019; Murugesan, 2007; Poell et al., 2019). We learnt that platform cultures of use require a comprehension from its natively digital environment; that is to apprehending platform cultures of use through the lens of social media programmability (Helmond, 2015a). Once again, we return to the language and reality of the web environment, particularly what its infrastructure enables and by what means. Moreover, as discussed in the previous section, we need to take technological grammars seriously respecting the environment they come from. After all, they are considered as a “language of sociality” (Marres, 2017) and “connectivity” (Van Dijck, 2013) serving, thereby, as both entry points and corpus for research. The forms of articulation and re-arrangements of technological grammars matters. So, while we pay attention to what is going on in platforms, we should look at how these forms of signification are articulated by platform grammatisation. Let us use TikTok as an example. To get a sense about cultures of use in TikTok, we need to use the app, while navigating through all its possibilities. This will give the researcher a practical sense of the technological environment and grammar offered in and through TikTok’s rules, default features and dynamics to the creation of short videos using lip-sync, dance, and face effects. I refer especially to an understanding of how the grammars of TikTok are imposed and articulated, rather than hastening the 105 Bennet and Segerberg introduce connective action as an expression of personalised action formations, which, however, have the same issues or claims (e.g. environment, rights, women's equality) found in older movements or party concerns (collective action). The authors explain that “people may still join actions in large numbers, but the identity reference is more derived through inclusive and diverse large-scale personal expression rather than through common group or ideological identification” (p.744). The authors highlight two important elements in large-scale connective action formations: i) political content in the form of easily personalised idea and ii) various personal communication technologies that enable sharing these themes. 125 observation (or first impression) to look at the content that these grammars carried out or at what the most popular users often do with them. I have navigated the application with a certain constancy during the lockdown in Portugal and, as a lurker, my first impressions were positive due to the funny and very creative memetic video content suggested in my feed. To bring happiness to everyone is the aim of TikTok, according to Zhang Nan (the CEO of Douyin TikTok’s Chinese version) (Zhang, 2020). In a recent article on the infrastructuralisation of TikTok, Zongyi Zhang (2020) says the platform distinguishes from the logic of traditional creative video industry, explaining also how hashtags, hashtag challenges and music structure and organise actions. Moreover, how TikTok’s new algorithm mechanism reacts to a popular video; “if the creator produces an extremely popular video, past content of him will also be reposted and recapture the attention of the algorithm. This compensation mechanism means not only an opportunity but also a compulsion for creators” (Zhang, 2020, p. 11). In figure 3.9, on the right, the dancing challenge based on Cardi B’s song WAP which was ranked in my feed on November 4, 2020; and, on the left, the Netflix #tumdumchallenge suggested by the discover information board on the same day. After clicking on #wapchallenge and scrolling down, one listens to the same part of the song106, while watching different videos, for instance, users imitating or adapting the singer’s dancing moves to other performance dance like Ballet, tutorials teaching how to dance WAP, users singing the song in 1940’s or Celine Dion style, among many other creative possibilities. In the #tumdumchallenge, we listen to the sound we hear when entering Netflix followed by a remix version of it. One cannot tell what to expect from video content here except user’s imaginative creativity (always consistent with the platform cultures of use)107. In TikTok, one must be performative enough to communicate and create videos using lip-sync, different filters and Augmented Reality effects or dancing (non) professionally in either indoor or outdoor environments. 106 In particular the part of the song that says: “Now from the top, make it drop, that's some wet ass pussy. Now get a bucket and a mop, that's some wet ass pussy. I'm talkin' WAP, WAP, WAP, that's some wet ass pussy. Macaroni in a pot, that's some wet ass pussy, huh”. 107 https://www.tiktok.com/amp/tag/tudumchallenge?lang=en 126 Figure 3.9. TikTok’s cultures of use: hashtag challenges. 127 The #tumdumchallenge and #wapchallenge are good examples of the common and expected video making practices on TikTok, reflecting what the platform proposes to its users. However, usage culture does not have to respond to what is imposed or expected by TikTok, because users may propose other forms of engagement or they can resist to what is imposed. For instance, when users appropriate TikTok technological grammar to support a cause not at all related to the making of entertaining videos, such as the Brazilian Mariana Ferrer’s case of rape, in which the accused was found not guilty108. This decision, communicated in September 2020, is an unprecedent judgment in the country (see Alves, 2020). According to the Brazilian Public Ministry, André de Camargo Aranha had a conduct where there was a will but not full consciousness109 for rape. Up until November 4, 2020, almost 40 million views were counted in videos hashtagged with #justiçapormariferrer (justice for Mari Ferrer), #justicapormariferrer or #justicapormaribferrer (see Figure 3.10). In addition to disagreeing with the judge's decision, an excerpt from the hearing provoked great indignation and criticism was provoked by an excerpt from the heraring, in which the defence lawyer humiliates Mariana Ferrer by showing sensual photos, implying she has a false victim's posture and saying she has only fake crying on her Instagram accounts110. 108 In December 2018, the 23-years-old Mariana Ferrer reported to the Brazilian police that she was raped in a nightclub in Santa Catarina, but she was unable to remember who had raped her; thus, she believed she was drugged. At the time, the event promoter was a virgin. In May 2019, Mariana shares her story on Instagram as an attempt to arouse public interest and see her process move faster in the Public Ministry. After investigation and based on genetic material (saliva in the glass) and an internal security video record, police have identified the rapist: the businessman André de Camargo Aranha. According to the article of Schirlei Alves from The Intercept Brasil (Alves, 2020) and The Intercept YouTube live (see https://www.youtube.com/watch?v=hsm4poTWjMs), in the first statement, the accused says that he did not touch Mariana and, in the second statement he claims that he had oral sex with her. However, and beyond other evidence, the corpus delict exam has proven that Mariana was indeed raped, proving also the rupture of her hymen. The defence of Mariana has already appealed against the judge's decision. Estadão made the full hearing available here: https://www.youtube.com/watch?v=P0s9cEAPysY Brazilians response to that case not only on TikTok but across social media platforms, engaging with the hashtag #justiçapormariferrer (justice for Mari Ferrer). See: https://www.tumblr.com/search/justi%C3%A7apormariferrer, https://www.instagram.com/explore/tags/justi%C3%A7apormariferrer/, https://twitter.com/search?q=%23justicapormariferrer&src=typeahead_click, https://www.facebook.com/hashtag/justi%C3%A7apormariferrer 109 The Intercept Brasil used the expression he expression 'rape by mistake' (estupro culposo) to summarise the case and explain it to the lay audience. 110 See https://www.youtube.com/watch?v=X--JAQShBBw. 128 The expression rape by mistake (estupro culposo), used by the Intercept Brasil to summarise the case and explain it to the lay audience (see Alves, 2020), was also reapropriated by TikTok users, showing their disagreement and revolt about the case through the hashtags #estuproculposonaoexiste (rape by mistake does not exist) and #estuproculposonao (rape by mistake no). Figure 3.10. Otherwise engaging with TikTok: #justiçapormariferrer (justice for Mari Ferrer). Screenshot taken November 4, 2020. What also informs cultures of use is when users do not accept how the technical design of platforms grammatises acts. They thus resist to what is imposed, finding a way out of the rules. A good example is the role of 4chan’s anonymised users (anons), known as the bakers, in resisting the platform’s limitations. The digital methods-based study 129 of 4chan’s /pol/, ‘Politically Incorrect’ board111, has shown how the bakers coordinate the creation and maintaining of threads as a means to continue conversation on this board (Bach, Tsapatsaris, Szpirt, & Custodis, 2018). In this study, Daniel Bach and colleagues detected a coordinated rhythm in 4chan/pol comments containing “President Trump General” and “Trump General”; a steady rate of around 1000 threads from January 2016 to February 2018. This finding reveals that political conversations in 4chan can be “centrally orchestrated by an elite who seem to dictate how the conversation is framed”, rather than being anarchic, chaotic, or random as is expected in this environment (see Bach et al., 2018; Knuttile, 2011; Tuters, Jokubauskaitė, & Bach, 2018). To better understand this specific culture of use, allow me to briefly introduce 4chan (https://www.4chan.org). This imageboard platform maintains a culture of anonymity (Knuttile, 2011) in which anyone may post anything anonymously in its various themed boards; however, “boards only allow a finite number of comments before threads must be purged” (Tuters, Jokubauskaitė, & Bach, 2018). The grammatisation of 4chan is quite simple: Most 4chan boards consist of 10 pages, each containing 20 threads for a total of 200 active threads at all times. When someone replies to a thread, it is ‘bumped’, meaning it becomes the top post on the board — but only until another post is bumped afterwards. If it reaches 300 comments, a thread can no longer be bumped. If this happens, or when a thread stops garnering any reactions, it starts to slowly descend towards the bottom of the board. If a thread falls outside the 200 active threads, it is deleted or locked so no one can comment anymore. (Bach et al., 2018) In response to such a transitory and fleeting mode of being, the bakers resist 4chan’s grammatisation and fight its ephemerality, by adhering to specific practices in which posting times and the use of standard templates when commenting. On one side, we have come to the point to reflect on cultures of use as shaped by how platforms develop their infrastructure “not just in anticipation of inappropriate content activity, but in response to it” (Gillespie, 2018, p. 264; see also van Dijck, 2013). Scholars have been more attentive to the particularities and stylistic conventions that emerge from social media by meeting forms of recognising “how registers of meaning 111 https://boards.4chan.org/pol/ 130 and affect are produced” by, in and through these infrastructures - what Martin Gibbs and colleagues called “platform vernaculars” (Gibbs, Meese, Arnold, Nansen, & Carter, 2015, p. 258; see also Flores, 2019; Geboers & Van De Wiele, 2020; Pearce et al., 2018; Pilipets, 2019). This approach allows researchers to “examine the specificities of social media platforms”, while paying special attention to the particular forms of participation that occur on them (Gibbs et al, 2015, p.258). On the other side, we learn that to understand cultures of use is also a matter of asking how it is to experience or to ‘be’ on a platform (see Knuttile, 2011) combined with the critical analytics data approach (see Bach, Tsapatsaris, Szpirt, & Custodis, 2018; Pilipets et al., 2019; Tuters et al., 2018) suggested by the practices of digital methods. Here social media are sights for causes and windows on an issue (Niederer, 2019), rather than only a place to praise the quality of being well-known, thus avoiding giving attention to vanity metrics (see Rogers, 2018). The affordances of software Software here stands for all the computational mediums that take part in the full range of digital methods because they “re-adjusts and re-shapes everything [they are] applied to – or at least, [they have] a potential to do this” (Manovich, 2014, p. 80). That means that we assume social media, search engines and web platforms as software along with other mediums such as Gephi and vision APIs. Here, the concept of software has less to do with “a foundational understanding of computing” (Rieder, 2020)112 and more to do with a consideration of what software has to say through its materialities, potentialities, functioning, outputs and relational aspects. That is an invitation to become familiar with “a properly technical substance that sits at the centre of technical practice” (Rieder, 2020a, p. 54), yet an active position before the use of software, which is required in the implementation of digital methods. Therefore, a conceptual or a technical understanding of software is not enough for these methods, but empirical knowledge is essential (as I argued in chapter 2). This section thus addresses software as the last pillar of knowledge illustrated in figure 3.4, paying attention to the content of software which concerns an awareness of software operation from the standpoint of non-developer researchers. To contextualise the 112 “Which seeks to settle its ontological status in order to develop a clear, axiomatic basis that supports the deductive style of reasoning analytical philosophy favors” (Rieder, 2020, p.52). 131 relational aspects of software with the other aspects to be considered when practising digital methods (cultures of use and platform grammatisation), we will pay attention to the notion of software affordances. In the context of human-machine interaction, affordances mirror the perceived and hidden properties of software (see Gaver, 1991; Norman, 1988). [...] the term affordance refers to the perceived and actual properties of the thing, primarily those fundamental properties that determine just how the thing could possibly be used. [...]. Affordances provide strong clues to the operations of things. Plates are for pushing. Knobs are for turning. Slots are for inserting things into. Balls are for throwing or bouncing. When affordances are taken advantage of, the user knows what to do just by looking: no picture, label, or instruction needed. (Norman 1988, p.9) In the context of software and platforms studies, Tania Bucher and Anne Helmond (2017) introduce a more useful and appropriate notion of software affordances to the context of digital methods. In approaching the questions of affordances in social media, the authors propose we should not only pay attention to what technology does to users but also what software can do for users and “what platforms afford to other kinds of users beside end-user” (Bucher & Helmond, 2017 p.16). They are affirming that […] by approaching the question of affordance from a relational and multi-layered perspective, the question is not just whose action possibilities we are talking about, but also how those action possibilities come into existence by drawing together (sometimes incompatible) entities into new forms of meaningfulness. (Bucher & Helmond, 2017 p.18) A similar vision of software affordances is presented by Ruppert et al. (2013) who warned researchers to be attentive to the specificities of the materiality of digital devices and to explore “the chains of relations and practices enrolled in the social science apparatus” (methods) (p.41). Although, the authors do not use the term “affordance”, they discussed what digital devices (software) can do for researchers or afford them to do or might do to these devices. In this sense, software affordances can 132 also be taken as the materiality, productive and mediating capacities of software which, according to Ruppert et al. (2013) are not explored in social theory. In practical matters, and considering software affordances from a relational perspective with platform grammatisation and cultures of use, we should be addressing questions such as: how can we study platforms with and through software? What are the elements/particularities of software that we should be aware of or master? What are the grammatised actions required to query platforms using extraction software or web research apps? In addition, there are basic requirements but also guidelines for a good understanding of the triad grammatisation-cultures of use-software: § The need to be aware of (extraction, mining, analysis and visualisation) software, while becoming familiar with its technical substance. § The need to understand how software takes part in the research design, analytical tasks and presentation of findings. § The need to recognise that software cannot act or perform alone but is, instead, conditioned by our choices, decisions and knowledge. To illustrate the issues raised above, two examples will be developed. The first relates to data collection and visualisation, whereas the second illustrates the implementation of digital methods for studying online images through vision APIs-based networks. Research is always influenced by the technological grammars or the technological environment under investigation. We can only capture what is stored within platforms databases and web environment. After choosing the extraction tool (e.g. YouTube Data Tools) and the platform (e.g. YouTube), we need to understand how software responds to its grammatisation and its regime of data access. For instance, asking what entry points are available to query platforms (e.g. key terms, location, hashtags, object id); how far back in time can data can be retrieved; what are the standard output files (e.g. TAB, CSV, HTML, GDF, JSON etc.); and what sort of information comes with these files. Consequently, what questions can be posed or answered. Another aspect requiring our attention is to understand that every decision matters when using data extraction software; the choice of words combined with the parameters chosen to collect data give us information about what we can get and how 133 we can see. In Figure 1 we see what happens with network visualisation when simply switching one parameter of YouTube Data Tools (YTDT) (Rieder, 2015); at the top, videos ranked by relevance and by date at the bottom. Technicity was the keyword used in the video network module that provided two GDF files; in one the network is organised by what YouTube considers more relevant to the search query (technicity) while, in the other, YTDT delivers a reverse chronological order based on the videos creation date. As illustrated in Figure 1, after retrieving data from social media, the output files must go elsewhere and be submitted to many other technical mediations. These are situations in which both software and the researcher’s decisions intervene, re-adjusting and re-shaping representations of online activity. For instance, when opening a file with Gephi, we must add other layers of technical meaning provided by force-directed algorithms (e.g. ForceAtlas2) and metrics like degree (from graph theory and network analysis). Here, we not only meet other fields of study (e.g. network studies) but also other technical mediums (besides YouTube and YTDT) with its own methods, rules and language for making networks readable and interpretable. Accordingly, there is a call for understanding the basic principle of network analysis, not just Gephi itself. Figure 1. Effects of the choices made before extracting data. 134 Figure 2 helps us in understanding such entanglements, highlighting what we see after feeding Gephi with the output provided by YTDT (e.g. through nodes position within the network, we may have a sense of how YouTube suggests relevant videos). Furthermore, we see what YouTube users have created (e.g. title, published date, category) and how other users have reacted to video content containing the word technicity (e.g. number of views, comments, likes, dislike). The position of nodes (videos) within the network makes visible YouTube grammatisation (e.g. video recommendation system) and cultures of use and is influenced by the affordances of the algorithm ForceAtlas2. Node size and colour can either display YouTube grammatised actions (e.g. commentcount, dislike, video category) or Gephi statistics (e.g. modularity, average degree). Figure 3.12. The entanglements of data with grammatisation, software functioning and mediums-specificity. The process of mapping a network of related videos with and about YouTube is a process of accumulation and transformation in which my dataset/corpus remains situated and contextualised, but it also gains new arrangements and technical substance to be considered when interpreting the network. Crucial in this process are the three pillars of the digital methods approach presented above. The second example concerns the implementation of digital methods for studying online images through vision APIs-based networks. Figure 3.13 illustrates the role of 135 software affordances in this process, using a research diagram protocol to expose how the researcher intervention, software specific potentialities and technical practices add new form and meaning to a collection of natively digital images, while transforming these throughout the operationalisation of the methods. This protocol is directly related to the description of the process of building/interpreting computer vision-based networks in chapter 2 (or my attempt to illustrate more clearly the mental and practical modes of what I am calling the technicity-of-the-mediums in digital methods). In short, the process starts with a list of social media image URLs which are interpreted by web vision APIs, re-arranged as a network by the researcher and other software, and finally, analysed according to its visual affordances. The result neither corresponds to the grammatisation of the platform in which the images were extracted, nor the grammatisation of the vision API in itself. Online images are merged, re-adjusted and re-shaped by other computational mediums inherent to the practices of digital methods. The final computer vision-based network thus carries another order of grammatisation, one co-created with technical mediums. The chain of methods results, built on the top of existing but different technological grammars, going beyond the limitations of each of those individual grammars and methods. It is a methodological process that demands a good understanding of the technicity-of-the-mediums, but which also leads the researcher to work with a second order of grammatisation (see also the last section of chapter 2). The result is the creation of new methodological grammars based on the basis of existing technological grammars and software affordances. 136 Figure 3.13. The methodological process for creating computer vision-based networks to study online images. Concept by Janna Joceli Omena and design by Beatrice Gobbo. 137 Introduction to chapter four This chapter is originally published as: Omena, J. J., Rabello, E. T., & Mintz, A. G. (2020). Digital Methods for Hashtag Engagement Research. Social Media + Society. https://doi.org/10.1177/2056305120940697 The chapter exemplifies the first steps in digital methods research and introduces the attitude of making room for the technicity of the mediums. It also marks a change in thinking about my research object, from political polarization to the role of platform/software and of methods. The chapter began to take shape in 2016 following a growing political polarization in Brazil, taking advantage of previous knowledge about Instagram grammatisation and the possible analytical approaches afforded by hashtag-based data. For example, networks of hashtag co-occurrence, account-based analysis such as verifying who were the most active users using specific hashtags. At that time, Instagram’s API Platform was relatively generous in offering access to posts and metadata associated to a list of hashtags, not only in quantity but in a temporal perspective. It allowed researchers to go back days, months, and even years in time. Data collection occurred in several iterations from March 13 to March 31, 2016 and was supported by Visual Tagnet Explorer (Rieder, 2015). The datasets were organised in a datasheet113. Data collection was carried out by closely following the political polarisation movement through the lens of Instagram. Data analysis was initiated in the context of a data sprint in July 2017, leading to the publication of a paper in coauthorship with Elaine Teixeira Rabello and André Mintz. My main contribution in this article refers to the methodological design and conceptualization, and analytical proposal to consider dominant and ordinary actors, uses and contents associated with hashtag engagement. The technicity approach here means paying close attention to Instagram’s API to make sense of how researchers can access, treat and repurpose the platform grammatised actions (technically and practically). Second, the chapter poses research questions aligned with the technicity of the mediums and the object of study (Brazilian protests mobilised towards the “impeachment-cum-coup” of Brazilian president Dilma Rousseff and framed by Instagram hashtag engagement). Third, and by seeking ways 113 https://drive.google.com/file/d/0B7j2W-Xfs9qBTVRhNzYydHdQd3c/view?usp=sharing 138 to apply “critical analytics” for social media research (Rogers, 2018), the chapter introduces the three-layered (3L) perspective as a way of making sense of hashtag engagement. The 3L perspective assembles hashtag engagement, their related content, and the actors involved by distinguishing dominant and ordinary groups embedded in social media practices and mechanisms. As an alternative to the customary choice of focussing on most popular content, I suggested that we consider not only what is high visible but also what is kept out of the spotlight. Finally, this chapter reflects my efforts in mobilising the three key aspects of the practice of digital methods (software affordances, platform’s grammatisation and cultures of use) using a methods’qualiquanti approach. Starting with the meticulous choice of a list of hashtags, followed by the visualization of co-occurrence networks to identify other potential hashtags. After collecting data and considering our analytical proposal (hashtags in favour and against the impeachment of Dilma), we excluded offensive hashtags (coxinha march vs. mortadela day) due to the low number of posts and one ambiguous hashtag (coup) which could be both for and against the impeachment. Data was merged into two databases and we used basic Excel formulas (e.g. VLOOKUP, SUMIF, percentile, remove duplicates) to distinguish high-visible from ordinary actors according to the total of likes and comments their posts received in the days of the protests. We then used spreadsheets114, basic exploratory visualisations115, the web (verifying the post URLs) and ImageSorter116 (to make sense of all visual content) to qualitatively analyse the top 40 high-visible actors of both the pro-impeachment and anti-coup groups. The image-label networks and the co-terms networks serve as other examples that provided a macro perspective of the case study (highlighting visual content associated to each protest), while the analysis of #nãovaitergolpe co-occurring network pointed to a very specific and particular situation regarding the appropriation of a hashtag, from which we were able to detect a shift of #nãovaitergolpe original meaning. 114 High-visible actors: https://drive.google.com/file/d/0B7j2WXfs9qBNnF3WDN2MmdsekE/view?usp=sharing&resourcekey=0-36LeRkc7nnlS8FYKDa2HWQ High-visible actors (anti-coup protests): https://drive.google.com/file/d/0B7j2WXfs9qBbVYwZXg3YkNqVHc/view?usp=sharing&resourcekey=0-c3ulgDn7ktI556ijkqWHng High-visible actors (pro-impeachment protests): https://drive.google.com/file/d/0B7j2WXfs9qBNS1TXzQxUHpFWlE/view?usp=sharing&resourcekey=0-6nDhzC56SIYkv53UTiwfsA 115 For instance: https://drive.google.com/file/d/0B7j2WXfs9qBMGdmblN6ZklMYXM/view?usp=sharing&resourcekey=0-EgJYDvReaxNXEk9w_OQu8g 116 For instance: https://drive.google.com/file/d/0B7j2WXfs9qBNUxHNjhEVnRURWc/view?usp=sharing&resourcekey=0-382uZW7S0WjTxqXNl2-2fw 139 As regard methods, on the one hand, this chapter can be characterised as a naive look at the potentialities of computer vision, particularly since the analysis took place just a few months after the launch of Google Vision API in May 2017. At the time, using computer vision networks to research purposes was a methodological novelty with room to be developed and criticised. This naive appropriation of digital technologies and methods allows me to reflect on the situation that researchers (or students) who are required to handle a series of other analytical decisions without mastering all the technical details of the methods. The use of computer vision was here purely exploratory, and the operations required to build the networks of the pro and antiimpeachment protests in Brazil has perhaps stolen some time from the critical questioning of the technical constraints of Google Vision API. The outputs were not contested but interpreted. This is another challenge we need to recognise when using digital methods. Nevertheless, the results showed that approach can be promising when dealing with large collections of images. On the other hand, this chapter introduces an experimental and situational approach (3L perspective to hashtag engagement) that has been developed and replicated in other research contexts117, confirming not only to its capacity to provide rich insight and in-depth vision of the case study but also methodologically contributing to digital research. There are some practical aspects that can be learnt from this case study, as highlighted below: § The requirement of a careful data curation process considering what hashtags are being used, the role of the extraction software and of the exploratory visualisations in the making of a good data sample. § When considering the technicity of the mediums, the formulation of research questions should go hand in hand with the subject of study itself and in relation to the ensemble of computational mediums put together by the researcher, as well as the technical practices required by the methods. 117 For instance, and in the context of the 2021 SMART Data Sprint, see the following projects: https://smart.inovamedialab.org/2021-platformisation/project-reports/gramming-covid19-reframingthe-pandemic/, https://smart.inovamedialab.org/2021-platformisation/project-reports/homesofinstafor-a-lockdownlife/, https://smart.inovamedialab.org/2021-platformisation/projectreports/vivasylibresnosqueremos/, 140 § There are no ready-made data for social research. Check, clean, re-work, include or delete data to the analysis are essential tasks; from the use of basic excel formulas and data cleaning techniques to mastering the use of research software. These practices are not explicitly available in digital method’s literature, nor have yet been standardised. § In the analysis of data and especially with a more qualitative approach, the web should always be a source of consultation and analysis. § In research oriented by hashtag engagement, researchers are invited to consider both high visibility and ordinary user. At all level of analyses, unique actors must be identified (e.g. users, link domains, image URLs and video ids) and subsequently distinguished in highly and less visible. These analytical decisions (or strategies) help researchers to better situate and contextualise the data sample, while also exposing the hierarchical structure of the Web combined with how Instagram make data available. § When analysing digital networks, great descriptive efforts are required before reaching possible findings or insights. § The use of computer vision facilitates the interpretation of large image datasets which are labelled with confidence scores and ranked by topicality. For instance, the labels assigned to pastéis de nata would be ranked by topicality (dish, food, cuisine) and respective scores, as demonstrated below: o Dish(0.9934035),Food(0.9903261), Cuisine(0.9864208), Egg tart(0.9717019), Baked goods(0.9347744), Ingredient(0.9207317), Dessert(0.88936204), Custard tart(0.8410596), Pastel(0.8360902), Pastry(0.8034706) § When building a network of images and labels provided by computer vision, topics categorization creates image clusters facilitating the interpretation of the visual content. This chapter feeds into the main arguments of this dissertation, by presenting a case study that took into account the need to become acquainted with computational mediums from a conceptual-technical-practical perspective, while introducing a methodological framework. 141 4 HASHTAG ENGAGEMENT RESEARCH118 C HAPTER 4 118 This chapter was originally published as: Omena, J. J., Rabello, E. T., & Mintz, A. G. (2020). Digital Methods for Hashtag Engagement Research. Social Media + Society. https://doi.org/10.1177/2056305120940697 142 Introduction In 2007, when Chris Messina made a tweet suggesting the use of # to organize content, he could not have predicted how the movement of adding the hash symbol before a word, a sequence of characters, or an emoji would become an everyday social practice inside and outside of web platforms. The adoption of the # symbol goes beyond the labelling of trackable content or elements; instead, it is now undertaken as “multiple, open-ended, and contingent phenomen[on]” in society (Rambukkana, 2015, p. 5) that serves digital research as a storytelling device. At the same time, the use of hashtags points to controversial and tricky activities (projected to create, induce, or keep alive a given debate/conversation). Either way, these activities have demanded medium-specific methods and research (Gerlitz & Rieder, 2018; Rogers, 2013). In alignment with new media scholars (Highfield & Leaver, 2016; Langlois & Elmer, 2013; Rieder & Röhle, 2017; van Dijck, 2013), we argue that social media research faces multiple challenges related to its complexity, both in terms of the amount of information that circulates online and, especially, of the need to investigate how to carry out research with the indispensable technical knowledge. This involves raising questions, for instance, regarding how to approach hashtags through platform mechanisms and how to handle the affordances and limitations imposed by their infrastructure (see Marres, 2017; Rieder et al., 2015). Against this background, this article proposes a framework to tackle the problem of the methods applied to understanding collectively formed actions mediated by social media platforms, that is, what we refer to as “hashtag engagement.” To that end, we acknowledge “methods” as not only complementary to digital research but in an interdependent position (Latour, 2010; Rogers, 2013) and, consequently, the study of “hashtag engagement” as something that requires technical knowledge and (a minimum) practical expertise on applied research with digital methods. In this regard, we incorporate the notions of technicity (Simondon, 2009, 2017) and platform grammatisation (Agre, 1994; Gerlitz & Rieder, 2018; Stiegler, 2006, 2012) to better understand the complexity and challenges of hashtagging for digital research. Furthermore, we present the three-layered (3L) perspective which aims to “repurpose” the way we reason about hashtag engagement, moving from folksonomy aspects to their multiple and complex role in and through social media. Under the 143 lens of digital methods (Rogers, 2013, 2019) and distinguishing high-visibility versus ordinary actors and related content, the 3L approach aims toward providing a novel way for reasoning and doing research about hashtag engagement. To conceptually and practically introduce our proposal, we draw on the case of the “impeachmentcum-coup” of Brazilian president Dilma Rousseff. The demonstrations of March 2016 are particularly meaningful as they marked a heightened peak of political polarisation in Brazil. We then took advantage of Instagram both as a source of historical data generated by millions of citizens and as a site of research. We first revisit the role of hashtags and situate “hashtag engagement” to underpin the 3L perspective. Revisiting the role of hashtags The use of hashtags is undoubtedly a part of our digital life. There is a hashtag for almost every social interest, for example, political causes or protests (#elenão vs. #elesim), branding or advertising campaigns (#PepsiGenerations), genre representation (#femboy), the awareness of illness (#microcefalia), erotic content (#22), tourism (#RiodeJaneiro), gastronomy (#foodporn), memories (#tbt), and so on. As natively digital objects (Liu, 2009; Rogers, 2013), hashtags may serve as indexes for their functions, meanings, and practices. That is to say, one can search for, navigate, or engage with hashtags, while others can monitor, trace, and retrieve small or large datasets linked to them. Engaging with hashtags may express local or global conversations, compact or large events, and controversial or non-controversial issues (Bruns & Burgess, 2011; Burgess et al., 2015; Highfield, 2018; Pearce et al., 2020; Tiindenberg & Baym, 2017). It is essential also to recall that hashtagging is not exclusively human activity, but often the fuel behind effective bot activity (Bessi & Ferrara, 2016; Omena et al., 2019; Wilson, 2017) also used on social media for political and marketing purposes. And that means, beyond the capacity to represent communities, publics, discourses, or sociopolitical formations, hashtags can be perceived as sociotechnical networks, both as “the medium and the message” (Rambukkana, 2015). The act of engaging with hashtags is not a new theme within Social Media Studies, particularly for Twitter. This platform is the most common focus 144 of hashtag-led studies, with a vast theoretical and empirical literature that addresses the relationship between hashtags and social formations (see Bode et al., 2014; Bruns & Burgess, 2011; Burgess et al., 2015; Small, 2011). Moreover, the use of political hashtags is a prevailing criterion in corpus selection (Jungherr, 2014, 2015). On Instagram, however, scholars have approached hashtags in selfie studies (Tifentale, 2015), commemoration and celebration (Gibbs et al., 2015), geolocalisation and sociospatial divisions (Boy & Uitermark, 2016), and as innovative visual methods to research emoji hashtags (Highfield, 2018) or climate change images (Pearce et al., 2020). Also, hashtags serve as a path to either training data for the development of automatic image annotation (Giannoulakis & Tsapatsoulis, 2016) or for addressing human behavior (see Cortese et al., 2018; Tiidenberg & Baym, 2017). On Instagram, the use of hashtags began in 2011,119 promoted by the platform community team through an initiative named “Weekend Hashtag Project”: a weekly campaign that stimulates a culture of hashtag use in association with artistic and creative photographic styles, giving users a chance to have their publications featured by Instagram. Beginning at the end of 2011, weekly suggestions were prompted every Friday, such as #throughthefence and #middleoftheroad in November, and #vanishingpoint in December120. Over time, the prefix “WHP”121 became compulsory for those who wanted to join the project and the weekly announcements moved beyond the Instagram Blog on Tumblr to other platforms such as Twitter and Facebook. After Instagram, a new tagging practice has also emerged throughout the #insta tags family—for example, #instagood, #instamood, #instadaily, #instalike, #instalove. These tags, moving across platforms, not only gave rise to ready-made hashtag 119 Although it is said that hashtag use began in the same year that Instagram was launched, in fact, we were able to detect two accounts that used hashtags in 2010—namely, cindy44 (see https://www.instagram.com/p/B7Ho/, https://www.instagram.com/p/CNLr/, https://www.instagra m.com/p/DPv4/) and natsuke (see https://www.instagram.com/p/vlyc/). The first profile, which belongs to a female creative director, adopted the tags #cindy44, #donkey, #throughthefence, #jj and #birds in the month that Instagram was created, October 2010, and the second profile also used the tag #throughthefence, but in December. 120 See http://blog.instagram.com/post/13,120,184,445/throughthefence; https://twitter.com/instagram/s tatus/141,220,040,329,531,392; https://twitter.com/instagram/status/148,826,765,953,990,656. 121 See a few examples: #streetartistry in 2012, #whpemptyspaces in 2013, #whpmirrormirror in 2014, #whpboomerang in 2015, #whpidentity in 2016, #whpinthekitchen in 2017 and #whp in 2018. 145 thematic lists to boost (automated) engagement122, but have also pushed the boundaries of hashtagging, and challenged hashtag based studies. Beyond serving as a description of visual content (Giannoulakis & Tsapatsoulis, 2016) or as an index for a topic, a hashtag is also a register for the realm of feelings, ideas, and beliefs (Paparachissi, 2015). To demonstrate this, #BrasilContraOGolpe [Brazil against the coup] may serve as a good example. In late March 2016, this tag emerged from Dilma Rousseff’s supporters and “democracy advocates.” Activists, intellectuals, journalists, politicians, and ordinary users started using #BrasilContraOGolpe as a reference to the impeachment process against the president - considered by many as a “modern coup” (Jinkings et al., 2016). Pro-impeachment supporters, however, have also adopted the usage of the tag, but shifted its original meaning to support their arguments: claiming that the real coup would be that of keeping Dilma Rousseff and her Labour Party (PT) in charge of the government. This meaning shift, especially concerning polarised debates in programs and anti-programs (see Akrich & Latour, 1992; Rogers, 2018), is an example of double-sense hashtags. To locate these modes of appropriation, a technical understanding of the platforms’ functional forms of living (technicity) must be entangled with the process of doing digital methods (Rogers, 2019). Studies based on hashtags, however, should not conflate different platforms but, rather, apply different analytical procedures to each one (see Highfield, 2018; Highfield & Leaver, 2015; Rogers, 2017). Conversely, hashtags can be viewed as “problematic” content for digital research due to their failure to cover certain sensitive issues that tend to be disguised, such as pro-eating disorder content (see Gerrard, 2018). The collective adoption of tags can also be employed as a comparative source to grasp hashtagging activity in different platforms, which can be used to adapt methodological approaches (Highfield & Leaver, 2016). Despite unveiling different layers of reasoning, the logics of the hashtag adoption, and its consequences in a given context, these studies do not necessarily address hashtagging as a collective action movement. Alternatively, we further introduce the idea of discussing hashtag engagement rather than the hashtag adoption, conflating with the technicity of Instagram and its grammatisation process. 122 These lists of hashtags were originally adopted to increase views on publications and, consequently, to boost likes, followers, and comments with the help of applications and their automated mechanisms. 146 Situating Hashtag Engagement What, then, does the word engagement in hashtag engagement refer to? Engagement is taken as actions, metrics, and research indicators. For instance, one can argue that hashtag engagement is commonly associated with the act of using tags to engage with news, activism, brand strategies, event-based information, politics, demonstrations, automation practices, or specific debates. However, the term engagement has been either used to name platform-afforded metrics (or the totality of commensurable activities in a media item) or taken as an indicator for research design. Engagement metrics have thus become part of general digital media literacy as well as parameters for selecting data samples to be further analysed. Partly encouraged by terminology adopted by platforms themselves123, these metrics have even merged with the very notion of engagement in common parlance124. On this topic, Marres (2017) refers to the analytic figure of power-law as a critical issue in “the re-validation of hierarchical forms of social and public life” (p. 71). According to Marres, by feeding power laws back to users in the form of trending lists, digital platforms not only inform what goes on in digital settings but also serve “as an instrument that influences collective action”. And, while these can be understood as actual and faithful results of how users generally relate to the media, Gillespie (2017) draws attention to how the platform metaphor may hide inherent biases and active intervention of the internet high-tech companies, while suggesting a smooth standing point from which users can participate equally and fairly. These two remarks remind us that hashtag engagement also responds to platform infrastructures and mechanisms. In this scenario, we understand that social media engagement can be approached under a dual logic. In one way, it prioritises the sum of actions media items receive from many actors. Alternatively, engagement with a topic can be perceived by the recurring use of natively digital objects (Rogers, 2013) or grammars of action (Agre, 1994) from many actors about a topic—that is, many people using particular terms, hashtags, or 123 Platforms’ documentation commonly refers to these metrics as “post engagement” and offer their analytic products as a way to “measure engagement.” See https://analytics.twitter.com or https://www.facebook.com/business/help/735,720,159,834,389 124 The term and the problem to which it refers has a history that long precedes social media platforms, and has been related to different meanings besides the ones it has come to convey nowadays. For these reasons, as discussed by Rafael Grohmann (2018), it is important to critically and carefully consider the term’s polysemy when it is used conceptually. 147 images. Following platform mechanisms, the first logic is reflected on the most engaged list or what is dominant in terms of popularity and influence— parameters commonly taken for sampling purposes in social media research. The second logic refers to the diffuse posting of content related to particular issues that do not necessarily reach large numbers of likes, shares, or similar actions. That is where we would also find ordinary posts kept out of the spotlight—in a distribution that is similar to C. Anderson’s (2008) notion of the long tail. The dual logic of social media engagement thus raises concerns in research methods, particularly the understanding of the high-visibility and ordinary lists: what different stories can they tell? How may these lists complement or contradict one another? Some researchers have addressed specific concerns regarding how the practice of emphasising high-visibility content or the logic of popularity may lead to social media studies driven by engagement parameters (Marres & Weltevrede, 2012; Rosa et al., 2018). On the contrary, there is a long-standing debate around what ordinary means and why it matters for Cultural, Communication and Media Studies. For instance, in attempting to describe the ordinariness of culture, Williams (1989) explained how difficult it is to interpret the ordinary or unknown audience. In his view, ordinary people do not belong to “the normal description of the masses”; they belong to the unknown or unseen structures (Williams, 1989, p. 98). This article thus proposes, from a standpoint of quali-quantitative methods (Latour et al., 2012; Moats & Borra, 2018; Venturini, 2010), an alternative perspective to addressing engagement in social media research; a call to embrace not only highly visible content, but also ordinary, less-visible content for the interpretation of hashtagmediated actions. Reasoning with and through the medium The study of hashtag engagement also requires a grasping of the functioning of the platform itself (technicity) along with the platform techno-materialisation process— which “enable (s) behavioural fluxes or flows to be made discrete (in the mathematical sense) and to be reproduced” (Stiegler, 2012, p. 2) (grammatisation). In this regard, we incorporate the notions of technicity and grammatisation, which not only 148 complement one another but are crucial for social media research and, accordingly, for the concretisation of the 3L approach. Technicity The philosophy of Gilbert Simondon (2009, 2017) reminds us of the crucial role of technicity for an understanding of “the mode of existence of the whole constituted by man and the world” (2017, p. 173)—a reality mediated by technical objects. The reasoning proposed in this article derives from Simondon’s ideas on the essence of technicity (2017) and the technical mentality (2009). Technicity, in a specific manner, refers to the notion of “function” as being associated with the technical and practical forms of knowledge of technical objects and how they relate to us. On this basis, technicity would simultaneously precede and take place with and in technical objects: first by being related to figural structures or the realm of ideas and, second, by the recognition of technical objects as a practical reality. This movement (from representative aspects to the praxis of techniques), consequently, divides technicity into two orders of thought: theory and praxis. In this way, technicity concomitantly triggers not only theoretical but also technical and practical knowledge on the functioning of technical objects and their relationship with human beings. A technical mentality thus implies thinking hashtag engagement with, in, and through social media platforms. Rather than only looking at the content, a study based on the technicity of Instagram should also consider the functioning of its technical interfaces and algorithmic techniques. One example of this would be to take advantage of application program interface (API) documentation using the knowledge about platform data access regimes, end-points, and their respective limitations and rate limits to repurpose social media research125. This article aligns with concerns raised by scholars such as Rieder et al. (2015) and Langlois and Elmer (2013), by looking at what is in social media technical interfaces as a way to perceive how social media grammars (hashtags) have been rendered and made available. In doing so, we propose, in practical ways, a more techno-aware understanding of social life (Marres, 2017) in pursuit of studying “hashtag engagement” on social media. 125 Some examples of the limitations of Instagram Graph API for getting hashtagged media include the fact that one cannot request username field or query more than 30 unique hashtags within a 7-day period. 149 Platform Grammatisation When referring to grammatisation, we are addressing an extension of the concept forged by Auroux (1994)—a process of description, formalisation, and discretization of human behaviors into representations, so that they can be reproduced (Crogan & Kinsley, 2012). This is what the French philosopher of technology, Bernard Stiegler (2006, 2012), called the process of digital grammatisation in which “all behavioural models can now be grammatised and integrated through a planetary-wide industry of the production, collection, exploitation, and distribution of digital traces” (Stiegler, 2012, p. 2). More recently, Gerlitz and Rieder (2018), envisioning the infrastructural aspects of Twitter, presented an updated definition of grammatisation: when users inscribe themselves into predefined forms and options produced and delineated by technical interfaces (software) to structure their activity. Beyond providing a way of looking at things, platform grammatisation simultaneously produces standardisation of actions (e.g., likes) and formalises these activities to calculability. This is a relevant concept for digital methods-based research, due to its strong focus on mediaspecificity, which, in the case of social media, is very much defined by their grammatisation of social activity. Next, we borrow Agre’s (1994) technical understanding of “grammars of action” or the representative forms of “discourse-made-machinery,” such as hashtagging, commenting, posting, replying, and so on. In this sense, hashtags are no longer text, but, by being clicked, they enact a navigational function. Thus, hashtag engagement is embedded into the platform databases that predefine specific properties (e.g., a tagged post has a caption, an image, or video and date of publication), the relationship between them (e.g., hashtags appear in Instagram posts), and a set of actions (e.g., liking or commenting on posts, using filters; see Gerlitz & Rieder, 2018). When considering how social media databases store and organise actions attached to the # symbol, we verify multiple forms of storing and further accessing hashtag data. As an illustration, through the former Instagram Platform API, it was possible to recall the number of times a profile mentioned a given tag (suggesting a form of appropriation) or the provision of ways of seeing correlations among tags (through a co-tag network). Meanwhile, the current Instagram Graph API only allows the search for the most popular or recently published tagged content. 150 In other words, and despite its prestructured form (#), hashtags can be differently embedded into social media databases permitting, then, different ways of reading hashtag engagement. Along with this grammatisation process, hashtags can also acquire different meanings and purposes in the modes they are used and, therefore, researched. That is what we refer here as “the grammars of hashtags,” how social media capture and reorganise the different modes of actions attached to hashtagging. The 3L perspective for studying hashtag engagement The 3L perspective assembles hashtag engagement, their related content, and the actors involved by distinguishing dominant and ordinary groups embedded in social media practices and mechanisms. The practical awareness of the platform grammatisation and technicity is the basis that concretely informs the 3L approach. This kind of knowledge, we argue, provides practical ways of reasoning with and through the functioning of the platform itself and its conjunction with hashtag engagement. Just as digital methods (Rogers, 2013, 2019) the 3L perspective must follow and evolve with the medium, its methods, and the affordances of digital data. Following the lexicon and proposal of Rogers (2018), this perspective also serves as a form of “critical analytics” or “alt metrics” for social media research by locating issue networks and creating indicators that are alternatives to marketing-like measures. We understand hashtag engagement as collectively formed actions mediated by technical interfaces. In other words, grammatised actions that move toward descriptions of images and feelings or toward particular topics of discussion (or issues), which require a (minimum) collective level of commitment. These sociotechnical formations, differently inscribed within web platforms, offer a framed (but sturdy) perception of society while providing social media research with different levels of analysis. Through the lens of the 3L perspective and along with the proposal of sociologist Bruno Latour (2010; Latour et al., 2012), the study of hashtag engagement allows analysis to move between the levels of the element (micro) and of 151 the aggregates (macro)126. With Latour and others (Omena, et al., 2019; Venturini et al., 2015, 2018), we embrace a “navigational practice” not restricted to either of those levels but a research practice that goes from micro to macro and back, taking any of them as a starting point for the inquiry. Few studies, however, have been developed on methods for researching hashtag engagement on Instagram on such bases. This is a contribution we expect to make with our 3L perspective for hashtag engagement studies on (but not restricted to) Instagram. In what follows, we explain each layer comprising the integrated 3L approach. Although presented in a linear sequence, they must be taken together, as layers of the same object. Layer 1: High-Visibility Versus Ordinary On this analytical level, unique actors are identified and subsequently distinguished according to the modes of activity and engagement metrics received by their posts over time (the acts of hashtagging or interacting with tagged content). In so doing, we attempt to cover both high-visibility and ordinary actors and related content, as well as answer the following questions: who are the high-visibility and the ordinary actors? Who dominates the debate? What is the visual and textual content related to them? What are the sites of image circulation? How about the distribution of users, posts, and engagement? The main challenge is in proposing a threshold for delimiting high-visibility from ordinary hashtag usage, its related actors, and content127. Driven by Rogers’s (2018) alternative metrics to study issue networks in social media research, we considered the persistence of user activity over time as they are inscribed in platform engagement metrics. Thereby, it is an attempt to address what the social media digital attention economy either emphasises or not. In this logic, high-visibility actors and content are understood as the minority, which exhibit comparatively high and consistent engagement metrics (e.g. likes and comments counts) across the observed time span. 126 Latour’s proposal is based on Gabriel Tarde’s social theory, particularly his idea of quantification. The importance and influence of Gabriel Tarde’s work is recognized by Bruno Latour when he places Tarde as the main precursor of Actor-Network Theory. 127 Actor activity is understood in their tagging or uploading tagged content overtime, whereas metrics of engagement means the total of likes and comments in a publication. In other platforms, engagement metrics could also include reposting (share, retweet, reblog, etc.), among other actions. 152 This would indicate not only the scale of their audience but also their ability to receive responses to their publications. Conversely, ordinary actors and content would be the majority, exhibiting comparatively lower engagement metrics, reaching a smaller audience. Of course, these categories are not empirically self-evident. Rather, the threshold needs to be arbitrarily defined by grounded criteria. Layer 2: Hashtagging Activity The second layer relates to the repurposing of hashtagging activity for grasping the grammars of hashtags. By this, we mean the ways in which social media platforms capture and reorganise the different modes attached to hashtagging. Far from being neutral intermediaries (Latour, 2005), hashtags are taken as entities to which the activities of users, bots, and platform algorithms converge and through which they mutually transform one another. Although such entanglement can be very complex, it is possible, in line with digital methods’ perspective (Rogers, 2013), to repurpose hashtags as traces from which one may infer those activities. Besides framing the most active actors or serving as qualitative parameters to inquire into high-visibility and ordinary groups, the intensity and rhythm of hashtag mentions may indicate actors very committed to specific issue spaces, as well as potential botted accounts (see Omena et al., 2019). Patterns of concomitant hashtag use can indicate different hashtagging practices, including shifts of meaning, purposeful deviations, as well as hashtag ambiguity and ironic usage. We argue that different approaches should be embraced to read the forms of appropriation and frequency of use regarding one or more hashtags. Looking at the affordances of Instagram to hashtagging activity, this layer seeks to answer questions such as the following: What can frequency of hashtag use reveal about high-visibility and ordinary groups? What can the number of times hashtags are mentioned by a given account tell us about particular actors or automated agency? How can the co-occurrences of hashtags indicate different hashtagging practices? How do hashtags mediate actors’ engagement with a cause? Layer 3: Visual and Textual Content Finally, hashtag engagement should also be related to the content of the posts within which they are mentioned. The third layer focuses on visual and textual content, providing an overview of the diversity and richness of narratives attributed to 153 particular hashtags. Here, the focus is on understanding the images and texts to which hashtags are brought to relation, taken as constituent parts of their meanings and related practices. In that regard, and accounting for high-visibility and ordinary groups, this layer asks: what stories can the visual and textual tell? What are the visual and textual compositions or meanings related to certain hashtags? How about the sites of image production and circulation? The quali-quantitative approach is particularly relevant at this analytical level. Considering our interest in massive ordinary posts, this approach would be laborious— not to say unfeasible. However, distant reading methods for both texts and visual content can be mobilised for identifying recurring patterns (Dixon, 2012) among the dataset, without losing sight of their manifestations. This is the main challenge of this layer, whose operationalisation will be detailed further. The praxis of Hashtag Engagement Research Political Context, Scholarly Approaches, and Framing of the Brazilian Case The case study approaches two antagonistic protests staged in Brazil in March 2016, during a rise in political animosity in the country. On the 13th of that month, protesters went to the streets in many cities in support of an ongoing parliamentary process to remove President Dilma Rousseff from office. Five days later, on the 18th, protesters contrary to the removal took their turn, expressing concern that the proposed impeachment lacked legal cause and would thus be qualified as a “parliamentary coup” (Jinkings et al., 2016). In respect to the terminology used by each of the groups in defining themselves—and wary of not prematurely resolving the implied controversy (Latour, 2005; Venturini, 2010)—we chose to refer to the protests, respectively, as “pro-impeachment” and “anti-coup.” It is essential to understand this case within a broader political context. Addressing Brazilian demonstrations staged between 2013 and 2016, Alonso (2017) discusses elements that could have facilitated their emergence with an interest in the styles of mobilisation of each cycle of demonstrations. These include the wave of global autonomist protests starting in 2010 (from Tunisia to Wall Street), Brazil’s international visibility due to the sports events it would host in following years; 154 corruption scandals and their spectacularisation; and the rapid reconfiguration of Brazilian social strata (see P. Anderson, 2011; Lima, 2010), which destabilised symbols of social hierarchy (race, income, and education, among others). This four-year period, culminating in 2016, is commonly divided into three protest waves. First is that of the so-called “June Journeys”: mass demonstrations which, at their peak in June 2013, brought an estimated one million people to the streets. They marked the emergence of an autonomist and leaderless style of demonstration, which took governments and traditional movements by surprise, but which also culminated in ideologically ambiguous protests coalescing agendas across the political spectrum—from anarchist to pro-dictatorship demands. Next was what Alonso (2017) refers to as the 2015 “Patriot cycle,”128 following the 2014 presidential elections, which Rousseff won by a very narrow margin. To the right of the political spectrum, allegedly nonpartisan groups achieved prominence, especially on social media (Omena & Rosa, 2017). They were able to mobilise a wide range of conservative political strands, from major players in the financial and industrial sectors to religious fundamentalists and conservative citizens from higher economic strata. The case studied in this article is part of the third wave, more directly tied to Rousseff’s impeachment process, which, officially, pursued accusations of administrative misconduct (which came to be known as “fiscal pedaling”) in December 2015. Most protests took place in 2016, when the aforementioned conservative groups were prominent established players in Brazilian protests. The polarisation already experienced in the second wave was magnified by the reconfiguration of the public agenda, with antagonistic groups of supporters and detractors of Rousseff’s deposition becoming delineated. Despite the actual judicial arguments of the process, public debate inherited much of the agenda of the previous wave, with pro-impeachment demonstrators focusing on corruption scandals, targeting the Workers’ Party, and mobilising mostly citizens from higher economic strata. Calls for Rousseff’s ousting were accompanied by several misogynistic depictions of Rousseff—the first-ever female president of Brazil— as discussed by scholarly inquiries of the case (see Corrêa, 2017). Hatred against left- 128 Alonso indicates March and April of that year, but we would extend the cycle’s scope to protests staged later in 2015 as well. 155 leaning activists and marginalised segments of the population, commonly associated with a progressive agenda, was also increasingly manifest in that context. Anti-coup demonstrators’ discourses focused on the defense of democracy and often exhibited explicit partisan stances. Although this event has prompted scholarly inquiries on several aspects of the process, there are surprisingly few works that investigate how protesters represented themselves in that context. The impeachment process has been more often studied in regard to how it was reported by the press or by groups leading the protests (see Fausto Neto, 2016; Tavares et al., 2016), with little attention paid to ordinary protesters’ visual depiction of the event or to Instagram as a site of observations129. In what follows, we will present a study of this case based on our 3L perspective, building upon Instagram’s culture of use and affordances. Operationalising the 3L Perspective Taking advantage of Instagram’s API Platform, which at the time allowed researchers to go back days, months, and even years in time, data collection occurred in several iterations from March 13 to March 31, 2016. Our study relied on Visual Tagnet Explorer (Rieder, 2015) to collect publicly available posts according to queries based on hashtags. Chosen upon immersive observation of the context and through previous exploratory data collection and analysis (co-hashtag networks and Excel’s pivot table), the selected hashtags (Table 4.1) corresponded to the following criteria: having a significant number of mentions, bearing clear connection with the topic, being an indicator of counter-reactions, or being an indicator of new connections on the topic. The datasets were later filtered by matching the dates of the posts and the protests, limiting the scope to the two dates - March 13 for pro-impeachment and March 18 for anti-coup. The final combined dataset included 19,231 unique Instagram accounts with a total of 22,423 posts. 129 Regarding self-representation, an exception is a work by França and Bernardes (2016), which approaches visual depictions of the 2015 demonstrations, albeit from a different theoretical and methodological standpoint. Regarding digital platforms, Twitter and Facebook were most commonly studied with regard to impeachment-related demonstrations (see Alzamora & Bicalho, 2016; Moraes & Quadros, 2016; Ribeiro et al., 2016). 156 Table 4.1. List of hashtags selected for the case study Following the 3L perspective, the distinction of high-visibility from ordinary was based on the combination of two factors: first, detecting unique actors (Instagram accounts) and then the testing of different thresholds for the average platform engagement metrics (sum of like and comment counts) of the users’ posts over time. In so doing, we expected to find a viable threshold that could distinguish between a minority group of users which received a large portion of the total sum of engagement metrics of all posts in the dataset. Through this process, we came to define the threshold at the 98th percentile of average engagement per post, per user. Using this boundary, we found similar distributions for both pro-impeachment and anti-coup datasets. In both cases, high-visibility actors were a minority responsible for roughly 4% of all the posts in each dataset; yet, they received around 50% of all engagementrelated activity. Through this procedure, we sought to distinguish the most visible (and, therefore, most popular and influential) actors and their related content from the rest. Next, for the analysis of hashtagging activity, we focused on hashtags’ frequency of use and their concomitant mentioning. The former was taken as an indicator of popular tags, which we compared between high-visibility and ordinary users in each protest. The concomitant mentioning of hashtags was observed through co-occurrence network 157 built on Gephi Version 0.9.2 (Gephi Consortium, 2017), taken as analytical devices to observe patterns of hashtagging practices130. For the visual dimension, we relied on an experimental on an experimental approach based on that proposed by Ricci et al. (2017). Post images were automatically labeled based on their content using a computer vision API—Google Cloud Vision API Version 1.0 (Google, 2017)131. The automated image classification was later combined with Gephi and a custom Python script (Mintz, 2018) for building a computer vision-based network. The so-called image-label networks in which we can see clusters of images connected by their descriptive labels. For the textual content, we resorted to two analytical tools: CorTexT Manager (Lisis Laboratory, 2017) and Textanalysis (Rieder, n.d.). The former, advanced by topic modeling algorithms, allowed us to visualise co-term networks of Instagram captions and their related hashtags (clustered by political positioning). Textanalysis served our case study to compare the use of emojis in the captions of posts by high-visibility and ordinary users. Findings In this section, we present the findings of the case study of the “impeachment-cumcoup” of Brazilian president Dilma Rousseff. We applied the 3L perspective to study political polarisation in Brazil through the lens of hashtag engagement and considering two national demonstrations: the pro-impeachment (March 13) and anti-coup (March 18) protests. High-Visibility Versus Ordinary Through the distinction made at this stage, we were able to inquire of high-visibility actors and their related content. Who are they? What can activity over time tell us about high-visibility actors? To what visual elements are they attached? We identified 130 All network visualisations used in this study were based on the visual network analysis technique (Venturini et al., 2015; Venturini et al., 2018). 131 Bernhard Rieder’s (2017) Memespector script was used for interfacing with Google’s API. https://github.com/bernorieder/memespector 158 a very particular structure in both pro-impeachment and anti-coup groups (Table 4.2): on one side, a group of actors who obtain high levels of engagement metrics with very few publications, while on the other, a group of actors with a large number of publications over the day of protests also getting high levels of engagement metrics (see Omena et al., 2017). In a more specific example, Figure 1 shows the configuration of high-visibility actors (dots) positioned according to received engagement metrics (vertical axis) along the day of the protests (horizontal axis). At the top, the actress Viviane Araújo points to a trending characteristic in the dominant visuality among public figures: selfies, whereas the classic imagery of the crowds is mainly promoted by non-official campaign accounts and the organiser of the protests— namely, chegadecorruptos, foracorruptos_rn and vemprarua. Other visual elements addressed by the high-visible actors in pro-impeachment protests expose the support of the then Federal Judge Sérgio Moro and the Operation Car Wash or the appearance of humorous images (e.g., Dilma in the shape of Zika mosquito) and aggressive messages addressed to Dilma Rousseff and Lula. Table 4.2. The high-visibility actors in Brazilian protests. Instagram, March 2016. 159 Figure 4.1. High-visibility actors of the pro-impeachment protests in Brazil, March 13, 2016. Composition, engagement flow over time, and visual elements (scatter plot design by Beatrice Gobbo). There were also some unexpected findings: first, an account dedicated to pets (petscharm) among high-visibility actors. This Instagram account published a series of images of dogs wearing Brazil’s football garment or the Brazilian flag, elements also worn or carried by protesters. In regard to actors’ activity and their associated engagement metrics, we saw an ongoing posting activity over March 13, 2016 and, between 3 p.m. and 9 p.m., high peaks of engagement that may correspond to the simultaneous protest acts across different cities in Brazil (Figure 4.1). It is also important to point out deleted non-official campaign accounts on Instagram, such as opereacaolavajatooficial (official operation car wash), which lead us to question their authenticity and role. Hashtagging Activity As a next step in the analysis of hashtag engagement, we considered the grammars of hashtags by reading Instagram’s different forms of capturing hashtagging. Looking at referential tags and their use frequency, we noticed different preferences among highvisibility and ordinary actors (Figure 4.2). For instance, in pro-impeachment protests, #foradilma (get out Dilma) and #forapt (get out PT) were more frequent among ordinary users, while #vemprarua (come to the street) was slightly more frequent among high-visibility ones. In anti-coup protests, ordinary actors gave preference to #naovaitergolpe (there won’t be a coup), while high-visibility actors opted for #vemprademocracia (come to democracy). The 160 different cultures of appropriation among high-visibility and ordinary actors provide a more accurate description of hashtag engagement practices. Now, we turn our attention to hashtag mentions and related actors, more precisely, who are the high-visibility actors and how many times they mention specific tags. Beyond seeing tag preferences among high-visibility and ordinary actors, the contribution of this analysis is in the detection of very committed Instagram accounts with given hashtags. So far, and unlike occasional mentions, we have seen that the persistence of hashtag mentions over time may refer to those actors responsible for keeping the debate regarding protesters’ grievances alive. Conversely, accounts with few mentions can equally reach high engagement metrics by being related to public figures, humorous or artistic visual content (e.g., tiacrey, lalanoleto, artedadepressao), or politicians and activists (e.g., humbertocostapt, fernando.domingos.sim). To take a concrete example, in the pro-impeachment protests, the most committed actors by hashtag mention were mainly non-official campaign accounts—namely, chegadecorruptos, foracorruptos_rn, operaçãolavajatooficial, petscharm, and the organisers of the protests (vemprarua). The behaviour of these Instagram accounts points to an automated agency (see Omena et al., 2019). Regarding the anti-coup protests, non-official campaign accounts (e.g., rosangelacct, transitivaedireta, liliferrer14) also took part in the “most active list” by hashtag mentions, but so did alternative media (e.g., medianinja) and one of the organisers of the protest (cutbrasil). Regarding non-official campaign accounts, we found strong suggestions that thirdparty applications were being used to boost engagement metrics. 161 Figure 4.2. Proportional frequency of hashtag mentions (number of mentions over the number of posts) for high-visibility and ordinary groups. Filtered to the 10 most mentioned hashtags of each dataset. Visualisation created with Tableau Desktop (Version 10.4.6; 2018). The visual exploration of co-occurring hashtag network added value to the hashtagging activity perspective. Rather than following the typical cluster analysis to study the partisan use of hashtags and related topics, we approached emblematic hashtags adopted by pro- and anti-programs as a form of seeing a shift in meaning. That is what we call double-sense hashtags. After scrutinising #nãovaitergolpe (there won’t be coup) (Figure 4.3) co-occurrence network, we were able to detect purposeful shifts of the hashtag’s meaning—for instance, hashtags supporting the impeachment process and connected to the main slogan of the pro- impeachment protests “come to the street”. In addition, tags addressing messages directly related to the now-former presidents of Brazil—“get out Dilma,” “get out Lula,” and the association of an inflatable puppet wearing prison uniform, named Pixuleco, with Lula. 162 Figure 4.3. #nãovaitergolpe co-occurring network related to anti-coup protests in Brazil, 18 March 2016. Instagram Platform. Network attributes: 1,250 nodes (hashtags) and 11,487 edges (co-occurrences). Visualisation created with Gephi, layout: Force Atlas 2 (Jacomy et al., 2014), “LinLog mode” option enabled. Visual and Textual Visual content was analysed through an image-label network built upon pre-trained machine learning models of Google Cloud Vision API. We interpreted this network by describing clusters of images brought together by formal similarity; an exercise of relabelling the image classification provided by the vision API (Figures 4.4 and 4.5). Through this approach, we found that both pro-impeachment and anti-coup visualities exhibited a similar overall pattern, annotated by three major clusters: selfies and portraits, crowds, and graphic pictures (banners, image macros, text, etc.). However minor, both networks had food and beverage clusters, which we have also found to be related to the protests themselves. Each of the groups had pejorative nicknames for antagonist protesters which were based on food: “coxinhas” (a popular Brazilian treat 163 made with chicken) and “mortadela” (a popular type of sausage), respectively, used by anti-coup and pro-impeachment protesters. Several unique clusters were detected in each network, pointing to a particular visual culture. The pro-impeachment (see Figure 4.4) had a large cluster of variations of the Brazilian flag, which shows its strong connection with patriotic iconography. A prominent cluster of dog pictures was also found, which indicates the trivialisation of political engagement, while also possibly relating to how pets are commonly treated and represented by middle-class Brazilians. Lying between individual and group portraits were a significant amount of people wearing sunglasses, which seems to relate to how these accessories are status symbols within Brazil. Contrary to this, the anti-coup image-label network (see Figure 4.5) had a comparatively smaller cluster of individual or small group portraits, with crowd photos being more prominent. The Brazilian flag was much less featured, while other symbols, such as red protest t-shirts and newspaper clippings, stood out. Within the individual portrait cluster, bearded faces composed a small but meaningful cluster which relates to a typical expression of political identity in the left. To compare visual content between high-visibility and ordinary groups of each protest, we resorted to a quantitative approach of label attribution frequency (Figure 4.6). Regarding the image-label networks, the pro-impeachment dataset had a higher occurrence of labels which relate to close-up portraits (e.g., “sunglasses,” “facial expression,” “face”). These labels were slightly more common in the ordinary group than in the high-visibility one. In the anti-coup dataset, labels related to collective imagery were more common (e.g., “festival,” “demonstration,” “event”), indicating a different representational tendency for this protest. These labels were also more common among the high-visibility group than the ordinary group. 164 Figure 4.4. Image-label network of the pro-impeachment protests, March 13, 2016, Brazil. Original Instagram images plotted according to relative node positions of a bipartite network built with Google Cloud Vision API’s Version 1.0 (Google, 2017) “Label Detection” data. Network attributes: 18,986 nodes (1,358 labels and 17,628 images) and 80,479 edges. Layout: Force Atlas 2 (Jacomy et al., 2014), “Prevent overlap” option enabled. Figure 4.5. Image-label network of the anti-coup protests, March 18, 2016, Brazil. Original Instagram images plotted according to relative node positions of the bipartite network built with Google Cloud Vision API’s Version 1.0 (Google, 2017) “Label Detection” data. 165 Network attributes: 2,872 nodes (587 labels and 2,285 images) and 10,508 edges. Layout: Force Atlas 2 (Jacomy et al., 2014), “Prevent overlap” option enabled. Figure 4.6. Proportional frequency of Google Cloud Vision API Version 1.0 (Google, 2017) label attributions (number of attributions over a number of posts) for high-visibility and ordinary groups. Filtered to the 15 most used attributed labels of each dataset. Visualisation created with Tableau Desktop (Version 10.4.6; 2018). Moreover, labels indicating colours were among the top occurring in both datasets: yellow and green for the pro-impeachment protests; red for the anti-coup protests, beyond being, respectively, associated with the Brazilian flag or the national football uniform (pro-impeachment) and to the Workers’ Party or other left-wing movements (anti-coup). Colours, here, indicate a statement of Brazilians’ position. Seeking to identify the specificities of the discourse adopted in each of the political perspectives (anti-coup and pro-impeachment) and groups (high-visibility and 166 ordinary), we visualised textual content (Instagram captions) in different levels of analysis (Figure 4.7) through co-term networks. We first visualised the textual content of both protests gathered in four main clusters (Figure 4.7, left): two related to anti-coup positioning, and the other two connected to the pro-impeachment group. In the latter, we see expected slogans against Dilma and surprising national anthem terms; while in the anti-coup clusters there are appeals for the impeachment process to end and for respecting the results of the 2014 democratic elections in Brazil. In opposition to this broad perspective, we separated the co-term networks by closely looking at the high-visibility and ordinary groups. The highvisibility network (Figure 4.7, center) shows more isolated clusters, scarcely interconnected. The places where the protests occurred are what connect the polarised debate. In the ordinary textual network (Figure 4.7, right), the main component shows more dense connections, thus reproducing concerns similar to those we have already mentioned. Figure 4.7. Textual analysis of Brazilian protests in March 2016 via co-term networks. Instagram captions and related hashtags were clustered according to political positioning (the pro-impeachment and anti-coup selected hashtags), and according to co-occurrences of the 50 top terms in Instagram captions. Nodes are terms and edges co-mentioning relationships. Software analysis: CorTexT Manager (Lisis Laboratory, 2017). The richness of these different narratives is found in isolated clusters that reveal very particular concerns, belonging solely to one group. It was the case of the appearance of terms suggesting Brazilians not be moved by hatred but to “protest peacefully” as a part of high-visibility textual content and the specific terms associated with an alternative media account—namely, Mídia Ninja (Figure 4.7, center). Another 167 example, now in the ordinary network (right side), entails nationalistic rhetoric referring to the Brazilian national anthem. Finally, but no less important, while highvisibility actors acknowledged Brazilians for their participation in the proimpeachment demonstrations, the ordinary actors expressed how proud they were of being present at the protest. Figure 4.8. The appropriation of emojis according to high- visibility and ordinary groups; emojis organised according to frequency of use. Ultimately, mixing the visual and textual content, we observed the use of emojis in Instagram captions. Emojis (formerly called “emoticons”) have had a significant role in computer-mediated communication, serving as a path to sharpen emotional expressiveness on text-based interactions. In our perspective, these objects are interesting because they can be apprehended in terms of representativeness (high-vis and ordinary) and positioning (pro-impeachment vs. anti- coup), and not only as an act of tagging per se. 168 Figure 4.8 depicts the appropriation of emojis in high-visibility and ordinary groups, ranked according to frequency of use. At a glance, representative colours may be seen in pro-impeachment icons (yellow and green) as well as in symbolic icons for the anticoup group (tulip and raised fist). This points to different use preferences, also serving as a reinforcement of the visuality (Instagram images) attached to the polarised groups. However, when comparing the appropriation of emojis by different groups, while the ordinary group has a heart among the most used emojis, high-visibility accounts opted for the globe showing the Americas, smiling face with sunglasses, and a party popper. In addition, the skin tone of emojis reveals an interesting perspective about race (represented by squares in Figure 4.8), with a predominance of light skin and medium skin tones among protesters, except for the high-visibility accounts of the anti-coup demonstrations, which had medium-dark and dark skin tones. Conclusion This chapter sought to critically and methodologically contribute to digital research by looking at the specific case of hashtag engagement. Through digital methods, we introduced the 3L perspective: a hands-on approach that operationalises new forms of digital social enquiry. It has, in its core, the entanglement of the technicity of Instagram and its grammatisation process as a lens for hashtag engagement analysis, as in the appraisal of what is trendy in Hashtag Studies or Social Media Research and what is often kept out of research concerns; that is, both high-visibility and ordinary actors, actions, and related hashtagged content. The core outcome of this kind of research is the assumption/perception of that high-visibility as a mirror of the social media digital attention economy. However, in being re-signified through the detection of unique actors combined with platform metrics over time, it serves as an alternative approach to social media vanity metrics. By enquiring of hashtag political engagement on Instagram, we confirmed the importance of including high-visibility versus ordinary groups (Layer 1), hashtagging activity (Layer 2), and its related visuality and textuality (Layer 3) as layers of the same object of study. Through the case of the impeachment-cum-coup of Brazilian president Dilma Rousseff in 2016, substantial differences between the high-visibility and ordinary groups were 169 uncovered—both in terms of hashtag usage culture and related content. By looking at the structural shape of high- visibility groups in Layer 1, we found that impactful visual content requires little effort from public figures, politicians, and artists (often with one post), while continuous activity over time is a mandatory task for non-official campaign accounts and independent media (often with a high number of posts). In Layer 2, the different ways in which hashtags are captured by social media databases expose different cultures of appropriation. The choice of tags and their intensities of use changes between high-visibility and ordinary actors. These grammatised actions also point to very particular behaviors—from the double-sense hashtags to an automated agency. With the third layer, we navigate through the whole (all images and textual content) to its parts (what pertains to high-visibility and ordinary) and back and forth. When cross- read, the three layers add value to one another, providing a rich and in-depth vision of the case study. This could not be understood without uncollapsing hashtags, often treated as monolithic indices, without internal differences. In this scope, the 3L approach adds value to social media research by accounting for how the functional/practical relationship between technicity and platform grammatisation concretely informs the process of reasoning with and through the medium. However, it is essential to observe the significant changes in social media APIs and their impact on research, as argued by Venturini and Rogers (2019): a call for researchers to gain independence from standardised pathways. For instance, and after the implementation of Instagram Graph API, the tool used in this study is now obsolete (see Rieder, 2016), leading us back to scraping-based tools as an alternative to pursuing the 3L perspective, e.g., Instaloader (Version 4.2.6, 2019). Another point concerns the inherent limitations of our proposal, which are certainly not exhaustive of possibilities to explore the modes of engagement beyond unique actors and their respective metrics and activities. For instance, to follow hashtags and account for their algorithmically driven placement in users’ feeds or to account for the biases and limitations of computer vision and machine learning as analytical instruments of analysis (see Mintz et al., 2019). Furthermore, the challenges of applying digital methods for hashtag engagement research concerns how to deal with the ephemeral ways of being of social media and their changeable ways of grammatising actions. Regardless of the possible changes in platforms and research tools or protocols, the conjunction of the 3L pertains to key 170 points often addressed in social media research. With this knowledge and positioning the notions of technicity and grammatisation as a practical matter, this chapter may contribute to what Rogers calls a medium-specific theory. Therefore, and as it follows the ways in which platforms operate, the techniques and enquiry proposed by 3L shall evolve through time. We also hope that this framework can enhance the understanding of hashtag engagement and, regardless of the platform-specific derivations, being further applicable on different platforms. 171 Introduction to chapter five This chapter is originally published as: Omena, J. J. & Granado, A. (2020). Call into the platform! Merging platform grammatisation and practical knowledge to study digital networks, Icono 14, 18 (1), 89-122. doi: 10.7195/ri14.v18i1.1436 The chapter follows the invitation of professor António Granado, at the beginning of 2016, to start a project about the Portuguese Universities on Facebook, raising questions such as: how do Portuguese universities make use of Facebook? What and how do they communicate? How do digital platforms serve as a tool or bridge for science communication? How can visual network analysis help to respond to these questions? My contribution in this chapter was written at times between 2016 and 2019, starting with the exploration of networks of Facebook page likes connections and page data to the making, analysis and visualisation of Google vision-based network. As other projects that seek to develop research with digital methods, our project started with a sort of mimic research, that is by imitating research protocols that had been previously successful, but it ended up taking its own form. The reason why I chose to include this chapter is that it illustrates a purposeful use of digital networks as research device and the possibility of experimenting different visual techniques from a media research perspective. This is, to give space and time to the practice of digital methods, learning how to think along and repurpose the medium and digital records, and, more practically, how to differentiate data descriptions from findings (or trying to do that!). A good opportunity to take advantage of the openended research procedure underpinning the methods. The chapter thus describes a situation where the researchers have already developed a certain proximity with computational mediums and developed a sensitivity to their technicity. The chapter takes seriously the acquired knowledge about the relationship between software affordances, platforms’ culture of use and their technical grammatisation and reflects on the iterative and navigational practices demanded by the method. Here time is crucial; time to explore and describe what the networks had to offer, time to question new analytical possibilities, time to present practical solutions through the methods, and to start the analysis all over again. 172 The technicity approach can be noticed in the way the research questions were asked and, in the way, technical imagination was used to solve these questions. Technical knowledge (about Facebook grammatisation, Google Vision’ labels detection and ForceAtlas2) and practices (using research software, reading networks, creating data visualisations) were crucial. Finally, such knowledge become a fundamental factor in the iterative and navigational technical practices, which use the Web not only as a source of data but also as place to find less structured information, supporting qualitative analysis, e.g. use pages and posts IDs or URLs (available on spreadsheets or Gephi) to locate publications on Facebook, situating, for instance, contexts related to the page like network. For instance, we took Facebook grammatisation into account, using Page like connections and data as a means to map and analyse the institutional interests of Portuguese higher education. This showed us relevant information in mapping the profile of Portuguese universities. We welcomed new research questions throughout the analysis, e.g. by asking about the dominant and ordinary associations of Portuguese Universities and using Facebook Page category to respond to that. By using Google Vision API outputs (labels based on confidence score and topicality rating to classify images), we were able to make sense of a historical image dataset rearranging all timeline images and labels as networks. To answer the main and emerging research questions, basic and advanced visualizations were produced, e.g. a bee swarm chart to examine the mood of Portuguese Universities over the years according to face detection module of Google Vision API, and a network grid detecting the visual patterns and particularities of each University observing the presence or absence of colours represented by image clusters (posters, animals, people, etc.). Through the bee swarm chart, we were able to identify what brings joy to four universities by closely analysing images tagged as very likely or likely to contain joy in their content (information provided by computer vision outputs). Through the network grid, we visually compare the images clusters across the different university pages. Here, the mastery of Gephi use and the ability to create basic visualisations with RawGraphs proven crucial for producing knowledge and posing new research questions. In regard to a discussion of the methods, this chapter continues and expands the use of computer vision networks as a research device, also making use of basic visualizations to answer meaningful questions. Here Google vision outputs were interpreted and 173 further explored (though not necessarily questioned), thus letting the methods, computational mediums and researchers’ intervention tell the story. For instance, after noticing that the dominant visual content contained students, professors, academic staff, board members in the most varied types of events, we asked about the mood of the Portuguese Universities and about the accuracy of Google’s face detection to detect different facial expressions. The accuracy for expressions such as "joy" has shown to be a reliable label, contrary to a lack of precision in the classification of images detected with surprise, sorrow or anger faces. The latter required a navigational mode of analysis; from the bee swarm chart to the search for an image id in the spreadsheet (filtering by face expressions and published results) and, afterwards, in the folder of images. The analysis of images with "joy" faces, pertaining to four specific universities and identified through the bee swarm chart, was a more complex methodological process. We came across a recurrent situation when using digital methods: knowing what to do and why, but not how to do it. In situations like this, we see the advantage of using or activating a technical imagination to solve emerging methodological problems. An imagination that results from some experience in the practice of digital methods, e.g. downloading and visualising a collection of images. In this case study, for instance, we considered analysing specific groups of images (faces with “joy”) because we were aware that all images (downloaded from the web) were stored in a folder and named with the format *name*.*ext*, the image filename follows the image id available on the image URL. Image URL https://instagram.fhio3-1.fna.fbcdn.net/v/t51.288515/e35/64816216_895583534126353_811098829469214962_n.jpg?_nc_ht=instagram.fhio31.fna.fbcdn.net&_nc_cat=101&_nc_ohc=EGRkWUNDcXMAX9wcHSI&tp=1&oh=fe4ddb96d47d4f 15367eb4b49f7080a4&oe=6059A868 Image filename 64816216_895583534126353_811098829469214962_n.jpg We knew beforehand that the images’ filenames could be filtered on a spreadsheet serving as query to locate the images inside the image folder, but we didn´t know how 174 to get a list of images out of it. In other words, we knew what to do (get the images labelled by joy by four different universities, relocating these in four separated folder) and why to do it (to verify the motive of joy by analysing the visual content with ImageSorter). The difficulty here was to operationalise the task of locating a list of images in folder using filename, and then, copy the selected images into another folder to be analysed qualitatively with the help of ImageSorter. A simple command line solved this challenge, thanks to a colleague’s help (Fábio Gouveia)132. What I want to emphasize here is the use of technical knowledge to facilitate the analysis. We learned that technical imagination helps to resolve methodological issues and supports the creation of research tools. In this chapter, most of the results, analyses and challenges could not have been foreseen or included in the initial plans, demonstrating the open-endedness of digital methods. There are some practical aspects that can be learnt from this case study, as highlighted below: § It is necessary to avoid giving too much importance to quantifiable metrics (such as number of followers, likes or post). We should take advantage of what technological grammar can offer as units of knowledge for a quali-quanti appreciation of the dataset, also counting on a technical imagination for this task. § To take advantage of technological grammar as methodological language, practical skills are required such as mastering basic excel formulas, knowing how to handle data and using research software for data visualization and analysis. These practices must go hand-in-hand with knowing-how to think along with the relationship between software affordances, platforms’ culture of use and their platform grammatisation. § The analysis of a digital network is never restricted to its visualization, the way nodes are connected and positioned can point to other levels of analysis contemplating content other than connections (e.g. moving from the network description and findings to the scrutiny of facial expression to identify what brings join to the Universities’ community). § Moreover, when reading digital networks through ForceAtlas2, the spatialization provided by this force-directed layout may provide a narrative thread that has fixed layers of interpretation (centre, mid-zone, periphery, isolated elements) that guide and facilitate reading. 132 Inspired by this and similar needs to search for images in a folder, recently, and together with Jason Chao, we created a simple command-line based-tool to help with that task, the Offline Image Query and Extraction Tool, https://github.com/jason-chao/offline-image-query. 175 § When building a temporal image dataset, it is necessary to be attentive to possible changes of unique identifiers (a Facebook Page id can change over time, e.g. pages can be deactivated) just as the short life span of social media image URLs (images should be downloaded as soon as data collection is complete). § Literacy in using (research) software should go hand-in-hand with the acknowledge about the relationship between software affordances, platforms’ culture of use and their technical grammatisation. § In the analysis of data and especially through a more qualitative look, the web should always be a source of consultation and analysis. § Digital methods require room for experimentation and time to deliver proper results. When using these methods, researcher should be open to new research questions that emerge in the exploration and visual analysis of the data. The analysis proposed in this chapter offers macro and micro perspective of the Portuguese Universities on Facebook, moving from general to specific visions with both quantitative (a general overview of the networks) and qualitative (looking at specific content and actors in the network). Therefore, the chapter supports the argument about the dissolution of the quali/quanti divide by taking technological grammar into account (rethinking conditions and means to knowledge production), mastering software practices (know-how) and being aware about potentials of computations mediums (in technical and practical terms). 176 5 DIGITAL NETWORKS133 C HAPTER 5 133 This chapter was originally published as: Omena, J. J. & Granado, A. (2020). Call into the platform! Merging platform grammatisation and practical knowledge to study digital networks, Icono 14, 18 (1), 89122. doi: 10.7195/ri14.v18i1.1436 177 The case of Portuguese Universities on Facebook Three visions of how to approach the digital have been modelling research in the field of Social Sciences and Communication. The mastering of online questionnaires, surveys and interviews to enquiry our digital life comprise the first. Although taken as key research methods, the proposal of migrating the social sciences instrumentarium to online does not properly respond to the affordances of digital platforms and data (Rogers, 2015; Marres, 2017). A second vision conforms mixed methods or what Marres (2017) refers as an affirmative approach to grasp the digital, that is, to treat “digital devices as an empirical resource for enquiry” (p.125) and also to affirm the role of bias in processes like issue formation. Despite being well-suited for online environment, this vision remains focused on the instrumental capacities of the digital, just as the first one. For both cases, the appreciation of technology is somehow broken and thus not seriously taken as “hybrid assemblages” (Latour, Jensen, Venturini, Grauwin, & Boullier, 2012). Conversely, this chapter foregrounds “the mediumspecificity of social phenomena” (Marres, 2017, p. 117), when digital platforms are both object and method of study (Latour et al., 2012). We bypass the so-called digitisation of methods and opt for the deployment of online mechanisms, tools and data for conducting social or medium research (Rogers, 2013; 2015; Marres, 2017). This introduces the third vision for approaching the digital - from the inside out or the incorporation of the methods of the medium to reimagine social and medium research (Rogers, 2019). In this line of thought, we account web platforms as sociological machines (Marres, 2017) that are qualified by digital instruments for data capture, analysis and feedback. Consequently, we consider digital infrastructures, with Marres (2017) and Rogers (2019) as promoters of methodological innovation. Drawing on the case of Portuguese Universities on Facebook, this chapter emphasises the importance of combining knowledge on platform grammatisation with data research practices (capture, mining, analysis and visualisation). It furthermore considers the notion of grammatisation (Gerlitz & Rieder, 2018; Omena, Rabello & Mintz, 2020) as a path to understand how social media delineate, (re) organise and structure online activity through software, for example, social media application programming interfaces (APIs). That is what we refer to as call into the platform, which points to the functional understanding of web platforms as a fundamental basis 178 for Digital Social Sciences and Communication. Thus, there is a requirement of technical layers of knowledge about the platform itself entangled with digital methods perspective (Rieder, 2015; Rogers, 2019). The proposal of calling into the platform to study digital networks, therefore, envisions the infrastructural aspects of Facebook and its forms of grammatising online activity. One way of understanding Facebook grammatisation is to grasp how its Graph API delineates predefined forms of activities and their specific properties. From the existence of a Facebook Page and related metadata134 to very peculiar characteristics, for instance, what a given page “is” by informing the “page category” (e.g. College & University). In platform functionality and comprehension, we ask what one can learn from the connections between Facebook Pages (through likes) and from a list of (timeline) image URLs. We further raise questions on the affordances and limitations of Facebook as a source for digital Social Sciences research through the exercise of reading digital networks. What can be studied from single page like connections? What can we foresee from a historical dataset of images featured in Facebook Pages timeline? How to reimagine platform grammatisation to study the institutional communication of Portuguese Universities? To address these challenges, two distinctive networks135 will be explored, shedding light on the institutional connections and the visual culture of higher education in Portugal. These networks emerge from different situations: one afforded by Facebook Graph API (the like network of Facebook Pages) and another afforded by digital data (the timeline image-label network136). The first (see Fig 5.1) comprises all connections made by a given page; the act of liking other pages or being liked in return (a monopartite network). The second is built upon the affordances of computer vision APIs in describing large images datasets, and the advantages of 134 E.g. Page description, post date of publication, reactions, shares, comments, posts per hour (post_activity), how users interact with or talk about a page (talking_about_account), the total number of likes a page has received (fan_count) and whether users can or cannot post in a Page (users_can_post) https://developers.facebook.com/docs/graph-api/reference/page/ 135 As socio-technical formations, digital networks offer ways of understanding social and cultural phenomena (including institutional communication). 136 For further details on image-label networks see the work of Ricci, Colombo, Meunier, & Brilli, 2017; Omena et al. 2017; and Mintz et al., 2019. 179 software and data for building and plotting a network of images and their descriptive labels (a bipartite network). Figure 5.1. The act of liking other pages in Facebook end-user interface. Given this scenario, we take the visual affordances of networks and data relational nature as an analytical framework (Venturini, Jacomy, Bounegru, & Gray, 2018; Venturini, Jacomy, & Pereira, 2015; Venturini, Jacomy, & Jensen, 2019; and Omena, Chao, Pilipets et al. 2019). Thus, overlooking statistical metrics, we argue with Venturini and Rogers (2019) that digital social sciences research through web platforms should always be research about these platforms. That is to say: one cannot study society through a web platform without studying the platform itself. In what follows, we operationalise research about Facebook as a form of grasping the institutional connections and visual culture of Portuguese Universities through this platform. Material and Methods The material presented in this chapter is part of a larger research project about the Portuguese Universities on Facebook, which started in March 2017. Since then and following the advantages of API research (Venturini & Rogers, 2019), the public page metadata of the 14 Portuguese Public Universities, as well as that of one private (see Table 5.1), have been collected and archived through the application Netvizz137 (Rieder, 2013). The list of 15 universities complies with the Council of 137 Netvizz is no longer available for research purposes, it stopped working on September 4, 2019. 180 Deans of Portuguese Universities (CRUP) that lists all Portuguese Public Universities and the Portuguese Catholic University (the oldest private higher education institution in Portugal). CRUP represents more than 80 percent of all students enrolled at Portuguese Universities. Facebook Page ID was the entry point for data collection, advanced by the Netvizz modules Page Like Network and Page Timeline Images. Table 1 shows the number of extracted pages for each Like Network (crawl depth 1), with data extraction in March 2019, and the number of all timeline images collected. This latter dataset is a substantial and representative sample because it brings together all images uploaded by the 15 Portuguese Universities: from the data of creation of each page to March 2018. Page Created Time University Page ID Extracted pages from Page Like Network Module Number of extracted images from Timeline Im- ages Module Feb 2009 Universidade do Porto Universidade da Beira Interior 51541308379 143419211198 69 pages 42 pages 1272 images 1,719 images 342519554742 16 pages 528 images 114882798568553 159654804074269 23 pages 1055 images 116726201675273 4 pages 152 pages 66 images 616 images 110296129010034 391 pages 7,054 images 190354481008389 111501795592752 157257697673304 528 pages 279 images 2,731 images 331782250179584 263694373699158 33 pages 53 pages 149 pages 1 pages 48 pages 1,758 pages Sep 2009 Mar 2010 Universidade Católica Portuguesa Sep 2010 Universidade de Aveiro Oct 2010 Universidade de Coimbra Apr 2010 ISCTE – Instituto Universitário de Lisboa Jan 2011 Universidade de Trás-osMontes e Alto Douro (UTAD) Mar 2011 Universidade do Algarve Feb 2011 Universidade do Minho May 2011 Universidade Aberta de Portugal Jan 2012 Universidade Nova de Lisboa Feb 2012 Universidade de Lisboa Nov 2012 Universidade de Évora 398588126903994 Feb 2014 Universidade dos Açores 1696811123898577 590410784413143 Jul 2014 Universidade da Madeira Total of pages and images: 121 pages 128 pages 982 images 365 images 562 images 2,772 images 1,813 images 784 images 22,598 images Table 5.1. List of Portuguese Universities according to Facebook Page IDs, created time by Facebook Page Transparency, and the outputs of data extraction through Netvizz138. 138 In March 2017, the University of Coimbra had 363 timeline images registered and in March 2018, we detected the page had changed its Facebook id from 170094986358297 to 159654804074269. The 181 The research protocol diagram139 is explained as follows (see Figure 5.2). The entry points for data collection were page ids. Through calling the Facebook Graph API, Netvizz returns both .tab and .gdf files that contain metrics and data afforded by Facebook Platform (e.g. Reactions button, post published date, comments). Some of these metrics were attached to node attributes (e.g. page category, post-activity, fan count, talking about count) or edge property (the act of liking connects pages) within the like network, while others facilitated the process of building up an image network (e.g. a list of image URLs). For data analysis and visualisation, we used Gephi (Bastian, Heymann, & Jacomy, 2009) and Graph Recipes140, and for the image-label network we relied also on Python scripts: one for interfacing with Google’s Vision API141 and the other for plotting images into .svg files, the Image Network Plotter142. For the automated image content analysis, the option was Google´s Vision API, due to its descriptive capacities, which tend to higher levels of specificity when labelling large image datasets in comparison to other vision APIs, such as IBM or Microsoft (Mintz et al., 2019). To analyse the 22,594 images generated by Portuguese Universities, two specific properties of Google’s computer vision API were chosen: the description of images and the detection of face expressions - namely label and face detection. For the latter, RawGraphs (Mauri, Elli, Caviglia, Uboldi, & Azzi, 2017) and ImageSorter143 served as important tools to analyse, navigate and visualise the results related to face detection. For the scrutiny of the two distinctive digital networks (page like and image-label networks), we relied mostly on visual network analysis (Venturini et al., 2015; 2019; Omena, Chao, Pilipets et al. 2019). This approach draws our attention to the visual affordances of the networks, rather than focusing only on statistical metrics. The total of 66 images collected though correspond to the latter page id and all images were uploaded between 21 June 2013 and 13 March 2017. 139 Research protocol diagrams presents the entire research process “in a compact visual form” (see the work of Niederer and Colombo 2019); it visually informs the research steps advanced by digital methods approach. 140 https://medialab.github.io/graph-recipes/#!/upload 141 That corresponds to an expanded version of Memespector, originally developed in PHP by Ber- nhard Rieder, and later ported to Python and expanded by André Mintz. Available in: https:// github.com/amintz/memespector-python. See Google Vision API here https://cloud.google. com/vision/. 142 https://github.com/amintz/image-network-plotter 143 https://visual-computing.com/project/imagesorter/ 182 position, size and colour of the nodes are fundamental aspects in this analytical process, as well as the spatialisation of the network, here provided by ForceAtlas2. This force-directed algorithm, commonly used for studying networks emerged from social media, supports the interpretation of data by creating a balance state in the spatialisation of the network (Jacomy, Heymann, Venturini, & Bastian, 2014). Modularity calculation (Blondel, Guillaume, Lambiotte, & Lefebvre, 2008) was also used to identify clusters: the detection of institutional interests within the like network and the different modes of visual representations within image-label networks. Adding to that, we took into account a critical framework for reading digital networks (Omena & Amaral, 2019) which simultaneously reflects technical-practical knowledge on platform grammatisation, the narrative affordances of ForceAtlas2 and Gephi software. Figure 5.2. Research protocol diagram: combining the knowledge of platform grammatisation with the praxis of data capture and data analysis for studying Portuguese Universities Digital Networks. Results Seeing beyond like connections The analysis of Portuguese Universities Page like network is organised according to an overview of the page’s profile, institutional connections and the narrative thread afforded by the spatialisation of the network, by looking at its whole and parts, as well as the central and bridging nodes (Latour et al., 2012; Venturini et al., 2019; Omena & Amaral, 2019). The question of clusters’ formations is also addressed, moving to an 183 in-depth analysis of page categories as a path to unveil the universities specific (institutional) interests. Starting with a general overview, the scatterplot below (see Figure 5.3) displays different Facebook metrics (variables) attached to a given university (nodes). Through the size of the nodes, we can see that Facebook typical forms of measurements can be very contradictory if taken as analytical parameters. For instance, a high degree of post activity may not relate with the users’ engagement - talking about count (see the universities of UTAD, Coimbra and Algarve). By the same token, the number of fan count (see Aberta), a high degree of activity or number of fans, does not necessarily indicates high levels of engagement. Additionally, we notice that more than a half of the universities allow users to post (green nodes), but these publications are kept hidden from their Page timeline. Figure 5.3. An overview of Portuguese Universities according to the following Facebook metrics: post activity (posts per hour and based on the last 50 posts); talking about count (attention metric); users can post (whether a page allows users to publish posts on the page); fan count (number of likes a page has received). Scatterplot made in RawGraphs and edited in Inkscape. 184 When moving towards the heat map of the Portuguese Universities page like network (see Figure 5.4), we first noticed that Media and News Companies (see the nodes among Público and Expresso), followed by Calouste Gulbenkian Foundation, is at the heart of the interests of this network. A second aspect is the high density of connections that surrounds UTAD and Algarve universities and, with a lower density, ISCTE, Porto, NOVA e Lisboa. A third aspect concerns Madeira University, which is detached from the main component within the network and, ironically, geographically isolated. Madeira is an island located in Funchal, part of the Maderia Archipelago. Following the positioning of the nodes, one final observation relates to the central role of Universia Portugal (an academic Portal), European Commission, Forum Estudante (an academic and professional-oriented magazine) and the Association of Portuguese Speaking Universities (AULP). The exercise of replacing edges with a density heat map provides a vision of the whole network and signalises the matters of common concern among Portuguese Universities (Media and News Journal), showing agglomerations (clusters) and central actors. Despite recognising the value of the nodes positioned in the centre of the network (under the categories of Media/News Company and Newspapers), those were removed with the intention of improving clusters’ visualisation. This is illustrated by the emergence of three small clusters attached to the universities of Aveiro, Beira Interior and Minho (see Fig. 5.5). Furthermore, new central actors for Portuguese Universities were detected: the General Direction of Portuguese Higher Education, MIT Portugal Program (an international collaboration centre), FNAC Portugal and Futurália (the biggest Education Fair in Portugal). Subsequently, we read the spatialisation of the network through the visual affordances of ForceAtlas2144. In the centre of the network, we can see the very connected pages that play a key role in the whole network, while influential actors and bridging nodes145 take part in the mid-zone, for instance, the Portuguese television and radio channels (e.g. RTP2, SIC TV, Rádio Comercial), AULP and the job opportunities office as bridging nodes. In the mid-zone, small 144 In practical terms, this force-directed layout provides a narrative thread that has fixed layers of interpretation but multiple forms of reading (see Omena & Amaral, 2019). 145 When a node connects nodes of different clusters. 185 clusters can be seen with connections mainly related to Porto, Nova, Lisboa and Évora universities. Figure 5.4. The heatmap of Portuguese Universities page like network in March 2019. The map shows 44 visible nodes (out of 1.522) and contains 18.988 edges. Node size by indegree. Visualisation by Graph Recipes (Heatmap dessert) and Gephi. In the periphery (see Fig. 5.5), while the large holes denote fewer connections with the centre or mid-term zones, the existence of clusters correspond to the particular interests of Algarve, UTAD, Minho, Beira Interior and ISCTE, for instance, faculties, laboratories, student associations, schools. As an isolated element, Madeira only takes part of the network due to the likes the page has received from a medical service’ and a high school’s Facebook Pages. Madeira university itself has only connected to pages featured by “Portuguese in...” or “Portuguese immigrants in...”, evoking the presence 186 of Portuguese in European, South America and African cities or countries. Those are, somehow, unexpected connections for an official higher education Institutional Facebook Page. Figure 5.5. Reading the page like network of Portuguese Universities according to the narrative thread afforded by the force directed layout ForceAtlas2. The nodes categorised by Media/News Company and Newspapers were removed to highlight cluster formation and to perceive the universities’ particular interests. 1.426 nodes,16.125 edges. Node sized by degree. Throughout the observation of the shape of the edges in large and small clusters, a visual pattern is identified suggesting how clusters are formed. An outward movement from the central node to its neighbours alludes to the fact that cluster formation is substantially based on the act of liking (see Algarve, UTAD and ISCTE in Fig. 5.5). 187 To confirm the visual hypothesis, we used degree centrality to analyse cluster formation within the page like network: by sizing nodes according to the number of connections made by a page (degree), and the total of likes a page received (indegree) or made (outdegree). Thus, we reimaged page likes (see table 5.2) to define whether cluster formation is based on page activity (by means of liking pages), reciprocity (by a balanced number of likes given vs. received), popularity (by means of receiving likes), or little reciprocity. In so doing, we avoid misinterpretation in the process of interpreting digital networks, such as taking larger clusters as more important or overseeing the role of hidden elements. Algarve has the largest cluster mainly because it has liked more than 500 pages. In contrast, the small cluster of Minho is based on reciprocal connections, while Porto seems to be popular among other institutional Facebook Pages. When accounting for how connections are made within a network (data relational nature), node size or visibility should only inform about different characteristics that serve as basis for understanding the cluster formation and its narrative thread. Clusters/Page Universidade do Algarve Universidade de Trás-os-Montes e Alto Douro Universidade de Évora Universidade Aberta de Portugal ISCTE Universidade do Minho Universidade de Lisboa Universidade Católica Portuguesa Universidade do Porto Universidade de Aveiro Universidade da Beira Interior Universidade NOVA de Lisboa Universidade da Madeira Indegree [receiving likes] Outdegree [liking other pages] Formation based on 178 117 54 19 75 131 66 37 527 390 148 127 151 120 52 15 Page activity [by means of liking pages] 140 81 75 66 2 68 22 41 33 47 Popularity [by means of receiving likes] Reciprocity [by balanced number of likes received and made] Little reciprocity Table 5.2: Analysing cluster formation and nodes’ size on the basis of in-degree and outdegree. 188 After understanding cluster formation, the Facebook parameter Page Category146 served as a form of identifying the interests of Portuguese Universities (see Fig. 5.6) within the network. Not surprisingly, College & University, School, Community, Non-profit Organization, Community College, High School, Government Organisation, and Media/News Company are the dominant categories within the network. Page Category also reveals multiple dimensions of sociality: media and communication (TV, radio, newspaper, website); culture (museums, library, art, musician/band); business (company, consulting or advertising or marketing agency, agriculture service); public services and health concerns (medical and health, hospital); entertainment (golf course and country, cultural centre, movie theatre) and gastronomy (Thai restaurant, dessert shop); public figures and news personality; and very specific interests like Barbershop, Car Dealership, Shopping Mall, Sports League and Events. However, when searching for politically related categories, civic engagement, social movements or causes, nothing was found. Figure 5.6. What are the dominant (highlighted in white) and ordinary (in colour) associations linked to Portuguese Universities? Treemap of Page Categories: size refers to the frequency of appearance. 146 Facebook used to have six broad categories of pages: Local business or place; Company, Organisation or Institution; Brand or product; Artist, Band or Public Figure, Entertainment; and, Cause or Community. Each category had a long sub-category list. In July 2019, we verified that the platform now only offers two broad categories of pages: i) Business or Brand and ii) Community or Public Figure. These latter are also constitute by sub-categories but presently, they are only visible when searched, see https://www.facebook.com/pages/creation/. 189 Back to the mainstream categories, and driven by College & University and School, three groups that stand for the institutional profile and specific interests of each university were detected: 1. Schools from varied districts/regions of Portugal, with especial attention to school groups (Algarve and UTAD), polytechnic institutes (Évora), university departments or services (Minho). 2. Internal stakeholders: faculties, institutes, departments, research centres, and courses from each university. For instance, Coimbra, Lisboa and NOVA focus on their faculties or institutes, while Aveiro pays more attention to departments, Beira Interior to students’ nuclei from different courses, and Católica, including its branches in Porto and Braga, to courses. 3. International universities and Portuguese higher education147 exclusively represented by ISCTE – IUL and local learning centres. Contrasting with this description, Aberta University is uniquely positioned, showing a combination of the three groups, due to its balanced range of connectivity. Another affordance of looking at page categories is unfolding official pages that are set aside by the university or the other way around. A good example is the lack of connection between Coimbra University and its Department of Physics, The geophysical and astronomic laboratory, the geophysical and astronomic laboratory, Rádio Universidade de Coimbra, and others148. Additionally, the discovery of pages oriented towards specific audience, for example, Brazilians students149. The imagery of Portuguese Universities What can we foresee from a historical dataset of Facebook images timelines? How to repurpose the methods of the medium to study the visuality of higher education in Portugal? As natively digital objects, online images have uniform resource identifiers 147 For instance, Münster University, University of Groningen, Universidad Carlos III de Madrid, UNED Universidad Nacional de Educación a Distancia, Universidade da Cidade de Macau, and Portuguese Higher Education: Universidade do Porto, Instituto Superior Técnico, Universidade da Beira Interior, Universidade Lusíada de Lisboa, Escola Superior de Comunicação Social, Uni- versidade de Lisboa, Laboratório de Ciências da Comunicação, ISCTE Business School. 148 Turismos Universidade de Coimbra, Relações Internacionais Universidade de Coimbra - International Office, Sociologia, Universidade de Coimbra, Clube de Robótica da Universidade de Coimbra. 149 https://www.facebook.com/UniversidadeCoimbraBrasil/ 190 (URLs) that often provide a fragment component preceded by a = (equal sign) which points to a unique identifier (id) of a given image, such as https://scontent.flis91.fna.fbcdn.net/v/t1.09/1937274_144956263379_7335679_n.jpg?_nc_cat=108&_nc_ht=scontent.flis91.fna&oh=2a719e0099e71e-4049305a22ea628887&oe=5D536D7D. By taking advantage of the images’ URLs afforded by Facebook Graph API and merging these into the computer vision services of Google Cloud API, we were able to describe and interpret the imagery of Portuguese Universities. A total of 22,594 images were plotted in a bipartite network, in which nodes are images (90.62%) and labels (9.38%), while edges (161,474 in total) show the connections made by a number of labels that describe one or more images. The co-occurrences of similar descriptive visual content (labels) inform about the position of the images within the network. In alignment with Rose’s (2016) proposal for interpreting visual material, the first level of analysis enquires about the site of the image itself. The multisited composition and meanings indicate different formations of institutional interests that are invested in the visuality of the page timelines. What visualities describe this culture? What can the dominant visualities and their inherent meanings tell? What is not there (and why)? Through the computer vision API-based network, in Figure 5.7, we may see the plot of all images featured in the timeline of the 15 Portuguese Universities’ Facebook Pages, spatialised according to correlated labels and, at the bottom, the analytical exercise of relabelling the machine vision by categorising clusters. Next, we will show how the scrutiny of timeline images can provided general and specific insights on the visual culture of Portuguese Universities over the years, more specifically from each university page creation date to March 2018. In Figure 5.7, at the top, we see quite a homogeneous network, except for the strong concentration of images in dark hues on the right side and the colourful agglomeration at the top. These clusters represent the most common visual representation associated with Portuguese Universities official Facebook Pages: Portugal sentado (Portugal while seated) and posters. The former is named after a Portuguese journalistic expression that relates to the (bad) habit of publishing photos of people seated in press conferences, auditoriums, all kinds of meetings, parliament, scientific conferences, or even sports events, in order words, something to be avoided in the newspapers. Posters depict all sorts of written content: screenshots of news, 191 institutional newsletter, banners to promote academic events or to celebrate commemorative dates. In the peripheral zone of the image-label network, we find the formation of discrete clusters that point to different visualities with a more detailed classification (labels) to describe the images, whereas a more generic labelling takes place in the centre of the network. After analysing the different regions of the network, ten clusters were detected besides those two already mentioned (see Fig. 5.7): people in academic events, people in outdoor/indoor events, school buildings, head shot pictures, sports, musical performance, sky and grass, animals, labs, and history in black and white. In a general view (see Fig. 5.8), the visual depiction of Portuguese Universities mainly outlines the tedious pictures of audiences (sitting in an auditorium) or keynote speakers in academic conferences. Adding to that, pre and post conference conversations, organisers or participants posing for pictures, posters, presentations, institutional partnership (e.g. shaking hands or signing contracts) and prize winners. Beside portugal sentado and nearly in the centre of the network (see Fig. 5.7), there are two other clusters of which the main focus is people in academic events (outdoor and indoor activities), small and big groups chatting or posing for a photo, professors or students being interviewed. 192 Figure 5.7. The imagery of Portuguese Universities on Facebook from 2009 to 2018. 193 Another strong visual identity is the graphical depiction of posters and banners with the most diverse type of announcements, beyond conference, symposiums, workshops, sports or new students related banners, the number of likes achieved by a page (fan count), and the celebration around the ‘international day of’ (e.g. water, statistics). A very particular visuality is also uncovered, specifically news clippings which indicates how Portuguese Universities value when mainstream or traditional Portuguese newspapers headline or mentions academic research, professors or students. In this scenario, and between 2009 and 2018, we may infer that the dominant visualities of Portuguese Universities conform to people attending events and institutional posters. The architecture of school buildings earned a place of honour, in particular the overwhelming presence of the Institute of Social Sciences (ICS- Minho University (see Fig. 5.8). We also visualise head shot pictures of local, national and international researchers (perhaps a few number of students), and a full cluster dedicated to university sports (see Fig.8). From collective (including wheelchair categories) to individuals’ modalities, the rule seems to be the depiction of victorious teams: images of teams celebrating the victory, athletes on the podium, holding medals or a trophy. At the bottom of the network, the musical performance cluster (see Fig. 5.7), composed of cultural events in the shape of concerts, choirs, orchestra and musicians with their instruments. There is also a group of images putting together because sky and grass are their main visual composition. The ordinary visual content (less substantial in numbers) brings wild and domestic animals such as birds, caws, owls, tigers, monkey, dogs and cats; the stereotype images of labs - namely researchers working with a telescope; and people who made history in black and white pictures. This latter basically represents historical photos published by Porto. The visual description and historical perspective of Portuguese Universities seem to lack, however, the everyday life of students and nonstereotypical imagery of scientific research, while it overvalued the academic events and institutional communication through banners and news clipping. 194 Figure 5.8. The visual history of Portuguese Universities from 2009 to 2018: image tree map based on clusters detected in the image-label network (portugal sentado, posters, people in academic events, people in outdoor/indoor events, school buildings, head shot pictures, sports, musical performance, sky and grass, animals, labs, and history in black and white). After having a general (but also detailed) perspective on the imagery of Portuguese Universities on Facebook, in the second level of analysis, we questioned the visual choices and patterns attached to each university. Fifteen bipartite image-label networks were arranged in a grid to respond to this question (Fig. 5.9). The focus of our analytical observations here is the presence or absence of colours, which indicates different visual patterns and particularities. In the grid, for each university image-label network, there is the number of images in accordance with the page created time. These individual characteristics help to situate different networks. The vision API-based network grid provides not only an innovative technique to approach Portuguese Universities’ timeline images but also their visual history on Facebook. 195 Figure 5.9. Vision API-based network grid. Seeing the visual patterns of a given Portuguese University from its page creation date to March 2018. As previously described, pictures of sitting audiences, posters and people in academic conference correspond to the dominant imagery of the majority of Portuguese Universities on Facebook – with exception of Coimbra University. In contrast to this portrayal, the practice of sports and musical performance appear to have little visual 196 space among universities, at least if compared with the main images’ categories. For instance, there is a minor representation in Açores, Lisboa and Católica and no visual mention in Madeira. Aberta University does not contemplate sport practices. Particular characteristics can be seen through the animals cluster, which is almost exclusively to UTAD but also represented in Porto and, to a lesser extent in Aveiro. The head shot pictures seem to please all universities, except in the cases of Açores, Madeira and Coimbra which do not invest in such style. The vision API-based network grid (see Fig. 5.9) offers an effective and direct way of reading the choices and patterns that constitute the imagery of Portuguese University on Facebook. Such technique can be replicated in further similar studies. Since people is at the core of visual communication, the last level of analysis took advantage of the face detection module made available by Google Vision API. The main objective was to repurpose machine vision to assess the mood of Portuguese higher education. Results demonstrate an overwhelming depiction of happiness along the years, but also detects other face expressions such as surprise, sorrow and anger (see Fig. 5.10). In terms of consistency, and considering the page creation date, Porto, Minho, Aveiro, ISCTE and Madeira have recursively been publishing images that contain joy in their respective Facebook Pages. Whereas, in terms of volume and duration, Trás-os-Montes e Alto Douro (775 images), Açores (603 images), Évora (447 images) and Beira Interior (285 images) stand out as the Facebook Pages with more images that are very likely or likely to express joy. But what are those images related to? What would be the motive for great happiness or pleasure? After plotting images separately (see appendices), it was possible to uncover the different motives that drive the visual imagery of UTAD, Açores, Évora and Beira Interior. The happiness in UTAD is mainly featured by students, staff or professors posing for a picture in various types of academic events, and also the depiction of smiling speakers and audience. There is also joy in head shot pictures, news clipping, people drinking wine, in a few selfies, sports, tech-related pics, and the inauguration of new facilities. In a similar spirit, the visuality of Beira Interior has a great focus on students participating in outdoor and indoor events, although it also brings university staff and board members to these events, some selfies, head shot pictures and the register of awards ceremonies. The smiling audience, the act of posing for pictures in academic 197 events or official ceremonies, are also key visual characteristics in Évora, in addition to the overwhelming presence of students. Other types of events prompt the happy visuality of Açores: official ceremonies to award the best graduated students (both in an auditorium and in the rectory building); events where the faculty staff and board members get together (e.g. in the rectory building, outdoor picnic). Photos of students in classroom, in the rectory building or in outdoor events are also common. Figure 5.10. Computer vision to examine the mood of Portuguese Universities over the years according to face detection module of Google Vision API. Four main expressions were detected as very likely or likely to appear in the universities’ Facebook images timeline: joy, surprise, sorrow and anger. In addition to the academic events that put together students, professors, staff and board members, there is a common reason that brings joy to the higher-educational environment: the visit of the president of Portugal, Marcelo Rebelo de Sousa. His 198 presence is synonymous of a series of pictures in the Facebook timeline images of UTAD, Açores, Évora and Beira Interior universities. The International Exchange Erasmus Students Network connects also to the joy of Évora and Beira Interior, and although sporting achievement is too little, these images often connect to happiness. When further analysing other face expressions, the results supplied by computer vision can be deceptive. The Facebook timeline images that contain surprise (64 images) do not exactly depict faces showing that something unexpected may have happened. Rather, we see face expressions that may be tricky for machine learning algorithms, such as raised eyebrows or opening and closing the mouth when talking or giving a presentation. Although useless for detecting real surprised expressions, the results provided by machine vision demonstrate a type of visuality that should perhaps be avoided in institutional communication, since it does not provide a pleasant visual identity (e.g. a, b, c, d, e). The sorrow related images (a total of 14) are mainly groups of students in different situations such as: in learning environments, appearing to be concentrated and sitting in a small auditorium or standing in a laboratory paying attention to the professor (e.g. in Açores and in UTAD); in academic events (e.g. 1 and 2 from UTAD, and e.g. 3 from Aveiro); or students being interviewed by a television reporter (Algarve). Institutional and academic events also categorise a few sorrow images, for instance university staff involved in organising or planning activities in the course of an educational event (Beira Interior), a picture zooming in to focus on the audience sitting in an auditorium (Açores), and during a presidential visit by the president of Portugal Marcelo Rebelo de Sousa who is entering the University facilities accompanied by the dean of Évora University (Ana Costa Freitas) and a few board members. There is also the case of detecting sorrow faces in banners; one image promoting an exhibition about the miracles of Nossa Senhora de Fátima constituted by an old black and white photo with three children staring at a camera looking sad (UTAD), and another with Alfred Hitchcock looking down another filmmaker (Évora). The five images with anger faces derive from the practice of sports, except for the one promoting a Masters in Theatre (Évora University in 2017). The other cases bring winning athletes such as João Paulo Fernandes getting second place in the London Paralympic Games 2012 (Aveiro); the celebration of Porto’s female Rugby team for 199 achieving third place in the European University Games 2014 (Porto); Minho´s female five-a-side football athletes in the middle of the court celebrating (probably the victory) in 2017 (Minho); or the applications for EA Campus 2014 (ISCTE). The thick description about face detection, in particular surprise, sorrow and anger, intentionally calls our attention to the lack of precision and limitations of computer vision APIs. Discussion The study of Portuguese Universities’ Facebook Pages discloses some practical and institutional implications for both research and communication practices. For the latter, we should consider that the use of social media for institutional communication is indeed a very recent activity: Facebook is a teenager (15 years old) and the oldest Portuguese Universities’ pages on this platform were created in 2009 (Porto and Beira Interior). For the last ten years, Portuguese Universities have been learning how to communicate using Facebook, while they use it and rely on its affordances. However, it also becomes clear that Portuguese Universities are using Facebook as just another platform to “shout out” their activities. At the same time, most universities do not seem to be spending enough time building their social media networks or giving attention to some aspects of their missions. Research activities or accomplishments, for instance, are very seldom referred to, which means that science communication is practically nonexistent. Connection with the outside world is also neglected. When seeing the dominant images shared by Portuguese Universities, we find mostly “Portugal Sentado” (people seated while listening to conferences) and conference posters, showing very little imagination and a somewhat careless attitude towards social media best practices and potentialities for communication. By using the same type of photos time and time again, universities are perpetuating the stigma of a boring academic environment. The “institutional” visuality of academia reproduced by Portuguese Universities fails in taking advantage of the attention economy of Facebook. In other words: no “clickworthy” and “shareable” content. 200 From a research perspective, vision API-based networks’ innovative approach sheds light on how the universities in Portugal make use of, manage and give priority to their visual content over the years. The digital visual methods adopted here come along with thick descriptions, technical knowledge and practical expertise, highlighting the need of questioning the methods of the medium critically. That is the case of recognising the lack of precision and limitations of computer vision in the analytical process or being aware of the problems with web data (knowing how to deal with it!). Our proposal of calling into the platform is an invitation to conduct communication and social research through the lens of medium-specificity. Following Latour’s oligoptic vision of society, it derives from the explorations in the context of the deviceaware sociology (Marres 2017) and the technical-practical fieldwork. This reality challenges new media researchers to take advantage of the intrinsic properties and dynamic nature of digital platforms, and here lies the main contribution of this chapter; a practical walk through the possibilities of repurposing the technicity of networks for studying societal (institutional) phenomenon on Facebook; a call for another culture of making research questions and designing research; a call for embracing an openended process in which new questions are always welcome. 201 Introduction to chapter six This chapter was originally published as: Silva, T., Mintz, A., Omena, J. J., Gobbo, B., Oliveira, T., Takamitsu, H., Pilipets, E., & Azhar, H. (2020). APIs de Visão Computacional: investigando mediações algorítmicas a partir de estudo de bancos de imagens. Logos, 27(1). doi:https://doi.org/10.12957/logos.2020.51523 The data sprint report of the article is available from https://smart.inovamedialab.org/pasteditions/smart-2019/project-reports/interrogating-vision-apis/ This chapter investigate the image classification capacities of different computer vision APIs (Google, Microsoft, IBM), while asking how different nationalities (Australian, Brazilian, Nigerian, Portuguese) are represented by stock images websites (Shutterstock and Adobe Stock). I have contributed to this chapter by suggesting the angle of research and proposing ways to interpret the image-label networks. Together with André Mintz and Beatrice Gobbo, I helped in developing and operationalising the research design and methods. The chapter illustrates a situation in which researchers are familiar with and care for the technicity of computational mediums (platforms and also research software) and presents a creative visual method (e.g. comparative matrixes of vision APIs outputs through image-label networks). It also serves as an example of when and how the technicity of the mediums is misaligned with the research objective which, particularly, highlights our lack of knowledge about how machine learning models deliver labels to classify objects and scene in an image. That said, I want to first describe the impact of such lack of knowledge in how we think and develop ideas about a specific issue. After that, I discuss potential visual methodologies to interrogate and compare the labelling capacities of three computer vision APIs. What were the research questions and why they should not have been asked? This case study is also interesting because we started off on the wrong foot, especially considering two of our initial research questions: § Can we investigate national representations through computer vision tools? § How are cultural specificities made visible by computer vision APIs? 202 These questions led us to seek for national representations and cultural specificities through the labels provided by vision APIs to classify the material content within an image, which brought us to wrong conclusions. For example, when assuming that computer vision has a limited scope to image classification because it lacks the capacity to detect cultural, racial and gender specificities (e.g. food, music, dance, customs related to a country). The article thus affirms the bias reproduced by proprietary vision APIs, stating that this is linked to the geopolitical position of these companies. And, even more, it insinuates that the outputs have racial bias. Although we have indeed managed to get a sense of how different nationalities are represented in stock image websites by describing common and unique image clusters to each nationality studied (Australian, Brazilian, Nigerian, Portuguese), we were unable to see over cultural specificities. My point here is, computer vision image classification provides general and very specific labels afforded by machine learning models (see example below). The textual descriptions are assigned to an image (e.g. labels or tags such as food) and these are always accompanied by high or low confidence scores (from 0 to 1) and ranked by topicality rating. This informs us about both the probability of the textual descriptions assigned to an image which follows a hierarchical way of classifying what is in an image. The labelling characteristics and potentialities of computer vision would allow different modes of understanding visual content, but not exactly identifying cultural specificities (not by labels but through our interpretation of image clusters). This is to say that the two research questions emphasised here should be dropped or at least rethought. If we had to ask about cultural specificities, we could have used another way such as opting for the detection of web entities (see below). Google Vision Labels Food (0.9772421); Ingredient (0.8929239); Fruit (0.8815268); Staple food (0.86995584); Recipe (0.8641755); Dish (0.85230696); Cuisine (0.8481042); Yellow (0.8465541); Produce (0.7728272); Baked goods (0.7722937) Google Vision Web Entities Lisbon (1.0512); Egg tart (1.02405); Puff pastry (0.92190003); Pastel (0.91665006); Tart (0.8598); Custard (0.80205); Pastel de nata (0.7354); Dessert (0.7092); Custard tart (0.7072); Cream (0.6104) But what can we learn from this? Well, I would say that in the same way that the practice of collecting and visualizing data can be overly addictive in the first contact with digital methods, researchers are susceptible to technological innovation, falling 203 into the temptation to make use of what is new (vision APIs) and apparently easy (e.g. YTDT combined with Gephi) and trying things out without quite knowing where they are going or what they are asking. One’s ability to use software (or being an expert in it) does not necessarily mean to be aware of the technicity of the medium, the same can be said for one’s skill to generate data visualizations or code. Our knowledge of image network was solid, but we had a poor understanding of the functioning of image classification by computer vision and because of that our results were flawed. As argued in chapter 2, we should distinguish computational mediums at different levels (e.g. individual and elements), taking special care to the “elements” (e.g. algorithmic techniques) as they are carriers of meaning. Here we lacked the ability to understand the practical qualities of image classification. Digital methods protocols are only as robust as the weakest link of their methodological chain. Our tendency to frame (or blame) algorithmic techniques as either racist or culturally ignorant agents portrays a common (and perhaps not always conscious) attitude towards technical objects, the tendency to judge them before getting to know them. We learnt that the use of computer vision APIs must be accompanied with minimum knowledge about its features, the technical element in use such as image classification. Otherwise, the research questions and design can be misguided, leading to erroneous assumptions. How the practice of methods guided us to new ways of asking questions and solving problems? From a medium perspective research, the chapter asks whether there are differences between computer vision APIs providers, particularly on their ability to classify images. For this purpose, the article proposes promising visual methodologies to both interrogate the computer vision APIs outputs and compare their descriptive capacities by comparing different image-label networks in matrixes. First, and by take advantage of the spatialization logic of ForceAtlas2, we grasp the mode of spatial distribution of images and labels through a visualization displaying networks in shades of gray. Second, we categorize image clusters according to their dominant visual typology and use colors to emphasize the common and unique clusters between the three APIs. In this way, from the center to the periphery of the networks, we were able to understand the generality and specificity of the labels applied by each vision API and the variety of objects defined by the scope of the dataset and their topical specificity. 204 In this chapter, digital networks are used as research devices for reading technical content (the range and specificity of labelling across APIs) and the content of the images in a comparative way (by attributing colours to common and unique image clusters). This chapter thus feeds into the main arguments of this dissertation, by presenting a case study that consider computational mediums from a conceptual-technical-practical perspective, by introducing a medium research perspective to get to know the potentials of computer vision (through the comparative analysis of image classification of different vision APIs) and innovative visual methodologies using comparative matrixes of image-label networks. The chapter is also valuable as a description of a failure: showing how seriously the lack of basic knowledge about a computer vision features can harm research results. 205 6 INTERROGATING COMPUTER VISION APIS150 C HAPTER 6 150 This chapter was originally published as: Silva, T., Mintz, A., Omena, J. J., Gobbo, B., Oliveira, T., Takamitsu, H., Pilipets, E., & Azhar, H. (2020). APIs de Visão Computacional: investigando mediações algorítmicas a partir de estudo de bancos de imagens. Logos, 27(1). doi:https://doi.org/10.12957/logos.2020.51523 206 Introduction The current internet and communication scenery are characterised by a duality between the multiplication of media technologies, which is present in the field of audience and also production (Napoli, 2008), and the growing and controversial web platforming and, consequently, the economy of digital attention. If the idea of liberating the transmitting pole or the relative decentralization of content production predominated during the dissemination of information and communication technologies (Castells, 2002; Lemos, 2002), today, the commercial panorama demonstrates considerable concentration of digital in a few companies. The acronym GAFAM cites the corporations around Google, Amazon, Facebook, Apple and Microsoft and the oligopoly power which is not only business, but also about the economy of attention and even the interpretation of social reality, since they concentrate a good part of the data daily generated by people and companies, as well as the technical resources to interpret them. Helmond (2015) observed the platformisation of the web by social media organisations as an effective effort to manage data flows and web economics in creating and reconfiguring values from digital tracks and data. This trend promoted the integration and inter-programmability between environments in a hierarchical manner among the web environments that cannot be seen in isolation, deepening the inadequacy of traditional research methods for their understanding (Pearce et al, 2018). Platforms such as Facebook, for example, evolved to other digital environments by providing easy authentication features or distributed comments. Amazon, in turn, was one of the first e-commerce cases to apply the marketplace idea: the site becomes an intermediary between consumers and retailers, of various scales, who need the website to access their customers. But Facebook is much more than a social networking site and Amazon is much more than an e-commerce site. These organisations have become conglomerates in competition for cutting-edge areas of technology, such as robotics and artificial intelligence (AI), driven by the data that passes through their servers in the transactions and communications they host. Srnicek (2017) explains the advantage of these corporations over traditional business models, since platforms are positioned between users and suppliers and, at the same time, dominate the space where transactions take place, having privileged access not only to data from the companies 207 that use them, but also to exclusive data that arises from the privileged position of having information from several competitors. In this scenario, AI platforms emerge both from these large new corporations in the digital market and from companies that were born from the traditional computing market (such as Microsoft and IBM). Among the different services offered, such as natural language processing and recommendations based on consumption patterns, is the computational interpretation of images, one of the AI frontiers and a fundamental demand of the contemporary media setting. In social media, for example, images have become increasingly central to communication. In this respect, the literature in the field refers to a "visual turnaround" of digital platforms (Faulkner, Vis & D'orazio, 2018), whose publications are more and more visual, including with social media specifically focused on this type of content (such as Instagram, Snapchat and Tiktok). This development, however, has not been accompanied in a proportional way by academic research, which still encounters difficulties in investigating this reality. Research focused on images has typically been focused on small qualitative studies (Laestadius, 2017), but the study of visual data in social media is moving considerably towards multidimensional approaches that see "qualitative data on a quantitative scale" (D'Orazio, 2014), due to the volume, the characteristics of circulation and its complexities. Studying images in social media is especially challenging due to the complexity of computational processing of visual data - to which the sub-discipline of Computer Science called Computer Vision is dedicated. However, this is a growing demand. Some social media platforms, for example, process all images published on their site through computer vision, for purposes such as moderation, ad segmentation and accessibility for people with visual impairments. In a sense, much of the work of the academic researcher interested in understanding the circulation of images may be accomplished and expanded through the use of automatic tagging of the semantic content in the images, available in a "packaged" form in programming libraries or computer vision providers. Most large digital media and technology companies have launched their own AI platforms serving these purposes, such as IBM Watson, Amazon Web Services, Microsoft Azure and Google Cloud Platform. However, the knowledge about these platforms and their mode of operation is largely unexplored, which demands reflection not only on their potential 208 for the applications in research, but also to investigate how they interpret the processed visual data. Therefore, the purpose of this chapter is to pursue this dual task of exploring the analytical possibilities of computer vision APIs - such as those mentioned above while at the same time questioning the very constitution of these platforms151. In this approach, the study is aligned with the "digital methods", according to the proposal of Richard Rogers (2013, 2015), which is defined, among other aspects, by the critical and reflective repurposing of operative instances of the web (native objects of the digital) and the data generated by the platforms in a social research sensitive to the media (cf. Omena, 2019). In particular, we examine the biases inherent in computer vision APIs, which will be evaluated in analytical and investigative practice. For such a framework, the study also mobilises elements of decolonial-based technological critique that emphasises that technologies are not just neutral things but incorporated cultural artefacts of power and representation relations, as mediating artefacts (Haas, 2003). Computer Vision and the study of images In the field of computer science, the sub-discipline specifically focused on the computational interpretation of visual data is called Computer Vision. Historically, it is one of the first problems proposed, in the 1950s, for the development of AI, then, as a derivation of cybernetics (Cardon, Cointet & Mazières, 2018). Among its first developments are programs for the computational reconstitution of three-dimensional spaces and objects from photographic images (Manovich, 1993; Roberts, 1963). With the elaboration of algorithmic models for image interpretation, computer vision allows the incorporation of photographs and videos - among other types of recording - as data input for robot navigation, forensic science and information management systems beyond, of course, war and surveillance applications. 151 This article derives from a collective research within the SMART Datasprint 2019, an event organised by the iNOVA Media Lab, da Universidade Nova de Lisboa. The research report is available at: https://smart.inovamedialab.org/smart-2019/project-reports/interrogating-vision-apis/. A version of this study was also presented at the National Symposium on Science, Technology and Society, organised by ESOCITE (Brazilian Association of Social Studies of Science and Technology) in Belo Horizonte, Brazil. 209 Importantly, however, the fundamental problem of computer vision corresponds to what, in the jargon of the area, is referred to as a "misplaced problem", that is: a problem for which it is impossible to achieve a single or optimal solution, but only probabilistic approaches (Smeulders et al., 2000). Far from being definitive, any computational interpretation of an image will always be an interpretation that will inform both about the analysed image and - fundamentally - about the program that produced it. In the context of platformisation, especially with the increasing centrality of visual content in the use practices of the platforms, image recognition programs from computer vision play a fundamental role. Given the constituent character of datafication and algorithmic mediations in the very definition of social media platforms (D'Andréa, 2018; van Dijck, 2013), image recognition programs allow the integration of visual content with the operation of platforms. Also, it is precisely the massive availability of these contents that enables the contemporary development of computer vision under the paradigm of machine learning (Alpaydin, 2016). This paradigm is characterised by the inductive nature of its operation, in which the algorithm is not explicitly elaborated in the program, but rather inferred by the program itself from a large volume of training data (Mackenzie, 2017). However, several problems emerge from this relatively recent configuration. A typical counter argument addressed to the advocates of the machine learning paradigm concerns the unintelligibility of the decision system produced (Cardon et al., 2018). After the training process, the resulting models offer probabilistic predictions based on the training data, but it becomes difficult to measure and (even more) intervene in the system architecture to correct eventual impertinences and biases. According to Cardon et al., this problem of unintelligibility was counterbalanced in the scientific controversy surrounding machine learning by the relative effectiveness then verified in system applications. This was due to the volume of data used in training which, in theory, would lead to a reduction in errors and biases. However, as much contemporary criticism has pointed out, the quantitative volume of data does not necessarily reflect its quality to infer general models. Data collected from the web - "in the wild" - tend to reproduce invisibility and oppression schemes that, due to their systemic character, are not compensated by the large volume of data used in training. On the contrary, it 210 can be seen that the models inferred from this data reify and, in doing so, reinforce the dynamics of invisibilization and marginalization. These programs have also become research tools and objects in humanities and social sciences, in applications with variable critical degree. Recent works have mobilised them as tools to understand cultural and behavioral trends in large databases, and also to map differences between image recognition systems, such as biases and preconceptions crystallised in databases, models and algorithms. Considering the first case, examples include semantic mapping of cities (Ricci et al., 2017; Rykov et al., 2016); comparative studies of selfie cultures (Tifentale & Manovich, 2015); visual persuasion analysis (Hussain, 2017; Joo et al., 2014); description of home styles in political communication (Anastasopoulous et al, 2016); classification of political propaganda (Qi et. al, 2016); the study of online political engagement modes (Omena, Rabeloo & Mintz, 2017); as well as image circulation dynamics (Omena et al, 2019; D'Andrea & Mintz, 2019; Silva, Barciela & Meirelles, 2018). For the study of the biases and differences between machine learning systems, approaches focus on their gender (Hendricks et al., 2018) and race (Buolamwini, 2017) implications. Some of these studies have resulted in impact information such as good practices guides (Osoba & Welser IV, 2017) or a proposal for carefully designed machine learning databases (Buolamwini & Gebru, 2018). Computer Vision APIs: What are they? Application Programming Interface (API) describes a way of structuring computer programs that allows their interoperability with other systems. By means of an API, a computer program can be designed to package certain functions and data resources so that they can be accessed by an external program. In the context of distributed computing and web platforming, APIs enable public or commercial availability of computational services and data. In practical terms, it means that thousands of individuals or companies can automatically and in a standardized manner, usually on demand based payment, use the digital services and data of a supplier company. In the area of providing Computer Vision resources, companies such as IBM, Microsoft, Google, Amazon and Clarifai stand out, as well as TinEye and Kairos niche resources. Among the functions performed are, for example, image classification, facial recognition and optical character recognition. Available in recent years by large 211 companies in Silicon Valley, these APIs are now a platform model of computer vision that has spread with applications in various areas - including research of social media platforms152. To illustrate a typical feature of these tools, we can see in Figure 6.1 a demonstration of the IBM Watson Visual Recognition feature. In the pre-trained vendor model, the system can identify objects and categories such as fabric (fabric), gray (gray colour), fabric types, and other entities with diverse degrees of confidence. With tens of thousands of pre-trained labels and classes, the feature can be applied immediately by developers for different purposes. Among the case studies provided by IBM is the use of the feature to quickly identify types of vehicle damage, for example. Figure 6.1. Screenshot of the Watson Visual Recognition demo on the company's official website. The computer vision APIs call the identification of these entities’ labels, tags or classes, depending on the organisation of each one. But this is only one of the groups of resources offered. The options available are growing; nevertheless, we can highlight five other groups: a) automated recognition of text in images, turning the visual resource into textual; b) resources linked to the identification of people and their characteristics, such as age, gender and facial expressions; c) discovery of equivalent or similar images on the web, as well as extraction of related information from the 152 Alongside these commercial APIs, it is worth highlighting open code and public models of image recognition, such as those made available by the project Keras (Chollet et al., 2015). This one in particular facilitates the development of applications with restrict image recognition models based on the typology of Imagenet database (Deng et al., 2009), with 1000 pre-trained image categories. 212 semantic web; d) vertical models, such as recognition of celebrities, tourist spots or types of food; and, e) detection of explicit content, such as violence and pornography, applied for moderation and content filtering. In Table 6.1 we summarise the main resources explicitly available by Google, IBM and Microsoft APIs providers with which we engage more directly in this study. As may be seen below, some features are unique - Google, for example, integrates its resources with search engines, allowing both reverse search of the image on the web and the extraction of information linked to these images by scanning the websites where they are used. Google IBM Microsoft Labels / Tags / Classes Yes Yes Yes Semantic Web entities Yes No No Food classes No Yes No Automatic subtitles No No Yes Explicit content detection Yes Yes Yes Face detection Yes No Yes Facial expressions Yes No No Celebrities No No Yes Touristic spots / Locations Yes No Yes Gender No Yes Yes Age No Yes Yes Text recognition Yes No Yes Text language Yes No No Reverse search on the web Yes No No Table 6.1. Comparison of some of the Computer Vision API features 213 The resources cited are widely available and employed by organisations and commercial or governmental sectors. Among these applications, collaborations of large technology companies with USA war and repressive projects have resulted in protests by their employees. In 2018, Google employees enforced shutdowns against the company's use of AI technologies to optimise drone attacks153. In an event indicative of the fight against the discussion on artificial intelligence ethics, one of the employees engaged in the protest, Meredith Whittaker, left the company after being pressured to distance herself from research in ethics and technology at New York University154. In 2019, Amazon employees were seeking to prevent the company from supporting the immigration sector responsible for tracking undocumented immigrants in the USA and their imprisonment in concentration camp analogue institutions in the border region155. At the interface with digital research methods, computer vision APIs have been used for creative research designs as well as for their own reflective critique of these resources. We can understand these initiatives in relation to what Farida Vis (2013) proposed to question other possible ways of thinking about the data collected or processed through computational resources such as AI and Big Data, to imagine other possible applications. Interrogating APIs and stock images websites: absences and hyper-visibilities Resuming the complex nature of the media ecosystems referred to in the introduction, the role of commercial images websites must be highlighted. Stock images companies have existed since the 1920s (Bruhn, 2003), providing photographs to media companies, publishers, advertising agencies, being used by producers in different formats and media, such as magazines, newspapers, out-of-home media and packaging. In the 21st century, with the internet and the cheapening of photographic production, the microstock format became popular, expanding the size of consumer and producer networks. The so-called microstock is a model through which sock image companies become an interface between photographers and professional or amateur 153 https://www.nytimes.com/2018/06/01/technology/google-pentagon-project-maven.html 154 https://www.theguardian.com/us-news/2019/apr/22/google-mass-protests-employee-retaliation 155 https://www.theguardian.com/us-news/2019/jul/11/amazon-ice-protest-immigrant-tech 214 studios that can sell their photographs individually and at low cost to all kinds of clients. The cultural analysis of the representative role played by stock images websites is relatively recent. Frosh (2001) discusses the different meanings and layers of references for media producers, buyers and end consumers. A crucial concept evoked by the author is invisibility: most media consumers are not familiar with the concept of stock images; they only consume and interpret the visuality in its aesthetic and representative contents in the final contexts. This invisibility is also partially presented in the academic field. Scientific papers on the observation and critical analysis of image repositories are scarce, especially when we talk about digital methods and social computing. Among the most recent investigations on the production and circulation of image repositories, one can cite the work of Pritchard and Whiting (2015) on gender and aging; an article by Giorgia Aiello and Anna Woodhouse (2016) on gender stereotyping; sprints reports coordinated by Aiello (2016, 2017) on stock images websites in relation to photojournalism and gender representations; West's study on gender stereotyping and class invisibility in images representative of work environments (West, 2018); and, a method for increasing equity in image recommendation systems (Karako & Manggala, 2018). Our proposal is to contribute to this set of experiments on stock images websites, through the lens of digital methods and computer vision APIs - while critically investigating these tools. The project sought to investigate Computer Vision APIs as search engines while experimentally applying them to the study of visual representation of different countries through the images resulting from the search for their gentílicos/gentilic/population (specifically: Brazilians, Portuguese, Nigerians and Austrians). This dual approach is based on the inflection proposed by Noortje Marres and David Moats (2015) to the founding principle of symmetry in the field of Social Studies of Science and Technology (STS). In considerations focused on the study of online controversies, the authors advocate the need for a symmetrical approach to the 'content' of controversies and the means by which they manifest themselves. In the case of this study, symmetry occurs between the representations of stock images and the algorithmic mediations of images in digital networks, also mobilised as a research methodological tool. To this end, the study is based on the approach of large visual 215 datasets, collected from stock image websites (Shutterstock and Adobe Stock), with the aim of deriving descriptions and identifying patterns of comparison and, at the same time, developing considerations on the potentials and limits of this methodological application. Thus, the delimitation of research questions pertinent to computer vision technology and, at the same time, to stock images, converged on the following questions: § What are the differences between computer vision APIs providers? § How do computer vision APIs "understand" the same photos? § How are the ontologies of each API's labels distinguished? § Can we investigate national representations with computer vision tools? § How do stock images represent the visuality of the analysed countries? § How are cultural specificities made visible through the use of computer vision APIs? To answer these questions, we developed a protocol and analysis based on mixed methods. A first step involved data collection from image repositories, going through the processing of these images by different computer vision APIs and, finally, ending in the analysis of this data through a variety of approaches and techniques aimed at different aspects of the questions posed. These specificities will be made explicit throughout the analysis. The research protocol yields a diversity of viewpoints that allows the isolation of variables for, at least, three axes of comparative research: between computer vision APIs, stock image websites and nationalities (see Figure 6.2). In this study we have chosen to focus on computer vision APIs, as already highlighted, and the different nationalities, given our interest in the cultural specificities of APIs. Different image banks are interesting to diversify the biases of each nationality through different sources; however, this aspect is not so central to the scope of the study. Therefore, image banks are mobilised in an undifferentiated way in the analysis. 216 Figure 6.2. Comparative axes of analysis The data collection was performed through a scratching technique, where the source code of the image banks' web pages was processed in order to find the images’ URLs. The images in their preview versions (thumbnail) were then loaded locally for processing. We used a Python script developed specifically for this purpose (Mintz, 2019). For every nationality we collected 2000 images for each stock image website (see Figure 6.3). As mentioned in the previous section, three providers of computer vision were selected for the study: Google Cloud Vision API, Microsoft Azure Computer Vision API and IBM Watson Visual Recognition API (see Table 6.1). In the scope of the study, images were processed through their respective modules that provide descriptive tags or labels - Label Detection, for Google, Image Classification for IBM, and Microsoft Tags. The images were submitted in batch to the APIs using a Python script called Data Inspector (Guerin Jr. & Silva, 2018), in turn based on the Memespector script. The Memespector script was originally developed in PHP by Bernhard Rieder (Rieder, Den Tex & Mintz, 2018) and then adapted to Python and expanded by André Mintz (2018b). The operating logic is common to all these programs: a list of images 217 (composed of URLs and other tabular data in CSV format) is taken as input, and each image is then loaded and processed through the APIs, returning annotated tabular data (in CSV format) and relational data, or graphs (in GEXF format) as a result. In this way, the resulting processed datasets could be interrogated through statistical, corpus linguistics, social network analysis and visual critical-exploratory analysis techniques. Among the results, we want to group observations into two points, respectively linked to (hyper-)visibility and absence practices. Figure 6.3. Data collection protocol 218 Granularity and standardisation in the semantic spaces of APIs As seen in the example illustrated in Figure 6.1, each image processed in the computer vision resources yields through the system the marking of terms that refer to objects, states, qualities or characteristics. A first challenge, which speaks significantly to our research questions, is the fact that the providers of these resources do not share the complete list of possible tags. Thus, there is no possibility to analyse the ability of the resource to understand a certain semantic group before performing empirical tests. Microsoft Azure Computer Vision Austrian Brazilian Nigerian Portuguese 317 561 485 501 1.632 2.044 1.846 1.991 2.037 2.170 1.145 1.992 API IBM Watson Visual Recognition API Google Cloud Vision API Table 6.2. Number of tags assigned by each computer vision API to each of the studied datasets (n = 4.000). Considering that an essentialist understanding is, by definition, standardising and simplifying, a first question is: to what extent do computer vision providers offer an effectively relevant number of tags that approximate to a certain degree the complexity of the scenes photographed? In Table 2 we can see the number of unique tags assigned to each dataset, comparing providers and datasets from each country. The most significant difference is in the number of Microsoft tags, and far behind we find IBM and Google. In addition to this quantitative aim, it also seems relevant to consider the relationships established between the tags and the types of clusters formed by their cooccurrence. 219 Figure 6.4. Comparative matrix of image-label bimodal networks for each country and each API. Figure 6.4 shows a set of views that helps to understand the topology of the relationships between labels assigned by APIs for each analysed dataset. We refer to this topology as semantic spaces. In the comparative matrix, the nets for each dataset (columns) were initially composed by merging the labels of the three APIs into the same net, then, submitted to the ForceAtlas2 spatialisation algorithm. Afterwards, to compose the matrix, this combined net was filtered for each API, so that it was possible to compare the labels distribution and relationships in this joint spatialisation. Hence, it is evident how the low number of tags of Microsoft's API is reflected in the greatly connected networks and, therefore, with little topological differentiation between clusters. It is also noticeable how, although quantitatively IBM and Google return comparable numbers of tags, the IBM API traces a semantic space more undifferentiated than Google's. In the case of the latter, the density of connections is less diffusely distributed across the network and tends to condense into specific 220 clusters, suggesting a greater degree of specialisation in labelling. We will see in sequence how this general feature is performed in specific cases. Seeking to qualitatively understand the APIs sensitivity to culturally specific images, the analysis included the examination of small groups or individual images and how APIs label them. While the analysis ultimately unfolds into the evaluation of the labels applied to the image, in the light of the contextual knowledge of the countries and cultures being represented, this approach involves the challenge of delimiting, from the large datasets, these specific and quantitatively smaller cases. Filtering specific label occurrences allows the analysis of small groups of images linked by related labels, through mixed exploratory methods. The bi-modal networks of images and labels (which we will see in detail in the following section) have composed exploratory devices that allow the examination of emerging visual patterns between the datasets, as well as possible misclassifications or exceptions. The Google Sheet tabular data software was also used due to its image preview functionality with its 'IMAGE' formula. Consequently, it has become possible to filter the datasets according to particular tags, in order to scrutinise the images in which they were applied by the API. This redirection of the analysis may be approximated to the API’s reverse engineering, since by taking the tags as a starting point, it is possible to infer aspects of the training data that fed the algorithms and their possible biases. Identifying food, dishes and cooking is something that computer vision providers include and, as we will see, are able to accomplish to a large extent. But is the recognition of regional specificities enough? As one example, we can speak of Pastel de Nata (see Figure 6.5), a famous Portuguese pastry that appears abundantly in the country related dataset. Considering the labels attributed to a group of three images that we find with Pastel de Nata, we found a case that demonstrates the low degree of APIs cultural specificity. In view of the results, clearly that Google presented the most accurate results, even assigning the label "pastel". Moreover, it also identified "custard tart", a generic name for the recipe, that also relates to similar pastry from England and France. IBM Watson assigned labels like "brioche" and "Yorkshire pudding", which are not the same patty but have a similar format to Pastel de Nata. Microsoft API used the descriptions "doughnut" and "donut", 221 which are very different compared to Portuguese food and typically related to American culture. Figure 6.5. Labels assigned to three images of pastéis de nata by APIs. Non-food labels were filtered for easier reading. Another topic addressed in the study concerns how computer vision APIs label nonwhite phenotypic characteristics and non-western accessories. The cases in question emerge from an exploratory assessment of the representation of indigenous and black population in the datasets of photos of Brazilians and Nigerians and their labelling by APIs. During navigation through photos of black people, a remark emerged about how hair and accessories were tagged, particularly by the Google Cloud Vision API. We found that the "wig" label was consistently assigned to black women with frizzy hair or wearing turbans, in addition to the "Dastar" label on turbans of Brazilian women from the Bahia region, when, in fact, Dastar is a specifically Indian turban used by Sikhism followers. Additionally, in regard to turbans, the label "fashion accessory" has often been attributed to clothing in the Nigerian dataset. This, in a way, simplifies the religious and traditional symbolic charge contained in the items. In the case of Brazilian Bahian turbans, the "Tradition" and "Ceremony" labels appeared more frequently, bringing them closer to religious characteristics. 222 Figure 6.6. Labels attributed to images of black women with frizzy hair and wearing turbans. This analysis gives us some clues about the limitations of the APIs in the cultural sense of their labels and leads us to questions, such as which of the variations of hair types are included in the API and why is there no "frizzy hair" label in pictures of black women. This duality between visibility and hyper-visibility has been explored by researchers of the implications of AI technologies and computer vision, in particular for racial and gender issues (Buolamwini, 2017; Buolamwini & Gebru, 2018; Noble, 2018). One aspect highlighted by the work of these researchers is how these technologies should be understood in view of the context in which they operate, reflecting not only biases contained in the datasets, but also forms of exclusion that have conditioned the constitution of both datasets and software, including the little diversity of the development teams. The idea of post-colonial computing was reviewed by Ali (2016) as he discusses how the examination of cultural relations and power in computing, human-computer interaction and information and communication technologies has been undertaken. The decolonial turnaround (Quijano, 2010), however, assumes that although the traditional designs of colonialism have been superseded, "an ongoing legacy of colonialism in contemporary societies in the form of social discrimination” and 223 “practices and legacies of European colonialism in social orders and forms of knowledge" persists (ALI, 2016, p. 4). Considering the abyssal gap between corporations like Google, IBM and Microsoft on one hand, and developers and communicators on the peripheries on the other, in terms of the potential for creating databases and computer vision ontologies, the pressures of cost-benefit, effectiveness and network effect tend to concentrate projects around those providers. The practices of standardisation of content and tools would be at the core of computer science itself, which, in its nature, "forces to assume a perspective of the positivist paradigm, in addition to its own empirical analytical approach to experimental sciences"156 (Portilla, 2013, p. 98). Networks of semantic spaces and typicality In order to analyse both the labelling performed by each API and the visual characteristics of the data addressed, the study used visualisation of bimodal networks when processing the assignment of labels to images as relational data. This is a common approach in similar research and is programmed in the Python version of the Memespector script (Mintz, 2018b). Image-label networks are generated by considering images and descriptive tags as two types of nodes. Nodes represent the assignment of a label to an image. This data representation allows its processing and visualisation through software such as Gephi (Gephi, 2017), which provides several analysis tools, from layout algorithms to statistical modules based on the Graph Theory methodological framework. For this study, the analysis was mainly established on visual network analysis (Grandjean & Jacomy, 2019; Venturini, Jacomy & Jensen, 2019), which focuses on describing properties of datasets according to the networks’ topological traits based on the nodes’ position and size. ForceAtlas2 (Jacomy et al., 2014) was the main layout algorithm used in network spatialisation and modularity calculation (Blondel et al., 2008) was applied to identify the main clusters. 156 Original: “obliga a asumir una mirada desde el paradigma positivista, dado además su enfoque empírico analítico propio de las ciencias experimentales”. Author’s translation. 224 Modularity divides the network into sections according to their possible community structure, assigning codes to each partition. Depending on the data represented, these partitions may lead the researcher to discover significant clusters, such as thematic or geographic groups or general semantic concepts. The exploration of network visualisations was carried out in printed formats combined with their visualisation on the screens for the identification of clusters. In addition to the graphical visualisation of networks, we used versions in which the images themselves were plotted at the position of their corresponding node, using the Image Network Plotter script (Mintz, 2018a). This representation facilitates the transit between the visual aspect of the images and the relative position they occupy in the network, according to the computational reading performed by the APIs (see Figure 6.7). Figure 6.7. Representations of the image-label network generated from the same dataset: on the left, the relationship between the labels/tags and on the right the images spatially arranged from the bi-modal relationship between the communality of the labels. The analysis needs to consider articulated aspects of the analysed datasets and the API used for the analysis. Layout algorithms based on force systems like ForceAtlas2 work under the power law and the preferred connection process (Jacomy et al., 2014), implying a very particular logic of reading the semantic space of the computer vision APIs. Importantly, the analysed bimodal networks are formed by the confluence between the tagging logic and the singular configuration of the visual datasets approached. Therefore, the mode of spatial distribution of images and labels, from the center to the periphery of the networks, seems to be the result of at least three factors: a) the generality or specificity of the labels; b) the variety of objects defined by the scope of the dataset and their topical specificity; and c) the topological characteristics 225 of the semantic space resulting from the associations between labels in each service. For example, in Figure 6.7 we see quite particular clusters positioned on the periphery of the networks as a result of the Google API tagging: musical instruments, dog types, and food. The formation of these clusters is a result of both the prioritisation of these topics in the data and the high degree of granularity of the Google API when describing them. We found quite specific tags for puppies, such as 'spanish water dog', 'lagotto romagnolo' and ‘serra de aires dog', for example. Considering these analysis parameters, the networks between images and tags allowed us to identify topical categories within each set of visual data and the relative predominance of each was taken as an indicator of typicality of each national representation (see Figure 6.8). By analysing these networks side by side, it was possible to define common topical categories that can be observed in the various datasets, given their presence and relative prominence in all cases. These were: "Nature", "Food" and "People". Additionally, through this analysis, a unique category was also identified as emerging in each set: "carnival" for Brazilian; "azulejos" for Portuguese; "cidade" for Austrian; and "dinheiro" for Nigerian. The network views presented in Figure 6.8 allow us to understand the different degrees of specialisation of labeling by each API. In this figure, in particular, one can also observe how the clusters formed by the labelling serve as indicators of the visualities constituted in the image banks for each nationality. The categories shared among the countries form communities of variable dimensions in each case, but in a relatively balanced way in most of them. The Austrian case stands out, however, by reason of its cluster concerning natureza (nature) being significantly larger than in the others. In the Portuguese case, the pessoas (people) cluster is also comparatively smaller than the other cases addressed, while the specific grouping cluster, related to azulejos (tiles), is quite pronounced. In the Brazilian and Nigerian cases, the food and people clusters were the most pronounced. 226 Figure 6.8. Comparative matrix of image- label networks according to the gentilic and computer vision API. Going to the level of the labels themselves, some of these general perceptions are specified around some of the more frequent terms in each case. Figure 6.9 shows the 10 most frequent terms for each nationality datasets, according to the Google API. The frequency in each case is compared with the frequency of the same term in other nationalities. It is noted the already mentioned exceptionality of the figurations in the Austrian dataset, the only case where the most frequent tags/labels are not related to food but to landscape categories - "mountain", "natural land", "sky" etc. The Portuguese dataset brings terms related to tile images after those related to food "design" and "textile". Both Brazilian and Nigerian datasets bring tags related to people after those related to food - "fun" and "smile". In marketing and consumer studies, the topic of typicality is approached from cognitive psychology studies as a parameter of a product or brand's strength (Loken & Ward, 1990). The notion is understood as the strength of attachment of an individual instance to a general category. Or, in another way, as the measure of how exemplary 227 an instance is in understanding a category. In the case of this study, we could say the typicality of a figuration is to what extent it would be representative of a nationality in the context of image banks and the media products that use them. The typicality, in this context, would be related to the frequency with which certain figurations appear in the search for the gentilic of each country. Bruhn (2003) highlights the allegorical character of stock images, which seems to aim at the representation of categories. However, this factor needs to be combined with the algorithmic mediations mobilised in the analysis, allowing the grouping of these figurations. Considering the operation of machine learning, there would be a particular dynamic of typicity conformation of a category in the training process by examples. Factors that contribute to the constitution of the category involve the frequency of linking certain figurations and the visual attributes shared between these figurations; aspects that are ultimately unfeasible by the closure of technologies in APIs and that relate to the constitution of their training bases and neural network architecture. Figure 6.9. Comparison of the 10 most frequent labels per gentilic. 228 Among the indications of Ali (2016) for the adoption of a decolonial perspective in computing is, as a minimum, geopolitics and policies consideration of the bodies in the different engagements of production and thinking about computing. Drawing, building, researching or theorising on phenomena cannot be done without looking at the power relations inherent in the concentration of ownership, development and sale of technologies in the USA and Silicon Valley, for example. Aspects of the typicality observed in the study seem to relate to this aspect. The bias reproduced by the technologies in question is linked to the geopolitical position of these companies and largely defines the technological mediations of contemporary digital communication. The production of codes, datasets and interfaces follows the logic of "mirroring" through which producers think of users similar to themselves (Haas, 2012). In line with the perspective of digital methods, as mentioned above, the mobilisation of these tools as analysis devices must be combined with their continuous critical consideration as part of the study. Critical and public scrutiny with AI systems, automation, indexing and categorisation of content may generate remedies that minimise the problems presented (Raji & Buolamwini, 2019). The centrality assumed by computer vision APIs in the interpretation of visual data should, therefore, be considered critically. Conclusion Studies of large corpora on digital platforms have focused on quantifiable aspects of the environment or on the interfaces themselves (Laestadius, 2017). Mixed methods such as our visual network analysis approach enable researchers to constantly change between levels of visualisation and data exploration. Network visualisations can generate unique insights through the spatialisation and modularity of tens of thousands of images. At the same time, these networks allow filtering specific instances related to a theme or context. We discovered patterns about what is commonly related to different countries through their gentilic. Food, Nature and People emerge as salient categories related to each of the four countries. Since all of these categories are related to concepts usually perceived as linked to places and cultures, as well as to the visuality of image banks, 229 we consider that this indicates the suitability of the methods used for the comparison between countries. At the same time, the unique characteristics pointed to particular discoveries. In each country a specific theme has emerged, linked to its culture and tourist strategies which, therefore, creates demand for image banks. While positive and negative stereotypes played a relevant role in these categories (such as "Carnival" for Brazil), the offer of albums by photographers and studios may have biased some results. In the case of Nigeria and Austria, for example, prolific albums directed the dataset (and its images and labels/tags) to heighten their themes. This highlights an important variable not included in the study: the ratio between the number of platform images under study and the number of producers. That is, in our project, if fewer content producers use a certain tag, each producer can influence the analysis more relatively. To move forward on this issue, future studies may compare peripheral countries with hegemonic countries in the global media industries, such as the USA, the UK and Japan. About the computer vision providers, the study sheds light on their differences, limitations, and ways of reappropriating them to explore culture and representations embedded in digital platforms with digital methods. Commercial services such as Google Cloud Vision API, Microsoft Azure Computer Vision API and IBM Watson Visual Recognition are considered "black boxes" (Buolamwini & Gebru, 2018; Latour, 2001; Pasquale, 2016), demanding auditing methods for assessing their mode of operation, accuracy or coverage. Neither the list of labels nor their total number are disclosed by the providers. We hope that studies such as ours add to the field, in order to help other researchers interested in using computer vision APIs for social research. In addition, the different providers do not overlap or equate in understanding complex cultural data, such as images, so, using them uncritically can be a methodological problem. Some of the procedures performed in this study (such as the clustering dictionary) show the feasibility of combining two or more providers by merging annotated datasets and networks. Future experiments with other providers are recommended to expand the scope of the findings. Experts on the researched topics can direct multifaceted investigations on the datasets. International comparisons with international teams can advance understanding of both method (computer vision) and object (image banks). 230 Finally, cultural stereotypes and "typicities" about a country and its populations are reproduced in content providers; therefore, understanding the ways in which this happens is important for fairer media ecosystems. Image bank providers are important to advertising agencies, public relations and publishing companies around the world and the microstock business model extends its impacts also to small and medium enterprises and public institutions, making understanding their visual cultures and productive routines especially important due to their pervasiveness. 231 TECHNICITY OF THE MEDIUMS IN DIGITAL METHODS C ONCLUSION 232 Summary This dissertation has addressed the role of technical knowledge, practice and expertise both as a problem and as a solution, in a variety of digital methods. The computational mediums (and respective regimes of functioning), the web environments and technical procedures were taken as key elements in the practice of digital methods. By demonstrating how technicity influences the ways we generate, present and legitimise knowledge in digital research, I have argued that the practice of digital methods is enhanced when researchers make room for, grow and establish a sensitivity to the technicity of the computational mediums. To substantiate my argument, I proposed a reflection on the intersections between digital methods, technicity and digital fieldwork, while mobilising the concept of technicity-of-the-mediums and discussing three crucial aspects of the digital methods approach. This thesis investigates what it is like to design and implement research with a digital methods approach by making clear that this approach requires researchers to develop a mind-frame that accounts for, investigates and re-purposes technological grammar for social enquiry. This dissertation shows that repurposing digital media and data for social and medium research is a challenging activity that demands extra effort. Using digital methods is not about enquiring technologies from the outside in (see Marres, 2017), but about understanding how to work with socio-technical assemblages and how to think along with a network of methods. This is what it takes to put research strategies and methods into action. It requires taking extra care when analysing grammatised actions, that “have not been created by or for the social sciences” (Venturini et al., 2018, p. 4). The question of what do extra efforts mean in practical terms has been answered, first, by exposing the many challenges in digital methods (chapter 1) and, then, proposing ways to defuse some of the difficulties related to the use of these methods (chapters 2 and 3). To describe how the notion of technicity matters for and contribute to digital research, this dissertation introduces the concept of technicity-of-the-mediums which is mobilised in a series of case studies (chapters 4, 5 and 6), resulting in specific methodological approaches. These provide new analytical perspectives for social media research and digital network studies, as I will argue in the following sections. With regard to what extent digital methods can be considered a type of fieldwork, this dissertation has presented a technical understanding of the web environments (from a 233 methodological standpoint), while proposing and discussing the role of three distinct but related aspects to be considered when doing digital methods: software affordances and platforms’ cultures of use and grammatisation (chapter 3). Below a summary of conceptual and practical contributions: § The expression of computational (technical) mediums stands for research software, digital platforms and associated algorithmic techniques, referring to media not only as communication platform, but also as mediators’ devices, which demand, as a consequence, as much attention as the contents or the objects of our research. § The concept of technicity-of-the-mediums serves as an invitation to become acquainted with the computational mediums in the practice of digital methods. It is related to the relationship among the computational mediums, the fieldwork and the researcher(s) and her/his object of study, thus demanding iterative and navigational technical practices. § Three pillars of the digital methods approach refer to the awareness of a triangular relationship between software affordances, platforms’ cultures of use and their grammatisation. While suggesting a way of carrying out digital fieldwork, this proposal highlight some of the difficulties related to the use of digital methods, i. e. the need to care about the specificities of the medium and data. § The notion of second order of grammatisation follows the creation of new methodological grammars by researchers when using digital methods. It is a second order because it is based on existing technological grammars such as what is captured by platforms (units of expression and communication), what is afforded by software (e.g. force-directed algorithms, metrics like network diameter and modularity) or by the outputs of other computational mediums such as computer vision APIs. The final results (about the research object and content) neither entirely represents platform grammars nor the affordances of (research) software, but a mix of both and of the researcher agency. Overall, it may be concluded that technical knowledge, technical practices and technical imagination can concretely enhance the design and implementation of research with digital methods. 234 Part 1: For a technical culture of knowledge in digital research Chapter 1 explains in different ways that, in the practice of digital methods, researchers should develop a digital methods’ mind-frame, being able to stay in it and then keep working in a piece of research that accounts for, investigates and re-purposes technological grammar and digital records for social enquiry. To that end, it is important to consider the computational mediums (e.g. APIs, extraction and analysis software) not only as a way to perform a particular piece of work (as instruments) but also as active participants in each stage of the research, because they add technical substance to the object of study while re-adjusting and re-shaping it. In chapter 1, I am therefore suggesting a broader definition of medium that encompasses but also exceeds that of medium of communication. Computational (or technical) mediums (e.g. software or digital tools) have a proper domain of being and meaning (see Rieder, 2020). Consequently, researchers should not only look carefully at the thick layers of technical mediation (inherent to the methods) but recognise the forms of practices and modes of operation held within software, web apps or APIs as “a medium of expressing a will and a means to know” (Rieder and Röhle, 2018, p. 123, see also Berry, 2011; Rieder 2020). To unfold this argument, I used a network of the followers-of-the-followers of an Instagram bot profile (Mary__loo025) to call attention to the role of technical knowledge in the practice of digital methods. Through this example, I have also demonstrated the importance of devoting some time to the art of querying as well as the craft of becoming close to computational mediums. Here is where the concept of technicity-of-the-mediums can help. Chapter 2 thus presents three attempts to understand technicity from the lens of media studies, of a philosophical perspective and of digital methods. After reviewing why and in which ways technicity matters in media studies, I identified different efforts to understand and use this concept: from giving attention to the processes through which people connect “through techniques, technologies, and dynamic traditions of practice” (Crogan & Kennedy, 2009, p. 109) to the focus on how the domains of knowledge and transformative practices relate to technicity (Dodge & Kitchin, 2005; Kitchin & Dodge, 2007; Niederer & Dijck, 2010). Although, clearly, 235 technicity is a complex and compound concept, which demands a close relationship with software through technical knowledge and practices, technicity in media studies is not generally referenced as a means to (re)think the way we design and implement digital methods. The literature review indicated how the notion of technicity, has been used for purposes such as taking into account identity formation, understanding how content is managed by digital platforms or studying participation in social media. In this dissertation, I consider technicity can be used to (re)think the design and implementation of methods. In chapter 2, another attempt to understand technicity follows the Gilbert Simondon’s philosophical reflections and try to connect it to the practice of digital methods – which has been one of the greatest challenges of this dissertation. While recognising that such a proposal deserves further development, I have tried my best to import the concept of technicity in the context of digital methods. This dissertation concludes that an understanding of the technicity alludes to the awareness component of technical mediums from different levels, in particular when these are, in Simondon’s terms, individuals (software) and elements (e.g. machine learning models of vision APIs, algorithmic techniques). Researchers are thus required to be acquainted with the former, while knowing when and why to value the latter in the research design. This means they also should know how to use the potentials, practical qualities and meanings of technical mediums for the research purposes. This distinction is crucial to help researchers: § To think with and repurpose (if necessary) technical mediums; learning to value elements in the full range of digital methods while keeping in mind the way in which they add new meanings to the subject of analysis. § To devise research design with digital methods and decide what pieces of software should be staked in a methodological chain to solve research problems, pushing researchers towards orders of technical-practical thoughts. § To implement digital methods through a technical ensemble composed by the medium under investigation and the tools used to investigate it. § To enable researchers to discover new arrangements when doing digital methods, using a technical imagination to create methodological solutions or raise new questions. In this thesis, I have also suggested that researchers can think of digital methods as technical ensembles, connecting the practical qualities and potentials of computational 236 mediums to the research purposes. Moreover, they should invoke their technical imagination to better combine software inputs and outputs and to accomplish the activities that compose a digital methods protocol. Chapter 2 also provides a description of the process of building/interpreting a computer vision-based network to illustrate more clearly the mental and practical modes of what I am calling the technicity-of-the-mediums in digital methods. A first lesson drawn from this is that the content of such a network is never the subject of study but also the technicity of the tool used to produce it. We need to move beyond discussions about how to get data or how to make beautiful network visualisation and start making room for what precedes the network visualisation. Central elements for this purpose are technical knowledge but also a good understanding of the object of study within the web environment. A second lesson refers to how a computer vision-based network can offer insights impossible to achieve through traditional research practices, but only through technical practices and imagination. For instance, in the making of the network, we created specific nodes attributes (with a spreadsheet and Table2Net) - namely the authors of the visual content, the year of its creation and the classification of link domains (whether a porn website or not), using colour (or label) to identify them in the network diagram. Those interventions have helped us to respond to a particular research question concerning porn bots, enabling us to analyse their agency but, at the same time, to perceive them in the general context of the network. In chapter 3 we continued and expanded the journey of understanding how technical knowledge contributes to new forms of enquiring in digital research, and here we learnt a way of carrying out digital fieldwork. The chapter contends that new media researchers should have a technical understanding of the web environments and develop a practical awareness of a triangular relationship between software affordances and platforms’ cultures of use and grammatisation. This is what it means to get acquainted with the technological environment in which digital methods ground their claims about social phenomena. For instance, within the web environments, we thus should take Uniform Resource Identifiers (URI) as research material because every piece of information stored in the web is standardised by an URI. Here, the syntax of URLs is a means for exploring the data set, for responding to research questions or identifying unique actors or records. An example of this is given in chapter 237 2: after following the protocol of search as research for detecting pornography websites through images URLs, it was possible to identify and visualise 125 pornography unique hosts within the network. As mentioned above, this strategy gave us a new direction for visual network analysis and generated new insightful findings. Such an attitude requires a capacity of considering the features of technical mediums as an ensemble and as a solution to methodological problems. To understand the layered structure of online connectivity (Ghitalla & Jacomy, 2019) is another aspect that researchers should keep in mind when using search engines results or crawlers’ outputs, because the hierarchical structure of the web delivers content arranged in a particular order (which is reflected in the data obtained): content which is known by everyone (first layer), by amateurs and experts (second layer), and simply ignored (third) (Jacomy, 2019). Not less important, is knowing how web apps and APIs may serve research. These mediate the execution of specific research tasks and demand our attention because they have “epistemic orientations that have repercussions for the production of academic knowledge” (Borra & Rieder, 2014, p. 2). Learning how to read social media APIs’ documentation (not always transparent or user friendly) is helpful to better understand the situations when users deal with predefined technological grammar, produced and delineated by software, to structure their activity (Gerlitz & Rieder, 2018). All these possibilities, however, exist in a web that is less and less seen as an open platform for everyone to use, and more and more as a closed environment ruled by specific platforms’ walled gardens. Finally, the concluding remarks of chapter 3 introduce the three pillars of the digital methods approach which relates to technical knowledge and practical awareness of: § Platform grammatisation: the technological processes inherent to the web environment and APIs in which and through which online actions are structured, recorded and collected through crawling, scraping or API calling. § Cultures of use: the modes of life, the common meanings and the forms of signification that emerge and circulate within a given platform, encompassing what is expressed by technological grammar and shaped by platform’s infrastructure and technical mechanisms. 238 § Software affordances: the materiality, productive and mediating capacities of software to be considered from a relational perspective with platform grammatisation and cultures of use. Part 2: From technical knowledge and technical practices to new forms of enquiring Chapter 4 presents a hands-on approach to study hashtag engagement on social media through the three pillars of the digital methods approach. The “impeachment-cumcoup” of Brazilian president Dilma Rousseff was studied under different but related perspectives. The first contemplates the differences between high-visibility and ordinary hashtag usage culture, the related actors, and content. The second focuses on hashtagging activity and the last layer looks into the images and texts to which hashtags are related. When read together, the three levels of analysis add value to one another, providing a rich and in-depth vision of the case study. This approach thus promises an enhanced understanding of hashtag engagement and can be applied to different platforms. Chapter 4 opens a window of opportunity for digital research, showing Instagram as an environment for the study of political debates through hashtags (in 2016, the platform was not considered a means to this end). In compliance with some extra effort required by the digital methods approach, the choice of hashtags in this chapter takes seriously the art of querying. It considered program and anti-program (situating hashtags as positioning efforts) and resulted from immersive observation and monitoring as well as exploratory data collection and analysis (co-hashtag networks and Excel’s pivot table) throughout the month of the protests (March 2016). The extra care needed for the analysis of grammatised actions is reflected in the criticism of some common sampling practices in digital research – e.g. focusing only on the most engaged items or what is dominant in terms of popularity and influence. It thus proposes that researchers consider also what is “ordinary”, that is users, posts and practices kept out of the spotlight. To do that, on the first level of analysis (highvisibility versus ordinary), unique actors must be identified and distinguished using engagement metrics. Two analytical strategies are proposed: the identification of 239 unique actors (e.g. users, link domains, image URLs and video ids) and the analyses of high-visible and ordinary elements (actors, content and or cultures of use). The care about unique actors gives researchers a sense of what the data set can represent, it helps to situate and contextualise it and provide a partial but robust perception of the digital records, framed by Instagram grammatisation, software affordances and our vision on the issue but solidly built with a good query design. Enquiring of hashtag political engagement on Instagram, this methodological framework confirms the importance of including high visibility but also ordinary groups. It also revealed a particular structure concerning high-visible actors. In regard to the latter, the evidence here is seen in two types of high-visible actors with different Instagram usage practices throughout the protest day; little and high engagement in post publication. We found that producing an impact requires little effort from public figures, politicians, and artists (often with one post), as expected, while continuous activity over time is necessary for non-official campaign accounts and independent media (often with a high number of posts). Both cases are part of the high-visibility group of protesters. When comparing high-visibility with ordinary, we detected different patterns between these groups related to textual and visual narratives (through analysing captions and images), forms of expressing feelings (through looking at emojis) and positioning efforts (through the choice of hashtags). Another criticism addressed in this chapter relates to superficial approaches to hashtag activity– e.g. when the analysis ignores the forms in which hashtags are captured and re-arranged by platforms (grammatisation) as well as the forms of appropriation and frequency of hashtags (cultures of use). In this regard, the second level of analysis (hashtagging activity) looks at referential tags and their use frequency. Again, we noticed different preferences between high-visibility and ordinary actors. Through a visual network exploration, chapter 4 also suggests that we approach emblematic hashtags as a form of seeing a shift in meaning (what we called double-sense hashtags), rather than following the typical cluster analysis to study the partisan use of hashtags and related topics. On the third level of analysis (visual and textual), different digital networks were part of the study including a network of images and labels afforded by Google Vision API. This type of network, still under-exploited, has shown great analytical value to study online images. Through the network we produced, we were able to see the stereotypes 240 that characterise different Brazilian’s political positions (e.g. colours: yellow and green for the pro-impeachment protests; red for the anti-coup protests) and political identity (e.g. bearded faces in the left, sunglasses in the right), for example. In addition, through the pro-impeachment related images, we see a higher occurrence of labels which relate to close-up portraits (e.g., “sunglasses,” “facial expression,” “face”), while labels related to collective imagery were more common in the anti-coup data set. (e.g., “festival,” “demonstration,” “event”). The richness of different narratives was found by analysing the networks of cooccurrence of terms and emojis, extracted from Instagram captions. The networks revealed particular concerns raised only by one group. One example of this is the argument made by anti-coup high-visible actors that Brazilians were not moved by hatred but wanted to “protest peacefully”. Another example is the nationalistic rhetoric, which appeared exclusively in pro-impeachment ordinary actors. In addition, we saw ordinary actors showing that they were proud to participate in the protest, while high-visibility actors acknowledged Brazilians for their participation. The emoji analysis revealed an interesting perspective about race, with a predominance of light skin and medium skin tones among protesters, except for the high-visibility accounts of the anti-coup demonstrations, which had medium-dark and dark skin tones. Chapter 5 expands the use of computer vision-based networks for social research, also providing new ways to design and implement research through repurposing Facebook likes and visual content. We studied how Portuguese Universities use the platform to communicate (using two types of networks); we map and analyse like connections as proxies of institutional interests and timeline images as institutional visual culture. By doing so, chapter 5 exposes what researchers can learn from the connections between Facebook Pages (through likes) and from a list of (timeline) image URLs. This study, like the hashtag engagement approach in chapter 4, can be repurposed for different studies. The main contribution of this chapter lies in embracing the methods of the medium, a navigational research practice and the technicity-of-the-mediums as key components for digital social sciences. The network of likes between pages revealed not only common interests among all universities (e.g. Portuguese and global Media/News Company, Newspapers and Education-related Pages), but also some peculiar behaviours (more or less active in terms of making institutional connections), dimensions of sociality (e.g. bonding with 241 pages categorised as business, public figures, restaurants, entertainment, barbershop, shopping mall), interests (e.g. from investing in connections with internal stakeholders to establishing a bond with international universities) and lack of interest (when searching for political related categories, civic engagement, social movements or political parties, nothing was found). This is a network that helps researchers to map the institutional profiles and that may serve well other examples (e.g. social or environmental causes and political oriented pages). On other platforms, we can consider following networks (Instagram or Twitter) and channel network (YouTube). Using the image-label network, researchers can carry out two analyses that complement one another. The first is the analysis of images clusters, proving a global vision of the site of the images in themselves. We found that Portuguese Universities are perpetuating the idea of a boring academic environment (e.g by using institutional posters or photos of people seated in an auditorium and listening to a conference). The second level of the analysis provides very specific insights concerning each university image sharing culture (e.g images containing animals are almost exclusive to UTAD). This analysis benefits from some knowledge on Facebook Grammatisation, the affordances of Gephi and Google Vision API combined with some technical expertise and practical awareness. Here colours were associated to the image clusters, while making the labels (second node type) disappear by colouring them with the same colour of the background. The final visualisation provides a network grid that gathers 15 networks (one for each university), which allowed us to take advantage of the visual affordances of the networks, analysing each university individually. Chapter 6 interrogates three computer vision APIs (Google, IBM and Microsoft) analysing 16,000 images related to Brazilians, Nigerians, Austrians and Portuguese through the search for their demonyms in two of the main Western stock image sites (Shutterstock and Adobe Stock). This served as an example of when and how the technicity of the mediums is misaligned with the research objective, teaching us that one’s ability to use software (or being an expert in it) does not necessarily mean to be aware of the technicity of the medium, the same can be said for one’s skill to generate data visualizations or code. Our knowledge of image network was solid, but we had a poor understanding of the functioning of image classification by computer vision. Consequently, we addressed research questions not consistent with the technicity of the medium (by framing algorithmic techniques as either racist or culturally ignorant 242 agents) which in turn led us to flawed reasoning and results. Too much confidence in knowing how to do or how to proceed at the level of methods or the eagerness for methodological experimentation can take away the attitude of identifying and interrogating the technical elements that are carriers of meaning. Thus, researcher's attention ends up focusing only on methods. We however had insightful results when asking how different computer vision APIs classify the same collections of images. This study presents important findings that researchers may want to consider when comparing or using computer vision APIs to study online images, summarised in four elements (or the need to be aware of): § The different range of image labelling (capacity in term of numbers of labels) § The modes of image labelling (use of words, e.g. in IBM Watson API colours are part of the generic labels used to classify visual content) § The granularity of image labelling (how specific the label can be, e.g. serra de aires dog) § The lack of precision (e.g. inaccurate detection of facial expression, e.g. for surprise, sorrow or anger) At the same time that these elements are reflected in the spatialisation of the network (forming clusters that reveal not only the image content but also the range and granularity of image labelling), they also shed light on the differences and limitations among the three computer vision providers. Chapters 6 and 5 show us that image clusters within the image-label network follows the logics of image classification by computer vision, that is the provision of a topicality rating and confidence scores for the textual descriptions, e.g. Food (0.9772421); Ingredient (0.8929239); Fruit (0.8815268); Staple food (0.86995584); Recipe (0.8641755); Dish (0.85230696); Cuisine (0.8481042). This means, in the image-label network, we would see clusters constituted by generic labels, such as food, text, buildings, musical instruments, that are followed by more specific or descriptive labels (e.g. when looking at the food cluster, we can make sense of the variety of food types pertaining to a group of images such apple pie, rice, beans). This kind of knowledge is noteworthy for interpreting the network because, when we look at image-label networks, we understand how images can be positioned in different areas of the network (how connections are made) and what is behind the formation of clusters (general labels followed by more specific ones). 243 Part 1 and part 2 illustrates in different ways what the technicity perspective has to offer to digital research. Understanding that computational mediums influence the interpretative process and a voice on their own, help re-think the way research questions are asked by taking into account technological grammar as factor that contribute to research efforts (not just an issue or bias). Accounting for technicity suggests that conceptualization is dependent on what is obtained through practical operations (method), that are not separated from close observation of one’s object of study. Developing a sensitivity to the technicity-of-the-mediums in digital methods In this dissertation I have argued and illustrated how the practice of digital methods is enhanced when researchers make room for, grow and establish a sensitivity to the technicity-of-the-mediums. Researchers benefit from that, developing a capacity to research with and about technological grammar and computational mediums. To show the connection between part 1 and part 2 of this research, I introduce below a figure that depicts three distinct but connected ways in which researchers can develop a sensitivity to the technicity-of-the-mediums in digital methods. When doing this, I argue, researchers may develop a capacity to research with and about technological grammar and computational mediums. This is the main contribution of this dissertation, which may serve as a building block in the practice of digital methods. The concept of technicity-of-the-mediums is related to the relationship among the technical mediums, the fieldwork and the researcher(s) and her object of study, thus referring to technical knowledge but also demanding iterative and navigational technical practices. In this way, the concept points to processes of getting acquainted with the computational mediums from conceptual, technical and empirical perspectives (making room for the sensitivity to technicity), and in the practice of digital methods (developing such sensitivity). This involves an engagement with the digital fieldwork as well as technical practices, which takes some time and requires extra efforts from the researcher (establish the importance of this approach). 244 I want to present practically the situations which relate to the attitude of make room for (chapter 4), grow (chapter 5) and establish (chapter 6) a sensitivity to the technicityof-the-mediums by using the case studies. Below, I summarise how the technicity approach has been into use in the case studies, while the figure helps to explain the technicity approach in a more metaphorical way. In this, the pixels in the background of the image represent the knowledge of the researcher which changes and evolves over time and through technical practices (the other shapes and the connecting line are explained later). § Chapter 4 exemplifies the attitude of making room for the technicity of the mediums and the philosophy underlying digital methods. By closely paying attention to Instagram’s application programme interface as a way of make sense of how researchers can access, treat and repurpose the platform technological grammar and by knowing how to make use of extraction and analysis software, as well as being aware of the potentialities of computer vision, we attempted to pose research questions aligned both with to the object of study and with the technicity of the mediums. § Chapter 5 results illustrate a situation in which researchers have already developed a certain proximity with computational mediums and the practice of digital methods. This chapter exemplifies the attitude of growing a sensitivity to the technicity of the mediums by taking seriously the relationship between software affordances, platforms’ culture of use and their technical grammatisation. As in chapter 4, the analysis here proposed offers macro and micro perspective of the research object, moving from general to specific visions with both quantitative (general overview of the networks) and qualitative (looking at specific content and actors in the network). § Finally, chapter 6 presents an example of the use of digital methods as a tool to create innovative visual methodologies using comparative matrixes of image-label networks, in order to compare three different vision APIs outputs. The chapter also serves a counter-example, describing the misalignment between the technicity of the mediums and the research objective. It demonstrates absence of basic knowledge about a computer vision feature can harm research results. 245 Developing a sensitivity to the technicity-of-the-mediums in digital methods. The pixels in the background of the image represents knowledge which begins with knowledge of the researcher's area of expertise that changes and evolves over time and through technical practices. Design by Beatrice Gobbo and concept by Janna Joceli Omena. 246 Making room for the computational mediums as carries of meaning The attitude of making room for the technicity-of-the-mediums refers to the efforts to become acquainted with the fieldwork and being of aware of technical mediation. When researchers get in contact for the first time with the digital methods approach, they are invited to train their mind to see the web, digital records, media and computational mediums as a means of enquiry, as source and methods of investigation. In the figure, the cubic forms (different from the rest) tell us about this challenging encounter, a change of mind set in which the meaning carried by computational mediums are as important as the object of study. It is from this direct contact with web environments that researchers start to understand what they should look at and what for. They should look at the web from a methodological and technical standpoint, understanding how online devices treat web data. To learn, researcher should understand a triangular relationship between platforms’ cultures of use and grammatisation, while exploring software affordances and not losing sight of the object of study. In doing so, they should consider the purpose, potentialities and limitations of the computational mediums, while practising the full range of digital methods. It is through the practice of the software (to extract, analyse and visualise digital records) that researchers start to understand what they should know. They should know how to navigate digital platforms and API documentations and identify what is seen there in the data obtained. They should know how to use software instructions and how to work with different format files, not losing sight of the data relational aspects. All these efforts point to an understanding that is conceptual more than practical. Here the use of software serves as an awareness tool, so that researchers can start seeing things differently. What is technical is already there. Grow a sensitivity to technical elements while practising digital methods To grow a sensitivity to the technicity-of-the-mediums, researchers need to engage with technical practices from the standpoint of software-using. This is a situation in which researchers, already familiar with the fieldwork, start developing projects with digital methods. They are invited to think along with a network of methods, while they learn how to implement the methods. In the figure, we see that the cubic forms (the 247 web, digital records, media and computational mediums) are no longer stranger but something familiar; they have come to make sense. Through technical practices, researchers understand how the meaning carried by computational mediums can be as important as the object of study, thereby rethinking conditions of proof in digital research. It is by practicing digital methods that researchers start to understand what they should know and what for. They should know that different fields of studies (e.g. graph theory or network studies, information visualisation, web technologies and statistics) are attached to computational mediums. It is from the continuous use of digital methods (individually and through data sprints) that the researchers can learn how to think along and repurpose the medium and digital records. To this end, and never losing sight of the object of study, they start with a sort of mimic research, imitating successful research protocols or following step-by-step tried-and-tested research recipes. Through iterative and navigational technical practices, researchers develop a sensitivity to technical elements (graph layout algorithms, web vision APIs modules, algorithmic techniques) as meaningful objects of attention, while they begin to value and then appropriate the navigational practice in analysis. With time, researchers develop the capacity of considering the features and practical qualities of technical mediums as ensemble. All these efforts accommodate more practical than conceptual understanding; here the use of software serves as a knowledge tool, so that researchers can start doing things differently. What is technical is already there. Establishing a sensitivity to the technicity-of-the-mediums All the effort made so far culminate in a balanced state. Here, what is conceptual, technical, and empirical is combined as if they were one thing (always connected with the object of study). It is from this perspective, that researchers become creators and interpreters of a second order of grammatisation. The research protocol diagram below illustrates that (but in a different way compared to the one discussed in chapter 3). On the one hand, it portrays a concrete methodological process for creating computer vision-based networks to study online images. On the other hand, it uses an overlapping layer to underline the crucial role of technical knowledge, technical 248 practices and the researcher’s act of connecting techniques for research purposes. Researchers orchestrate a technical ensemble (the full range of digital methods), knowing what computational mediums should be part of it, at what time and in what way they perform. They also know which of these are meaningful objects of attention. That is, for instance, the case of the API that captured and made available the images, the vision API that added new technological grammar to the images, the software (Gephi) used to analyse the images and force-directed algorithm used to spatialise the network (ForceAtlas2). Researchers organise a technical ensemble and follow the protocol of a navigational practice in analysis, understanding that their decisions are affected by the performance of the technical mediums. With time, researchers develop the capacity of using the features of technical mediums as an ensemble and as a solution to methodological problems. As we see in the figure, the colours of the pixels in the background of the image have changed, representing what has been gained through processes of getting acquainted with the computational mediums and the web environment. This situation, however, is not definitive because, whenever necessary, the researcher should go back to previous steps (as represented by the lines in the figure). Although the methods, media and technological grammar are unstable and change over the years, there is something permanent in the logic of thinking digital research, which may come from the awareness of the technicity-of-the-mediums combined with the practice of digital methods. 249 The research protocol diagram for building/interpreting computer vision networks as an example of researchers as creators and interpreters of a second order of grammatisation. Design by Beatrice Gobbo and concept by Janna Joceli Omena. 250 References Adar, E., & Kim, M. (2007). SoftGUESS: Visualization and Exploration of Code Clones in Context. Retrieved from http://depfind.sourceforge.net/ Agre, P. E. (1994). Surveillance and capture: Two models of privacy. The Information Society, 10, 110–127. https://doi.org/10.1080/01972243.1994.9960162 Aiello, G. & Woodhouse, A. (2016). When corporations come to define the visual politics of gender: The case of Getty Images. Journal of Language and Politics, v. 15, n. 3, p. 351-366. Aiello, G., et al. (2016). A critical genealogy of the Getty Images Lean In Collection. Retrieved August, 19, 2019 from https://wiki.digitalmethods.net/Dmi/WinterSchool2016CriticalGenealogyGettyI magesLeanIn Aiello, G., et al. (2017). Taking stock: Can news images be generic? Retrieved on the August, 19, 2019 from https://wiki.digitalmethods.net/Dmi/TakingStock Akrich, M., & Latour, B. (1992). A summary of a convenient vocabulary for the semiotics of human and nonhuman assemblies. In W. Bijker & J. Law (Eds.), Shaping technology/building society: Studies in sociotechnical change. (pp. 259–264). Cambridge: MIT Press. Alonso, A. (2017, June). The politics of the streets: protests in São Paulo from Dilma to Temer. Novos Estudos CEBRAP. http:// bdpi.usp.br/item/002837619 Alpaydin, E. (2016). Machine learning: the new AI. MIT Press. Alves, S. (2020). Julgamento de Influencer Mariana Ferrer Termina com sentença inédita de ‘Estupro Culposo’ e Advogado humilhand jovem. The Intercept Brasil. Retrieved from https://theintercept.com/2020/11/03/influencer-marianaferrer-estupro-culposo/ Alzamora, G. C., & Bicalho, L. A. G. (2016). The representation of the impeachment day mediated by hashtags on Twitter and Facebook: semiosis in hybrid networks. Interin, 21(2), 100–121. Analyx. (2015). GitHub - analyxcompany/ForceAtlas2: This is the R implementation of the Force Atlas 2 graph layout designed for Gephi. Retrieved June 4, 2020, from https://github.com/analyxcompany/ForceAtlas2 251 Anderson, C. (2008). The long tail: Why the future is selling less of more. Hachette Books. Anderson, C., & Wolff, M. (2010). The Web is Dead - Long Live the Web. https://doi.org/10.1177/146045820000600401 Anderson, P. (2011). Lula’s Brazil. London Review of Books, 7, 3–12. https://www.lrb.co.uk/v33/n07/perry-anderson/lulas-brazil Auroux, S. (1992). The technological revolution of grammatization.University of Campinas. Andreessen, M. (2007). The three kinds of platforms you meet on the Internet. Retrieved January 13, 2016, from https://web.archive.org/web/20071002031605/http://blog.pmarca.com/2007/09/t he-threekinds.%0Dhtml Ash, J. (2012). Technology, technicity, and emerging practices of temporal sensitivity in videogames. Environment and Planning A, 44(1), 187–203. https://doi.org/10.1068/a44171 Auroux, S. (Translated by E. P. O. (1992). A Revolução Tecnológica da Gramatização. Campinas, SP: Editora da Unicamp. Bach, D., Tsapatsaris, M. R., Szpirt, M., & Custodis, L. (2018). The Baker’s Guild: The Secret Order Countering 4chan’s Affordances. Retrieved from https://oilab.eu/the-bakers-guild-the-secret-order-countering-4chans-affordances/ Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An Open Source Software for Exploring and Manipulating Networks. Third International AAAI Conference on Weblogs and Social Media, 361–362. https://doi.org/10.1136/qshc.2004.010033 Bennett, Lance and Segerberg, A. (2012). The logic of connective action. Information, Communication & Society, 15(5), 739–768. https://doi.org/10.1080/1369118X.2012.670661 Berners-Lee, T, Fielding, R. T., & Masinter, L. M. (2005). Uniform Resource Identifiers (URI): Generic Syntax. Retrieved from https://doi.org/10.17487/RFC3986 Berners-Lee, Tim. (1995). Hypertext and Our Collective Destiny. Retrieved September 20, 2020, from https://www.dougengelbart.org/content/view/258/000/ Berry, D. M. (2011). The computational turn: thinking about the digital humanities. Culture Machine, 12. 252 Bessi, A., & Ferrara, E. (2016, November). Social bots distort the 2016 U.S. Presidential election online discussion. First Monday. https://firstmonday.org/ojs/index.php/fm/article/ view/7090/5653 Blondel, V. D. et al. (2008). Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, v. 2008, n. 10, p. P10008. Blondel, V. D., Guillaume, J.L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 10008(10), 6. Bode, L., Vraga, E. K., Borah, P., & Shah, D. V. (2014). A new space for political behavior: Political social networking and its democratic consequences. Journal of Computer-Mediated Communication, 19, 414–429. http://https://doi.org/10.1111/ jcc4.12048 Bogers, L., Niederer, S., Bardelli, F., & De Gaetano, C. (2020). Confronting bias in the online representation of pregnancy. Convergence, 1–23. https://doi.org/10.1177/1354856520938606 Bogost, I., & Montfort, N. (2009). Platform Studies: Frequently Questioned Answers. Digital Arts and Culture, 12–15(December), 1–6. Borra, E., & Rieder, B. (2014). Programmed Method: Developing a Toolset for Capturing and Analysing Tweets. Aslib Journal of Information Management, Vol. 66(No. 3), 262–278. Bounegru, L., Gray, J., Venturini, T., & Mauri, M. (2017). A field guide to Fake News and other information disorders: a collection of recipes for those who love to cook with digital methods. Retrieved from https://fakenews.publicdatalab.org/ Boy, J. D., & Uitermark, J. (2016). How to study the city on Instagram. PLOS ONE, 11(6), Article e0158161. https://doi. org/10.1371/journal.pone.0158161 Bruhn, M.(2003). Visualization services: Stock photography and the picture industry. Genre: Forms of Discourse and Culture, v. 36, n. 3-4, p. 365-381, 2003. Bruns, A., & Burgess, J. E. (2011, October 17). The use of Twitter hashtags in the formation of ad hoc publics [Conference ses- sion]. Proceedings of the 6th European Consortium for Political Research (ECPR) General Conference, Reykjavik. Bucher, T. (2012). A technicity of attention: How software “makes sense.” Culture Machine, 13, 1–23. 253 Bucher, T. (2013). Objects of Intense Feeling: The Case of the Twitter. Computational Culture. Retrieved from http://computationalculture.net/article/objects-of-intense-feeling-the-case-of-thetwitter-api Buolamwini, J. & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. In: Conference on fairness, accountability and transparency. p. 77-91. Buolamwini, J. (2017). Gender shades: Intersectional phenotypic and demographic evaluation of face datasets and gender classifiers (Master Thesis, Massachusetts Institute of Technology). Retrieved from https://dspace.mit.edu/bitstream/handle/1721.1/114068/1026503582-MIT.pdf Burgess, Robin, Remi Jedwab, Edward Miguel, Ameet Morjaria, and Gerard Padró i Miquel. 2015. "The Value of Democracy: Evidence from Road Building in Kenya." American Economic Review, 105 (6): 1817-51. DOI: 10.1257/aer.20131031 Cardon, D., Cointet, J. & Mazieres, A.(2018). Neurons spike back: The Invention of Inductive Machines and the Artificial Intelligence Controversy. Castells, M. (2002). A sociedade em rede. São Paulo: Paz e Terra. Chippada, B. (2017). GitHub - bhargavchippada/forceatlas2: Fastest Gephi’s ForceAtlas2 graph layout algorithm implemented for Python and NetworkX. Retrieved June 4, 2020, from https://github.com/bhargavchippada/forceatlas2 Chollet, F. et al. (2015). Keras. Recovered at https://keras.io Ciuccarelli, P., & Elli, T. (2019). Beyond visualisation. In Reassembling the Republic of Letters in the Digital Age (pp. 299–314). Göttingen University Press. https://doi.org/10.17875/gup2019-1146 Coding Arena. (2018). Build Your First Web App In Visual Studio - Microsoft Virtual Academy - Coding Arena. Retrieved from https://www.youtube.com/watch?v=mgAtiR-1is4 Colombo, G. (2018). The design of composite images: Displaying digital visual content for social research. Corrêa, L. G. (2017). Does impeachment have gender? Circulation of images and texts about Dilma Rousseff in Brazilian and British press. In P. C. Castro (Ed.), A circulação discursiva entre produção e reconhecimento (pp. 279–292). Edufal. 254 Cortese, D. K., Szczypka, G., Emery, S., Wang, S., Hair, E., & Vallone, D. (2018). Smoking selfies: Using Instagram to explore young women’s smoking behaviors. Social Media + Society, 4(3). https://doi.org/10.1177/2056305118790762 Crogan, P., & Kennedy, H. (2009). Technologies Between Games and Culture, 4(2), 107–114. https://doi.org/10.1177/0022022103251753 Crogan, P., & Kinsley, S. (2012). Paying Attention: Towards a Critique of the Attention Economy, 13(1997), 1–29. Retrieved from http://eprints.uwe.ac.uk/17039/1/463-965-1-PB.pdf D'Orazio, F. (2014). The Future of Social Media Research. In Woodfield, K. (org.), Social media in social research: blogs on blurring the boundaries. D’Andréa, C. (2018). Cartografando controvérsias com as plataformas digitais: apontamentos teórico-metodológicos. Galáxia (São Paulo), n. 38, p. 28-39. D’Andréa, C. (2020). Pesquisando plataformas online: conceitos e métodos. Salvador: EDUFBA. Retrieved from https://repositorio.ufba.br/ri/handle/ri/32043 D’Andréa, C., & Mintz, A. (2019). Studying the live cross-platform circulation of images with computer vision API: An experiment based on a sports media event. International Journal of Communication, 13, 1825–1845. de Souza, C. R. B., Redmiles, D., Cheng, L.-T., Millen, D., & Patterson, J. (2004). Sometimes you need to see through walls, (September 2015), 63. https://doi.org/10.1145/1031607.1031620 Deng, J. et al.(2009). Imagenet: A large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, 248–255 Dixon, D. (2012). Analysis tool or research methodology: Is there an epistemology for patterns? In D. M. Berry (Ed.), Understanding digital humanities (pp. 191– 209). Palgrave Macmillan. https://doi.org/10.1057/9780230371934 Dodge, M., & Kitchin, R. (2005). Code and the Transduction of Space. Annals of the Association of American Geographers, 95(1), 162–180. https://doi.org/10.1111/j.1467-8306.2005.00454.x Dovey, J., & Kennedy, H. W. (2006). Game cultures : computer games as new media. Open University Press. Fausto Neto, A. (2016). Impeachment according to the logics of the “fabrication” of the event. Rizoma, 4(2), 8–36. https://doi. org/10.17058/rzm.v4i2.8602 255 Flanagan, D. (2011). JavaScript: The Definitive Guide. (M. Loukides, Ed.) (6th ed.). USA: O’Reilly Media. Retrieved from https://books.google.pt/books?id=4RChxt67lvwC&printsec=frontcover#v=onep age&q&f=false Flores, A. M. M. (2019). Produção e Consumo de Vídos em 360° - Tendências para o Jornalismo Brasileiro no YouTube. In J.J. Omena (Ed.), Métodos Digitais: Teoria-Prática-Crítica (pp. 183–201). Lisbon: ICNOVA. Retrieved from https://www.researchgate.net/publication/340814578_Producao_e_consumo_de _videos_em_360_-_tendencias_para_o_jornalismo_brasileiro_no_YouTube França, V. V., & Bernardes, M. (2016). Images, beliefs and truth in the protests of 2013 and 2015. Rumores, 10(19), 8–24. https:// doi.org/10.11606/issn.1982677X.rum.2016.112718 Freelon, D. (2018). Computational research in the post-API age Deen. Political Communication, 35, 665–668. https://doi.org/10.1080/10584609.2018.1477506 Frosh, Paul. (2001). Inside the image factory: stock photography and cultural production. Media, Culture & Society, v. 23, n. 5, p. 625-646. Fuller, M. (2008). Software Studies. A Lexicon. (Matthew Fuller, Ed.). Cambridge, MA; London, England: The MIT Press. Galloway, A. R. (2014). The Cybernetic Hypothesis. Differences, 25(1), 107–131. https://doi.org/10.1215/10407391-2420021 Gaver, W. W. (1991). Technology affordances. In Conference on Human Factors in Computing Systems - Proceedings (pp. 79–84). New York, New York, USA: Association for Computing Machinery. https://doi.org/10.1145/108844.108856 Geboers, M. A., & Van De Wiele, C. T. (2020). Machine Vision and Social Media Images: Why Hashtags Matter. Social Media + Society, 6(2). https://doi.org/10.1177/2056305120928485 Geboers, M., et al. (2019). Tracing relational affect on social platforms through image recognition. Retrieved from the Universiteit van Amsterdam’s website: https://wiki.digitalmethods.net/Dmi/SummerSchool2019TracingAffect Gephi Community Project. (2009). GEXF File Format. Retrieved June 14, 2020, from https://gephi.org/gexf/format/ Gerlitz, C., & Rieder, B. (2018). Tweets Are Not Created Equal: investigating Twitter’s client ecosystem. International Journal of Communication : IJoC, 12. 256 Retrieved from https://dare.uva.nl/search?identifier=4da1d406-1213-4103-8237eef5ae786948 Gerlitz, Carolin, & Helmond, A. (2013). The like economy: Social buttons and the data-intensive web. New Media & Society, 15(8), 1348–1365. https://doi.org/10.1177/1461444812472322 Gerlitz, Carolin. (2016). What Counts? Reflections on the Multivalence of Social Media Data. Digital Culture & Society, 2(2). https://doi.org/10.14361/dcs-20160203 Gerrard, Y. (2018). Beyond the hashtag: Circumventing content moderation on social media. New Media and Society, 20(12), 4492–4511. https://doi.org/10.1177/1461444818776611 Giannoulakis, S., & Tsapatsoulis, N. (2016). Evaluating the descrip- tive power of Instagram hashtags. Journal of Innovation in Digital Ecosystems, 3(2), 114–129. https://doi.org/10.1016/j. jides.2016.10.001 Gibb, R. (2016). What is a web application? Retrieved from https://blog.stackpath.com/web-application/ Gibbs, M., Meese, J., Arnold, M., Nansen, B., & Carter, M. (2015). #Funeral and Instagram: death, social media, and platform vernacular. Information Communication and Society, 18(3), 255–268. https://doi.org/10.1080/1369118X.2014.987152 Gillespie, T. (2010). The politics of “platforms.” New Media & Society, 12(3), 347– 364. https://doi.org/10.1177/1461444809342738 Gillespie, T. (2017). The platform metaphor, revisited. The Alexander Von Humboldt Institute for Internet and Society. https://www.hiig.de/en/theplatform-metaphor-revisited Gillespie, Tarleton. (2015). Platforms Intervene. Social Media + Society, 1(1), 1–2. https://doi.org/10.1177/2056305115580479 Gillespie, Tarleton. (2018a). Governance of and by platforms. In A. Burgess, Jean; Poell, Thomas & Marwick (Ed.), The Sage handbook of social media (SAGE Publi, pp. 254–278). Sage. Retrieved from https://www.microsoft.com/enus/research/publication/governance-of-and-by-platforms/ Gillespie, Tarleton. (2018b). Platforms Are Not Intermediaries. 2 GEO. L. TECH. REV. 198, (2), 198–216. https://doi.org/10.1177/1527476411433519 257 Google. (2017). Google Cloud Vision API (Version 1.0) [Computer software]. https://cloud.google.com/vision Grandjean, M. & Jacomy, M.(2019). Translating Networks: Assessing Correspondence Between Network Visualisation and Analytics. Digital Humanities, 10, 2019. Grohmann, R. (2018). The notion of engagement: meanings and traps for communication research. Revista FAMECOS, 25(3), 29387. https://doi.org/10.15448/1980-3729.2018.3.29387 Haas, A. M. (2012). Race, rhetoric, and technology: A case study of decolonial technical communication theory, methodology, and pedagogy. Journal of Business and Technical Communication, v. 26, n. 3, p. 277-310. Helmond, A. (2013). The Algorithmization of the Hyperlink | Computational Culture. Computational Culture, 3. Retrieved from http://computationalculture.net/the-algorithmization-of-the-hyperlink/ Helmond, A. (2015a). The Platformization of the Web: Making Web Data Platform Ready. Social Media + Society, 1(2), 2056305115603080. https://doi.org/10.1177/2056305115603080 Helmond, A. (2015b). The web as platform: Data flows in social media. PhD dissertation. University of Amsterdam. Retrieved from http://www.annehelmond.nl/wordpress/wpcontent/uploads//2015/08/Helmond_WebAsPlatform.pdf Heymann, S. (2014). Gephi. Encyclopedia of Social Networks and Mining (ESNAM). Hendricks, Lisa Anne et al.(2018). Women also snowboard: Overcoming bias in captioning models. European Conference on Computer Vision. Springer, Cham, 2018. p. 793-811. Highfield, T. (2018). Emoji hashtags // hashtag emoji: Of platforms, visual affect, and discursive flexibility. First Monday, 23(9), 1–16. https://doi.org/10.5210/fm.v23i9.9398 Highfield, T., & Leaver, T. (2015, January). A methodology for mapping Instagram hashtags. First Monday, 20(1). https://doi. org/10.5210/fm.v20i1.5563 Highfield, T., & Leaver, T. (2016). Instagrammatics and digital methods: Studying visual social media, from selfies and GIFs to memes and emoji. Communication Research and Practice, 2(1), 47–62. 258 Ho, J. C. T. (2020). How biased is the sample? Reverse engineering the ranking algorithm of Facebook’s Graph application programming interface. Big Data and Society, 7(1). https://doi.org/10.1177/2053951720905874 Hoel, A. S. (2018). Technicity. In Posthuman Glossary (Rosi Braid, pp. 420–423). Bloomsbury. Hoel, A. S., & Van Der Tuin, I. (2013). The ontological force of technicity: Reading Cassirer and Simondon diffractively. Philosophy and Technology, 26(2), 187– 202. https://doi.org/10.1007/s13347-012-0092-5 Hussain, Zaeem et al.(2017). Automatic understanding of image and video advertisements. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p. 1705-1715. Iliadis, A. (2015). Two examples of concretization. Platform, 6(1), 86–95. Instaloader. (2019). (Version 4.2.6) [Computer software]. https:// github.com/instaloader/instaloader Jacomy, M. (2019). The Web as Layers. Retrieved from https://panopto.aau.dk/Panopto/Pages/Viewer.aspx?id=48cfe5ff-5503-431b887f-ab53007ef5c4 Jacomy, Mathieu, Venturini, T., Heymann, S., & Bastian, M. (2014). ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PLoS ONE, 9(6). https://doi.org/10.1371/journal.pone.0098679 Jacomy, Mathieu. (2011). ForceAtlas2, the new version of our home-brew Layout. Retrieved June 4, 2020, from https://gephi.wordpress.com/2011/06/06/forceatlas2-the-new-version-of-ourhome-brew-layout/ Jacomy, Mathieu. (2020a). A validity metric for interpreting distances in a network map. Retrieved June 4, 2020, from https://observablehq.com/@jacomyma/avalidity-metric-for-interpreting-distances-in-a-networkm?collection=@jacomyma/quest-for-connected-closeness Jacomy, Mathieu. (2020b). Making complex networks interpretable with a metric – Reticular. Retrieved June 4, 2020, from https://reticular.hypotheses.org/1603 Jinkings, I., Doria, K., & Cleto, M. (Eds.). (2016). Why do we shout coup? To understand the impeachment and political crisis in Brazil. Boitempo Editorial. 259 Joo, Jungseock et al. (2014). Visual persuasion: Inferring communicative intents of images. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p. 216-223. Jünger, J., & Keyling, T. (2019). Facepager. An application for automated data retrieval on the web. Jungherr, A. (2014, February 27). Twitter in politics: A compre- hensive literature review. SSRN Electronic Journal. https://doi. org/10.2139/ssrn.2402443 Jungherr, A. (2015). Twitter use in election campaigns: A system- atic literature review. Journal of Information Technology & Politics, 13(1), 72–91. Karako, C. & Manggala, P. (2018). Using image fairness representations in diversitybased re-ranking for recommendations. In: Adjunct Publication of the 26th Conference on User Modeling, Adaptation and Personalization. Kitchin, R., & Dodge, M. (2007). Rethinking maps. Progress in Human Geography, 31(3), 331–344. https://doi.org/10.1177/0309132507077082 Knuttile, L. (2011). User unknown: 4chan, anonymity and contingency. First Monday, 16(10). Retrieved from https://doi.org/10.5210/fm.v16i10.3665 Kuecklich, J. (2009). A Techno-Semiotic Approach to Cheating in Computer Games Or How I Learned to Stop Worrying and Love the Machine. Games and Culture, 4(2), 158–169. https://doi.org/10.1177/1555412008325486 Laestadius, L. (2017). Instagram. In Sloan, L.; Quan-Haase, A. (Orgs.), The SAGE handbook of social media research methods (p.573–592). Los Angeles ; London: SAGE Publications Ltd. Langlois, G., & Elmer, G. (2013). The research politics of social media platforms. Culture Machine, 14, 1–17. Latour, B. (2001). Um coletivo de humanos e não-humanos: no labirinto de Dédalo. A esperança de Pandora: ensaios sobre a realidade dos estudos científicos. São Paulo: EDUSC. Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. Oxford University Press. Latour, B. (2010). Tarde’s idea of quantification. In M. Candea (Ed.), The social after Gabriel Tarde: Debates and assess- ments (pp. 145–162). Routledge. Latour, B., Jensen, P., Venturini, T., Grauwin, S., & Boullier, D. (2012). “The whole is always smaller than its parts” - a digital test of Gabriel Tardes’ monads. 260 British Journal of Sociology, 63(4), 590–615. https://doi.org/10.1111/j.14684446.2012.01428.x Lazer, D. M. J., Pentland, A., Watts, D. J., Aral, S., Athey, S., Contractor, N., … Wagner, C. (2020). Computational social science: Obstacles and opportunities. Science, 369(6507), 1060–1062. https://doi.org/10.1126/science.aaz8170 Lazer, D., Brewer, D., Christakis, N., Fowler, J., & King, G. (2009). Life in the network: the coming age of computational social science. Science, 323(5915), 721–723. https://doi.org/10.1126/science.1167742.Life Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabási, A. L., Brewer, D., … Van Alstyne, M. (2009, February 6). Social science: Computational social science. Science. American Association for the Advancement of Science. https://doi.org/10.1126/science.1167742 Lemos, A. (2002). Cibercultura, tecnologia e vida social na cultura contemporânea. Porto Alegre: Sulina. Lima, M. (2010). Racial inequalities and public policy: affirmative action in the Lula government. Novos Estudos CEBRAP, 87, 77–95. Lisis Laboratory. (2017). CorTexT Manager [Computer software]. https://managerv2.cortext.net Liu, A. (2009). Digital humanities and academic change. English Language Notes, 47, 17–35. Loken, B. & Ward, J. (1990). Alternative approaches to understanding the determinants of typicality. Journal of Consumer Research, v. 17, n. 2, p. 111126. Lomborg, S., & Bechmann, A. (2014). Using APIs for Data Collection on Social Media. The Information Society, 30(4), 256–265. https://doi.org/10.1080/01972243.2014.915276 Mackenzie, A. (2019, November 10). From API to AI: platforms and their opacities. Information Communication and Society, pp. 1989–2006. https://doi.org/10.1080/1369118X.2018.1476569 Manovich, L. (1993). The engineering of vision from constructivism to computers (doctoral thesis), University of Rochester. Retrieved from http://manovich.net/EV/EV.PDF Manovich, L. (2009). Cultural Analytics: Visualizing Cultural Patterns in the Era of “More Media.” New York, 1–4. 261 Manovich, L. (2014). Software is the message. Journal of Visual Culture, 13(1), 79– 81. https://doi.org/10.1177/1470412913509459 Manovich, L. (2020). Cultural Analytics. The MIT Press. Retrieved from https://mitpress.mit.edu/books/cultural-analytics Markham AN, Buchanan E (2012) Ethical decision-making and internet research, recommendations from the AoIR ethics working committee (version 2.0). Retreived from: http://aoir.org/ reports/ethics2.pdf Markham A (2017) Impact model for ethics: notes from a talk. Retrieved from: https:// annettemarkham.com/2017/07/impact-model-ethics/ Marres, N. (2011). Re-distributing methods: digital social research as participatory research. Sociological Review. Retrieved from http://eprints.gold.ac.uk/6846 Marres, N. (2017). Digital sociology: the reinvention of social research. Bristol: Polity Press. Marres, N. & Moats, D. (2015). Mapping controversies with social media: The case for symmetry. Social Media+ Society. Marres, N., & Weltevrede, E. (2012). Scraping the social? Issues in live social research. Journal of Cultural Economy, 6(3), 313–335. Mauri, M., Elli, T., Caviglia, G., Uboldi, G., & Azzi, M. (2017). RAWGraphs. In Proceedings of the 12th Biannual Conference on Italian SIGCHI Chapter CHItaly ’17 (pp. 1–5). New York, New York, USA: ACM Press. https://doi.org/10.1145/3125571.3125585 Mauri, M., Gobbo, B., & Colombo, G. (2019). O papel do designer no contexto do data sprint. In J.J. Omena (Ed.), Métodos digitais: Teoria-Prática-Crítica (pp. 161–180). Lisbon: ICNOVA. Meyer, B. (1998). Object-oriented software construction (Second Edi). Santa Barbara (California), USA: Prentice Hall. Miller, P. (2018). Web apps are only getting better. Retrieved from https://www.theverge.com/circuitbreaker/2018/4/11/17207964/web-appsquality-pwa-webassembly-houdini Mindfire Solutions. (2017). How important is JavaScript for Modern Web Developers? Retrieved from https://medium.com/@mindfiresolutions.usa/howimportant-is-javascript-for-modern-web-developers-2854309b9f52 Mintz, A. (2019) Stock scraper (software). Retrieved from https://github.com/amintz/stock-scraper 262 Mintz, A., Silva, T., Gobbo, B., Pilipets, E., Azhar, H., Takamistu, H., Omena, J. J., & Oliveira, T. (2019). Interrogating vision APIs [Smart Data Sprint 2019]. Universidade Nova de Lisboa. https://smart.inovamedialab.org/smart2019/project-reports/ interrogating-vision-apis/ Mintz, A.(2018a) Image Network Plotter (software). Retrieved from https://github.com/amintz/image-network-plotter Mintz, A.(2018b). Memespector Python (software). Retrieved from https://github.com/amintz/memespector-python Moats, D., & Borra, E. (2018). Quali-quantitative methods beyond networks: Studying information diffusion on Twitter with the Modulation Sequencer. Big Data & Society, 5(1). https://doi.org/10.1177/2053951718772137 Moraes, T. P. B., & Quadros, D. G. (2016). The crisis of Dilma Rousseff government in 140 characters on Twitter: from #impeachment to #foradilma. Em debate: Periódico de Opinião Pública e Conjuntura Política, 8(1), 14–21. http://bibliotecadigital.tse.jus.br/xmlui/handle/bdtse/3290 Morozov, & Evgeny. (2018). There is a leftwing way to challenge big tech for our data. Here it is. Retrieved September 29, 2020, from https://www.theguardian.com/commentisfree/2018/aug/19/there-is-a-leftwingway-to-challenge-big-data-here-it-is?CMP=twt_gu Mozilla Developer Networks. (2020). CSS: Cascading Style Sheets. Retrieved from https://developer.mozilla.org/en-US/docs/Web/CSS Mozilla Developers Network. (2020). What is JavaScript? Retrieved November 9, 2020, from https://developer.mozilla.org/enUS/docs/Web/JavaScript/About_JavaScript Munk, A. K., Madsen, A. K., & Jacomy, M. (2019). Thinking through the databody: Sprints as experimental situations. In Å. Mäkitalo, T. Nicewonger, & M. Elam (Eds.), Designs for Experimentation and Inquiry: Approaching Learning and Knowing in Digital Transformation (1st ed., pp. 110–128). London: Routledge. Murthy, D., Powell, A. B., Tinati, R., Anstead, N., Carr, L., Halford, S. J., & Weal, M. (2016). Bots and political influence: A sociotechnical investigation of social network capital. International Journal of Communication, 10(June), 4952–4971. Murugesan, S. (2007). Understanding Web 2.0. IEEE Computer Society, (August), 1–10. https://doi.org/10.1109/MITP.2007.78 263 Napoli, P. M.(2008). Toward a model of audience evolution: New technologies and the transformation of media audiences. McGannon Center Working Paper Series. Niederer, S. & Colombo, G. (2019). Visual Methodologies for Networked Images: Designing Visualizations for Collaborative Research, Cross-platform Analysis, and Public Participation. Diseña, (14), 40-67. Niederer, S., & van Dijck, J. (2010). Wisdom of the crowd or technicity of content? Wikipedia as a sociotechnical system. New Media and Society, 12(8), 1368– 1387. https://doi.org/10.1177/1461444810365297 Noble, Safiya Umoja. (2018). Algorithms of oppression: How search engines reinforce racism. New York: NYU Press, 2018. Norman, D. (1988). The Psychology of Everyday Things. New York: Basic Books. O´Reilly, T. (2005). What is Web 2.0: Design patterns and business models for the next generation of software. Retrieved from https://www.oreilly.com/pub/a/web2/archive/what-is-web-20.html Omena, J. J. & Amaral, I. (2019). Sistema de leitura de redes digitais multiplataforma. In Janna Joceli Omena (Ed.), Métodos Digitais: Teoria-PráticaCrítica (pp. 121–140). Lisbon: ICNOVA. Retrieved from https://www.researchgate.net/publication/339434985_Sistema_de_leitura_de_re des_digitais_multiplataforma Omena, J. J., & Rosa, J. M. (2017). “Brazil went to the streets”- Again! Studies of protests on social networks. In C. Camponez, F. Pinheiro, J. Fernandes, M. Gomes, & R. Sobreira (Eds.), Comunicação e Transformações Sociais, Vol II: Comunicação Política, Comunicação Organizacional e Institucional e Cultura Visual (Atas do IX Congresso da SopCom) (pp. 51– 74). Associação Portuguesa de Ciências da Comunicação. Omena, J.J. (2019). O que são métodos digitais? In J.J. Omena (Ed.), Métodos Digitais: teoria -prática-crítica (pp. 1–15). Lisbon: ICNOVA. Omena, J.J., & Granado, A. (2020). Call into the platform! Revista ICONO14 Revista Científica de Comunicación y Tecnologías Emergentes, 18(1), 89–122. https://doi.org/10.7195/ri14.v18i1.1436 Omena, J.J., Chao, J., Pilipets, E., Kollanyi, B., Zilli, B., Flaim, G., … Nero, S. (2019). Bots and the black market of social media engagement. https://doi.org/10.13140/RG.2.2.30518.52804 264 Omena, Janna Joceli, Rabello, E. T., & Mintz, A. G. (2020). Digital Methods for Hashtag Engagement Research. Social Media + Society, (July-September), 1– 18. https://doi.org/10.1177/2056305120940697 Omena, Janna Joceli, Rabello, Elaine Teireixa, Mintz, A. (2017). Visualising hashtag engagement: imagery of political polarisation on Instagram. Retrieved June 12, 2020, from https://wiki.digitalmethods.net/Dmi/InstagramLivenessVisualisingengagement Omena, Janna Joceli, Deem, A., Gobbo, B., Van Geenen, D., Alves, D., Kannasto, E., … Israel Turin, V. (2020). Cross-platform digital networks: Exploring the narrative affordances of force-directed layouts and data relations nature. Retrieved from https://smart.inovamedialab.org/2020-digital-methods/projectreports/cross-platform-digital-networks/ Omena, Janna Joceli. (2017). Insta Bots and the black market of social media engagement – The Social Platforms. Retrieved June 12, 2020, from https://thesocialplatforms.wordpress.com/2017/12/21/insta-bots-and-the-blackmarket-of-social-media-engagement/ Omena, Janna Joceli. (2019). Métodos Digitais: teoria‐prática‐crítica. (Janna Joceli Omena, Ed.). Lisbon: ICNOVA. Retrieved from https://www.icnova.fcsh.unl.pt/metodos-digitais-teoria‐pratica‐critica/ Ooghe-Tabanou, B., Jacomy, M., Girard, P., & Plique, G. (2018). Hyperlink is not dead! In ACM International Conference Proceeding Series (Vol. 2, pp. 12–18). New York, New York, USA: Association for Computing Machinery. https://doi.org/10.1145/3240431.3240434 Osoba, O. A. & Welser IV, W. (2017). An Intelligence in Our Image: The Risks of Bias and Errors in Artificial Intelligence. Rand Corporation. Paparachissi, Z. (2015). Affective publics: Sentiment, technology, and politics. Oxford University Press. Parnas, D. L. (1971). Information Distribution aspects of design methodology. Retrieved from http://nova.campusguides.com/hpdilll Pasquale, F. (2016). The Black Box Society – The Secret Algorithms That Control Money and Information. Cambridge, Massachusetts London, England: Harvard University Press. Pearce, W., Özkula, S. M., Greene, A. K., Teeling, L., Bansard, J. S., Omena, J. J., & Rabello, E. T. (2018). Visual cross-platform analysis: digital methods to 265 research social media images. Information Communication and Society, 23(2), 161–180. https://doi.org/10.1080/1369118X.2018.1486871 Peeters, S., & Hagen, S. (2018). 4CAT: Capture and Analysis Toolkit. Perriam, J., Birkbak, A., & Freeman, A. (2020). Digital methods in a post-API environment. International Journal of Social Research Methodology, 23(3), 277– 290. https://doi.org/10.1080/13645579.2019.1682840 Petit, V. (2012). Ars Industrialis. Retrieved February 3, 2018, from http://arsindustrialis.org/grammatisation Pilipets, E. (2019). From Netflix Streaming to Netflix and Chill: The (Dis)Connected Body of Serial Binge-Viewer. Social Media + Society, 5(4), 205630511988342. https://doi.org/10.1177/2056305119883426 Pilipets, E., Flores, A. M. M., Flaim, G., Skazedonig, M., Sepúlveda, R., & Del Nero, S. (2019). From “tumblr purge” to “female nipples.” Retrieved from https://smart.inovamedialab.org/2020-digital-methods/project-reports/tumblrpurge-female-nipples/ Poell, T., Nieborg, D., & van Dijck, J. (2019). Platformisation. Internet Policy Review, 8(4). https://doi.org/10.14763/2019.4.1425 Porikli, F., Shan, S., Snoek, C., Sukthankar, R., & Wang, X. (2018). Deep Learning for Visual Understanding: Part 2 [From the Guest Editors]. IEEE Signal Processing Magazine. https://doi.org/10.1109/MSP.2017.2766286 Portilla, J. H. I. (2013). Ciencias de la computación:¿ un reto para el pensamiento decolonial? Revista Criterios, v. 20, n. 1, p. 91-99. Pritchard, K. & Whiting, R.(2015). Taking stock: A visual analysis of gendered ageing. Gender, Work & Organization, v. 22, n. 5, p. 510-528. Quijano, A.(2010). Colonialidade do poder e classificação social. In: SANTOS, B. (org). Epistemologias do Sul. São Paulo: Cortez, p. 84-130. Rabello, E., Matta, G., Omena, J. J., Silva, T., Teixeira, A., Cano-Orón, L., … Costa, A. R. (2018). Visualising engagement on Zika epidemic: public health and social insights from platform data analysis. https://doi.org/10.13140/RG.2.2.26627.32800/1 Raji, I. D. & Buolamwini, J. (2019). Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial ai products. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 2019. p. 429-435. 266 Rambukkana, N. (Ed.). (2015). Hashtag publics: The power and politics of discursive networks. Peter Lang. Ribeiro, M. M., & Ortellado, M. (2016). Digital profile of protesters of 13th and 18th of March| Opinião | EL PAÍS Brasil. El País. https://brasil.elpais.com/brasil/2016/03/28/ opinion/1459128271_535467.html Ricci, D., Colombo, G., Meunier, A., & Brilli, A. (2017, June 28–30). Designing digital methods to monitor and inform Urban Policy. The case of Paris and its urban nature initiative [Conference session]. Proceedings of the 3rd International Conference on Public Policy (ICPP3), Singapore. Rieder, B. (2015). Analyzing Social Media with Digital Methods. Possibilities, Requirements, and Limitations. Retrieved October 26, 2015, from https://www. slideshare.net/bernhardrieder/analyzing-social-media-with-digital-methodspossibilities-requirements-and-limitations Rieder, B. (2015). Visual tagnet explorer [Computer software]. https://tools.digitalmethods.net/netvizz/instagram/ Rieder, B. (2016). Closing APIs and the public scrutiny of very large online platforms. http://thepoliticsofsystems.net/2016/05/clos- ing-apis-and-the-publicscrutiny-of-very-large-online-plat- forms/ Rieder, B. (2017). Memespector. Retrieved from https://github.com/bernorieder/memespector Rieder, B. (n.d.). Textanalysis [Computer software]. http://labs.polsys.net/tools/textanalysis/ Rieder, Bernhard, & Röhle, T. (2018). Digital Methods From Challenges to Bildung. In M. T. Schaefer & K. van Es (Eds.), The Datafied Society (pp. 109–124). Amsterdam University Press. https://doi.org/10.1515/9789048531011-010 Rieder, Bernhard, & Rohle, T. (2019). Métodos digitais: dos desafios à Bildung. In Janna Joceli Omena (Ed.), Métodos Digitais: teoria -prática-crítica (pp. 19–36). Lisbon: ICNOVA. Retrieved from https://www.icnova.fcsh.unl.pt/metodosdigitais-teoria‐pratica‐critica/ Rieder, Bernhard, Abdulla, R., Poell, T., Woltering, R., & Zack, L. (2015). Data critique and analytical opportunities for very large Facebook Pages: Lessons learned from exploring “We are all Khaled Said.” Big Data & Society, 2(2), 2053951715614980. https://doi.org/10.1177/2053951715614980 267 Rieder, Bernhard, Coromina, Ò., & Matamoros-Fernández, A. (2020). Mapping YouTube. First Monday, 25(8). https://doi.org/10.5210/fm.v25i8.10667 Rieder, Bernhard, Matamoros-Fernández, A., & Coromina, Ò. (2018). From ranking algorithms to ‘ranking cultures’: Investigating the modulation of visibility in YouTube search results. Convergence, 24(1), 50–68. https://doi.org/10.1177/1354856517736982 Rieder, Bernhard. (2013). Studying Facebook via Data Extraction: The Netvizz Application. Proceedings of WebSci ’13, the 5th Annual ACM Web Science Conference, 346–355. https://doi.org/10.1145/2464464.2464475 Rieder, Bernhard. (2015). YouTube Data Tools (Version 1.11). Rieder, Bernhard. (2020). Engines of Order: a mechanology of algorithmic techniques. Amsterdam University Press. Roberts, L. G. (1963). Machine perception of three-dimensional solids. Retrieved from https://www.researchgate.net/publication/220695992_Machine_Perception_of_T hree-Dimensional_Solids Rogers, R. (2017). Foundations of Digital Methods: Query Design. In M. T. Schäfer & V. van Es (Eds.), The Datafied Society. Studying Culture through Data (pp. 75–94). Amsterdam: Amsterdam University Press. https://doi.org/10.5117/9789462981362 Rogers, R. (2018). Otherwise Engaged: Social Media from Vanity Metrics to Critical Analytics. International Journal of Communication, 12, 450–472. Rogers, Richard, & Lewthwaite, S. (2019). Teaching Digital Methods: Interview with Richard Rogers. Interviewer: S. Lewthwaite. Revista Diseña, (14), 12–37. https://doi.org/10.7764/disena.14.12-37 Rogers, Richard. (1996). The future of STS on the web, or: What I learned (naively) making the EASST website EASST Review, 15(2), 25–27. https://www.researchgate.net/publication/239841669_The_Future_of_Science_a nd_Technology_Studies_on_the_Web Rogers, Richard. (2009). The End of the Virtual: Digital Methods. The End of the Virtual: Digital Methods. Amsterdam: Amsterdam University Press. https://doi.org/10.5117/9789056295936 Rogers, Richard. (2010). Internet Research: The Question of Method—A Keynote Address from the YouTube and the 2008 Election Cycle in the United States 268 Conference. Journal of Information Technology & Politics, 7(2–3), 241–260. https://doi.org/10.1080/19331681003753438 Rogers, Richard. (2013). Digital Methods. Cambridge, MA: MIT Press. Rogers, Richard. (2015). Digital Methods for Web Research. In R. Scott & S. Kosslyn (Eds.), Emerging Trends in the Behavioral and Social Sciences (pp. 1– 22). Hoboken, NJ: Wiley. https://doi.org/10.1002/9781118900772 Rogers, Richard. (2018). Digital Methods for Cross-platform Analysis. In J. et. al. Burgess (Ed.), The SAGE Handbook of Social Media (pp. 91–108). 55 City Road, London: SAGE Publications Ltd. https://doi.org/10.4135/9781473984066.n6 Rogers, Richard. (2019). Doing digital methods. Lodon: Sage. Rosa, J. M., Omena, J. J., & Cardoso, D. (2018). Watchdogs in the social network: A polarized perception? Observatório, 12(5), 98–117. Rose, G. (2016). Visual Methodologies (4th Ed.). UK: Open University. Ruppert, E., Law, J., & Savage, M. (2013). Reassembling Social Science Methods: The Challenge of Digital Devices. Theory, Culture & Society, 30(4), 22–46. https://doi.org/10.1177/0263276413484941 Rykov et al. (2016). Semantic and geospatial ,mapping of Instagram Images in SaintPetersburg. Proceedings of the AINL FRUCT 2016 Conference SaintPetersburg, Russia, 10-12 November 2016. Retrieved from http://ieeexplore.ieee.org/servlet/opac?punumber=7889413 Shirky, C. (2004). Situated Software. Retrieved July 23, 2020, from https://www.gwern.net/docs/technology/2004-03-30-shirkysituatedsoftware.html Silva, T., Barciela, P. & Meirelles, P. (2019). Mapeando Imagens de Desinformação e Fake News Político-Eleitorais com Inteligência Artificial. 3o CONEC: Congresso Nacional de Estudos Comunicacionais Da PUC Minas Poços de Caldas - Convergência e Monitoramento, 413–427, 2018. Retrieved from https://conec.pucpcaldas.br/wp-content/uploads/2019/06/anais2018.pdf Silva, T., Mintz, A., Omena, J. J., Gobbo, B., Oliveira, T., Takamitsu, H. T., … Azhar, H. (2020). APIs de Visão Computacional: investigando mediações algorítmicas a partir de estudo de bancos de imagens. Logos, 27(1), 25.54. https://doi.org/doi:https://doi.org/10.12957/logos.2020.51523 269 Silva, T.; Meirelles, P.; Apolonio, B. (2018). Visão Computacional nas Mídias Sociais: Estudando imagens de #Férias no Instagram. Presented at I Encontro Norte e Nordeste da ABCiber, São Luís. Simondon, G. (1980). On the Mode of Existence of Technical Objects. University of Western Ontario (Vol. 1). https://doi.org/10.1017/CBO9781107415324.004 Simondon, G. (2009). Technical Mentality. Parrhesia, 17–27. Retrieved from http://www.mediafire.com/?emywtgzmmmn%5Cnpapers2://publication/uuid/C AC01CB8-9CBB-4B93-91F2-1B46AB2534E8 Simondon, G. (2017). On the mode of existence of technical objects. Minnesota, USA.: University of Minnesota Press. Small, T. A. (2011). What the hashtag? A content analy- sis of Canadian politics on Twitter. Journal Information, Communication & Society, 14(6), 872–895. Smeuders et al. (2000). Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12), 1349– 1380. Srnicek, N. (2017). Platform Capitalism. Cambridge, UK; Malden, MA:Polity Press. Stiegler, B. (2006). Anamnesis and hypomnesis: The memories of desire. In L. Armand & A. Bradley (Eds.), Technicity (pp. 15–41). Litteraria Pragensia. Stiegler, B. (2011). The Decadence of Industrial Democracies. Cambridge, UK: Polity Press. Stiegler, B. (2018). Technologies of memory and imagination. Stiegler, Bernard. (2011). The Decadence of Industrial Democracies. Cambridge, UK: Polity Press. Stiegler, Bernard. (2012). Die Aufklärung in the Age of Philosophical Engineering. Computational Culture. https://doi.org/10.1017/CBO9781107415324.004 Sturm, R., Pollard, C., & Graig, J. (2017). Application Performance Management (APM) in the digital Enterprise: Management of Traditional Applications. Retrieved from https://www.sciencedirect.com/book/9780128040188/application-performancemanagement-apm-in-the-digital-enterprise Tableau Desktop. (2018). Tableau (Version 10.4.6) [Computer software]. https://www.tableau.com/products/desktop Taibi, D., Rogers, R., Marenzi, I., Nejdl, W., Ahmad, Q. A. I., & Fulantelli, G. (2016). Search as research practices on the web: The SaR-Web platform for 270 cross-language engine results analysis. WebSci 2016 - Proceedings of the 2016 ACM Web Science Conference, 367–369. https://doi.org/10.1145/2908131.2908201 Tavares, F. D. M. B., Berger, C., & Vaz, P. B. (2016). A fore- seen coup: Lula, Dilma and the pro-impeachment discourse on Veja magazine. Pauta Geral: Estudos em Jornalismo, 3(2), 20–44. http://www.revistas2.uepg.br/index.php/pauta/article/ view/9174 Tiidenberg K. (2020) Research Ethics, Vulnerability, and Trust on the Internet. In: Hunsinger J., Allen M., Klastrup L. (eds) Second International Handbook of Internet Research. Springer, Dordrecht. https://doi.org/10.1007/978-94-0241555-1_55 Tifentale, A. (2015). Making sense of the selfie: Digital image- making and imagesharing in social media. Scriptus Manet, 1, 47–59. Tifentale, A. & Manovich, L. (2015). Selfiecity: Exploring photography and selffashioning in social media. In Postdigital Aesthetics (p. 109–122). Springer,. Tiidenberg, K., & Baym, N. K. (2017). Learn it, buy it, work it: Intensive pregnancy on Instagram. Social Media + Society, 3(1). https://doi.org/10.1177/2056305116685108 Toft-Nielsen, C., & Nørgård, R. T. (2015). Expertise as gender performativity and corporeal craftsmanship. Convergence, 21(3), 343–359. https://doi.org/10.1177/1354856515579843 Tuters, M., Jokubauskaitė, E., & Bach, D. (2018). Post-Truth Protest: How 4chan Cooked Up the Pizzagate Bullshit. M/C Journal, 21(3). https://doi.org/https://doi.org/10.5204/mcj.1422 van Dijck, José, Poell, T., & de Waal, M. (2018). The Platform Society (Vol. 1). Oxford University Press. https://doi.org/10.1093/oso/9780190889760.001.0001 Van Dijck, José. (2013). The Culture of Connectivity: A Critical History of Social Media. New York: Oxford University Press. Van Dijck, José. (2020). Seeing the forest for the trees: Visualizing platformization and its governance. New Media and Society. https://doi.org/10.1177/1461444820940293 Venturini, T. (2010). Diving in magma: How to explore contro- versies with actornetwork theory. Public Understanding of Science, 19(3), 258–273. 271 Venturini, T., Jacomy, M., & Pereira, D. (2015). Visual Network Analysis. SciencesPo Media Lab working paper. Venturini, T., Jacomy, M., Bounegru, L., & Gray, J. (2018). Visual Network Exploration for Data Journalists. In B. Eldridge II, S. & Franklin (Ed.), Handbook to Developments in Digital Journalism Studies. Abingdon: Routledge. Venturini, T., Jacomy, M., Bounegru, L., & Gray, J. (2018). Visual Network Exploration for Data Journalists. In B. Eldridge II, S. & Franklin (Ed.), Handbook to Developments in Digital Journalism Studies. Abingdon: Routledge. Venturini, Tommaso, & Rogers, R. (2019). “API-based research” or how can digital sociology and digital journalism studies learn from the Cambridge Analytica affair. Digital Journalism, (Forthcoming). https://doi.org/10.1080/21670811.2019.1591927 Venturini, Tommaso, Bounegru, L., Gray, J., & Rogers, R. (2018). A reality check(list) for digital methods. New Media & Society, 20(11), 4195–4217. https://doi.org/10.1177/1461444818769236 Venturini, Tommaso, Jacomy, M., & Jensen, P. (2019). What do we see when we look at networks an introduction to visual network analysis and force-directed layouts. SSRN, (1). Retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3378438 Venturini, Tommaso, Jacomy, M., Meunier, A., & Latour, B. (2017). An unexpected journey: A few lessons from sciences Po médialab’s experience. Big Data & Society, 4(2), 205395171772094. https://doi.org/10.1177/2053951717720949 Vis, F. (2013). A critical reflection on Big Data: Considering APIs, researchers and tools as data makers. First Monday, 18(10). https://doi.org/10.5210/fm.v18i10.4878 Warwick, C., Terras, M., & Nyhan, J. (2012). Digital Humanities in Practice. (C. Warwick, M. Terras, & J. Nyhan, Eds.). Cambridge University Press. https://doi.org/https://doi.org/10.29085/9781856049054 Wattenberg, M., & Viégas, F. B. (2008). The word tree, an interactive visual concordance. IEEE Transactions on Visualization and Computer Graphics, 14(6), 1221–1228. https://doi.org/10.1109/TVCG.2008.172 272 Watts, D. J. (2007). A twenty-first century science. Nature, 445(7127), 489. https://doi.org/10.1038/445489a Weltevrede, E., & Borra, E. (2016). Platform affordances and data practices: The value of dispute on Wikipedia. Big Data and Society, 3(1), 1–16. https://doi.org/10.1177/2053951716653418 West, C. (2018). The Lean In Collection: Women, Work, and the Will to Represent. Open Cultural Studies, 2(1), 430-439, 2018. Wiki, O. S. M. (2017). GDF. Retrieved June 14, 2020, from https://wiki.openstreetmap.org/wiki/GDF Wikipedia. (2015). Gephi. Retrieved June 4, 2020, from https://en.wikipedia.org/wiki/Gephi Wikipedia. (2020). Coupling (computer programming). Retrieved June 5, 2020, from https://en.wikipedia.org/wiki/Coupling_(computer_programming)#cite_noteISOIECTR19759_2005-2 Williams, R. (1989). Culture is ordinary. In R. Williams (Ed.), Resources of hope, culture, democracy, socialism (pp. 3–14). Verso. Wilson, C. (2017, April 6). I spent two years botting on Instagram— Here’s what I learned [Blog post]. PetaPixel. https://petapixel. com/2017/04/06/spent-twoyears-botting-instagram-heres- learned/ Woolley, S. C. (2016). Automating Power: Social bot interference in global politics. First Monday, 21(4). Woolley, S. C., & Howard, P. N. (2016). Political Communication, Computational Propaganda, and Autonomous Agents. International Journal of Communication, 10, 4882–4890. World Wide Web Consortium. (n.d.). HTML & CSS. Retrieved November 9, 2020, from https://www.w3.org/standards/webdesign/htmlcss#whatcss Yoshihara, N. (2011). Development History of Wire Rods for Valve Springs. KOBELCO TECHNOLOGY REVIEW. Retrieved from https://www.semanticscholar.org/paper/Development-History-of-Wire-Rods-forValve-Springs-Yoshihara/3ca2f39b7e0c18167e116331a9383f4488acc83b Zhang, Z. (2020). Infrastructuralization of Tik Tok: transformation, power relationships, and platformization of video entertainment in China. Media, Culture and Society. https://doi.org/10.1177/0163443720939452 273 Appendices 274 275 276 277 278 279