Capítulo / Chapter IV | Cinema – Tecnologia / Technology

From photographic representation to statistical abstraction: a self-portrait on (in)visibility

Marta Amorim

Centro de Investigação em Arte e Comunicação, Universidade do Algarve, Portugal

Bruno Mendes da Silva

Centro de Investigação em Arte e Comunicação, Universidade do Algarve, Portugal

Abstract

This paper examines the creative process behind the conceptualization of a video composed of AI-generated images, structured around a digitized version of a classic studio portrait captured with an analog large-format camera, which is progressively deconstructed and reconfigured into an abstraction.
The research finds grounding in artworks that employ transformation and disappearance as a means of generating new visual forms, whether through mechanical or technological interventions, by artists such as Robert Rauschenberg, Wolf Vostell, Stan Brakhage, Hiroshi Sugimoto, and Thomas Ruff, whose practices—despite differences in medium, methodology, and historical context—navigate the paradox of destruction as creation.
The use of generative-AI in the project aligns with a lineage of experimental approaches to abstraction in video art. The image-making process, rooted in human-machine collaboration, unfolds through an iterative approach in which the creative vision of the author is negotiated with the statistical mechanisms of AI-driven synthesis. A dual-input strategy is employed by combining image and text prompts, with the latter incorporating terms related to unseen layers of digital technology, based on the premise that subjectivity can foster abstraction.
Situated within broader discussions on the evolving relationship between technology and visual culture, this study interrogates the conditions defining the current landscape of abstract moving-image production. It examines the transition to probabilistic imaging as a shift that expands abstraction beyond aesthetic concerns, framing it as both a form of conceptual invisibility and a critical mode of inquiry into underlying mechanisms of contemporary digital imagery.

Keywords: AI-generated imaging, Moving images, Image deconstruction, Abstraction, Invisibility in digital culture.

Introduction

In the twenty-first century, image making has expanded from analog and digital photographic processes to the latent space of generative models, where pictures condense out of probability distributions. This transition introduces not only a new aesthetics of abstraction, but also raises ontological questions about the nature of images shaped by algorithmic systems, a line of inquiry explored throughout this study in relation to operational opacity and the dissolution of representational form.

The audiovisual piece developed within this project departs from a digitized studio portrait originally captured using a large-format analog camera, composed according to the visual conventions of traditional portraiture: neutral backdrop, controlled lighting, and a poised, three-quarter view. This choice establishes a deliberate tension, anchoring the image in a representational mode historically linked to stability, authorship, and referential clarity, while subjecting it to a process of algorithmic abstraction driven by statistical inference and latent-space computation. This juxtaposition produces a conceptual anachronism, placing the enduring materiality of analog photography in dialogue with the volatility of generative AI. The resulting interaction occurs through a series of iterative experiments, procedural uncertainty, and algorithmic serendipity, an exploratory impulse that echoes the formative period of video art and exemplified by works such as Shigeko Kubota’s Self-Portrait, from which this project draws inspiration. In that seminal work, Kubota records herself and manipulates the image, turning it into a site of modulation, disruption, and visual transformation, pushing the technological medium toward its limits to uncover new aesthetic and conceptual possibilities. Extending this tradition, Kubota’s analog circuitry is replaced here with AI image-generation models such as Stable Diffusion and Midjourney, carrying forward that sense of experimentation into contemporary neural network practices.

When submitting a figurative image to algorithmic abstraction within the framework of destruction-as-creation, aesthetic conventions of representation are reconfigured as the act of erasure becomes a process negotiated with a black-box computational system, rather than one enacted through direct intervention. This allows image degradation and procedural distortion to be mobilized as critical strategies, while enabling abstraction to emerge as a methodological lens through which to interrogate the underlying operations that structure data-driven image construction, ultimately opening a space to reflect on the shifting conditions of visibility in digital culture.

The structure of this discussion unfolds in four sections, each corresponding to a different dimension of the project: section one surveys historical precedents of destruction-as-creation, situating the project in the lineage of debates on materiality and indexicality; section two outlines the technical process behind image generation, highlighting how a dual-input method enables a more controlled transition from figuration to abstraction; section three focuses on the construction of the audiovisual composition, interpreting its formal and temporal choices as a contemporary resonance of avant-garde film practices; and section four theorizes abstraction as a form of invisibility, linking the visual strategies employed to wider reflections on digital visual culture and the opacity of algorithmic systems. The conclusion synthesizes these threads to articulate how the project situates itself within current discussions on the aesthetics and politics of machine-generated images.

Destruction-as-creation

The deliberate deconstruction of the image has long served as a means of reconfiguring artistic creation itself. By erasing, fragmenting, or subjecting the image to processes of degradation, artists challenge the conventions that define visual representation, not only in terms of aesthetics, but also in relation to authorship, temporality, and material integrity. These gestures do not signify an effacement, but rather assert an alternative mode of making, that operates through subtraction, disruption, and the exposure of underlying structures. In this context, destruction emerges not as an endpoint but as a generative force that displaces the notion of artistic mastery and reframes the act of seeing as contingent, unstable, and shaped by conceptual tensions. Contingent, in the sense that meaning becomes dependent on specific conditions of reception—historical, material, or contextual—rather than residing in the image itself; unstable, because what is seen resists fixation: it fluctuates, collapses into ambiguity, or remains incomplete; and shaped by conceptual tensions, as it demands more than direct visual perception, calling for an interpretative effort directed at the implicit ideas, contradictions, and ambiguities embedded within the image.

A particularly emblematic instance of such practice can be found in Robert Rauschenberg’s Erased de Kooning Drawing, which originated from the artist’s inquiry into whether a drawing could be made using an eraser (Roberts 2013). The artwork consists of the barely visible traces of a drawing by Willem de Kooning, which Rauschenberg obtained with the artist’s consent and subsequently erased over the course of several weeks. This meticulous removal was not a spontaneous act, but a deliberate intervention with the purpose of challenging the values of Abstract Expressionism and its reverence for the artist’s gesture as a direct expression of identity and authenticity, both understood as reflections of subjectivity. Erasure operates here not only as a negation of the image but as a conceptual repositioning: by removing de Kooning’s marks through the repetitive, almost mechanical movements of an eraser, Rauschenberg does not inscribe his own signature, he interrupts the notion of authorship altogether (Foster et al. 2016). The resulting image—a barely perceptible residue on paper—empties the surface of the visible signs of drawing and instead foregrounds absence, withdrawal, and the displacement of meaning from visual presence to conceptual context. Taken together, these elements radically subvert traditional concepts of composition, originality, and the existential dimension attached to the creative act, positioning Erased de Kooning Drawing as a statement on the possibility of an image that emerges through subtraction rather than addition.

Wolf Vostell’s TV-Dé-coll/age series advances this subtractive logic within the domain of mass media, treating the television broadcast as volatile material to be disrupted, fragmented, and reassembled (Meigh-Andrews 2014). Vostell’s approach to the image was inseparable from his treatment of the material conditions of its transmission; he regarded the apparatus as extending beyond a carrier of content, functioning instead as a medium that could be dismantled, misused, or even destroyed. The screen, therefore, is not conceived as a passive surface but as a site of tension, rupture, and potential reconfiguration that lays bare both the technological matrix and the ideological underpinnings. This strategy, which he encapsulated under the term décollage, interrupts the dominant flow of information and dispels the illusion of continuity and neutrality intrinsic to television, prompting a critical interrogation of the medium, its modes of production and reception, and structures of power. From this standpoint, destruction does not oppose creation and is never purely iconoclastic; instead it is part of a broader attempt to destabilize institutionalized meanings, allowing agency to shift toward the viewer (Hanhardt 1992).

Stan Brakhage radicalizes the logic of destruction by focusing on the fragile emulsion of the celluloid surface. Through scratching, scoring, hand-painting, and collage frame-by-frame interventions, he eradicates stable figuration not by removing images but by overlaying them with marks that sever ties to photographic indexicality (Sitney 2002). In the process, attention is diverted from projection-mediated illusion to the tangible matter of the strip itself: dust motes, brushstrokes, even organic fragments become luminous events once they pass through the projector’s beam and are experienced as torrents of flicker and chromatic burst. Ultimately, this disruption of the filmstrip seeks to emancipate an “untutored eye” (Brakhage 1963, 25), deliberately disabling cinema’s normal descriptive role. This allows the film to register how visual impressions arise in the mind before they are shaped by language or cinematic conventions, and thereby turns each work into a study of the act of seeing itself rather than a straightforward depiction of external reality. In this sensory terrain, interpretive agency shifts to the spectator, who must actively negotiate a personal, phenomenological encounter with light.

Echoing Rauschenberg’s speculative impulse yet redirecting it toward cinema through photography, Hiroshi Sugimoto began his series Theaters as an experiment premised on the possibility of distilling a complete cinematic narrative into a single exposure (Fried [2008] 2012). Working with a large-format camera, Sugimoto opens the shutter for the entire duration of a feature film, capturing every frame, every cut, and every narrative element. However, this accumulation obliterates the sequence of images that once carried the story, resulting in a single rectangle of incandescent light, while the architecture of the theater remains sharply rendered. The work reverses cinema’s frame-by-frame logic, enacting a radical temporal compression that fuses the discrete frames through which film produces movement and narrative coherence into a static abstraction that nullifies both. This total exposure replaces linear unfolding with a luminous screen—a blank field that resists interpretation and functions as a mute surface, foregrounding the conditions of projection rather than the content it once displayed. Sugimoto’s gesture converts this obliteration into a new visual form: by exhausting representation in a single, prolonged act of capture, he produces an image that is simultaneously an archive of duration and a monochrome abstraction, compelling viewers to interrogate the temporal and spatial foundations of the cinematic experience, and to reflect on the common ground that binds film to photography—namely, their shared reliance on time and light (Baschiera 2020).

Continuing within the photographic field, Thomas Ruff appears to extend the paradigm of layering reminiscent of Brakhage’s practice. The artist immerses himself in the internet’s endless image-stream to create his Substrates series, working from the premise that these digital residues have become detached from meaning and now function only as electronic stimuli, representing “visual nothingness” (Schellmann 2014, 90). Using hentai source files as raw material, Ruff performs successive superimpositions and digital manipulations until every figurative trace dissolves into chromatic blur. The original motifs are not erased; they are overwritten by these cumulative layers whose elements spread, dissolve, and leak into one another, resulting in prints—vast, immersive, and vivid color fields—that withhold referential certainty and confront the viewer. Substrates uses this additive procedure, overwhelming the picture plane rather than rupturing it, so that its saturated density redirects attention from depicted content to the underlying mechanics of photography itself. Here, abstraction is used to expose how internet images exist in perpetual flux, are continually reshaped by the shifting contexts they enter, and, as a result of such constant mutation, can no longer serve as reliable documents of truth (Gunti 2020).

Taken together, these artistic practices reveal destruction as a method of making that unsettles established visual grammars and exposes the material and procedural conditions of representation. Whether through erasure, degradation or accumulation, each gesture reorients the viewer’s relation to the image, opening up a space in which abstraction becomes a critical operation that foregrounds the technological, material, and cognitive frameworks through which visual meaning is produced and disrupted.

From prompt to composition

This section examines the strategies adopted to reconfigure a photographic portrait by subjecting it to a generative process that produces a series of algorithmically abstracted variations, which are subsequently assembled into an audiovisual composition. The creative pipeline relied on a dual-input setup, combining a text-based instruction with the source image, which was exposed to two distinct procedures, each aligned with the mode of operation of the models employed. On one hand, it was used as a direct input, experiencing iterative transformation, while on the other, it functioned as a visual reference, informing the system’s interpretation of the accompanying textual instruction. Such methods enable greater authorial intervention in formal and aesthetic construction, while also embracing the unpredictability inherent in AI systems, placing the artist in a position that oscillates between the ongoing process of prompt refinement and a critical evaluation of the generated outcomes in order to calibrate their alignment with the conceptual framework of the project. This feedback loop between intention, algorithmic translation, and aesthetic judgment unfolds within a context of dynamic collaboration, reframing the system as an active creative agent capable of producing unforeseen visual results through its probabilistic nature and intrinsic constraints. This challenges the conventional perception of the technology as a mere tool expected to reproduce established aesthetic paradigms.

These generative processes ultimately converge in the construction of an audiovisual composition, in which the created images are not simply sequenced, but articulated within a non-linear structure governed by formal and spatial relationships.

Text as instruction

The crafting of the text-based instruction involved a strategic articulation of language and concept, and was developed through an iterative, research-driven process embedded in the project’s theoretical framework. This approach was informed by the study presented in Amorim and Mendes da Silva (2025), which examines how speculative formulations using subjective terms can facilitate the generation of abstract visual outcomes. The selected vocabulary reflects Hito Steyerl’s (2016) proposition that “not seeing anything intelligible is the new normal,” and consists of expressions associated with the unseen, explicitly referenced in her essay A Sea of Data: Apophenia and Pattern (Mis-Recognition).

The composition of the prompt—abstract renderings of intercepted broadcasts, machinic perception, signal, noise, lines, color, patterns, electric charges, radio waves, light pulses, a sea of data, speculative photography—relies on terms that destabilize legibility, resisting the depiction of recognizable subjects. Eschewing the hierarchical structures commonly used to guide photorealistic image generation—which emphasize concrete nouns and clear stylistic directives to optimize visual coherence—it operates within a logic of concealment and transformation, privileging ambiguity and indeterminacy. Accordingly, the instruction opens with abstract renderings of intercepted broadcasts, not only to foreground its non-representational stance, but also to preempt mimetic accuracy. This initial segment is followed by expressions that suggest distortion—machinic perception, signal, noise—while subsequent entries introduce references that further complicate representation, such as electric charges, radio waves, and light pulses. Positioned midway, terms such as lines, color, and patterns introduce vocabulary historically associated with early abstraction in painting and avant-garde cinema, functioning as a mid-sequence modulation between the technical lexicon and the more speculative elements drawn from physics and electromagnetic phenomena. Although the model does not interpret the prompt as a coherent statement, responding instead to individual words and phrases based on how they relate to images contained in its training set, it nonetheless assigns greater weight to those that appear earlier in the sequence and progressively less to those placed toward the end. Operating under these conditions, when concrete terms—those that tend to guide the image toward more organized visual structures—are combined with more ambiguous expressions—those that lead to less predictable results, often producing diffuse textures, visual noise, or unresolved forms—they do not interact through semantic integration, but through statistical proximity. The result is a composite in which structured elements coexist with areas of fragmentation and visual unintelligibility, a possibility that could be attributed to the activation of diverse regions within the latent space and the expanded range of combinations it may afford.

The sequence culminates in the inclusion of speculative photography, chosen to preserve a conceptual link with the project’s photographic origin while refusing to confine the image to the conventions of photographic realism. Rather than imposing a stylistic constraint, this final expression evokes a visual imaginary no longer bound to indexical representation, signaling instead a generative orientation that can embrace processes of transformation and distortion, and is conceptually aligned with the unpredictable nature of the models’ algorithmic inference..

Image as input

As defined by Qiao, Liu, and Chilton (2022), an initial image is “an image given to the model that initializes the generation from the chosen image instead of random noise”, which means that photographic material can be used as the point of departure for the generative process, replacing the standard noise-based initialization employed in text-only prompting. By anchoring generation in a pre-existing visual composition, this technique enhances precision and provides greater control over the resulting imagery, reducing interpretative ambiguities frequently observed when models rely exclusively on textual descriptions (Qiao, Liu, and Chilton 2022).

To implement this approach, Deforum Stable Diffusion (v0.7.1)—a modified version of Stable Diffusion—was employed for its ability to integrate photographic sources as conditioning inputs within the image-to-image method. This model allows for fine-tuned control over how closely the output adheres to the structure and content of the initial image, a process that requires adjusting specific parameters, most notably strength, which governs the degree of influence exerted by the original image and ranges from 0 to 1; a higher value constrains the model more tightly to the source, producing outputs that retain most of its visual features, whereas a lower value allows the system greater generative latitude, allowing it to depart more significantly from the input structure. Following several rounds of iterative testing, an intermediate value of 0.6 was found to offer the desired compromise, preserving key formal elements of the original portrait in the early frames, while setting the conditions for a progressive erosion of figuration, a necessary condition for the visual structure to evolve toward abstraction. This parameter setting was therefore not merely a technical adjustment, but part of a deliberate strategy to incrementally dismantle visibility, an approach that proved essential to the project’s exploration of abstraction.

Although the model includes a native animation function, the process deliberately refrained from using its automatic video rendering capabilities, focusing instead on generating and preserving the complete sequence of individual frames produced during the animation cycle. The system was therefore used not to create a final video, but rather as a tool to generate a progression of still images that could later be shaped through artistic decision-making, allowing for greater authorial control over the rhythm and visual direction, particularly through interventions in the temporal pacing that would not have been possible within the constraints of automated rendering.

The images were conceived with sequential progression as a guiding principle, making use of the model’s capacity to generate seamless transitions across iterations. Through parameter adjustments and refinement of the textual prompt, the generative process was directed toward producing outputs aligned with the intended temporal unfolding, ensuring formal coherence and consistency throughout the sequence. In the initial phase, smaller batches of images were generated and tested in order to identify the sequence that effectively reflected the desired visual trajectory, which was subsequently expanded into a longer composition comprising 960 frames. From the outset, the image dimensions were defined within the system parameters to correspond to a video-compatible aspect ratio, creating the necessary conditions for integration into a moving-image context.

Image as reference

The second approach to image production in this project made use of the Midjourney model (v. 7.0), whose multimodal capacity, by allowing for the combination of image references with textual instructions, enables more precise control over the visual characteristics of the results, which, as Oppenlaender (2023) argues, exhibit greater nuance, complexity, and diversity, thereby aligning more closely with the intended artistic direction. In contrast with the previous strategy, this method employs a different operational logic, adding an image to the prompt not to initiate a generative sequence, but rather to establish a formal reference. This integration allows the model to analyze and interpret essential features of the input image—which may include compositional structure, color dynamics, textural qualities, and other expressive elements—and use them as a basis for stylistic orientation, allowing the generated outputs to echo the reference while remaining formally distinct, instead of reproducing or modifying it.

The process unfolded through the adjustment of specific parameters, with image weight serving as the primary control for calibrating the impact of the reference image on the generated results. Ranging from 0 to 3, this setting defines a continuum between expressive deviation and formal consistency, with lower values encouraging the model to abstract from the reference image, whereas higher values yield outputs that draw more heavily from its visual identity. Extensive experimentation revealed that generating non-representational images—particularly those devoid of human-like forms—depended on setting the image weight parameter to a notably low value, with 0.3 emerging as the upper threshold beyond which recognizable features would tend to re-emerge, thus compromising the project’s abstract aesthetic objectives.

Unlike Stable Diffusion, which is optimized for temporal progression and animation, Midjourney remains primarily oriented toward the production of still image sets, offering no integrated mechanism for temporal sequencing. Despite the absence of frame-by-frame continuity, the system allows for the creation of subtle variations within a single output, a feature that proved instrumental in producing visually consistent sets of images, which could later be combined into small sequences that support a sense of motion.

To reinforce the still-image logic explored through this model, a 4:5 aspect ratio was adopted, establishing a formal link with the original photographic self-portrait that served as a point of departure. In contrast to the video-oriented framing employed in the previous phase, this format choice marks a deliberate shift away from a time-based perspective, opening possibilities for a fragmented and layered visuality, reflective of the project’s exploratory stance toward the image as a space of deconstruction.

Given the discontinuous nature of the generated outputs, this workflow required an extended curatorial stage, comparable—as Oppenlaender (2022) suggests—to the photographic editing process, in which artistic coherence is actively shaped through thorough selection and grouping of images. In addition to internal formal and expressive affinities, curatorial decisions also considered how each image might relate to the previously established sequence, laying the groundwork for visual connections to emerge.

Audiovisual composition

Lev Manovich’s (2001, 322) formulation of spatial montage, defined as the coexistence of “a number of images, potentially of different sizes and proportions” within a single frame, provides the structural foundation for this audiovisual composition, departing it from the conventions of linear editing by prioritizing simultaneity over succession. Occupying a fixed mid-screen position, the primary video sequence presents the progressive visual transformation of the portrait, while the still images, appearing either alone or in brief sequences, are strategically juxtaposed across the frame as they gradually emerge. These fragments are not positioned arbitrarily; their arrangement is guided by a set of compositional criteria—such as visual continuity, chromatic resonance, or pattern affinity—intended to establish formal correspondences with the central sequence, a strategy that encourages a spatial rather than temporal reading. Temporality, however, is not excluded but simply no longer structured by linear succession, emerging instead from the simultaneous coexistence of visual elements and thereby acquiring a more open and interpretative definition. This reconfiguration of time foregrounds movement as the primary means of articulating space in abstract animation, as Gascard (1983) suggests. Even in the absence of traditional narrative progression, it becomes possible to temporalize purely visual spatial relations, creating a dynamic formal field continually renegotiated through motion.

The video sequence employs animation on threes, meaning that each image remains on screen for three frames, which results in a reduced framerate of eight images per second (Teh, Perumal, and Hamid 2023). Commonly used in the early days of animation—when each frame had to be manually crafted through labor-intensive and time-consuming processes—this technique resonates with the demands of producing AI-based animations, where short sequences require the synthesis of thousands of individual images. The iterative nature of prompting, which typically entails generating a high number of outputs to yield a result that meets the desired visual criteria, results in substantial computational expenditure and extended processing time, as well as significant curatorial labor. Beyond its pragmatic rationale, this choice is aesthetically motivated: adopting a low framerate exposes the discontinuity between frames, accentuating the constructed nature of motion and enhancing the abstract quality of the images. By rejecting cinematic fluidity, the images reveal themselves as shifting visual structures, rather than a seamless rendering.

The sound component plays a subtle yet significant role in the construction of the audiovisual composition, taking the form of a synthesized voice-over that punctuates the video in isolated segments, interspersed with silence. Though emotionally neutral and slightly robotic, the voice approximates natural speech patterns, producing a paradoxical presence that is simultaneously synthetic and evocative of human expression, despite being fundamentally disembodied. Its content—a reconfigured assemblage of phrases adapted from Akiko Busch’s How to Disappear: Notes on Invisibility in a Time of Transparency (2019)—forms a cohesive and seemingly original discourse that, rather than describing the visual transformations directly, reflects on the very processes of disappearance and deconstruction undergone by the image, articulating the theme of invisibility in a more abstract register. Like the AI-generated images, this synthetic voice resulted from a manual and curated process in which the excerpts were carefully selected and recomposed to resonate with the visual trajectory, thereby reinforcing the aesthetic and conceptual direction of the project, while establishing another layer in the compositional system.

Abstraction as invisibility

Departing from the visual strategies and compositional choices explored throughout the project, this section examines how abstraction can operate as a mode of invisibility. Rather than functioning as a purely formal exercise, it becomes a lens to understand the emergence of images through computational procedures that begin with formlessness and are shaped by statistical inference. In this context, invisibility emerges not only as an aesthetic condition, but also as an interpretative framework that allows for a deeper understanding of how the production, perception, and meaning of images are being reconfigured within contemporary visual practices.

Unlike traditional artistic abstraction, which often stems from an artist’s intentional reduction of form, AI generation emerges from the way systems translate and reconstruct visual information, since machine vision—trained on vast datasets from which it learns statistical patterns—operates by processing inputs as numerical representations, establishing an image construction regime fundamentally different from visual perception. This results in images whose genesis remains obscured, shaped by operations unfolding within an inaccessible space governed by mathematical models, probability distributions, and algorithmic decision-making.

Within the scope of this project—where the figurative elements of a conventional portrait are rendered invisible through AI mediation—abstraction does not function as a visual simplification, but marks an ontological turn; the image, rather than being gradually reduced to its most basic elements, revealing something essential about the nature of form, color, or composition, is progressively reinterpreted by a system that works according to principles that remain largely hidden from human perception. This opaque dimension is crucial to establish a connection between abstraction and invisibility, insofar as it configures a mode of visuality in which what is seen is always contingent upon what remains unseen. In this context, the increasing unreadability of visual forms reflects not only a process of aesthetic abstraction, but also the influence of algorithmic unpredictability, through which randomness and machinic logic are embedded into the generative process, informing its operations from within. As a result, the visual outputs cannot be attributed solely to human intervention, since they are shaped by this underlying computational architecture whose internal mechanisms resist full comprehension, rendering the conditions of image production inaccessible.

The idea of invisibility at work in this project operates on multiple levels: beyond the technological opacity of algorithmic systems, it also resonates with a broader critique of visual culture, particularly the saturation of digital imagery that absorbs individual presence into a continuous stream of mediated representations. In this “society of transparency”, as theorized by Byung-Chul Han (2014), visibility, once associated with empowerment, becomes a mechanism of control and commodification, under which subjects are incessantly exposed and made legible to systems that harvest data under the guise of openness and participation. This overexposure, however, does not guarantee presence or recognition; it produces, instead, a paradoxical form of invisibility, simultaneously hyper-visible and socially irrelevant, eclipsed by a relentless accumulation of images that flattens any trace of singularity. This tension finds expression in the dissolution of figuration presented in the video component of this project, which can be viewed as a metaphor for the effects of a system in which recognizability is eroded by excess and visibility is determined by algorithmic circulation.

Within this critical context, it is also relevant to consider Hito Steyerl’s How Not to Be Seen: A Fucking Didactic Educational .MOV File (2013), a video that frames invisibility both as an act of resistance and as a tactical response to the practices of surveillance, offering a reflection on how images and identities are controlled within contemporary digital infrastructures. By situating invisibility within a contemporary media landscape shaped by algorithmic governance, the audiovisual element of this project aligns with Steyerl’s critical framework and can also be understood as a strategy for avoiding representation in the digital sphere. From this perspective, invisibility is not simply about disappearance, but about navigating the conditions under which one appears—or is erased—within systems of visual control. This positioning exposes the logic of digital visibility, where appearing often entails submission to constraints and loss of control. Abstraction in this work thus becomes also a form of withdrawal, a way of suspending the demand to be seen, shifting attention from what is represented to what resists representation, and displacing the very terms on which visibility is granted.

Conclusion

This study set out to explore how a single analog studio portrait can be re-imagined through generative AI as an ever-eroding field of abstraction, and what that metamorphosis discloses about image-making in the algorithmic age. By bringing into dialogue historical precedents of destruction-as-creation with a contemporary, multimodal production workflow, the project demonstrates that the impulse to dismantle visibility now unfolds inside probabilistic latent spaces. In that migration from the tactile to the statistical, the act of erasure became a negotiated choreography between authorial intent and the black-box logic of machine inference. The portrait’s gradual dissolution thus operates simultaneously as a formal device, a conceptual metaphor, and an inquiry into how authorship, materiality, and the very ontological status of the image are being redrawn by AI systems.

The dual-input prompting strategies employed underscore the productive friction between control and contingency. Parameter tuning became a critical practice for pacing visibility, staging rupture, and sculpting the viewer’s path from figuration to abstraction. Likewise, the spatial montage architecture and low-frame-rate pacing situate the resulting video in a lineage of early experimental cinema, foregrounding the fact that motion is not automated, but carefully assembled from stills generated by AI. The synthesized voice-over adds a final recursive layer: language that narrates disappearance is itself produced by the same computational paradigm that renders the image invisible.

Framing abstraction as a mode of invisibility reveals two intertwined dimensions: first, the machinic opacity of generative models that turn every output into a partial view of an invisible process; second, a culturally embedded form derived both from the visual overexposure diagnosed by Han and from Steyerl’s conception of tactical withdrawal under surveillance capitalism. The project therefore argues that contemporary abstraction is not merely the stripping away of recognizable form but a deliberate redirection of attention toward not the image itself, but how and why it appears.

By situating AI-driven abstraction within a century-long discourse on image deconstruction, this research develops three lines of argument: (1) generative workflows can be harnessed deliberately to choreograph disappearance rather than photographic fidelity; (2) creative parameter control and curatorial labor constitute a new form of artistic authorship, one that is procedural, iterative, and dialogic with the machines; and (3) abstraction today can function as both an aesthetic strategy and a critical methodology for interrogating the infrastructures of visibility that govern digital culture in the age of machine-learning. Ultimately, the portrait’s passage from photographic index to statistical haze invites us to reconsider the image not as a window onto reality but as an interface that both reveals and conceals the algorithmic forces shaping contemporary visuality.

Endnotes

This work is supported by national funds through the FCT – Foundation for Science and Technology, I.P., within the scope of the project UIDP/04019/2020.

Bibliography

Baschiera, Stefano. 2020. “The Cinematic Dispositif and Its Ghost; Sugimoto’s Theaters.” In Theorizing Film through Contemporary Art. Expanding Cinema, edited by Jill Murphy and Laura Rascaroli, 195–212. Amsterdam: Amsterdam University Press. https://doi.org/10.1017/9789048542024.011.

Brakhage, Stan. 1963. Metaphors on Vision. Edited by P. Adams Sitney. New York: Film Culture Inc.

Busch, Akiko. 2019. How to Disappear: Notes on Invisibility in a Time of Transparency. New York: Penguin Press.

Foster, Hal, Rosalind E Krauss, Yve-Alain Bois, Benjamin H. D. Buchloh, and David Joselit. 2016. Art since 1900: Modernism, Antimodernism, Postmodernism. 3rd ed. London: Thames & Hudson.

Fried, Michael. (2008) 2012. Why Photography Matters as Art as Never Before. New Haven: Yale University Press.

Gascard, Lorettann Devlin. 1983. “Motion Painting: ‘Abstract’ Animation as an Art Form.” Leonardo 16 (4): 293–97. https://doi.org/10.2307/1574955.

Gunti, Claus. 2020. Digital Image Systems: Photography and New Technologies at the Düsseldorf School. Bielefeld: transcript.

Han, Byung-Chul. 2014. A Sociedade Da Transparência. Lisboa: Relógio D’Água.

Hanhardt, John G. 1992. “De-Collage and Television: Wolf Vostell in New York, 1963-64.” Visible Language 26 (1/2): 109–23.

Manovich, Lev. 2001. The Language of New Media. Cambridge, Massachusetts: The MIT Press.

Meigh-Andrews, Chris. 2014. A History of Video Art. 2nd ed. New York: Bloomsbury.

Oppenlaender, Jonas. 2022. “The Creativity of Text-To-Image Generation.” In 25th International Academic Mindtrek Conference, 192–202. https://doi.org/10.1145/3569219.3569352.

———. 2023. “A Taxonomy of Prompt Modifiers for Text-To-Image Generation.” Behaviour & Information Technology, November, 1–14. https://doi.org/10.1080/0144929x.2023.2286532.

Qiao, Han, Vivian Liu, and Lydia Chilton. 2022. “Initial Images: Using Image Prompts to Improve Subject Representation in Multimodal AI Generated Art.” In C&c ’22: Proceedings of the 14th Conference on Creativity and Cognition. ACM. https://doi.org/10.1145/3527927.3532792.

Roberts, Sarah. 2013. “Erased de Kooning Drawing.” San Francisco Museum of Modern Art. https://d1hhug17qm51in.cloudfront.net/www-media/2018/10/03215343/SFMOMA_RRP_Erased_de_Kooning_Drawing.pdf.

Schellmann, Jörg, ed. 2014. Thomas Ruff, Editions 1988-2014. Catalogue Raisonné. Ostfildern: Hatje Cantz.

Sitney, P. Adams. 2002. Visionary Film: The American Avant-Garde 1943-2000. New York: Oxford University Press.

Steyerl, Hito. 2013. “How Not to Be Seen: A Fucking Didactic Educational .MOV File.” Website Video. https://www.artforum.com/video/hito-steyerl-how-not-to-be-seen-a-fucking-didactic-educational-mov-file-2013-165845/.

———. 2016. “A Sea of Data: Apophenia and Pattern (Mis-Recognition).” E-Flux Journal, no. 72 (March): 1–12. https://www.e-flux.com/journal/72/60480/a-sea-of-data-apophenia-and-pattern-mis-recognition/.

Teh, Sharafina, Vimala Perumal, and Hushinaidi Abdul Hamid. 2023. “Investigating How Frame Rates in Different Styles of Animation Affect the Psychology of the Audience.” International Journal of Creative Multimedia 4 (2): 10–31. https://doi.org/10.33093/ijcm.2023.4.2.2.