Introduction

Organic synthesis assumes a pivotal role in the production of pharmaceuticals, and the conventional batch-wise approach has enjoyed extensive utilization for decades in both laboratory and industrial contexts. However, the intricate manual pathways for multi-step synthesis of active pharmaceutical ingredients (API) from basic precursors frequently manifest as overly complex [1]. This entails sequential batch-wise transport of intermediates, incorporating isolation and purification of the ultimate product. This established approach is not only inefficient but also time-intensive, consequently inflating costs during the initial stages of pharmaceutical development [2].

To enhance synthetic efficiency and surmount these challenges, the concept of flow synthesis was introduced within the realm of organic chemistry and has garnered substantial traction in industrial and laboratory spheres since the 1900s [3,4,5,6,7]. The conventional flow synthesis, as exemplified by prominent synthesis like ibuprofen [8] and trinitrotoluene (TNT) [9], affords superior command over reaction variables including temperature, time, and reagent quantities when juxtaposed with traditional batch manner. Nonetheless, it also engenders specific control quandaries, encompassing reaction magnitude, residence time, thermal and pressure regulation, auxiliary reagents, intermediates, and scale-up, given that any deviations in these parameters can exert influence upon the final product [10, 11].

Across the past decade, the pharmaceutical domain has witnessed an upsurge in the adoption of fully automated flow synthesis systems, particularly for macromolecular synthesis [12, 13]. This immensely efficient automated paradigm has substantially curtailed time requirements [14], operational intricacy, and susceptibility to chemical exposure and contamination [15]. Leveraging coding and algorithms for comprehensive control, the automated configuration has streamlined the synthesis process, thereby expediting reactions and simplifying the conveyance of intermediates. As a result, entirely automated flow chemistry has emerged as a safer and more efficacious alternative for pharmaceutical synthesis, spanning both laboratory and industrial arenas [16,17,18]. Nevertheless, it is pivotal to acknowledge that flow synthesis may still necessitate a larger quantity of reactants in comparison to manual synthesis [19].

This review is geared towards an exploration of the progression of flow synthesis and the assimilation of automated platforms within the pharmaceutical sector. Additionally, an in-depth analysis of the rapid strides in program engineering, which have facilitated the application of machine learning and high-throughput screening for reaction forecasting and enhancement, will be undertaken. Moreover, the pivotal role of in-line instruments as integral modules in flow systems will be discussed. These instruments enable real-time monitoring of reactions and meticulous oversight of synthesis advancement, particularly when optimization becomes imperative. Through this all-encompassing appraisal, our objective is to cast illumination upon recent advancements in automated flow chemistry for the synthesis of pharmaceutical compounds. This coverage will span diverse dimensions, encompassing biologics, small organic molecules, carbohydrates, and the utilization of machine learning for synthesis refinement and high-throughput screening.

Automated flow synthesis of bioactive macromolecular pharmaceuticals

Automated flow peptide and protein synthesis

Peptide drugs have constituted a significant category within the realm of pharmaceutical innovation, offering a myriad of global advantages [20, 21]. These advantages encompass the treatment of a spectrum of chronic ailments, including but not limited to diabetes [22,23,24], cancer [25,26,27], and hepatitis [28,29,30]. Peptide drugs, serving as therapeutic agents, present a diverse array of merits, notably encompassing heightened biological activity, pronounced specificity, and minimal toxicity [31]. These attributes, in turn, foster further explorations in this domain.

In the domain of chemical synthesis, the methodology for peptide synthesis originally revolved around the solution-phase paradigm [32]. Nonetheless, the isolation and purification procedures within the synthetic progression impede the efficiency and cost-effectiveness of peptide production. In 1962, Merrifield introduced the inaugural solid-phase peptide synthesis (SPPS) method for peptide assembly [33]. This methodology entailed the incremental addition of amino acids onto a solid resin particle that had been specifically modified with covalent bonding groups. This tailored modification facilitates the progressive growth of peptides on the solid support. The SPPS approach encompasses sequential phases of deprotection, coupling, washing, and cleavage. Among these solid-phase processes, the resultant peptides remain affixed to the resin, with the washing protocol designed to eliminate surplus reagents. Consequently, the need for intermediary steps of isolation and purification is obviated.

However, Merrifield's technique necessitated the utilization of t-butoxycarbonyl (Boc) protected amino acids, mandating an acid treatment (trifluoroacetic acid, TFA) at each deprotection juncture. This approach, however, was characterized by instability and forceful reactivity. Subsequently, two decades later, Andrews proffered an optimized course of action, substituting Boc with fluorenylmethoxycarbonyl (Fmoc) [34]. This protecting group can be cleaved under basic conditions (20% piperidine), thereby obviating the recurrent reliance on strong acidic reagents. Notably, this optimized pathway augmented the compatibility of SPPS with continuous flow systems, employing a polymerized dimethylacrylamide-based monomer mixture. This mixture facilitated rapid diffusion of reactants, devoid of pressure fluctuations.

Advancements continued, as in 2014, Simon and colleagues pioneered a rapid flow synthesis of peptides employing a semi-automated system [35] (Fig. 1a). This innovation drastically curtailed synthesis duration by eliminating the need for solution delivery, thereby markedly enhancing peptide synthesis efficiency. The refined approach featured a DMF-soluble polymer on the resin, combined with Fmoc protected amino acids. This amalgamation heightened their aptitude for seamless flow within a closed system. Consequently, the synthetic duration per residue plummeted from 15–30 min [36, 37] to a mere 3 min (Fig. 1b). This optimization was achieved by streamlining manual processes, including the introduction of pumps for delivering Fmoc protected amino acids into the reactor equipped with a heating module. While valve switching for reagent selection still mandated manual operation, this system laid the groundwork for the automation of synthesis methodologies.

Fig. 1
figure 1

Semi-automated flow synthesis of peptides a The instrument flowchart for semi-automated synthesis of peptides b The technical roadmap for flow synthesis of peptides

After a span of 3 years, Mijalis and his colleagues pioneered an automated flow peptide synthesis (AFPS) system through a foundation in solid-phase synthesis (SPS) [38] (Fig. 2a). This breakthrough system boated remarkable attributes including high yield, elevated purity, and minimal epimerization, particularly concerning cystine and histidine. The comprehensive design of the AFPS system incorporated three reagent storage units, three HPLC pumps, three heating loops, a singular reactor, a robotic arm for sample and resin manipulation, and an ultraviolet–visible (UV–vis) spectrophotometer. The UV–vis was utilized to monitor the removal of the Fmoc protecting group, serving as an indicator of deprotection efficiency, coupling yield, and peptide aggregation.

Fig. 2
figure 2

Fully automated flow synthesis of peptides a The instrument flowchart for AFPS system b The technical roadmap for peptide flow synthesis

Within the AFPS framework, the mixture composed of amino acids, the base N,N-diisopropylethylamine (DIEA), and the coupling agent O-(7-azabenzotriazo-1-yl)-N,N,N’,N’-tetramethyluronium hexafluorophosphate (HATU) is swiftly elevated to 90 °C within the tubular reactor. The heightened temperature shortened the coupling time required for an amide bond to mere 7 s. Furthermore, the synthesis of an individual amino acid residue is accomplished in just 40 s, exhibiting a substantial improvement from the previous duration of 3 min (Fig. 2b). Moreover, the system's versatility extends to its application in the realm of therapeutic innovation and advancement. Notably, it has been employed in the progress of antimicrobial agents such as Histatin, Alarin, Amyloid β, Bradykinin, Catestatin, Neurotensin, Neuropeptide Y, encompassing a total of 43 antimicrobials. Additionally, the system finds utility in the exploration of tumor neoantigens [39], further underlining its significance in cutting-edge medical research.

In spite of the demonstrated rapid synthesis of 30-mer peptides through the AFPS system, the synthesis of peptides exceeding 50-mer remains problematic. The challenges predominantly arise from substantial coupling and side reactions, encompassing deletions, truncations, and aggregations. These complications have endured as longstanding impediments. Even with advancements in native chemical ligation, practical synthesis protocols continue to be restricted by specific peptide segments. In the year 2020, the Pentelute group devised an optimized AFPS system with generalized parameters such as flow rate, reagent concentration, and coupling reagents, aiming to mitigate side reactions and enhance purity [40]. By transitioning the coupling reagent from HATU to a combination of HATU and (7-azabenzotriazol-1-yloxy)tripyrrolidino-phosphonium hexafluorophosphate (PyAOP) [41], coupled with an escalation in reagent concentration, the synthetic purity of glucagon-like peptide-1 (GLP-1) surged from 53% to 70%. To counteract the formation of aspartimide prompted by heightened temperatures, the addition of formic acid to the piperidine deprotection solution, along with safeguarding the backbone with dimethoxybenzyl glycine, demonstrated itself as the most efficacious approach. The final optimization effort focused on retaining chirality and preventing the epimerization of amino acids like cysteine (Cys) and histidine (His). Empirical evidence showed that Fmoc-Cys(Trt)-OH and Fmoc-His(Boc)-OH, reacted at 60 °C, yielded less than 2% D-epimer. Consequently, based on the optimized AFPS method, nine proteins could be comprehensively and expeditiously synthesized, ranging up to 160-mer in size, each with distinct functionalities, all within a span of 6.5 h. This procedure yielded proteins of elevated purity, minimal aspartimide formation, and negligible epimerization. Furthermore, the synthesized proteins, encompassing proinsulin, barstar, lysozyme, HIV-1 protease, and a total of 10 therapeutic peptides, preserved both their structural integrity and biological functionality. This outcome decisively underscored the triumph of the refined system.

Nonetheless, due to the inevitable side reactions inherent in SPPS for peptides exceeding 50-mer in length, lengthier sequences are typically synthesized utilizing a convergent strategy. This approach, while effective, demands a lengthier timeframe compared to the "single-shot" synthesis method. To address this quandary, in the year 2023, Saebi and colleagues undertook the automation of a 214-mer protein synthesis, specifically the bacteriocin pyocin S2 (PyS2NTD) [42]. Remarkably, this endeavor achieved the longest "single-shot" synthesis utilizing the AFPS approach, culminating in a mere 10-h period. This achievement was predicated upon an optimized combination of reagent concentration (0.4 M), Fmoc deprotection additive (formic acid), and reaction temperature (90 °C), all of which preserved the biological functionality of the synthesized protein.

The automated SPPS platform represents a substantial advancement, significantly curtailing the synthetic timeline for peptides by obviating the need for labor-intensive manual interventions inherent in conventional strategies. With a notable leap in synthetic velocity, measured in orders of magnitude (from hours to seconds), these automated SPPS pathways are poised to catalyze the development of therapeutic peptide synthesis and the exploration of their inherent biological attributes.

Automated flow oligonucleotides synthesis

Nucleic acid therapy has been regarded as a consequential avenue owing to its capacity to modulate gene expression and achieve target specificity [43, 44]. Nucleic acid was identified in the mid-nineteenth century, and over the subsequent decades, scientists persisted in the examination of this compound extracted from white blood cells [45]. Due to its distinct biological properties, it finds utility in a broad spectrum of diseases attributed to genetic expression [46,47,48,49,50,51,52,53,54]. In recent decades, a specially modified type of nucleic acid, namely antisense oligonucleotides (ASOs), has been introduced into gene therapy, showcasing unique attributes such as the incorporation of bases that can naturally and rapidly target any genetic disease, and a modified backbone that safeguards against nucleases cleavage [55, 56]. With heightened stability and affinity, drugs like 4 phosphonodiamidite morpholino oligomers (PMOs) with a six-membered morpholino ring and phosphorodiamidates linkage backbone, namely Eteplirsen [57], Golodirsen [58], Viltolarsen [59], and Casimersen [60], have secured approval for treating Duchenne muscular dystrophy (DMD). These PMOs represent the sole four approved targeted drugs for DMD, underscoring the significant role of PMOs within the pharmaceutical realm. Nevertheless, protracted synthesis obstructs the further advancement of therapeutic ASOs due to the imperative need for library production and sequence optimization stemming from screening efforts. Hence, the pursuit of more expeditious and streamlined strategies becomes imperative for future progress.

Traditionally, the evolution of PMOs was hampered by the intricacies associated with synthesis optimization. Conventional synthesis methods necessitated 180 min for the coupling of a single PMO monomer, implying that weeks could be consumed for the synthesis of a 20-mer PMO. In 2021, Li and colleagues introduced a fully automated microscale flow synthesizer known as "Tiny Tides", comprised of six modules including nitrogen-protected reagent containers, chemically inert valves, HPLC pumps, reaction vessels, UV–vis detectors, and computers equipped with the Mechwolf program [61, 62] (Fig. 3a). To enhance reaction performance, the synthesis pathway underwent optimization involving temperature (90 °C), deprotection reagent (3,5-lutidine TFA salt), neutralization reagent (N-ethylmaleimide), and coupling base (DIEA). This optimization led to the successful synthesis of a 4-mer PMO with a purity of 99%. Subsequently, following the same protocol, a more practical 18-mer PMO corresponding to the β-thalassemia gene sequence IVS2-654 12 was synthesized within a mere 3.5 h, attaining an 85% purity (Fig. 3b). Notably, three 20-mer PMO sequences targeting splice donor and acceptor sites for exon 46 skipping, a crucial splice alteration for DMD, were synthesized in just one day, contrasting with the previous month-long timeframe, and achieved a minimum purity of 85%. Consequently, with further validation via severe acute respiratory syndrome coronavirus (SARS-CoV-2) studies, this system not only presented heightened synthetic efficiency but also underscored the proficiency of a high-temperature automated flow system for the demanding task of biopolymer synthesis.

Fig. 3
figure 3

Fully automated flow synthesis of PMOs a The instrument flowchart for automated PMO synthesis b The technical roadmap for PMO flow synthesis

As a charge-neutral ASO class, peptide nucleic acids (PNAs) also capture the attention of scientists due to their enhanced stability resulting from the amide backbone, alongside high affinity and selectivity. However, PNAs exhibit poor cellular uptake and low solubility in clinical settings, which constrains their applicability in clinical trials. Consequently, the incorporation of a cell-penetrating peptide (CPP) into PNAs has been proposed, leading to the development of CPP-conjugated PNAs (PPNAs) that can facilitate enhanced cellular uptake. Nevertheless, conventional PNA synthesis is associated with potential side reactions such as aggregation, deletion, rearrangement, isomerization, and nucleobase addition. These reactions inevitably limit the yield and purity of the synthesized product. Furthermore, the elevated cost of PNA monomers, coupled with the sensitivity of these side reactions, further hampers the efficiency of therapeutic PNA screening and production.

In 2022, Li and colleagues devised a fully automated synthesis platform based on “Tiny Tides”, with minor modifications tailored specifically for expeditious PNA synthesis [63] (Fig. 4a). This innovative platform achieved rapid synthesis of PNAs, completing 10 s per amide bond and a 3-min cycle at a temperature of 70 °C, in contrast to the previous 32-min cycle at room temperature (Fig. 4b). The resulting PNAs exhibited an impressive purity level of 95%. With the introduction of this novel platform, a set of 8 long PPNAs (comprising more than 15 base pairs) were efficiently synthesized in a single day. These PPNAs were designed to target the transcription regulatory site (TRS) and translation start site (AUG) of the SARS-CoV-2 virus. Remarkably, these synthesized sequences exhibited purity levels of at least 90%. In inhibition tests, the 8 synthesized PPNAs demonstrated their efficacy by significantly reducing the viral titer of SARS-CoV-2 by 95%, showing the therapeutic potential of sequences synthesized using this innovative platform.

Fig. 4
figure 4

Rapid automated flow synthesis of PPNAs a The instrument flowchart for automated PPNA synthesis b The technical roadmap for PPNA flow synthesis

Based on the reported platform, Li and colleagues additionally demonstrated that in-line analysis can aid in training a machine learning (ML) model to predict the synthesis yield of PNA sequences [64] (Fig. 5). The system successfully facilitated the rapid generation of 239 PNA pre-chain and nucleotide combinations, encompassing therapeutic PNAs for genetic and viral diseases, cardiovascular disorders, and cancer, through the utilization of a tenfold ML model. With the assistance of this rapid automated system, the trained dataset provided suitable input for the machine learning algorithm, which accurately predicted PNA synthesis efficiency and sequences with a remarkable accuracy rate of 93%. This advancement significantly enhances the capacity for the rational design of PNAs.

Fig. 5
figure 5

Machine-learning assisted automated flow synthesis of PPNA

Hence, automated flow synthesis proved capable of offering a swift and cost-effective route for the production of therapeutic oligonucleotide sequences. Moreover, the integration of rapid synthesis with a machine learning model-trained dataset enabled precise predictions regarding synthetic yields. Such predictions hold the potential to expedite the subsequent advancement of oligonucleotide drugs.

Automated flow polysaccharides synthesis

Polysaccharides, as the most abundant biopolymers on Earth, serve a multitude of structural and modulatory roles [65]. They find application in addressing chronic conditions such as diabetes [66] and tumors [67]. Unlike peptides and nucleic acids, polysaccharide monomers establish connections in a nonlinear manner, yielding limitless compositional variations [68]. However, the intricate nature of this architecture often renders the complete synthesis of polysaccharides more intricate compared to oligonucleotides and peptides. The distinct structure of carbohydrate monomers necessitates targeted protection of multiple -OH groups throughout the polymerization process. This challenge has stimulated the advancement of polysaccharide synthesis through flow chemistry techniques.

Since the inception of solid-phase synthesis for oligosaccharides in 1971 [69], researchers gained the capacity to construct more intricate carbohydrates, including N-acylated di- and trisaccharides. A diverse array of monomers could be harnessed on a solid support. Nevertheless, the protracted and labor-intensive synthesis duration continued to underscore the necessity for refining oligosaccharide synthesis methods. In 2001, the Seeberger group documented an automated solid-phase approach for polysaccharide synthesis, drawing inspiration from automated peptide synthesis principles [70] (Fig. 6a). An olefin linker-modified resin, which maintained stability over numerous deprotection and coupling cycles, was employed. Through solid-phase automated synthesis, they managed to condense the total synthesis timeline from 14 days to a mere 20 h, accompanied by an almost five-fold enhancement in overall yield for heptamannoside (Fig. 6b).

Fig. 6
figure 6

Automated solid-phase synthesis of oligosaccharides a Applied Biosystems Inc. Model 433A peptide synthesizer b The technical roadmap for automated oligosaccharides flow synthesis

However, in the realm of oligosaccharides, sequence length remained confined to a maximum of 92-mer. In 2020, an automated glycan assembly (AGA) system was introduced by Joseph et al. They effectively applied this system to the total synthesis of a 100-mer polymannoside with differentially protected monomer blocks, as well as a 151-mer polymannoside achieved through convergent [31 + 30 + 30 + 30 + 30] block coupling [65]. A photocleavable linker based on polystyrene Merrifield resin was employed. Each monomer underwent a four-step cycle (acid wash, glycosylation, capping, cleavage). The total synthesis period required for the 100-mer was 188 h, resulting in a 5% yield. The synthesis of the 151-mer, in combination with a branched 31-mer polymannoside, took 50 h and in 12% yield. Additionally, the synthesis included 4 units of 30-mer polymannosides, each requiring 45 h and yielding 30%. Ultimately, a fully deprotected 151-mer polymannoside was successfully synthesized with a 78% isolated yield, considering the branched 31-mer polymannoside as the acceptor.

In addition to the rapid synthesis of oligosaccharides, manual synthesis of protected monosaccharides often consumed substantial time, impeding oligosaccharide development. This was particularly true for special monosaccharides frequently utilized in bioactive natural products and microbial agents. Addressing this, in 2021, Yalamanchili et al. devised an automated continuous flow synthesis method for protected 2,6-dideoxy and 3-amino-2,3,6-trioxy monosaccharides [71] (Fig. 7b). They efficiently synthesized deoxy-sugar donors (in 74–131.5 min, instead of several days) like L-rhamnose, L-olivose hemiacetal 18, L-olivose thioglycoside 19, L-oliose 20, L-digitoxose 21, L-boivinose 22, L-ristosamine 23, and L-megosamide 24 (Fig. 7c). This was achieved with relatively high yields (15%-30%) within a Python-controlled system akin to Mechwolf, operating within a specially modified flow system.

Fig. 7
figure 7

Automated synthesis of monosaccharide a The instrument flowchart for automated monosaccharides synthesis b The technical roadmap for L-olivose flow synthesis c Synthesized monosaccharides

Although synthetic methods for oligosaccharides have evolved through solid-phase synthesis, the demand for a substantial quantity of building blocks remains a limitation. Subsequently, the following year, Yao et al. from Peking University introduced a swift solution-phase synthesis method for oligosaccharides. This approach encompassed a universally applicable and highly efficient automated solution-phase synthesizer (Fig. 8a), coupled with a pre-activation one-pot multicomponent and continuous multiplicative synthesis strategy [72]. In this progression, the automated multiplicative synthesis (AMS) protocol efficiently synthesized a 1080-mer arabinan within five multiplying amplification steps (1 × 6 × 5 × 4 × 3 × 3 = 1080), achieving a 33% isolated yield from monosaccharides (Fig. 8b).

Fig. 8
figure 8

Fully automated solution-phase synthesis of polysaccharides a The instrument flowchart for automated solution-phase polysaccharides synthesis b The technical roadmap for 1080-mer polysaccharides flow synthesis

In both solid-phase and solution-phase synthesis, automated flow pathways can offer a higher mass transfer rate, precise temperature control, and eliminate the need for manual interventions through program-based automation, thus typically reducing synthesis time. With accelerated synthesis, scientists can readily refine pathways for improved yield and purity, thereby enhancing synthetic efficiency. The automated flow system incorporates a central control computer accessible even to inexperienced operators. Additionally, the in-line analysis module facilitates real-time process monitoring, further augmenting pathway optimization. Ultimately, rapid synthesis also aids in generating datasets for predicting therapeutic compounds, thus advancing the future of drug discovery.

Automated flow synthesis of small-molecule APIs or drug-like compounds

In the realm of pharmaceutical advancements, small molecules such as analgesics [73], antibiotics [74], and treatments for tumors hold paramount importance. However, in contrast to polymers that are typically crafted through repetitive processes involving similar functional units, the methodologies for synthesizing small molecules are often directed towards specific targets. This approach yields a multitude of byproducts, fostering diversity in production but also presenting challenges in devising a universal synthesis method.

In 2019, Steiner and colleagues introduced an automated robotic system named "Chemputer" [75] (Fig. 9a). This innovation relied upon four fundamental synthesis protocols: reactions, workup, isolation, and purification. A standardized reporting format was employed, supported by a chemical programming language. Control over the system was exercised through the utilization of "Chempiler", a distinct low-level instruction program. This program combined the open-source GraphML with the chemical assembly language known as ChASM. The synthesis scheme would be articulated using the chemical descriptive language (χDL), subsequently translated into a ChASM file tailored to the specific protocol. In addition to the software aspect, the system incorporated a flow-based architecture encompassing four modules: a reaction flask, a temperature-regulated jacketed filtration system, an automated liquid–liquid separation module, and a solvent evaporation module. This comprehensive platform demonstrated its competence by autonomously synthesizing three Active Pharmaceutical Ingredients (APIs): sildenafil 31, rufinamide 32, and nytol 33. Yields and purities of products and intermediates achieved through this automated process were comparable to or better than those achieved manually (Table 1).

Fig. 9
figure 9

Automated flow synthesis of small-molecule APIs a The instrument flowchart for automated organic synthesis b The technical roadmap for Sildenafil flow synthesis

Table 1 Comparative Analysis of Yield and Synthesis Time in Manual vs. Automated Synthesis [76, 77]

In 2022, Rohrbach et al. established an open database containing a diverse array of experiments within an optimized Chemputer consisting of seven modules: a reaction module, a separator, a conductivity sensor, a jacketed filter, several reagent flasks, a rotary evaporator, and a chromatography system [78] (Fig. 10a). Initially, the database comprised 103 compounds with pathways denoted in χDL codes. Among these, 53 compounds were physically validated by the automated platform introduced, yields and purities as described in literature, thus confirming the enhancement in synthesis throughput. Notably, the pharmaceutical compound atropine, utilized as an anticholinergic medication for nerve agent poisoning treatment, was successfully synthesized devoid of human intervention, serving as a demonstration of the system's drug development capabilities.

Fig. 10
figure 10

Automated flow synthesis based on “Chemputer” a The instrument flowchart for automated organic synthesis b Validated 10 representitive reactions

Although the automated system and its control program have reached an advanced stage of development, a portion of the process remains partially configure manually by expert chemists. In 2019, Coley and colleagues introduced an artificial intelligence (AI)-based platform proficient in designing and retrosynthesizing target compounds [79]. In this system, chemical recipe files (CRFs) were translated by chemists, incorporating supplementary parameters like residence time, concentrations, and equivalents. Employing CRFs, the robotic platform constructed an entire closed flow system crafted from temperature and pressure-resistant materials, selecting the necessary reagents and conditions (e.g., temperature, pressure) (Fig. 11a). As illustrated by Jamison and Jensen, the platform effectively automated the synthesis of 15 APIs or drug-like compounds, encompassing aspirin 34, secnidazole 35, lidocaine 36, diazepam 37, a pair of chiral drugs 38–39, a set of 5 angiotensin-converting enzyme (ACE) inhibitors 40–44, and a cluster of 4 nonsteroidal anti-inflammatory drugs (NSAIDs) 45–48 (Fig. 11b). This was achieved through 8 specific retrosynthetic routes and 9 distinct process configurations.

Fig. 11
figure 11

AI-driven automated flow synthesis. CRF: chemical recipe file a The instrument flowchart for computer-aided automated organic synthesis b Synthesized APIs

Advancements in machine-assisted technology led to the development of computer-aided synthesis planning (CASP) tools, which combine curated human reactions with algorithm-based learning from published literature, reducing the necessity for manual research and configuration. Nonetheless, due to limitations in available data, CASP faces challenges in specifying conditions such as concentration and temperature without human intervention. In 2022, Nambiar et al. outlined a CASP-proposed synthesis pathway, integrated with an automated flow synthesis platform featuring process analytical technology (PAT) [80]. This was exemplified through the synthesis of the API sonidegib, an antineoplastic agent used to treat locally advanced recurrent basal cell carcinoma (BCC). The Bayesian optimization algorithm successfully generated predictive mathematical process models encompassing various objectives (yield, productivity, cost) and optimal settings for categories (reagents), as well as continuous reaction conditions (temperature, time, stoichiometry). Additional support was derived from in-line analytics including FT-IR and LC–MS. Consequently, this platform demonstrated the successful application of machine assistance in initial formulation, experimental execution, and data collection.

In 2020, Collins and colleagues developed a fully automated chemical synthesizer named “Autosyn”, designed for the synthesis of a wide range of organic small molecules utilizing standard reactions [81]. The system comprised the Cityscape flowing synthesis platform, a reagent delivery system, an analytical platform, and a control system. The synthesis platform was constructed with a base plate consisting of a substrate layer and a manifold layer for the controlled flow of components. Surface-mount components incorporate various compartments for chemical reactions, including reactors, separators, back pressure regulators, valves, and sensors (Fig. 12a). Through the application of “Autosyn”, ten FDA-approved drugs were successfully synthesized: imatinib, diphenhydramine 49, ibuprofen 50, warfarin 51, nevirapine 52, tramadol 53, fluconazole 54, diazepam 55, hydroxychloroquine 56, and tranexamic acid 57 (Fig. 12b). The AutoSyn platform offers thousands of synthesis pathways, enabling the synthesis of milligram to even gram quantities of various small molecule drugs within a matter of hours. It can effectively replicate reactions between nearly all types of flow synthesis.

Fig. 12
figure 12

Automated flow synthesis based on “AutoSyn” a The instrument flowchart for automated chemical synthesis b Synthesized APIs

Despite the significant enhancements brought by flow chemistry to efficiency and synthetic accessibility in small molecule production, certain challenges persist in the synthesis process. These include the incompatibility of reagents and solvents between steps, by-product formation, and potential clogging. To address these issues, Wu et al. introduced a fully automated solid-phase system in 2021 for the synthesis of the registered anti-tumor drug prexasertib [82]. The platform encompasses a high-pressure pump, a peristaltic pump, four multiway selection valves, a stainless-steel column reactor, a digital heating plate, and three back-pressure regulators (Fig. 13a). In terms of system control, a computer-based control and regulation framework (CRF) was established with three stages: solution-batch synthesis, solid-phase synthesis (SPS) batch synthesis, and automated SPS-flow synthesis. Using this approach, prexasertib 58 was synthesized with a 65% yield in a 32-h timeframe. While the overall yield is only marginally greater than that achieved using the previous method (57%), it is important to note that no intermediate purification was necessary during the synthesis, thereby demonstrating the viability of the automated solid-phase system. Moreover, the SPS-flow-based CRF facilitated the synthesis of diverse molecules, producing a library of 23 prexasertib analogues with only a single-step modification of a six-step CRF (Fig. 13b). This underscores the capability of the automated solid-phase system not only for innovation in drugs sharing a common core structure but also for the broader synthesis landscape.

Fig. 13
figure 13

Fully automated solid-phase synthesis of prexasertib a The instrument flowchart for automated solid-phase organic synthesis b The technical roadmap for perxasertib flow synthesis

In light of this, although the non-iterative structure of small molecules and the need for manual optimization can extend the synthesis process, an algorithm-based program supporting an automated flowing system offers notable advantages. This includes the ability to autonomously optimize synthetic pathways through learned models and to automatically construct and select the appropriate system and reagents for a given target molecule. Additionally, the utilization of solid-phase synthesis (SPS) has further streamlined the synthesis of small molecules, reducing the requirement for isolation and purification through a straightforward washing protocol. The successful outcomes underscore the efficacy of SPS methods in synthesizing small molecular active pharmaceutical ingredients (APIs), particularly those sharing a core structure.

In-line analysis assisted automated synthesis and screening

In a fully automated flow system, closed-flow synthesis modules assist in mitigating risks and reducing time-related costs throughout the entire process. Moreover, in-line analysis constitutes a pivotal component of a continuous flow system, aiming to eliminate manual intervention. In-line instruments contribute to the establishment of a closed system, thereby minimizing the risks of contamination or yield loss that often accompany manual operations. Within the automated framework, analytical instruments remain under the control of a programmed interface, affording scientists the capability to monitor and fine-tune reaction parameters—an option unavailable in manual operations. Consequently, these visualized processes facilitate the optimization of reactions to rectify insufficient outcomes.

Within the realm of flow synthesis, High-Performance Liquid Chromatography (HPLC) is widely adopted as an in-line instrument for distinguishing between residues and products. In the system introduced by Seeberger in 2020, both normal-phase HPLC and reverse-phase HPLC were employed for purification and quantification, respectively. In the automated synthesis of peptides and PNAs, ultraviolet (UV) spectroscopy is frequently employed as an analytical tool due to the pronounced absorption characteristics of the protecting group. This system tracks the synthesis progress by monitoring the absorption spectrum. Additionally, certain platforms incorporate Nuclear Magnetic Resonance (NMR) as an in-line analytical tracer to enhance precision and accuracy [83]. Furthermore, to enhance analytical precision, efficiency, and versatility, scientists also utilize other instruments such as Ultra-Performance Liquid Chromatography (UPLC) [84]. These exceptionally sensitive and accurate instruments serve to mitigate misunderstandings and ambiguities arising from reaction data. However, the heightened accuracy and sensitivity are coupled with the challenges of increased maintenance, expenses, and pre-processing steps, thereby contributing to the heightened complexity of the overall system.

Although certain instruments are excessively sensitive to function as in-line components, some devices specifically designed for particular reaction types have been integrated into automated systems. In 2020, Mo and colleagues engineered an automated system for high-throughput electroorganic chemistry [85]. They introduced a microfluidic platform strategy to address electrode-related influences. Given the specific requisites for analyzing electroorganic reactions, alongside in-line Liquid Chromatography-Mass Spectrometry (LC–MS), cyclic voltammetry (CV) was employed for kinetic measurements. However, CV necessitates a semi-infinite stationary liquid solution, while an automated system maintains a continuous liquid flow. To optimize in-line CV within the system, Jensen's group introduced a micro-scale pocket positioned ahead of the electrodes to meet the minimal analytical prerequisites. This optimized configuration amalgamates precision instruments into the flow system, effectively bridging the gap between instruments designed for static environments and those tailored for continuous flow setups (Fig. 14) [86].

Fig. 14
figure 14

Summary of commonly used in-line analysis technology. Abbreviations: UV–Vis, Ultraviolet–visible spectroscopy; IR, Infrared spectroscopy; NMR, Nuclear magnetic resonance spectroscopy

With an increasing demand for targeted disease treatments, the escalating costs and extended timelines have become significant hurdles in the field of drug discovery. Therefore, the need for advanced technology, particularly high-throughput screening, in pharmaceutical research and development utilizing extensive libraries has been on the rise. In 2018, Goodman and colleagues introduced an automated high-throughput system for generating libraries of macromolecules and protein-loaded nanoparticles [87]. This robot-based system comprises three pumps, two synthesis racks, a sonicator, a stainless-steel nozzle, and two fixed plastic nozzles connected to a computer. Parameters can be adjusted through the user interface. In comparison to traditional polymer synthesis, the system was able to synthesize 24 copolymers in 144 h, a process that would take almost a whole day for a single copolymer. Building on this, in 2021, Gao and his team developed an automated, on-the-fly synthetic nanomole-scaled system for a large unpurified library [88]. This system aimed to rapidly discover new binders to the menin protein. The setup included a 384-well destination/reaction plate, a mass spectrometer (MS), a differential scanning fluorimetry (DSF) unit, a resynthesis module, a microscale thermophoresis (MST) setup, and a co-crystal structure analysis component. The library consisted of 1536 Groebcke-Blackburn-Bienaymé three-component reaction (GBB-3CR) based drug-like heterocycles. The identified hits were subsequently resynthesized and purified after assessment by DSF/TSA (thermal shift analysis) to evaluate their affinity to menin. Following comprehensive characterization, 323 high-yield, 281 medium-yield, and 932 no-yield compounds were screened within 24 h. Consequently, the utilization of microscale reactions (0.5 mmol) in this system contributes to both environmental and economic advancement.

Within the pharmaceutical realm, the palladium-catalyzed Suzuki–Miyaura reaction stands out as the most commonly employed catalytic pathway for carbon–carbon bond formation. However, the pre-catalytic process involving palladium leads to a reaction duration of 8 h. To expedite the reaction cycle, Christensen and colleagues devised an automated kinetic profiling-based complex catalytic system [89]. This system is composed of a Chemspeed product Swing for automated synthesis [90], accompanied by a UPLC-MS system for sample acquisition. With the implementation of this system, the reaction time was successfully reduced from 400 to 80 min.

In the pharmaceutical industry, compliance with regulatory requirements necessitates meticulous monitoring of each stage of large-scale production to optimize any deficient processes. While certain precise instruments demand meticulous control over specific parameters, an in-line analysis module remains crucial for visualizing the entire process and ensuring the success of production scaling. However, instruments requiring precise or static solutions often find limited applicability in a flowing system. The precise regulation of detection concentration and retention time stands out as pivotal parameters contributing to the triumph of in-line analysis.

Conclusion

In the past few decades, scientists have consistently endeavored to advance flow chemistry systems and automated platforms in order to expedite manufacturing and research processes due to the escalating demands within drug development and APIs. Through the continual enhancement of hardware technology and programmed systems, numerous flow-based methodologies and automated systems have been devised to augment various domains of chemical synthesis. These domains encompass solid-phase synthesis for proteins, nucleic acids, carbohydrates, chemical iterative synthesis, and chemical descriptive language. The evolution of universal automated platforms has progressively led to a noteworthy escalation in reaction rates, concurrently diminishing experimental disparities by instituting standardized conditions for discrete synthetic pathways. This in turn facilitates the refinement of data collection from existing literature, compelling the creation of comprehensive databases tailored for machine learning. These databases, in turn, fortify the progress of predictive product development.

Looking forward, the advancement of technology could further optimize intelligent algorithms, thereby enhancing their capacity for autonomous learning, decision-making, and adept responsiveness in varying conditions. Additionally, as connectivity between isolated programs intensifies through online databases, streamlined collaboration among different groups can foster the expansion and rejuvenation of preexisting databases. This collaborative effort will substantiate the ongoing enhancement of intelligent synthesis algorithms. Furthermore, the prospective strides in engineering and hardware, such as the emergence of robotic chemists [91], have the potential to stimulate innovation within laboratory settings as well as industrial production environments. In summation, the continued evolution of automated synthesis, driven by program-based control systems, holds the promise of furnishing safer, swifter, and more straightforward synthetic environments for pathways that were traditionally reliant on manual operation.