The FDA's new artificial intelligence (AI)-based modeling on characterizing protein aggregation could directly impact the safety assurance of biosimilars and other therapeutic proteins.
The FDA is a regulatory agency and a deep science factory, creating new technologies, publishing scientific papers, and sharing its perspective on making biosimilars affordable by removing redundant testing. For example, on March 10, 2023, the FDA disclosed an artificial intelligence (AI)-based modeling on characterizing protein aggregation that will directly impact the safety assurance of therapeutic proteins, particularly biosimilars.
A significant issue regarding the safety of biosimilar therapeutic proteins such as monoclonal antibodies is their characterization and comparison with the reference products; one critical parameter is protein aggregation which can be caused under various stress conditions (or due to the presence of container leachable or silicone oil in pre-filled syringes. Protein aggregates, albeit in minute quantity, can trigger an immune response.
While a biosimilar product is compared with its reference product for protein aggregates, this property can be elusive given the stresses encountered during manufacturing, shipping, storage, and administration of protein drugs that can promote aggregation by different molecular mechanisms, creating particles with a wide variety of sizes, shapes, and compositions. So far, this risk avoidance has not been possible.
The FDA recently conducted a study using imaging-based techniques for characterizing aggregate protein particles using flow imaging microscopy (FIM) that can record many images of individual subvisible particles from a single sample. While these image sets are rich in structural information, manual extraction is cumbersome and often questionable because of human-defined features such as aspect ratio, compactness, or pixel intensity, leaving most complex morphological information encoded in an FIM image underutilized.
To overcome the shortcomings of current optical image analysis, the FDA applied an artificial intelligence/machine learning approach (AI/ML), specifically, convolutional neural networks (CNNs or ConvNets), a class of artificial neural networks proven helpful in many areas of image analysis.
The CNNs enable automatically extracting data-driven features (i.e., measurable characteristics or properties) encoded in images. These complex features (e.g., fingerprints specific to stressed proteins) extracted by CNNs, can be used to monitor the morphological features of particles in biotherapeutics and enable tracking the consistency of particles in a drug product. The FDA used ParticleSentry AI software to identify protein aggregation, batch variation, and anomalies during the biological development and manufacturing processes.
The AI model is trained using estimations of the most discriminatory parameters from images properly labeled as stressed or unstressed. This approach of classifying aggregates is similar to what has recently been reported in the x-ray readings. Supervised learning techniques allow CNNs to extract feature information from raw images and correlate these features to experimental conditions that generate particle images with different morphologies. Supervised learning relies on pre-defined labels (e.g., ‘non-stressed’ and ‘stressed’) associated with individual images for the network's training.
Once trained, the CNN can predict which pre-defined labels best apply to a new image that has not been used in training. This approach is helpful in root-cause analysis when the conditions that introduce aggregation are not possible to anticipate.
The fingerprinting approach is motivated by processes (or events) occurring during manufacturing that lead to potential aggregation-inducing stresses. Still, the precise nature of process upsets is not known a priori, so traditional supervised learning approaches are not applicable. Instead, CNN uses the fingerprint approach. Instead of aiming to predict classes, the network is optimized to reduce the dimension of the spatially correlated image pixel intensities, resulting in a new lower dimensional (e.g., 2 dimentional) representation of each image. The lower dimensional representation can help analyze or “curate” complex morphology encoded in a heterogeneous collection of FIM images since the full images can readily be mapped to the lower dimensional representation enabled by the CNN.
The pre-market research case study by the FDA used FIM to evaluate the impact of applied stressors (e.g., freeze-thawing, agitation) on model protein formulations (globulins and monoclonal antibodies). As a comparator with which one could validate image measurements, the investigators used abraded ethylene tetrafluoroethylene (ETFE) particles that mimic proteinaceous aggregates generated in a protein solution and are stable for up to 3 years, making the ETFE reference standard a promising tool in helping to validate a new “CNN-based” analytical method.
Application of AI/ML in the form of CNNs has enabled the processing of extensive collections of images with high efficiency and accuracy by distinguishing complex “textural features” not readily delineated with existing image processing software. These findings apply to a range of products in pharmaceuticals and biopharmaceuticals to monitor changes in product attributes (e.g., particles/aggregates) during manufacturing.
For example, flow microscopy combined with CNN image analysis can detect tiny shifts in protein aggregate populations due to stresses resulting from unknown process upsets, providing potential new strategies for monitoring product quality attributes.
In addition, using a reference standard such as ETFE that is stable over time and possesses optical properties similar to those of protein aggregates helps validate and evaluate the robustness of the analytical procedure.
The FDA has now provided biosimilar developers with a remarkable tool to demonstrate biosimilarity, particularly for proteins subject to uncertainty in their immunogenic properties caused mainly by aggregation during manufacturing and, more important, until the drug reaches the patient. It is now up to the developers how they use this technology to improve the safety of their biosimilar products and bring higher assurance of biosimilarity to the FDA.
 Chiu K, Racz R, Burkhart K, et al. New science, drug regulation, and emergent public health issues: The work of FDA's division of applied regulatory science. Front Med (Lausanne). 2023;9:1109541. doi:10.3389/fmed.2022.1109541
 Homayounieh F, Digumarthy S, Ebrahimian S, et al. An artificial intelligence–based chest x-ray model on human nodule detection accuracy from a multicenter study. JAMA Netw Open. 2021;4(12):e2141096. doi:10.1001/jamanetworkopen.2021.41096