Over the next five years, global data creation is projected to increase by more than 50% year-over-year.
Because of this growth and complexity, it is essential to categorize, weed out, contextualize, process, and visualize data. In doing it right, both humans and machines can use the data properly. This is increasingly challenging in life sciences across the value chain. After all, with all of today's accumulated data, questions about what to do with it, how to analyze it, and how to leverage its predictive power are a concern for many organizations. It’s one thing to collect data and another thing to make use of it.
Regulators are paying attention too. Year after year, the FDA finds more data integrity issues, costing manufacturers and, ultimately, patients who needs quality medication on time.
Improving pharma manufacturing and operations starts with using data well and using it right. The three key factors impacting and potentially complicating the process of acquiring, understanding, and using data are:
- The Five Vs: Data Volume, Variety, Velocity, Validity, and Veracity (still need to know more about those? Check out our eBook)
- The Paper Problem
- Data Integrity
The third piece of the puzzle, Data Integrity, is the focus of this blog, and is the crux of the challenge facing manufacturers. Data Integrity in pharmaceutical manufacturing is more than just a concept.
It’s a set of principles that refer to data that is Attributable, Legible, Contemporaneous, Original, Accurate, Complete, Consistent, Enduring, and Available (ALCOA+).
ALCOA+ principles intend to ensure that all of the data that the pharmaceutical industry generates, including data from clinical trials, manufacturing, and quality control meets the criteria listed above. It applies to any kind of data that has a GMP impact and that has been recorded electronically, manually in a paper-based format, or in a combination of the previous ones (hybrid format).
Let’s break it down…
- Attributable: The data contains records of who or what device generated the data and when.
- Legible: Data must be recorded and stored in a durable medium that will ensure readability throughout the retention period.
- Contemporaneous: Data must be recorded at the moment it is generated or observed.
- Original: Data to be preserved must be the original data or true copies.
- Accurate: It must be verified that the data is free of errors through repeatable calculation, algorithm or analysis and any change must be documented.
- Plus (+)
- Complete: All of the information that allows for recreating an event must be recorded, including the critical data and metadata.
- Consistent: Data is consistent chronologically and follows the expected sequence.
- Enduring: The data will last for the whole retention period without losing readability.
- Available: The data should be accessible whenever needed during the retention period, by both users & auditors.
The principles of ALCOA+ are important for ensuring the integrity of data in the pharmaceutical industry, which is critical for the safety and efficacy of drugs. Adhering to these principles can also help protect companies from potential legal or regulatory issues that may arise from inaccurate or incomplete data.
But DI does not apply only to data acquisition - it also applies to results and outputs. When AI Models support human decision-making in critical processes along the full drug life cycle, the letters A, L, C, O, and A perfectly define the prediction characteristics and recommendations that smart systems like Aizon's provide. Any prediction, recommendation, cluster, pattern recognition, or detected anomaly must always have a link to the data used to create the model. It also must link to the corresponding accuracy and performance, units of measure, timestamp, and the input data used for the prediction. Finally, the result must be understandable and clearly interpretable.
Let’s take a look at a very specific case: Plasma derivatives constitute a specific class of drug biomanufacturing with very particular requirements. The most critical factor is that the raw material is human blood, donated by individuals and managed by donation centers. The raw material CQA is varied and heterogeneous, and its values determine the performance of the final product derived from its manufacturing process.
Spreadsheets, emails, PDFs, and applications (in the best cases), are examples of data sources that inform the CQA. When SMEs are required to prepare and set up the operations that transform plasma into intermediate and final products, they need the ability to manipulate different data sources and unify them into a single record point, which will ultimately be the source of truth for later actions.
Automizing data extraction from multiple origins and centralizing the information in a standard structure should be performed by systems like the Aizon platform. This guarantees data processing in a homogeneous structure that is consistent with Data Integrity standards.
If you’re ready to get started with better data management, take a look at the article: Four Reasons to Digitize your Life Sciences Manufacturing Data.