Blockchain for clinical trials
Recently, Hyper38 was tasked with a research project aiming to apply blockchain technology in clinical trials. We have identified eleven points where the blockchain-based solution can create a significant positive impact. In this paper, we are presenting those 11 points in the context of the standard clinical trials process.
What is a blockchain?
A blockchain is a protocol of trust and a particular kind of a database. Every several records in a blockchain are saved together in a so-called block. Besides its data, every block contains a cryptographic hash (a unique signature) of the previous block. A blockchain is a distributed database managed by a peer-to-peer network collectively adhering to a protocol for internode communication and validating new blocks. Each node in a network contains an entire blockchain. This design makes blockchain resistant to modification of data. The data in any block cannot be altered retroactively without the alteration of all subsequent blocks, which requires a conspiracy of the network majority.
A blockchain is invented by a pseudonymous person or persons named Satoshi Nakamoto back in 2008. It was defined as a new protocol for a peer-to-peer electronic cash system using a cryptocurrency called bitcoin. Cryptocurrencies are different from traditional fiat currencies because they are not created or controlled by countries. This protocol established a set of rules - described above - that ensured the integrity of the data exchanged among billions of devices without going through a trusted third party. This seemingly subtle act set off a spark that has excited, terrified, or otherwise captured the imagination of the computing world and has spread like wildfire to businesses, governments, privacy advocates, social development activists, media theorists, and journalists, to name a few, everywhere. "This is the thing! This is the distributed trust network that the Internet always needed and never had," said Marc Andreessen, the co-creator of the first commercial Web browser, Netscape, and a prominent investor in technology ventures.
The blockchain is a protocol of trust, a platform for everyone to know what is right without relying on third-party institutions.
What is a clinical trial?
The clinical trial is “the most definitive tool for evaluation of the applicability of clinical research.” It represents “a key research activity with the potential to improve the quality of health care and control costs through careful comparison of alternative treatments.” On many occasions, it has been called “the gold standard” against which all other clinical research is measured. A properly planned and executed clinical trial is the best experimental technique for assessing an intervention’s effectiveness. It also contributes to the identification of possible harms.
A clinical trial is a prospective study comparing the effects and value of intervention against a control in human beings. Note that clinical trial is prospective rather than retrospective. Study participants must be followed forward in time. They need not all be followed from an identical calendar date. In fact, this will occur only rarely. However, each participant must be followed from a well-defined point in time, which becomes time zero or baseline for that person in the study.
A clinical trial must employ one or more intervention techniques. These may be single or combination of diagnostics, preventive, or therapeutic drugs, biologics, devices, regimens, procedures, or educational approaches. Intervention techniques should be applied to participants in a standard fashion in an effort to change some outcome. A trial contains a control group against which the intervention group is compared. At baseline, the control group must be sufficiently similar in relevant respects to the intervention group in order that differences in outcome may reasonably be attributed to the action of the intervention.
Unlike animal studies, in clinical trials, the investigator cannot dictate what an individual should do. He can only strongly encourage participants to avoid certain medications or procedures which might interfere with the trial. Since it may be impossible to have “pure” intervention and control groups, an investigator may not be able to compare interventions, but only intervention strategies. Strategies refer to attempts at getting all participants to adhere to their originally assigned intervention to the best of their ability. When planning a trial, the investigator should recognize the difficulties inherent in studies with human subjects and attempt to estimate the magnitude of a participant’s failure to adhere strictly to the protocol.
The ideal clinical trial is the one that is randomized and double-blinded. Deviation from this standard has potential drawbacks. In some clinical trials compromise is unavoidable, but often deficiencies can be prevented or minimized by employing fundamental features of design, conduct and analysis.
A number of people distinguish between demonstrating the “efficacy” of intervention and “effectiveness” of an intervention. They also refer to “explanatory” trials, as opposed to “pragmatic” or “practical” trials. Efficacy of explanatory trials refers to what the intervention accomplishes in an ideal setting. Effectiveness or pragmatic trials refer to what the intervention accomplishes in actual practice, taking into account the inclusion of participants who may incompletely adhere to the protocol or who may not respond to an intervention for other reasons.
Classically, trials of pharmaceutical agents have been divided into phases I through IV.
Figure 1 Correlation between development phases and types of study
A good summary of phases of clinical trials and the kinds of questions addressed at each phase was prepared by the International Conference on Harmonisation. Figure 1, taken from that document, illustrates that research goals can overlap with more than one study phase. Although pharmacology studies in humans that examine drug tolerance, metabolism, and interactions, and describe pharmacokinetics and pharmacodynamics, are generally done as phase I, some pharmacology studies may be done in other trial phases. Therapeutics exploratory studies, which look at the effects of various doses and typically use biomarkers as the outcome, are generally thought of as phase II. However, sometimes, they may be incorporated into other phases. The usual phase III trial consists of therapeutic confirmatory studies, which demonstrate clinical usefulness and examine the safety profile. But such studies may also be done in phase II or phase IV trials. Therapeutic use studies, which examine the drug in broad or special populations and seek to identify uncommon adverse effects, are almost always phase IV (or post-approval) trials.
Well-designed and sufficiently large randomized clinical trials are the best method to establish which interventions are effective and generally safe and thereby improve public health. Unfortunately, a minority of recommendations in clinical practice guidelines are based on evidence from randomized trials, the type of evidence needed to have confidence in the results. Thus, although trials provide the essential foundation of evidence, they do not exist for many commonly used therapies and preventive measures. Improving the capacity, quality, and relevance of clinical trials is a major public health priority.
The aim of this white paper is to demonstrate how application of blockchain technology, as a protocol of trust can improve overall processes of clinical trials.
Over the course of the clinical trials, there are many steps needed to be accomplished in an accurate and timely manner. There are various groups of people with the myriads of roles involved in the process. Some of them are included only in some steps, while the others are engaged in the entire lifecycle of study, which typically lasts for several years. Along the way, there are numerous data collected. Some of the data are part of the data collection process, and some consist of the documents supporting the trial protocol and responsibilities of the participants.
If we have a look at the guidance for developing clinical trials protocol published by the Standard Protocol Items - Recommendations for Intervention Trials (SPIRIT 2013 Statement), we can see the content of the typical clinical trials protocol and the organization that supports it:
- Background of the study
- Primary question and response variable
- Secondary question and response variables
- Subgroup hypotheses
- Adverse effects
- Design of the study
- Study population
- Inclusion criteria
- Exclusion criteria
- Sample size assumptions and estimates
- Enrollment of participants
- Informed consent
- Assessment of eligibility
- Baseline examination
- Intervention allocation (e.g., randomization method)
- Description and schedule
- Measures of compliance
- Follow-up visit description and schedule
- Ascertainment of response variables
- Data collection
- Quality control
- Assessment of adverse events
- Type and frequency
- Data analysis
- Interim monitoring, including data monitoring committee role
- Final analysis
- Termination policy
- Study population
- Participating investigators
- Statistical unit or data coordinating center
- Laboratories and other special units
- Clinical center(s)
- Study administration
- Steering committees and subcommittees
- Monitoring committee
- Funding organization
- Participating investigators
In every single step in this list, piles of information are generated. If you look from the points of view of the various study participants, you will recognize different types of data they create, collect or exchange with others. We are not talking only about the Data Collection procedures here, which are the most important, but also about many other documents produced from he day one of the trial up until the end. You can imagine how many decisions, revisions, forms, pieces of training, meetings are made over time.
In the following figure, you can see the industry-standard clinical trial model with all the participants involved:
Figure 2 The Industry Modified Clinical Trial Model
Figure 2 depicts how many participants are required in the process and the relations between them. This figure is essential for us in two ways. First, it gives us an idea of how much information flows between all parties communicating to each other via documents, emails, meetings, decisions, etc. The second and the most important to us shows the clinical data management as a strictly centralized unit - marked in Figure 2 as a Site and Data Management Center. To stress out this data management centralization, we are showing the next figure, which better describes data collection flow:
Figure 3 Source data flow
Figure 3 shows various data flows in the Data Collection processes during the clinical trials. Our goal is not to explain each source and why it is important, but to show that there is a central unit where all the data goes. At the first look, there is nothing wrong with the central database collecting all the information.
PROBLEM #1 - Centralization: Centralized data storage could be a single point of failure in data integrity protection. Information that is centrally stored could be compromised by different participants with certain interests in data tampering. Even though there are standards in place for data protection, centralized databases are by design vulnerable to attacks, either from insiders or outsiders. We need a decentralized system, accessible from all interested parties in an independent fashion.
Valid and informative results from clinical trials depend on data that are of high enough quality and sufficiently robust to address the question posed. Such data in clinical trials are collected from several sources - medical records, interviews, questionnaires, participant examinations, laboratory determinations, or public sources like national death registries. Data elements vary in their importance but having valid data regarding key descriptors of the population, the intervention, and primary outcome measures are essential to the success of the trial.
Avoiding problems in data collection represents a challenge. There are many reasons for poor quality. The problems encompass missing data, erroneous, falsified, and fabricated data, large variability, and long delays in data submission. Even with the best planning, data quality needs to be monitored throughout the trial, and corrective actions taken to deal with unacceptable problems.
During all phases of a study, sufficient effort should be spent to ensure that all data critical to the interpretation of the trial, i.e., those relevant to the main questions posed in the protocol, are of high quality.
The data collected should focus on the answers to the questions posed in the protocol. Essential data vary by the type of trial, and they include:
- baseline information, such as inclusion and exclusion criteria that define the population
- measures of adherence to the study intervention
- important concomitant interventions
- primary response variables
- important secondary response variables
- adverse effects with emphasis on predefined serious events
- other prespecified response variables
Data are collected to answer questions about benefits, risks, and the ability to adhere to the intervention being tested. Trials must collect data on baseline covariates or risk factors for at least three purposes:
- to verify eligibility and describe the population studied
- to verify the randomization did balance the important known risk factors
- to allow for limited subgroup analysis
Data must be collected on the primary and secondary response variables specified in the protocol and in some cases, tertiary level variables. Some measures of adherence to the interventions specified in the protocol are necessary as well as important concomitant medications used during the trial. That is, to validly test the intervention, the trial must describe how much of the intervention the participant was exposed to and what other interventions were used. Collection of adverse events is challenging for many reasons.
Problems in data collection
There are three major types of data collection problems:
- Missing data
- Incorrect, fabricated, and falsified data
- Delayed submission
First, incomplete and irretrievably missing data can arise, for example, from the inability of participants to provide the necessary information, from inadequate assessment like physical examinations, from laboratory mishaps, from carelessness in completion of data entry, or from inadequate quality control within electronic data management systems. For example, missing outcome data due to the withdrawal of participant consent or loss to follow-up can result in unreliable results. When the results of the Anti-Xa Therapy to Lower Cardiovascular Events in Addition to Standard Therapy in Subjects with Acute Coronary Syndrome (ATLAS-ACS 2) trial testing rivaroxaban following acute coronary syndromes were reviewed by a FDA Advisory Committee, drug approval was not recommended in large part due to over 10% of participants having incomplete follow-up. The percent of the missing critical data in a study is considered as one indicator of the quality of the data, and, therefore, the quality of the trial.
PROBLEM #2 - Missing data: A system for early detection of missing data and continuous monitoring of the data entry processes is of crucial importance.
2. Incorrect, fabricated and falsified data
Erroneous data may not be recognized and, therefore, can be even more troublesome than incomplete data. For study purposes, a specified condition may be defined in a particular manner. A clinic staff member may unwittingly use a clinically acceptable definition, but one that is different from the study definition. Specimens may be mislabeled. In one clinical trial, the investigators appropriately suspected mislabeling errors when, in a glucose tolerance test, the fasting levels were higher than the 1-h levels in some participants. Badly calibrated equipment can be a source of error. In addition, the incorrect data may be entered on a form.
The most troublesome types of erroneous data are those that are falsified or entirely fabricated. The pressure to recruit participants may result in alterations of laboratory values, blood pressure measurements, and critical dates in order to qualify otherwise ineligible participants for enrollment.
PROBLEM #3 - Incorrect, fabricated, falsified data: Prevent data fabrication by design. Leave no chance to the potential data tempering.
3. Delayed submission
Delayed submission of participant data, especially from the clinical sites in multicenter trials, is often a major issue. The importance of timely submissions is directly related to effective data quality monitoring; see bellow.
PROBLEM #4 - Delayed data submission: Create a system of automatic awards for in time data submissions
Even though every effort is made to obtain high-quality data, a formal monitoring or surveillance system is crucial. When errors are found, this system enables the investigator to take corrective action. Monitoring is most effective when it is current so that when deficiencies are identified, measures can be instituted to fix the problem as early as possible. Additionally, monitoring allows an assessment of data quality when interpreting study results. Numerous procedures, including drug handling and the process of informed consent should be monitored. Minimizing missing data, particularly of the primary outcome and major safety outcomes, is crucially important.
Monitoring of Data
During the study, data entered into the system should be checked for completeness, internal consistency, and consistency with other data fields. There should be a system to ensure that important source data matches what is in the database. When the data fields disagree, the group responsible for ensuring consistent and accurate data should ensure that a system is in place to correct the discrepancy. Dates and times are particularly prone to error.
A system should be in place to constantly monitor data completeness and currency to find evidence of missing participant visits or visits that are off schedule in order to correct any problems. The frequency of missing or late visits may be associated with the intervention. Differences between groups in missed visits may bias the study results. To improve data quality, it may be necessary to observe actual clinic procedures.
Monitoring of procedures
Extreme laboratory values should be checked. Values incompatible with life such as potassium of 10 mEq/L are obviously incorrect. Other less extreme values (i.e., total cholesterol of 125 mg/dL in male adults in the United States who are not taking lipid-lowering agents) should be questioned. They may be correct, but it is unlikely. Finally, values should be compared with previous ones from the same participant. Certain levels of variability are expected, but when these levels are exceeded, the value should be flagged as a potential outlier. For example, unless the study involves administering a lipid-lowering therapy, any determination which shows a change in serum cholesterol of 20% or more from one visit to the next should be repeated.
Monitoring of drug handling
In a drug study, the quality of drug preparations should be monitored throughout the trial. This includes periodically examining containers for possible mislabeling and for proper contents (both quality and quantity). Investigators should carefully look for discoloration and breaking or crumbling of capsules or tablets. The actual bottle content of pills should not vary by more than 1% or 2%. The number of pills in a bottle is important to know if pill count will be used to measure participants adherence. Products having short shelf life require frequent production of small batches. Records should be maintained for study drugs prepared, examined, and used.
The dispensing of medication should also be monitored. Checking has two aspects. First. Were the proper drugs sent from the pharmacy or pharmaceutical company to the clinic? If the study is double-blind, the clinic staff will be unable to check this. They must assume that the medication has been properly coded. Second, when the study is blinded, the clinic personnel need to be absolutely sure that the code number on the container is the proper one. Labels and drugs should be identical except for the code; therefore, extra care is essential.
The drug manufacturer assigns lot, or batch, numbers to each batch of drugs prepared. If contamination or problems in preparation are detected, then only those drugs from the problem batch need to be recalled. The use of batch numbers is especially important in clinical trialssince the recall of all drugs can severely delay, or even ruin, the study. When only some drugs are recalled, the study can usually manage to continue. Therefore, the lot number of the drug as well as the name or code number should be listed in the participant’s study record.
There are three general types of audits: routine audits of a random sample of records, structured audits, and audits for the cause. Site visits are commonly conducted in long-term multicenter trials. In many non-industry sponsored trials, a 5-10% random sample of study forms may be audited for the purpose of verifying accurate transfer of data from hospital source records. More complete audits are usually performed in industry-sponsored trials. While the traditional model has been for study monitors (or clinical research associates) to visit the sites in order to verify that the entered data are correct, a more appropriate role may be to perform selected source-data verification for critical variables and to spend more time ensuring that appropriate systems are in place and training has been performed.
The purpose of audits for the cause is to respond to allegations of possible scientific misconduct. This could be expanded to include any unusual performance pattern, such as enrolling participants well in excess of the number contracted for or anticipated. This type of audit includes fabrication, falsification, or plagiarism in proposing, performing, or reviewing research findings.
The FDA (Food and Drug Administration in the USA) or the other national regulatory agencies conduct periodic audits as well as investigations of applicable law violations. These may include clinical investigator fraud, such as falsifying documentation and enrolling ineligible patients.
PROBLEM #7 - Quality control: Many steps performed during the trials have to be controlled for the quality. A robust surveillance system is required to provide early signals of problems and enable quick resolution.
Assessment and Reporting of Harm
An adverse effect has been described as “a noxious or unintended response to a medical product in which a causal relationship is at least a reasonable possibility.” Harm is the sum of all adverse effects and is used to determine the benefit-harm balance of an intervention. Risk is the probability of developing an adverse effect. Severe is a measure of intensity. Serious is an assessment of medical consequence. Expected adverse events or effects are those that are anticipated based on prior knowledge. Unexpected are findings not previously identified in nature, severity, or degree in incidence.
Assessment of harm is more complex than the assessment of the benefit of an intervention. The measures of favorable effects are or should be prespecified in the protocol and they are limited in number. In contrast, the number of adverse events is typically very large, and they are rarely prespecified in the protocol. Some may not even be known at the time of trial initiation. These facts introduce analytic challenges.
PROBLEM #8 - Assessment and reporting of harm: Careful attention needs to be paid to the assessment, analysis, and reporting of adverse effects to permit valid assessment of harm from interventions.
Participant adherence (compliance)
The terms compliance and adherence are often used interchangeably. An international consensus statement crafted by the World Health Organization and the International Society of Pharmacoeconomics and Outcomes Research defined medication adherence as “the extent to which a patient acts in accordance with the prescribed interval and dose of the dosing regime.” The term adherence implies active participant involvement in the decision to take a medication, use a device, or engage in a behavior change.
Medication adherence is a major challenge for patients, the consequences of which affect clinical practitioners and investigators alike. As many as one-third of all prescriptions are reportedly never filled and, among those filled, a large proportion is associated with incorrect administration. Even among patients who receive medication at no cost from their health plans, rates of nonadherence reach nearly 40%. Nonadherence has been estimated to cause nearly 125,000 deaths per year in the U.S. and has been linked to 10% of hospital admissions and 23% of nursing home admissions. Poor medication adherence in the U.S. has a resultant cost of approximately $100 billion a year.
Factors in improving the likelihood of medication adherence in clinical trials are:
- Trial design - Simple schedule (once or twice daily dosing) that fits into a daily routine
- Relationships and communication - Enhanced relationship of study coordinator with the participant with regular communication
- Passive monitoring - Electronic monitoring tools
- Education - Medication usage skills
- Reminders - Alarms (e.g., electronic reminders to medication schedule) and associations (e.g., put medication beside toothbrush or use a behavior trigger)
- Incentives - Monetary or other rewards
Monitoring adherence is important in a clinical trial for two reasons: first, to identify any problems so steps can be taken to enhance adherence; second, to be able to relate the trial findings to the level of adherence.
PROBLEM #9 - Participant adherence: Many potential adherence problems can be prevented or minimized before participant enrollment. Once a participant is enrolled, measures to monitor and enhance participant adherence are essential.
The investigator’s ethical responsibility to the study participants demands that safety and clinical benefit be monitored during trials. If data partway through the trial indicate that the intervention is harmful to the participants, early termination of the trial should be considered. If these data demonstrate a clear definitive benefit from the intervention, the trial may also be stopped early because continuing would be unethical to the participants in the control group. In addition, if differences in primary and possibly secondary response variables are so unimpressive that the prospect of a clear result is extremely unlikely, it may not be justifiable in terms of time, money, and effort to continue the trial.Also, monitoring of response variables can identify the need to collect additional data to clarify questions of benefit or toxicity that may arise during the trial. Finally, monitoring may reveal logistical problems or issues involving data quality that need to be promptly addressed. Thus, there are ethical, scientific, and economic reasons for the interim evaluation of a trial.
PROBLEM #10 - Independent monitoring: During the trial, response variables need to be monitored for early dramatic benefits or potentially harmful effects or futility. Monitoring should be done by a person or group independent of the investigator.
For many clinical trials, there are national and local regulations that must be followed in order to conduct clinical research. Furthermore, in order for an industry sponsor to market a medical product, regulatory agency approval is required in most countries of the world. Different countries have different names for their agencies: Food and Drug Administration (FDA) in the United States, European Medicines Agency (EMA), Swissmedic in Switzerland, etc., but their purpose is the same. Regulatory agency rules and guidelines and rules still differ among the countries, and these differences may contribute to different approval decisions.Importantly, rules of conduct differ in various countries in which a multinational trial might be conducted.
When designing and conducting a clinical trial, investigators must know and follow national, state, and institutional regulations that are designed to protect research integrity and participant safety. There is a requirement for a clinical trial to report certain procedures and ongoing events to the regulatory agencies. Regulatory agencies are doing periodical checks of the trials, ensuring that everything is aligned with the rules.
The following table lists key actions and responsibilities required of investigators (both lead and other) conducting trials that fall under the purview of the FDA:
Principles of research
Informed consent process
Knowledge of basic regulations
45 CFR 46 (Common rule)
21 CFR 50 (FDA Regulations)
45 CFR 160 (Privacy act)
IND or IDE Completion
Pre-clinical materials and references
Final protocol (allowing FDA up to 30 days for review)
Information showing competence(s) of investigators
Information showing the adequacy of facilities
Registration with clinicaltrials.gov
Other quality assurance activities
Reporting to IRB(s) and FDA
Serious adverse events
Submission of data and documents to the FDA (if seeking product approval)
Completed case report forms
Presentation to the advisory committee
Possible conduct of post-approval studies
Publication of trial results
Timely submission of data to clinicaltrials.gov
PROBLEM #11 - Regulatory: Various data collected and procedures performed must be submitted for approval to the regulatory agencies and available for periodical checks. Investigators are obligated to know and follow the regulatory rules.
The Blockchain Solution
Blockchain-based software solution for conducting all clinical trial processes in a trusted, secure, private, transparent, and innovative way opens the field of clinical trials for technological innovations. We are proposing a unified solution that is solving all 11 problems identified above.
To be continued.
 Tapscott, Don, and Alex Tapscott. Blockchain Revolution How the Technology behind Bitcoin Is Changing Money, Business and the World. Portfolio Penguin, 2016.
 NIH Inventory of Clinical Trials: Fiscal Year 1979. Volume 1. National Institutes of Health, Division of Research Grants, Research Analysis and Evaluation Branch, Bethesda, MD
 Tricoci PL, Allen JM, Kramer JM, Califf RM, Smith SC Jr. Scientific evidence underlying the ACC/AHA clinical practice guidelines, JAMA 2009;301:831-841; erratum in JAMA 2009;301:1544.69.
 Friedman, Lawrence M., et al. Fundamentals of Clinical Trials. Springer, 2015.
 Chan A-W, Tetzlaff JM, Altman Dg, et al. SPIRIT 2013 Statement: defining standard protocol items for clinical trials. Ann Intern Med 2013; 158:200-207
 Fisher MR, Roecker EB,DeMets DL. The role of an independent statistical analysis center in the industry-modified National Institutes of Health model. Drug Inf J 2001;35:115-129
 Society for Clinical Data Management. eSource Implementation in Clinical Research: A Data Management Perspective. A White Paper.
 Briefing Information for the May 23, 2012 Meeting of the Cardiovascular and Renal Drugs Advisory Committee. Available at https://wayback.archive-it.org/7993/20170404150433/https://www.fda.gov/AdvisoryCommittees/CommitteesMeetingMaterials/Drugs/CardiovascularandRenalDrugsAdvisoryCommittee/ucm304754.htm
 International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use. Clinical Safety Data Management: Definition and Standards for Expedited Reporting, E2A, October 27, 1994.
 Cramer JA, Roy A, Burrel A, et al. Medication compliance and persistence: terminology and definitions. Value Health 2008; 11:44-47
 Peterson AM, Takiya L, Finley R. Meta-analysis of trials of interventions to improve medication adherence . Am J Health-Syst Pharm 2003;60:657-665
 Fleming TR, Harrington DP. Counting Processes and Survival Analysis. Wiley, 2011.
 Cox DR, Oakes D. Analysis of Survival Data. Taylor & Francis, 1984.
 Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. Wiley, 2011.