Statistics, computer science, and nursing work together to analyze data and inform patient care.
- The many devices we use in everyday practice can be mined for data to create new knowledge and improve patient care.
- Creating and manipulating Excel spreadsheets is not big data.
- Informatics nurses, statisticians, and computer scientists work as a team to turn big data into useful information that can be translated into nursing practice.
By Melinda Higgins, PhD; Roy L. Simpson, DNP, RN, FAAN, DPNAP, FACMI; William Gregory Johnson
You’ve probably been hearing a lot in the news lately about big data. But what is it? And what does it mean for nurses? How does it affect our lives and our patients’ lives? How can it help us develop new knowledge for our profession?
All the devices we use professionally and personally have one thing in common—they produce data that can be mined; in other words, “big data.” Most of us didn’t encounter big data in our nursing curriculum, but we need to know about it now. We can use big data to advance knowledge through technology and innovation, and it can have a significant impact on our practice.
This article reviews the interdisciplinary team required for big data analytics, along with some processes and suggestions for using big data in your practice.
Big data = better patient care
In their landmark 2015 article, Brennan and Bakken aptly stated, “Nursing needs big data and big data needs nursing.” The authors noted that big data arises out of scholarly inquiry, which can occur through everyday observations using tools such as computer watches with physical fitness programs, cardiac devices like ECGs, and Twitter and Facebook accounts. When the information in these devices and programs are mined, it can be analyzed to help create new knowledge and improve patient care.
To make all of this happen, we need centers that advance big data science, such as the one housed at Emory University’s Nell Hodgson Woodruff School of Nursing. Emory’s center established three components of big data for advancing nursing science: a data dictionary of research and activities related to research and its funding, an educational database of over 1 million patients seen at Emory University Healthcare, and detailed biological measurements for advancing precision healthcare and nursing research.
The success of the center depends on team members from three disciplines: statistics, computer science, and professional nursing. Each is prepared at the doctoral level.
The statistician aligns correct statistical analysis approaches to address each research question, defines proper statistical methods, and coordinates with the computer scientist and RN to understand the data structure and format. Then the interdisciplinary team works to understand the underlying system that generates the data (for example, the electronic health record [EHR]) and accurately identifies the outcomes of interest (for example, readmissions within 30 days) that need to be estimated and evaluated.
When working with big data, you may encounter issues related to software and computing resources, data formatting, and data choice. Many statistical software packages run into problems when the organization’s computer system memory isn’t large enough to handle the software. However, rapid developments in this area have advanced new methods to manage these situations. The statistician must work closely with the team to ensure that the computing resources are adequate to handle all of the data sources for storage, preprocessing, cleaning.
In addition to software and computing resource issues, data may be unstructured (not always numeric and not in a rectangular “spreadsheet” format) and messy (missing data, outliers, mixed data types). This
is where the RN aligns the nomenclatures and taxonomies of practice to the data, building on the work of the American Nurses Association (ANA) database committee, which identified nursing vocabulary in the late 1980s and early 1990s. The team works to help with all preprocessing steps to understand data quality and limitation issues that affect the final analyses, modeling, and subsequent inferences to be drawn.
Keep in mind that not all data should be included in the analyses. Instead, data sources and amounts should be purposefully sampled with as much care and consideration as enrolling a sample from a larger population of interest for a clinical trial.
All of the data from the hundreds of devices that nurses use enables the transformation of information into actionable knowledge at the bedside. However, health data flows into EHR repositories with various characteristics, which can overwhelm a computer.
A computer scientist focused on data science has the skills and understanding to calm the volume and veracity issues (bias, noise, uncertainty) of information into quality assets that help nurses deliver targeted care. Advances in computer processors and algorithms also enable mining of data generated from health devices worn by patients. Data from these devices are transmitted over the internet so nurses can interpret the information to create actionable outcomes.
The team informatics RN understands the foundations of nursing, has patient care knowledge, and uses data to inform nursing practice. In addition, the RN understands practice theory and how to implement it at the bedside within the workflow and context of the organization through the lens of a nurse. He or she also understands the independent and dependent variables of the practice; the alignment of legal, ethical, and regulatory requirements (for example, privacy regulations and institutional board review requirements); and the criteria for research versus quality and safety analytics.
In addition to identifying and interpreting important data sources through the lens of a clinician, the RN understands the science behind the nursing process, how the life cycle of data affects nursing practice, and how feedback loops for quality and practice can be developed based on evidence. The RN evaluates the unintended and intended biases of the process and helps integrate the ANA Code of Ethics for Nurses with Interpretive Statements and the patient’s bill of rights into the context of the information age.
From data to knowledge
Big data lets us analyze gazillions of data elements. For example, when all of the data in the EHR are processed, they’re cleaned so that missing values are identified, unrealistic or meaningless data points are extracted, and redundant and conflicting data are eliminated. This is where computer scientists and statisticians come in.
To process the data and transform it into a format for meaningful analysis, it needs to be smoothed, aggregated, normalized, and discretized. It also requires clustering and binning, histograms analysis, and hierarchy evaluation. In other words, we need knowledge from disciplines outside of nursing. Don’t let anyone tell you that creating an Excel spreadsheet is big data. Similarly, manipulation of a spreadsheet isn’t even close to the requirements needed to interpret big data. It requires computer coding and statistical programming skills.
After processing, we begin mining the data for new knowledge, so we can illuminate nursing phenomena. We want to shine a light on what we do and how we make a difference in patients’ lives. For example, in the 1980s, ANA made what was an audacious statement for the times: Every patient needs a nurse. Aikens’ research showing the impact of RN staffing on patient morbidity and mortality followed. Her work illuminated the need for nursing practice at a professional level of advanced knowledge to avoid costly complications or even death. Our work matters, and to show that it matters and advance the profession, we need big data.
Melinda Higgins is an associate research professor and senior biostatistician at Emory University Nell Hodgson School of Nursing in Atlanta, Georgia. Roy L. Simpson is a professor and assistant dean of technology at Emory University Nell Hodgson School of Nursing. William Gregory Johnson is a brain and behavior neuroscience fellow at Georgia State University in Atlanta.
Brennan PF, Bakken S. Nursing needs big data and big data needs nursing. J Nurs Scholarsh. 2015;47(5):477-84.
Emerson JW, Kane MJ. Don’t drown in the data. Signif (Oxf). 2012;9(4):38-9.
Hansell PS. Advances in nursing research methodology: Big data analytics the future. Int J Nurs Clin Pract. 2017;4:220-2.
Kannry J, Sengstack P, Thyvalikakath TP, et al. The chief clinical informatics officer (CCIO): AMIA Task Force report on CCIO knowledge, education, and skillset requirements. Appl Clin Inform. 2016;7(1):143-76.
McCormick KA, Lang N, Zielstorff R, Milholland DK, Saba V, Jacox A. Toward standard classification schemes for nursing language: Recommendations of the American Nurses Association steering committee on databases to support clinical nursing practice. J Am Med Inform Assoc. 1994;1(6):421-7.
Rose S. Big data and the future. Signif (Oxf). 2012;9(4):47-8.
Wlodarczak P, Ally M, Soar J. Data mining in IoT: Data analysis for a new paradigm on the internet. In: Proceedings of the International Conference on Web Intelligence; 2017.