From Weeks to Hours: A Case for Automation in Databook Development
Metro SHAPE (Survey of the Health of All the Population and the Environment) 2014 is the latest implementation of a long-standing series of health surveillance surveys to assess the health status of the population living within Hennepin County, Minnesota. As in previous iterations, the survey focused on 10 sub-county geographies to demonstrate differences in health based on where people live. One major change for Metro SHAPE 2014 was that it was implemented for five additional counties, including seven local public health agencies, in the Twin Cities metro area in addition to Hennepin County. The data dissemination plan for the project involved developing databooks for each participating county, as well as for each sub-county area within Hennepin County. Each databook is a set of 96 tables that break the data down by various demographic characteristics (age, gender, income, race, etc.), and comprise upwards of 25,000 individual data elements. These databooks serve as public health surveillance data reports for each participating public health agency, and are ready reference sources for internal and external stakeholders who are seeking data on particular subjects. A quality improvement (QI) project was identified during the planning for the survey project. Previous methods that had been used in the past to develop databooks were based on outdated technology that were not replicable. In addition, they did not directly generate the Adobe files that were the intended product. With existing processes and technology, the team determined that it would likely take several months of labor to complete the project. The aim of the QI project was to develop and implement an automation system that would decrease the time to complete a databook, increase data accuracy, and facilitate comparability across jurisdictions. In addition, expediting the production of databooks would allow staff to consider additional analyses of the data and produce additional databooks.
Developing databooks using the Metro SHAPE 2014 data is a primary mode of managing population health surveillance for participating health departments. The SHAPE survey collects data from thousands of residents and covers a wide range of topics, from questions about nutrition and physical activity, health care utilization, mental health and social connectedness, disability and quality of life, and many more topics. The comprehensive databooks serve as resources for staff and other external stakeholders to easily access and understand the data and it is critical that they are produced in a timely and accurate manner.
Because the development of the databooks was labor intensive, the research team focused on the required single databook for each local public health jurisdiction. With the innovation of automating the process and greatly expediting the production of the databooks, as well as time saved with automating the process and reducing manual entry error, the research team was able to respond to additional requests for databooks.
The basic steps needed to prepare each of the databooks were to do a set of analysis procedures for the indicators for each of the 96 tables using STATA, and then format the results into a visually appealing table in an InDesign document.
The team analyzed the older report-development process using a fishbone diagram to examine the causes of delays in the process. The steps of the original process included: 1) perform analyses using STATA; 2) save output into a text log file; 3) use a routine written in dBase programming language to extract the relevant results; 4) import those results into Excel; 5) create tables in Excel in the desired format; and 6) copy those tables into InDesign. Contributing causes for delays included the high number of variables and tables in the databooks, time involved in manual analysis and transfer of results, complexity of raw output files, and formatting.
The team researched other methods, consulted with other local public health practitioners, and developed a new system that would address the causes of delay. The new process involved translating complicated analysis output into user-friendly data tables using a combination of analysis software available to staff and other Microsoft products. The process included the following steps: 1) syntax .do files with putexcel commands for STATA; 2) Excel macros and formulae; 3) XML structures; and 4) script .vbs files for Adobe InDesign to create an assembly line of moving information from the STATA analyses into a standard format needed for each of the 96 tables in one of the databooks. The pilot test of the project was the development of the Metro Databook, which provided data for the widest project geography, including the six counties that participated in the project.
The innovation – automating production of the databooks – was enhanced through the use of QI tools. The team leading the SHAPE projects, the Hennepin County Public Health Assessment Team, identified the production time for completing one databook as the benchmark measure for this project. Once the systems of automation tools were developed, the Assessment Team shortened the time needed to create a databook from an average of seven to eight days, to one day. In addition to decreasing time necessary to create databooks, because the process was automated, errors introduced due to human error with data entry and translation were also reduced.
Because the production of databooks was expedited to this degree, the Assessment Team identified an opportunity to create additional geographic-specific databooks within Hennepin County that aligned with the geography served by the county’s decentralized Human Service Centers. By having databooks that represented the data of residents living in the service delivery areas of each Human Service Center, presentation of the data and subsequent discussions were more focused and human services staff more engaged. One direct result of these presentations and other work is an exciting expansion of future SHAPE survey projects where surveys will be completed with clients receiving services at the Human Service Centers.
Development of the automated process required considerable technical knowledge of advanced features of the data analysis software STATA, Microsoft Excel, and Adobe InDesign, as well as specific coding skills to create tables that met specific requirements for this project. Replication of this process for future SHAPE projects or other projects will require similar levels of technical knowledge and skills to successfully develop and test the process.
Having the ability to develop databooks targeted regionally to the county’s Human Service Centers made possible the tools to share geographic-specific results to our human services colleagues. The human services staff were very engaged in the project and felt the data was useful. This kick-started the idea for collaboration for the next iteration of the survey.
The team learned the utility of developing automated processes. Whenever a process involves steps done repeatedly, there may be situations where automation can help. However, the first time you do the automation will be difficult and time consuming. Automation is not the solution for every problem. It’s important to weigh the cost-benefit to developing code, processes, and testing the automation.
There were significant learning curves to developing this automation. The necessary technical competency to develop/launch automation could be a real barrier.
Each skill learned through this automation translated into other aspects of work. As an example, the skills gained in learning putexcel were put to use in a mapping application for the same project. This created opportunities for cross training of staff and benefited other projects.
The QI project provided opportunities to continue quality improvement efforts on this project. The research team added an additional measure of reliability, relative standard error (RSE), midway through reporting. This added reliability measure improved the quality of the databooks, but in the absence of the automated process would have delayed release of the databooks by potentially several months. One key element of the automation design was to rely on developing code that read parameters rather than hard coding - this provides more flexibility for changes later in process.
Automation requires technical skills and programming that should not reside with one person. It was necessary to cross-train and ensure systems were well documented. Assessment Team staff worked to train across systems so multiple staff could understand the process and make corrections as needed. In addition, staff have used components of this automated process in other projects and applications, furthering our understanding of this ability to automate.
This process was developed using out-of-the-box features of already available software from Adobe InDesign, STATA and Microsoft products, standards of XML, and general coding language. There were no added software costs.
Assessment Team staff documented and shared this process at a training with others in Hennepin County’s Research and Evaluation Community of Practice, a collaboration of many of the research staff within the county.