Rally Process

COLLABORATIVE DATA SCIENCE RALLIES

A key part of the discovery phase of Ki involves data science rallies, which are based on efficient, sprint-like efforts adapted from the Agile software development method. This approach to data science is producing actionable answers to specific questions about child growth and development. Ki models are used during rally sprints, which usually run for 2 weeks and address a specific question, hypothesis, and deliverable. The key to the success of a rally is the teaming of data scientists with domain experts who are in continual communication throughout the course of the rally.

The overarching process involves using new modeling and data visualization methods to understand and analyze data from the knowledge base, and provide insights about child health assessment and interventions. Rally members begin each sprint with a planning session to determine sprint objectives. Throughout the rally, team members review the analysis and modeling methods and results, and report on progress and impediments.

As the rally is completed, team members report the outcomes and review the experience to help improve future rallies or other projects. Rally reports are submitted, and next steps are recommended.

Rallies are a cyclical process and start with a question

Rollover each step to learn more:

platform 1. RALLY Question(s)First, a domain expert forms the rally questions from the prioritized list of HBGDki questions. 2. Team FormsAround the same time, a team is assembled and data sets are selected. The Global Health Analytics Platform (GHAP), which is the HBGDki data repository, contains over 170M observations from 174 studies. 3. Modeling Technique & Data sets selectedA preliminary analysis plan is established, including proposed statistical modeling techniques. 4. Sprint BeginsEach rally (represented by a number, i.e., Rally 1) is made up of a series of sprints (represented by a letter, i.e., Rally 1A) which will each last for about 2 to 4 weeks and take place on a cloud-based collaborative platform.5. Daily Stand-UpEvery day members report on the rally’s progress and any impediments. 6. Analyze & DiscussDuring this time, data scientists analyze the data (using GHAP) and intermediate results are discussed with domain experts. The data scientist/ domain expert interaction drives iteration toward answering the rally questions. As results are validated, rally members document the methods, results, and key findings in a rally template on Open Science Framework. 7. Present OutcomesOutcomes of the rally are then summarized and presented.8. ReviewMembers reflect and review to inform future rallies. Following this point, there are three paths to go down: continue with another iteration of the same rally; start a new, different rally; or end the rally when all the questions have been answered. 9. Next Steps A, BAnother sprint is run if the rally question has not been fully answered (outcome A); if the rally answer has been answered, the rally is complete and a new rally can begin (B).10. dELIVERABLES & OUTCOMES a, bOutputs of a rally can include reusable assets such as tools, models, and methods (A), and/or artifacts for external impact such as apps/tools for the field, and/or insights to inform policies or investment decision making (B). Data Scientists Rally Master Domain Experts questions check off questionon to-do list HBGDki collaborative team science overview An approach based on Agile software principles that produces actionable answers to HBGD questions 9A. Questions not fully answered: run another sprint 10B. IMPACT(i.e., app/tool for the field, policy, and/or investment decision making) 9B. Questions answered: run a new rally 1. questions RALLY PROCESS DETAILS 2. Team Forms 3. Modeling technique & data sets selected 4. SPRINT begins 5. daily stand-up 6. Analyze & discuss 7. present outcomes 8. review ki Partner 10A. REUSABLE ASSETS(i.e., datasets, tools, methods, and/or models) GENERATE NEW QUESTIONS platform Data Scientists Rally Master Domain Experts Data Scientists Rally Master Domain Experts Questions 1. Questions ki Partner platform Data Scientists Rally Master Domain Experts Questions check off questionon to-do list 9A. Questions not fully answered: run another sprint 10B. IMPACT(i.e., app/tool for the field, policy, and/or investment decision making) 9B. Questions answered: run a new rally 1. Questions 2. Team Forms 3. Modeling technique & data sets selected 4. SPRINT begins 5. daily stand-up 6. Analyze & discuss 7. present outcomes 8. review ki Partner 10A. REUSABLE ASSETS(i.e., datasets, tools, methods, and/or models) GENERATE NEW QUESTIONS 2. Team forms 3. Modeling technique & data sets selected 5. Daily stand-up 4. Sprint begins 7. Present Outcomes 8. Review Check off question on to-do list Generate new questions 9b. Questions answered: run a new rally 9a. Questions answered Run a new rally 6. Analyze & discuss 10a. Reusable Assets 10b. Impact i.e. datasets, tools, methods, and/or models i.e. app/tool for the field, policy, and/or investment decision making 9b. Questions not fully answered Run another sprint Questions ki Partner Data Scientists Rally Master Domain Experts platform