Enabling AI Innovation via Data and Model Sharing
This talk will provide an overview of the NSF Convergence Accelerator program (C-Accel) and Track D of the Accelerator on Enabling AI Innovation via Data and Model Sharing. The NSF C-Accel is a new use-inspired research program focusing on transitioning research to practice in national priority areas via projects that involve deep multidisciplinary collaborations addressing complex societal challenges (i.e., “convergence”); multi-institutional and multi-sector partnerships; and clear deliverables that provide societal impact. The cohort of FY20 C-Accel projects will kick off their Phase I efforts in October 2020.AI today is characterized by the ubiquitous use of machine learning and deep neural networks across a breathtaking range of applications, powered by the rapid accumulation of huge amounts of heterogeneous data. The ability to easily and effectively share, discover, and reuse these data and data-driven models will drive innovations with—and in—AI. However, current practices are rather ad hoc, often depending upon the experience and expertise of individual data scientists with bespoke methods tied to specific applications and/or application domains. While there is potential for applications in different domains/disciplines to use the same or slightly modified models and modeling tools, such sharing is limited in practice. Furthermore, modeling results often have poor reproducibility and also do not encode the full context, e.g., when/how a model works versus when it may fail. The provenance of data and model should be recorded and captured, along with the original intent and context behind the knowledge discovery process. This would help improve transparency to the end-users of data analysis procedures and workflows and predictive analytics algorithms. The NSF C-Accel Track D supports multidisciplinary, use-inspired research projects that provide data and model sharing to enable AI innovation, for both open as well as sensitive/protected data and models. Data-centric efforts focus on tools, platforms, and/or protocols for data preprocessing and preparation; making data FAIR (Findable, Accessible, Interoperable, Reusable); developing metadata for provenance and context information; data sanitization and encryption for sensitive data; and more—all in the context of enabling AIinnovation with the data. Model-centric efforts focus on annotation and sharing of data-driven models; authoring and publishing lineage and provenance information for models; providing appropriate contextual information to enable reuse; supporting reproducibility, and the like—with specific datasets and applications domains as exemplars. The C-Accel Track D also encourages projects to deal with issues of ethics, fairness, and bias in AI.
Session ID: LG0916 Presentation Type: Live Session (Replay Available)
Date / Time: [Day 3] Wed. Sep. 16, 2020 @ 09:00 ET (US)
To view this session, register for the conference.