Please join us for our next KRYPTON seminar of the Fall 2021 semester. We will meet at the usual time, 1600h, on Friday October
15, 2021. We will continue to have a virtual option, but will also meet in person. I will attend in person in Room 2241 ENGR for anyone who wants to attend in person.
You can check Krypton events through our calendar at:
[log in to unmask]&ctz=America/New_York&pli=1" class="">https:[log in to unmask]&ctz=America/New_York&pli=1
**********************************************
Krypton Seminar Series - Fall 2021
Date: October 15, 2021
Time: 4:00PM - 5:30PM
Venue: Zoom and Room 2241 ENGR
Link for remote participation: https://gmu.zoom.us/j/92356938266
***********************************************
James Lee
Ph.D. candidate in Systems Engineering and Operations Research, GMU
***********************************************
Title — A Systems Engineering Approach to Data Science Project
Knowledge Management
Abstract — Data scientists currently employ manual processes where
either a new solution is created from scratch or potential previous solutions are searched for and modified for new problems. The process of searching, selecting candidates, and evaluating whether it is the appropriate solution is often a time-consuming activity.
Although data science is a relatively new term, it is an interdisciplinary field that spans many mature areas such as database management, statistics, and computer/software engineering. The trend is to have data scientists mainly work in Python or R, but there
exist many high-quality scripts that are written in other programming languages such as SQL, Java, and/or JavaScript, each that demonstrates its own strengths in its respective field. It would be ideal to take these existing scripts and knowledge and be able
to stitch them together to solve new problems, but interoperability is a challenge. Finally, there exist some traceability challenges with software development projects in that the actual implementation, which happens in an environment external to a design
documentation repository, i.e., a systems engineering modeling tool, is seldom linked back to the design repository, leaving a broken link between the design and implementation of the product.
The objective of this work is to improve the initial solution search and selection process for data science projects, enable interoperability and reuse of
existing solutions from different disciplines in a single integrated workflow, and advance traceability between design and implementation of data science projects. I propose a method that enables documenting software solutions in a central repository which
provides efficient and effective prototyping of solutions for problems that have similar characteristics to previously developed solutions and allows execution of modules of different software languages in a single workflow.