Why is it important for data scientists to seek transparency?


Transparency is essentially important in data science projects and machine learning programs, partly because of the complexity and sophistication that drives them — because these programs are “learning” (generating probabilistic results) rather than following predetermined linear programming instructions, and because as a result, it can be hard to understand how the technology is reaching conclusions. The “black box” problem of machine learning algorithms that are not fully explainable to human decision-makers is a big one in this field.

With that in mind, being able to master explainable machine learning or “explainable AI” will likely be a main focus in how companies pursue talent acquisition for a data scientist. Already DARPA, the institution that brought us the internet, is funding a multimillion-dollar study in explainable AI, trying to promote the skills and resources needed to create machine learning and artificial intelligence technologies that are transparent to humans.

One way to think about it is that there is often a “literacy stage” of talent development and a “hyperliteracy stage.” For a data scientist, the traditional literacy stage would be knowledge of how to put together machine learning programs and how to build algorithms with languages like Python; how to construct neural networks and work with them. The hyperliteracy stage would be the ability to master explainable AI, to provide transparency in the use of machine learning algorithms and to preserve transparency as these programs work toward their goals and the goals of their handlers.

Another way to explain the importance of transparency in data science is that the data sets that are being used keep becoming more sophisticated, and therefore more potentially intrusive into people’s lives. Another major driver of explainable machine learning and data science is the European General Data Protection Regulation that was recently implemented to try to curb unethical use of personal data. Using the GDPR as a test case, experts can see how the need to explain data science projects fits into privacy and security concerns, as well as business ethics.

Justin Stoltzfus is an independent blogger and business consultant assisting a range of businesses in developing media solutions for new campaigns and ongoing operations. He is a graduate of James Madison University.Stoltzfus spent several years as a staffer at the Intelligencer Journal in Lancaster, Penn., before the merger of the city’s two daily newspapers in 2007. He also reported for the twin weekly newspapers in the area, the Ephrata Review and the Lititz Record.More recently, he has cultivated connections with various companies as an independent consultant, writer and trainer, collecting bylines in print and Web publications, and establishing a reputation…


Related Terms

Related Questions