Question

How can new machine learning capabilities enable the mining of stock documents for financial data?

Answer
Why Trust Techopedia

One of the exciting new frontiers of machine learning and AI is that scientists and engineers are embarking on various ways to use completely new types of resources to predict stock movement and investment outcomes. This is a tremendous game-changer in the financial world, and will revolutionize investment strategies in a very profound way.

One of the basis ideas for expanding this type of stock research is computational linguistics, which involves the modeling of natural language. Experts are investigating how to use text documents, from SEC filings to shareholder letters to other peripheral text-based resources, in order to augment or fine-tune stock analysis or to develop entirely new analyses.

Free Download: Machine Learning and Why It Matters

The important disclaimer is that all of this is only made feasible through brand new advances in neural networks, machine learning and natural language analysis. Prior to the advent of ML/AI, computing technologies mostly used linear programming to "read" inputs. Text documents were too highly unstructured to be useful. But with the progress made in natural language analysis within the last few years, scientists are finding that it is possible to "mine" natural language for quantifiable results or, in other words, results that can be computed in some way.

Some of the best evidence and most useful examples of this come from various dissertations and doctoral work available on the web. In a paper, "Applications of Machine Learning and Computational Linguistics in Financial Economics," published April 2016, Lili Gao capably explains relevant processes specific to the mining of corporate SEC filings, shareholder calls, and social media messages.

"Extracting meaningful signals from unstructured and high dimensional text data is not an easy task," Gao writes. "However, with the development of machine learning and computational linguistic techniques, processing and statistically analyzing textual documents tasks can be accomplished, and many applications of statistical text analysis in social sciences have proven to be successful." From Gao's discussion of modeling and calibration in the abstract, the entire developed document shows how some of this type of analysis works in detail.

Other sources for active projects include pages like this GitHub project brief, and this IEEE resource talking specifically about getting valuable financial information from "Twitter sentiment analysis."

The bottom line is that the use of these new NLP models is driving quick innovation in using all sorts of text documents, not just for financial analysis, but for other kinds of cutting-edge discovery, blurring that traditionally established line between "language" and "data."

Related Terms

Justin Stoltzfus
Contributor
Justin Stoltzfus
Contributor

Justin Stoltzfus is an independent blogger and business consultant assisting a range of businesses in developing media solutions for new campaigns and ongoing operations. He is a graduate of James Madison University.Stoltzfus spent several years as a staffer at the Intelligencer Journal in Lancaster, Penn., before the merger of the city’s two daily newspapers in 2007. He also reported for the twin weekly newspapers in the area, the Ephrata Review and the Lititz Record.More recently, he has cultivated connections with various companies as an independent consultant, writer and trainer, collecting bylines in print and Web publications, and establishing a reputation…