Over the last several decades, Computer Science has increasingly emerged in the realms of finance and economics. Leading financial institutions depend on computers for effective results. For instance, firms like JP Morgan, Chase, and Barclays utilize supercomputers to assess the risk of stock portfolios or forecast the future value of assets. How is this accomplished?
In this article, I will examine how computers assess sentiment and apply machine learning to enhance investment returns.
## Sentiment Analysis
The investment landscape has transformed from relying on intuition to placing trust in computer programs. The era of traders engaging in frantic buying and selling on trading floors is fading. These floor traders are being replaced by quantitative analysts, commonly referred to as quants. Quants create computer algorithms that focus heavily on mathematical principles to perform trades and predict a stock’s future performance.
Acknowledging the market’s evolution, Dow Jones has established a lexicon that enables computers to analyze the potential movements of a stock effortlessly. Another term for lexicon is dictionary. The Dow Jones Lexicon (DJL) compiles financial news and translates it into a language comprehensible to computers. Additionally, the DJL comprises six distinct dictionaries, all crafted by Bill McDonald, a finance professor at the University of Notre Dame. Nonetheless, financial institutions have the flexibility to develop their own dictionaries tailored to their particular requirements.
With a lexicon, a computer can distinguish between positive and negative news. This insight then guides traders on whether it is prudent to execute a trade for that stock.
Let us evaluate a news article to understand how the sentiment lexicon operates. This piece is from CNBC and was published on November 26, 2021, shortly after scientists in South Africa identified a new variant of COVID.
The introductory line of the article includes the word “dropped.” A computer examining this text snippet would classify it in the negative category and lower the sentiment score, suggesting a bearish stock market. However, the location of the word also holds significance. Keywords found in headlines or the opening lines carry more impact on the sentiment score than if they were deep within the text.
In the accompanying screenshot, terms like “down,” “dropped,” “lost,” and “fell” are evident. These, again, are interpreted negatively by the computer, resulting in a reduced sentiment score.
So, what does this process look like from the computer’s perspective? It turns out to be relatively straightforward. Developing a sentiment analysis program involves using XML code. XML, or extensible markup language, is closely related to HTML but is not a standalone language. HTML specifies the layout of a document while XML details the data within that document. For instance, in HTML, tags are inserted by a programmer to create certain elements. The `
` tag, for example, indicates the primary heading of a webpage (h for heading; 1 for the first heading). Conversely, XML enables a programmer to customize their code further. Instead of merely using a generic `
` tag, XML allows a programmer to designate the tag as something like “ to clarify its purpose. This clearer definition enhances code interpretation and aligns with a business’s requirements.
Note: the following code was compiled by sqlauthority.com
“`xml
White
Blue
Black
Green
Red
Apple
Pineapple
Grapes
Melon
“`
In the example depicted, quants or traders would adhere to a similar format, altering the tags to align with their specific sentiment analysis algorithm.
Sentiment analysis merely scratches the surface of the capabilities computers possess on Wall Street. While I won’t cover every aspect, I will proceed with the article by discussing the role of Machine Learning in predicting market movements.
## Machine Learning
Machine Learning (ML) constitutes a subset of Artificial Intelligence (AI) that collects data for systems like Siri to make informed decisions. In this scenario, it is computer algorithms in lieu of Siri. For additional insights on ML and AI, refer to a prior post I authored.
To optimize the likelihood of success, the computer must undergo a training phase. Four characteristics of stock are utilized during training: its opening price, daily high, daily low, and trading volume. The following data pertains to Microsoft stock.
| Date | Open