Over the last several decades, Computer Science has become more prevalent in the fields of finance and economics. Leading financial institutions are increasingly dependent on computers to provide results. For instance, JP Morgan, Chase, and Barclays rely on supercomputers to assess the risk of a stock portfolio or forecast an asset’s future value. How is this accomplished?
In this article, I will delve into how computers assess sentiment and employ machine learning techniques to optimize investment returns.
## Sentiment Analysis
The investment landscape has transitioned from relying on gut feelings to placing confidence in computer programs. The era of traders hastily buying and selling stocks on the trading floor is diminishing. Floor traders are being replaced by quantitative analysts, often referred to as quants. Quants create computer algorithms with a strong focus on mathematics to execute trades and predict the future performance of stocks.
Recognizing the changes in the market, Dow Jones has developed a lexicon that enables computers to effortlessly analyze potential stock movements. Another term for lexicon is dictionary. The Dow Jones Lexicon (DJL) gathers financial news and translates it into a format that computers can comprehend. Additionally, the DJL includes six distinct dictionaries, all crafted by Bill McDonald, a finance professor at the University of Notre Dame. Nonetheless, financial firms have the option to create customized dictionaries to meet their specific requirements.
When a computer utilizes a lexicon, it can differentiate between positive and negative news. This capability advises traders on whether it would be prudent to conduct a trade for that stock.
Let’s evaluate a news article to see how the sentiment lexicon operates. This article is from CNBC and was published on November 26, 2021, shortly after scientists in South Africa identified a new COVID variant.
This is the introductory line of the article. You will observe that one of the initial words is “dropped.” A computer examining this text fragment would classify it under the negative segment and lower the sentiment score, signaling that the stock market is in a bearish state. However, the context in which the word appears is also significant. Words placed in headlines or the opening lines carry more weight in determining the sentiment score than if they were located deeper in the article.
In this image, you’ll notice words such as “down,” “dropped,” “lost,” and “fell.” Again, these terms are interpreted by the computer as negative, thereby reducing the sentiment score.
So, what does this process look like from the computer’s perspective? It turns out that creating a sentiment analysis program is quite straightforward. This is accomplished through XML coding. XML stands for extensible markup language and is akin to HTML, albeit not a standalone language. While HTML outlines how a document should be displayed, XML defines the data contained within the document. For instance, in HTML, tags are inserted by a programmer to create certain elements. The
tag, for example, indicates the primary heading of a webpage (where h represents heading, and 1 stands for the first heading). Meanwhile, XML permits programmers to further customize their codes. Instead of using a broadly defined
tag, XML enables programmers to designate a tag like to clarify what is being described. This enhances code interpretation and tailors it to a business’s specific needs.
Note: the subsequent code was gathered from sqlauthority.com
“`xml
White
Blue
Black
Green
Red
Apple
Pineapple
Grapes
Melon
“`
In the scenario above, quants or traders would adhere to a similar format as demonstrated, only modifying the tags to suit the algorithm for sentiment analysis.
Sentiment analysis merely scratches the surface of what computers are capable of on Wall Street. While I won’t cover everything, I will proceed with the article by discussing how Machine Learning factors into predicting market movements.
## Machine Learning
Machine Learning (ML) falls under the umbrella of Artificial Intelligence (AI) and collects data for systems like Siri to make decisions. In this context, rather than Siri, it pertains to computer algorithms. You can read more about ML and AI in a prior post I published.
To ensure the highest probability of success, the computer must undergo training. Four stock characteristics are utilized for this training: the opening price, the day’s high, the day’s low, and trading volume. The following data is from Microsoft stock.
| Date | Open