Over the last several decades, Computer Science has increasingly infiltrated the realms of finance and economics. Leading financial institutions are turning to computers for improved outcomes. For instance, JP Morgan, Chase, and Barclays utilize supercomputers for assessing a stock portfolio’s risk or forecasting an asset’s potential value. How is this accomplished?
This article will delve into how computers evaluate sentiment and employ [machine learning](https://www.sas.com/en_us/insights/analytics/machine-learning.html#:~:text=Machine%20learning%20is%20a%20method,decisions%20with%20minimal%20human%20intervention.) to optimize investment returns.
## Sentiment Analysis
The investment landscape has evolved from relying on gut feelings to placing confidence in computer algorithms. The era of traders hastily buying and selling stocks on the [trading floor](https://www.investopedia.com/terms/t/trading_floor.asp) is diminishing. On the rise are quantitative analysts, commonly referred to as quants. Quants create computer algorithms grounded in mathematics to carry out trades and predict a stock’s future performance.
Acknowledging this market transition, [Dow Jones](https://www.investopedia.com/ask/answers/who-or-what-is-dow-jones/) has established a lexicon that enables computers to swiftly evaluate stock movements. Another term for lexicon is *dictionary*. The Dow Jones Lexicon (DJL) aggregates financial news and translates it into a format that machines comprehend. Furthermore, the DJL comprises six distinct dictionaries, all designed by Bill McDonald, a finance professor at the University of Notre Dame. Financial firms, however, can develop tailored dictionaries to meet their individualized requirements.
Equipped with a lexicon, a computer can distinguish between positive news and negative. This insight helps traders determine whether it is prudent to carry out a trade for that stock.
Let’s dissect a news article to illustrate how the sentiment lexicon functions. [This](https://www.cnbc.com/2021/11/26/stock-futures-open-to-close-market-news.html) article from CNBC was published on November 26, 2021, mere days after scientists in South Africa identified a new COVID variant.
*CNBC*
The opening line of the article starts with the word “dropped.” A computer analyzing this excerpt would classify it in the negative category and lower the sentiment score, signaling that the stock market is in a bearish trend. Nevertheless, the position of the word is crucial. Keywords situated in headlines or the initial lines carry more influence on the sentiment score compared to if they were hidden within the text body.
*CNBC*
In this screenshot, you can find keywords such as “down,” “dropped,” “lost,” and “fell.” These are words that the computer would interpret negatively, thus reducing the sentiment score.
What does this process look like on the computer side? It turns out, it’s a relatively straightforward operation. Developing a sentiment analysis program is accomplished with XML code. XML, which stands for extensible markup language, is closely related to HTML, though it is not a standalone language. HTML outlines how a document should be displayed, while XML details the data contained within the document. For example, in HTML, there are tags that a developer uses to create specific elements. The
tag, for example, indicates the main heading of a webpage (h for heading; 1 for first heading). On the other hand, XML allows developers to customize their code further. Instead of creating a vaguely defined
tag, XML permits the programmer to specify tags such as to clarify what is being defined. This enhances the interpretation of code and caters to a business’s specifications.
Note: the following code was compiled by sqlauthority.com
“`xml
White
Blue
Black
Green
Red
Apple
Pineapple
Grapes
Melon
“`
In the example above, quants or traders would adopt a similar structure as shown, simply modifying the tags to suit their sentiment analysis algorithm.
Sentiment analysis merely scratches the surface of the capabilities computers possess on Wall Street. While I