Over the last several decades, the field of Computer Science has gained significant traction in finance and economics. Leading financial institutions are leveraging computing power to achieve results. For instance, JP Morgan, Chase, and Barclays utilize supercomputers to assess the risk of a stock portfolio or forecast an asset’s potential value. How is this accomplished?

In this piece, I will delve into how computers interpret sentiment and employ machine learning to enhance investment returns.

## Sentiment Analysis

The investment landscape has transitioned from relying on gut feelings to placing confidence in computer algorithms. The era of stock traders hastily buying and selling shares on the trading floor is fading. Quantitative analysts, or quants for short, are now taking the place of floor traders. Quants design computer algorithms with a strong focus on mathematics to execute trades and predict the future behavior of stocks.

Acknowledging this market evolution, Dow Jones has created a lexicon that enables computers to efficiently evaluate stock movement. Another term for lexicon is *dictionary*. The Dow Jones Lexicon (DJL) aggregates financial news and translates it into a language comprehensible to computers. Additionally, the DJL comprises six distinct dictionaries, all produced by Bill McDonald, a finance professor at the University of Notre Dame. Financial firms also have the option to develop their own dictionaries tailored to their specific requirements.

Equipped with a lexicon, a computer can differentiate between positive and negative news. This provides traders with insights on whether it would be prudent to carry out a trade for that stock.

Let’s scrutinize a news article to illustrate how the sentiment lexicon operates. This article originates from CNBC and was published on November 26, 2021, shortly after scientists in South Africa identified a new COVID variant.

The opening line of the article features the word “dropped.” A computer analyzing this fragment would classify it as negative and lower the sentiment score, suggesting a bearish market. However, the position of the word is also crucial. Keywords found in headlines or the initial lines carry more significance in the sentiment score compared to those buried deeper within the text.

In this screenshot, keywords like “down,” “dropped,” “lost,” and “fell” stand out. Again, these words will be interpreted by the computer as negative, which further decreases the sentiment score.

So, how is this processed on the computer’s side? It turns out to be a relatively straightforward procedure. Developing a sentiment analysis program involves the use of XML code. XML, meaning extensible markup language, is closely associated with HTML, though it is not an independent language. HTML delineates how a document should look, while XML describes the data contained within the document. For instance, HTML employs tags that programmers insert to create specific objects. The `

` tag, for instance, signifies a website’s heading (h for heading; 1 for the primary heading). Conversely, XML permits programmers to tailor their code even more. Instead of using a vaguely defined `

` tag, XML allows a programmer to specify the tag as something like “, clarifying what is being defined. This simplifies code interpretation and aligns it with a business’s requirements.

Note: the next code was compiled by sqlauthority.com

“`xml

White
Blue
Black
Green
Red

Apple
Pineapple
Grapes
Melon

“`

In the example above, quants or traders would adopt a similar format as shown, merely altering the tags to fit the specific algorithm for sentiment analysis.

Sentiment analysis merely scratches the surface of the capabilities of computers on Wall Street. While I cannot cover everything, I will proceed with the article by discussing the role of Machine Learning in forecasting market movements.

## Machine Learning

Machine Learning (ML) is a subset of Artificial Intelligence (AI) that collects information for systems like Siri to make informed decisions. In this scenario, it involves computer algorithms rather than Siri.

To maximize the likelihood of success, the computer must undergo a training process. Four characteristics of stock are utilized for training: its opening price, the day’s high, the day’s low, and trading volume. The following data pertains to Microsoft stock.

Note: the subsequent code was compiled by Analytics Vidhya

Normalizing

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.