How Computers Are Changing Wall Street

How Computers Influence Wall Street: Sentiment Analysis and Machine Learning

In recent decades, the impact of Computer Science on the fields of finance and economics has expanded significantly. Nowadays, major financial institutions like JP Morgan Chase and Barclays depend on advanced computational systems to evaluate portfolio risks and forecast future asset values. But what are the mechanisms through which computers accomplish this?

In this piece, we will delve into two key methods by which computers are transforming finance: sentiment analysis and machine learning.

Sentiment Analysis: Gauging Market Sentiment

The era of relying solely on instincts and bustling trading floors for stock investments is long gone. The conventional trader has been predominantly supplanted by the “quant”—short for quantitative analyst—who crafts sophisticated algorithms grounded in mathematics to strategically execute trades and anticipate stock movements.

How Computers Understand News

In light of this technological evolution, financial data firms such as Dow Jones have created instruments like the Dow Jones Lexicon (DJL). This lexicon (which functions akin to a dictionary) enables machines to interpret financial news and identify positive or negative sentiments. With various specialized dictionaries made by finance educator Bill McDonald from the University of Notre Dame, or specially designed lexicons suited to particular requirements, companies can refine their sentiment analyses.

For example, when examining a headline from CNBC about a new variant of COVID-19 leading to market declines, a machine would identify negative keywords such as “dropped,” “down,” “fell,” and “lost.” These terms would decrease the sentiment score, indicating a bearish market attitude. Additionally, the position of words matters—those in headlines and introductory paragraphs have more influence than those hidden within the text.

Developing Sentiment Analysis Systems

The sentiment analysis is generally constructed using XML (Extensible Markup Language), a markup language for tagging and structuring data. In contrast to HTML, which dictates how content is presented, XML provides context about the data. For example:

<SampleXML>
    <Colors>
        <Color1>White</Color1>
        <Color2>Blue</Color2>
        <Color3>Black</Color3>
    </Colors>
    <Fruits>
        <Fruits1>Apple</Fruits1>
        <Fruits2>Pineapple</Fruits2>
    </Fruits>
</SampleXML>

Likewise, financial sentiment systems would utilize custom tags like or to categorize news content, allowing programs to quickly organize and respond to news developments efficiently.

While sentiment analysis provides essential real-time insights into public sentiment and the effects of news, it represents just one aspect of financial forecasting. This is where machine learning elevates the precision of predictions.

Machine Learning: Training Computers to Anticipate Markets

Machine Learning (ML)—a branch of Artificial Intelligence (AI)—empowers computers to learn from data and enhance their predictions over time with minimal human intervention.

In the finance sector, ML models are programmed to forecast stock prices based on historical features such as:

Open price
High price (the peak for the day)
Low price (the minimum trading price for the day)
Trading volume (number of shares traded)

Here’s an illustration of historical stock data for Microsoft:

Date	Open	High	Low	Close	Adj Close	Volume
1990-01-02	0.6059	0.6163	0.5981	0.6163	0.4473	53,033,600
1990-01-03	0.6215	0.6267	0.6146	0.6198	0.4498	113,772,800

Data Normalization

Before machines can analyze this data, it must be scaled (normalized) to range from 0 to 1. Normalization minimizes computational requirements and standardizes inputs for quicker and more effective learning.

Example (Normalized Data):

Date	Open	High	Low	Volume
1990-01-02	0.00013	0.00010	0.00013	0.06484
1990-01-03	0.00026	0.00020	0.00027	0.14467

Preparing the Machine for Learning

To facilitate training, the data is divided into:

A training set (to educate the model)
A testing set (to assess its performance)

Python code snippet:

# Define the target variable
output_var = PD.DataFrame(df['Adj Close'])

# Choose the features
features = ['Open', 'High', 'Low', 'Volume']

Utilizing a time…

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.