Predict a pattern of a stock price after a rights-issue announcement with deep learning

Charlie_the_wanderer
7 min readMay 1, 2021

In this post, I’d like to talk about predicting a pattern of a stock price.
Generally, there are two methodologies for predicting a stock price.
One is fundamental analysis, which focuses on analyzing the financial statements of a company. Although those financial statements and indices from it are important and relevant for predicting its price, it isn’t always an answer for it.
It’s because the numbers in financial statements such as sales and profit can be easily manipulated by the company in its favor. Also, there exist many things that cannot be represented in financial statements.
For example, sometimes you can observe stock prices of some biotech or pharmaceutical companies spike up unreasonably even though they are not making a profit from their business. In those cases, financial indices such as PER, ROI, profit become meaningless since the only thing that props the price up is people’s expectation for the future of the company.

Therefore, you need to take the psychology of people into account as well and technical analysis is all about that, which is the other approach.
It focuses on the stock chart where people’s psychology is represented as candle bars with certain patterns. While it’s suitable for microscopic analysis when trading stocks, it also has a limit to predict the long-term price, since it doesn’t consider the fundamental value of a company.

You can see there’s no panacea for everything. As a result, most stock investors adopt both methodologies to maximize the accuracy of predicting a stock price.

However, the method I used is a little bit different from those two above, which is analyzing events such as rights-issue, bonus-issue, and IPO to predict a pattern of its stock price after the announcement. Methodologies mentioned above can be also used with this approach. The advantage of this method is you don’t have to predict every movement of a stock price which is almost impossible even for experts. Since there are certain patterns historically observed after an event announcement, you can concentrate only on predicting which pattern it will be. Therefore it’s more feasible and easier than predicting a price itself.
In particular, I focused on the event of rights-issue.

To do so, I had to scrape historical data of both financial statements and rights-issue announcements from all listed companies.

First, to fetch financial statements, I used the Open API of DART which is a website where every corporate announcement is published operated by the Korean government. I extracted 4-year worth of quarterly financial statements and rights-issue announcements within that period from over 2200 listed companies. Since there were a lot of missing values and inconsistent formats caused by M&A and different document formats of each company, I made use of Python regular expressions to extract what I needed.

It took a couple of days to extract and store every data into my database.

Let’s see one of the examples from those events.

The image above shows a movement of a stock price of 다스코 after the rights-issue announcement.
Though it depends on the type of rights-issue, in general, a stock price tends to plummet after the rights-issue announcement because it signals that the company’s financial condition is suffering and some shareholders who don’t want to buy new stocks with warrant just sell what they have before the base date. Then, the price goes up after the listing date with the expectation for a company to invest in a new business with raised capital.

However, it’s not always the case understandably like the image below.

Depending on a company’s financial condition and other economic situations at that point, patterns can vary, so my goal was to predict the pattern after the rights-offering announcement with a deep learning model that’s trained on all historical data related to rights-offering.

The number of the whole announcements of rights-issue during the period was 1751. It includes all types of offering methods such as offering to shareholders, public offering, allocation to third parties. Since the things like offering method, the ratio of offering stock to existing stock, and the reason the company requires more funds highly affect a stock price, I extracted those data from announcement documents.

Then, I extracted relevant data from each company’s financial statement that has announced rights-offerings.

To save processing time, I used the Python Multiprocessing module. Five quarters of financial information before the announcement was extracted from each company’s quarterly financial statement.
More specifically, I extracted financial data such as assets, liabilities, equity, cash flows, revenue, and profit since those numbers represent important aspects of a company showing its stability, liquidity, profitability.

Companies that have missing values in that period were dropped while preprocessing.

Then, I merged those two data from corporate announcements and financial statements. Since there are also categorical data such as offering method and ‘corp_cls’ which shows which market a company is listed in, I had to figure out a way to deal with both numerical data and categorical data at the same time to feed them into a model to train.
Also, appropriate feature scaling had to be implemented to normalize all data entries.

There are generally two ways of doing it. One is Min-Max Normalization and the other is Z-Score Normalization. However, it wasn’t easy to simply apply either of them since numerical values such as assets, liabilities, equity, profit, sales can lose their relevant meanings between each other if normalization is implemented.
For example, when you analyze a company’s financial condition, the ratios such as profit to sales, equity to liabilities, liabilities to assets imply significant things since the ratios themselves serve as financial indices. However, when normalization is implemented, those relevant meanings can be lost since each column is separately normalized.

To avoid this, I converted those financial data into ratio data before normalization as below.

Then, also applied One-hot encoding to categorical values to handle them with numerical values when training. Also, I converted cash flows into categorical values of either plus or minus since I thought whether it’s plus or minus is a more important signal than the number itself.

Target labels were generated with historical data of stock price. If a company’s stock price after a listing date is higher than the price before a rights-issue announcement, I gave it a positive label and if not, I gave it a negative label. In other words, I tried to make a model that can execute binary classification.

One thing to note is it had a skewed dataset which means it had more examples that have False labels than True labels. So it had to be taken into account when you measure the accuracy and performance of the model.

The model was built with Google Tensorflow. I used dense layers and dropout layers to avoid overfitting. The summary of the model is as below.

With the model above, I was able to reach 0.67 of validation set accuracy on average. I also tried training with traditional ML models such as SVM, RandomForest, but those results from them were not that different from the first one.

Although it’s not a perfect score that can tell the answer for every case, it’s safe to say it’s much better than the coin toss since validation accuracy is higher than 0.57 which was the ratio of negative samples of the dataset.

There are some limits however, it still remains to be seen whether it can well generalize over unseen data and perform as expected. Also, the accuracy isn’t that enough to solely rely on. So there is much room for further improvement.

Put together, statistically, I think this model could give an investor relative information that can add up the odds of winning in the process of decision-making in the long run.

If you have any thoughts or ideas about this post, feel free to put a comment below. Thanks for reading.

--

--