Candlestick Pattern Backtest - Engulfing Patterns

Do candlestick patterns really work?

Author

Bashar Ul Fattah

Candlestick Patterns

When first looking into trading, I saw that everyone was talking about candlestick patterns. However, I could never wrap my head around how certain patterns could predict the direction of price. To me, it always seemed like astrology. You can’t just look at the formation of stars and say that something will happen in the future.

While candlestick patterns do depict the price action in that time frame, the data can’t be sufficient enough to go off of, right? The best way to see it for ourselves would be to backtest it.

So, that’s what I’m going to do in this blog. I’ll test it out on a year worth of data.

Now, there are dozens of these patterns, and testing every single one of them would be beyond the scope of this blog. Instead, I’ve decided to test out only one kind of such pattern. The “Engulfing Patterns”. Mostly because these are quite popular and farnkly the easiest to explain.

Important

This blog is for informational purposes only and not for any kind of financial advice.

Github Repo: https://github.com/Zentropic/engulfing-patterns-test

Engulfing Candlestick Patterns

The engulfing pattern consists of two candles. One is of the opposite kind of the other. The engulfing pattern is formed when the second candle’s body completely engulfs the preceding candle’s body. Thus the name “engulfing”.

We have two kinds of engulfing patterns. The bullish engulfing pattern and the bearish engulfing pattern. Each supposedly predicts a reversal in their respective directions.

(a) Bullish Engulfing Pattern
(b) Bearish Engulfing Pattern
Figure 1: Engulfing Patterns
Bullish Engulfing Pattern
A bullish engulfing pattern is one where the first candle is bearish and the second candle opens lower than the first candle’s close but closes higher than the first candle’s open.
Bearish Engulfing Pattern
A bearish engulfing pattern is one where the first candle is bullish and the second candle opens higher than the first candle’s close but closes lower than the first candle’s open.

The rationale is that when a bullish engulfing pattern appears, the price is supposed to go up, and when a bearish engulfing pattern appears, the price is supposed to go down.

Backtest

Now let’s get into the testing part. As mentioned before, I want to test it on a year worth of data. For this, I’ll use OHLCV data from 2023-05-25 to 2024-05-25 of stocks listed on DSE.

Let’s import the required packages and the data.

import pandas as pd
import numpy as np
import vectorbt as vbt
import re
import warnings

warnings.simplefilter("ignore")

vbt.settings.plotting["layout"]["template"] = "vbt_dark"
vbt.settings.portfolio["size_granularity"] = 1
vbt.settings.portfolio["freq"] = "D"
# Dataframe containing the data
df = pd.read_pickle('dse-all-securities-2023-05-25-to-2024-05-25.pkl')
df.head()
1JANATAMF 1STPRIMFMF ... ZAHINTEX ZEALBANGLA
Open High Low Close Volume Trade Ltp Open High Low ... Volume Trade Ltp Open High Low Close Volume Trade Ltp
DATE
2023-05-25 6.1 6.1 6.1 6.1 10696.0 NaN 6.1 14.6 14.6 14.2 ... NaN NaN NaN 142.8 153.3 141.3 147.8 58145.0 NaN 147.3
2023-05-28 6.1 6.1 6.1 6.1 1452.0 NaN 6.1 14.6 15.0 14.1 ... NaN NaN NaN 148.7 157.9 148.0 150.4 20229.0 NaN 150.4
2023-05-29 6.1 6.1 6.1 6.1 115800.0 NaN 6.1 15.0 16.2 15.0 ... 1100.0 NaN 9.0 150.4 157.9 146.0 147.3 41375.0 NaN 147.3
2023-05-30 6.1 6.1 6.1 6.1 104286.0 NaN 6.1 16.2 16.6 15.5 ... 1100.0 NaN 9.0 145.0 154.0 145.0 146.2 12217.0 NaN 146.2
2023-05-31 6.1 6.1 6.1 6.1 10166.0 NaN 6.1 15.6 15.6 15.2 ... 10.0 NaN 9.0 145.3 151.5 144.0 145.4 3861.0 NaN 145.4

5 rows × 4823 columns

Not all symbols in this data frame have data from the start date to the end date, as they might have been listed on the exchange much more recently. We will keep them, as it keeps the timeline more accurate for a broad test. However, the data frame doesn’t only contain data on stocks but also T-bills, bonds and mutual funds.

cols = df.columns.get_level_values(0)
unwanted_cols = [col for col in cols if
                 "BOND" in col or
                 "MF" in col or
                 re.search(r'TB\d+', col)]

print(f'For example: {[unwanted_cols[1], unwanted_cols[200], unwanted_cols[400]]}')
For example: ['1JANATAMF', 'PBLPBOND', 'TB10Y0730']

We don’t want them in our test. Let’s see how many of these need to be removed.

print(f'Total symbols: {len(set(cols))}')
print(f'Number of unwanted symbols: {len(set(unwanted_cols))}')
Total symbols: 689
Number of unwanted symbols: 303

Let’s get rid of the columns. And hopefully, this will remove all the non stock symbols. Even if one or two are left, that shouldn’t impact our test significantly.

stock_data = df.drop(unwanted_cols, axis=1)

This is how the new dataframe looks.

stock_data.head()
AAMRANET AAMRATECH ... ZAHINTEX ZEALBANGLA
Open High Low Close Volume Trade Ltp Open High Low ... Volume Trade Ltp Open High Low Close Volume Trade Ltp
DATE
2023-05-25 77.3 78.6 76.5 77.0 1662559.0 NaN 77.0 37.1 37.1 35.8 ... NaN NaN NaN 142.8 153.3 141.3 147.8 58145.0 NaN 147.3
2023-05-28 77.5 78.0 76.6 76.8 1598668.0 NaN 76.8 36.0 36.5 35.3 ... NaN NaN NaN 148.7 157.9 148.0 150.4 20229.0 NaN 150.4
2023-05-29 77.0 80.9 77.0 79.2 2549974.0 NaN 79.2 35.5 36.4 35.5 ... 1100.0 NaN 9.0 150.4 157.9 146.0 147.3 41375.0 NaN 147.3
2023-05-30 80.0 80.8 78.8 79.6 1544119.0 NaN 80.1 35.6 36.4 35.5 ... 1100.0 NaN 9.0 145.0 154.0 145.0 146.2 12217.0 NaN 146.2
2023-05-31 80.4 81.8 79.9 80.2 3407082.0 NaN 80.2 36.3 38.7 36.3 ... 10.0 NaN 9.0 145.3 151.5 144.0 145.4 3861.0 NaN 145.4

5 rows × 2702 columns

To define the logic for the backtest we will need the Open and Close columns. For this we will create two separate dataframes containing Open and Close prices.

def split_open_close(df):
    open_mask = df.columns.get_level_values(1) == 'Open'
    close_mask = df.columns.get_level_values(1) == 'Close'

    open_prices = df.loc[:, open_mask]
    open_prices.columns = open_prices.columns.get_level_values(0)
    
    close_prices = df.loc[:, close_mask]
    close_prices.columns = close_prices.columns.get_level_values(0)

    return open_prices, close_prices

open_prices, close_prices = split_open_close(stock_data)

This is how the Open and Close price dataframes look.

open_prices.head(2)
AAMRANET AAMRATECH ABBANK ACFL ACI ACIFORMULA ACMELAB ACMEPL ACTIVEFINE ADNTEL ... UTTARAFIN VAMLRBBF VFSTDL WALTONHIL WATACHEM WMSHIPYARD YPL ZAHEENSPIN ZAHINTEX ZEALBANGLA
DATE
2023-05-25 77.3 37.1 9.9 26.5 260.2 159.0 86.8 35.4 19.3 131.7 ... NaN NaN 22.2 1047.7 200.2 11.0 22.0 10.2 NaN 142.8
2023-05-28 77.5 36.0 9.9 26.5 260.2 158.5 86.8 35.4 19.3 128.7 ... 33.8 7.4 22.2 1047.7 200.2 11.0 21.5 10.0 NaN 148.7

2 rows × 386 columns

close_prices.head(2)
AAMRANET AAMRATECH ABBANK ACFL ACI ACIFORMULA ACMELAB ACMEPL ACTIVEFINE ADNTEL ... UTTARAFIN VAMLRBBF VFSTDL WALTONHIL WATACHEM WMSHIPYARD YPL ZAHEENSPIN ZAHINTEX ZEALBANGLA
DATE
2023-05-25 77.0 36.0 9.9 26.5 260.2 157.7 86.1 35.4 19.3 128.0 ... 33.8 7.4 22.2 1047.7 200.2 11.0 21.3 10.0 9.0 147.8
2023-05-28 76.8 35.5 9.9 26.5 260.2 157.7 86.0 35.4 19.3 124.1 ... 33.8 7.4 22.2 1047.7 200.2 11.0 19.8 10.0 9.0 150.4

2 rows × 386 columns

It’s time to define the logic and create the entries and exits. The entries will be when there’s a bullish engulfing pattern, and the exits will be when there’s a bearish one. We won’t go the usual route of going short once there’s a bearish signal. As short selling isn’t possible in DSE.

entries = (open_prices.shift() > close_prices.shift()) &\
          (open_prices < close_prices.shift()) &\
          (close_prices > open_prices.shift())

exits = (close_prices.shift() > open_prices.shift()) &\
        (open_prices > close_prices.shift()) &\
        (close_prices < open_prices.shift())

entries = entries.vbt.fshift()
exits = exits.vbt.fshift()

The entries and exits are defined exactly as explained before. Two opposite candles. The second one’s body completely engulfs the first one’s. Also, the entries and exits are shifted forward by one candle to account for look-forward bias.

Now we will run the test. For this, I’m using a package called vectorbt. This will let us test the whole thing way more efficiently. There is an option to set the commission rate, but I will not set one. Because, we want to see the performance without the effect of settlement delay and commissions. As for the initial capital, I’ll set it to BDT 100k just for the sake of it.

Note

DSE has a T+2 day settlement period for A and B category stocks and a T+3 day settlement period for N and Z category stocks. That aspect will not be simulated in this particular test.

pf = vbt.Portfolio.from_signals(close_prices, entries, exits, init_cash=100_000)

Results

Let’s look at the results. First, let’s look at the overall performance across all stocks. The stats will be aggregated by taking the average across all stocks.

pf.stats(metrics=["start_value", "end_value",
                 "total_return", "benchmark_return",
                 "max_dd", "win_rate",
                 "avg_winning_trade", "avg_losing_trade"])
Start Value              100000.000000
End Value                 96563.376943
Total Return [%]             -3.436623
Benchmark Return [%]        -14.183825
Max Drawdown [%]             19.003600
Win Rate [%]                 27.740575
Avg Winning Trade [%]        10.171921
Avg Losing Trade [%]         -8.536604
Name: agg_func_mean, dtype: float64

The -3.4% return does look good compared with the buy & hold return of -14.18%. But it’s still in the negative. The win rate of 27% isn’t that compelling either. Because the difference between the average winning trade and the average losing trade isn’t significant enough.

Anyway, let’s see how the distribution of the returns looks like.

pf.total_return().vbt.histplot().show()

Distribution of returns

As we can see, most of the stocks have a negative return. However, there’s an outlier, “KBPPWBIL” sitting far right on the chart at 381.5%. Let’s have a better look at its stats.

pf.stats(column=("KBPPWBIL"))
Start                         2023-05-25 00:00:00
End                           2024-05-23 00:00:00
Period                          241 days 00:00:00
Start Value                              100000.0
End Value                                481508.6
Total Return [%]                         381.5086
Benchmark Return [%]                  1311.278195
Max Gross Exposure [%]                  99.992319
Total Fees Paid                               0.0
Max Drawdown [%]                        26.712969
Max Drawdown Duration            59 days 00:00:00
Total Trades                                    2
Total Closed Trades                             1
Total Open Trades                               1
Open Trade PnL                           301900.5
Win Rate [%]                                100.0
Best Trade [%]                          79.619565
Worst Trade [%]                         79.619565
Avg Winning Trade [%]                   79.619565
Avg Losing Trade [%]                          NaN
Avg Winning Trade Duration       16 days 00:00:00
Avg Losing Trade Duration                     NaT
Profit Factor                                 inf
Expectancy                                79608.1
Sharpe Ratio                             4.557742
Calmar Ratio                             36.72315
Omega Ratio                              2.878554
Sortino Ratio                           13.170218
Name: KBPPWBIL, dtype: object

Just looking at the 381.5% return would be a bad idea. Because this particular stock, “KBPPWBIL” somehow has a buy & hold return of 13x!! Taking that into consideration, 381.5% is actually quite a bad performance. Not to mention, this specific stock is way out of the norm.

Here’s the plot of the trades and the equity curve:

pf.plot(column=("KBPPWBIL"), subplots=["trades", "cum_returns"]).show()

I will not draw any conclusions as to whether or not the candlestick patterns work or not. That would require much, much more thorough research. But this specific pattern doesn’t seem to be performing all too well on this set of data.