Feature Importance in Option Pricing with Captum
Table of Contents
Deep neural networks map a set of features through layers of many neurons each to make a prediction of a target value. Many use cases of deep neural networks only care about the accuracy of the prediction, and don't care about how the features are related to the target. There are other sets of problems however, that are concerned about how each feature affect the target.
For example, you may be predicting default on auto leans and want to know whether debt-to-income levels affect the probability of default. If they do, you may change your auto loan origination methods. A logistic regression will give you a prediction of default, along with an estimate of how a given increase in debt-to-income will increase the probability of default. A deep neural network alone will just give you a prediction. However research in feature attribution, and the Captum library, want to change that.
In these notes we will use a different example of the need for feature attribution—option pricing. When using options pricing models (like Black-Scholes1) the option value is only one of the important outputs of the model. The model also affords the Greeks, which are measures of how sensitive the option price is to the input values. For example, an options Delta tells how much the options price will change given some change in the underlying's price.
Here we'll use deep learning to value an interesting type of option—a Financial Transmission Right option. An FTR pays the difference between the congestion prices at two points on the electricity grid. An FTR option pays the difference if it is positive, or $0 otherwise. To simulate FTR option values we can simulate electricity price processes at two points on the grid. To do so we generate two correlated processes which follow the following stochastic differential equation:
\[dE_t = \kappa(\mu - E_t)dt + \sigma E_t dB_t\]
where:
- \(E_t\) is the electricity price at a given node
- \(\kappa\) is the rate of mean reversion
- \(\mu\) is the mean electricity price at the node
- \(\sigma\) is the price volatility at the node
Import data from Monte Carlo simulations:
risk_free vol1 vol2 corr k1 k2 e1_start e2_start mu1 mu2 option_value 0 0.000246 0.022983 0.051694 -0.275666 0.547739 0.586659 26.736341 27.136899 25.858735 34.003442 6.797245 1 0.000058 0.029828 0.005129 -0.022649 0.720233 0.207708 33.555252 26.736119 29.976300 33.063499 5.440954 2 0.000321 0.053443 0.020939 0.391795 0.308738 0.046923 31.241577 27.143763 33.058117 27.189379 10.974798 3 0.000130 0.033620 0.016250 -0.003000 0.654085 0.382512 34.044292 33.778042 34.696238 25.894681 8.243242 4 0.000210 0.057207 0.021190 0.484918 0.575378 0.668296 30.335520 28.391513 25.502045 32.433721 6.613631 .. ... ... ... ... ... ... ... ... ... ... ... 995 0.000229 0.024420 0.032109 0.168135 0.370507 0.621441 26.101353 34.332895 28.746192 32.059997 3.265470 996 0.000084 0.006779 0.007712 -0.466865 0.243271 0.127567 30.873648 32.967921 34.333053 26.092348 2.313836 997 0.000076 0.015852 0.042681 0.262823 0.070187 0.369278 33.930473 34.468095 27.403043 33.487219 8.132854 998 0.000142 0.059682 0.062720 -0.049992 0.602649 0.508839 31.113604 30.427855 31.950044 32.877230 12.834470 999 0.000111 0.047834 0.032335 -0.123484 0.526323 0.366141 25.256368 28.618138 25.730874 25.107557 7.182105 [9000 rows x 11 columns]
1. Model
Here we define the model and import trained weights.
model = nn.Sequential(
nn.Linear(10, 400),
nn.ReLU(),
nn.Linear(400, 400),
nn.ReLU(),
nn.Linear(400, 400),
nn.ReLU(),
nn.Linear(400, 400),
nn.ReLU(),
nn.Linear(400, 400),
nn.ReLU(),
nn.Linear(400, 1)
)
Now we feed in the weights from a previously trained model:
model.load_state_dict(torch.load("./FTR_model_july_27_4000_inputs.pt"))
<All keys matched successfully>
Here are predictions vs actual option values.
Figure 1: Prediction Error
2. Feature Attribution
Figure 2: Feature Importance
Footnotes:
Interestingly, one of the first uses of neural networks in finance was to learn the Black-Scholes model from generated prices.