Replies: 5 comments
-
Hi @cf4869 , this is related to #230 and #231. When using discrete valuations model = cl.BarnettZehnwirth(formula='C(origin)+C(development)+valuation').fit(abc) Here the regression knows to extrapolate beyond the end of the triangle since it sees valuation as an ordinal variable. I would like to see |
Beta Was this translation helpful? Give feedback.
-
Hi @jbogaardt, thanks for the answer. so in that case, the signal should be picked up by using valuation as one of the parameters, but we still have non-random residual on the evaluation date graph. if we fit only ordinal variables, we have non-random residuals on all of them. Does this makes sense? |
Beta Was this translation helpful? Give feedback.
-
I'll have to dust off the paper - its been a while and my understanding is a little hazy. Fitting features as ordinal/continuous and not strictly categorical, you get a single regression coefficient for that axis. For example, this model has three coefficents (plus intercept): import chainladder as cl
abc = cl.load_sample('abc')
cl.BarnettZehnwirth(formula='origin+development+valuation').fit(abc).coef_ A single Hypothetically you could choose one coefficient for the 3 oldest origin years, another coefficient for origins 4 and 5 and a final coefficient for origin years 6 and later. This is how this would look: cl.BarnettZehnwirth(
formula='C(np.where(origin<=2, 0, np.where(origin<5,1,2)))+development+valuation'
).fit(abc).coef_ Actually, getting back to your original issue, you could create a model that uses discrete valuations and just extrapolates future valuations from the last available: cl.BarnettZehnwirth(
formula='C(origin)+C(development)+C(np.minimum(valuation, 9))'
).fit(abc).coef_ The point I am trying to make in all this is that the residual analysis gives you insight into how you should structure your formula, its not guaranteed to be random for any particular formula. |
Beta Was this translation helpful? Give feedback.
-
Hi @jbogaardt, thanks for clarifying; it appears that using discrete valuations completely absorbs the signal, resulting in random residuals in all directions. However, because of the multicollinearity, the model may be overparameterized in this case. As a result, fitting on fewer parameters by combining a few levels could be a viable option. So, aside from grouping levels, is it possible to set a few parameters in the fitted model? For example, if we know the trend is 2% prior to 1979, and 5% after for origin, how do we feed this information into the model so that we only need to fit development and valuation? |
Beta Was this translation helpful? Give feedback.
-
Do you mean to insert offsets for specific parameters rather than fitting the parameters from the data? Unfortunately, no. Under the hood, the regression is being carried out by sklearn.linearmodel.Linear_Regression which doesn't support offset parameters. The statsmodels.GLM implementation does support offsets, but |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
abc = cl.load_sample('abc')
len(abc.origin)
len(abc.development)
len(abc.valuation)
model = cl.BarnettZehnwirth(formula='C(origin)+C(development)+C(valuation)').fit(abc)
Beta Was this translation helpful? Give feedback.
All reactions