Skip to content

Fix handling of zero weights#638

Open
nalimilan wants to merge 2 commits intomasterfrom
nl/zerowts
Open

Fix handling of zero weights#638
nalimilan wants to merge 2 commits intomasterfrom
nl/zerowts

Conversation

@nalimilan
Copy link
Member

Observations with zero weights have to be skipped for probability and analytic weights. Values have been checked against R. Avoid duplication by using a single method for all LinPredModels (though this means we cannot call nobs from functions which don't get passed the full model).

Also fix name of internal function link, which actually returns the distribution.

Improves on #487. Cc: @gragusa

Observations with zero weights have to be skipped for probability and analytic weights.
Values have been checked against R. Avoid duplication by using a single method
for all `LinPredModel`s (though this means we cannot call `nobs` from functions
which don't get passed the full model).

Also fix name of internal function `link`, which actually returns the distribution.
@test dof_residual(lmod) == dof_residual(glmod) == sum(!iszero, wts) - 2
@test deviance(lmod) ≈ deviance(glmod) ≈ 24.500813008130084
@test coef(lmod) ≈ coef(glmod) ≈ [3.6707317073170724, 0.20731707317073195]
@test stderror(lmod) ≈ stderror(glmod) ≈ [1.7617126628715203, 0.23455696986842048]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gragusa I get slightly different standard errors with svyglm. Is this expected? Generally in my tests I got exactly the same values up to a least four decimals.

> X = 1:10
> y = c(1, 4, 6, 2, 3, 5, 6, 7, 1, 6)
> wts = c(1, 0, 4, 0, 5, 2, 6, 4, 2, 6)
> svyd <- svydesign(~1, weights=wts, data=data.frame(X=X, y=y, wts=wts))

> summary(svyglm(y ~ X, svyd))

Call:
svyglm(formula = y ~ X, design = svyd)

Survey design:
svydesign(~1, weights = wts, data = data.frame(X = X, y = y, 
    wts = wts))

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)   3.6707     1.7371   2.113    0.079 .
X             0.2073     0.2313   0.896    0.405  
---
Signif. codes:  0***0.001**0.01*0.05.0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 3.500116)

@nalimilan nalimilan added the bug label Jan 11, 2026
@nalimilan nalimilan added this to the Release 2.0 milestone Jan 11, 2026
@codecov
Copy link

codecov bot commented Jan 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.96%. Comparing base (17a568c) to head (add9660).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #638      +/-   ##
==========================================
+ Coverage   96.94%   96.96%   +0.01%     
==========================================
  Files           8        8              
  Lines        1213     1219       +6     
==========================================
+ Hits         1176     1182       +6     
  Misses         37       37              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

src/glmfit.jl Outdated
wts = weights(r)
d = sum(r.devresid)
return wts isa ProbabilityWeights ? d * nobs(r) / sum(wts) : d
return wts isa ProbabilityWeights ? d * sum(!iszero, wts) / sum(wts) : d
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this change unnecessary since nobs is fixed below?

Suggested change
return wts isa ProbabilityWeights ? d * sum(!iszero, wts) / sum(wts) : d
return wts isa ProbabilityWeights ? d * nobs(r) / sum(wts) : d

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was because I defined the method only on LinPredModel and not on the response object. But I've discovered that we have ModResp that covers both LMs and GLMs so I've changed the method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants