-
Notifications
You must be signed in to change notification settings - Fork 407
Fix autobatch size issue #1073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix autobatch size issue #1073
Conversation
…items, plus updated the logic in generation size to respect what the user asks
| if config.model_parallel is False and self.config.dtype not in ["4bit", "8bit"]: | ||
| logger.info(f"Using Data Parallelism, putting model on device {self._device}") | ||
| self.model = self.model.to(self._device) | ||
| if config.compile: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate code, already exists in _create_auto_model
| ) | ||
| # model.to(self.device) | ||
| model.eval() | ||
| torch.set_grad_enabled(False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set at the module level
| continuation = continuation.lstrip() | ||
| return continuation | ||
|
|
||
| def _model_call(self, inputs: torch.Tensor) -> torch.Tensor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
legacy function
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@pcuenca this should fix the issue you had with autobatch size, can you take a look? I'm not sure it's 100% perfect, I'm still getting some memory not deallocated in the model, but I suspect it should already be helpful for your usecase |
The PR does 2 main things:
delon the created objects to force the memory release of attached resourcesRest of modifs are nits (duplicated code/legacy functions that I removed), can put in another PR but they were thematically linked