Skip to content

text-splitters: Set strict mypy rules #30900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

cbornet
Copy link
Collaborator

@cbornet cbornet commented Apr 17, 2025

  • Add strict mypy rules
  • Fix mypy violations
  • Add error codes to all type ignores
  • Add ruff rule PGH003
  • Bump mypy version to 1.15

Copy link

vercel bot commented Apr 17, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Apr 18, 2025 9:54am

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. Ɑ: text splitters Related to text splitters package 🤖:nit Small modifications/deletions, fixes, deps or improvements to existing code or docs labels Apr 17, 2025
Copy link
Collaborator

@sydney-runkle sydney-runkle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -70,7 +74,7 @@ ignore_missing_imports = "True"
target-version = "py39"

[tool.ruff.lint]
select = ["E", "F", "I", "T201", "D"]
select = ["E", "F", "I", "PGH003", "T201", "D"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add UP (pyupgrade)? Reminded of this bc of the List -> list changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be done in a different PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, can be done in another PR

token_ids_with_start_and_end_token_ids = self.tokenizer.encode(
text,
max_length=self._max_length_equal_32_bit_integer,
truncation="do_not_truncate",
)
return token_ids_with_start_and_end_token_ids
return cast("list[int]", token_ids_with_start_and_end_token_ids)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprised we have to cast this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because mypy can't figure the type so it uses Any which is incompatible with the method signature.

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Apr 17, 2025
@sydney-runkle
Copy link
Collaborator

Happy to merge once CI is happy

@cbornet cbornet force-pushed the text-splitters-mypy branch 2 times, most recently from 4a5e08f to 7f762cf Compare April 18, 2025 09:44
@cbornet cbornet force-pushed the text-splitters-mypy branch from 7f762cf to 748e778 Compare April 18, 2025 09:54
@cbornet
Copy link
Collaborator Author

cbornet commented Apr 18, 2025

Happy to merge once CI is happy

It now is 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm PR looks good. Use to confirm that a PR is ready for merging. 🤖:nit Small modifications/deletions, fixes, deps or improvements to existing code or docs size:M This PR changes 30-99 lines, ignoring generated files. Ɑ: text splitters Related to text splitters package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants