-
Notifications
You must be signed in to change notification settings - Fork 71
Better handling of treated input in RegressionDiscontinuityBetter handling treated input #450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Better handling of treated input in RegressionDiscontinuityBetter handling treated input #450
Conversation
Hi @drbenvincent, I’ve opened this PR to address #440. Please review when you have a moment. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves the input validation for the treated column in RegressionDiscontinuity to ensure it is of boolean type. It adds a validation function and corresponding tests to provide clearer error messages and prevent misuse of integer-coded treatments.
- Added a data type check using pandas for the treated column.
- Introduced tests to verify that a ValueError is raised when non-boolean data is provided.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
File | Description |
---|---|
causalpy/experiments/test_treated_column_valid.py | Added tests to validate that the treated column is boolean. |
causalpy/experiments/regression_discontinuity.py | Updated input validation to enforce that the treated column is of boolean type. |
Comments suppressed due to low confidence (2)
causalpy/experiments/regression_discontinuity.py:193
- Consider using a more robust check (e.g., using pd.api.types.is_bool_dtype) for the treated column instead of comparing dtypes to the string 'bool' to ensure consistency with the test validation function.
if not self.data['treated'].dtype == 'bool':
causalpy/experiments/regression_discontinuity.py:194
- There is a minor typographical issue in the error message: add a space after 'bool.' to improve readability.
raise ValueError("The 'treated' column must be of type bool.Please convert your data accordingly.")
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #450 +/- ##
==========================================
- Coverage 94.40% 93.60% -0.80%
==========================================
Files 31 32 +1
Lines 1985 2003 +18
==========================================
+ Hits 1874 1875 +1
- Misses 111 128 +17 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reviewing the PR, I noticed that the new tests in test_treated_column_valid.py
don’t directly test the input_validation
method in the RegressionDiscontinuity
class. Instead, they test a separate helper function that mimics some of the validation logic. While this is helpful, it would be more beneficial to test the actual input_validation
method to ensure the validation works as expected within the context of the class.
Would you mind updating the tests to directly call input_validation
? This way, we can ensure the validation logic is fully covered in the real implementation.
Let me know if you have any questions or need guidance on how to proceed. Thanks for your hard work—I’m looking forward to seeing the updated tests!
This PR addresses issue #440 by improving the handling of the treated column in RegressionDiscontinuity.
Added a data validation step to ensure the treated column is of boolean type.
Included a test (test_treated_column_valid.py) to check that an exception is raised when the treated column contains integers instead of booleans.
This ensures clearer error messages and prevents mysterious errors when using integer-coded treatments.
📚 Documentation preview 📚: https://causalpy--450.org.readthedocs.build/en/450/