Skip to content

Normalize and store email addresses #17925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
miketheman opened this issue Apr 8, 2025 · 0 comments · May be fixed by #17946
Open

Normalize and store email addresses #17925

miketheman opened this issue Apr 8, 2025 · 0 comments · May be fixed by #17946

Comments

@miketheman
Copy link
Member

Email addresses are currently stored in a varchar(254) column in the database, with a non-null, and unique constraint.

However, as some email providers allow for extra characters in email addresses, and the column is not citext (case insensitive), we can have duplicate values.

Proposal:

  • add a new empty normalized_email citext column to Emails model
  • populate the column with normalized values of each email address during email addition
  • backfill existing records

This effort could be complemented by also adding a domain column to the table, and do the same work as normalization effort, to make the data representation very clear and unambiguous or reliant on string splitting.

This effort should be preceded by some queries on the table to determine how many of these we might expect to see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant