-
Notifications
You must be signed in to change notification settings - Fork 2.5k
How are community reports structured across hierarchy levels? #1850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Also, when generating community reports at the leaf level, the system includes the entity and relation IDs that each finding is based on, which allows for traceability. However, if the context window is exceeded and the entity and relation information is replaced with a summary at the intermediate level (i.e., the community report), does that mean this traceability is lost? |
When generating a summary based on leaf-level community information, how were the edges selected?(Could you please answer this question based on your paper?)
|
This is correct. The community context building happens here: https://github.com/microsoft/graphrag/blob/main/graphrag/index/operations/summarize_communities/build_mixed_context.py Note that sufficiently large contexts will end up truncated even with sub-summarization. For community building we only include edges that reside entirely within the community. We include all nodes. This happens here: https://github.com/microsoft/graphrag/blob/main/graphrag/index/workflows/create_communities.py |
This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days. |
This issue has been closed after being marked as stale for five days. Please reopen if needed. |
I was going through the code and had a question about the "From Local to Global" paper. From my understanding, the paper generates community reports at both the leaf level and higher (intermediate) levels.
For leaf-level communities, the report includes all the nodes and edges. At the intermediate levels, it includes all the nodes and edges from its sub-communities. However, if this information exceeds the context window, the system uses a summarized version (i.e., the community report of the lower-level community) instead.
Is this understanding correct? If so, could you tell me which file and prompt implement this behavior? And if the implementation has changed from the original paper, could you let me know how it works now?
The text was updated successfully, but these errors were encountered: