[FEAT] summarized chat completion context #6217

SongChiYoung · 2025-04-05T07:52:00Z

🧠 Summary

This PR introduces the structural design for a new context type: SummarizedChatCompletionContext.
It allows message summarization to be triggered within agent-local contexts, based on user-defined conditions — decoupling summarization from termination.

This draft focuses on infrastructure only: the new context class, message summarization condition interfaces, and logical composition (AND / OR) are all in place — but the system is not yet wired to active agents or tested in workflows.

✨ Motivation

In complex multi-agent systems (e.g., SocietyOfMindAgent or deeply nested teams), agent-internal messages often grow long and redundant.
These messages:

Leak into outer context evaluation
Pollute termination conditions
Waste tokens

To address this, summarization must happen before termination — inside the agent context itself.
This PR introduces a new kind of ChatCompletionContext that enables exactly that.

See: Discussion #6160

🧱 Key Components

1. `SummarizedChatCompletionContext` (New)

Subclass of ChatCompletionContext
Accepts:
- summarizing_func: a function that takes messages and non_summarized_messages and returns a summary
- summarizing_condition: a subclass of MessageCompletionCondition
Automatically triggers summary() when condition is met
Can support async summarization later

await self._summarizing_condition(self._messages)
if self._summarizing_condition.triggered:
    await self.summary()

MessageCompletionCondition Interface (New)
• Abstract base class to define summarization triggers
• Tracks .triggered state
• call(messages) pattern

Also includes:
• reset() method for reuse
• and / or logic via:
• AndMessageCompletionCondition
• OrMessageCompletionCondition

Related issue number

#6160

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

ToDo

Tools style User defined summary function serialize (Need to help!) - Now Serialize is working!
pyright/mypy confirmed
Documentation (If we need more than docstring)
Fix Docstrings
Fix Func/Args names.., (If we need it)
Test code / coverage

SongChiYoung · 2025-04-05T07:54:22Z

Just to add a bit of context:

This implementation heavily references and reuses the structure of existing termination logic.
While I believe this new context could potentially replace or unify several existing ChatCompletionContext variants, any such refactoring is out of scope for this PR and will not be addressed here.

ekzhu

Perhaps we can start with a concreate implementation of summarization? As a first version, an actual implementation would provide the most value than a templated version with scaffolding like summarization_func and various conditions.

For example, a simple implementation that uses a model client to convert a list of LLMMessage into a single message is already very useful. It can be triggered for every 10 messages, for example.

A highly opinionated but concrete implementation gets feedbacks and usage, and users can pick it up quickly and provide feedbacks.

SongChiYoung · 2025-04-06T05:53:36Z

@ekzhu
Got it — that makes sense!

Just to confirm: is the current structure and file placement generally okay?

I actually have more progress locally — I'm currently working on porting the termination logic over (even if it's slightly imperfect for now).

My network is a bit unstable at the moment, so I’ll push everything once I get a better connection. Thanks!

…_chat_completion_context

SongChiYoung · 2025-04-07T22:09:28Z

@ekzhu
Now, It will be work!

example code(Works in my mac)

    import asyncio
    from autogen_core.models import UserMessage, AssistantMessage
    from autogen_ext.models.openai import OpenAIChatCompletionClient
    from autogen_ext.models.anthropic import AnthropicChatCompletionClient
    from autogen_agentchat.agents import AssistantAgent
    from autogen_core.model_context import SummarizedChatCompletionContext
    from autogen_core.model_context.conditions import (
        MaxMessageCompletion
    )
    from autogen_ext.summary import buffered_summary


    client = OpenAIChatCompletionClient(
        model="claude-3-haiku-20240307"
    )
    
    print(client.model_info)
    context = SummarizedChatCompletionContext(
        summarizing_func = buffered_summary(buffer_count=2),
        summarizing_condition = MaxMessageCompletion(max_messages=2)
    )
    agent = AssistantAgent(
        "helper",
        model_client=client,
        system_message="You are a helpful agent",
        model_context=context
    )
    
    async def run():
        from pprint import pprint
        res = await agent.run(task="What is the capital of France?")
        pprint(res)
        pprint(await context.get_messages())
        res = await agent.run(task="What is the capital of Korea?")
        pprint(res)
        pprint(await context.get_messages())
        

    asyncio.run(run())

…_chat_completion_context

SongChiYoung · 2025-04-10T05:15:29Z

NOW! Serialize is working!

Example code

def test13():
    import asyncio
    from autogen_core.models import UserMessage, AssistantMessage
    from autogen_ext.models.openai import OpenAIChatCompletionClient
    from autogen_ext.models.anthropic import AnthropicChatCompletionClient
    from autogen_agentchat.agents import AssistantAgent
    from autogen_core.model_context import SummarizedChatCompletionContext
    from autogen_core.model_context.conditions import (
        MaxMessageCompletion
    )
    from autogen_ext.summary import (
        buffered_summary,
        buffered_summarized_chat_completion_context
    )


    client = OpenAIChatCompletionClient(
        model="claude-3-haiku-20240307"
    )
    
    print(client.model_info)
    context = SummarizedChatCompletionContext(
        summarizing_func = buffered_summary(buffer_count=2),
        summarizing_condition = MaxMessageCompletion(max_messages=2)
    )
    agent = AssistantAgent(
        "helper",
        model_client=client,
        system_message="You are a helpful agent",
        model_context=context
    )
    
    async def run():
        from pprint import pprint
        res = await agent.run(task="What is the capital of France?")
        pprint(res)
        pprint(await context.get_messages())
        res = await agent.run(task="What is the capital of Korea?")
        pprint(res)
        pprint(await context.get_messages())
        

    asyncio.run(run())

    print("=====================")
    print(agent.dump_component())
    print("=====================")

    agent = AssistantAgent(
        "helper",
        model_client=client,
        system_message="You are a helpful agent",
        model_context=buffered_summarized_chat_completion_context(
            buffer_count=2,
            max_messages=2
        )
    )
    
    async def run():
        from pprint import pprint
        res = await agent.run(task="What is the capital of France?")
        pprint(res)
        pprint(await context.get_messages())
        res = await agent.run(task="What is the capital of Korea?")
        pprint(res)
        pprint(await context.get_messages())
        

    asyncio.run(run())

    test = agent.dump_component()
    print("=====================")
    print(test)
    print("=====================")

    agent = AssistantAgent.load_component(test)
    
    async def run():
        from pprint import pprint
        res = await agent.run(task="What is the capital of France?")
        pprint(res)
        pprint(await context.get_messages())
        res = await agent.run(task="What is the capital of Korea?")
        pprint(res)
        pprint(await context.get_messages())
    asyncio.run(run())

if __name__ == "__main__":
    test13()

SongChiYoung · 2025-04-10T10:09:09Z

Now fixed comment/Docstring!
Please check if I need to write other documentation.

SongChiYoung · 2025-04-10T13:46:30Z

@ekzhu
This PR is now fully functional, tested, and ready for review.

All core features are implemented:

SummarizedChatCompletionContext
Condition-based summarization triggers (e.g., MaxMessageCompletion)
Buffered summary strategy
Component-based serialization and config restoration
Complete unit test coverage with mypy / pyright strict compatibility

Looking forward to feedback!

SongChiYoung · 2025-04-10T13:59:19Z

Next Step Idea (will be a separate PR):

As a next step, I’m planning to implement an LLM-based summarizer using AutoGen agents — such as CodeExecutorAgent or AssistantAgent — to generate summaries directly from recent messages.

The goal is to:

Enable more semantically aware summarization via LLMs
Support prompt-based, model-guided strategies (e.g., one-shot, chain-of-thought)
Leverage AutoGen’s existing agent infrastructure for summarization tasks

This will be added as an optional summarizer type, while still fully supporting the current function-based summarization approach — no breaking changes, just an additional pluggable option.

The interface will remain compatible with SummarizedChatCompletionContext, using the same summarizing_func slot and a tool-style wrapper for serialization.

Looking forward to exploring this in a follow-up PR!

…_chat_completion_context

ekzhu

I can see there is a lot of progress made much more beyond the initial scope of the issue.

I still hold my last point:

Perhaps we can start with a concreate implementation of summarization? As a first version, an actual implementation would provide the most value than a templated version with scaffolding like summarization_func and various conditions.

For example, a simple implementation that uses a model client to convert a list of LLMMessage into a single message is already very useful. It can be triggered for every 10 messages, for example.

A highly opinionated but concrete implementation gets feedbacks and usage, and users can pick it up quickly and provide feedbacks.

I think instead of building all the scaffolding using conditions, we can start with a much simpler user experience. For example,

from autogen_core.model_context import SummarizerChatCompletionContext

summarizer_context = SummarizerChatCompletionContext(
  model_client=model_client,
  summary_prompt="Summarize the conversation so far for your own memory",
  summary_format="This portion of conversation has been summarized as follow: {summary}",
  summary_interval=10, # trigger for every 10 messages
  summary_start=2, #the produced summary will replace the portion of message history starting from the 3rd message.
  summary_end=-2, #the produced summary will replace the portion of message history ending at the 2nd to the last message. 
)

agent = AssistantAgent("assistant", ..., model_context=summarizer_context)

In this example, the model_client is used to perform the summary. This is mostly likely how ChatGPT performs model context summary, and most users just want something like this that work out-of-the-box.

SongChiYoung · 2025-04-14T21:37:36Z

@ekzhu
Thanks for the thoughtful feedback. I can definitely add a model_client-based summarization style as a high-level option.

That said, I still believe the current structure using summarizing_func and summarizing_condition is conceptually aligned with how AutoGen users already use termination_condition in GroupChat. So from that perspective, I think it could actually feel more intuitive and familiar to many users.

Also, as I understand it, summarization is a highly experimental and research-driven area. I believe this kind of extensible and general-purpose structure can enable AutoGen users - especially research-oriented users - to explore novel summarization strategies much more easily.

That said, I’ll work on bridging the two - providing a simple preset (like SummarizerChatCompletionContext) that’s powered by the current modular infrastructure underneath.

Appreciate your guidance as always!

ekzhu · 2025-04-15T04:06:13Z

If we add a new user-facing concept for every new feature, the framework will quickly turn into chaos.

There is indeed value for more structured code in this component, but not now. We can always add those code later. We don't need to create too much structure. The original Langchain code base has been criticized by many people for introducing too many abstractions, I hope we don't follow the same path.

My suggestion is to follow Keep It Simple and Stupid (KISS): just the minimal code required for the minimal viable usage case. This way, it is much easier to write unit tests that creates high coverage.

ekzhu · 2025-04-15T04:30:08Z

More on my previous comment.

I agree with you the conditions are similar to termination conditions already in the framework. In the future there will be a space for those, perhaps in different forms. But let's get users to use the basic feature first and gather feedback.

For this PR, let's not add the new concepts. Please only create the implementation for a new SummarizerChatCompletionContext class and unit tests for it.

SongChiYoung · 2025-04-18T10:36:06Z

@ekzhu
Got it — in that case, how about turning this into a 3rd-party extension instead, without including the more complex summary engine in core?

I can of course implement the simpler version you suggested, but in my case, that approach doesn’t quite work well — especially when dealing with long contexts and nuanced summarization (e.g., ChatGPT or Claude sometimes "forget" key parts in long chats).
If you'd like, I can still work on that simpler version in a separate PR (Like your advice thing).

Also, if possible, please don’t close or significantly alter this PR for now — I’d like to reference it in a follow-up PR to the 3rd-party extensions page.

This PR was structured with a bit more abstraction because I was exploring ideas like:

summarizing only non-user messages
source-aware summarization (e.g., per-agent)
meta-summary: generating multiple summaries from the same context, then summarizing those

These are admittedly more experimental, but reflect the direction I was aiming for.
So perhaps it makes more sense to keep that direction in a community extension — happy to maintain it if that’s preferred.

Let me know what you think!

Also, just to note — if someday the AutoGen community finds this kind of structured summarization useful (and others express similar needs), I’d be happy to contribute the extension back into core and hand over maintenance. No attachment on my end — just want the idea to be available if it's ever helpful.

ekzhu · 2025-04-18T18:02:45Z

Got it — in that case, how about turning this into a 3rd-party extension instead, without including the more complex summary engine in core?

I think this will be a good outcome.

Let's just have a separate PR to add the extension to the list.

SongChiYoung · 2025-04-19T11:59:12Z

Got it — in that case, how about turning this into a 3rd-party extension instead, without including the more complex summary engine in core?

I think this will be a good outcome.

Let's just have a separate PR to add the extension to the list.

Thx.
I was build it at (https://github.com/SongChiYoung/autogen-contextplus)
and make PR for update 3-party extensions doc.

DOC: add extentions - autogen-oaiapi and autogen-contextplus the contextplus is user define autogen model_context. It discussion in #6217 and #6160 --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>

ekzhu · 2025-04-21T19:31:29Z

Converted to draft for now

SongChiYoung added 2 commits April 5, 2025 15:30

FEAT: build Just draft

364ddf6

FEAT: End of design summarized cat completion

29f29de

ekzhu reviewed Apr 6, 2025

View reviewed changes

SongChiYoung added 9 commits April 6, 2025 22:50

FEAT: add conditions

03353ef

Merge remote-tracking branch 'upstream/main' into feature/_summarized…

fda9bed

…_chat_completion_context

Merge remote-tracking branch 'upstream/main' into feature/_summarized…

2a651a5

…_chat_completion_context

FEAT: Add TokenUsageMessageCompletion

e88443b

Merge remote-tracking branch 'upstream/main' into feature/_summarized…

33e9595

…_chat_completion_context

feat: maybe, add all of conditions

ec63df3

FIX: fix errors

9e4fef9

FEAT: buffered_summary add for included summary func

f772ed4

Merge remote-tracking branch 'upstream/main' into feature/_summarized…

d063f15

…_chat_completion_context

SongChiYoung added 3 commits April 10, 2025 12:24

Merge remote-tracking branch 'upstream/main' into feature/_summarized…

b337c13

…_chat_completion_context

FEAT: add bufferd_summary_context

df60122

FEAT: serialize aware

37ef9bf

SongChiYoung added 2 commits April 10, 2025 17:29

FEAT: pyright and mypy aware

c536f8f

FIX: Docstring fixed

88894d3

SongChiYoung added 4 commits April 10, 2025 20:00

TEST: add summary model context test

d8d18a8

CHOR: test code pyright/mypy

90486e0

FEAT: add summary engines test cases

b8ddf8e

FEAT: test cov done

5686a69

SongChiYoung marked this pull request as ready for review April 10, 2025 13:46

SongChiYoung changed the title ~~[DRAFT] summarized chat completion context~~ [FEAT] summarized chat completion context Apr 10, 2025

SongChiYoung added 2 commits April 11, 2025 09:35

Merge remote-tracking branch 'upstream/main' into feature/_summarized…

daf0ee8

…_chat_completion_context

Merge branch 'main' into feature/_summarized_chat_completion_context

5ac3fb9

ekzhu requested changes Apr 14, 2025

View reviewed changes

FEAT: LLM summary engine is now appear

4e3c53d

SongChiYoung mentioned this pull request Apr 19, 2025

DOC: add extentions - autogen-oaiapi and autogen-contextplus #6338

Merged

ekzhu marked this pull request as draft April 21, 2025 19:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEAT] summarized chat completion context #6217

[FEAT] summarized chat completion context #6217

SongChiYoung commented Apr 5, 2025 •

edited

Loading

SongChiYoung commented Apr 5, 2025

ekzhu left a comment

SongChiYoung commented Apr 6, 2025

SongChiYoung commented Apr 7, 2025 •

edited

Loading

SongChiYoung commented Apr 10, 2025 •

edited

Loading

SongChiYoung commented Apr 10, 2025 •

edited

Loading

SongChiYoung commented Apr 10, 2025

SongChiYoung commented Apr 10, 2025

ekzhu left a comment •

edited

Loading

SongChiYoung commented Apr 14, 2025

ekzhu commented Apr 15, 2025

ekzhu commented Apr 15, 2025 •

edited

Loading

SongChiYoung commented Apr 18, 2025 •

edited

Loading

ekzhu commented Apr 18, 2025 •

edited

Loading

SongChiYoung commented Apr 19, 2025

ekzhu commented Apr 21, 2025

[FEAT] summarized chat completion context #6217

Are you sure you want to change the base?

[FEAT] summarized chat completion context #6217

Conversation

SongChiYoung commented Apr 5, 2025 • edited Loading

🧠 Summary

✨ Motivation

🧱 Key Components

1. SummarizedChatCompletionContext (New)

Related issue number

Checks

ToDo

SongChiYoung commented Apr 5, 2025

ekzhu left a comment

Choose a reason for hiding this comment

SongChiYoung commented Apr 6, 2025

SongChiYoung commented Apr 7, 2025 • edited Loading

SongChiYoung commented Apr 10, 2025 • edited Loading

SongChiYoung commented Apr 10, 2025 • edited Loading

SongChiYoung commented Apr 10, 2025

SongChiYoung commented Apr 10, 2025

ekzhu left a comment • edited Loading

Choose a reason for hiding this comment

SongChiYoung commented Apr 14, 2025

ekzhu commented Apr 15, 2025

ekzhu commented Apr 15, 2025 • edited Loading

SongChiYoung commented Apr 18, 2025 • edited Loading

ekzhu commented Apr 18, 2025 • edited Loading

SongChiYoung commented Apr 19, 2025

ekzhu commented Apr 21, 2025

SongChiYoung commented Apr 5, 2025 •

edited

Loading

1. `SummarizedChatCompletionContext` (New)

SongChiYoung commented Apr 7, 2025 •

edited

Loading

SongChiYoung commented Apr 10, 2025 •

edited

Loading

SongChiYoung commented Apr 10, 2025 •

edited

Loading

ekzhu left a comment •

edited

Loading

ekzhu commented Apr 15, 2025 •

edited

Loading

SongChiYoung commented Apr 18, 2025 •

edited

Loading

ekzhu commented Apr 18, 2025 •

edited

Loading