You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Python: Allow Kernel Functions from Prompt for image and audio content (#11403)
### Motivation and Context
<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users, providing the following
information:
1. Why is this change required?
2. What problem does it solve?
3. What scenario does it contribute to?
4. If it fixes an open issue, please link to the issue here.
-->
I noticed that even though the input and prompt rendering match what you
want to use for image and audio generation, we didn't support that.
This introduces just that, with two samples. This unlocks the following
scenario's:
- Running text to speech pipelines with set intro/outro statements
- Creating function calls for image generation with limited scope and a
lot of set pieces.
### Description
<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->
- Adds a `get_image_content` method to the TextToImageClientBase class
- Adds the option to select a TextToImage or TextToAudio client in the
service selector (only for non-streaming)
- Adds branches in the KernelFunctionFromPrompt _invoke_internal for
those types.
- Adds handling the output as a FunctionResult
- Adds samples for both
### Contribution Checklist
<!-- Before submitting this PR, please make sure: -->
- [x] The code builds clean without any errors or warnings
- [x] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄
Copy file name to clipboardExpand all lines: python/semantic_kernel/connectors/ai/open_ai/prompt_execution_settings/open_ai_text_to_image_execution_settings.py
+20-10
Original file line number
Diff line number
Diff line change
@@ -41,6 +41,26 @@ class OpenAITextToImageExecutionSettings(PromptExecutionSettings):
0 commit comments