feat: Persistent Task Store and Distributed event bus implementation …#759
feat: Persistent Task Store and Distributed event bus implementation …#759hashtekconsulting wants to merge 4 commits intoa2aproject:mainfrom
Conversation
…for A2A server capability
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the A2A server's scalability by introducing robust mechanisms for handling task state and event propagation in multi-instance production environments. It moves beyond in-memory solutions to leverage AWS services like DynamoDB, SNS, and SQS, ensuring data consistency and reliable event delivery across a cluster of servers. The changes allow for seamless operation behind a load balancer, where clients can connect to any instance and still receive all relevant events and access task data. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a significant feature for scaling A2A server instances by implementing a persistent task store using DynamoDB and a distributed event bus using AWS SNS/SQS. While the implementation is well-structured, it's crucial to address security concerns related to SQS IAM policy construction, which currently uses string replacement and could be susceptible to policy injection. Furthermore, logging of SQS/SNS message bodies should be restricted to prevent PII leakage, and error handling needs to ensure untrusted input is not included in logged exception messages. My review also includes minor suggestions to improve efficiency and typing for better maintainability.
feat: Persistent Task Store and Distributed Event Bus/Queue for scaling the A2A server instances in production environment
By default the SDK ships with InMemoryTaskStore and DefaultExecutionEventBusManager, which are perfect for a single-process server. In a production deployment with multiple server instances behind a load balancer you need:
Persistent task state — so any instance can serve a tasks/get request regardless of which instance originally handled the task.
Distributed SSE fan-out — so a client that opens an SSE stream on Instance B receives events published by the executor running on Instance A.
This feature introduced new components to support A2A server scalability in production environment