Building a Real-Time Gmail Processing Pipeline with Pub/Sub Webhooks

Building a Real-Time Gmail Processing Pipeline with Pub/Sub Webhooks

How we solved the challenge of processing Gmail messages in real-time without constantly polling the API using SmythOS

The Problem: When Email Polling Just Doesn’t Cut It

Picture this: You’re building an email automation system that needs to respond to incoming Gmail messages within seconds. Your first instinct might be to poll the Gmail API every few seconds to check for new emails. But as your user base grows, you quickly run into several problems:

  • Rate Limiting: Gmail API has strict rate limits that make frequent polling unsustainable
  • Latency: Polling intervals create unavoidable delays in processing
  • Resource Waste: 99% of your API calls return “no new messages”
  • Scalability Issues: More users = more polling = more API quota consumed

We faced exactly this challenge when building our Gmail Conversational Agent. We needed a system that could:

  • Process emails in real-time (sub-second response times)
  • Handle multiple Gmail accounts simultaneously
  • Scale efficiently without hitting API limits
  • Maintain reliability and avoid missing messages

The solution? Gmail Push Notifications via Google Cloud Pub/Sub webhooks, implemented using the SmythOS platform.

The Architecture: Real-Time Gmail Pipeline

Using SmythOS’s visual workflow builder, we architected our real-time Gmail processing system:

Gmail Account → Gmail Push Notifications → Google Cloud Pub/Sub → SmythOS Webhook → Processing Pipeline

1. Setting Up Gmail Push Notifications

The first component in our SmythOS workflow is the Gmail Watch API setup. This component tells Gmail to send us notifications when new emails arrive by setting up a “watch” on the user’s mailbox targeting the INBOX label.

2. Webhook Endpoint: Receiving the Notifications

When a new email arrives, Google sends a POST request to our SmythOS webhook endpoint. The SmythOS API Endpoint component automatically handles the incoming payload, which contains Base64-encoded notification data with the email address and history ID information.

The Technical Challenges We Solved

Challenge 1: Intelligent Message Filtering

Not every Gmail notification represents an email we want to process. We needed to filter out:

  • Emails we sent ourselves
  • Draft emails
  • Emails in specific folders

We solved this using SmythOS’s multi-component filtering approach:

  1. Gmail History API Component: Retrieves the full message history using the history ID from the notification
  2. Parse History Code Component: Processes history items in reverse chronological order, extracting only messages that are truly incoming (INBOX label, no SENT or DRAFT labels)

Challenge 2: Preventing Duplicate Processing

We needed to ensure each email was processed exactly once, even if we received multiple notifications for the same message.

Our solution? A lightweight deduplication system using SmythOS’s Google Sheets integration as a rolling cache:

  1. Google Sheets Get Values Component: Retrieves the current list of processed message IDs
  2. Deduplication Logic Component: Checks if the incoming message ID already exists in our tracking sheet
  3. Rolling Window Management: Maintains a maximum of 20 recent message IDs, automatically removing the oldest when the limit is exceeded
  4. Google Sheets Update Component: Stores new message IDs back to the sheet for future reference

Why Google Sheets? It provided:

  • Persistent storage across webhook calls
  • Easy human-readable debugging
  • Built-in concurrency handling through SmythOS’s Google Sheets API integration
  • No additional database infrastructure needed

Challenge 3: Managing Gmail Watch Subscriptions

Gmail watch subscriptions expire after 7 days maximum. We needed a way to automatically renew them before they expired.

SmythOS solved this elegantly with its built-in scheduling capabilities. We configured SmythOS to automatically call our Gmail Watch setup workflow every week, ensuring continuous monitoring without manual intervention.

The scheduled approach provides:

  • Automated Renewal: SmythOS scheduler calls the Gmail Watch API setup every 7 days
  • Proactive Management: Subscriptions are renewed before they expire, preventing gaps in monitoring
  • Set-and-Forget Operation: Once configured, the system maintains itself without manual oversight
  • Built-in Reliability: SmythOS’s scheduling service handles the timing and execution automatically

The beauty of using SmythOS’s scheduling feature is that it transforms what could be a complex cron job or background service into a simple, visual workflow that runs automatically. No server maintenance, no cron configuration—just reliable, scheduled execution.

The Complete Pipeline in Action

Here’s how our SmythOS workflow handles everything when a new email arrives:

  1. Email Arrives: User receives email in Gmail
  2. Gmail Notification: Gmail sends push notification to our Pub/Sub topic
  3. SmythOS Webhook Triggered: Our API Endpoint component receives the notification
  4. Parse Notification: SmythOS components extract message info from the notification payload
  5. Deduplication Check: Google Sheets components verify we haven’t processed this message before
  6. Fetch Full Message: Gmail API component retrieves complete email content
  7. Smart Filtering: Multi-layer filtering components ensure it’s an email we should process
  8. AI Processing: GenAI LLM component analyzes email content and generates appropriate responses
  9. Send Response: Gmail Send component delivers the reply, maintaining proper thread context
  10. Update Tracking: Google Sheets components record message ID to prevent future duplicates

Performance & Reliability Results

After implementing this SmythOS-based system, we achieved:

  • Sub-second processing: From email arrival to processing start in <500ms
  • Zero API quota issues: Push notifications eliminated polling overhead entirely
  • Linear scalability: Adding users doesn’t increase API usage proportionally, thanks to the event-driven architecture

Conclusion

By switching from API polling to Gmail Push Notifications with SmythOS, we transformed our email processing from a rate-limited nightmare into a real-time, scalable solution. The result: sub-second processing times, zero API quota issues, and linear scalability across hundreds of accounts. What would traditionally take months to build, including webhook servers, deduplication logic, and scheduled renewals, became a visual workflow deployed in days. Stop polling, start pushing.