I’m posting here since we no longer have access to support channels per the changes announced yesterday. We believe this is a bug in Nylas webhook delivery, and if so, it’s a serious one.
Issue: We have customers repeatedly reporting missing emails in our system. After investigation, it appears we’re not receiving webhooks we expected based on how we understood the webhook system to work.
What we’re seeing:
Missing message.created webhooks - We’ve identified numerous messages where Nylas logs show only message.updated events were sent, with no corresponding message.created event. We expected each new message to generate a message.created event, although the documentation isn’t definitive on this.
Workaround challenges - We’ve been attempting to use message.updated as a workaround for missing message.created events, but this isn’t reliable since not all messages will have an update event.
Impact: This results in emails being completely missing from our system, causing significant customer complaints.
I’d prefer not to post customer email addresses, grant IDs, and message IDs on a public forum. If someone from Nylas engineering could reach out, I can provide specific examples and logs to help investigate. We’re open to exploring if there’s something on our end, but the pattern suggests a webhook delivery issue.
The quick summary is that some email providers don’t differentiate between creation and update notifications so Nylas is left to try to programmatically determine which webhook trigger to send out. And while we catch most, there are edge cases where the wrong one will be sent.
We recommend using both message webhook triggers and keeping a local cache of messages. If you get a webhook regarding a message ID that you don’t have cached, you can safely assume it is a new message.
Please let us know if you did not receive any message webhook at all for a new message and we can investigate further.
I’ve reviewed both the webhook documentation and webhook quickstart guide and found no mention of this behavior. I may have missed it, but I think this is critical information which should be in the main webhook documentation, not just a knowledge base article. The main docs state webhooks have “at least once” delivery guarantee, which implies we’ll get at least one message.created per message. Now it seems the true meaning is “if we’re able to distinguish a created event, then it will be delivered at least once” - but I think you can see why this would be interpreted as “we can rely on getting a message.created event for each message.”
Per the KB article, the underlying architectural change allowed you to eliminate your message storage layer. While this may have simplified your infrastructure, it shifts significant complexity to customers. We have no need to handle message.updated events otherwise - we don’t care about read status or folder changes. The only reason we’d need to process these events is to work around missing message.created webhooks. This means we now need to handle multiple message types per message and process significantly more webhook traffic.
Your recommendation to keep a local cache essentially asks us to solve a generic email problem that isn’t specific to our business domain. This feels like exactly the type of provider-specific complexity that Nylas’s value proposition is supposed to abstract away. While our existing system would technically work without this caching layer, we have fairly complex logic that determines if a message is relevant to our platform. We don’t want to run this expensive processing multiple times for each underlying message, only to reject it at the end as a duplicate, so we’ll need to build in such a layer that stores at least recently seen message IDs to short-circuit and discard duplicates. I recognize there were security reasons for eliminating the message storage layer as well, but the point remains - this architectural decision shifts complexity to customers.
As a minor point, the KB article doesn’t mention Microsoft specifically - it only mentions “Graph”. When I see “Graph” I don’t think “Microsoft” since the whole purpose of Nylas is to abstract away details of underlying APIs. Would suggest docs explicitly say “Microsoft Graph” for clarity. FYI we use Nylas exclusively with Microsoft at this point so I was looking for an explicit Microsoft reference.
We’re still wary that we may be missing webhooks altogether, but we need to investigate more thoroughly and it will take time to collect data. We’ll follow up if we find there are messages for which we don’t receive either event type.
@chitreshd Thanks for following up. To be clear, we’ve always received message.created events for the vast majority of newly created messages. The issue is that we were missing these for a small percentage of emails.
Your support team pointed us to that knowledge base article, which is how we found out that message.created is not guaranteed and we need to handle message.updated as well, which we would otherwise not do since we don’t care about read status, etc. Since then, we’ve had to upgrade our database server and application servers and burn days of developer time working around this issue and doing it in an efficient way, since we’re now handling 4X the volume of webhooks just to work around this issue.
Again, my main point and main request is that you make it so that message.created is guaranteed so we no longer have to handle message.updated. At a minimum, you could implement the same “hack” we have: storing every message ID and checking to see if we’ve seen it before so that if we get a message.updated for an email we had not previously seen a message.created for, we will handle it as a message.created. Are you planning on doing this (or have you done it — is this the change you’re referring to)? Thanks.
We added some more tricks to separate message.created from message.updated while staying close to Provider changes.
We have thought about the “hack” you mentioned and will roll it out next quarter in a controlled manner. Issue is, we want to be as close to the source of truth i.e provider as possible. So if provider sends us only message.updated ( as bad it as it may sound in your case ) - we want to be truthful to downstream customers.
Again, we are planning to introduce a setting which customers can toggle and leverage “hack” or “optimization” if they so chose to.
@chitreshd thanks, great to hear that you’re working on a solution. One thought: would it make sense to create a new type of webhook: message.first_seen?
This would fire whenever Nylas encounters a message for the first time, regardless of whether the email provider reported it as created or updated. It would solve the edge case issues while being clearer about what’s actually happening - you’re seeing this message for the first time in the Nylas system.
How would you consume it any differently from treating message.update as upsert?
First, to be clear, we don’t actually do upserts for message.updated. Just “blindly upset everything” does not work for us. We’re not intersted in 95% of messages our users receive and we don’t want to store messages we don’t care about. We have a whole layer of business logic to determine if a given message we’ve been notified of is one that we actually care about. This is quite expensive to run, and the big issue with having to handle message.updated is that we’re now running this expensive logic, on average, 4X per message rather than 1X (because we get on average 3X updated events per message).
Because this additional processing overwhelmed our database, we had to spend a lot of developer time optmizing this flow. One thing we put in place was logic to determine “have we seen this message before” efficiently to avoid the expensive business logic I mentioned.
I’ve mentioned this before, but to make sure we’re on the same page: we do not care about message.updated events other than the fact that we’re currently required to handle them to avoid completely missing some messages for which message.created is never sent.
So, what I’m asking for is some way to completely avoid handling message.updated events at all. My suggestion was a new event, message.first_seen, which would give us notifications of:
any true message.created events
plus any message.updated events for which no message.created was ever sent
Or it could just cover the second category, and we could continue to subscribe to both this and message.created on our end. Doesn’t matter much to us. But the point is to avoid the 99+% of message.updated events for emails we’ve already seen and only get notifications of the ones we would have missed.
IIUC, you want a way to distinguish truemessage.created and message.created because we saw it first.
A more formal way to do this is leverage source field in the webhooks payload which is what I am planning to use. This is more off a CloudEvent convention.
ref: spec/cloudevents/spec.md at v1.0.2 · cloudevents/spec · GitHub
true message.created will have { “source”: “/google/emails/realtime” }
inferred one will have { “source”: “/nylas/inferred” } ( haven’t finalized on the value, but something on those lines )
Filtering
Well maintaining such post filtering sucks! Sorry to hear that. We are also working on some features next quarter so we can help our customers. Would love to get more feedback on that front. We can help you offload some of that compute. It actually helps us too as we’re not spending unncessary cost on expensive wasteful egress.
IIUC, you want a way to distinguish truemessage.created and message.created because we saw it first
This is not what I’m saying. What I want is not to have to subscribe to message.updated at all. Let me state what we’re trying to accomplish again, hopefully in a way that’s more clear:
We have no interest and no need to handle message.updated other than to ensure that we don’t miss messages altogether
Handling message.updated has blown up the volume of webhooks we need to handle by 4X
We want to go back to not having to subscribe to message.updated at all
I’ve suggested one way to accomplish this but I’m happy to implement whatever you choose as long as it accomplishes the goal of not forcing us to track message.updated events, 99.9% of we have no interest in, while allowing us not to miss any emails.
The source field may be a hack we can use in the interim, so that’s good to know about, but not having to deal with all of these requests in the first place is our goal.
I just wanted to add here that we’ve been seeing a similar issue to @enaia for the past few weeks.
Our users are frequently complaining of missing emails, and we are finding a small perctange of users are receiving no message.created or message.updated events, they only seem to get status events handled by Nylas like message.opened.
This is resulting in a considerable loss of trust in our product. Nylas support was able to help with the first batch, and seems to have fixed this for some of our users, but new users are continuing to encounter the issue.
There seems to be a critical long-lived issue right now resulting in some accounts not receiving any message created/updated events that isn’t being communicated on the status page.