Recommended way to sync all historical user emails

So I’ve been looking through the Nylas docs for some hours now and can’t find a definite instruction for syncing all historical user emails using the Email API. I found notes on using the /messages endpoint effectively, but not one, particularly about syncing all-time messages. (Maybe I didn’t search well enough.) In short, I have the following questions:

  1. What is the recommended way to sync all-time user data (actually, I only want to sync the email subject lines) without running into errors? I know there is a high likelihood of errors when trying to do something like this.
  2. How do I optimize the process to be as efficient as possible in terms of time and bandwidth resources requested from the provider, since, according to this page, Nylas v3 doesn’t store any end-user data and only forwards it to the provider? I need to prevent getting blocked by the provider particularly if the user has several GBs of data in their emails. I don’t mind taking more time if it is absolutely needed.

Hello @AbdulramonJemil as V3 we don’t store any end-user data :frowning: We provide straight access to the provider…

Let me ask if there’s any way to do what you want to do…AFAIK, that’s not possible…but let me confirm…

I guess it still should be possible by fetching the emails little by little e.g. 50 emails every 5 seconds for a Google provider, since the usage limit for Gmail API is 250 quota units per user per second and the Gmail messages.list API (which I believe is used under the hood by Nylas for the list messages endpoint) takes only 5 quota units, so a single call every 5 seconds should be fine (can be up to 50 messages.list calls per second, no?). Should work fine particularly when only selecting specific fields using the select query parameter. Isn’t this the case @Blag ?

@AbdulramonJemil Sorry to ask, but…what’s the use case for needing ALL subject lines? Is it for things like total message counts etc or something more?

We can help you with “best practices” to avoid not only per-account Google rate limiting, but also for their entire Google account. :wink:

I was looking to do something along the lines of similarity search based on subject lines so the lines would be indexed in a vector DB, but I’m still trying to figure everything out, as I do know people can have several thousands of email messages.

I am looking into a similar use case

What do you think of the pattern I stated above? You only notify the user that it’ll take some time to get everything set up. Vectors do require lots of storage though, in case vector storage is what you’re looking to do.

I am considering a similar use case. For now my application only stores the Nylas id plus some custom fields that are internal for the application. I am considering however to sync threads, emails and attachments in the database for each user.

I was also thinking of fetching data little by little until the sync process for a user is complete. Optimized search, with a similarity search, as well as faster response times is the goal for this.

Do you guys have any recommended way of syncing historical data? Also, taking rate limits in mind, what do you think would be the optimal number of emails / threads / attachments per user in regards to time? Like @AbdulramonJemil said, will fetching 50 emails every 5 seconds pose issues? If so, what number every what seconds would be optimal?