Exporting Events using API
AnsweredI'm looking at exporting the Events data (and other too, but focusing on Events):
Using the "Try it now" with a StartAt and EndedAt date range and a Limit of 50, I get data that starts with:
"response": { "count": 3697, "items": [ {
However, there are not 3697 results, there are 50. Which is what I expected. If I remove the limit, I get 500 back, and Limit prevents me exceeding 1000.
So I thought I would try to chunk the data using Offset & Limit, but Export/Events.json doesn't use Offset.
Perhaps I could iterate based on the StaredtAt using a Date & Time, as the description says I can:
startedAt |
Only include events that happened after this date (YYYY-MM-DD, YYYY-MM-DD HH:mm:ss or timestamp) |
I get back:
"errorType": "general", "description": "Parameter startedAt is not valid. Ensure format is YYYY-MM-DD.",
With Export/shares.json, there are no parameters - I just get all the shares, 1272 so far and counting. I like that, its simple.
So, my question is
What is the suggested way to export all the Events without loss?
Regards,
Steve Salt
Application Devloper
-
Official comment
Hi Steve,
The default maximum number of returned items in the Events API endpoint is set to 500. As you have found out, you can change the limit and set a range between 50 and 1000 responses.
There are millions of events in the database and so we have not made it possible to return everything in a single request. There is a way to fetch this information in batches that you can scroll through until you have all the events that you need.
The response comes with a header called X-Showpad-Scroll-Id that contains a token and remains valid for a limited time. It is comparable to an authentication key, If you make the same request again but set this header on your request, you will get the second batch of events. You get back another X-Showpad-Scroll-Id that you can use to fetch the third batch of events etc. This way, you can do a loop in the application until it fetches all the events that you want.
You can learn more about the X-Showpad-Scroll-Id header in our Help Center article. I hope this explains why you aren't able to get this information in a single export and gives you the information you need for your reporting.
- Elaine
Comment actions -
Hi,
I obtain the data using cURL as an Ajax request from within a PHP application. For anyone else doing this, you will need to set the following to get access to the header, and specifically, the X-ShowPad-Scroll-Id.
curl_setopt($this->curl, CURLOPT_HEADERFUNCTION, array (&$this, 'methodName'));
Regards,
Steve
-
Hi Elaine Mui - we're having to retrieve showpad tables also & we're doing that with a cron job. Because of the huge amount it's not exactly going that fast, but we're working on optimisations.
One thing I don't understand is why every showpad api works via the offset method & why only events does not. Sure, the X-Showpad-Scroll-Id works fine, but why the different approach ? And why do we only get max 1000 records at a time?
I also noticed that when our script runs for a few hours, we sometimes get a 503 server error from showpad (which makes our script stop :/)
-
Hi Ben,
Thanks for asking in the community. I've sourced some answers from internal experts on the topic.
The events are in a dedicated datastore, while other data is in the Showpad content database. The scrolling mechanism was a better fit than the simple limit/offset in that datastore, which is not a good choice if data is constantly changing and gets updated or added.
For example, sometimes an offline iPad comes online and sends events from days or weeks ago to the server, so during your export request there can suddenly be different events which would mess up the limit/offset mechanism. When you start the scroll request, the server stores and keeps the context alive, it's like a classic database cursor. It's described in our docs as so: "A scrolled search takes a snapshot in time — it doesn’t see any changes that are made to the index after the initial search request has been made. It does this by keeping the old datafiles around, so that it can preserve its “view” on what the index looked like at the time it started." So no events get updated or added during the requests, which ensures consistency.
The 1000 record limit was chosen because most Showpad API’s also limit up to 1000. It’s a balance to make sure customers don’t have to make thousands of requests, but also so our servers don’t have to send megabytes of data in a single request.
Not sure about the 503, but could be the services behind this API scaling or one of the database nodes hit the memory limit for a moment, or a network issue somewhere. Our person more knowledgable on this topic said he always adds up to 5 tries with exponential backoff (something AWS does by default and recommends, finely if the export process could take hours and one error could stop the flow : https://docs.aws.amazon.com/general/latest/gr/api-retries.html).
If you experience further issues, please reach out to our support team using the "Help" option below.
Hope this helps!
Mary -
Thank you for the explanation Mary Hauber - after more optimisations the script now runs nice & smooth & pretty fast!
I do have another question about the Events API.
When we do a call for a certain timestamp we always get events back starting from -2hours ago. So we have to fix this by adding 2 hours to the time we want.. why is this happening? If we don't add that work-around then all the records for the 2 hours difference are duplicate at next cron job call (which again is not a big issue as we dont allow duplicates but seems a bit strange that we ask a certain time & dont get it back).
Please sign in to leave a comment.
Comments
6 comments