Bulk data export
The data export API serves as a proxy to a portion of accesso's data lake. Using the API allows the caller to retrieve data to which they have been authorized. This guide provides an overview of the export endpoint and some basic best practices for using the API.
Exporting data
To export data, the following steps are performed:
- Acquiring a service token, if a valid/non-expired one is not already in the caller’s possession.
- Initiating the export and fetching the first page of results.
- Fetching remaining pages of results.
1. Acquiring a service token
See the Service Tokens documentation for instructions on how to obtain a service token.
2. Initiating the export and fetching first page of results
Due to the amount of data which can possibly be returned, the export API returns its results in pages. The maximum number of records returned by any request is 1,000 records. When an export is requested, only the first page of results will be initially returned. Subsequent pages must be explicitly requested.
-G
option on the curl
commands below includes the
parameters passed to the -d
option as query string parameters. This format is used
on this guide for readability of the multiple parameters.
The following request is used to invoke an export:
1
2
3
4
5
6
7
curl -X GET -G \
'https://api.{region}.te2.io/v1/export/{dataType}' \
-d 'startTime={startTime}' \
-d 'endTime={endTime}' \
-d 'venueId={venueId}' \
-H 'Cache-Control: no-cache' \
-H 'Authorization: Bearer {token}'
Parameters used above include:
- dataType: Data set from which to export.
To which
dataType
s a user has access is driven by configuration. Examples ofdataType
s to which consumers generally are granted access include but are not limited to:- user_locations: Records of when a user enters/exits a venue and when a user’s device emits location data while within the bounds of a venue.
- user_registrations: Records of when a user registered an account.
- user_tag_updates: Record of user tag changes (set, update, remove).
- user_updates: Record of changes to a user’s profile (set, update, remove).
- ticket_updates: Records of updates to ticket user mapping when receiving orders from Passport and during ticket registration.
- startTime: Starting date and time (inclusive) from which to export. Expected to be in ISO-8601 date-time format including timezone (e.g., 2019-10-15T16:29:05.734Z).
- endTime: Ending date and time (inclusive) from which to export. Expected to be in ISO-8601 date-time format including timezone (e.g., 2019-10-30T16:29:05.734Z).
- venueId: Optional. ID of a venue. If not provided, the export will cover all venues in which you are authorized to retrieve. Learn how to fetch a list of venues.
- token: Service token.
A successful request will return the first page of results and a status of 200
.
The body of the response has the following format:
1
2
3
4
5
6
7
8
9
10
{
"data": [
{
"property1": "value1",
"property2": "value2"
}
],
"queryExecutionId": "{queryExecutionId}",
"nextToken": "{nextToken}"
}
Parameters used above include:
- queryExecutionId: ID of the query. Used to retrieve the next page of results.
- nextToken: Token used as a result bookmark. Used to retrieve the next page of results.
If
null
, there are no additional pages of data to export. - data: Array of records as JSON objects.
The exact properties of these records is highly dependent on
dataType
requested.
3. Fetching remaining pages of results
To fetch the remaining pages of results, if any exist, the following request is used:
1
2
3
4
5
6
curl -X GET -G \
'https://api.{region}.te2.io/v1/export/{dataType}' \
-d 'queryExecutionId={queryExecutionId}' \
-d 'nextToken={nextToken}' \
-H 'Cache-Control: no-cache' \
-H 'Authorization: Bearer {token}'
Parameters used above include:
- dataType: Must match the
dataType
used when invoking the export. - queryExecutionId: Must match the
queryExecutionId
returned with the first page of results. - nextToken: Must match the
nextToken
returned with the previous page of results. - token: Service token.
A successful request will return a response in the same format as the initial page of results.
Retrieval flow
This is a general example of what an export looks like using the above steps. Your exact set up may differ slightly, however the basic flow will most likely remain the same.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Fetch a service token.
service_token <- fetch_service_token()
# Fetch the initial page of results.
result_page, query_execution_id, next_token <- fetch_first_page(
service_token,
start_time,
end_time,
venue_id
)
# Store the first page of results.
results <- result_page
# So long as there are additional pages, keep retrieving data
# and appending the returned data to the result set.
while next_token != null do:
result_page, query_execution_id, next_token <- fetch_next_page(
service_token,
query_execution_id,
next_token
)
results <- results + result_page
end while
# Return your exported data.
return results
Rules governing export request parameters
Violating any of the following will result in a returned status of 422
:
- an unknown or disallowed
dataType
is provided queryExectionId
is not provided and neither isstartTime
norendTime
- the
startTime
is after theendTime
- the
startTime
is too far in the past - the
endTime
is in the future - a
queryExecutionId
is provided without anextToken
Best practices
Do:
- use the
venueId
parameter. Pull back only the data you need and no more.
Don’t:
-
set the
endTime
to the current time. The data returned by the export is streaming data. It can take several minutes for data to arrive in accesso’s data lake. Thus, if you query too close to the present time, you may receive incomplete data for your date range. It is recommended to not setendTime
to a value more recent than 15 minutes from the current time. -
repetitively execute the same request multiple times in a short period. Execute once and cache the response if you need the data for an extended period of time.
-
attempt to pull back exceedingly large amounts of data in a single request. Limit your requests to periods of no more than a day or so at a time.