Skip to end of metadata
Go to start of metadata

You can import data from REST API sources which return JSON, XML or CSV data. Data import is similar to Excel and CSV file upload but you do not need to export source data to file and upload to eazyBI. In addition you can schedule regular daily import from provided REST API data source.

Please at first read about eazyBI cubes, dimensions and measures - you will need to understand multi-dimensional data model to be able to map source file columns to cube dimensions and measures.

In this tutorial we will use GitHub issues REST API data source as example.

 

On this page:

Create new source application

Go to Source Data tab and Add new source application and select REST API application type.

If you have created already another similar REST API data source then you can export its definition and paste it in Import definition to create new REST API source application with the same parameters.

REST API source parameters

In the next step you will need to provide REST API source parameters which will be used to retrieve the data. See example:

In Source data URL provide REST API URL which returns the data. In this example, we are using GitHub issues REST API to get all issues from https://github.com/rails/rails/issues. In addition, we have added parameters per_page=100 (to indicate that we want to get 100 issues per page) and state=closed to indicate that we want to get just closed issues.

In Pagination parameters specify if you can get all data just with one request or you need to make many requests to get all result pages. In this example, we are using Page parameter and we specify that we need to use an additional page parameter to get all pages (it will add parameters page=1page=2, ... and so on to get all result pages. If page size can be specified as a parameter then add it in URL input (as we already provided per_page=100 in this example).

Another option is to use Offset and limit parameter if we can get all pages using parameters e.g. offset=0 and limit=100offset=100 and limit=100, ... and so on.

The last option is to use Next page URL and specify a path to a JSON results property which contains the URL of the next page. For example, if REST API uses the HAL convention for specifying the next page URL then enter _links.next.href. If the property will not be specified then the Link header of the HTTP response will be checked if it contains a URL with the attribute rel="next".

It is recommended to use page or offset and limit parameters if possible as it will enable parallel REST API requests and will make the import faster.

Next, we need to specify if and how the requests should be authenticated in Authentication parameters. You can use simple Basic authentication and provide username and password or specify authentication HTTP header name and value, as well as use OAuth 1.0a or OAuth 2.0 authentication. If you need to use OAuth authentication then typically you will need to register eazyBI as consumer / client in the provider application.

In this example GitHub uses OAuth 2.0 for their REST API authentication. At first we need to register eazyBI application in our GitHub account settings / Applications:

Please specify there Authorization callback URL as Redirect URL that you see in eazyBI REST API source parameters page.

After registration you will get your application Client ID and Client Secret - please paste them in eazyBI REST API source parameters page corresponding fields.

If you will create several REST API sources in your eazyBI account that all retrieve data from the same source site (GitHub in this example) then you can use the same Client ID and Client Secret for all these REST API sources.

Please enter Authorize URL and Token URL parameters according to REST API source OAuth authentication documentation (in case of GitHub OAuth implementation they are https://github.com/login/oauth/authorize and https://github.com/login/oauth/access_token.

Finally specify wether REST API will return JSON, XML or CSV. In case of JSON you can specify JSONPath expression which returns JSON array of objects that we want to import in eazyBI (it is necessary if REST API returns more complex object and array of data for import are in some lower level JSON attribute). Similarly in case of XML you can specify XPath expression which returns list of XML nodes that should be imported in eazyBI.

In our GitHub issues example we do not need to provide JSONPath expression as REST API will return simple array of issue objects.

Click Continue to process to the next step.

Custom JavaScript code

You can use custom JavaScript code to modify received JSON, XML or CSV data before importing into eazyBI. Click Add custom JavaScript code to show code editor.

You can use the doc variable to access received data object properties and modify or add additional properties. If necessary you can define additional JavaScript functions that you need to use in your code.

Here is example of JavaScript code which will change title property to capitalized version (first capital letter and then lowercase letters):

function capitalize(s) {
  return s.charAt(0).toUpperCase() + s.slice(1).toLowerCase();
}
doc.title = capitalize(doc.title);

If you would like to skip some data rows and do not import them in eazyBI then use return false; in these cases. Here is example which will skip data rows which do not have title property:

if (!doc.title) return false;

You can also create new properties for the doc object when you need to construct eazyBI dimension level names or calculate additional measures. Here is example how to create full_name property:

doc.full_name = doc.first_name + " " + doc.last_name;

The following feature (returning an array of data rows) is available just in eazyBI Cloud and starting from eazyBI add-on for JIRA version 4.0.3.

You can map one source data row to multiple data rows that should be imported into eazyBI. For example, if doc is an invoice object with several lines then you can return an array of invoice lines which contains both invoice and line attributes:

return _.map(doc.lines, function(line) {
  return {
    invoice_number: doc.number,
    invoice_customer: doc.customer,
	invoice_date: doc.date,
    line_product: line.product,
    line_item_count: line.item_count,
    line_item_amount: line.item_amount
  };
});

You can use Underscore.js functions (like _.each) in your custom JavaScript code.

 

Please contact eazyBI support if you need help to write custom JavaScript code for your specific needs.

Authorize access to REST API source

In case of OAuth authentication in the first time you will be redirected to source application authorization page where you will need to authorize eazyBI access to source application data:

After successful authorization you will be redirected back to eazyBI Source columns mapping page. If there will be any authentication errors then you will see corresponding error messages - please review your authentication settings and try again.

Source columns mapping

REST API source columns mapping is similar to Excel or CSV file columns mapping where you specify which source data fields should be imported as corresponding eazyBI dimensions or measures. Please review Excel and CSV file upload documentation page if you have not yet done so.

You can click Generate default names toolbar button to turn on automatic dimension and level and measure name generation from column names (which is turned on by default when doing file upload). And when you have mapped columns that you want to import in eazyBI then you can click Hide unmapped columns to hide columns that will not be imported into eazyBI.

See example of GitHub issues columns mapping:

When you need to import several columns as the same dimension attributes then click Show options and specify the following options:

  • ID column - if attribute is unique integer ID attribute of dimension member (like issue id attribute in this example). If ID column is not specified then dimension members will be identified by key column and IDs will be automatically generated for each new key column value.
  • Key column - if attribute is unique numeric or string key of dimension member (like issue number here)
  • Name column - if attribute is longer name that should be displayed as dimension member name (if not specified then key column will be used as name column). But keep in mind that name column still needs to be unique within dimension (therefore in this case we will not use title as name as it might not be unique).
  • Ordinal column - if this attribute should be used to order dimension members.
  • Property name - specify if attribute should be imported as additional property for dimension member which is identified by ID or key column (like issue title in this example).
  • When importing date or datetime field as dimension property you can specify additional options:
    • Date count measure - specify measure name that should store count of imported rows in this date (in this example Issues created measure will show the count of issues that were created in created_at date or Issues closed will show the count of issues that were closed in closed_at date).
    • Date count dimension - specify time dimension which should be used for this measure (by default Time dimension name is suggested).

After mapping all necessary columns you can click Start import. If there will be any mapping errors then they will be shown and columns with errors will be highlighted. If you need to save draft of mapping then click Back to edit and confirm that you want to save changes.

Importing of source data

If source columns mapping was saved without any validation errors then REST API source application will be queued for background import. You will see the updated count of imported rows during the import:

And after successful import it will automatically refresh the status of source application:

You can later visit Source Data tab again and click Import button again to import the latest data from REST API source. During each import it will at first delete all data that were imported previously from this source and then import new data. In addition you can also click Delete data to delete imported data from this source (you need to delete imported data also if you want to change source columns mapping).

Export definition

As it was mentioned in the beginning you can export REST API source application definition:

and copy this definition and paste in Import definition field when creating different source application to create a new copy of this definition which can be modified later.

GitHub issues source application definition example

If you would like to use this GitHub issues import example from this tutorial then you can use the following source application definition:

{
  "application_type": "rest_api",
  "application_params": {
    "source_params": {
      "url": "https://api.github.com/repos/rails/rails/issues?per_page=100&state=closed",
      "pagination": "page",
      "page_parameter": "page",
      "authentication_type": "oauth2",
      "oauth2_authorize_url": "https://github.com/login/oauth/authorize",
      "oauth2_token_url": "https://github.com/login/oauth/access_token",
      "content_type": "json"
    },
    "columns_options": {
      "hide_unmapped": true
    }
  },
  "source_cube_name": "GitHub issues",
  "columns": [
    {
      "name": "url",
      "data_type": "string"
    },
    {
      "name": "labels_url",
      "data_type": "string"
    },
    {
      "name": "comments_url",
      "data_type": "string"
    },
    {
      "name": "events_url",
      "data_type": "string"
    },
    {
      "name": "html_url",
      "data_type": "string"
    },
    {
      "name": "id",
      "data_type": "integer",
      "dimension": "Issue",
      "id_column": true
    },
    {
      "name": "number",
      "data_type": "integer",
      "dimension": "Issue",
      "key_column": true
    },
    {
      "name": "title",
      "data_type": "string",
      "dimension": "Issue",
      "property": "Title"
    },
    {
      "name": "user\nlogin",
      "data_type": "string",
      "dimension": "Reporter",
      "key_column": true
    },
    {
      "name": "user\nid",
      "data_type": "integer",
      "dimension": "Reporter",
      "id_column": true
    },
    {
      "name": "user\navatar_url",
      "data_type": "string"
    },
    {
      "name": "user\ngravatar_id",
      "data_type": "string"
    },
    {
      "name": "user\nurl",
      "data_type": "string"
    },
    {
      "name": "user\nhtml_url",
      "data_type": "string"
    },
    {
      "name": "user\nfollowers_url",
      "data_type": "string"
    },
    {
      "name": "user\nfollowing_url",
      "data_type": "string"
    },
    {
      "name": "user\ngists_url",
      "data_type": "string"
    },
    {
      "name": "user\nstarred_url",
      "data_type": "string"
    },
    {
      "name": "user\nsubscriptions_url",
      "data_type": "string"
    },
    {
      "name": "user\norganizations_url",
      "data_type": "string"
    },
    {
      "name": "user\nrepos_url",
      "data_type": "string"
    },
    {
      "name": "user\nevents_url",
      "data_type": "string"
    },
    {
      "name": "user\nreceived_events_url",
      "data_type": "string"
    },
    {
      "name": "user\ntype",
      "data_type": "string"
    },
    {
      "name": "user\nsite_admin",
      "data_type": "string"
    },
    {
      "name": "labels",
      "data_type": "string"
    },
    {
      "name": "state",
      "data_type": "string",
      "dimension": "Status"
    },
    {
      "name": "comments",
      "data_type": "integer"
    },
    {
      "name": "created_at",
      "data_type": "datetime",
      "dimension": "Issue",
      "property": "Created at",
      "date_count_measure": "Issues created",
      "date_count_dimension": "Time"
    },
    {
      "name": "updated_at",
      "data_type": "datetime",
      "dimension": "Issue",
      "property": "Updated at"
    },
    {
      "name": "closed_at",
      "data_type": "datetime",
      "dimension": "Issue",
      "property": "Closed at",
      "date_count_measure": "Issues closed",
      "date_count_dimension": "Time"
    },
    {
      "name": "pull_request\nhtml_url",
      "data_type": "string"
    },
    {
      "name": "pull_request\ndiff_url",
      "data_type": "string"
    },
    {
      "name": "pull_request\npatch_url",
      "data_type": "string"
    },
    {
      "name": "body",
      "data_type": "string"
    },
    {
      "name": "milestone\nurl",
      "data_type": "string"
    },
    {
      "name": "milestone\nlabels_url",
      "data_type": "string"
    },
    {
      "name": "milestone\nid",
      "data_type": "integer",
      "dimension": "Milestone",
      "id_column": true
    },
    {
      "name": "milestone\nnumber",
      "data_type": "integer",
      "dimension": "Milestone",
      "key_column": true
    },
    {
      "name": "milestone\ntitle",
      "data_type": "string",
      "scale": 1,
      "dimension": "Milestone",
      "name_column": true
    },
    {
      "name": "milestone\ndescription",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\nlogin",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\nid",
      "data_type": "integer"
    },
    {
      "name": "milestone\ncreator\navatar_url",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\ngravatar_id",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\nurl",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\nhtml_url",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\nfollowers_url",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\nfollowing_url",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\ngists_url",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\nstarred_url",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\nsubscriptions_url",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\norganizations_url",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\nrepos_url",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\nevents_url",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\nreceived_events_url",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\ntype",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreator\nsite_admin",
      "data_type": "string"
    },
    {
      "name": "milestone\nopen_issues",
      "data_type": "integer"
    },
    {
      "name": "milestone\nclosed_issues",
      "data_type": "integer"
    },
    {
      "name": "milestone\nstate",
      "data_type": "string"
    },
    {
      "name": "milestone\ncreated_at",
      "data_type": "datetime"
    },
    {
      "name": "milestone\nupdated_at",
      "data_type": "datetime"
    },
    {
      "name": "milestone\ndue_on",
      "data_type": "string"
    },
    {
      "name": "assignee\nlogin",
      "data_type": "string",
      "dimension": "Assignee",
      "key_column": true
    },
    {
      "name": "assignee\nid",
      "data_type": "integer",
      "dimension": "Assignee",
      "id_column": true
    },
    {
      "name": "assignee\navatar_url",
      "data_type": "string"
    },
    {
      "name": "assignee\ngravatar_id",
      "data_type": "string"
    },
    {
      "name": "assignee\nurl",
      "data_type": "string"
    },
    {
      "name": "assignee\nhtml_url",
      "data_type": "string"
    },
    {
      "name": "assignee\nfollowers_url",
      "data_type": "string"
    },
    {
      "name": "assignee\nfollowing_url",
      "data_type": "string"
    },
    {
      "name": "assignee\ngists_url",
      "data_type": "string"
    },
    {
      "name": "assignee\nstarred_url",
      "data_type": "string"
    },
    {
      "name": "assignee\nsubscriptions_url",
      "data_type": "string"
    },
    {
      "name": "assignee\norganizations_url",
      "data_type": "string"
    },
    {
      "name": "assignee\nrepos_url",
      "data_type": "string"
    },
    {
      "name": "assignee\nevents_url",
      "data_type": "string"
    },
    {
      "name": "assignee\nreceived_events_url",
      "data_type": "string"
    },
    {
      "name": "assignee\ntype",
      "data_type": "string"
    },
    {
      "name": "assignee\nsite_admin",
      "data_type": "string"
    }
  ]
}

You will need to register your own client application in GitHub account settings and get your Client ID and Client Secret.

 

If you need any help with REST API source data import then please contact eazyBI support.

 

  • No labels