Building meaningful charts using Xero’s Reporting API

Xero’s Reporting API is an extremely powerful endpoint which provides us with the data required to assemble their most common reports, such as Balance Sheets or Profit & Loss. Unfortunately, with this power comes a level of abstraction that makes it quite tough to wrangle the data into a meaningful chart or report.

In this post, I will demonstrate what the structure of this endpoint means, why it is actually excellent (for us nerds), and the SQL required to coerce it into something that humans (not nerds) can understand.

The structure

Please note: In this tutorial I’ll be using the Profit & Loss report, but the same concepts apply to all of Xero’s reporting endpoints

When you think about your Profit & Loss report, you probably imagine it visually - a list of items down the left hand side, with corresponding values for the month in a separate column. Something, like this (taken from Xero’s own dashboards):

But that’s not how a data-driven person sees it. We see a huge range of considerations when trying to convert this into a consistent format suitable for an API. No, we see something more like this:

  • Reports are divided into sections, with section totals at the bottom of each. This also indicates that there might be sub-sections (or sub-sub-sections) in some reports

  • The column that contains the totals on the right, also contains a heading, which is of type string. This means the numbers may be type string as well - making things like sum or avg difficult

  • Different variations show a different number of columns, such as a YTD roll-up

  • …you get the idea…

Abstracting the data

The variance in formats makes it prohibitive to generate a different endpoint for each permutation, so what the Xero team did was abstract the data into its common features - rows & cells. Let’s have a look at what the API returns (I’ve truncated to only the Gross Profit section):

{
  "Id": "896ccad4-d9f9-457c-9453-3998692ae1d6",
  "Status": "OK",
  "ProviderName": "API Explorer",
  "DateTimeUTC": "\/Date(1686862568907)\/",
  "Reports": [
    {
      "ReportID": "ProfitAndLoss",
      "ReportName": "Profit and Loss",
      "ReportType": "ProfitAndLoss",
      "ReportTitles": [
        "Profit & Loss",
        "Demo Company (NZ)",
        "1 June 2023 to 30 June 2023"
      ],
      "ReportDate": "15 June 2023",
      "UpdatedDateUTC": "\/Date(1686862568907)\/",
      "Fields": [],
      "Rows": [
        {
          "RowType": "Header",
          "Cells": [
            {
              "Value": ""
            },
            {
              "Value": "30 Jun 23"
            }
          ]
        },
        {
          "RowType": "Section",
          "Title": "Income",
          "Rows": [
            {
              "RowType": "Row",
              "Cells": [
                {
                  "Value": "Sales",
                  "Attributes": [
                    {
                      "Value": "7d05a53d-613d-4eb2-a2fc-dcb6adb80b80",
                      "Id": "account"
                    }
                  ]
                },
                {
                  "Value": "6398.52",
                  "Attributes": [
                    {
                      "Value": "7d05a53d-613d-4eb2-a2fc-dcb6adb80b80",
                      "Id": "account"
                    }
                  ]
                }
              ]
            },
            {
              "RowType": "SummaryRow",
              "Cells": [
                {
                  "Value": "Total Income"
                },
                {
                  "Value": "6398.52"
                }
              ]
            }
          ]
        },
        {
          "RowType": "Section",
          "Title": "Less Cost of Sales",
          "Rows": [
            {
              "RowType": "Row",
              "Cells": [
                {
                  "Value": "Purchases",
                  "Attributes": [
                    {
                      "Value": "573a170b-6792-4cfa-b8ce-6f8fd27f5458",
                      "Id": "account"
                    }
                  ]
                },
                {
                  "Value": "730.43",
                  "Attributes": [
                    {
                      "Value": "573a170b-6792-4cfa-b8ce-6f8fd27f5458",
                      "Id": "account"
                    }
                  ]
                }
              ]
            },
            {
              "RowType": "SummaryRow",
              "Cells": [
                {
                  "Value": "Total Cost of Sales"
                },
                {
                  "Value": "730.43"
                }
              ]
            }
          ]
        },
        {
          "RowType": "Section",
          "Title": "",
          "Rows": [
            {
              "RowType": "Row",
              "Cells": [
                {
                  "Value": "Gross Profit"
                },
                {
                  "Value": "5668.09"
                }
              ]
            }
          ]
        },
        {
          "RowType": "Section",
          "Title": "",
          "Rows": [
            {
              "RowType": "Row",
              "Cells": [
                {
                  "Value": "Net Profit"
                },
                {
                  "Value": "4986.98"
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

Wow, that is quite a lot to take in. You’ll probably recognize a few things like your product/service names and the totals. But it’s not immediately obvious how to coerce this into a tabular format.

Step 1/2 - Storing the data in a relational database

The format of the API response is called JSON - and it’s a common format for APIs to return data in. It’s possible to query and manipulate JSON data, but your reporting tool probably doesn’t have this facility and besides, you have to be pretty nerdy.

So, our first step is to convert this JSON into a relational-database format. Reporting tools love relational data. At SyncHub, below is the structure we came up with.

If you have a SyncHub subscription connected to Xero, these tables will automatically appear in your database, and the data will be kept up to date automatically too.

As you can see, we retained the abstract nature of the data (rows & cells), and augmented with a few other quality-of-life fields such as typed columns and foreign keys back to the Account records.

At this point you’re probably thinking that we’re just replacing one confusing solution for another. And from a human’s point of view, you’re right. But from your reporting tool’s point of view there is one major new benefit - SQL queries.

Step 2/2 - Querying your Xero reports using SQL queries

Now, this data does not lend itself to your run-of-the-mill SELECT queries. No, for this you need something more powerful - the PIVOT query. I won’t go into details about exactly how the PIVOT query works - you can search Google or ask ChatGPT about that. But in short, in our case it lets us use the imported column values (i.e. month names) as columns headings. I’ll cut straight to the code, pasted below.

Note that I’m using MS SQL, which is the default database provided free as part of your SyncHub subscription. If you’re using an alternative such as Postgres, you’ll have to tweak accordingly.

declare @ReportID nvarchar(500)
declare @ColumnNames table (ColumnName nvarchar(500), CellNumber int)

-- Get the most recent report
select top 1 @ReportID = RemoteID from [CONNECTIONS.xero].ProfitAndLossReportByMonth
order by [From] desc


-- Some reports do not have a column name in the initial column, as it is implicit. Our pivot below however
-- needs a column name, so we'll create one here
declare @DefaultColumnName nvarchar(100) = 'Item'

-- Get our column names. Most typically these are the months that your report covers (e.g. 'Feb 1979')
-- Note that this logic is duplicated in the dynamic query below - REF_COLUMN_NAMES
insert into @ColumnNames
select
	case when isnull(cc.Value, '') = '' then 'Item' else cc.Value end as ColumnName,
	cc.CellNumber
from [CONNECTIONS.xero].ReportRowCell cc
inner join [CONNECTIONS.xero].ReportRow header on (cc.ReportRowRemoteID = header.RemoteID and header.RowType = 'Header')
where header.ReportRemoteID = @ReportID
and cc.IsDeleted = 0 and header.IsDeleted = 0

-- Check that we have data
if not exists (select 1 from @ColumnNames)
begin
	raiserror('No data is present - please make sure your data is synced and up to date', 16, 1)
	return
end

-- Pull the column names into a variable so we can dynamically inject into our query below
declare @Columns nvarchar(max) = '' 
select @Columns += QUOTENAME(ColumnName) + ','
from @ColumnNames
order by CellNumber
set @Columns = left(@Columns, len(@Columns) - 1)

-- Dynamically execute our SQL so that we may inject these column names using string concatenation
declare @SQL nvarchar(max) = N'
;with allcells as (
	select 
		rr.Title, 
		rr.RowType, 
		rr.RowNumber, 
		rr.RemoteID as RowRemoteID,
		columnNames.ColumnName,
		cc.[Value] as CellValue

	from [CONNECTIONS.xero].ReportRow rr
	left join [CONNECTIONS.xero].ReportRowCell cc on rr.RemoteID = cc.ReportRowRemoteID
	-- Cross reference to find the corresponding column name, as per the same logic in REF_COLUMN_NAMES above
	-- We need this so that we can JOIN later
	left join (
		select
			case when isnull(cc.Value, '''') = '''' then @DefaultColumnName else cc.Value end as ColumnName,
			cc.CellNumber
		from [CONNECTIONS.xero].ReportRowCell cc
		inner join [CONNECTIONS.xero].ReportRow header on (cc.ReportRowRemoteID = header.RemoteID and header.RowType = ''Header'')
		where header.ReportRemoteID = @ReportID
		and cc.IsDeleted = 0 and header.IsDeleted = 0
	) columnNames on cc.CellNumber = columnNames.CellNumber
	where rr.ReportRemoteID = @ReportID and rr.IsDeleted = 0
), cellsByColumn as (
	-- Pivot the cells so that they are based on the column name
	select *
	from allCells
	pivot (
		-- Pivot requires that we use some kind of aggregation, so we'll just use min() here - noting that there is only one value anyway
		min(CellValue)
		for ColumnName in (' + @Columns + ')
	) as pt
)
-- Now just format the final report, using all the (dynically generated) month names as columns
select  
	Title, ' + @Columns + '
from cellsByColumn 
where Rowtype <> ''Header''
order by RowNumber
'

-- The work above merely concatenated a long SQL string. Here, we execute it to actually query the database, using sp_executeSQL
exec sp_executeSQL @SQL, N'@ReportID nvarchar(500), @DefaultColumnName nvarchar(100)', @ReportID, @DefaultColumnName

If you execute this against your SyncHub data store, you’ll get something along the lines of the results below:

So, how can you use this?

I won’t lie to you - that is a gnarly query. But, that is the nature of BI sometimes, and think how impressed everybody in the office will be! So let’s move on to the final step - how exactly can you use this query in your reports?

Using your own reporting tool

Unfortunately, we can’t provide guidance for every reporting tool, but (assuming your tool supports SQL queries, which it almost certainly does), what you can essentially do is copy/paste this into your editor and tweak accordingly.

Note: you must replace the schema placeholder [CONNECTIONS.xero] with whatever the schema name of your SyncHub data store is.

There is also the chance that your reporting tool won’t let you execute dynamically-assembled strings. In this case, you’ll have to either:

  • hard-code the column names (changing every month, as required)

  • use our Insights option below, then query the resulting table from your reporting tool

Using SyncHub Insights

For a much easier solution, you can use this query in our Insights platform. Insights will periodically execute the query, and store the results in a new database table. From there, you can either:

  • query your new database table from your reporting tool, using a simple select * from… query; or

  • use Insights itself to render and share your reports

To conclude

Xero is great. Their Reporting API is great. SyncHub is great.

Happy reporting everybody!

Previous
Previous

The Flexi Table

Next
Next

Introducing…SyncHub Insights