Developing Financial Applications using XBRL

XBRL (eXtensible Business Reporting Language) is a data format for company financial reporting which can be easily consumed by software applications. The ‘Language’ in the name is actually a misnomer, XBRL is simply a data format as opposed to a computer language capable of performing operations. XBRL is a variant of the XML format which ‘marks-up’ data in a tag based syntax which is human readable. For example, to report Revenues the below code could be used :

<us-gaap:Revenues> 1500000 </us-gaap:Revenues>

This is clearly a simple format and for this reason XML based formats have become popular across a wide range of applications, since Office 2007 both Microsoft Word and Microsoft Excel use XML based formats (.docx and .xlsx respectively).

However, despite the promise of an easily consumable format for financial reporting data, XBRL is deeply flawed and has several major issues.

Issues with XML

XML is very useful for representing basic data structures, however as the complexity of the data structure grows the XML file(s) becomes extremely large and unwieldy. A set of company financial reports is a large and heavily structured set of data and so the XBRL files to represent them is very large. For a single quarterly financial report there will typically be at least six interlinked documents which will in total be about five times the size of the PDF.

XML has never been a performant data format, even for simple data structures, however it becomes incredibly sluggish for heavily structured data. To read and parse and single XBRL document took a dual-processor server between 50 to 80 seconds. There are a multitude of optimisations which could be applied but none are going to be sufficient to enable scaling and compete with database access times which are measured in milliseconds.

Thus, for any serious financial analysis we will need to develop a database schema and first read the XBRL document into the database which will then be the source of all analysis queries.

The XBRL Spec

Many of the elements for the XBRL specification were directly ported from the regulatory reporting requirements. The reporting requirements are sufficient for human analysis of financial reports but perform very poorly when translated into a data format to be consumed by software applications.

When reading financial reports it is very obvious which figures relate to the current reporting period. However to determine this for an XBRL document is not nearly so easy. For any given line item such as Revenue there are numerous Revenue items which relate to various periods and business segments, only one of these elements relates to the firm’s current Revenue for the actual period and determining which is error-prone. For starters, XBRL does not define the start and end dates for the current period and some companies use non-standard dates (for example Apple ends its reporting periods not on the last day of the period but the last business day). Thus to extract the actual reporting elements related to the current period some relatively convoluted (and hence error-prone) code logic is required.

In financial reports a zero balance is often not required to be reported. This works fine when reading reports but for a data format it is not very robust. For example, there is no debt on Apple’s Balance Sheet so the Debt elements are simply not included in the XBRL document. If we read the data into an application and attempt to calculate a metric which uses debt as an input an error will be thrown since the Debt element is simply not found as opposed to being found and having a zero balance. Thus, a zero balance will need to be programatically substituted when no element is found but this is definitely not robust (for example, we would like to know if an element could not be read properly for the XBRL doc and so was excluded for this reason).

A major omission from the XBRL spec is non-financial data which is often crucial for an analyst. For example, Apple includes unit sales (eg number of iPhone sales) in its reports but not in it XBRL filings.

XBRL Taxonomies

The above issues pale into insignificance when compared with the minefield of XBRL taxonomies.

The XBRL taxonomy is basically a template for the tagging and structuring various items which comprise a financial report.

XBRL allows firms a lot of latitude in selecting and even creating taxonomies, as well as changing taxonomies between reporting periods. Taking Apple again as an example, up until 2010 it reported its Fixed Assets using the XBRL tag aaplPropertyPlantAndEquipmentAndCapitalizedSoftwareNet , from 2011 it changed to using us-gaapPropertyPlantAndEquipmentGross. There was no notification of the change or mapping information so software parsing the two documents would not be able to associate the items as being the same balance for different periods.
The differences between XBRL documents of different companies is even greater, making financial comparisons between companies almost impossible.

The above issues (and especially the taxonomy issue) means that the manual intervention is almost always necessary original promise of XBRL to allow fully automated data analysis still unfulfilled.