Welcome!

@BigDataExpo Authors: John Esposito, Liz McMillan, Pat Romanski, Elizabeth White, Kevin Benedict

Blog Feed Post

Big Data accessiblity for SEC reporting? Not yet. Columbia report explains why.

By

[This post by Hudson Hollister is cross-posted on the Data Transparency Coalition's blog.]

Last Tuesday Columbia Business School’s Center for Excellence in Accounting and Security Analysis released a definitive report evaluating the implementation of a structured data format for the financial statements that public companies file with the U.S. Securities and Exchange Commission. Over a year in the making and based on extensive discussions and surveys with corporate filers, investors, data and filing vendors, regulators, and others, the survey illuminates the promise of structured data to better serve investors, improve the enforcement of securities laws, and make the U.S. capital market more efficient. It also reveals serious flaws in the SEC’s approach thus far – flaws which have prevented the promise from being realized.

Data Transparency CoalitionThe Columbia report is a call to action by both the SEC and Congress. The Data Transparency Coalition is going to pursue that action in 2013.

In 2009, the SEC adopted a requirement for public companies to file each financial statement in the eXtensible Business Reporting Language (XBRL) alongside the regular plain-text version. The requirement was slowly phased in over four years, starting with the largest companies and eventually covering all public companies. The XBRL format imposes a data structure on the financial statements and their notes and footnotes by assigning electronic tags to each item and defining how the items relate to one another.

Judging by potential impact, this is the most ambitious data transparency program ever undertaken by the U.S. government. The XBRL reporting requirement transformed all of the public financial statements in the world’s largest capital market from cumbersome text, which must be manually transcribed to allow quantitative analysis by investors and regulators, into an open, standardized, machine-readable format.

In theory, replacing unstructured text with structured data should, by now, have triggered revolutions and disruptions all over the financial industry. The SEC’s XBRL reporting requirement should, by now, have opened up corporate financial statements in the United States to Big Data platforms and applications.

  • Investors and analysts serving them should, by now, have started using powerful new software tools to compare and analyze the newly-structured financial statements – and to mash financial figures together with other data sources. They should be making better decisions, evaluating a broader universe of companies, and democratizing the financial industry.
  • Aggregators like Bloomberg and Google Finance should, by now, have started saving money and improving accuracy by ingesting corporate financial data directly from the SEC’s structured XBRL feed instead of manually entering the numbers into their own systems (or paying someone else to do that).
  • The SEC should, by now, have incorporated structured corporate financial data into its own review processes, instead of relying on manual reviews of the financial statements in Forms 10-K and 10-Q.
  • Other federal agencies should, by now, have started automatically checking the financial performance of companies as reported to the SEC before bestowing contracts or loan guarantees (among many other possible uses).

None of these things is happening on a large scale – yet. The Columbia report explains why. The Columbia report also hints at what the SEC and Congress can, and should, do about it.

 

What does the Columbia report tell us?

  • Investors are demanding structured data – not unstructured text – to track companies’ financial performance. The Columbia authors “have no doubt that [investors'] analysis of companies will continue to be based off increasing amounts of data that are structured and delivered to users in an interactive [structured] format” (p. i). “[T]here is clear demand for timely, structured, machine-readable data including information in financial reports, and … this need can be met via XBRL as long as the XBRL-tagged data can reduce the total processing costs of acquiring and proofing the data, and that the data are easily integrated (mapped) into current processes” (p. 20).
  • Nonetheless, most investors are not making any use of the structured-data financial statements that public companies are now submitting to the SEC. Fewer than ten percent of the Columbia study’s non-scientific sample of investors said they were using XBRL data downloaded directly from the SEC or from XBRL US (p. 61). Instead, most investors were getting their corporate financial information from aggregators like Bloomberg and Google Finance – some free, some not. Moreover, aggregators told Columbia that they were not using XBRL data either. Aggregators were mostly still electronically scraping the old-fashioned plain-text financial statements (which are still being filed alongside the new structured-data financial statements) and manually verifying the numbers – or paying others to do that “labor-intensive” work for them. (pp 26-27.)  
  • Two problems explain why most investors have not begun to use structured-data financial statements. First, they don’t yet trust the data. “XBRL-tagged SEC data are generally perceived by investors as unreliable,” say the Columbia authors, both because of errors in numbers and categorization and because of companies’ use of unnecessary extensions, hindering comparability (p. 28). Columbia’s review of the quality of structured-data financial statements filed with the SEC (conducted two years ago) revealed that fully 73% of filings had data quality errors (p. 32). Moreover, investors reported “a large number of seemingly unnecessary company-specific tags” (p. 21). Investors surveyed by Columbia were “especially hesitant about using the data until they are comfortable that the XBRL data matches the [plain-text] data in SEC filings” (p. 21). Aggregators, too, were holding off until accuracy and comparability improved.
  • Second, investors don’t yet have a wide range of software tools to compare and analyze structured-data financial statements. End users are also looking for easy-to-use XBRL consumption and analysis tools that do not require programming or query language knowledge. In general, these users are not willing or able to incur the significant disruption to their workflow that they perceived would be required to incorporate XBRL data without state-of-the-art consumption and analytics tools.” (p. 24)
  • If these two problems were fixed, investors could make enthusiastic and productive use of structured-data financial statements. “[T]he potential for interactive data to democratize financial information and transform transparency remains stronger than ever, and many participants, including most investors and analysts, wish that the data were useful today,” say the Columbia authors (p. 4). For instance, “virtually all investors” frequently use information that is available only in the footnotes of corporate financial statements to make their decisions – information that is now submitted and published in XBRL as part of companies’ structured-data filings (p. 48.) “With respect to the detailed-tagged footnote data, in particular, several investors and analysts have communicated to us that they view XBRL data as potentially an excellent solution to manually collecting the data they need” (p. 31).
  • Even if most investors aren’t directly using structured-data financial statements, there will be indirect benefits to investors and the markets if the SEC starts using such data for its own reviews. The study reported that “the SEC has begun to review the data to identify filer-wide, as well as individual company filing and financial reporting issues. XBRL data could significantly enhance the efficiency of the Division of Corporate Finance’s review of filings and facilitate a “red-flag” ex-ante approach to regulatory oversight.” (p. 25) “Representatives from the FASB and the SEC have both stated on the record that, in their opinions, the amount of time that it takes them to conduct their respective analyses has been reduced significantly by their use of the XBRL-tagged data (p. 26).” Even imperfectly implemented, the XBRL mandate could indirectly benefit investors and the markets by improving the SEC’s review and enforcement processes.

The SEC’s XBRL reporting requirement could deliver transformative data transparency. But it has not. So far its impact has been incremental, not transformative.

To be sure, the problems identified by the Columbia study are problems of execution, not shortcomings of XBRL itself or of the concept of structured data. Investors and the analysts serving them “would like to have the U.S. regulatory filings tagged in a structured (e.g., XBRL) format that would meet their information requirements” (p. 5). For the SEC to eliminate the XBRL reporting requirement entirely – as some filers seem to hope that it will – would be a backward move and a tragic mistake.

Nevertheless, structured data for financial statements is, without doubt, “at a critical stage in its development. Without a serious reconsideration of the technology, coupled with a focus on facile usability of the data, and value-add consumption tools, it will at best remain of marginal benefit to the target audience of both its early proponents and the SEC’s mandate—investors and analysts” (p. ii). 

 

How can these problems be fixed?

How can the SEC fix these problems of reliability and analysis and deliver transformative transparency? The Columbia report suggests four answers:

  • First, insist on accuracy and quality! The SEC does not require companies to amend their filings to correct tagging errors and unnecessary extensions. The Columbia report suggests strongly that it should. The Columbia authors fault “the reticence (or inability) of regulators and filers to ensure that the interactive filings data are accurate and correctly-tagged from day one of their release to the public and forward (or, to communicate to the market for this information that they were not insisting on this and why)” (p. 37). It is “critical” to reduce errors and extensions, either through “greater regulatory oversight and potentially requiring the audit of this data” or through third-party quality checks (pp. 42-43). The SEC’s own interests should motivate it to insist on accuracy once it becomes “serious about using the data in its Corporate Finance function and even for enforcement, as it should” (p. 43) (emphasis added). The need to improve quality might require the SEC and the Financial Accounting Standards Board to consider simplifying the underlying XBRL taxonomy (pp. i, 14, 43).
  • Second, communicate that structured data is not a supplemental feature of a regulatory filing. Rather, it is the filing! The Columbia authors explain that “the reliability of the data has been compromised by the way filers have approached their XBRL filings … [perceiving] XBRL-tagging [as] an additional task in the financial reporting documentation process rather than as a part of the internal data systems” (p. 29). The SEC framed its XBRL reporting rule as a requirement to “create an XBRL-tagged reproduction of the paper or HTML presentations of their filings” (p. 37), rather than “making individual data points available for the end user to utilize or present as they required” (p. 39). Since filers think structured-data financial statements are “incremental to their existing [plain-text] filings, they do not perceive any user need” (p. 35) – and take few pains to ensure that investors using their structured data filings get an accurate picture of their finances. “We believe this presentation-centric step hindered or diverted what should have been an important evolution from a paper presentation-centric view of financial reporting information to a far more transparent and effective data-centric one” (p. 37). One way to correct this situation would be to move to a data format that is both human-readable and machine-readable, combining the plain text and structured-data tags in a single filing. Inline XBRL would do exactly that, and in fact the SEC is considering adopting this format (n. 48).
  • Third, encourage the development of software tools that make structured-data financial statements come alive! This is something of a chicken-and-egg problem. More software tools will be created as investors demand them. But effective, lightweight, cheap XBRL analysis tools are already on offer – notably Calcbench.
  • Fourth, expand the mandate! The Columbia report is clear that investors want more regulatory information tagged and structured, not less (p. 28):

i. The data that are required by the SEC to be XBRL-tagged are all relevant in varying degrees to some subset of the investor/analyst population, but more data are required than currently mandated—e.g., earnings release, MD&A, etc.

ii. If anything, users require more, not less, types of machine-readable data to be made available, because a significant amount of information they require are not from SEC filings or financial statements.

iii. The primary focus on data in the SEC filings of annual and quarterly financial statements seriously limits the perceived ongoing usefulness and relevance of the data.

Over and over, the report points out that the SEC’s current mandate for structured data is limited to the financial statements and accompanying notes (pp. 14, 18, 21, 24, 34-35, 42). Everything else that companies must file with the SEC under the U.S. securities laws is still submitted only in plain text. These other materials – earnings releases, corporate actions, executive compensation disclosures, proxy statements, officer and director lists, management discussions – could be valuable if tagged. But they are not. Investors “view access to the full array of footnote, management discussion and analysis (MD&A), and earnings release numerical data as the main reason to consider adapting their workflow to incorporate XBRL-tagged filings” (p. 21). But this demand is “pent-up” because such items are not – yet – included in the SEC’s mandate (p. 24).

What lies ahead? 

The path forward for the SEC is clear. First, the agency must take the basic steps that are necessary to improve the quality of structured-data financial statements. Second, to tap the full potential of structured data, the agency must first stop requiring the simultaneous submission of plain-text and structured-data versions of financial statements. It should instead collect single structured-data version. That would encourage companies, analysts, and the SEC’s own staff to focus on data, not on documents. Second, data transparency requires full standardization as well as publication. Third, the agency must expand its structured-data mandate by phasing in more disclosures: earnings releases, management’s discussion and analysis, executive compensation, proxy disclosures, ownership structure, board and officer lists, insider trading reports – and, eventually, everything.

If the SEC is unwilling to act, Congress could insist. Our Coalition will call for the reintroduction, this year, of the Financial Industry Transparency Act. That bipartisan proposal, first introduced in 2010 by Reps. Darrell Issa (R-CA), Edolphus Towns (D-NY), and Spencer Bachus (R-AL), would require these steps as a matter of law.

Read the original blog entry...

More Stories By Bob Gourley

Bob Gourley writes on enterprise IT. He is a founder and partner at Cognitio Corp and publsher of CTOvision.com

@BigDataExpo Stories
In his session at @ThingsExpo, Chris Klein, CEO and Co-founder of Rachio, will discuss next generation communities that are using IoT to create more sustainable, intelligent communities. One example is Sterling Ranch, a 10,000 home development that – with the help of Siemens – will integrate IoT technology into the community to provide residents with energy and water savings as well as intelligent security. Everything from stop lights to sprinkler systems to building infrastructures will run ef...
SYS-CON Events announced today that Peak 10, Inc., a national IT infrastructure and cloud services provider, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Peak 10 provides reliable, tailored data center and network services, cloud and managed services. Its solutions are designed to scale and adapt to customers’ changing business needs, enabling them to lower costs, improve performance and focus inter...
You think you know what’s in your data. But do you? Most organizations are now aware of the business intelligence represented by their data. Data science stands to take this to a level you never thought of – literally. The techniques of data science, when used with the capabilities of Big Data technologies, can make connections you had not yet imagined, helping you discover new insights and ask new questions of your data. In his session at @ThingsExpo, Sarbjit Sarkaria, data science team lead ...
Up until last year, enterprises that were looking into cloud services usually undertook a long-term pilot with one of the large cloud providers, running test and dev workloads in the cloud. With cloud’s transition to mainstream adoption in 2015, and with enterprises migrating more and more workloads into the cloud and in between public and private environments, the single-provider approach must be revisited. In his session at 18th Cloud Expo, Yoav Mor, multi-cloud solution evangelist at Cloudy...
Artificial Intelligence has the potential to massively disrupt IoT. In his session at 18th Cloud Expo, AJ Abdallat, CEO of Beyond AI, will discuss what the five main drivers are in Artificial Intelligence that could shape the future of the Internet of Things. AJ Abdallat is CEO of Beyond AI. He has over 20 years of management experience in the fields of artificial intelligence, sensors, instruments, devices and software for telecommunications, life sciences, environmental monitoring, process...
Increasing IoT connectivity is forcing enterprises to find elegant solutions to organize and visualize all incoming data from these connected devices with re-configurable dashboard widgets to effectively allow rapid decision-making for everything from immediate actions in tactical situations to strategic analysis and reporting. In his session at 18th Cloud Expo, Shikhir Singh, Senior Developer Relations Manager at Sencha, will discuss how to create HTML5 dashboards that interact with IoT devic...
So, you bought into the current machine learning craze and went on to collect millions/billions of records from this promising new data source. Now, what do you do with them? Too often, the abundance of data quickly turns into an abundance of problems. How do you extract that "magic essence" from your data without falling into the common pitfalls? In her session at @ThingsExpo, Natalia Ponomareva, Software Engineer at Google, will provide tips on how to be successful in large scale machine lear...
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
The IoTs will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform. In his session at @ThingsExpo, Craig Sproule, CEO of Metavine, will demonstrate how to move beyond today's coding paradigm and share the must-have mindsets for removing complexity from the development proc...
SYS-CON Events announced today that Ericsson has been named “Gold Sponsor” of SYS-CON's @ThingsExpo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. Ericsson is a world leader in the rapidly changing environment of communications technology – providing equipment, software and services to enable transformation through mobility. Some 40 percent of global mobile traffic runs through networks we have supplied. More than 1 billion subscribers around the world re...
You deployed your app with the Bluemix PaaS and it's gaining some serious traction, so it's time to make some tweaks. Did you design your application in a way that it can scale in the cloud? Were you even thinking about the cloud when you built the app? If not, chances are your app is going to break. Check out this webcast to learn various techniques for designing applications that will scale successfully in Bluemix, for the confidence you need to take your apps to the next level and beyond.
There is an ever-growing explosion of new devices that are connected to the Internet using “cloud” solutions. This rapid growth is creating a massive new demand for efficient access to data. And it’s not just about connecting to that data anymore. This new demand is bringing new issues and challenges and it is important for companies to scale for the coming growth. And with that scaling comes the need for greater security, gathering and data analysis, storage, connectivity and, of course, the...
SYS-CON Events announced today that Fusion, a leading provider of cloud services, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Fusion, a leading provider of integrated cloud solutions to small, medium and large businesses, is the industry's single source for the cloud. Fusion's advanced, proprietary cloud service platform enables the integration of leading edge solutions in the cloud, including cloud...
The increasing popularity of the Internet of Things necessitates that our physical and cognitive relationship with wearable technology will change rapidly in the near future. This advent means logging has become a thing of the past. Before, it was on us to track our own data, but now that data is automatically available. What does this mean for mHealth and the "connected" body? In her session at @ThingsExpo, Lisa Calkins, CEO and co-founder of Amadeus Consulting, will discuss the impact of wea...
Unless you don’t use the internet, don’t live in California, or haven’t been paying attention to the recent news… you should be aware that self-driving cars are on their way to becoming a reality. I have seen them – they are real. If you believe in the future reality of self-driving cars, then continue reading on. If you don’t believe in the future possibilities, then I am not sure what to do to convince you other than discuss the very real changes that will roll out with the consumer producti...
SYS-CON Events announced today that Enzu, a leading provider of cloud hosting solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Enzu’s mission is to be the leading provider of enterprise cloud solutions worldwide. Enzu enables online businesses to use its IT infrastructure to their competitive advantage. By offering a suite of proven hosting and management services, Enzu wants companies to foc...
SYS-CON Events announced today that DatacenterDynamics has been named “Media Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY. DatacenterDynamics is a brand of DCD Group, a global B2B media and publishing company that develops products to help senior professionals in the world's most ICT dependent organizations make risk-based infrastructure and capacity decisions.
The IoT has the potential to create a renaissance of manufacturing in the US and elsewhere. In his session at 18th Cloud Expo, Florent Solt, CTO and chief architect of Netvibes, will discuss how the expected exponential increase in the amount of data that will be processed, transported, stored, and accessed means there will be a huge demand for smart technologies to deliver it. Florent Solt is the CTO and chief architect of Netvibes. Prior to joining Netvibes in 2007, he co-founded Rift Technol...
We’ve worked with dozens of early adopters across numerous industries and will debunk common misperceptions, which starts with understanding that many of the connected products we’ll use over the next 5 years are already products, they’re just not yet connected. With an IoT product, time-in-market provides much more essential feedback than ever before. Innovation comes from what you do with the data that the connected product provides in order to enhance the customer experience and optimize busi...
SYS-CON Events announced today that Stratoscale, the software company developing the next generation data center operating system, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Stratoscale is revolutionizing the data center with a zero-to-cloud-in-minutes solution. With Stratoscale’s hardware-agnostic, Software Defined Data Center (SDDC) solution to store everything, run anything and scale everywhere...