Welcome!

@BigDataExpo Authors: Elizabeth White, Karyn Jeffery, Jeev Trika, Anil Kaul, Titir Pal

Related Topics: @BigDataExpo

@BigDataExpo: Blog Post

Big Data in Financial Analytics

Big Data and UIMA use cases

Big Data & Text Analytics: As  the analysis  of  large amounts  of unstructured  data is gaining a major space in enterprise  computing,  we  are seeing the emergence of more use cases in this regard.  While  the  term   "Big"  in Big Data   makes it more synonymous  with  Massively Parallel Processing frameworks like Hadoop,  however  the  underlying the success of  Big Data  relies  on effective usage of  content analytics  of the underlying  unstructured data.  I have high lighted  this thought process in my earlier  article, Big Data Analytics Thinking Outside Of Hadoop.

Unstructured Content Analytics is  defined  as the   process of  gaining  new insights  from  the  unstructured data, by  employing   text mining, image recognition, voice recognition and other related analytical techniques.


Big Data Journal was launched on SYS-CON.com in 2012

The below  material  explains   one such use case of  Big Data &  Text Analytics in  getting meaningful insights  from the Financial  Reports.

Financial Reports  & Analytics: All the  publicly  traded  companies in USA & else where  mandatorily  disclose  their corporate information to their  shareholders.  These annual financial statements   are available  as  downloadable reports  on the corporate websites  of  public  companies.   Apart  from the  annual report , there are other forms  of financial statements  like,  investor news letters, Quarterly earning presentation, conference calls by CFO  and other investor relationship documents form part of  an  organization's  financial standing in the eyes  of the  investor.

Most  of the  investors  and  investment analyst  firms  currently  uses  their specialized   knowledge to understand  these financial  statements  and  create meaningful  insights  out of them.  However  these analytics  are mostly limited to the structured  portions  of  the financial statements and not so much  on the unstructured  side of it.

To explain this more :

  • For example An annual report may contain statements like Balance Sheet, Income, Equity, Cash Flows etc.. these statements are highly structured and organized as per accounting principles so that any of the qualified financial analysts can understand them
  • At the same a typical financial statement also contains lot of unstructured information about growth strategies of the organization, road map, optimism, future vision, how the business model is aligned to the changing times etc...

So   an effective  analysis of  a financial statement  not only pertains  to the structured information but also to the unstructured  data available in the  financial statements.

BigData, UIMA  & Financial Report Analytics: The following   Big Data aligned  technologies  can be effectively used  in analysing the  financial  reports  to derive meaningful insights into the  large volumes  of unstructured data.

  • UIMA : UIMA stands for Unstructured Information Management Architecture is the major industry standard for content analytics.

 

  • Annotators : UIMATM Annotators do the real work of extracting structured information from unstructured data. You can write your own annotators. Though Annotators form part of UIMA framework lot of custom development is written is creating Annotators specific to the needs of the Finance industry. When documents are processed through the document processing pipeline, the annotators extract concepts, words, phrases, classifications, and named entities from unstructured content and mark these extractions as annotations. The annotations are added to the index as tokens or facets and are used as the source for content analysis.

  • Taxonomies : Taxonomies play a major role in identifying the topics of interest within a document using UIMA. In UIMA a type system defines the various types of objects that may be discovered in the document. Types in a UIMA type system may be organized into a taxonomy. For example, Company may be defined as a subtype of Organization

 

Realizing Financial Statement Analytics & Role of  XBRL: There  are not very many  UIMA  annotators  and  implementation of   text extraction specific to financial statements.  However  we find that,  under  APACHE UIMA community  there is  one such annotator,   The AlchemyAPI Annotator is a set of annotators that wrap the AlchemyAPI.

AlchemyAPI's  (http://www.alchemyapi.com/api/)  Categorization service can be used to categorize text, HTML, or web-based content, assigning the most likely topic category (news, sports, business, etc.).  The business categories  include  topics like, Business and Finance News, SEC filings, etc.

There  are  several  of  the   text analytics concepts  like  the below,  can be applied on the financial statements

  • Named Entity Extraction : Identify people, companies, organizations, cities, geographic features, and other typed entities within HTML pages and text documents/content.
  • Concept Tagging : Automatically tag documents and text in a manner similar to human-based tagging.
  • Keyword / Term Extraction : Extract important terms and "topic" keywords from HTML pages and text documents/content. Advanced statistical and linguistic algorithms analyze your content, "tagging" it with the most important words and phrases.
  • Sentiment Analysis : Identify positive, negative and neutral sentiment within HTML pages and text documents/content.
  • Relation Extraction : Identify facts and Subject-Action-Object relations within HTML pages and text documents/content.

Apart  from  the  already  developed  and community  supported  annotators,  we could   develop  new annotators  which  can take the best use of already  established  taxonomies  for the financial industry   in the form of  XBRL.

XBRL stands for eXtensible Business Reporting Language. It is a language for the electronic communication of business information, providing major benefits in the preparation, analysis and communication of business information. It is one of a family of "XML" languages which is a standard means of communicating information between businesses and on the internet.

XBRL Taxonomies,  are the dictionaries which the language uses. These are the categorization schemes which define the specific tags for individual items of data (such as "net profit").  National jurisdictions have different accounting regulations, so each may have its own.  There are already well established  approved taxonomies  for  financial reporting  like  XBRL  US  GAAP  as listed in the  site, http://www.xbrl.org/FRTApproved.

As  evident  from  the  architecture  of UIMA  and annotator  entity extraction process, these established  taxonomies  can play a major role in areas like concept tagging,  which  can help in  getting the  meaningful insights  from    large  amounts of  textual and other  unstructured content in the financial statements.

Summary: As  enterprises  and analytics vendors  adopt  Big Data  as part of the mainstream ,  this  adoption will be  more meaningful  to  enable   the technology  to support new  business use cases.  Financial  Analytics  is  one such important area  ,  and with the support of    frameworks like UIMA  coupled  with  industry established taxonomies,  such  analytics  are quite possible  and worth to be implemented.

More Stories By Srinivasan Sundara Rajan

Srinivasan is passionate about ownership and driving things on his own, with his breadth and depth on Enterprise Technology he could run any aspect of IT Industry and make it a success.

He is a seasoned Enterprise IT Expert, mainly in the areas of Solution, Integration and Architecture, across Structured, Unstructured data sources, especially in manufacturing domain.

He currently works as Technology Head For GAVS Technologies.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@BigDataExpo Stories
The IoTs will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform. In his session at @ThingsExpo, Craig Sproule, CEO of Metavine, will demonstrate how to move beyond today's coding paradigm and share the must-have mindsets for removing complexity from the development proc...
In the rush to compete in the digital age, a successful digital transformation is essential, but many organizations are setting themselves up for failure. There’s a common misconception that the process is just about technology, but it’s not. It’s about your business. It shouldn’t be treated as an isolated IT project; it should be driven by business needs with the committed involvement of a range of stakeholders.
In today's enterprise, digital transformation represents organizational change even more so than technology change, as customer preferences and behavior drive end-to-end transformation across lines of business as well as IT. To capitalize on the ubiquitous disruption driving this transformation, companies must be able to innovate at an increasingly rapid pace. Traditional approaches for driving innovation are now woefully inadequate for keeping up with the breadth of disruption and change facin...
SYS-CON Events announced today that MangoApps will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. MangoApps provides modern company intranets and team collaboration software, allowing workers to stay connected and productive from anywhere in the world and from any device. For more information, please visit https://www.mangoapps.com/.
SYS-CON Events announced today that EastBanc Technologies will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. EastBanc Technologies has been working at the frontier of technology since 1999. Today, the firm provides full-lifecycle software development delivering flexible technology solutions that seamlessly integrate with existing systems – whether on premise or cloud. EastBanc Technologies partners with p...
The cloud era has reached the stage where it is no longer a question of whether a company should migrate, but when. Enterprises have embraced the outsourcing of where their various applications are stored and who manages them, saving significant investment along the way. Plus, the cloud has become a defining competitive edge. Companies that fail to successfully adapt risk failure. The media, of course, continues to extol the virtues of the cloud, including how easy it is to get there. Migrating...
SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in Embedded and IoT solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology, is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and ...
SYS-CON Events announced today that Catchpoint Systems, Inc., a provider of innovative web and infrastructure monitoring solutions, has been named “Silver Sponsor” of SYS-CON's DevOps Summit at 18th Cloud Expo New York, which will take place June 7-9, 2016, at the Javits Center in New York City, NY. Catchpoint is a leading Digital Performance Analytics company that provides unparalleled insight into customer-critical services to help consistently deliver an amazing customer experience. Designed...
18th Cloud Expo, taking place June 7-9, 2016, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some...
How will your company move to the cloud while ensuring a solid security posture? Organizations from small to large are increasingly adopting cloud solutions to deliver essential business services at a much lower cost. According to cyber security experts, the frequency and severity of cyber-attacks are on the rise, causing alarm to businesses and customers across a variety of industries. To defend against exploits like these, a company must adopt a comprehensive security defense strategy that is ...
SYS-CON Events announced today that IBM Cloud Data Services has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. IBM Cloud Data Services offers a portfolio of integrated, best-of-breed cloud data services for developers focused on mobile computing and analytics use cases.
SYS-CON Events announced today Object Management Group® has been named “Media Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Cloud computing delivers on-demand resources that provide businesses with flexibility and cost-savings. The challenge in moving workloads to the cloud has been the cost and complexity of ensuring the initial and ongoing security and regulatory (PCI, HIPAA, FFIEC) compliance across private and public clouds. Manual security compliance is slow, prone to human error, and represents over 50% of the cost of managing cloud applications. Determining how to automate cloud security compliance is critical...
The Internet of Things (IoT) is growing rapidly by extending current technologies, products and networks. By 2020, Cisco estimates there will be 50 billion connected devices. Gartner has forecast revenues of over $300 billion, just to IoT suppliers. Now is the time to figure out how you’ll make money – not just create innovative products. With hundreds of new products and companies jumping into the IoT fray every month, there’s no shortage of innovation. Despite this, McKinsey/VisionMobile data...
"What we see what happens when you have a completely networked society and the potential to now drive the value creation and the collaboration and the ecosystems that are possible when you start to be able to connect people and industries together in ways that have never been possible before," explained Esmeralda Swartz, VP of Marketing Enterprise & Cloud at Ericsson, in this SYS-CON.tv interview at @ThingsExpo, held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that AppNeta, the leader in performance insight for business-critical web applications, will exhibit and present at SYS-CON's @DevOpsSummit at Cloud Expo New York, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. AppNeta is the only application performance monitoring (APM) company to provide solutions for all applications – applications you develop internally, business-critical SaaS applications you use and the networks that deli...
SYS-CON Events announced today that Zerto will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Zerto is committed to keeping enterprise and cloud IT running 24/7 by providing innovative, simple, reliable and scalable business continuity software solutions. Through the Zerto Cloud Continuity Platform™, organizations can seamlessly move and protect virtualized workloads between public, private and hybrid clou...
SYS-CON Events announced today that ContentMX, the marketing technology and services company with a singular mission to increase engagement and drive more conversations for enterprise, channel and SMB technology marketers, has been named “Sponsor & Exhibitor Lounge Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York City, New York. “CloudExpo is a great opportunity to start a conversation with new prospects, but what happens after the...
As organizations shift towards IT-as-a-service models, the need for managing and protecting data residing across physical, virtual, and now cloud environments grows with it. Commvault can ensure protection, access and E-Discovery of your data – whether in a private cloud, a Service Provider delivered public cloud, or a hybrid cloud environment – across the heterogeneous enterprise. In his general session at 18th Cloud Expo, Randy De Meno, Chief Technologist - Windows Products and Microsoft Part...
The IoT is changing the way enterprises conduct business. In his session at @ThingsExpo, Eric Hoffman, Vice President at EastBanc Technologies, discuss how businesses can gain an edge over competitors by empowering consumers to take control through IoT. We'll cite examples such as a Washington, D.C.-based sports club that leveraged IoT and the cloud to develop a comprehensive booking system. He'll also highlight how IoT can revitalize and restore outdated business models, making them profitable...