Welcome!

@BigDataExpo Authors: Yeshim Deniz, Elizabeth White, Pat Romanski, Carmen Gonzalez, Liz McMillan

Related Topics: @BigDataExpo, @CloudExpo, @ThingsExpo

@BigDataExpo: Article

Excel for #BigData Analysis | @CloudExpo #BI #Analytics #DigitalTransformation

How propelling instant results to the Excel edge democratizes advanced analytics

HTI Labs in London provides the means and governance with its Schematiq tool to bring critical data services to the interface users want.

The next BriefingsDirect Voice of the Customer digital transformation case study explores how powerful and diverse financial information is newly and uniquely delivered to the ubiquitous Excel spreadsheet edge.

We'll explore how HTI Labs in London provides the means and governance with its Schematiq tool to bring critical data services to the interface users want. By leveraging the best of instant cloud-delivered data with spreadsheets, Schematiq democratizes end-user empowerment while providing powerful new ways to harness and access complex information.

To learn how complex cloud core-to-edge processes and benefits can be better managed and exploited we're joined by Darren Harris, CEO and Co-Founder of HTI Labs, and Jonathan Glass, CTO and Co-Founder of HTI Labs, based in London. The discussion is moderated by BriefingsDirect's Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Let's put some context around this first. What major trends in the financial sector led you to create HTI Labs, and what are the problems you're seeking to solve?

Harris

Harris: Obviously, in finance, spreadsheets are widespread and are being used for a number of varying problems. A real issue started a number of years ago, where spreadsheets got out of control. People were using them everywhere, causing lots of operational risk processes. They wanted to get their hands around it for governance, and there were loads that we needed to eradicate -- Excel-type issues.

That led to the creation of centralized teams that locked down rigid processes and effectively took away a lot of the innovation and discovery process that traders are using to spot opportunities and explore data.

Through this process, we're trying to help with governance to understand the tools to explore, and [deliver] the ability to put the data in the hands of people ... [with] the right balance.

So by taking the best of regulatory scrutiny around what a person needs, and some innovation that we put into Schematiq, we see an opportunity to take Excel to another level -- but not sacrifice the control that’s needed.

Gardner: Jonathan, are there technology trends that allowed you to be able to do this, whereas it may not have been feasible economically or technically before?

Upstream capabilities

Glass: There are lot of really great back-end technologies that are available now, along with the ability to either internally or externally scale compute resources. Essentially, the desktop remains quite similar. Excel has remained quite the same, but the upstream capabilities have really grown.

Glass

So there's a challenge. Data that people feel they should have access to is getting bigger, more complex, and less structured. So Excel, which is this great front-end to come to grips with data, is becoming a bit of bottleneck in terms of actually keeping up with the data that's out there that people want.

Gardner: So, we're going to keep Excel. We're not going to throw the baby out with the bathwater, so to speak, but we are going to do something a little bit different and interesting. What is it that we're now putting into Excel and how is that different from what was available in the past?

Harris: Schematiq extends Excel and allows it to access unstructured data. It also reduces the complexity and technical limitations that Excel has as an out-of-the-box product.

We have the notion of a data link that's effectively in a single cell that allows you to reference data that’s held externally on a back-end site. So, where people used to ingest data from another system directly into Excel, and effectively divorce it from the source, we can leave that data where it is.

It's a paradigm of take a question to the data; don’t pull the data to the question. That means we can leverage the power of the big-data platforms and how they process an analytic database on the back-end, but where you can effectively use Excel as the front screen. Ask questions from Excel, but push that query to the back-end. That's very different in terms of the model that most people are used to working with in Excel.

Gardner: This is a two-way street. It's a bit different. And you're also looking at the quality, compliance, and regulatory concerns over that data.

Harris: Absolutely. An end-user is able to break down or decompose any workflow process with data and debug it the same way they can in a spreadsheet. The transparency that we add on top of Excel’s use with Schematiq allows us to monitor what everybody is doing and the function they're using. So, you can give them agility, but still maintain the governance and the control.

In organizations, lots of teams have become disengaged. IT has tried to create some central core platform that’s quite restrictive, and it's not really serving the users. They have gotten disengaged and they've created what Gartner referred to as the Shadow BI Team, with databases under their desk, and stuff like that.

By bringing in Schematiq we add that transparency back, and we allow IT and the users to have an informed discussion -- a very analytic conversation -- around what they're using, how they are using it, where the bottlenecks are. And then, they can work out where the best value is. It's all about agility and control. You just can't give the self-service tools to an organization and not have the transparency for any oversight or governance.

To the edge

Gardner: So we have, in a sense, brought this core to the edge. We've managed it in terms of compliance and security. Now, we can start to think about how creative we can get with what's on that back-end that we deliver. Tell us a little bit about what you go after, what your users want to experiment with, and then how you enable that.

Glass: We try to be as agnostic to that as we can, because it's the creativity of the end-user that really drives value.

We have a variety of different data sources, traditional relational databases, object stores, OLAP cubes, APIs, web queries, and flat files. People want to bring that stuff together. They want some way that they can pull this stuff in from different sources and create something that's unique. This concept of putting together data that hasn't been put together before is where the sparks start to fly and where the value really comes from.

Gardner: And with Schematiq you're enabling that aggregation and cleansing ability to combine, as well as delivering it. Is that right?

The iteration curve is so much tighter and the cost of doing that is so much less. Users are able to innovate and put together the scenario of the business case for why this is a good idea.

Harris: Absolutely. It's that discovery process. It may be very early on in a long chain. This thing may progress to be something more classic, operational, and structured business intelligence (BI), but allowing end-users the ability to cleanse, explore data, and then hand over an artifact that someone in the core team can work with or use as an asset. The iteration curve is so much tighter and the cost of doing that is so much less. Users are able to innovate and put together the scenario of the business case for why this is a good idea.

The only thing I would add to the sources that Jon has just mentioned is with HPE Haven OnDemand, [you gain access to] the unstructured analytics, giving the users the ability to access and leverage all of the HPE IDOL capabilities. That capability is a really powerful and transformational thing for businesses.

They have such a set of unstructured data [services] available in voice and text, and when you allow business users access to that data, the things they come up with, their ideas, are just quite amazing.

Technologists always try to put themselves in the minds of the users, and we've all historically done a bad job of making the data more accessible for them. When you allow them the ability to analyze PDFs without structure, to share that, to analyze sentiment, to include concepts and entities, or even enrich a core proposition, you're really starting to create innovation. You've raised the awareness of all of these analytics that exist in the world today in the back-end, shown end-users what they can do, and then put their brains to work discovering and inventing.

Gardner: Many of these financial organizations are well-established, many of them for hundreds of years perhaps. All are thinking about digital transformation, the journey, and are looking to become more data-driven and to empower more people to take advantage of that. So, it seems to me you're almost an agent of digital transformation, even in a very technical and sophisticated sector like finance.

Making data accessible

Glass: There are a lot of stereotypes in terms of who the business analysts are and who the people are that come up with ideas and intervention. The true power of democratization is making data more accessible, lowering the technical barrier, and allowing people to explore and innovate. Things always come from where you least expect them.

Gardner: I imagine that Microsoft is pleased with this, because there are some people who are a bit down on Excel. They think that it's manual, that it's by rote, and that it's not the way to go. So, you, in a sense, are helping Excel get a new lease on life.

Glass: I don’t think we're the whole story in that space, but I love Excel. I've used it for years and years at work. I've seen the power of what it can do and what it can deliver, and I have a bit of an understanding of why that is. It’s the live nature of it, the fact that people can look at data in a spreadsheet, see where it’s come from, see where it’s going, they can trust it, and they can believe in it.

That’s why what we're trying to do is create these live connections to these upstream data sources. There are manual steps, download, copy/paste, move around the sheet, which is where errors creep in. It’s where the bloat, the slowness, and the unreliability can happen, but by changing that into a live connection to the data source, it becomes instant and it goes back to being trusted, reliable, and actionable.

Harris: There's something in the DNA, as well, of how people interact with data and so we can lay out effectively the algorithm or the process of understanding a calculation or a data flow. That’s why you see a lot of other systems that are more web-based or web-centric and replicate an Excel-type experience.

The user starts to use it and starts to think, "Wow, it’s just like Excel," and it isn’t. They hit a barrier, they hit a wall, and then they hit the "export" button. Then, they put it back [into Excel] and create their own way to work with it. So, there's something in the DNA of Excel and the way people lay things out. I think of [Excel] almost like a programing environment for non-programers. Some people describe it as a functional language very much like Haskell, and the Excel functions they write were effectively then working and navigating through the data.

Gardner: No need to worry that if you build it, will they come; they're already there.

Harris: Absolutely.

Gardner: Tell us a bit about HTI Labs and how your company came about, and where you are on your evolution.

Cutting edge

Harris: HTI labs was founded in 2012. The core backbone of the team actually worked for the same tier 1 investment bank, and we were building risk and trading systems for front-office teams. We were really, I suppose, the cutting edge of all the big data technologies that were being used at the time -- real-time, disputed graphs and cubes, and everything.

As a core team, it was about taking that expertise and bringing it to other industries. Using Monte Carlo farms in risk calculations, the ability to export data at speed and real-time risk. These things were becoming more centric to other organizations, which was an opportunity.

At the moment, we're focusing predominately on energy trading. Our software is being used across a number of other sectors and our largest client has installed Schematiq on 120 desktops, which is great. That’s a great validation of what we're doing. We're also a member of the London Stock Exchange Elite Program, based in London for high-growth companies.

Glass: Darren and I met when we were working for the same company. I started out as a quant doing the modeling, the map behind pricing, but I found that my interest lay more in the engineering. Rather than doing it once, can I do it a million times, can I do these things reliably and scale them?

The algorithms are built, but the key to making them so much more improved is the feedback loop between your domain users, your business users, and how they can enrich and train effectively these algorithms.

Because I started in a front-office environment, it was very spreadsheet-dominated, it was very VBA-dominated. There's good and bad in that. A lot of those lessened, and Darren and I met up. We crossed the divide together from the top-down, big IT systems and the bottom-up end-user best-developed spreadsheets, and so on. We found a middle ground together, which we feel is a quite powerful combination.

Gardner: Back to where this leads. We're seeing more-and-more companies using data services like Haven OnDemand and starting to employ machine learning, artificial intelligence (AI), and bots to augment what the humans do so well. Is there an opportunity for that to play here, or maybe it already is? The question basically is, how does AI come to bear on what you can deliver out to the Excel edge?

Harris: I think what you see is that out of the box, you have a base unit of capability. The algorithms are built, but the key to making them so much more improved is the feedback loop between your domain users, your business users, and how they can enrich and train effectively these algorithms.

So, we see a future where the self-service BI tools that they use to interact with data and explore would almost become the same mechanism where people will see the results from the algorithms and give feedback to send back to the underlying algorithm.

Gardner: And Jonathan, where do you see the use of bots, particularly perhaps with an API model like Haven OnDemand?

The role of bots

Glass: The concept for bots is replicating an insight or a process that somebody might already be doing manually. When people create these data flows and analyses that they maybe run once so it’s quite time-consuming to run. The real exciting possibility is that you make these things run 24×7. So, you start receiving notifications, rather than having to pull from the data source. You start receiving notifications from your own mailbox that you have created. You look at those and you decide whether that's a good insight or a bad insight, and you can then start to train it and refine it.

The training and refining is that loop that potentially goes back to IT, gets back through a development loop, and it’s about closing that loop and tightening that loop. That's the thing that really adds value to those opportunities.

Gardner: Perhaps we should unpack Schematiq a bit to understand how one might go back and do that within the context of your tool. Are there several components of the tool, one of which might lend itself to going back and automating?

Glass: Absolutely. You can imagine the spreadsheet has some inputs and some outputs. One of the components within the Schematiq architecture is the ability to take a spreadsheet, to take the logic and the process that’s embedded in our spreadsheet, and turn it into an executable module of code, which you can host on your server, you can schedule, you can run as often as you like, and you can trigger based on events.

It’s very much all about empowering the end-user to connect, create, govern, share instantly and then allow consumption from anybody on any device.

It’s a way of emitting code from a spreadsheet. You take some of the insight, you take without a business analysis loop and a development loop, and you take the exact thing that the user, the analyst, has programmed. You make it into something that you can run, commoditize, and scale. That’s quite an important way in which we reduce that development loop. We create that cycle that’s tight and rapid.

Gardner: Darren, would you like to explain the other components that make-up Schematiq?

Harris: There are four components of Schematiq architecture. There's the workbench that extends Excel and allows the ability to have large structured data analytics. We have the asset manager, which is really all about governance. So, you can think of it like source control for Excel, but with a lot more around metadata control, transparency, and analytics on what people are using and how they are using it.

There's a server component that allows you just to off-load and scale analytics horizontally, if they do that, and build repeatable or overnight processes. The last part is the portal. This is really about allowing end-users to instantly share their insights with other people. Picking up from Jon’s point about the compound executable, but it’s defined in Schematiq. That can be off-loaded to a server and exposed as another API to a computer, the mobile, or even a function.

So, it’s very much all about empowering the end-user to connect, create, govern, share instantly and then allow consumption from anybody on any device.

Market for data services

Gardner: I imagine, given the sensitive nature of the financial markets and activities, that you have some boundaries that you can’t cross when it comes to examining what’s going on in between the core and the edge.

Tell me about how you, as an organization, can look at what’s going on with the Schematiq and the democratization, and whether that creates another market for data services when you see what the demand entails.

Harris: It’s definitely the case that people have internal datasets they create and that they look after. People are very precious about them because they are hugely valuable, and one of the things that we strive to help people do is to share those things.

Across the trading floor, you might effectively have a dozen or more different IT infrastructures, if you think of what’s existing on the desk as being a miniature infrastructure that’s been created. So, it's about making easy for people to share these things, to create master datasets that they gain value from, and to see that they gain mutual value from that, rather than feeling closed in, and don’t want to share this with their neighbors.

If we work together and if we have the tools that enable us to collaborate effectively, then we can all get more done and we can all add more value.

If we work together and if we have the tools that enable us to collaborate effectively, then we can all get more done and we can all add more value.

Gardner: It's interesting to me that the more we look at the use of data, the more it opens up new markets and innovation capabilities that we hadn’t even considered before. And, as an analyst, I expect to see more of a marketplace of data services. You strike me as an accelerant to that.

Harris: Absolutely. As the analytics are coming online and exposed by API’s, the underlying store that’s used is becoming a bit irrelevant. If you look at what the analytics can do for you, that’s how you consume the insight and you can connect to other sources. You can connect from Twitter, you connect from Facebook, you can connect PDFs, whether it’s NoSQL, structured, columnar, rows, it doesn’t really matter. You don’t see that complexity. The fact that you can just create an API key, access it as consumer, and can start to work with it is really powerful.

There was the recent example in the UK of a report on the Iraq War. It’s 2.2 million words, it took seven years to write, and it’s available online, but there's no way any normal person could consume or analyze that. That’s three times the complete works of Shakespeare.

Using these APIs, you can start to pull out mentions, you can pull out countries, locations and really start to get into the data and provide anybody with Excel at home, in our case, or any other tool, the ability to analyze and get in there and share those insights. We're very used to media where we get just the headline, and that spin comes into play. People turn things on their, head and you really never get to delve into the underlying detail.

What’s really interesting is when democratization and sharing of insights and collaboration comes, we can all be informed. We can all really dig deep, and all these people that work there, the great analysts, could start to collaborate and delve and find things and find new discoveries and share that insight.

Gardner: All right, a little light bulb just went off in my head whereas we would go to a headline and a new story and we might have a hyperlink to a source. I could get a headline and a news story, open up my Excel spreadsheet, get to the actual data source behind the entire story and then probe and plumb and analyze that any which way I wanted to.

Harris: Yes, Exactly. I think the most savvy consumer now, the analyst, is starting to demand that transparency. We've seen in the UK, words, election messages and quotes and even financial stats where people just don’t believe the headlines. They're demanding transparency in that process, and so governance can only be really a good thing.

You may also be interested in:

More Stories By Dana Gardner

At Interarbor Solutions, we create the analysis and in-depth podcasts on enterprise software and cloud trends that help fuel the social media revolution. As a veteran IT analyst, Dana Gardner moderates discussions and interviews get to the meat of the hottest technology topics. We define and forecast the business productivity effects of enterprise infrastructure, SOA and cloud advances. Our social media vehicles become conversational platforms, powerfully distributed via the BriefingsDirect Network of online media partners like ZDNet and IT-Director.com. As founder and principal analyst at Interarbor Solutions, Dana Gardner created BriefingsDirect to give online readers and listeners in-depth and direct access to the brightest thought leaders on IT. Our twice-monthly BriefingsDirect Analyst Insights Edition podcasts examine the latest IT news with a panel of analysts and guests. Our sponsored discussions provide a unique, deep-dive focus on specific industry problems and the latest solutions. This podcast equivalent of an analyst briefing session -- made available as a podcast/transcript/blog to any interested viewer and search engine seeker -- breaks the mold on closed knowledge. These informational podcasts jump-start conversational evangelism, drive traffic to lead generation campaigns, and produce strong SEO returns. Interarbor Solutions provides fresh and creative thinking on IT, SOA, cloud and social media strategies based on the power of thoughtful content, made freely and easily available to proactive seekers of insights and information. As a result, marketers and branding professionals can communicate inexpensively with self-qualifiying readers/listeners in discreet market segments. BriefingsDirect podcasts hosted by Dana Gardner: Full turnkey planning, moderatiing, producing, hosting, and distribution via blogs and IT media partners of essential IT knowledge and understanding.

@BigDataExpo Stories
Financial Technology has become a topic of intense interest throughout the cloud developer and enterprise IT communities. Accordingly, attendees at the upcoming 20th Cloud Expo at the Javits Center in New York, June 6-8, 2017, will find fresh new content in a new track called FinTech.
@GonzalezCarmen has been ranked the Number One Influencer and @ThingsExpo has been named the Number One Brand in the “M2M 2016: Top 100 Influencers and Brands” by Analytic. Onalytica analyzed tweets over the last 6 months mentioning the keywords M2M OR “Machine to Machine.” They then identified the top 100 most influential brands and individuals leading the discussion on Twitter.
Cognitive Computing is becoming the foundation for a new generation of solutions that have the potential to transform business. Unlike traditional approaches to building solutions, a cognitive computing approach allows the data to help determine the way applications are designed. This contrasts with conventional software development that begins with defining logic based on the current way a business operates. In her session at 18th Cloud Expo, Judith S. Hurwitz, President and CEO of Hurwitz & ...
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 add...
With billions of sensors deployed worldwide, the amount of machine-generated data will soon exceed what our networks can handle. But consumers and businesses will expect seamless experiences and real-time responsiveness. What does this mean for IoT devices and the infrastructure that supports them? More of the data will need to be handled at - or closer to - the devices themselves.
Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like “How is my application doing” but no id...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
Multiple data types are pouring into IoT deployments. Data is coming in small packages as well as enormous files and data streams of many sizes. Widespread use of mobile devices adds to the total. In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists will look at the tools and environments that are being put to use in IoT deployments, as well as the team skills a modern enterprise IT shop needs to keep things running, get a handle on all this data, and deli...
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value S...
Automation is enabling enterprises to design, deploy, and manage more complex, hybrid cloud environments. Yet the people who manage these environments must be trained in and understanding these environments better than ever before. A new era of analytics and cognitive computing is adding intelligence, but also more complexity, to these cloud environments. How smart is your cloud? How smart should it be? In this power panel at 20th Cloud Expo, moderated by Conference Chair Roger Strukhoff, pane...
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...
SYS-CON Events announced today that Progress, a global leader in application development, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Enterprises today are rapidly adopting the cloud, while continuing to retain business-critical/sensitive data inside the firewall. This is creating two separate data silos – one inside the firewall and the other outside the firewall. Cloud ISVs oft...
The age of Digital Disruption is evolving into the next era – Digital Cohesion, an age in which applications securely self-assemble and deliver predictive services that continuously adapt to user behavior. Information from devices, sensors and applications around us will drive services seamlessly across mobile and fixed devices/infrastructure. This evolution is happening now in software defined services and secure networking. Four key drivers – Performance, Economics, Interoperability and Trust ...
SYS-CON Events announced today that Grape Up will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct. 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company specializing in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market across the U.S. and Europe, Grape Up works with a variety of customers from emergi...
@ThingsExpo has been named the Most Influential ‘Smart Cities - IIoT' Account and @BigDataExpo has been named fourteenth by Right Relevance (RR), which provides curated information and intelligence on approximately 50,000 topics. In addition, Right Relevance provides an Insights offering that combines the above Topics and Influencers information with real time conversations to provide actionable intelligence with visualizations to enable decision making. The Insights service is applicable to eve...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
Cybersecurity is a critical component of software development in many industries including medical devices. However, code is not always written to be robust or secure from the unknown or the unexpected. This gap can make medical devices susceptible to cybersecurity attacks ranging from compromised personal health information to life-sustaining treatment. In his session at @ThingsExpo, Clark Fortney, Software Engineer at Battelle, will discuss how programming oversight using key methods can incre...
Quickly find the root cause of complex database problems slowing down your applications. Up to 88% of all application performance issues are related to the database. DPA’s unique response time analysis shows you exactly what needs fixing - in four clicks or less. Optimize performance anywhere. Database Performance Analyzer monitors on-premises, on VMware®, and in the Cloud, including Amazon® AWS and Azure™ virtual machines.
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software in the hope of capturing value in IoT. Although IoT is relatively new in the market, it has already gone through many promotional terms such as IoE, IoX, SDX, Edge/Fog, Mist Compute, etc. Ultimately, irrespective of the name, it is about deriving value from independent software assets participating in an ecosystem as one comprehensive solution.
SYS-CON Events announced today that CA Technologies has been named "Platinum Sponsor" of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, New York, and 21st International Cloud Expo, which will take place in November in Silicon Valley, California.