Welcome!

@BigDataExpo Authors: Liz McMillan, Pat Romanski, Shelly Palmer, Elizabeth White, William Schmarzo

Related Topics: @BigDataExpo, Microservices Expo, Containers Expo Blog, Machine Learning , @CloudExpo, SDN Journal

@BigDataExpo: Article

From Metrics to Success

Ford scours for more big data to bolster quality, improve manufacturing, streamline processes

Ford has exploited the strengths of big data analytics by directing them internally to improve business results. In doing so, they scour the metrics from the company’s best processes across myriad manufacturing efforts and through detailed outputs from in-use automobiles -- all to improve and help transform their business.

So explains Michael Cavaretta, PhD, Technical Leader of Predictive Analytics for Ford Research and Advanced Engineering in Dearborn, Michigan. Cavaretta is one of a group of experts assembled this week for The Open Group Conference in Newport Beach, California.

Cavaretta has led multiple data-analytic projects at Ford to break down silos inside the company to best define Ford’s most fruitful data sets. Ford has successfully aggregated customer feedback, and extracted all the internal data to predict how best new features in technologies will improve their cars.

As a contributor to the The Open Group conference and its focus on "Big Data -- The Transformation We Need to Embrace Today," Cavaretta explains how big data is fostering business transformation by allowing deeper insights into more types of data efficiently, and thereby improving processes, quality control, and customer satisfaction.

The interview was moderated by Dana Gardner, Principal Analyst at Interarbor Solutions. [Disclosure: The Open Group is a sponsor of BriefingsDirect podcasts.]

Here are some excerpts:

Gardner: What's different now in being able to get at this data and do this type of analysis from five years ago?

Cavaretta: The biggest difference has to do with the cheap availability of storage and processing power, where a few years ago people were very much concentrated on filtering down the datasets that were being stored for long-term analysis. There has been a big sea change with the idea that we should just store as much as we can and take advantage of that storage to improve business processes.

Gardner: How did we get here? What's the process behind the benefits?

Sea change in attitude

Cavaretta: The process behind the benefits has to do with a sea change in the attitude of organizations, particularly IT within large enterprises. There's this idea that you don't need to spend so much time figuring out what data you want to store and worry about the cost associated with it, and more about data as an asset. There is value in being able to store it, and being able to go back and extract different insights from it. This really comes from this really cheap storage, access to parallel processing machines, and great software.

Cavaretta

I like to talk to people about the possibility that big data provides and I always tell them that I have yet to have a circumstance where somebody is giving me too much data. You can pull in all this information and then answer a variety of questions, because you don't have to worry that something has been thrown out. You have everything.

You may have 100 questions, and each one of the questions uses a very small portion of the data. Those questions may use different portions of the data, a very small piece, but they're all different. If you go in thinking, "We’re going to answer the top 20 questions and we’re just going to hold data for that," that leaves so much on the table, and you don't get any value out of it.

The process behind the benefits has to do with a sea change in the attitude of organizations, particularly IT within large enterprises.

We're a big believer in mash-ups and we really believe that there is a lot of value in being able to take even datasets that are not specifically big-data sizes yet, and then not go deep, not get more detailed information, but expand the breadth. So it's being able to augment it with other internal datasets, bridging across different business areas as well as augmenting it with external datasets.

A lot of times you can take something that is maybe a few hundred thousand records or a few million records, and then by the time you’re joining it, and appending different pieces of information onto it, you can get the big dataset sizes.

Gardner: You’re really looking primarily at internal data, while also availing yourself of what external data might be appropriate. Maybe you could describe a little bit about your organization, what you do, and why this internal focus is so important for you.

Internal consultants
Cavaretta: I'm part of a larger department that is housed over in the research and advanced-engineering area at Ford Motor Company, and we’re about 30 people. We work as internal consultants, kind of like Capgemini or Ernst & Young, but only within Ford Motor Company. We’re responsible for going out and looking for different opportunities from the business perspective to bring advanced technologies. So, we’ve been focused on the area of statistical modeling and machine learning for I’d say about 15 years or so.

And in this time, we’ve had a number of engagements where we’ve talked with different business customers, and people have said, "We'd really like to do this." Then, we'd look at the datasets that they have, and say, "Wouldn’t it be great if we would have had this. So now we have to wait six months or a year."

These new technologies are really changing the game from that perspective. We can turn on the complete fire-hose, and then say that we don't have to worry about that anymore. Everything is coming in. We can record it all. We don't have to worry about if the data doesn’t support this analysis, because it's all there. That's really a big benefit of big-data technologies.

The real value proposition definitely is changing as things are being pushed down in the company to lower-level analysts who are really interested in looking at things from a data-driven perspective. From when I first came in to now, the biggest change has been when Alan Mulally came into the company, and really pushed the idea of data-driven decisions.

The real value proposition definitely is changing as things are being pushed down in the company to lower-level analysts.

Before, we were getting a lot of interest from people who are really very focused on the data that they had internally. After that, they had a lot of questions from their management and from upper level directors and vice-president saying, "We’ve got all these data assets. We should be getting more out of them." This strategic perspective has really changed a lot of what we’ve done in the last few years.

Gardner: Are we getting to the point where this sort of Holy Grail notion of a total feedback loop across the lifecycle of a major product like an automobile is really within our grasp? Are we getting there, or is this still kind of theoretical. Can we pull it altogether and make it a science?

Cavaretta: The theory is there. The question has more to do with the actual implementation and the practicality of it. We still are talking a lot of data where even with new advanced technologies and techniques that’s a lot of data to store, it’s a lot of data to analyze, there’s a lot of data to make sure that we can mash-up appropriately.

And, while I think the potential is there and I think the theory is there. There is also a work in being able to get the data from multiple sources. So everything which you can get back from the vehicle, fantastic. Now if you marry that up with internal data, is it survey data, is it manufacturing data, is it quality data? What are the things do you want to go after first? We can’t do everything all at the same time.

Highest value

Our perspective has been let’s make sure that we identify the highest value, the greatest ROI areas, and then begin to take some of the major datasets that we have and then push them and get more detail. Mash them up appropriately and really prove up the value for the technologists.

Gardner: Clearly, there's a lot more to come in terms of where we can take this, but I suppose it's useful to have a historic perspective and context as well. I was thinking about some of the early quality gurus like Deming and some of the movement towards quality like Six Sigma. Does this fall within that same lineage? Are we talking about a continuum here over that last 50 or 60 years, or is this something different?

Cavaretta: That’s a really interesting question. From the perspective of analyzing data, using data appropriately, I think there is a really good long history, and Ford has been a big follower of Deming and Six Sigma for a number of years now.

The difference though, is this idea that you don't have to worry so much upfront about getting the data. If you're doing this right, you have the data right there, and this has some great advantages. You’ll have to wait until you get enough history to look for somebody’s patterns. Then again, it also has some disadvantage, which is you’ve got so much data that it’s easy to find things that could be spurious correlations or models that don’t make any sense.

The piece that is required is good domain knowledge, in particular when you are talking about making changes in the manufacturing plant. It's very appropriate to look at things and be able to talk with people who have 20 years of experience to say, "This is what we found in the data. Does this match what your intuition is?" Then, take that extra step.

Gardner: How has the notion of the Internet of things being brought to bear on your gathering of big data and applying it to the analytics in your organization?

Cavaretta: It is a huge area, and not only from the internal process perspective -- RFID tags within the manufacturing plans, as well as out on the plant floor, and then all of the information that’s being generated by the vehicle itself.

The Ford Energi generates about 25 gigabytes of data per hour. So you can imagine selling couple of million vehicles in the near future with that amount of data being generated. There are huge opportunities within that, and there are also some interesting opportunities having to do with opening up some of these systems for third-party developers. OpenXC is an initiative that we have going on to add at Research and Advanced Engineering.

Huge number of sensors

We have a lot of data coming from the vehicle. There’s huge number of sensors and processors that are being added to the vehicles. There's data being generated there, as well as communication between the vehicle and your cell phone and communication between vehicles.

There's a group over at Ann Arbor Michigan, the University of Michigan Transportation Research Institute (UMTRI), that’s investigating that, as well as communication between the vehicle and let’s say a home system. It lets the home know that you're on your way and it’s time to increase the temperature, if it’s winter outside, or cool it at the summer time.

The amount of data that’s been generated there is invaluable information and could be used for a lot of benefits, both from the corporate perspective, as well as just the very nature of the environment.

Gardner: Just to put a stake in the ground on this, how much data do cars typically generate? Do you have a sense of what now is the case, an average?

Cavaretta: The Energi, according to the latest information that I have, generates about 25 gigabytes per hour. Different vehicles are going to generate different amounts, depending on the number of sensors and processors on the vehicle. But the biggest key has to do with not necessarily where we are right now but where we will be in the near future.

With the amount of information that's being generated from the vehicles, a lot of it is just internal stuff. The question is how much information should be sent back for analysis and to find different patterns? That becomes really interesting as you look at external sensors, temperature, humidity. You can know when the windshield wipers go on, and then to be able to take that information, and mash that up with other external data sources too. It's a very interesting domain.

With the amount of information that's being generated from the vehicles, a lot of it is just internal stuff.

Gardner: What skills do you target for your group, and what ways do you think that you can improve on that?

Cavaretta: The skills that we have in our department, in particular on our team, are in the area of computer science, statistics, and some good old-fashioned engineering domain knowledge. We’ve really gone about this from a training perspective. Aside from a few key hires, it's really been an internally developed group.

Targeted training

The biggest advantage that we have is that we can go out and be very targeted with the amount of training that we have. There are such big tools out there, especially in the open-source realm, that we can spin things up with relatively low cost and low risk, and do a number of experiments in the area. That's really the way that we push the technologies forward.

Talking with The Open Group really gives me an opportunity to be able to bring people on board with the idea that you should be looking at a difference in mindset. It's not "Here’s a way that data is being generated, look, try and conceive of some questions that we can use, and we’ll store that too." Let's just take everything, we’ll worry about it later, and then we’ll find the value.

It's important to be thinking about data as an asset, rather than as a cost. You even have to spend some money, and it may be a little bit unsafe without really solid ROI at the beginning. Then, move towards pulling that information in, and being able to store it in a way that allows not just the high-level data scientist to get access to and provide value, but people who are interested in the data overall. Those are very important pieces.

The last one is how do you take a big-data project, how do you take something where you’re not storing in the traditional business intelligence (BI) framework that an enterprise can develop, and then connect that to the BI systems and look at providing value to those mash-ups. Those are really important areas that still need some work.

There are many companies, especially large enterprises, that are looking at their data assets and wondering what can they do to monetize this, not only to just pay for the efficiency improvement but as a new revenue stream.

Gardner: For those organizations that want to get started on this, how do you get started?

Understand that it maybe going to be a little bit more costly and the ROI isn't going to be there at the beginning.

Cavaretta: We're definitely a huge believer in pilot projects and proof of concept, and we like to develop roadmaps by doing. So get out there. Understand that it's going to be messy. Understand that it maybe going to be a little bit more costly and the ROI isn't going to be there at the beginning.

But get your feet wet. Start doing some experiments, and then, as those experiments turn from just experimentation into really providing real business value, that’s the time to start looking at a more formal aspect and more formal IT processes. But you've just got to get going at this point.

You may also be interested in:

More Stories By Dana Gardner

At Interarbor Solutions, we create the analysis and in-depth podcasts on enterprise software and cloud trends that help fuel the social media revolution. As a veteran IT analyst, Dana Gardner moderates discussions and interviews get to the meat of the hottest technology topics. We define and forecast the business productivity effects of enterprise infrastructure, SOA and cloud advances. Our social media vehicles become conversational platforms, powerfully distributed via the BriefingsDirect Network of online media partners like ZDNet and IT-Director.com. As founder and principal analyst at Interarbor Solutions, Dana Gardner created BriefingsDirect to give online readers and listeners in-depth and direct access to the brightest thought leaders on IT. Our twice-monthly BriefingsDirect Analyst Insights Edition podcasts examine the latest IT news with a panel of analysts and guests. Our sponsored discussions provide a unique, deep-dive focus on specific industry problems and the latest solutions. This podcast equivalent of an analyst briefing session -- made available as a podcast/transcript/blog to any interested viewer and search engine seeker -- breaks the mold on closed knowledge. These informational podcasts jump-start conversational evangelism, drive traffic to lead generation campaigns, and produce strong SEO returns. Interarbor Solutions provides fresh and creative thinking on IT, SOA, cloud and social media strategies based on the power of thoughtful content, made freely and easily available to proactive seekers of insights and information. As a result, marketers and branding professionals can communicate inexpensively with self-qualifiying readers/listeners in discreet market segments. BriefingsDirect podcasts hosted by Dana Gardner: Full turnkey planning, moderatiing, producing, hosting, and distribution via blogs and IT media partners of essential IT knowledge and understanding.

@BigDataExpo Stories
SYS-CON Events announced today that Ryobi Systems will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ryobi Systems Co., Ltd., as an information service company, specialized in business support for local governments and medical industry. We are challenging to achive the precision farming with AI. For more information, visit http:...
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
SYS-CON Events announced today that Mobile Create USA will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Mobile Create USA Inc. is an MVNO-based business model that uses portable communication devices and cellular-based infrastructure in the development, sales, operation and mobile communications systems incorporating GPS capabi...
SYS-CON Events announced today that Daiya Industry will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Daiya Industry specializes in orthotic support systems and assistive devices with pneumatic artificial muscles in order to contribute to an extended healthy life expectancy. For more information, please visit https://www.daiyak...
SYS-CON Events announced today that Fusic will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Fusic Co. provides mocks as virtual IoT devices. You can customize mocks, and get any amount of data at any time in your test. For more information, visit https://fusic.co.jp/english/.
SYS-CON Events announced today that Keisoku Research Consultant Co. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Keisoku Research Consultant, Co. offers research and consulting in a wide range of civil engineering-related fields from information construction to preservation of cultural properties. For more information, vi...
SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
SYS-CON Events announced today that Interface Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Interface Corporation is a company developing, manufacturing and marketing high quality and wide variety of industrial computers and interface modules such as PCIs and PCI express. For more information, visit http://www.i...
Today companies are looking to achieve cloud-first digital agility to reduce time-to-market, optimize utilization of resources, and rapidly deliver disruptive business solutions. However, leveraging the benefits of cloud deployments can be complicated for companies with extensive legacy computing environments. In his session at 21st Cloud Expo, Craig Sproule, founder and CEO of Metavine, will outline the challenges enterprises face in migrating legacy solutions to the cloud. He will also prese...
SYS-CON Events announced today that Enroute Lab will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Enroute Lab is an industrial design, research and development company of unmanned robotic vehicle system. For more information, please visit http://elab.co.jp/.
SYS-CON Events announced today that Nihon Micron will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nihon Micron Co., Ltd. strives for technological innovation to establish high-density, high-precision processing technology for providing printed circuit board and metal mount RFID tags used for communication devices. For more inf...
DevOps at Cloud Expo – being held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real r...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that TMC has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo and Big Data at Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Global buyers rely on TMC’s content-driven marketplaces to make purchase decisions and navigate markets. Learn how we can help you reach your marketing goals.
Cloud-based disaster recovery is critical to any production environment and is a high priority for many enterprise organizations today. Nearly 40% of organizations have had to execute their BCDR plan due to a service disruption in the past two years. Zerto on IBM Cloud offer VMware and Microsoft customers simple, automated recovery of on-premise VMware and Microsoft workloads to IBM Cloud data centers.
Real IoT production deployments running at scale are collecting sensor data from hundreds / thousands / millions of devices. The goal is to take business-critical actions on the real-time data and find insights from stored datasets. In his session at @ThingsExpo, John Walicki, Watson IoT Developer Advocate at IBM Cloud, will provide a fast-paced developer journey that follows the IoT sensor data from generation, to edge gateway, to edge analytics, to encryption, to the IBM Bluemix cloud, to Wa...
The last two years has seen discussions about cloud computing evolve from the public / private / hybrid split to the reality that most enterprises will be creating a complex, multi-cloud strategy. Companies are wary of committing all of their resources to a single cloud, and instead are choosing to spread the risk – and the benefits – of cloud computing across multiple providers and internal infrastructures, as they follow their business needs. Will this approach be successful? How large is the ...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
The “Digital Era” is forcing us to engage with new methods to build, operate and maintain applications. This transformation also implies an evolution to more and more intelligent applications to better engage with the customers, while creating significant market differentiators. In both cases, the cloud has become a key enabler to embrace this digital revolution. So, moving to the cloud is no longer the question; the new questions are HOW and WHEN. To make this equation even more complex, most ...