Welcome!

Big Data Journal Authors: Kira Makagon, Carmen Gonzalez, Adam Vincent, Pat Romanski, Liz McMillan

Blog Feed Post

Deep Learning at Stanford

by Joseph Rickert Last week,I had the opportunity to participate in the Second Academy of Science and Engineering (ASE) Conference on Big Data Science and Computing at Stanford University. Since the conference was held simultaneously with the two other conferences, one on Social Computing and the other on Cyber Security, it was definitely not an R crowd, and not even a  typical Big Data crowd. Talks from the three programs were intermixed throughout the day so at any given moment you could find yourself looking for common ground in a conversation with mostly R aware, but language impartial fellow attendees. I don’t know whether this method of organization was the desperate result of necessity or genius, but I thought it worked out very well and made for a stimulating interaction dynamic. The ASE conference must have been difficult program to set up. The organizers, however, did a wonderful job mashing talks and themes together to make for an excellent experience. There were several very good talks at the conference, however, the tutorial on Deep Learning and Natural Language Processing given by Richard Socher was truly outstanding. Richard is a PhD student in Stanford’s Computer Science Department studying under Chris Manning and Andrew Ng. Very rarely do you come across such a polished speaker with complete and casual command of complex material. And, while the delivery was impressive the content was jaw dropping. Richard walked through the Deep Learning methodology and tools being developed in Stanford’s AI lab and showed a number of areas where the Deep Learning techniques are yielding notable results; for example, a system for single sentence sentiment detection that improved positive/negative sentence classification by 5.4%. Have a look at Andrew Ng’s or Christopher Manning’s lists of publications to get a good idea of the outstanding work that is being done in this area. A key concept covered in the tutorial is the ability to represent natural language structures, parsing trees for example, in a finite dimensional vector space and to build the theoretical and software tools in such a way that same method can be use to deconstruct and represent other hierarchies. The following slide indicates how a structures build for Natural Language Processing (NLP) can also be used to represent images. This ability to bring a powerful, integrated set of tools to many different areas seems to be a key reason why neural nets and Deep Learning are suddenly getting so much attention. In a tutorial similar to the one Richard gave on Saturday, Richard and Chris Manning attribute the recent resurgence of Deep Learning to three factors: New methods for unsupervised pre-training: Restricted Boltzmann Machines (RBMs), autoencoders and contrastive estimation More efficient parameter estimation methods Better understanding of model regularization The software used in the NLP and Deep Learning work at Stanford seems to be mostly based on Python and C. (See theano and Senna for example.) So far, it does not appear that much Deep Learning work at all is being done with R. However, things are looking up. 0xdata’s H20 Deep Learning implementation is showing impressive results, and the this algorithm is available in the h20 R package. Also, the R package darch and the very recent deepnet package, both of which offer implementations of Restricted Boltzman Machines, indicate that Deep Learning researchers are working in R. Finally, to get a quick overview of the area have a look at  the book, Deep Learning: Methods and Applications by Li Deng and Dony Yu of Microsoft Research is available online.

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid

@BigDataExpo Stories
DevOps Summit 2015 New York, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that it is now accepting Keynote Proposals. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete...
Software AG and Wipro Ltd. have announced a joint solution platform for streaming analytics that provides real-time actionable intelligence for the Internet of Things (IoT) market. “The key to successfully addressing the IoT market is the ability to rapidly build and evolve apps that tap into, analyze and make smart decisions on fast, big data”, said John Bates, Global Head of Industry Solutions and CMO, Software AG. To address the huge market potential created by streaming analytics in conj...
Amazon, Google and Facebook are household names in part because of their mastery of Big Data. But what about organizations without billions of dollars to spend on Big Data tools - how can they extract value from their data? In his session at 6th Big Data Expo®, Ali Ghodsi, Co-Founder and Head of Engineering at Databricks, discussed how the zero management cost and scalability of the cloud is addressing the challenges and pain points that data engineers face when working with Big Data. He also s...
We’re no longer looking to the future for the IoT wave. It’s no longer a distant dream but a reality that has arrived. It’s now time to make sure the industry is in alignment to meet the IoT growing pains – cooperate and collaborate as well as innovate. In his session at @ThingsExpo, Jim Hunter, Chief Scientist & Technology Evangelist at Greenwave Systems, will examine the key ingredients to IoT success and identify solutions to challenges the industry is facing. The deep industry expertise be...
DevOps means different things to different people. Qubell defines DevOps as the ability for the developer teams to do what they need to do to have this level of self-service. At DevOps Summit, Stan Klimoff, CTO of Qubell, demos the enterprise DevOps platform.
SYS-CON Events announced today that that Innodisk, the service-driven provider of industrial embedded flash and DRAM storage products and technologies, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Innodisk is a service-driven provider of industrial embedded flash and DRAM storage products and technologies. With satisfied customers across the embedded, aerospace and defense, cloud storage markets an...
The 3rd International Internet of @ThingsExpo, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that its Call for Papers is now open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.
The Internet of Things will greatly expand the opportunities for data collection and new business models driven off of that data. In her session at @ThingsExpo, Esmeralda Swartz, CMO of MetraTech, discussed how for this to be effective you not only need to have infrastructure and operational models capable of utilizing this new phenomenon, but increasingly service providers will need to convince a skeptical public to participate. Get ready to show them the money!
Eighty-five percent of companies store information in some sort of unstructured manner. In this demo at 15th Cloud Expo, Mark Fronczak, Product Manager at Solgenia, discussed their enterprise content management solution, which was created to help companies organize and take control of their digital assets.

ARMONK, N.Y., Nov. 20, 2014 /PRNewswire/ --  IBM (NYSE: IBM) today announced that it is bringing a greater level of control, security and flexibility to cloud-based application development and delivery with a single-tenant version of Bluemix, IBM's

Midway through the decade, the experts from Veeva Systems – a leader in cloud-based software for the global life sciences industry – look at what’s on the horizon over the next five years. Their forecasts are informed by a vision for what’s next in technology and insight gleaned from Veeva’s 200+ life sciences customers worldwide. Overall, these predictions reflect how new innovations will enable faster time to market, evolving commercial models, and new ways to support physicians and patients. ...
The Internet of Things promises to transform businesses (and lives), but navigating the business and technical path to success can be difficult to understand. In his session at @ThingsExpo, Sean Lorenz, Technical Product Manager for Xively at LogMeIn, demonstrated how to approach creating broadly successful connected customer solutions using real world business transformation studies including New England BioLabs and more.
We certainly live in interesting technological times. And no more interesting than the current competing IoT standards for connectivity. Various standards bodies, approaches, and ecosystems are vying for mindshare and positioning for a competitive edge. It is clear that when the dust settles, we will have new protocols, evolved protocols, that will change the way we interact with devices and infrastructure. We will also have evolved web protocols, like HTTP/2, that will be changing the very core...
NuoDB just introduced the Swifts 2.1 Release. In this demo at 15th Cloud Expo, Seth Proctor, CTO of NuoDB, Inc., discussed why scaling databases in the cloud is challenging, why building your application on top of the infrastructure that is designed with this in mind makes a difference, and what you can do with NuoDB that simplifies your programming model, your operations model.
Enthusiasm for the Internet of Things has reached an all-time high. In 2013 alone, venture capitalists spent more than $1 billion dollars investing in the IoT space. With "smart" appliances and devices, IoT covers wearable smart devices, cloud services to hardware companies. Nest, a Google company, detects temperatures inside homes and automatically adjusts it by tracking its user's habit. These technologies are quickly developing and with it come challenges such as bridging infrastructure gaps,...
“We are a managed services company. We have taken the key aspects of the cloud and the purposed data center and merged the two together and launched the Purposed Cloud about 18–24 months ago," explained Chetan Patwardhan, CEO of Stratogent, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
The 4th International DevOps Summit, co-located with16th International Cloud Expo – being held June 9-11, 2015, at the Javits Center in New York City, NY – announces that its Call for Papers is now open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's large...
The Internet of Things is a misnomer. That implies that everything is on the Internet, and that simply should not be - especially for things that are blurring the line between medical devices that stimulate like a pacemaker and quantified self-sensors like a pedometer or pulse tracker. The mesh of things that we manage must be segmented into zones of trust for sensing data, transmitting data, receiving command and control administrative changes, and peer-to-peer mesh messaging. In his session a...
You use an agile process; your goal is to make your organization more agile. But what about your data infrastructure? The truth is, today's databases are anything but agile - they are effectively static repositories that are cumbersome to work with, difficult to change, and cannot keep pace with application demands. Performance suffers as a result, and it takes far longer than it should to deliver new features and capabilities needed to make your organization competitive. As your application an...
The 3rd International Internet of @ThingsExpo, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that its Call for Papers is now open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.