Welcome!

Big Data Journal Authors: Yeshim Deniz, Pat Romanski, Carmen Gonzalez, Roger Strukhoff, Elizabeth White

News Feed Item

The Apache Software Foundation Announces Apache™ Tajo™ as a Top-Level Project

Advanced Open Source data warehousing system in Apache Hadoop in use by Gruter, Korea University, and SK Telecom, among others, for processing Web-scale data sets

FOREST HILL, Md., April 1, 2014 /PRNewswire-USNewswire/ -- The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 170 Open Source projects and initiatives, announced today that Apache Tajo has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

"It's a pleasure to graduate from the Apache Incubator," said Hyunsik Choi, Vice President of Apache Tajo. "This milestone further reinforces our hard work in bringing a much-needed big data solution under the Apache banner."

Dubbed an "SQL-on-Hadoop" solution, Apache Tajo is a robust big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources. By supporting SQL standards and leveraging advanced database techniques, Tajo allows direct control of distributed execution and data flow across a variety of query evaluation strategies and optimization opportunities.

The Tajo project began in 2010 at Korea University's Database Lab, and entered the Apache Incubator in March 2013. Apache Tajo is in use at Gruter, Korea University, and SK Telecom, among others, for its ability to analyze massive data sets in real time.

"Apache Tajo has earned its place as a top-level project in the ASF. It's an excellent example of a community building around a core piece of technology. Not to mention, the technology itself is quite cool. Tajo has a large role to play in the Apache Hadoop ecosystem," said Jakob Homan, Staff Software Engineer at LinkedIn, and ASF Member.

"Tajo project is a really good example that how company and Open Source community can benefit from each other. Its real open community has assisted me to solve lots of practical problems, and I have opportunities to make Tajo more robust and have richer functionalities," said Keuntae Park, IT manager of SK Telecom and contributor to Apache Tajo. "I feel much affection for Tajo project and it's my great pleasure to participate in its growth, graduation, and becoming of top-level project."

"Tajo is one of the most promising projects for SQL-on-Hadoop. Many contributors have been improving Tajo by developing various interesting features. It's an honor for me to work with such a wonderful community," said Jihoon Son, Ph.D. candidate at Korea University and contributor to Apache Tajo.

"Apache Tajo has been a model community through the Incubator. They have demonstrated meritocracy on lists in the face of some pretty awesome and complex software for Big Data Analytics," said Chris Mattmann, Apache Tajo Incubator Mentor at the ASF, and Chief Architect, Instrument and Science Data Systems Section at NASA JPL. "We are currently evaluating the use of Tajo in projects for Radio Astronomy at JPL, as well as in the context of our Airborne Snow Observatory (ASO) project for Big Data query processing and storage. I'm really excited to see where Tajo is headed along with the other Big Data stacks at the ASF including Spark and Mesos."

"The key to a successful Open Source community lies in its diversity and active participation," added Choi. "As Apache Tajo continues to grow, we welcome contributions with code, documentation, testing, submitting patches, and other valuable forms of feedback."

Availability and Oversight
As with all Apache products, Apache Tajo software is released under the Apache License v2.0, and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For documentation and ways to become involved with Apache Tajo, visit http://tajo.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than one hundred and seventy leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 400 individual Members and 3,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Budget Direct, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.

"Apache", "Apache HTTP Server", "Apache Mesos", "Mesos", "Apache Spark", "Spark", "Apache Tajo", "Tajo", and "ApacheCon" are trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.

SOURCE Apache Software Foundation

More Stories By PR Newswire

Copyright © 2007 PR Newswire. All rights reserved. Republication or redistribution of PRNewswire content is expressly prohibited without the prior written consent of PRNewswire. PRNewswire shall not be liable for any errors or delays in the content, or for any actions taken in reliance thereon.

@BigDataExpo Stories
SYS-CON Events announced today that Verizon has been named "Gold Sponsor" of SYS-CON's 15th International Cloud Expo®, which will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Verizon Enterprise Solutions creates global connections that generate growth, drive business innovation and move society forward. With industry-specific solutions and a full range of global wholesale offerings provided over the company's secure mobility, cloud, strategic network...
SimpleECM is the only platform to offer a powerful combination of enterprise content management (ECM) services, capture solutions, and third-party business services providing simplified integrations and workflow development for solution providers. SimpleECM is opening the market to businesses of all sizes by reinventing the delivery of ECM services. Our APIs make the development of ECM services simple with the use of familiar technologies for a frictionless integration directly into web applicat...
The only place to be June 9-11 is Cloud Expo & @ThingsExpo 2015 East at the Javits Center in New York City. Join us there as delegates from all over the world come to listen to and engage with speakers & sponsors from the leading Cloud Computing, IoT & Big Data companies. Cloud Expo & @ThingsExpo are the leading events covering the booming market of Cloud Computing, IoT & Big Data for the enterprise. Speakers from all over the world will be hand-picked for their ability to explore the economic...
Cloudwick, the leading big data DevOps service and solution provider to the Fortune 1000, announced Big Loop, its multi-vendor operations platform. Cloudwick Big Loop creates greater collaboration between Fortune 1000 IT staff, developers and their database management systems as well as big data vendors. This allows customers to comprehensively manage and oversee their entire infrastructure, which leads to more successful production cluster operations, and scale-out. Cloudwick Big Loop supports ...
Software AG helps organizations transform into Digital Enterprises, so they can differentiate from competitors and better engage customers, partners and employees. Using the Software AG Suite, companies can close the gap between business and IT to create digital systems of differentiation that drive front-line agility. We offer four on-ramps to the Digital Enterprise: alignment through collaborative process analysis; transformation through portfolio management; agility through process automation...
Headquartered in Santa Monica, California, Bitium was founded by Kriz and Erik Gustavson. The 1,500 cloud-based application using Bitium’s analytics, app management, and single sign-on services include bug trackers, customer service dashboards, Google Apps, and social networks. The firm states website administrators can do multiple tasks online without revealing passwords. Bitium’s advisors include Microsoft’s former CMO and the former senior vice president of strategy, the founder and CEO of Li...
Things are being built upon cloud foundations to transform organizations. This CEO Power Panel at 15th Cloud Expo, moderated by Roger Strukhoff, Cloud Expo and @ThingsExpo conference chair, will address the big issues involving these technologies and, more important, the results they will achieve. How important are public, private, and hybrid cloud to the enterprise? How does one define Big Data? And how is the IoT tying all this together?
The 3rd International Internet of @ThingsExpo, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that its Call for Papers is now open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.
The Industrial Internet revolution is now underway, enabled by connected machines and billions of devices that communicate and collaborate. The massive amounts of Big Data requiring real-time analysis is flooding legacy IT systems and giving way to cloud environments that can handle the unpredictable workloads. Yet many barriers remain until we can fully realize the opportunities and benefits from the convergence of machines and devices with Big Data and the cloud, including interoperability, da...
All major researchers estimate there will be tens of billions devices - computers, smartphones, tablets, and sensors - connected to the Internet by 2020. This number will continue to grow at a rapid pace for the next several decades. Over the summer Gartner released its much anticipated annual Hype Cycle report and the big news is that Internet of Things has now replaced Big Data as the most hyped technology. Indeed, we're hearing more and more about this fascinating new technological paradigm. ...
Cultural, regulatory, environmental, political and economic (CREPE) conditions over the past decade are creating cross-industry solution spaces that require processes and technologies from both the Internet of Things (IoT), and Data Management and Analytics (DMA). These solution spaces are evolving into Sensor Analytics Ecosystems (SAE) that represent significant new opportunities for organizations of all types. Public Utilities throughout the world, providing electricity, natural gas and water,...
The Internet of Things needs an entirely new security model, or does it? Can we save some old and tested controls for the latest emerging and different technology environments? In his session at Internet of @ThingsExpo, Davi Ottenheimer, EMC Senior Director of Trust, will review hands-on lessons with IoT devices and reveal privacy options and a new risk balance you might not expect.
The information technology sphere undergoes what we like to call a paradigm shift, sea change or plain old ‘upheaval’ roughly every five years or so. Don’t ask anybody why this half decade cyclicality exists; it just has to be so. Accept that reinvention happens constantly and that major seismic shifts are tangibly felt by us human beings roughly every 1826.21 days… and we can move on.
There’s Big Data, then there’s really Big Data from the Internet of Things. IoT is evolving to include many data possibilities like new types of event, log and network data. The volumes are enormous, generating tens of billions of logs per day, which raise data challenges. Early IoT deployments are relying heavily on both the cloud and managed service providers to navigate these challenges. In her session at 6th Big Data Expo®, Hannah Smalltree, Director at Treasure Data, to discuss how IoT, B...
SYS-CON Events announced today that Objectivity, Inc., the leader in real-time, complex Big Data solutions, will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Objectivity, Inc. is the Enterprise Database leader of real-time, complex Big Data solutions. Our leading edge technologies – InfiniteGraph®, The Distributed Graph Database™ and Objectivity/DB®, a distributed and scalable object ma...
In their session at DevOps Summit, Stan Klimoff, CTO of Qubell, and Mike Becker, Senior Data Engineer for RingCentral, will share the lessons learned from implementing CI/CD pipeline on AWS for a customer analytics project powered by Cloudera Hadoop, HP Vertica and Tableau. Stan Klimoff is CTO of Qubell, the enterprise DevOps platform. Stan has more than a decade of experience building distributed systems for companies such as eBay, Cisco and Seagate. Qubell is helping enterprises to become mor...
The major cloud platforms defy a simple, side-by-side analysis. Each of the major IaaS public-cloud platforms offers their own unique strengths and functionality. Options for on-site private cloud are diverse as well, and must be designed and deployed while taking existing legacy architecture and infrastructure into account. Then the reality is that most enterprises are embarking on a hybrid cloud strategy and programs. In this Power Panel at 15th Cloud Expo, moderated by Ashar Baig, Research ...
Big Data means many things to many people. From November 4-6 at the Santa Clara Convention Center, thousands of people will gather at Big Data Expo to discuss what it means to them, how they are implementing it, and how Big Data plays an integral role in the maturing cloud computing world and emerging Internet of Things. Attend Big Data Expo and make your contribution. Register for Big Data Expo "FREE" with Discount Code "BigDataOCTOBER" by October 31
The evolution of the database is under constant upheaval, discussion, debate and (if you will excuse the expression) 'analysis.' This basic truth is now more relevant, pertinent and pressing than ever due to the prevalence of Big Data (and the need to impose analytics of insight upon it) driven by social, mobile, cloud and of course the Internet of (Every) Things. Today then, as a staple of our IT infrastructure, databases have been around for over 50 years now with first references of the ter...
SYS-CON Events announced today that Cloudian, Inc., the leading provider of hybrid cloud storage solutions, has been named “Bronze Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Cloudian is a Foster City, Calif.-based software company specializing in cloud storage. Cloudian HyperStore® is an S3-compatible cloud object storage platform that enables service providers and enterprises to bui...