From Data Silos to Data Lakes: Realizing the Accessible Dream

January 16, 2014 Paul M. Davis

silos-lake The era of data silos is nearing its end. The ongoing cycle of data science, and the rapid development of applications built upon those models and insights, will not wait for an IT infrastructure that stores critical information in numerous disconnected locations. The speed and scalability of Hadoop has given rise to the concept of the data lake, which is key to Pivotal’s vision of a unified PaaS. In an article at Forbes, Edd Dumbill characterizes the data lake as “a dream” given the current enterprise climate, but one that remains “an accessible dream.”

In his article, Dumbill offers a succinct and useful definition of the data lake concept:

“The data lake dream is of a place with data-centered architecture, where silos are minimized, and processing happens with little friction in a scalable, distributed environment. Applications are no longer islands, and exist within the data cloud, taking advantage of high bandwidth access to data and scalable computing resource. Data itself is no longer restrained by initial schema decisions, and can be exploited more freely by the enterprise.”

The move from data silos to data lakes will accelerate data-driven insights, app development, iteration, and time to value. But this transition doesn’t happen overnight. Dumbill views this as being a four-part process for an enterprise.

When Hadoop first enters the picture, it primarily serves as an input, with disparate applications and sources contributing data for analysis. Over time, as more data sources are integrated into a growing Hadoop system, this changes into an ongoing cycle of input and output, wherein data drives insight which produces data-aware apps, which in turn contribute back to the growing wealth of information.

The data lake’s opportunities and impacts are well-documented on this blog. It is set to transform corporate IT and security operations, require closer collaboration between data scientists and app developers, spur competition and innovation, and drive new value opportunities.

As Dumbill states in his article, many enterprises remain in the early stages of this transition, but that is quickly changing. Noting that consumer giants such as Google and Facebook already boast these capabilities, enterprises have an imperative to catch up.

“As business is increasingly digital, access to data will become a critical priority,” Dumbill writes, “As will speed of development and deployment. The data lake is a dream that can match those demands.” Providing the knowledge and infrastructure necessary to meet this challenge and enable the “consumer-grade enterprise” is fundamental to the Pivotal One vision.

Learn more about Pivotal and the Data Lake

Capgemini Is Co-Innovating with Pivotal to Provide the Business Data Lake. Find out how.
Pivotal One is the World’s First Comprehensive Multi-Cloud Enterprise PaaS.
Want to start fishing in your own data lake? Contact the new Pivotal and Capgemini CoE.

About the Author

Biography

The power and structure of push: Second screen solution

Originally posted at EmirWeb by Emir Hasanbegovic Second screen has been a buzzword for quite some time and...

Ruby 2.1.0 changes of note.

The Ruby 2.1.0 Release is nearly a month old, so its well past time to look over the changes and uncover t...

From Data Silos to Data Lakes: Realizing the Accessible Dream

Learn more about Pivotal and the Data Lake

About the Author

Previous

Next

From Data Silos to Data Lakes: Realizing the Accessible Dream

Learn more about Pivotal and the Data Lake

About the Author

Previous

Next

Related content in this Stream

In the cloud-native landscape, MCAs drive seamless compliance integration. Their expertise ensures proactive security measures align with regulatory standards for sustained innovation & collaboration.

Tanzu Application Platform brings innovation faster with more frequent feature updates. With 1.9, take advantage of enhanced DORA metrics visibility and improved compliance options for companies.

We’re excited to share some great news! Spring Academy Pro content is now free. It will be available to everyone who registers a work, vocational, or educational email address.

March 28, 2024, marks the official minor release date of Spring Cloud Gateway for K8s version 2.2, and it's set to optimize how developers protect access to their GraphQL services.

We are excited to announce that VMware Tanzu Application Service 6.0 is now generally available!

Get a clear picture of your OSS supply chain, and the risks you face from your open source software dependencies, using the all-new Tanzu OSS Health Assessment.

Trivy can now utilize CSAF VEX data to filter out false positives in CVE reports, maximizing the value of VEX documents in VMware Tanzu Application Catalog.

Bitnami-packaged open source software container images available in DockerHub are now signed by Notation, an implementation of the Notary Project specifications and a CNCF-incubating project.

There’s never been a better time to be a Java and Spring developer! Let me show you why with a sneak peak into JD Conference 2024.

If you're into FinOps, you've probably heard of FOCUS. Introducing our FOCUS FlexReports template for AWS, Azure, and GCP. Turn your cloud bills into FOCUS-compliant reports in minutes!

The latest Spring Boot simplifies infrastructure setup with Docker Compose. Now, supporting Bitnami images, it opens new possibilities for developers. Exciting times ahead!

Shape the future of Spring! Participate in the State of Spring Survey 2024. Share insights, collaborate with the community, and drive innovation.

Extend Apache Tomcat support with Tanzu Spring Runtime. Seamless transition, enhanced security, and uninterrupted workflow for Java applications.

Welcome to another edition of What’s new with Tanzu Application Catalog. This is a quarterly round up of all things related to Tanzu Application Catalog.

As we stand at the threshold of a new era in data management, Greenplum continues to lead the industry with its commitment to innovation.

Experience enhanced security with Tanzu Application Platform. Elevate your organization's defenses from code to build with SLSA Level 3, image scanning scheduling & automatic upgrades for new patches.

Explore Spring's exceptional NPS score of 75, surpassing industry benchmarks by 18%. Discover why it matters.

From single apps to portfolios of apps in large enterprises and our experience has led us to identify four of the most common anti-patterns impacting organizations.