This Month In Data Science: November 2015

November 30, 2015 Paul M. Davis

This Month in Data Science With the Presidential race heating up, the increasing importance of data science within the candidates’ campaigns received attention in November. In other news, Stanford’s School of Engineering held the first Women in Data Science conference, the National Science Foundation invested in cross-disciplinary data science partnerships, and the discipline’s capability to track and explain global consumer trends was focused on. Here’s our roundup of the biggest data science news of the month, both from Pivotal and beyond.

How Campaign Data Scientists Figure Out The Formula To Sway Your Vote For President

During a thread on Quora, Democratic Campaign Data Manager Luke Riley demystified how the 2016 Presidential hopefuls are utilizing data science. Though he sites micro-targeting as an important tool that data science adds to the process, but also emphasized the importance of his field experience gathering data door-to-door for four previous campaigns before joining the Obama 2012 team. During that time he learned “the depth of information available and the limitations behind data collection in that environment.”

Who’s Who In The Booming World Of Data Science

While “data scientist” is a blanket term used to refer to a variety of types of practitioners, a recent infographic from DataCamp breaks down who performs what key roles in the field, from data architects to business analysts. Moreover, the infographic lists the primary skills required for each role, major companies hiring those specific practitioners, and compares the national average salaries of the different roles.

Needed: More Women In Data Science

Stanford’s School of Engineering held the first Women in Data Science conference on November 2nd. During the all-female gathering, women practitioners shared their research, discussed the importance of diversity in understanding the questions and answered posed by data research, and discussed the challenges facing prospective women data scientists.

Data Science At Petabyte Scale Is Helping Explain Global Trends

Wired speaks to James Crawford, founder and CEO of Orbital Insight, which uses advanced image processing and data science techniques to track sales trends on a global scale. The company utilizes a plethora of data sources, including satellite and drone imagery, to gain insight on consumer activity as well as to track mining, manufacturing, and shipping activities.

Nate Silver Predicts 2016 Presidential Race At Salesforce World Tour

Speaking at the Salesforce World Tour on November 18th, star statistician Nate Silver offered some preliminary predictions for who will be the front-runners in the 2016 Presidential Election. Despite the insurgence of Bernie Sanders, Silver stated that Hillary Clinton remains the firm front runner in the Democratic race. On the Republican side, Silver hedged his bets, noting that there are few firm endorsements of candidates at this point, and that some factors are unpredictable: “You’ve never had a Trump or a Carson be a major candidate before,” he stated, referring to the current front runners in polls.

Establishing A Brain Trust For Data Science

The National Science Foundation announced the establishment of awards totaling $5 million and “Big Data Regional Innovation Hubs” which bring together academic researchers and leading corporations to drive innovation and share insights. The Hubs will prioritize a number of major topics researchers and scientists are focused on, including healthcare, management of natural resources, agriculture, smart cities, precision medicine, energy and manufacturing, and finance.

This Month In Pivotal Data Science

The New Flo for Spring XD

Flo for Spring XD is an incredibly powerful tool with a graphical canvas and DSL access. This first production-ready release adds batch workflows while addressing the most prominent challenges presented by Spring XD users since the beta process. Ultimately, Flo makes integration easier, improves the speed and quality of development, and addresses organizational needs. This post provides a background on Flo, explains the challenges it addresses, reviews the Flo solution and features, then talks about the journey ahead.

Data, Why Did It Have To Be Data?

In this episode of the Pivotal podcast, host Coté once again chats with Andrew Clay Shafer about the sundry challenges of transforming to a Cloud Native enterprise. They cover the changing focus we’re seeing among Pivotal customers: moving up the stack from infrastructure to the application layers. Then they discuss the difficulties of handling the data layer, and wrap-up with some change management tactics for getting the “rank and file” inspired and bought into the Cloud Native lifestyle.

Try Out Pivotal Greenplum With A Sandbox Virtual Machine

Pivotal Greenplum became the first open source massively parallel data warehouse in late October. Now known as Greenplum Database in its open source form, anyone can clone the github repo and build the product, but there is another segment of the community that just wants to try out the functionality of the product without going through that process. For that group, we now have the Pivotal Greenplum Sandbox Virtual Machine which combines the open source Greenplum Database, the commercially available Pivotal Greenplum Command Center management tool, Apache MADlib (incubating), PostGIS, PL/R, PL/Perl, and PL/Java into an easy-to-use virtual machine that runs in either VirtualBox or VMware Fusion.

How WellCare Accelerated Big Data Delivery To Improve Analytics

In the healthcare industry, big data management is becoming more and more of a high priority. This webinar is presented by executives from Pivotal, Attunity, and WellCare, a joint customer. In it, there are a number of industry data points shared—covering reporting at scale, using real-time data, where Apache Hadoop™ and massively parallel SQL on Hadoop systems fit, and more. WellCare also shares the story of their journey to improve mission-critical query times from 30 to seven days.

Now Open: Pivotal Big Data Center Of Excellence In Denver

Pivotal is expanding our partner support. With data and analytics becoming a key differentiator for successful businesses, enterprises and start-ups alike are increasingly building scale-out big data platforms. Pivotal is expanding our presence in Denver to increase the amount of hardware platforms we are certified with, helping to reduce risk and increase the time to value for our customers. Read more about this new capability and how to participate.

About the Author

Biography

New Federation Business Data Lake Should Be Your Silver Bullet for Big Data Success

This week, EMC is announcing a major milestone for companies looking to transform how they approach big dat...

Maximizing Cloud Optionality

Pivotal's Cote reviews the big impact the small licensing changes we made last week had for Pivotal Cloud F...

This Month In Data Science: November 2015

How Campaign Data Scientists Figure Out The Formula To Sway Your Vote For President

Who’s Who In The Booming World Of Data Science

Needed: More Women In Data Science

Data Science At Petabyte Scale Is Helping Explain Global Trends

Nate Silver Predicts 2016 Presidential Race At Salesforce World Tour

Establishing A Brain Trust For Data Science

This Month In Pivotal Data Science

The New Flo for Spring XD

Data, Why Did It Have To Be Data?

Try Out Pivotal Greenplum With A Sandbox Virtual Machine

How WellCare Accelerated Big Data Delivery To Improve Analytics

Now Open: Pivotal Big Data Center Of Excellence In Denver

About the Author

Previous

Next

This Month In Data Science: November 2015

This Month In Pivotal Data Science

About the Author

Previous

Next

Related content in this Stream

How VMware Tanzu CloudHealth helps customers uncover spiraling AWS Extended Support charges.

VMware Tanzu enhances Spring development with simplified operations, accelerated innovation, seamless microservices transition, increased security, and effortless scaling.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

Bitnami-packaged open source software is loved by developers for its ease of use, which enables developers to directly pull a Bitnami package and seamlessly start using it with little effort.

VMware Tanzu announces the General Availability of AWS Commitment Discount Recommendations, which provides recommendations for all reservable services in AWS through VMware Tanzu CloudHealth.

Introducing VMWare Tanzu Data Hub, a self-managed Database as a Service (DBaaS) Platform, providing enterprises a way to host their internal DBaaS offering for internal business users.

In the cloud-native landscape, MCAs drive seamless compliance integration. Their expertise ensures proactive security measures align with regulatory standards for sustained innovation & collaboration.

Tanzu Application Platform brings innovation faster with more frequent feature updates. With 1.9, take advantage of enhanced DORA metrics visibility and improved compliance options for companies.

We’re excited to share some great news! Spring Academy Pro content is now free. It will be available to everyone who registers a work, vocational, or educational email address.

March 28, 2024, marks the official minor release date of Spring Cloud Gateway for K8s version 2.2, and it's set to optimize how developers protect access to their GraphQL services.

We are excited to announce that VMware Tanzu Application Service 6.0 is now generally available!

Get a clear picture of your OSS supply chain, and the risks you face from your open source software dependencies, using the all-new Tanzu OSS Health Assessment.

Trivy can now utilize CSAF VEX data to filter out false positives in CVE reports, maximizing the value of VEX documents in VMware Tanzu Application Catalog.

Bitnami-packaged open source software container images available in DockerHub are now signed by Notation, an implementation of the Notary Project specifications and a CNCF-incubating project.

There’s never been a better time to be a Java and Spring developer! Let me show you why with a sneak peak into JD Conference 2024.