The promise and risks of big data analysis received plenty of attention in recent weeks. While data science’s ability to optimize performance and facilitate more effective operations was seen in the oil and gas, shipping, and even restaurant industries, the risk that “algorithmic bias” will enforce existing social inequities received serious scrutiny.
Here’s our roundup of the biggest data science news of the month, both from Pivotal and beyond.
As oil prices continue to fall, from over $100 a barrel a year ago to $38 in September, companies are increasingly adopting big data tools to manage their operations more cost-effectively and efficiently.
UPS was an early adopter of using data-driven approaches to optimize its delivery chain, and that influence has since filtered into the company’s DNA. This feature at Datanami takes a look at how the company is shifting from “descriptive” to predictive analytics, and the potential efficiencies and improvements UPS may reap as a result.
Independent retailers and restaurateurs have not traditionally been quick to adopt new technologies, if the continued poor quality of restaurant websites is any indication. But a new generation of tech-savvy retailers and restaurant owners are quickly adopting data science techniques to optimize their businesses and increase customer satisfaction.
In this report for Truth-Out, Lauren Kirchner explores the problematic presumptions and unintended consequences of data-driven decision making, and the risk that “algorithmic bias” will continue to enforce, or even bolster, existing race, gender, and class inequities.
Stanford launched a new podcast this month that aims to investigate and discuss some of issues of representation and bias that Kirchner wrote about in her feature. The biweekly podcast, produced Worldview Stanford, will “examine how big data and cyber technologies are changing the relationships between people, technology and social institutions.”
Phys.org profiles the startup Agrimetrics, which aims to provide big data services across all aspects of the food chain, so that producers, processers and retailers can better produce and distribute safe and affordable food on a worldwide scale.
The explosive growth of job opportunities for data scientists will come as no surprise to those already working in big data-related industries, but the extent of that growth has now been confirmed by a study of LinkedIn data, which reveals that the number of employed data scientists has doubled in the past four years. Also fascinating are the numbers of data scientists employed by respective companies and the rate of hiring, with Microsoft and Facebook appearing as standouts.
This Month In Pivotal Data Science
As malware techniques continue to evolve, it becomes increasingly challenging to detect network security threats, especially Advanced Persistent Threats (APTs) that are orchestrated by sophisticated adversaries. An increasingly common strategy adopted by APT actors to carry out targeted attacks is the watering hole technique. Watering hole attacks target a group of users in an organization by infesting the websites that are most often visited by these users. In this blog post, Anirudh Kondaveeti and Jin Yu discuss the application of sequential pattern mining to detect coordinated network attacks such as watering hole attacks.
Pivotal is proud to announce that Pivotal GemFire was cited as a leader in newly published The Forrester Wave™: In-Memory Data Grids, Q3 2015 report from Forrester Research. While we are proud to report that GemFire was cited among the second-highest in the strategy category, this post also explores a strong point that Forrester underscores in the report. That, “AD&D pros should not make the mistake of turning to IMDGs only when performance at scale becomes an issue. It will become an issue sooner or later.” A free download of the report is included in this post.
In this blog, we continue our blog series on multivariate time series to apply this modeling approaches for forecasting virtual machine capacity planning. This technique can be broadly applied to other areas as well such as monitoring industrial equipment or vehicle engines.
The pressures for real-time data in applications is picking up at the same rate that applications are gravitating toward modern Cloud Native architectures. Last month at Spring One 2GX, Pivotal announced the release of Spring Cloud Data Flow, which moves many of the capabilities of Spring XD to a Cloud Native architecture. In this episode, host Simon Elisha walks us through the changes and how it fits into Cloud Native application architectures.
In this post, Anirudh Kondaveeti, a Principal Data Scientist at Pivotal, provides an in-depth, real-world example of how data science applies to mechanical and materials engineering in the semiconductor manufacturing industry. Step-by-step, he covers de-noising, preprocessing, feature extraction, dimensionality reduction, outlier detection, and clustering to show how yield and profitability are improved.
Upcoming Pivotal Events
- PGConf Silicon Valley, Nov 17 – 18, 2015, South San Francisco, CA
- PostgreSQL Japan, Nov 27, 2015, Tokyo, Japan
- Strata + Hadoop World in Singapore, Dec 1 – 3, 2015, Singapore
- Gartner AADI Las Vegas 2015, Dec 1 – 3, 2015, Las Vegas, NV
About the Author