IDG Contributor Network: In an effort to get its house in order, Docker containerizes and ships out its CEO

Big changes for Docker Inc. What’s behind them, and what does it mean?

News that Docker Inc., the commercial organization behind the open source Docker initiative, has replaced its CEO will come as a shock to many, and not to others. I recently covered some PR blunders that Docker has made but these, to be honest, are simply symptoms of a far more serious malady — a company that has a massive valuation, huge interest, a growing ecosystem but is under pressure to tie that all together into a viable business. Quite simply, Docker (the open source project) is far more successful than its eponymously named commercial entity and there is pressure for that to change. Indeed, recently I took part in a panel discussing the recent DockerCon conference and all it means for the business.

To read this article in full or to leave a comment, please click here

Computerworld Cloud Computing

IBM Strengthens Effort to Support Open Source Spark for Machine Learning

Spark 300x251 IBM Strengthens Effort to Support Open Source Spark for Machine LearningIBM is providing substantial resources to the Apache Software Foundation’s Spark project to prepare the platform for machine learning tasks, like pattern recognition and classification of objects. The company plans to offer Bluemix Spark as a service and has dedicated 3,500 researchers and developers to assist in its preservation and further development.

In 2009, AMPLab of the University of Berkeley developed the Spark framework that went open source a year later as an Apache project. This framework, which runs on a server cluster, can process data up to 100 times faster than Hadoop MapReduce. Given that the data and analyzes are embedded in the corporate structure and society – from applications to the Internet of Things (IoT) – Spark provides essential advancements in large-scale data processing.

First, it significantly improves the performance of applications dependent data. Then it radically simplifies the development process of intelligence, which are supplied by the data. Specifically, in its effort to accelerate innovation on Spark ecosystem, IBM decided to include Spark in its own platforms of predictive analysis and machine learning.

IBM Watson Health Cloud will use Spark to healthcare providers and researchers as they have access to new health data of the population. At the same time, IBM will make available its SystemML machine learning technology open source. IBM is also collaborating with Databricks in changing Spark capabilities.

IBM will hire more than 3,500 researchers and developers to work on Spark-related projects in more than a dozen laboratories worldwide. The big blue company plans to open a Spark Technology Center in San Francisco for the Data Science and the developer community. IBM will also train Spark to more than one million data scientists and data engineers through partnerships with DataCamp, AMPLab, Galvanize, MetiStream, and Big Data University.

A typical large corporation will have hundreds or thousands of data sets that reside in different databases through their computer system. A data scientist can design an algorithm using to plumb the depths of any database. But is needs 90 working days of scientific data to develop the algorithm. Today, if you want to implement another system, it is a quarter of work to adjust the algorithm so that it works. Spark eliminates that time in half. The spark-based system can access and analyze any database, without development and no additional delay.

Spark has another virtue of ease of use where developers can concentrate on the design of the solution, rather than building an engine from scratch. Spark brings advances in data processing technology on a large scale because it improves the performance of data-dependent applications, radically simplifies the process of developing intelligent solutions and enables a platform capable of unifying all kinds of information on real work schemes.

Many experts consider Spark as the successor to Hadoop, but its adoption remains slow. Spark works very well for machine learning tasks that normally require running large clusters of computers. The latest version of the platform, which recently came out, extends to the machine learning algorithms to run.


CloudTimes