What is it?

Data Mining Hub (DMH) is a platform for development of data mining and machine learning algorithms, which is based on an iterative approach, as well as a business tool that helps to analyze large amounts of data and extract from the data useful and important information.

There are two roles in DMH: customer (business) side and scientist side. A customer describes a task (problem) and a scientist tries to solve this task.

DMH allows scientists to take part in solving interesting problems, compete with other scientists and of course get paid if their algorithm was chosen by a customer. If the algorithm was not selected in the iteration, it can always be selected in the next one. The algorithm results will be automatically migrated by DMH from the last iteration to new one if original data is not changed. Also there is an opportunity to improve the algorithm and get paid in the next iteration.

For a customer DMH is a single integration point with a large number of scientists and an easy way to use different algorithms for the same data.

In short, DMH operating principle can be described as follows:

  1. A customer creates a task, provides description, defines acceptable budget, duration and decision making period for each iteration.
  2. The customer loads the data, which then will be used by scientists.
  3. The customer confirms the task and after that the data is available to scientists.
  4. Using the data scientists create their algorithms, upload them to the DMH and set the cost of algorithm usage.
  5. The customer chooses an algorithm he liked and then transfers the payment to the scientist.

Differences of DMH from similar platforms, such as Kaggle and Algomost:

  1. a task is divided into iterations
  2. an author owns algorithm code, a customer only rents it
  3. calculations, evaluation and money manipulations are performed by DMH
  4. a scientists does not need to verify his qualification

You may also be interested by workflow or data format or estimation algorithm or example of hub usage.