Using Machine Learning with Object Storage:
The Machine Learning packages such as sklearn and Microsoft ML package or complex reporting queries for say dashboards can also utilize object storage. These analytical capabilities are leveraged in the database systems, but there is no limitation to apply it over objects in object storage.
This works very well for several reasons:
1) The analysis can be run on all data at once. The more the data, the better the analysis and object storage is one of the biggest possible. Consequently the backend and particularly the cloud services are better prepared for this task
2) The cloud services are elastic - they can pull in as much resource as needed for the execution of the queries and this works well for map-reduce processing
3) Object storage is also suited to do this processing once for every client and application. Different views and viewmodels can use the same computation so long as the results are part of the storage.
4) Performance increases dramatically when the computations are as close to the data as possible. This has been one of the arguments for pushing the machine learning package into the sql server for example.
5) Such compute and data intensive operations are hardly required on the frontend where the data may be very limited on a given page. Moreover, optimizations only happen when the compute and storage are elastic where they can be studied, cached, and replayed.
6) Complex queries can already be reduced to use a few primitives which can be made available as query operators over object storage leaving the choice to implement higher order themselves using these primitives or their own custom operators.
The Machine Learning packages such as sklearn and Microsoft ML package or complex reporting queries for say dashboards can also utilize object storage. These analytical capabilities are leveraged in the database systems, but there is no limitation to apply it over objects in object storage.
This works very well for several reasons:
1) The analysis can be run on all data at once. The more the data, the better the analysis and object storage is one of the biggest possible. Consequently the backend and particularly the cloud services are better prepared for this task
2) The cloud services are elastic - they can pull in as much resource as needed for the execution of the queries and this works well for map-reduce processing
3) Object storage is also suited to do this processing once for every client and application. Different views and viewmodels can use the same computation so long as the results are part of the storage.
4) Performance increases dramatically when the computations are as close to the data as possible. This has been one of the arguments for pushing the machine learning package into the sql server for example.
5) Such compute and data intensive operations are hardly required on the frontend where the data may be very limited on a given page. Moreover, optimizations only happen when the compute and storage are elastic where they can be studied, cached, and replayed.
6) Complex queries can already be reduced to use a few primitives which can be made available as query operators over object storage leaving the choice to implement higher order themselves using these primitives or their own custom operators.
#codingexercise
Determine if a sum is a perfect number. A perfect number is one whose factors add up to the number
Determine if a sum is a perfect number. A perfect number is one whose factors add up to the number
Bool IsPerfectSum(ref List<int> factors, int sum)
{
If (sum == 0 && factors.Count() > 0) return false;
if (sum == 0) return true;
If (sum < 0) return false;
if (factors.Count() == 0 && sum != 0) return false;
// sum > 0 and factors.Count() > 0
Var last = factors.last();
factors.RemoveAt(factors.Count() - 1);
if (last> sum)
{
Return false;
}
Return IsSubsetSum(ref factors, sum-last) ;
}