A motivation to use S3 over document store:
Cost is one of the main drivers for the choice of cloud
technologies. Unfortunately, programmability and functionality are developer’s
motivations. For example, a document store like dynamo db is a fully managed
NoSQL database service that provides fast and predictable performance with
seamless scalability. It might be the convenience choice for schema-less
storage, a table representation and for its frequent usage with an in-memory
cache for low latency. But the operations taken on the resource stored in the
table must be plain and simple create, update, get and delete of the resource.
On the other hand, in terms of storage of such objects, a web accessible store
like S3, is sufficient.
When we calculate the cost of a small sized application, the
monthly charges might appear something like this:
API Gateway 0.04 USD
Cognito 10.00 USD
DynamoDB 75.02 USD
S3 2.07 USD
Lambda 0.00 USD
Web Application Firewall 8.00
USD
In this case, the justification to use S3 is clear from the
cost savings for the said low-overhead resources for whom only cloud
persistence is necessary.
It is in this context, that application modernization has
the potential to driven costs by moving certain persistence to S3 instead of
DynamoDB. The only consideration is the inevitability to use a new and improved
feature on S3 called Amazon S3 Select to realize these cost savings. The
bookkeeping operations on the other objects can be achieved by querying a
ledger object that makes progressive updates without deleting earlier entries.
Using Amazon S3 Select, we can query for a subset of data
from an S3 object by using Simple SQL expressions. The selectObjectContent API
in the AWS SDK for JavaScript is used for this purpose.
Let us use a CSV file named target-file.csv as the key,
that’s uploaded to an S3 object in the bucket named my-bucket in the us-west-2
region. This csv contains entries with username, age attributes. If we were to
select users with an age greater than 20, the SQL query would appear as
SELECT username FROM S3Object WHERE cast(age as int) > 20
With Javascript SDK, we write this as:
const S3 =
require(‘aws-sdk/clients/s3’);
s3.selectObjectContent(params,
(err, data) => {
if (err) {
// handle error
Return
}
const eventStream = data.Payload;
eventStream.on(‘data’, (event)
=> {
if (event.Records) {
// event.Records.Payload is a
buffer containing
// a single record, partial
records, or multiple records
process.stdout.write(event.Records.Payload.toString());
} else if (event.Stats) {
console.log(`Processed
${event.Stats.Details.BytesProcessed} bytes`);
} else if (event.End) {
console.log('SelectObjectContent
completed');
}
});
// Handle errors encountered
during the API call
eventStream.on('error', (err)
=> {
switch (err.name) {
// Check against specific error
codes that need custom handling
}
});
eventStream.on('end', () =>
{
// Finished receiving events
from S3
});
});
No comments:
Post a Comment