Today we discuss url shortening service. This involves:
int getCount(int[,] matrix, int [,] dp, int x, int y)
{
if (dp[x,y] != -1)
return dp[x,y];
int dx = new int []{ 0, 1, -1, 0};
int dy = new int [] {1 , 0 , 0 , -1};
int result = 1; // element by itself
for (int i = 0; i < dx.count; i++)
{
int m = x + dx;
int n = y + dy;
if (isValid(m,n, matrix) && matrix[m,n] > matrix[x,y]) {
result += getCount(matrix, dp, m, n);
}
}
dp[x, y] = result;
return result;
}
1) shortening - take a url and return a much smaller url generally with an endpoint and a hash
2) redirection - take a short url and redirect to the original url.
With the primary use cases, we can start the design. Although it may not be necessary to include use cases in the design, we just list some others as custom url, analytics, automatic link expiration, manual link removal and providing interface options such as UI, API and command line.
The use cases help define the problem we are trying to solve. Spending some time up front on the use cases, helps us plan for the next steps.
Any system can solve a given use case but the design gets more specific as we introduce constraints in the system.
These constraints include high availability with high Service Level Agreements. There may be 100 million urls per month with 1 billion requests per month. The traffic may be over 400 requests per second. 99% of these requests need to be responded. 10% of the requests may be for shortening and the other 90% maybe for redirection. Adding them up, we get over 6 billion urls in 5 years.
With this estimation, we proceed to the storage and processing requirements for the design. Each URL may take say 1000 bytes and its hash may take twelve bytes for unicode. This may result in over 6 TBs for all urls.
The storage acts like a huge hash table which stores new mappings and retrieves a url for a given hash.
The hash itself is computed as the last six characters of the base64 of the md5 of the original url taken with random salt.
In order to scale the shortening and the redirection, we can create separate services for these so that each can scale independently. Services may be hosted by several web servers that are identical and behind a load balancer. The data may be a common server that stores all the states. Databases can easily scale to the data we have estimated. Caches can improve response times and the most used redirections can easily be served from cache. Other bottlenecks may similarly be improved.
#codingexercise
Find the number of increasing paths in a matrix:
For example we have
1 2
1 3
as 1, 1, 2, 3, {1,2}, {1,3}, {2,3},{1,2,3}
We can do this with recursion:int getCount(int[,] matrix, int [,] dp, int x, int y)
{
if (dp[x,y] != -1)
return dp[x,y];
int dx = new int []{ 0, 1, -1, 0};
int dy = new int [] {1 , 0 , 0 , -1};
int result = 1; // element by itself
for (int i = 0; i < dx.count; i++)
{
int m = x + dx;
int n = y + dy;
if (isValid(m,n, matrix) && matrix[m,n] > matrix[x,y]) {
result += getCount(matrix, dp, m, n);
}
}
dp[x, y] = result;
return result;
}
No comments:
Post a Comment