Thursday, May 25, 2017

For the next few days, I will write about System Design problems which are increasingly becoming popular in interviews. These problems delve into planning, design, architecture and deployment considerations and are not as focused as the coding exercises. Let us take the example of URL shortening service.
The use cases for shortening service include the following:
shortening - take a url and return a much smaller url generally with an endpoint and a hash
redirection - take a short url and redirect to the original url.
With the primary use cases, we can start the design. Although it may not be necessary to include use cases in the design, we just list some others as custom url, analytics, automatic link expiration, manual link removal and providing interface options such as UI, API and command line.
The use cases help define the problem we are trying to solve. Spending some time up front on the use cases, helps us plan for the next steps.
Any system can solve a given use case but the design gets more specific as we introduce constraints in the system.
These constraints include high availability with high Service Level Agreements. There may be 100 million urls per month with 1 billion requests per month. The traffic may be over 400 requests per second.  99% of these requests need to be responded. 10% of the requests may be for shortening and the other 90% maybe for redirection. Adding them up, we get over 6 billion urls in 5 years.
With this estimation, we proceed to the storage and processing requirements for the design.  Each URL may take say 1000 bytes and its hash may take twelve bytes for unicode.  This may result in over 6 TBs for all urls.
The storage acts like a huge hash table which stores new mappings and retrieves a url for a given hash.
The hash itself is computed as the last six characters of the base64 of the md5 of the original url taken with random salt.
In order to scale the shortening and the redirection, we can create separate services for these so that each can scale independently. Services may be take by several web servers that are identical and behind a load balancer. The data may be a common server that stores all the states. Databases can easily scale to the data we have estimated. Caches can improve response times and the most used redirections can easily be served from cache. Other bottlenecks may similarly be improved.
courtesy: HiredInTech
#codingexercise
Shuffle :
        static void shuffle(ref List<int> A)
        {
            Random randObj = new Random( 1 );
            for (int i = A.Count/2; i < A.Count; i++)
            {
                var j = randObj.Next(A.Count/2);
                int temp = A[i];
                A[i] = A[j];
                A[j] = temp;
            }
        }
Determine cycle in graph:
Bellman-Ford(G, w, s)
Initialize-single-source(G,s)
for i = 1 to number of vertices -1
    for each edge (u,v) belongs to edges E in G
          relax(u,v,w)
for each edge(u,v) belongs to edges E in G
     if (v.d  > u.d  + w(u,v))
         return False
return True
The topological sort can also detect cycles but this works with negative edges too.

No comments:

Post a Comment