Monday, September 23, 2013

In the previous posts, we looked at possible solutions for keyword detection. Most of those relied on large corpus for statistical modeling. I want to look into solutions on the client side or at least rely on web based requests to a central server. When we relied on the keyword detection using corpus data, there was a large volume of text parsed and statistics gathered. These are useful when we are doing server side computing but in the case of client tools such as a word doc plugin we hardly have that luxury unless we rely on web requests and responses. If the solution works on the server side, the text on the client side can be sent to the server side for processing. If the server side is implemented with APIs, then it can enable several clients to connect and process.
This means we can write a website that works with different handheld devices for a variety of text.  Since the text can be parsed from several types of document, the proposed services can work with any.
The APIs can be simple in that they take input text and generate a list of keywords in the text in the form of word offsets, leaving the pagination and rendering to the client. This works well when a text or a collection of words can be looked up based on relative offsets which guarantee a consistent and uniform way to access any incoming text.
Also, some of the things we could do with server side processing is to build our corpus if that is permissible. The incoming text is something that we could get more representative text for which this strategy is important. The representative text is not only important from a collection perspective but also gives common keywords that are most prevalent. The identification of this subset alone can help with pure client side checks and processing.
Along the lines of the previous post, some common lambda expressions:
Func<bool> tester = () => true;
Func<string, string> selector = str => str.ToUpper();
Func<string, int, string[]> extract = (s,i) => i > 0 ? s.Split(delimiters,i) :  s.Split(delimiters);
Func<string, NumberStyles, IFormatProvider, int> parser = int.Parse;
button1.Click += async (sender, e) =>  { await Task.Delay(1000); textbox1.Text = "Voila!" }
var firstLargerThanIndexNumbers = numbers.TakeWhile( (n,index) => n > index);
var query = people.join( pets,
                                    people => people,
                                    pets => pets.Owner,
                                    (person, pet) => new { Pet = pet.Name, "OwnerName"=person.Name });
grades.OrderByDescending(grade => grade).Take(3);
Enumerable.Repeat("Repetition", 15);
var squares = Enumerable.Range(1, 10).Select(x => x * x);

Sunday, September 22, 2013

Sentence reversal in msdn

var sentence = "the quick brown fox jumped over the lazy dog";
var words = sentence.split(' ');

var reversedSentence  = words.Aggregate((sent, next) =>  next + " " + sent);

Console.WriteLine(reversedSentence);
Here's a quick reference on WCF Contract Attributes, jargons etc.
1) MSMQ - message delivery guarantees such as when receiver is offline, transactional delivery of messages, durable storage of messages, exception management through dead letter and poison letter queues dead letter queues used when messages expire or purged from other queues. Poison letter queues used when max retry count is exceeded. Security over AD. MSMQ management console facilitates administration.
NetMsmqBindingFeatures :
ExactlyOnce, TimeToLive (default 1day), QueueTransferProtocol ( SRMP for HTTP expose ), ReceiveRetryCount, MaxRetryCycles, UseMsmqTracing, UseActiveDirectory, UseSourceJournal,
ClientTransactions - TransactionScope, Queued Service Contract - TrasnactionFlow(TransactionFlowOption.Allowed)
Security Features:
Authentication - mutual sender and receiver, Authorization - access level, Integrity , Confidentiality,
SecurityMode - None, Transport, Message, Both -  TransportWithMessageCredential and TransportCredentialOnly, Client credentials are passed with the transport layer.
ClientCredentialType can be WindowsClient  property, HttpDigest property,  UserName property, ClientCertificate property, ServiceCertificate property, IssuedToken property, Peer property,
SecurityPrincipal - roles, identity. ServiceSecurityContext - claims, identity.
Claims based security model - security tokens and claims. using ClaimType, Right and Resource.
A X.509 token has a claim set where a list of claims selected from an indexer set are issued by a particular issuer. Authorization calls based on custom claims
Windows CardSpace is used for creating, managing and sharing digital identities in a secure and reliable manner. CardSpace usually has  a local STS that can issue SAML tokens. When a card is used to authenticate, a local or remote STS looks at the claims in the card to generate a token.
In a federated security, AAA is sometimes delegated to STS. The client application authenticates to the STS to request a token for a particular user. The STS returns a signed and encrypted token that can be presented to the relying party.
Exception handling - SOAP faults have faultCode, faultString, faultFactor, and detail.
Exception, SystemException, CommunicationException, FaultException, FaultException<T>.
BindingFeatures - Transport protocols, Message enconding, Message version, transport security, message security, duplex, reliable messaging and Transactions.
Reliability  - implemented via RM Buffer on both client and server side and session maintenance. RequireOrderedDelivery, retry attempts, SessionThrottling. Reliable sessions can be configured with Acknowledgement Interval, Flow Control, Inactivity Timeout, Max pending Channels, Max retry count, Max Transfer Size Window, Ordered etc.
Here are some more along with the previous post:

Decorator Pattern :  Here we don't change the interface like we did in the adapter. We wrap the object to add new functionalities. Extension methods is an example as well as a specific example with Stream class. BufferedStream, FileStream can all be wrapped from existing stream.

Iterator pattern :  This is evident from the IEnumerable pattern in .Net. This is very helpful with LINQ expressions where you can take the collection as Enumerable and invoke the standard query operators. The GetEnumerator() method and the MoveNext() on the IEnumerable enable the iteration.

Observer pattern : This is used to notify changes from one class to the other. Here we use an interface so that any object can subscribe to notifications that are invoked by calling the Notify method on the Observer.
public interface Observer {
void Notify(State s);
}
public class Subject
{
  private List<IObservers> observers;
  public void Add(IObserver);
  public void Remove(IObserver);
  public void NotifyObservers(State s)
  {
     observers.ForEach (x => x.Notify(s));
  }
}

Strategy pattern : When we are able to switch different ways to do the same task such as say sorting where we take a parameter for different comparisions, we implement the strategy pattern. Here the IComparer interface enables different comparisions between elements so that the same sorting operation on the same input can yield different results based on the different algorithms used. This pattern is called the Strategy pattern.
public class ArrayList : IList, ICollection, IEnumerable, ICloneable
{
    public virtual void Sort(IComparer comparer);
}

Saturday, September 21, 2013

I want to post some samples of select design patterns:
1) Builder pattern - This pattern separates the construction of a complex object from its representation so that the same construction process can create different representations.
public class RTFReader
{
   private TextConverter builder;
   public RTFReader( TextConverter t )
  {
    builder = t;
   }
   public void ParseRTF()
   {
              var t = token.Next();
              while (t != null)
             {
                switch( typeof(t))
                {
                    CHAR:
                      builder->ConvertCharacter(t.Char);
                    FONT:
                      builder->ConvertFont(t.Font);
                    PARA:
                     builder->ConvertParagraph(t.Para);
                 }
              }
    }
}

public abstract class TextConverter
{
   public ConvertCharacter(CHAR);
   public ConvertFont(FONT);
   public ConvertParagraph(PARA);
}

BTW - StringBuilder in .Net is not a design pattern.

Factory Pattern: Here we use an interface for creating an object but let the sub-classes decide the class to instantiate. In .Net library, we use for example a WebRequest class that is used to make a request and receive a response.

public static class WebRequest {

public static WebRequest Create(string);
}
The Create method creates various instances of the WebRequest class such as HttpWebRequest, FileWebRequest, FTPWebRequest etc.

Adaptor  Pattern: These can be often confused with Decorator patterns but they are different.
The Decorator patterns extend functionality dynamically. The Adaptors make one interface work with another so they change interfaces unlike the decorator.

SqlClient is an adapter pattern. Each provider is an adapter for its specific database. A class adapter uses multiple inheritance to adapt interfaces.

public sealed class SqlDataAdapter : DbDataAdapter, IDbDataAdapter, IDataAdapter, ICloneable
{
}

The key here is to inherit one and implement the other

Friday, September 20, 2013

In the previous post, we mentioned how we can index and store text with Lucene so that we can build a source index server. I also mentioned a caveat that unlike the Java version which may have a method to add files recursively from a directory, the Lucene.Net library does not come with it. So you build an index this way:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Lucene.Net;
using Lucene.Net.Analysis;
using Lucene.Net.Documents;
using Lucene.Net.Index;
using Lucene.Net.QueryParsers;
using Lucene.Net.Store;

namespace SourceSearch
{
    class Program
    {
        static void Main(string[] args)
        {
            if (args.Count() != 1)
            {
                Console.WriteLine("Usage: SourceSearch <term>");
                return;
            }

            var indexAt = SimpleFSDirectory.Open(new DirectoryInfo(Environment.SpecialFolder.LocalApplicationData.ToString()));
            using (var indexer = new IndexWriter(
                indexAt,
                new SimpleAnalyzer(),
                IndexWriter.MaxFieldLength.UNLIMITED))
            {

                var src = new DirectoryInfo(@"C:\code\text");
                var source = new SimpleFSDirectory(src);

                src.EnumerateFiles("*.cs", SearchOption.AllDirectories).ToList()
                    .ForEach(x =>
                        {
                            using (var reader = File.OpenText(x.FullName))
                            {
                                var doc = new Document();
                                doc.Add(new Field("contents", reader));
                                doc.Add(new Field("title", x.FullName, Field.Store.YES, Field.Index.ANALYZED));
                                indexer.AddDocument(doc);
                            }
                        });

                indexer.Optimize();
                Console.WriteLine("Total number of files indexed : " + indexer.MaxDoc());
            }

            using (var reader = IndexReader.Open(indexAt, true))
            {
                var pos = reader.TermPositions(new Term("contents", args.First().ToLower()));
                while (pos.Next())
                {
                    Console.WriteLine("Match in document " + reader.Document(pos.Doc).GetValues("title").FirstOrDefault());
                }
            }
        }
    }
}