Cluster computing

Friday, July 12, 2013

Publishing load test results.

In the Visual Studio, when we open a load test, we see an option to "open and manage Results" in the toolbar. This brings up a dialog box which lists the results associated with the loadtest. Each of these results can be selected and opened. Opening a result brings up the summary view by default. This view can then be cut and paste in the body of an e-mail for reporting. Alternatively it can be exported to a file on the fileshare.

SQL Services Reporting Manager provides great functionality to design custom reports. These reports can draw data using SQL queries. They can also be subscribed with e-mail registration.

Team Foundation Server enables automation of a performance test cycle. The steps involved in a performance test cycle are as follows:
1. Understand the process and compliance criteria
2. Understand the system and the project plan
3. Identify performance acceptance criteria
4. Plan performance-testing activities
5. Design tests
6. Configure the test environment
7. Implement the test design
8. Execute the work items
9. Report results and archive data
10. Modify the pain and gain approvals for Modifications
11. Return to activity 5
12. Prepare the final report.

The first step involves getting a buy-in on the performance testing prior to the testing and to comply with the standards, if any. The second step is to determine the use case scenarios and their priority. The third step is to determine the requirements and goals for performance testing as determined with stakeholders, project documentation, usability study and competitive analysis. The goals should be articulated in a measurable way and recorded somewhere. Plan work items to project plans and schedule them accordingly. This planning is required to line up the activities ahead of time. Designing performance tests involves identifying usage scenarios, user variances and generating test data. Test are designed based on real operations and data, to produce more credible results and enhance the value of performance testing. Tests include component-level testing. Next, configure the environments using load-generation and application monitoring tools, isolated network environments and ensuring compatibility all of which takes time. Test Designs are implemented to simulate a single or virtual user. Next, work items are executed in the order of their priority, their results evaluated and recorded, communicated and test plan adapted. Results are then reported and archived. Even if there are runs that may not all be usable, they are sometimes archived with appropriate labels. After each testing phase, it is important to review the performance test plan. Mark the test plan that have been completed and evaluated and submit for approval. Repeat the iterations. Finally, prepare a report to be submitted to the relevant stakeholders for acceptance.

Thursday, July 11, 2013

Mailing data and database objects best practices:
Data can be attached to mail in several different file formats including Excel, rtf, csv, text, HTML.
Data can also be included in the body of the mail.
You would need access to MAPI or SMTP to send mails
When sending a data access page, share the database so that users can interact with the page.
Create the data access page using UNC paths so that they are not mapped to local drives
Store the database and the page on the same server.
Publish from a trusted intranet security zone
Send a pointer instead of a copy of the HTML source code
For intranet users, UNC and domains alleviate security considerations while the same can be used to demand permissions for external users.
Always send the page to yourself and view the code before mailing others.

System generated mails for periodic activities or alerts are common practice in most workplace. There are several layers from which such mails can be generated. SQL Server has a xp called sendmail that can send messages to an smtp server. It needs to be enabled via server configurations. The sendmail xp can be directly invoked from the stored procedures which are very close to the data.
SSRS is another layer from which well-formed reports can be mailed out. These are again designed and sent out from SSRS. The TFS or source control is another place which can send mail. Automated performance reports can also be sent out this way.

Tuesday, July 9, 2013

REST API : resource versus API throttling

REST APIs should have cost associated with the resources rather than the APIs because there is no limit on the number of calls made per API. If the response sizes can be reduced with inline filter, then it directly translates to savings for both the sender and the receiver.

Some of the performance degradations occur due to :
premature optimization
guessing
caching everything
fighting the framework

Performance can be improved with
1) finding the target baseline
2) knowing the current state
3) profiling to find bottlenecks
4) removing bottlenecks
5) repeating the above

Request distribution per hour, most requested, http statuses returned, request duration, failed requests etc all help with the analysis. Server logs can carry all this information. Tools to parse the logs for finding these information could help. Process id and memory usage can directly be added to the server logs. Server side and client side performance metrics help to isolate issues.

Benchmarks are available for performance testing of APIs. CDN should not matter in performance measurements. use static file return as baseline. Separate out I/O and CPU bound processes.

Courtesy : rails performance best practices

Monday, July 8, 2013

Full text and semantic extraction in SQL Server 2012

Here are some sample queries for semantic extraction of keyphrases in SQL Server 2012.
SET @Title = 'TestDoc.docx'

SELECT @DocID = DocumentID
FROM Documents
WHERE DocumentTitle = @Title

# Finds the keyphrases in a document.
SELECT @Title as Title, keyphrase, score
FROM SEMANTICKEYPHRASETABLE(Documents, *, @DocID)
ORDER by score DESC

# Finds similar documents
SELECT @Title as SourceTitle, DocumentTitle as MatchedTitle,
DocumentID, score
FROM SEMANTICSIMILARITYTABLE(Documents, *, @DocID)
INNER JOIN Documents ON DocumentID = matched_document_key
ORDER BY score DESC

# Finds keyphrases that make documents similar or related
SELECT @SourceTitle as SourceTitle, @MatchedTitle as MatchedTitle, keyphrase, score
FROM SEMANTICSIMILARITYDETAILSTABLE(Documents, DocumentContent, @SourceDocID, DocumentContent, @MatchedDocID)
ORDER BY score DESC

You can use FileTables to store documents in SQL Server. These are special tables built on top of FILESTREAM.
A FileTable enables application to access files and documents as if they were stored in the filesystem without
requiring any changes to the application.

You can enable semantic search on columns using semantic index.
To create a semantic index when there is no fulltext index

CREATE FULLTEXT CATALOG ft as DEFAULT
GO

CREATE UNIQUE INDEX ui_ukDescription
ON MyTable.Description(DescriptionID)
GO

CREATE FULLTEXT INDEX ON MyTable.Description
(Description, Language-1033, Statistical_Semantics)
KEY INDEX DescriptionID
WITH STOPLIST = SYSTEM
GO

or Add semantic indexing to one that has fulltext index
ALTER FULLTEXT INDEX ON MyTable.Description
ALTER COLUMN Description
ADD Statistical_Semantics
WITH NO POPULATION
GO

Sunday, July 7, 2013

Application Partition for DNS

Application partitions for DNS
Application partitions are user defined partitions that have a custom replication scope. Domain controllers can be configured to host any application partition irrespective of their domains so long as they are in the same forest. This decouples the DNS data and its replication from the domain context. You can now configure AD to replicate only the DNS data between the domain controllers running the DNS service within a domain or forest.
The other partitions are DomainDnsZones and ForestDnsZones. The system folder is the root level folder to store DNS data. The default partitions for Domain and Forest are created automatically.
Aging and scavenging When the DNS records build up, some of the entries become stale when the clients have changed their names or have moved. These are difficult to maintain as the number of hosts increases. Therefore a process called scavenging is introduced in the Microsoft DNS server that scans all the records in a zone and removes the records that have not been refreshed in a certain time period. when the clients register themselves with the dynamic DNS, their registrations are set to be renewed every 24 hours by default. Windows DNS will store this timestamp as an attribute of the DNS record and is used with scavenging. Manual record entries have timestamps set to zero so they are excluded from scavenging.
"A "no-refresh interval" for the scavenging configuration option is used to limit the amount of unnecessary replication because it defines how often the DNS sever will accept the DNS registration refresh and update the DNS record.
This is how often the DNS server will propagate a timestamp refresh from the client to the directory or file-system. Another option called the refresh interval specifies how long the DNS server must wait following a refresh for a record to be eligible for scavenging and this is typically seven days.

Friday, July 5, 2013

Active directory conditional forwarding

Active directory has a feature where by one or more IP address can be specified to forward name resolutions to that are not handled by the local DNS server. The conditional forwarder definitions are also replicated via Active Directory. Together with the forward and reverse lookup zones in the active directory these can be set via the DNS mmc management console. The DNS servers are usually primary or secondary in nature. The primary stores all the records of the zone and the secondary gets the contents of its zone from the primary. Each update can flow from the primary to the secondary or the secondary may pull the updates periodically or on demand. All updates have to be made to the primary. Each type of server can resolve name queries that come from hosts for the zones. The contents of the zone file can also be stored in the active directory in a hierarchical structure. The DNS structure can be replicated among all DCs of the domain, each DC holds a writeable copy of the DNS data. The DNS objects stored in the Active Directory could be updated on any DC via LDAP operations or through DDNS against DCs that act as DNS servers when the DNS is integrated with the Active Directory.
The DNS "island" issue sometimes occurs due to improper configuration. AD requires proper DNS resolution to replicate changes and when using integrated DNS, the DC replicates DNS changes throught AD replication. This is the classic chicken and egg problem. If the DC configured as name server points to itself and its IP address changes, the DNS records will successfully be updated locally but other DCs cannot resolve this DC's IP address unless they point to it. This causes replication fail and effectively renders the DC with the changed IP address an island to itself. This can be avoided when the forest root domain controllers that are the name servers are configured to point at root servers other than themselves.