Cluster computing: Threads and Process

Thursday, January 31, 2013

Threads and Process

"An operating system thread is a unit of control without additional private OS context and without a private address space. Each OS thread has full access to the memory of the other threads executing within the same multithreaded OS process. Thread execution is scheduled by the operating system kernel scheduler and these threads are often called "kernel threads" or k-threads.
A Lightweight Thread package is an application-level construct that supports multiple threads within a single OS process. Unlike OS threads scheduled by the OS, lightweight threads are scheduled by an application-level thread scheduler. The difference between a lightweight thread and a kernel thread is that a lightweight thread is scheduled in user-space without kernel scheduler involvement or knowledge. The combination of the user-space scheduler and all of its lightweight threads run withing a single OS process and appears to the OS scheduler as a single thread of execution. Lightweight threads have the advantage of faster thread switches when compared to OS threads since there is no need to do an OS kernel mode switch to schedule the next thread. Lightweight threads ahve the disadvantage, however, that any blocking operation such as a synchronous I/O by any thread will block all threads in the process. This prevents any of the other threads frommaking progress while one thread is blocked waiting for an OS resource. Lightweight thread packages avoid this by (1) issuing only asychronous (non-blocking) I/O requests and (2) not invoking any OS operations that could block. Generally, lightweight threads offer a more difficult programming model than writing software based on either OS processes or OS threads. Some DBMSs implement their own lightweight thread (LWT) packages. These are a special case of general LWT packages. We refer to these threads as DBMS threads and simply threads when the distinction between DBMS, general LWT and OS threads are unimportant to the discussion. A DBMS client is the software component that implements the API used by the application programs to communicate with a DBMS. Some example database access APIs are JDBC, ODBC, and OLE/DB. In addition, there are a wide variety of proprietary database access API sets. Some programs are written using embedded SQL, a technique of mixing programming language statements with database access statements. This was first delivered in IBM COBOL and PL/I and, much later, in SQL/J which implements embedded SQL for Java. Embedded SQL is processed by preprocessors that translate the embedded SQL statements into direct calls to data access APIs. Calls made to these APIs are marshaled by the DBMS client component and sent to the DBMS over some communications protocol. The protocols are usually proprietary and often undocumented. In the past, there have been several efforts to standardize client-to-database communication protocols, with Open Group DRDA being perhas the best known, but none have achived broad adoption. A DBMS worker is the thread of execution in the DBMS that does work on behalf of a DBMS client. A 1:1 mapping exists between a DBMS worker and a DBMS client: the DBMS worker handles all SQL requests from a single DBMS client. The DBMS client sends SQL requests to the DBMS server. The worker executes each request and returns the result to the client."
Reference: Architecture of a database system - Hellerstein, Stonebraker, Hamilton

Cluster computing

Thursday, January 31, 2013

Threads and Process

No comments:

Post a Comment