What are Web Servers?
Web Server Communication
Web servers are one of the endpoints in communication through the World Wide Web. According to its inventor, Tim Berner-Lee, the World Wide Web is “the universe of network-accessible information, an embodiment of human knowledge.” While the latter part of this definition is arguable, the former offers a starting point through which to understand the magnitude of the Web.
The World Wide Web is the global structure of electronically connected information. It refers to the global connections between computers that allow users to search for documents or web pages by requesting results from a web server. These documents are hyper-text based (written in HTML-Hypertext Markup Language), allowing users to travel to other pages and extend their research through links. They are delivered in a standardized protocol, HTTP (Hypertext Transfer Protocol, usually written in lower case letters), making HTML documents intelligible across hardware and software variations.
This information travels through web servers and web browsers. The communication initiates from a user request through a web browser. The request is delivered to a web server in ‘http’ format. The server then processes the request, which can be anything from a general search to a specific task, and returns the results in the same format. The results are written in HTML, which is the language web pages are written in that supports high-speed travel between web pages.
HTML is also essential for displaying many of the interactive features on web pages, such as linking web pages to other objects, like images. An important distinction when defining web servers is between hardware and software. A web server is also a computer program (software) that performs the functions outlined above. This article will provide a basic overview of web servers. It will begin with a brief history, and then define the terms and components of how web server communication works in more detail. The article will conclude with a description of common web server features.
Web Servers History
The World Wide Web was developed by Tim Berners-Lee for his employer CERN or the European Organization for Nuclear Research between 1989-1991. In 1990 he wrote the program for the World Wide Web. This program created the first web browser and HTML editor. It was the first program to use both FTP and HTTP.
FTP is “file transfer protocol” and is used to transfer data over a network. Protocol is the set of standards that defines and controls connection, communication, and the transfer of data between two computer endpoints. It determines the format and defines the terms of transmission.
As previously mentioned, HTTP is the protocol that supports hyper-text documents. Both of these protocols are necessary for communication over the Internet or World Wide Web. The source code for the World Wide Web was made public in 1993, making it available to everyone with a computer. The technology continued to develop and between 1991-1994 extended from communication only between scientific organizations, to universities and, finally, to industry. By 1994, computers could transfer data between each other through a cable linking ports across various operating systems (OSs). Operating systems manage a computer’s hardware and software systems.
The first web server, also written by Berners-Lee, ran on NeXTSTEP, the operating system for NeXT computers. The other technology authored by Berners-Lee that is required for Web communication is URLs (Universal Resource Locators). These are the uniform global identifiers for documents on the Web allowing for easily locating them on the Web. Berners-Lee is also responsible for writing the initial specifications for HTML. The first web server was installed in the United States on December 12, 1991 and at SLAC (Stanford Linear Accelerator Center), which is a U.S. Department of Energy laboratory.
In 1994, Berners-Lee created the World Wide Web Consortium (W3C) to regulate and standardize the various technologies required for Web construction and communication. It was created to insure compatibility between vendors or industry members by having them agree on certain core standards. This insures the ability for web pages to be intelligible between different operating systems and software packages. After 2000, the web exploded. As of March 2007, there exist 110 million web sites on the World Wide Web.
Understanding Web Server Terminology
Definition of Terms, Process, Components and Features
At the most basic level, the process for web communication works as follows: a computer runs a web browser that allows it to request, communicate and display HTML documents (web pages). Web browsers are the software applications that allow users to access and view these web pages and they run on individual computers. The most popular web browsers are Internet Explorer, Mozilla, Firefox, and Safari (for Mac). After typing in the URL (or address) and pressing return, the request is sent to a server machine that runs the web server. The web server is the program that delivers the files that make up web pages. Every web site or computer that creates a web site requires a web server. The most popular web server program is Apache. The server machine then returns the requested web page.
Communication over the Internet can be broken down into two interested parties: clients and servers. The machines providing services are servers. Clients are the machines used to connect to those services. For example, the personal computer requesting web pages according to search parameters (defined by key words) does not provide any services to other computers. This is the client. If the client requests a search from, for example, the search engine Yahoo!, Yahoo! is the server, providing the hardware machinery to service the request. As previously mentioned, each computer requesting information over the Internet requires a web server program like Apache to render the search result intelligible in HTML.
Web servers translate URL path components in local file systems. The URL path is dependent on the server’s root directory. The root directory is the top directory of a file system that usually exists hierarchically as an inverted tree. URL paths are similar to UNIX-like operating systems.
The typical client request reads, for example, “http://www.example.com/path/file.html”. This client web browser translates this request through an HTTP request and by connecting to “www.example.com”, in this case. The web server will then add the requested path to its root directory path. The result is located in the server’s local file system or hierarchy of directories. The server reads the file and responds to the browser’s request. The response contains the requested documents, in this case, web sites and the constituent pages.
{mospagebreak title=Web Servers Features}
Web Servers Features
Although web servers differ in specifics there are certain basic characteristics shared by all web servers. These basic characteristics include HTTP and logging. As previously mentioned HTTP is the standard communications protocol for processing requests between client browsers and web servers. This protocol provides the standardized rules for representing data, authenticating requests, and detecting errors.
The purpose of protocols is to make data transfer and services user-friendly. In computing, the protocols determine the nature of the connection between two communicating endpoints (wired or wireless) and verify the existence of the other endpoints being communicated with. It also negotiates the various characteristics of the connection. It determines how to begin, end, and format a request. It also signals any errors or corruptions in files and alerts the user as to the appropriate steps to take. HTTP is the request/response protocol used specifically for communicating HTML documents which is the language hypertext or web pages are written in. However, responses can also return in the form of raw text, images or other types of documents.
The other basic web server characteristic is logging. This is a feature that allows the program to automatically record events. This record can then be used as an audit trail to diagnose problems. Web servers log detailed information recording client requests and server responses. This information is stored in log files and can be analyzed to better understand user behavior, such as key word preferences, generate statistics, and run a more efficient web site.
There are many other practical features common to a variety of web sites. Configuration files or external user interfaces help determine how much and to what level of sophistication users can interact with the server. This establishes the configurability of the server. Some servers also provide authentication features that require users to register with the server through a username and password before being allowed access to resources or the execution of requests.
Web servers must also be able to manage static and dynamic content. Static content exists as a file in a file system. Dynamic content is content (text, images, form fields) on a web page that changes according to specific contexts or conditions. Dynamic content is produced by some other program or script (a user-friendly programming language that connects existing components to execute a specific task) or API (Application Programming Interface the web server calls upon). It is much slower to load than static content since it often has to be pulled from a remote database. It provides a greater degree of user interactivity and tailor responses to user requests.
To handle dynamic content, web servers must support at least one of the following interfaces:
JSP (Java Server pages);
PHP (a programming language that creates dynamic web pages);
ASP (Active Server Pages, developed by Microsoft);
ASP.NET (also developed by Microsoft, it is the successor of ASP).