This tutorial will give you an overview of the C programming language. We will cover some of the history of C, why people use it, where it is being used, and the basic structure of programs in C.
History
The C language was developed at AT&T Bell Labs in the early 1970s by Dennis Ritchie. It was based on an earlier Bell Labs language “B” which itself was based on the BCPL language. Since early on, C has been used with the Unix operating system, but it is not bound to any particular O/S or hardware.
C has gone through some revisions since its introduction. The American National Standards Institute developed the first standardized specification for the language in 1989, commonly referred to as C89. Before that, the only specification was an informal one from the book “The C Programming Language” by Brian Kernighan and Dennis Ritchie.
The next major revision was published in 1999. This revision introduced some new features, data types and some other changes. This is referred to as the C99 standard.
Advantages
Here are some advantages of programming in C:
- C is a general purpose programming language, meaning that it is not limited to any one specific kind of programming. This is different from languages like COBOL which was built for business applications, and FORTRAN for scientific calculations. You can write all sorts of software using C. .
- C is not a very high-level language. A high-level language tries to isolate the programmer from the hardware as much as possible. In contrast, C allows you to directly access memory addresses, create bit fields and structures and map them to memory, perform bitwise operations and so on. C facilitates hardware programming. .
- Not being high-level also means there is little overhead; it is highly efficient and provides fast execution speed. .
- There are C language compilers and development tools available for many different platforms from small embedded systems to large mainframes and supercomputers. .
- C has been around for almost 40 years. In that time there has much software written in C. If there is some functionality you need in a C program you are writing, chances are someone has already written it. It may even be available for free.
Uses of C
In spite of its age, C is still being heavily used in industry. Several surveys have placed C as one of the most popular languages currently in use.
C is a very good choice for writing software to control hardware. The Unix (and derivatives) operating system’s kernel is written in C (with some small pieces in assembly). Most firmware and device drivers are written in C as well.
C is also used in many real-time systems programming. While the language itself does not have any real-time features, it can be combined with platform-specific libraries or libraries that implement the POSIX real-time interfaces. C is a very efficient language that does not require many supporting libraries to run and does not have much overhead, which is desirable in low-memory embedded systems. Combining real-time libraries with C give it the timing constraints and other features needed for real-time programming.
Because C is efficient and fast it is sometimes used as the development language of other programming languages. Languages like PHP and Perl have been written in C. Many computationally intensive libraries and applications like MATLAB have been written in it too, for the same reason.
We have only talked about a few specialized domains where C is used. In addition to those, there are many other applications of all kinds that are written in C.
Structure of C Program
In this section we will take a look at the structure of a C program. Remember that many of the concepts, terms and syntax shown in this section will be reviewed in detail in other tutorials. This is only an introduction.
A C program may be made up of one or more files called “source files”. There is a kind of source file that is used to define constants, macros, function prototypes, type definitions, etc. called a “header file”. Header files are basically used to share things between other source files. By convention, source file names have the extension “.c” and header file names have the extension “.h”.
How you enter a C statements into one or more files, how you run the compiler on those files, and how you run the resulting executable is completely dependent on what system you are using and what tools you have. Most systems have some kind of text editor for creating and modifying files.
Each compiler is different, you must consult your compiler’s documentation for information on how to run it and how to set different options. There are also Integrated Development Environments (IDEs) that let you edit, compile, run and sometimes debug a program, all with a friendly user interface. The examples in this tutorial were written using a text editor on a Linux system and compiled with the gcc compiler.
Hello World
Let us look at a very basic C program. We will write the canonical “Hello World” program in a file called hello.c. Here are the contents of file hello.c:
#include int main(void) { printf("Hello Worldn"); return 0; }
I have added line numbers here to make it easier to refer to specific lines; the actual C source code does not have line numbers. We can run the C compiler on this program (the $ is the command prompt, what I type is in red):
$ gcc hello.c
The compiler does not give any errors and has created an executable file called “a.out”. As mentioned, how you run the compiler on your system and what executable file it creates depends on your system and compiler or IDE. We run the program and get the output:
$ ./a.out Hello World!
Let us take a look at this example in detail, starting with how the file is compiled.
A C compiler translates source files into machine instructions and then links them with any libraries needed to run the program, creating an executable file. When a compiler processes a source file, one of the first things it does is carry out the preprocessor directives. These are various commands that control how the compiler processes the source file.
All preprocessor commands start with the hash (#) mark. We will take a more detailed look at preprocessor commands in another tutorial, but for now we will talk about the include directive seen on the first line of hello.c. This directive gives the name of another source file to the compiler to include in this file. The file name is surrounded by double quotes or angle brackets.
When the compiler sees this command it switches processing to the named file, then back to the original file. It is like saying to the compiler “include the contents of the file stdio.h here”. Typically the include directives are put near the top of a source file and are used to include header files.
The stdio.h file is a standard header file included with all C compilers. We will talk about header files more a little later.
C is a procedural language. A procedural language can break a program up into several procedures (also called subroutines), and each procedure can issue commands and invoke other procedures. Though C’s procedures are called “functions”, that does not make C a “functional” programming language – that term is used for another type of programming paradigm. In this tutorial we will use C’s terminology and call procedures “functions”.
A C program should have one function called “main”. When the program is run this is the function that gets executed first. On line 3 of hello.c we have our main function. C functions can take parameters (also called arguments) and return values, similar to functions in math. On line 3 the int signifies that the main function returns an integer value. The void indicates that main does not take any parameters.
There are two classic ways to declare the main function. One is as seen on line 3, the other is like this:
int main( int argc, char *argv[] )
This form is used when you wish to pass parameters to the main function. In addition to these, a C compiler may allow other forms of main(). A couple of common ones are:
void main(void) void main( int ac, char *av[] )
These are the same as the two main() declarations seen before except they return a void – that is, they do not return any value.
The body of a function is placed between curly braces { } like on lines 4 and 7 in hello.c. In this program the body of the main() function only calls function printf() to print the string “Hello World!” and a new line (the n at the end of the string) to the output screen on line 5, and returns the value 0 on line 6. printf() is a standard function used to write formatted output to the screen or whatever output device you have. This function is in the standard libraries that the C compiler links with programs. We will talk more about the printf() function below.
In C, every statement is terminated by a semicolon (;) character. A compound statement (also called a block) is a sequence of statements inside curly braces { }. As you have seen, the body of a function is a compound statement. In addition, most places that accept a single statement can also accept a compound statement, such as after the if, while, case statements.
In a C source file, by convention the statements in between curly braces are indented by a certain amount of space. You can see that lines 5 and 6 of hello.c above have been indented with a tab. This is only to make the code easier to read for people, it is not required by the compiler. The compiler ignores white-space (spaces, tabs, new lines) unless it occurs in a character or string. For example the statement:
sum = a+b;
could have written like this:
sum = a + b ;
The amount to indent statements inside a block is up to you; some people indent a tab stop, others indent 2 or 4 spaces, it is just whatever you prefer.
Comments
You can put comments inside a C program in two ways. The first way is a comment block. The sequence /* introduces a comment block and the */ sequence ends it. All text inside those sequences is ignored by the compiler. The second way is to use a double slash //. The text from the double slash to the end of the line is ignored. Here are some examples:
#include void main(void) { /* this is a comment */ // this is another comment printf("This is printed.n"); /* this is not */ /* this is a comment over multiple lines. */ printf("/* This is not a comment */n"); // this is a comment }
The output of this program is:
This is printed.
/* This is not a comment */
Lines 7 and 13 show that you can mix statements and comments on the same line. Lines 8-12 show a comment block that spans multiple lines. Line 13 shows that you cannot have a comment inside a string (a string is a sequence characters in double quotes), because the comment is included in the string as shown in the output.
Declarations
In C, every variable or function has a type – a kind of value that it can hold. Some common types are int (for holding integers), char (characters) and double (high precision real numbers). A variable must be declared before it is used. Let us look at some declarations:
int number; double e = 2.71828; char hello[] = { 'H', 'e', 'l', 'l', 'o', '�' };
This shows the general pattern of a variable declaration – a type name followed by the variable name and optional initial value. The first line declares a variable called number that can hold integers. The second line declares e as a real number and initializes it to the value 2.71828. The third declares hello as an array (the square brackets [] denote an array) of characters and initializes it to the characters H, e, l, l, o and the null character.
Every variable has a “scope” – a region of the program where the variable is visible to the compiler and can be used. A variable declared inside a block has a scope within that block. A variable declared in the arguments to a function is visible only inside that function. If a variable is declared outside of any block or function, it is said to have “file” scope – it is visible from the point it is declared to the end of the file.
When a variable is referenced, the compiler will look for that variable starting from the current block and work outward to any enclosing blocks and finally to file scope variables. This implies that a variable in the current block with the same name as one in an outer block or file scope will hide the outer one. Let us see an example that will hopefully make these concepts clear:
#include int n = 99; void print_n( n ) int n; { printf( "print_n first printf, n = %dn", n ); { int n = 2; printf( "print_n second printf, n = %dn", n ); { int n = 3; printf( "print_n third printf, n = %dn", n ); } printf( "print_n fourth printf, n = %dn", n ); } printf( "print_n fifth printf, n = %dn", n ); } void main(void) { printf("in main, n = %dn", n ); print_n( 1 ); printf("back in main, n = %dn", n ); }
The output of this program follows:
in main, n = 99 print_n first printf, n = 1 print_n second printf, n = 2 print_n third printf, n = 3 print_n fourth printf, n = 2 print_n fifth printf, n = 1 back in main, n = 99
The example defines a file scope variable n on line 3, and a function print_n() that takes one integer argument called n. When main() is run, it first prints the value of n (line 27). Since there is no n defined in main() the compiler looks outside of the function and finds the file scope n on line 3. As you see from the output it has value 99, because it was initialized with that value on line 3.
Then main() calls the print_n() function and passes it the value 1. This initializes the n parameter of print_n() declared on line 6 with the value 1. This n now hides the file scope n declared on line 3. It is important to note that this is temporary; the original n’s value is still there, it is just hidden by this new n. Line 8 prints the value of n, which you can see is 1.
Line 10 introduces a new block and line 11 declares another variable n and initializes it to 2. This variable n now hides the one declared on line 6. Line 12 prints the value of n again, but now it refers to the n on line 11, which has the value 2. Lines 14-17 repeat the process by introducing another block with another n.
In the remaining printf() calls you can see that once a block ends, any variables declared inside that block disappear as well. If any variables in a block hid other variables, those hidden ones are visible again once the block ends. For example, on line 19, the n variable refers to the one declared on line 11 which is no longer hidden by the declaration on line 15.
You may have noticed that in this example the parameter for function print_n() is declared on a different line. C allows you to declare the types of the parameters for a function inside the parentheses or on separate statements before the { brace. For example, this function definition:
int compute( double n, long y, char * str ) { // do something here }
is the same as this one:
int compute( n, y, str ) double n; long y; char* str; { // do something here }
Function Prototypes
Function Prototypes Just like variables need to be declared before being used, functions should also be declared before being called. Unlike variables however, the compiler will still allow you to call a function without declaring it, because the compiler will make some assumptions about the called function. However, if those assumptions are wrong you will get bad results, or worse, your program will crash. Let us take a look at an example. This time the program will be split between two files file1.c and file2.c:
/* * file1.c */ void main() { int m = 1; int n = 2; add( m, n ); }
/* * file2.c */ #include void add( double a, double b ) { printf("%g + %g = %gn", a, b, a+b ); }
In file1.c in main() we simply call a function called add() and pass it two integer parameters. In file2.c on lines 6-9 we define the add() function as returning nothing (void) and taking two double parameters. It just prints the values of the parameters and the result of adding them. When we compile these files and run the program we get:
$ gcc file1.c file2.c $ ./a.out 4.24399e-314 + 1.73554e-305 = 1.73554e-305
We expected the output to be “1 + 2 = 3”. Why did we get these strange results? It is because when the compiler processes file1.c it knows nothing about the add() function. It does not know that add() takes two double parameters, not ints. So it assumes that passing two int arguments to add() is OK. When add() is executed it expects double values for a and b, but since it was passed ints the values of a and b are garbage. We can fix this by declaring a “prototype” of the add() function before calling it in file1.c. Prototypes tell the compiler what a function returns, and how many arguments it takes and the arguments’ types. We will add a prototype for the
add() function in file1.c: /* * file1.c */ void add( double, double ); void main() { int m = 1; int n = 2; add( m, n ); }
The prototype is on line 4. It tells the compiler that the add() function takes two double arguments and returns void. Now when file1.c is compiled, when add() is called on line 11 the compiler knows to convert the ints to doubles before calling add(). Now we get the correct output when the program is run: 1 + 2 = 3
Prototypes are typically put in header files. For a tiny program like this it is not really necessary, but for larger programs it is a good way to organize things. Let us move the add() prototype into a header file just to see how it is done. Create a file file2.h with the prototype and include that file from both file1.c and file2.c.
/* * file1.c */ #include "file2.h" void main() { int m = 1; int n = 2; add( m, n ); }
/* * file2.h */ void add( double, double );
Compiling and running the program gives the same output:
$ gcc file1.c file2.c $ ./a.out 1 + 2 = 3
Note that we did not have to specify the header file to gcc. It is included when the other files are compiled.
You can see that in all the previous examples in this tutorial, anywhere we used the printf() function we included the standard I/O header file stdio.h. That file contains the prototype for printf() and various other functions, as well as type definitions, constants and other things.
Printing
In this section we will take a more detailed look at the printf() function that you have already seen used in several examples. It is one of the main ways to display output in a C program.
The printf() function is unusual in that it can take a variable number and types of arguments. The first argument to printf() is the format string. This is always required. It tells printf() what to print and how to print it. The second and further arguments are dependent on the format string.
The format string can contain conversion specifications. These are sequences of characters beginning with a % that tell printf() what to convert and how to print it. For example, the “%d” conversion specification tells printf() to convert an int argument to a string representing its decimal value. Each conversion specification applies to the next argument to printf(). Let us see how this works:
#include void main() { printf("second printf argument is %d, third is %dn", 12, 34 ); }
The output of this program is: second printf argument is 12, third is 34
As you can see, the first %d used the second argument to printf() and the second %d used the third argument. Both of those arguments are integer constants that are converted to ints, then passed to printf(). The printf() function has many conversion specifiers. Here are some of them, what arguments they expect and how they appear when printed:
Specifier Expected type Printed as
d, i int decimal number o unsigned int octal number u unsigned int decimal number x, X unsigned int hexadecimal number e, E double [-]d.ddde±dd where d is a decimal digit and [-] means a minus sign is printed if the value is negative. f, F double [-]ddd.ddd g, G double esentially acts as a f or e specifier, whichever is more compact. c char a single character s const char * a string % none prints a single % character.
With each conversion specifier you can also give a width, a precision and a justification. These are specified as a numeric width followed by a period, then by a numeric precision, between the % and the conversion specifier. For example “%20.3f” specifies a width of 20 and a precision of 3 for a double argument.
The width gives the minimum number of characters to print (but does not specify the maximum) when converting. You can put a ‘-‘ character before the width to left justify the output; otherwise it is right justified. The precision value is used in different ways for different specifiers. For the %f and %e specifiers it gives the number of places after the decimal point. For %d the precision gives the number of digits to print (it is padded with zeroes if necessary) and for %s is gives the maximum number of characters to print.
Let us look at some examples. This program prints different types of variables with a field width, precision and justification:
#include void main() { char c = 'x'; int i = 1234; double d = 4.982761036e+3; char *s = "This is a character string"; /* default */ printf("Defaultn"); printf("|1234567890123456789012345678901234567890|n"); printf("|%c|n", c ); printf("|%d|n", i ); printf("|%f|n", d ); printf("|%s|n", s ); /* right justified */ printf("nRight Justifiedn"); printf("|1234567890123456789012345678901234567890|n"); printf("|%20c|n", c ); printf("|%20d|n", i ); printf("|%20f|n", d ); printf("|%20s|n", s ); /* left justified */ printf("nLeft Justifiedn"); printf("|1234567890123456789012345678901234567890|n"); printf("|%-20c|n", c ); printf("|%-20f|n", d ); /* precision without width */ printf("nPrecision without widthn"); printf("|1234567890123456789012345678901234567890|n"); printf("|%.5d|n", i ); printf("|%.5f|n", d ); printf("|%.5s|n", s ); /* both width and precision */ printf("nPrecision and widthn"); printf("|1234567890123456789012345678901234567890|n"); printf("|%20.5d|n", i ); printf("|%20.5f|n", d ); printf("|%20.5s|n", s ); }
The output of this program is:
Default |1234567890123456789012345678901234567890| |x| |1234| |4982.761036| |This is a character string| Right Justified |1234567890123456789012345678901234567890| | x| | 1234| | 4982.761036| |This is a character string| Left Justified |1234567890123456789012345678901234567890| |x | |4982.761036 | Precision without width |1234567890123456789012345678901234567890| |01234| |4982.76104| |This | Precision and width |1234567890123456789012345678901234567890| | 01234| | 4982.76104| | This |
Lines 12-17 print the character, integer, real number and string with the default formatting. Lines 20-25 print them right justified in a field 20 characters wide. Since the width only gives the minimum width, the string which is 26 characters long overflows the field. Lines 28-31 print the character and real number left justified in a field 20 characters wide.
Lines 34-38 print the integer, real number and the string with a precision of 5 specified. For the integer you can see that a 0 has been added to the number. The real number’s fractional part .761036 is rounded to .76104 to give it 5 digits after the decimal point. And for the string the precision gives the number of characters to print, which results in “This “.
Lines 41 to 45 specify both width and precision.
The options and capabilities of printf() could take up an entire tutorial in itself. We have only discussed a few of its features. For more info consult your C compiler documentation.