Structure
A struct is an aggregate type, meaning that it is made up of other objects. The objects that make up a struct are called its members. The members of a struct may be accessed by the ‘.’ or ‘->’ operators. A struct may be named or unnamed. Let’s look at some examples:
#include struct person { char first[100]; char last[100]; int age; struct { char addrline[500]; char city[100]; char state[30]; char zip[15]; } address; }; void printperson( struct person *p ) { printf( "%s %s is %d years oldn", p->first, p->last, p->age ); } void main() { struct person a; struct person b[100]; strcpy( a.first, "Bob" ); strcpy( a.last, "Smith" ); a.age = 33; strcpy( a.address.addrline, "123 Main Street" ); strcpy( a.address.city, "St. Louis" ); strcpy( a.address.state, "Missouri" ); strcpy( a.address.zip, "63141" ); b[7] = a; printperson( &b[7] ); }
The output is:
Bob Smith is 33 years old
Lines 3-14 declare a named structure called person. Its members are two character arrays first and last, an integer age and a nested unnamed struct. The nested unnamed struct is refered to by the identifier address.
Line 24 declares a to be a struct person, and line 25 declares an array of struct persons of size 100. Lines 27 – 33 show how to access the members of a struct using the ‘.’ operator. Line 35 shows that it is possible to use the assignment operator with structs – each member of a is copied to the corresponding member of b[7]. Line 36 calls printperson() with a pointer to the 8th element of b.
The printperson() function on lines 16-20 show accessing a struct’s members through the ‘->’ operator when you have a pointer to a struct.
Structures will be discussed more in another tutorial.
Unions
A union is similar to a struct except that only one of the members may have a value at one time. Unions are declared the same as structs with the ‘union’ keyword replacing ‘struct’, and its members are accessed with the same ‘.’ and ‘->’ operators. Also, like structs, they can be named or unnamed and can be nested.
Since a union can only have one of its members holding a value at one time, it only needs enough memory to hold its biggest member. It can use that space for all of its members.
Here is an example comparing structs and unions:
#include struct a { int num; float flt; }; union b { int num; float flt; }; void print_a( struct a * ap ) { printf("a.num = %d, a.flt = %fn", ap->num, ap->flt ); } void print_b( union b * bp ) { printf("b.num = %d, b.flt = %fn", bp->num, bp->flt ); } void main() { struct a a; union b b; printf( "size of struct a = %d bytesn", sizeof( a ) ); printf( "size of union b = %d bytesn", sizeof( b )); a.num = 9; a.flt = -45.73; b.num = 9; b.flt = -45.73; print_a( &a ); print_b( &b ); b.num = 9; print_b( &b ); }
The output of this program is:
size of struct a = 8 bytes size of union b = 4 bytes a.num = 9, a.flt = -45.730000 b.num = -1036588155, b.flt = -45.730000 b.num = 9, b.flt = 0.000000
Lines 3-7 declare a struct a that is has integer and float members. Lines 9-13 declare a union that is composed of the same types of members. However, from the output produced from lines 30-31 you can see that the union takes only 4 bytes while the struct takes 8. That is because the struct needs space for both the int (4 bytes) and float (another 4 bytes). The union can only hold either an integer value or a float value at one time, so it only needs 4 bytes total. The same 4 bytes in memory are used to hold both members.
Lines 33-34 assign values to the members of the struct a and line 39 prints out the values of the struct. Both values are printed correctly.
Lines 36-37 assign values to the members of the union b and line 40 prints out those values. However, notice that only the b.flt value is correct. That is because the union only holds one value at a time, and the last member assigned was b.flt. On line 42 b.num is assigned and the values printed again. This time b.num has the correct value while b.flt does not.
There will be more information about structs and unions in another tutorial.
Pointers to functions
In C, functions have a kind of type too. The function’s type is defined by the type it returns and the types of its arguments. For example the function: double fabs( double x ) has the type “function taking 1 double argument and returning a double”. All functions taking 1 double parameter and returning a double value have the same type.
Just as a variable pointer can point to another variable, a pointer to a function can point to another function. A pointer to a function is declared like this: return-type (*name)( type1, type2, … ). For example:
float (*fnp)( short, char [20] );
This declares fnp as a pointer to a function that returns a float and takes 2 parameters – a short and a character array of size 20. The parentheses around *fnp are necessary because without them you are declaring fnp as a function that returns a pointer to a float and takes a short and character array as parameters.
Here is another example. In this we use strcmp() and strcasecmp() which take two strings as arguments and returns an integer. These functions are used to compare strings; the strcasecmp() function ignores differences in lower and upper case while strcmp() does not.
#include #include void main() { char * str1 = "abcd"; char * str2 = "ABCD"; int c; int ( * cmpfn )( const char *, const char * ); printf("Ignore case? "); c = getchar(); if ( c == 'y' || c == 'Y' ) cmpfn = &strcasecmp; else cmpfn = &strcmp; if ( (*cmpfn)( str1, str2 ) == 0 ) printf( "%s and %s are the samen", str1, str2 ); else printf( "%s and %s are differentn", str1, str2 ); }
Here is the output from a couple of runs:
Ignore case? y abcd and ABCD are the same Ignore case? n abcd and ABCD are different
Line 9 defines cmpfn as a pointer to a function that returns an int and takes two const char * arguments. Lines 11-12 asks the user whether to ignore case or not. If the user enters ‘y’ or ‘Y’, line 15 sets cmpfn to point to strcasecmp(); otherwise line 17 sets it to point to strcmp(). On line 19, the dereference operator calls the function that cmpfn is pointing to.
There will be more details on these in future tutorials.
User defined types
C allows you to define your own types. You can give new names to existing types to make your code more clear or to abbreviate types with long names. New types are defined using the typedef keyword like this: typedef existing-type new-name. Let’s see a simple example:
#include typedef float Pounds; typedef unsigned long long BigInt; Pounds kg_to_pounds( float kg ) { return kg * 2.2031; } void main() { Pounds p = kg_to_pounds( 10 ); printf("10 kg is about %f poundsn", p ); BigInt debt = 14348938876642LL; printf("The debt is about %llun", debt ); }
The output of the program is:
10 kg is about 22.031000 pounds The debt is about 14348938876642
Line 3 defines a new type Pounds that is the same as a float. Line 4 defines BigInt as an unsigned long long. Lines 6-9 define a function that returns a value of type Pounds. Line 13 declares variable p having type Pounds and line 16 declares debt as having type BigInt. These newly defined types behave exactly the same as the types they are based on and they can be used anywhere the original type can be used. (It is not necessary to use a capitalized name for a new type’s name. Some C programmers follow that convention, others don’t.)
Typedefs can be used with derived types such as arrays, structs and unions. Here are some examples:
#include typedef char NameString[50]; typedef struct { char streetaddr[200]; char city[50]; char state[2]; char zip[15]; } Address; struct person { NameString first_name; NameString last_name; int age; Address addr; }; typedef struct person Person; void main() { Person bob = { "Bob", "Smith", 33, { "123 Main Street", "St. Louis", "MO", "63141" }}; printf( "%s %s lives at %s in %sn", bob.first_name, bob.last_name, bob.addr.streetaddr, bob.addr.city ); }
The output of the program is:
Bob Smith lives at 123 Main Street in St. Louis
Line 3 declares NameString as an array of 50 characters. Lines 5-11 define the unnamed struct as type Address. Line 21 declares Person as a new type name for struct person. Line 25 declares bob as a type of Person and initializes the data values.
Here is an example of a typedef for a function pointer. Looking again at the example in the section on Pointers to Functions above, we could have used a typedef like this like this:
typedef int (*CompareFunc)( const char *, const char * );
This defines CompareFunc as a “pointer to function that takes two const char * arguments and returns an int” type. We could have used it on line 9 of that example to declare cmpfn:
CompareFunc cmpfn;
Storage Class
Variables and functions have what is called a storage class. When you declare a variable or function you can define its storage class with one of these keywords: extern, static, auto or register.
In the explanations of different storage classes below, we use the terms “file scope” and “block scope”. An object having file scope means it is not declared inside a { } block of statements. An object with block scope is one that is declared inside a { } block.
Auto
The auto storage class applies to block scope variables. Space for auto variables are automatically allocated (and initialized if an initial values are given) when execution reaches the block in which they are defined. The space for auto variables are automatically freed when execution leaves the enclosing block.
All block level variables that do not specify a storage class are implicitly auto, so using the auto keyword is redundant. It is hardly ever used in practice. Some examples:
void func() { printf( "In func()n" ); auto int a = 8; auto struct { float f; char *p; } b; printf( "Second print linen" ); { auto long double ldbl = 2334.332L; /* some calculations here */ } printf( "Third print linen" ); }
Space for the integer a and structure b are allocated when func() is called; they are actually allocated (and a is initialized) before execution of the first printf() call. They are freed when func() returns. Space for ldbl would be allocated and initialized after the 2nd printf() call and it would be freed before the 3rd printf() call.
This code would behave exactly the same without the auto keyword too.
Extern
A variable or function declared as extern means it is visible outside of the file in which it is defined. A C program may be comprised of several source files and libraries; each of those files and libraries define objects such as variables and functions. An extern object must only be defined in one place. For example, say you are writing a program that is made up of two source files. In both files you use the name count :
// file1.c // extern void count() { ... }
// file2.c // extern int count; int count = 0; ...
You would get an error trying to compile these two files into a program because count is an extern object that is defined in two different files.
If a storage class is not declared for a function or for a variable in the file scope, it is extern by default.
Let’s look at an example of a program using extern:
// // file1.c // #include void main() { extern double add( double, double ); double a = 31.9388; extern double b; printf( "%f + %f = %fn", a, b, add( a, b )); }
// // file2.c // double b = -1.9387; extern double add( double a, double b ) { return a + b; }
The output of the program is:
31.938800 + -1.938700 = 30.000100
This example compiles two C source files into one program. In file1.c on line 8 we tell the compiler that a function called add() that takes two doubles as arguments and returns a double is defined elsewhere. On line 10 we do the same for a double variable named b.
In file2.c, on line 4 b is declared a double and is implicitly given storage class extern. That is because it is a variable in file scope with no storage class defined. In contrast, a variable such as a on line 9 of file1.c is not extern since it is a block scope variable; it is implicitly given the auto storage class.
On lines 6-10 we define add() with external storage class, though the extern is not necessary since all functions are implicitly extern unless they are given another storage class. (This makes the extern on line 8 in file1.c unnecessary as well.)
When the compiler combines file1.c and file2.c into one program, it knows that file1.c referred to an external function add() and an external variable b that were both defined in file2.c, so it is able to compile the program.
Static
The static storage class does two things. For file scope objects, the declared object (variable or function) is visible only within the file in which it is declared. This allows each source file to have a separate copy of an object with the same name, which is the opposite of extern.
The second thing static does is allocate storage for a variable (file or block scope) at the start of the program, initialize it at the start and keep that storage until the program terminates.
Let’s see an example:
// // file1.c // #include static double b = 7.56; static double add( double n1, double n2 ) { return n1+n2; } void main() { double a = 10; printf( "%f + %f = %fn", a, b, add( a, b )); extern void file2func( void ); file2func(); file2func(); file2func(); }
// // file2.c // #include static char * b = "b is a stringn"; static double add( double n1, double n2 ) { return 2*n1 + n2*n2; } // file2func is extern by default void file2func() { static int count = 0; ++count; printf( "file2func() has been called %d timesn", count ); if ( add( 2, 1 ) != 5 ) printf( "add() is brokenn" ); }
The output of the program is:
10.000000 + 7.560000 = 17.560000 file2func() has been called 1 times file2func() has been called 2 times file2func() has been called 3 times
In this example once again two source files are combined into one program. As you can see, both file1.c and file2.c have file scope variables called b, but they are not the same type. Both files also have a function called add() that have the same type but their definitions are different – add() in file1.c just returns the sum of its arguments but add() in file2.c returns the sum of 2 times its first argument and the square of its second argument. If add() and b had extern storage class this would not be allowed and the compiler would have given errors, but since they have static storage class it is accepted. Each file has its own separate copy of b and add().
Line 17 in file1.c prints the result of calling add(), but which add() function does it call? Because both add() functions are declared static, each is only visible inside the file in which it is defined. So the file1.c add() gets called and the result is printed. A similar thing happens on line 22 of file2.c – that calls the file2.c add() function and the result is compared to 5. Since the result is 5, the printf() on line 23 is not executed.
In file2.c, line 17 shows that the static storage class can be used with block scope variables too. This will allocate space for count and initialize it with the value 0 even before main() is called. As you can see from the program output, the storage for count is kept until the end of the program unlike auto variables that disappear when execution goes past the enclosing block.
Register
The register keyword gives a hint to the compiler that you want the fastest access possible to a variable. Typically this means the compiler will try to put that variable in a CPU register since accessing a register is much faster than accessing memory. But the compiler is not required to do that, and in fact it is free to ignore the register keyword completely. There is no guarantee that declaring a variable with register storage class will speed up access to it. Consult your implementation’s documentation on how it treats register variables.
The register keyword applies to block level variables. It can also be used in function parameters or in for loops. Let’s see some examples:
#include double sum(int n, register double * arr ) { register double s = 0; for ( register int i = 0; i < n; ++i ) s += arr[i]; return s; } void main() { double d[] = { 643.75, 982, -38.89, -3.2e2 }; int d_size = sizeof(d) / sizeof(d[0]); printf( "sum of %d elements = %gn", d_size, sum( d_size, d )); }
The output of the program is:
sum of 4 elements = 1266.86
In this program the sum() function takes an integer and a pointer to an array of doubles. The integer n is the size of the array arr. The function goes through the array adding each element of the array to s, finally returning s. The arr parameter to the function, s and loop variable i are all declared with storage class register. For this example, having these variables in registers won’t make much difference, but if we had an array of thousands or millions of elements then the speedup could be noticeable.
One caveat with register variables is that you cannot take their address, because if they are stored in registers they don’t have a memory address. If you try to take the address of a register variable with the & operator you will get an error.
Type Qualifiers
Type qualifiers are used to modify the way variables may be accessed. Type qualifiers may be one or more of const, volatile or restrict.
The const qualifier let’s the compiler know this variable should not be modified.