Bucket sort is a sorting algorithm that works by inserting the elements of the sorting array intro buckets, then, each bucket is sorted individually. The idea behind bucket sort is that if we know the range of our elements to be sorted, we can set up buckets for each possible element, and just toss elements into their corresponding buckets. We then empty the buckets in order, and the result is a sorted list. It is similar to radix sort.
How it works:
Initially we have to set up an array of empty buckets, then put each object into its bucket. After this, we sort each bucket with elements, and then we pass through each bucket in order and gather all the elements into the original array. Also, the number of buckets can be determined by the programmer.
Step by step example :
Having the following list, let’s try to use bucket sort to arrange the numbers from lowest to greatest:
Unsorted list:
6 | 9 | 7 | 1 | 5 | 4 | 2 | 8 | 3 |
For this example, let us use 3 buckets, first is 1-3, second is 4-6, and last is 7-9.
1 | – | 3 | 4 | – | 6 | 7 | – | 9 |
At the first run, the algorithm passes through the array, and selects the proper elements which then are moved into the bucket:
1 | 5 | 4 | 2 | 8 | 3 |
6 | 9 | 7 | ||||||
1 | – | 3 | 4 | – | 6 | 7 | – | 9 |
And the same happens for the rest of elements, at the end of the pass, it will look like this:
1 | 2 | 3 | 6 | 5 | 4 | 9 | 7 | 8 |
1 | – | 3 | 4 | – | 6 | 7 | – | 9 |
Now, we sort each bucket individually, using either bucket sort or a different algorithm :
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
1 | – | 3 | 4 | – | 6 | 7 | – | 9 |
Now, we take the elements in the order from the buckets, and insert them into the original array. Also, the elements are in the proper order.
Sample code:
#include < iostream >
using namespace std;
#define m 10
void bucketsort (int *a, int n)
{
int buckets [m];
for (int j=0; j < m; ++j)
buckets[j]=0;
for (int i=0; i < n; ++i)
++buckets[a[i]];
for (int i=0, j=0; j < m; ++j)
for (int k=buckets[j]; k > 0; --k)
a[i++]=j;
}
int main ()
{
int n;
int *a;
cout << "Please insert the number of elements to be sorted: ";
cin >> n; // The total number of elements
a = (int *)calloc(n, sizeof(int));
cout << "nThe elements must be lower than the value of m = " << m << endl;
for(int i=0;i< n;i++)
{
cout << "Input " << i << " element: ";
cin >>a[i]; // Adding the elements to the array
}
cout << "Unsorted list:" << endl; // Displaying the unsorted array
for(int i=0;i< n;i++)
{
cout << a[i] << " ";
}
bucketsort(a,n);
cout << "nSorted list:" << endl; // Display the sorted array
for(int i=0;i < n;i++)
{
cout << a[i] << " ";
}
return 0;
}
Output :
Code explanation :
At first, we define the value of m, which means that all the elements we will introduce for the array will have to be lower than m. Next, we make buckets for the size of m, and we make them null and in the end, we add the elements to the proper buckets. It is not required another type of sorting algorithm, in this example we use bucket sort only, as we use a bucket for each element of the array, this might seem familiar with radix sort.
Complexity:
Bucket sort can be seen as a generalization of counting sort; in fact, if each bucket has size 1 then bucket sort degenerates to counting sort. The variable bucket size of bucket sort allows it to use O(n) memory instead of O(M) memory, where M is the number of distinct values; in exchange, it gives up counting sort’s O(n + M) worst-case behavior.
The time complexity of bucket sort is: O(n + m) where: m is the range input values, n is the total number of values in the array. Bucket sort beats all other sorting routines in time complexity. It should only be used when the range of input values is small compared with the number of values. In other words, occasions when there are a lot of repeated values in the input. Bucket sort works by counting the number of instances of each input value throughout the array. It then reconstructs the array from this auxiliary data. This implementation has a configurable input range, and will use the least amount of memory possible.
Advantages:
- the user knows the range of the elements;
- time complexity is good compared to other algorithms.
Disadvantages:
- you are limited to having to know the greatest element;
- extra memory is required;
Conclusion:
If you know the range of the elements, the bucket sort is quite good, with a time complexity of only O(n+k). At the same time, this is a major drawback, since it is a must to know the greatest elements. It is a distribution sort and a cousin of radix sort.