Understanding Arrays: A Comprehensive Guide to Array Data Structures
May 22, 2023 / 21 min read /
--- views
Introduction to Arrays
Array is a data structure that stores a collection of elements of the same data type. The elements of an array are arranged in a contiguous block of memory, and each element is assigned a unique index. Arrays are often used to store lists of data, such as the names of students in a class or the scores of a sporting event.
Arrays can be used to store a variety of data types, including integers, floats, strings, and objects. They are a versatile data structure that can be used to store a wide range of data.
Here are some of the advantages of using arrays:
- Efficient in terms of memory usage 🧠.
- Easy to access and manipulate 📝.
- Used to store a large amount of data 💾.
Understanding Arrays
Arrays are the simplest data structures, and we use them to store a list of items like a list of strings, numbers, objects, and literally anything else that gets stored sequentially in memory. For example if we allocate an array of five integers these integers get stored in memory like this let's say the address of the first item in memory is 12 as you probably know integers take 4 bytes of memory so the second item would be stored at the memory location 13 the third item would be stored at the memory location 14 and so on for this very reason looking up items in an array by their index is super fast we give our array an index and it will figure out where exactly in memory it should access now what do you think is the runtime complexity of this operation ⏰? It's O of 1 because the calculation of the memory address is very simple it doesn't involve any loops or complex logic so if you need to store a list of items and access them by their index arrays are the optimal data structures for you now.
Let's look at the limitations or weaknesses of arrays 📉. Arrays are static, which means when we allocate them, we should specify their size, and this size cannot change later on. So, we need to know ahead of time how many items we want to store in an array. Now, if we don't know, we have to make a guess. If our guess is wrong, we'll waste memory because we'll have cells that are never filled. If our guess is right, our array gets filled quickly, and to add another item, we'll have to resize the array, which means we should allocate a larger array and then copy all the items in the old array into the new array.
These operations can be costly. Because arrays have a fixed size, in situations where we don't know ahead of time how many items we want to store in them or when we need to add or remove a lot of items from them, they don't perform well in those cases.
Array complexity:
- Access/Lookup:
O(1)
- Search:
O(n)
- Insertion/Add:
O(n)
- Deletion/Remove:
O(n)
Array Types and Variations
There are two main types of arrays: one-dimensional arrays and multi-dimensional arrays.
- One-dimensional Array
- Multi-dimensional Arrays
And there are two variations of arrays:
- Static Array
- Dynamic Array
One-dimensional Arrays
One-dimensional arrays store data in a single, linear order. The elements of a one-dimensional array are accessed by their index, which is a number that indicates the position of the element in the array.
Linear Array
The Linear Array is the simplest way to store a bunch of items in computer memory. It consists in allocating a sequential space in the computer memory, and writing your items sequentially in that space, marking the end of the sequence with a special NULL
token.
Each object in an array occupies the same amount of space in memory. Imagine an array starting at memory address s, where each item occupy b bytes. The nth item in the array can be obtained fetching b bytes starting from memory position s + (b × n).
Linear Array Operations and Manipulations
- Accessing Array Elements O(1): Retrieve the value of an element at a specific index by using the index notation. For example,
array[index]
returns the value at the given index.
1array = [1, 2, 3, 4, 5]2index = 23value = array[index]4print(value) # Output: 3
- Modifying Array Elements O(1): Assign a new value to an element at a specific index by using the assignment operator. For example,
array[index] = new_value;
updates the value at the given index.
1array = [1, 2, 3, 4, 5]2index = 33new_value = 554array[index] = new_value5print(array) # Output: [1, 2, 3, 55, 5]
- Array Traversal O(n): Iterate over the elements of the array using loops such as
for
orwhile
to perform operations on each element.
1array = [1, 2, 3, 4, 5]2for element in array:3print(element)4# Output:5# 106# 207# 308# 409# 50
- Array Left Shift N Cells O(n): Left shifting an array involves moving each element to the left by a specified number of positions. For instance, if we left shift the
[1, 2, 3, 4, 5, 6]
array by 3 positions, the result would be:[4, 5, 6, 0, 0, 0]
. Here the three leftmost elements, 1, 2, and 3 moved to the left by three positions. Since there are no spaces to the left of the three elements, they are replaced by 0s. The remaining elements are shifted to the left by three positions. The three rightmost elements are replaced by 0s.
shiftLeft.py
1def shiftLeft(arr: list, n: int) -> None:2capacity = len(arr)3if n > capacity or n < 0:4raise IndexError("Index Error.")5else:6for i in range(capacity):7if i < capacity - n:8arr[i] = arr[i + n]9else:10arr[i] = 0111213arr = [1, 2, 3, 4, 5, 6]14shiftLeft(arr, 3)15print(arr)
- Array Right Shift N Cells O(n): Right shifting an array involves moving each element to the right by a specified number of positions. For instance, if we right shift the
[1, 2, 3, 4, 5, 6]
array by 3 positions, the result would be:[0, 0, 0, 1, 2, 3]
. Here the three rightmost elements, 4, 5, and 6 moved to the right by three positions. Since there are no spaces to the right of the three elements, they are replaced by 0s. The remaining elements are shifted to the right by three positions. The three leftmost elements are replaced by 0s.
shiftRight.py
1def shiftRight(arr: list, n: int) -> None:2capacity = len(arr)3if n > capacity or n < 0:4raise IndexError("Index Error.")5else:6for i in range(capacity - 1, -1, -1):7if i >= n:8arr[i] = arr[i - n]9else:10arr[i] = 01112arr = [1, 2, 3, 4, 5, 6]13shiftRight(arr, 3)14print(arr)
- Array Left Rotation N Cells O(n): Left rotation of an array involves moving each element to the left by a specified number of positions. For instance, if we left rotate the
[1, 2, 3, 4, 5, 6]
array by 3 positions, the result would be:[4, 5, 6, 1, 2, 3]
. Here the three leftmost elements, 1, 2, and 3 moved to the left by three positions. The remaining elements are shifted to the left by three positions. The three rightmost elements are moved to the leftmost positions.
rotateLeft.py
1def rotateLeft(arr: list, n: int) -> None:2capacity = len(arr)3if n > capacity or n < 0:4raise IndexError("Index Error.")5else:6temp = arr[:n]7for i in range(capacity - n):8arr[i] = arr[i + n]9for i in range(n):10arr[capacity - n + i] = temp[i]111213arr = [1, 2, 3, 4, 5, 6]14rotateLeft(arr, 3)15print(arr)
- Array Right Rotation N Cells O(n): Right rotation of an array involves moving each element to the right by a specified number of positions. For instance, if we right rotate the
[1, 2, 3, 4, 5, 6]
array by 3 positions, the result would be:[4, 5, 6, 1, 2, 3]
. Here the three rightmost elements, 4, 5, and 6 moved to the right by three positions. The remaining elements are shifted to the right by three positions. The three leftmost elements are moved to the rightmost positions.
rotateRight.py
1def rotateRight(arr: list, n: int) -> None:2capacity = len(arr)3if n > capacity or n < 0:4raise IndexError("Index Error.")5else:6temp = arr[capacity - n:]7for i in range(capacity - 1, n - 1, -1):8arr[i] = arr[i - n]9for i in range(n):10arr[i] = temp[i]111213arr = [1, 2, 3, 4, 5, 6]14rotateRight(arr, 3)15print(arr)
- Deleting/Removing an Element from an Array O(n): Let's talk about removing an element. Here are a couple of different scenarios. If you want to remove the last item, that's pretty easy; we can quickly look it up by its index and clear the memory. So here we have o of 1, which is our best-case scenario, but when doing Big O analysis, we should think about the worst-case scenario. What is the worst-case scenario? Here, when we want to remove an item from the beginning of the array, we have to shift all the items on the right one step to the left to fill in the hole. The more items we have, the more this shifting operation is going to cost, so for the worst-case scenario, deletion is an O of n operation.
Deleting an element from an array involves removing the element from the array and shifting the remaining elements to the left by one position. For instance, if we delete the element 3 from the array [1, 2, 3, 4, 5, 6]
, the result would be: [1, 2, 4, 5, 6, None]
. Here the element 3 is removed from the array and the remaining elements are shifted to the left by one position. To perform this operation, we need to know the index of the element to be deleted also the size of the array. If the index is not known, we need to search for the element in the array and then delete it.
deleteElement.py
1def remove(arr: list, size: int, idx: int) -> None:2if size == 0:3raise ValueError("Cannot remove from empty list.")4elif idx > size or idx < 0:5raise IndexError("Index Error.")6else:7for i in range(idx, size):8arr[i] = arr[i + 1]9arr[size - 1] = None101112arr = [1, 2, 3, 4, 5, None, None]13remove(arr, 0, 2)14print(arr)
- Inserting an Element into an Array O(n): In the best-case scenario, inserting an element at the end of an array is a straightforward process. Since the item is added at the last position, no shifting of elements is required. The index of the inserted element can be easily determined, resulting in a constant time complexity of O(1). This scenario is optimal and highly efficient.
When we want to insert an item at the beginning of an array, the situation becomes more complex. To make room for the new element, all existing elements must be shifted one step to the right. The time complexity of this operation is directly proportional to the number of elements in the array, denoted as O(n), where 'n' represents the array size. The larger the array, the more costly the shifting operation becomes.
Inserting an element into an array involves inserting the element at the end of the array and shifting the remaining elements to the right by one position. For instance, if we insert the element 10 into the array [1, 2, 3, 4, 5, None, None]
, the result would be: [1, 2, 10, 3, 4, 5, None, None]
. Here the element 10 is inserted at the index of 2 and the remaining elements are shifted to the right by one position. To perform this operation, we need to know the index at which the element is to be inserted and the size of the array. If the index is not known, we need to search for the element in the array and then insert it.
insertElement.py
1def insert(arr: list, size: int, idx: int, val: int) -> None:2if size == len(arr):3raise ValueError("Cannot insert into full list.")4elif idx > size or idx < 0:5raise IndexError("Index Error.")6else:7for i in range(size, idx, -1):8arr[i] = arr[i - 1]9arr[idx] = val101112arr = [1, 2, 3, 4, 5, None, None]13insert(arr, 5, 2, 10)14print(arr)
Circular Array/Circular buffer
A circular array, also known as a circular buffer or ring buffer, is a data structure that operates as a fixed-size array with a circular behavior. In a circular array, the elements are stored in a continuous block of memory, and when the end of the array is reached, the next element wraps around to the beginning of the array.
The circular behavior of the array allows for efficient utilization of memory and enables cyclic or wraparound operations without the need to shift elements. It is especially useful in scenarios where data needs to be processed in a continuous loop or when implementing a queue or buffer with a fixed capacity.
To get the desired element index of a circular array 3 values are necessary.
- Start index 🏁: This is the index from which you want to begin calculating the relative position. It could be any valid index within the range of the circular array.
Circular queue starting and ending index. Image by Programiz - Relative position 📍: It indicates the relative position of the desired element from the start index. If you want to find the index of an element that is, for example, three positions ahead of the start index, relative position would be 3.
- Size of the array 📏: It represents the total number of elements in the circular array. It determines the range of valid indices in the circular array.
With these three values, you can use the following mathematical formula to calculate the index of the desired element:
1index = (start_index + relative_position) % array_size
In this formula, the %
operator represents the modulo operation, which calculates the remainder after division.
Circular Array/Circular buffer Operations and Manipulations
- Create a Circular Array: To create a circular array, we need to know the size of the array and the start index. The size of the array determines the range of valid indices, and the start index indicates the index from which the relative position is calculated. The start index can be any valid index within the range of the circular array.
1def createCircularArray(size: int, start_index: int, linear_arr: list) -> list:2circular_arr = [None] * size3for i in range(size):4circular_arr[i] = linear_arr[(start_index + i) % len(linear_arr)]5return circular_arr
- Circular Array Traversal: To traverse a circular array, we need to know the start index and the size of the array. The starting index denotes the reference point for calculating relative positions, while the size of the array determines the permissible range of indices. Any valid index within the circular array's range can be chosen as the starting index.
1def traverseCircularArray(start_index: int, size: int, circular_arr: list) -> None:2for i in range(size):3print(circular_arr[(start_index + i) % len(circular_arr)])
- Circular Array Linearization: To linearize a circular array, we need to know the start index and the size of the array. The start index indicates the index from which the relative position is calculated, and the size of the array determines the range of valid indices. The start index can be any valid index within the range of the circular array.
1def linearizeCircularArray(start_index: int, size: int, circular_arr: list) -> list:2linear_arr = [None] * size3for i in range(size):4linear_arr[i] = circular_arr[(start_index + i) % len(circular_arr)]5return linear_arr
- Enqueue (Inserting an Element into a Circular Array): To insert an element into a circular array, we need to know the start index, the size of the array, and the index at which the element is to be inserted. The start index indicates the index from which the relative position is calculated, and the size of the array determines the range of valid indices. The start index can be any valid index within the range of the circular array.
enqueue.py
1def enqueue(start_index: int, size: int, circular_arr: list, idx: int, val: int) -> None:2if size == len(circular_arr):3raise ValueError("Cannot insert into full list.")4elif idx > size or idx < 0:5raise IndexError("Index Error.")6else:7for i in range(size, idx, -1):8circular_arr[(start_index + i) % len(circular_arr)] = circular_arr[(start_index + i - 1) % len(circular_arr)]9circular_arr[(start_index + idx) % len(circular_arr)] = val
- Dequeue (Deleting an Element from a Circular Array): To delete an element from a circular array, we need to know the start index, the size of the array, and the index of the element to be deleted. The start index indicates the index from which the relative position is calculated, and the size of the array determines the range of valid indices. The start index can be any valid index within the range of the circular array.
dequeue.py
1def dequeue(start_index: int, size: int, circular_arr: list, idx: int) -> None:2if size == 0:3raise ValueError("Cannot delete from empty list.")4elif idx >= size or idx < 0:5raise IndexError("Index Error.")6else:7for i in range(idx, size - 1):8circular_arr[(start_index + i) % len(circular_arr)] = circular_arr[(start_index + i + 1) % len(circular_arr)]9circular_arr[(start_index + size - 1) % len(circular_arr)] = None
Use cases:
- Memory Management: Circular arrays are used to implement circular buffers, which are used for memory management. A circular buffer is a data structure that uses a single, fixed-size buffer as if it were connected end-to-end. This structure lends itself easily to buffering data streams. For example, a circular buffer can be used to store audio data from a microphone or a video stream from a camera. The buffer is filled with data from the stream, and the data is read from the buffer at the same rate as it is written to the buffer. This allows the data to be processed in real-time, without any delays or interruptions.
- Networking applications: Circular queues are used in networking applications to buffer data that is being received or transmitted. Data is added to the queue when it is received, and it is removed from the queue when it is transmitted. This ensures that data is not lost or delayed, and that it is processed in a timely manner.
- CPU scheduling: manage processes that are waiting to be executed.
- Music players: manage playlists.
Multi-dimensional Arrays
Multi-dimensional arrays 🗄️ store data in multiple dimensions. The elements of a multi-dimensional array are accessed by their index, which is a list of numbers that indicates the position of the element in the array. For example, a 2D array, or two-dimensional array, is an array of arrays, meaning it is a matrix of rows and columns (think of a table). A 3D array adds another dimension, turning it into an array of arrays of arrays. Multi-dimensional arrays are often used to store data that is naturally organized in a tabular format, such as a spreadsheet or a database.
Unlike a single-dimensional array, which consists of a linear sequence of elements, a multi-dimensional array is structured as a matrix or a grid with rows and columns. It can be thought of as an array of arrays, where each element in the array is itself an array.
Multi-dimensional arrays find applications in various domains, including:
Matrices and grids: Multi-dimensional arrays are widely used for tasks involving matrices and grids, such as mathematical operations, image processing, and game development. Matrices can represent spatial data, pixel values, game boards, and much more.
Scientific and statistical analysis: In scientific and statistical computations, multi-dimensional arrays are essential for storing and manipulating large datasets. For example, a three-dimensional array can represent a cube of data points in a scientific simulation.
Multimedia processing: Multi-dimensional arrays are employed extensively in multimedia processing, such as audio and video signal processing. These arrays can hold time-series data or pixel values for images and videos, allowing for efficient analysis and manipulation.
Geographic information systems (GIS): GIS applications often involve handling geographical data with multi-dimensional characteristics. Arrays can be used to represent maps, satellite imagery, and terrain elevation data.
Static Arrays vs. Dynamic Arrays
Two commonly employed types of arrays are static arrays and dynamic arrays. Understanding the differences between these array types is crucial for making informed decisions when it comes to optimizing memory usage and program performance.
Static Arrays:
Static arrays, also known as fixed-size arrays, are data structures with a predetermined size that cannot be changed once initialized. They are created with a fixed number of elements and have a contiguous block of memory allocated to store those elements. Static arrays are typically declared at compile-time and occupy a fixed amount of memory throughout the program's execution.
Benefits of Static Arrays:
Efficiency: Static arrays offer excellent performance due to their contiguous memory allocation. Accessing elements in a static array is straightforward, as the index directly maps to a specific memory location. This characteristic makes static arrays efficient for random access operations.
Simplicity: Static arrays are simple to use and understand. They have a clear and predictable structure, and the elements can be accessed and modified using indices, making it easy to manipulate data stored in the array.
Trade-offs of Static Arrays:
Fixed Size Limitation: The most significant limitation of static arrays is their fixed size. Once declared, the size cannot be changed, which can be problematic when dealing with scenarios where the number of elements varies dynamically.
Memory Wastage: If a static array is declared with a larger size than needed, it can lead to memory wastage. Conversely, if the array is too small to accommodate the required elements, it can result in data loss or program errors.
Dynamic Arrays:
Dynamic arrays, also known as resizable arrays or arrays lists, provide a flexible alternative to static arrays. They allow for the allocation and deallocation of memory at runtime, enabling the array size to be modified dynamically.
Benefits of Dynamic Arrays:
Flexibility: The primary advantage of dynamic arrays is their ability to grow or shrink in size as needed. This flexibility allows for efficient memory utilization and eliminates the fixed size limitation of static arrays.
Versatility: Dynamic arrays provide a variety of operations that simplify array management, such as adding elements, removing elements, and resizing the array. This versatility is particularly useful when the number of elements is uncertain or subject to change.
Trade-offs of Dynamic Arrays:
Memory Overhead: Dynamic arrays require additional memory to manage their size and dynamically allocate memory. This overhead can be a concern when dealing with large arrays or systems with limited memory resources.
Performance Impact: Dynamic arrays may introduce performance overhead due to the need for resizing and memory reallocation. When the array grows beyond its current capacity, a new, larger block of memory needs to be allocated, and the existing elements need to be copied over. This process can be time-consuming, especially for large arrays.
Reference Videos