Data Structures
π Overviewβ
Data structures are ways to organize and store data for efficient access and modification.
π Key Conceptsβ
Time Complexityβ
| Operation | Array | Linked List | Hash Table | BST | Heap |
|---|---|---|---|---|---|
| Access | O(1) | O(n) | O(1) avg | O(log n) | O(1) |
| Search | O(n) | O(n) | O(1) avg | O(log n) | O(n) |
| Insert | O(n) | O(1)* | O(1) avg | O(log n) | O(log n) |
| Delete | O(n) | O(1)* | O(1) avg | O(log n) | O(log n) |
* At head or after an existing reference
ποΈ Common Data Structuresβ
Arrayβ
Pros:
- Random access O(1)
- Good memory locality
- Easy to implement
Cons:
- Fixed size (static array)
- Insert/delete expensive
# Python
arr = [1, 2, 3, 4, 5]
arr.append(6) # O(1) amortized
arr.pop() # O(1)
arr.insert(0, 0) # O(n)
Linked Listβ
Pros:
- Dynamic size
- Insert/delete O(1) with reference
- No contiguous memory required
Cons:
- No random access
- Extra memory for pointers
- Poor cache locality
Hash Tableβ
Collision Resolution:
- Chaining: Each bucket is a linked list
- Open Addressing: Find the next empty slot (linear probing, quadratic probing)
# Python dict/HashSet
hash_map = {"key": "value"}
hash_map["new_key"] = "new_value" # O(1) average
Common pitfalls:
- Bad hash function β many collisions
- Load factor too high β degraded performance
- No order guarantee (except Python 3.7+, LinkedHashMap)
Tree - Binary Search Tree (BST)β
Properties:
- Left subtree < root
- Right subtree > root
- In-order traversal = sorted array
Operations:
- Search: O(h), h = height
- Insert: O(h)
- Delete: O(h)
Balance:
- Unbalanced BST β O(n)
- Balanced (AVL, Red-Black) β O(log n)
Heapβ
Min-Heap:
- Parent β€ Children
- Root = minimum element
- Complete binary tree
Common Operations:
- Insert: O(log n)
- Extract Min/Max: O(log n)
- Peek: O(1)
- Build Heap: O(n)
import heapq
# Min-heap
heap = [3, 1, 4, 1, 5]
heapq.heapify(heap) # O(n)
heapq.heappush(heap, 2) # O(log n)
min_val = heapq.heappop(heap) # O(log n)
π― Use Casesβ
| Data Structure | Use Case |
|---|---|
| Array | Random access, fixed size data |
| Linked List | Frequent insert/delete, unknown size |
| Hash Table | Fast lookup, caching |
| BST | Sorted data, range queries |
| Heap | Priority queue, top K elements |
| Graph | Relationships, networks |
| Trie | Prefix search, autocomplete |
β Interview Questionsβ
Easyβ
- Valid Anagram - Hash table
- Contains Duplicate - Hash set
- Reverse Linked List - Two pointers
Mediumβ
- Group Anagrams - Hash table + sorting
- LRU Cache - Hash map + doubly linked list
- Top K Frequent Elements - Heap
Hardβ
- Merge K Sorted Lists - Heap
- Serialize and Deserialize Binary Tree - Tree + String
- LFU Cache - Hash map + doubly linked list + frequency