Algorithm Interview Questions: Data Structures And Search

Understanding tradeoffs between these structures is essential for algorithm design:

Memory allocation: Arrays use contiguous memory (better cache locality); linked lists scatter across memory with pointer references
Access time: Array O(1) random access; linked list O(n) traversal required
Insertion/deletion: Arrays require O(n) shifts; linked lists require O(1) pointer updates when position is known
Size: Arrays have fixed capacity (dynamic arrays resize with amortized O(1)); linked lists grow dynamically without reallocation
Cache performance: Arrays dominate due to spatial locality; linked lists suffer cache misses

Sorting Linked Lists with Merge Sort

Merge sort is the optimal choice for linked lists—O(n log n) time and O(log n) space for the recursion stack. Unlike quicksort, it requires no random access and performs consistently regardless of input distribution.

class Node:
    def __init__(self, data):
        self.data = data
        self.next = None

def merge_sorted_lists(l1, l2):
    """Merge two sorted linked lists"""
    dummy = Node(0)
    current = dummy

    while l1 and l2:
        if l1.data <= l2.data:
            current.next = l1
            l1 = l1.next
        else:
            current.next = l2
            l2 = l2.next
        current = current.next

    # Attach remaining nodes
    current.next = l1 if l1 else l2
    return dummy.next

def merge_sort_list(head):
    """Sort linked list using merge sort"""
    if not head or not head.next:
        return head

    # Find middle using slow/fast pointer technique
    slow, fast = head, head.next
    while fast and fast.next:
        slow = slow.next
        fast = fast.next.next

    # Split list at middle
    mid = slow.next
    slow.next = None

    left = merge_sort_list(head)
    right = merge_sort_list(mid)

    return merge_sorted_lists(left, right)

The slow/fast pointer approach avoids materializing the middle index—a natural fit for linked list traversal without random access.

Sorting Arrays with Quicksort

Quicksort remains standard for arrays despite O(n²) worst-case complexity—average O(n log n) performance, superior cache locality, and in-place operation make it practical. Production implementations often use introsort, switching to heapsort if recursion depth exceeds 2⌊log n⌋ to prevent pathological cases.

def quicksort(arr, low=0, high=None):
    """Sort array using quicksort algorithm"""
    if high is None:
        high = len(arr) - 1

    if low < high:
        pi = partition(arr, low, high)
        quicksort(arr, low, pi - 1)
        quicksort(arr, pi + 1, high)

    return arr

def partition(arr, low, high):
    """Partition array around pivot"""
    pivot = arr[high]
    i = low - 1

    for j in range(low, high):
        if arr[j] < pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]

    arr[i + 1], arr[high] = arr[high], arr[i + 1]
    return i + 1

For production code, use Python’s built-in sorted() (Timsort) unless you need in-place sorting or specific memory constraints. The sorted() function handles edge cases and performs optimizations automatically.

String Searching with KMP Algorithm

Implement substring search with O(n + m) time complexity using the Knuth-Morris-Pratt algorithm, avoiding naive O(n·m) scanning:

def strstr(haystack, needle):
    """Find first occurrence of needle in haystack using KMP"""
    if not needle:
        return 0
    if len(needle) > len(haystack):
        return -1

    # Build failure function (prefix table)
    def build_failure_table(pattern):
        table = [0] * len(pattern)
        j = 0
        for i in range(1, len(pattern)):
            while j > 0 and pattern[i] != pattern[j]:
                j = table[j - 1]
            if pattern[i] == pattern[j]:
                j += 1
            table[i] = j
        return table

    failure = build_failure_table(needle)
    j = 0

    for i in range(len(haystack)):
        while j > 0 and haystack[i] != needle[j]:
            j = failure[j - 1]
        if haystack[i] == needle[j]:
            j += 1
        if j == len(needle):
            return i - len(needle) + 1

    return -1

The failure table encodes how to skip comparisons when mismatches occur, eliminating backtracking. This is superior to Python’s built-in str.find() for interview contexts where you need to demonstrate algorithmic knowledge, though in production code str.find() is preferable.

String Reversal

For standard use, Python’s slice notation is idiomatic and optimal:

def reverse_string(s):
    """Reverse string using slice"""
    return s[::-1]

For in-place reversal on mutable sequences (lists):

def reverse_string_inplace(s):
    """Reverse list in-place"""
    left, right = 0, len(s) - 1
    while left < right:
        s[left], s[right] = s[right], s[left]
        left += 1
        right -= 1

The slice approach creates O(n) temporary space for immutable strings—unavoidable due to string immutability. The two-pointer approach is O(1) space for lists.

Cycle Detection in Linked Lists

Floyd’s cycle detection (tortoise and hare) uses O(1) space and O(n) time:

def has_cycle(head):
    """Detect if linked list contains a cycle"""
    if not head:
        return False

    slow = fast = head

    while fast and fast.next:
        slow = slow.next
        fast = fast.next.next
        if slow == fast:
            return True

    return False

def find_cycle_start(head):
    """Return the node where cycle begins, or None if no cycle"""
    if not head:
        return None

    slow = fast = head

    # Detect cycle
    while fast and fast.next:
        slow = slow.next
        fast = fast.next.next
        if slow == fast:
            break
    else:
        return None

    # Find cycle start: move slow to head, advance both one step at a time
    slow = head
    while slow != fast:
        slow = slow.next
        fast = fast.next

    return slow

When slow and fast pointers meet, a cycle exists. By resetting slow to head and advancing both at speed 1, they meet at the cycle start. This works because the distance from head to cycle start equals the distance from the meeting point to the cycle start—a mathematical property of the two-pointer technique.

Fisher-Yates Shuffle

Generate uniformly random permutations in O(n) time and O(1) extra space:

import random

def shuffle(arr):
    """Shuffle array in-place using Fisher-Yates algorithm"""
    for i in range(len(arr) - 1, 0, -1):
        j = random.randint(0, i)
        arr[i], arr[j] = arr[j], arr[i]
    return arr

Iterate backward through the array, swapping each element with a randomly selected element from the range [0, i]. This ensures uniform distribution—crucial for algorithms like randomized quicksort. Avoid naive shuffle approaches (randomly swapping any two elements) as they produce skewed distributions.

String to Integer Conversion

Parse signed integers with boundary checking:

def parse_int(s):
    """Convert string to integer, returning None if invalid"""
    if not s or not isinstance(s, str):
        return None

    s = s.strip()
    if not s:
        return None

    start = 0
    sign = 1

    if s[0] in ['+', '-']:
        sign = -1 if s[0] == '-' else 1
        start = 1

    result = 0
    for i in range(start, len(s)):
        if not s[i].isdigit():
            return None
        result = result * 10 + int(s[i])

    return result * sign

Handle leading whitespace and optional sign. For production code, prefer Python’s built-in int(s) with exception handling for robustness. This implementation demonstrates manual parsing logic expected in interviews.

Generating All Permutations

Recursively generate permutations by selecting each element as the first item:

def permute(elements):
    """Generate all permutations of elements"""
    if len(elements) <= 1:
        return [elements]

    result = []
    for i in range(len(elements)):
        current = elements[i]
        remaining = elements[:i] + elements[i+1:]
        for perm in permute(remaining):
            result.append([current] + perm)

    return result

This has O(n!) time complexity (unavoidable for n! outputs) and O(n!) space for storing results. For larger inputs or streaming use cases, use a generator to reduce memory overhead:

def permute_generator(elements):
    """Generate permutations one at a time"""
    if len(elements) <= 1:
        yield elements
    else:
        for i in range(len(elements)):
            current = elements[i]
            remaining = elements[:i] + elements[i+1:]
            for perm in permute_generator(remaining):
                yield [current] + perm

Sorted Array to Balanced Binary Search Tree

Build a height-balanced BST from a sorted array by recursively choosing the middle element:

class TreeNode:
    def __init__(self, val):
        self.val = val
        self.left = None
        self.right = None

def sorted_array_to_bst(arr):
    """Convert sorted array to height-balanced BST"""
    def build(low, high):
        if low > high:
            return None

        mid = (low + high) // 2
        node = TreeNode(arr[mid])
        node.left = build(low, mid - 1)
        node.right = build(mid + 1, high)

        return node

    return build(0, len(arr) - 1)

Selecting the middle element ensures O(log n) height and balanced structure. Time complexity is O(n), space is O(log n) for recursion stack on a balanced tree.

Level-Order Tree Traversal

Use a queue to traverse the tree by level, useful for serialization and tree analysis:

from collections import deque

def level_order_traversal(root):
    """Return list of lists, each containing one level's node values"""
    if not root:
        return []

    result = []
    queue = deque([root])

    while queue:
        level_size = len(queue)
        current_level = []

        for _ in range(level_size):
            node = queue.popleft()
            current_level.append(node.val)

            if node.left:
                queue.append(node.left)
            if node.right:
                queue.append(node.right)

        result.append(current_level)

    return result

Track level_size before processing to avoid mixing levels—critical for correct grouping. Use deque from collections for O(1) popleft operations rather than lists.

Reversing a Linked List

Iterative approach with proper pointer management:

def reverse_linked_list(head):
    """Reverse linked list iteratively"""
    prev = None
    current = head

    while current:
        next_temp = current.next
        current.next = prev
        prev = current
        current = next_temp

    return prev

Recursive variant (use cautiously—stack depth equals list length, risking stack overflow on large lists):

def reverse_linked_list_recursive(head, prev=None):
    """Reverse linked list recursively"""
    if not head:
        return prev

    next_temp = head.next
    head.next = prev
    return reverse_linked_list_recursive(next_temp, head)

Prefer the iterative approach for production code—constant O(1) stack space versus O(n) recursion depth. Test both with empty lists and single-node lists to verify boundary handling. The iterative version is also more readable and performs better on systems with stack size limits.

Algorithm Interview Questions: Data Structures and Search