Chuyển tới nội dung chính

Multithreading and Parallelism

Definition

Multithreading enables concurrent execution of code within a single process. C# and .NET provide multiple layers of abstraction: raw threads (System.Threading.Thread), the ThreadPool, Task-based Asynchronous Pattern (TAP), and parallel programming APIs.

Key distinction: concurrency is dealing with many things at once (switching between tasks); parallelism is doing many things at once (simultaneous execution on multiple cores).

// Sequential — one at a time
var stopwatch = Stopwatch.StartNew();
Transform(data1);
Transform(data2);
Transform(data3);
Console.WriteLine($"Sequential: {stopwatch.ElapsedMilliseconds}ms");

// Parallel — all at once
stopwatch.Restart();
Parallel.Invoke(
() => Transform(data1),
() => Transform(data2),
() => Transform(data3)
);
Console.WriteLine($"Parallel: {stopwatch.ElapsedMilliseconds}ms");

Core Concepts

Thread Fundamentals

The System.Threading.Thread class wraps an OS thread. Each thread has its own stack (~1MB), priority, and can be foreground or background.

// Creating and starting a thread
var thread = new Thread(() =>
{
Console.WriteLine($"Running on thread: {Thread.CurrentThread.ManagedThreadId}");
});
thread.Name = "Worker";
thread.IsBackground = true;
thread.Start();
thread.Join(); // Wait for completion

Foreground vs Background threads:

TypeBehaviorUse Case
ForegroundKeeps process alive until thread finishesCritical work that must complete
BackgroundTerminated when all foreground threads finishAuxiliary work that can be abandoned
// ThreadPool — managed pool of worker threads
ThreadPool.QueueUserWorkItem(_ =>
{
Console.WriteLine("Running on ThreadPool thread");
});
Prefer ThreadPool over manual Thread creation

Manual thread creation allocates ~1MB stack per thread and doesn't reuse threads. The ThreadPool manages a pool of threads, reusing them and throttling creation. For almost all scenarios, use Task.Run (which uses the ThreadPool) instead of new Thread().

Task-Based Asynchronous Pattern (TAP)

Task and Task<T> are the modern unit of asynchronous work. They represent an operation that may complete in the future.

// CPU-bound work offloaded to ThreadPool
Task<int> task = Task.Run(() =>
{
int result = 0;
for (int i = 0; i < 100_000_000; i++) result += i;
return result;
});

// Task<T> — get the result
int sum = await task;
Console.WriteLine($"Sum: {sum}");

// Fan-out — run multiple tasks concurrently
var tasks = urls.Select(url => FetchDataAsync(url)).ToArray();
string[] results = await Task.WhenAll(tasks);

// Task.WhenAny — process as each completes
while (tasks.Length > 0)
{
Task<string> completed = await Task.WhenAny(tasks);
tasks = tasks.Where(t => t != completed).ToArray();
Console.WriteLine(await completed);
}

Task lifecycle states:

StateMeaning
CreatedTask initialized but not scheduled
WaitingToRunScheduled, waiting for a ThreadPool thread
RunningCurrently executing
RanToCompletionCompleted successfully
FaultedCompleted with an exception
CanceledWas canceled via CancellationToken
Task.Run vs async/await

Task.Run is for CPU-bound work that should run on a background thread. For I/O-bound work (network, file, database), use async/await directly — see Async/Await for the full guide.

Synchronization Primitives

When multiple threads access shared mutable state, synchronization prevents race conditions.

// lock statement — simplest synchronization
private readonly object _lock = new();
private int _counter = 0;

public void Increment()
{
lock (_lock)
{
_counter++;
}
}

Comparison of synchronization primitives:

PrimitiveScopeAsync-FriendlyUse Case
lockSingle processNoSimple mutual exclusion
MonitorSingle processNoTimed lock, Pulse/Wait
MutexCross-processNoCross-process exclusion
SemaphoreSlimSingle processYes (WaitAsync)Limiting concurrency
ReaderWriterLockSlimSingle processNoMultiple readers, single writer
AutoResetEventSingle processNoSignal one waiting thread
ManualResetEventSlimSingle processNoSignal all waiting threads
// SemaphoreSlim — limit concurrent access (async-friendly)
private readonly SemaphoreSlim _semaphore = new(3, 3); // Max 3 concurrent

public async Task ProcessAsync(string url)
{
await _semaphore.WaitAsync();
try
{
await FetchDataAsync(url);
}
finally
{
_semaphore.Release();
}
}
// ReaderWriterLockSlim — multiple readers OR single writer
private readonly ReaderWriterLockSlim _rwLock = new();
private readonly Dictionary<string, string> _cache = new();

public string? Get(string key)
{
_rwLock.EnterReadLock();
try { return _cache.GetValueOrDefault(key); }
finally { _rwLock.ExitReadLock(); }
}

public void Set(string key, string value)
{
_rwLock.EnterWriteLock();
try { _cache[key] = value; }
finally { _rwLock.ExitWriteLock(); }
}

Concurrent Collections

System.Collections.Concurrent provides thread-safe collections:

CollectionDescriptionKey Methods
ConcurrentDictionary<TKey, TValue>Thread-safe dictionaryTryAdd, AddOrUpdate, GetOrAdd
ConcurrentQueue<T>Lock-free FIFO queueEnqueue, TryDequeue
ConcurrentStack<T>Lock-free LIFO stackPush, TryPop
ConcurrentBag<T>Unordered, thread-local storageAdd, TryTake
BlockingCollection<T>Bounding and blocking wrapperAdd, Take, CompleteAdding
// ConcurrentDictionary — atomic operations
var counts = new ConcurrentDictionary<string, int>();
counts.TryAdd("apple", 0);

// Thread-safe increment
counts.AddOrUpdate("apple", 1, (_, old) => old + 1);

// Get or add
int count = counts.GetOrAdd("banana", _ => ExpensiveLookup("banana"));
// BlockingCollection — producer-consumer pattern
var collection = new BlockingCollection<string>(boundedCapacity: 10);

// Producer
Task.Run(() =>
{
for (int i = 0; i < 100; i++)
collection.Add($"Item {i}"); // Blocks if at capacity

collection.CompleteAdding(); // Signal no more items
});

// Consumer
Task.Run(() =>
{
foreach (var item in collection.GetConsumingEnumerable())
Process(item); // Blocks until item available or CompleteAdding
});
Compound operations still need care

Concurrent collections protect individual operations (TryAdd, TryGetValue), but compound operations (read-then-write across multiple calls) are not atomic. Use AddOrUpdate or GetOrAdd for atomic compound operations.

Parallel Class and PLINQ

The System.Threading.Tasks.Parallel class and PLINQ provide high-level APIs for CPU-bound parallelism.

// Parallel.For — parallelize a counted loop
Parallel.For(0, 1000, i =>
{
results[i] = Compute(i);
});

// Parallel.ForEach — parallelize iteration
Parallel.ForEach(items, item =>
{
Process(item);
});

// With options
var options = new ParallelOptions
{
MaxDegreeOfParallelism = Environment.ProcessorCount,
CancellationToken = cts.Token
};
Parallel.ForEach(data, options, item => Transform(item));
// PLINQ — parallel LINQ queries
var results = data.AsParallel()
.Where(x => x.IsValid)
.OrderBy(x => x.Priority)
.Select(x => x.Value)
.ToArray();

// Control parallelism
var ordered = data.AsParallel()
.AsOrdered() // Preserve original order
.WithDegreeOfParallelism(4)
.WithCancellation(cts.Token)
.Select(ExpensiveTransform)
.ToList();
Parallel is for CPU-bound work

Parallel.For/ForEach and PLINQ are designed for CPU-bound work. For I/O-bound operations, use Task.WhenAll — parallelism APIs would waste ThreadPool threads waiting for I/O.

CancellationToken

Cooperative cancellation pattern for stopping long-running operations gracefully.

// Creating and using a CancellationToken
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(30));
CancellationToken token = cts.Token;

// Pass to Task.Run
var task = Task.Run(() =>
{
for (int i = 0; i < 1000; i++)
{
token.ThrowIfCancellationRequested(); // Throws OperationCanceledException
Process(i);
}
}, token);

// Cancel from another context
cts.Cancel(); // Triggers cancellation

// Checking without throwing
if (token.IsCancellationRequested) return;

// Combining multiple tokens
using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(token1, token2);

Thread Safety Patterns

// 1. Immutable state — the safest approach
public record Config(string Host, int Port, string Database); // Immutable by default

// 2. Interlocked — atomic operations without locks
private int _counter = 0;
Interlocked.Increment(ref _counter); // Thread-safe increment
int result = Interlocked.CompareExchange(ref _counter, 10, 5); // Set to 10 if currently 5

// 3. Thread-local storage
var threadLocal = new ThreadLocal<int>(() => 0);
threadLocal.Value++; // Each thread has its own copy
int sum = threadLocal.Values.Sum();

// 4. AsyncLocal — flows with async context
static readonly AsyncLocal<string> _correlationId = new();
_correlationId.Value = Guid.NewGuid().ToString(); // Available throughout async call chain

Deadlocks

A deadlock occurs when two or more threads each hold a lock and wait for a lock the other holds, creating a circular dependency.

Classic deadlock scenario:

private readonly object _lockA = new();
private readonly object _lockB = new();

// Thread 1: locks A then B
lock (_lockA)
{
Thread.Sleep(100);
lock (_lockB) { /* Deadlock if Thread 2 holds _lockB */ }
}

// Thread 2: locks B then A
lock (_lockB)
{
Thread.Sleep(100);
lock (_lockA) { /* Deadlock if Thread 1 holds _lockA */ }
}

Prevention strategies:

  1. Lock ordering — Always acquire locks in the same order across all threads
  2. Lock timeout — Use Monitor.TryEnter with a timeout instead of lock
  3. Single lock — Use one lock when possible
  4. Keep critical sections small — Minimize time holding locks
// Lock ordering — always acquire _lockA before _lockB
lock (_lockA)
{
lock (_lockB)
{
// Safe — all code follows the same order
}
}

// Lock timeout
if (Monitor.TryEnter(_lock, TimeSpan.FromSeconds(5)))
{
try { /* work */ }
finally { Monitor.Exit(_lock); }
}
else
{
// Handle timeout — log, retry, or fail
}
Never hold a lock across await

The lock statement does not support await inside its body (compile error), but Monitor.Enter + await is possible and extremely dangerous. The lock may be released on a different thread, causing deadlocks. Use SemaphoreSlim with WaitAsync instead:

// WRONG — compiler prevents this, but Monitor + await is possible
lock (_lock)
{
await DoSomethingAsync(); // Compile error
}

// CORRECT — use SemaphoreSlim for async
await _semaphore.WaitAsync();
try
{
await DoSomethingAsync();
}
finally
{
_semaphore.Release();
}

When to Use

ScenarioRecommended Approach
CPU-bound work (computation)Task.Run or Parallel.For
I/O-bound work (network, file, DB)async/await — see Async/Await
Producer-consumer pipelineBlockingCollection<T> or Channel<T>
Shared mutable statelock, SemaphoreSlim, or concurrent collections
Fan-out multiple independent tasksTask.WhenAll
Limit concurrent accessSemaphoreSlim
Cross-process synchronizationMutex or named EventWaitHandle

Common Pitfalls

  1. Using Thread directly instead of Task — Manual threads are expensive (~1MB stack each) and not reused. Prefer Task.Run or ThreadPool.

  2. Locking on this, typeof(...), or string literals — These are publicly accessible, making external code a deadlock risk. Use a private readonly object _lock = new().

  3. Deadlock from sync-over-async — Calling .Result or .Wait() on async code can deadlock in UI/ASP.NET contexts. Always use await.

  4. Forgetting to cancel — Not passing CancellationToken to long-running operations leads to runaway tasks. Always support cancellation.

  5. Race conditions with compound operations — Checking then acting (e.g., if (!dict.ContainsKey(key)) dict[key] = value) is not atomic. Use TryAdd or AddOrUpdate.

  6. Excessive parallelism — Setting MaxDegreeOfParallelism too high causes ThreadPool starvation. The default is usually optimal.

Key Takeaways

  1. Prefer Task and Task<T> over raw Thread for almost all scenarios.
  2. Use lock for simple critical sections; SemaphoreSlim for async-compatible and concurrency-limiting scenarios.
  3. Concurrent collections are thread-safe per operation, but compound operations still need care.
  4. Always support CancellationToken in long-running methods.
  5. Parallel.For/ForEach and PLINQ are for CPU-bound work; Task.WhenAll is for I/O-bound.
  6. Avoid deadlocks by establishing lock ordering, never holding a lock across await, and using Monitor.TryEnter with timeouts.
  7. For the async/await programming model, see the dedicated Async/Await guide.

Interview Questions

Q: What is the difference between Thread and Task?

Thread is a low-level OS thread wrapper that allocates its own stack (~1MB). Task is a higher-level abstraction that runs on the ThreadPool, supports continuation chaining, exception aggregation, and cancellation. Always prefer Task over manual Thread creation.

Q: What is the difference between lock and Monitor?

lock is syntactic sugar that compiles to Monitor.Enter/Monitor.Exit in a try/finally. Monitor provides additional features: TryEnter with timeout, Pulse/Wait for signaling. Use lock for simple cases; Monitor when you need timeouts or signaling.

Q: How do you handle cancellation in long-running tasks?

Pass a CancellationToken from CancellationTokenSource to the task. The task checks token.ThrowIfCancellationRequested() periodically. The caller calls cts.Cancel() to signal cancellation cooperatively.

Q: What is a deadlock and how do you prevent it?

A deadlock occurs when two or more threads each hold a lock and wait for a lock the other holds. Prevent by always acquiring locks in a consistent order, using timeouts with Monitor.TryEnter, keeping critical sections small, and never holding a lock while calling await.

Q: When would you use ConcurrentDictionary over Dictionary with lock?

Use ConcurrentDictionary when multiple threads frequently read and write to the same dictionary. Its fine-grained locking and atomic methods (TryAdd, AddOrUpdate) perform better than a single lock around a regular Dictionary. For read-heavy or single-writer scenarios, a regular Dictionary with a ReaderWriterLockSlim may suffice.

References