Go 语言面试实录 —— Go Map 的并发安全问题

公众号：GopherYes
Go 语言面试实录 —— Go Map 的并发安全问题

面试官：你好，我们来聊下 Go 语言的 map。首先，请聊下 Go 语言的 map 是不是并发安全的？

应试者：不是的。Go 语言的 map 不是并发安全的。如果在多个 goroutine 同时读写同一个 map 时，会出现竞态条件（race condition），可能导致程序运行出错，甚至崩溃。

为了证明这一点，我可以展示一个并发不安全的示例：

package main

import (
    "fmt"
    "sync"
)

func main() {
    m := make(map[int]int)
    var wg sync.WaitGroup

    for i := 0; i < 100; i++ {
        wg.Add(1)
        go func(n int) {
            defer wg.Done()
            m[n] = n
  }(i)
  }

    wg.Wait()
    fmt.Println("Map 的大小:", len(m))
}

这段代码可能会出现以下问题：
1. 并发写入可能导致 map 数据损坏
2. 可能会触发运行时 panic
3. 最终的 map 大小可能不是预期的 100

面试官：OK。那当我们需要在并发场景下使用 map 时，你有什么好的解决方案吗？

应试者：在 Go 语言中，主要有三种解决方案：

1. 使用 sync.Mutex 互斥锁来保护 map

type SafeMap struct {
    sync.Mutex
    m map[int]int
}

func (sm *SafeMap) Set(key, value int) {
    sm.Lock()
    defer sm.Unlock()
    sm.m[key] = value
}

func (sm *SafeMap) Get(key int) (int, bool) {
    sm.Lock()
    defer sm.Unlock()
    val, exists := sm.m[key]
    return val, exists
}

2. 使用 sync.RWMutex，允许并发读，但写入互斥

type SafeMap struct {
    sync.RWMutex
    m map[int]int
}

func (sm *SafeMap) Get(key int) (int, bool) {
    sm.RLock()
    defer sm.RUnlock()
    val, exists := sm.m[key]
    return val, exists
}

3. 使用 sync.Map，这是 Go 标准库提供的并发安全的 map 实现

var m sync.Map

// 存储
m.Store("key", "value")

// 读取
value, ok := m.Load("key")

// 删除
m.Delete("key")

面试官：能详细解释一下为什么普通的 map 不是并发安全的吗？这背后的机制是什么？

应试者：这涉及到 Go 语言 map 的底层实现。在 Go 的源码中（runtime/map.go），map 的结构大致是这样的：

type hmap struct {
    count     int    // 元素个数
    flags     uint8  // 状态标志
    B         uint8  // 桶的对数
    noverflow uint16 // 溢出桶的近似数
    hash0     uint32 // 哈希种子

    buckets    unsafe.Pointer // 桶数组
    oldbuckets unsafe.Pointer // 旧桶数组，在扩容时使用
    // ... 其他字段
}

并发不安全的根本原因在于：
1. map 的内部操作（如插入、删除）不是原子的
2. 扩容过程中会修改桶的结构
3. 多个 goroutine 同时操作会导致数据竞争

具体来说，一个简单的写入操作可能包含多个步骤：

计算 key 的哈希值
定位到具体的桶
在桶中找到空位
写入数据

这些步骤如果被并发执行，就会导致不可预期的结果。

面试官：sync.Map 是如何解决这些并发问题的？能详细介绍一下它的实现原理吗？

应试者：sync.Map 的核心设计是读写分离和优化的并发控制。我们可以看一下它的大致结构：

type Map struct {
    mu Mutex
    read atomic.Value // readOnly
    dirty map[interface{}]*entry
    misses int
}

type readOnly struct {
    m       map[interface{}]*entry
    amended bool // 是否有新的数据在 dirty 中
}

它的主要优化策略包括：

1. 双层存储：

read map：无锁读取
dirty map：需要加锁的可写 map

2. 读优化：

优先从 read map 读取
使用原子操作 atomic.Value 保证读取的线程安全

3. 写入机制：

先尝试在 read map 中更新
如果不成功，则加锁操作 dirty map

4. 动态提升：

当 dirty map 被频繁访问时，会将其提升为 read map

实际的读写流程大致如下：

func (m *Map) Load(key interface{}) (value interface{}, ok bool) {
    // 首先无锁读取 read map
    read, _ := m.read.Load().(readOnly)
    e, ok := read.m[key]
    if !ok && read.amended {
        // 如果 read map 没有，且有新数据，则加锁查询 dirty map
        m.mu.Lock()
        // 双检查，避免重复加锁
        read, _ = m.read.Load().(readOnly)
        e, ok = read.m[key]
        if !ok && read.amended {
            e, ok = m.dirty[key]
            // 记录未命中次数，可能会晋升 dirty map
            m.missLocked()
  }
        m.mu.Unlock()
  }
    // ... 返回结果
}

面试官：那么，sync.Map 是不是在所有并发场景下都是最佳选择？

应试者：不是的。sync.Map 有其特定的适用场景和局限性：

适用场景：
1. 读操作明显多于写操作
2. key 是动态增长的
3. 元素命中率较高

不适用场景：
1. 写操作频繁
2. key 是有限且提前确定的
3. 需要有序遍历 map
4. 需要对 key 进行排序或自定义比较的场景

性能建议：

对于写多读少的场景，传统的 sync.Mutex + map 可能更高效
对于读写均衡的场景，可以考虑分片锁等自定义方案

面试官：最后，你能分享一个实际工作中处理并发 map 的最佳实践吗？

应试者：在高并发缓存场景，我曾使用分片锁方案来优化 map 的并发性能：

type ShardedMap struct {
    shards     []map[string]interface{}
    locks      []sync.RWMutex
    shardCount int
}

func NewShardedMap(shardCount int) *ShardedMap {
    sm := &ShardedMap{
        shards:     make([]map[string]interface{}, shardCount),
        locks:      make([]sync.RWMutex, shardCount),
        shardCount: shardCount,
    }
    for i := 0; i < shardCount; i++ {
        sm.shards[i] = make(map[string]interface{})
 }    return sm
}

func (sm *ShardedMap) getShard(key string) (map[string]interface{}, *sync.RWMutex) {
    hash := fnv32(key)
    shardIndex := hash % uint32(sm.shardCount)
    return sm.shards[shardIndex], &sm.locks[shardIndex]
}

func (sm *ShardedMap) Set(key string, value interface{}) {
    shard, lock := sm.getShard(key)
    lock.Lock()
    defer lock.Unlock()
    shard[key] = value
}

func fnv32(key string) uint32 {
    hash := uint32(2166136261)
    for i := 0; i < len(key); i++ {
        hash *= 16777619
        hash ^= uint32(key[i])
  }
    return hash
}