LeakCanary

发表于 2020-01-06 更新于 2020-07-05 分类于 Android 知识点 Waline：阅读次数：本文字数： 14k 阅读时长 ≈ 13 分钟

原理概述

1、检测保留的对象
LeakCanary 使用 ObjectWatcher 来监控 Android 的生命周期。当 Activity 和 Fragment 被 destroy 以后，这些引用被传给 ObjectWatcher 以 WeakReference 的形式引用着。如果 gc 完 5 秒钟以后这些引用还没有被清除掉，那就是内存泄露了。

2、堆转储
当被泄露掉的对象达到一个阈值，LeakCanary将Java堆栈信息 dump（转储）到.hprof存储在Android文件系统上的文件中。这会使应用冻结一小段时间，并弹出提示。

3、分析堆
LeakCanary 用 Shark 库来解析.hprof文件，找到无法被清理的引用的引用栈，然后再根据对 Android 系统的知识来判定是哪个实例导致的泄露。通过泄露信息，LeakCanary 会将一条完整的引用链缩减到一个小的引用链，其余的因为这个小的引用链导致的泄露链都会被聚合在一起。

准备工作

添加依赖：

dependencies {
    // debugImplementation because LeakCanary should only run in debug builds.
    debugImplementation 'com.squareup.leakcanary:leakcanary-android:2.1'
}

源码分析

初始化

2.0 版本的 LeakCanary 使用了 ContentProvider 来进行初始化，它的特点，即在打包的过程中来自不同 module的 ContentProvider 最后都会 merge 到一个文件中，启动 app 的时候 ContentProvider 是自动安装，并且安装会比 Application 的 onCreate 还早。

// 在 leakcanary-object-watcher-android 包下的 AndroidManifest.xml 中有一个 ContentProvider
<provider
    android:name="leakcanary.internal.AppWatcherInstaller$MainProcess"
    android:authorities="${applicationId}.leakcanary-installer"
    android:exported="false"/>

// class：AppWatcherInstaller
override fun onCreate(): Boolean {
  val application = context!!.applicationContext as Application
  // 使用 InternalAppWatcher 进行的 install
  InternalAppWatcher.install(application)
  return true
}

// class：InternalAppWatcher
fun install(application: Application) {
  SharkLog.logger = DefaultCanaryLog()
  SharkLog.d { "Installing AppWatcher" }
  checkMainThread()
  if (this::application.isInitialized) {
    return
  }
  InternalAppWatcher.application = application

  val configProvider = { AppWatcher.config }
  // 主要把 Activity 和 Fragment 区分开来，然后分别进行注册。
  ActivityDestroyWatcher.install(application, objectWatcher, configProvider)
  FragmentDestroyWatcher.install(application, objectWatcher, configProvider)
  onAppWatcherInstalled(application)
}

Activity 的生命周期监听是借助于 Application.ActivityLifecycleCallbacks。

// class：ActivityDestroyWatcher
private val lifecycleCallbacks =
  object : Application.ActivityLifecycleCallbacks by noOpDelegate() {
    override fun onActivityDestroyed(activity: Activity) {
      if (configProvider().watchActivities) {
        objectWatcher.watch(
            activity, "${activity::class.java.name} received Activity#onDestroy() callback"
        )
      }
    }
}

companion object {
    fun install(
      application: Application,
      objectWatcher: ObjectWatcher,
      configProvider: () -> Config
    ) {
      val activityDestroyWatcher =
        ActivityDestroyWatcher(objectWatcher, configProvider)
      // 注册
   application.registerActivityLifecycleCallbacks(activityDestroyWatcher.lifecycleCallbacks)
    }
  }

而 Fragment 的生命周期监听是借助了 Activity 的 ActivityLifecycleCallbacks 生命周期回调，当 Activity 创建的时候去调用 FragmentManager.registerFragmentLifecycleCallbacks 方法注册 Fragment 的生命周期监听。

// class：AndroidOFragmentDestroyWatcher
override fun onFragmentViewDestroyed(
    fm: FragmentManager,
    fragment: Fragment
  ) {
    val view = fragment.view
    if (view != null && configProvider().watchFragmentViews) {
      objectWatcher.watch(
          view, "${fragment::class.java.name} received Fragment#onDestroyView() callback " +
          "(references to its views should be cleared to prevent leaks)"
      )
    }
  }

  override fun onFragmentDestroyed(
    fm: FragmentManager,
    fragment: Fragment
  ) {
    if (configProvider().watchFragments) {
      objectWatcher.watch(
          fragment, "${fragment::class.java.name} received Fragment#onDestroy() callback"
      )
    }
  }
}

override fun invoke(activity: Activity) {
  val fragmentManager = activity.fragmentManager
  // 注册
  fragmentManager.registerFragmentLifecycleCallbacks(fragmentLifecycleCallbacks, true)
}

最终，Activity 和 Fragment 都将自己的引用传入了 ObjectWatcher.watch() 进行监控。从这里开始进入到LeakCanary 的引用监测逻辑。

引用监控

引用和 GC

Java 中存在四种引用：

强引用：垃圾回收器绝不会回收它，当内存空间不足，Java虚拟机宁愿抛出OOM
软引用：只有在内存不足的时候JVM才会回收仅有软引用指向的对象所占的空间
弱引用：当JVM进行垃圾回收时，无论内存是否充足，都会回收仅被弱引用关联的对象。
虚引用：和没有任何引用一样，在任何时候都可能被垃圾回收。

一个对象在被gc的时候，如果发现还有软引用（或弱引用，或虚引用）指向它，就会在回收对象之前，把这个引用加入到与之关联的引用队列(ReferenceQueue)中去。如果一个软引用（或弱引用，或虚引用）对象本身在引用队列中，就说明该引用对象所指向的对象被回收了。

当软引用（或弱引用，或虚引用）对象所指向的对象被回收了，那么这个引用对象本身就没有价值了，如果程序中存在大量的这类对象（注意，我们创建的软引用、弱引用、虚引用对象本身是个强引用，不会自动被gc回收），就会浪费内存。因此我们这就可以手动回收位于引用队列中的引用对象本身。常见的用法：

1
2
3

WeakReference<ArrayList> weakReference = new WeakReference<ArrayList>(list);

WeakReference<ArrayList> weakReference = new WeakReference<ArrayList>(list, new ReferenceQueue<WeakReference<ArrayList>>());

这样就可以把对象和 ReferenceQueue 关联起来，进行对象是否 gc 的判断了。另外我们从弱引用的特征中看到，弱引用是不会影响到这个对象是否被 gc 的，很适合用来监控对象的 gc 情况。

Java 中有两种手动调用 GC 的方式。

1
2
3

System.gc();

Runtime.getRuntime().gc();

监控

Activity 和 Fragment 都依赖于响应的 LifecycleCallback 来回调销毁信息，然后调用了 ObjectWatcher.watch 添加了销毁后的监控。

/**
 * Watches the provided [watchedObject].
 *
 * @param description Describes why the object is watched.
 */
// class：ObjectWatcher
@Synchronized fun watch(
  watchedObject: Any,
  description: String
) {
  if (!isEnabled()) {
    return
  }
  
  removeWeaklyReachableObjects()
  val key = UUID.randomUUID()
      .toString()
  val watchUptimeMillis = clock.uptimeMillis()
  // 生成一个 KeyedWeakReference，这个对象就是一个持有了 key 和监测开始时间的 WeakReference 对象。
  val reference =
    KeyedWeakReference(watchedObject, key, description, watchUptimeMillis, queue)
  SharkLog.d {
    "Watching " +
        (if (watchedObject is Class<*>) watchedObject.toString() else "instance of ${watchedObject.javaClass.name}") +
        (if (description.isNotEmpty()) " ($description)" else "") +
        " with key $key"
  }

  watchedObjects[key] = reference
  checkRetainedExecutor.execute {
    // 最后再去调用 moveToRetained，相当于记录和回调给监控方这个对象正式开始监测的时间。
    moveToRetained(key)
  }
}

// 一个存储着 KeyedWeakReference 的 ReferenceQueue 对象。
// 在每次增加 watch object 的时候，都会去把已经处于 ReferenceQueue 中的对象给从监控对象的 map 即watchObjects 中清理掉，因为这些对象都已经被回收了。
private fun removeWeaklyReachableObjects() {
  // WeakReferences are enqueued as soon as the object to which they point to becomes weakly
  // reachable. This is before finalization or garbage collection has actually happened.
  var ref: KeyedWeakReference?
  do {
    ref = queue.poll() as KeyedWeakReference?
    if (ref != null) {
      watchedObjects.remove(ref.key)
    }
  } while (ref != null)
}

@Synchronized 
private fun moveToRetained(key: String) {
  removeWeaklyReachableObjects()
  val retainedRef = watchedObjects[key]
  if (retainedRef != null) {
    retainedRef.retainedUptimeMillis = clock.uptimeMillis()
    onObjectRetainedListeners.forEach { it.onObjectRetained() }
  }
}

当拿到了需要监控的对象，但是又是怎么去判断这个对象已经内存泄露的呢？前面在 InternalAppWatcher 的 install 方法的时候，除了 install 了 Activity 和 Fragment 的检测器，还调用了onAppWatcherInstalled(application) 方法，这个方法就是 InternalLeakCanary 的 invoke 方法。

// class：InternalAppWatcher
init {
  val internalLeakCanary = try {
    val leakCanaryListener = Class.forName("leakcanary.internal.InternalLeakCanary")
    leakCanaryListener.getDeclaredField("INSTANCE")
        .get(null)
  } catch (ignored: Throwable) {
    NoLeakCanary
  }
  @kotlin.Suppress("UNCHECKED_CAST")
  onAppWatcherInstalled = internalLeakCanary as (Application) -> Unit
}

// class：InternalLeakCanary
override fun invoke(application: Application) {
  this.application = application
  // 实现了 OnObjectRetainedListener，并把自己添加其中，以便每个对象 moveToRetained 的时候，InternalLeakCanary 都能获取到 onObjectRetained() 的回调，回调里就只是回调了heapDumpTrigger.onObjectRetained() 方法。看来都是依赖于 HeapDumpTrigger 这个类。
  AppWatcher.objectWatcher.addOnObjectRetainedListener(this)
  // 初始化了 heapDumper，gcTrigger，heapDumpTrigger 等对象用于 gc 和 heapDump。
  val heapDumper = AndroidHeapDumper(application, leakDirectoryProvider)

  val gcTrigger = GcTrigger.Default

  val configProvider = { LeakCanary.config }

  val handlerThread = HandlerThread(LEAK_CANARY_THREAD_NAME)
  handlerThread.start()
  val backgroundHandler = Handler(handlerThread.looper)

  heapDumpTrigger = HeapDumpTrigger(
      application, backgroundHandler, AppWatcher.objectWatcher, gcTrigger, heapDumper,
      configProvider
  )
  application.registerVisibilityListener { applicationVisible ->
    this.applicationVisible = applicationVisible
    heapDumpTrigger.onApplicationVisibilityChanged(applicationVisible)
  }
  addDynamicShortcut(application)

  disableDumpHeapInTests()
}

// 回调
override fun onObjectRetained() {
    if (this::heapDumpTrigger.isInitialized) {
      heapDumpTrigger.onObjectRetained()
    }
  }

HeapDumpTrigger 主要是下面几个功能：

后台线程轮询当前还存活着的对象
如果存活的对象大于0，那就触发一次GC操作，回收掉没有泄露的对象
GC完后，仍然存活着的对象数和预定的对象数相比较，如果多了就调用heapDumper.dumpHeap()方法把对象dump成文件，并交给HeapAnalyzerService去分析
根据存活情况展示通知

主要的处理逻辑都在 checkRetainedObjects 方法中。

private fun checkRetainedObjects(reason: String) {
  val config = configProvider()
  // A tick will be rescheduled when this is turned back on.
  if (!config.dumpHeap) {
    SharkLog.d { "Ignoring check for retained objects scheduled because $reason: LeakCanary.Config.dumpHeap is false" }
    return
  }

  var retainedReferenceCount = objectWatcher.retainedObjectCount
  
  if (retainedReferenceCount > 0) {
    // 触发一次GC操作，只保留不能被回收的对象
    gcTrigger.runGc()
    retainedReferenceCount = objectWatcher.retainedObjectCount
  }

  if (checkRetainedCount(retainedReferenceCount, config.retainedVisibleThreshold)) return

  if (!config.dumpHeapWhenDebugging && DebuggerControl.isDebuggerAttached) {
    showRetainedCountNotification(
        objectCount = retainedReferenceCount,
        contentText = application.getString(
            R.string.leak_canary_notification_retained_debugger_attached
        )
    )
    scheduleRetainedObjectCheck(
        reason = "debugger is attached",
        rescheduling = true,
        delayMillis = WAIT_FOR_DEBUG_MILLIS
    )
    return
  }

  val now = SystemClock.uptimeMillis()
  val elapsedSinceLastDumpMillis = now - lastHeapDumpUptimeMillis
  if (elapsedSinceLastDumpMillis < WAIT_BETWEEN_HEAP_DUMPS_MILLIS) {
    showRetainedCountNotification(
        objectCount = retainedReferenceCount,
        contentText = application.getString(R.string.leak_canary_notification_retained_dump_wait)
    )
    scheduleRetainedObjectCheck(
        reason = "previous heap dump was ${elapsedSinceLastDumpMillis}ms ago (< ${WAIT_BETWEEN_HEAP_DUMPS_MILLIS}ms)",
        rescheduling = true,
        delayMillis = WAIT_BETWEEN_HEAP_DUMPS_MILLIS - elapsedSinceLastDumpMillis
    )
    return
  }

  SharkLog.d { "Check for retained objects found $retainedReferenceCount objects, dumping the heap" }
  dismissRetainedCountNotification()
  dumpHeap(retainedReferenceCount, retry = true)
}

总结

Activity 和 Fragment 通过注册系统的监听在 onDestroy 的时候把自己的引用放入 ObjectWatcher 进行监测，监测主要是通过 HeapDumpTrigger 类轮询进行，主要是调用 AndroidHeapDumper 来 dump 出文件来，然后依赖于 HeapAnalyzerService 来进行分析。

dump 对象及分析

dump 对象：

hprof 是 JDK 提供的一种 JVM TI Agent native 工具。JVM TI，全拼是 JVM Tool interface，是 JVM 提供的一套标准的 C/C++ 编程接口，是实现 Debugger、Profiler、Monitor、Thread Analyser 等工具的统一基础，在主流 Java 虚拟机中都有实现。hprof 工具事实上也是实现了这套接口，可以认为是一套简单的 profiler agent 工具。

LeakCanary 也是使用的 hprof 文件进行对象存储。hprof 文件比较简单，整体按照前置信息 + 记录表的格式来组织的。但是记录的种类相当之多。具体种类可以查看 HPROF Agent。

同时，android 中也提供了一个简便的方法 Debug.dumpHprofData(filePath) (需要权限)可以把对象 dump 到指定路径下的hprof 文件中。LeakCanary 使用 Shark 库来解析 Hprof 文件中的各种 record，比较高效，使用 Shark 中的HprofReader 和 HprofWriter 来进行读写解析，获取需要的信息。

dump 具体的代码在 AndroidHeapDumper 类中。

override fun dumpHeap(): File? {
    val heapDumpFile = leakDirectoryProvider.newHeapDumpFile() ?: return null

    val waitingForToast = FutureResult<Toast?>()
    showToast(waitingForToast)

    if (!waitingForToast.wait(5, SECONDS)) {
      SharkLog.d { "Did not dump heap, too much time waiting for Toast." }
      return null
    }

    val notificationManager =
      context.getSystemService(Context.NOTIFICATION_SERVICE) as NotificationManager
    if (Notifications.canShowNotification) {
      val dumpingHeap = context.getString(R.string.leak_canary_notification_dumping)
      val builder = Notification.Builder(context)
          .setContentTitle(dumpingHeap)
      val notification = Notifications.buildNotification(context, builder, LEAKCANARY_LOW)
      notificationManager.notify(R.id.leak_canary_notification_dumping_heap, notification)
    }

    val toast = waitingForToast.get()

    return try {
      Debug.dumpHprofData(heapDumpFile.absolutePath)
      if (heapDumpFile.length() == 0L) {
        SharkLog.d { "Dumped heap file is 0 byte length" }
        null
      } else {
        heapDumpFile
      }
    } catch (e: Exception) {
      SharkLog.d(e) { "Could not dump heap" }
      // Abort heap dump
      null
    } finally {
      cancelToast(toast)
      notificationManager.cancel(R.id.leak_canary_notification_dumping_heap)
    }
  }

对象分析：

HeapDumpTrigger 主要是依赖于 HeapAnalyzerService 进行分析。HeapAnalyzerService 其实是一个ForegroundService。在接收到分析的 Intent 后就会调用 HeapAnalyzer 的 analyze 方法。所以最终进行分析的地方就是 HeapAnalyzer 的 analyze 方法。

fun analyze(
  heapDumpFile: File,
  leakingObjectFinder: LeakingObjectFinder,
  referenceMatchers: List<ReferenceMatcher> = emptyList(),
  computeRetainedHeapSize: Boolean = false,
  objectInspectors: List<ObjectInspector> = emptyList(),
  metadataExtractor: MetadataExtractor = MetadataExtractor.NO_OP,
  proguardMapping: ProguardMapping? = null
): HeapAnalysis {
  val analysisStartNanoTime = System.nanoTime()

  if (!heapDumpFile.exists()) {
    val exception = IllegalArgumentException("File does not exist: $heapDumpFile")
    return HeapAnalysisFailure(
        heapDumpFile, System.currentTimeMillis(), since(analysisStartNanoTime),
        HeapAnalysisException(exception)
    )
  }

  return try {
    listener.onAnalysisProgress(PARSING_HEAP_DUMP)
    Hprof.open(heapDumpFile)
        .use { hprof ->
          // 1.生成 graph
          // 专为 LeakCanary 设计的 Shark 库的用法
          // 首先调用 HprofHeapGraph.indexHprof 方法，这个方法会把dump出来的各种实例instance，Class类对象和Array对象等都建立起查询的索引，以record的id作为key，把需要的信息都存储在Map中便于后续取用
          val graph = HprofHeapGraph.indexHprof(hprof, proguardMapping)
          // 2.寻找 Leak
          // 这个方法会从 GC Root 开始查询，找到最短的一条导致泄露的引用链，然后再根据这条引用链构建出LeakTrace。
          // 再把查询出来的 LeakTrace 对外展示
          val helpers =
            FindLeakInput(graph, referenceMatchers, computeRetainedHeapSize, objectInspectors)
          helpers.analyzeGraph(
              metadataExtractor, leakingObjectFinder, heapDumpFile, analysisStartNanoTime
          )
        }
  } catch (exception: Throwable) {
    HeapAnalysisFailure(
        heapDumpFile, System.currentTimeMillis(), since(analysisStartNanoTime),
        HeapAnalysisException(exception)
    )
  }
}

总结

1、LeakCanary 是如何使用 ObjectWatcher 监控生命周期的？
LeakCanary 使用了 Application 的 ActivityLifecycleCallbacks 和 FragmentManager 的FragmentLifecycleCallbacks 方法进行 Activity 和 Fragment 的生命周期检测，当 Activity 和 Fragment 被回调onDestroy 以后就会被 ObjectWatcher 生成 KeyedReference 来检测，然后借助 HeapDumpTrigger 的轮询和触发 gc 的操作找到弹出提醒的时机。

2、LeakCanary 如何 dump 和分析.hprof文件的？
使用 Android 平台自带的 Debug.dumpHprofData 方法获取到 hprof 文件，使用自建的 Shark 库进行解析，获取到 LeakTrace。