Android Performance

Android Systrace Basics - Vsync Explained

Word count: 1.6kReading time: 9 min
2019/12/01
loading

This is the seventh article in the Systrace series, primarily introducing the Vsync mechanism in Android. This article examines the display of each frame in the Android system from the perspective of Systrace. Vsync is a critical mechanism in Systrace. Although invisible and intangible when operating a phone, we can see in Systrace how the Android system, guided by Vsync signals, orderly performs rendering and composition for each frame, ensuring stable frame rates.

The purpose of this series is to view the overall operation of the Android system from a different perspective using Systrace, while also providing an alternative angle for learning the Framework. Perhaps you’ve read many articles about the Framework but can never remember the code, or you’re unclear about the execution flow. Maybe from Systrace’s graphical perspective, you can gain a deeper understanding.

Table of Contents

Series Article Index

  1. Introduction to Systrace
  2. Systrace Basics - Prerequisites for Systrace
  3. Systrace Basics - Why 60 fps?
  4. Android Systrace Basics - SystemServer Explained
  5. Systrace Basics - SurfaceFlinger Explained
  6. Systrace Basics - Input Explained
  7. Systrace Basics - Vsync Explained
  8. Systrace Basics - Vsync-App: Detailed Explanation of Choreographer-Based Rendering Mechanism
  9. Systrace Basics - MainThread and RenderThread Explained
  10. Systrace Basics - Binder and Lock Contention Explained
  11. Systrace Basics - Triple Buffer Explained
  12. Systrace Basics - CPU Info Explained
  13. Systrace Smoothness in Action 1: Understanding Jank Principles
  14. Systrace Smoothness in Action 2: Case Analysis - MIUI Launcher Scroll Jank Analysis
  15. Systrace Smoothness in Action 3: FAQs During Jank Analysis
  16. Systrace Responsiveness in Action 1: Understanding Responsiveness Principles
  17. Systrace Responsiveness in Action 2: Responsiveness Analysis - Using App Startup as an Example
  18. Systrace Responsiveness in Action 3: Extended Knowledge on Responsiveness
  19. Systrace Thread CPU State Analysis Tips - Runnable
  20. Systrace Thread CPU State Analysis Tips - Running
  21. Systrace Thread CPU State Analysis Tips - Sleep and Uninterruptible Sleep

Main Content

Vsync signals can be generated by hardware or simulated via software, though hardware generation is now standard. The Hardware Composer (HWC) produces VSYNC events and sends them to SurfaceFlinger via callbacks. DispSync then generates VSYNC_APP and VSYNC_SF signals from these for use by Choreographer and SurfaceFlinger.

In the article Detailed Explanation of Android Rendering Mechanism Based on Choreographer, we mentioned that Choreographer coordinates with Vsync to provide a stable Message processing timing for App-level rendering. When Vsync arrives, the system adjusts the signal period to control when each frame is drawn. Most current phones have a 60Hz refresh rate (16.6ms). To match this, the system sets the Vsync period to 16.6ms, waking Choreographer every period to perform app drawing—this is the primary purpose of Choreographer.

While Choreographer handles Vsync for the rendering layer (App), SurfaceFlinger handles it for the composition layer. SurfaceFlinger composites all prepared Surfaces when Vsync arrives.

The diagram below shows VSYNC_APP and VSYNC_SF within the SurfaceFlinger process in Systrace.

Android Graphics Data Flow

First, we must understand the high-level direction of graphics data. From app drawing to screen display, the process consists of several stages:

  1. Stage 1: Upon receiving Vsync-App, the App performs measure, layout, and draw (constructing a DisplayList containing OpenGL commands and data) on the main thread. This corresponds to the doFrame operation in Systrace’s main thread.
  2. Stage 2: The CPU uploads data (via sharing or copying) to the GPU. On ARM devices, CPU and GPU usually share memory. This corresponds to the flush drawing commands operation in Systrace’s rendering thread.
  3. Stage 3: Notify the GPU to render. Real devices typically don’t block waiting for GPU completion; the CPU returns to other tasks immediately. The Fence mechanism is used for synchronization.
  4. Stage 4: swapBuffers and notification to SurfaceFlinger for layer composition. This corresponds to the eglSwapBuffersWithDamageKHR operation in Systrace’s rendering thread.
  5. Stage 5: SurfaceFlinger starts composition. If previous GPU tasks aren’t finished, it waits (Fence mechanism). Composition still relies on the GPU, but as a subsequent task. This corresponds to onMessageReceived (including handleTransaction, handleMessageInvalidate, handleMessageRefresh) in SurfaceFlinger‘s main thread. SurfaceFlinger delegates some work to the Hardware Composer to offload the GPU; only layers HWC can’t handle (or those explicitly assigned to OpenGL) use OpenGL.
  6. Stage 6: Final composited data is placed in the Frame Buffer and becomes visible upon refresh.

The following official diagram shows how frame data flows across processes from left to right:

Graphic Data Flow in Systrace

By mapping the abstract data flow above to Systrace, we get this:

The diagram above includes SurfaceFlinger, App, and hwc processes. Let’s follow the numbers:

  1. First Vsync signal arrives; SurfaceFlinger and App receive it simultaneously.
  2. SurfaceFlinger receives Vsync-sf and begins compositing the App’s previous frame buffer.
  3. App receives Vsync-app and begins rendering the current frame buffer (stages 1-4 above).
  4. Second Vsync signal arrives. SurfaceFlinger retrieves the buffer the App rendered in step 2 and begins composition (stage 5). Simultaneously, the App receives Vsync-app and starts rendering the next frame buffer (stages 1-4).

Vsync Offset

As mentioned, HWC produces hardware Vsync, which DispSync converts into VSYNC_APP and VSYNC_SF.

disp_sync_arch

Both app and sf signals have offsets relative to hw_vsync_0, namely phase-app and phase-sf:

Vsync Offset refers to the difference between VSYNC_APP and VSYNC_SF (i.e., phase-sf - phase-app), which manufacturers can configure. If the offset is non-zero, the App and SurfaceFlinger don’t receive the signal simultaneously; they receive it separated by the Offset (usually 0-16.6ms).

Most manufacturers leave this at 0, so App and SurfaceFlinger receive signals simultaneously.

Check the values using Dumpsys SurfaceFlinger:

Offset = 0: (sf phase - app phase = 0)

1
2
3
4
5
6
Sync configuration: [using: EGL_ANDROID_native_fence_sync EGL_KHR_wait_sync]
DispSync configuration:
app phase 1000000 ns, sf phase 1000000 ns
early app phase 1000000 ns, early sf phase 1000000 ns
early app gl phase 1000000 ns, early sf gl phase 1000000 ns
present offset 0 ns refresh 16666666 ns

Offset ≠ 0 (SF phase - app phase = 4 ms)

1
2
3
4
5
6
7
Sync configuration: [using: EGL_ANDROID_native_fence_sync EGL_KHR_wait_sync]

VSYNC configuration:
         app phase:   2000000 ns          SF phase:   6000000 ns
   early app phase:   2000000 ns    early SF phase:   6000000 ns
GL early app phase:   2000000 ns GL early SF phase:   6000000 ns
    present offset:         0 ns      VSYNC period:  16666666 ns

Now let’s see how Offset looks in Systrace.

Offset = 0

When Offset is 0, App and SurfaceFlinger receive Vsync simultaneously:

As shown, the App-rendered buffer must wait until the next Vsync-SF for composition, causing a ~16.6ms delay. You might wonder: if SurfaceFlinger could start composition immediately after the App swaps the buffer into BufferQueue, could we save those 0-16.6ms?

Yes, that’s exactly what the Offset mechanism does. The App receives Vsync first, performs rendering, and after the Offset delay, SurfaceFlinger receives Vsync and begins composition. If the app’s buffer is ready, SurfaceFlinger can include it in the current composition, allowing the user to see it sooner.

Offset ≠ 0

Below is an example with a 4ms Offset: SurfaceFlinger receives Vsync 4ms after the App does.

Pros and Cons of Offset

The challenge is determining the optimal Offset—one reason many manufacturers don’t configure it. Its effectiveness depends on device performance and usage scenarios.

  1. If Offset is too short: The app might not finish rendering before SurfaceFlinger starts composition. If no previous buffers are ready, SurfaceFlinger‘s current composition won’t include the new frame, delaying it until the next Vsync-SF. The delay effectively becomes Vsync period + Offset instead of the expected Offset.
  2. If Offset is too long: It loses its purpose.

HW_Vsync

Note that hardware Vsync isn’t requested every time. It’s only requested from HWC if the last composition was more than 500ms ago.

Using Launcher scrolling as an example, you can see the HW_VSYNC state in the SurfaceFlinger process trace.

Subsequent Vsync requests from Apps can occur either with or without HW_VSYNC.

Without HW_VSYNC

With HW_VSYNC

HW_VSYNC primarily uses recent hardware Vsync timestamps for prediction (usually 3-32, minimum 6). DispSync calculates SW_VSYNC after receiving 6 timestamps. As long as incoming Present Fence errors remain within threshold, hardware Vsync is turned off; otherwise it remains active to recalibrate SW_VSYNC. For details, see Generation and Transmission of SW-VSYNC. Here is their conclusion:

SurfaceFlinger implements the HWC2::ComposerCallback interface to receive hardware VSYNC callbacks and pass them to DispSync. DispSync records these timestamps. After collecting enough (currently ≥6), it calculates the SW-VSYNC offset mPeriod. This is used by DispSyncThread to periodically wake and notify listeners—SurfaceFlinger and all apps needing to render. Listeners register via Connection objects in EventThread, which is linked to DispSyncThread through DispSyncSource. When SW-VSYNC arrives, EventThread notifies connections, triggering SurfaceFlinger composition and App frame drawing. Once enough hardware VSYNCs are received and errors are within tolerance, hardware VSYNC is disabled via EventControlThread.

Other Addresses for This Article

To be updated.

References

  1. VSYNC
  2. Sync and Vsync
  3. Android Graphics Architecture
  4. Generation and Transmission of SW-VSYNC
  5. DispSync Implementation Details

About Me && Blog

Below is my personal intro and related links. I look forward to exchanging ideas with fellow professionals. “When three walk together, one can always be my teacher!”

  1. Blogger Intro: Includes personal WeChat and WeChat group links.
  2. Blog Content Navigation: A guide for my blog content.
  3. Curated Excellent Blog Articles - Android Performance Optimization Must-Knows: Welcome to recommend projects/articles.
  4. Android Performance Optimization Knowledge Planet: Welcome to join and thank you for your support~

One walks faster alone, but a group walks further together.

Scan WeChat QR Code

CATALOG
  1. 1. Table of Contents
  • Series Article Index
  • Main Content
  • Android Graphics Data Flow
  • Graphic Data Flow in Systrace
  • Vsync Offset
    1. 1. Offset = 0
    2. 2. Offset ≠ 0
    3. 3. Pros and Cons of Offset
  • HW_Vsync
    1. 1. Without HW_VSYNC
    2. 2. With HW_VSYNC
  • Other Addresses for This Article
  • References
  • About Me && Blog