首页 > 技术 > 内容

Arm架构下的Synchronization概述和案例分析

时间:2025-12-25  作者:Diven  阅读:0

白皮书下载链接:Arm架构下的Synchronization概述和案例分析

https://developer.arm.com/documentation/107630/latest/

 

1. 简介


 

随着近年来Arm服务器的应用越来越,越来越多的云厂商开始提供基于Arm架构的云实例,越来越多的开发人员正在为Arm平台编写软件。

 

Synchronization是软件迁移和优化过程中的热门话题。基于Arm架构的服务器通常具有比其架构更多的CPU内核,对Synchronization的深入理解显得更为重要。

 

Arm和X86 CPU之间最显著的区别是内存模型:Arm架构具有与x86架构的TSO(TOTAl Store Order)模型不同的弱内存模型。不同的内存模型可能会导致程序在架构上运行良好,但在另架构上会遇到性能问题或错误。Arm服务器更宽松的内存模型允许更多的编译器和硬件优化以提高系统性能,但代价是更难理解并且可能更容易编写错误代码。

 

我们创作此文档是为了分享有关Arm架构的Synchronization专业知识,可以帮助其架构的开发人员在Arm系统上进行开发。

 

2. Armv8-A架构上的Synchronization方法


 

本文档首先介绍了Armv8-A架构上的Synchronization相关知识,包括原子操作、Arm内存顺序和数据访问屏障指令。

 

2.1 原子操作

 

锁的实现要求原子访问,Arm架构定义了两种类型的原子访问:

 

  • Load exclusive and store exclusive

  • AtomIC operation, whICh is introduced in armv8.1-a large system extension (LSE)

 

2.1.1 Exclusive load and store

 

LDREX/LDXR - The load exclusive instruction performs a load fROM an addressed memory location, the PE (e.g. the CPU) also marks the physical address being accessed as an exclusive access. The exclusive access mark is checked by store exclusive instructions.STREX/STXR - The store exclusive instruction tries to a value fROM a register to memory if the PE (e.g. the CPU) has exclusive access to the memory address, and returns a status value of 0 if the store was successful, or of 1 if no store was performed. 

2.1.2 LSE Atomic operation

 

LDXR/STXR使用了try and test机制,LSE不一样,直接强制原子访问,主要有如下指令: 
  • Compare and Swap instructions, CAS, and CASP. These instructions perform a read from memory and compare it against the value held in the first register. If the comparison is equal, the value in the second register is written to memory. If the write is performed, the read and write occur atomically such that no other modification of the memory location can take place between the read and write.

  • Atomic memory operation instructions, LD, and ST, where is one of ADD, CLR, EOR, SET, SMAX, SMIN, UMAX, and UMIN. Each instruction atomically loads a value from memory, performs an operation on the values, and stores the result back to memory. The LD instructions save the originally read value in the destination register of the instruction.

  • Swap instruction, SWP. This instruction atomically reads a location from memory into a register and writes back a different supplied value back to the same memory location.

 

2.2 Arm内存顺序

 

Arm架构定义了弱内存模型,内存访问可能不会按照代码顺序:

 

2.3 Arm数据访问屏障指令

 

Arm架构定义了屏障指令来保证内存访问的顺序。 DMB – Data Memory Barrier
Explicit memory accesses before the DMB are observed before any explicit access after the DMB 
  • Does not guarantee when the operations happen, just guarantee the order
     LDR X0, [X1] ;Must be seen by memory system before STR DMB SY ADD X2, #1 ; May be executed before or after memory system sees LDR STR X3, [X4] ;Must be seen by memory system after LDR
DSB – Data Synchronization Barrier
A DSB is more restrictive than a DMB 
  • Use a DSB when necessary, but do not overuse them

 No instruction after a DSB will execute until: 
  • All explicit memory accesses before the DSB in progRAM order have completed

  • Any outstanding cache/TLB/branch predictor operations complete

 

 DC ISW ; Operation must have completed before DSB can complete STR X0, [X1] ; Access must have completed before DSB can complete DSB SY ADD X2, X2, #3 ;Cannot be executed until DSB completes
DMB和DSB是双向栅栏,对两个方向都限制,Armv8-a也设计了单向栅栏:load-acquire和store-release机制,只在一个方向上做限制。 Load-Acquire (LDAR) 
  • All accesses after the LDAR are observed by memory system after the LDAR.

  • Accesses before the LDAR are not affected.

 Store-Release (STLR) 
  • All accesses before the STLR are observed by memory system before the STLR

  • Accesses after the STLR are not affected

 

3. C++内存模型


 

有了语言层面的内存模型,对于大多数情况,开发者不需要去写依赖于具体架构的汇编代码,而只需要借助于良好设计的语言层面的内存模型来编写高质量代码,不必担心架构差异。
C++ memory model:
https://en.cppreference.com/w/cpp/header/atomic  我们做了一个C++内存模型与Armv8-A实现之间的映射: 

 

4.


 

在白皮书中,为帮助读者更好地理解,我们选取了三个典型案例进行深入分析。由于与Synchronization相关的编程非常复杂,因此我们必须仔细权衡其正确性和性能。我们建议首先使用较重的屏障指令保证逻辑的正确性,然后通过移除一些冗余屏障或在必要时切换到较轻的屏障来继续提高性能。对Arm内存模型和相关指令的深入理解,是对实现准确和高性能的Synchronization编程非常有必要的。 在附录部分,我们还介绍了内存模型工具(The litmus test suite),可以帮助理解内存模型并在各种架构上验证程序。 关于以上内容更完整的讲解,请参考“Arm架构下的Synchronization概述和案例分析白皮书”。 

参考文献

 

  1. Arm, “Arm Architecture Reference Manual Armv8, for Armv8-A architecture profile Documentation” 

    https://developer.arm.com/docs/ddi0487/latest

  2. “The software suite diy7”
    http://diy.inria.fr/

  3. “A working example of how to use the herd7 Memory Model Tool”https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/how-to-use-the-memory-model-tool
  4. “How to generate litmus tests automatically with the diy7 tool”https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/generate-litmus-tests-automatically-diy7-tool
  5. “Running litmus tests on hardware using litmus7”
    https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/running-litmus-tests-on-hardware-litmus7
 审核编辑 :李倩

 


猜您喜欢


断线钳是常用的工具,其主要优势体现在多个方面。断线钳具备强大的剪切能力,能够轻松剪断各种金属线材,如电线、铁丝等,操作简便,省时省力。断线钳的设计符合人体工学,...
2010-06-19 00:00:00
随着电子设备的普及,保险丝作为重要的安全保护元件被应用于各种电路中。了解保险丝的生产过程,不仅有助于提升对产品质量的认知,也能帮助消费者选择更适合的保险丝。通过...
2025-11-01 23:00:09
一、电气隔离定义电隔离是将具有非理想效果的部件与其他部件分开。在电子电路中,电介质是通过阻断直流电来隔离的。隔离电路如何在更大的电气系统中工作?这个问题的答案是...
2023-09-26 16:05:00
贴片电阻510是一种广泛应用于电子电路中的基础元件,其名称「510」代表其阻值。 「51」代表数值51,「0」代表后面的零的个数,即51乘以10的0次方,最终阻...
2024-11-26 11:30:10
可调电阻作为电子元件中的重要组成部分,被应用于各种电子设备中。BOURNS(伯恩斯)作为全球知名的电子元件制造商,其可调电阻系列以高品质和很好性能赢得了市场的高...
2012-05-26 02:28:26
贴片电阻是电子电路中不可或缺的元器件,其阻值的选择至关重要。常用贴片电阻的阻值范围非常广泛,涵盖了从几欧姆到几兆欧姆的各种数值,以满足不同电路设计的需求。一般来...
2025-04-14 15:01:54
滤波电容器应用于多个领域,主要用于电力电子和电气设备中,以提高电能质量和系统稳定性。在电力系统中,被用于滤除高频噪声,平滑电压波动,从而保护设备免受干扰。在通信...
2017-10-08 00:00:00
快速准确地识别贴片电阻的参数对于电子工程师和爱好者来说很重要。面对密密麻麻的数字和字母组合,有哪些便捷的查询工具可以帮助我们呢?在线贴片电阻代码计算器是首选。众...
2024-11-29 10:25:42
随着电子技术的不断发展,大功率可调电阻电位器在工业控制、电源调节和音频设备等领域的应用日益。作为能够调节电阻值的电子元件,不仅具备调节电流和电压的功能,还能承受...
2025-11-30 06:30:08
电源连接器/插接器在电子设备中是非常重要的配件,其参数直接影响设备的性能与安全性。电流和电压额定值是最基本的参数,决定了连接器能够承载的最大负载。接触电阻和绝缘...
2024-02-11 00:00:00