Swift 类似HandyJSON解析Struct
- HandyJSON
- 从源码解析Struct
- 获取TargetStructMetadata
- 获取TargetStructDescriptor
- 实现TargetRelativeDirectPointer
- FieldDescriptor和FieldRecord
- fieldOffsetVectorOffset计算偏移量
- 代码的验证
HandyJSON
HandyJSON
是阿里开发的一个在swift
上把JSON
数据转化为对应model
的框架。与其他流行的Swift JSON
库相比,HandyJSON
的特点是,它支持纯swift
类,使用也简单。它反序列化时(把JSON
转换为Model
)不要求Model
从NSObject
继承(因为它不是基于KVC
机制),也不要求你为Model
定义一个Mapping
函数。只要你定义好Model
类,声明它服从HandyJSON
协议,HandyJSON
就能自行以各个属性的属性名为Key
,从JSON
串中解析值。不过因为HandyJSON
是基于swift
的metadata
来做的,如果swift
的metadata
的结构改了,HandyJSON
可能就直接不能用了。当然阿里一直在维护这个框架,swift
的源码有变化,相信框架也是相对于有改变的。
HandyJSON的github
从源码解析Struct
获取TargetStructMetadata
由于HandyJSON
是基于swift
的metadata
来做的,说道解析解析struct
,那就不得不去了解metadata
。接下来,我们会从源码的角度去寻找metadata
。
首先,我们从源码Metadata.h
中搜索StructMetadata
相关信息,会发现其真正类型是TargetStructMetadata
。
using StructMetadata = TargetStructMetadata<InProcess>;
接着,我们查看TargetStructMetadata
的结构会发现,TargetStructMetadata
继承自TargetValueMetadata
,TargetValueMetadata
继承自TargetMetadata
。
struct TargetStructMetadata : public TargetValueMetadata<Runtime> {
struct TargetValueMetadata : public TargetMetadata<Runtime> {
那么,我们就可以通过这个继承链去还原TargetStructMetadata
的结构。
从代码中我们可以看出,TargetStructMetadata
的第一个属性是Kind
,除了这个属性还有一个description
,用于记录描述文件。
struct TargetMetadata {
......
private:
/// The kind. Only valid for non-class metadata; getKind() must be used to get
/// the kind value.
StoredPointer Kind;
......
}
struct TargetValueMetadata : public TargetMetadata<Runtime> {
using StoredPointer = typename Runtime::StoredPointer;
TargetValueMetadata(MetadataKind Kind,
const TargetTypeContextDescriptor<Runtime> *description)
: TargetMetadata<Runtime>(Kind), Description(description) {}
//用于记录元数据的描述
/// An out-of-line description of the type.
TargetSignedPointer<Runtime, const TargetValueTypeDescriptor<Runtime> * __ptrauth_swift_type_descriptor> Description;
......
}
这样我们就可以得到TargetStructMetadata
的结构为
struct TargetStructMetadata {
// StoredPointer Kind; 64位系统下 using StoredPointer = uint64_t; 即为Int
var kind: Int
//暂且先定义为UnsafeMutablePointer,后面会分析typeDescriptor的结构 T就是泛型
var typeDescriptor: UnsafeMutablePointer<T>
}
获取TargetStructDescriptor
接下来我们解析Description
的相关信息。从源码中可得TargetStructDescriptor
是Description
的结构。
const TargetStructDescriptor<Runtime> *getDescription() const {
return llvm::cast<TargetStructDescriptor<Runtime>>(this->Description);
}
我们查找TargetStructDescriptor
可以得到,其继承自TargetValueTypeDescriptor
,含有两个属性NumFields
(记录属性的count
)和FieldOffsetVectorOffset
(记录属性在metadata
中的偏移量)
class TargetStructDescriptor final
: public TargetValueTypeDescriptor<Runtime>,
public TrailingGenericContextObjects<TargetStructDescriptor<Runtime>,
TargetTypeGenericContextDescriptorHeader,
/*additional trailing objects*/
TargetForeignMetadataInitialization<Runtime>,
TargetSingletonMetadataInitialization<Runtime>,
TargetCanonicalSpecializedMetadatasListCount<Runtime>,
TargetCanonicalSpecializedMetadatasListEntry<Runtime>,
TargetCanonicalSpecializedMetadatasCachingOnceToken<Runtime>> {
......
/// The number of stored properties in the struct.
/// If there is a field offset vector, this is its length.
uint32_t NumFields; //记录属性的count
/// The offset of the field offset vector for this struct's stored
/// properties in its metadata, if any. 0 means there is no field offset
/// vector.
uint32_t FieldOffsetVectorOffset; //记录属性在metadata中的偏移量
TargetValueTypeDescriptor
继承自TargetTypeContextDescriptor
,TargetTypeContextDescriptor
含有三个属性:Name
(类型的名称)、AccessFunctionPtr
(指向此类型的元数据访问函数的指针)和Fields
(指向类型的字段描述符的指针)。
class TargetValueTypeDescriptor
: public TargetTypeContextDescriptor<Runtime> {
public:
static bool classof(const TargetContextDescriptor<Runtime> *cd) {
return cd->getKind() == ContextDescriptorKind::Struct ||
cd->getKind() == ContextDescriptorKind::Enum;
}
};
class TargetTypeContextDescriptor
: public TargetContextDescriptor<Runtime> {
public:
/// The name of the type.
// 类型的名称
TargetRelativeDirectPointer<Runtime, const char, /*nullable*/ false> Name;
/// A pointer to the metadata access function for this type.
///
/// The function type here is a stand-in. You should use getAccessFunction()
/// to wrap the function pointer in an accessor that uses the proper calling
/// convention for a given number of arguments.
// 指向此类型的元数据访问函数的指针
TargetRelativeDirectPointer<Runtime, MetadataResponse(...),
/*Nullable*/ true> AccessFunctionPtr;
/// A pointer to the field descriptor for the type, if any.
// 指向类型的字段描述符的指针
TargetRelativeDirectPointer<Runtime, const reflection::FieldDescriptor,
/*nullable*/ true> Fields;
......
}
TargetTypeContextDescriptor
又继承自基类TargetContextDescriptor
,TargetContextDescriptor
包含两个属性:Flags
(用于表示描述context
的标志,包含kind
和version
)和Parent
(用于表示父类的context
,如果是在顶层,则表示没有父类,则为NULL
)。
/// Base class for all context descriptors.
template<typename Runtime>
struct TargetContextDescriptor {
/// Flags describing the context, including its kind and format version.
// 用于表示描述context的标志,包含kind和version
ContextDescriptorFlags Flags;
/// The parent context, or null if this is a top-level context.
// 用于表示父类的context,如果是在顶层,则表示没有父类,则为NULL
TargetRelativeContextPointer<Runtime> Parent;
......
}
从这里开始,TargetStructDescriptor
就已经明了了,我们就可以写出TargetStructDescriptor
的相关结构,同时修正TargetStructMetadata
中的泛型T。
struct TargetStructMetadata {
var kind: Int
var typeDescriptor: UnsafeMutablePointer<TargetStructDescriptor>
}
struct TargetStructDescriptor {
// 用于表示描述context的标志,包含kind和version
var flags: Int32 // ContextDescriptorFlags Int32
// 用于表示父类的context,如果是在顶层,则表示没有父类,则为NULL
var parent: TargetRelativeContextPointer<UnsafeRawPointer> // Relative 相对地址
// 类型的名称
var name: TargetRelativeDirectPointer<CChar> // Relative 相对地址
// 指向此类型的元数据访问函数的指针
var accessFunctionPointer: TargetRelativeDirectPointer<UnsafeRawPointer> // Relative 相对地址
// 指向类型的字段描述符的指针
var fieldDescriptor: TargetRelativeDirectPointer<FieldDescriptor> // Relative 相对地址
// 记录属性的count
var numFields: Int32
// 记录属性在metadata中的偏移量
var fieldOffsetVectorOffset: Int32
}
// 下面是一些属性的类型解析
/// Common flags stored in the first 32-bit word of any context descriptor.
// flags 就是 Int32
struct ContextDescriptorFlags {
private:
uint32_t Value;
}
实现TargetRelativeDirectPointer
对于相对地址TargetRelativeDirectPointer
,我们从源码中搜索TargetRelativeDirectPointer
可得出TargetRelativeDirectPointer
就是RelativeDirectPointer
。
template <typename Runtime, typename Pointee, bool Nullable = true>
using TargetRelativeDirectPointer
= typename Runtime::template RelativeDirectPointer<Pointee, Nullable>;
接着在RelativePointer.h
找到RelativeDirectPointer
,发现RelativeDirectPointer
继承自基类RelativeDirectPointerImpl
,其包含一个属性RelativeOffset
(偏移量)。并且其含有通过偏移量获取真实内存的方法。
template <typename T, bool Nullable = true, typename Offset = int32_t,
typename = void>
class RelativeDirectPointer;
/// A direct relative reference to an object that is not a function pointer.
// offset传入Int32
template <typename T, bool Nullable, typename Offset>
class RelativeDirectPointer<T, Nullable, Offset,
typename std::enable_if<!std::is_function<T>::value>::type>
: private RelativeDirectPointerImpl<T, Nullable, Offset>
{
......
}
/// A relative reference to a function, intended to reference private metadata
/// functions for the current executable or dynamic library image from
/// position-independent constant data.
template<typename T, bool Nullable, typename Offset>
class RelativeDirectPointerImpl {
private:
/// The relative offset of the function's entry point from *this.
Offset RelativeOffset;
......
// 通过偏移量计算 同时还返回泛型T类型
PointerTy get() const & {
// Check for null.
if (Nullable && RelativeOffset == 0)
return nullptr;
// The value is addressed relative to `this`.
uintptr_t absolute = detail::applyRelativeOffset(this, RelativeOffset);
return reinterpret_cast<PointerTy>(absolute);
}
......
}
/// Apply a relative offset to a base pointer. The offset is applied to the base
/// pointer using sign-extended, wrapping arithmetic.
// 通过偏移量计算
template<typename BasePtrTy, typename Offset>
static inline uintptr_t applyRelativeOffset(BasePtrTy *basePtr, Offset offset) {
static_assert(std::is_integral<Offset>::value &&
std::is_signed<Offset>::value,
"offset type should be signed integer");
auto base = reinterpret_cast<uintptr_t>(basePtr);
// We want to do wrapping arithmetic, but with a sign-extended
// offset. To do this in C, we need to do signed promotion to get
// the sign extension, but we need to perform arithmetic on unsigned values,
// since signed overflow is undefined behavior.
auto extendOffset = (uintptr_t)(intptr_t)offset;
// 指针地址+存放的offset(偏移地址) -- 内存平移获取值
return base + extendOffset;
}
那么我们就可以TargetRelativeDirectPointer
的结构:
// 传入泛型Pointee
struct TargetRelativeDirectPointer<Pointee> {
var offset: Int32
// 通过偏移量计算内存
mutating func getmeasureRelativeOffset() -> UnsafeMutablePointer<Pointee> {
let offset = self.offset
return withUnsafePointer(to: &self) { p in
// 使用advanced偏移offset,再重新绑定成Pointee类型
return UnsafeMutablePointer(mutating: UnsafeRawPointer(p).advanced(by: numericCast(offset)).assumingMemoryBound(to: Pointee.self))
}
}
}
同时我们就可以修正TargetStructDescriptor
为:
struct TargetStructDescriptor {
// 用于表示描述context的标志,包含kind和version
var flags: Int32
// 用于表示父类的context,如果是在顶层,则表示没有父类,则为NULL
var parent: Int32// 由于不去解析,暂时定义为Int32
// 类型的名称
var name: TargetRelativeDirectPointer<CChar>
// 指向此类型的元数据访问函数的指针
var accessFunctionPointer: TargetRelativeDirectPointer<UnsafeRawPointer>
// 指向类型的字段描述符的指针
var fieldDescriptor: TargetRelativeDirectPointer<FieldDescriptor>
// 记录属性的count
var numFields: Int32
// 记录属性在metadata中的偏移量
var fieldOffsetVectorOffset: Int32
}
// TargetRelativeContextPointer暂时不解析,通过源码分析可得暂时解析为Int32
template<typename Runtime,
template<typename _Runtime> class Context = TargetContextDescriptor>
using TargetRelativeContextPointer =
RelativeIndirectablePointer<const Context<Runtime>,
/*nullable*/ true, int32_t,
TargetSignedContextPointer<Runtime, Context>>;
FieldDescriptor和FieldRecord
再下一步,我们开始解析FieldDescriptor
,源码中FieldDescriptor
如下:
// Field descriptors contain a collection of field records for a single
// class, struct or enum declaration.
class FieldDescriptor {
const FieldRecord *getFieldRecordBuffer() const {
return reinterpret_cast<const FieldRecord *>(this + 1);
}
public:
const RelativeDirectPointer<const char> MangledTypeName;
const RelativeDirectPointer<const char> Superclass;
FieldDescriptor() = delete;
const FieldDescriptorKind Kind;
const uint16_t FieldRecordSize;
const uint32_t NumFields;
......
// 获取所有属性,每个属性用FieldRecord封装
llvm::ArrayRef<FieldRecord> getFields() const {
return {getFieldRecordBuffer(), NumFields};
}
......
}
// FieldDescriptorKin就是 Int16
enum class FieldDescriptorKind : uint16_t {
......
}
FieldRecord
在源码中的结构为:
class FieldRecord {
const FieldRecordFlags Flags;
public:
const RelativeDirectPointer<const char> MangledTypeName;
const RelativeDirectPointer<const char> FieldName;
......
}
// Field records describe the type of a single stored property or case member
// of a class, struct or enum.
// FieldRecordFlags 就是Int32
class FieldRecordFlags {
using int_type = uint32_t;
......
}
fieldOffsetVectorOffset计算偏移量
最后还有fieldOffsetVectorOffset
(记录属性在metadata
中的偏移量)的计算,来获取属性再metadata
中的偏移量。源码中能得到的资料是:
// StoredPointer 是Int32 即会返回一个Int32
/// Get a pointer to the field offset vector, if present, or null.
const StoredPointer *getFieldOffsets() const {
assert(isTypeMetadata());
auto offset = getDescription()->getFieldOffsetVectorOffset();
if (offset == 0)
return nullptr;
auto asWords = reinterpret_cast<const void * const*>(this);
return reinterpret_cast<const StoredPointer *>(asWords + offset);
}
但是以这个逻辑去处理,获取的数据是不对的,所以我从HandyJSON
的源码中找到了这个:
// 当时64位是 offset 会乘以2
return Int(UnsafePointer<Int32>(pointer)[vectorOffset * (is64BitPlatform ? 2 : 1) + $0])
分析到这里,我们就得到了一个比较清晰地结构线,如下:
// 通过偏移量计算内存地址 传入泛型Pointee
struct TargetRelativeDirectPointer<Pointee> {
var offset: Int32
// 通过偏移量计算内存
mutating func getmeasureRelativeOffset() -> UnsafeMutablePointer<Pointee> {
let offset = self.offset
return withUnsafePointer(to: &self) { p in
// 使用advanced偏移offset,再重新绑定成Pointee类型
return UnsafeMutablePointer(mutating: UnsafeRawPointer(p).advanced(by: numericCast(offset)).assumingMemoryBound(to: Pointee.self))
}
}
}
struct TargetStructMetadata {
var kind: Int
var typeDescriptor: UnsafeMutablePointer<TargetStructDescriptor>
}
struct TargetStructDescriptor {
var flags: Int32
var parent: Int32
var name: TargetRelativeDirectPointer<CChar>
var accessFunctionPointer: TargetRelativeDirectPointer<UnsafeRawPointer>
var fieldDescriptor: TargetRelativeDirectPointer<FieldDescriptor>
var numFields: Int32
var fieldOffsetVectorOffset: Int32
func getFieldOffsets(_ metadata: UnsafeRawPointer) -> UnsafePointer<Int32> {
print(metadata)
return metadata.assumingMemoryBound(to: Int32.self).advanced(by: numericCast(self.fieldOffsetVectorOffset) * 2)
}
// 计算元型时使用
var genericArgumentOffset: Int {
return 2
}
}
struct FieldDescriptor {
var MangledTypeName: TargetRelativeDirectPointer<CChar>
var Superclass: TargetRelativeDirectPointer<CChar>
var kind: UInt16
var fieldRecordSize: Int16
var numFields: Int32
var fields: FieldRecordBuffer<FieldRecord>
}
struct FieldRecord {
var fieldRecordFlags: Int32
var mangledTypeName: TargetRelativeDirectPointer<CChar>
var fieldName: TargetRelativeDirectPointer<UInt8>
}
// 获取FieldRecord
struct FieldRecordBuffer<Element> {
var element: Element
mutating func buffer(n: Int) -> UnsafeBufferPointer<Element> {
return withUnsafePointer(to: &self) {
let ptr = $0.withMemoryRebound(to: Element.self, capacity: 1) { start in
return start
}
return UnsafeBufferPointer(start: ptr, count: n)
}
}
mutating func index(of i: Int) -> UnsafeMutablePointer<Element> {
return withUnsafePointer(to: &self) {
return UnsafeMutablePointer(mutating: UnsafeRawPointer($0).assumingMemoryBound(to: Element.self).advanced(by: i))
}
}
}
代码的验证
下面我们就代码来验证我们得到的这个结构。
protocol BrigeProtocol {}
extension BrigeProtocol {
// 通过协议重新绑定类型 返回出去
static func get(from pointor: UnsafeRawPointer) -> Any {
// Self就是真实的类型
pointor.assumingMemoryBound(to: Self.self).pointee
}
}
struct BrigeMetadataStruct {
let type: Any.Type
let witness: Int
}
func custom(type: Any.Type) -> BrigeProtocol.Type {
let container = BrigeMetadataStruct(type: type, witness: 0)
let cast = unsafeBitCast(container, to: BrigeProtocol.Type.self)
return cast
}
// LLPerson结构体
struct LLPerson {
var age: Int = 18
var name: String = "LL"
var nameTwo: String = "LLLL"
}
// 创建一个实例
var p = LLPerson()
// LLPerson的metadata按位塞入TargetStructMetadata这个metadata中,LLPerson.self就是UnsafeMutablePointer<TargetStructMetadata>.self
let ptr = unsafeBitCast(LLPerson.self as Any.Type, to: UnsafeMutablePointer<TargetStructMetadata>.self)
// 拿到结构体名称
let namePtr = ptr.pointee.typeDescriptor.pointee.name.getmeasureRelativeOffset()
print("当前 struct name: \(String(cString: namePtr))")
// 拿到属性个数
let numFields = ptr.pointee.typeDescriptor.pointee.numFields
print("当前类属性个数: \(numFields)")
// 拿到属性再metadata中的偏移量
let offsets = ptr.pointee.typeDescriptor.pointee.getFieldOffsets(UnsafeRawPointer(ptr).assumingMemoryBound(to: Int.self))
print("----------- start fetch field -------------")
for i in 0..<numFields {
// 获取属性名
let fieldName = ptr.pointee.typeDescriptor.pointee.fieldDescriptor.getmeasureRelativeOffset().pointee.fields.index(of: Int(i)).pointee.fieldName.getmeasureRelativeOffset()
print("----- field \(String(cString: fieldName)) -----")
// 拿到属性对应的偏移量 按字节偏移的
let fieldOffset = offsets[Int(i)]
print("\(String(cString: fieldName)) 的偏移量是:\(fieldOffset)字节")
// 这是swift混写过的类型名称 需要把它转成真正的类型名称
let typeMangleName = ptr.pointee.typeDescriptor.pointee.fieldDescriptor.getmeasureRelativeOffset().pointee.fields.index(of: Int(i)).pointee.mangledTypeName.getmeasureRelativeOffset()
// print("\(String(cString: typeMangleName))")
let genericVector = UnsafeRawPointer(ptr).advanced(by: ptr.pointee.typeDescriptor.pointee.genericArgumentOffset * MemoryLayout<UnsafeRawPointer>.size).assumingMemoryBound(to: Any.Type.self)
// 需要用到这个库函数 swift_getTypeByMangledNameInContext 传递四个参数
let fieldType = swift_getTypeByMangledNameInContext(
typeMangleName, // 混写过后的名称
256, // 混写过后的名称信息长度,需要计算 HandyJSON中直接 256
UnsafeRawPointer(ptr.pointee.typeDescriptor), // 上下文 typeDescriptor中
UnsafeRawPointer(genericVector).assumingMemoryBound(to: Optional<UnsafeRawPointer>.self)) //当前的泛型参数 还原符号信息
// 将fieldType按位塞入Any
let type = unsafeBitCast(fieldType, to: Any.Type.self)
// 通过协议桥接获取我们的真实类型信息
let value = custom(type: type)
//获取实例对象p的指针 需要转换成UnsafeRawPointer 并且绑定成1字节即Int8类型,
//因为后面是按字节计算偏移量的,不转换,会以结构体的长度偏移
let instanceAddress = withUnsafePointer(to: &p){return UnsafeRawPointer($0).assumingMemoryBound(to: Int8.self)}
print("fieldTyoe: \(type) \nfieldValue: \(value.get(from: instanceAddress.advanced(by: Int(fieldOffset))))")
}
print("----------- end fetch field -------------")
打印信息:
从内存地址我们也可以看出属性的布局信息。