学习文件系统的第一步,先搞清楚文件系统在设备上的存储结构,先来简单了解一下。
F2FS空间布局
F2FS空间布局整个存储空间被划分为6个区域:
-
超级块(SB) 包含基本分区信息和F2FS在格式化分区时确定不可更改的参数
-
检查点(CP) 保存文件系统状态,有效NAT/SIT(见下文说明)集合的位图,孤儿inode列表(文件被删除时尚有引用无法立即释放时需被计入此列表,以便再次挂载时释放)和当前活跃段的所有者信息。和其他日志结构文件系统一样,F2FS检查点时某一给定时点一致的文件系统状态集合——可用于系统崩溃或掉电后的数据恢复。F2FS的两个检查点各占一个Segment,和前述不同的是,F2FS通过检查点头尾两个数据块中的version信息判断检查点是否有效。
-
段信息表Segment Information Table(SIT) 包含主区域(Main Area,见下文说明)中每个段的有效块数和标记块是否有效的位图。SIT主要用于回收过程中选择需要搬移的段和识别段中有效数据。
-
索引节点地址表Node Address Table(NAT) 用于定位所有主区域的索引节点块(包括:inode节点、直接索引节点、间接索引节点)地址。即NAT中存放的是inode或各类索引node的实际存放地址。
-
段摘要区Segment Summary Area (SSA) 主区域所有数据块的所有者信息(即反向索引),包括:父inode号和内部偏移。SSA表项可用于搬移有效块前查找其父亲索引节点编号,
-
主区域 Main Area 由4KB大小的数据块组成,每个块被分配用于存储数据(文件或目录内容)和索引(inode或数据块索引)。一定数量的连续块组成Segment,进而组成Section和Zone(如前所述)。一个Segment要么存储数据,要么存储索引,据此可将Segment划分为数据段和索引段。
在sdcard中,新建一个100MB大小的文件f2fs_device
dd if=/dev/zero of=/sdcard/f2fs_device bs=1MB count=100
将文件f2fs_device格式化成f2fs文件系统
make_f2fs /sdcard/f2fs_device
将f2fs_device和loop设备绑定,生成一个虚拟块设备,如果提示设备忙,13换成其他数字
losetup /dev/block/loop13 /sdcard/f2fs_device
新建一个目录f2fs_root_dir
mkdir /sdcard/f2fs_root_dir
将loop13挂在到f2fs_root_dir目录
mount -t f2fs /dev/block/loop13 /sdcard/f2fs_root_dir
二、填充数据
在目录中新建一个1.txt文件,并且写入hello world
T1_PRO:/sdcard/f2fs_root_dir # touch 1.txt
T1_PRO:/sdcard/f2fs_root_dir # echo "hello world" > 1.txt
三、hexdump工具分析
pull出文件f2fs_device,千万别出pull f2fs_root_dir这个路径
adb pull sdcard/f2fs_device
直接用hexdump分析块设备的原始数据
hexdump -C f2fs_device
3.1 dump开头内容分析
首先你会看到下面的一堆16进制的数字,就是块设备的原始数据
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
#super block 1 start
*
00000400 10 20 f5 f2 01 00 0b 00 09 00 00 00 03 00 00 00 |. ..............|
00000410 0c 00 00 00 09 00 00 00 01 00 00 00 01 00 00 00 |................|
00000420 00 00 00 00 00 64 00 00 00 00 00 00 2a 00 00 00 |.....d......*...|
00000430 31 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 |1...............|
00000440 01 00 00 00 2a 00 00 00 00 02 00 00 00 02 00 00 |....*...........|
00000450 00 06 00 00 00 0a 00 00 00 0e 00 00 00 10 00 00 |................|
00000460 03 00 00 00 01 00 00 00 02 00 00 00 b6 71 aa 9d |.............q..|
00000470 8e e2 40 64 81 df 52 71 22 74 8d ac 00 00 00 00 |..@d..Rq"t......|
00000480 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000870 00 00 00 00 00 00 00 00 00 00 00 00 22 00 00 00 |............"...|
00000880 6a 70 67 00 00 00 00 00 67 69 66 00 00 00 00 00 |jpg.....gif.....|
00000890 70 6e 67 00 00 00 00 00 61 76 69 00 00 00 00 00 |png.....avi.....|
000008a0 64 69 76 78 00 00 00 00 6d 34 61 00 00 00 00 00 |divx....m4a.....|
000008b0 6d 34 76 00 00 00 00 00 6d 34 70 00 00 00 00 00 |m4v.....m4p.....|
000008c0 6d 70 34 00 00 00 00 00 6d 70 33 00 00 00 00 00 |mp4.....mp3.....|
000008d0 33 67 70 00 00 00 00 00 77 6d 76 00 00 00 00 00 |3gp.....wmv.....|
000008e0 77 6d 61 00 00 00 00 00 6d 70 65 67 00 00 00 00 |wma.....mpeg....|
000008f0 6d 6b 76 00 00 00 00 00 6d 6f 76 00 00 00 00 00 |mkv.....mov.....|
00000900 61 73 78 00 00 00 00 00 61 73 66 00 00 00 00 00 |asx.....asf.....|
00000910 77 6d 78 00 00 00 00 00 73 76 69 00 00 00 00 00 |wmx.....svi.....|
00000920 77 76 78 00 00 00 00 00 77 76 00 00 00 00 00 00 |wvx.....wv......|
00000930 77 6d 00 00 00 00 00 00 6d 70 67 00 00 00 00 00 |wm......mpg.....|
00000940 6d 70 65 00 00 00 00 00 72 6d 00 00 00 00 00 00 |mpe.....rm......|
00000950 6f 67 67 00 00 00 00 00 6f 70 75 73 00 00 00 00 |ogg.....opus....|
00000960 66 6c 61 63 00 00 00 00 6a 70 65 67 00 00 00 00 |flac....jpeg....|
00000970 76 69 64 65 6f 00 00 00 61 70 6b 00 00 00 00 00 |video...apk.....|
00000980 73 6f 00 00 00 00 00 00 65 78 65 00 00 00 00 00 |so......exe.....|
00000990 64 62 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |db..............|
000009a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
...省略大量数据
3.1.1 超级块(SB)
超级块(SB)从0x00000400开始,也就是从1KB开始存,不是从0开始.
存在两个一模一样的超级块(SB),f2fs的设计,防止数据损坏,两个结构体间隔4KB
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000400 10 20 f5 f2 01 00 0b 00 09 00 00 00 03 00 00 00 |. ..............|//SB1
*
00001400 10 20 f5 f2 01 00 0b 00 09 00 00 00 03 00 00 00 |. ..............|//SB2
3.1.3 检查点(CP)
检查点(CP)从0x00200000开始,也就是2MB开始,因为一个Segment为2MB,检查点(CP)是段对齐的
*
00200000 3b 51 5b 23 00 00 00 00 00 28 00 00 00 00 00 00 |;Q[#.....(......|#CP
3.2 文件1.txt
整个1.txt文件对应的索引(inode)从地址0x01201000到0x01202000等于0x1000B = 4KB
留个疑问:为什么文件内容"hello world"会保存在inode块而不是在数据块?
01201000 b6 81 00 0b 00 00 00 00 00 00 00 00 01 00 00 00 |................|
01201010 0c 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
01201020 4f 26 70 5e 00 00 00 00 57 26 70 5e 00 00 00 00 |O&p^....W&p^....|
01201030 57 26 70 5e 00 00 00 00 a2 f4 2a 02 9e 08 d3 1d |W&p^......*.....|
01201040 9e 08 d3 1d af 1e c5 19 00 00 00 00 00 00 00 00 |................|
01201050 00 00 00 00 03 00 00 00 05 00 00 00 31 2e 74 78 |............1.tx|
01201060 74 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |t...............|
01201070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
01201160 00 00 00 00 00 00 00 00 00 00 00 00 68 65 6c 6c |............hell|
01201170 6f 20 77 6f 72 6c 64 0a 00 00 00 00 00 00 00 00 |o world.........|
01201180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
01201f00 00 00 00 00 00 00 00 00 00 00 00 00 11 20 f5 f2 |............. ..|
01201f10 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
01201f20 00 00 00 00 06 07 18 00 73 65 6c 69 6e 75 78 75 |........selinuxu|
01201f30 3a 6f 62 6a 65 63 74 5f 72 3a 75 6e 6c 61 62 65 |:object_r:unlabe|
01201f40 6c 65 64 3a 73 30 00 00 00 00 00 00 00 00 00 00 |led:s0..........|
01201f50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
01201fe0 00 00 00 00 00 00 00 00 04 00 00 00 04 00 00 00 |................|
01201ff0 01 00 00 00 3a 51 5b 23 21 79 00 61 02 12 00 00 |....:Q[#!y.a....|
01202000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
3.3
最后一行,最大的寻址是0x06400000=100MB,我们创建的块设备就是100MB
*
06400000
四、dump.f2fs工具分析
单纯的用hexdump去分析f2fs块设备上的存储结构,我们还可以通过dump.f2fs工具去查看,当然Android源码下默认dump.f2fs是关闭的,以后我再写文章如何开启Android下的dump.f2fs。
4.1 dump.f2fs使用说明
Usage: dump.f2fs [options] device
[options]:
-d debug level [default:0]
-i inode no (hex)
-n [NAT dump nid from #1~#2 (decimal), for all 0~-1]
-s [SIT dump segno from #1~#2 (decimal), for all 0~-1]
-S sparse_mode
-a [SSA dump segno from #1~#2 (decimal), for all 0~-1]
-b blk_addr (in 4KB)
-V print the version number and exit
4.2 dump.f2fs -n
触发dump node address table
1|T1_PRO:/sdcard $ dump.f2fs -n 0~-1 f2fs_device
Info: No support kernel version!
Info: Segments per p = 1
Info: Sections per zone = 1
Info: sector size = 512
Info: total sectors = 204800 (100 MB)
Info: MKFS version
"4.14.117+ #1 SMP PREEMPT Mon Mar 9 21:37:48 CST 2020"
Info: FSCK version
from "4.14.117+ #1 SMP PREEMPT Mon Mar 9 21:37:48 CST 2020"
to "4.14.117+ #1 SMP PREEMPT Mon Mar 9 21:37:48 CST 2020"
Info: superblock features = 0 :
Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
Info: total FS sectors = 204800 (100 MB)
Info: CKPT version = 235b513e
Info: checkpoint state = c5 : nat_bits crc compacted_summary unmount
Done.
查看生成文件dump_nat
T1_PRO:/sdcard $ cat dump_nat
nid: 3 ino: 3 offset: 0 blkaddr: 4098 pack:2
nid: 4 ino: 4 offset: 0 blkaddr: 4609 pack:2
注意nid 4对应的blkaddr 4609,转成16进制就是1201,有没有点眼熟,看看3.2 中dump的1.txt的文件对应的inode第一样的地址01201000,所以blkaddr:4609就是代表nid为4的数据结构在存储设备中的地址,也就是blkaddr * 4k,还记得开头说的主区域 Main Area 由4KB大小的数据块组成嘛,正好对应。
01201000 b6 81 00 0b 00 00 00 00 00 00 00 00 01 00 00 00 |................|
4.3 dump.f2fs -i
dump inode号对应的inode结构体
T1_PRO:/sdcard $ dump.f2fs -i 4 f2fs_device
Info: No support kernel version!
Info: Segments per p = 1
Info: Sections per zone = 1
Info: sector size = 512
Info: total sectors = 204800 (100 MB)
Info: MKFS version
"4.14.117+ #1 SMP PREEMPT Mon Mar 9 21:37:48 CST 2020"
Info: FSCK version
from "4.14.117+ #1 SMP PREEMPT Mon Mar 9 21:37:48 CST 2020"
to "4.14.117+ #1 SMP PREEMPT Mon Mar 9 21:37:48 CST 2020"
Info: superblock features = 0 :
Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
Info: total FS sectors = 204800 (100 MB)
Info: CKPT version = 235b513e
[print_node_info: 275] Node ID [0x4:4] is inode
i_mode [0x 81b6 : 33206]
i_advise [0x 0 : 0]
i_uid [0x 0 : 0]
i_gid [0x 0 : 0]
i_links [0x 1 : 1]
i_size [0x c : 12]
i_blocks [0x 1 : 1]
i_atime [0x5e70264f : 1584408143]
i_atime_nsec [0x 22af4a2 : 36369570]
i_ctime [0x5e702657 : 1584408151]
i_ctime_nsec [0x1dd3089e : 500369566]
i_mtime [0x5e702657 : 1584408151]
i_mtime_nsec [0x1dd3089e : 500369566]
i_generation [0x19c51eaf : 432348847]
i_current_depth [0x 0 : 0]
i_xattr_nid [0x 0 : 0]
i_flags [0x 0 : 0]
i_inline [0x b : 11]
i_pino [0x 3 : 3]
i_dir_level [0x 0 : 0]
i_namelen [0x 5 : 5]
i_name [1.txt]
i_ext: fofs:0 blkaddr:0 len:0
i_addr[ofs] [0x 0 : 0]
i_addr[ofs + 1] [0x6c6c6568 : 1819043176]
i_addr[ofs + 2] [0x6f77206f : 1870078063]
i_addr[ofs + 3] [0x a646c72 : 174353522]
i_addr[0x3] points data block [0xa646c72]
i_nid[0] [0x 0 : 0]
i_nid[1] [0x 0 : 0]
i_nid[2] [0x 0 : 0]
i_nid[3] [0x 0 : 0]
i_nid[4] [0x 0 : 0]
xattr: e_name_index:6 e_name:selinux e_name_len:7 e_value_size:24 e_value:
753A6F626A6563745F723A756E6C6162656C65643A733000
Do you want to dump this file into ./lost_found/? [Y/N] y
Info: checkpoint state = c5 : nat_bits crc compacted_summary unmount
Done.
如果选择y,可以dump出1.txt到./lost_found/路径下
似乎这个指令以后可以用来从原始数据直接生成文件
Do you want to dump this file into ./lost_found/? [Y/N] y
T1_PRO:/sdcard $ cat ./lost_found/1.txt
hello world
注意inode的i_addr,看起来数字是不是也很眼熟,其实就是hello world,为什么i_addr不是指向数据块,而是直接存储hello world。因为F2FS支持inline data(数据直接存储在inode中),小文件大小最大可达约3.4KB,在Android大量小文件场景中对存取空间占用和性能有一定优化。
i_addr[ofs + 1] [0x6c6c6568 : 1819043176]
i_addr[ofs + 2] [0x6f77206f : 1870078063]
i_addr[ofs + 3] [0x a646c72 : 174353522]
01201160 00 00 00 00 00 00 00 00 00 00 00 00 68 65 6c 6c |............hell|
01201170 6f 20 77 6f 72 6c 64 0a 00 00 00 00 00 00 00 00 |o world.........|
五、总结
对f2fs存储结构学习,只有这么一点是远远不够的,为什么研究文件系统要先研究存储结构?因为文件系统的很多代码都是按照存储结构来写的,我觉得文件系统其实就是块设备的原始数据的翻译者,管理者。