前言

之前在Mac电脑上拷贝文件到Windows电脑时，文件夹里会额外多出文件，就是我们今天要说的.DS_Store文件，当时并没有在意。

昨天在写一个文件流读取功能时，解析一直出问题，后面发现是文件夹里多了.DS_Store文件，导致文件遍历解析出了些问题，过程我就不再详述。

因此也对这个.DS_Store文件产生了一些兴趣，特地分享记录下。

正文

简介

.DS_Store，全称 Desktop Services Store，是Mac OS中保存文件夹自定义属性的隐藏文件，目的在于存储文件夹的自定义属性，例如文件图标位置、视图设置，或背景色等，相当于Windows下的 desktop.ini。.DS_Store 默认放在每个文件夹的下面。

解析

当我们尝试用文本打开时可以发现它是乱码的，我们接下来尝试解读一下它。

我们使用Notepad++，在其插件-插件管理添加Hex-Editor插件，安装好此插件后Notepad++可以以十六进制形式打开指定文件。

我们使用Notepad++借助此插件可以打开.DS_Store，可以看到它的十六进制格式。

当然我们也可以使用VSCode，安装hexdump for vscode插件，来显示十六进制格式。

两种方式插件的安装过程略，有兴趣的可以查看相关文章等。

这儿我们使用VSCode结合插件来打开该文件，如下图，我们尝试解析一下该十六进制数据。

关于该文件二进制的结构说明我们可以参考这篇文章解析.DS_Store文件格式。

.DS_Store文件转为二进制树后，根据上面这篇文章，它的结构大致如下：

文件头部（Header）
根块
- 偏移部分（Offsets）
- 内容表（Toc）
- 空闲表（FreeList）

文件头部： 通常文件头部用来进行校验，判断这个文件是不是.DS_Store文件（详见代码readHeader方法）。

偏移部分： 偏移部分记录了有关文件中树(叶)块的偏移量信息，这些块存储的都是目录的实际信息，如文件名。获得偏移量需要遍历这个树（详见代码readOffsets方法）。

内容表： 在偏移部分结束后，内容表部分就会呈现出来。他通常存在只有一个名为DSDB的表,并且值为1。这个特殊的表通常引用了我们将要遍历的第一个块的ID（详见代码readTOC方法）。

空闲表： 最后一部分是空闲表，也就是在树中还有哪些地方是没有使用的或者是空闲的模块（详见代码readFreelist方法）。

然后我们使用Java来解析下这个二进制结构数据。代码如下：

public class DataBlock {
    private byte[] data;
    private int pos;
    private boolean debug;

    /**
     * Returns an byte array of length from data at the given offset or pos.
     * If offset==0 (no offset is given) , pos will be increased by length.
     * Throws Exception if offset+length > this.data.length
     * @Params: [length, offset]
     * @Return: byte[]
     */
    public byte[] offsetRead(int length,int offset){
        int offsetPosition;
        if(offset==0){
            offsetPosition = this.pos;
        }else{
            offsetPosition = offset;
        }
        if(this.data.length < offsetPosition +length){
            throw new RuntimeException("Offset+Length > this.data.length");
        }
        if(offset==0){
            this.pos+=length;
        }
        byte[] value =new byte[length];
        System.arraycopy(this.data,offsetPosition,value,0,length);
        if(debug){
            System.out.println(String.format("Reading: %s-%s => %s",offsetPosition, offsetPosition+length, value));
        }
        return value;
    }

    /**
     * Increases pos by length without reading data!
     * @Params: [length]
     * @Return: void
     */
    public void skip(int length){
        this.pos+=length;
    }

    /**
     * Extracts a file name from the current position.
     * @Params: []
     * @Return: java.lang.String
     */
    public String readFileName(){
        //The length of the file name in bytes.
        int length = ByteBuffer.wrap(offsetRead(4,0)).getInt();
        //The file name in UTF-16, which is two bytes per character.
        String fileName = new String(offsetRead(2 * length,0), StandardCharsets.UTF_16BE);
        //A structure ID that I haven't found any use of.
        int structureId = ByteBuffer.wrap(offsetRead(4,0)).getInt();
        //Now read the structure type as a string of four characters and decode it to ascii.
        String structureType = new String(offsetRead(4,0), StandardCharsets.US_ASCII);
        if(debug){
            System.out.println("Structure type "+ structureType);
        }
        //If we don't find a match, skip stays < 0 and we will do some magic to find the right skip due to somehow broken .DS_Store files..
        int skip = -1;
        //Source: http://search.cpan.org/~wiml/Mac-Finder-DSStore/DSStoreFormat.pod
        while (skip < 0){
            if(structureType.equals("bool")){
                skip = 1;
            }else if(structureType.equals("type") || structureType.equals("long")  || structureType.equals("shor")
            || structureType.equals("fwsw") || structureType.equals("fwvh") || structureType.equals("icvt")
            || structureType.equals("lsvt") || structureType.equals("vSrn") || structureType.equals("vstl")){
                skip = 4;
            }else if(structureType.equals("comp") || structureType.equals("dutc") || structureType.equals("icgo")
            || structureType.equals("icsp") || structureType.equals("logS") || structureType.equals("lg1S")
            || structureType.equals("lssp") || structureType.equals("modD") || structureType.equals("moDD")
            || structureType.equals("phyS") || structureType.equals("ph1S")){
                skip = 8;
            }else if(structureType.equals("blob")){
                skip = ByteBuffer.wrap(offsetRead(4,0)).getInt();
            }else if(structureType.equals("ustr") || structureType.equals("cmmt") || structureType.equals("extn")
            || structureType.equals("GRP0")){
                skip = 2 * ByteBuffer.wrap(offsetRead(4,0)).getInt();
            }else if(structureType.equals("BKGD")){
                skip = 12;
            }else if(structureType.equals("ICVO") || structureType.equals("LSVO") || structureType.equals("dscl")){
                skip = 1;
            }else if(structureType.equals("Iloc") || structureType.equals("fwi0")){
                skip = 16;
            }else if(structureType.equals("dilc")){
                skip = 32;
            }else if(structureType.equals("lsvo")){
                skip = 76;
            }else if(structureType.equals("icvo")){

            }else if(structureType.equals("info")){

            }else {

            }
            if(skip <= 0){
                //We somehow didn't find a matching type. Maybe this file name's length value is broken. Try to fix it!
                //This is a bit voodoo and probably not the nicest way. Beware, there by dragons!
                if(debug){
                    System.out.println("Re-reading!");
                }
                // Rewind 8 bytes, so that we can re-read structure_id and structure_type
                skip(-1 * 2 * 0x4);
                fileName = new String(offsetRead(0x2,0), StandardCharsets.UTF_16BE);
                //re-read structure_id and structure_type
                structureId = ByteBuffer.wrap(offsetRead(4,0)).getInt();
                structureType = new String(offsetRead(4,0), StandardCharsets.US_ASCII);
                //Look-ahead and check if we have  structure_type==Iloc followed by blob.
                //If so, we're interested in blob, not Iloc. Otherwise continue!
                String futureStructureType = new String(offsetRead(4,this.pos), StandardCharsets.US_ASCII);
                if(debug){
                    System.out.println(String.format("Re-read structure_id %s / structure_type %s",structureId, structureType));
                }
                if ((!structureType.equals("blob")) && (!futureStructureType.equals("blob"))){
                    structureType = "";
                    if(debug){
                        System.out.println("Forcing another round!");
                    }
                }
            }
        }
        // Skip bytes until the next (file name) block
        skip(skip);
        if(debug){
            System.out.println(String.format("Filename %s",fileName));
        }
        return fileName;
    }

    public DataBlock() {
    }

    public DataBlock(byte[] data, int pos, boolean debug) {
        this.data = data;
        this.pos = pos;
        this.debug = debug;
    }

    public byte[] getData() {
        return data;
    }

    public void setData(byte[] data) {
        this.data = data;
    }

    public int getPos() {
        return pos;
    }

    public void setPos(int pos) {
        this.pos = pos;
    }

    public boolean isDebug() {
        return debug;
    }

    public void setDebug(boolean debug) {
        this.debug = debug;
    }
}
public class DS_Store extends DataBlock{
    private byte[] data;
    private boolean debug;

    private DataBlock root;
    private List<Integer> offsets;
    private Map<String,Integer> toc;
    private Map<Integer,List<Integer>> freeList;

    /**
     * Constructor of DS_Store
     * @Params: [data, debug]
     * @Return:
     */
    public DS_Store(byte[] data, boolean debug) {
        super(data,0,debug);
        this.data = data;
        this.debug = debug;
        this.data = data;
        this.debug = debug;
        this.root = readHeader();
        this.offsets = readOffsets();
        this.toc = readTOC();
        this.freeList = readFreelist();
    }

    /**
     * Checks if this.data is actually a .DS_Store file by checking the magic bytes.
     * It returns the file's root block.
     * @Params: []
     * @Return: com.zwt.framework.utils.util.dsstore.DataBlock
     */
    private DataBlock readHeader(){

        // We read at least 32+4 bytes for the header!
        if (this.data.length < 36){
            throw new RuntimeException("Length of data is too short!");
        }
        // Check the magic bytes for .DS_Store
        int magic1 = ByteBuffer.wrap(this.offsetRead(4,0)).getInt();
        int magic2 = ByteBuffer.wrap(this.offsetRead(4,0)).getInt();
        if(magic1 != 0x1 && magic2 != 0x42756431){
            throw new RuntimeException("Magic byte 1 does not match!");
        }
        // After the magic bytes, the offset follows two times with block's size in between.
        // Both offsets have to match and are the starting point of the root block
        int offset = ByteBuffer.wrap(this.offsetRead(4,0)).getInt();
        int size = ByteBuffer.wrap(this.offsetRead(4,0)).getInt();
        int offset2 = ByteBuffer.wrap(this.offsetRead(4,0)).getInt();
        if(debug){
            System.out.println(String.format("Offset 1: %s",offset));
            System.out.println(String.format("Size: %s",size));
            System.out.println(String.format("Offset 2: %s",offset2));
        }
        if(offset!=offset2){
            throw new RuntimeException("Offsets do not match!");
        }
        //Skip 16 bytes of unknown data...
        skip(4*4);

        return new DataBlock(this.offsetRead(size, offset+4),0, this.debug);
    }

    /**
     * Reads the offsets which follow the header
     * @Params: []
     * @Return: java.util.List<java.lang.Integer>
     */
    private List<Integer> readOffsets(){

        int startPos = this.root.getPos();
        // First get the number of offsets in this file.
        int count = ByteBuffer.wrap(this.root.offsetRead(4,0)).getInt();
        if(debug){
            System.out.println(String.format("Offset count: %s",count));
        }
        // Always appears to be zero!
        this.root.skip(4);

        // Iterate over the offsets and get the offset addresses.
        List<Integer> offsets = new ArrayList<>();
        for(int i=0;i<count;i++){
            // Address of the offset.
            int address = ByteBuffer.wrap(this.root.offsetRead(4,0)).getInt();
            if(debug){
                System.out.println(String.format("Offset %s is %s",i, address));
            }
            if (address == 0){
                // We're only interested in non-zero values
                continue;
            }
            offsets.add(address);
        }

        // Calculate the end of the address space (filled with zeroes) instead of dumbly reading zero values...
        int sectionEnd = startPos + (count / 256 + 1) * 256 * 4 - count*4;

        // Skip to the end of the section
        this.root.skip(sectionEnd);
        if(debug){
            System.out.println(String.format("Skipped %s to %s",(this.root.getPos() + sectionEnd),this.root.getPos()));
            System.out.println(String.format("Offsets: %s",offsets));
        }
        return offsets;
    }

    /**
     * Reads the table of contents (TOCs) from the file.
     * @Params: []
     * @Return: java.util.Map<java.lang.String,java.lang.Integer>
     */
    private Map<String,Integer> readTOC(){
        if(debug){
            System.out.println(String.format("POS %s",this.root.getPos()));
        }
        // First get the number of ToC entries.
        int count = ByteBuffer.wrap(this.root.offsetRead(4,0)).getInt();
        if(debug){
            System.out.println(String.format("Toc count: %s",count));
        }

        Map<String,Integer> toc = new HashMap<>();
        // Iterate over all ToCs
        for(int i=0;i<count;i++){
            // Get the length of a ToC's name
            int tocLen = this.root.offsetRead(1,0)[0];
            // Read the ToC's name
            String tocName = new String(this.root.offsetRead(tocLen,0), StandardCharsets.UTF_8);
            // Read the address (block id) in the data section
            int blockId = ByteBuffer.wrap(this.root.offsetRead(4,0)).getInt();
            // Add all values to the dictionary
            toc.put(tocName,blockId);
        }
        if(debug){
            System.out.println(String.format("Toc %s",toc));
        }
        return toc;
    }

    /**
     * Read the free list from the header.
     * The free list has n=0..31 buckets with the index 2^n
     * @Params: []
     * @Return: java.util.Map<java.lang.Integer,java.util.List<java.lang.Integer>>
     */
    private Map<Integer,List<Integer>> readFreelist(){
        Map<Integer,List<Integer>> freelist = new HashMap<>();
        for(int i=0;i<32;i++){
            freelist.put(1<<i,new ArrayList<>());
            // Read the amount of blocks in the specific free list.
            int blkcount = ByteBuffer.wrap(this.root.offsetRead(4,0)).getInt();
            for(int j=0;j<blkcount;j++){
                // Read blkcount block offsets.
                int freeOffset = ByteBuffer.wrap(this.root.offsetRead(4,0)).getInt();
                freelist.get(1<<i).add(freeOffset);
            }
        }
        if(debug){
            System.out.println(String.format("Freelist: %s",freelist));
        }
        return freelist;
    }


    /**
     * Create a DataBlock from a given block ID (e.g. from the ToC)
     * @Params: [blockId]
     * @Return: com.zwt.framework.utils.util.dsstore.DataBlock
     */
    public DataBlock blockById(int blockId){
        // First check if the block_id is within the offsets range
        if(this.offsets.size() < blockId){
            throw new RuntimeException("BlockID out of range!");
        }

        // Get the address of the block
        int addr = this.offsets.get(blockId);

        // Do some necessary bit operations to extract the offset and the size of the block.
        // The address without the last 5 bits is the offset in the file
        int offset = addr >> 0x5 << 0x5;
        //The address' last five bits are the block's size.
        int size = 1 << (addr & 0x1f);
        if(debug){
            System.out.println(String.format("New block: addr %s offset %s size %s",addr, offset + 0x4, size));
        }
        // Return the new block
        return new DataBlock(this.offsetRead(size, offset + 0x4),0, this.debug);
    }

    /**
     * Traverses a block identified by the given block_id and extracts the file names.
     * @Params: [blockId]
     * @Return: java.util.List<java.lang.String>
     */
    public List<String> traverse(int blockId){
        // Get the responsible block by it's ID
        DataBlock node = this.blockById(blockId);
        // Extract the pointer to the next block
        int nextPointer =  ByteBuffer.wrap(node.offsetRead(4,0)).getInt();
        // Get the number of next blocks or records
        int count =  ByteBuffer.wrap(node.offsetRead(4,0)).getInt();
        if(debug){
            System.out.println(String.format("Next Ptr %s with %s ",nextPointer,count));
        }
        List<String> filenames =new ArrayList<>();

        // If a next_pointer exists (>0), iterate through the next blocks recursively
        // If not, we extract all file names from the current block
        if(nextPointer > 0){
            for(int i=0;i<count;i++){
                // Get the block_id for the next block
                int nextId =  ByteBuffer.wrap(node.offsetRead(4,0)).getInt();
                if(debug){
                    System.out.println(String.format("Child: %s",nextId));
                }
                // Traverse it recursively
                List<String>  files = this.traverse(nextId);
                filenames.addAll(files);
                // Also get the filename for the current block.
                String filename = node.readFileName();
                if(debug){
                    System.out.println(String.format("Filename: %s", filename));
                }
                filenames.add(filename);
            }
            // Now that we traversed all childs of the next_pointer, traverse the pointer itself.
            // TODO: Check if that is really necessary as the last child should be the current node... (or so?)
            List<String> files = this.traverse(nextPointer);
            filenames.addAll(files);
        }else{
            // We're probably in a leaf node, so extract the file names.
            for(int i=0;i<count;i++){
                String f = node.readFileName();
                filenames.add(f);
            }
        }
        return filenames;
    }


    /**
     * Traverse from the root block and extract all file names.
     * @Params: []
     * @Return: java.util.List<java.lang.String>
     */
    public List<String> traverseRoot(){
        // Get the root block from the ToC 'DSDB'
        DataBlock root = this.blockById(this.toc.get("DSDB"));
        // Read the following root block's ID, so that we can traverse it.
        int rootId =  ByteBuffer.wrap(root.offsetRead(4,0)).getInt();
        if(debug){
            System.out.println(String.format("Root-ID %s", rootId));
        }
        // Read other values that we might be useful, but we're not interested in... (at least right now)
        int internalBlockCount =  ByteBuffer.wrap(root.offsetRead(4,0)).getInt();
        int recordCount =  ByteBuffer.wrap(root.offsetRead(4,0)).getInt();
        int blockCount =  ByteBuffer.wrap(root.offsetRead(4,0)).getInt();
        int unknown =  ByteBuffer.wrap(root.offsetRead(4,0)).getInt();
        // traverse from the extracted root block id.
        return this.traverse(rootId);
    }


    public DS_Store() {
    }

    public byte[] getData() {
        return data;
    }

    public void setData(byte[] data) {
        this.data = data;
    }

    public boolean isDebug() {
        return debug;
    }

    public void setDebug(boolean debug) {
        this.debug = debug;
    }
}
public class DS_StoreParser {
    /**
     * Return bytes by reading .DS_Store File
     * Throw Exception if file not exist
     * Throw Exception if Reading Error
     * @Params: [fileName]
     * @Return: byte[]
     */
    public static byte[] readFile(String fileName){
        File file = new File(fileName);
        if((!file.exists())||(!file.isFile())){
            throw new RuntimeException(".DS_Store File not exist ！");
        }
        try (FileInputStream fis = new FileInputStream(file);
             ByteArrayOutputStream bos = new ByteArrayOutputStream()){
            byte[] b = new byte[1024];
            int len;
            while((len = fis.read(b)) != -1) {
                bos.write(b, 0, len);
            }
            return bos.toByteArray();
        }catch (IOException e){
            throw new RuntimeException("Reading .DS_Store File Error!"+e);
        }
    }

    public static void main(String[] args) {
        byte[] data = readFile("/Users/zhangwentong/Desktop/DS_Store/bak.DS_Store");
        DS_Store store = new DS_Store(data,true);

        List<String> files = store.traverseRoot();
        System.out.println("Count: "+ files.size());
        for(int i=0;i<files.size();i++){
            System.out.println(files.get(i));
        }
    }

}

上述代码过程较复杂，有兴趣的可以参考最后面的参考资料部分，先对.DS_Store文件结构有些了解，在看代码就比较容易了。

我们运行上述代码后会看到如下输出：

可以看到这里面包含着我们的一些文件目录信息等，这些文件信息时是当前目录下的所有文件信息。这就可能造成一些安全问题。

有什么安全问题呢？我们来看下。

信息安全问题

当这一文件上传到了web服务器时，往往会带来一定的危害。

它带来的危害是它包含的文件名。MacOs在几乎所有文件夹都创建了一个.DS_Store文件。

信息泄漏(敏感文件)问题：

我们可以在https://en.internetwache.org/scanning-the-alexa-top-1m-for-ds-store-files-12-03-2018/ 这里查看。这篇文章涉及的Internetwache.org网站项目，对Alexa Top 1000的网站的根目录进行扫描，证明在有的网站中的确存在这一文件，导致信息泄漏。通过解析这一文件，他们发现了数据库备份，配置文件，以及一些缓存文件，甚至是密钥。

产生这种情况的原因就是我们在Git合作中，将.DS_Store进行了上传，而后对项目进行了部署。

PS：需要说明的一个事实是，存储在.DS_Store文件中的文件名仅代表本地MacOS系统上的目录内容。这就意味着解析出来的文件列表中有些文件可能不存在于我们的服务器或者要查找的机器上。

解决方案

尽管这样，我们在使用MacOS系统传输文件或者上传文件时，还是要尽量避免该文件的产生。

这个文件除了会记录目录信息，我们在使用程序遍历文件读取时也可能遇到它引起的问题（如我开头所说），我们在版本协作时，也有可能产生由此文件产生的不必要的冲突问题等。

我们处理.DS_Store文件大概有以下一些方法。

通常，在传输文件时，我们可以直接删除文件目录下的.DS_Store文件，如在MacOS上传输文件给Windows系统，这些.DS_Store文件就成了垃圾文件，对我们是毫无用处的。
对于提交给版本控制系统的文件，我们可以将.DS_Store文件加入到.gitignore文件中以达到忽略提交的目的。
如果我们想查询服务器（Linux）或者MacOS上有多少.DS_Store文件，可以使用下面的命令。
1
find . -name '*.DS_Store'
如果要删除它们，可以使用下面的指令。
1
find . -name '*.DS_Store' -type f -delete
上面的两个命令可以查询和删除当前目录下的.DS_Store文件。
PS：在MacOS上删除后会影响到如文件图标位置等问题，需要注意。
我们也可以让.DS_Store文件不在MacOS外接设备上（U盘等）继续生成，如下指令：
1
defaults write com.apple.desktopservices DSDontWriteNetworkStores -bool TRUE
如果要开启外接设备继续生成.DS_Store文件，指令如下：
1
defaults write com.apple.desktopservices DSDontWriteNetworkStores -bool FALSE
如果想直接禁止.DS_Store在MacOS上生成，可以使用Asepsis。
Asepsis的工作原理是拦截所有.DS_Store文件的创建或写入，并将它们重定向到 /usr/local/.dscage。这样 Finder 如常工作，且不会有这种无用文件污染文件系统。
不幸的是，在 OS X 10.11 El Capitan 发布之后，Apple 启用了 System Integrity Protection (SIP)，它会阻止 Asepsis 的安装和正常运行。Asepsis 的作者已经放弃了对它的后续支持，因为他不希望用户为了使用这个工具而禁用系统关键安全服务。
然而网上也有相关可以继续使用Asepsis的方法，如这篇文章禁止.DS_store生成，有兴趣的可以看一看。

总结

.DS_Store文件一般情况下是无用文件，也不会造成多大问题，但我们也应该对其存在提高警惕，尤其是在服务器上出现时，我是因为遇到了它造成的一个bug才决定研究下它，同时用程序解析一下锻炼自己，网上关于该文件的解析文章不算多，下面我把一些参考资料分享给大家，希望大家对其有更深的了解。

参考资料

源码地址

上述代码地址： GitHub .DS_Store Parser

SakuraTears的博客

关于.DS_Store文件的一些问题

前言

正文

简介

解析

信息安全问题

解决方案

总结

参考资料

源码地址