YAML is a file format, and as with many formats, some of us love them and some of us hate them. It seems to be better than XML in many ways, and perhaps easier to deal with the JSON. It might not be better than csv/tsv/delimited formats for large transfers but for many of us, it’s a nice format for configuration items.
While the format felt fairly intuitive to me, and it’s not hard to write, it is quite persnickety about whitespace. This makes using some plugin, like the Red Hat YAML extension for VS Code, important to help you prevent mistakes. Even as easy as the format can be to read and understand, it’s also easy to make mistakes with the whitespace as you indent and try to add subkeys.
I was watching an AWS talk, and there was an interesting note about using YAML for control planes and being sure that you have some sort of checksum if you do. Why? Because you aren’t sure if you have the entire file. A YAML file could be truncated in any file transfer, and it would still appear to be valid. Hearing that made me realize that those annoying closure tags in XML and JSON might have some value.
Those of you that work with YAML, how are you sure you got the entire file? Is there something you’d program in? Do you checksum the file and pass that along? Do you include a required, closing key:value tag of some sort? I don’t, but I might think about doing so in any place where an invalid or incomplete file might cause me problems. This certainly seems like something you’d want in a control file, like one used for Kubernetes.
In most cases, we assume if we can read a file, then we have the complete file. I don’t know of many customers that require some sort of checksum or validation for a file. Certainly, if a CSV or TSV was missing rows, the file might still appear valid to an import process. XML and JSON should have a closing tag or character, so we’d hope we could catch this, but maybe not.
Moving around data through files, especially data used to drive processes, should include some error handling. That would mean that we have some way to detect if part of our file is missing. There are ways, but it seems that in many cases we’ve gotten lazy about implementing them in file transfers. Certainly, I don’t see people adding a checksum to their YAML files, which seems like something that we’d want to require.