11.11.2012 at 06:20 #341
An abstract format for the encapsulation of data while retaining technical integrity of original file for the creation of “Artifact-like” virtualization and mimic the internal regard of a similar “IRL” physical object.
My proposal is this; files can be copied over and over and over, this will work. This is supposed to work, but due to inherent technological limits and essentially the degradation of storage medium over time, there exists within the user a fear that sufficiently barriers them from regarding said virtualization of the file to anything like that of the physical variety.
The virtual object is known to be tricky and although two photos that when rendered to a screen can be sufficiently passed off as identical, the “files” or the abstract representation within a binary system upon layers of computer science history that somehow encapsulate the 1′s and 0′s that are seemingly completely apart if one were to inspect the blocks that store this data upon the drive or storage medium. A FS or file system is the lowest the user can go within a working
UNIXsystem makes clever use of this system and the entire basis of the shell allows for programs to essentially be piped into one another through
yolk$ echo "Hello" | cat -
The file system is where this artifacting must take place. Everything beneath this must be thought as working-as-intended, as this virtual abstraction exists over and over through the repeated creation of any complex virtual environment or computer system. The file system must exist for anything else above to function, this is where the “artifacting” must take place.
A file represents a single-entity within the system. To the file system, it is just a name. The data attached to that name is transparent to the file system.
A checksum, is a hash of the data contained within that file. It is always the same length and is made up of alphanumeric values.
For an artifact to be created I believe that the filename should be the checksum of the data it represents.
For example, let me take a photo named
cat.jpg. To create an artifact, that is to create a file that it’s virtual representation can be immediately identified as valid, we first need to take a checksum of that file.
Several checksum methods exist, including SHA-1 and MD5. MD5 is sufficient enough for this example.
yolk$ md5 cat.jpg
MD5 (cat.jpg) = 3f31e2da1437edee6bf1fbc27f7ecc89
We can now rename this file to that checksum.
yolk$ mv cat.jpg 3f31e2da1437edee6bf1fbc27f7ecc89
or in one fell swoop:
yolk$ mv cat.jpg `md5 -q cat.jpg`
as the file-name is the single thing separate from the data it represents (including metadata, unless the metadata is within the FS, but such attempts have been unsuccessful ((ie.WinFS))). As we take a checksum of the file it represents itself; the integrity of the data is instantly and efficiently verified.
yolk$ md5 3f31e2da1437edee6bf1fbc27f7ecc89
MD5 (3f31e2da1437edee6bf1fbc27f7ecc89) = 3f31e2da1437edee6bf1fbc27f7ecc89
The file can still be probed, with something like the Unix
yolk$ file 3f31e2da1437edee6bf1fbc27f7ecc89
3f31e2da1437edee6bf1fbc27f7ecc89: JPEG image data, JFIF standard 1.01
And can still be opened with the proper program or application.
Just some thoughts that I would like people’s input on, please.
- This topic was modified 8 years ago by yolk. Reason: formatting errors
- This topic was modified 8 years ago by yolk. Reason: example of unix pipes
- This topic was modified 8 years ago by yolk. Reason: more tags; why not. link to winfs on wikipedia
- This topic was modified 8 years ago by yolk. Reason: formatting
- This topic was modified 8 years ago by yolk. Reason: added `file` example
You must be logged in to reply to this topic.