Technology Musings
Usenet is a distributed message service that allows users to post and read items by other users. Typically, an end-user subscribes to a single Usenet server – either provided by their ISP, or subscribed through a premium service such as NewsDemon. That server provides a single point of entry for the user, but receives and distributes messages with other Usenet servers in a mesh network. A mesh network is similar to the infamous P2P networking, with the added ability of data being able to flow through multiple hops to reach a destination. Thinking about air travel, P2P is like taking only a direct flight whereas mesh is similar to using layovers to reach your destination.
Usenet has evolved since its introduction in 1980, now serving a purpose far more vast than it’s original intended purpose of ASCII communication. With the popularization of binary-to-ASCII conversion (such as uuencoding), it became practical to post files on the text-only messaging system. However, there are size limits on Usenet messages, so large files may be split among thousands of messages or more.
NZB was developed by Newzbin.com as a method of creating a portable index of files on Usenet. The file format is open, XML-based, and free to use by anyone. The specification is somewhat simple and may contains identifiers for multiple files, the messages that compose that file, and any groups that may reference those messages. This information can be read by a newsreader implementing NZB support, allowing automated download of the files contained in the NZB.
Where do you get these NZB files? A favorite place of mine is Binsearch.
Due to the mesh-style network of Usenet, it is common for some messages in a large file to not reach every server. Because of this, a parity volume set archive (known as parchive) was developed. These files consist of both index and recovery files. The index files contain a calculated value (hash) that is used to verify that a particular file is valid and not corrupt. If, however, a file is corrupted or missing, the repair files can be used to fix the data or reconstruct the file.
RAR is a proprietary archiving format similar to zip, gzip, and 7z. Although the format is not free to use for compression (it requires a paid license from the author), extraction is implemented an open-source utility called unrar, available on most OS‘s. Despite the non-free nature of RAR, its success is partially dependent on handling split volumes. A split volume allows a file to be spread among multiple smaller archive files. For example, an 700MB movie file might be archived and split into 50MB volumes. The advantage of smaller volumes is due in part to verification limitations and inherent Usenet deficiencies.
NFO files are not specifically required in any way, but often accommodate the main files. It’s no coincidence that the extension looks much like the word “info”, as that is the purpose of the file. Information often included in the NFO file are the release details, quality, source, and group that released the files. The files use no specific format, but are simple ASCII files, though several programs exist for the explicit purpose of formatting the messages in a friendly way. However, the information is just as easily viewed in any text editor.
So how do all these components fit together? It goes something like this:
So, why bother with downloading binaries from Usenet? Speeds are fast – often very fast. Because ISP servers sit relatively close to the end-user, it’s easily possible to max out the rated line speed with downloads. Although the topography doesn’t always hold true for subscription services, this is a selling point and has become a requirement of successful newsgroup providers.
Unlike P2P applications, you are not dependent on the popularity of content with other users. This not only affects the speed of download, but also the availability of a file. If you’re searching for an obscure file, it may be much easier to find it on the newsgroups. Many services are providing file retention in excess of an entire year. As long as a single person has posted the file within the retention time, anyone can access it.
Another important comparison with P2P applications, is the relative privacy you have when downloading from Usenet. With BitTorrent, you are sharing with hundreds of users, all of which must be able to see that you’re transferring the file for the system to work. Usenet, however, utilizes a client-server model, so the only person that can see what you’re downloading is you. Additionally, for the extra-paranoid, many providers now offer SSL-encrypted connections.
It’s for all these reasons that I find Usenet to be a superior resource for locating and downloading binary files.