You will find an archive of Usenet posting here:
[login to view URL]
There are about 2GB of messages.
The goal of this project is to take these messages and convert them into puffball format. The full description of puffball format can be found here:
[login to view URL]
The content of each message will go in the "content" field, and the username will be formed from the user's email address by replacing the at sign with a dot. Each message will be "signed" with a key generated just for that user. We will provide you with the functions (in Javascript) to sign the content. The most complicated field is "parents", which needs to reference all of the messages that the user is replying to (it is possible that a user has replied to more than one message). However, the way that the archives are structured should make it easy to locate the messages being replied to, given the header information. You should confirm this!
You can write the function that parses the archives and creates the puffs in Python, PHP, Perl, JavaScript (with node), or as a linux shell script. We will want the code you create, as well as the puffs it creates.
Hello, I would like to participate in your project. I ve read the puff definition and I understood it (it is pretty straightforward) but I do not understand the mapping against the usenet file I download. Could you please post an specific usenet file from the site so I can analyze it?
I have a large working experience with programming (mainly Python) and regular expressions, having worked as a instructor at 4Linux. I believe I'm the ideal professional for this project.