Notes on Peer-To-Peer: Harnessing the Power of Disruptive Technologies

Peer-To-Peer (amazon, oreilly) is an old book by internet standards (published in March of 2001), but chock full of interesting thoughts and perspectives.

· On gnutella, did you know that you can watch what other people are searching for? The book has a screenshot of gnutella client v0.56, I have Gnucleus, but you can do it in Gnucleus by clicking Tools –> Statistics –> Log. Betcha you didn’t know that an entire company was founded based on that idea did you?

· footnote: Chaffing and Winnowing: Confidentiality without Encryption by Ronald L. Rivest

· On the ‘small-world’ model and the importance of bridges: “.. The key to understanding the result lies in the distribution of links within social networks. In any social grouping, some acquaintances will be relatively isolated and contribute few new contacts, whereas otherws will have more wide-ranging connections and be able to serve as bridges between far-flung social clusters. These bridging vertices play a critical role in bringing the network closer together… It turns out that the presence of even a small number of bridges can dramatically reduce the lengths of paths in a graph, …” — Reference “Collective dynamics of ‘small-world’ networks” published in Nature, download the PDF.

· Publius looks like some cool software

· Crowds: “Crowds is a system whose goals are similar to that of mix networks but whose implementation is quite different. Crowds is based on that idea that people can be anonymous when they blend into a crowd. As with mix networks, Crowds users need not trust a single third party in order to maintain their anonymity. A crowd consists of a group of web surfers all running the Crowds software. When one crowd member makes a URL request, the Crowds software on the corresponding computer randomly chooses between retrieving the requested document or forwarding the request to a randomly selected member of the crowd…. ” — Read more about Crowds at the official site.

· Tragedy of the Commons. This idea was mentioned in chapter 16 on Accountability and is talked about in various other books I’ve read, but I’m not sure that I ever recorded the source. The idea came from Garrett Hardin in a paper written in 1968 called “The Tragedy of the Commons”, which you can read on his site.

· On accountability and how we really *are* all just six degrees apart. Read the PGP Web of Trust statistics if you don’t believe it.

· On reputation systems, specifically Advogato’s Trust Metric

· On reputation scoring systems, a good system “… will possess many of the following qualities:

  • Accurate for long-term performance: The system reflects the confidence (the likelihood of accuracy) of a given score. It can also distinguish between a new entity of unknown quality and an entity with bad long-term performance.
  • Weighted toward current behavior: The system recognizes and reflects recent trends in entity performance. For instance, an entity that has behaved well for a long time but suddenly goes downhill is quickly recognized and no longer trusted.
  • Efficient: It is convenient if the system can recalculate a score quickly. Calculations that can be performed incrementally are important.
  • Robust against attacks: The system should resist attempts of any entity or entities to influence scores other than by being more honest or having higher quality.
  • Amenable to statistical evaluation: It should be easy to find outliers and other factors that can make the system rate scores differently.
  • Private: No one should be able to learn how a given rater rated an entity except the rater himself.
  • Smooth: Adding any single rating or small number of ratings doesn’t jar the score much.
  • Understandable: It should be easy to explanin to people who use these scores what they mean — not only so they know how the system works, but so they can evaluate for themselves what the score implies.
  • Verifiable: A score under dispute can be supported with data.”

· On reputation scoring systems: “Vulnerabilities from overly simple scoring systems are not limited to “toy” systems like Instant Messenger. Indeed, eBay suffers from a similar problem. In eBay, the reputation score for an individual is a linear combination of good and bad ratings, one for each transaction. Thus, a vendor who has performed dozens of transactions and cheats on only 1 out of every 4 customers will have a steadily rising reputation, whereas a vendor who is completely honest but has done only 10 transactions will be displayed as less reputable. As we have seen, a vendor could make a good profit (and build a strong reputation!) by being honest for several small transactions and then being dishonest for a single large transaction.

· The book was written when Reputation Technologies was still a distinct company, but I thought this list of Reputation and Asset Management vendors was interesting in that reputation is something that is becoming more and more important.. for instance, when was the last time you purchased something from eBay where the vendor had a bad rating? Never right? Did you ever stop to think about how the vendor in question got a bad rating? Since when is eBay a good judge of someone’s character? Why do we trust eBay’s reputation algorigthms?

· On the optimal size of an organization: “Business theorists have observed that the ability to communicate broadly and deeply through the Internet as low cost is driving a process whereby large businesses break up into a more competitive system of smaller component companies. They call this process ‘deconstruction.’ This process is an example of Coase’s Law, which states that other things being equal, the cost of a transaction — negotiating, paying, dealing with errors or fraud — between firms determines the optimal size of the firm. When business transactions between firms are expensive, it’s more economical to have larger firms, even though larger firms are considered less efficient because they are slower to make decisions. When transactions are cheaper, small firms can replace the larger integrated entity.

· “Why Johnny Can’t Encrypt: A Usability Evaluation of PGP 5.0 (pdf)” from Alma Whitten

Notes on “Things A Computer Scientist Rarely Talks About”

I picked up “Things A Computer Scientist Rarely Talks About” by Donald Knuth at Barnes & Noble a couple weeks back on a whim after spending 45 minutes looking through the fascinating science/technology section at the back of the Natick store. (sidenote: some Barnes and Nobles have fabulous science/technology/computer science/engineering sections with rows and rows of books… and some have “JavaScript for Dummies”. Why is that?)

It’s not a book about computer science but is rather the transcribed text of his series of public lectures about interactions between faith and computer science (which you can view online). Couple quotes I deemed noteworthy for one reason or another:

· On page 28 he talks about he how he used randomization when grading papers while teaching at Stanford. Reminder to read up on “zero knowledge proofs” sometime.

· The basis of his lectures was a book he wrote called “3:16 Bible Texts Illuminated” which aimed to gain an understanding into the Bible by taking 59 random snapshots (verses) and studying them in detail. His son was inspired indirectly by this book: “… to start up the H-20 project, which is designed to answer the question ‘What is Massachusetts?’ … He and my daughter have a book of maps of Massachusetts at a large scale; they live fairly near campus, at coordinates H-20 in the relevant map of Cambridge. So they’re going to try and visit H-20 on all the other pages of their book. That should give terrific insights into the real nature of Massachusetts.

· on learning: “… I learned that the absolute best way to find out what you don’t understand is to try to express something in your own words. If I had been operating only in input mode, looking at other translations but not actually trying to output the thoughts they expressed, I would never have come to grips with the many shades of meaning that lurk just below the surface. In fact, I would never have realized that such shades of meaning even exist, if I had just been inputting. The exercise of producing output, trying to make a good translation by yourself, is a tremendous help to your education.

· A quote from Peter Gomes at the beginning of his book called “The Good Book“: “… The notion that [the texts of the Bible] have meaning and integrity, intention, contexts and subtexts, and that they are part of an enormous history of interpretation that has long involved some of the greatest thinkers in the history of the world, is a notion often lost on those for whom the text is just one more of the many means the church provides to massage the egos of its members.

· One of the questions asked about Douglas Hofstadter’s book “Le Ton Beau de Marot: In Praise of the Music of Language“.

· “My experience suggests that the optimum way to run a research think tank would be to take people’s nice offices away from them and to make them live in garrets, and even to insist that they do non-researchy things. That’s a strange way to run a research center, but it might well be true that the imposition of such constraints would bring out maximum creativity.” — after mentioning that he was able to come up with several relatively important ideas (attribute grammars, Knuth-Bendix completion, LL(k) parsing) during the “most hectic year of his life”.

· On aesthetics according to C. S. Peirce: “Aesthetics deals with things that are admirable; ethics deals with things that are right or wrong; logic deals with things that are true or false.

· “Somehow the whole idea of art and aesthetics and beauty underlies all the scientific work I do. Whatever I do, I try to do it in a way that has some elegance; I try to create something that I think is beautiful. Instead of just getting a job done, I prefer to do my work in a way that pleases me in as many senses as possible…. I like especially to be associated with art, in the sense of making things of beauty.

· Planet Without Laughter: “.. It’s a marvelous parable on many levels, about the limits of rationality. You can read it to get insight about all religions, and about the question of form over substance in religion.

· Eugene Wigner, a Princeton physicist: “It is good that the completion of our scientific work is an unattainable ideal. Striving toward it is attracting many of us, and gives much pleasure and satisfaction… If science were completed, the satisfaction which research, the furthering of human knowledge, had provided, would disappear. Also, even more men would strive for power and domination…. We know that there are facts and insights which we cannot communicate to animals — no animal is familiar, for instance, with the associative law of multiplication… Is it not possible that our understanding of nature also has limitations?… I hope that, even if this should be true, we will be able to continue the extension of our knowledge indefinitely, … even if the limit thereof will always remain widely separated from the complete knowledge and understanding of nature.

· On artificial life: “… the Game of Life illustrates the power of evolutionary mechanisms. Stable configurations arise out of random soup, usually very quickly; and many of those configurations have properties analogous to biological organisms.

· Stuart Sutherland, in the 1996 edition of the International Dictionary of Psychology: “Consciousness: The having of perceptions, thoughts and feelings; awareness. The term is impossible to define except in terms that are unintelligible without a grasp of what consciousness means. Consciousness is a fascinating but elusive phenomenom: it is impossible to specify what it is, what it does, or why it evolved. Nothing worth reading has ever been written on it.

.NET HttpRequest.ValidateInput()

I mentioned that v1.1 of ASP.NET by default validates input received from QueryString, Form and Cookie scope. You can turn off this validation site wide by tweaking the web.config:

<configuration>
  <system.web>
    <pages validateRequest=”false” />
  </system.web>
</configuration>

But then you’re left with no validation right? Wrong. You can use the ValidateInput() method of the HttpRequest object programmatically in any code that has access to the HttpRequest instance. Very useful stuff.

One question though: What is potentially dangerous data according to Microsoft? And can you modify that definition? I’m guessing the answers are: a) we’ll never know and b) no. Given their track record, does it make sense to trust Microsoft to validate the input you receive from client browsers when the browser they created can’t be trusted?

More on the out method parameter

I’m sure this is boring for about 99% of you, but I’m writing for my own benefit anyway. I mentioned the ‘out method parameter’ yesterday because I saw it used in a custom library I’m using and then today I found out that the Double class uses it as well. I think it’s a great example of how it should be used. It’s used in the TryParse method:

public static bool TryParse(
   string s,
   NumberStyles style,
   IFormatProvider provider,
   out double result
);

The TryParse method is like the Parse method, except this method does not throw an exception if the conversion fails. If the conversion succeeds, the return value is true and the result parameter is set to the outcome of the conversion. If the conversion fails, the return value is false and the result parameter is set to zero.

I like it because throwing an exception (ie: what the Parse() method does) violates rule #39 of Effective Java Programming which says to “… Use exceptions only for exceptional conditions.” Using an out parameter feels cleaner and simpler. Anyone else think so?

out method parameter in C#

Just discovered this C# tidbit called the out method parameter. Let’s say you have a method:

public String ReplaceAll(String toReplace, String replaceWith)

and you want to know how many replacements were actually made. With a regular method you can only return the modified string. The out method parameter gives you the ability to return another variable. The modified method would look like this:

public String ReplaceAll(String toReplace, String replaceWith, out int numReplaced)

and then concretely:

String myString = “aaron”;
int replaced;
myString.ReplaceAll(“a”, “k”, out replaced);
Console.WriteLine(“The method ReplaceAll() replaced ” + replaced + ” characters.”);

Similar idea in the ref method parameter, but you have to initialize the variable before sending it to the method.

The Philosophy of Ruby: An interview with Yukihiro Matsumoto

Bill Venners just posted the first of an installment of articles with Yukihiro Matsumoto, the creator of the programming language Ruby. Specifically, they talk about the how Ruby wasn’t designed to the the ‘perfect’ language (but rather a language that feels good when used), and “… the danger of orthogonality, granting freedom with guidance, the principle of least surprise and the importance of the human in computer endeavors.

I thought the quote “Language designers want to design the perfect language.” could also be re-phrased as “Programmers want to feel like their language is the perfect lanaguage.” I know this blog is being syndicated through fullasagoog.com (as a ColdFusion blog) and also through markme.com (as a Java blog) and I read alot of the blogs on both sites, as well as some of the blogs on weblogs.asp.net and javablogs.com. It’s interesting that all of the above mentioned sites (not to mention slashdot) are generally short sighted when it comes to the subject of what language is better (reference discussions re: Java as the SUV of programming languages, PHP vs. ASP.NET, MX vs. .NET) and hammer away at how x is better than y. I think Yukihiro is right, there isn’t a ‘perfect programming’ language and there never will be. Macromedia employees probably aren’t encouraged to say this, but I’d encourage anyone writing a ColdFusion application to try and write a similar application in ASP.NET or in Java using Struts or in ASP.. or even Ruby. You’ll be amazed at how things you’ll learn.