Spiga

Who will filter the stream first?

January 26, 10 by Craig

Facebook is where I have more noise than any other social site, twitter may even tie facebook at amount of sheer content I receive in my feed. With regards to the ratio of what I care about to what I see facebook is a lot better, due to their news feed versus live feed. However, their news feed is still very often off. I wrote some time back about web 3.0, and how essentially showing what I want to see is what the web will become. You’ll take the vast amount of content and distill it into what I want to see. People seem to be taking very half-hazard shots at it and its quite a let down.

I’ll start with twitter, twitter gives no filtering on the content based on their view. Instead they put the control in the users hands for me to create filters based on friends. This means I have to take time to go through all of my 600+ people I follow and group them into lists, then navigate each list when I want to view such topics. This is not only time intensive it still doesn’t accomplish what I want which is information by topic in a lot of cases, especially on twitter.

Moving on to facebook, they at least take care of the process (almost transparently) of who I want to see. If someone shows up, I can simply say hide from the news feed. I have a strong hunch that when I click out of the news feed and go to someone’s profile it weights that person to be more frequent. This is a very logical deduction to make, and in most cases I’m pretty pleased with the result. The big problem with this is it’s still all about the people, not about the content. If I clicked on someone because they mentioned coming to visit California, I may have not talked to them in 2 years, but would simply like to offer up my help when they visit. This doesn’t mean I want to get updates about them after they visit.

Facebook is definitely a leader in this space, first they’re one of the few with enough content in a feed that filtering even matters. Then the fact that they get a user beyond analysis-paralysis it’s a positive move, however the classification is wrong. Whether it’s twitter, facebook, or some other service that hasn’t emerged yet, filtering a mass of information to what a user cares about will be huge.

Amazon and Netflix have done this for products, why has no one tried this for information?

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Ma.gnolia
  • StumbleUpon
  • Technorati
Add your comment

2 responses for this post

  1. Matt Says:

    To answer your last question: because products have a defined set of meta-data and context in which that data is useful. They have years upon years worth of behavioral patterns stored to determine similar interest among items.

    The problem with the information on Facebook and other social sites is that the information isn’t nearly as easily placed into context using artificial intelligence. The amount of AI work to do something like the example you listed with “California” could easily result into you seeing someone is currently enjoying a Tupac and Dr. Dre track – something you aren’t very likely interested in.

    I think the better question becomes: how do you convert a user’s data into actual information? The first obvious step to me is meta-data, even if it means having the user voluntarily qualify and add extra information to their shared data. Figuring out this meta-data on a 140 character sentence seems impossible without some type of user input. I think we already have a lot of tools to help us make a more educated guess: geolocation, “tagging” of other users, etc., but we still have a long way to go.

  2. Craig Says:

    I agree it COULD result in mis-classification, but there’s a lot being done quite well with unsupervised clustering that’s proving effective. Users on at least some services such as twitter are already providing tagging of other users, and validating which items fall in a category via tagging. By expanding the classification to just beyond California, to California and Traveling you could clearly identify that as an area of interest, thereby eliminating things only pertaining to California and only pertaining to Traveling. Essentially any filtering you could do would be some improvement, and how effective you could be would be more restricted by the amount you wanted to see in your feed as much as it would on how effective your filtering was. Only for the users that greatly limited what they wanted in a feed would you miss out. Since you’re already opting in to relevant users, instead of on the entire corpus of twitter you have to worry less about the overload of information of what may occur for conflicting classifications.

    I think your initial question really becomes the driver for then being the input to filtering the stream. You begin with unsupervised clustering via voluntary tagging, like a hashtag on twitter, and couple that with geolocation. From there you have the classification of information and can track that to how the user interacts.

Leave a Reply