|
|
posted by Eduardo Favaretto on June 8, 2008 at 08:53 PM |
True Knowledge: semantic search for facts
True Knowledge is an Internet search company based in Cambridge, England. It is building its own knowledge base, importing data from sources (like Wikipedia) and encourage their user base to contribute with facts and add new knowledge in a more structured manner A simple way of thinking about this technology: "a website where you can ask questions about any subject and get a direct response".
TheTrue Knowledge system is a database capable of holding any human factual knowledge in a form that both, computers and humans, can understand. It models the universe as a vast collection of "entities" and "facts" about those entities.
An entity can be anything at all, for example: Albert Einstein, the color blue, Germany, a specific length of time. A fact is a relationship between entities, such as: "Albert Einstein was born in Germany".
The technology is designed to allow anyone on the Internet to add new knowledge to the system and this knowledge is then instantly available for the rest of the planet to use. It sounds complicated and maybe provide no assistance for a "normal user" who just wants a good search experience: get in, get info, get out.
I have received an invitation to test the service [beta version with restricted accesss only to invited users]. After login, I typed in "Who is Christopher Columbus?" and I got an immediate answer, as follows:
 It was great to see the message: "We don´t have an identifying image of this person. Add one."
I loved to find the possibility to include a picture / photo [and contribute with my own knowlegment] regarding "Christopher Columbus" - such as the Wikipedia service. This way, I uploaded a Christopher Columbus photo and instantaneously it was there, ready to be viewed for other users, as follows:

Congratulations to True Knowledge team and its approach to building public knowledge bases - there is a hard work to improve, especially related to the quality answers and also user experience. |
|
comment
(0)
send a friend
print
permalink
RSS / feeds |
|
|
|
posted by Eduardo Favaretto on May 23, 2008 at 10:30 AM |
Powerset gives wings to Wikipedia
I have tested Powerset's first product released two weeks ago. A new idea to reinvents the search and discovery experience for 3 million pages of Wikipedia articles, giving users a better way to digest and navigate content quickly, trying to understand natural language searches and competing with keyword matching engines that dominate search today. 
Powerset differs from Google and other search engines in that it linguistically parses sentences, finding subjects, verbs, objects, synonyms, and other elements, using technology licensed from Xerox PARC. It extracts and indexes concepts, relationships and meanings, rather than keywords.
Instead of being limited to keywords, Powerset allows users to enter phrases, questions or keywords. It works better in English language than Portuguese [or, in this last case, does not work yet]. To prove that, I have done simple tests, "asking" to Poweset and Google two phrases: "First man on the moon", "Quem descobriu o Brasil" and two keywords: "Carmen Miranda".
Regarding the phrase "First man on the moon", the both search engines have resulted good links (see images bellow).

Powerset "understood" some information related with the second phrase, "Quem descobriu o Brasil" (in portuguese) getting result from a Wikipedia page content about Carnival (???), and Google gave me a lot of results to be "explored"...


Finally, when I have asked these search engines only two keywords, "Carmen Miranda", I have gotten a good answer from Google
and a excellent answer from Poweset - all information related with the "artist", in a detailed and compiled way, easy to be understood.

|
|
comment
(0)
send a friend
print
permalink
RSS / feeds |
|
|
|
posted by Eduardo Favaretto on February 12, 2008 at 08:27 AM |
Microsoft + Yahoo = Microhoo? Not yet.
Yahoo! Board of Directors concluded that the Microsoft's unsolicited proposal is not in the best interests of Yahoo! and stockholders. "(...) After careful evaluation, the Board believes that Microsoft's proposal substantially undervalues Yahoo! including our global brand, large worldwide audience, significant recent investments in advertising platforms and future growth prospects, free cash flow and earnings potential, as well as our substantial unconsolidated investments. The Board of Directors is continually evaluating all of its strategic options in the context of the rapidly evolving industry environment and we remain committed to pursuing initiatives that maximize value for all stockholders. (...)" - from Yahoo! official Press Release (February 11, 2008): "Yahoo! Board of Directors Says Microsoft's Proposal Substantially Undervalues Yahoo!".
It’s not the first time Microsoft "has offered" to buy Yahoo! as Dave Winer (Scripting News), covered this subject in a post (May 4, 2007). The offer on the table this time: US$44.6 billion - see Microsoft official Press Release (February 1, 2008). Danny Sullivan (Search Engine Land), wrote a great post discussing the merits of the deal.
I noticed last month in Brazil (it seems in many other countries it happens too), that Microsoft has been using "Yahoo! Search Marketing" as "an sponsored link service" to the site www.live.com - in resume: if you pay for an on-line campaign at Yahoo!, the same ads will be displayed at Microsoft search site too. Friend companies?
|
|
comment
(0)
send a friend
print
permalink
RSS / feeds |
|
|
|
posted by Eduardo Favaretto on December 7, 2007 at 03:29 PM |
Why does Google (and others...) show only 1000 results? Anyone can answer this question to me? Technically, I have noticed in a practical experiment at ::buscas.com there is a "deep process" to seek a keyword an entire text field in complex databases structures (over 5 million records, for example). I t means, when the keyword is in the begging of the indexed field (i.e. LINK at LINKING), it is easy and quick to find it, but when this same keyword is a partial string in a big word (i.e. LINK at FIRSTLINKMEDIA), as a combined one (or mixed string), it takes time to be found. Because of that [and other "unknown" purposes], the majority of very popular search engine systems [Live Search, Yahoo, Dogpile besides Google], limit the user searches to some thousand results, even if you check a database that brings some million valid records. Maybe the users don´t want to wait more than some milliseconds... too much time (???). By the way, if I receive 52,700,000 results to the keyword "Brazil" (my country name in English), but I "only" can see the resume of "first" 1000 links (99 pages), why the other 52,699,000 need to exist?
It is really true to say: there is a lot of information stored at databases [i.e. machine side], but humans are responsible to create "filters" or "indexes" to decide which records will be "visible". Would it be Page Rank side? Thanks Larry and Sergey. |
|
comment
(0)
send a friend
print
permalink
RSS / feeds |
|
|
|
posted by Eduardo Favaretto on September 27, 2007 at 04:35 PM |
Search
2.0, Search 3.0: what will be the next (r)evolution?
"Today
a typical Google search returns up to hundreds of thousands or even millions of
results -- but we only really look at the first page or two of results. What
about the other results we don't look at? There is a lot of room to improve the
productivity of search, and the help people deal with increasingly large
collections of information. Keyword search doesn't understand the meaning of
information, let alone its structure. Natural language search is a little better
at understanding the meaning of information -- but it still won't help with the
structure of information. To really improve productivity significantly as the
Web scales, we will need forms of search that are data-structure-aware -- that
are able to search within and across data structures, not just unstructured text
or semi structured HTML. This is one of the key benefits of the coming
Semantic Web: it
will enable the Web to be navigated and searched just like a database.", said
Radar Networks CEO and Founder Nova Spivack in his blog. O’Reilly Media
co-founder, editor and publisher of MAKE magazine Dale Dougherty, coined
the
term Web 2.0 in 2004, contrary to popular perception. In
a
recent web interview, he said some words about the origin of the term: "a
new generation was going to be coming forward, and they would do things and
think differently than the previous generation... the next new technology was
once again the Web". Google CEO Eric Schmidt
was recently at the
"Seoul Digital Forum" (South Korea) and launched into a great definition of Web
3.0: "applications that are pieced together" - with the characteristics that
the apps are relatively small, the data is in the cloud (Internet), the apps can
run on any device (PC or mobile), the apps are very fast and very customizable,
and are distributed "virally" (social networks, email, etc). What kind of web
search model will we expect to have in 3 years? For some reason, I can't stop
asking myself: so, what will be the next step? We have been noticing some "signs"
regarding the evolution of interface between us and search engines - the way we
interact with them today needs to be easier, because now we need to find other
kinds of contents, not just HTML pages, such as: video, audio files, music,
podcats, presentation files, PDF documents, high definition photos, feeds,
content to mobile devices (ipod, pda, smartphones, etc.). My perception is that
people do not have time enough for anything else... Despite the technology 2.0
or 3.0, the most important step in my opinion is to combine technology
and user-experience (continuum interaction) to help them to find what they want,
in a precise way, wherever they are, using any device instead only computers. |
|
comment
(0)
send a friend
print
permalink
RSS / feeds |
|
|
|